NAME
sitecopy - maintain remote copies of web sites
SYNOPSIS
sitecopy [options] [operation mode] sitename ...
DESCRIPTION
sitecopy is for copying locally stored web sites to remote
web servers. A single command will upload files to the
server which have changed locally, and delete files from
the server which have been removed locally, to keep the
remote site synchronized with the local site. The aim is
to remove the hassle of uploading and deleting individual
files using an FTP client. sitecopy will also optionally
try to spot files you move locally, and move them
remotely.
FTP, WebDAV and other HTTP-based authoring servers (for
instance, AOLserver and Netscape Enterprise) are sup-
ported.
GETTING STARTED
This section covers how to start maintaining a web site
using sitecopy. After introducing the basics, two situa-
tions are covered: first, where you have already upload
the site to the remote server; second, where you haven't.
Lastly, normal site maintenance activities are explained.
Introducing the Basics
If you have not already done so, you need to create an
rcfile, which will store information about the sites you
wish to administer. You also need to create a storage
directory, which sitecopy uses to record the state of the
files on each of the remote sites. The rcfile and storage
directory must both be accessible only by you - sitecopy
will not run otherwise. To create the storage directory
with the correct permissions, use the command
mkdir -m 700 .sitecopy
from your home directory. To create the rcfile, use the
commands
touch .sitecopyrc
chmod 600 .sitecopyrc
from your home directory. Once this is done, edit the
rcfile to enter your site details as shown in the CONFIGU-
RATION section.
Existing Remote Site
If you have already uploaded the site to the remote
server, ensure your local files are synchronized with the
where sitename is the name of the site you used after the
site keyword in the rcfile.
If you do not have a local copy of the remote site, then
you can use fetch mode to discover what is on the remote
site, and synchronize mode to download it. Fetch mode
works well for WebDAV servers, and might work if you're
lucky for FTP servers. Run
sitecopy --fetch sitename
to fetch the site - if this succeeds, then run
sitecopy --synch sitename
to download a local copy. Do NOT do this if you already
have a local copy of your site.
New Remote Site
Ensure that the root directory of the site has been cre-
ated on the server by the server administrator. Run
sitecopy --init sitename
where sitename is the name of the site you used after the
site keyword in the rcfile.
Site Maintenance
After setting up the site as given in one of the two above
sections, you can now start editing your local files as
normal. When you have finished a set of changes, and you
want to update the remote copy of the site, run:
sitecopy --update sitename
and all the changed files will be uploaded to the server.
Any files you delete locally will be deleted remotely too,
unless the nodelete option is specified in the rcfile. If
you move any files between directories, the remote files
will be deleted from the server then uploaded again unless
you specify the checkmoved option in the rcfile.
At any time, if you wish to see what changes you have made
to the local site since the last update, you can run
sitecopy sitename
which will display the list of differences.
Synchronization Problems
In some circumstances, the actual files which make up the
remote site will be different from what sitecopy thinks is
on the remote site. This can happen, for instance, if the
connection to the server is broken during an update. When
this situation arises, Fetch Mode should be used to fetch
the list of files making up the site from the remote
server.
followed by any options you choose, then one or more site
names. For instance,
sitecopy --update --quiet mainsite anothersite
will quietly update the sites named 'mainsite' and 'anoth-
ersite'.
OPERATION MODES
-l, --list
List Mode - produces a listing of all the differ-
ences between the local files and the remote copy
for the specified sites.
-ll, --flatlist
Flat list Mode - like list mode, except the output
produced is suitable for parsing by an external
script or program. An AWK script, changes.awk. is
provided which produces an HTML page from this
mode.
-u, --update
Update Mode - updates the remote copy of the speci-
fied sites.
-f, --fetch
Fetch Mode - fetches the list of files from the
remote server. Note that this mode has only lim-
ited support in FTP - the server must accept the
MDTM command, and use a Unix-style 'ls' for LIST
implementation.
-s, --synchronize
Synchronize Mode - updates the local site from the
remote copy. WARNING: This mode overwrites local
files. Use with care.
-i, --initialize
Initialization Mode - initializes the sites speci-
fied - making sitecopy think there are NO files on
the remote server.
-c, --catchup
Catchup Mode - makes sitecopy think the local site
is exactly the same as the remote copy.
-v, --view
View Mode - displays all the site definitions from
the rcfile.
-h, --help
Display help information.
OPTIONS
-y, --prompting
Applicable in Update Mode only, will prompt the
user for confirmation for each update (i.e., creat-
ing a directory, uploading a file etc.).
-r RCFILE, --rcfile=RCFILE
Specify an alternate run control file location.
-p PATH, --storepath=PATH
Specify an alternate location to use for the remote
site storage directory.
-q, --quiet
Quiet output - display the filename only for each
update performed.
-qq, --silent
Very quiet output - display nothing for each update
performed.
-o, --show-progress
Applicable in Update Mode only, displays the
progress (percentage complete) of data transfer.
-k, --keep-going
Keep going past errors in Update Mode or Synch Mode
-a, --allsites
Perform the given operation on all sites - applica-
ble for all modes except View Mode, for which it
has no effect.
-d MASK, --debug=KEY[,KEY...]
Turns on debugging. A list of comma-separated key-
words should be given. Each keyword may be one of:
socket Socket handling
files File handling
rcfile rcfile parser
http HTTP driver
httpbody Display response bodies in HTTP
ftp FTP driver
xml XML parsing information
xmlparse Low-level XML parsing information
httpauth HTTP authentication information
cleartext Display passwords in plain text
Passwords will be obscured in the debug output
unless the cleartext keyword is used. An example
use of debugging is to debug FTP fetch mode:
CONCEPTS
The stored state of a site is the snapshot of the state of
the site saved into the storage directory (~/.sitecopy/).
The storage file is used to record this state between
invocations. In update mode, sitecopy builds up a files
list for each site by scanning the local directory, read-
ing in the stored state, and comparing the two - determin-
ing which files have changed, which have moved, and so on.
CONFIGURATION
Configuration is performed via the run control file
(rcfile). This file contains a set of site definitions.
A unique name is assigned to every site definition, which
is used on the command line to refer to the site.
Each site definition contains the details of the server
the site is stored on, how the site may be accessed at
that server, where the site is held locally and remotely,
and any other options for the site.
Site Definition
A site definition is made up of a series of lines:
site sitename
server server-name
remote remote-root-directory
local local-root-directory
[ port port-number ]
[ username username ]
[ password password ]
[ proxy-server proxy-name
proxy-port port-number ]
[ url siteURL ]
[ protocol { ftp | webdav } ]
[ ftp nopasv ]
[ ftp showquit ]
[ ftp { usecwd | nousecwd } ]
[ http expect ]
[ http secure ]
[ safe ]
[ state { checksum | timesize } ]
[ permissions { ignore | exec | all | dir } ]
[ symlinks { ignore | follow | maintain } ]
[ nodelete ]
[ nooverwrite ]
[ checkmoved [renames] ]
[ tempupload ]
[ exclude pattern ]...
[ ignore pattern ]...
[ ascii pattern ]...
comment. Values may be quoted and characters may be back-
slash-escaped. For example, to use the exclude pattern
*#, use the following line:
exclude "*#"
Remote Server Options
The server key is used to specify the remote server the
site is stored on. This may be either a DNS name or IP
address. A connection is made to the default port for the
protocol used, or that given by the port key. sitecopy
supports the WebDAV or FTP protocols - the protocol key
specifies which to use, taking the value of either webdav
or ftp respectively. By default, FTP will be used.
The proxy-server and proxy-port keys may be used to spec-
ify a proxy server to use. Proxy servers are currently
only supported for WebDAV.
If the FTP server does not support passive (PASV) mode,
then the key ftp nopasv should be used. To display the
message returned by the server on closing the connection,
use the ftp showquit option. If the server only supports
uploading files in the current working directory, use the
key ftp usecwd (possible symptom: "overwrite permission
denied"). Note that the remote-directory (keyword remote)
must be an absolute path (starting with '/'), or usecwd
will be ignored.
If the WebDAV server correctly supports the 100-continue
expectation, e.g. Apache 1.3.9 and later, the key http
expect should be used. Doing so can save some bandwidth
and time in an update.
If the WebDAV server supports access via SSL, the key http
secure can be used. Doing so will cause the transfers
between sitecopy and the host to be performed using an
secure, encrypted link. The first time SSL is used to
access the server, the user will be prompted to verify the
SSL certificate, if it's not signed by a CA trusted in the
system's CA root bundle.
To authenticate the user with the server, the username and
password keys are used. If it exists, the ~/.netrc will be
searched for a password if one is not specified. See
ftp(1) for the syntax of this file.
Basic and digest authentication are supported for WebDAV.
Note that basic authentication must not be used unless the
connection is known to be secure.
The full URL that is used to access the site can option-
Changes' pages. The URL must not have a trailing slash; a
valid example is
url http://www.site.com/mysite
If the tempupload option is given, new or changed files
are upload with a ".in." prefix, then moved to the true
filename when the upload is complete.
File State
File state is stored in the storage files (~/.sitecopy/*),
and is used to discover when a file has been changed. Two
methods are supported, and can be selected using the state
option, with either parameter: timesize (the default), and
checksum.
timesize uses the last-modification date and the size of
files to detect when they have changed. checksum uses an
MD5 checksum to detect any changes to the file contents.
Note that MD5 checksumming involves reading in the entire
file, and is slower than simply using the last-modifica-
tion date and size. It may be useful for instance if a
versioning system is in use which updates the last-modifi-
cation date on a 'checkout', but this doesn't actually
change the file contents.
Safe Mode
Safe Mode is enabled by using the safe key. When enabled,
each time a file is uploaded to the server, the modifica-
tion time of the file as on the server is recorded. Subse-
quently, whenever this file has been changed locally and
is to be uploaded again, the current modification time of
the file on the server is retrieved, and compared with the
stored value. If these differ, then the remote copy of the
file has been altered by a foreign party. A warning mes-
sage is issued, and your local copy of the file will not
be uploaded over it, to prevent losing any changes.
Safe Mode can be used with FTP or WebDAV servers, but if
Apache/mod_dav is used, mod_dav 0.9.11 or later is
required.
Note Safe mode cannot be used in conjunction with the
nooverwrite option (see below).
File Storage Locations
The remote key specifies the root directory of the remote
copy of the site. It may be in the form of an absolute
pathname, e.g.
the login directory, in which case it must be prefixed by
"~/", for example:
remote ~/public_html/
The local key specifies the directory in which the site is
stored locally. This may be given relative to your home
directory (as given by the environment variable $HOME),
again using the "~/" prefix.
local ~/html/foosite/
local /home/fred/html/foosite/
are equivalent, if $HOME is set to "/home/fred".
For both the local and remote keywords, a trailing slash
may be used, but is not required.
File Permissions Handling
File permissions handling is dictated by the permissions
key, which may be given one of three values:
ignore to ignore file permissions completely,
exec to mirror the permissions of executable files only,
all to mirror the permissions of all files.
This can be used, for instance, to ensure the permissions
of CGI files are set. The option is currently ignored for
WebDAV servers. For FTP servers, a chmod is performed
remotely to set the permissions.
To handle permissions directories, the key:
permissions dir
may be used in addition to a permissions key of either
exec, local or all. Note that permissions all does not
imply permissions dir.
Symbolic Link Handling
Symlinks found in the local site can be either ignored,
followed, or maintained. In 'follow' mode, the files ref-
erences by the symlinks will be uploaded in their place.
In 'maintain' mode, the link will be created remotely as
well, see below for more information. The mode used for
each site is specified with the symlinks rcfile key, which
may take the value of ignore, follow or maintain to select
the mode as appropriate.
The default mode is follow, i.e. symbolic links found in
the local site are followed.
driver, and will work only with servers which implement
WebDAV Advanced Collections, which is a work-in-progress.
The target of the link on the server is literally copied
from the target of the symlink. Hint: you can use URL's if
you like:
ln -s "http://www.somewhere.org/" somewherehome
In this way, a "302 Redirect" can be easily set up from
the client, without having to alter the server configura-
tion.
Deleting and Moving Remote Files
The nodelete option may be used to prevent remote files
from ever being deleted. This may be useful if you keep
large amounts of data on the remote server which you do
not need to store locally as well.
If your server does not allow you to upload changed files
over existing files, then you can use the nooverwrite
option. When this is used, before uploading a changed
file, the remote file will be deleted.
If the checkmoved option is used, sitecopy will look for
any files which have been moved locally. If any are found,
when the remote site is updated, the files will be moved
remotely.
If the checkmoved renames option is used, sitecopy will
look for any files which have been moved or renamed
locally. This option may only be used in conjunction with
the state checksum option.
WARNING
If you are not using MD5 checksumming (i.e. the state
checksum option) to determine file state, do NOT use the
checkmoved option if you tend to hold files in different
directories with identical sizes, modification times and
names and ever move them about. This seems unlikely, but
don't say you haven't been warned.
Excluding Files
Files may be excluded from the files list by use of the
exclude key, which accepts shell-style globbing patterns.
For example, use
exclude *.bak
exclude *~
exclude "#*#"
to exclude all files which have a .bak extension, end in a
tilde (~) character, or which begin and end with a a hash.
To exclude certain files within an particular directory,
simply prefix the pattern with the directory name -
including a leading slash. For instance:
exclude /docs/*.m4
exclude /files/*.gz
which will exclude all files with the .m4 extension in the
'docs' subdirectory of the site, and all files with the
.gz extension in the files subdirectory.
An entire directory can also be excluded - simply use the
directory name with no trailing slash. For example
exclude /foo/bar
exclude /where/else
to exclude the 'foo/bar' and 'where/else' subdirectories
of the site.
Exclude patterns are consulted when scanning the local
directory, and when scanning the remote site during a
--fetch. Any file which matches any exclude pattern is
not added to the files list. This means that a file which
has already been uploaded by sitecopy, and subsequently
matches an exclude pattern will be deleted from the
server.
Ignoring Local Changes to Files
The ignore option is used to instruct sitecopy to ignore
any local changes made to a file. If a change is made to
the contents of an ignored file, this file will not be
uploaded by update mode. Ignored files will be created,
moved and deleted as normal.
The ignore option is used in the same way as the exclude
option.
Note that synchronize mode will overwrite changes made to
ignored files.
FTP Transfer Mode
To specify the FTP transfer mode for files, use the ascii
key. Any files which are transferred using ASCII mode have
CRLF/LF translation performed appropriately. For example,
use
ascii *.pl
to upload all files with the .pl extension as ASCII text.
This key has no effect with WebDAV (currently).
RETURN VALUES
Return values are specified for different operation modes.
If multiple sites are specified on the command line, the
Update Mode
-1 ... update never even started - configuration problem
0 ... update was entirely successful.
1 ... update went wrong somewhere
2 ... could not connect or login to server
List Mode (default mode of operation)
-1 ... could not form list - configuration problem
0 ... the remote site does not need updating
1 ... the remote site needs updating
EXAMPLE RCFILE CONTENTS
FTP Server, Simple Usage
Fred's site is uploaded to the FTP server 'my.server.com'
and held in the directory 'public_html', which is in the
login directory. The site is stored locally in the direc-
tory /home/fred/html.
site mysite
server my.server.com
url http://www.server.com/fred
username fred
password juniper
local /home/fred/html/
remote ~/public_html/
FTP Server, Complex Usage
Here, Freda's site is uploaded to the FTP server
'ftp.elsewhere.com', where it is held in the directory
/www/freda/. The local site is stored in
/home/freda/sites/elsewhere/
site anothersite
server ftp.elsewhere.com
username freda
password blahblahblah
local /home/freda/sites/elsewhere/
remote /www/freda/
# Freda wants files with a .bak extension or a
# trailing ~ to be ignored:
exclude *.bak
exclude *~
WebDAV Server, Simple Usage
This example shows use of a WebDAV server.
site supersite
server dav.wow.com
password zap
local /home/joe/www/super/
remote /
FILES
~/.sitecopyrc Default run control file location.
~/.sitecopy/ Remote site information storage directory
~/.netrc Remote server accounts information
BUGS
Known problems: Fetch + synch modes are NOT reliable for
FTP. If you need reliable operation of fetch or synch
modes, you shouldn't be using sitecopy. Try rsync
instead.
Please send bug reports and feature requests to <site-
copy@lyra.org> rather than to the author, since the mail-
ing list is archived and can be a useful resource for oth-
ers.
SEE ALSO
rsync(1), ftp(1), mirror(1)
STANDARDS
[Listed for reference only, no claim of compliance to any
of the below standards is made.]
RFC 959 - File Transfer Protocol (FTP)
RFC 1521 - Multipurpose Internet Mail Extensions Part One
RFC 1945 - Hypertext Transfer Protocol -- HTTP/1.0
RFC 2396 - Uniform Resource Identifiers: Generic Syntax
RFC 2518 - HTTP Extensions for Distributed Authoring --
WEBDAV
RFC 2616 - Hypertext Transfer Protocol -- HTTP/1.1
RFC 2617 - HTTP Authentication
REC-XML - Extensible Markup Language (XML) 1.0
REC-XML-NAMES - Namespaces in XML
DRAFT STANDARDS
draft-ietf-ftpext-mlst-05.txt - Extensions to FTP
draft-ietf-webdav-collections-protocol-03.txt - WebDAV
Advanced Collections Protocol
AUTHOR
Joe Orton and others.
e-mail: sitecopy@lyra.org
Man(1) output converted with
man2html