WWWOFFLE - World Wide Web Offline Explorer - Version 2.7 ======================================================== The WWWOFFLE programs simplify World Wide Web browsing from computers that use intermittent (dial-up) connections to the internet. Description ----------- The WWWOFFLE server is a proxy web server with special features for use with dial-up internet links. This means that it is possible to browse web pages and read them without having to remain connected. Basic Features - Caching of HTTP, FTP and finger protocols. - Allows the 'GET', 'HEAD', 'POST' and 'PUT' HTTP methods. - Interactive or command line control of online/offline/autodial status. - Highly configurable. - Low maintenance, start/stop and online/offline status can be automated. While Online - Caching of pages that are viewed for later review. - Conditional fetching to only get pages that have changed. - Based on expiration date, time since last fetched or once per session. - Non cached support for SSL (Secure Socket Layer e.g. https). - Can be used with one or more external proxies based on web page. - Control which pages cannot be accessed. - Allow replacement of blocked pages. - Control which pages are not to be stored in the cache. - Requests compressed pages from web servers (compile time option). While Offline - Can be configured to use dial-on-demand for pages that are not cached. - Selection of pages to download next time online - Using normal browser to follow links. - Command line interface to select pages for downloading. - Control which pages can be requested when offline. - Provides non-cached access to intranet servers. Automated Download - Downloading of specified pages non-interactively. - Options to automatically fetch objects in requested pages - Understands various types of pages - HTML 4.0, Java classes, VRML (partial), XML (partial). - Options to fetch different classes of objects - Images, Stylesheets, Frames, Scripts, Java or other objects. - Option to not fetch webbug images (images of 1 pixel square). - Automatically follows links for pages that have been moved. - Can monitor pages at regular intervals to fetch those that have changed. - Recursive fetching - To specified depth. - On any host or limited to same server or same directory. - Chosen from command line or from browser. - Control over which links can be fetched recursively. Convenience - Optional information footer on HTML pages showing date cached and options. - Options to modify HTML pages - Remove scripts. - Remove Java applets. - Remove stylesheets. - Remove shockwave flash animations. - Indicate cached and uncached links. - Remove the blink tag. - Remove refresh tags. - Remove links to pages that are in the DontGet list. - Remove inline frames (iframes) that are in the DontGet list. - Replace images that are in the DontGet list. - Replace webbug images (images of 1 pixel square). - Demoronise HTML character sets. - Stop animated GIFs. - Automatic proxy configuration for Netscape. - Searchable cache with the addition of the ht://Dig, mnoGoSearch (UdmSearch) or Namazu programs. - Built in simple web-server for local pages. - Timeouts to stop proxy lockups - DNS name lookups. - Remote server connection. - Data transfer. - Continue or stop downloads interrupted by client. - Based on file size of fraction downloaded. - Purging of pages from cache - Based on URL matching. - To keep the cache size below a specified limit. - To keep the free disk space above a specified limit. - Interactive or command line control. - Compression of cached pages based on age. - Provides compressed pages to web browser (compile time option). Indexes - Multiple indexes of pages stored in cache - Servers for each protocol (http, ftp ...). - Pages on each server. - Pages waiting to be fetched. - Pages requested last time offline. - Pages fetched last time online. - Pages monitored on a regular basis. - Configurable indexes - Sorted by name, date, server domain name, type of file. - Options to delete, refresh or monitor pages. - Selection of complete list of pages or hide un-interesting pages. Security - Works with pages that require basic username/password authentication. - Automates proxy authentication for external proxies that require it. - Control over access to the proxy - Defaults to local host access only. - Host access configured by hostname or IP address. - Optional proxy authentication for user level access control. - Optional password control for proxy management functions. - Can censor incoming and outgoing HTTP headers to maintain user privacy. Configuration - All options controlled using a configuration file. - Interactive web page to allow editing of the configuration file. - User customisable error and information pages. Changes ------- Since version 2.6d: Bug Fixes: IPv6 getnameinfo() bug fixed. IPv6 freeaddrinfo() bug fixed. IPv6 ftp bug fixed. Fix compile problem when not using zlib. Client acceptable compression type detection fixed. Upgrade config script fixed (purge age). Remove gcc warnings for xml.l and errors.c. Renamed UdmSearch to mnoGoSearch. Fixed bug with purge min-free option. Fix Javascript removal problem with ^M characters. Check the wildcards in URL-SPECIFICATIONs in the configuration file are valid. Distinguish between timeout and compression errors from the remote host. Delete the Content-Length headers from URLs that wwwoffle has uncompressed. Allow multi-line headers in requests. Don't request pages when monitor times changed. Give warning on error writing to cache (disk full?). Don't remove the If-Modified-Since header on non-cached requests. Fix iframe link finding error. Reply to conditional requests when online as unchanged if unchanged on server. Handle reserved HTML characters in FTP directory listings. Fix error finding iframes when finding HTML links. Allow the compiled in localhost to be changed. Change the default buffer size to 4KBytes to try and increase performance. Fix more Javascript removal problems when modifying HTML. Added a question about the order of URL-SPECIFICATIONs to the FAQ. Give an error if there is an equal sign in the conf file when not expected. Handle the 'Cache-Control: max-age=0' header from the client. Added an answer to the most common IPv6 questions to the FAQ. Another fix for javascript in HTML modifications. Stop uncompress-cache from crashing on zero length files. Include sys/time.h in various files. Fix another bug with script removal when modifying HTML. Make 'wwwoffle URL' put a request in outgoing in all cases. Fix the configuration editing when there are no entries for an item. Make 'wwwoffle URL' put a request in outgoing even if already cached. Choose client compression based on q factor. Fix bug with gethostbyaddr() parameters. Documentation and HTML message updates. HTML parser updates for Javascript. Improved socket error messages. Stop permanent lockup in case of parsing error. Configuration editing doesn't crash if file is not writable. Better handling of FTP servers that can't handle EPSV command. Fix ConfirmRequest option. New Features: All message translations are installed and chosen by browser language settings. Added the option to use the Namazu and mknmz-wwwoffle programs to search cache. Removed the pages at /control/edit that edit the configuration file. Add a new set of pages to edit the configuration file, each item individually. Add the option to disable iframes that include URLs in the DontGet list. Add the option to remove shockwave/flash animations. Check the modification time of FTP files on the server if conditional request. Handle conditional requests that use If-None-Match headers. Add the option to cycle the lasttime/prevtime & lastout/prevout indexes daily. Add a timestamp to the lasttime/prevtime & lastout/prevout index pages. Added a summary of the quantity of files deleted and compressed by the purge. Added URL-SPECIFICATIONS for the options in the FetchOptions section. Added the option to re-request URLs that contain a redirection to another URL. Display the current value of the item in the configuration url page. Make the referer-self and referer-self-dir options add headers if none present. Disable HTML modifications when a 'Cache-Control: no-transform' header is seen. Programs: Added a '-f' option to wwwoffled to simplify debugging (for development usage). Added '--help' and '--version' options to all programs. Add a '-status' option to 'wwwoffle' to get the current 'wwwoffled' status. The wwwoffle program exits with an error status when reporting an error message. Allow wwwoffle-tools to be run with an argument to specify operating mode. Allow multiple arguments to 'wwwoffle-ls' and 'wwwoffle-rm' programs. Add a wwwoffle-hash program (part of wwwoffle-tools) to print URL hash. Add '-c ' option to 'convert-cache' & 'uncompress-cache' programs. Make wwwoffle-ls output the time in the local language. Make wwwoffle-ls work if the whole directory path to the cache is specified. Translations: Updated translated pages for French, Dutch and Polish Languages. Availability ------------ Version 2.7 uploaded, but may not be available yet FTP server: ftp://metalab.unc.edu/pub/Linux/apps/www/servers/wwwoffle-2.7.tgz FTP server: ftp://ftp.demon.co.uk/pub/unix/httpd/wwwoffle-2.7.tgz Web page: http://www.gedanken.demon.co.uk/wwwoffle/ Author & Copyright ------------------ This program is copyright Andrew M. Bishop 1996,97,98,99,2000,01,02 (amb@gedanken.demon.co.uk) and distributed under GPL. email: amb@gedanken.demon.co.uk [Please put wwwoffle in the subject line]