HTTP Replicator is a general purpose caching proxy server written in python. It reduces bandwidth by merging concurrent downloads and building a local 'replicated' file hierarchy, similar to wget -r. The cache will also be accessible through a web interface; currently unsupported.
The following example session demonstrates basic usage.
~$ mkdir /tmp/cache
~$ http-replicator -r /tmp/cache -p 8888 --daemon /tmp/replicator.log
[process id]
~$ http_proxy=localhost:8888 wget http://www.python.org/index.html
100%[====================================>] 15,978
~$ find /tmp/cache
/tmp/cache
/tmp/cache/www.python.org:80
/tmp/cache/www.python.org:80/index.html
Replicator has reasonable defaults for all its settings, which means it can be run without command line arguments. In that case it will listen at port 8080, will not detach from the terminal, and takes the current directory as root. Files are cached in top directory host:port, where port defaults to 80 for http and 21 for ftp, and a trailing path corresponding to the url. The following arguments can be used to change this default behaviour:
-h --help
Show this help message and exit.
-p --port PORT
Listen on this port for incoming connections, default 8080.
-r --root DIR
Set cache root directory, default current.
-v --verbose
Show http headers and other info
-t --timeout SEC
Break connection after so many seconds of inactivity, default 15
-6 --ipv6
Try ipv6 addresses if available
--flat
Flat mode; cache all files in root directory (dangerous!)
--static
Static mode; assume files never change
--offline
Offline mode; never connect to server
--limit RATE
Limit download rate at a fixed K/s
--daemon LOG
Route output to log and detach
--debug
Switch from gather to debug output module