🗃 Open source self-hosted web archiving. Takes URLs/browser history/bookmarks/Pocket/Pinboard/etc., saves HTML, JS, PDFs, media, and more...
-
Updated
Nov 22, 2024 - Python
🗃 Open source self-hosted web archiving. Takes URLs/browser history/bookmarks/Pocket/Pinboard/etc., saves HTML, JS, PDFs, media, and more...
😇 A Docker Compose bundle to run on servers with spare CPU, RAM, disk, and bandwidth to help the world. Includes Tor, ArchiveWarrior, BOINC, and more...
Official ArchiveBox browser extension: automatically/manually preserve your browsing history using ArchiveBox.
Desktop Electron app for ArchiveBox internet archiver. (ALPHA: not ready for general use)
🎭 An introduction to the Internet Archiving ecosystem, tooling, and some of the ethical dilemmas that the community faces.
Home of the official docker image for ArchiveBox
Javascript/Node wrapper around Mozilla's Readability library so that ArchiveBox can call it as a oneshot CLI command to extract each page's article text.
An Obsidian plugin to submit file links to an ArchiveBox instance.
Homebrew formula for the ArchiveBox self-hosted internet archiving solution.
⬇️ A CLI tool to download all discovered content from a URL (like youtube-dl/yt-dlp, forum-dl, gallery-dl). 🎭 Uses headless Chrome to get HTML, JS, CSS, images/video/audio/subtitles, PDFs, screenshots, article text, git srcs, and more...
Export userdata from your reddit accounts. Submissions, comments, saved, upvoted contents are supported.
Home of the official apt/deb package for Ubuntu/Debian-based systems.
Clean a series of links, resolving redirects and finding Wayback results if page is gone. Originally written to aid with importing from ArchiveBox.
DigestBox takes any webpage URL (news article, video link, comment thread, etc.) and gives you just the raw content. It's powered by ArchiveBox.io under the hood.
Self-hosted internet archiving solution to collect, save, and view sites you want to preserve offline, for YunoHost.
Official ArchiveBox MITM proxy: saves URLs of all requests passing through to an ArchiveBox server for archival.
Source for the Github Wiki / ReadTheDocs documentation for AchiveBox, the self-hosted internet archiving solution.
ArchiveBoxMatic: configure ArchiveBox with the simplicity of a yaml file.
Official Python package for ArchiveBox, the self-hosted internet archiving solution.
🧩 Proposal to allow user scripts like "expand comments", "hide popups", "fill out this form", etc. to be reusable across pure browser environments, puppeteer, playwright, extensions, AI tools, and many other contexts with minimal adjustment.
Add a description, image, and links to the archivebox topic page so that developers can more easily learn about it.
To associate your repository with the archivebox topic, visit your repo's landing page and select "manage topics."