Thanks to visit codestin.com
Credit goes to github.com

Skip to content

pivilartisant/arena-web-archiver

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

17 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Are.na Web Archiver

A minimal, self-hosted utility for Are.na built with HTMX, FastAPI, and Wget.


Details

  • Websites are archived as HTML and WARC files.
  • If no WARC filename is provided, the Are.na block ID will be used.
  • Archives are saved in:
    • /tmp/arena_archives when running with Docker.
    • /tmp (inside the project repository) when running locally.
  • Mirror → creates a 1:1 copy of the entire website.
  • Snapshot → archives only a single page.

Recommendations

  • Use this tool to preserve old web content that might disappear one day.
  • It’s intentionally minimal—expect the bare essentials.
  • Some sites may cause infinite loops when mirroring (this error is not currently handled).
  • For very large websites and archives, consider using a more robust archiving tool.

Get Started

  1. Clone repo
  2. Have Python and and UV package manager and WGET installed on your machine
  3. Start the server using make server
  4. Access on http://127.0.0.1:8000
  5. See docs at http://127.0.0.1:8000/docs

Get started with docker

  1. run make docker-create-volume
  2. run make docker-build
  3. run make docker-run
  4. Access on http://127.0.0.1:8000
  5. See docs at http://127.0.0.1:8000/docs

contribute

See https://github.com/pivilartisant/arena-web-archiver/issues

License

Shield: CC BY-NC-SA 4.0

This work is licensed under a Creative Commons Attribution-NonCommercial-ShareAlike 4.0 International License.

CC BY-NC-SA 4.0

About

Minimal, selfhosted website archiver for Are.na Blocks. Built with HTMX, FastAPI and WGET.

Topics

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published