webarchive

Here are 74 public repositories matching this topic...

StrawberryMaster / wayback-machine-downloader

Download an entire website from the Wayback Machine.

ruby scraper osint internet-archive wayback-machine wayback webarchive archive-org waybackmachine osint-tool wayback-downloader archive-downloader

Updated Oct 28, 2025
Ruby

karust / gogetcrawl

Star

Extract web archive data using Wayback Machine and Common Crawl

golang crawler concurrency wayback-machine webarchive commoncrawl

Updated Nov 4, 2024
Go

vegetableman / vandal

Star

Navigator for Web Archive

chrome-extension firefox-addon wayback-machine webarchive internet-archiving

Updated Nov 23, 2023
JavaScript

helgeho / ArchiveSpark

Star

An Apache Spark framework for easy data processing, extraction as well as derivation for web archives and archival collections, developed at Internet Archive.

spark internet-archive warc web-archiving webarchive archivespark spark-framework

Updated Oct 8, 2025
Scala

chatnoir-eu / chatnoir-resiliparse

Star

A robust web archive analytics toolkit

python web cpp cython bigdata extraction warc webarchive htmlparser

Updated Oct 15, 2025
Cython

N0taN3rd / node-warc

Star

Parse And Create Web ARChive (WARC) files with node.js

warc web-archiving webarchive web-archives webarchiving warc-files chrome-remote-interface pupeteer

Updated Jan 29, 2025
JavaScript

mathis2001 / WebHackUrls

Star

Simple python OSINT tool for urls recon thanks to the waybackmachine.

osint pentesting recon bugbounty wayback-machine webarchive

Updated Jun 19, 2023
Python

rcarmo / python-webarchive

Sponsor

Star

Create WebKit/Safari .webarchive files on any platform

python3 asyncio webarchive

Updated Feb 4, 2020
Python

cipher387 / quickcacheandarchivesearch

Star

Quick Cache and Archive search buttons

webarchive webarchiving google-cache yandex-cache baidu-cache

Updated May 11, 2024
JavaScript

rumca-js / RSS-Link-Database

Star

Bookmarked archived links

rss links archive rss-feed webarchive link-aggregator link-aggregation rss-archive

Updated Nov 2, 2025

mhucka / devilfish

Star

A utility for simultaneously creating full-page PDF snapshots and web archives of web pages in DEVONthink Pro.

pdf web archiving webarchive devonthink

Updated Jul 24, 2020
AppleScript

Mixnode / mixnode-warcreader-php

Star

Read Web ARChive (WARC) files in PHP.

php warc webarchive

Updated Mar 10, 2017
PHP

gonejack / webarchive-to-singlefile

Star

This command line converts .webarchive file to resources embed .html file

html webarchive

Updated Dec 5, 2022
Go

WebarchivCZ / Seeder

Star

Seeder - Czech webarchive curating tool and public site

government django tools czech czech-republic archive webarchive webarchiving webarchives

Updated Aug 29, 2025
Python

birbwatcher / wayback-machine-downloader

Star

Wayback Machine Downloader for webmasters, OSINT researchers, and SEO specialists

scraper wayback-machine webarchive osint-tool wayback-machine-downloader webarchive-data-scraping

Updated Oct 28, 2025
JavaScript

toimik / WarcProtocol

Star

Parser for WARC (aka WebArchive) files

warc webarchive webarchiving warc-files webarchives warc-format warc-reader warc-record

Updated Jul 9, 2024
C#

helgeho / HadoopConcatGz

Star

A Splitable Hadoop InputFormat for Concatenated GZIP Files and *.(w)arc.gz

spark hadoop warc web-archiving webarchive

Updated Feb 7, 2018
Java

ticky / webarchive

Star

📑 Rust utilities for working with Apple's Web Archive file format

safari rust-lang webarchive rust-crate

Updated Mar 11, 2022
Rust

q-m / scrapy-webarchive

Star

A plugin for Scrapy that allows users to capture and export web archives in the WARC and WACZ formats during crawling.

scrapy warc webarchive webarchive-data-scraping wacz

Updated Sep 5, 2025
Python

nlnwa / docker-chrome-headless

Star

webarchive

Updated Apr 6, 2018
Shell

Improve this page

Add a description, image, and links to the webarchive topic page so that developers can more easily learn about it.

Curate this topic

Add this topic to your repo

To associate your repository with the webarchive topic, visit your repo's landing page and select "manage topics."

Learn more

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

webarchive

Here are 74 public repositories matching this topic...

StrawberryMaster / wayback-machine-downloader

karust / gogetcrawl

vegetableman / vandal

helgeho / ArchiveSpark

chatnoir-eu / chatnoir-resiliparse

N0taN3rd / node-warc

mathis2001 / WebHackUrls

rcarmo / python-webarchive

cipher387 / quickcacheandarchivesearch

rumca-js / RSS-Link-Database

mhucka / devilfish

Mixnode / mixnode-warcreader-php

gonejack / webarchive-to-singlefile

WebarchivCZ / Seeder

birbwatcher / wayback-machine-downloader

toimik / WarcProtocol

helgeho / HadoopConcatGz

ticky / webarchive

q-m / scrapy-webarchive

nlnwa / docker-chrome-headless

Improve this page

Add this topic to your repo