About stare.pub

stare.pub is a free, open tool for resolving U.S. legal citations to full case text and original scanned reporter pages. The entire application is static files served to your browser — there is no backend server, no database, no API to maintain or go down. It could run from IPFS, a USB stick, or any static file host. No tracking, no account required.

This project is built by lokkju, motivated by the belief that public law should be permanently, freely accessible — not dependent on any single institution's continued goodwill or funding.

The data

The case text and page scans come from the Harvard Caselaw Access Project (CAP), which digitized roughly 40,000 reporter volumes covering 6.7 million cases from 1658 to 2018. All of this data is public domain.

Harvard announced in 2024 that they were winding down the CAP project. The API is gone. Search is gone. The bulk data files at static.case.law are still available, but for how long is uncertain. This project exists to preserve that data and keep it accessible.

How it works

When you look up a citation, your browser:

Fetches a small binary index file (~7 KB per volume) to find the case
Fetches the volume's case data (~400 KB zip) from static.case.law
Extracts and renders the case text entirely client-side using WebAssembly
Caches everything locally so repeat lookups are instant

For the original page viewer, individual page images (~50 KB each) are fetched via HTTP range requests into the volume's tar archive — no need to download the full ~100 MB archive.

The total WebAssembly module is about 280 KB. There is no backend server, no database, no compute — just static file hosting. The same files could be served from Cloudflare R2, IPFS, a local directory, or any web host that can serve files.

Current status

This is an early-stage project. We're actively working on:

Mirroring all data — downloading all 40,622 volumes (~4.4 TB) from Harvard to our own storage
Building the full index — currently only a handful of test volumes are indexed; the full corpus index will follow once the mirror completes
IPFS distribution — pinning the volume archives on IPFS for permanent, decentralized access
Citation parsing — a Rust port of eyeCite to accept natural citation strings like "31 A.2d 647"
Full-text search — serverless search via RoaringBuckets bitmap indexes on Cloudflare R2
Federal and state statutes — the same architecture works for statutes (U.S. Code, Statutes at Large, state codes) using public domain data from govinfo.gov and state legislatures
Fully decentralized hosting — the site itself (not just the data) served from IPFS, making the entire system uncensorable and permanent

Help preserve case law

This data is at risk. Harvard's hosting is a soft commitment from an academic lab in wind-down mode. The more copies that exist, the safer the data is.

If you can help, here's how:

Mirror the data — the full dataset is ~4.4 TB at static.case.law. If you have storage, grab a copy. Eventually, you'll be able to mirror the entire stare.pub system — site, search index, and data — from a single command.
Pin on IPFS — once we publish IPFS CIDs, pin them from your node to help distribute
Seed torrents — per-reporter torrents are planned for bulk distribution
Contribute code — the project is open source on GitHub

Technical details

Frontend: Astro + vanilla JS + CSS custom properties
Core: Rust, compiled to both native (CLI) and WebAssembly
Data: Harvard CAP zip/tar files — static files, hostable anywhere
Index: Custom binary format — 8 bytes per page, direct offset lookup
Hosting: Any static file host works — Cloudflare R2, S3, IPFS, a local web server. The entire system (site, index, and data) is designed to be trivially mirrorable.

Credits

Built on data from the Harvard Library Innovation Lab's Caselaw Access Project. Citation detection uses patterns from the Free Law Project's eyeCite library. All case law data is public domain (CC0).

← Back to stare.pub