About stare.pub
stare.pub is a free, open tool for resolving U.S. legal citations to full case text and original scanned reporter pages. The entire application is static files served to your browser — there is no backend server, no database, no API to maintain or go down. It could run from IPFS, a USB stick, or any static file host. No tracking, no account required.
This project is built by lokkju, motivated by the belief that public law should be permanently, freely accessible — not dependent on any single institution's continued goodwill or funding.
The data
The case text and page scans come from the Harvard Caselaw Access Project (CAP), which digitized roughly 40,000 reporter volumes covering 6.7 million cases from 1658 to 2018. All of this data is public domain.
Harvard announced in 2024 that they were winding down the CAP project. The API is gone. Search is gone. The bulk data files at static.case.law are still available, but for how long is uncertain. This project exists to preserve that data and keep it accessible.
How it works
When you look up a citation, your browser:
- Fetches a small binary index file (~7 KB per volume) to find the case
- Fetches the volume's case data (~400 KB zip) from static.case.law
- Extracts and renders the case text entirely client-side using WebAssembly
- Caches everything locally so repeat lookups are instant
For the original page viewer, individual page images (~50 KB each) are fetched via HTTP range requests into the volume's tar archive — no need to download the full ~100 MB archive.
The total WebAssembly module is about 280 KB. There is no backend server, no database, no compute — just static file hosting. The same files could be served from Cloudflare R2, IPFS, a local directory, or any web host that can serve files.
Current status
This is an early-stage project. We're actively working on:
- Mirroring all data — downloading all 40,622 volumes (~4.4 TB) from Harvard to our own storage
- Building the full index — currently only a handful of test volumes are indexed; the full corpus index will follow once the mirror completes
- IPFS distribution — pinning the volume archives on IPFS for permanent, decentralized access
- Citation parsing — a Rust port of eyeCite to accept natural citation strings like "31 A.2d 647"
- Full-text search — serverless search via RoaringBuckets bitmap indexes on Cloudflare R2
- Federal and state statutes — the same architecture works for statutes (U.S. Code, Statutes at Large, state codes) using public domain data from govinfo.gov and state legislatures
- Fully decentralized hosting — the site itself (not just the data) served from IPFS, making the entire system uncensorable and permanent
Help preserve case law
This data is at risk. Harvard's hosting is a soft commitment from an academic lab in wind-down mode. The more copies that exist, the safer the data is.
If you can help, here's how:
- Mirror the data — the full dataset is ~4.4 TB at static.case.law. If you have storage, grab a copy. Eventually, you'll be able to mirror the entire stare.pub system — site, search index, and data — from a single command.
- Pin on IPFS — once we publish IPFS CIDs, pin them from your node to help distribute
- Seed torrents — per-reporter torrents are planned for bulk distribution
- Contribute code — the project is open source on GitHub
Technical details
- Frontend: Astro + vanilla JS + CSS custom properties
- Core: Rust, compiled to both native (CLI) and WebAssembly
- Data: Harvard CAP zip/tar files — static files, hostable anywhere
- Index: Custom binary format — 8 bytes per page, direct offset lookup
- Hosting: Any static file host works — Cloudflare R2, S3, IPFS, a local web server. The entire system (site, index, and data) is designed to be trivially mirrorable.
Credits
Built on data from the Harvard Library Innovation Lab's Caselaw Access Project. Citation detection uses patterns from the Free Law Project's eyeCite library. All case law data is public domain (CC0).
← Back to stare.pub