> Please let me know how you find using the data and if you'd like to see any additions or changes!
I only just started yesterday but so far so good!
Having it in Parquet files on Git LFS makes a huge difference. It only took a few lines to add the entire dataset to our CI/CD cache which is an improvement over the ingestion scripts we have to normally write with change detection and all that. It took less than an hour to start running the cases through our pipeline - I wish all of the GovInfo bulk data were available this way!
To be clear, they are all in the public domain— Fastcase updates included. All of the proprietary info was redacted by hand and the opinions themselves are not copyrightable. The throttling is a contractual obligation to a project partner that limits Harvard's distribution of the cases until Feb of 2024, but that's it. There are also exceptions— cases where the publication is no longer in copyright, and jurisdictions that already publish their opinions online... There are 3 or 4. Those are accessible without throttling through the API and through bulk downloads right now.
This should have more up-to-date and accurate information than I do:
https://case.law/about
Harvard's Library Innovation Lab is a product studio building open-source tools and services for open knowledge, available to everyone and in the public interest.
We're looking for a Senior Product Designer to join our product team and help lead the brand and visual development of our lab and its products. Promising candidates will have a strong visual design portfolio and experience in user research, brainstorming, and wireframing products from inception to high fidelity. This is an opportunity for a design lead to utilize their craft toward helping positive ideas travel further and faster.
Apply at the link above or feel free to send us questions at lil@law.harvard.edu