Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

PGDP is the project that's doing high quality book transcriptions. OpenLibrary is a distribution mechanism.


OpenLibrary also do automated OCR type stuff, though generally its subsidiary to scans of the page.

But search and some recent tools for extracting data from books (e.g. they find URLs in books and then save them in the Wayback machine so people can see what the book linked to) all rely on automated OCR so any improvements will help.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: