Title: Herculaneum scrolls: A 20-year journey to read the unreadable
it goes a little bit into the technology of how this was done, deep learning finally cracked the code. They had the scans for a decade but it took ML training to be able to identify which parts were paper and which parts were the ink on top. This had been done on a different set of scrolls with easier to read higher contrasting materials like the video says, 20 years ago. Deep learning is cracking the code for these datasets we had previously thought were impossible to algorithmically solve.
Can't speak for the video, but this is a bit misleading actually. What cracked this was actually visual inspection looking for patterns which could then be used as better training data, which so far apparently hasn't found very many letters that were too hard to see. Read the OP describing the iterative process of hand-annotation guided by output of a model, then retraining the model with the additional data, it's a fascinating technique! Simply using deep learning on the initially available ground truths without knowing what features the models should be looking for actually pretty much didn't work!
Also, so far the process of virtually unrolling the scrolls is mostly manual and extremely labour intensive.
This is highly misleading. Deep learning was not what did the discovery, the find was handmade. They're trying to make a deep learning model do what was done by hand here, but so far they haven't had success in it finding actual letters.
These models certainly have found letters, although mostly they produce unreadable partial letters. Look at the images from the "What’s next?" section of the article! [1] They certainly seem better than human annotation, and more importantly don't hallucinate whole letters. Casey Handmer made a submission for this First Letters prize [2] based solely on hand-annotation and wasn't awarded it, because it's really unconvincing. His letters [3] don't look at all like the computer annotations.
Title: Herculaneum scrolls: A 20-year journey to read the unreadable
it goes a little bit into the technology of how this was done, deep learning finally cracked the code. They had the scans for a decade but it took ML training to be able to identify which parts were paper and which parts were the ink on top. This had been done on a different set of scrolls with easier to read higher contrasting materials like the video says, 20 years ago. Deep learning is cracking the code for these datasets we had previously thought were impossible to algorithmically solve.