Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

"A Data Capsule is a secure, virtual computer that allows what’s known as “non-consumptive” research, meaning that a scholar can do computational analysis of texts without downloading or reading them. The process respects copyright while enabling work based on copyrighted materials."

And that is completely ridiculous; a technical solution to an invented problem.



There's an easier solution, we call it "my friend Ivan".

I have a friend, Ivan, who lives somewhere in the world - he's notoriously reticent about his location, and only communicates over OTR at strange times. Whenever I want to get some research done, I ask him if he happens to know this or that fact about this or that copyrighted database. I then cite him as a source if anyone has any questions.


> And that is completely ridiculous; a technical solution to an invented problem.

Congratulations, you’ve just described most software projects by libraries and archives.

It’s funny, because back in the 60s-80s, libraries were leaders in building shared data systems and networked infrastructure. The history of OCLC describes this well.

But once the web came around, they had an identity crisis, were unable to react to technology trends, and largely got conned into predatory and restrictive arrangements by service providers (Elsevier, ProQuest, etc). The same thing actually happened in the past with microfiche and led to libraries destroying huge, valuable portions of their collections, that could have been better preserved with the advent of scanning technologies.


Most librarians would agree with you that the current situation is terrible, and if you look at library literature they've generally felt that way from the beginning. They are highly constrained by copyright law.

re: microfiche, it lasts much longer than digital scans -- if kept in good conditions it has a usable half-life on the scale of >100 years, while digital files need much more active maintenance to both prevent bit-rot and to ensure the file-type is still readable (eg countless file formats have been abandoned and are only currently accessible via emulators of older machines).

And if you want to bring the cloud into this, most libraries don't have the funding to bring in the technical know-how to manage a private S3 instance only accessible from that building.


> prevent bit-rot

What I do is regularly copy my files forward onto newer media. I started this back in the 1970s, and it is the only reason I still have a copy of the FORTRAN-10 source code of Empire:

https://github.com/DigitalMars/Empire-for-PDP-10

All the other stuff I wrote at the time is lost because I stored it on a magtape, and the Caltech magtape drive had drifted so far out of spec the tapes could only be read on that machine which was lost.

I managed to preserve that by copying it over a serial line to a PDP-11 and storing it on a PDP-11 floppy. I later was able to save my 11 code by copying it over a much later serial line to an IBM PC, to put on 5.25 disks. As time went by, the files migrated to zip drives, then CD-ROMs, then a long sequence of hard drives (my older hard drives can't be read with modern IDE interfaces, even if the connector fits, I have no idea why).

I remember reading boxes of 5.25 floppies and burning them onto CD-ROMs, a long and tedious process. But now, nothing will read 5.25 floppies any more, but copying a year old hard drive to a new one is a simple process, especially since the new drives are usually much larger than the older ones.

Hence I have most of the stuff I worked on since the early 1980s. The old Zortech bulletin board stuff is gone, though, even though I still have the hard drive for it. Nothing can read that old drive. Not that there's anything particularly interesting on it, but I enjoyed running the BBS for many years.


Copies of microfiche are lossy though, right? It may last longer between copies, but eventually you'll need to copy it.


> Most librarians would agree with you that the current situation is terrible, and if you look at library literature they've generally felt that way from the beginning.

I know! I’m a former librarian who has worked with a lot of “big players” on the institutional and software side of things.

> They are highly constrained by copyright law.

Here’s where I disagree with you: they’re largely constrained by the kind of administrative bloat that has permeated all of academia, which has no technical expertise and prefers corporate solutions or managing large-scale, dead-end projects for resume padding. I was on the receiving end of this so many times I left the field.

> re: microfiche, it lasts much longer than digital scans -- if kept in good conditions it has a usable half-life on the scale of >100 years, while digital files need much more active maintenance to both prevent bit-rot and to ensure the file-type is still readable (eg countless file formats have been abandoned and are only currently accessible via emulators of older machines).

True on the digital files part, not so much on the microfiche/ film which proved in many cases to be of poor durability and prone to data loss. But my comment was more about how it’s adoption caused libraries to destroy huge parts of their collections with little recourse once microfiche/film proved to not live up to its marketing claims or in cases where it was poorly implemented. I recommend Nicholson Baker’s “Double Fold” for a good account of all of this.


It is discouraging that so much effort has been spent in a futile effort to impose the limitations of physical media onto digital information.

It's as if we passed laws to require that all email must be delayed for at least two days in order to preserve the business model of the post office.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: