Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

It's amazing how wholesale scraping of data was being celebrated, because the scrapers "were making the internet better".

This is the first topic I've seen were the HN's usual Libertarian bias disappears and no-one challenges the underlying notion that CL is seen as part of the Commons. Channelling Dagny Taggart, I'd say 3Tap & PadMapper were a bunch of moochers.



You're making it sound like the scraping was being done to replace the functionality of CL, which, yeah, would be pretty transparently shitty to do. But they weren't doing that, especially PadMapper: They were indexing the content to make it more accessible, an action that's been taken probably trillions of times and is pretty much the main reason most of the internet is even usuable today. It's like accusing Google of plagiarizing your website because they linked to it.


That's a good point - does craigslist have a robots.txt to prevent Google from crawling it? If not, isn't Google guilty of the very same thing, by aggregating the information via search results?


Craigslist doesn't prevent Google from crawling them. Not only that, Craigslist also sued at least one company for scraping Google results in order to index Craigslist postings.


> User-agent: * > Disallow: /reply > Disallow: /fb/ > Disallow: /suggest > Disallow: /flag > Disallow: /mf > Disallow: /eaf

Nothing blocking listings... OR PadMapper...


This is the first topic I've seen were the HN's usual Libertarian bias disappears

Not even close.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: