It's amazing how wholesale scraping of data was being celebrated, because the sc...

rev_bird · on July 2, 2015

You're making it sound like the scraping was being done to replace the functionality of CL, which, yeah, would be pretty transparently shitty to do. But they weren't doing that, especially PadMapper: They were indexing the content to make it more accessible, an action that's been taken probably trillions of times and is pretty much the main reason most of the internet is even usuable today. It's like accusing Google of plagiarizing your website because they linked to it.

mark-r · on July 2, 2015

That's a good point - does craigslist have a robots.txt to prevent Google from crawling it? If not, isn't Google guilty of the very same thing, by aggregating the information via search results?

makomk · on July 2, 2015

Craigslist doesn't prevent Google from crawling them. Not only that, Craigslist also sued at least one company for scraping Google results in order to index Craigslist postings.

rev_bird · on July 2, 2015

> User-agent: * > Disallow: /reply > Disallow: /fb/ > Disallow: /suggest > Disallow: /flag > Disallow: /mf > Disallow: /eaf

Nothing blocking listings... OR PadMapper...

mikerichards · on July 2, 2015

This is the first topic I've seen were the HN's usual Libertarian bias disappears

Not even close.