Hacker Newsnew | past | comments | ask | show | jobs | submit | squatrito23's commentslogin

Alternative data (data feed for investment managers and hedge funds) seems to be heading towards a consolidation in the space


It all comes down to what data: PII (personally Identifiable Info) or IP (Intellectual Property) are strongly protected. There are a lot of useful info not PII or IP that can be collected, used and sold, like prices (Ryanair case against Expedia being a crucial case)


I'm somewhat ignorant on this matter, but I was under the impression that in some jurisdictions (e.g. UK) scraping as a whole sale concept is pretty grey area at best not just on PII & IP


Not all content is covered byIp (intellectual property). Long textx (like an article), images, or videos are. But factual information (such as prices, a hot one in web scraping) are not.


> Long textx (like an article)

Short ones, too.


Is it proprietary data or web scraped? This might change the strategy. If proprietary, with enough history, and relevant for the market you cover (and this market has listed companies operating in it), you shall consider selling it to hedge funds. There is an entire ecosystem for the so called "alternative data" you should look at. But big money is for 1) proprietary data 2) long historical trends


I’m interested in hearing more about this. It might not pertain to the data I’m thinking specifically as it would be scraped but finding out where the holes are and creating proprietary data around that would be a consideration. However, I’m interested in diving in deeper on alternative data if you have some solid sources I can brush up on.


The issue with general data markets is that they are too .. generic. They can't address the quality assurance problem, so as a buyer you'd end up doing you own due diligence for every merchant on the marketplace. As if you were to site-visit each Airbnb before you book a night there. A web-scraping dedicated marketplace might address this, surely not a generic one.


On the legal side, legislation in the US, UK and EU is clear on PII (personally identifiable Informtion) and IP (Intellectual Property). Web scraping can be perfectly legal when these elements are considered. Here my article https://blog.databoutique.com/p/web-scraping-legal-context


Definition: There is no intrinsic value difference on who is doing it, as long as it's done well.


Hi, During my daily talks within the web scraping community, I have the gut feeling that we’re scraping all the same websites. To prove so, I decided to create the first Web Scraping Website Ranking, an anonymous poll to understand better where the web scraping professionals are putting their efforts. It would be great if all the people involved as a user or as a data providers share the name of the websites they are working on, in an anonymous and safe way. The results of this poll will be used as an input for a series of reports about the web scraping industry.


Python library for dealing with proxies in your Scrapy project


Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: