Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

Wow we were just talking about selling shovels in a (ML) gold rush.

Incidentally, what is the open source alternative of this? Data is so cheap that it should be actually free, unlike counterfeit nike shoes.

(Does a bittorrent tracker specifically for research data exist? Edit: there's http://academictorrents.com/)



> What is the open source alternative of this?

There are many public-domain datasets:

* https://aws.amazon.com/opendata/public-datasets/

* https://cloud.google.com/public-datasets/

* https://github.com/awesomedata/awesome-public-datasets

> Data is so cheap that it should be actually free

I feel like you haven't had the opportunity to work with data that has value.


if a dataset is really valuable, it wont be available for public consumption/copying, and, like nike, won't be selling itself on amazon


Sometimes it's not that the data itself doesn't have value, it's that to get the value out of that data requires a lot of blood, toil and treasure in human capital to extract.

That's how data companies make money, by doing that work for you.


> if a dataset is really valuable, it wont be available for public consumption

Data vendors sell valuable data all the time to essentially anyone who is willing to pay their asking price. I don't see how this is any different? Just because aws is trying to run a marketplace doesn't suddenly make the data "public" in an open/free sense, there's still a price tag attached to it.

> and, like nike, won't be selling itself on amazon

Not every data vendor (or retailer since you keep bringing up Nike) has the necessary brand recognition, marketing budget, or technical proficiency to only sell direct to customers: that's why centralized marketplaces and alternate distribution channels exist.


> valuable data all the time to essentially anyone

if it s truly valuabe it isn't sold to 'essentially anyone'. like high value financial info or security.


Sorry, I don't agree. There have been a number of successful start-ups that monetized publically available data in their products. Real estate and finance come to mind, but there's a ton of data, you just have to grab it and make sense of it for someone who's will to pay for that service.


Thompson Reuters, Bloomberg, FTSE, LSE, NYSE, Nasdaq, etc all sell data that is very valuable.

I feel like you haven't had the opportunity to work with data that has value.


i doubt TR or bloomberg will be selling on amazon, considering what they do to other lines of business. unless they re foolish, of course


Bloomberg usually prevents customers from using their other services and platforms if the customers pay for data from a competing provider. The settlements from lawsuits that follow are only a fraction of what Bloomberg gains by compelling customers to use the Bloomberg suite, so they continue to conduct business this way.

It will be a very busy, but lucrative time for corporate law firms as they battle things out.


Emmmm. Wikipedia is pretty valuable yet it is free.


sorry, i meant valuable in the monetary sense. Wikipedia is very valuable but its market price is $0


Thinking that there are no useful freely available datasets is beyond naive.


Data by itself is cheap.

Data that has an useful business or predictive purpose that is clean and constantly updated is not cheap at all.


why not sell them directly through an API instead of paying an amazon tax?


Most people would host that API using AWS, and there you are.


Most people probably already are. The marketplace is just another distribution channel.


> Data is so cheap that it should be actually free, unlike counterfeit nike shoes.

Getting data is not cheap, and maintaining a dataset is certainly not cheep either.


> Data is so cheap that it should be actually free, unlike counterfeit nike shoes.

This really depends on the data. In pharma the right 50 bytes of data can be worth billions. Not all data is personal product preferences for add targeting.


DBnomics (https://db.nomics.world/) is an example of a free & open API to source large volumes of data, in this case economic data. It's also completely open source.





Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: