Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

Just that deeply inside that forest of functions you just wrote is the inner join that the SQL query would do in a couple of lines of code embedded.


more like, when it comes to complex data structures and logic, i will do that outside of sql. I'll do a join with sql no problem, by the time we're doing multiple inner joins I usually prefer to just do multiple sql queries. I don't care about performance that badly.


That’ll often not scale to millions of records. Letting the database optimizer find the optimal execution path instead of doing it procedurally elsewhere might result in “finishes in 5 minutes”, versus “doesn’t fit in a night”.


This isn’t the 90s. Most hardware is way over-specced for the data sizes most people are dealing with.

The number of use cases which are too heavy to finish in hours but small enough to fit in a single instance is pretty limited.


Costs are another reason to optimize queries, long running, inefficient queries will be a lot more expensive on things like snowflake than more efficient queries.


SQL is popular because it can be run on a map/reduce backend. So once you have written your code it can run on any number of machines.


a) SQL is not that popular on map/reduce backends. Most people are doing it in code.

b) Only basic SQL works on any database and even then there are major differences in how they treat things like nulls, type coercion etc.


BigQuery? Athena/Redshift?


I usually only do one join at a time. But I separate them with CTEs ("WITH"). I can agree that many joins at once can make you grow grey hair.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: