Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

One of the big problems with science is the proliferation of Excel "databases". It is very easy to look at a lot of numbers and make quick calculations with them. However, when you start getting into extremely large datasets, your propensity to make mistakes increases. A2:A10 here, B3:B11 there, etc... This is one reason why recent versions of Excel warn you when your formulas aren't in sync.

However, all of this is fixed when you are using SQL to properly query a database. Why? Because you are forced to write a SQL statement that details exactly what you want done. With Excel it can all be hidden away behind the cells. With SQL, it's out in front, so it's easier to check.

People like to use Excel because they can get an answer quickly without all that "programming". The problems start to arise when you need better tools, but only know Excel. So, in this case, it's not a matter of a craftsman blaming their tools, it's closer to an amateur trying to pretend to be a professional.

Excel is a wonderful spreadsheet. It is a horrible database.



Unless you write SQL queries that use stored procedures or pre-calculated results tables, and then you just wind up in the same situation as Excel.

I mean Excel is at its heart a query language. So your logic equally applies to Excel, why not write a massive query in Excel that does all the calculations in one go so you can see the inner workings?

All you're really doing is playing musical chairs with the data. SQL query language for Excel query language, and data moved from tables to worksheets.

As I said earlier, unless there are procedural changes upstream nothing will change. A tool is just a tool. You can use it in a way to minimise mistakes, or not. Humans are the weak point.

Now there are tools that automatically identify common mistakes but neither Excel or SQL/relational database engines are in that class.


> I mean Excel is at its heart a query language. So your logic equally applies to Excel

The least readable query language ever. Instead of names, everything must be addressed as column,row. Imagine a C program that instead of declaring variables with sane names as needed, simply created a large array of each type and used constant numeric indecies. That's what Excel requires.


Excel cells have been nameable since forever... It also has supported types since forever. I'm literally talking Excel 2000 or older functionality[1][2].

With respect do you even know how to use Excel? Because everything you just said is incorrect.

[1] http://www.computerhope.com/issues/ch000704.htm

[2] http://support.microsoft.com/kb/274504


Anecdotally I have to strongly disagree here.

Only one of the tens of heavy users of Excel I have worked with know of this feature on others like it. I am not a heavy use of Excel but my theory is that this is due to a failure of explanation by the software. Excel does not by its design encourage good coding, and most of the users of Excel do not have any programming background so they do not realize that these features should even be there so they wont know to look it up in the manual.


And it's easy to write horrifically bad, but working, code in C or C++. How exactly do you propose Excel butt into your workflow to expose advanced features?

If you plan to use Excel to aid you in making serious money (as countless business do), can't you fork out a thousand extra for an employee who actually knows how to use Excel properly?


Naming a single cell doesn't solve the problem of accidentally using a different range of values. In a normal programming language you would define a variable such as GDP containing a list of values for the countries of interest, in Excel every formula has to restate the range, which is what caused the mistake.


That's not the case.

You can have named ranges in Excel - in fact, you can have named ranges which are dynamically calculated based on a formula.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: