1. Software is easy to fix. Compared to other construction disciplines, software is the cheapest to fix.
2. The consequences of failure are typically minimal. Most software is not mission critical. Failure means minor inconvenience for a lot of people. Moreover, since your competitors are no more reliable than you are, you don't even stand to lose customers by having an occasional failure. (This is the one the article discusses.)
3. Software has more degrees of freedom. Unlike physical building disciplines where the types of materials are limited and generally well known, and where the number of physical dimensions is at most three, software takes place in an extremely complex and multidimensional operation space. Moreover, we have no evolved intuition about how things behave in that space.
4. Comparatively minor errors result in comparatively major failures. If you forget one rivet on a building, the engineering tolerances will make it such that the building will not fail as a result. But it is not really possible to make a server resilient to null pointer exceptions. It either works or it doesn't. Software fails partially much less often than physical things.
One of my favorite quotes from my old data structures professor in college was this: "software is non-linear."
If you're laying down the bricks for a house, there really aren't any cheap tricks or shortcuts. Building two houses takes twice as many bricks as building a single house, and so on. We could say that brick usage scales linearly with house construction. This is a bit of a bummer for bricklaying enthusiasts, but on the other hand, a single brick out of place is probably not going to bring your roof crashing down.
Software is different. Often, the distinction between processing 100 files and 1,000 files is minimal at best. This ability to scale effortlessly is often the best thing about software, but the same non-linearity is also the price we pay for power. A single subtle programming error can cause millions of dollars' worth of damage and bring massive systems to a grinding halt.
Overall, I think this is what makes software interesting and exciting to work with, but I can understand how it can pose a steep challenge once things like reliability and security become major concerns. The way I see it, much of research and development into better programming languages and tools has been to see how we can systematically mitigate these kinds of propagating errors and make our systems more robust to mistakes.
Yes but aren't we past this "software development is a production process" misunderstanding by now? We have solved the production problem: I can churn out thousands of identical binaries per day through my build system. The production side of things are solved. Other disciplines look at us enviously for it.
That's not where the troubles come in. The problems we face are generally those of design.
Our production is done by absolute, complete idiots - computer programs. Compilers can't think and can't fix design mistakes for us.
Contrast that with high-complexity engineering projects, e.g. fighter planes or Apollo project, where it usually turns out the official design specs tell at best half of the story, and most of the knowledge was contained in the heads of the builders, encoded in the shape and adjustments to tooling they used to build those vehicles. Which of course creates a problem today, now that both the tooling and the builders are long gone - but it demonstrates clearly how a build process made of people can fix a lot of errors in the design.
Are you sure? Pretty much every book and methodology over the past two decades has focused on how software engineering needs to "grow up" and be more like other production industries. A cornerstone in the latest fad, "new agile" with books like the Phoenix Project, is how the software delivery process should learn from lean production industries.
This is very popular with managers and funders who desperately want the predictability. Those pressures aren't going to go away. Doesn't mean it can be made to fit in that box, though.
> The production side of things are solved. Other disciplines look at us enviously for it.
Actually,software's production process is a mess that's designed to mitigate the colossal mess that's the typical engineering culture observed in software development. I mean, the only reason that a procedure needed to be devised to deliver automatically and immediately any fix to a software system is the fact that there was a need to repeatedly and systematically fix the product that was being delivered to production. That shit doesn't fly in engineering fields.
it's more that bricks have to follow the laws of physics. A building with bricks that don't have the quality required will collapse, and the builders will be found guilty.
Software doesn't fail so dramatically - small annoyances, for example, is easily dismissed. Even if there are many of them.
Come on man, how can you be so stupid so as to forget something like that?
This stuff is easy, just look over the code- I mean blueprints before commi- I mean building and you'll be fine. Everyone knows you're supposed to put the weep holes in, so make sure to do that.
The most serious and expensive software quality problems tend to c result from requirements analysis failures. Better programming languages and tools can't do much to mitigate those mistakes.
> Comparatively minor errors result in comparatively major failures. If you forget one rivet on a building, the engineering tolerances will make it such that the building will not fail as a result.
... but a systemic design error can cause it to fail, and this happens a lot when new techniques are developed and deployed. Change implies the increased possibility of failure.
I don't know why people keep holding up the construction industry as the paradigm here. Large civil engineering projects are notorious for cost overruns. The industry used to routinely get people killed during construction; only really post WW2 in the West has this been solved. It's still possible to have big post-construction disasters. One of the big political issues in the UK at the moment is the deployment of flammable cladding on multistorey buildings - this turns a fire in a single unit into potentially the loss of the whole building and many of its occupants.
You've all seen the Tacoma Narrows bridge video, right? But all sorts of innovations have their problems, mostly invisibly to the public. The last time someone attempted to disrupt tunneling before the "boring company", the "New Austrian Tunneling Method", it turned out to have serious problems: https://www.hse.gov.uk/pubns/natm.htm
> Software has more degrees of freedom. Unlike physical building disciplines where the types of materials are limited and generally well known,
The fact that materials used in construction are not that plenty, thus generally well known, is not a law of nature. In fact, it's the direct result of a radically different engineering culture. In construction, you only use building materials and techniques that are well known and demonstrably robust and performance. Even in new building elements and materials and techniques, before they can be adopted by the public first they need to be subjected to a whole batch of certification procedures and approved by a bunch of regulatory bodies.
Hence, a brick is a brick is a brick, because bricks are standardized and all the bricks you have contact with are actually certified and must comply wiht standarda such as
> you only use building materials and techniques that are well known and demonstrably robust
For better and worse. Most older building materials do not pass current building codes¹, even though many of them are perfectly safe and robust, while being way more ecological. Usually more labor intensive yet often still cheaper.
> For better and worse. Most older building materials do not pass current building codes¹,
That's primarily because they had to comply with the requirements of that time instead of requirements created in the future.
Nevertheless, rigorous compliance is not the point. The point is that the whole industry is focused on standardization, complying with specs, and use only tried and true technologies and techniques. That's not what the software development industry is about.
> 1. Software is easy to fix. Compared to other construction disciplines, software is the cheapest to fix.
You could just as easilty as say software is extremely costly to fix. Software is labor intensive so fixing it can cost a lot of labor of extremely high paid people.
But really, I think the way to put it is "software seems much easier to fix than a physical object." It seems that way to the software engineer and to the manager talking to the software engineer.
Not only are there no physical reminders of how difficult changing a given piece of software is but the challenge is primarily in the unexpected and the illogical. Fixing routine X would be easy if it wasn't unexpectedly and/or illogically tied into a variety of other functions. And this stream of unexpecteds is extremely easy to forget/discount.
Software fixes can be costly, but they are easier than most other fixes at scale.
Doing a recall on a popular hardware product is orders of magnitude harder than issuing a software update for that same product.
There have been a few scenarios at my job where it was easier to just solve a problem in hardware, but once a bunch of hardware gets shipped, only software can save us.
It's easier to update the software than change the hardware. But you the problem that software can never fixed through the multiple iterations of updates - through the complex of the software and because the company pushes new features with new bugs along with the bug fixes.
But my comment above wasn't really to say software was more was less costly than hardware. It was mostly say that software is more unpredictable than hardware.
> You could just as easilty as say software is extremely costly to fix. Software is labor intensive so fixing it can cost a lot of labor of extremely high paid people.
Developing a fix for a software issue can be expensive, deploying the fix is relatively cheap in today's world with automated software updates.
Compare to e.g. fixing a design flaw in a car, where the development of a fix can be relatively cheap but deploying it (recalling all affected cars to the dealer to have new parts installed) is pretty much always expensive.
> Failure means minor inconvenience for a lot of people.
That alone makes software more critical than we give it credit for. Especially if there's many people.
Let's say you've got some mildly popular piece of software, with 10,000 users. Let's say a particular inconvenience means they each waste like 10 seconds per day. Total waste is 100,000 seconds per day, or 28 hours. Per day.
It still adds up slower than many other engineering areas. A recent technical problem on my subway line (Paris Métro 6) delayed everyone by 30 minutes for about an hour. Based on average ridership figures from Wikipedia, this adds up to ~250 days lost. An article from 2016 [1] places the downtime at ~2h16 per month for that line. That's ~1000 days lost per month, for only one subway line.
The subway is not the norm, it's an outlier. Most engineering projects are not nearly as important or impactful. An apt comparison would be a fairly big internet provider, a popular game (1M users, not just 10 thousands), a popular productivity suite…
I have chosen the small end of the scale, but let's think about Microsoft Word for a second. Or how much time Adobe Photoshop takes to start up.
An online coffee machine? If that thing's got a vulnerability, I already envision some corporate spy recording discussions, send them home, and comb them for corporate secrets…
They've been putting listening bugs into appliances long before the IoT was a concept. The IoT "connect all the things" approach just makes it easy on a grand scale.
People always forget that software engineering is a relatively new thing and still rapidly evolving. If you look at buildings a couple of hundred years ago(not the still standing, but the average) or the golden age of flying ~1950 it was mostly the same. Better than not having those things but nowhere near perfect.
>Software fails partially much less often than physical things.
If only this were true. The worst kind of errors are the silent kind. They're just correct enough not to crash or notify anything, but wrong enough to corrupt results
1. Software is easy to fix. Compared to other construction disciplines, software is the cheapest to fix.
2. The consequences of failure are typically minimal. Most software is not mission critical. Failure means minor inconvenience for a lot of people. Moreover, since your competitors are no more reliable than you are, you don't even stand to lose customers by having an occasional failure. (This is the one the article discusses.)
3. Software has more degrees of freedom. Unlike physical building disciplines where the types of materials are limited and generally well known, and where the number of physical dimensions is at most three, software takes place in an extremely complex and multidimensional operation space. Moreover, we have no evolved intuition about how things behave in that space.
4. Comparatively minor errors result in comparatively major failures. If you forget one rivet on a building, the engineering tolerances will make it such that the building will not fail as a result. But it is not really possible to make a server resilient to null pointer exceptions. It either works or it doesn't. Software fails partially much less often than physical things.