Ten low-level, generic advices. Too low level to do any good in larger
picture, and too generic to be applicable in real life as-is. The only
exception is not interrupting people when they are in the middle of work.
There are several high-level advices that are similarly generic: break up your
program to subsystems of clearly defined interface and purpose, make the
subsystems composable by not loading them with too many purposes, don't mix
high-level and low-level purposes or operations in the same interface, avoid
putting dependencies on other modules or services unless necessary, and know
which operation belongs to which module. Those make good programmer, not
minutiae about creating identifiers.
I like to generalize it to "Understand and consistently apply the concepts and good practices of your chosen field".
This requires you to find out what good practices are, understanding why they are good practices and then working on making them a habit so they reliably show up in your work.
The issue I have with the article myself is that it over-promises in the title and under-delivers in the body.
The rules are drawn at random from things that make good sense, with no attention to order in level, area, or importance. DRY is a pattern, not commenting code is an anti-pattern, and not interrupting co-workers is workplace etiquette. Some rules are obvious to the intermediate [DRY, no magic numbers], whilst others are confusing to the beginner [eliminating dependencies to make code easier to test]. This makes the list useful for neither.
Calling this "golden rules" seems dishonest. "My rules for becoming a better programmer" would be a more accurate, albeit more boring title.
Yeah, this article reads like someone was forced by their employer to write a blog post. If it was for self-promotion you'd think they'd put a bit more effort into it. Would be better if retitled to "10 things that just so happened to pop into my head".
>There are several high-level advices that are similarly generic: break up your program to subsystems of clearly defined interface and purpose, make the subsystems composable by not loading them with too many purposes, don't mix high-level and low-level purposes or operations in the same interface, avoid putting dependencies on other modules or services unless necessary, and know which operation belongs to which module. Those make good programmer, not minutiae about creating identifiers.
Lot of this will happen during accretion. You can rationalize adding one low level method to a high level class because at the time because you only have one request to do.
On the other hand, had you a request that warranted a whole bunch of low level stuff in your high level program, it's very easy to see the opportunity of separating those concerns.
If you have good instincts for your problem space, chances are you'll foresee a lot of these needs. You might even have the concerns separated long before the requests for them come in. Instincts take time to develop. So most people end up writing sub-optimal code because it's the cost of training someone. Code reviews stop it from actually getting committed, but the team is still incurring a cost with the small chance it doesn't pay off from the person not adapting the reviewer's style.
How often do you sit and count the number of purposes a class has? The number of dependencies you're adding per unit time? The amount of small things that sneak in because it's just one small change and we really want to go drinking tonight? Do you go through this ritual every time you open the source? After all, you don't just have to worry about your code, but also your co-workers'. Are they putting in the same effort as you are?
Even if you do that, what are your thresholds? Can you defend for choosing a particular value? Maybe it's just inherited from someone who has a bigger name than you, which is a good guideline, but is not necessarily law.
To me, it will always come down to finding like-minded people in this area, which means there are people that just don't care about the quality as long as it works. It's a people problem. Since the compiler/interpreter is happy to run terrible looking code, maybe it should have hard coded thresholds as compiler warnings decided by dice rolls!
I find this is one of the biggest problems with object-oriented programming. The object is a terrible unit of abstraction, because it forces you to carve up your domain into rigid chunks which might map to your domain today, and fail to map to it tomorrow, and a terrible unit of reuse, because it couples methods to data, and bundles these methods together on top of that.
12. Never listen to anybody who tells you "You shouldn't need comments if your code is clear enough to read." - the next one reading your code (probably you, 10 years later) will love you for documenting the WHY just as much as s/he'll hate you for having to rely on documentation that explains the WHAT.
I like how Robert C. Martin puts it: before you write a comment, ask yourself whether you can write the code in a way that makes the comment unnecessary. If you cannot, then it is better to leave the comment than none at all.
Fair enough, so long as you understand that programming languages are not expressive enough, in general, to clearly explain purpose and rationale. In these matters, providing a hint through identifier naming is not an adequate substitute for a clear explanation in a human language.
As an aside, clearly written code does a pretty good job of explaining what it does, but not why. Sometimes there will have been reasons for avoiding a particular method, which isn't made obvious by the code itself.
However 10 years later, it is also likely you'll be reading the comments for the explanation and realise that what the comments say is different from what the code does (because over the span of 10 years the code was updated but not the comments) and that's just as confusing.
If the code is written with meaningful variable and function names, and functions no larger than a screen, then lack of comments isn't really a problem.
Yep, and often the code that breaks what the comment says is far away. Too easy to change something that subtly invalidates a comment in some other module. Especially if you didn't write that module.
It is also possible that your change subtly invalidates the semantics implied by the identifiers used in some other module. In either case, there is a nontrivial possibility that you have introduced a bug (and quite possibly a subtle one).
Variable and function names can be just as misleading as comments. I am not aware of any language which guarantees the semantics implied by your choice of identifiers.
> will love you for documenting the WHY just as much as s/he'll hate you for having to rely on documentation that explains the WHAT.
I've saved myself so many times with this. External services are bad about having odd-ball edge-case errors that require a little funny business on your end in order to keep operational. So there are often little code snippets that look like their bugs, but are actually necessary to prevent regressions. Without a comment as to why the code is there, some boy scout is going to "fix" it (and the now failing unit test) and introduce a regression.
Some of these could serve as philosophical starting points but are a bit too dogmatic for "rules".
1. DRY
I like the Rule of Three better, and even that is flexible. It's important to find an underlying abstraction and not just extract common chunks of code because they happen to be similar.
4. No literals
Eh, not really. Most of the time I'd rather just see the value and not have to follow a reference to the definition of a constant.
5. Avoid dependencies
Yeah, probably true for application level settings like he describes, but don't go crazy with it. The goal is to have clean readable code and tests. Its just a headache when everything is a FooImpl of a FooInterface, for instance, and not really a win for testability.
Generally I agree with you here, but I think what the post is getting at is more the one case where you do want a variable in place of a literal - when passing the literal into a function where its purpose isn't immediately obvious. Passing the literal behind a more meaningful variable name can provide cleaner, self-documenting code in cases where, for example, modifying the function isn't possible and writing an adaptor would be overkill.
Makes me want to poke my eyes out with a stick everytime I see kEmpty used..
(The most egregious thing is spaces inside parens. But that's just a personal preference. In general I tend to be quite laissez-faire about tolerating different styles, and consider style guides to be a pox because they tend to give the subliminal message that your code is done when it complies with the style guide.)
I'm 100% with you, the variable name kEmpty here is completely tautological - it just describes the literal in a different, clunkier way, and doesn't add any information. Kind of similar to something like nameOfSteve = "steve"
The case where it's useful is where instead you'd call the variable referring to a constant something like dollarCurrencyCode = "USD", where you have to use an api which takes a currency code but you will always pass "USD". To the uninitiated reader, what purpose does the literal "USD" serve - what does it mean? It's not clear, because they don't know the api, and may not intuit currency codes. Putting it behind the dollarCurrencyCode variable name makes it a lot clearer on first glance. If they care to know what the dollarCurrencyCode is 'under the hood', they can then look, but this is a lower level detail of the underlying api and doesn't help with understanding.
I always prefer to use literals in tests, I had the case once where I was checking that foo was equal to bar. In reality, both were null, so the test passed.
I couldn't agree more. This a sure path to disaster and unhappiness. Fixing code requires understanding what it is doing in all except the most basic examples.
Unless you have a firm set of tests, or strict API constraints defined, don't ever follow this rule. It will only lead to a sad tail of how you created this huge bug that deleted all the customers data because you though this Boolean wasn't used anymore, so you removed it. It makes for a funny story, but a really bad couple of weeks.
Worst case, as you said, but the best case is...nothing. You're literally spending a bunch of time doing a high risk activity to produce nothing of value. It might be easier to read, but if the original developer is long gone, then it's likely nobody will look at it again for years (unless you broke something and the next guy rolls back your changes in VCS).
Taking on a foolish refactoring project is a rite of passage for developers.
I disagree. I think the boy scout rule applies but should be interpreted as follows: when you see technical debt, you should try to fix it, but when this would take too much time, at least fix it a bit. This way technical debt will be fixed eventually, but it doesn't disrupt the timeline of the current sprint/ project too much.
E.g.: suppose you are fixing a bug in some part of the application that was not written by you, for instance it could be written a while ago by a colleague who's no longer here.
And you notice that the code is written so badly that it takes you a long time to understand. What you should do is:
- write tests that verify the code behaves as expected
- fix the bug (so the tests succeed)
- refactor the code until you think the next colleague who's reading it will be able to understand it. If this is too much work to do all at once, at least fix the most urgent parts.
This is the well known 'red, green, refactor' mantra.
This is the type of article that spawns the best and most informative comments. As a rather isolated programmer being able to browse through the HN comment section and read everyone's different perspective of "rules for becoming a better programmer", I find is infinitely more constructive than reading blog posts. It also somehow feels more credible... So thanks I suppose.
Disagree strongly with the "boy-scout rule". If it's a bug, then fix it, but if it's messy code, then leave it! There's no reason to touch something that may be working fine and for you to create subtle bugs.
If you have the option of leaving it alone, then you probably should, however if you've already got to work on the messy part anyway, and if the code is messy to the point of being difficult to read, then you might be reducing the chance of introducing new bugs if the first thing you do is refactor the lot such that you don't cringe just looking at it.
I'm talking about cleaning up the code though, not the logic - refactor really isn't even the right word. Poor choices for variable names, misleading or just plain wrong comments, inconsistent use of whitespace, etc etc.
And even this you should only do if you really think it's necessary, since it will make it very difficult to use diff to suss out what your actual changes to the area were after you're done. Still, there is a time and a place for it.
If you use your VCS right the diff issue doesn't have to be a problem. Do the tidying up in the upstream branch, or at least as a separate commit that can precedes your actual logic changes.
I would add: Also research tools/concepts that you can't understand yet, and read programming books above your level.
Because after enough time your brain will have collected enough information to understand them (I call it the 'click' -- that moment where everything clicks and becomes clearer), and you will become a better programmer because of it.
Agreed. Everyone loves gof design patterns, but the real gems for practitioners start a couple level below in abstraction: code complete basically shortcuts you five years of experience and Fowler refactoring + patterns of ea ground yourself in making what it's needed and not what it's dreamed
It's been my experience that clever architecture tends to beat clever coding and tactics.
I don't think I started to have an idea of what being a better programmer was, until I had a relationship with multiple code bases that I had started or built and were 5-7 years old.
What came from it?
- idea of being kind to your future self,
- coding in a way that the next person could quickly pick up on it vs building a tribute to my own cleverness
- not prematurely optimizing
- refactoring being expected
- learning to solve all sorts of small and not-small problems
- being open to the idea that few problems haven't been thought about or solved already
The article's advice is rather generic, though it did bring up some interesting responses. The rule about comments in code elicited a minor controversy, but I tend to agree with the sentiment that adding comments is more worthwhile than not.
The program logic may be clear to the author but there's a good chance it won't be to others using it, especially when 6 months later I'm one of those "others". At that point it's usually easier to determine if a comment really clarifies things and if not remove or modify it.
I'm surprised not much said here about rule 10. Giving feedback to team members is an essential part of teamwork, e.g., certainly it can't work if problems with code aren't expressed. However I think what the author is getting at is the fine line between "criticism" and "critique", the latter meaning communication re: mistakes, etc., in a constructive, generous and objective way.
Without due care in expression, teams risk degenerating into factions arguing over who's "right", who's to blame, and there progress grinds to a halt. It's always obvious when other teams are in this state, but frequently we don't see it in ourselves or happening in our own group. Mastering the art of critique, including self-critique, is hard, but an amazingly valuable skill to learn.
1. Practice, practice, practice. This goes for any endeavor in life, not just programming - the more you do it, the better you get.
2. Never settle for your first attempt, always be thinking of ways to improve. Sometimes I'll come up with ways to redo something years after the opportunity to do so has passed, just because it's still on my mind. Hopefully when a similar situation presents itself the better method will be my first choice.
11. Take occasional breaks. Yes its sometimes a good idea to get out of the zone.
I can't tell you how many times I have written a 500 or so lines of code and realized after taking a break I was going down the wrong path or was thinking about the problem incorrectly.
I've seen lots of people make this assertion, most of the code I've seen doesn't need more than the how. Can you give some examples of well written code where there's significant amounts of comments that explain the why?
In certain types of code, the purpose of some logic is not clear by itself.
A comment can be very valuable to answer the question "why the heck is this being done here?"
It's valuable because it represents hard-earned knowledge.
That thing that is being done there wasn't being done there before, and it took weeks to figure out that it had to be done there, for the sake of such and such a race condition that can occur, when the interrupt goes off just when we are doing such and such elsewhere in the driver. It was hard to reproduce the ensuing problem and trace it to this.
The comment explains how the system will fall down if not for those lines of code.
Systems that have a lot of (or not even a lot of) concurrency going on can be very hard to understand from the static view. Our aim is to produce the dynamic view: to bring to life the collaboration or sequence diagram that we had on the whiteboard. But it's not obvious how (or whether) the code actually does that (and only that, ruling out unwanted scenarios).
"Why" comments can also make the code more navigable, because they mention other identifiers elsewhere in the code which are related to this code here in ways which are not obvious. The text editor can jump from these mentions to their definitions, which is a big time saver for someone trying to understand that situation.
Sometimes you have to work with a third-party system or interface that is sending your application dirty data, and you need to account for it in your code in creative ways that may not be obvious to someone who is just reading it.
For example in our current project w'ere working on a front-end piece that connects to a CRM application where no rules or constraints have been set up or enforced for a long time. We're talking stuff like zip code fields having values like "I don't know" or "XXXXX". So for every function that receives such values from a lookup, we have to clearly document those edge-cases in the comments and explain how the function deals with them.
ZipCode GetZipCode()
{
var result = _crm.GetZip();
return isValidZip(result) ? new ValidZip(result) : new InvalidZip(result);
}
class ValidZipCode : ZipCode { ... }
class InvalidZipCode : ZipCode { ... }
Why do you return a different class for invalid zips? Is it OK to use invalid zips in all cases? If not, why not throw an error? If so, what benefit does the separate class give you?
Because that code doesn't handle bad data - something in the objects or some other function must handle it, and this code doesn't really tell us what is going to happen with a bad zip. It just tells us that the code noticed it.
See the null object pattern.
Encapsulate all such behavior you've mentioned with appropriate OO patterns, and stop writing procedural code spread all over the place that has to rely on comments to inform you about the right way to handle the bad data. Handle the bad data cases simply and specifically.
Feel free to add some code that explains your use case and where you think such comments are required...
Yes, all over the place, explaining the business reasons behind the algorithms, or why some code was put in place to resolve a specific browser oddity (usually with dates and versions so we also know when it can be removed in the future). We also use comments to indicate when code is written to follow a specific decision from the CEO, and was not just an arbitrary design choice by us coders.
In short, we make sure that a coder can read the code and understand not just the code, but how it fits into the larger business.
Code is rarely an arbitrary design choice. Such design choice decisions as you mentioned belong in your source repository message IMO, rarely in the code.
So when you are coding, and start looking at an existing function, you are telling me that you go back to your repository and read through the entire commit history for that function?
If any function has such a significant commit history that this is a problem in your code, there's probably a violation of the SRP right in front of you. Stop modifying it. Write modules that are small enough that their usage is obvious and you'll find it less likely that you're changing code so often that commit history on a function matters.
In the case of the CEO mandated approach, write a test that explains it. Why comment something that can be tested.
[Fact]
void FooReturnsBazBecauseJoeMadeUsDoThisInsteadOfBar()
{
var actual = _someObject.Foo();
Assert("baz", actual);
}
This way, if that code ever is changed, the comments and the code don't get out of whack with each other. Instead, you'll have a broken test. At that point it becomes obvious to reach for understanding as to why that test is there. The first step is read the commit history. If your commit history is useless, you talk to the dev that implemented it, or go back to the spec that drove it. These solutions are still 100x more useful than a comment.
Because at one point, the code was not optimized, so it already had a function name. Are you really going to rename all instances where that function is called so it is named based on a single commit, instead of its actual purpose?
There are many ways that are non breaking to implement this. I chose a particular one for example's sake. In reality, extracting an interface is a more valuable refactoring, and a likely valid one given that the optimization tradeoff is happening. So MemoryOptimizedFoo : IFoo, SpeedOptimizedFoo : IFoo...
In fact you might also want the un-optimized code around and run the optimized version "against" the un-optimized version (in tests) to see if they actually do the same thing.
Extract an interface (IFoo). Implement Foo : IFoo and FasterFoo : IFoo. Write tests to confirm all implementations of IFoo act equivalently. Write performance characteristic tests around FasterFoo.
I'm going to pull the "shutup and show me the code" plug on replies to this. If you're still writing prose in response to my question, you're in the exact same bucket at those I mentioned making the assertion.
You can't more often than not. Often the "why" is an implementation detail
regarding a single statement, and those shouldn't make their way to a function
summary.
If there's hidden meaning in an implementation detail regarding a single statement extract that value to one of the following:
a) a variable
b) a method
c) a class
All of these provide a mechanism for documenting the code without comments. Show me where you think this doesn't apply.
Nope. In my career I've never seen this mythical code that doesn't need comments. Tell me about preconditions, postconditions, performance characteristics, quirks, absolutely anything and everything that could affect the caller.
I've often seen assertions like this that all / most code requires comments. I hold a personal view that it's actually the opposite, but can't seem to get anyone above to bite and show me useful real code provides a counterpoint to my view.
Can you point to an open source repo somewhere that demonstrates your point that comments are preferable to writing clear code without comments?
I'm sure you can point to repos where poorly written code requires comments in order to be properly explained. This is not really what I'm after, as I'd contend it's likely that writing the code better would be more useful to that project.
The point is that writing comments isn't a substitute for writing clear code. I have seen way too much code that someone puked up without thinking about readability, and then tried to save it by sprinkling it with comments.
So the first thought when writing code that isn't obvious should not be to comment it, but to figure out if there is a way to write it clearer.
But of course, sometimes we need to write code that isn't obvious, and then a few well thought through comments will save the day.
One of my favorites was a hex string (ascii) to binary string (ascii) converter that ran all on one line. I think my colleague found it online, because they were nowhere near technical enough to implement it, let alone on a single line. Nary a comment or clue in the context to figure out why or what it was doing.
This is already a softer statement than the rule proclaimed in the article ("not many" vs "none").
And if you have to do something unusual due to limitations (say, a bug in a 3rd-party library) in your code I don't want to know about that when reading the method's documentation (what you refer to as "summary") to be able to use your code. I want to be bothered with it when I have to read its implementation because I have to fix some bug in it.
There are several high-level advices that are similarly generic: break up your program to subsystems of clearly defined interface and purpose, make the subsystems composable by not loading them with too many purposes, don't mix high-level and low-level purposes or operations in the same interface, avoid putting dependencies on other modules or services unless necessary, and know which operation belongs to which module. Those make good programmer, not minutiae about creating identifiers.