Fun fact: it was a design goal of the SHA-3 competition not to have the hash be vulnerable to length extension attacks, which obviates the need for constructions like HMAC that apply the hash function twice.
Sufficiently truncated hashes (SHA-384, SHA-512/256, SHA-512/224) in the SHA2 family should have the same property, right? You need the missing state to extend.
Yes, but that's a hack, and not nearly as well-studied as HMAC or SHA-3. If you decide to do it this way, at least be sure to make it very obvious that "upgrading" to SHA-512 completely breaks your cryptosystem.
It is fairly well-studied; it's called chopMD in the literature. It has been shown to be indifferentiable from a random oracle [1, Theorem 3], provided enough bits are chopped.
Thanks for pointing me at those papers! I'll admit it's been studied more than I thought.
(I was actually more worried about cryptanalytic attacks - the construction itself is obviously secure when instantiated with a random oracle - but I don't really see how that would work either. I maintain that it's ugly, but HMAC isn't the nicest construction in the world either, I suppose...)
I suppose there is some CYA value in HMAC, since there is less attacker control over what goes in the outer hash. If the case is that prefix-MAC (with chopMD) is vulnerable to some kind of cryptanalysis and HMAC is not, however, the hash should be considered broken, and not in the academic sense.
You're completely right, but it's not like people have transitioned off of SHA-1 (Wikipedia: "As of 2012, the most efficient attack against SHA-1 is considered to be the one by Marc Stevens with an estimated cost of $2.77M to break a single hash value by renting CPU power from cloud servers"; note that renting cloud server time is not a sensible way to approach this problem). Or even MD5.
Not a very helpful way to express the advice; if you tell people "don't hash secrets" the natural interpretation is "send them in the clear".
HMACs have their use case, and it's good to use them for that, but honestly it's a tiny corner of the possibilities for "hashing secrets", and an effective approximation (hash(message + hash(secret)), as mentioned towards the end of the article) is the obvious approach if you have any idea how hashes work, and just as effective as a dedicated HMAC.
It's terrible advice, just like that "Don't do cryptography" from some time ago. It just seems like people are trying to be too clever. I admit I learned something, but the tone of the article could be better.
One rule of thumb I've discovered: Whenever you say "why not X?", where "X" is some nonstandard construction, the answer is "because X is horribly broken in ways you can't even imagine".
For one thing, you're incurring one more SHA1 invocation than you need; for another, you're clumsily duplicating HMAC, which has other security features (albeit not hugely practical ones). But the core feature of HMAC is indeed hashing the secret separately from the message.
...if you tell people "don't hash secrets" the natural interpretation is "send them in the clear"
What? When you see a sign by the pool that says "Don't Run" do you get down on your belly and start crawling? Some assumptions are safe to make -- e.g. most people who care enough to read this know not to send secrets in the clear.
This article is right for the case of using HMACs for two parties checking the authenticity of a message passed between them (assuming they have a shared secret).
But it's wrong about the database password storage. These days the real threat there is the monstrous computing power available to compute tons of hashes simultaneously. To combat that, you want a hashing function specifically designed to be slow. And not just slow, but to have a tunable parameter so you can require more and more work as GPUs get faster and cheaper. PBKDF2 is such a function, and is used by Django as of a year or two ago. bcrypt is another.
Okay, not to put anyone off crypto, as it's really really important, but the one rule everyone needs to remember: Use the current best practices for each situation, don't make up your own crypto.
This usually means "use whatever is in OpenSSL, Keyczar, etc", that's what it's for.
A hash is a crypto primitive, it's not what you use for that. If you want to store a password, you use a KDF. If you want to fake-sign, you use an HMAC with a secret. You don't use a hash, just like you don't use plain AES to encrypt a file.
But he is hashing secrets. Salted and Hashed passwords are extremely important...and saying "Don't Hash Secrets" is just an attempt to get him more traffic.
There is another important aspect to salting that he neglects to say. Adding an 8byte salt is extremely effective (along with a proper password hashing algorithm) in increasing the complexity of generating and storing rainbow tables. PBKDF2 and bcrypt are both slow enough to protect a lot of that low hanging fruit that he is saying ISN'T protected....Sure if you use SHA1....they might get some of them....but everyone agrees you should not be using SHA1 ! (for passwords).
OK I have a probably naive cryptography question. It seems that one good way of transmitting secrets would be to do so with everyone thinking they are encrypted one way (or possibly not at all) when the truth is they are encrypted another way. I.e. a difficult problem to know when the message has been unencrypted. So basically there would be multiple ways to read the message which are plausible as the true unencrypted message. Is there any parallel to that sort of thing in modern cryptography?
> Here’s the deal: if I tell you that SHA1(foo) is X, then it turns out, in a lot of cases, to be quite easy for you to determine what SHA1(foo || bar) is. You don’t need to know what foo is. It’s just that, because SHA1 is iterative and works block by block, if you know the hash of foo, then you can extend the computation to determine the hash of foo || bar.
This is simply wrong. Padding of cryptographic hash functions protect against this behavior.
It is not simply wrong. The usual Merkle-Damgard padding only means that you get something like SHA1(foo || garbage || bar) instead, garbage being smaller than the hash's block size.
The salt is different for EVERY secret. When you hash/store a new password, you randomly generate a salt for THAT password, and you store each salt & password together.
Because each salt is ~unique, the attacker must rebuild a separate dictionary for every single stored secret. That's functionally the same as having no dictionary, at all.
Actually, it's underkill. HMAC is too easy to crack to be used as a password hashing scheme.
http://security.stackexchange.com/questions/16809/is-a-hmac-...
http://security.stackexchange.com/questions/3165/hmac-why-no...