Hacker Newsnew | past | comments | ask | show | jobs | submitlogin
Don’t Hash Secrets (2008) (benlog.com)
78 points by pizza on June 13, 2013 | hide | past | favorite | 34 comments


> In fact, I use HMAC for my password databases, too. It’s overkill, but it lets me use a standard library that likely makes me safer in the long run.

Actually, it's underkill. HMAC is too easy to crack to be used as a password hashing scheme.

http://security.stackexchange.com/questions/16809/is-a-hmac-...

http://security.stackexchange.com/questions/3165/hmac-why-no...


Fun fact: it was a design goal of the SHA-3 competition not to have the hash be vulnerable to length extension attacks, which obviates the need for constructions like HMAC that apply the hash function twice.


Sufficiently truncated hashes (SHA-384, SHA-512/256, SHA-512/224) in the SHA2 family should have the same property, right? You need the missing state to extend.


Yes, but that's a hack, and not nearly as well-studied as HMAC or SHA-3. If you decide to do it this way, at least be sure to make it very obvious that "upgrading" to SHA-512 completely breaks your cryptosystem.


It is fairly well-studied; it's called chopMD in the literature. It has been shown to be indifferentiable from a random oracle [1, Theorem 3], provided enough bits are chopped.

[1] http://www.cs.nyu.edu/~puniya/papers/merkle.pdf


Thanks for pointing me at those papers! I'll admit it's been studied more than I thought.

(I was actually more worried about cryptanalytic attacks - the construction itself is obviously secure when instantiated with a random oracle - but I don't really see how that would work either. I maintain that it's ugly, but HMAC isn't the nicest construction in the world either, I suppose...)


I suppose there is some CYA value in HMAC, since there is less attacker control over what goes in the outer hash. If the case is that prefix-MAC (with chopMD) is vulnerable to some kind of cryptanalysis and HMAC is not, however, the hash should be considered broken, and not in the academic sense.


You're completely right, but it's not like people have transitioned off of SHA-1 (Wikipedia: "As of 2012, the most efficient attack against SHA-1 is considered to be the one by Marc Stevens with an estimated cost of $2.77M to break a single hash value by renting CPU power from cloud servers"; note that renting cloud server time is not a sensible way to approach this problem). Or even MD5.


Yeah, this article is honestly painful to read with it's repetitive 'duh' style that doesn't apply to half the hash functions out there.

Hell, you could take the 'use bcrypt' meme out of context and you'd be fine.


Not a very helpful way to express the advice; if you tell people "don't hash secrets" the natural interpretation is "send them in the clear".

HMACs have their use case, and it's good to use them for that, but honestly it's a tiny corner of the possibilities for "hashing secrets", and an effective approximation (hash(message + hash(secret)), as mentioned towards the end of the article) is the obvious approach if you have any idea how hashes work, and just as effective as a dedicated HMAC.


It's terrible advice, just like that "Don't do cryptography" from some time ago. It just seems like people are trying to be too clever. I admit I learned something, but the tone of the article could be better.


I think the point is that that is not necessarily true for all hash functions, since reordering your example with SHA1 renders it weaker.

If I understood correctly sha1(sha1(secret) + message) is not safe, because from that hash you can calculate sha1(sha1(secret) + message + ANYTHING)


Correct; H(H(secret) || message) just scrambles the secret, but doesn't change the MAC construction.


So why not sha1(sha1(secret) + sha1(message))? Or is that essentially an MAC?


One rule of thumb I've discovered: Whenever you say "why not X?", where "X" is some nonstandard construction, the answer is "because X is horribly broken in ways you can't even imagine".


For one thing, you're incurring one more SHA1 invocation than you need; for another, you're clumsily duplicating HMAC, which has other security features (albeit not hugely practical ones). But the core feature of HMAC is indeed hashing the secret separately from the message.


...if you tell people "don't hash secrets" the natural interpretation is "send them in the clear"

What? When you see a sign by the pool that says "Don't Run" do you get down on your belly and start crawling? Some assumptions are safe to make -- e.g. most people who care enough to read this know not to send secrets in the clear.


I don't think that anyone who understands what "hashing secrets" means would think that "don't hash secrets" means "send them in the clear."


> Not a very helpful way to express the advice; if you tell people "don't hash secrets" the natural interpretation is "send them in the clear".

Yeah, a better title would be "Don't just hash secrets."


This article is right for the case of using HMACs for two parties checking the authenticity of a message passed between them (assuming they have a shared secret).

But it's wrong about the database password storage. These days the real threat there is the monstrous computing power available to compute tons of hashes simultaneously. To combat that, you want a hashing function specifically designed to be slow. And not just slow, but to have a tunable parameter so you can require more and more work as GPUs get faster and cheaper. PBKDF2 is such a function, and is used by Django as of a year or two ago. bcrypt is another.


This. Use bcrypt or scrypt for storing secrets. HMAC is for...hash(ing) message authentication code(s); read: http://security.stackexchange.com/questions/16809/is-a-hmac-...


Say it with me one more time: do not use fast hash functions for passwords.


Okay, not to put anyone off crypto, as it's really really important, but the one rule everyone needs to remember: Use the current best practices for each situation, don't make up your own crypto.

This usually means "use whatever is in OpenSSL, Keyczar, etc", that's what it's for.


The article makes the point that it is insufficient to use good crypto algorithms / implementations. You must also use it correctly.

"A component can only be secure under certain well-defined circumstances, not for any use that happens to look similar."


A hash is a crypto primitive, it's not what you use for that. If you want to store a password, you use a KDF. If you want to fake-sign, you use an HMAC with a secret. You don't use a hash, just like you don't use plain AES to encrypt a file.


> don't use plain AES to encrypt a file

What does "plain" mean in this context? (If you mean ECB then you are right, but it is ambiguous.)


I mean a mode without authenticated encryption:

http://en.wikipedia.org/wiki/Authenticated_encryption


But he is hashing secrets. Salted and Hashed passwords are extremely important...and saying "Don't Hash Secrets" is just an attempt to get him more traffic.

There is another important aspect to salting that he neglects to say. Adding an 8byte salt is extremely effective (along with a proper password hashing algorithm) in increasing the complexity of generating and storing rainbow tables. PBKDF2 and bcrypt are both slow enough to protect a lot of that low hanging fruit that he is saying ISN'T protected....Sure if you use SHA1....they might get some of them....but everyone agrees you should not be using SHA1 ! (for passwords).


OK I have a probably naive cryptography question. It seems that one good way of transmitting secrets would be to do so with everyone thinking they are encrypted one way (or possibly not at all) when the truth is they are encrypted another way. I.e. a difficult problem to know when the message has been unencrypted. So basically there would be multiple ways to read the message which are plausible as the true unencrypted message. Is there any parallel to that sort of thing in modern cryptography?


> Here’s the deal: if I tell you that SHA1(foo) is X, then it turns out, in a lot of cases, to be quite easy for you to determine what SHA1(foo || bar) is. You don’t need to know what foo is. It’s just that, because SHA1 is iterative and works block by block, if you know the hash of foo, then you can extend the computation to determine the hash of foo || bar.

This is simply wrong. Padding of cryptographic hash functions protect against this behavior.


It is not simply wrong. The usual Merkle-Damgard padding only means that you get something like SHA1(foo || garbage || bar) instead, garbage being smaller than the hash's block size.


Maybe disingenuous was a better word. The "garbage" in the middle still protects against most (though not all) attacks.


On salting, the author says that an attacker needs to rebuild the dictionary for each user they attack.

But doesn't the attacker only need to rebuild the dictionary once, using the salt they recovered?


The salt is different for EVERY secret. When you hash/store a new password, you randomly generate a salt for THAT password, and you store each salt & password together.

Because each salt is ~unique, the attacker must rebuild a separate dictionary for every single stored secret. That's functionally the same as having no dictionary, at all.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: