Randomness of hash of (password + suffix)



  • Suppose I have a password, say "thisIsThePassword".
    Then I have MD5 hashes of that password followed by an increasing numeric suffix:

    MD5("thisIsThePassword1") = c5c038617ea97315f137606c4123789e
    MD5("thisIsThePassword2") = 0a52caffe2e7904c832a257a406f903f
    MD5("thisIsThePassword3") = 06683298a9c3cdf8940518db54c43758
    ...
    

    The question is: are those hashes related to each other in any way, so that from one hash it is possible to compute another one without having to bruteforce the password? Hashes are supposed to be totally different and random as soon as you change one bit in the input, but I'm afraid there are some gotchas, maybe when comparing the hash of thisIsThePassword1 with the one of thisIsThePassword11 (and subsequent suffixes starting with "1"). Would it be better or make any difference if we used SHA instead, or make sure the input is always the same length? What function or method should I use if I wanted to make sure that you can't easily compute another of those hashes unless you bruteforce the password?



  • Most common hash algorithms are designed to produce outputs that are indistinguishable from random, and as a result, even related inputs are not supposed to produce related outputs. Therefore, generating multiple hashes with related input isn't intrinsically insecure. You can use KMAC (related to SHA-3) and BLAKE2b as MACs and they can be used securely with related inputs in this way.

    However, MD5, SHA-1, and SHA-2 are vulnerable to length-extension attacks. That means that it is possible, given a hash and the original length of the input, to produce more hash values with the same prefix without knowing about the original input. Therefore, you would not want to use this approach with one of those algorithms, because an attacker could possibly compute and compare additional values to see if they match. The SHA-3 algorithms and BLAKE2b don't have this problem.

    If your goal is to hash a password, though, you don't want to use a simple hash function using any construction. Passwords typically don't contain a lot of entropy, and so you want to use a good password hash such as scrypt or Argon2i with a large salt and suitable iteration count. You can either create a new salt for each use or generate a single salt and append a fixed-length counter.

    If your secret isn't actually a password but instead a randomly generated secret with enough entropy (at least 128 bits), then you're probably better off using something like HMAC with a strong hash function, or HKDF if you want a key derivation function. You can then provide the secret as the key and the counter as the message to hash, and this will both avoid length-extension attacks and also has much nicer provable security properties.

    And, of course, you would never want to use MD5 or SHA-1 for any purpose. They are both broken and you'd want to use something like SHA-2 (if you don't need to worry about length-extension attacks), SHA-3, or BLAKE2b.


Log in to reply
 

Suggested Topics

  • 2
  • 2
  • 2
  • 2
  • 2
  • 2
  • 2
  • 2