Understanding Hash Functions and Keeping Passwords Safe

0

Category :

Understanding Hash Functions and Keeping Passwords Safe

From time to time, servers and databases are stolen or compromised. With this in mind, it is important to ensure that some crucial user data, such as passwords, can not be recovered.

What Does “Hashing” Do?
Hashing converts a piece of data (either small or large), into a relatively short piece of data such as a string or an integer.
This is accomplished by using a one-way hash function. “One-way” means that it is very difficult (or practically impossible) to reverse it.

MD5
With md5(), the result will always be a 32 character long string. You may md5() much longer strings and data, and you will still end up with a hash of this length.

Using a Hash Function for Storing Passwords
The usual process during a user registration:
    User fills out registration form, including the password field.
    The web script stores all of the information into a database.
    However, the password is run through a hash function, before being stored.
    The original version of the password has not been stored anywhere, so it is technically discarded.

And the login process:
    User enters username (or e-mail) and password.
    The script runs the password through the same hashing function.
    The script finds the user record from the database, and reads the stored hashed password.
    Both of these values are compared, and the access is granted if they match.

Problem #1: Hash Collision
A hash “collision” occurs when two different data inputs generate the same resulting hash. The likelihood of this happening depends on which function you use. We can figure out another password that will convert to the same hash value, with a simple script.

So we need a hash function that has a very big range.
For example, md5() might be suitable, as it generates 128-bit hashes. This translates into 340,282,366,920,938,463,463,374,607,431,768,211,456 possible outcomes.

Sha1
Sha1() is a better alternative, and it generates an even longer 160-bit hash value.

Problem #2: Rainbow Tables
A rainbow table is built by calculating the hash values of commonly used words and their combinations. For example, you can go through a dictionary, and generate hash values for every word.

How can this be prevented?
We can try adding a “salt”. What we basically do is concatenate the “salt” string with the passwords before hashing them. The resulting string obviously will not be on any pre-built rainbow table. But, we’re still not safe just yet!

Problem #3: Rainbow Tables (again)
Even if a salt was used, this may have been stolen along with the database. All they have to do is generate a new Rainbow Table from scratch, but this time they concatenate the salt to every word that they are putting in the table.

How can this be prevented?
We can use a “unique salt” instead, which changes for each user.

Problem #4: Hash Speed
Most hashing functions have been designed with speed in mind, because they are often used to calculate checksum values for large data sets and files, to check for data integrity.
An 8 character long string has 62^8 possible versions. That is a little over 218 trillion. At a rate of 1 billion hashes per second, that can be solved in about 60 hours.
And for 6 character long passwords, which is also quite common, it would take under 1 minute.

How can this be prevented?
Imagine that you use a hash function that can only run 1 million times per second on the same hardware, instead of 1 billion times per second. It would then take the attacker 1000 times longer to brute force a hash. 60 hours would turn into nearly 7 years!

Or you may use an algorithm that supports a "cost parameter," such as BLOWFISH. In PHP, this can be done using the crypt() function.

thanks.credit to:
http://net.tutsplus.com/tutorials/php/understanding-hash-functions-and-keeping-passwords-safe/