Password systems must be resilient and strong. By signing up to your online service, customers are trusting you with their passwords – betraying that trust can have serious impacts on your business.
It’s impossible to have flawless security, but having an understanding of what was acceptable in the past and the best practices today can help you keep your users protected.
Balancing Factors
There are many factors that need to be weighed up when building a great password system, so we’ll go into a few of the features and defences that a good system needs.
Database leaks
It’s unacceptable for a database leak to result in your users’ passwords being revealed, and yet it’s still a fairly common occurrence. It has to be assumed that your database containing users and passwords will be leaked and accessible to thousands of attackers around the world. Recently, Slack had a major database breach, but luckily (at least so far) it appears that their password security was strong enough to stop passwords from being decrypted. Sony, on the other hand, was storing passwords in plain-text.
To be secure against database leaks, cryptographically strong hashing functions need to be used. However, if the attacker can’t retrieve the users’ passwords from the database, they’ll often turn to brute-force.
Brute-force
A lot of users have passwords which are easy to guess, mainly in an attempt to make them easier to remember. If an attacker knows somebody is a user of your product and wants to gain access to their account, they can quite easily try to login to that account by running through a list of common passwords (dictionary attack).
Using AWS, a single cg1.4xlarge
instance, which costs as little as $0.21/hour on spot pricing, can calculate 2.32 billion MD5 hashes per second – so in less than an hour every single 8 character lowercase alphanumeric password could be cracked. To protect against these attacks, it’s important to make guessing a password as resource intensive (CPU, memory, storage, bandwidth) as possible, but that can easily make it slow for your users.
Speed
Logging into a service should be quick and easy for the end user, but strong hashing functions are designed to be slow and resource intensive to help mitigate brute-force attacks and make rainbow/lookup table generation too expensive (there’s a great StackExchange answer on rainbow tables which helps explain the concept). It’s a fine balance to strike, but it shouldn’t take any more than a couple of seconds for a user to log in.
Maintainability
Security through obscurity, otherwise known as having a complex password system to try and throw attackers off, is often a bad idea. It must always be assumed that the attacker is smarter than you, and that they’ll be able to figure out the system with ease. All that an obscure system really achieves is to increase the likelihood of there being a security hole and reduces maintainability. Good hashing algorithms and security practices are the result of years of research and have been tested by some of the world’s best scientists and mathematicians.
Evolution
Over time, password security has progressively had to improve as attacks have become more sophisticated and computing power and storage has become cheaper and more available.
1. Plain Text
The earliest, easiest and quickest method of storing passwords is also the least secure. Nobody who knows anything about security has ever recommended this solution, but it deserves a mention as it’s still used in places.
2. Unsalted Hash
The most basic concept in password security is hashing. The raw password goes through a one-way function that makes it impossible to get the raw password again, so the hashed password can be stored in your database and then when a user logs in, the function can be run again and the values can be compared.
Originally MD5 and SHA-1 were commonly used as the hashing algorithms, but they’ve both had vulnerabilities highlighted and it’s now far too cheap to crack the passwords. Hashed passwords are technically impossible to reverse/decrypt, but they are not impossible to crack (guess every possibility). Additionally, two users with the same password would have the same hash.
3. Salted Hash
To multiply the complexity of hashed passwords, salts are used. These are random strings of data (generated and stored whenever the password is changed) which are stored in plain-text alongside the hash, retrieved on login and then used to generate the password hash (HASH(salt + password)
). By using salts, an attacker has to run through the entire process once per-user, rather than once for the entire user-base.
A short four-character alphanumeric salt multiplies the size of the rainbow table required to crack the passwords by 364 (1,679,616). At GoSquared, we use 32-byte salts, so the multiplier is 1664 (1.158×1077). In the real-world, however, the multiplier is only the number of unique salts in your system (number of users) as once your database has been compromised the attacker can see which salts are in use.
This method has been recommended for a long time, and it’s still the go-to system for new applications. A strong hashing algorithm (such as scrypt, bcrypt or PBKDF2) along with a secure salt is usually enough to prevent attackers from cracking passwords (unless they’re using common passwords), but there’s still more that can be done..
4. Two-step salted hash
Using the salted hash method usually results in a users
table which has id
, email
, salt
and password
columns, meaning that the passwords and salts are stored at a 1-1 ratio with your users. This means the complexity increase by using salts is only the number of users that you have (or the number of users that the attacker is interested in). Most of the time, the 1-1 ratio results in rainbow tables still being a feasible attack method.
A great solution is outlined on Jeremy Spilman’s blog, which suggests de-coupling your password hashes and users. The hashes table can then be massively bloated with millions (or billions) of rows of faux data to increase the cost and complexity of attacking the system. There are a couple of potential security risks with his initial design, and the follow-up post fixes these well. Here’s the solution (mostly taken from the post):
users
table: [ id, salt1, hash2 ]
hashes
table: [ hash1, salt2 ]
Both salts are cryptographically secure and 32-bytes is a good length. In Node.js, they can be generated using crypto.randomBytes(32)
. The hashes are the output from your preferred hashing algorithm. Scrypt is our algorithm of choice as it helps protect against GPU-based attacks by increasing the amount of memory required – otherwise they can make billions of guesses per-second using the multi-core parallelism in graphics cards.
hash1 = HASH(salt1, password)
hash2 = HASH(salt2, password)
System flow
Setting a new password:
- Generate both salts
- Calculate hash1 (
HASH(salt1, password)
) - Insert hash1 and salt2 into the
hashes
tables:INSERT INTO hashes SET hash1 = [hash1], salt2 = [salt2]
- Calculate hash2 (
HASH(salt2, password)
) - Update hash2 and salt1 in the
users
table:UPDATE users SET salt1 = [salt1], hash2 = [hash2] WHERE id = [userID]
Note: there’s no need to delete the old data from the hashes table – the extra hash only increases the bloating of the table, helping to make it more secure.
Validating a login:
- Retrieve
salt1
andhash2
from theusers
table:SELECT salt1, hash2 FROM users WHERE id = [userID]
- Calculate
hash1
:HASH(salt1, password)
- Retrieve
salt2
from thehashes
table:SELECT salt2 FROM hashes WHERE hash1 = [hash1]
- If no rows are returned, invalid login
- Calculate
hash2
:HASH(salt2, password)
- Verify that the calculated
hash2
is equal to thehash2
from theusers
table
Why is this good? It just looks more complicated…
As mentioned above, a simple/maintainable system is very useful. However, after a few minutes of thought this system is quite simple. The real strength of this system is that you can heavily bloat the hashes table with any amount of data. Scrypt is already very good at increasing CPU and memory requirements, so increasing the bandwidth and storage size needed to make an attack is a great addition.
Since early 2013, we’ve had this system in production and it’s served us very well.
Extras
To help identify and stop attacks, there a few extra (fairly obvious) best practises that should be followed.
Log and monitor everything
…but don’t log passwords!
Logging whenever an incorrect password attempt is made and monitoring the number of successful/failed attempts is essential. The logs should include as much information as possible to help identify patterns if somebody starts trying to guess your users passwords.
Monitoring database activity is critical too, to help identify if a breach has been made. Exporting your huge hashes
table all at once should be pretty clear on your bandwidth charts.
HTTPS/TLS
This shouldn’t need mentioning, never transfer sensitive data over insecure connections.
Rate-limit attempts
Even without a database breach, attackers can make as many password attempts as they like against your users. Hopefully, your logging and monitoring would highlight this activity, but it’s also a good idea to rate limit attempts by user and IP address so they get locked out before they can make too many attempts. More importantly, alert yourself so you can spot suspicious activity.
Two-factor authentication
All online services should offer two-factor authentication to their users. It massively reduces the chances of an attacker gaining access to an account by requiring both a password and a 6-character code generated on a mobile device. Unfortunately, it can be tough to build a secure and user-friendly solution, but we’ve written in detail about building our 2FA system. Alternatively, there are third-party services such as Authy which can make it easier to implement, but Authy has had security issues.
Conclusions
Despite the common claim that “passwords are dead” (started by Bill Gates in 2004), it’s still going to be quite some time until new solutions completely replace them. Therefore, software engineers need to keep improving password security to help keep their users safe.
Everyone has a responsibility to use secure passwords. Services like 1Password or LastPass can help with this, but even LastPass was hacked recently. Due to this, I’d recommend using an offline solution, such as a personal project of mine – PW, which hashes the service name and your master password to generate secure and unique passwords.
Password security is continuously evolving, and there are probably better systems out there now which we’d love to hear about in the comments or on Twitter.