The past two years have seen some dramatic leaks of
passwords including from well-known names such as LinkedIn and Adobe. These
events shone a light on how our passwords are being stored. If someone is daft
enough to store our passwords as plain text then they do not deserve to be
trusted with them. Most attempt to protect passwords by using a “hash” of our
passwords.
Hashing function have been around since the early 1950s and
were developed to allow, for
example, fast comparison of database entries to
see if there were duplicates. Many hash
functions have been developed over the years but they all do basically the same
thing: they take an arbitrarily long set of characters and transform it into a
fixed length, much shorter string of characters. For the same set of input characters you
would always end up with the same output. However, the likelihood of ending up
with the same shortened output from differing input should be negligible: known
as a “collision”.
Why does that help? Well, on the relatively slow machines of
the time it was better to compare shorter strings of characters when looking
for matches. Plus the development of hash functions focussed on making the hash
function very fast. Hence, producing hashes and using them to find, for
example, a match was significantly faster than trying to do so using the
original data.
Then came the development of “cryptographic hashes”, which
most refer to today simply as “hashes”.
These secure hashes are like original hash functions except that they put
extra emphasis on preventing someone from determining anything about the input
based solely on the hashed value: a one-way or trap door function. It was very
difficult anyway as in compressing the length of the data to produce the hash
you have always lost information: so called “lossy compression”. But
cryptographic hashes are tested specifically for their ability to prevent
reversing.
An obvious use for these cryptographic hashes was for
password management. Instead of storing
our passwords in plain text, a system could now receive our password, hash it
and compare it with the stored hash. If
the two matched it was almost certain that the password we had sent was
correct. Hash functions appeared to do this with names like SHA1 and MD5, with
some becoming standards recommended by many governments for securing passwords
on their systems. As time has passed
researchers have found that some of these hash functions have weaknesses and so
are not quite as “one way” as had been hoped.
Hence, you start to notice that major vendors have begun to retire
certain algorithms in favour of newer ones.
Unfortunately, as time moved on computer became faster and
faster….and faster still. So much so
that even your home computer is capable of undertaking millions of comparisons
a second. Plus the hashing algorithms
have become well known if only because systems developers were encourage to
implement them to protect passwords.
This led to the development of what is known as the “dictionary attack”
which rely upon simple brute force.
In essence it’s simple. You take a dictionary of words that
might be used as passwords, you hash it yourself and you compare your resulting
hashes with the hashed password you have access to. When you have a match you
look back at your dictionary to see what the original plaintext word was ie the
password. As it still takes an
appreciable time to hash the dictionary you are using to mount the attack then
people began pre-computing the hashed forms of the dictionary. The resulting set of hashes became known as
“rainbow tables”. Now all you have to do
is compare stolen hashed passwords with your rainbow table, find a match and
look back in your index to find the original word/password.
Using these techniques hackers have been able to steal huge
sets of hashed passwords (sometimes hundreds of thousands) and almost before
the keeper knows they are missing the hackers can have computed the original passwords.
The answer is to add a touch of salt.
A “salt” is a randomly generated set of characters which you
add (before or after) your password characters and then pass it through your
hashing function. Now the hacker’s dictionary or rainbow tables should
theoretically be useless. But, as ever, whilst the theory is sound the way
system developers sprinkle their salt can give the hackers another route
in. Typical mistakes are:
1.
Choosing
a random character string that is not truly random. Computers have great difficulty in generating
anything that is random so this can be difficult and some developers in the
past have taken short cuts assuming that no one will guess how they have
generated their “random” characters. They were wrong.
2.
Choosing
a random character string that is too short. If it is short enough there are only so many
possible characters that it could be so it is possible to calculate all
possible values and simply add those to your dictionary.
3. Using the same random character set for
every password. One of the greatest helps a cryptographer can be to a cryptanalyst
(who is trying to break their code) is to reuse the same string of
characters. Once found, this salt will
allow the attacker to compute all the passwords almost as if the salt had never
been added.
Ideally systems would store the salt on a separate system to
the username and hashed password. However, practical considerations often mean
this is not done so the hacker might be able to obtain the salt as well as the
username and password. From this they
can of course then simply compute the original passwords. However, because of the way in which it has
to be done it is a much slower process and if a hacker is attempting to crack
thousands of passwords the process will take much longer than they want. So, hackers have moved on from using
computers as you might recognise them to harness one particular part of your
computer: the Graphics Processor Unit (GPU).
Whilst most people have been aware that the processors in
their home computers have become faster and faster, the GPU has been silently
developing to achieve quite astronomical speeds. They can achieve such speeds because they are
dedicated to very specific types of computing such as decoding video or
generating 3D graphics. GPUs can be
optimised to dedicate more of their processing power to these graphics
functions – they don’t need to be able to do the general purpose functions that
your Central Processing Unit (CPU) which is the brain of your computer must be
capable of.
However, for some time now hackers (or particularly
“password crackers”) have worked out how to combine many of these GPUS together
to produce your own mini-supercomputer.
They sit on a desktop and can be built from parts routinely available on
the Internet. The software needed to run
these GPUs in parallel and the software to make use of them to crack the
passwords as explained above are freely available to download, if you know
where to look. Suddenly, although salted
hashes makes it more difficult, the arms race swings back in favour of those seeking
to find your password.
But, the war is not over. It might seem obvious, but it is
only relatively recently that those seeking to protect passwords have started
to research hashing functions that are deliberately slow. Whereas, because of their original purpose,
hash functions were always designed to be fast and efficient, some of the
latest hash functions are deliberately slow. The idea I that you cause the
hackers/crackers so much inconvenience, even with their home built
supercomputers that they move onto easier targets. You can’t stop them eventually calculating
your password but you can make it take a long time.
There is one way that you can help enormously: choose a
“strong password” which is simply a set of characters that is unlikely to
appear in the hacker’s dictionary.
That’s why many system insist that you use unusual characters in your
password. For example, if you chose a
phrase like “my dog has big ears”, you could write that as
“Myd0ghasb!gears”. The other thing you
can do is not to reuse passwords. Much
easier said than done but sadly not all systems are developed to the same high
standards so your password is only as secure as the weakest of those systems:
pointless having slow salted hashes on one system I the same password is stored
on a system storing your password in plaintext.