Difference
between MD5 or SHA1 or RSA
MD5
is a cryptographic hash function. "SHA" is the name for a
family of hash functions; first a short-lived "SHA" which was renamed
"SHA-0", then "SHA-1" was defined (a best seller). Later
on, new members of the family were added, collectively designated as
"SHA-2", and consisting of SHA-224, SHA-256, SHA-384 and SHA-512.
Recently, a new SHA generation was designed, called "SHA-3" but also
"Keccak" (this was an open competition, Keccak being the codename of
one of the candidates, who ultimately won).
A cryptographic hash function is a
fully defined, deterministic function which uses no secret key. It takes as
input a message of arbitrary length (a stream of bits, any bits) and produces a fixed-size output.
Output size depends on the function; it is 128 bits for MD5, 160 bits for
SHA-1, 256 bits for SHA-256... Everybody can compute a given hash function on a
given input, and they all get the same results. Hash functions are also called digests because they somehow produce a kind of
"checksum" or "summary" of the input. Robust hash functions
must be such that nobody knows how to "invert" them or even find two
distinct inputs which yield the same output. The latter is called a collision and it is a mathematical necessity that
collisions exist (since the function can accept many more distinct inputs than
it can produce distinct outputs), but we require that it is unfeasible to find
even one collision.
MD5 turned out to be very broken with regards to
collisions (we can produce a collision in a few seconds of work on a PC) and
SHA-0 is also broken in that respect; SHA-1 is a bit flaky; the rest of the SHA
family appears to be robust so far. How a hash function achieves collision
resistance is a bit of a miracle since the whole function is completely known,
with no secret value; it just mixes the data too much for the best
cryptographers to unravel the process.
RSA
is two algorithms: an asymmetric encryption algorithm
and a digital signature algorithm. Although both algorithms build on
the same kind of mathematics, they are quite distinct (a lot of people describe
signatures as "encryption with the private key", which is a flaw
analogy and at best confusing, so don't do that). Both algorithms uses keys, i.e. pieces of data which must be kept secret.
It so happens that for RSA signatures, what is signed is not directly a given
message (a sequence of bits) but a hash of
the message: the message is first processed with a cryptographic hash function
like SHA-256, and the hash value is then used. This is done that way because
the mathematics of RSA can handle only values of moderate size, a few hundred
bits at best. Cryptographic hash functions are such that signing the hash is as
good as signing the original data.
That way, RSA and cryptographic hash
functions are often used together; but they are not the same thing at all.
Hexadecimal is a way to represent a sequence of bits into a
sequence of characters:
hexadecimal uses digits and letters from 'A' to 'F'; each character encodes
exactly four bits ('0' encodes '0000', '7' encodes '0111', 'D' encodes
'1101'...). Any sequence
of bits (and in particular the output of a hash function) can be converted to
hexadecimal and back. Hexadecimal is popular because human eyes and brain are
good at reading characters, and not at reading bits. Command-line tools which
compute cryptographic hash functions on files traditionally output hexadecimal
characters for that reason. However, there is nothing which intrinsically ties
hash functions with hexadecimal: everything which fits in a computer one way or
another is a sequence of bits, and so is amenable to hexadecimal; and a hash
function output is just a sequence of bits, which can be encoded in various
ways, hexadecimal being just the "traditional" way (although Base64 is also often encountered, especially when dealing
with databases).
No comments:
Post a Comment