sphlib
is a library which contains implementations of various cryptographic hash functions. These pages have been generated with doxygen and document the API for the C implementations.
The API is described in appropriate header files, which are available in the "Files" section. Each hash function family has its own header, whose name begins with "sph_"
and contains the family name. For instance, the API for the RIPEMD hash functions is available in the header file sph_ripemd.h
.
In all generality, hash functions operate over strings of bits. Individual bits are rarely encountered in C programming or actual communication protocols; most protocols converge on the ubiquitous "octet" which is a group of eight bits. Data is thus expressed as a stream of octets. The C programming language contains the notion of a "byte", which is a data unit managed under the type "unsigned
char"
. The C standard prescribes that a byte should hold at least eight bits, but possibly more. Most modern architectures, even in the embedded world, feature eight-bit bytes, i.e. map bytes to octets.
Nevertheless, for some of the implemented hash functions, an extra API has been added, which allows the input of arbitrary sequences of bits: when the computation is about to be closed, 1 to 7 extra bits can be added. The functions for which this API is implemented include the SHA-2 functions and all SHA-3 candidates.
sphlib
defines hash function which may hash octet streams, i.e. streams of bits where the number of bits is a multiple of eight. The data input functions in the sphlib
API expect data as anonymous pointers ("const void *"
) with a length (of type "size_t"
) which gives the input data chunk length in bytes. A byte is assumed to be an octet; the sph_types.h
header contains a compile-time test which prevents compilation on architectures where this property is not met.
The hash function output is also converted into bytes. All currently implemented hash functions have an output width which is a multiple of eight, and this is likely to remain true for new designs.
Most hash functions internally convert input data into 32-bit of 64-bit words, using either little-endian or big-endian conversion. The hash output also often consists of such words, which are encoded into output bytes with a similar endianness convention. Some hash functions have been only loosely specified on that subject; when necessary, sphlib
has been tested against published "reference" implementations in order to use the same conventions.
Each implemented hash function has a "short name" which is used internally to derive the identifiers for the functions and context structures which the function uses. For instance, MD5 has the short name "md5"
. Short names are listed in the next section, for the implemented hash functions. In subsequent sections, the short name will be assumed to be "XXX"
: replace with the actual hash function name to get the C identifier.
Note: some functions within the same family share the same core elements, such as update function or context structure. Correspondingly, some of the defined types or functions may actually be macros which transparently evaluate to another type or function name.
Each implemented hash fonction has its own context structure, available under the type name "sph_XXX_context"
for the hash function with short name "XXX"
. This structure holds all needed state for a running hash computation.
The contents of these structures are meant to be opaque, and private to the implementation. However, these contents are specified in the header files so that application code which uses sphlib
may access the size of those structures.
The caller is responsible for allocating the context structure, whether by dynamic allocation (malloc()
or equivalent), static allocation (a global permanent variable), as an automatic variable ("on the stack"), or by any other mean which ensures proper structure alignment. sphlib
code performs no dynamic allocation by itself.
The context must be initialized before use, using the sph_XXX_init()
function. This function sets the context state to proper initial values for hashing.
Since all state data is contained within the context structure, sphlib
is thread-safe and reentrant: several hash computations may be performed in parallel, provided that they do not operate on the same context. Moreover, a running computation can be cloned by copying the context (with a simple memcpy()
): the context and its clone are then independant and may be updated with new data and/or closed without interfering with each other. Similarly, a context structure can be moved in memory at will: context structures contain no pointer, in particular no pointer to themselves.
Hashed data is input with the sph_XXX()
fonction, which takes as parameters a pointer to the context, a pointer to the data to hash, and the number of data bytes to hash. The context is updated with the new data.
Data can be input in one or several calls, with arbitrary input lengths. However, it is best, performance wise, to input data by relatively big chunks (say a few kilobytes), because this allows sphlib
to optimize things and avoid internal copying.
When all data has been input, the context can be closed with sph_XXX_close()
. The hash output is computed and written into the provided buffer. The caller must take care to provide a buffer of appropriate length; e.g., when using SHA-1, the output is a 20-byte word, therefore the output buffer must be at least 20-byte long.
For some hash functions, the sph_XXX_addbits_and_close()
function can be used instead of sph_XXX_close()
. This function can take a few extra bits to be added at the end of the input message. This allows hashing messages with a bit length which is not a multiple of 8. The extra bits are provided as an unsigned integer value, and a bit count. The bit count must be between 0 and 7, inclusive. The extra bits are provided as bits 7 to 0 (bits of numerical value 128, 64, 32... downto 0), in that order. For instance, to add three bits of value 1, 1 and 0, the unsigned integer will have value 192 (1*128 + 1*64 + 0*32) and the bit count will be 3.
The SPH_SIZE_XXX
macro is defined for each hash function; it evaluates to the function output size, expressed in bits. For instance, SPH_SIZE_sha1
evaluates to 160
.
When closed, the context is automatically reinitialized and can be immediately used for another computation. It is not necessary to call sph_XXX_init()
after a close. Note that sph_XXX_init()
can still be called to "reset" a context, i.e. forget previously input data, and get back to the initial state.
"Alignment" is a property of data, which is said to be "properly
aligned" when its emplacement in memory is such that the data can be optimally read by full words. This depends on the type of access; basically, some hash functions will read data by 32-bit or 64-bit words. sphlib
does not mandate such alignment for input data, but using aligned data can substantially improve performance.
As a rule, it is best to input data by chunks whose length (in bytes) is a multiple of eight, and which begins at "generally aligned" addresses, such as the base address returned by a call to malloc()
.
We give here the list of implemented functions. They are grouped by family; to each family corresponds a specific header file. Each individual function has its associated "short name". Please refer to the documentation for that header file to get details on the hash function denomination and provenance.
Note: the functions marked with a '(64)' in the list below are available only if the C compiler provides an integer type of length 64 bits or more. Such a type is mandatory in the latest C standard (ISO 9899:1999, aka "C99") and is present in several older compilers as well, so chances are that such a type is available.
sph_haval.h
haval128_3
haval128_4
haval128_5
haval160_3
haval160_4
haval160_5
haval192_3
haval192_4
haval192_5
haval224_3
haval224_4
haval224_5
haval256_3
haval256_4
haval256_5
sph_md2.h
, short name: md2
sph_md4.h
, short name: md4
sph_md5.h
, short name: md5
sph_panama.h
, short name: panama
sph_radiogatun.h
radiogatun32
radiogatun64
(64)sph_ripemd.h
ripemd
ripemd128
ripemd160
sph_sha0.h
, short name: sha0
sph_sha1.h
, short name: sha1
sph_sha2.h
sha224
sha256
sha384
(64)sha512
(64)sph_tiger.h
tiger
(64)tiger2
(64)sph_whirlpool.h
whirlpool0
(64)whirlpool1
(64)whirlpool
(64)The fourteen second-round SHA-3 candidates are also implemented:
sph_blake.h
blake224
blake256
blake384
(64)blake512
(64)sph_bmw.h
bmw224
bmw256
bmw384
(64)bmw512
(64)sph_cubehash.h
(specified as CubeHash16/32 in the CubeHash specification)cubehash224
cubehash256
cubehash384
cubehash512
sph_echo.h
echo224
echo256
echo384
echo512
sph_fugue.h
fugue224
fugue256
fugue384
fugue512
sph_groestl.h
groestl224
groestl256
groestl384
groestl512
sph_hamsi.h
hamsi224
hamsi256
hamsi384
hamsi512
sph_jh.h
jh224
jh256
jh384
jh512
sph_keccak.h
keccak224
keccak256
keccak384
keccak512
sph_luffa.h
luffa224
luffa256
luffa384
luffa512
sph_shabal.h
shabal192
shabal224
shabal256
shabal384
shabal512
sph_shavite.h
shabal224
shabal256
shabal384
shabal512
sph_simd.h
simd224
simd256
simd384
simd512
sph_skein.h
skein224
(64)skein256
(64)skein384
(64)skein512
(64)For the second-round SHA-3 candidates, the functions are as specified for round 2, i.e. with the "tweaks" that some candidates added between round 1 and round 2. Also, some of the submitted packages for round 2 contained errors, in the specification, reference code, or both. sphlib
implements the corrected versions.