[[User:Bluegol|Bluegol]] ([[User talk:Bluegol|talk]]) 17:47, 15 January 2013 (KST)
* locally sensitive hashing
* geometric hashing
* uniformity…
* variable range
** A common solution is to compute a fixed hash function with a very large range (say, 0 to 232 − 1), divide the result by n, and use the division’s remainder. If n is itself a power of 2, this can be done by bit masking and bit shifting. When this approach is used, the hash function must be chosen so that the result has fairly uniform distribution between 0 and n − 1, for any value of n that may occur in the application. Depending on the function, the remainder may be uniform only for certain values of n, e.g. odd or prime numbers
* Merkle–Damgård construction
** In cryptographic hash functions, a Merkle–Damgård construction is usually used. In general, the scheme for hashing such data is to break the input into a sequence of small units (bits, bytes, words, etc.) and combine all the units b[1], b[2], …, b[m] sequentially, as follows
** S ← S0; // Initialize the state.
** for k in 1, 2, …, m do // Scan the input data units:
** S ← F(S, b[k]); // Combine data unit k into the state.
** return G(S, n) // Extract the hash value from the state.
** This schema is also used in many text checksum and fingerprint algorithms. The state variable S may be a 32- or 64-bit unsigned integer; in that case, S0 can be 0, and G(S,n) can be just S mod n. The best choice of F is a complex issue and depends on the nature of the data. If the units b[k] are single bits, then F(S,b) could be, for instance
** if highbit(S) = 0 then
** return 2 * S + b
** else
** return (2 * S + b) ^ P
** Here highbit(S) denotes the most significant bit of S; the ‘*’ operator denotes unsigned integer multiplication with lost overflow; ‘^’ is the bitwise exclusive or operation applied to words; and P is a suitable fixed word.[4]
* cryptographic hash function
** The ideal cryptographic hash function has four main properties:
*** it is easy to compute the hash value for any given message
*** it is infeasible to generate a message that has a given hash
*** it is infeasible to modify a message without changing the hash
*** it is infeasible to find two different messages with the same hash