User Tools

Site Tools


chess:programming:zobrist_hashing

Differences

This shows you the differences between two versions of the page.

Link to this comparison view

Both sides previous revisionPrevious revision
Next revision
Previous revision
chess:programming:zobrist_hashing [2021/10/12 10:50] peterchess:programming:zobrist_hashing [2021/10/30 21:41] (current) – [Implementation Consideration] peter
Line 1: Line 1:
 ====== Chess - Programming - Zobrist Hashing ====== ====== Chess - Programming - Zobrist Hashing ======
  
-**Zobrist Hashing** is a technique to transform a board position of arbitrary size into a number of a set length, with an equal distribution over all possible numbers.+**Zobrist Hashing** is a hashing function that is widely used in 2 player board games.
  
   * The main purpose of Zobrist hash codes in chess programming is to get an almost unique index number for any chess position, with a very important requirement that two similar positions generate entirely different indices.   * The main purpose of Zobrist hash codes in chess programming is to get an almost unique index number for any chess position, with a very important requirement that two similar positions generate entirely different indices.
-  * These index numbers are used for faster and more space efficient Hash tables or databases, e.g. transposition tables and opening books.+  * It is the most common hashing function used in transposition tables; and sometimes opening books. 
 +  * Transposition tables basically store the evaluated values of previous board states, so that if they are encountered again we simply retrieve the stored value from the transposition table. 
 +  * These index numbers are used for faster and more space efficient Hash tables or databases.
  
 ---- ----
  
-==== Initialization ====+[[Chess:Programming:Zobrist Hashing:A program to illustrate the Zobrist Hashing Algorithm|A program to illustrate the Zobrist Hashing Algorithm]] 
 + 
 +---- 
 + 
 +===== Pseudocode ===== 
 + 
 +<code> 
 +// A matrix with random numbers initialized once. 
 +Table[#ofBoardCells][#ofPieces]  
 + 
 +// Returns Zobrist hash function for current configuration of board. 
 +function findhash(board): 
 +  hash = 0 
 +  for each cell on the board : 
 +    if cell is not empty : 
 +      piece = board[cell] 
 +      hash ^= table[cell][piece] 
 +  return hash 
 +</code> 
 + 
 +<WRAP info> 
 +**NOTE:**  Explanation : 
 + 
 +  * The idea behind Zobrist Hashing is that for a given board state, if there is a piece on a given cell, we use the random number of that piece from the corresponding cell in the table. 
 + 
 +  * If more bits are there in the random number the lesser chance of a hash collision. 
 +  * Therefore 64 bit numbers are commonly used as the standard and it is highly unlikely for a hash collision to occur with such large numbers. 
 +  * The table has to be initialized only once during the programs execution. 
 + 
 +  * Also the reason why Zobrist Hashing is widely used in board games is because when a player makes a move, it is not necessary to recalculate the hash value from scratch. 
 +  * Due to the nature of XOR operation we can simply use few XOR operations to recalculate the hash value. 
 + 
 +</WRAP> 
 + 
 +---- 
 + 
 +===== Initialization =====
  
 At program initialization, we generate an array of pseudorandom numbers: At program initialization, we generate an array of pseudorandom numbers:
Line 29: Line 67:
 ---- ----
  
 +===== Runtime =====
 +
 +To get the Zobrist hash code of a certain position, the hash key is initialized by xoring all random numbers linked to the given feature.
 +
 +For example, the starting position:
 +
 +<code>
 +[Hash for White Rook on a1] xor [White Knight on b1] xor [White Bishop on c1] xor ... ( all pieces )
 +... xor [White castling short] xor [White castling long] xor ... ( all castling rights )
 +</code>
 +
 +<WRAP info>
 +**NOTE:**  The fact that xor-operation is own inverse and can be undone by using the same xor-operation again, is often used by chess engines.
 +
 +  * It allows a fast incremental update of the hash key during make or unmake moves.
 +
 +</WRAP>
 +
 +
 +Another example, for a White Knight that jumps from b1 to c3 capturing a Black Bishop, these operations are performed:
 +
 +<code>
 +[Original Hash of position] xor [Hash for White Knight on b1] ... ( removing the knight from b1 )
 +... xor [Hash for Black Bishop on c3] ( removing the captured bishop from c3 )
 +... xor [Hash for White Knight on c3] ( placing the knight on the new square )
 +... xor [Hash for Black to move] ( change sides)
 +</code>
 +
 +----
 +
 +===== What size the hash keys should have =====
 +
 +Smaller hash keys are faster and more space efficient, while larger ones reduce the risk of a hash collision.
 +
 +  * A collision occurs if two positions map the same key.
 +  * Usually 64 bit is used as a standard size in modern chess programs.
 +
 +----
 +
 +===== Collisions =====
 +
 +Using too small a hash code, such as only 32 bits, may result in different positions being assigned the same hash code.
 +
 +Programs have various strategies for deciding what to do when a clash occurs.
 +
 +  * Some programs attempt to store the position at the next location.
 +    * If that location is found to be occupied also, the program can give up or it can replace the older of the two entries just examined.
 +  * Most programs try more than once to find an unoccupied location.
 +
 +A supposedly good hashing function workaround for clashing:
 +
 +  * Given a 64 bit (or whatever word size you want) take the low order "n" bits where "n" is log2 (tablesize) and use this as the "first" entry (x) to probe.
 +  * Take the next "m" bits (for Cray Blitz, m=9 and use this as a "vector stride" (y) and then load eight words, table(x), table(x+y), table(x+2y), ..., table(x+7y).
 +  * One of these will be the one chosen to be replaced by the current store, or else one of these will be the "match" for the current lookup.
 +  * The only minor difficulty to handle is that you do not dimension the table for (n) entries [table(n)] but, rather, we have to dimension the table larger [table(n+7*512)] to handle the positions where we hash to near the end of the table, and the vector loads would then step beyond the end... 
 +    * No, there is not a "modulus" mechanism for vector loads to make this load "wrap around" to the beginning.
 +
 +Remember that a table entry contains a "key", score, score type, best move, etc.
 +
 +  * This key must EXACTLY match the hash key before we consider this a "hit".
 +  * This means that the 64 bit hash key must match completely, even though we only used the low order "n" bits of this key to probe into the table.
 +  * In fact, since we used the low order "n" bits to prob into the table, we do not even bother storing them since we "know" what they were already (which saves table space.)
 +
 +<WRAP info>
 +**NOTE:**  It is highly recommended to use 64-bit keys to not have to deal with collisions.
 +
 +  * A value of 48 bits has shown to be just as safe as 64, but 48 is not a reasonable size as 64 bit signatures are the natural signature length for 64 bit processors.
 +
 +</WRAP>
 +
 +----
 +
 +===== Implementation Consideration =====
 +
 +<code cpp>
 +UINT64 getZoristKey(int piece, int color, int square)
 +{
 +  int index = piece*color*square;
 +  const char *p = ((char *)zorbrist) + index
 +  return *(UNIT64 *) p;
 +}
 +</code>
 +
 +This means that the array does not need:
 +
 +<code cpp>
 +(piece*color*square * 64) bits;
 +</code>
 +
 +  * The original and well-known implementation.
 +
 +but only
 +
 +<code cpp>
 +(piece*color*square * 8 + 3*8) bits;
 +</code>
 +
 +  * This will reduce array of about 8th in size.
 +
 +If a bit boundary is used, the array becomes:
 +
 +<code cpp>
 +(piece*color*square + 63) bits;
 +</code>
 +
 +To use this:  The index to the 64-bit value that is being searched for becomes an index to the first bit of the 64-bits needed.
 +
 +  * This gives the shortest possible array.
 +  * The access would be 64 + 8 bits; and then shift the results according to the bit index.
 +  * An easiest way is to use a byte boundary.
 +
 +----
 +
 +===== References =====
  
 +http://web.archive.org/web/20111202160335/http://www.cis.uab.edu/hyatt/collisions.html
chess/programming/zobrist_hashing.1634035857.txt.gz · Last modified: 2021/10/12 10:50 by peter

Donate Powered by PHP Valid HTML5 Valid CSS Driven by DokuWiki