2

I am making a chess engine and trying to implement Zobrist tables into my engine. From what I've read, my understanding is that you first calculate the Zobrist hashcode, then get the modulo of the hashcode divided by the size of your transposition table to find the storage index, then store all corresponding evaluation values into that index. I have a 64-bit Zobrist hash function written, but I have some questions on how to implement the actual table and Zobrist hashing in general:

  1. From what I've read, I should make an array of entries to store hash/eval info. How big should I make an array like that be to fit all the hash functions over an entire game?
  2. If the index at the modulo of a hash function is already stored, would I just follow regular hashing rules and keep traversing every n elements in the array from the index until I find an empty index? Is this also the protocol for Zobrist hashing?
  3. Correct me if what I'm saying is wrong, but I've read that when searching through the hash array, if the depth searched of a hashed position is different from the current depth you're searching, even if the hash value between your position and the hashed position is the same, you're not supposed to use the past evaluation. I understand the logic behind this, but doesn't this practically make Zobrist hashing useless? Realistically, an engine is not going to evaluate a position at the exact same depth every time, right?
  4. Say a position with hashcode x and evaluation depth 5 exists in the hashtable and has a final evaluation of a. I evaluate the same position x but this time at depth 6, so I find the new evaluation value to be b. Do I update the past evaluation a for x at depth 5 to the current evaluation b at depth 6?

1 Answer 1

1

I will try to answer some of the questions.

  1. Is the question how large your hash table should be? I think the simple answer is, make it as big as you can on your system. If you have 16 GB of RAM you can for example make it 8 GB big. (usually the number of entries will be a power of 2 e.g. to make the index calculation a simple bit operation, though you can use other sizes) Most programs have this size be configurable.

  2. This is not usually how things are implemented. For most applications it is essential that stored elements are retrievable again. However, in chess you don't lose much if you lose one entry, you can simply redo the search. Because of that, when writing to a cell that already contains a value you can simply overwrite it without losing correctness. What I do in my program is to have at every index a bucket of 4 entries (16 Byte each, so together they nicely fit into a 64 Byte cache line) then I can store up to 4 entries per index. But of course in a long enough search that will eventually fill up, in which case I simply discard the least useful entry. (there you also have some options of how to value different entries against each other)

  3. There are two possible approaches here. The standard approach is to only use entries that have depth at least as high as you need. So if you need a depth 8 value for a position but only have a depth 9 value, you still take that one. This is simply and the most efficient and because of that is the default option. However, it has some undesired consequences: the most apparent one is that your search is no longer equivalent to minimax, since a minimax depth 8 search would get a different result than a depth 9 search. Depending on your evaluation, you might have some odd-even effect where the player with the most recent move tends to get a better evaluation by virtue of being able to use that last move. In that case depth 8 and depth 9 might not compare super nicely. If you want to avoid this, you can store one entry per depth. So you might have a depth 7, 8 and a depth 9 entry for the same position. You can for example do this by storing the position at index (hash % table_size) + depth. This costs some more memory but you avoid some of the above problems. Are these exact matches even useful values? Yes, because of transpositions. Assume for example you calculate 1.d4 Nf6 2.c4 or 1.c4 Nf6 2.d4 from the start position. For the second line you will get an exact depth match. Furthermore, if you play against the engine, after making a move the engine will still have the previous search results in the hash table so when starting to calculate for the next move you get a "head start" by reusing the previous calculations.

  4. In the scenario where you only have one entry per position you change both the evaluation and the depth of that entry. (as well as the best move and whatever else you might store) In the alternative approach of one depth per entry you naturally don't meddle with the lower depth entry, but rather would simply create a new entry.

And finally a small general note: Zobrist hashing is one specific type of hash functions. I think your questions seem to be more about transposition tables in general?

3
  • 1
    @OP: Understanding of hash table management (which usually is independent of the actual hashing method) is important to have. Most books on algorithms or data structures discuss that subject. As they are usually also tailored to specific languages, choose one to fit the programming language you use.
    – user30536
    Commented Aug 13, 2023 at 15:43
  • @koedem yeah, my questions are mainly on tables. In terms of desired hash table size, how exactly would I calculate that? If I write an Entry class to put in the hash table with a number of instance variables, would I just add up the total memory of each instance variable (for example, say 5 ints as instance vars = 80 bites per Entry at an index) for an Entry and multiply that until I get to 8GB to get the desired number of indexes?
    – wdk23
    Commented Aug 14, 2023 at 0:14
  • 1
    @wdk23 depending on the language you use, you can find out explicitly how large your entry is. For example in C++ you can use sizeof. Alternatively you can determine it by experiment. Create an array with 10 million of your entries and check how much memory your program consumes. Usually you have the number of entries be the power of 2, it's just that often your Entry itself will also be a power of two large. (e.g. 16 Byte is a typical entry size) If that is not the case and your entry is 20 Bytes, I'd still use a power of 2 for number of entries, so that might be 10 GB worth of entries then.
    – koedem
    Commented Aug 15, 2023 at 11:06

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Not the answer you're looking for? Browse other questions tagged or ask your own question.