Endgame tablebase

A typical interface for querying a tablebase

An endgame tablebase is a computerized database that contains precalculated exhaustive analysis of chess endgame positions. It is typically used by a computer chess engine during play, or by a human or computer that is retrospectively analysing a game that has already been played.

The tablebase contains the game-theoretical value (win, loss, or draw) of each possible move in each possible position, and how many moves it would take to achieve that result with perfect play. Thus, the tablebase acts as an oracle, always providing the optimal moves. Typically the database records each possible position with certain pieces remaining on the board, and the best moves with White to move and with Black to move.

Tablebases are generated by retrograde analysis, working backwards from a checkmated position. By 2005, all chess positions with up to six pieces (including the two kings) had been solved. By August 2012, tablebases had solved chess for every position with up to seven pieces (the positions with a lone king versus a king and five pieces were omitted because they were considered uninteresting).

The solutions have profoundly advanced the chess community's understanding of endgame theory. Some positions which humans had analyzed as draws were proven to be winnable; the tablebase analysis could find a mate in more than five hundred moves, far beyond the horizon of humans, and beyond the capability of a computer during play. For this reason, they have also called into question the 50 move rule since many positions are now seen to exist that are a win for one side but would be drawn because of the 50 move rule. Tablebases have enhanced competitive play and facilitated the composition of endgame studies. They provide a powerful analytical tool.

While endgame tablebases for other board games like checkers, chess variants or Nine Men's Morris exist, when a game is not specified, it is assumed to be chess.

Background

Physical limitations of computer hardware aside, in principle it is possible to solve any game under the condition that the complete state is known and there is no random chance. Strong solutions, i.e. algorithms that can produce perfect play from any position, are known for some simple games such as Tic Tac Toe (draw with perfect play) and Connect Four (first player wins). Weak solutions exist for somewhat more complex games, such as checkers (with perfect play on both sides the game is known to be a draw, but it is not known for every position created by less-than-perfect play what the perfect next move would be). Other games, such as chess (from the starting position) and Go, have not been solved because their game complexity is too vast for computers to evaluate all possible positions. To reduce the game complexity, researchers have modified these complex games by reducing the size of the board, or the number of pieces, or both.

Computer chess is one of the oldest domains of artificial intelligence, having begun in the early 1930s. Claude Shannon proposed formal criteria for evaluating chess moves in 1949. In 1951, Alan Turing designed a primitive chess playing program, which assigned values for material and mobility; the program "played" chess based on Turing's manual calculations. However, even as competent chess programs began to develop, they exhibited a glaring weakness in playing the endgame. Programmers added specific heuristics for the endgame - for example, the king should move to the center of the board. However, a more comprehensive solution was needed.

In 1965, Richard Bellman proposed the creation of a database to solve chess and checkers endgames using retrograde analysis. Instead of analyzing forward from the position currently on the board, the database would analyze backward from positions where one player was checkmated or stalemated. Thus, a chess computer would no longer need to analyze endgame positions during the game because they were solved beforehand. It would no longer make mistakes because the tablebase always played the best possible move.

In 1970, Thomas Ströhlein published a doctoral thesis with analysis of the following classes of endgame: KQK, KRK, KPK, KQKR, KRKB, and KRKN. In 1977 Thompson's KQKR database was used in a match versus Grandmaster Walter Browne.

Ken Thompson and others helped extend tablebases to cover all four- and five-piece endgames, including in particular KBBKN, KQPKQ, and KRPKR. Lewis Stiller published a thesis with research on some six-piece tablebase endgames in 1995.

More recent contributors have included the following people:

Eugene Nalimov, after whom the popular Nalimov tablebases are named;
Eiko Bleicher, who has adapted the tablebase concept to a program called "Freezer" (see below);
Guy Haworth, an academic at the University of Reading, who has published extensively in the ICGA Journal and elsewhere;
Marc Bourzutschky and Yakov Konoval, who have collaborated to analyze endgames with seven pieces on the board;
Peter Karrer, who constructed a specialized seven-piece tablebase (KQPPKQP) for the endgame of the Kasparov versus The World online match;
Vladimir Makhnychev and Victor Zakharov from Moscow State University, who completed 4+3 DTM-tablebases (525 endings including KPPPKPP) in July 2012. The tablebases are named Lomonosov tablebases. The next set of 5+2 DTM-tablebases (350 endings including KPPPPKP) was completed during August 2012. The high speed of generating the tablebases was because of using a supercomputer named Lomonosov (top500). The size of all tablebases up to seven-man is about 140 TB.

The tablebases of all endgames with up to six pieces are available for free download, and may also be queried using web interfaces (see the external links below). Nalimov tablebase requires more than one terabyte of storage space.

Generating tablebases

Metrics: Depth to conversion and depth to mate

	a	b	c	d	e	f	g	h
8									8
7									7
6									6
5									5
4									4
3									3
2									2
1									1
	a	b	c	d	e	f	g	h

Example: DTC vs. DTM

Before creating a tablebase, a programmer must choose a metric of optimality - in other words, he must define at what point a player has "won" the game. Every position can be defined by its distance (i.e. the number of moves) from the desired endpoint. Two metrics are generally used:

Depth to mate (DTM). A checkmate is the only way to win a game.
Depth to conversion (DTC). The stronger side can also win by capturing material, thus converting to a simpler endgame. For example, in KQKR, conversion occurs when White captures the Black rook.

Haworth has discussed two other metrics, namely "depth to zeroing-move" (DTZ) and "depth by the rule" (DTR). These metrics support the fifty-move rule, but DTR tablebases have not yet been computed. As of April 1, 2013, 5- and 6-man DTZ tablebases have been generated by Ronald de Man; both tablebases and generation code are available for download.

The difference between DTC and DTM can be understood by analyzing the diagram at right. How White should proceed depends on which metric is used.

Metric	Play	DTC	DTM
DTC	1. Qxd1 Kc8 2. Qd2 Kb8 3. Qd8 mate	1	3
DTM	1. Qc7+ Ka8 2. Qa7 mate	2	2

According to the DTC metric, White should capture the rook because that leads immediately to a position which will certainly win (DTC = 1), but it will take two more moves actually to checkmate (DTM = 3). In contrast according to the DTM metric, White mates in two moves, so DTM = DTC = 2.

This difference is typical of many endgames. Usually DTC is smaller than DTM, but the DTM metric leads to the quickest checkmate. Exceptions occur where the weaker side has only a king, and in the unusual endgame of two knights versus one pawn; then DTC = DTM because either there is no defending material to capture or capturing the material does no good. (Indeed, capturing the defending pawn in the latter endgame results in a draw.)

Step 1: Generating all possible positions

David Levy, How Computers Play Chess

	a	b	c	d	e	f	g	h
8									8
7									7
6									6
5									5
4									4
3									3
2									2
1									1
	a	b	c	d	e	f	g	h

The ten unique squares (with symmetry)

	a	b	c	d	e	f	g	h
8									8
7									7
6									6
5									5
4									4
3									3
2									2
1									1
	a	b	c	d	e	f	g	h

The twenty four unique pawn squares (with symmetry)

Once a metric is chosen, the first step is to generate all the positions with a given material. For example, to generate a DTM tablebase for the endgame of king and queen versus king (KQK), the computer must describe approximately 40,000 unique legal positions.

Levy and Newborn explain that the number 40,000 derives from a symmetry argument. The Black king can be placed on any of ten squares: a1, b1, c1, d1, b2, c2, d2, c3, d3, and d4 (see diagram). On any other square, its position can be considered equivalent by symmetry of rotation or reflection. Thus, there is no difference whether a Black king in a corner resides on a1, a8, h8, or h1. Multiply this number of 10 by at most 60 (legal remaining) squares for placing the White king and then by at most 62 squares for the White queen. The product 10x60x62 = 37,200. Several hundred of these positions are illegal, impossible, or symmetrical reflections of each other, so the actual number is somewhat smaller.

For each position, the tablebase evaluates the situation separately for White-to-move and Black-to-move. Assuming that White has the queen, almost all the positions are White wins, with checkmate forced in not more than ten moves. Some positions are draws because of stalemate or the unavoidable loss of the queen.

Each additional piece added to a pawnless endgame multiplies the number of unique positions by about a factor of sixty which is the approximate number of squares not already occupied by other pieces.

Endgames with one or more pawns increase the complexity because the symmetry argument is reduced. Since pawns can move forward but not sideways, rotation and vertical reflection of the board produces a fundamental change in the nature of the position. The best calculation of symmetry is achieved by limiting one pawn to 24 squares in the rectangle a2-a7-d7-d2. All other pieces and pawns may be located in any of the 64 squares with respect to the pawn. Thus, an endgame with pawns has a complexity of 24/10 = 2.4 times a pawnless endgame with the same number of pieces.

Step 2: Evaluating positions using retrograde analysis

Tim Krabbé explains the process of generating a tablebase as follows:

"The idea is that a database is made with all possible positions with a given material [note: as in the preceding section]. Then a subdatabase is made of all positions where Black is mated. Then one where White can give mate. Then one where Black cannot stop White giving mate next move. Then one where White can always reach a position where Black cannot stop him from giving mate next move. And so on, always a ply further away from mate until all positions that are thus connected to mate have been found. Then all of these positions are linked back to mate by the shortest path through the database. That means that, apart from 'equi-optimal' moves, all the moves in such a path are perfect: White's move always leads to the quickest mate, Black's move always leads to the slowest mate."

The retrograde analysis is only necessary from the checkmated positions. Other positions need not be worked from because every position that is not reached from a checkmated position is a draw.

Figure 1 illustrates the idea of retrograde analysis. White mates in two moves with 1. Kc6, leading to the position in Figure 2. Then if 1...Kb8 2. Qb7 mate, and if 1...Kd8 2. Qd7 mate (Figure 3).

Figure 3, before White's second move, is defined as "mate in one ply." Figure 2, after White's first move, is "mate in two ply," regardless of how Black plays. Finally, the initial position in Figure 1 is "mate in three ply" (i.e., two moves) because it leads directly to Figure 2, which is already defined as "mate in two ply." This process, which links a current position to another position that could have existed one ply earlier, can continue indefinitely.

Each position is evaluated as a win or loss in a certain number of moves. At the end of the retrograde analysis, positions which are not designated as wins or losses are necessarily draws.

Figure 1

	a	b	c	d	e	f	g	h
8									8
7									7
6									6
5									5
4									4
3									3
2									2
1									1
	a	b	c	d	e	f	g	h

White to move: mate in three ply (Kc6)

Figure 2

	a	b	c	d	e	f	g	h
8									8
7									7
6									6
5									5
4									4
3									3
2									2
1									1
	a	b	c	d	e	f	g	h

Black to move: mate in two ply (Kd8 or Kb8)

Figure 3

	a	b	c	d	e	f	g	h
8									8
7									7
6									6
5									5
4									4
3									3
2									2
1									1
	a	b	c	d	e	f	g	h

White to move: mate in one ply (Qd7)

Step 3: Verification

After the tablebase has been generated, and every position has been evaluated, the result must be verified independently. The purpose is to check the self-consistency of the tablebase results.

For example, in Figure 1 above, the verification program sees the evaluation "mate in three ply (Kc6)." It then looks at the position in Figure 2, after Kc6, and sees the evaluation "mate in two ply." These two evaluations are consistent with each other. If the evaluation of Figure 2 were anything else, it would be inconsistent with Figure 1, so the tablebase would need to be corrected.

Captures, pawn promotion, and special moves

A four-piece tablebase must rely on three-piece tablebases that could result if one piece is captured. Similarly, a tablebase containing a pawn must be able to rely on other tablebases that deal with the new set of material after pawn promotion to a queen or other piece. The retrograde analysis program must account for the possibility of a capture or pawn promotion on the previous move.

Tablebases assume that castling is not possible for two reasons. First, in practical endgames, this assumption is almost always correct. (However, castling is allowed by convention in composed problems and studies.) Second, if the king and rook are on their original squares, castling may or may not be allowed. Because of this ambiguity, it would be necessary to make separate evaluations for states in which castling is or is not possible.

The same ambiguity exists for the en passant capture, since the possibility of en passant depends on the opponent's previous move. However, practical applications of en passant occur frequently in pawn endgames, so tablebases account for the possibility of en passant for positions where both sides have at least one pawn.

Using a priori information

	a	b	c	d	e	f	g	h
8									8
7									7
6									6
5									5
4									4
3									3
2									2
1									1
	a	b	c	d	e	f	g	h

An example of the KRP(a2)KBP(a3) endgame. White mates in 72 moves, starting with 1.Kh7! Other White moves draw.

According to the method described above, the tablebase must allow the possibility that a given piece might occupy any of the 64 squares. In some positions, it is possible to restrict the search space without affecting the result. This saves computational resources and enables searches which would otherwise be impossible.

An early analysis of this type was published in 1987, in the endgame KRP(a2)KBP(a3), where the Black bishop moves on the dark squares (see example position at right). In this position, we can make the following a priori assumptions:

1. If a piece is captured, we can look up the resulting position in the corresponding tablebase with five pieces. For example, if the Black pawn is captured, look up the newly created position in KRPKB.

2. The White pawn stays on a2; capture moves are handled by the 1st rule.

3. The Black pawn stays on a3; capture moves are handled by the 1st rule.

The result of this simplification is that, instead of searching for 48 * 47 = 2,256 permutations for the pawns' locations, there is only one permutation. Reducing the search space by a factor of 2,256 facilitates a much quicker calculation.

Bleicher has designed a commercial program called "Freezer," which allows users to build new tablebases from existing Nalimov tablebases with a priori information. The program can produce a tablebase for positions with seven or more pieces with blocked pawns, even though tablebases for seven or more pieces are generally not available.

Applications

Correspondence chess

Kasparov vs The World, 1999

	a	b	c	d	e	f	g	h
8									8
7									7
6									6
5									5
4									4
3									3
2									2
1									1
	a	b	c	d	e	f	g	h

The position after 55.Qxb4; tablebases tell us White wins in 82 moves.

In correspondence chess, a player may consult a chess computer for assistance, provided that the etiquette of the competition allows this. A six-piece tablebase (KQQKQQ) was used to analyze the endgame that occurred in the correspondence game Kasparov versus The World. Players have also used tablebases to analyze endgames from over-the-board play after the game is over.

Competitive players need to know that tablebases ignore the fifty-move rule. According to that rule, if fifty moves have passed without a capture or a pawn move, either player may claim a draw. FIDE changed the rules several times, starting in 1974, to allow one hundred moves for endgames where fifty moves were insufficient to win. In 1988, FIDE allowed seventy-five moves for KBBKN, KNNKP, KQKBB, KQKNN, KRBKR, and KQPKQ with the pawn on the seventh rank, because tablebases had uncovered positions in these endgames requiring more than fifty moves to win. In 1992, FIDE canceled these exceptions and restored the fifty-move rule to its original standing. Thus a tablebase may identify a position as won or lost, when it is in fact drawn by the fifty-move rule. In 2013, ICCF changed the rules for correspondence chess tournaments starting from 2014; a player may claim a win or draw based on six-man tablebases. In this case the fifty-move rule is not applied, and the number of moves to mate is not taken into consideration.

Haworth has designed a tablebase that produces results consistent with the fifty-move rule. However most tablebases search for the theoretical limits of forced mate, even if it requires several hundred moves.

Computer chess

The knowledge contained in tablebases affords the computer a tremendous advantage in the endgame. Not only can computers play perfectly within an endgame, but they can simplify to a winning tablebase position from a more complicated endgame. For the latter purpose, some programs use "bitbases" which give the game-theoretical value of positions without the number of moves until conversion or mate - that is, they only reveal whether the position is won, lost or draw. Sometimes even this data is compressed and the bitbase reveals only whether a position is won or not, making no difference between a lost and a drawn game. Shredderbases, for example, used by the Shredder program, are a type of bitbase which fits all three, four and five piece bitbases in 157 MB. This is a mere fraction of the 7.05 GB that the Nalimov tablebases require. Some computer chess experts have observed practical drawbacks to the use of tablebases. In addition to ignoring the fifty-move rule, a computer in a difficult position might avoid the losing side of a tablebase ending even if the opponent cannot practically win without himself knowing the tablebase. The adverse effect could be a premature resignation, or an inferior line of play that loses with less resistance than a play without tablebase might offer.

Another drawback is that tablebases require a lot of memory to store the many thousands of positions. The Nalimov tablebases, which use advanced compression techniques, require 7.05 GB of hard disk space for all five-piece endings. The six-piece endings require approximately 1.2 TB. It is estimated that seven-piece tablebases will require between 50 and 200 TB of storage space. Some computers play better overall if their memory is devoted instead to the ordinary search and evaluation function. Modern engines analyze far enough ahead conventionally to handle the elementary endgames without needing tablebases (i.e. without suffering from the horizon effect). It is only in more complicated endgames that tablebases will have any significant effect on an engine's performance.

Syzygy tablebases were developed by Ronald de Man, released in April 2013, in a form optimized for use by a chess program during search. This variety consists of two tables per endgame: a smaller WDL table (win-draw-loss) which contains knowledge of the 50-move rule, and a larger DTZ table (distance to zero ply, i.e. pawn move or capture). The WDL tables were designed to be small enough to fit on a solid-state drive for quick access during search, whereas the DTZ form is for use at the root position to choose the game-theoretically quickest win instead of performing a search. Syzygy tablebases are available for all 5 piece endings and some 6 piece endings, and are now supported by many top engines, including Komodo 7, Deep Fritz 14, Houdini 4, and Stockfish 6.

Endgame theory

Lewis Stiller, 1991

	a	b	c	d	e	f	g	h
8									8
7									7
6									6
5									5
4									4
3									3
2									2
1									1
	a	b	c	d	e	f	g	h

White mates in 262 moves

In contexts where the fifty-move rule may be ignored, tablebases have answered longstanding questions about whether certain combinations of material are wins or draws. The following interesting results have emerged:

KBBKN - Bernhard Horwitz and Josef Kling (1851) proposed that Black can draw by entering a defensive fortress, but tablebases demonstrated a general win, with maximum DTC = 66 or 67 and maximum DTM = 78. (Also see pawnless chess endgame.)
KNNKP - Alexey Troitsky established this as a win for the knights if the pawn was blocked behind the Troitzky line. Analysis of the tablebases has clarified that even if the pawn has crossed the Troitzky line, White can sometimes win by forcing zugzwang. Maximum DTC = DTM = 115 moves.
KNNNNKQ - The knights win in 62.5 percent of positions, with maximum DTM = 85 moves.
KQRKQR - Despite the equality of material, the player to move wins in 67.74% of positions. The maximum DTC is 92, and the maximum DTM is 117. In both this endgame and KQQKQQ, the first player to check usually wins.
KRNKNN and KRBKNN - Friedrich Amelung had analyzed these two endgames in the 1900s. KRNKNN and KRBKNN are won for the strongest side in 78% and 95% of the cases, respectively. Stiller's DTC tablebase revealed several lengthy wins in these endgames. The longest win in KRBKNN has a DTC of 223 and a DTM of 238 moves (not shown). Even more amazing is the position at right, where White wins starting with 1. Ke6! Stiller reported the DTC as 243 moves, and the DTM was later found to be 262 moves.

For some years, this position held the record for the longest computer-generated forced mate. (Otto Blathy had composed a "mate in 292 moves" problem in 1889, albeit from an illegal starting position.) However, in May 2006, Bourzutschky and Konoval discovered a KQNKRBN position with an astonishing DTC of 517 moves. This was more than twice as long as Stiller's maximum, and almost 200 moves beyond the previous record of a 330 DTC for a position of KQBNKQB_1001. Bourzutschky wrote, "This was a big surprise for us and is a great tribute to the complexity of chess." Later, a similar position was shown to have a DTM of 545.

In August 2006, Bourzutschky released preliminary results from his analysis of the following seven-piece endgames: KQQPKQQ, KRRPKRR, and KBBPKNN.

	a	b	c	d	e	f	g	h
8									8
7									7
6									6
5									5
4									4
3									3
2									2
1									1
	a	b	c	d	e	f	g	h

Black to move wins in 154 moves

Many positions are winnable although at first sight they appear to be non-winnable. For example, this position is a win for Black in 154 moves (during which the white pawn is liquidated after around eighty moves).

	a	b	c	d	e	f	g	h
8									8
7									7
6									6
5									5
4									4
3									3
2									2
1									1
	a	b	c	d	e	f	g	h

In this position the White pawn's first move is at move 119 against optimal defense by Black.

Endgame studies

E. Pogosyants, EG 1978

	a	b	c	d	e	f	g	h
8									8
7									7
6									6
5									5
4									4
3									3
2									2
1									1
	a	b	c	d	e	f	g	h

White to play and win. The composer intended 1. Ne3 as a solution, but a tablebase revealed that 1. h4 also wins.

Harold van der Heijden, 2001

	a	b	c	d	e	f	g	h
8									8
7									7
6									6
5									5
4									4
3									3
2									2
1									1
	a	b	c	d	e	f	g	h

White to play and draw

Since many composed endgame studies deal with positions that exist in tablebases, their soundness can be checked using the tablebases. Some studies have been cooked, i.e. proved unsound, by the tablebases. That can be either because the composer's solution does not work, or else because there is an equally effective alternative that the composer did not consider. Another way tablebases cook studies is a change in the evaluation of an endgame. For instance, the endgame with a queen and bishop versus two rooks was thought to be a draw, but tablebases proved it to be a win for the queen and bishop, so almost all studies based on this endgame are unsound.

For example, Erik Pogosyants composed the study at right, with White to play and win. His intended main line was 1. Ne3 Rxh2 2. O-O-O mate! A tablebase discovered that 1. h4 also wins for White in 33 moves, even though Black can capture the pawn (which is not the best move - in case of capturing the pawn black loses in 21 moves, while Kh1-g2 loses in 32 moves). Incidentally, the tablebase does not recognize the composer's solution because it includes castling.

While tablebases have cooked some studies, they have assisted in the creation of other studies. Composers can search tablebases for interesting positions, such as zugzwang, using a method called data mining. For all three- to five-piece endgames and pawnless six-piece endgames, a complete list of mutual zugzwangs has been tabulated and published.

There has been some controversy whether to allow endgame studies composed with tablebase assistance into composing tourneys. In 2003, the endgame composer and expert John Roycroft summarized the debate:

[N]ot only do opinions diverge widely, but they are frequently adhered to strongly, even vehemently: at one extreme is the view that since we can never be certain that a computer has been used it is pointless to attempt a distinction, so we should simply evaluate a 'study' on its content, without reference to its origins; at the other extreme is the view that using a 'mouse' to lift an interesting position from a ready-made computer-generated list is in no sense composing, so we should outlaw every such position.

Roycroft himself agrees with the latter approach. He continues, "One thing alone is clear to us: the distinction between classical composing and computer composing should be preserved for as long as possible: if there is a name associated with a study diagram that name is a claim of authorship."

Mark Dvoretsky, an International Master, chess trainer, and author, took a more permissive stance. He was commenting in 2006 on a study by Harold van der Heijden, published in 2001, which reached the position at right after three introductory moves. The drawing move for White is 4. Kb4!! (and not 4. Kb5), based on a mutual zugzwang that may occur three moves later.

Dvoretsky comments:

Here, we should touch on one delicate question. I am sure that this unique endgame position was discovered with the help of Thompson’s famous computer database. Is this a 'flaw,' diminishing the composer's achievement?

Yes, the computer database is an instrument, available to anyone nowadays. Out of it, no doubt, we could probably extract yet more unique positions - there are some chess composers who do so regularly. The standard for evaluation here should be the result achieved. Thus: miracles, based upon complex computer analysis rather than on their content of sharp ideas, are probably of interest only to certain aesthetes.

"Play chess with God"

On the Bell Labs website, Ken Thompson maintains a link to some of his tablebase data. The headline reads, "Play chess with God."

Regarding Stiller's long wins, Tim Krabbé struck a similar note:

A grandmaster wouldn't be better at these endgames than someone who had learned chess yesterday. It's a sort of chess that has nothing to do with chess, a chess that we could never have imagined without computers. The Stiller moves are awesome, almost scary, because you know they are the truth, God's Algorithm - it's like being revealed the Meaning of Life, but you don't understand one word.

Nomenclature

Originally, an endgame tablebase was called an "endgame data base" or "endgame database". This name appeared in both EG and the ICCA Journal starting in the 1970s, and is sometimes used today. According to Haworth, the ICCA Journal first used the word "tablebase" in connection with chess endgames in 1995. According to that source, a tablebase contains a complete set of information, but a database might lack some information.

Haworth prefers the term "Endgame Table", and has used it in the articles he has authored. Roycroft has used the term "oracle database" throughout his magazine, EG. Nonetheless, the mainstream chess community has adopted "endgame tablebase" as the most common name.

Books

John Nunn has written three books based on detailed analysis of endgame tablebases:

Computer chess
EG magazine
Zobrist hashing

COMMENTS