The task involved creating a vectorized version of the cosine similarity in C++ to compare its performance against Python, NumPy, SciPy, and plain C++ implementations.
Below is my implementation and the average times for calculating the similarity between two random vectors of 640,000 floats. The full code can be found here.
# Python
def cosine(A, B):
dot = denom_a = denom_b = 0.0
for i in range(len(A)):
dot += A[i] * B[i]
denom_a += A[i] * A[i]
denom_b += B[i] * B[i]
return 1 - (dot / (math.sqrt(denom_a) * math.sqrt(denom_b)))
# Scipy
cos_sim = distance.cosine(A,B)
# Numpy
cos_sim = np.dot(A, B)/(norm(A)*norm(B)))
// Plain C++
float cosine_similarity(float *A, float *B)
{
float dot = 0.0, denom_a = 0.0, denom_b = 0.0;
for (auto i = 0; i < SIZE; ++i)
{
dot += A[i] * B[i];
denom_a += A[i] * A[i];
denom_b += B[i] * B[i];
}
return dot / (sqrt(denom_a) * sqrt(denom_b));
}
// Vectorized C++
inline float simd_horizontal_sum(__m256 &r)
{
__m128 r4 = _mm_add_ps(_mm256_castps256_ps128(r), _mm256_extractf128_ps(r, 1));
__m128 r2 = _mm_add_ps(r4, _mm_movehl_ps(r4, r4));
__m128 r1 = _mm_add_ss(r2, _mm_movehdup_ps(r2));
return _mm_cvtss_f32(r1);
}
float cosine_similarity_simd(float *A, float *B)
{
__m256 sum_dot = _mm256_setzero_ps();
__m256 sum_A = _mm256_setzero_ps();
__m256 sum_B = _mm256_setzero_ps();
for (size_t i = 0; i < SIZE; i += 8)
{
__m256 buf1 = _mm256_loadu_ps(A + i);
__m256 buf2 = _mm256_loadu_ps(B + i);
sum_dot = _mm256_fmadd_ps(buf1, buf2, sum_dot);
sum_A = _mm256_fmadd_ps(buf1, buf1, sum_A);
sum_B = _mm256_fmadd_ps(buf2, buf2, sum_B);
}
float float_dot = simd_horizontal_sum(sum_dot);
float float_A_norm = simd_horizontal_sum(sum_A);
float float_B_norm = simd_horizontal_sum(sum_B);
return float_dot / (sqrt(float_A_norm) * sqrt(float_B_norm));
}
Implementation | Time (ms) |
---|---|
Vectorized C++ | 0.0936 |
C++ | 0.5416 |
SciPy | 0.5494 |
NumPy | 0.6953 |
Plain Python | 323.389 |
As expected, SIMD is the fastest and plain Python is frustratingly slow. Yes, maybe we can do better with the plain Python implementation making it more “pythonic”:
def cosine(A, B):
dot = denom_a = denom_b = 0.0
dot = sum([a*b for a,b in zip(A,B)])
denom_a = sum([x*x for x in A])
denom_b = sum([x*x for x in B])
return 1 - (dot / (math.sqrt(denom_a) * math.sqrt(denom_b)))
Implementation | Time (ms) |
---|---|
Vectorized C++ | 0.0936 |
C++ | 0.5416 |
SciPy | 0.5494 |
NumPy | 0.6953 |
Plain Python, more pythonic | 271.683 |
Plain Python | 323.389 |
Which results in a ~20% improvement, but still nowhere near the performance of the other implementations, so I’ll stop here and shift my focus to Python libraries.
Interestingly, Python’s libraries are comparable to plain C++ in performance, which isn’t bad in my opinion. Yes, the vectorized C++ version is an order of magnitude faster, but unless you opt to implement a processor-specific calculation in C++, the Python libraries are decently optimized for tasks like this.
However, SciPy is slightly faster than my NumPy implementation even though it is built using NumPy, and there shouldn’t be much fantasy in the cosine similarity implementation, so let me check it out here:
def correlation(u, v, w=None, centered=True):
u = _validate_vector(u)
v = _validate_vector(v)
if w is not None:
w = _validate_weights(w)
w = w / w.sum()
if centered:
if w is not None:
umu = np.dot(u, w)
vmu = np.dot(v, w)
else:
umu = np.mean(u)
vmu = np.mean(v)
u = u - umu
v = v - vmu
if w is not None:
vw = v * w
uw = u * w
else:
vw, uw = v, u
uv = np.dot(u, vw)
uu = np.dot(u, uw)
vv = np.dot(v, vw)
dist = 1.0 - uv / math.sqrt(uu * vv)
# Clip the result to avoid rounding error
return np.clip(dist, 0.0, 2.0)
def cosine(u, v, w=None):
return correlation(u, v, w=w, centered=False)
My initial thought was that the difference in performance was due to the denominator, I have no idea why but that is the only difference I see. My implementation is using the norm function from NumPy while SciPy uses two NumPy dot multiplications and the Python sqrt function.
Before diving deeper into the difference, I simplified the correlation function by removing extra checks and generalizations that weren’t needed for my use case, making it easier to read and hopefully a bit faster:
def cosine(u, v):
uv = np.dot(u, v)
uu = np.dot(u, u)
vv = np.dot(v, v)
dist = 1.0 - uv / math.sqrt(uu * vv)
return np.clip(dist, 0.0, 2.0)
And now, re-run it to ensure everything works as expected and also see if removing those checks improves performance.
Implementation | Time (ms) |
---|---|
Vectorized C++ | 0.0936 |
C++ | 0.5416 |
SciPy | 0.5494 |
NumPy as Scipy | 0.6714 |
NumPy | 0.6953 |
Plain Python, more pythonic | 271.683 |
Plain Python | 323.389 |
And the result is ~0.7 ms, which oh crap, is slower than executing the original SciPy version and almost identical to my original NumPy implementation. It turns out that I removed two important lines from the correlation function:
u = _validate_vector(u)
v = _validate_vector(v)
The name seems pretty explicit, validate the vectors, something I don’t need as I know what I am using at runtime, but _validate_vector is implemented as:
def _validate_vector(u, dtype=None):
u = np.asarray(u, dtype=dtype, order='c')
if u.ndim == 1:
return u
raise ValueError("Input vector should be 1-D.")
Which not only validates that the input is a 1 x n array but also makes the array contiguous in memory calling np.asarray (see order parameter from NumPy documentation), something that can’t be guaranteed when loading a vector from disk as I am doing.
As CPUs love working with contiguous things, having things contiguous in memory is always good to go fast, either to compute in parallel or access things faster from cache, something the underlying Python libraries can probably take advantage of. So the time spent making vectors contiguous in memory, along with performing the calculations, seems worth it in this case as it is slightly faster.
Just to be sure this is true, let’s reload the vectors, make them contiguous and see if my NumPy and SciPy are close in execution time this way.
file_data = np.genfromtxt('../tools/vectors.csv', delimiter=',')
A,B = np.moveaxis(file_data, 1, 0).astype('f')
A = np.asarray(A, dtype='float', order='c')
B = np.asarray(B, dtype='float', order='c')
accum = 0
for _ in range(EXECUTIONS):
start_time = time.time()
cos_sim = 1 - np.dot(A, B)/(norm(A)*norm(B))
accum += (time.time() - start_time) * 1000
print(" %s ms" % (accum/EXECUTIONS))
def cosine(u, v):
uv = np.dot(u, v)
uu = np.dot(u, u)
vv = np.dot(v, v)
dist = 1.0 - uv / math.sqrt(uu * vv)
# Clip the result to avoid rounding error
return np.clip(dist, 0.0, 2.0)
accum = 0
for _ in range(EXECUTIONS):
start_time = time.time()
cos_sim = cosine(A,B)
accum += (time.time() - start_time) * 1000
print(" %s ms" % (accum/EXECUTIONS))
Implementation | Time (ms) |
---|---|
NumPy. Contiguous array | 0.04193 |
NumPy as Scipy. Contiguous array | 0.04264 |
Vectorized C++ | 0.0936 |
SciPy | 0.549369 |
NumPy | 0.695258 |
C++ | 2.3839 |
Plain Python, more pythonic | 271.683 |
Plain Python | 323.389 |
The results show that making arrays contiguous does improve performance significantly. Mmmmm… not only that, they are also faster than my SIMD implementation, twice as fast in fact.
Why? Because they are calculated using the BLAS library available in the OS, which means that not even writing C++ SIMD code will make me have a faster implementation than the one Python is using and I will probably have to write my own assembly code with compiler-like tricks to go as fast as Python plus C++ libraries.
Honestly Python^{1} is lightning fast for my purposes and deadlines.
Where Python is the experience you get when you open VS Code and start coding—no specifics about language implementations, libraries, or interpreters. It’s all about diving into coding quickly. If you view Python differently, this might not be for you. ↩
Something I learned in CS that sticked in my head were Genetic Algorithms. I guess the reason was that GA were one of the first (and few) applied things I saw in CS and it seemed to me a simple, intuitive and brilliant idea. Today I was bored at home and I decided to play a little bit with it.
GAs are a search technique that is inspired in biological evolution and genetic mutations which are used to purge certain parts of the search space. This is done encoding the nodes in the space into a genetic representation and using a fitness function to evaluate them.
I started implementing a useless but I think illustrative example of GAs which is generating a sequence of random bits and then search for it. Plotting it as an $nxn$ matrix makes it is easier to visualize and debug the process.
img = np.random.randint(2, size=(15,15))
plt.imshow(img, cmap=plt.cm.gray)
plt.show()
This is a 15x15 array, so 225 bits and therefore a space of 2^225 possible combinations. Next I define the fitness function which is nothing more than the amount of bits of the image have the same value. In Numpy it would be:
def score(matrix1, matrix2):
return (matrix1 == matrix2).sum()
The genetic algorithm implementation per se (the fitness function, crossover and mutation) is implemented below, where the parameters population is the initial population and mutations are the percentage of mutated bits.
def ga(array, population, mutations):
def score(matrix1, matrix2):
return (matrix1 == matrix2).sum()
rows = array.shape[0]
columns = array.shape[1]
mid = rows//2
mem = np.random.randint(2, size=(2 * population, rows, columns))
scores = np.zeros((2 * population))
bottom = list(range(len(mem)))
for i in range(1000000):
# When initialized bottom will contain a set of random individuals. Later it will be
# the bottom of the list of individuals sorted by score
for k in bottom:
scores[k] = score(mem[k], array)
# Check if the solution has been found
max_score = np.argmax(scores)
if scores[max_score] == rows * columns:
print(i)
plt.imshow(mem[max_score], cmap=plt.cm.gray) # use appropriate colormap here
plt.show()
break
# Select the population of individuals according to the score function
top_n_scores = np.argpartition(scores, population)
top = top_n_scores[population:]
bottom = top_n_scores[:population]
# Create #population new elements from the crossover and mutation
for j in range(population):
# Crossover -> Select parents from the top individuals
#
# I tried this with random.choice or just picking a random position from the list and
# the next one. The result is the same for both but way faster
# with the latter option.
# The reason it still works might be either because of the randomization of the initial
# population or maybe the implementation of argpartition? or both?
r = random.randrange(len(top))
idx = [r, (r+1)%len(top)]
parents = [top[idx[0]],top[idx[1]]]
mem[bottom[j]][0:mid] = mem[parents[0]][0:mid]
mem[bottom[j]][-(mid+1):] = mem[parents[1]][-(mid+1):]
# Mutation -> Mutate the bits
#
# The random choice of the bits to mutate is the most costly of the implementation
# It seems there has to be some way to speed up this
idx = np.random.choice([0,1], p=[(1-mutations), mutations],size=(rows,columns))
mem[bottom[j]] = abs(mem[bottom[j]] - idx)
With some manual testing and trial error I ended up with an initial population of 500 and a mutation of 0.5% as the fastest way to find the array of bits, which finishes in approximately 160 iterations and 3 seconds in my computer. Even the experiment not making sense seems pretty good to me taking into account that the space is 2^225 and it can be implemented with few lines of code.
The entire implementation can be found here
Next I wanted to try something more useful, solving mastermind with 6 colors and 4 pegs, the original game had to be solved with 12 moves.
The implementation is similar to the previous one, mostly changing the score function. To count the number of evaluated choices I memoize the score function.
My best approach is a population of 2, and mutate 1 out of 4 pegs, with an average of ~36 evaluateed choices, so three times more than what would be the 12 choices of the original game. Something interesting is that with only crossover and mutation the information regarding the pegs with the correct color but not in the correct spot does not seem to be relevant.
The entire implementation can be found here
def ga(array, population, mutations):
score_memoization = {}
def score(board, choice):
key = tuple(choice)
if key in score_memoization:
return score_memoization[key]
tmp_board = board.copy()
in_place = 0
in_place_list = set()
same_color = 0
for i in range(len(choice)):
if choice[i] == board[i]:
in_place += 1
in_place_list.add(i)
tmp_board[i] = -1
for i,c in enumerate(choice):
if i not in in_place_list and c in tmp_board:
same_color += 1
# In correct place and correct color
#score = (2 * in_place) + same_color
# Only in correct place
score = in_place
score_memoization[key] = score
return score
mid = 4//2
mem = np.random.randint(6, size=(2 * population, 4))
scores = np.zeros((2 * population))
bottom = list(range(len(mem)))
for i in range(1000000):
# When initialized bottom will contain a set of random individuals. Later it will be
# the bottom of the list of individuals sorted by score
for k in bottom:
scores[k] = score(array, mem[k])
# Check if the solution has been found
max_score = np.argmax(scores)
if scores[max_score] == 8:
return i, len(score_memoization)
# Select the population of individuals according to the score function
top_n_scores = np.argpartition(scores, population)
top = top_n_scores[population:]
bottom = top_n_scores[:population]
# Create #population new elements from the crossover and mutation
for j in range(population):
# Crossover -> Select parents from the top individuals
#
# I tried this with random.choice or just picking a random position from the list and
# the next one. The result is the same for both but way faster
# with the latter option.
# The reason it still works might be either because of the randomization of the initial
# population or maybe the implementation of argpartition? or both?
r = random.randrange(len(top))
idx = [r, (r+1)%len(top)]
parents = [top[idx[0]],top[idx[1]]]
mem[bottom[j]][0:mid] = mem[parents[0]][0:mid]
mem[bottom[j]][-(mid):] = mem[parents[1]][-(mid):]
# Mutation -> Mutate the bits
#
# The random choice of the bits to mutate is the most costly of the implementation
# It seems there has to be some way to speed up this
idx = np.random.choice([0,1,2,3,4,5,6], p=[(1-mutations), mutations/6,mutations/6,mutations/6,mutations/6,mutations/6,mutations/6],size=(4))
for k in range(len(idx)):
mem[bottom[j]][k] = (mem[bottom[j]][k] + idx[k])%6
HNer 1:
> Most likely it has seen a similar sequence of moves in its training set.
Is this a joke making fun of the common way people dismiss other ChatGPT successes? This makes no sense with respect to chess, because every game is unique, and playing a move from a different game in a new game is nonsensical.`Me:
If it is not memorizing, how do you think is doing it?HNer 2:
by trying to learning the general rules that to explain the dataset and minimize its loss. That’s what machine learning is about, it’s not called machine memorizing.
Although I am not sure I completely understand what HNer 2 is saying I think the idea is that LLMs can play games if they know the rules. And not just play but for chess doing so at 1400 ELO level.
To me LLM are indeed language models that put one token after the other based on what have seen before, and that includes chess moves and having seen games in the past. Only knowing the rules of chess and playing at 1400 ELO level is like LLMs have an embedded function to play chess that somehow have learned only with the rules and a bunch of other crap.
Playing chess with Chat GPT had been the third or fourth thing I asked the model once I had access to it and my goal was to see how capable was to retrieve moves from all the data it had ingested. I tried it a couple of times and it returned me legal moves, I did not go further but only that seemed impressive to me. Later I saw that it could play at 1400 ELO level but was still inventing moves, its level surprised me, inventing moves not.
To test the idea that an LLM can’t play a board game if it has not seen moves in the past I decided to create a new board game, give the rules to Chat GPT as a context and play against it.
Now, it is worth noting that I am not sure this is what HNer 2 wanted to say as it is mentioning datasets and minimizing losses and I guess the proper thing to do would be to retrain the model exactly as it is now and only adding the rules.
Obviously I can’t do that with Chat GPT and I thought on trying it with one of the open LLM so I did some deep research to see what internet has to say about it, this was googling “llm prompt vs fine tuning” and click on the first SO link. This is what came up:
It appears prompting an LLM may outperform fine tuning a smaller model on domain-specific tasks if the training data is small and vice versa if otherwise.
So I decided that my best approach was adding the rules to the prompt and try to see what it does with that context. Although it might not be what HNer 2 was saying for me it was still worth the time to see what Chat GPT is able to do.
The game goes like below. This is also going to be the input to chat gpt (without the images):
The game is played using an 8x8 board
The cells of the board will be referred as python formatted tuples (row,column) for the 8x8 matrix with indices starting at 0
There are two players referred as black and white and each with its same set of pieces
The pieces for each player are: 8 pawns and 1 queen
Each turn one player moves one piece
White starts the game with first move
Each pawn position will be represented as a tuple (row,column) and all the pawns as a set of these tuples in python format
Queens positions will be represented as a tuple of (row, column)
A specific state of the board is represented as a tuple of 4 elements in this order (white queen position, white pawns, black queen position, black pawns)
When the game starts the board is ((7,0),{(5,0),(5,1),(5,2),(6,0),(6,1),(6,2),(7,1),(7,2)},(0,7),{(0,5),(0,6),(1,5),(1,6),(1,7),(2,5),(2,6),(2,7)})
The goal of the game is for each player to place the queen in its opponent queen cell and the first player that accomplishes the goal wins. This means that white has to place the queen in (0,7) and black in (7,0)
If none of the players has legal moves to make the game ends with a draw
White pawns can only move inside the board boundaries, can only move towards (0,7), one square at a time and only horizontally or vertically, they can’t move diagonally. This means they can only decrease the row index by 1 or increase the column index by 1 per move
As an example, all the possible legal moves for white pawns to start the game generate these new pawn positions: {(7, 1), (6, 1), (5, 1), (5, 0), (7, 2), (6, 0), (6, 3), (5, 2)}, {(6, 2), (7, 1), (6, 1), (5, 0), (7, 2), (6, 0), (4, 1), (5, 2)}, {(6, 2), (4, 0), (7, 1), (6, 1), (5, 1), (7, 2), (6, 0), (5, 2)}, {(6, 2), (7, 1), (6, 1), (5, 1), (7, 3), (5, 0), (6, 0), (5, 2)}, {(6, 2), (7, 1), (6, 1), (5, 1), (4, 2), (5, 0), (7, 2), (6, 0)}, {(6, 2), (7, 1), (6, 1), (5, 1), (5, 0), (7, 2), (6, 0), (5, 3)}
White queen can only move inside the board boundaries and towards (0,7) and only horizontally or vertically, it can’t move diagonally. This means it can only either decrease the row index or increase the column index per move.
The queen can only move one square at a time but can jump over contiguous white pawns that are next to it.
As an example, all the legal moves for white queen to start the game generate these new queen positions: (4, 0) and (7, 3)
White pieces can’t be around black queen, so if black queen is in (3,3) none of the white pieces can be at (2,2), (2,3), (2,4), (3,2), (3,4), (4,2), (4,3) or (4,4)
Black pawns can only move inside the board boundaries, can only move towards (7,0), one square at a time and only horizontally or vertically, they can’t move diagonally. This means they can only increase the row index by 1 or decrease the column index by 1 per move
As an example, all the possible new possible positions for black pawns for this board ((7, 0), {(7, 2), (6, 0), (6, 3), (5, 2)}, (0, 7), {(2, 7), (0, 6), (2, 5)}) are: {(3, 5), (2, 7), (0, 6)}, {(2, 4), (2, 7), (0, 6)}, {(3, 7), (2, 5), (0, 6)}, {(2, 5), (2, 6), (0, 6)}, {(1, 6), (2, 5), (2, 7)} and {(2, 5), (2, 7), (0, 5)}
Black queen can only move inside the board boundaries and towards (7,0) and only horizontally or vertically, it can’t move diagonally. This means it can only either increase the row index or decrease the column index per move.
The queen can only move one square at a time but can jump over contiguous white pawns that are next to it.
As an example, all the new possible positions for black queen for this board ((7, 0), {(7, 2), (6, 0), (6, 3), (5, 2)}, (0, 7), {(2, 7), (0, 6), (2, 5)}) are: (1, 7) and (0, 5)
As I don’t want to learn how to play this game I have created a simple Monte Carlo Tree Search engine in case Chat GPT is able to play and I have to respond. I also used the engine to see what happens when the game is played with random moves, I found specially interesting to know the chances of draw and of white winning as it starts the game.
And here are the results after 50000 games:
Black wins | White wins | Draws |
---|---|---|
17440 | 17820 | 14740 |
34.9% | 35.6% | 29.5% |
It seems that with random moves white starting does not give a huge advantage and I found the chances of draw pretty high. This makes me think that with similarly rated players a draw would be probable. Surprisingly for me in chess the probability seems to be even higher (I am quite ignorant in chess).
This is the conversation (link to the conversation):
I will teach you a board game and then we will play
That sounds like a fun idea! I’m excited to learn and play a new board game with you. Please go ahead and teach me the rules of the game, and we can get started.
The game is played using an 8x8 board
The cells of the board will be referred as python formatted tuples (row,column) for the 8x8 matrix with indices starting at 0
There are two players referred as black and white and each with its same set of pieces
The pieces for each player are: 8 pawns and 1 queen
Each cell can only contain one piece
Each turn one player moves one piece
White starts the game with first move
Pawns positions will be represented as a set of tuples in python format
Queens positions will be represented as a tuple of (row, column)
A specific state of the board is represented as a tuple of 4 elements in this order (white queen position, white pawns, black queen position, black pawns)
When the game starts the board is ((7,0),{(5,0),(5,1),(5,2),(6,0),(6,1),(6,2),(7,1),(7,2)},(0,7),{(0,5),(0,6),(1,5),(1,6),(1,7),(2,5),(2,6),(2,7)})
The goal of the game is for each player to place the queen in its opponent queen cell and the first player that accomplishes the goal wins. This means that white has to place the queen in (0,7) and black in (7,0)
If none of the players has legal moves to make the game ends with a draw
White pawns can only move inside the board boundaries, can only move towards (0,7), one square at a time and only horizontally or vertically, they can’t move diagonally. This means they can only decrease the row index by 1 or increase the column index by 1 per move
As an example, all the possible legal moves for white pawns to start the game generate these new pawn positions: {(7, 1), (6, 1), (5, 1), (5, 0), (7, 2), (6, 0), (6, 3), (5, 2)}, {(6, 2), (7, 1), (6, 1), (5, 0), (7, 2), (6, 0), (4, 1), (5, 2)}, {(6, 2), (4, 0), (7, 1), (6, 1), (5, 1), (7, 2), (6, 0), (5, 2)}, {(6, 2), (7, 1), (6, 1), (5, 1), (7, 3), (5, 0), (6, 0), (5, 2)}, {(6, 2), (7, 1), (6, 1), (5, 1), (4, 2), (5, 0), (7, 2), (6, 0)}, {(6, 2), (7, 1), (6, 1), (5, 1), (5, 0), (7, 2), (6, 0), (5, 3)}
White queen can only move inside the board boundaries and towards (0,7) and only horizontally or vertically, it can’t move diagonally. This means it can only either decrease the row index or increase the column index per move.
The queen can only move one square at a time but can jump over contiguous white pawns that are next to it.
As an example, all the legal moves for white queen to start the game generate these new queen positions: (4, 0) and (7, 3)
When a white pawn moves where a black pawn is, the black pawn is dead and is no longer part of the game. As an example, with board ((7, 0), {(7, 2)}, (0, 7), {(7, 3)}) and white’s turn this ((7, 0), {(7, 3)}, (0, 7), {}) is a valid board
White pieces can’t be around black queen, so if black queen is in (3,3) none of the white pieces can be at (2,2), (2,3), (2,4), (3,2), (3,4), (4,2), (4,3) or (4,4)
Black pawns can only move inside the board boundaries, can only move towards (7,0), one square at a time and only horizontally or vertically, they can’t move diagonally. This means they can only increase the row index by 1 or decrease the column index by 1 per move
As an example, all the possible new possible positions for black pawns for this board ((7, 0), {(7, 2), (6, 0), (6, 3), (5, 2)}, (0, 7), {(2, 7), (0, 6), (2, 5)}) are: {(3, 5), (2, 7), (0, 6)}, {(2, 4), (2, 7), (0, 6)}, {(3, 7), (2, 5), (0, 6)}, {(2, 5), (2, 6), (0, 6)}, {(1, 6), (2, 5), (2, 7)} and {(2, 5), (2, 7), (0, 5)}
Black queen can only move inside the board boundaries and towards (7,0) and only horizontally or vertically, it can’t move diagonally. This means it can only either increase the row index or decrease the column index per move.
The queen can only move one square at a time but can jump over contiguous white pawns that are next to it.
As an example, all the new possible positions for black queen for this board ((7, 0), {(7, 2), (6, 0), (6, 3), (5, 2)}, (0, 7), {(2, 7), (0, 6), (2, 5)}) are: (1, 7) and (0, 5)
When a black pawn moves where a white pawn is, the white pawn is dead and is no longer part of the game. As an example, with board ((7, 0), {(7, 2)}, (0, 7), {(7, 3)}) and black’s turn this ((7, 0), {}, (0, 7), {(7, 2)}) is a valid board
Black pieces can’t be around white queen, so if white queen is in (3,3) none of the black pieces can be at (2,2), (2,3), (2,4), (3,2), (3,4), (4,2), (4,3) or (4,4)
Thank you for explaining the rules of the game. It seems like a strategic and intriguing board game. I understand that the goal of the game is for each player to place their queen in their opponent’s queen cell. The players take turns moving their pieces, with white starting first.
The board is an 8x8 grid, and each cell can only contain one piece. There are two players: black and white. Each player has 8 pawns and 1 queen. The positions of the pawns and queens are represented using tuples and sets in Python.
White pawns can move one square at a time, either horizontally or vertically, towards the cell (0,7). They cannot move diagonally. White pawns cannot be around the black queen.
White queen can also move one square at a time, either horizontally or vertically, towards the cell (0,7). The queen can jump over contiguous white pawns that are next to it.
Black pawns can move one square at a time, either horizontally or vertically, towards the cell (7,0). They cannot move diagonally. Black pawns cannot be around the white queen.
Black queen can also move one square at a time, either horizontally or vertically, towards the cell (7,0). The queen can jump over contiguous white pawns that are next to it.
When a pawn moves to a cell occupied by an opponent’s pawn, the opponent’s pawn is eliminated from the game.
The game ends in a draw if none of the players have legal moves.
Now that I have a good understanding of the rules, I’m ready to play the game with you. Shall we start with the initial board state you provided, and you can make the first move as white?
You are white, so you start the game
Alright, let’s begin the game with me playing as the white player. Here is the initial board state:
Board: ((7, 0), {(5, 0), (5, 1), (5, 2), (6, 0), (6, 1), (6, 2), (7, 1), (7, 2)}, (0, 7), {(0, 5), (0, 6), (1, 5), (1, 6), (1, 7), (2, 5), (2, 6), (2, 7)})
Your turn as black will come after I make my move. Please provide your move instructions in the format “piece to move: current position -> new position”.
I’ll make my move as white. I’ll start by moving one of my pawns. Let’s say I choose to move the pawn at position (7, 1) to (6, 1):
White pawn move: (7, 1) -> (6, 1)
Here’s the updated board state:
Board: ((7, 0), {(5, 0), (5, 1), (5, 2), (6, 0), (6, 2), (6, 1), (7, 1), (7, 2)}, (0, 7), {(0, 5), (0, 6), (1, 5), (1, 6), (1, 7), (2, 5), (2, 6), (2, 7)})
Now it’s your turn as black. Please provide your move instructions for the black pieces.
That is not a legal move, there already was one of your pawns in (6,1)
Apologies for the mistake. Thank you for pointing that out. Let me choose a different move.
I’ll move the white pawn at position (6, 2) to (5, 2):
White pawn move: (6, 2) -> (5, 2)
Here’s the updated board state:
Board: ((7, 0), {(5, 0), (5, 1), (5, 2), (6, 0), (6, 1), (5, 2), (7, 1), (7, 2)}, (0, 7), {(0, 5), (0, 6), (1, 5), (1, 6), (1, 7), (2, 5), (2, 6), (2, 7)})
Now it’s your turn as black. Please provide your move instructions for the black pieces.
That is not a legal move either
Apologies again for the oversight. Let me correct my move.
I’ll move the white pawn at position (7, 2) to (6, 2):
White pawn move: (7, 2) -> (6, 2)
Here’s the updated board state:
Board: ((7, 0), {(5, 0), (5, 1), (5, 2), (6, 0), (6, 1), (6, 2), (6, 2), (7, 1)}, (0, 7), {(0, 5), (0, 6), (1, 5), (1, 6), (1, 7), (2, 5), (2, 6), (2, 7)})
Now it’s your turn as black. Please provide your move instructions for the black pieces.
That is not a legal move either, you are moving pawns where you already have pieces
Apologies once again for the error. Let me rectify my move.
I’ll move the white pawn at position (6, 1) to (5, 1):
White pawn move: (6, 1) -> (5, 1)
Here’s the updated board state:
Board: ((7, 0), {(5, 0), (5, 1), (5, 2), (6, 0), (5, 1), (6, 2), (7, 1), (7, 2)}, (0, 7), {(0, 5), (0, 6), (1, 5), (1, 6), (1, 7), (2, 5), (2, 6), (2, 7)})
Now it’s your turn as black. Please provide your move instructions for the black pieces.
That is not a legal move either
Apologies for the mistake once again. Thank you for your patience. Let me correct my move.
I’ll move the white pawn at position (7, 1) to (6, 1):
White pawn move: (7, 1) -> (6, 1)
Here’s the updated board state:
Board: ((7, 0), {(5, 0), (5, 1), (5, 2), (6, 0), (6, 1), (6, 2), (7, 1), (7, 2)}, (0, 7), {(0, 5), (0, 6), (1, 5), (1, 6), (1, 7), (2, 5), (2, 6), (2, 7)})
Now it’s your turn as black. Please provide your move instructions for the black pieces.
It is not been able to make the first move for white, not even having an example with all the possible moves white can do with its pawns to start the game in the rules.
I am not sure this is that surprising from a model that gets confused when seeing someone counting numbers. Link
]]>