feat: finish ps3
This commit is contained in:
parent
7cdb61b354
commit
3287079e24
BIN
cs2109s/labs/ps3/imgs/black_heuristic.png
Executable file
BIN
cs2109s/labs/ps3/imgs/black_heuristic.png
Executable file
Binary file not shown.
After Width: | Height: | Size: 48 KiB |
BIN
cs2109s/labs/ps3/imgs/breakthrough_board.png
Executable file
BIN
cs2109s/labs/ps3/imgs/breakthrough_board.png
Executable file
Binary file not shown.
After Width: | Height: | Size: 8.4 KiB |
BIN
cs2109s/labs/ps3/imgs/game_move.png
Executable file
BIN
cs2109s/labs/ps3/imgs/game_move.png
Executable file
Binary file not shown.
After Width: | Height: | Size: 7.6 KiB |
BIN
cs2109s/labs/ps3/imgs/invert_board.png
Executable file
BIN
cs2109s/labs/ps3/imgs/invert_board.png
Executable file
Binary file not shown.
After Width: | Height: | Size: 17 KiB |
BIN
cs2109s/labs/ps3/imgs/white_heuristic.png
Executable file
BIN
cs2109s/labs/ps3/imgs/white_heuristic.png
Executable file
Binary file not shown.
After Width: | Height: | Size: 50 KiB |
854
cs2109s/labs/ps3/ps3.ipynb
Normal file
854
cs2109s/labs/ps3/ps3.ipynb
Normal file
@ -0,0 +1,854 @@
|
||||
{
|
||||
"cells": [
|
||||
{
|
||||
"cell_type": "markdown",
|
||||
"metadata": {},
|
||||
"source": [
|
||||
"# Problem Set 3: Minimax & Alpha-beta Pruning\n",
|
||||
"\n",
|
||||
"**Release Date:** 6 February 2024\n",
|
||||
"\n",
|
||||
"**Due Date:** 23:59, 21 February 2024"
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "markdown",
|
||||
"metadata": {},
|
||||
"source": [
|
||||
"## Overview\n",
|
||||
"\n",
|
||||
"In class, we discussed a number of search algorithms to implement a two-player game playing agent. In this problem set, we get some hands-on practice by coding an AI to play the game **Breakthrough**.\n",
|
||||
"\n",
|
||||
"Breakthrough was the winner of the 2001 8 × 8 Game Design Competition, sponsored by *About.com* and *Abstract Games Magazine*. When Dan Troyka formulated it, it was originally for a 7×7 board. We’re going to play it on a 6×6 board to limit the complexity. In terms of our terminology for the agent environment, Breakthrough is a fully observable, strategic, deterministic game. The game always results in a win for one of the two players.\n",
|
||||
"\n",
|
||||
"How exactly do you design an agent to play this game and, most importantly, win? An agent takes sensory input and reasons about it, and then outputs an action at each time step. You thus need to create a program that can read in a representation of the board (that’s the input) and output a legal move in Breakthrough. You then need an evaluation function to evaluate how good a position is to your agent. The better your evaluation function, the better your agent will be at picking good moves.\n",
|
||||
"\n",
|
||||
"Aside from the evaluation function, you also need to decide a strategy for exploring the search space. In this problem set, you will first implement a minimax agent, followed by augmenting it with alpha-beta pruning. Additionally, you will be given a limited amount of time to make each move (for the contest) - you must devise a strategy for selecting the optimal move once the allocated search time has expired.\n",
|
||||
"\n",
|
||||
"Required Files:\n",
|
||||
"\n",
|
||||
"* utils.py\n",
|
||||
"* ps3.py\n",
|
||||
"\n",
|
||||
"**Honour Code**: Note that plagiarism will not be condoned! You may discuss with your classmates and check the internet for references, but you MUST NOT submit code/report that is copied directly from other sources!\n",
|
||||
"\n",
|
||||
"**IMPORTANT**: While it is possible to write and run Python code directly in Jupyter notebook, we recommend that you do this Problem Set with an IDE using the `.py` file provided. An IDE will make debugging significantly easier."
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "markdown",
|
||||
"metadata": {},
|
||||
"source": [
|
||||
"## Breakthrough Technical Description\n",
|
||||
"\n",
|
||||
"<pre>\n",
|
||||
"<p style=\"text-align: center;\">\n",
|
||||
"<img src=\"imgs/breakthrough_board.png\">\n",
|
||||
"Figure 1. Game Board\n",
|
||||
"</p>\n",
|
||||
"</pre>\n",
|
||||
"\n",
|
||||
"Figure 1 shows our typical game board. Black (**B**) wins by moving one piece to the opposite side, row index 5. White (**W**) wins by moving one piece to row index 0. A side also wins if their opponent has no pieces left. Kindly **follow the same indexing as provided in *Figure 1*, and write code only for moving black**. A simple board inversion will make black’s code work seamlessly for white as well.\n",
|
||||
"\n",
|
||||
"<pre>\n",
|
||||
"<p style=\"text-align: center;\">\n",
|
||||
"<img src=\"imgs/invert_board.png\">\n",
|
||||
"Figure 2. Board Inversion Illustration\n",
|
||||
"</p>\n",
|
||||
"</pre>\n",
|
||||
"\n",
|
||||
"Pieces move one space directly forward or diagonally forward, and only capture diagonally forward. The possible moves have been illustrated in *Figure 3*. In this figure, the black pawn at (3, 2) can go to any of the three spaces indicated forward. The black pawn at (0, 4) can either choose to move by going diagonally right or capture by going diagonally left. It cannot move or capture by moving forward; its forward move is blocked by the white pawn. Note that your move is not allowed to take your pawn outside the board.\n",
|
||||
"\n",
|
||||
"<pre>\n",
|
||||
"<p style=\"text-align: center;\">\n",
|
||||
"<img src=\"imgs/game_move.png\">\n",
|
||||
"Figure 3. Possible Moves\n",
|
||||
"</p>\n",
|
||||
"</pre>\n",
|
||||
"\n",
|
||||
"Your program will always play **black**, whose objective is to move a black pawn to row index 5. Given a move request, your agent should output a pair of coordinates using the coordinate system shown in the figure. For example, for moving the black pawn standing at (0, 4) in *Figure 3* to (1, 3), your agent should make a move that returns two 2 tuples: (0, 4) and (1, 3).\n",
|
||||
"\n",
|
||||
"You will implement some basic components to of the agent. Afterward, you can further improve your agent with your own design to compete with agents created by your fellow students in a contest."
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "markdown",
|
||||
"metadata": {},
|
||||
"source": [
|
||||
"## Provided Utility Functions\n",
|
||||
"\n",
|
||||
"You can use the functions provided in *util.py* file as you see fit. These functions have mainly been used by the game playing framework to facilitate the two player game. A short description of these functions is given below:\n",
|
||||
"\n",
|
||||
"- `generate_init_state()`: It generates the initial state (*Game Board in Figure 1*) at the start of the game.\n",
|
||||
"- `print_state(board)`: It takes in the board 2D list as parameter and prints out the current state of the board in a convenient way (sample shown in *Possible Moves in Figure 3*).\n",
|
||||
"- `is_game_over(board)`: Given a board configuration, it returns `True` if the game is over, `False` otherwise.\n",
|
||||
"- `is_valid_move(board, src, dst)`: It takes in the board configuration and the move source and move destination as its parameters. It returns `True` if the move is valid and returns `False` if the move is invalid.\n",
|
||||
"- `state_change(curr_board, src, dst, in_place=True)`: Given a board configuration and a move source and move destination, this function changes board configuration in accordance to the indicated move. This function updates the board configuration by modifying existing values if `in_place` is set to `True`, or creating a new board with updated values if `in_place` is set to `False`.\n",
|
||||
"- `invert_board(curr_board, in_place=True)`: It takes in the board 2D list as parameter and returns the inverted board. You should always code for black, not for white. The game playing agent has to make move for both black and white using only black’s code. So, when it is time for white to make its move, we invert the board using this function to see everything from white side’s perspective (done by inverting the colors of each pawn and by modifying the row indices). An example of inversion has been shown in *Figure 2 Board Inversion Illustration*. In your minimax algorithm, you need to consider both black and white alternatively. Instead of writing the same code twice separately for black and white, you can use `invert_board()` function to invert your board configuration that enables you to utilize black’s codes for white pawns as well. This function inverts the board by modifying existing values if `in_place` is set to `True`, or creating a new board with updated values if `in_place` is set to `False`.\n",
|
||||
"- `generate_rand_move(board)`: It takes in the board configuration as its parameter and generates an arbitrary valid move. You likely won’t need to use this function. This function is used by the game playing framework in one of two cases - (1) an invalid move has been made by the game playing agent or (2) the game playing agent has taken more than 3 seconds to make its move.\n",
|
||||
"\n",
|
||||
"Other functions are used to play the game or test your solution - you don't need to use those functions."
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "code",
|
||||
"execution_count": null,
|
||||
"metadata": {},
|
||||
"outputs": [],
|
||||
"source": [
|
||||
"\"\"\"\n",
|
||||
"Run this cell before you start!\n",
|
||||
"\"\"\"\n",
|
||||
"import utils\n",
|
||||
"from typing import Union\n",
|
||||
"\n",
|
||||
"Score = Union[int, float]\n",
|
||||
"Move = tuple[tuple[int, int], tuple[int, int]]\n",
|
||||
"Board = list[list[str]]"
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "markdown",
|
||||
"metadata": {},
|
||||
"source": [
|
||||
"To build your own agent, you will need a heuristic function to evaluate a position. One sample heuristic function is provided below."
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "code",
|
||||
"execution_count": null,
|
||||
"metadata": {},
|
||||
"outputs": [],
|
||||
"source": [
|
||||
"# remember, we are black\n",
|
||||
"def evaluate(board: Board) -> Score:\n",
|
||||
" \"\"\"\n",
|
||||
" Returns the score of the current position.\n",
|
||||
"\n",
|
||||
" Parameters\n",
|
||||
" ----------\n",
|
||||
" board: 2D list of lists. Contains characters \"B\", \"W\", and \"_\",\n",
|
||||
" representing black pawn, white pawn, and empty cell, respectively.\n",
|
||||
" \n",
|
||||
" Returns\n",
|
||||
" -------\n",
|
||||
" An evaluation (as a Score).\n",
|
||||
" \"\"\"\n",
|
||||
" bcount = 0\n",
|
||||
" wcount = 0\n",
|
||||
" for r, row in enumerate(board):\n",
|
||||
" for tile in row:\n",
|
||||
" if tile == \"B\":\n",
|
||||
" if r == 5:\n",
|
||||
" return utils.WIN\n",
|
||||
" bcount += 1\n",
|
||||
" elif tile == \"W\":\n",
|
||||
" if r == 0:\n",
|
||||
" return -utils.WIN\n",
|
||||
" wcount += 1\n",
|
||||
" if wcount == 0:\n",
|
||||
" return utils.WIN\n",
|
||||
" if bcount == 0:\n",
|
||||
" return -utils.WIN\n",
|
||||
" return bcount - wcount"
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "markdown",
|
||||
"metadata": {},
|
||||
"source": [
|
||||
"The provided heuristic function returns `utils.WIN` if black wins, and `-utils.WIN` if white wins. Otherwise, it takes the difference between the number of black pieces and the number of white pieces that are on the board. \n",
|
||||
"\n",
|
||||
"**Note**: On Coursemology, we will provide and use this heuristic function to test your code in task 2 and task 3."
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "markdown",
|
||||
"metadata": {},
|
||||
"source": [
|
||||
"### Task 1.1: Implement a function to generate all valid moves\n",
|
||||
"\n",
|
||||
"It is useful to generate all the possible moves that black can make in a certain position. You may need this function when implementing the minimax algorithm.\n",
|
||||
"\n",
|
||||
"**Note**: On Coursemology, we will provide you with the correct implementation of `generate_valid_moves` in task 2 and task 3."
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "code",
|
||||
"execution_count": null,
|
||||
"metadata": {},
|
||||
"outputs": [],
|
||||
"source": [
|
||||
"def generate_valid_moves(board: Board) -> list[Move]:\n",
|
||||
" \"\"\"\n",
|
||||
" Generates a list containing all possible moves in a particular position for black.\n",
|
||||
"\n",
|
||||
" Parameters\n",
|
||||
" ----------\n",
|
||||
" board: 2D list of lists. Contains characters \"B\", \"W\", and \"_\",\n",
|
||||
" representing black pawn, white pawn, and empty cell, respectively.\n",
|
||||
"\n",
|
||||
" Returns\n",
|
||||
" -------\n",
|
||||
" A list of Moves.\n",
|
||||
" \"\"\"\n",
|
||||
" # TODO: Replace this with your own implementation\n",
|
||||
" raise NotImplementedError"
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "code",
|
||||
"execution_count": null,
|
||||
"metadata": {},
|
||||
"outputs": [],
|
||||
"source": [
|
||||
"# Note that order of the moves might be different\n",
|
||||
"\n",
|
||||
"board1 = [\n",
|
||||
" [\"_\", \"_\", \"B\", \"_\", \"_\", \"_\"],\n",
|
||||
" [\"_\", \"_\", \"_\", \"_\", \"_\", \"_\"],\n",
|
||||
" [\"_\", \"_\", \"_\", \"_\", \"_\", \"_\"],\n",
|
||||
" [\"_\", \"_\", \"_\", \"_\", \"_\", \"_\"],\n",
|
||||
" [\"_\", \"_\", \"_\", \"_\", \"_\", \"_\"],\n",
|
||||
" [\"_\", \"_\", \"_\", \"_\", \"_\", \"_\"]\n",
|
||||
"]\n",
|
||||
"assert sorted(generate_valid_moves(board1)) == [((0, 2), (1, 1)), ((0, 2), (1, 2)), ((0, 2), (1, 3))], \"board1 test output is incorrect\"\n",
|
||||
"\n",
|
||||
"board2 = [\n",
|
||||
" [\"_\", \"_\", \"_\", \"_\", \"B\", \"_\"],\n",
|
||||
" [\"_\", \"_\", \"_\", \"_\", \"_\", \"B\"],\n",
|
||||
" [\"_\", \"W\", \"_\", \"_\", \"_\", \"_\"],\n",
|
||||
" [\"_\", \"_\", \"W\", \"_\", \"_\", \"_\"],\n",
|
||||
" [\"_\", \"_\", \"_\", \"_\", \"_\", \"_\"],\n",
|
||||
" [\"_\", \"_\", \"B\", \"_\", \"_\", \"_\"]\n",
|
||||
"]\n",
|
||||
"assert sorted(generate_valid_moves(board2)) == [((0, 4), (1, 3)), ((0, 4), (1, 4)), ((1, 5), (2, 4)), ((1, 5), (2, 5))], \"board2 test output is incorrect\"\n",
|
||||
"\n",
|
||||
"board3 = [\n",
|
||||
" [\"_\", \"_\", \"_\", \"_\", \"_\", \"_\"],\n",
|
||||
" [\"_\", \"_\", \"B\", \"_\", \"_\", \"_\"],\n",
|
||||
" [\"_\", \"W\", \"W\", \"W\", \"_\", \"_\"],\n",
|
||||
" [\"_\", \"_\", \"_\", \"_\", \"_\", \"_\"],\n",
|
||||
" [\"_\", \"_\", \"_\", \"_\", \"_\", \"_\"],\n",
|
||||
" [\"_\", \"_\", \"_\", \"_\", \"_\", \"_\"]\n",
|
||||
"]\n",
|
||||
"assert sorted(generate_valid_moves(board3)) == [((1, 2), (2, 1)), ((1, 2), (2, 3))], \"board3 test output is incorrect\""
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "markdown",
|
||||
"metadata": {},
|
||||
"source": [
|
||||
"## Minimax Algorithm\n",
|
||||
"\n",
|
||||
"Your agent must be able to calculate the game state a few moves in advance, by implementing the **minimax** algorithm."
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "markdown",
|
||||
"metadata": {},
|
||||
"source": [
|
||||
"### Task 2.1: Implement minimax\n",
|
||||
"\n",
|
||||
"In the lecture, you have seen the minimax algorithm without and with cutoff. We will implement a minimax algorithm with cutoff in this case, as the depth of the game and the branching factor in certain positions can be (very) large and we do not have the computational power to compute the entire game until the terminal states.\n",
|
||||
"\n",
|
||||
"Your minimax function should explore different game states, until either the depth is `max_depth`, or there is a winner. In these cases, your minimax algorithm should use the provided heuristic function to evaluate the position.\n",
|
||||
"\n",
|
||||
"You can reuse `generate_valid_moves` and `evaluate` to handle the white side if you use `invert_board` from `utils.py`. If you choose to do this, remember to invert the board again when you need to handle the black side.\n",
|
||||
"\n",
|
||||
"**Note**: For tasks 2.1 to 3.2, if you are certain that your solution is correct but the test cases fail on Coursemology due to timeout, just rerun your code. Depending on the load on Coursemology, a correct solution might still timeout."
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "code",
|
||||
"execution_count": null,
|
||||
"metadata": {},
|
||||
"outputs": [],
|
||||
"source": [
|
||||
"def minimax(\n",
|
||||
" board: Board, \n",
|
||||
" depth: int, \n",
|
||||
" max_depth: int, \n",
|
||||
" is_black: bool\n",
|
||||
" ) -> tuple[Score, Move]:\n",
|
||||
" \"\"\"\n",
|
||||
" Finds the best move for the input board state.\n",
|
||||
" Note that you are black.\n",
|
||||
"\n",
|
||||
" Parameters\n",
|
||||
" ----------\n",
|
||||
" board: 2D list of lists. Contains characters \"B\", \"W\", and \"_\",\n",
|
||||
" representing black pawn, white pawn, and empty cell, respectively.\n",
|
||||
" \n",
|
||||
" depth: int, the depth to search for the best move. When this is equal\n",
|
||||
" to `max_depth`, you should get the evaluation of the position using \n",
|
||||
" the provided heuristic function.\n",
|
||||
"\n",
|
||||
" max_depth: int, the maximum depth for cutoff.\n",
|
||||
" \n",
|
||||
" is_black: bool. True when finding the best move for black, False \n",
|
||||
" otherwise.\n",
|
||||
"\n",
|
||||
" Returns\n",
|
||||
" -------\n",
|
||||
" A tuple (evalutation, ((src_row, src_col), (dst_row, dst_col))):\n",
|
||||
" evaluation: the best score that black can achieve after this move.\n",
|
||||
" src_row, src_col: position of the pawn to move.\n",
|
||||
" dst_row, dst_col: position to move the pawn to.\n",
|
||||
" \"\"\"\n",
|
||||
" # TODO: relace with your own implementation\n",
|
||||
" raise NotImplementedError"
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "code",
|
||||
"execution_count": null,
|
||||
"metadata": {},
|
||||
"outputs": [],
|
||||
"source": [
|
||||
"# Note that there can be multiple best moves, denoted by _\n",
|
||||
"\n",
|
||||
"board1 = [\n",
|
||||
" list(\"______\"),\n",
|
||||
" list(\"___B__\"),\n",
|
||||
" list(\"____BB\"),\n",
|
||||
" list(\"___WB_\"),\n",
|
||||
" list(\"_B__WW\"),\n",
|
||||
" list(\"_WW___\"),\n",
|
||||
"]\n",
|
||||
"score1, _ = minimax(board1, 0, 1, True)\n",
|
||||
"assert score1 == utils.WIN, \"black should win in 1\"\n",
|
||||
"\n",
|
||||
"board2 = [\n",
|
||||
" list(\"______\"),\n",
|
||||
" list(\"___B__\"),\n",
|
||||
" list(\"____BB\"),\n",
|
||||
" list(\"_BW_B_\"),\n",
|
||||
" list(\"____WW\"),\n",
|
||||
" list(\"_WW___\"),\n",
|
||||
"]\n",
|
||||
"score2, _ = minimax(board2, 0, 3, True)\n",
|
||||
"assert score2 == utils.WIN, \"black should win in 3\"\n",
|
||||
"\n",
|
||||
"board3 = [\n",
|
||||
" list(\"______\"),\n",
|
||||
" list(\"__B___\"),\n",
|
||||
" list(\"_WWW__\"),\n",
|
||||
" list(\"______\"),\n",
|
||||
" list(\"______\"),\n",
|
||||
" list(\"______\"),\n",
|
||||
"]\n",
|
||||
"score3, _ = minimax(board3, 0, 4, True)\n",
|
||||
"assert score3 == -utils.WIN, \"white should win in 4\""
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "markdown",
|
||||
"metadata": {},
|
||||
"source": [
|
||||
"### Task 2.2: Implement negamax\n",
|
||||
"\n",
|
||||
"You may notice that Breakthrough is a zero-sum game. It means that the sum of the evalutation scores of the two players should be zero.\n",
|
||||
"\n",
|
||||
"For example, we can consider a position in which there are _9 black pawns_ and _6 white pawns_. Using the sample heuristic function given at the start:\n",
|
||||
"\n",
|
||||
"- The evaluation score of black in this position is `+3`.\n",
|
||||
"- The evaluation score of white in this position is `-3`.\n",
|
||||
"\n",
|
||||
"Using this property, we can simplify the implementation of minimax. Instead of taking the maximum and minimum scores for black and white respectively, we can negate the score of the opposite player and take the maximum score. This version is called **negamax**."
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "code",
|
||||
"execution_count": null,
|
||||
"metadata": {},
|
||||
"outputs": [],
|
||||
"source": [
|
||||
"def negamax(\n",
|
||||
" board: Board,\n",
|
||||
" depth: int, \n",
|
||||
" max_depth: int\n",
|
||||
" ) -> tuple[Score, Move]:\n",
|
||||
" \"\"\"\n",
|
||||
" Finds the best move for the input board state.\n",
|
||||
" Note that you are black.\n",
|
||||
"\n",
|
||||
" Parameters\n",
|
||||
" ----------\n",
|
||||
" board: 2D list of lists. Contains characters \"B\", \"W\", and \"_\",\n",
|
||||
" representing black pawn, white pawn, and empty cell, respectively.\n",
|
||||
" \n",
|
||||
" depth: int, the depth to search for the best move. When this is equal\n",
|
||||
" to `max_depth`, you should get the evaluation of the position using \n",
|
||||
" the provided heuristic function.\n",
|
||||
"\n",
|
||||
" max_depth: int, the maximum depth for cutoff.\n",
|
||||
"\n",
|
||||
" Notice that you no longer need the parameter `is_black`.\n",
|
||||
"\n",
|
||||
" Returns\n",
|
||||
" -------\n",
|
||||
" A tuple (evalutation, ((src_row, src_col), (dst_row, dst_col))):\n",
|
||||
" evaluation: the best score that black can achieve after this move.\n",
|
||||
" src_row, src_col: position of the pawn to move.\n",
|
||||
" dst_row, dst_col: position to move the pawn to.\n",
|
||||
" \"\"\"\n",
|
||||
" # TODO: replace with your own implementation\n",
|
||||
" raise NotImplementedError"
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "code",
|
||||
"execution_count": null,
|
||||
"metadata": {},
|
||||
"outputs": [],
|
||||
"source": [
|
||||
"# Note that there can be multiple best moves, denoted by _\n",
|
||||
"\n",
|
||||
"board1 = [\n",
|
||||
" list(\"______\"),\n",
|
||||
" list(\"___B__\"),\n",
|
||||
" list(\"____BB\"),\n",
|
||||
" list(\"___WB_\"),\n",
|
||||
" list(\"_B__WW\"),\n",
|
||||
" list(\"_WW___\"),\n",
|
||||
"]\n",
|
||||
"score1, _ = negamax(board1, 0, 1)\n",
|
||||
"assert score1 == utils.WIN, \"black should win in 1\"\n",
|
||||
"\n",
|
||||
"board2 = [\n",
|
||||
" list(\"______\"),\n",
|
||||
" list(\"___B__\"),\n",
|
||||
" list(\"____BB\"),\n",
|
||||
" list(\"_BW_B_\"),\n",
|
||||
" list(\"____WW\"),\n",
|
||||
" list(\"_WW___\"),\n",
|
||||
"]\n",
|
||||
"score2, _ = negamax(board2, 0, 3)\n",
|
||||
"assert score2 == utils.WIN, \"black should win in 3\"\n",
|
||||
"\n",
|
||||
"board3 = [\n",
|
||||
" list(\"______\"),\n",
|
||||
" list(\"__B___\"),\n",
|
||||
" list(\"_WWW__\"),\n",
|
||||
" list(\"______\"),\n",
|
||||
" list(\"______\"),\n",
|
||||
" list(\"______\"),\n",
|
||||
"]\n",
|
||||
"score3, _ = negamax(board3, 0, 4)\n",
|
||||
"assert score3 == -utils.WIN, \"white should win in 4\""
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "markdown",
|
||||
"metadata": {},
|
||||
"source": [
|
||||
"If you implement negamax correctly, the code should be much more elegant compared to minimax."
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "markdown",
|
||||
"metadata": {},
|
||||
"source": [
|
||||
"## Alpha-beta Pruning\n",
|
||||
"\n",
|
||||
"With minimax (or negamax), our agent can see the future within a few moves. However, the naive implementation of minimax (or negamax) may explore many redundant states, which slows down our agent. As discussed in the lecture, we can apply **alpha-beta pruning** to eliminate unnecessary states, thereby improving our agent's speed and its ability to see even further into the future. This will increase our agent's strength and its likelihood of winning the game. \n",
|
||||
"\n",
|
||||
"First, you should try to integrate alpha-beta pruning with the standard minimax algorithm."
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "markdown",
|
||||
"metadata": {},
|
||||
"source": [
|
||||
"### Task 3.1: Integrate alpha-beta pruning into minimax"
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "code",
|
||||
"execution_count": null,
|
||||
"metadata": {},
|
||||
"outputs": [],
|
||||
"source": [
|
||||
"def minimax_alpha_beta(\n",
|
||||
" board: Board,\n",
|
||||
" depth: int, \n",
|
||||
" max_depth: int, \n",
|
||||
" alpha: Score, \n",
|
||||
" beta: Score, \n",
|
||||
" is_black: bool\n",
|
||||
" ) -> tuple[Score, Move]:\n",
|
||||
" \"\"\"\n",
|
||||
" Finds the best move for the input board state.\n",
|
||||
" Note that you are black.\n",
|
||||
"\n",
|
||||
" Parameters\n",
|
||||
" ----------\n",
|
||||
" board: 2D list of lists. Contains characters \"B\", \"W\", and \"_\",\n",
|
||||
" representing black pawn, white pawn, and empty cell, respectively.\n",
|
||||
"\n",
|
||||
" depth: int, the depth to search for the best move. When this is equal\n",
|
||||
" to `max_depth`, you should get the evaluation of the position using\n",
|
||||
" the provided heuristic function.\n",
|
||||
"\n",
|
||||
" max_depth: int, the maximum depth for cutoff.\n",
|
||||
"\n",
|
||||
" alpha: Score. The alpha value in a given state.\n",
|
||||
"\n",
|
||||
" beta: Score. The beta value in a given state.\n",
|
||||
"\n",
|
||||
" is_black: bool. True when finding the best move for black, False\n",
|
||||
" otherwise.\n",
|
||||
"\n",
|
||||
" Returns\n",
|
||||
" -------\n",
|
||||
" A tuple (evalutation, ((src_row, src_col), (dst_row, dst_col))):\n",
|
||||
" evaluation: the best score that black can achieve after this move.\n",
|
||||
" src_row, src_col: position of the pawn to move.\n",
|
||||
" dst_row, dst_col: position to move the pawn to.\n",
|
||||
" \"\"\"\n",
|
||||
" # TODO: Replace this with your own implementation\n",
|
||||
" raise NotImplementedError"
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "code",
|
||||
"execution_count": null,
|
||||
"metadata": {},
|
||||
"outputs": [],
|
||||
"source": [
|
||||
"# Note that there can be multiple best moves, denoted by _\n",
|
||||
"\n",
|
||||
"board1 = [\n",
|
||||
" list(\"______\"),\n",
|
||||
" list(\"__BB__\"),\n",
|
||||
" list(\"____BB\"),\n",
|
||||
" list(\"WBW_B_\"),\n",
|
||||
" list(\"____WW\"),\n",
|
||||
" list(\"_WW___\"),\n",
|
||||
"]\n",
|
||||
"score1, _ = minimax_alpha_beta(board1, 0, 3, -utils.INF, utils.INF, True)\n",
|
||||
"assert score1 == utils.WIN, \"black should win in 3\"\n",
|
||||
"\n",
|
||||
"board2 = [\n",
|
||||
" list(\"____B_\"),\n",
|
||||
" list(\"___B__\"),\n",
|
||||
" list(\"__B___\"),\n",
|
||||
" list(\"_WWW__\"),\n",
|
||||
" list(\"____W_\"),\n",
|
||||
" list(\"______\"),\n",
|
||||
"]\n",
|
||||
"score2, _ = minimax_alpha_beta(board2, 0, 5, -utils.INF, utils.INF, True)\n",
|
||||
"assert score2 == utils.WIN, \"black should win in 5\"\n",
|
||||
"\n",
|
||||
"board3 = [\n",
|
||||
" list(\"____B_\"),\n",
|
||||
" list(\"__BB__\"),\n",
|
||||
" list(\"______\"),\n",
|
||||
" list(\"_WWW__\"),\n",
|
||||
" list(\"____W_\"),\n",
|
||||
" list(\"______\"),\n",
|
||||
"]\n",
|
||||
"score3, _ = minimax_alpha_beta(board3, 0, 6, -utils.INF, utils.INF, True)\n",
|
||||
"assert score3 == -utils.WIN, \"white should win in 6\""
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "markdown",
|
||||
"metadata": {},
|
||||
"source": [
|
||||
"### Task 3.2: Integrate alpha-beta pruning into negamax\n",
|
||||
"\n",
|
||||
"At this stage, you may wonder: why don't we integrate alpha-beta pruning with the more elegant alternative, negamax? Of course, we can also incorporate alpha-beta pruning into our negamax algorithm to improve its performance.\n",
|
||||
"\n",
|
||||
"Remember, we exploited the zero-sum property of the game to implement negamax by negating the evaluation score, and always taking the maximum score instead of alternating between the maximum and minimum. You need to further exploit this property to correctly integrate alpha-beta pruning into negamax. To help you, these are some questions that you can try to answer:\n",
|
||||
"\n",
|
||||
"- What is the meaning of `alpha` and `beta`?\n",
|
||||
"- From the perspective of the opponent, what is the corresponding `alpha` and `beta`? Can I exploit the zero-sum property here?"
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "code",
|
||||
"execution_count": null,
|
||||
"metadata": {},
|
||||
"outputs": [],
|
||||
"source": [
|
||||
"def negamax_alpha_beta(\n",
|
||||
" board: Board, \n",
|
||||
" depth: int, \n",
|
||||
" max_depth: int, \n",
|
||||
" alpha: Score, \n",
|
||||
" beta: Score\n",
|
||||
" ) -> tuple[Score, Move]:\n",
|
||||
" \"\"\"\n",
|
||||
" Finds the best move for the input board state.\n",
|
||||
" Note that you are black.\n",
|
||||
"\n",
|
||||
" Parameters\n",
|
||||
" ----------\n",
|
||||
" board: 2D list of lists. Contains characters \"B\", \"W\", and \"_\",\n",
|
||||
" representing black pawn, white pawn, and empty cell, respectively.\n",
|
||||
"\n",
|
||||
" depth: int, the depth to search for the best move. When this is equal\n",
|
||||
" to `max_depth`, you should get the evaluation of the position using\n",
|
||||
" the provided heuristic function.\n",
|
||||
"\n",
|
||||
" max_depth: int, the maximum depth for cutoff.\n",
|
||||
"\n",
|
||||
" alpha: Score. The alpha value in a given state.\n",
|
||||
"\n",
|
||||
" beta: Score. The beta value in a given state.\n",
|
||||
"\n",
|
||||
" Notice that you no longer need the parameter `is_black`.\n",
|
||||
" \n",
|
||||
" Returns\n",
|
||||
" -------\n",
|
||||
" A tuple (evalutation, ((src_row, src_col), (dst_row, dst_col))):\n",
|
||||
" evaluation: the best score that black can achieve after this move.\n",
|
||||
" src_row, src_col: position of the pawn to move.\n",
|
||||
" dst_row, dst_col: position to move the pawn to.\n",
|
||||
" \"\"\"\n",
|
||||
" # TODO: Replace this with your own implementation\n",
|
||||
" raise NotImplementedError"
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "code",
|
||||
"execution_count": null,
|
||||
"metadata": {},
|
||||
"outputs": [],
|
||||
"source": [
|
||||
"# Note that there can be multiple best moves, denoted by _\n",
|
||||
"\n",
|
||||
"board1 = [\n",
|
||||
" list(\"______\"),\n",
|
||||
" list(\"__BB__\"),\n",
|
||||
" list(\"____BB\"),\n",
|
||||
" list(\"WBW_B_\"),\n",
|
||||
" list(\"____WW\"),\n",
|
||||
" list(\"_WW___\"),\n",
|
||||
"]\n",
|
||||
"score1, _ = negamax_alpha_beta(board1, 0, 3, -utils.INF, utils.INF)\n",
|
||||
"assert score1 == utils.WIN, \"black should win in 3\"\n",
|
||||
"\n",
|
||||
"board2 = [\n",
|
||||
" list(\"____B_\"),\n",
|
||||
" list(\"___B__\"),\n",
|
||||
" list(\"__B___\"),\n",
|
||||
" list(\"_WWW__\"),\n",
|
||||
" list(\"____W_\"),\n",
|
||||
" list(\"______\"),\n",
|
||||
"]\n",
|
||||
"score2, _ = negamax_alpha_beta(board2, 0, 5, -utils.INF, utils.INF)\n",
|
||||
"assert score2 == utils.WIN, \"black should win in 5\"\n",
|
||||
"\n",
|
||||
"board3 = [\n",
|
||||
" list(\"____B_\"),\n",
|
||||
" list(\"__BB__\"),\n",
|
||||
" list(\"______\"),\n",
|
||||
" list(\"_WWW__\"),\n",
|
||||
" list(\"____W_\"),\n",
|
||||
" list(\"______\"),\n",
|
||||
"]\n",
|
||||
"score3, _ = negamax_alpha_beta(board3, 0, 6, -utils.INF, utils.INF)\n",
|
||||
"assert score3 == -utils.WIN, \"white should win in 6\""
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "markdown",
|
||||
"metadata": {},
|
||||
"source": [
|
||||
"## Heuristic Function\n",
|
||||
"\n",
|
||||
"Phew, we finish the search algorithm! But, our heuristic function is too simple - it may not give the best evaluation for a position and we need a better one. Therefore, you shall implement the not-as-simple heuristic function described below."
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "markdown",
|
||||
"metadata": {},
|
||||
"source": [
|
||||
"### Task 4.1: Implement a less simple heuristic function\n",
|
||||
"\n",
|
||||
"Recall that the heuristic function should return a larger value when black is closer to winning. If black is closer to winning, black should have more pieces closer to row 5 compared to white having pieces closer to row 0. Of course this is not necessarily the case since you only need one piece to make it through while the rest remain behind, but this is just a heuristic after all. Thus, in this heuristic you are about to implement, we add more points if a black piece is closer to the end, and subtract more points if a white piece is closer to the end. The exact amount of points is shown in the figures below.\n",
|
||||
"\n",
|
||||
"<pre>\n",
|
||||
"<p style=\"text-align: center;\">\n",
|
||||
"<img src=\"imgs/black_heuristic.png\", width = 300>\n",
|
||||
"Figure 4. Points to add for each black piece in the square\n",
|
||||
"<p style=\"text-align: center;\">\n",
|
||||
"<img src=\"imgs/white_heuristic.png\", width = 300>\n",
|
||||
"Figure 5. Points to subtract for each white piece in the square\n",
|
||||
"</p>\n",
|
||||
"</pre>\n",
|
||||
"\n",
|
||||
"So, for example, if there are two black pieces on (0, 4) and (3, 2), and a white piece on (1, 4), then the output of the heuristic function should be `10 + 40 - 50 = 0`. Additionally, return `utils.WIN` if any of black's pieces reach the end, and `-utils.WIN` if any of white's pieces reach the end. Similarly, if white has no pieces, return `utils.WIN`, and if black has no pieces, return `-utils.WIN`. The value of `utils.WIN` can be found in `utils.py` and has a value of `101010`."
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "code",
|
||||
"execution_count": null,
|
||||
"metadata": {},
|
||||
"outputs": [],
|
||||
"source": [
|
||||
"def evaluate(board: Board) -> Score:\n",
|
||||
" # TODO: replace this with your own implementation\n",
|
||||
" raise NotImplementedError"
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "code",
|
||||
"execution_count": null,
|
||||
"metadata": {},
|
||||
"outputs": [],
|
||||
"source": [
|
||||
"board1 = [\n",
|
||||
" [\"_\", \"_\", \"_\", \"B\", \"_\", \"_\"],\n",
|
||||
" [\"_\", \"_\", \"_\", \"W\", \"_\", \"_\"],\n",
|
||||
" [\"_\", \"_\", \"_\", \"_\", \"_\", \"_\"],\n",
|
||||
" [\"_\", \"_\", \"B\", \"_\", \"_\", \"_\"],\n",
|
||||
" [\"_\", \"_\", \"_\", \"_\", \"_\", \"_\"],\n",
|
||||
" [\"_\", \"_\", \"_\", \"_\", \"_\", \"_\"],\n",
|
||||
" ['_', '_', '_', '_', '_', '_']\n",
|
||||
"]\n",
|
||||
"assert evaluate(board1) == 0\n",
|
||||
"\n",
|
||||
"board2 = [\n",
|
||||
" [\"_\", \"_\", \"_\", \"B\", \"W\", \"_\"],\n",
|
||||
" [\"_\", \"_\", \"_\", \"W\", \"_\", \"_\"],\n",
|
||||
" [\"_\", \"_\", \"_\", \"_\", \"_\", \"_\"],\n",
|
||||
" [\"_\", \"_\", \"_\", \"_\", \"_\", \"_\"],\n",
|
||||
" [\"_\", \"_\", \"_\", \"_\", \"_\", \"_\"],\n",
|
||||
" [\"_\", \"_\", \"_\", \"_\", \"_\", \"_\"]\n",
|
||||
"]\n",
|
||||
"assert evaluate(board2) == -utils.WIN\n",
|
||||
"\n",
|
||||
"board3 = [\n",
|
||||
" [\"_\", \"_\", \"_\", \"_\", \"_\", \"_\"],\n",
|
||||
" [\"_\", \"_\", \"_\", \"_\", \"_\", \"_\"],\n",
|
||||
" [\"_\", \"_\", \"_\", \"_\", \"_\", \"_\"],\n",
|
||||
" [\"_\", \"_\", \"B\", \"_\", \"_\", \"_\"],\n",
|
||||
" [\"_\", \"_\", \"_\", \"_\", \"_\", \"_\"],\n",
|
||||
" [\"_\", \"_\", \"_\", \"_\", \"_\", \"_\"]\n",
|
||||
"]\n",
|
||||
"assert evaluate(board3) == utils.WIN"
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "markdown",
|
||||
"metadata": {},
|
||||
"source": [
|
||||
"## Free Implementation and Contest\n",
|
||||
"\n",
|
||||
"Finally, you can combine the implemented components together and create your own agent. But wait, there is something missing - we must deal with the time constraint! Your agent only has a limited amount of time for calculation before making a move.\n",
|
||||
"\n",
|
||||
"Your agent _**must not take more than 3 real-time seconds**_ to make a move in the contest. You should check for time passed during every recursive call in your algorithm to follow this 3 second rule. Whenever you see that 3 seconds is almost over, immediately return the best move you have at your disposal. This is really important because the machine where we will run your code may be much slower than your local machine."
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "code",
|
||||
"execution_count": null,
|
||||
"metadata": {},
|
||||
"outputs": [],
|
||||
"source": [
|
||||
"class PlayerAI:\n",
|
||||
"\n",
|
||||
" def make_move(self, board: Board) -> Move:\n",
|
||||
" \"\"\"\n",
|
||||
" This is the function that will be called from main.py\n",
|
||||
" You should combine the functions in the previous tasks\n",
|
||||
" to implement this function.\n",
|
||||
"\n",
|
||||
" Parameters\n",
|
||||
" ----------\n",
|
||||
" self: object instance itself, passed in automatically by Python.\n",
|
||||
" \n",
|
||||
" board: 2D list-of-lists. Contains characters \"B\", \"W\", and \"_\",\n",
|
||||
" representing black pawn, white pawn, and empty cell, respectively.\n",
|
||||
" \n",
|
||||
" Returns\n",
|
||||
" -------\n",
|
||||
" A tuple of tuples containing coordinates (row_index, col_index).\n",
|
||||
" The first tuple contains the source position of the black pawn\n",
|
||||
" to be moved, the second tuple contains the destination position.\n",
|
||||
" \"\"\"\n",
|
||||
" # TODO: Replace starter code with your AI\n",
|
||||
" ################\n",
|
||||
" # Starter code #\n",
|
||||
" ################\n",
|
||||
" for r in range(len(board)):\n",
|
||||
" for c in range(len(board[r])):\n",
|
||||
" # check if B can move forward directly\n",
|
||||
" if board[r][c] == \"B\" and board[r + 1][c] == \"_\":\n",
|
||||
" src = r, c\n",
|
||||
" dst = r + 1, c\n",
|
||||
" return src, dst # valid move\n",
|
||||
" return (0, 0), (0, 0) # invalid move"
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "markdown",
|
||||
"metadata": {},
|
||||
"source": [
|
||||
"After this, you are free to further improve your agent with any technique, except for some that allow you to gain unfair advantages, including, but not limited to:\n",
|
||||
"- Change the testing framework / timer.\n",
|
||||
"- Use Python to compile C++ as this is hardware advantage.\n",
|
||||
"- Use of multi-process as this is hardware advantage.\n",
|
||||
"\n",
|
||||
"The maximum size of code (of your agent) that you can upload is 10MB.\n",
|
||||
"\n",
|
||||
"Ultimately, we shall be playing all the student designed agents against each other. So, it will be a small Breakthrough tournament. The top players will get some bonus XP."
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "markdown",
|
||||
"metadata": {},
|
||||
"source": [
|
||||
"## Testing Your Game Playing Agent\n",
|
||||
"\n",
|
||||
"Fill in `make_move(board)` method of the `PlayerAI` class with your game playing agent code. The `PlayerNaive` class has been provided for you to test out your agent against another program. Always code for black (assume black as max player) in both these class functions. The game playing framework calls the `make_move(board)` method of each agent alternatively. After you complete `PlayerAI`, simply run the *template.py* file. You will see the two agents (`PlayerAI` and `PlayerNaive`) playing against each other.\n",
|
||||
"\n",
|
||||
"Your agent should always provide a legal move. Moves will be validated by the game playing framework. If your player makes an illegal move, the competition framework will choose the next available valid move on your behalf, so you will likely lose. Your agent must always make a move; it is not allowed to skip moves. Your program *cannot take more than 3 real-time seconds* to make a move. If your program does not output a coordinate within 3 seconds, the competition framework will choose the next available move too. You can read up the implementation to obtain the next available move by looking up the function `generate_rand_move` in `utils.py`.\n",
|
||||
"\n",
|
||||
"To maximise your chances of winning, you might want to optimise the following points:\n",
|
||||
"- The evaluation function used to evaluate a certain position.\n",
|
||||
"- Effective exploration strategy (for example: move ordering).\n",
|
||||
"- Modifying the alpha-beta pruning algorithm for more efficient search."
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "markdown",
|
||||
"metadata": {},
|
||||
"source": [
|
||||
"# Submission\n",
|
||||
"\n",
|
||||
"Once you are done, please submit your work to Coursemology, by copying the right snippets of code into the corresponding box that says 'Your answer', and click 'Save'. After you save, you can make changes to your\n",
|
||||
"submission.\n",
|
||||
"\n",
|
||||
"Once you are satisfied with what you have uploaded, click 'Finalize submission.' **Note that once your submission is finalized, it is considered to be submitted for grading and cannot be changed**. If you need to undo\n",
|
||||
"this action, you will have to email your assigned tutor for help. Please do not finalize your submission until you are sure that you want to submit your solutions for grading. \n"
|
||||
]
|
||||
}
|
||||
],
|
||||
"metadata": {
|
||||
"kernelspec": {
|
||||
"display_name": "Python 3",
|
||||
"language": "python",
|
||||
"name": "python3"
|
||||
},
|
||||
"language_info": {
|
||||
"codemirror_mode": {
|
||||
"name": "ipython",
|
||||
"version": 3
|
||||
},
|
||||
"file_extension": ".py",
|
||||
"mimetype": "text/x-python",
|
||||
"name": "python",
|
||||
"nbconvert_exporter": "python",
|
||||
"pygments_lexer": "ipython3",
|
||||
"version": "3.10.6"
|
||||
},
|
||||
"orig_nbformat": 4
|
||||
},
|
||||
"nbformat": 4,
|
||||
"nbformat_minor": 2
|
||||
}
|
622
cs2109s/labs/ps3/ps3.py
Normal file
622
cs2109s/labs/ps3/ps3.py
Normal file
@ -0,0 +1,622 @@
|
||||
import utils
|
||||
from typing import Union
|
||||
|
||||
Score = Union[int, float]
|
||||
Move = tuple[tuple[int, int], tuple[int, int]]
|
||||
Board = list[list[str]]
|
||||
|
||||
def evaluate(board: Board) -> Score:
|
||||
"""
|
||||
Returns the score of the current position.
|
||||
|
||||
Parameters
|
||||
----------
|
||||
board: 2D list of lists. Contains characters "B", "W", and "_",
|
||||
representing black pawn, white pawn, and empty cell, respectively.
|
||||
|
||||
Returns
|
||||
-------
|
||||
An evaluation (as a Score).
|
||||
"""
|
||||
bcount = 0
|
||||
wcount = 0
|
||||
for r, row in enumerate(board):
|
||||
for tile in row:
|
||||
if tile == "B":
|
||||
if r == 5:
|
||||
return utils.WIN
|
||||
bcount += 1
|
||||
elif tile == "W":
|
||||
if r == 0:
|
||||
return -utils.WIN
|
||||
wcount += 1
|
||||
if wcount == 0:
|
||||
return utils.WIN
|
||||
if bcount == 0:
|
||||
return -utils.WIN
|
||||
return bcount - wcount
|
||||
|
||||
def generate_valid_moves(board: Board) -> list[Move]:
|
||||
"""
|
||||
Generates a list containing all possible moves in a particular position for black.
|
||||
|
||||
Parameters
|
||||
----------
|
||||
board: 2D list of lists. Contains characters "B", "W", and "_",
|
||||
representing black pawn, white pawn, and empty cell, respectively.
|
||||
|
||||
Returns
|
||||
-------
|
||||
A list of Moves.
|
||||
"""
|
||||
res = []
|
||||
for r, row in enumerate(board):
|
||||
for c, tile in enumerate(row):
|
||||
if tile != "B":
|
||||
continue
|
||||
for d in (-1, 0, 1):
|
||||
dst = r + 1, c + d
|
||||
if utils.is_valid_move(board, (r, c), dst):
|
||||
res.append(((r, c), dst))
|
||||
|
||||
return res
|
||||
|
||||
def test_11():
|
||||
board1 = [
|
||||
["_", "_", "B", "_", "_", "_"],
|
||||
["_", "_", "_", "_", "_", "_"],
|
||||
["_", "_", "_", "_", "_", "_"],
|
||||
["_", "_", "_", "_", "_", "_"],
|
||||
["_", "_", "_", "_", "_", "_"],
|
||||
["_", "_", "_", "_", "_", "_"]
|
||||
]
|
||||
assert sorted(generate_valid_moves(board1)) == [((0, 2), (1, 1)), ((0, 2), (1, 2)), ((0, 2), (1, 3))], "board1 test output is incorrect"
|
||||
|
||||
board2 = [
|
||||
["_", "_", "_", "_", "B", "_"],
|
||||
["_", "_", "_", "_", "_", "B"],
|
||||
["_", "W", "_", "_", "_", "_"],
|
||||
["_", "_", "W", "_", "_", "_"],
|
||||
["_", "_", "_", "_", "_", "_"],
|
||||
["_", "_", "B", "_", "_", "_"]
|
||||
]
|
||||
assert sorted(generate_valid_moves(board2)) == [((0, 4), (1, 3)), ((0, 4), (1, 4)), ((1, 5), (2, 4)), ((1, 5), (2, 5))], "board2 test output is incorrect"
|
||||
|
||||
board3 = [
|
||||
["_", "_", "_", "_", "_", "_"],
|
||||
["_", "_", "B", "_", "_", "_"],
|
||||
["_", "W", "W", "W", "_", "_"],
|
||||
["_", "_", "_", "_", "_", "_"],
|
||||
["_", "_", "_", "_", "_", "_"],
|
||||
["_", "_", "_", "_", "_", "_"]
|
||||
]
|
||||
assert sorted(generate_valid_moves(board3)) == [((1, 2), (2, 1)), ((1, 2), (2, 3))], "board3 test output is incorrect"
|
||||
|
||||
# test_11()
|
||||
|
||||
def minimax(
|
||||
board: Board,
|
||||
depth: int,
|
||||
max_depth: int,
|
||||
is_black: bool
|
||||
) -> tuple[Score, Move]:
|
||||
"""
|
||||
Finds the best move for the input board state.
|
||||
Note that you are black.
|
||||
|
||||
Parameters
|
||||
----------
|
||||
board: 2D list of lists. Contains characters "B", "W", and "_",
|
||||
representing black pawn, white pawn, and empty cell, respectively.
|
||||
|
||||
depth: int, the depth to search for the best move. When this is equal
|
||||
to `max_depth`, you should get the evaluation of the position using
|
||||
the provided heuristic function.
|
||||
|
||||
max_depth: int, the maximum depth for cutoff.
|
||||
|
||||
is_black: bool. True when finding the best move for black, False
|
||||
otherwise.
|
||||
|
||||
Returns
|
||||
-------
|
||||
A tuple (evalutation, ((src_row, src_col), (dst_row, dst_col))):
|
||||
evaluation: the best score that black can achieve after this move.
|
||||
src_row, src_col: position of the pawn to move.
|
||||
dst_row, dst_col: position to move the pawn to.
|
||||
"""
|
||||
if depth == max_depth or utils.is_game_over(board):
|
||||
return evaluate(board), None
|
||||
# max value
|
||||
if is_black:
|
||||
v = -utils.INF
|
||||
best_move = None
|
||||
for action in generate_valid_moves(board):
|
||||
next_board = utils.state_change(board, action[0], action[1], False)
|
||||
new_v, _ = minimax(next_board, depth + 1, max_depth, False)
|
||||
if new_v > v:
|
||||
v = new_v
|
||||
best_move = action
|
||||
return v, best_move
|
||||
# min value
|
||||
else:
|
||||
v = utils.INF
|
||||
best_move = None
|
||||
w_board = utils.invert_board(board, False)
|
||||
for action in generate_valid_moves(w_board):
|
||||
next_board = utils.state_change(w_board, action[0], action[1], False)
|
||||
new_v, _ = minimax(utils.invert_board(next_board, False), depth + 1, max_depth, True)
|
||||
if new_v < v:
|
||||
v = new_v
|
||||
best_move = action
|
||||
return v, best_move
|
||||
|
||||
|
||||
def test_21():
|
||||
board1 = [
|
||||
list("______"),
|
||||
list("___B__"),
|
||||
list("____BB"),
|
||||
list("___WB_"),
|
||||
list("_B__WW"),
|
||||
list("_WW___"),
|
||||
]
|
||||
score1, _ = minimax(board1, 0, 1, True)
|
||||
assert score1 == utils.WIN, "black should win in 1"
|
||||
|
||||
board2 = [
|
||||
list("______"),
|
||||
list("___B__"),
|
||||
list("____BB"),
|
||||
list("_BW_B_"),
|
||||
list("____WW"),
|
||||
list("_WW___"),
|
||||
]
|
||||
score2, _ = minimax(board2, 0, 3, True)
|
||||
assert score2 == utils.WIN, "black should win in 3"
|
||||
|
||||
board3 = [
|
||||
list("______"),
|
||||
list("__B___"),
|
||||
list("_WWW__"),
|
||||
list("______"),
|
||||
list("______"),
|
||||
list("______"),
|
||||
]
|
||||
score3, _ = minimax(board3, 0, 4, True)
|
||||
assert score3 == -utils.WIN, "white should win in 4"
|
||||
|
||||
# test_21()
|
||||
|
||||
def negamax(
|
||||
board: Board,
|
||||
depth: int,
|
||||
max_depth: int
|
||||
) -> tuple[Score, Move]:
|
||||
"""
|
||||
Finds the best move for the input board state.
|
||||
Note that you are black.
|
||||
|
||||
Parameters
|
||||
----------
|
||||
board: 2D list of lists. Contains characters "B", "W", and "_",
|
||||
representing black pawn, white pawn, and empty cell, respectively.
|
||||
|
||||
depth: int, the depth to search for the best move. When this is equal
|
||||
to `max_depth`, you should get the evaluation of the position using
|
||||
the provided heuristic function.
|
||||
|
||||
max_depth: int, the maximum depth for cutoff.
|
||||
|
||||
Notice that you no longer need the parameter `is_black`.
|
||||
|
||||
Returns
|
||||
-------
|
||||
A tuple (evalutation, ((src_row, src_col), (dst_row, dst_col))):
|
||||
evaluation: the best score that black can achieve after this move.
|
||||
src_row, src_col: position of the pawn to move.
|
||||
dst_row, dst_col: position to move the pawn to.
|
||||
"""
|
||||
if depth == max_depth or utils.is_game_over(board):
|
||||
return evaluate(board), None
|
||||
v = -utils.INF
|
||||
best_move = None
|
||||
for action in generate_valid_moves(board):
|
||||
# if current is black, then the next board is white. It is inverted after applying the black move
|
||||
next_board = utils.invert_board(utils.state_change(board, action[0], action[1], False), False)
|
||||
# since this move is white, the score needs to be negated (to get the min value of the white move)
|
||||
new_v, _ = negamax(next_board, depth + 1, max_depth)
|
||||
# if the negated score is greater than the current max value, update the max value and the best move
|
||||
if -new_v > v:
|
||||
v = -new_v
|
||||
best_move = action
|
||||
return v, best_move
|
||||
|
||||
def test_22():
|
||||
board1 = [
|
||||
list("______"),
|
||||
list("___B__"),
|
||||
list("____BB"),
|
||||
list("___WB_"),
|
||||
list("_B__WW"),
|
||||
list("_WW___"),
|
||||
]
|
||||
score1, _ = negamax(board1, 0, 1)
|
||||
assert score1 == utils.WIN, "black should win in 1"
|
||||
|
||||
board2 = [
|
||||
list("______"),
|
||||
list("___B__"),
|
||||
list("____BB"),
|
||||
list("_BW_B_"),
|
||||
list("____WW"),
|
||||
list("_WW___"),
|
||||
]
|
||||
score2, _ = negamax(board2, 0, 3)
|
||||
assert score2 == utils.WIN, "black should win in 3"
|
||||
|
||||
board3 = [
|
||||
list("______"),
|
||||
list("__B___"),
|
||||
list("_WWW__"),
|
||||
list("______"),
|
||||
list("______"),
|
||||
list("______"),
|
||||
]
|
||||
score3, _ = negamax(board3, 0, 4)
|
||||
assert score3 == -utils.WIN, "white should win in 4"
|
||||
|
||||
# test_22()
|
||||
|
||||
def minimax_alpha_beta(
|
||||
board: Board,
|
||||
depth: int,
|
||||
max_depth: int,
|
||||
alpha: Score,
|
||||
beta: Score,
|
||||
is_black: bool
|
||||
) -> tuple[Score, Move]:
|
||||
"""
|
||||
Finds the best move for the input board state.
|
||||
Note that you are black.
|
||||
|
||||
Parameters
|
||||
----------
|
||||
board: 2D list of lists. Contains characters "B", "W", and "_",
|
||||
representing black pawn, white pawn, and empty cell, respectively.
|
||||
|
||||
depth: int, the depth to search for the best move. When this is equal
|
||||
to `max_depth`, you should get the evaluation of the position using
|
||||
the provided heuristic function.
|
||||
|
||||
max_depth: int, the maximum depth for cutoff.
|
||||
|
||||
alpha: Score. The alpha value in a given state.
|
||||
|
||||
beta: Score. The beta value in a given state.
|
||||
|
||||
is_black: bool. True when finding the best move for black, False
|
||||
otherwise.
|
||||
|
||||
Returns
|
||||
-------
|
||||
A tuple (evalutation, ((src_row, src_col), (dst_row, dst_col))):
|
||||
evaluation: the best score that black can achieve after this move.
|
||||
src_row, src_col: position of the pawn to move.
|
||||
dst_row, dst_col: position to move the pawn to.
|
||||
"""
|
||||
# base case
|
||||
if depth == max_depth or utils.is_game_over(board):
|
||||
return evaluate(board), None
|
||||
|
||||
# max value
|
||||
if is_black:
|
||||
v = -utils.INF
|
||||
best_move = None
|
||||
for action in generate_valid_moves(board):
|
||||
next_board = utils.state_change(board, action[0], action[1], False)
|
||||
new_v, _ = minimax_alpha_beta(next_board, depth + 1, max_depth, alpha, beta, False)
|
||||
if new_v > v:
|
||||
v = new_v
|
||||
best_move = action
|
||||
if v >= beta:
|
||||
break
|
||||
alpha = max(alpha, v)
|
||||
return v, best_move
|
||||
# min value
|
||||
else:
|
||||
v = utils.INF
|
||||
best_move = None
|
||||
w_board = utils.invert_board(board, False)
|
||||
for action in generate_valid_moves(w_board):
|
||||
next_board = utils.invert_board(utils.state_change(w_board, action[0], action[1], False), False)
|
||||
new_v, _ = minimax_alpha_beta(next_board, depth + 1, max_depth, alpha, beta, True)
|
||||
if new_v < v:
|
||||
v = new_v
|
||||
best_move = action
|
||||
if v <= alpha:
|
||||
break
|
||||
beta = min(beta, v)
|
||||
return v, best_move
|
||||
|
||||
def test_31():
|
||||
board1 = [
|
||||
list("______"),
|
||||
list("__BB__"),
|
||||
list("____BB"),
|
||||
list("WBW_B_"),
|
||||
list("____WW"),
|
||||
list("_WW___"),
|
||||
]
|
||||
score1, _ = minimax_alpha_beta(board1, 0, 3, -utils.INF, utils.INF, True)
|
||||
assert score1 == utils.WIN, "black should win in 3"
|
||||
|
||||
|
||||
board2 = [
|
||||
list("____B_"),
|
||||
list("___B__"),
|
||||
list("__B___"),
|
||||
list("_WWW__"),
|
||||
list("____W_"),
|
||||
list("______"),
|
||||
]
|
||||
score2, _ = minimax_alpha_beta(board2, 0, 5, -utils.INF, utils.INF, True)
|
||||
assert score2 == utils.WIN, "black should win in 5"
|
||||
|
||||
board3 = [
|
||||
list("____B_"),
|
||||
list("__BB__"),
|
||||
list("______"),
|
||||
list("_WWW__"),
|
||||
list("____W_"),
|
||||
list("______"),
|
||||
]
|
||||
score3, m = minimax_alpha_beta(board3, 0, 6, -utils.INF, utils.INF, True)
|
||||
print(m)
|
||||
assert score3 == -utils.WIN, "white should win in 6"
|
||||
|
||||
# print("custom")
|
||||
# print(minimax_alpha_beta([['_', '_', '_', '_', 'B', '_'], ['_', '_', 'B', 'B', '_', '_'], ['_', '_', '_', '_', '_', '_'], ['_', 'W', 'W', 'W', '_', '_'], ['_', '_', '_', '_', 'W', '_'], ['_', '_', '_', '_', '_', '_']],
|
||||
# 0, 6, -utils.INF, utils.INF, True))
|
||||
# print("custom end")
|
||||
|
||||
def negamax_alpha_beta(
|
||||
board: Board,
|
||||
depth: int,
|
||||
max_depth: int,
|
||||
alpha: Score,
|
||||
beta: Score
|
||||
) -> tuple[Score, Move]:
|
||||
"""
|
||||
Finds the best move for the input board state.
|
||||
Note that you are black.
|
||||
|
||||
Parameters
|
||||
----------
|
||||
board: 2D list of lists. Contains characters "B", "W", and "_",
|
||||
representing black pawn, white pawn, and empty cell, respectively.
|
||||
|
||||
depth: int, the depth to search for the best move. When this is equal
|
||||
to `max_depth`, you should get the evaluation of the position using
|
||||
the provided heuristic function.
|
||||
|
||||
max_depth: int, the maximum depth for cutoff.
|
||||
|
||||
alpha: Score. The alpha value in a given state.
|
||||
|
||||
beta: Score. The beta value in a given state.
|
||||
|
||||
Notice that you no longer need the parameter `is_black`.
|
||||
|
||||
Returns
|
||||
-------
|
||||
A tuple (evalutation, ((src_row, src_col), (dst_row, dst_col))):
|
||||
evaluation: the best score that black can achieve after this move.
|
||||
src_row, src_col: position of the pawn to move.
|
||||
dst_row, dst_col: position to move the pawn to.
|
||||
"""
|
||||
if depth == max_depth or utils.is_game_over(board):
|
||||
return evaluate(board), None
|
||||
best_move = None
|
||||
for action in generate_valid_moves(board):
|
||||
# if current is black, then the next board is white. It is inverted after applying the black move
|
||||
next_board = utils.invert_board(utils.state_change(board, action[0], action[1], False), False)
|
||||
# since this move is white, the score needs to be negated (to get the min value of the white move)
|
||||
new_v, _ = negamax_alpha_beta(next_board, depth + 1, max_depth, -beta, -alpha)
|
||||
# if the negated score is greater than the current max value, update the max value and the best move
|
||||
if -new_v > alpha:
|
||||
alpha = -new_v
|
||||
best_move = action
|
||||
if alpha >= beta:
|
||||
break
|
||||
return alpha, best_move
|
||||
|
||||
def test_32():
|
||||
board1 = [
|
||||
list("______"),
|
||||
list("__BB__"),
|
||||
list("____BB"),
|
||||
list("WBW_B_"),
|
||||
list("____WW"),
|
||||
list("_WW___"),
|
||||
]
|
||||
score1, _ = negamax_alpha_beta(board1, 0, 3, -utils.INF, utils.INF)
|
||||
assert score1 == utils.WIN, "black should win in 3"
|
||||
|
||||
board2 = [
|
||||
list("____B_"),
|
||||
list("___B__"),
|
||||
list("__B___"),
|
||||
list("_WWW__"),
|
||||
list("____W_"),
|
||||
list("______"),
|
||||
]
|
||||
score2, _ = negamax_alpha_beta(board2, 0, 5, -utils.INF, utils.INF)
|
||||
assert score2 == utils.WIN, "black should win in 5"
|
||||
|
||||
board3 = [
|
||||
list("____B_"),
|
||||
list("__BB__"),
|
||||
list("______"),
|
||||
list("_WWW__"),
|
||||
list("____W_"),
|
||||
list("______"),
|
||||
]
|
||||
score3, _ = negamax_alpha_beta(board3, 0, 6, -utils.INF, utils.INF)
|
||||
assert score3 == -utils.WIN, "white should win in 6"
|
||||
|
||||
# test_32()
|
||||
|
||||
# Uncomment and implement the function.
|
||||
# Note: this will override the provided `evaluate` function.
|
||||
|
||||
def evaluate(board: Board) -> Score:
|
||||
score = 0
|
||||
bcount = 0
|
||||
wcount = 0
|
||||
for r, row in enumerate(board):
|
||||
for tile in row:
|
||||
if tile == "B":
|
||||
bcount += 1
|
||||
if r == 0:
|
||||
score += 10
|
||||
if r == 1:
|
||||
score += 10
|
||||
if r == 2:
|
||||
score += 20
|
||||
if r == 3:
|
||||
score += 40
|
||||
if r == 4:
|
||||
score += 50
|
||||
if r == 5:
|
||||
return utils.WIN
|
||||
elif tile == "W":
|
||||
wcount += 1
|
||||
if r == 0:
|
||||
return -utils.WIN
|
||||
if r == 1:
|
||||
score -= 50
|
||||
if r == 2:
|
||||
score -= 40
|
||||
if r == 3:
|
||||
score -= 20
|
||||
if r == 4:
|
||||
score -= 10
|
||||
if r == 5:
|
||||
score -= 10
|
||||
if wcount == 0:
|
||||
return utils.WIN
|
||||
if bcount == 0:
|
||||
return -utils.WIN
|
||||
return score
|
||||
|
||||
def test_41():
|
||||
board1 = [
|
||||
["_", "_", "_", "B", "_", "_"],
|
||||
["_", "_", "_", "W", "_", "_"],
|
||||
["_", "_", "_", "_", "_", "_"],
|
||||
["_", "_", "B", "_", "_", "_"],
|
||||
["_", "_", "_", "_", "_", "_"],
|
||||
["_", "_", "_", "_", "_", "_"]
|
||||
]
|
||||
assert evaluate(board1) == 0
|
||||
|
||||
board2 = [
|
||||
["_", "_", "_", "B", "W", "_"],
|
||||
["_", "_", "_", "W", "_", "_"],
|
||||
["_", "_", "_", "_", "_", "_"],
|
||||
["_", "_", "_", "_", "_", "_"],
|
||||
["_", "_", "_", "_", "_", "_"],
|
||||
["_", "_", "_", "_", "_", "_"]
|
||||
]
|
||||
assert evaluate(board2) == -utils.WIN
|
||||
|
||||
board3 = [
|
||||
["_", "_", "_", "_", "_", "_"],
|
||||
["_", "_", "_", "_", "_", "_"],
|
||||
["_", "_", "_", "_", "_", "_"],
|
||||
["_", "_", "B", "_", "_", "_"],
|
||||
["_", "_", "_", "_", "_", "_"],
|
||||
["_", "_", "_", "_", "_", "_"]
|
||||
]
|
||||
assert evaluate(board3) == utils.WIN
|
||||
|
||||
test_41()
|
||||
|
||||
class PlayerAI:
|
||||
|
||||
def make_move(self, board: Board) -> Move:
|
||||
"""
|
||||
This is the function that will be called from main.py
|
||||
You should combine the functions in the previous tasks
|
||||
to implement this function.
|
||||
|
||||
Parameters
|
||||
----------
|
||||
self: object instance itself, passed in automatically by Python.
|
||||
|
||||
board: 2D list-of-lists. Contains characters "B", "W", and "_",
|
||||
representing black pawn, white pawn, and empty cell, respectively.
|
||||
|
||||
Returns
|
||||
-------
|
||||
A tuple of tuples containing coordinates (row_index, col_index).
|
||||
The first tuple contains the source position of the black pawn
|
||||
to be moved, the second tuple contains the destination position.
|
||||
"""
|
||||
# TODO: Replace starter code with your AI
|
||||
################
|
||||
# Starter code #
|
||||
################
|
||||
for r in range(len(board)):
|
||||
for c in range(len(board[r])):
|
||||
# check if B can move forward directly
|
||||
if board[r][c] == "B" and board[r + 1][c] == "_":
|
||||
src = r, c
|
||||
dst = r + 1, c
|
||||
return src, dst # valid move
|
||||
return (0, 0), (0, 0) # invalid move
|
||||
|
||||
class PlayerNaive:
|
||||
"""
|
||||
A naive agent that will always return the first available valid move.
|
||||
"""
|
||||
def make_move(self, board: Board) -> Move:
|
||||
return utils.generate_rand_move(board)
|
||||
|
||||
##########################
|
||||
# Game playing framework #
|
||||
##########################
|
||||
if __name__ == "__main__":
|
||||
assert utils.test_move([
|
||||
["B", "B", "B", "B", "B", "B"],
|
||||
["_", "B", "B", "B", "B", "B"],
|
||||
["_", "_", "_", "_", "_", "_"],
|
||||
["_", "B", "_", "_", "_", "_"],
|
||||
["_", "W", "W", "W", "W", "W"],
|
||||
["W", "W", "W", "W", "W", "W"]
|
||||
], PlayerAI())
|
||||
|
||||
assert utils.test_move([
|
||||
["_", "B", "B", "B", "B", "B"],
|
||||
["_", "B", "B", "B", "B", "B"],
|
||||
["_", "_", "_", "_", "_", "_"],
|
||||
["_", "B", "_", "_", "_", "_"],
|
||||
["W", "W", "W", "W", "W", "W"],
|
||||
["_", "_", "W", "W", "W", "W"]
|
||||
], PlayerAI())
|
||||
|
||||
assert utils.test_move([
|
||||
["_", "_", "B", "B", "B", "B"],
|
||||
["_", "B", "B", "B", "B", "B"],
|
||||
["_", "_", "_", "_", "_", "_"],
|
||||
["_", "B", "W", "_", "_", "_"],
|
||||
["_", "W", "W", "W", "W", "W"],
|
||||
["_", "_", "_", "W", "W", "W"]
|
||||
], PlayerAI())
|
||||
|
||||
# generates initial board
|
||||
# board = utils.generate_init_state()
|
||||
# res = utils.play(PlayerAI(), PlayerNaive(), board)
|
||||
# # Black wins means your agent wins.
|
||||
# print(res)
|
355
cs2109s/labs/ps3/utils.py
Normal file
355
cs2109s/labs/ps3/utils.py
Normal file
@ -0,0 +1,355 @@
|
||||
import os
|
||||
import sys
|
||||
import copy
|
||||
import time
|
||||
import traceback
|
||||
import multiprocessing
|
||||
import functools
|
||||
from threading import Thread
|
||||
|
||||
from typing import Callable, Any
|
||||
|
||||
# board row and column -> these are constant
|
||||
ROW, COL = 6, 6
|
||||
INF = 90129012
|
||||
WIN = 21092109
|
||||
MOVE_NONE = (-1, -1), (-1, -1)
|
||||
TIME_LIMIT = 3.05
|
||||
|
||||
Move = tuple[tuple[int, int], tuple[int, int]]
|
||||
Board = list[list[str]]
|
||||
|
||||
# generates initial state
|
||||
def generate_init_state() -> Board:
|
||||
"""
|
||||
Generates the initial state of the game.
|
||||
|
||||
Returns
|
||||
-------
|
||||
2D list-of-lists. Contains characters "B", "W", and "_"
|
||||
representing black pawn, white pawn, and empty cell respectively.
|
||||
|
||||
"""
|
||||
state = [
|
||||
list("BBBBBB"),
|
||||
list("BBBBBB"),
|
||||
list("______"),
|
||||
list("______"),
|
||||
list("WWWWWW"),
|
||||
list("WWWWWW"),
|
||||
]
|
||||
return state
|
||||
|
||||
# prints board
|
||||
def print_state(board: Board) -> None:
|
||||
horizontal_rule = "+" + ("-" * 5 + "+") * COL
|
||||
for row in board:
|
||||
print(horizontal_rule)
|
||||
print(f"| {' | '.join(' ' if tile == '_' else tile for tile in row)} |")
|
||||
print(horizontal_rule)
|
||||
|
||||
# inverts board by modifying board state, or returning a new board with updated board state
|
||||
def invert_board(board: Board, in_place: bool = True) -> Board:
|
||||
"""
|
||||
Inverts the board by modifying existing values if in_place is set to True,
|
||||
or creating a new board with updated values if in_place is set to False.
|
||||
"""
|
||||
if not in_place:
|
||||
board = copy.deepcopy(board)
|
||||
board.reverse()
|
||||
for r, row in enumerate(board):
|
||||
for c, tile in enumerate(row):
|
||||
if tile == "W":
|
||||
board[r][c] = "B"
|
||||
elif tile == "B":
|
||||
board[r][c] = "W"
|
||||
return board
|
||||
|
||||
# checks if a move made for black is valid or not.
|
||||
# Move source: src (row, col), move destination: dst (row, col)
|
||||
def is_valid_move(
|
||||
board: Board,
|
||||
src: tuple[int, int],
|
||||
dst: tuple[int, int]
|
||||
) -> bool:
|
||||
"""
|
||||
Checks whether the given move is a valid move.
|
||||
|
||||
Parameters
|
||||
----------
|
||||
board: 2D list-of-lists. Contains characters "B", "W", and "_"
|
||||
representing black pawn, white pawn, and empty cell respectively.
|
||||
|
||||
src: tuple[int, int]. Source position of the pawn.
|
||||
|
||||
dst: tuple[int, int]. Destination position of the pawn.
|
||||
|
||||
Returns
|
||||
-------
|
||||
A boolean indicating whether the given move from `src` to `dst` is valid.
|
||||
"""
|
||||
sr, sc = src
|
||||
dr, dc = dst
|
||||
if board[sr][sc] != "B":
|
||||
# if move not made for black
|
||||
return False
|
||||
if dr < 0 or dr >= ROW or dc < 0 or dc >= COL:
|
||||
# if move takes pawn outside the board
|
||||
return False
|
||||
if dr != sr + 1:
|
||||
# if move takes more than one step forward
|
||||
return False
|
||||
if dc > sc + 1 or dc < sc - 1:
|
||||
# if move takes beyond left/right diagonal
|
||||
return False
|
||||
if dc == sc and board[dr][dc] != "_":
|
||||
# if pawn to the front, but still move forward
|
||||
return False
|
||||
if (dc == sc + 1 or dc == sc - 1) and board[dr][dc] == "B":
|
||||
# if black pawn to the diagonal or front, but still move forward
|
||||
return False
|
||||
return True
|
||||
|
||||
# generates the first available valid move for black
|
||||
def generate_rand_move(board: Board) -> Move:
|
||||
"""
|
||||
Generates a random move.
|
||||
|
||||
Parameters
|
||||
----------
|
||||
board: 2D list-of-lists. Contains characters "B", "W", and "_"
|
||||
representing black pawn, white pawn, and empty cell respectively.
|
||||
|
||||
Returns
|
||||
-------
|
||||
A tuple ((src_row, src_col), (dst_row, dst_col)):
|
||||
src_row, src_col: position of the pawn to move.
|
||||
dst_row, dst_col: position to move the pawn to.
|
||||
"""
|
||||
for r, row in enumerate(board):
|
||||
for c, tile in enumerate(row):
|
||||
if tile != "B":
|
||||
continue
|
||||
src = r, c
|
||||
for d in (-1, 0, 1):
|
||||
dst = r + 1, c + d
|
||||
if is_valid_move(board, src, dst):
|
||||
return src, dst
|
||||
raise ValueError("no valid move")
|
||||
|
||||
# makes a move effective on the board by modifying board state,
|
||||
# or returning a new board with updated board state
|
||||
def state_change(
|
||||
board: Board,
|
||||
src: tuple[int, int],
|
||||
dst: tuple[int, int],
|
||||
in_place: bool = True
|
||||
) -> Board:
|
||||
"""
|
||||
Updates the board configuration by modifying existing values if in_place is set to True,
|
||||
or creating a new board with updated values if in_place is set to False.
|
||||
|
||||
Parameters
|
||||
----------
|
||||
board: 2D list-of-lists. Contains characters "B", "W", and "_"
|
||||
representing black pawn, white pawn, and empty cell respectively.
|
||||
|
||||
src: tuple[int, int]. Source position of the pawn.
|
||||
|
||||
dst: tuple[int, int]. Destination position of the pawn.
|
||||
|
||||
in_place: bool. Whether the modification is to be made in-place or to a deep copy of the given `board`.
|
||||
|
||||
Returns
|
||||
-------
|
||||
The modified board.
|
||||
"""
|
||||
if not in_place:
|
||||
board = copy.deepcopy(board)
|
||||
if is_valid_move(board, src, dst):
|
||||
sr, sc = src
|
||||
dr, dc = dst
|
||||
board[sr][sc] = "_"
|
||||
board[dr][dc] = "B"
|
||||
return board
|
||||
|
||||
# checks if game is over
|
||||
def is_game_over(board: Board) -> bool:
|
||||
"""
|
||||
Returns True if game is over.
|
||||
|
||||
Parameters
|
||||
----------
|
||||
board: 2D list-of-lists. Contains characters "B", "W", and "_"
|
||||
representing black pawn, white pawn, and empty cell respectively.
|
||||
|
||||
Returns
|
||||
-------
|
||||
A bool representing whether the game is over.
|
||||
"""
|
||||
if any(tile == "B" for tile in board[5]) or any(tile == "W" for tile in board[0]):
|
||||
return True
|
||||
wcount, bcount = 0, 0
|
||||
for row in board:
|
||||
for tile in row:
|
||||
if tile == "B":
|
||||
bcount += 1
|
||||
elif tile == "W":
|
||||
wcount += 1
|
||||
return bcount == 0 or wcount == 0
|
||||
|
||||
|
||||
#############################################
|
||||
# Utils function for game playing framework #
|
||||
#############################################
|
||||
|
||||
# move making function for game playing
|
||||
def make_move_job_func(player, board: Board, queue) -> None:
|
||||
# disable stdout and stderr to prevent prints
|
||||
sys.stdout = open(os.devnull, "w")
|
||||
sys.stderr = open(os.devnull, "w")
|
||||
try:
|
||||
src, dst = player.make_move(board)
|
||||
queue.put((src, dst))
|
||||
except KeyboardInterrupt:
|
||||
exit()
|
||||
except Exception as e:
|
||||
queue.put(e)
|
||||
exit(1)
|
||||
finally:
|
||||
# reenable stdout and stderr
|
||||
sys.stdout = sys.__stdout__
|
||||
sys.stderr = sys.__stderr__
|
||||
return
|
||||
|
||||
# game playing function. Takes in the initial board
|
||||
def play(playerAI_A, playerAI_B, board: Board) -> bool:
|
||||
colors = (black, white) = "Black(Student agent)", "White(Test agent)"
|
||||
players = []
|
||||
random_moves = 0
|
||||
move = 0
|
||||
|
||||
# disable stdout for people who leave print statements in their code, disable stderr
|
||||
sys.stdout = open(os.devnull, "w")
|
||||
sys.stderr = open(os.devnull, "w")
|
||||
try:
|
||||
players.append(playerAI_A)
|
||||
except KeyboardInterrupt:
|
||||
exit()
|
||||
except:
|
||||
return f"{black} failed to initialise"
|
||||
finally:
|
||||
# reenable stdout and stderr
|
||||
sys.stdout = sys.__stdout__
|
||||
sys.stderr = sys.__stderr__
|
||||
|
||||
# disable stdout for people who leave print statements in their code, disable stderr
|
||||
sys.stdout = open(os.devnull, 'w')
|
||||
sys.stderr = open(os.devnull, 'w')
|
||||
try:
|
||||
players.append(playerAI_B)
|
||||
except KeyboardInterrupt:
|
||||
exit()
|
||||
except:
|
||||
return f"{white} failed to initialise"
|
||||
finally:
|
||||
# reenable stdout and stderr
|
||||
sys.stdout = sys.__stdout__
|
||||
sys.stderr = sys.__stderr__
|
||||
|
||||
# game starts
|
||||
color = None
|
||||
while not is_game_over(board):
|
||||
player = players[move % 2]
|
||||
color = colors[move % 2]
|
||||
src, dst = MOVE_NONE
|
||||
if color == white:
|
||||
invert_board(board)
|
||||
src, dst = player.make_move(board)
|
||||
else: # black
|
||||
result_queue = multiprocessing.Queue()
|
||||
board_copy = copy.deepcopy(board)
|
||||
mp = multiprocessing.Process(target=make_move_job_func, args=(player, board_copy, result_queue))
|
||||
mp.start()
|
||||
mp.join(timeout=3)
|
||||
exit_code = mp.exitcode
|
||||
if mp.is_alive():
|
||||
mp.terminate()
|
||||
if exit_code == None:
|
||||
del result_queue
|
||||
elif exit_code == 1:
|
||||
e = result_queue.get()
|
||||
del result_queue
|
||||
return f"{black} returned err={e} during move"
|
||||
elif exit_code == 0:
|
||||
src, dst = result_queue.get()
|
||||
del result_queue
|
||||
else:
|
||||
del result_queue
|
||||
|
||||
is_valid = False
|
||||
try:
|
||||
is_valid = is_valid_move(board, src, dst)
|
||||
except KeyboardInterrupt:
|
||||
exit()
|
||||
except Exception:
|
||||
is_valid = False
|
||||
if not is_valid:
|
||||
# if move is invalid or time is exceeded, then we give a random move
|
||||
random_moves += 1
|
||||
src, dst = generate_rand_move(board)
|
||||
|
||||
state_change(board, src, dst) # makes the move effective on the board
|
||||
if color == white:
|
||||
invert_board(board)
|
||||
move += 1
|
||||
|
||||
# return f"{color} win; Random move made by {BLACK}: {random_moves};"
|
||||
return color == black
|
||||
|
||||
def wrap_test(func: Callable) -> Any:
|
||||
def inner(*args, **kwargs):
|
||||
try:
|
||||
return func(*args, **kwargs)
|
||||
except Exception as e:
|
||||
traceback.print_exc()
|
||||
return f"FAILED, reason: {e}"
|
||||
return inner
|
||||
|
||||
if os.name == "nt":
|
||||
def timeout(timeout):
|
||||
def decorate(func):
|
||||
@functools.wraps(func)
|
||||
def wrapper(*args, **kwargs):
|
||||
ret = TimeoutError(f'Function [{func.__name__}] exceeded timeout of [{timeout} seconds]')
|
||||
def run_func():
|
||||
nonlocal ret
|
||||
try:
|
||||
ret = func(*args, **kwargs)
|
||||
except Exception as e:
|
||||
ret = e
|
||||
t = Thread(target=run_func, daemon=True)
|
||||
try:
|
||||
t.start()
|
||||
t.join(timeout)
|
||||
except Exception as e:
|
||||
traceback.print_exc()
|
||||
raise e
|
||||
if isinstance(ret, BaseException):
|
||||
raise ret
|
||||
return ret
|
||||
return wrapper
|
||||
return decorate
|
||||
else:
|
||||
from timeout_decorator import timeout
|
||||
|
||||
@wrap_test
|
||||
@timeout(TIME_LIMIT)
|
||||
def test_move(board: Board, playerAI) -> bool:
|
||||
board_copy = copy.deepcopy(board)
|
||||
start = time.time()
|
||||
src, dst = playerAI.make_move(board_copy)
|
||||
end = time.time()
|
||||
move_time = end - start
|
||||
valid = is_valid_move(board, src, dst)
|
||||
return valid and move_time <= 3
|
Loading…
Reference in New Issue
Block a user