nus/cs2109s/labs/final/scratchpad.ipynb
2024-04-29 12:45:46 +08:00

1049 lines
117 KiB
Plaintext

{
"cells": [
{
"cell_type": "markdown",
"id": "7d017333",
"metadata": {},
"source": [
"# Final Assessment Scratch Pad"
]
},
{
"cell_type": "markdown",
"id": "d3d00386",
"metadata": {},
"source": [
"## Instructions"
]
},
{
"cell_type": "markdown",
"id": "ea516aa7",
"metadata": {},
"source": [
"1. Please use only this Jupyter notebook to work on your model, and **do not use any extra files**. If you need to define helper classes or functions, feel free to do so in this notebook.\n",
"2. This template is intended to be general, but it may not cover every use case. The sections are given so that it will be easier for us to grade your submission. If your specific use case isn't addressed, **you may add new Markdown or code blocks to this notebook**. However, please **don't delete any existing blocks**.\n",
"3. If you don't think a particular section of this template is necessary for your work, **you may skip it**. Be sure to explain clearly why you decided to do so."
]
},
{
"cell_type": "markdown",
"id": "022cb4cd",
"metadata": {},
"source": [
"## Report"
]
},
{
"cell_type": "markdown",
"id": "9c14a2d8",
"metadata": {},
"source": [
"##### Overview\n",
"https://chat.openai.com/share/ec6c6778-d7cc-48e2-98d1-24b7f6a6a769\n",
"\n",
"##### 1. Descriptive Analysis\n",
"At the start, I *did* not read the main.ipynb and jumped straight into teh scratchpad to read the data directly. After fiddling with the data for about an hour, being confused about the variable n (6-10) x 16 x 16 images, I decided to read the main.ipynb fully. I then realised that the data was a list of videos. With that in mind, I decided to plot out the images. However, on first observation, I did not come to realise teh images as japanese characters. I only came to this understanding at 9.30PM on Sunday.\n",
"\n",
"##### 2. Detection and Handling of Missing Values\n",
"There were quite a few NaNs for both the X and y values. Initially, I considered the y NaNs as a class of its own. On further reading of teh main file, I realised that the Nans were not a class, but rows to be removed. After filtering out the y values, there was still the X values to consider. Initially, to get a quick model out, I decided to zero out the values. In the end, I decided to replace the NaNs with the average of the frame. This was done by iterating through each frame, and replacing the NaNs with the average of the frame. This was done with the assistance of GPT for the code generation.\n",
"\n",
"##### 3. Detection and Handling of Outliers\n",
"There were quite a few outliers in teh dataset, which strayed away from the general min and max of 0 and 255. To resolve this, I used the np.clip(X, 0, 255) to upper bound and lowerbound the values. I did some experimentation with trying to make the different values more distinct, but this was not successful. With some rudimentary code, I was able to improve the performance of teh detection, somewhat, but it was not very consistent. \n",
"\n",
"I decided to use equalize also, which should make the data more prominent. However, with the equalize method, I hit time limit, and was not able to use that. \n",
"\n",
"##### 4. Detection and Handling of Class Imbalance \n",
"There was heavy class imbalance in the dataset. This was fixed with sampling the data. I used both upsampling and downsampling, where I upsampled data when it didn't hit the minimum I needed from each class, and downsampled data when it exceeded the maximum I needed from each class. \n",
"\n",
"##### 5. Understanding Relationship Between Variables\n",
"This analysis was not particularly done. Since I was processing it as images, I didn't plan to use dimentionality reduction techniques or other means. \n",
"\n",
"\n",
"##### 6. Data Visualization\n",
"The data visualisation can be seen below. Generally, I used data visualisation to directly see the images, and also see the distribution of the classes. \n",
"\n",
"Visualization was also helpful in viewing the outliers, especially with boxplots. However, I didn't use visualiastions to inspect the model output. \n",
"##### 7. General Preprocessing\n",
"The general preprocessing was done by reducing the data to 6 frames, and then processing the data to remove NaNs and clip the values. The data was then sampled to `600` elements, (which was obtained through hyperparameter optimisation). \n",
" \n",
"##### 8. Feature Selection \n",
"NO feature selection and engineering was done, generally the entire dataset was used. NaN values were replaced by frame averages. \n",
"\n",
"##### 9. Feature Engineering\n",
"No featuer engineering was also done. \n",
"\n",
"##### 10. Creating Models\n",
"The model that I eventually came up with after a lot of experimentation is a 2 layer CNN. I had experimented with a CNN -> RNN, but that didn't work very well. The first model I had created was a 2D CNN, but I struggled with mapping the Videos to a stream of photos effectively. I then created a 1 layer conv3d model, and that gave me a solid result. I then experimented with the possible 2nd layers for the model. I first used LSTM, as I thought that since this was video data, it would be more effective. However, the LSTM model did not give good results. I am predicting this is due to bad hyperparameter optimisation. When I switched the 2nd layer to another 3D CNN, it worked quite well. Implementing batch normalization also improved the results greatly. \n",
"##### 11. Model Evaluation\n",
"The model was evaludated using F1. I averaged out the results of 3 runs of the model with F1 before I decided to finalize this. \n",
"\n",
"##### 12. Hyperparameters Search\n",
"Optuna was used to search for hyperparameters. Learning this library was quite useful. However, I wasn't able to optimally implement this, due to large variances when running hte model multiple times. Figuring out how to take the average value of multiple runs for each parameter search would have been useful to find the most optimal parameters. I mainly used this to find parameters for lr, hidden layers and batch size. I wasn't able to use it optimally due to time constraints too. \n",
"\n",
"##### Conclusion\n",
"I spent way too long due to a bad understanding of a lot of concepts in AI. This exam helped me to learn a lot of stuff that we had used in class, but didn't fully understand. For example, in the psets, a lot of the code writing was very machanical. But this exam allowed us to be creative and problem solve. However, it would have been nicer if the duration was much shorter. \n",
"\n",
"The main aspects I struggled with was trying to figure out what to with the videos. The initial method of processing them as just images was not very effective. This was mainly due to my inability to code out a solution well. "
]
},
{
"cell_type": "markdown",
"id": "49dcaf29",
"metadata": {},
"source": [
"---"
]
},
{
"cell_type": "markdown",
"id": "27103374",
"metadata": {},
"source": [
"# Workings (Not Graded)\n",
"\n",
"You will do your working below. Note that anything below this section will not be graded, but we might counter-check what you wrote in the report above with your workings to make sure that you actually did what you claimed to have done. "
]
},
{
"cell_type": "markdown",
"id": "0f4c6cd4",
"metadata": {},
"source": [
"## Import Packages\n",
"\n",
"Here, we import some packages necessary to run this notebook. In addition, you may import other packages as well. Do note that when submitting your model, you may only use packages that are available in Coursemology (see `main.ipynb`)."
]
},
{
"cell_type": "code",
"execution_count": 42,
"id": "cded1ed6",
"metadata": {
"ExecuteTime": {
"end_time": "2024-04-28T06:46:52.407375Z",
"start_time": "2024-04-28T06:46:52.405317Z"
}
},
"outputs": [],
"source": [
"import pandas as pd\n",
"import os\n",
"import numpy as np"
]
},
{
"cell_type": "markdown",
"id": "748c35d7",
"metadata": {},
"source": [
"## Load Dataset\n",
"\n",
"The dataset `data.npy` consists of $N$ grayscale videos and their corresponding labels. Each video has a shape of (L, H, W). L represents the length of the video, which may vary between videos. H and W represent the height and width, which are consistent across all videos. \n",
"\n",
"A code snippet that loads the data is provided below."
]
},
{
"cell_type": "markdown",
"id": "c09da291",
"metadata": {},
"source": [
"### Load Data"
]
},
{
"cell_type": "code",
"execution_count": 267,
"id": "6297e25a",
"metadata": {
"ExecuteTime": {
"end_time": "2024-04-28T13:26:08.028976Z",
"start_time": "2024-04-28T13:26:08.014557Z"
}
},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"Number of data sample: 2500\n",
"Shape of the first data sample: (10, 16, 16)\n",
"Shape of the third data sample: (8, 16, 16)\n"
]
}
],
"source": [
"with open('data.npy', 'rb') as f:\n",
" data = np.load(f, allow_pickle=True).item()\n",
" X = data['data']\n",
" y = data['label']\n",
"\n",
"\n",
"print('Number of data sample:', len(X))\n",
"print('Shape of the first data sample:', X[0].shape)\n",
"print('Shape of the third data sample:', X[2].shape)"
]
},
{
"cell_type": "markdown",
"id": "cbe832b6",
"metadata": {},
"source": [
"## Data Exploration & Preparation"
]
},
{
"cell_type": "markdown",
"id": "2f6a464c",
"metadata": {},
"source": [
"### 1. Descriptive Analysis"
]
},
{
"cell_type": "code",
"outputs": [],
"source": [
"import matplotlib.pyplot as plt\n",
"import torch\n",
"\n",
"def show_images(images, n_row=5, n_col=5, figsize=[12,12]):\n",
" _, axs = plt.subplots(n_row, n_col, figsize=figsize)\n",
" axs = axs.flatten()\n",
" for img, ax in zip(images, axs):\n",
" ax.imshow(img, cmap='gray')\n",
" plt.show()"
],
"metadata": {
"collapsed": false,
"ExecuteTime": {
"end_time": "2024-04-28T15:03:57.588510Z",
"start_time": "2024-04-28T15:03:57.585485Z"
}
},
"id": "f8155151c5e660f5",
"execution_count": 330
},
{
"cell_type": "code",
"execution_count": 338,
"id": "3b1f62dd",
"metadata": {
"ExecuteTime": {
"end_time": "2024-04-28T15:07:52.672987Z",
"start_time": "2024-04-28T15:07:51.997288Z"
}
},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"torch.Size([6, 1, 16, 16])\n"
]
},
{
"data": {
"text/plain": "<Figure size 2000x2000 with 6 Axes>",
"image/png": ""
},
"metadata": {},
"output_type": "display_data"
},
{
"data": {
"text/plain": "<Figure size 2000x2000 with 6 Axes>",
"image/png": ""
},
"metadata": {},
"output_type": "display_data"
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"tensor([227, 255, 255, 255, 212, 242, 255, 255, 247, 255, 255, 255, 208, 231,\n",
" 230, 223], dtype=torch.uint8)\n",
"[227. 255. 255. 255. 212. 242. 255. 255. 247. 255. 255. 255. 208. 231.\n",
" 230. 223.]\n"
]
}
],
"source": [
"# Remove NaN's from the input\n",
"import torchvision\n",
"not_nan_indices = np.argwhere(~np.isnan(np.array(y))).squeeze()\n",
"y = [y[i] for i in not_nan_indices]\n",
"X = [X[i] for i in not_nan_indices]\n",
"# Plot each image in a row\n",
"tmp = X[0][:6].copy()\n",
"# Set 255 to all values in X which are greater than 120\n",
"# Set 0 to all values in X which are less than 100\n",
"\n",
"tmp = np.array(tmp)\n",
"tmp = np.nan_to_num(tmp, 0)\n",
"tmp = np.clip(tmp, 0, 255)\n",
"tensor = torch.Tensor(tmp)\n",
"tensor = tensor.to(torch.uint8).reshape(-1, 1, 16, 16)\n",
"print(tensor.shape)\n",
"tensor = torchvision.transforms.functional.equalize(tensor)\n",
"tensor = tensor.reshape(6, 16, 16)\n",
"# 100 all values less than \n",
"show_images(tensor, n_row=1, n_col=6, figsize=[20, 20])\n",
"show_images(tmp, n_row=1, n_col=6, figsize=[20, 20])\n",
"print(tensor[0][0])\n",
"print(tmp[0][0])\n",
"# At 9.30PM on Sunday I've come to realies that this is japanese characters... A bit too late to figure that out...\n"
]
},
{
"cell_type": "code",
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"[1.0, 1.0, 4.0, 1.0, 1.0, 3.0, 1.0, 1.0, 1.0, 0.0]\n"
]
}
],
"source": [
"print(y[:10]) # y is just a list of values"
],
"metadata": {
"collapsed": false,
"ExecuteTime": {
"end_time": "2024-04-28T14:16:29.783690Z",
"start_time": "2024-04-28T14:16:29.781229Z"
}
},
"id": "3b890da00340343f",
"execution_count": 294
},
{
"cell_type": "code",
"outputs": [],
"source": [
"pd.DataFrame(y).value_counts()\n",
"# From this, we know that we need to undersample or upsample the data. We will pick understampling as the data is quite large, and understampling will reduce the training time."
],
"metadata": {
"collapsed": false
},
"id": "dd66bb1efa4e602c",
"execution_count": null
},
{
"cell_type": "markdown",
"id": "adb61967",
"metadata": {},
"source": [
"### 2. Detection and Handling of Missing Values"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "4bb9cdfb",
"metadata": {},
"outputs": [],
"source": [
"np.isnan(X6).sum() # We know that there is quite a few NaNs in the data. However, I will not be figuring out which column / nan has this value. Instead we can just take the average of each image, adn use that as the input to the nan"
]
},
{
"cell_type": "markdown",
"id": "8adcb9cd",
"metadata": {},
"source": [
"### 3. Detection and Handling of Outliers"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "ed1c17a1",
"metadata": {},
"outputs": [],
"source": [
"# Check if there are outliers\n",
"# We can check if there are outliers by checking the max and min values of each video\n",
"np.max(X6, axis=3)\n",
"# From this we can see that there are values whic exceed 255, and thus, we can clip that."
]
},
{
"cell_type": "markdown",
"id": "d4916043",
"metadata": {},
"source": [
"### 4. Detection and Handling of Class Imbalance"
]
},
{
"cell_type": "code",
"execution_count": 83,
"id": "ad3ab20e",
"metadata": {
"ExecuteTime": {
"end_time": "2024-04-28T06:59:16.949196Z",
"start_time": "2024-04-28T06:59:16.943398Z"
}
},
"outputs": [
{
"data": {
"text/plain": "0\n0 300\n1 300\n2 300\n3 300\n4 300\n5 300\nName: count, dtype: int64"
},
"execution_count": 83,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"# Handling Undersampling\n",
"pd.DataFrame(y).value_counts()\n",
"# There is a class imbalance, and we will need to undersample the data"
]
},
{
"cell_type": "markdown",
"id": "2552a795",
"metadata": {},
"source": [
"### 5. Understanding Relationship Between Variables"
]
},
{
"cell_type": "code",
"execution_count": 44,
"id": "29ddbbcf",
"metadata": {
"ExecuteTime": {
"end_time": "2024-04-28T06:46:52.478781Z",
"start_time": "2024-04-28T06:46:52.477156Z"
}
},
"outputs": [],
"source": []
},
{
"cell_type": "markdown",
"id": "757fb315",
"metadata": {},
"source": [
"### 6. Data Visualization"
]
},
{
"cell_type": "code",
"execution_count": 349,
"id": "93f82e42",
"metadata": {
"ExecuteTime": {
"end_time": "2024-04-28T15:44:25.035919Z",
"start_time": "2024-04-28T15:44:24.956794Z"
}
},
"outputs": [
{
"data": {
"text/plain": "<Axes: >"
},
"execution_count": 349,
"metadata": {},
"output_type": "execute_result"
},
{
"data": {
"text/plain": "<Figure size 640x480 with 1 Axes>",
"image/png": "iVBORw0KGgoAAAANSUhEUgAAAjwAAAGdCAYAAAAWp6lMAAAAOXRFWHRTb2Z0d2FyZQBNYXRwbG90bGliIHZlcnNpb24zLjguMywgaHR0cHM6Ly9tYXRwbG90bGliLm9yZy/H5lhTAAAACXBIWXMAAA9hAAAPYQGoP6dpAAA9e0lEQVR4nO3dfXSU5Z3H/08yJEPghzlKSEIImtH41CbUFTyAmgyxJkqMNQ2xLSqnnnZ9KKLFQO2GdhdwK6kW0lVYUNeH7U8ruieM2TUGfowVkuEhlgfZEl01YMJjMBExwQB5uOf+/WFmDmNijQIZueb9OodznPv+ZvIdz0nuT677uq8ryrZtWwAAAAaLDncDAAAAZxqBBwAAGI/AAwAAjEfgAQAAxiPwAAAA4xF4AACA8Qg8AADAeAQeAABgvCHhbuDbwu/36+DBgxoxYoSioqLC3Q4AABgA27Z19OhRpaSkKDr6y8dxCDy9Dh48qLFjx4a7DQAA8A3s27dPqampX3qewNNrxIgRkj7/H3bOOeeEuRsAp1N3d7fWrl2rvLw8xcTEhLsdAKdRe3u7xo4dG7yOfxkCT6/AbaxzzjmHwAMYpru7W8OGDdM555xD4AEM9VXTUZi0DAAAjEfgAQAAxiPwAAAA4xF4AACA8Qg8AADAeAQeAABgPAIPAAAwHoEHAAAYj8ADwGiWZammpka1tbWqqamRZVnhbglAGBB4ABjL4/EoPT1dubm5Ki8vV25urtLT0+XxeMLdGoBBxtYSAIzk8XhUXFysm266SQ8++KAaGhp08cUXy+v1qri4WBUVFSoqKgp3mwAGSZRt23a4m/g2aG9vV3x8vNra2thLCzjLWZal9PR0JSQkqLW1VXv27Ameu+CCCzRq1CgdPnxYDQ0NcjgcYewUwKka6PWbW1oAjOPz+dTU1KStW7dq3Lhx8vl8WrlypXw+n8aNG6etW7eqsbFRPp8v3K0CGCQEHgDGOXDggCRp6tSpqqys1MSJExUXF6eJEyeqsrJSU6dODakDYD4CDwDjtLa2SpKKiooUHR36ay46OlqFhYUhdQDMR+ABYJxRo0ZJ+nzist/vDznn9/tVWVkZUgfAfAQeAMYZM2aMJGnNmjUqLCxUXV2djh8/rrq6OhUWFmrNmjUhdQDMx1NavXhKCzDHyU9pffzxx2pqagqec7lcGjlyJE9pAYYY6PWbdXgAGMfhcGjJkiVfug7P66+/roqKCsIOEEEIPACMVFRUpIqKCs2ZM0dVVVXB4y6Xi0UHgQjELa1e3NICzGRZltatW6fVq1dr6tSpysnJYWQHMAgLDwIAAPQi8AAwFpuHAggg8AAwUmDz0MzMzJCtJTIzM1VcXEzoASIMc3h6MYcHMEfgsfTMzExVVlbKsixVV1crPz9fDodDhYWFqq+v57F0wADM4QEQsQKbh86bN6/frSVKS0vZPBSIMAQeAMZpbm6WJGVkZPR7PnA8UAfAfAQeAMYZPXq0JKm+vr7f84HjgToA5iPwADBOVlaW0tLStGjRon43Dy0rK5PL5VJWVlaYOgQw2Ag8AIwT2Fqiqqqq381Dq6qqtHjxYiYsAxGErSUAGOnkrSWys7ODx9laAohMgzbCU1ZWpqioKM2ePTt4zLZtLViwQCkpKYqLi9OUKVP0zjvvhHxdZ2en7r//fiUkJGj48OH6wQ9+oP3794fUHDlyRDNmzFB8fLzi4+M1Y8YMffrpp4PwqQB8mxUVFWnXrl3yer0qKSmR1+tVQ0MDYQeIQIMSeLZs2aKnn35a48aNCzn+2GOPqby8XMuWLdOWLVuUnJys3NxcHT16NFgze/Zsvfrqq3r55Ze1YcMGffbZZyooKJBlWcGa2267TTt27NCaNWu0Zs0a7dixQzNmzBiMjwbgW87hcMjtdis7O1tut5vbWECkss+wo0eP2hdffLHt9Xptt9tt//KXv7Rt27b9fr+dnJxs//73vw/Wnjhxwo6Pj7effPJJ27Zt+9NPP7VjYmLsl19+OVhz4MABOzo62l6zZo1t27b97rvv2pLsurq6YM3mzZttSfZ777034D7b2tpsSXZbW9upfFwA30JdXV12ZWWl3dXVFe5WAJxmA71+n/E5PPfdd59uuukmXX/99frd734XPN7Y2KhDhw4pLy8veMzpdMrtdmvTpk265557tG3bNnV3d4fUpKSkKCMjQ5s2bdINN9ygzZs3Kz4+XhMnTgzWTJo0SfHx8dq0aZMuvfTSfvvq7OxUZ2dn8HV7e7skqbu7W93d3aft8wMIv8DPND/bgHkG+nN9RgPPyy+/rO3bt2vLli19zh06dEiSlJSUFHI8KSlJe/bsCdbExsbq3HPP7VMT+PpDhw4pMTGxz/snJiYGa/pTVlamhQsX9jm+du1aDRs27Cs+GYCzkdfrDXcLAE6zY8eODajujAWeffv26Ze//KXWrl2roUOHfmldVFRUyGvbtvsc+6Iv1vRX/1XvU1paqpKSkuDr9vZ2jR07Vnl5eeylBRimu7tbXq9Xubm5iomJCXc7AE6jwB2ar3LGAs+2bdvU0tKi8ePHB49ZlqXa2lotW7ZM77//vqTPR2hOXu20paUlOOqTnJysrq4uHTlyJGSUp6WlRVdffXWw5qOPPurz/VtbW/uMHp3M6XTK6XT2OR4TE8MvRMBQ/HwD5hnoz/QZe0rr+9//vnbu3KkdO3YE/02YMEG33367duzYoQsvvFDJyckhQ8xdXV2qqakJhpnx48crJiYmpKa5uVn19fXBmsmTJ6utrU1//etfgzVvvfWW2tragjUAIpdlWaqpqVFtba1qampCnvAEEDnO2AjPiBEj+mzcN3z4cI0cOTJ4fPbs2Vq0aJEuvvhiXXzxxVq0aJGGDRum2267TZIUHx+vn//855ozZ45Gjhyp8847T3PnzlVmZqauv/56SdLll1+uG2+8UXfddZeeeuopSdLdd9+tgoKCL52wDCAyeDwezZkzR01NTZKk8vJypaWlacmSJazFA0SYsG4t8dBDD2n27NmaOXOmJkyYoAMHDmjt2rUaMWJEsOaPf/yjCgsL9aMf/UjXXHONhg0bptdeey1kLY0///nPyszMVF5envLy8jRu3Di98MIL4fhIAL4lPB6PiouLlZmZKZ/Pp5UrV8rn8ykzM1PFxcXyeDzhbhHAIIqybdsOdxPfBu3t7YqPj1dbWxuTloGznGVZSk9PV2ZmpiorK2VZlqqrq5Wfny+Hw6HCwkLV19eroaGBhQiBs9xAr99sHgrAOD6fT01NTZo3b56io0N/zUVHR6u0tFSNjY3y+Xxh6hDAYCPwADBOc3OzJPWZRxgQOB6oA2A+Ag8A4wSWuqivr+/3fOD4yUtiADAbgQeAcbKyspSWlqZFixbJ7/eHnPP7/SorK5PL5VJWVlaYOgQw2Ag8AIzjcDi0ZMkSVVVVqbCwUHV1dTp+/Ljq6upUWFioqqoqLV68mAnLQAQ545uHAkA4FBUVqaKiQnPmzFF2dnbwuMvlUkVFBevwABGGx9J78Vg6YCbLsrRu3TqtXr1aU6dOVU5ODiM7gEEGev1mhAeA0RwOh9xutzo6OuR2uwk7QIRiDg8AADAegQcAABiPwAMAAIxH4AEAAMYj8AAAAOMReAAAgPEIPAAAwHgEHgAAYDwCDwAAMB6BBwAAGI/AAwAAjEfgAQAAxiPwAAAA4xF4AACA8Qg8AADAeAQeAABgPAIPAAAwHoEHgNEsy1JNTY1qa2tVU1Mjy7LC3RKAMCDwADCWx+NRenq6cnNzVV5ertzcXKWnp8vj8YS7NQCDjMADwEgej0fFxcXKzMyUz+fTypUr5fP5lJmZqeLiYkIPEGEIPACMY1mW5syZo4KCAq1atUonTpzQli1bdOLECa1atUoFBQWaO3cut7eACELgAWAcn8+npqYmXX311brkkktCbmldcsklmjx5shobG+Xz+cLdKoBBQuABYJzm5mZJUmlpab+3tObNmxdSB8B8BB4AxklMTJQkXXvttaqsrNTEiRMVFxeniRMnqrKyUtdcc01IHQDzEXgARJyoqKhwtwBgkBF4ABinpaVFkrRx40YVFhaqrq5Ox48fV11dnQoLC7Vx48aQOgDmI/AAMM7o0aMlSYsWLdLOnTuVnZ2t6dOnKzs7W/X19XrkkUdC6gCYL8q2bTvcTXwbtLe3Kz4+Xm1tbTrnnHPC3Q6AU2BZltLT05WQkKDW1lbt2bMneO6CCy7QqFGjdPjwYTU0NMjhcISxUwCnaqDXb0Z4ABjH4XDo1ltv1datW3XixAmtWLFCzz//vFasWKETJ05o69atKi4uJuwAEYQRnl6M8ADm+HsjPGlpaUpISGCEBzAEIzwAIlZg4cGlS5dq9+7d8nq9Kikpkdfr1a5du/TEE0+w8CAQYQg8AIwTWFAwIyOj3/OB4yw8CESOIeFuAABOt8DTV8uWLdNTTz2lpqYmSVJ5ebnS0tJ09913h9QBMB9zeHoxhwcwh2VZGj16tFpbW1VQUKBf//rX2r9/v1JTU/Xoo4+qqqpKiYmJOnjwIHN4gLPcQK/fjPAAMFJgNWXbtrV9+3Y1NDTo4osvFn/jAZGJwAPAOD6fTy0tLbr99tv1yiuv6PXXXw+eGzJkiG677Ta99NJL8vl8mjJlSvgaBTBoCDwAjBOYjPznP/9ZBQUFys3NDY7weL1evfTSSyF1AMxH4AFgnJN3S//v//5vWZal6upq5efna9asWcrOztbGjRvZLR2IIDyWDiDisFs6EHkIPACMw27pAL6IwAPAOOyWDuCLCDwAjJOVlaW0tDRt2rRJH3zwQcjWEu+//742b94sl8ulrKyscLcKYJAQeAAYx+FwaMmSJaqqqtK0adPkdDp11VVXyel0atq0aaqqqtLixYtZdBCIIDylBcBIRUVFqqio0Jw5c5SdnR087nK5VFFRoaKiojB2B2CwsbVEL7aWAMxkWZbWrVun1atXa+rUqcrJyWFkBzAIW0sAgD6/veV2u9XR0SG3203YASIUc3gAAIDxCDwAjGZZlmpqalRbW6uamhpZlhXulgCEAYEHgLE8Ho/S09OVm5ur8vJy5ebmKj09XR6PJ9ytARhkBB4ARvJ4PCouLlZmZqZ8Pp9Wrlwpn8+nzMxMFRcXE3qACMNTWr14Sgswh2VZSk9PV2ZmpiorK0M2D3U4HCosLFR9fb0aGhqYxAyc5QZ6/WaEB4BxfD6fmpqaNG/ePNm2HTKHx7ZtlZaWqrGxUT6fL9ytAhgkZzTwlJWV6aqrrtKIESOUmJiowsJCvf/++yE1tm1rwYIFSklJUVxcnKZMmaJ33nknpKazs1P333+/EhISNHz4cP3gBz/Q/v37Q2qOHDmiGTNmKD4+XvHx8ZoxY4Y+/fTTM/nxAHxLNTc3S5J2797d7xyeDz/8MKQOgPnOaOCpqanRfffdp7q6Onm9XvX09CgvL08dHR3Bmscee0zl5eVatmyZtmzZouTkZOXm5uro0aPBmtmzZ+vVV1/Vyy+/rA0bNuizzz5TQUFByNMWt912m3bs2KE1a9ZozZo12rFjh2bMmHEmPx6Ab6nApqB33HFHv3N47rjjjpA6ABHAHkQtLS22JLumpsa2bdv2+/12cnKy/fvf/z5Yc+LECTs+Pt5+8sknbdu27U8//dSOiYmxX3755WDNgQMH7OjoaHvNmjW2bdv2u+++a0uy6+rqgjWbN2+2JdnvvffegHpra2uzJdltbW2n/DkBhFdnZ6c9ZMgQOykpye7u7ra7urrsyspKu6ury+7u7raTkpLsIUOG2J2dneFuFcApGuj1e1BXWm5ra5MknXfeeZKkxsZGHTp0SHl5ecEap9Mpt9utTZs26Z577tG2bdvU3d0dUpOSkqKMjAxt2rRJN9xwgzZv3qz4+HhNnDgxWDNp0iTFx8dr06ZNuvTSS/v00tnZqc7OzuDr9vZ2SVJ3d7e6u7tP7wcHMKhqa2vV09OjlpYWFRYWas6cOTp+/Lg2bNigJUuWqKWlRbZtq7a2Vm63O9ztAjgFA71mD1rgsW1bJSUluvbaa5WRkSFJOnTokCQpKSkppDYpKUl79uwJ1sTGxurcc8/tUxP4+kOHDikxMbHP90xMTAzWfFFZWZkWLlzY5/jatWs1bNiwr/npAHyb1NbWSvr8dviLL76o6667LnguMTFRs2fP1h//+EetXr065BY7gLPPsWPHBlQ3aIFn1qxZ+tvf/qYNGzb0ORcVFRXy2rbtPse+6Is1/dX/vfcpLS1VSUlJ8HV7e7vGjh2rvLw8HksHznLDhw9XeXm5Ro4c2ecPmLi4uOAo89SpUxnhAc5ygTs0X2VQAs/999+v//mf/1Ftba1SU1ODx5OTkyV9PkJz8uTBlpaW4KhPcnKyurq6dOTIkZBRnpaWFl199dXBmo8++qjP921tbe0zehTgdDrldDr7HI+JiVFMTMw3+JQAvi1ycnI0atQo/fa3v1VBQYFefPFF7d+/X6mpqXr00Uf1z//8z0pMTGTndMAAA71mn9GntGzb1qxZs+TxePTmm2/K5XKFnHe5XEpOTpbX6w0e6+rqUk1NTTDMjB8/XjExMSE1zc3Nqq+vD9ZMnjxZbW1t+utf/xqseeutt9TW1hasARBZTh7dtXvXV7VZZxWIXGdy5vQvfvELOz4+3l6/fr3d3Nwc/Hfs2LFgze9//3s7Pj7e9ng89s6dO+3p06fbo0ePttvb24M19957r52ammq/8cYb9vbt2+3rrrvO/t73vmf39PQEa2688UZ73Lhx9ubNm+3NmzfbmZmZdkFBwYB75SktwBzr1q2zJdllZWV2WlqaLSn4z+Vy2YsWLbIl2evWrQt3qwBO0UCv32c08Jz8S+bkf88//3ywxu/32/Pnz7eTk5Ntp9NpZ2dn2zt37gx5n+PHj9uzZs2yzzvvPDsuLs4uKCiw9+7dG1Jz+PBh+/bbb7dHjBhhjxgxwr799tvtI0eODLhXAg9gjpdeesmWZB89etTu6emxvV6vXVJSYnu9Xrunp8dub2+3JdkvvfRSuFsFcIoGev1mL61e7KUFmGP9+vXKycnR5s2bdeWVV2rp0qV68803dd111+n+++/Xtm3bdPXVV2vdunWaMmVKuNsFcAoGev0m8PQi8ADmCGwe6nA41NTUFLIqu8PhUFpamvx+P5uHAgZg81AAEcvhcOh73/uedu/eLYfDoYceekjLly/XQw89JIfDod27d2vcuHGEHSCCMMLTixEewBxdXV0aPny4hg8frnPPPVdNTU3Bcy6XS5988ok6OjrU0dGh2NjY8DUK4JQxwgMgYi1fvlw9PT1avHixdu3aJa/Xq5KSEnm9XjU0NOixxx5TT0+Pli9fHu5WAQySQd1LCwAGw+7duyVJBQUFcjgccrvd6ujokNvtlsPhUEFBQUgdAPMxwgPAOBdddJEkqaqqqt/zgeOBOgDmYw5PL+bwAOYIzOEZOXKk9u/fL9u2VV1drfz8fEVFRSk1NVWHDx9mDg9gAObwAIhYsbGxevDBB/XRRx8pNTVVzzzzjD755BM988wzSk1N1UcffaQHH3yQsANEEObwADDSY489JkkqLy/XzJkzg8eHDBmiX/3qV8HzACIDIzwAjDVp0iSlpqaGHBszZowmTZoUpo4AhAuBB4CRPB6PiouLNW7cOPl8Pq1cuVI+n0/jxo1TcXGxPB5PuFsEMIiYtNyLScuAOQJbS2RmZqqyslKWZQUnLTscDhUWFqq+vp6tJQADMGkZQMTy+XxqamrSvHnzFB0d+msuOjpapaWlamxslM/nC1OHAAYbgQeAcZqbmyVJGRkZsixLNTU1qq2tVU1NjSzLUkZGRkgdAPPxlBYA44wePVqStGzZMj311FPBvbTKy8uVlpamu+++O6QOgPmYw9OLOTyAOSzLUkpKilpaWlRQUKBf//rX2r9/v1JTU/Xoo4+qqqpKiYmJOnjwIHN4gLMcc3gARLST/5YL/Dd/3wGRi8ADwDg+n0+tra0qKytTfX29srOzNX36dGVnZ+udd97RokWL1NLSwqRlIIIQeAAYJzAZeezYsX1Gdfx+v84///yQOgDmI/AAME5gMvIdd9zR78KDd9xxR0gdAPMxabkXk5YBc7BbOhA5mLQMIGJt2rRJPT09amlpUVFRkerq6nT8+HHV1dWpqKhILS0t6unp0aZNm8LdKoBBQuABYJzA3JwXXnhBO3fuDJm0XF9frxdeeCGkDoD5CDwAjBOYm3PRRRdp165d8nq9KikpkdfrVUNDgy688MKQOgDmYw5PL+bwAOZg81AgcjCHB0DEcjgcWrJkiaqqqlRYWBgyh6ewsFBVVVVavHgxYQeIIOylBcBIRUVFqqio0Jw5c5SdnR087nK5VFFRoaKiojB2B2CwcUurF7e0ADNZlqV169Zp9erVmjp1qnJychjZAQwy0Os3IzwAjOZwOOR2u9XR0SG3203YASIUc3gAGM2yLNXU1Ki2tlY1NTWyLCvcLQEIAwIPAGN5PB6lp6crNzdX5eXlys3NVXp6ujweT7hbAzDICDwAjOTxeFRcXKzMzMyQvbQyMzNVXFxM6AEiDJOWezFpGTAH6/AAkYN1eABELJ/Pp6amJs2bN0/R0aG/5qKjo1VaWqrGxkb5fL4wdQhgsBF4ABgnsEdWRkZGv+cDx9lLC4gcBB4AxgnskVVfX9/v+cBx9tICIgeBB4BxsrKylJaWpkWLFsnv94ec8/v9Kisrk8vlUlZWVpg6BDDYCDwAjMNeWgC+iJWWARiJvbQAnIwRHgBG++LKG1+8xQUgMhB4ABgpsPDguHHjQhYeHDduHAsPAhGIhQd7sfAgYA4WHgQiBwsPAohYJy88aNt2yOahtm2z8CAQgQg8AIwTWFBw9+7d/W4e+uGHH4bUATAfgQeAcQILCt5xxx39bh56xx13hNQBMB9zeHoxhwcwR1dXl4YPH66RI0dq//79sm07OIcnKipKqampOnz4sDo6OhQbGxvudgGcAubwAIhYmzZtUk9Pj1paWlRUVBSy8GBRUZFaWlrU09OjTZs2hbtVAIOEwAPAOIG5OS+88IJ27typ7OxsTZ8+XdnZ2aqvr9cLL7wQUgfAfAQeAMYJzM256KKLtGvXLnm9XpWUlMjr9aqhoUEXXnhhSB0A8zGHpxdzeABzsA4PEDmYwwMgYrF5KIAvYvNQAEZi81AAJ+OWVi9uaQFmsixL69at0+rVqzV16lTl5OQwsgMYZKDXb0Z4ABjN4XDI7Xaro6NDbrebsANEKObwAAAA4zHCA+C027Z/jw4e/eiU3qOz84QO7Nt7Wvrx+y29/977+lCfKjr61Ed4xow9X07n0FN+n5QRSRqfesEpvw+Ar0bgAXBaNX7coekv/1HOUX8JdyuhkqWNn752et7rk9PzNp2t39f/d+cjciUMPz1vCOBLEXgAnFYff9ap7k8nqiD9eo09d9g3fp/urk61Htx3Sr0sm//LLz03a+Hj3/h9R6WMVUys8xt/vSS1ftaplQ2fqqOz55TeB8DAEHgAnFa7Wz6T3XOOPHWSdPwU323MN/7KPY8W/N3zi++8Rxf8uuqbvfn/+XXqn02SztFwJ7+GgcHATxqA0yrvu8mSpIsS/x/FxYTniaj5D5VozwDqxu/zaOFj5We8ny8z3DmE21nAIDFqHZ7ly5frD3/4g5qbm/Xd735X//Zv/6asrKwBfS3r8ADfLseOHdN77733jb52/PjxA67dtm3b137/yy67TMOGffPbdQBOn4hbh+eVV17R7NmztXz5cl1zzTV66qmnNHXqVL377rs6//zzw90egK/pvffe+1rB5Zv6Jt9j27ZtuvLKK89ANwDOFGNGeCZOnKgrr7xSK1asCB67/PLLVVhYqLKysq/8ekZ4gG+X0znCc955IzXRnau3arz65JPDIecY4QHObhE1wtPV1aVt27bpn/7pn0KO5+XladOmTWHqCsCpGDZs2GkZRfnNb36j+fPnq7q6Wv/9yv+rhQsX6pFHHgmeZ6QGiAxGBJ6PP/5YlmUpKSkp5HhSUpIOHTrU79d0dnaqs7Mz+Lq9vV2S1N3dre7u7jPXLIBB9cgjj4QEnC/i5x04uw30Z9iIwBMQFRUV8tq27T7HAsrKyrRw4cI+x9euXctQNRBBqqurw90CgFNw7NixAdUZEXgSEhLkcDj6jOa0tLT0GfUJKC0tVUlJSfB1e3u7xo4dq7y8PObwAGe5uLg4HT/+1evkxMXFKT8/fxA6AnCmBO7QfBUjAk9sbKzGjx8vr9erH/7wh8HjXq9Xt9xyS79f43Q65XT2XSk1JiZGMTExZ6xXAGdeQ0ODUlNTB1THzztwdhvoz7ARgUeSSkpKNGPGDE2YMEGTJ0/W008/rb179+ree+8Nd2sABtmYMWMUGxurrq6uL62JjY3VmDHffCVnAGcXYwLPj3/8Yx0+fFgPP/ywmpublZGRoerqal1wATsRA5Gos7NTMTEx6unpu1fVkCFDQh5aAGC+6HA3cDrNnDlTTU1N6uzs1LZt25SdnR3ulgCEicfjkWVZuv766zV06FBFRUVp6NChuv7662VZljweT7hbBDCIjFl48FSx8CBgDsuylJ6eroSEBLW0tGjv3r3Bc+eff74SExN1+PBhNTQ0yOEIz35fAE6PiFp4EABO5vP51NTUpKampj7n9u7dGwxAPp9PU6ZMGdzmAISFUbe0AECSDhw4EPzv6OjQX3Mnvz65DoDZCDwAjNPc3Bz87/z8fPl8Pq1cuVI+ny9k3Z2T6wCYjVtaAIyzfft2SdKIESP06quvyrZtHT58WBMnTtSrr76q8847T0ePHg3WATAfIzwAjBOYo/PZZ5+pqKhIdXV1On78uOrq6lRUVKTPPvsspA6A+RjhAWCctLQ0bdy4Uampqdq5c2fIEhUul0upqanat2+f0tLSwtckgEHFCA8A4/z0pz+VJO3bt0/f/e539fjjj2vWrFl6/PHH9Z3vfEf79u0LqQNgPtbh6cU6PIA5LMvSyJEj1dbWpqioKJ38ay46Olp+v1/x8fE6fPgw6/AAZ7mBXr8Z4QFgHIfDoeeee06S9MW/6fx+vyTpueeeI+wAEYTAA8BIRUVFWrVqVZ/99NLS0rRq1SoVFRWFqTMA4cAtrV7c0gLM1NXVpaVLl+rNN9/Uddddp/vvv1+xsbHhbgvAacItLQARz+Px6NJLL9XcuXNVXV2tuXPn6tJLL2XjUCACEXgAGMnj8ai4uFiZmZkhKy1nZmaquLiY0ANEGG5p9eKWFmCOwG7pmZmZqqyslGVZqq6uVn5+vhwOhwoLC1VfX89u6YABuKUFIGIFdkufN29ev5uHlpaWqrGxUT6fL0wdAhhsBB4AxglsCpqRkdHv+cBxNg8FIgeBB4BxRo8eLUmqr6/v93zgeKAOgPkIPACMk5WVpbS0NC1atCi40GCA3+9XWVmZXC6XsrKywtQhgMFG4AFgHIfDoSVLlqiqqkqFhYUhu6UXFhaqqqpKixcvZsIyEEHYLR2AkYqKilRRUaE5c+b02S29oqKClZaBCMNj6b14LB0wEystA2bjsXQAEY+VlgEEEHgAGCmw0nJGRoaeeOIJzZo1S0888YQyMjJYaRmIQNzS6sUtLcAcgZWWExIS9PHHH6upqSl4Li0tTQkJCTp8+DArLQMG4JYWgIgVWGl527Zt/e6ltW3bNlZaBiIMgQeAcQ4cOCBJuvHGG7Vq1SqdOHFCW7Zs0YkTJ7Rq1SrdeOONIXUAzMdj6QCM09raKunz21eXXHJJ8JZWeXm50tLSgoEnUAfAfIzwADDOqFGjJEkrVqxQRkZGyC2tjIwMPfnkkyF1AMzHCA8A4yQnJwf/27Ztbd++XQ0NDbr44ot18nMaJ9cBMBuBB4CxxowZozVr1uj1118PHhsyZIjGjBnD/B0gwhB4ABinpaVF0ueTkpOSknTbbbepo6NDw4cP10svvRQMO4E6AOYj8AAwTmJioiTp8ssv17Fjx/THP/4xeC4tLU2XXXaZ3nvvvWAdAPMxaRmAsUaOHKmGhgZ5vV6VlJTI6/Xqgw8+0MiRI8PdGoBBRuABYJzAraoNGzZo2rRpcjqduuqqq+R0OjVt2jRt3LgxpA6A+Qg8AIwzevRoSVJZWZl27typ7OxsTZ8+XdnZ2aqvr9eiRYtC6gCYj720erGXFmCOk/fS+uijj7Rv377gubFjxyopKYm9tABDsJcWgIjlcDh06623auvWrX0ePz9w4IC2bt2q4uJiwg4QQQg8AIxjWZb+8z//U5LkdDpDzg0dOlSS9Kc//UmWZQ12awDChMADwDjr169Xa2urrr32WrW1tYU8pfXpp5/q2muvVUtLi9avXx/uVgEMEgIPAOMEgszChQsVExMjt9ut7Oxsud1uxcTEaP78+SF1AMxH4AEAAMYj8AAwzpQpUyRJ8+fPl9/vDznn9/u1YMGCkDoA5iPwADDOlClTlJiYqA0bNuiWW25RXV2djh8/rrq6Ot1yyy3auHGjEhMTCTxABGEvLQDGcTgcWrFihYqLi/WXv/xFVVVVwXPDhg1TVFSUVqxYwWPpQARhhAeAkYqKilRRUaGkpKSQ40lJSaqoqFBRUVGYOgMQDqy03IuVlgEzWZaldevWafXq1Zo6dapycnIY2QEMMtDrN7e0ABjN4XDI7Xaro6NDbrebsANEKG5pAQAA4xF4AACA8Qg8AADAeAQeAABgPAIPAAAwHoEHAAAYj8ADAACMR+ABYDTLslRTU6Pa2lrV1NTIsqxwtwQgDAg8AIzl8XiUnp6u3NxclZeXKzc3V+np6fJ4POFuDcAgI/AAMJLH41FxcbEyMzPl8/m0cuVK+Xw+ZWZmqri4mNADRBj20urFXlqAOSzLUnp6ujIzM1VZWSnLslRdXa38/Hw5HA4VFhaqvr5eDQ0NbDUBnOUGev1mhAeAcXw+n5qamjRv3jxFR4f+mouOjlZpaakaGxvl8/nC1CGAwUbgAWCc5uZmSVJGRka/5wPHA3UAzHfGAk9TU5N+/vOfy+VyKS4uThdddJHmz5+vrq6ukLq9e/fq5ptv1vDhw5WQkKAHHnigT83OnTvldrsVFxenMWPG6OGHH9YX78TV1NRo/PjxGjp0qC688EI9+eSTZ+qjAfiWGz16tCSpvr6+3/OB44E6AOYbcqbe+L333pPf79dTTz2l9PR01dfX66677lJHR4cWL14s6fP77DfddJNGjRqlDRs26PDhw/rpT38q27a1dOlSSZ/fm8vNzVVOTo62bNmiDz74QHfeeaeGDx+uOXPmSJIaGxuVn5+vu+66Sy+++KI2btyomTNnatSoUZo2bdqZ+ogAvqWysrKUlpamRYsWadWqVcHH0ocPHy63262ysjK5XC5lZWWFu1UAg8UeRI899pjtcrmCr6urq+3o6Gj7wIEDwWMrV660nU6n3dbWZtu2bS9fvtyOj4+3T5w4EawpKyuzU1JSbL/fb9u2bT/00EP2ZZddFvK97rnnHnvSpEkD7q2trc2WFPy+AM5uq1atsiXZcXFxtqTgv8DrVatWhbtFAKfBQK/fZ2yEpz9tbW0677zzgq83b96sjIwMpaSkBI/dcMMN6uzs1LZt25STk6PNmzfL7XbL6XSG1JSWlqqpqUkul0ubN29WXl5eyPe64YYb9Oyzz6q7u1sxMTF9euns7FRnZ2fwdXt7uySpu7tb3d3dp+0zAwiPnp4eRUVF9TkeFRWlqKgo9fT08LMOGGCgP8eDFnh2796tpUuXasmSJcFjhw4dUlJSUkjdueeeq9jYWB06dChYk5aWFlIT+JpDhw7J5XL1+z5JSUnq6enRxx9/3O99+rKyMi1cuLDP8bVr12rYsGHf6DMC+HawLEv333+/JkyYoIceekjvvfeejhw5onPPPVeXXXaZHnvsMT3wwAMaMmQIj6UDZ7ljx44NqO5rB54FCxb0GxROtmXLFk2YMCH4+uDBg7rxxht166236h//8R9Davv7C8y27ZDjX6yxeycsf92ak5WWlqqkpCT4ur29XWPHjlVeXh7r8ABnuZqaGrW0tGjVqlWaOHGi8vPz5fV6lZubq5iYGCUlJSk7O1vnnHOO3G53uNsFcAoCd2i+ytcOPLNmzdJPfvKTv1tz8ojMwYMHlZOTo8mTJ+vpp58OqUtOTtZbb70VcuzIkSPq7u4OjtgkJycHR3sCWlpaJOkra4YMGaKRI0f226PT6Qy5TRYQExPT7y0wAGeP1tZWSdIVV1wR8vMc+Pm+4oorgnX8vANnt4H+DH/twJOQkKCEhIQB1R44cEA5OTkaP368nn/++T4LgE2ePFmPPPKImpubg7ed1q5dK6fTqfHjxwdr5s2bp66uLsXGxgZrUlJSgsFq8uTJeu2110Lee+3atZowYQK/zIAIdPJj6ZMmTepznsfSgchzxtbhOXjwoKZMmaKxY8dq8eLFam1t1aFDh0JGYvLy8vSd73xHM2bM0Ntvv62//OUvmjt3ru66667gbaXbbrtNTqdTd955p+rr6/Xqq69q0aJFKikpCd6uuvfee7Vnzx6VlJTo//7v//Tcc8/p2Wef1dy5c8/UxwPwLXbyY+l+vz/knN/v57F0IBKdqcfEnn/++ZBHQU/+d7I9e/bYN910kx0XF2efd9559qxZs0IeQbdt2/7b3/5mZ2Vl2U6n005OTrYXLFgQfCQ9YP369fY//MM/2LGxsXZaWpq9YsWKr9Uvj6UDZlm1apUdFRVl33zzzXZtba29cuVKu7a21r755pvtqKgoHksHDDHQ6zebh/Zi81DAPB6PR3PmzFFTU1PwmMvl0uLFi1VUVBS+xgCcNgO9fhN4ehF4ADNZlqV169Zp9erVmjp1qnJycngUHTDIQK/fg7rwIAAMNofDIbfbrY6ODrndbsIOEKHYLR0AABiPwAMAAIxH4AEAAMYj8AAAAOMReAAAgPEIPACMZlmWampqVFtbq5qaGlmWFe6WAIQBgQeAsTwej9LT05Wbm6vy8nLl5uYqPT1dHo8n3K0BGGQEHgBG8ng8Ki4uVmZmpnw+n1auXCmfz6fMzEwVFxcTeoAIw0rLvVhpGTCHZVlKT09XZmamKisrZVmWqqurlZ+fL4fDocLCQtXX16uhoYGFCIGz3ECv34zwADCOz+dTU1OT5s2bp+jo0F9z0dHRKi0tVWNjo3w+X5g6BDDYCDwAjNPc3CxJysjI6Pd84HigDoD5CDwAjDN69GhJUn19fb/nA8cDdQDMR+ABYJysrCylpaVp0aJF8vv9Ief8fr/KysrkcrmUlZUVpg4BDDYCDwDjOBwOLVmyRFVVVSosLFRdXZ2OHz+uuro6FRYWqqqqSosXL2bCMhBBhoS7AQA4E4qKilRRUaE5c+YoOzs7eNzlcqmiokJFRUVh7A7AYOOx9F48lg6YqaurS0uXLtWbb76p6667Tvfff79iY2PD3RaA04TH0gFEPI/Ho0svvVRz585VdXW15s6dq0svvZRFB4EIROABYCRWWgZwMm5p9eKWFmAOVloGIge3tABELFZaBvBFBB4AxmGlZQBfROABYJyTV1q2LEs1NTWqra1VTU2NLMtipWUgAjGHpxdzeABzBObwJCQkqLW1VXv27Ameu+CCCzRq1CgdPnyYOTyAAZjDAyBiORwO3Xrrrdq6datOnDihFStW6LnnntOKFSt04sQJbd26VcXFxYQdIIIwwtOLER7AHCeP8Hz88cdqamoKnnO5XBo5ciQjPIAhGOEBELECT2ktXbpU7777ru69915dccUVuvfee/XOO+/oiSee4CktIMKwlxYA4wSevnr55ZeVlZWlnp4eSdKOHTv0zDPP6L777gupA2A+Ag8A4wSevnr88ceVlJSkhQsXyul0qrOzU/Pnz9fjjz8eUgfAfMzh6cUcHsAcx48f17BhwxQbG6ujR48qKioquNKybdsaMWKEurq6dOzYMcXFxYW7XQCngDk8ACLWU089JenzndKLi4tVV1en48ePq66uTsXFxerq6gqpA2A+Ag8A4+zevVuS9Mwzz2jnzp3Kzs7W9OnTlZ2drfr6ev3Hf/xHSB0A8xF4ABjnoosukiTZtq1du3bJ6/WqpKREXq9XDQ0N8vv9IXUAzEfgAWCcmTNnasiQIfrtb38bfEIroKenR//yL/+iIUOGaObMmWHqEMBg4yktAMaJjY3Vgw8+qD/84Q8aNmxYcESnvLxc0dHR8vv9+tWvfqXY2NgwdwpgsDDCA8BIkyZNkqRg2AkIvA6cBxAZeCy9F4+lA+awLEspKSlqaWlRfn6+LrzwQn3wwQe65JJL9OGHH6q6ulqJiYk6ePAgW0sAZ7mBXr+5pQXAOOvXr1dLS4uuvfZavfbaa7IsK7gOj8PhUHZ2tjZu3Kj169fr+9//frjbBTAIuKUFwDjr16+XJC1cuFDR0aG/5qKjo7VgwYKQOgDmI/AAAADjEXgAGGfKlCmSpPnz5/c7aXnhwoUhdQDMR+ABYJwpU6Zo1KhR2rBhg2655ZaQrSVuueUWbdiwQYmJiQQeIIIwaRmAcRwOh5588klNmzZNf/nLX1RVVRU8N2zYMEnSihUreEILiCCM8AAwUlFRkVatWqXExMSQ44mJiVq1apWKiorC1BmAcGAdnl6swwOYybIsrVu3TqtXr9bUqVOVk5PDyA5gENbhAQB9fnvL7Xaro6NDbrebsANEKG5pAQAA4xF4AACA8Qg8AADAeAQeAEazLEs1NTWqra1VTU2NLMsKd0sAwoDAA8BYHo9H6enpys3NVXl5uXJzc5Weni6PxxPu1gAMMgIPACN5PB4VFxcrMzNTPp9PK1eulM/nU2ZmpoqLiwk9QIRhHZ5erMMDmMOyLKWnpyszM1OVlZWyLEvV1dXKz8+Xw+FQYWGh6uvr1dDQwGPqwFluoNdvRngAGMfn86mpqUnz5s1TdHTor7no6GiVlpaqsbFRPp8vTB0CGGwEHgDGaW5uliRlZGT0O2k5IyMjpA6A+VhpGYBxRo8eLUlatmyZnnrqKTU1NUmSysvLlZaWprvvvjukDoD5mMPTizk8gDksy9Lo0aPV2tqqgoIC/frXv9b+/fuVmpqqRx99VFVVVUpMTNTBgweZwwOc5ZjDAyCiRUVFBf878Hcdf98BkYvAA8A4Pp9PLS0tKisrU319vbKzszV9+nRlZ2frnXfe0aJFi9TS0sKkZSCCDErg6ezs1BVXXKGoqCjt2LEj5NzevXt18803a/jw4UpISNADDzygrq6ukJqdO3fK7XYrLi5OY8aM0cMPP9znL7WamhqNHz9eQ4cO1YUXXqgnn3zyTH8sAN9SgcnIs2bN0q5du+T1elVSUiKv16uGhgbNmjUrpA6A+QZl0vJDDz2klJQU/e///m/IccuydNNNN2nUqFHasGGDDh8+rJ/+9KeybVtLly6V9Pm9udzcXOXk5GjLli364IMPdOedd2r48OGaM2eOJKmxsVH5+fm666679OKLL2rjxo2aOXOmRo0apWnTpg3GRwTwLRKYjFxfX69JkybJ7Xaro6NDbrdbDodD9fX1IXUAIoB9hlVXV9uXXXaZ/c4779iS7LfffjvkXHR0tH3gwIHgsZUrV9pOp9Nua2uzbdu2ly9fbsfHx9snTpwI1pSVldkpKSm23++3bdu2H3roIfuyyy4L+b733HOPPWnSpAH32dbWZksKfl8AZ6+enh47LS3Nvvnmm+2uri7b6/XaJSUlttfrtbu6uuybb77Zdrlcdk9PT7hbBXCKBnr9PqMjPB999JHuuusuVVZWatiwYX3Ob968WRkZGUpJSQkeu+GGG9TZ2alt27YpJydHmzdvltvtltPpDKkpLS1VU1OTXC6XNm/erLy8vJD3vuGGG/Tss8+qu7tbMTExfb53Z2enOjs7g6/b29slSd3d3eru7j7lzw4gvB599FH95Cc/UXx8vI4fPy7p88fS4+LidOLECb388svy+/3y+/1h7hTAqRjoNfuMBR7btnXnnXfq3nvv1YQJE4LrYJzs0KFDSkpKCjl27rnnKjY2VocOHQrWpKWlhdQEvubQoUNyuVz9vk9SUpJ6enr08ccf9ztsXVZWpoULF/Y5vnbt2n7DGYCzy/bt22Xbdp/d0f1+v2zb1vbt20P+kAJwdjp27NiA6r524FmwYEG/QeFkW7Zs0aZNm9Te3q7S0tK/W3vyo6MBtm2HHP9ijd07Yfnr1pystLRUJSUlwdft7e0aO3as8vLyWIcHOMtZlqXZs2frpptu0sqVK7VixQrV1NTI7XbrF7/4haZPn67/+q//0oIFC1iHBzjLBe7QfJWvHXhmzZqln/zkJ3+3Ji0tTb/73e9UV1fX5y+oCRMm6Pbbb9ef/vQnJScn66233go5f+TIEXV3dwdHbJKTk4OjPQEtLS2S9JU1Q4YM0ciRI/vt0el09vvXXUxMTL+3wACcPTZu3Kimpibdc889GjduXHCEubq6WitWrNDdd9+t119/XXV1dZoyZUpYewVwagZ6zf7agSchIUEJCQlfWffEE0/od7/7XfD1wYMHdcMNN+iVV17RxIkTJUmTJ0/WI488oubm5uBtp7Vr18rpdGr8+PHBmnnz5qmrq0uxsbHBmpSUlOCtrsmTJ+u1114L+f5r167VhAkTCC9ABAo8bj5v3jwVFBTohRdeCK60/Nhjj+k3v/lNSB2ACHDGp0/3amxs7POUVk9Pj52RkWF///vft7dv326/8cYbdmpqqj1r1qxgzaeffmonJSXZ06dPt3fu3Gl7PB77nHPOsRcvXhys+fDDD+1hw4bZDz74oP3uu+/azz77rB0TE2NXVFQMuD+e0gLM8cYbb9iS7Guvvda2LMvu6uqyKysr7a6uLtuyLPvaa6+1JdlvvPFGuFsFcIoGev0O60rLDodDr7/+uoYOHaprrrlGP/rRj1RYWKjFixcHa+Lj4+X1erV//35NmDBBM2fOVElJScj8G5fLperqaq1fv15XXHGF/vVf/1VPPPEEa/AA6JfNFhNAxBm03dLT0tL6/SVz/vnnq6qq6u9+bWZmpmpra/9ujdvt1vbt20+pRwBmCMzz27BhgwoLC/WrX/1Kx48fV11dnf7whz9o48aNIXUAzMdeWgCME5gTWFZWpp07d4bspVVfX69FixaF1AEwH4EHgHGysrKUlpamTZs26YMPPgjZS+v999/X5s2b5XK5lJWVFe5WAQwSAg8A4zgcDi1ZskRVVVWaNm2anE6nrrrqKjmdTk2bNk1VVVVavHgxa/AAEWTQ5vAAwGAqKipSRUWF5syZo+zs7OBxl8uliooKFRUVhbE7AIMtyuZxBUmfr9QYHx+vtrY2VloGDGJZltatW6fVq1dr6tSpysnJYWQHMMhAr9+M8AAwmsPhkNvtVkdHh9xuN2EHiFDM4QEAAMYj8AAAAOMReAAAgPEIPACMZlmWampqVFtbq5qaGlmWFe6WAIQBgQeAsTwej9LT05Wbm6vy8nLl5uYqPT1dHo8n3K0BGGQ8pQXASB6PR8XFxcrPz9fNN9+s999/X5deeqk+/PBDFRcXsxYPEGFYh6cX6/AA5rAsS+np6XI4HNqzZ496enqC54YMGaILLrhAfr9fDQ0NPKYOnOVYhwdAxPL5fGpqapIkJSUlaeHChXI6ners7NT8+fO1e/fuYN2UKVPC1yiAQcMcHgDG2bdvnyQpMTFR+/fv189+9jOde+65+tnPfqb9+/crMTExpA6A+Qg8AIzz1ltvSZJ+9rOfaciQ0IHsIUOG6M477wypA2A+bmkBME5gauK2bdt04sQJ/fu//7vefPNN7dq1S/fdd5/efvvtkDoA5iPwADDOxRdfLEnyer2Ki4sLHq+urtbcuXP71AEwH09p9eIpLcAcXV1dGjp06N8dwYmKitKJEycUGxs7iJ0BON14SgtARAuEndjYWP3whz/UsGHDdOzYMb366qvq6uridhYQYQg8AIyzbNkySdKoUaP0ySef6JVXXgmeczgcGjVqlFpbW7Vs2TKVlJSEq00Ag4intAAYx+fzSZKef/55HTt2TIsXL1Z+fr4WL16sY8eO6ZlnngmpA2A+RngAGGfEiBGSpMbGRsXGxuqBBx5Qenq68vPzFRMTE1yUMFAHwHyM8AAwzowZMyRJ//Iv/xKyrYQk9fT0aMGCBSF1AMxH4AFgnOuuu07x8fE6cuSIxowZo2eeeUaffPKJnnnmGY0ZM0ZHjhxRfHy8rrvuunC3CmCQcEsLgHEcDoeee+45TZs2Ta2trZo5c2bwXFRUlCTpueeeY+NQIIIwwgPASEVFRVq1apXOP//8kOMXXHCBVq1apaKiojB1BiAcWHiwFwsPAmayLEvr1q3T6tWrNXXqVOXk5DCyAxiEhQcBQJ/f3nK73ero6JDb7SbsABGKW1oAAMB4BB4AAGA8Ag8AADAegQcAABiPwAMAAIxH4AEAAMYj8AAAAOMReAAAgPEIPAAAwHistNwrsMNGe3t7mDsBcLp1d3fr2LFjam9vV0xMTLjbAXAaBa7bX7VTFoGn19GjRyVJY8eODXMnAADg6zp69Kji4+O/9Dybh/by+/06ePCgRowYoaioqHC3A+A0am9v19ixY7Vv3z42BwYMY9u2jh49qpSUFEVHf/lMHQIPAOMNdDdlAOZi0jIAADAegQcAABiPwAPAeE6nU/Pnz5fT6Qx3KwDChDk8AADAeIzwAAAA4xF4AACA8Qg8AADAeAQeAABgPAIPAKMtX75cLpdLQ4cO1fjx4+Xz+cLdEoAwIPAAMNYrr7yi2bNn6ze/+Y3efvttZWVlaerUqdq7d2+4WwMwyHgsHYCxJk6cqCuvvFIrVqwIHrv88stVWFiosrKyMHYGYLAxwgPASF1dXdq2bZvy8vJCjufl5WnTpk1h6gpAuBB4ABjp448/lmVZSkpKCjmelJSkQ4cOhakrAOFC4AFgtKioqJDXtm33OQbAfAQeAEZKSEiQw+HoM5rT0tLSZ9QHgPkIPACMFBsbq/Hjx8vr9YYc93q9uvrqq8PUFYBwGRLuBgDgTCkpKdGMGTM0YcIETZ48WU8//bT27t2re++9N9ytARhkBB4Axvrxj3+sw4cP6+GHH1Zzc7MyMjJUXV2tCy64INytARhkrMMDAACMxxweAABgPAIPAAAwHoEHAAAYj8ADAACMR+ABAADGI/AAAADjEXgAAIDxCDwAAMB4BB4AAGA8Ag8AADAegQcAABiPwAMAAIz3/wMwatUTuhosSAAAAABJRU5ErkJggg=="
},
"metadata": {},
"output_type": "display_data"
}
],
"source": [
"# Plot the values in all the images. \n",
"# Flatten the image\n",
"data_to_plot = pd.DataFrame(X[0].reshape(-1))\n",
"data_to_plot.boxplot()\n",
"\n"
]
},
{
"cell_type": "code",
"outputs": [
{
"data": {
"text/plain": "<Axes: xlabel='0'>"
},
"execution_count": 351,
"metadata": {},
"output_type": "execute_result"
},
{
"data": {
"text/plain": "<Figure size 640x480 with 1 Axes>",
"image/png": ""
},
"metadata": {},
"output_type": "display_data"
}
],
"source": [
"y_pd = pd.DataFrame(torch.tensor(y).int())\n",
"y_pd.value_counts().plot(kind='bar')"
],
"metadata": {
"collapsed": false,
"ExecuteTime": {
"end_time": "2024-04-28T15:45:30.464473Z",
"start_time": "2024-04-28T15:45:30.370547Z"
}
},
"id": "1752fd9dbaef6786",
"execution_count": 351
},
{
"cell_type": "markdown",
"id": "2a7eebcf",
"metadata": {},
"source": [
"## Data Preprocessing"
]
},
{
"cell_type": "markdown",
"id": "ae3e3383",
"metadata": {},
"source": [
"### 7. General Preprocessing"
]
},
{
"cell_type": "code",
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"torch.Size([1800, 6, 16, 16])\n",
"(1800,)\n",
"<class 'numpy.ndarray'>\n"
]
}
],
"source": [
"import torch\n",
"\n",
"# Reduce the data to 6 frames\n",
"X = np.array([video[:6] for video in X])\n",
"tensor_videos = torch.tensor(X, dtype=torch.float32)\n",
"# Clip values to 0 and 255\n",
"tensor_videos = np.clip(tensor_videos, 0, 255)\n",
"# Replace NaNs in each frame, with the average of the frame. This was generated with GPT\n",
"for i in range(tensor_videos.shape[0]):\n",
" for j in range(tensor_videos.shape[1]):\n",
" tensor_videos[i][j][torch.isnan(tensor_videos[i][j])] = torch.mean(tensor_videos[i][j][~torch.isnan(tensor_videos[i][j])])\n",
" \n",
"# Undersample the data for each of the 6 classes. Select max of 300 samples for each class\n",
"# Very much generated with the assitance of chatGPT with some modifications\n",
"# Get the indices of each class\n",
"indices = [np.argwhere(y == i).squeeze(1) for i in range(6)]\n",
"# Get the number of samples to take for each class\n",
"num_samples_to_take = 300\n",
"# Get the indices of the samples to take\n",
"indices_to_take = [np.random.choice(indices[i], num_samples_to_take, replace=True) for i in range(6)]\n",
"# Concatenate the indices\n",
"indices_to_take = np.concatenate(indices_to_take)\n",
"# Select the samples\n",
"tensor_videos = tensor_videos[indices_to_take]\n",
"y = y[indices_to_take]\n"
],
"metadata": {
"collapsed": false
},
"id": "19174365",
"execution_count": 82
},
{
"cell_type": "code",
"outputs": [
{
"data": {
"text/plain": "torch.Size([1800, 1, 6, 16, 16])"
},
"execution_count": 85,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"# This is the extra channel dimention to work with the conv3d\n",
"tensor_videos = tensor_videos.unsqueeze(1)\n",
"tensor_videos.shape"
],
"metadata": {
"collapsed": false,
"ExecuteTime": {
"end_time": "2024-04-28T07:01:44.496557Z",
"start_time": "2024-04-28T07:01:44.492973Z"
}
},
"id": "8b6bcf332c355e9d",
"execution_count": 85
},
{
"cell_type": "markdown",
"id": "fb3aa527",
"metadata": {},
"source": [
"### 8. Feature Selection"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "a85808bf",
"metadata": {},
"outputs": [],
"source": []
},
{
"cell_type": "markdown",
"id": "4921e8ca",
"metadata": {},
"source": [
"### 9. Feature Engineering"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "dbcde626",
"metadata": {},
"outputs": [],
"source": []
},
{
"cell_type": "markdown",
"id": "fa676c3f",
"metadata": {},
"source": [
"## Modeling & Evaluation"
]
},
{
"cell_type": "markdown",
"id": "589b37e4",
"metadata": {},
"source": [
"### 10. Creating models"
]
},
{
"cell_type": "code",
"execution_count": 238,
"id": "d8dffd7d",
"metadata": {
"ExecuteTime": {
"end_time": "2024-04-28T08:00:54.037178Z",
"start_time": "2024-04-28T08:00:54.027410Z"
}
},
"outputs": [],
"source": [
"from torch import nn\n",
"class CNN3D(nn.Module):\n",
" def __init__(self):\n",
" super(CNN3D, self).__init__()\n",
" self.conv1 = nn.Conv3d(1, 12, 2, 1,2)\n",
" self.mp = nn.AvgPool3d(2)\n",
" self.relu = nn.LeakyReLU()\n",
" self.fc1 = nn.Linear(3888, 6)\n",
" self.flatten = nn.Flatten()\n",
" def forward(self, x):\n",
" x = self.conv1(x)\n",
" x = self.mp(x)\n",
" x = self.relu(x)\n",
" \n",
" # print(x.shape)\n",
" \n",
" x = x.view(-1, 3888)\n",
" return x\n",
" \n",
"def train(model, criterion, optimizer, loader, epochs = 10):\n",
" for epoch in range(epochs):\n",
" for idx, (inputs, labels) in enumerate(loader):\n",
" optimizer.zero_grad()\n",
" outputs = model(inputs)\n",
" loss = criterion(outputs, labels)\n",
" loss.backward()\n",
" optimizer.step()\n",
" print(f'Epoch {epoch}, Loss: {loss.item()}')\n",
" return model\n",
"def process_X(X):\n",
" X = np.array([video[:6] for video in X])\n",
" tensor_videos = torch.tensor(X, dtype=torch.float32)\n",
" # Clip values to 0 and 255\n",
" tensor_videos = np.clip(tensor_videos, 0, 255)\n",
" # Replace NaNs in each frame, with the average of the frame. This was generated with GPT\n",
" for i in range(tensor_videos.shape[0]):\n",
" for j in range(tensor_videos.shape[1]):\n",
" tensor_videos[i][j][torch.isnan(tensor_videos[i][j])] = torch.mean(tensor_videos[i][j][~torch.isnan(tensor_videos[i][j])])\n",
" return tensor_videos\n",
" \n",
"def process_data(X, y):\n",
" y = np.array(y)\n",
" tensor_videos = process_X(X)\n",
" # Undersample the data for each of the 6 classes. Select max of 300 samples for each class\n",
" # Very much generated with the assitance of chatGPT with some modifications\n",
" # Get the indices of each class\n",
" indices = [np.argwhere(y == i).squeeze(1) for i in range(6)]\n",
" # Get the number of samples to take for each class\n",
" num_samples_to_take = 300\n",
" # Get the indices of the samples to take\n",
" indices_to_take = [np.random.choice(indices[i], num_samples_to_take, replace=True) for i in range(6)]\n",
" # Concatenate the indices\n",
" indices_to_take = np.concatenate(indices_to_take)\n",
" # Select the samples\n",
" tensor_videos = tensor_videos[indices_to_take].unsqueeze(1)\n",
" y = y[indices_to_take]\n",
" return torch.Tensor(tensor_videos), torch.Tensor(y).long()\n",
"class Model():\n",
" def __init__(self):\n",
" self.model = CNN3D()\n",
" self.criterion = nn.CrossEntropyLoss()\n",
" self.optimizer = torch.optim.Adam(self.model.parameters(), lr=0.001)\n",
" def fit(self, X, y):\n",
" X, y = process_data(X, y)\n",
" train_dataset = torch.utils.data.TensorDataset(X, y)\n",
" train_loader = torch.utils.data.DataLoader(train_dataset, batch_size=32, shuffle=True)\n",
" train(self.model, self.criterion, self.optimizer, train_loader)\n",
" def predict(self, X):\n",
" self.model.eval()\n",
"\n",
" X = np.array([video[:6] for video in X])\n",
" tensor_videos = torch.tensor(X, dtype=torch.float32)\n",
" # Clip values to 0 and 255\n",
" tensor_videos = np.clip(tensor_videos, 0, 255)\n",
" # Replace NaNs in each frame, with the average of the frame. This was generated with GPT\n",
" for i in range(tensor_videos.shape[0]):\n",
" for j in range(tensor_videos.shape[1]):\n",
" tensor_videos[i][j][torch.isnan(tensor_videos[i][j])] = torch.mean(tensor_videos[i][j][~torch.isnan(tensor_videos[i][j])])\n",
" X = torch.Tensor(tensor_videos.unsqueeze(1))\n",
" return np.argmax(self.model(X).detach().numpy(), axis=1)\n"
]
},
{
"cell_type": "markdown",
"id": "495bf3c0",
"metadata": {},
"source": [
"### 11. Model Evaluation"
]
},
{
"cell_type": "code",
"execution_count": 239,
"id": "9245ab47",
"metadata": {
"ExecuteTime": {
"end_time": "2024-04-28T08:00:56.273946Z",
"start_time": "2024-04-28T08:00:56.253771Z"
}
},
"outputs": [],
"source": [
"from sklearn.model_selection import train_test_split\n",
"\n",
"with open('data.npy', 'rb') as f:\n",
" data = np.load(f, allow_pickle=True).item()\n",
" X = data['data']\n",
" y = data['label']\n",
"\n",
"\n",
"X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.1)\n",
"\n",
"not_nan_indices = np.argwhere(~np.isnan(np.array(y_test))).squeeze()\n",
"y_test = [y_test[i] for i in not_nan_indices]\n",
"X_test = [X_test[i] for i in not_nan_indices]\n",
"\n"
]
},
{
"cell_type": "code",
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"Epoch 0, Loss: 85.83575439453125\n",
"Epoch 1, Loss: 43.13077926635742\n",
"Epoch 2, Loss: 13.879751205444336\n",
"Epoch 3, Loss: 3.084989070892334\n",
"Epoch 4, Loss: 5.557327747344971\n",
"Epoch 5, Loss: 3.1260528564453125\n",
"Epoch 6, Loss: 3.4430527687072754\n",
"Epoch 7, Loss: 5.166628837585449\n",
"Epoch 8, Loss: 4.4976654052734375\n",
"Epoch 9, Loss: 5.530020236968994\n",
"F1 Score (macro): 0.02\n"
]
}
],
"source": [
"model = Model()\n",
"model.fit(X_train, y_train)\n",
"\n",
"from sklearn.metrics import f1_score\n",
"\n",
"y_pred = model.predict(X_test)\n",
"print(\"F1 Score (macro): {0:.2f}\".format(f1_score(y_test, y_pred, average='macro'))) # You may encounter errors, you are expected to figure out what's the issue.\n"
],
"metadata": {
"collapsed": false,
"ExecuteTime": {
"end_time": "2024-04-28T08:01:04.071319Z",
"start_time": "2024-04-28T08:01:01.436939Z"
}
},
"id": "abb2d957f4a15bd2",
"execution_count": 241
},
{
"cell_type": "code",
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"F1 Score (macro): 0.60\n"
]
}
],
"source": [
"\n"
],
"metadata": {
"collapsed": false,
"ExecuteTime": {
"end_time": "2024-04-28T07:57:16.355215Z",
"start_time": "2024-04-28T07:57:16.281540Z"
}
},
"id": "37ff28a8da9dba6c",
"execution_count": 232
},
{
"cell_type": "markdown",
"id": "8aa31404",
"metadata": {},
"source": [
"### 12. Hyperparameters Search"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "81addd51",
"metadata": {},
"outputs": [],
"source": [
"def objective(trial):\n",
" batch = trial.suggest_int(\"batch_size\", 1, 12, log=True)\n",
" epochs = trial.suggest_int(\"epochs\", 1, 20)\n",
" model = Model(batch_size=2**batch, epochs=epochs)\n",
" model.fit(X_train, y_train)\n",
" pred = model.predict(X_test)\n",
" return -f1_score(y_test, pred, average='macro')\n",
"# Run optimization\n",
"# storage = optuna.storages.InMemoryStorage()\n",
"# study = optuna.create_study(storage=storage)\n",
"# study.optimize(objective, n_trials=10)\n",
"# \n",
"# best_score = study.best_value\n",
"# best_params = study.best_params\n",
"# \n",
"# print(best_score, best_params)\n"
]
},
{
"cell_type": "code",
"outputs": [],
"source": [
"import numpy as np\n",
"import torch\n",
"import os\n",
"\n",
"from torch import nn\n",
"\n",
"with open('data.npy', 'rb') as f:\n",
" data = np.load(f, allow_pickle=True).item()\n",
" X = data['data']\n",
" y = data['label']\n",
"\n",
"from torch import nn\n",
"from sklearn.model_selection import train_test_split\n",
"\n",
"from torch import nn\n",
"import numpy as np\n",
"import torch\n",
"import os\n",
"from torchvision.transforms.functional import equalize\n",
"\n",
"class CNN3D(nn.Module):\n",
" def __init__(self, hidden_size=32, dropout=0.0):\n",
" super(CNN3D, self).__init__()\n",
" self.conv1 = nn.Conv3d(1, hidden_size, kernel_size=3, stride=1, padding=1)\n",
" self.batchnorm = nn.BatchNorm3d(hidden_size)\n",
" self.conv2 = nn.Conv3d(hidden_size, hidden_size*2, kernel_size=3, stride=1, padding=1)\n",
" self.relu = nn.ReLU()\n",
" self.maxpool = nn.MaxPool3d(kernel_size=2, stride=2)\n",
" self.fc1 = nn.Linear(hidden_size*32, 256) # Calculate input size based on output from conv3\n",
" self.fc2 = nn.Linear(256, 6)\n",
" # self.dropout = nn.Dropout(dropout)\n",
"\n",
" def forward(self, x):\n",
" x = self.conv1(x)\n",
" x = self.relu(x)\n",
" x = self.maxpool(x)\n",
" x = self.batchnorm(x)\n",
" x = self.conv2(x)\n",
" x = self.relu(x)\n",
" x = self.maxpool(x)\n",
" # x = self.dropout(x)\n",
"\n",
" x = x.view(x.size(0), -1) # Flatten features for fully connected layers\n",
" x = self.fc1(x)\n",
" x = self.relu(x)\n",
" x = self.fc2(x)\n",
" return x\n",
"\n",
"def train(model, criterion, optimizer, loader, epochs=5):\n",
" for epoch in range(epochs):\n",
" for idx, (inputs, labels) in enumerate(loader):\n",
" optimizer.zero_grad()\n",
" outputs = model(inputs)\n",
" loss = criterion(outputs, labels)\n",
" loss.backward()\n",
" optimizer.step()\n",
" print(f'Epoch {epoch}, Loss: {loss.item()}')\n",
" return model\n",
"\n",
"class Model():\n",
" def __init__(self, batch_size=64,lr=0.001,epochs=10, dropout=0.0, hidden_size=32, n_samples=900):\n",
" print(batch_size, epochs, lr, dropout, hidden_size, n_samples)\n",
" self.batch_size = batch_size\n",
" self.lr = lr\n",
" self.epochs = epochs\n",
" self.model = CNN3D(dropout=dropout, hidden_size=hidden_size)\n",
" self.criterion = nn.CrossEntropyLoss()\n",
" self.optimizer = torch.optim.Adam(self.model.parameters(), lr=self.lr)\n",
" self.n_samples = n_samples\n",
"\n",
" def fit(self, X, y):\n",
" X, y = self.process_data(X, y)\n",
" train_dataset = torch.utils.data.TensorDataset(X, y)\n",
" train_loader = torch.utils.data.DataLoader(train_dataset, batch_size=self.batch_size, shuffle=True)\n",
" train(self.model, self.criterion, self.optimizer, train_loader, self.epochs)\n",
"\n",
" def predict(self, X):\n",
" self.model.eval()\n",
" with torch.no_grad():\n",
" X = np.array([video[:6] for video in X])\n",
" tensor_videos = torch.tensor(X, dtype=torch.float32)\n",
" # Clip values to 0 and 255\n",
" tensor_videos = np.clip(tensor_videos, 0, 255)\n",
"\n",
" # Replace NaNs in each frame, with the average of the frame. This was generated with GPT\n",
" for i in range(tensor_videos.shape[0]):\n",
" for j in range(tensor_videos.shape[1]):\n",
" tensor_videos[i][j][torch.isnan(tensor_videos[i][j])] = torch.mean(\n",
" tensor_videos[i][j][~torch.isnan(tensor_videos[i][j])])\n",
" # tensor_videos = torch.Tensor(tensor_videos).to(torch.uint8).reshape(-1, 1, 16, 16)\n",
" # tensor_videos = equalize(tensor_videos).float().reshape(-1, 1, 6, 16, 16)\n",
" tensor_videos = torch.Tensor(tensor_videos).reshape(-1, 1, 6, 16, 16)\n",
" # some funky code to make the features more prominent\n",
"\n",
" result = self.model(tensor_videos)\n",
" return torch.max(result, dim=1)[1].numpy()\n",
" def process_data(self, X, y):\n",
" y = np.array(y)\n",
" X = np.array([video[:6] for video in X])\n",
" tensor_videos = torch.tensor(X, dtype=torch.float32)\n",
" # Clip values to 0 and 255\n",
" tensor_videos = np.clip(tensor_videos, 0, 255)\n",
"\n",
" # Replace NaNs in each frame, with the average of the frame. This was generated with GPT\n",
" for i in range(tensor_videos.shape[0]):\n",
" for j in range(tensor_videos.shape[1]):\n",
" tensor_videos[i][j][torch.isnan(tensor_videos[i][j])] = torch.mean(\n",
" tensor_videos[i][j][~torch.isnan(tensor_videos[i][j])])\n",
" # Undersample the data for each of the 6 classes. Select max of 300 samples for each class\n",
" # Very much generated with the assitance of chatGPT with some modifications\n",
" # Get the indices of each class\n",
" indices = [np.argwhere(y == i).squeeze(1) for i in range(6)]\n",
" # Get the number of samples to take for each class\n",
" # Get the indices of the samples to take\n",
" indices_to_take = [np.random.choice(indices[i], self.n_samples, replace=True) for i in range(6)]\n",
" # Concatenate the indices\n",
" indices_to_take = np.concatenate(indices_to_take)\n",
" # Select the samples\n",
" tensor_videos = tensor_videos[indices_to_take]\n",
"\n",
" tensor_videos = torch.Tensor(tensor_videos).reshape(-1, 1, 6, 16, 16)\n",
" # Reshape the tensor to int for image processing\n",
" # tensor_videos = torch.Tensor(tensor_videos).to(torch.uint8).reshape(-1, 1, 16, 16)\n",
" # tensor_videos = equalize(tensor_videos).float().reshape(-1, 1, 6, 16, 16)\n",
"\n",
" y = y[indices_to_take]\n",
" return tensor_videos, torch.Tensor(y).long()\n",
"\n",
"\n",
"X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.1)\n",
"\n",
"not_nan_indices = np.argwhere(~np.isnan(np.array(y_test))).squeeze()\n",
"y_test = [y_test[i] for i in not_nan_indices]\n",
"X_test = [X_test[i] for i in not_nan_indices]\n",
"\n",
"print(\"init model\")\n",
"model = Model()\n",
"model.fit(X_train, y_train)\n",
"\n",
"from sklearn.metrics import f1_score\n",
"\n",
"y_pred = model.predict(X_test)\n",
"print(\"F1 Score (macro): {0:.2f}\".format(f1_score(y_test, y_pred, average='macro'))) # You may encounter errors, you are expected to figure out what's the issue."
],
"metadata": {
"collapsed": false
},
"id": "a56142ab267bafaa"
}
],
"metadata": {
"kernelspec": {
"display_name": "Python 3 (ipykernel)",
"language": "python",
"name": "python3"
},
"language_info": {
"codemirror_mode": {
"name": "ipython",
"version": 3
},
"file_extension": ".py",
"mimetype": "text/x-python",
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.9.18"
}
},
"nbformat": 4,
"nbformat_minor": 5
}