566 lines
49 KiB
Plaintext
566 lines
49 KiB
Plaintext
{
|
||
"cells": [
|
||
{
|
||
"cell_type": "markdown",
|
||
"id": "7d017333",
|
||
"metadata": {},
|
||
"source": [
|
||
"# Final Assessment Scratch Pad"
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "markdown",
|
||
"id": "d3d00386",
|
||
"metadata": {},
|
||
"source": [
|
||
"## Instructions"
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "markdown",
|
||
"id": "ea516aa7",
|
||
"metadata": {},
|
||
"source": [
|
||
"1. Please use only this Jupyter notebook to work on your model, and **do not use any extra files**. If you need to define helper classes or functions, feel free to do so in this notebook.\n",
|
||
"2. This template is intended to be general, but it may not cover every use case. The sections are given so that it will be easier for us to grade your submission. If your specific use case isn't addressed, **you may add new Markdown or code blocks to this notebook**. However, please **don't delete any existing blocks**.\n",
|
||
"3. If you don't think a particular section of this template is necessary for your work, **you may skip it**. Be sure to explain clearly why you decided to do so."
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "markdown",
|
||
"id": "022cb4cd",
|
||
"metadata": {},
|
||
"source": [
|
||
"## Report"
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "markdown",
|
||
"id": "9c14a2d8",
|
||
"metadata": {},
|
||
"source": [
|
||
"**[TODO]**\n",
|
||
"\n",
|
||
"Please provide a summary of the ideas and steps that led you to your final model. Someone reading this summary should understand why you chose to approach the problem in a particular way and able to replicate your final model at a high level. Please ensure that your summary is detailed enough to provide an overview of your thought process and approach but also concise enough to be easily understandable. Also, please follow the guidelines given in the `main.ipynb`.\n",
|
||
"\n",
|
||
"This report should not be longer than **1-2 pages of A4 paper (up to around 1,000 words)**. Marks will be deducted if you do not follow instructions and you include too many words here. \n",
|
||
"\n",
|
||
"**[DELETE EVERYTHING FROM THE PREVIOUS TODO TO HERE BEFORE SUBMISSION]**\n",
|
||
"\n",
|
||
"##### Overview\n",
|
||
"**[TODO]**\n",
|
||
"\n",
|
||
"##### 1. Descriptive Analysis\n",
|
||
"**[TODO]**\n",
|
||
"\n",
|
||
"##### 2. Detection and Handling of Missing Values\n",
|
||
"**[TODO]**\n",
|
||
"\n",
|
||
"##### 3. Detection and Handling of Outliers\n",
|
||
"**[TODO]**\n",
|
||
"\n",
|
||
"##### 4. Detection and Handling of Class Imbalance \n",
|
||
"**[TODO]**\n",
|
||
"\n",
|
||
"##### 5. Understanding Relationship Between Variables\n",
|
||
"**[TODO]**\n",
|
||
"\n",
|
||
"##### 6. Data Visualization\n",
|
||
"**[TODO]** \n",
|
||
"##### 7. General Preprocessing\n",
|
||
"**[TODO]**\n",
|
||
" \n",
|
||
"##### 8. Feature Selection \n",
|
||
"**[TODO]**\n",
|
||
"\n",
|
||
"##### 9. Feature Engineering\n",
|
||
"**[TODO]**\n",
|
||
"\n",
|
||
"##### 10. Creating Models\n",
|
||
"**[TODO]**\n",
|
||
"\n",
|
||
"##### 11. Model Evaluation\n",
|
||
"**[TODO]**\n",
|
||
"\n",
|
||
"##### 12. Hyperparameters Search\n",
|
||
"**[TODO]**\n",
|
||
"\n",
|
||
"##### Conclusion\n",
|
||
"**[TODO]**"
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "markdown",
|
||
"id": "49dcaf29",
|
||
"metadata": {},
|
||
"source": [
|
||
"---"
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "markdown",
|
||
"id": "27103374",
|
||
"metadata": {},
|
||
"source": [
|
||
"# Workings (Not Graded)\n",
|
||
"\n",
|
||
"You will do your working below. Note that anything below this section will not be graded, but we might counter-check what you wrote in the report above with your workings to make sure that you actually did what you claimed to have done. "
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "markdown",
|
||
"id": "0f4c6cd4",
|
||
"metadata": {},
|
||
"source": [
|
||
"## Import Packages\n",
|
||
"\n",
|
||
"Here, we import some packages necessary to run this notebook. In addition, you may import other packages as well. Do note that when submitting your model, you may only use packages that are available in Coursemology (see `main.ipynb`)."
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "code",
|
||
"execution_count": 2,
|
||
"id": "cded1ed6",
|
||
"metadata": {
|
||
"ExecuteTime": {
|
||
"end_time": "2024-04-20T14:33:43.165330Z",
|
||
"start_time": "2024-04-20T14:33:41.764757Z"
|
||
}
|
||
},
|
||
"outputs": [],
|
||
"source": [
|
||
"import pandas as pd\n",
|
||
"import os\n",
|
||
"import numpy as np\n",
|
||
"from util import show_images, dict_train_test_split"
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "markdown",
|
||
"id": "748c35d7",
|
||
"metadata": {},
|
||
"source": [
|
||
"## Load Dataset\n",
|
||
"\n",
|
||
"The dataset provided is multimodal and contains two components, images and tabular data. The tabular dataset `tabular.csv` contains $N$ entries and $F$ columns, including the target feature. On the other hand, the image dataset `images.npy` is of size $(N, H, W)$, where $N$, $H$, and $W$ correspond to the number of data, image width, and image height, respectively. Each image corresponds to the data in the same index of the tabular dataset. These datasets can be found in the `data/` folder in the given file structure.\n",
|
||
"\n",
|
||
"A code snippet that loads and displays some of the data is provided below.\n",
|
||
"\n",
|
||
"### Load Tabular Data"
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "code",
|
||
"execution_count": 3,
|
||
"id": "a88be725",
|
||
"metadata": {
|
||
"ExecuteTime": {
|
||
"end_time": "2024-04-20T14:33:46.117171Z",
|
||
"start_time": "2024-04-20T14:33:44.688985Z"
|
||
}
|
||
},
|
||
"outputs": [
|
||
{
|
||
"name": "stdout",
|
||
"output_type": "stream",
|
||
"text": [
|
||
"(357699, 61)\n"
|
||
]
|
||
},
|
||
{
|
||
"data": {
|
||
"text/plain": " V0 V1 V2 V3 V4 V5 V6 V7 \\\n0 8315.0 1784.0 21994.0 37115.0 317.0 105.016815 296559.0 321602.0 \n1 8315.0 1272.0 11114.0 18683.0 230.0 NaN 340059.0 368602.0 \n2 8315.0 3832.0 65514.0 147707.0 607.0 105.018240 279159.0 302802.0 \n3 8315.0 2296.0 32874.0 55547.0 404.0 NaN 313959.0 340402.0 \n4 11021.0 1784.0 21994.0 37115.0 375.0 105.024985 232701.0 252606.0 \n\n V8 V9 ... V51 V52 V53 V54 V55 V56 V57 V58 \\\n0 2470.0 C1 ... C4 C4 834148.0 C2 C6 1089 293 C2 \n1 2820.0 C0 ... C7 C7 401668.0 C5 C6 9801 1085 C7 \n2 2330.0 C1 ... C7 C7 820948.0 C5 C4 1485 304 C6 \n3 2610.0 C1 ... C7 C7 1664548.0 C5 C5 -495 711 C4 \n4 1490.0 C0 ... C7 C7 735748.0 C2 C9 1683 117 C0 \n\n V59 target \n0 7428.249334 300.0 \n1 9693.829502 200.0 \n2 7609.258214 50.0 \n3 4258.532609 140.0 \n4 9492.484802 20.0 \n\n[5 rows x 61 columns]",
|
||
"text/html": "<div>\n<style scoped>\n .dataframe tbody tr th:only-of-type {\n vertical-align: middle;\n }\n\n .dataframe tbody tr th {\n vertical-align: top;\n }\n\n .dataframe thead th {\n text-align: right;\n }\n</style>\n<table border=\"1\" class=\"dataframe\">\n <thead>\n <tr style=\"text-align: right;\">\n <th></th>\n <th>V0</th>\n <th>V1</th>\n <th>V2</th>\n <th>V3</th>\n <th>V4</th>\n <th>V5</th>\n <th>V6</th>\n <th>V7</th>\n <th>V8</th>\n <th>V9</th>\n <th>...</th>\n <th>V51</th>\n <th>V52</th>\n <th>V53</th>\n <th>V54</th>\n <th>V55</th>\n <th>V56</th>\n <th>V57</th>\n <th>V58</th>\n <th>V59</th>\n <th>target</th>\n </tr>\n </thead>\n <tbody>\n <tr>\n <th>0</th>\n <td>8315.0</td>\n <td>1784.0</td>\n <td>21994.0</td>\n <td>37115.0</td>\n <td>317.0</td>\n <td>105.016815</td>\n <td>296559.0</td>\n <td>321602.0</td>\n <td>2470.0</td>\n <td>C1</td>\n <td>...</td>\n <td>C4</td>\n <td>C4</td>\n <td>834148.0</td>\n <td>C2</td>\n <td>C6</td>\n <td>1089</td>\n <td>293</td>\n <td>C2</td>\n <td>7428.249334</td>\n <td>300.0</td>\n </tr>\n <tr>\n <th>1</th>\n <td>8315.0</td>\n <td>1272.0</td>\n <td>11114.0</td>\n <td>18683.0</td>\n <td>230.0</td>\n <td>NaN</td>\n <td>340059.0</td>\n <td>368602.0</td>\n <td>2820.0</td>\n <td>C0</td>\n <td>...</td>\n <td>C7</td>\n <td>C7</td>\n <td>401668.0</td>\n <td>C5</td>\n <td>C6</td>\n <td>9801</td>\n <td>1085</td>\n <td>C7</td>\n <td>9693.829502</td>\n <td>200.0</td>\n </tr>\n <tr>\n <th>2</th>\n <td>8315.0</td>\n <td>3832.0</td>\n <td>65514.0</td>\n <td>147707.0</td>\n <td>607.0</td>\n <td>105.018240</td>\n <td>279159.0</td>\n <td>302802.0</td>\n <td>2330.0</td>\n <td>C1</td>\n <td>...</td>\n <td>C7</td>\n <td>C7</td>\n <td>820948.0</td>\n <td>C5</td>\n <td>C4</td>\n <td>1485</td>\n <td>304</td>\n <td>C6</td>\n <td>7609.258214</td>\n <td>50.0</td>\n </tr>\n <tr>\n <th>3</th>\n <td>8315.0</td>\n <td>2296.0</td>\n <td>32874.0</td>\n <td>55547.0</td>\n <td>404.0</td>\n <td>NaN</td>\n <td>313959.0</td>\n <td>340402.0</td>\n <td>2610.0</td>\n <td>C1</td>\n <td>...</td>\n <td>C7</td>\n <td>C7</td>\n <td>1664548.0</td>\n <td>C5</td>\n <td>C5</td>\n <td>-495</td>\n <td>711</td>\n <td>C4</td>\n <td>4258.532609</td>\n <td>140.0</td>\n </tr>\n <tr>\n <th>4</th>\n <td>11021.0</td>\n <td>1784.0</td>\n <td>21994.0</td>\n <td>37115.0</td>\n <td>375.0</td>\n <td>105.024985</td>\n <td>232701.0</td>\n <td>252606.0</td>\n <td>1490.0</td>\n <td>C0</td>\n <td>...</td>\n <td>C7</td>\n <td>C7</td>\n <td>735748.0</td>\n <td>C2</td>\n <td>C9</td>\n <td>1683</td>\n <td>117</td>\n <td>C0</td>\n <td>9492.484802</td>\n <td>20.0</td>\n </tr>\n </tbody>\n</table>\n<p>5 rows × 61 columns</p>\n</div>"
|
||
},
|
||
"execution_count": 3,
|
||
"metadata": {},
|
||
"output_type": "execute_result"
|
||
}
|
||
],
|
||
"source": [
|
||
"df = pd.read_csv(os.path.join('data', 'tabular.csv'))\n",
|
||
"print(df.shape)\n",
|
||
"df.head()"
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "markdown",
|
||
"id": "c09da291",
|
||
"metadata": {},
|
||
"source": [
|
||
"### Load Image Data"
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "code",
|
||
"execution_count": 4,
|
||
"id": "6297e25a",
|
||
"metadata": {
|
||
"ExecuteTime": {
|
||
"end_time": "2024-04-20T14:34:00.060850Z",
|
||
"start_time": "2024-04-20T14:33:59.642045Z"
|
||
}
|
||
},
|
||
"outputs": [
|
||
{
|
||
"name": "stdout",
|
||
"output_type": "stream",
|
||
"text": [
|
||
"Shape: (357699, 8, 8)\n"
|
||
]
|
||
},
|
||
{
|
||
"data": {
|
||
"text/plain": "<Figure size 1200x500 with 15 Axes>",
|
||
"image/png": "iVBORw0KGgoAAAANSUhEUgAAA5kAAAGsCAYAAABXfmMRAAAAOXRFWHRTb2Z0d2FyZQBNYXRwbG90bGliIHZlcnNpb24zLjguMCwgaHR0cHM6Ly9tYXRwbG90bGliLm9yZy81sbWrAAAACXBIWXMAAA9hAAAPYQGoP6dpAABHhklEQVR4nO3dfXTU5Zn/8U/IwwQwicozEjHrYzSomKACYkHcdFnpHrXHogWKLlRpEcVoVZa6IrRET7uIXZf0QC31WVotBV0U01aErsXSiIpoAaWWIEQ2qAlPmYRkfn9wwm9TCM59zTeZh/v9OmfOKdPvx/vmO9c131yZYSYtEolEBAAAAABAALrEewMAAAAAgNTBkAkAAAAACAxDJgAAAAAgMAyZAAAAAIDAMGQCAAAAAALDkAkAAAAACAxDJgAAAAAgMBmdvWBLS4t27typnJwcpaWldfbyCEAkEtHevXvVv39/denC7ylcUP/Jj/q3o/5TAz1gRw8kP+rfjvpPfi713+lD5s6dO5Wfn9/Zy6IDVFdXa8CAAfHeRlKh/lMH9e+O+k8t9IA7eiB1UP/uqP/UEU39d/qQmZOTI0m66667FAqFnLKHDh0yrTlo0CBTTpLeeOMNU27hwoWm3KxZs0y5Xr16mXKS1LVrV6fjDx48qBkzZhx5LBG91nNWXV2t3Nxcp+wPfvAD05qVlZWmnCQNGzbMlCspKTHlduzYYcotWrTIlJOkTZs2OR1fX1+v/Px86t+g9Zw9/PDDzs87Tz/9tGnNL774wpSTpIwM2yVyxIgRpty+fftMuddff92Uk6TRo0c7ZxobG/WLX/yCHjCI5Rpwww03mNY8//zzTTlJ6tmzpylXU1Njyp100kmmnOvzyf/15ptvOh3f1NSkX//619S/Qes569evn/OrwCNHjjStOXz4cFNOkn7/+9+bcmeccYYp9+CDD5pyr776qiknSb/97W+djg+Hw3rkkUeiqv9OHzJbXx4PhULKzs52yjY1NZnW7NatmyknyXkQjpXrOWkVyxOsNctbHdy1nrPc3FznHzCstZienm7KxbKmtees9R/LW5ZcH4dW1L+71nPWtWtX5+cd68AXS/1bs9a+aWxsNOVi+TtmZWWZs/SAu1iuAZmZmaY1Y/k5xvrzgXXNePwMZO0B6t9d6znr0qWL83Xb+jjFUhvWnrPWsVX37t3NWWuvRlP/vJkcAAAAABAY05C5cOFCFRQUKDs7W8XFxVq7dm3Q+wISFvUP39ED8Bn1D9/RA4iG85C5dOlSzZgxQ7NmzdKGDRs0YsQIjRkzRtu3b++I/QEJhfqH7+gB+Iz6h+/oAUTLecicP3++Jk+erClTpqiwsFALFixQfn6+Kioqjnl8OBxWfX19mxuQrKh/+M6lB6h/pBquAfAd1wBEy2nIbGxsVFVVlUpLS9vcX1pa2u6nsJaXlysvL+/IjY8uRrKi/uE71x6g/pFKuAbAd1wD4MJpyKytrVVzc7P69OnT5v4+ffq0+3HVM2fOVF1d3ZFbdXW1fbdAHFH/8J1rD1D/SCVcA+A7rgFwYfpM+L//2NpIJNLuR9mGQqFO/xoQoCNR//BdtD1A/SMVcQ2A77gGIBpOr2T27NlT6enpR/22Yvfu3Uf9VgNINdQ/fEcPwGfUP3xHD8CF05CZlZWl4uJiVVZWtrm/srJSw4YNC3RjQKKh/uE7egA+o/7hO3oALpzfLltWVqaJEyeqpKREQ4cO1aJFi7R9+3ZNnTq1I/YHJBTqH76jB+Az6h++owcQLechc9y4cdqzZ4/mzJmjXbt2qaioSCtXrtTAgQM7Yn9AQqH+4Tt6AD6j/uE7egDRSotEIpHOXLC+vl55eXlasmSJunXr1ilrduni/HWgR3z++eem3J49e0y5NWvWmHIXX3yxKSdJW7ZscTq+qalJzz//vOrq6pSbm2te10et9X/llVcqI8P0uVvO9u7da8727dvXlHv++edNufa+a+7LfPe73zXlJOnPf/6z0/H79u3TyJEjqX+D1vqfNGmSsrKyOmXNWD7NMBwOm3KnnnqqKVdSUmLKXXHFFaacJP3mN79xzjQ0NGju3Ln0gEFrD/z4xz9W165dO2XNWHrgrLPOMuVuuukmU+5nP/uZKRfLdW79+vVOx/MzkF1r/X/ve9/rtA8EimUGaG5uNuUGDRpkyi1dutSUO/HEE005SXrzzTedjm9ubtbmzZujqn/7mQcAAAAA4O8wZAIAAAAAAsOQCQAAAAAIDEMmAAAAACAwDJkAAAAAgMAwZAIAAAAAAsOQCQAAAAAIDEMmAAAAACAwDJkAAAAAgMAwZAIAAAAAAsOQCQAAAAAIDEMmAAAAACAwDJkAAAAAgMBkxGvhTz/9VNnZ2U6ZRYsWddBu2rdp0yZT7t///d9NuW984xum3N69e005ScrLy3M6vrGx0bwWDjv11FOVlZXllHF9nILw4IMPmnLz58835YYPH96p60nSJ5984nT8gQMHzGvhsEgkokgk4pSx1v95551nyknSj3/8Y1Pu7rvvNuUqKytNubVr15pyklRYWOicaWpqMq+Hw7KyspyvAdbnnn79+plyknTTTTeZco8++qgpV1tba8pt377dlJOkc8891+n4hoYG81o4bP/+/Tp06JBTZs+ePR20m/Y9/vjjptz3v/99U+4Pf/iDKXfDDTeYcpJ7/Tc1NWnz5s1RHcsrmQAAAACAwDBkAgAAAAACw5AJAAAAAAiM05BZXl6uIUOGKCcnR71799bVV18d9ftygWRH/cN39AB8Rv3Dd/QAXDgNma+//rqmTZumdevWqbKyUocOHVJpaan279/fUfsDEgb1D9/RA/AZ9Q/f0QNw4fTpsq+88kqbPy9ZskS9e/dWVVWVLr/88kA3BiQa6h++owfgM+ofvqMH4CKmrzCpq6uTJJ188sntHhMOhxUOh4/8ub6+PpYlgYRB/cN3X9YD1D9SGdcA+I5rAI7H/ME/kUhEZWVluuyyy1RUVNTuceXl5crLyztyy8/Pty4JJAzqH76Lpgeof6QqrgHwHdcAfBnzkHnrrbfq3Xff1bPPPnvc42bOnKm6urojt+rqauuSQMKg/uG7aHqA+keq4hoA33ENwJcxvV12+vTpWrFihdasWaMBAwYc99hQKKRQKGTaHJCIqH/4LtoeoP6RirgGwHdcAxANpyEzEolo+vTpWrZsmVavXq2CgoKO2heQcKh/+I4egM+of/iOHoALpyFz2rRpeuaZZ7R8+XLl5OSopqZGkpSXl6euXbt2yAaBREH9w3f0AHxG/cN39ABcOP2bzIqKCtXV1WnkyJHq16/fkdvSpUs7an9AwqD+4Tt6AD6j/uE7egAunN8uC/iK+ofv6AH4jPqH7+gBuIjpezJj8Z3vfEe5ublOmR07dpjWmj9/viknSbfccospd+KJJ5pyffr0MeUuvfRSU06SPvjgA6fj09LSzGvhsIcffti5/qdPn25aq7S01JSTpCuuuMKUKykpMeVmz55tysVSk7///e+djm9qajKvhcP+8z//07n+v/rVr5rWGj58uCknSe+++64pZ/0uuNraWlPu29/+tikn2a4d+/bt04MPPmheE9KkSZOce+Dhhx82rXXw4EFTTpK+9a1vmXJDhw415QoLC025WP5t4KpVq5yO5xoQu/Lycuf6v/fee01r/exnPzPlJOnGG2805c466yxTbsKECabcW2+9ZcpJUq9evczZL2P+ChMAAAAAAP4eQyYAAAAAIDAMmQAAAACAwDBkAgAAAAACw5AJAAAAAAgMQyYAAAAAIDAMmQAAAACAwDBkAgAAAAACw5AJAAAAAAgMQyYAAAAAIDAMmQAAAACAwDBkAgAAAAACw5AJAAAAAAhMRrwWzsvLc87cd999prXuuusuU06Szj33XFPu7bffNuU2b95syr344oumnCSdccYZTsc3NDSY18JhL7zwgrp16+aUKSgoMK21du1aU06STj/9dFPurLPOMuW+/vWvm3K33nqrKSdJ1113ndPxDQ0NWrVqlXk9HH4uD4VCTpkLLrjAtNbOnTtNOUn6yU9+YsoVFhaacuvWrTPlbrzxRlNOkq699lrnTFNTk3k9HDZ37lznHujbt69prezsbFNOkkaNGmXKhcNhU27Tpk2mXE5OjiknSenp6U7HNzc3m9fCYeXl5c51mZFhG1nKyspMOUnavXu3Kffee++Zci+//LIpd+WVV5pyktSnTx+n4xsbG6M+llcyAQAAAACBYcgEAAAAAAQmpiGzvLxcaWlpmjFjRkDbAZIH9Q/f0QPwGfUPn1H/+DLmIXP9+vVatGiRzj///CD3AyQF6h++owfgM+ofPqP+EQ3TkLlv3z6NHz9eixcv1kknnRT0noCERv3Dd/QAfEb9w2fUP6JlGjKnTZumq666KqpPMwqHw6qvr29zA5IZ9Q/fRdsD1D9SEdcA+Iz6R7ScPw/4ueee01tvvaX169dHdXx5ebkeeOAB540BiYj6h+9ceoD6R6rhGgCfUf9w4fRKZnV1tW6//XY99dRTUX+/zcyZM1VXV3fkVl1dbdooEG/UP3zn2gPUP1IJ1wD4jPqHK6dXMquqqrR7924VFxcfua+5uVlr1qzRo48+qnA4fNSX2oZCIecvHAYSEfUP37n2APWPVMI1AD6j/uHKacgcPXq0Nm7c2Oa+m266Seecc47uueeeo4oLSCXUP3xHD8Bn1D98Rv3DldOQmZOTo6Kiojb3de/eXT169DjqfiDVUP/wHT0An1H/8Bn1D1fm78kEAAAAAODvOX+67N9bvXp1ANsAkhP1D9/RA/AZ9Q+fUf84npiHTKvPPvtMubm5Tpknn3yyg3bTvt69e5tyAwcONOVWrlxpyk2ePNmUk6RevXo5Hb9//37zWjissrJSWVlZTpnvfOc7HbSb9s2ePduUmzJlSqeud9ttt5lykrR161an48PhsHktHLZp0yZlZLhdfuLx4RG1tbWm3J///GdTbtmyZabc7bffbsrFsiZi88c//tG5B0aOHNkxmzmOwsJCU+6vf/2rKWftnVjerrlv3z6n45uamsxr4TDLdfSxxx7rgJ0c34QJE0y5l19+2ZSz/izT0tJiyknS3LlznY6vr6/X4sWLozqWt8sCAAAAAALDkAkAAAAACAxDJgAAAAAgMAyZAAAAAIDAMGQCAAAAAALDkAkAAAAACAxDJgAAAAAgMAyZAAAAAIDAMGQCAAAAAALDkAkAAAAACAxDJgAAAAAgMAyZAAAAAIDAMGQCAAAAAAKTEa+Fn3/+eXXt2tUp09LSYlrr4osvNuUkaeHChabcqFGjTLnJkyebcs8//7wpJ0mjR492Ov7gwYPmtXDYZZdd5lz/b7/9tmmtLl3sv0uqra015S688EJT7pvf/KYpl5mZacpJUq9evZyOp/5jd9pppykrK8sp09DQYForltpYvny5KTdz5kxTbsGCBabcs88+a8pJ0vjx450zTU1N+uUvf2leE9IFF1ygUCjklJk9e7ZprbKyMlNOkrZv327KzZ0715T73ve+Z8pt27bNlJPcrwGNjY3mtXBYWlqa0tLSnDI33nijaa2ioiJTTpJefPFFU+7JJ5805d58801T7rXXXjPlJPfnh3A4HPWxvJIJAAAAAAgMQyYAAAAAIDAMmQAAAACAwDgPmZ988okmTJigHj16qFu3brrwwgtVVVXVEXsDEg71D9/RA/AZ9Q/f0QOIltMH/3z++ecaPny4Ro0apZdfflm9e/fWRx99pBNPPLGDtgckDuofvqMH4DPqH76jB+DCach86KGHlJ+fryVLlhy577TTTjtuJhwOt/kkovr6ercdAgmC+ofvXHuA+kcq4RoA33ENgAunt8uuWLFCJSUluu6669S7d28NHjxYixcvPm6mvLxceXl5R275+fkxbRiIF+ofvnPtAeofqYRrAHzHNQAunIbMbdu2qaKiQmeeeaZWrVqlqVOn6rbbbtMTTzzRbmbmzJmqq6s7cquuro5500A8UP/wnWsPUP9IJVwD4DuuAXDh9HbZlpYWlZSUaN68eZKkwYMHa9OmTaqoqNC3vvWtY2ZCoZDzFw4DiYj6h+9ce4D6RyrhGgDfcQ2AC6dXMvv166dzzz23zX2FhYXavn17oJsCEhH1D9/RA/AZ9Q/f0QNw4TRkDh8+XJs3b25z35YtWzRw4MBANwUkIuofvqMH4DPqH76jB+DCaci84447tG7dOs2bN08ffvihnnnmGS1atEjTpk3rqP0BCYP6h+/oAfiM+ofv6AG4cBoyhwwZomXLlunZZ59VUVGR5s6dqwULFmj8+PEdtT8gYVD/8B09AJ9R//AdPQAXTh/8I0ljx47V2LFjO2IvQMKj/uE7egA+o/7hO3oA0XIeMoPyy1/+UpmZmU6Z733ve6a1rDlJevnll025OXPmmHK9e/c25UaPHm3KIT4qKiqUnp7ulCkvLzettXbtWlNOks4++2xTzrrXHj16mHJILlOmTNEJJ5zglLnvvvtMa3322WemnCR94xvfMOU+//xzU65Pnz6mXGlpqSmH+HnjjTecrwFlZWWmtbZu3WrKSfZavuaaa0y5L774wpQ7+eSTTTnER0NDgyKRiFNmzJgxprV+/vOfm3KSNGnSJFPOtbdb7d+/35S7+OKLTbmO5vR2WQAAAAAAjochEwAAAAAQGIZMAAAAAEBgGDIBAAAAAIFhyAQAAAAABIYhEwAAAAAQGIZMAAAAAEBgGDIBAAAAAIFhyAQAAAAABIYhEwAAAAAQGIZMAAAAAEBgGDIBAAAAAIHJ6OwFI5GIJOnQoUPO2f3795vWtKzVqr6+3pRraGgw5Q4ePGjKdabWv1vrY4notZ6z5uZm5+yBAwdMa4bDYVNOkpqamjp1Teo/tbWeM8tzubUWY3n+7+w1ret1ttZ90gPuYrkGWJ9XY6mrxsbGTl3Tul5nat0j9e+u9ZxZHmfrDBBL/Vt/7kpPTzflYvl5rbO41H9apJO7ZMeOHcrPz+/MJdFBqqurNWDAgHhvI6lQ/6mD+ndH/acWesAdPZA6qH931H/qiKb+O33IbGlp0c6dO5WTk6O0tLQ2/199fb3y8/NVXV2t3NzcztxWQku08xKJRLR37171799fXbrwjmsX1L9NIp0b6t/uePUvJdbjnEgS7bzQA3ZcA2wS6dxQ/3bUv00inRuX+u/0t8t26dLlSyff3NzcuJ/ERJRI5yUvLy/eW0hK1H9sEuXcUP820dS/lDiPc6JJpPNCD9hwDYhNopwb6t+G+o9NopybaOufX8EAAAAAAALDkAkAAAAACExCDZmhUEj333+/QqFQvLeSUDgvfuBxbh/nxg88zsfGefEDj3P7ODepj8e4fcl6bjr9g38AAAAAAKkroV7JBAAAAAAkN4ZMAAAAAEBgGDIBAAAAAIFhyAQAAAAABCZhhsyFCxeqoKBA2dnZKi4u1tq1a+O9pbibPXu20tLS2tz69u0b722hg9ADbVH/fqH+j0YP+IP6Pxr17xd64GjJ3gMJMWQuXbpUM2bM0KxZs7RhwwaNGDFCY8aM0fbt2+O9tbg777zztGvXriO3jRs3xntL6AD0wLFR/36g/ttHD6Q+6r991L8f6IH2JXMPJMSQOX/+fE2ePFlTpkxRYWGhFixYoPz8fFVUVMR7a3GXkZGhvn37Hrn16tUr3ltCB6AHjo369wP13z56IPVR/+2j/v1AD7QvmXsg7kNmY2OjqqqqVFpa2ub+0tJSvfHGG3HaVeLYunWr+vfvr4KCAl1//fXatm1bvLeEgNED7aP+Ux/1f3z0QGqj/o+P+k999MDxJXMPxH3IrK2tVXNzs/r06dPm/j59+qimpiZOu0oMl1xyiZ544gmtWrVKixcvVk1NjYYNG6Y9e/bEe2sIED1wbNS/H6j/9tEDqY/6bx/17wd6oH3J3gMZ8d5Aq7S0tDZ/jkQiR93nmzFjxhz534MGDdLQoUN1+umn6/HHH1dZWVkcd4aOQA+0Rf37hfo/Gj3gD+r/aNS/X+iBoyV7D8T9lcyePXsqPT39qN9W7N69+6jfaviue/fuGjRokLZu3RrvrSBA9EB0qP/URP1Hjx5IPdR/9Kj/1EQPRC/ZeiDuQ2ZWVpaKi4tVWVnZ5v7KykoNGzYsTrtKTOFwWB988IH69esX760gQPRAdKj/1ET9R48eSD3Uf/So/9RED0Qv6XogkgCee+65SGZmZuSxxx6LvP/++5EZM2ZEunfvHvn444/jvbW4uvPOOyOrV6+ObNu2LbJu3brI2LFjIzk5Od6fl1REDxyN+vcH9X9s9IAfqP9jo/79QQ8cW7L3QEL8m8xx48Zpz549mjNnjnbt2qWioiKtXLlSAwcOjPfW4mrHjh264YYbVFtbq169eunSSy/VunXrvD8vqYgeOBr17w/q/9joAT9Q/8dG/fuDHji2ZO+BtEgkEunMBVtaWrRz507l5OR4/w96k1UkEtHevXvVv39/dekS93dcJxXqP/lR/3bUf2qgB+zogeRH/dtR/8nPpf47/ZXMnTt3Kj8/v7OXRQeorq7WgAED4r2NpEL9pw7q3x31n1roAXf0QOqg/t1R/6kjmvrv9CEzJydH0uHN5ebmOmUnT55sWvPzzz835SQpI8N2ir7yla+YctOmTTPlXnrpJVNOkvM/rN67d6/OP//8I48lohdL/T/99NOmNT/77DNTTrLX/+jRo0056yem/fu//7spJ0kTJ050Or6hoUHl5eXUv0Es9X/jjTea1ty9e7cpJ0mZmZmm3DXXXGPKWf+Of/zjH005SXrllVecM+FwWBUVFfSAQSw9sGzZMtOav/nNb0w56fA+Lc444wxTrri42JSL5dM2b775Zqfj9+3bp1GjRlH/BrHUv9Xdd99tzlrnhwsvvNCU27lzpyl30UUXmXKSNGfOHKfjW1patH379qjqv9OHzNaXx3Nzc50LzHrBt/6gHMua2dnZppy16bp162bKxbImb3VwF0v9d+3a1bSmtRYle++ccMIJppy1jtPT0005yX5+qH93yfb8b81ae9X6XNy9e3dTTpJCoZA5Sw+4i6UHrM+P1t6R7M+tWVlZppy1d6zrSfbrFfXvLpb6t4qlNqxZ688V1ufjWGYA61u+o6l/03954cKFKigoUHZ2toqLi7V27VrLfwZIStQ/fEcPwGfUP3xHDyAazkPm0qVLNWPGDM2aNUsbNmzQiBEjNGbMGG3fvr0j9gckFOofvqMH4DPqH76jBxAt5yFz/vz5mjx5sqZMmaLCwkItWLBA+fn5qqio6Ij9AQmF+ofv6AH4jPqH7+gBRMtpyGxsbFRVVZVKS0vb3F9aWqo33njjmJlwOKz6+vo2NyAZUf/wnWsPUP9IJVwD4DuuAXDhNGTW1taqublZffr0aXN/nz59VFNTc8xMeXm58vLyjtz46GIkK+ofvnPtAeofqYRrAHzHNQAuTB/88/efKBSJRNr9lKGZM2eqrq7uyM36cdhAoqD+4btoe4D6RyriGgDfcQ1ANJw+n71nz55KT08/6rcVu3fvPuq3Gq1CoVBMH5EOJArqH75z7QHqH6mEawB8xzUALpxeyczKylJxcbEqKyvb3F9ZWalhw4YFujEg0VD/8B09AJ9R//AdPQAXzt80XVZWpokTJ6qkpERDhw7VokWLtH37dk2dOrUj9gckFOofvqMH4DPqH76jBxAt5yFz3Lhx2rNnj+bMmaNdu3apqKhIK1eu1MCBAztif0BCof7hO3oAPqP+4Tt6ANFKi0Qikc5csL6+Xnl5ebrnnns67X3asXxBbFNTkyn31FNPmXIvvPCCKTd69GhTTpLuvfdep+MbGxu1ZMkS1dXVKTc317yuj1rrv7KyUt27d++UNT/++GNztrGx0ZSbNGmSKdfQ0GDKde3a1ZSTpPXr1zsdv2/fPo0aNYr6N2it//vuu0/Z2dmdsubmzZvN2XA4bMo999xzptyaNWtMuQMHDphykvQf//EfzplDhw5p9erV9IBBaw90pksuucScbe/fmn6Z5cuXm3JXXHGFKderVy9TTpJuvfVWp+P379+vMWPGUP8GrfX/4IMPdto1wPpzvCT17dvXlJswYYIpN336dFMulr+j689Azc3Neuedd6Kqf9OnywIAAAAAcCwMmQAAAACAwDBkAgAAAAACw5AJAAAAAAgMQyYAAAAAIDAMmQAAAACAwDBkAgAAAAACw5AJAAAAAAgMQyYAAAAAIDAMmQAAAACAwDBkAgAAAAACw5AJAAAAAAgMQyYAAAAAIDAZ8Vo4EokoEok4ZfLy8kxrFRUVmXKSdOedd5pyDz/8sCmXkWF7SGbPnm3KSdKDDz7odHx9fb2WLFliXg+2+s/NzTWtNWjQIFNOsvfOX/7yF1PunHPOMeVWrlxpyknSxIkTnY5vbm42r4XDWlpa1NLS4pTp2bOnaa2hQ4eacpI0depUU27RokWmXK9evUy5tWvXmnKSNHnyZOfMgQMHtHr1avOakAoLC5Wenu6U6d69ewftpn3Lly835azXjp/+9Kem3N69e005SRoxYoTT8fX19ea1cFhWVpZCoZBTJh71P2HCBFOurKzMlMvPzzflPvvsM1NOkqZMmeJ0/MGDB6OejXglEwAAAAAQGIZMAAAAAEBgGDIBAAAAAIFhyAQAAAAABMZpyCwvL9eQIUOUk5Oj3r176+qrr9bmzZs7am9AQqH+4Tt6AD6j/uE7egAunIbM119/XdOmTdO6detUWVmpQ4cOqbS0VPv37++o/QEJg/qH7+gB+Iz6h+/oAbhw+r6MV155pc2flyxZot69e6uqqkqXX355oBsDEg31D9/RA/AZ9Q/f0QNwEdP3ZNbV1UmSTj755HaPCYfDCofDR/7M9wshVVD/8N2X9QD1j1TGNQC+4xqA4zF/8E8kElFZWZkuu+yy437pbnl5ufLy8o7crF80CiQS6h++i6YHqH+kKq4B8B3XAHwZ85B566236t1339Wzzz573ONmzpypurq6I7fq6mrrkkDCoP7hu2h6gPpHquIaAN9xDcCXMb1ddvr06VqxYoXWrFmjAQMGHPfYUCikUChk2hyQiKh/+C7aHqD+kYq4BsB3XAMQDachMxKJaPr06Vq2bJlWr16tgoKCjtoXkHCof/iOHoDPqH/4jh6AC6chc9q0aXrmmWe0fPly5eTkqKamRpKUl5enrl27dsgGgURB/cN39AB8Rv3Dd/QAXDj9m8yKigrV1dVp5MiR6tev35Hb0qVLO2p/QMKg/uE7egA+o/7hO3oALpzfLhuUWbNmKTc31ykzadIk01qXXXaZKSdJtbW1plzrxzq7evfdd025qVOnmnKStGjRIqfjGxoazGslsyDr/5JLLnGu/6eeesq01siRI005SbruuutMuTvvvNOU++lPf2rK/fGPfzTlJGno0KFOxzc2Nmrr1q3m9ZJZUD1w9913O9f/xIkTTWt97WtfM+Uk6bXXXjPlPvvsM1PupZdeMuV+8IMfmHKS9OijjzpnGhsbzeslsyCvAevWrXPugeHDh5vWOuOMM0w5yV5b5513nin3yCOPmHIlJSWmnCS99957Tsf7+jOQFFwPpKenKz093Smzc+dO01onnniiKSdJ1157rSmXl5dnylnr+KyzzjLlJOmZZ55xOr6pqSnqY82fLgsAAAAAwN9jyAQAAAAABIYhEwAAAAAQGIZMAAAAAEBgGDIBAAAAAIFhyAQAAAAABIYhEwAAAAAQGIZMAAAAAEBgGDIBAAAAAIFhyAQAAAAABIYhEwAAAAAQGIZMAAAAAEBgGDIBAAAAAIHJiNfC//Vf/6Xs7GynzIUXXmha65NPPjHlJOmRRx4x5QoLC025k046yZTLzMw05STp448/djq+sbHRvBYO27Jli0444QSnTElJiWmtHTt2mHKSdPvtt5ty1vq/9NJLTbkXXnjBlJOkW2+91en4AwcO6OmnnzavB+kXv/iFunbt6pQZMWKEaa1t27aZcpK0efNmU+7iiy825c444wxTLpa/Y0aG+48BLS0t5vVw2M0336ysrCynTF1dnWmthoYGU06SnnrqKVPuK1/5iik3YcIEU+7ll1825SQpNzfX6fjm5mbzWjhs//79zufx008/Na31l7/8xZSTpCuuuMKUe//99025v/3tb6bchg0bTDlJGjBggNPx4XA46mN5JRMAAAAAEBiGTAAAAABAYBgyAQAAAACBiWnILC8vV1pammbMmBHQdoDkQf3Dd/QAfEb9w2fUP76Mechcv369Fi1apPPPPz/I/QBJgfqH7+gB+Iz6h8+of0TDNGTu27dP48eP1+LFi82fhgokK+ofvqMH4DPqHz6j/hEt05A5bdo0XXXVVbryyiu/9NhwOKz6+vo2NyCZUf/wXbQ9QP0jFXENgM+of0TL+QuynnvuOb311ltav359VMeXl5frgQcecN4YkIiof/jOpQeof6QargHwGfUPF06vZFZXV+v222/XU089pezs7KgyM2fOVF1d3ZFbdXW1aaNAvFH/8J1rD1D/SCVcA+Az6h+unF7JrKqq0u7du1VcXHzkvubmZq1Zs0aPPvqowuGw0tPT22RCoZBCoVAwuwXiiPqH71x7gPpHKuEaAJ9R/3DlNGSOHj1aGzdubHPfTTfdpHPOOUf33HPPUcUFpBLqH76jB+Az6h8+o/7hymnIzMnJUVFRUZv7unfvrh49ehx1P5BqqH/4jh6Az6h/+Iz6hyvz92QCAAAAAPD3nD9d9u+tXr3alPvd736njAy35a+66irTWj169DDlJOm1114z5Xr27GnKTZ061ZR7+OGHTTlJ+uKLL5yOb2pqMq+Vaqz1//bbb6tbt25OmW9+85umtWLx05/+1JS77LLLTLlvfetbptyPfvQjU06S3nnnHafjDx48aF4rFVl64He/+50yMzOdMuPHj3deR5J69eplyknSkiVLTLnCwkJTbvDgwabcSy+9ZMpJ0meffeac4Rrw/1mvARdddJG6du3qlLn55ptNa8XikUceMeUmTJhgyv3P//yPKffP//zPppwk1dTUOB1/4MAB81qpxlr/VVVVztcA6/Nj3759TTlJOvnkk025/Px8U2758uWm3LXXXmvKSdK2bducjm9sbIz6WF7JBAAAAAAEhiETAAAAABAYhkwAAAAAQGAYMgEAAAAAgWHIBAAAAAAEhiETAAAAABAYhkwAAAAAQGAYMgEAAAAAgWHIBAAAAAAEhiETAAAAABAYhkwAAAAAQGAYMgEAAAAAgWHIBAAAAAAEJiNeC19yySXKzs52ynz66aemtTIy7H/N6dOnm3IDBgww5R5++GFTbubMmaacJN11111Ox4fDYfNaOOz000/XCSec4JRx7ZdWq1evNuUk6YUXXjDlvv71r5tyZWVlptwDDzxgyknStGnTnI7fv3+/eS0cVlJS4lzPGzduNK2VmZlpyknSv/7rv5py3bp1M+VeeuklU2706NGmnCTV1NQ4Z7gGxG7QoEHq3r27U6axsdG0ViyP15lnnmnKnXvuuaZcWlqaKXfZZZeZcpL06quvOh2fnp5uXguHnXvuuc7XgIMHD5rWmj17tiknSdddd50p16WL7TW8a6+91pQ7++yzTTlJ+uKLL5yOd3k+4ZVMAAAAAEBgGDIBAAAAAIFhyAQAAAAABMZ5yPzkk080YcIE9ejRQ926ddOFF16oqqqqjtgbkHCof/iOHoDPqH/4jh5AtJw+Eefzzz/X8OHDNWrUKL388svq3bu3PvroI5144okdtD0gcVD/8B09AJ9R//AdPQAXTkPmQw89pPz8fC1ZsuTIfaeddlrQewISEvUP39ED8Bn1D9/RA3Dh9HbZFStWqKSkRNddd5169+6twYMHa/HixcfNhMNh1dfXt7kByYj6h+9ce4D6RyrhGgDfcQ2AC6chc9u2baqoqNCZZ56pVatWaerUqbrtttv0xBNPtJspLy9XXl7ekVt+fn7MmwbigfqH71x7gPpHKuEaAN9xDYALpyGzpaVFF110kebNm6fBgwfrlltu0be//W1VVFS0m5k5c6bq6uqO3Kqrq2PeNBAP1D9859oD1D9SCdcA+I5rAFw4DZn9+vXTueee2+a+wsJCbd++vd1MKBRSbm5umxuQjKh/+M61B6h/pBKuAfAd1wC4cBoyhw8frs2bN7e5b8uWLRo4cGCgmwISEfUP39ED8Bn1D9/RA3DhNGTecccdWrdunebNm6cPP/xQzzzzjBYtWqRp06Z11P6AhEH9w3f0AHxG/cN39ABcOA2ZQ4YM0bJly/Tss8+qqKhIc+fO1YIFCzR+/PiO2h+QMKh/+I4egM+of/iOHoALp+/JlKSxY8dq7NixHbEXIOFR//AdPQCfUf/wHT2AaDkPmUHJyspSVlaWUyYtLc201saNG005Sfr88887NdenTx9T7q677jLlEB//9m//powMt/b70Y9+ZFrrvffeM+Uk6fbbbzflXnnlFVOuX79+phxv1UkutbW1CoVCThlr/V9zzTWmnCTt3r3blPv0009NOeu/a3r66adNOcRPv379lJOT45Q5/fTTTWv95Cc/MeUkafbs2abciy++aMq5npNWr776qimH+Fi9erXzz0D/+I//aFqrsbHRlJOkf/mXfzHlCgoKTLkvvvjClHvzzTdNuY7m9HZZAAAAAACOhyETAAAAABAYhkwAAAAAQGAYMgEAAAAAgWHIBAAAAAAEhiETAAAAABAYhkwAAAAAQGAYMgEAAAAAgWHIBAAAAAAEhiETAAAAABAYhkwAAAAAQGAYMgEAAAAAgcno7AUjkYgkqaGhwTnb1NRkWtOak6RwOGzKNTY2dup6nan179b6WCJ6refs0KFDztmDBw8GvZ0vdeDAAVPOutf9+/ebcp2p9ZxQ/+5az1lnPs/F4/nfumYyPP9L/3+f9IC71nO2b98+52x9fb1pTcvPW7Guab12dOmS+K99cA2wi+VnIGsdW2tYsu1TSu0ZwOX5Py3SyV2yY8cO5efnd+aS6CDV1dUaMGBAvLeRVKj/1EH9u6P+Uws94I4eSB3UvzvqP3VEU/+dPmS2tLRo586dysnJUVpaWpv/r76+Xvn5+aqurlZubm5nbiuhJdp5iUQi2rt3r/r3758Uv3VMJNS/TSKdG+rf7nj1LyXW45xIEu280AN2XANsEuncUP921L9NIp0bl/rv9LfLdunS5Usn39zc3LifxESUSOclLy8v3ltIStR/bBLl3FD/NtHUv5Q4j3OiSaTzQg/YcA2ITaKcG+rfhvqPTaKcm2jrn1/BAAAAAAACw5AJAAAAAAhMQg2ZoVBI999/v0KhULy3klA4L37gcW4f58YPPM7HxnnxA49z+zg3qY/HuH3Jem46/YN/AAAAAACpK6FeyQQAAAAAJDeGTAAAAABAYBgyAQAAAACBYcgEAAAAAASGIRMAAAAAEJiEGTIXLlyogoICZWdnq7i4WGvXro33luJu9uzZSktLa3Pr27dvvLeFDkIPtEX9+4X6Pxo94A/q/2jUv1/ogaMlew8kxJC5dOlSzZgxQ7NmzdKGDRs0YsQIjRkzRtu3b4/31uLuvPPO065du47cNm7cGO8toQPQA8dG/fuB+m8fPZD6qP/2Uf9+oAfal8w9kBBD5vz58zV58mRNmTJFhYWFWrBggfLz81VRURHvrcVdRkaG+vbte+TWq1eveG8JHYAeODbq3w/Uf/vogdRH/beP+vcDPdC+ZO6BuA+ZjY2NqqqqUmlpaZv7S0tL9cYbb8RpV4lj69at6t+/vwoKCnT99ddr27Zt8d4SAkYPtI/6T33U//HRA6mN+j8+6j/10QPHl8w9EPchs7a2Vs3NzerTp0+b+/v06aOampo47SoxXHLJJXriiSe0atUqLV68WDU1NRo2bJj27NkT760hQPTAsVH/fqD+20cPpD7qv33Uvx/ogfYlew9kxHsDrdLS0tr8ORKJHHWfb8aMGXPkfw8aNEhDhw7V6aefrscff1xlZWVx3Bk6Aj3QFvXvF+r/aPSAP6j/o1H/fqEHjpbsPRD3VzJ79uyp9PT0o35bsXv37qN+q+G77t27a9CgQdq6dWu8t4IA0QPRof5TE/UfPXog9VD/0aP+UxM9EL1k64G4D5lZWVkqLi5WZWVlm/srKys1bNiwOO0qMYXDYX3wwQfq169fvLeCANED0aH+UxP1Hz16IPVQ/9Gj/lMTPRC9pOuBSAJ47rnnIpmZmZHHHnss8v7770dmzJgR6d69e+Tjjz+O99bi6s4774ysXr06sm3btsi6desiY8eOjeTk5Hh/XlIRPXA06t8f1P+x0QN+oP6Pjfr3Bz1wbMneAwnxbzLHjRunPXv2aM6cOdq1a5eKioq0cuVKDRw4MN5bi6sdO3bohhtuUG1trXr16qVLL71U69at8/68pCJ64GjUvz+o/2OjB/xA/R8b9e8PeuDYkr0H0iKRSKQzF2xpadHOnTuVk5Pj/T/oTVaRSER79+5V//791aVL3N9xnVSo/+RH/dtR/6mBHrCjB5If9W9H/Sc/l/rv9Fcyd+7cqfz8/M5eFh2gurpaAwYMiPc2kgr1nzqof3fUf2qhB9zRA6mD+ndH/aeOaOq/04fMnJwcSYc3l5ub65QdP368ac3a2lpTTpIyMzNNuauuusqU27Fjhyn3+uuvm3KSNG/ePKfjDxw4oHHjxh15LBG91nNWXFys9PR0p2zfvn1Na7a0tJhy0uEvSbZ49dVXTbnvfve7ptyQIUNMOUn68MMPnY4Ph8P68Y9/TP0btJ6z0tJS5+fWcDhsWnP06NGmnCTzD0MfffSRKff++++bcqeddpopJ0mvvPKKc6a5uVnvvfcePWDQes5GjhypjAy3H8Gsn7QZyxvWrNeA1157zZSbOHGiKRfLB6H84he/cDq+ublZW7Zsof4NYpkBfvOb35jWXLBggSknyfnntFhzb775pilXWFhoyknSk08+6XT8vn37NHLkyKjqv9OHzNaXx3Nzc50LzDrwuT6RB5HNzs425UKhkClnLWjp8EciW/BWB3et5yw9Pd25tqz1H8uQ2cnvpjfXf7du3cxrWnuV+nfXes4yMzOd69lax9bHV7LXlXXNrKwsU87aN1Js1w56wF3rOcvIyHDuAWt9xPI8bs1a30ZqreVY+tzaA9S/u1hmAOvzcSzPcdYZIJY1O3u9E044wZSLpv5NzwILFy5UQUGBsrOzVVxcrLVr11r+M0BSov7hO3oAPqP+4Tt6ANFwHjKXLl2qGTNmaNasWdqwYYNGjBihMWPGaPv27R2xPyChUP/wHT0An1H/8B09gGg5D5nz58/X5MmTNWXKFBUWFmrBggXKz89XRUVFR+wPSCjUP3xHD8Bn1D98Rw8gWk5DZmNjo6qqqlRaWtrm/tLSUr3xxhvHzITDYdXX17e5AcmI+ofvXHuA+kcq4RoA33ENgAunIbO2tlbNzc1HfcJZnz59VFNTc8xMeXm58vLyjtz46GIkK+ofvnPtAeofqYRrAHzHNQAuTB/88/efKBSJRNr9lKGZM2eqrq7uyK26utqyJJAwqH/4LtoeoP6RirgGwHdcAxANp8/m7dmzp9LT04/6bcXu3bvb/f6mUCgU08erA4mC+ofvXHuA+kcq4RoA33ENgAunVzKzsrJUXFysysrKNvdXVlZq2LBhgW4MSDTUP3xHD8Bn1D98Rw/AhfO3jJaVlWnixIkqKSnR0KFDtWjRIm3fvl1Tp07tiP0BCYX6h+/oAfiM+ofv6AFEy3nIHDdunPbs2aM5c+Zo165dKioq0sqVKzVw4MCO2B+QUKh/+I4egM+of/iOHkC00iKRSKQzF6yvr1deXp7uvvvuTnuf9ocffmjONjY2mnLPP/+8Kbdy5UpT7mc/+5kpJ0nLli0z5erq6pSbm2te10et9T9+/HhlZWV1ypp5eXnm7DvvvGPKzZw505T7/ve/b8rdc889ppwkzZo1y+n45uZmbd26lfo3aK3/Bx98UNnZ2Z2y5t69e83ZnJwcU669r7P4Mtddd50p9/Of/9yUk6QNGzY4Z1paWrR79256wKC1B2655ZZO+xkolsdo06ZNptxtt91myj344IOm3JVXXmnKSdLjjz/udHxzc7M++OAD6t+gtf779++vLl1Mnz3qbMiQIeas9eenJUuWmHL333+/KbdlyxZTTpKuuuoqp+MPHjyom2++Oar675xHGAAAAADgBYZMAAAAAEBgGDIBAAAAAIFhyAQAAAAABIYhEwAAAAAQGIZMAAAAAEBgGDIBAAAAAIFhyAQAAAAABIYhEwAAAAAQGIZMAAAAAEBgGDIBAAAAAIFhyAQAAAAABIYhEwAAAAAQmIx4LRyJRBSJRJwyJ510kmmtkpISU06SysrKTLlHHnnElPvf//1fU65///6mnCQ99thjTscfPHhQt956q3k9SDt27FBGhlv7/eUvf+mg3bRvx44dptyQIUNMuQ0bNphyL774oiknSQMGDHA6/tChQ9q6dat5PUi5ubnq2rWrU6Zbt26mtfr162fKSdI3vvENU65Pnz6m3Pvvv2/KnXLKKaacJFVXVztnmpubtXv3bvOakHbt2qXMzEynTDyed9555x1T7qtf/aop9+qrr5pyZ511liknSePGjXM6vqGhQT/84Q/N60G66KKLnOv/7LPP7qDdtK+8vNyUs147TjjhBFNu0qRJppx0+LnIRVNTU9TH8komAAAAACAwDJkAAAAAgMAwZAIAAAAAAsOQCQAAAAAIjNOQWV5eriFDhignJ0e9e/fW1Vdfrc2bN3fU3oCEQv3Dd/QAfEb9w3f0AFw4DZmvv/66pk2bpnXr1qmyslKHDh1SaWmp9u/f31H7AxIG9Q/f0QPwGfUP39EDcOH0HQqvvPJKmz8vWbJEvXv3VlVVlS6//PJjZsLhsMLh8JE/19fXG7YJxB/1D9+59gD1j1TCNQC+4xoAFzH9m8y6ujpJ0sknn9zuMeXl5crLyztyy8/Pj2VJIGFQ//Ddl/UA9Y9UxjUAvuMagOMxD5mRSERlZWW67LLLVFRU1O5xM2fOVF1d3ZGb5YufgURD/cN30fQA9Y9UxTUAvuMagC/j9HbZ/+vWW2/Vu+++qz/84Q/HPS4UCikUClmXARIS9Q/fRdMD1D9SFdcA+I5rAL6MacicPn26VqxYoTVr1mjAgAFB7wlIaNQ/fEcPwGfUP3xHDyAaTkNmJBLR9OnTtWzZMq1evVoFBQUdtS8g4VD/8B09AJ9R//AdPQAXTkPmtGnT9Mwzz2j58uXKyclRTU2NJCkvL09du3btkA0CiYL6h+/oAfiM+ofv6AG4cPrgn4qKCtXV1WnkyJHq16/fkdvSpUs7an9AwqD+4Tt6AD6j/uE7egAunN8uG5Tvf//7ys3Ndcpcf/31prX+6Z/+yZSTpLFjx5pyJSUlptxbb71lyr3++uumnCStX7/e6fjm5mbzWsksyPpfsWKFc/0f7xMMj+f999835STpmmuuMeX69etnyp1yyimmnPXcSIc//c7Fvn37VFxcbF4vmQXVA+PHj3eu/6eeesq0VixfEn7eeeeZcvPmzTPlMjJsn8V3ww03mHKS9Kc//ck5E+RzYTIJ8u/95JNPOvfAiBEjTGtt2bLFlJOkSZMmmXK9e/c25SZOnGjKFRYWmnKSnJ/P9+3bpx/+8Ifm9ZJZUD2QkZHh/Hz361//2rRWLAPwD37wA1MuPT3dvKbFu+++a8526eL2RSMNDQ3R/7ddNwMAAAAAQHsYMgEAAAAAgWHIBAAAAAAEhiETAAAAABAYhkwAAAAAQGAYMgEAAAAAgWHIBAAAAAAEhiETAAAAABAYhkwAAAAAQGAYMgEAAAAAgWHIBAAAAAAEhiETAAAAABAYhkwAAAAAQGAy4rVwRUWFsrOznTJDhw41rfXxxx+bcpJ04YUXmnKDBg0y5Wpqaky5oqIiU06Szj//fKfjGxsbVVVVZV4PUl5ennOmpKTEtNbYsWNNOUlKS0sz5Zqamky5MWPGmHK1tbWmnCTNnTvX6fjGxkbzWjhswYIFzs//p556qmmtvn37mnKSVFZWZsrt37/flGtoaDDl7rjjDlNOki644ALnTGNjozZt2mReE9I//MM/qEsXt9/zDx482LRWaWmpKSfJeY+twuGwKVdQUGDKbdiwwZSTpI0bNzodzzUgdrt27VJGhtsIYnmukqSHHnrIlJMO79PimmuuMeXmzJljyvXv39+Uk6QpU6Y4Hb93717dd999UR3LK5kAAAAAgMAwZAIAAAAAAsOQCQAAAAAITExDZnl5udLS0jRjxoyAtgMkD+ofvqMH4DPqHz6j/vFlzEPm+vXrtWjRIucPjQFSAfUP39ED8Bn1D59R/4iGacjct2+fxo8fr8WLF+ukk04Kek9AQqP+4Tt6AD6j/uEz6h/RMg2Z06ZN01VXXaUrr7zyS48Nh8Oqr69vcwOSGfUP30XbA9Q/UhHXAPiM+ke0nL8n87nnntNbb72l9evXR3V8eXm5HnjgAeeNAYmI+ofvXHqA+keq4RoAn1H/cOH0SmZ1dbVuv/12PfXUU1F/kfbMmTNVV1d35FZdXW3aKBBv1D9859oD1D9SCdcA+Iz6hyunVzKrqqq0e/duFRcXH7mvublZa9as0aOPPqpwOKz09PQ2mVAopFAoFMxugTii/uE71x6g/pFKuAbAZ9Q/XDkNmaNHj9bGjRvb3HfTTTfpnHPO0T333HNUcQGphPqH7+gB+Iz6h8+of7hyGjJzcnJUVFTU5r7u3burR48eR90PpBrqH76jB+Az6h8+o/7hyvw9mQAAAAAA/L20SCQS6cwF6+vrlZeXp0suuUQZGW4fbjty5MiO2dRxvPXWW6bcqaeeasqdccYZptyhQ4dMOUn605/+5HR8U1OTXnrpJdXV1Sk3N9e8ro9a6//KK690rv/KysoO2lX7xo4da8pdccUVptxf//pXU66lpcWUk6RJkyY5Hb9v3z595Stfof4NWut/8ODBzm+t+trXvtZBu2qfa4+2amhoMOXmzp1rys2ePduUk6QBAwY4Zw4ePKjp06fTAwatPTBmzBhlZmY6ZVesWNFBu2rfxIkTTblTTjnFlPvkk09MOUsdt7rhhhucjt+3b5+GDRtG/Ru01v+cOXOi/vCgVh988EEH7Sp4y5cvN+XuvvtuU2748OGmnCTV1NQ4HX/gwAFNmjQpqvrnlUwAAAAAQGAYMgEAAAAAgWHIBAAAAAAEhiETAAAAABAYhkwAAAAAQGAYMgEAAAAAgWHIBAAAAAAEhiETAAAAABAYhkwAAAAAQGAYMgEAAAAAgWHIBAAAAAAEhiETAAAAABAYhkwAAAAAQGAy4rXw6NGjlZ2d7ZQ5ePCgaa3MzExTTpL27Nljyv3tb38z5U477TRT7t577zXlJKmsrMzp+HA4bF4LhxUUFCgrK8spM3DgQNNa1pxk77m3337blMvJyTHlhg4daspJ0sKFC52Ob2xsNK+Fw0aNGqVQKOSU+eKLL0xruV5n/q///u//NuVeeOEFUy49Pd2Uu/zyy005SfrVr37lnKEHYnf22Wc798BZZ51lWmvAgAGmnCRt2rTJlPvoo49MuTPOOMOUq66uNuUk6ec//7nT8fwMFLuSkhJ1797dKfPuu++a1tqxY4cpJ0nDhw835T755BNT7s9//rMpZ70+SlJeXp7T8RkZ0Y+OvJIJAAAAAAgMQyYAAAAAIDAMmQAAAACAwDgPmZ988okmTJigHj16qFu3brrwwgtVVVXVEXsDEg71D9/RA/AZ9Q/f0QOIltMH/3z++ecaPny4Ro0apZdfflm9e/fWRx99pBNPPLGDtgckDuofvqMH4DPqH76jB+DCach86KGHlJ+fryVLlhy5z/ppqECyof7hO3oAPqP+4Tt6AC6c3i67YsUKlZSU6LrrrlPv3r01ePBgLV68+LiZcDis+vr6NjcgGVH/8J1rD1D/SCVcA+A7rgFw4TRkbtu2TRUVFTrzzDO1atUqTZ06VbfddpueeOKJdjPl5eXKy8s7csvPz49500A8UP/wnWsPUP9IJVwD4DuuAXDhNGS2tLTooosu0rx58zR48GDdcsst+va3v62Kiop2MzNnzlRdXd2RWyxfmAvEE/UP37n2APWPVMI1AL7jGgAXTkNmv379dO6557a5r7CwUNu3b283EwqFlJub2+YGJCPqH75z7QHqH6mEawB8xzUALpyGzOHDh2vz5s1t7tuyZYsGDhwY6KaARET9w3f0AHxG/cN39ABcOA2Zd9xxh9atW6d58+bpww8/1DPPPKNFixZp2rRpHbU/IGFQ//AdPQCfUf/wHT0AF05D5pAhQ7Rs2TI9++yzKioq0ty5c7VgwQKNHz++o/YHJAzqH76jB+Az6h++owfgwul7MiVp7NixGjt2bEfsBUh41D98Rw/AZ9Q/fEcPIFrOQ2ZQtmzZoszMTKfMm2++aVqrqKjIlJOk4uJiU662ttaU2717tylXVlZmyiE+LrjgAnXt2tUpc99995nW6tWrlyknSffee68pN2fOHFNuypQpptxvf/tbUw7x8eijjyotLc0p09DQYFrrq1/9qikn2Z//77zzTlPulFNOMeV+9atfmXKIn8LCQudrQI8ePUxr/ehHPzLlJOn222835a6++mpT7o477jDlevbsacohPl566SWFQiGnzJNPPtlBu2lfY2OjKffSSy+Zcjk5Oaac67nsLE5vlwUAAAAA4HgYMgEAAAAAgWHIBAAAAAAEhiETAAAAABAYhkwAAAAAQGAYMgEAAAAAgWHIBAAAAAAEhiETAAAAABAYhkwAAAAAQGAYMgEAAAAAgWHIBAAAAAAEhiETAAAAABCYjM5eMBKJSJKampqcsy0tLaY1LWu1amxs7NQ1w+GwKdeZWs9J62OJ6LWes4MHDzpnrfXf3NxsyknSgQMHTLn6+npTztpvnYn6t2s9Z5ZzZ62pQ4cOmXJS5z//J0P9S/RALGK5Blifj2Ppgc6+BvAzUGprPWeW5zprTcXC+pxs7ZsuXRL/tb/Wv1s09Z8W6eQu2bFjh/Lz8ztzSXSQ6upqDRgwIN7bSCrUf+qg/t1R/6mFHnBHD6QO6t8d9Z86oqn/Th8yW1patHPnTuXk5CgtLa3N/1dfX6/8/HxVV1crNze3M7eV0BLtvEQiEe3du1f9+/dPit+6JBLq3yaRzg31b3e8+pcS63FOJIl2XugBO64BNol0bqh/O+rfJpHOjUv9d/rbZbt06fKlk29ubm7cT2IiSqTzkpeXF+8tJCXqPzaJcm6of5to6l9KnMc50STSeaEHbLgGxCZRzg31b0P9xyZRzk209c+vYAAAAAAAgWHIBAAAAAAEJqGGzFAopPvvv1+hUCjeW0konBc/8Di3j3PjBx7nY+O8+IHHuX2cm9THY9y+ZD03nf7BPwAAAACA1JVQr2QCAAAAAJIbQyYAAAAAIDAMmQAAAACAwDBkAgAAAAACw5AJAAAAAAhMwgyZCxcuVEFBgbKzs1VcXKy1a9fGe0txN3v2bKWlpbW59e3bN97bQgehB9qi/v1C/R+NHvAH9X806t8v9MDRkr0HEmLIXLp0qWbMmKFZs2Zpw4YNGjFihMaMGaPt27fHe2txd95552nXrl1Hbhs3boz3ltAB6IFjo/79QP23jx5IfdR/+6h/P9AD7UvmHkiIIXP+/PmaPHmypkyZosLCQi1YsED5+fmqqKiI99biLiMjQ3379j1y69WrV7y3hA5ADxwb9e8H6r999EDqo/7bR/37gR5oXzL3QNyHzMbGRlVVVam0tLTN/aWlpXrjjTfitKvEsXXrVvXv318FBQW6/vrrtW3btnhvCQGjB9pH/ac+6v/46IHURv0fH/Wf+uiB40vmHoj7kFlbW6vm5mb16dOnzf19+vRRTU1NnHaVGC655BI98cQTWrVqlRYvXqyamhoNGzZMe/bsiffWECB64Niofz9Q/+2jB1If9d8+6t8P9ED7kr0HMuK9gVZpaWlt/hyJRI66zzdjxow58r8HDRqkoUOH6vTTT9fjjz+usrKyOO4MHYEeaIv69wv1fzR6wB/U/9Gof7/QA0dL9h6I+yuZPXv2VHp6+lG/rdi9e/dRv9XwXffu3TVo0CBt3bo13ltBgOiB6FD/qYn6jx49kHqo/+hR/6mJHohesvVA3IfMrKwsFRcXq7Kyss39lZWVGjZsWJx2lZjC4bA++OAD9evXL95bQYDogehQ/6mJ+o8ePZB6qP/oUf+piR6IXtL1QCQBPPfcc5HMzMzIY489Fnn//fcjM2bMiHTv3j3y8ccfx3trcXXnnXdGVq9eHdm2bVtk3bp1kbFjx0ZycnK8Py+piB44GvXvD+r/2OgBP1D/x0b9+4MeOLZk74GE+DeZ48aN0549ezRnzhzt2rVLRUVFWrlypQYOHBjvrcXVjh07dMMNN6i2tla9evXSpZdeqnXr1nl/XlIRPXA06t8f1P+x0QN+oP6Pjfr3Bz1wbMneA2mRSCQS700AAAAAAFJD3P9NJgAAAAAgdTBkAgAAAAACw5AJAAAAAAgMQyYAAAAAIDAMmQAAAACAwDBkAgAAAAACw5AJAAAAAAgMQyYAAAAAIDAMmQAAAACAwDBkAgAAAAACw5AJAAAAAAjM/wNLGjOwcsJ7lwAAAABJRU5ErkJggg=="
|
||
},
|
||
"metadata": {},
|
||
"output_type": "display_data"
|
||
}
|
||
],
|
||
"source": [
|
||
"with open(os.path.join('data', 'images.npy'), 'rb') as f:\n",
|
||
" images = np.load(f)\n",
|
||
" \n",
|
||
"print('Shape:', images.shape)\n",
|
||
"show_images(images[:18], n_row=3, n_col=5, figsize=[12,5])"
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "markdown",
|
||
"id": "cbe832b6",
|
||
"metadata": {},
|
||
"source": [
|
||
"## Data Exploration & Preparation"
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "markdown",
|
||
"id": "2f6a464c",
|
||
"metadata": {},
|
||
"source": [
|
||
"### 1. Descriptive Analysis"
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "code",
|
||
"execution_count": 5,
|
||
"id": "3b1f62dd",
|
||
"metadata": {
|
||
"ExecuteTime": {
|
||
"end_time": "2024-04-20T14:34:03.954499Z",
|
||
"start_time": "2024-04-20T14:34:03.584502Z"
|
||
}
|
||
},
|
||
"outputs": [
|
||
{
|
||
"name": "stdout",
|
||
"output_type": "stream",
|
||
"text": [
|
||
"(357699, 61)\n",
|
||
"V5 172305\n",
|
||
"V15 191109\n",
|
||
"V38 302903\n",
|
||
"V39 347413\n",
|
||
"shape (357699, 64)\n",
|
||
"nan 0 191109\n",
|
||
"nan 12 172305\n",
|
||
"nan 19 347413\n",
|
||
"nan 24 302903\n",
|
||
"0 55 357699\n",
|
||
"0 61 357699\n",
|
||
"0 62 357699\n",
|
||
"0 63 357699\n"
|
||
]
|
||
}
|
||
],
|
||
"source": [
|
||
"targets = df['target']\n",
|
||
"print(df.shape)\n",
|
||
"for column in df:\n",
|
||
" if df[column].isna().sum() > 100000:\n",
|
||
" print(column, df[column].isna().sum())\n",
|
||
"# Remove V38, V39, and we can interpolate / remove V5 and V15\n",
|
||
"# Flatten the images\n",
|
||
"flattened_images = images.reshape(images.shape[0], -1)\n",
|
||
"print('shape', flattened_images.shape)\n",
|
||
"# ID useless columns\n",
|
||
"for i, col in enumerate(flattened_images.T):\n",
|
||
" if (np.isnan(col).sum() > 100000):\n",
|
||
" print('nan', i, np.isnan(col).sum())\n",
|
||
" if (col == 0).sum() > 100000:\n",
|
||
" print('0', i, (col == 0).sum())\n",
|
||
"# Col 19, 24, 55, 61, 62, 63 are useless\n",
|
||
"# Interpolate 0 and 12\n"
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "markdown",
|
||
"id": "adb61967",
|
||
"metadata": {},
|
||
"source": [
|
||
"### 2. Detection and Handling of Missing Values"
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "code",
|
||
"execution_count": 85,
|
||
"id": "4bb9cdfb",
|
||
"metadata": {
|
||
"ExecuteTime": {
|
||
"end_time": "2024-04-16T09:06:12.435405Z",
|
||
"start_time": "2024-04-16T09:06:12.381979Z"
|
||
}
|
||
},
|
||
"outputs": [
|
||
{
|
||
"ename": "KeyError",
|
||
"evalue": "\"['V38', 'V39', 'target'] not found in axis\"",
|
||
"output_type": "error",
|
||
"traceback": [
|
||
"\u001B[0;31m---------------------------------------------------------------------------\u001B[0m",
|
||
"\u001B[0;31mKeyError\u001B[0m Traceback (most recent call last)",
|
||
"Cell \u001B[0;32mIn[85], line 2\u001B[0m\n\u001B[1;32m 1\u001B[0m dropped_columns \u001B[38;5;241m=\u001B[39m [\u001B[38;5;124m'\u001B[39m\u001B[38;5;124mV38\u001B[39m\u001B[38;5;124m'\u001B[39m, \u001B[38;5;124m'\u001B[39m\u001B[38;5;124mV39\u001B[39m\u001B[38;5;124m'\u001B[39m, \u001B[38;5;124m'\u001B[39m\u001B[38;5;124mtarget\u001B[39m\u001B[38;5;124m'\u001B[39m]\n\u001B[0;32m----> 2\u001B[0m df \u001B[38;5;241m=\u001B[39m \u001B[43mdf\u001B[49m\u001B[38;5;241;43m.\u001B[39;49m\u001B[43mdrop\u001B[49m\u001B[43m(\u001B[49m\u001B[43mdropped_columns\u001B[49m\u001B[43m,\u001B[49m\u001B[43m \u001B[49m\u001B[43maxis\u001B[49m\u001B[38;5;241;43m=\u001B[39;49m\u001B[38;5;241;43m1\u001B[39;49m\u001B[43m)\u001B[49m\n\u001B[1;32m 3\u001B[0m flattened_images \u001B[38;5;241m=\u001B[39m np\u001B[38;5;241m.\u001B[39mdelete(flattened_images, [\u001B[38;5;241m19\u001B[39m, \u001B[38;5;241m24\u001B[39m,\u001B[38;5;241m55\u001B[39m,\u001B[38;5;241m61\u001B[39m,\u001B[38;5;241m62\u001B[39m,\u001B[38;5;241m63\u001B[39m], axis\u001B[38;5;241m=\u001B[39m\u001B[38;5;241m1\u001B[39m)\n\u001B[1;32m 4\u001B[0m flattened_images\u001B[38;5;241m.\u001B[39mshape\n",
|
||
"File \u001B[0;32m/nix/store/nip0khhq6vhx1cimwz0ap9bzdvqawyg5-python3-3.11.8-env/lib/python3.11/site-packages/pandas/core/frame.py:5347\u001B[0m, in \u001B[0;36mDataFrame.drop\u001B[0;34m(self, labels, axis, index, columns, level, inplace, errors)\u001B[0m\n\u001B[1;32m 5199\u001B[0m \u001B[38;5;28;01mdef\u001B[39;00m \u001B[38;5;21mdrop\u001B[39m(\n\u001B[1;32m 5200\u001B[0m \u001B[38;5;28mself\u001B[39m,\n\u001B[1;32m 5201\u001B[0m labels: IndexLabel \u001B[38;5;241m|\u001B[39m \u001B[38;5;28;01mNone\u001B[39;00m \u001B[38;5;241m=\u001B[39m \u001B[38;5;28;01mNone\u001B[39;00m,\n\u001B[0;32m (...)\u001B[0m\n\u001B[1;32m 5208\u001B[0m errors: IgnoreRaise \u001B[38;5;241m=\u001B[39m \u001B[38;5;124m\"\u001B[39m\u001B[38;5;124mraise\u001B[39m\u001B[38;5;124m\"\u001B[39m,\n\u001B[1;32m 5209\u001B[0m ) \u001B[38;5;241m-\u001B[39m\u001B[38;5;241m>\u001B[39m DataFrame \u001B[38;5;241m|\u001B[39m \u001B[38;5;28;01mNone\u001B[39;00m:\n\u001B[1;32m 5210\u001B[0m \u001B[38;5;250m \u001B[39m\u001B[38;5;124;03m\"\"\"\u001B[39;00m\n\u001B[1;32m 5211\u001B[0m \u001B[38;5;124;03m Drop specified labels from rows or columns.\u001B[39;00m\n\u001B[1;32m 5212\u001B[0m \n\u001B[0;32m (...)\u001B[0m\n\u001B[1;32m 5345\u001B[0m \u001B[38;5;124;03m weight 1.0 0.8\u001B[39;00m\n\u001B[1;32m 5346\u001B[0m \u001B[38;5;124;03m \"\"\"\u001B[39;00m\n\u001B[0;32m-> 5347\u001B[0m \u001B[38;5;28;01mreturn\u001B[39;00m \u001B[38;5;28;43msuper\u001B[39;49m\u001B[43m(\u001B[49m\u001B[43m)\u001B[49m\u001B[38;5;241;43m.\u001B[39;49m\u001B[43mdrop\u001B[49m\u001B[43m(\u001B[49m\n\u001B[1;32m 5348\u001B[0m \u001B[43m \u001B[49m\u001B[43mlabels\u001B[49m\u001B[38;5;241;43m=\u001B[39;49m\u001B[43mlabels\u001B[49m\u001B[43m,\u001B[49m\n\u001B[1;32m 5349\u001B[0m \u001B[43m \u001B[49m\u001B[43maxis\u001B[49m\u001B[38;5;241;43m=\u001B[39;49m\u001B[43maxis\u001B[49m\u001B[43m,\u001B[49m\n\u001B[1;32m 5350\u001B[0m \u001B[43m \u001B[49m\u001B[43mindex\u001B[49m\u001B[38;5;241;43m=\u001B[39;49m\u001B[43mindex\u001B[49m\u001B[43m,\u001B[49m\n\u001B[1;32m 5351\u001B[0m \u001B[43m \u001B[49m\u001B[43mcolumns\u001B[49m\u001B[38;5;241;43m=\u001B[39;49m\u001B[43mcolumns\u001B[49m\u001B[43m,\u001B[49m\n\u001B[1;32m 5352\u001B[0m \u001B[43m \u001B[49m\u001B[43mlevel\u001B[49m\u001B[38;5;241;43m=\u001B[39;49m\u001B[43mlevel\u001B[49m\u001B[43m,\u001B[49m\n\u001B[1;32m 5353\u001B[0m \u001B[43m \u001B[49m\u001B[43minplace\u001B[49m\u001B[38;5;241;43m=\u001B[39;49m\u001B[43minplace\u001B[49m\u001B[43m,\u001B[49m\n\u001B[1;32m 5354\u001B[0m \u001B[43m \u001B[49m\u001B[43merrors\u001B[49m\u001B[38;5;241;43m=\u001B[39;49m\u001B[43merrors\u001B[49m\u001B[43m,\u001B[49m\n\u001B[1;32m 5355\u001B[0m \u001B[43m \u001B[49m\u001B[43m)\u001B[49m\n",
|
||
"File \u001B[0;32m/nix/store/nip0khhq6vhx1cimwz0ap9bzdvqawyg5-python3-3.11.8-env/lib/python3.11/site-packages/pandas/core/generic.py:4711\u001B[0m, in \u001B[0;36mNDFrame.drop\u001B[0;34m(self, labels, axis, index, columns, level, inplace, errors)\u001B[0m\n\u001B[1;32m 4709\u001B[0m \u001B[38;5;28;01mfor\u001B[39;00m axis, labels \u001B[38;5;129;01min\u001B[39;00m axes\u001B[38;5;241m.\u001B[39mitems():\n\u001B[1;32m 4710\u001B[0m \u001B[38;5;28;01mif\u001B[39;00m labels \u001B[38;5;129;01mis\u001B[39;00m \u001B[38;5;129;01mnot\u001B[39;00m \u001B[38;5;28;01mNone\u001B[39;00m:\n\u001B[0;32m-> 4711\u001B[0m obj \u001B[38;5;241m=\u001B[39m \u001B[43mobj\u001B[49m\u001B[38;5;241;43m.\u001B[39;49m\u001B[43m_drop_axis\u001B[49m\u001B[43m(\u001B[49m\u001B[43mlabels\u001B[49m\u001B[43m,\u001B[49m\u001B[43m \u001B[49m\u001B[43maxis\u001B[49m\u001B[43m,\u001B[49m\u001B[43m \u001B[49m\u001B[43mlevel\u001B[49m\u001B[38;5;241;43m=\u001B[39;49m\u001B[43mlevel\u001B[49m\u001B[43m,\u001B[49m\u001B[43m \u001B[49m\u001B[43merrors\u001B[49m\u001B[38;5;241;43m=\u001B[39;49m\u001B[43merrors\u001B[49m\u001B[43m)\u001B[49m\n\u001B[1;32m 4713\u001B[0m \u001B[38;5;28;01mif\u001B[39;00m inplace:\n\u001B[1;32m 4714\u001B[0m \u001B[38;5;28mself\u001B[39m\u001B[38;5;241m.\u001B[39m_update_inplace(obj)\n",
|
||
"File \u001B[0;32m/nix/store/nip0khhq6vhx1cimwz0ap9bzdvqawyg5-python3-3.11.8-env/lib/python3.11/site-packages/pandas/core/generic.py:4753\u001B[0m, in \u001B[0;36mNDFrame._drop_axis\u001B[0;34m(self, labels, axis, level, errors, only_slice)\u001B[0m\n\u001B[1;32m 4751\u001B[0m new_axis \u001B[38;5;241m=\u001B[39m axis\u001B[38;5;241m.\u001B[39mdrop(labels, level\u001B[38;5;241m=\u001B[39mlevel, errors\u001B[38;5;241m=\u001B[39merrors)\n\u001B[1;32m 4752\u001B[0m \u001B[38;5;28;01melse\u001B[39;00m:\n\u001B[0;32m-> 4753\u001B[0m new_axis \u001B[38;5;241m=\u001B[39m \u001B[43maxis\u001B[49m\u001B[38;5;241;43m.\u001B[39;49m\u001B[43mdrop\u001B[49m\u001B[43m(\u001B[49m\u001B[43mlabels\u001B[49m\u001B[43m,\u001B[49m\u001B[43m \u001B[49m\u001B[43merrors\u001B[49m\u001B[38;5;241;43m=\u001B[39;49m\u001B[43merrors\u001B[49m\u001B[43m)\u001B[49m\n\u001B[1;32m 4754\u001B[0m indexer \u001B[38;5;241m=\u001B[39m axis\u001B[38;5;241m.\u001B[39mget_indexer(new_axis)\n\u001B[1;32m 4756\u001B[0m \u001B[38;5;66;03m# Case for non-unique axis\u001B[39;00m\n\u001B[1;32m 4757\u001B[0m \u001B[38;5;28;01melse\u001B[39;00m:\n",
|
||
"File \u001B[0;32m/nix/store/nip0khhq6vhx1cimwz0ap9bzdvqawyg5-python3-3.11.8-env/lib/python3.11/site-packages/pandas/core/indexes/base.py:6992\u001B[0m, in \u001B[0;36mIndex.drop\u001B[0;34m(self, labels, errors)\u001B[0m\n\u001B[1;32m 6990\u001B[0m \u001B[38;5;28;01mif\u001B[39;00m mask\u001B[38;5;241m.\u001B[39many():\n\u001B[1;32m 6991\u001B[0m \u001B[38;5;28;01mif\u001B[39;00m errors \u001B[38;5;241m!=\u001B[39m \u001B[38;5;124m\"\u001B[39m\u001B[38;5;124mignore\u001B[39m\u001B[38;5;124m\"\u001B[39m:\n\u001B[0;32m-> 6992\u001B[0m \u001B[38;5;28;01mraise\u001B[39;00m \u001B[38;5;167;01mKeyError\u001B[39;00m(\u001B[38;5;124mf\u001B[39m\u001B[38;5;124m\"\u001B[39m\u001B[38;5;132;01m{\u001B[39;00mlabels[mask]\u001B[38;5;241m.\u001B[39mtolist()\u001B[38;5;132;01m}\u001B[39;00m\u001B[38;5;124m not found in axis\u001B[39m\u001B[38;5;124m\"\u001B[39m)\n\u001B[1;32m 6993\u001B[0m indexer \u001B[38;5;241m=\u001B[39m indexer[\u001B[38;5;241m~\u001B[39mmask]\n\u001B[1;32m 6994\u001B[0m \u001B[38;5;28;01mreturn\u001B[39;00m \u001B[38;5;28mself\u001B[39m\u001B[38;5;241m.\u001B[39mdelete(indexer)\n",
|
||
"\u001B[0;31mKeyError\u001B[0m: \"['V38', 'V39', 'target'] not found in axis\""
|
||
]
|
||
}
|
||
],
|
||
"source": [
|
||
"dropped_columns = ['V38', 'V39', 'target']\n",
|
||
"df = df.drop(dropped_columns, axis=1)\n",
|
||
"flattened_images = np.delete(flattened_images, [19, 24,55,61,62,63], axis=1)"
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "code",
|
||
"outputs": [
|
||
{
|
||
"name": "stdout",
|
||
"output_type": "stream",
|
||
"text": [
|
||
"(357699, 58)\n"
|
||
]
|
||
}
|
||
],
|
||
"source": [
|
||
"print(flattened_images.shape)"
|
||
],
|
||
"metadata": {
|
||
"collapsed": false,
|
||
"ExecuteTime": {
|
||
"end_time": "2024-04-16T09:10:48.089387Z",
|
||
"start_time": "2024-04-16T09:10:48.083321Z"
|
||
}
|
||
},
|
||
"id": "d996a04b28b2d1be",
|
||
"execution_count": 101
|
||
},
|
||
{
|
||
"cell_type": "markdown",
|
||
"id": "8adcb9cd",
|
||
"metadata": {},
|
||
"source": [
|
||
"### 3. Detection and Handling of Outliers"
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "code",
|
||
"execution_count": 100,
|
||
"id": "ed1c17a1",
|
||
"metadata": {
|
||
"ExecuteTime": {
|
||
"end_time": "2024-04-16T09:10:46.281705Z",
|
||
"start_time": "2024-04-16T09:10:46.278864Z"
|
||
}
|
||
},
|
||
"outputs": [],
|
||
"source": []
|
||
},
|
||
{
|
||
"cell_type": "markdown",
|
||
"id": "d4916043",
|
||
"metadata": {},
|
||
"source": [
|
||
"### 4. Detection and Handling of Class Imbalance"
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "code",
|
||
"execution_count": null,
|
||
"id": "ad3ab20e",
|
||
"metadata": {},
|
||
"outputs": [],
|
||
"source": []
|
||
},
|
||
{
|
||
"cell_type": "markdown",
|
||
"id": "2552a795",
|
||
"metadata": {},
|
||
"source": [
|
||
"### 5. Understanding Relationship Between Variables"
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "code",
|
||
"execution_count": null,
|
||
"id": "29ddbbcf",
|
||
"metadata": {},
|
||
"outputs": [],
|
||
"source": []
|
||
},
|
||
{
|
||
"cell_type": "markdown",
|
||
"id": "757fb315",
|
||
"metadata": {},
|
||
"source": [
|
||
"### 6. Data Visualization"
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "code",
|
||
"execution_count": null,
|
||
"id": "93f82e42",
|
||
"metadata": {},
|
||
"outputs": [],
|
||
"source": []
|
||
},
|
||
{
|
||
"cell_type": "markdown",
|
||
"id": "2a7eebcf",
|
||
"metadata": {},
|
||
"source": [
|
||
"## Data Preprocessing"
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "markdown",
|
||
"id": "ae3e3383",
|
||
"metadata": {},
|
||
"source": [
|
||
"### 7. General Preprocessing"
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "code",
|
||
"execution_count": null,
|
||
"id": "19174365",
|
||
"metadata": {},
|
||
"outputs": [],
|
||
"source": []
|
||
},
|
||
{
|
||
"cell_type": "markdown",
|
||
"id": "fb3aa527",
|
||
"metadata": {},
|
||
"source": [
|
||
"### 8. Feature Selection"
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "code",
|
||
"execution_count": null,
|
||
"id": "a85808bf",
|
||
"metadata": {},
|
||
"outputs": [],
|
||
"source": []
|
||
},
|
||
{
|
||
"cell_type": "markdown",
|
||
"id": "4921e8ca",
|
||
"metadata": {},
|
||
"source": [
|
||
"### 9. Feature Engineering"
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "code",
|
||
"execution_count": null,
|
||
"id": "dbcde626",
|
||
"metadata": {},
|
||
"outputs": [],
|
||
"source": []
|
||
},
|
||
{
|
||
"cell_type": "markdown",
|
||
"id": "fa676c3f",
|
||
"metadata": {},
|
||
"source": [
|
||
"## Modeling & Evaluation"
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "markdown",
|
||
"id": "589b37e4",
|
||
"metadata": {},
|
||
"source": [
|
||
"### 10. Creating models"
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "code",
|
||
"execution_count": null,
|
||
"id": "d8dffd7d",
|
||
"metadata": {},
|
||
"outputs": [],
|
||
"source": []
|
||
},
|
||
{
|
||
"cell_type": "markdown",
|
||
"id": "495bf3c0",
|
||
"metadata": {},
|
||
"source": [
|
||
"### 11. Model Evaluation"
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "code",
|
||
"execution_count": null,
|
||
"id": "9245ab47",
|
||
"metadata": {},
|
||
"outputs": [],
|
||
"source": []
|
||
},
|
||
{
|
||
"cell_type": "markdown",
|
||
"id": "8aa31404",
|
||
"metadata": {},
|
||
"source": [
|
||
"### 12. Hyperparameters Search"
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "code",
|
||
"execution_count": null,
|
||
"id": "81addd51",
|
||
"metadata": {},
|
||
"outputs": [],
|
||
"source": []
|
||
}
|
||
],
|
||
"metadata": {
|
||
"kernelspec": {
|
||
"display_name": "Python 3 (ipykernel)",
|
||
"language": "python",
|
||
"name": "python3"
|
||
},
|
||
"language_info": {
|
||
"codemirror_mode": {
|
||
"name": "ipython",
|
||
"version": 3
|
||
},
|
||
"file_extension": ".py",
|
||
"mimetype": "text/x-python",
|
||
"name": "python",
|
||
"nbconvert_exporter": "python",
|
||
"pygments_lexer": "ipython3",
|
||
"version": "3.10.7"
|
||
}
|
||
},
|
||
"nbformat": 4,
|
||
"nbformat_minor": 5
|
||
}
|