{ "cells": [ { "cell_type": "markdown", "id": "cfbc8b83", "metadata": { "id": "cfbc8b83" }, "source": [ "# **模型訓練(分類問題)**\n", "此份程式碼會講解針對分類型任務在模型訓練上需要注意的細節。\n", "\n", "## 本章節內容大綱\n", "* ### 二元分類問題\n", " * ### [創建資料集/載入資料集(Dataset Creating/ Loading)](#DatasetCreating/Loading)\n", " * ### [資料前處理(Data Preprocessing)](#DataPreprocessing)\n", " * ### [模型建置(Model Building)](#ModelBuilding)\n", " * ### [模型訓練(Model Training)](#ModelTraining)\n", " * ### [模型評估(Model Evaluation)](#ModelEvaluation)\n", "* ### 多元分類問題\n", "---" ] }, { "cell_type": "markdown", "id": "6c1ac997", "metadata": { "id": "6c1ac997" }, "source": [ "## 匯入套件" ] }, { "cell_type": "code", "execution_count": null, "id": "1e78cf9f", "metadata": { "id": "1e78cf9f" }, "outputs": [], "source": [ "import pandas as pd\n", "import numpy as np\n", "import matplotlib.pyplot as plt\n", "from tqdm.auto import tqdm\n", "\n", "# PyTorch 相關套件\n", "import torch\n", "import torch.nn as nn" ] }, { "cell_type": "markdown", "id": "979f67f3", "metadata": { "id": "979f67f3" }, "source": [ "\n", "## 創建資料集/載入資料集(Dataset Creating / Loading)" ] }, { "cell_type": "code", "execution_count": null, "id": "MuTlaAi9FzqJ", "metadata": { "id": "MuTlaAi9FzqJ" }, "outputs": [], "source": [ "# 上傳資料\n", "!wget -q https://github.com/TA-aiacademy/course_3.0/releases/download/DL/Data_part2.zip\n", "!unzip -q Data_part2.zip" ] }, { "cell_type": "code", "execution_count": null, "id": "5cbef01a", "metadata": { "id": "5cbef01a" }, "outputs": [], "source": [ "train_df = pd.read_csv('./Data/FilmComment_train.csv')\n", "test_df = pd.read_csv('./Data/FilmComment_test.csv')" ] }, { "cell_type": "code", "execution_count": null, "id": "9c5cca18", "metadata": { "id": "9c5cca18" }, "outputs": [], "source": [ "train_df.head()" ] }, { "cell_type": "markdown", "id": "5d6bfd0d", "metadata": { "id": "5d6bfd0d" }, "source": [ "* #### 電影評論資料集\n", "訓練集,測試集分別為 6250,2500 筆,9997 種常用字詞,若在同一則評論中出現該字詞為 1,若否則為 0,y_label 標記評價正面與否。" ] }, { "cell_type": "code", "execution_count": null, "id": "32082d9c", "metadata": { "id": "32082d9c" }, "outputs": [], "source": [ "X_df = train_df.iloc[:, :-1].values\n", "y_df = train_df.y_label.values" ] }, { "cell_type": "code", "execution_count": null, "id": "55e1bc9b", "metadata": { "id": "55e1bc9b" }, "outputs": [], "source": [ "X_test = test_df.iloc[:, :-1].values\n", "y_test = test_df.y_label.values" ] }, { "cell_type": "markdown", "id": "f2133a95", "metadata": { "id": "f2133a95" }, "source": [ "\n", "## 資料前處理(Data Preprocessing)" ] }, { "cell_type": "markdown", "id": "858fc2ed", "metadata": { "id": "858fc2ed" }, "source": [ "* ### 資料正規化(Data Normalization)\n", "由於此資料集的數值範圍都介於 0-1,並且皆是以相同意義轉換特徵值,因此也可以使用原始的數值作為訓練資料。" ] }, { "cell_type": "markdown", "id": "f18e67ea", "metadata": { "id": "f18e67ea" }, "source": [ "* ### 資料切分(Data Splitting)" ] }, { "cell_type": "code", "execution_count": null, "id": "6fe5be4a", "metadata": { "id": "6fe5be4a" }, "outputs": [], "source": [ "# train, valid/test dataset split\n", "from sklearn.model_selection import train_test_split\n", "X_train, X_valid, y_train, y_valid = train_test_split(X_df, y_df, test_size=0.2, random_state=5566)" ] }, { "cell_type": "code", "execution_count": null, "id": "74f1ed43", "metadata": { "id": "74f1ed43" }, "outputs": [], "source": [ "print(f'X_train shape: {X_train.shape}')\n", "print(f'X_valid shape: {X_valid.shape}')\n", "print(f'y_train shape: {y_train.shape}')\n", "print(f'y_valid shape: {y_valid.shape}')" ] }, { "cell_type": "code", "execution_count": null, "id": "nQsKsEMdZ7ZR", "metadata": { "id": "nQsKsEMdZ7ZR" }, "outputs": [], "source": [ "# Build torch dataset and dataloader\n", "from torch.utils.data import TensorDataset, DataLoader\n", "\n", "BATCH_SIZE = 512\n", "\n", "train_dataset = TensorDataset(torch.from_numpy(X_train).float(),\n", " torch.from_numpy(y_train).unsqueeze(1).float())\n", "valid_dataset = TensorDataset(torch.from_numpy(X_valid).float(),\n", " torch.from_numpy(y_valid).unsqueeze(1).float())\n", "\n", "train_loader = DataLoader(train_dataset, batch_size=BATCH_SIZE, shuffle=True)\n", "valid_loader = DataLoader(valid_dataset, batch_size=BATCH_SIZE, shuffle=False)" ] }, { "cell_type": "code", "execution_count": null, "id": "Z11U2tHZdYXr", "metadata": { "id": "Z11U2tHZdYXr" }, "outputs": [], "source": [ "for x, y in train_loader:\n", " print(x.shape, y.shape, y[:10])\n", " break" ] }, { "cell_type": "markdown", "id": "88a51105", "metadata": { "id": "88a51105" }, "source": [ "\n", "## 模型建置(Model Building)" ] }, { "cell_type": "code", "execution_count": null, "id": "OADvRuBYaGEc", "metadata": { "id": "OADvRuBYaGEc" }, "outputs": [], "source": [ "torch.manual_seed(5566)\n", "\n", "model = nn.Sequential(\n", " nn.Linear(X_train.shape[1], 16),\n", " nn.ReLU(),\n", " nn.Linear(16, 16),\n", " nn.ReLU(),\n", " nn.Linear(16, 1),\n", " nn.Sigmoid(),\n", ")\n", "\n", "print(model)" ] }, { "cell_type": "markdown", "id": "3a358982", "metadata": { "id": "3a358982" }, "source": [ "## 模型訓練(Model training)" ] }, { "cell_type": "markdown", "id": "2fd7204c", "metadata": { "id": "2fd7204c" }, "source": [ "* ### 設定模型訓練時,所需的優化器 (optimizer)、損失函數 (loss function)、評估指標 (metrics)" ] }, { "cell_type": "code", "execution_count": null, "id": "cKqRy_ZWae3Z", "metadata": { "id": "cKqRy_ZWae3Z" }, "outputs": [], "source": [ "optimizer = torch.optim.RMSprop(model.parameters(), lr=0.001)\n", "loss_fn = nn.BCELoss()\n", "\n", "device = torch.device('cuda' if torch.cuda.is_available() else 'cpu')\n", "print(f'device: {device}')\n", "model = model.to(device)" ] }, { "cell_type": "code", "execution_count": null, "id": "i84dnY6MatnT", "metadata": { "id": "i84dnY6MatnT" }, "outputs": [], "source": [ "def train_epoch(model, optimizer, loss_fn, train_dataloader, val_dataloader):\n", " # 訓練一輪\n", " model.train()\n", " total_train_loss = 0\n", " total_train_correct = 0\n", " for x, y in tqdm(train_dataloader, leave=False):\n", " x, y = x.to(device), y.to(device) # 將資料移至GPU\n", " y_pred = model(x) # 計算預測值\n", " loss = loss_fn(y_pred, y) # 計算誤差\n", " optimizer.zero_grad() # 梯度歸零\n", " loss.backward() # 反向傳播計算梯度\n", " optimizer.step() # 更新模型參數\n", "\n", " total_train_loss += loss.item()\n", " total_train_correct += ((y_pred > 0.5) == (y > 0.5)).sum().item()\n", "\n", " # 驗證一輪\n", " model.eval()\n", " total_val_loss = 0\n", " total_val_correct = 0\n", " # 關閉梯度計算以加速\n", " with torch.no_grad():\n", " for x, y in val_dataloader:\n", " x, y = x.to(device), y.to(device)\n", " y_pred = model(x)\n", " loss = loss_fn(y_pred, y)\n", " total_val_loss += loss.item()\n", " total_val_correct += ((y_pred > 0.5) == (y > 0.5)).sum().item()\n", "\n", " avg_train_loss = total_train_loss / len(train_dataloader)\n", " avg_train_acc = total_train_correct / len(train_dataloader.dataset)\n", " avg_val_loss = total_val_loss / len(val_dataloader)\n", " avg_val_acc = total_val_correct / len(val_dataloader.dataset)\n", " return avg_train_loss, avg_val_loss, avg_train_acc, avg_val_acc" ] }, { "cell_type": "code", "execution_count": null, "id": "z2uvE80Havvy", "metadata": { "id": "z2uvE80Havvy" }, "outputs": [], "source": [ "train_loss_log = []\n", "val_loss_log = []\n", "train_acc_log = []\n", "val_acc_log = []\n", "for epoch in tqdm(range(20)):\n", " avg_train_loss, avg_val_loss, avg_train_acc, avg_val_acc = train_epoch(model, optimizer, loss_fn, train_loader, valid_loader)\n", " train_loss_log.append(avg_train_loss)\n", " val_loss_log.append(avg_val_loss)\n", " train_acc_log.append(avg_train_acc)\n", " val_acc_log.append(avg_val_acc)\n", " print(f'Epoch: {epoch}, Train Loss: {avg_train_loss:.3f}, Val Loss: {avg_val_loss:.3f} | Train Acc: {avg_train_acc:.3f}, Val Acc: {avg_val_acc:.3f}')" ] }, { "cell_type": "markdown", "id": "49d3db7d", "metadata": { "id": "49d3db7d" }, "source": [ "\n", "## 模型評估(Model Evaluation)" ] }, { "cell_type": "markdown", "id": "cb66d982", "metadata": { "id": "cb66d982" }, "source": [ "* ### 視覺化訓練過程的評估指標 (Visualization)" ] }, { "cell_type": "code", "execution_count": null, "id": "3e7050e9", "metadata": { "id": "3e7050e9" }, "outputs": [], "source": [ "plt.figure(figsize=(15, 4))\n", "plt.subplot(1, 2, 1)\n", "plt.plot(range(len(train_loss_log)), train_loss_log, label='train_loss')\n", "plt.plot(range(len(val_loss_log)), val_loss_log, label='valid_loss')\n", "plt.xlabel('Epochs')\n", "plt.ylabel('Binary crossentropy')\n", "plt.legend()\n", "\n", "plt.subplot(1, 2, 2)\n", "plt.plot(range(len(train_acc_log)), train_acc_log, label='train_acc')\n", "plt.plot(range(len(val_acc_log)), val_acc_log, label='valid_acc')\n", "plt.xlabel('Epochs')\n", "plt.ylabel('Accuracy')\n", "plt.legend()\n", "plt.show()" ] }, { "cell_type": "markdown", "id": "daaebf99", "metadata": { "id": "daaebf99" }, "source": [ "* ### 模型預測(Model predictions)" ] }, { "cell_type": "code", "execution_count": null, "id": "79b4465f", "metadata": { "id": "79b4465f" }, "outputs": [], "source": [ "# predict all test data\n", "model.eval()\n", "\n", "y_pred = []\n", "with torch.no_grad():\n", " for x, _ in tqdm(valid_loader):\n", " x = x.to(device)\n", " y_pred.append(model(x))\n", "\n", "y_pred = torch.cat(y_pred).cpu()\n", "y_pred_class = (y_pred > 0.5).squeeze(1).int()\n", "\n", "print(y_pred_class, y_pred_class.shape)" ] }, { "cell_type": "markdown", "id": "8273fa68", "metadata": { "id": "8273fa68" }, "source": [ "* ### 視覺化結果" ] }, { "cell_type": "code", "execution_count": null, "id": "4a8c906c", "metadata": { "id": "4a8c906c" }, "outputs": [], "source": [ "plt.figure(figsize=(15, 4))\n", "plt.scatter(range(y_pred.shape[0]), y_pred)\n", "plt.hlines(0.5, 0, y_pred.shape[0], colors='red', label='y=0.5')\n", "plt.legend()\n", "plt.show()" ] }, { "cell_type": "markdown", "id": "73309200", "metadata": { "id": "73309200" }, "source": [ "----------------\n", "此範例是二元分類,y 的表示方式可用一維陣列,分別以 0, 1 表示兩個類別(正面,負面評價)\n", "![](https://hackmd.io/_uploads/SyTA5tU-p.png)\n", "\n", "**若是多元分類又該如何表示?** pytorch中以**整數值**代表類別(0, 1, ... n)\n", "\n", "**對訓練有何影響?**\n", "跟 y 最直接相關的就是 Loss function,使用torch.nn.CrossEntropyLoss()\n", "\n", "\n", "* 預測值(y_pred):每筆資料n類別預測值\n", "* 解答(y): 0~n-1 的整數值代表解答類別\n", "\n", "\n", "\n" ] }, { "cell_type": "markdown", "id": "ccb031e0", "metadata": { "id": "ccb031e0" }, "source": [ "----------------------------" ] }, { "cell_type": "markdown", "id": "23ba626a", "metadata": { "id": "23ba626a" }, "source": [ "## 多元分類(Multi-class classification)" ] }, { "cell_type": "markdown", "id": "e5e7adb5", "metadata": { "id": "e5e7adb5" }, "source": [ "### 創建資料集/載入資料集(Dataset Creating / Loading)" ] }, { "cell_type": "code", "execution_count": null, "id": "2ada5ae4", "metadata": { "id": "2ada5ae4" }, "outputs": [], "source": [ "train_df = pd.read_csv('./Data/FilmComment_train.csv')\n", "test_df = pd.read_csv('./Data/FilmComment_test.csv')" ] }, { "cell_type": "code", "execution_count": null, "id": "5fa1ba27", "metadata": { "id": "5fa1ba27" }, "outputs": [], "source": [ "X_df = train_df.iloc[:, :-1].values\n", "y_df = train_df.y_label.values" ] }, { "cell_type": "code", "execution_count": null, "id": "1db3602b", "metadata": { "id": "1db3602b" }, "outputs": [], "source": [ "X_test = test_df.iloc[:, :-1].values\n", "y_test = test_df.y_label.values" ] }, { "cell_type": "markdown", "id": "47419c59", "metadata": { "id": "47419c59" }, "source": [ "\n", "## 資料前處理(Data Preprocessing)" ] }, { "cell_type": "code", "execution_count": null, "id": "94cd8466", "metadata": { "id": "94cd8466" }, "outputs": [], "source": [ "# train, valid/test dataset split\n", "from sklearn.model_selection import train_test_split\n", "X_train, X_valid, y_train, y_valid = train_test_split(X_df, y_df,\n", " test_size=0.2,\n", " random_state=5566)" ] }, { "cell_type": "code", "execution_count": null, "id": "b85e17bf", "metadata": { "id": "b85e17bf" }, "outputs": [], "source": [ "print(f'X_train shape: {X_train.shape}')\n", "print(f'X_valid shape: {X_valid.shape}')\n", "print(f'y_train shape: {y_train.shape}')\n", "print(f'y_valid shape: {y_valid.shape}')" ] }, { "cell_type": "code", "execution_count": null, "id": "pcuSsMtHhIFw", "metadata": { "id": "pcuSsMtHhIFw" }, "outputs": [], "source": [ "# Build torch dataset and dataloader\n", "from torch.utils.data import TensorDataset, DataLoader\n", "\n", "BATCH_SIZE = 512\n", "\n", "train_dataset = TensorDataset(torch.from_numpy(X_train).float(),\n", " torch.from_numpy(y_train).long())\n", "valid_dataset = TensorDataset(torch.from_numpy(X_valid).float(),\n", " torch.from_numpy(y_valid).long())\n", "\n", "train_loader = DataLoader(train_dataset, batch_size=BATCH_SIZE, shuffle=True)\n", "valid_loader = DataLoader(valid_dataset, batch_size=BATCH_SIZE, shuffle=False)\n", "\n", "for x, y in train_loader:\n", " print(x.shape, y.shape, y[:10])\n", " break" ] }, { "cell_type": "markdown", "id": "80924567", "metadata": { "id": "80924567" }, "source": [ "### 模型建置(Model Building)" ] }, { "cell_type": "code", "execution_count": null, "id": "8391cbd3", "metadata": { "id": "8391cbd3" }, "outputs": [], "source": [ "torch.manual_seed(5566)\n", "\n", "# 不需要sigmoid, 分別輸出類別0, 1的值\n", "model = nn.Sequential(\n", " nn.Linear(X_train.shape[1], 16),\n", " nn.ReLU(),\n", " nn.Linear(16, 16),\n", " nn.ReLU(),\n", " nn.Linear(16, 2),\n", ")\n", "\n", "print(model)" ] }, { "cell_type": "markdown", "id": "f9ceea97", "metadata": { "id": "f9ceea97" }, "source": [ "### 模型訓練(Model training)" ] }, { "cell_type": "markdown", "id": "bb5925b9", "metadata": { "id": "bb5925b9" }, "source": [ "* #### 設定模型訓練時,所需的優化器 (optimizer)、損失函數 (loss function)" ] }, { "cell_type": "code", "execution_count": null, "id": "_vqU2S13hs6N", "metadata": { "id": "_vqU2S13hs6N" }, "outputs": [], "source": [ "optimizer = torch.optim.RMSprop(model.parameters(), lr=0.001)\n", "loss_fn = nn.CrossEntropyLoss() # 多元分類損失函數\n", "\n", "device = torch.device('cuda' if torch.cuda.is_available() else 'cpu')\n", "print(f'device: {device}')\n", "model = model.to(device)" ] }, { "cell_type": "code", "execution_count": null, "id": "321b5sOqh5a9", "metadata": { "id": "321b5sOqh5a9" }, "outputs": [], "source": [ "def train_epoch(model, optimizer, loss_fn, train_dataloader, val_dataloader):\n", " # 訓練一輪\n", " model.train()\n", " total_train_loss = 0\n", " total_train_correct = 0\n", " for x, y in tqdm(train_dataloader, leave=False):\n", " x, y = x.to(device), y.to(device) # 將資料移至GPU\n", " y_pred = model(x) # 計算預測值\n", " loss = loss_fn(y_pred, y) # 計算誤差\n", " optimizer.zero_grad() # 梯度歸零\n", " loss.backward() # 反向傳播計算梯度\n", " optimizer.step() # 更新模型參數\n", "\n", " total_train_loss += loss.item()\n", " # 利用argmax計算最大值是第n個類別,與解答比對是否相同\n", " total_train_correct += ((y_pred.argmax(dim=1) == y).sum().item())\n", "\n", " # 驗證一輪\n", " model.eval()\n", " total_val_loss = 0\n", " total_val_correct = 0\n", " # 關閉梯度計算以加速\n", " with torch.no_grad():\n", " for x, y in val_dataloader:\n", " x, y = x.to(device), y.to(device)\n", " y_pred = model(x)\n", " loss = loss_fn(y_pred, y)\n", " total_val_loss += loss.item()\n", " # 利用argmax計算最大值是第n個類別,與解答比對是否相同\n", " total_val_correct += ((y_pred.argmax(dim=1) == y).sum().item())\n", "\n", " avg_train_loss = total_train_loss / len(train_dataloader)\n", " avg_train_acc = total_train_correct / len(train_dataloader.dataset)\n", " avg_val_loss = total_val_loss / len(val_dataloader)\n", " avg_val_acc = total_val_correct / len(val_dataloader.dataset)\n", "\n", " return avg_train_loss, avg_val_loss, avg_train_acc, avg_val_acc" ] }, { "cell_type": "code", "execution_count": null, "id": "sk_Qvl2uiP4U", "metadata": { "id": "sk_Qvl2uiP4U" }, "outputs": [], "source": [ "train_loss_log = []\n", "val_loss_log = []\n", "train_acc_log = []\n", "val_acc_log = []\n", "for epoch in tqdm(range(20)):\n", " avg_train_loss, avg_val_loss, avg_train_acc, avg_val_acc = train_epoch(model, optimizer, loss_fn, train_loader, valid_loader)\n", " train_loss_log.append(avg_train_loss)\n", " val_loss_log.append(avg_val_loss)\n", " train_acc_log.append(avg_train_acc)\n", " val_acc_log.append(avg_val_acc)\n", " print(f'Epoch: {epoch}, Train Loss: {avg_train_loss:.3f}, Val Loss: {avg_val_loss:.3f} | Train Acc: {avg_train_acc:.3f}, Val Acc: {avg_val_acc:.3f}')" ] }, { "cell_type": "markdown", "id": "f7941459", "metadata": { "id": "f7941459" }, "source": [ "### 模型評估(Model evalutation)" ] }, { "cell_type": "markdown", "id": "983598de", "metadata": { "id": "983598de" }, "source": [ "* #### 視覺化訓練過程的評估指標 (Visualization)" ] }, { "cell_type": "code", "execution_count": null, "id": "yqbJg154iXLv", "metadata": { "id": "yqbJg154iXLv" }, "outputs": [], "source": [ "plt.figure(figsize=(15, 4))\n", "plt.subplot(1, 2, 1)\n", "plt.plot(range(len(train_loss_log)), train_loss_log, label='train_loss')\n", "plt.plot(range(len(val_loss_log)), val_loss_log, label='valid_loss')\n", "plt.xlabel('Epochs')\n", "plt.ylabel('Binary crossentropy')\n", "plt.legend()\n", "\n", "plt.subplot(1, 2, 2)\n", "plt.plot(range(len(train_acc_log)), train_acc_log, label='train_acc')\n", "plt.plot(range(len(val_acc_log)), val_acc_log, label='valid_acc')\n", "plt.xlabel('Epochs')\n", "plt.ylabel('Accuracy')\n", "plt.legend()\n", "plt.show()" ] }, { "cell_type": "markdown", "id": "faf85f9f", "metadata": { "id": "faf85f9f" }, "source": [ "* ### 模型預測(Model predictions)" ] }, { "cell_type": "code", "execution_count": null, "id": "ed795618", "metadata": { "id": "ed795618" }, "outputs": [], "source": [ "# predict all test data\n", "model.eval()\n", "\n", "y_pred = []\n", "with torch.no_grad():\n", " for x, _ in tqdm(valid_loader):\n", " x = x.to(device)\n", " y_pred.append(model(x))\n", "\n", "y_pred = torch.cat(y_pred).cpu()\n", "y_pred_class = y_pred.argmax(dim=1)\n", "\n", "print(y_pred_class, y_pred_class.shape)" ] }, { "cell_type": "markdown", "id": "702cec23", "metadata": { "id": "702cec23" }, "source": [ "### Quiz\n", "請試著利用 Data/pkgo_train.csv 做多元分類問題,預測五個種類的 pokemon,並調整模型(網路層數、神經元數目)得到更高的準確度。\n", "\n", "pkgo_train 為 Pokemon go 中 pokemon 出沒狀態描述的資料集,欄位說明如下:\n", "* latitude, longitude: 位置(經緯度)\n", "* local.xx: 時間(擷取格式 mm-dd'T'hh-mm-ss.ms'Z')\n", "* appearedTimeOfDay: night, evening, afternoon, morning 四種時段\n", "* appearedHour/Minute: 當地小時/分鐘\n", "* appearedDayOfWeek: Monday, Tuesday, Wednesday, Thursday, Friday, Saturday, Sunday\n", "* appearedDay/Month: 當地日期/月份\n", "* terrainType: 地形種類\n", "* closeToWater: 是否接近水源(100 公尺內)\n", "* city: 城市\n", "* continent: 洲別\n", "* weather: 天氣種類(Foggy Clear, PartlyCloudy, MostlyCloudy, Overcast, Rain, BreezyandOvercast, LightRain, Drizzle, BreezyandPartlyCloudy, HeavyRain, BreezyandMostlyCloudy, Breezy, Windy, WindyandFoggy, Humid, Dry, WindyandPartlyCloudy, DryandMostlyCloudy, DryandPartlyCloudy, DrizzleandBreezy, LightRainandBreezy, HumidandPartlyCloudy, HumidandOvercast, RainandWindy)\n", "* temperature: 攝氏溫度\n", "* windSpeed: 風速(km/h)\n", "* windBearing: 風向\n", "* pressure: 氣壓\n", "* sunrise/sunsetXX: 日出日落相關訊息\n", "* population_density: 人口密集度\n", "* urban/suburban/midurban/rural: 出沒過的地點城市程度(人口密集度小於 200 為 rural, 大於等於 200 且小於 400 為 midUrban, 大於等於400 且小於 800 為 subUrban, 大於 800 為 urban)\n", "* gymDistanceKm: 最近道館的距離\n", "* gymInxx: 道館是否在指定距離內\n", "* cooc1-cooc151: 是否有其他 pokemon 在 24 小時內,出現在周圍 100 公尺之內\n", "* category: 種類" ] }, { "cell_type": "code", "execution_count": null, "id": "523db00d", "metadata": { "id": "523db00d" }, "outputs": [], "source": [] } ], "metadata": { "accelerator": "GPU", "colab": { "provenance": [] }, "gpuClass": "standard", "kernelspec": { "display_name": "Python 3 (ipykernel)", "language": "python", "name": "python3" }, "language_info": { "codemirror_mode": { "name": "ipython", "version": 3 }, "file_extension": ".py", "mimetype": "text/x-python", "name": "python", "nbconvert_exporter": "python", "pygments_lexer": "ipython3", "version": "3.7.12" } }, "nbformat": 4, "nbformat_minor": 5 }