{ "cells": [ { "cell_type": "markdown", "id": "e51223b0", "metadata": { "id": "e51223b0" }, "source": [ "# **模型調校(Model Tuning)**\n", "此份程式碼會提供針對某資料集的模型調校策略,以及比較其超參數的選擇。\n", "\n", "## 本章節內容大綱\n", "* ### [損失函數(Loss function)](#LossFunction)\n", "* ### [激活函數(Activation function)](#ActivationFunction)\n", "* ### [優化器(Optimizer)](#Optimizer)\n", "* ### [學習率(Learning rate)](#LearningRate)\n", "* ### [模型架構(Model architecture)](#ModelArchitecture)" ] }, { "cell_type": "markdown", "id": "33255f75", "metadata": { "id": "33255f75" }, "source": [ "## 匯入套件" ] }, { "cell_type": "code", "execution_count": null, "id": "bac12f0a", "metadata": { "id": "bac12f0a" }, "outputs": [], "source": [ "import numpy as np\n", "import pandas as pd\n", "import matplotlib.pyplot as plt\n", "from tqdm.auto import tqdm\n", "\n", "# PyTorch 相關套件\n", "import torch\n", "import torch.nn as nn\n", "import torch.nn.functional as F" ] }, { "cell_type": "markdown", "id": "495c8b24", "metadata": { "id": "495c8b24" }, "source": [ "## 創建資料集/載入資料集(Dataset Creating / Loading)" ] }, { "cell_type": "code", "source": [ "# 上傳資料\n", "!wget -q https://github.com/TA-aiacademy/course_3.0/releases/download/DL/Data_part3.zip\n", "!unzip -q Data_part3.zip" ], "metadata": { "id": "Ny4jdxCvv7XU" }, "id": "Ny4jdxCvv7XU", "execution_count": null, "outputs": [] }, { "cell_type": "code", "execution_count": null, "id": "899342e7", "metadata": { "id": "899342e7" }, "outputs": [], "source": [ "train_df = pd.read_csv('./Data/News_train.csv')\n", "test_df = pd.read_csv('./Data/News_test.csv')" ] }, { "cell_type": "code", "execution_count": null, "id": "8be123e5", "metadata": { "id": "8be123e5" }, "outputs": [], "source": [ "train_df.head()" ] }, { "cell_type": "markdown", "id": "e37d2723", "metadata": { "id": "e37d2723" }, "source": [ "* #### 新聞文章資料集\n", "訓練集,測試集分別為 7728,1907 筆,4081 種常用字詞,若在同一篇新聞中出現該字詞為 1,若否則為 0,y_category 標記文章類別,共 11 種類別。" ] }, { "cell_type": "code", "execution_count": null, "id": "7aace4ec", "metadata": { "id": "7aace4ec" }, "outputs": [], "source": [ "X_df = train_df.iloc[:, :-1].values\n", "y_df = train_df.y_category.values" ] }, { "cell_type": "code", "execution_count": null, "id": "2fdea65d", "metadata": { "id": "2fdea65d" }, "outputs": [], "source": [ "X_test = test_df.iloc[:, :-1].values\n", "y_test = test_df.y_category.values" ] }, { "cell_type": "markdown", "id": "a5ca0f5f", "metadata": { "id": "a5ca0f5f" }, "source": [ "## 資料前處理(Data Preprocessing)" ] }, { "cell_type": "code", "execution_count": null, "id": "2ead5bdb", "metadata": { "id": "2ead5bdb" }, "outputs": [], "source": [ "from sklearn.preprocessing import StandardScaler, MinMaxScaler\n", "# Feature scaling\n", "sc = StandardScaler()\n", "X_scale = sc.fit_transform(X_df, y_df)\n", "X_test_scale = sc.transform(X_test)" ] }, { "cell_type": "code", "execution_count": null, "id": "b00c5e3a", "metadata": { "id": "b00c5e3a" }, "outputs": [], "source": [ "# train, valid/test dataset split\n", "from sklearn.model_selection import train_test_split\n", "X_train, X_valid, y_train, y_valid = train_test_split(X_scale, y_df,\n", " test_size=0.2,\n", " random_state=5566,\n", " stratify=y_df)" ] }, { "cell_type": "code", "execution_count": null, "id": "6be0c87c", "metadata": { "id": "6be0c87c" }, "outputs": [], "source": [ "print(f'X_train shape: {X_train.shape}')\n", "print(f'X_valid shape: {X_valid.shape}')\n", "print(f'y_train shape: {y_train.shape}')\n", "print(f'y_valid shape: {y_valid.shape}')" ] }, { "cell_type": "code", "source": [ "# build dataset and dataloader\n", "train_ds = torch.utils.data.TensorDataset(torch.tensor(X_train, dtype=torch.float32),\n", " torch.tensor(y_train, dtype=torch.long))\n", "valid_ds = torch.utils.data.TensorDataset(torch.tensor(X_valid, dtype=torch.float32),\n", " torch.tensor(y_valid, dtype=torch.long))\n", "\n", "BATCH_SIZE = 64\n", "train_loader = torch.utils.data.DataLoader(train_ds, batch_size=BATCH_SIZE, shuffle=True)\n", "valid_loader = torch.utils.data.DataLoader(valid_ds, batch_size=BATCH_SIZE)" ], "metadata": { "id": "5jS0tCO_36E-" }, "id": "5jS0tCO_36E-", "execution_count": null, "outputs": [] }, { "cell_type": "markdown", "id": "02cf5671", "metadata": { "id": "02cf5671" }, "source": [ "## 模型建置(Model Building)" ] }, { "cell_type": "code", "source": [ "NUM_CLASS = 11" ], "metadata": { "id": "VL3_42mD4eLI" }, "id": "VL3_42mD4eLI", "execution_count": null, "outputs": [] }, { "cell_type": "code", "execution_count": null, "id": "fc6e3e67", "metadata": { "id": "fc6e3e67" }, "outputs": [], "source": [ "torch.manual_seed(5566)\n", "\n", "def build_model(input_shape, num_class):\n", " model = nn.Sequential(\n", " nn.Linear(input_shape, 16),\n", " nn.Sigmoid(),\n", " nn.Linear(16, 16),\n", " nn.Sigmoid(),\n", " nn.Linear(16, num_class),\n", " )\n", " return model" ] }, { "cell_type": "code", "execution_count": null, "id": "5a41671f", "metadata": { "id": "5a41671f" }, "outputs": [], "source": [ "model = build_model(X_train.shape[1], NUM_CLASS)\n", "print(model)" ] }, { "cell_type": "markdown", "id": "fce8b56e", "metadata": { "id": "fce8b56e" }, "source": [ "## 模型訓練(Model Training)" ] }, { "cell_type": "code", "source": [ "optimizer = torch.optim.RMSprop(model.parameters(), lr=0.001)\n", "loss_fn = nn.CrossEntropyLoss() # 多元分類損失函數\n", "\n", "device = torch.device('cuda' if torch.cuda.is_available() else 'cpu')\n", "print(f'device: {device}')\n", "model = model.to(device)" ], "metadata": { "id": "CD6tVSdq4woX" }, "id": "CD6tVSdq4woX", "execution_count": null, "outputs": [] }, { "cell_type": "code", "source": [ "def train_epoch(model, optimizer, loss_fn, train_dataloader, val_dataloader):\n", " # 訓練一輪\n", " model.train()\n", " total_train_loss = 0\n", " total_train_correct = 0\n", " for x, y in tqdm(train_dataloader, leave=False):\n", " x, y = x.to(device), y.to(device) # 將資料移至GPU\n", " y_pred = model(x) # 計算預測值\n", " if type(loss_fn) != nn.CrossEntropyLoss:\n", " y_pred = F.softmax(y_pred, dim=1)\n", " y = F.one_hot(y, num_classes=NUM_CLASS).float() # one-hot encoding\n", " loss = loss_fn(y_pred, y) # 計算誤差\n", " optimizer.zero_grad() # 梯度歸零\n", " loss.backward() # 反向傳播計算梯度\n", " optimizer.step() # 更新模型參數\n", "\n", " total_train_loss += loss.item()\n", " if type(loss_fn) != nn.CrossEntropyLoss:\n", " y = y.argmax(dim=1).long()\n", " # 利用argmax計算最大值是第n個類別,與解答比對是否相同\n", " total_train_correct += ((y_pred.argmax(dim=1) == y).sum().item())\n", "\n", " # 驗證一輪\n", " model.eval()\n", " total_val_loss = 0\n", " total_val_correct = 0\n", " # 關閉梯度計算以加速\n", " with torch.no_grad():\n", " for x, y in val_dataloader:\n", " x, y = x.to(device), y.to(device)\n", " y_pred = model(x)\n", " if type(loss_fn) != nn.CrossEntropyLoss:\n", " y_pred = F.softmax(y_pred, dim=1)\n", " y = F.one_hot(y, num_classes=NUM_CLASS).float() # one-hot encoding\n", " loss = loss_fn(y_pred, y)\n", " total_val_loss += loss.item()\n", " # 利用argmax計算最大值是第n個類別,與解答比對是否相同\n", " if type(loss_fn) != nn.CrossEntropyLoss:\n", " y = y.argmax(dim=1).long()\n", " total_val_correct += ((y_pred.argmax(dim=1) == y).sum().item())\n", "\n", " avg_train_loss = total_train_loss / len(train_dataloader)\n", " avg_train_acc = total_train_correct / len(train_dataloader.dataset)\n", " avg_val_loss = total_val_loss / len(val_dataloader)\n", " avg_val_acc = total_val_correct / len(val_dataloader.dataset)\n", "\n", " return avg_train_loss, avg_val_loss, avg_train_acc, avg_val_acc" ], "metadata": { "id": "CNAM22hQ5C4T" }, "id": "CNAM22hQ5C4T", "execution_count": null, "outputs": [] }, { "cell_type": "code", "source": [ "def run(model, optimizer, loss_fn, train_loader, valid_loader, verbose=1):\n", " train_loss_log = []\n", " val_loss_log = []\n", " train_acc_log = []\n", " val_acc_log = []\n", " for epoch in tqdm(range(20)):\n", " avg_train_loss, avg_val_loss, avg_train_acc, avg_val_acc = train_epoch(model, optimizer, loss_fn, train_loader, valid_loader)\n", " train_loss_log.append(avg_train_loss)\n", " val_loss_log.append(avg_val_loss)\n", " train_acc_log.append(avg_train_acc)\n", " val_acc_log.append(avg_val_acc)\n", " if verbose == 1:\n", " print(f'Epoch: {epoch}, Train Loss: {avg_train_loss:.3f}, Val Loss: {avg_val_loss:.3f} | Train Acc: {avg_train_acc:.3f}, Val Acc: {avg_val_acc:.3f}')\n", " return train_loss_log, train_acc_log, val_loss_log, val_acc_log" ], "metadata": { "id": "YNKeBk_n5Fn9" }, "id": "YNKeBk_n5Fn9", "execution_count": null, "outputs": [] }, { "cell_type": "code", "source": [ "train_loss_log, train_acc_log, val_loss_log, val_acc_log = run(model, optimizer, loss_fn, train_loader, valid_loader)" ], "metadata": { "id": "gFOsC-0M8Auw" }, "id": "gFOsC-0M8Auw", "execution_count": null, "outputs": [] }, { "cell_type": "markdown", "id": "48403507", "metadata": { "id": "48403507" }, "source": [ "## 模型評估(Model Evaluation)" ] }, { "cell_type": "code", "execution_count": null, "id": "16a070b5", "metadata": { "id": "16a070b5" }, "outputs": [], "source": [ "plt.figure(figsize=(15, 4))\n", "plt.subplot(1, 2, 1)\n", "plt.plot(range(len(train_loss_log)), train_loss_log, label='train_loss')\n", "plt.plot(range(len(val_loss_log)), val_loss_log, label='valid_loss')\n", "plt.xlabel('Epochs')\n", "plt.ylabel('Binary crossentropy')\n", "plt.legend()\n", "\n", "plt.subplot(1, 2, 2)\n", "plt.plot(range(len(train_acc_log)), train_acc_log, label='train_acc')\n", "plt.plot(range(len(val_acc_log)), val_acc_log, label='valid_acc')\n", "plt.xlabel('Epochs')\n", "plt.ylabel('Accuracy')\n", "plt.legend()\n", "plt.show()" ] }, { "cell_type": "markdown", "id": "98120751", "metadata": { "id": "98120751" }, "source": [ "## 模型調校" ] }, { "cell_type": "markdown", "id": "0316d055", "metadata": { "id": "0316d055" }, "source": [ "![](https://hackmd.io/_uploads/SyE5RYIbT.png)\n" ] }, { "cell_type": "markdown", "id": "ed21b1b9", "metadata": { "id": "ed21b1b9" }, "source": [ "\n", "* ## 損失函數(Loss function)\n", "torch.nn Loss function: https://pytorch.org/docs/stable/nn.html#loss-functions" ] }, { "cell_type": "code", "execution_count": null, "id": "a4ab32b2", "metadata": { "id": "a4ab32b2" }, "outputs": [], "source": [ "# 以下放置要比較的 loss function\n", "loss_funcs = [\n", " nn.MSELoss(),\n", " nn.CrossEntropyLoss(),\n", " nn.L1Loss(), # mean absolute error\n", "]\n", "\n", "# 建立兩個 list 記錄選用不同 loss function 的訓練結果\n", "all_loss, all_acc = [], []\n", "\n", "# 迭代不同的 loss function 去訓練模型\n", "for loss_fn in loss_funcs:\n", " print(f'Running model, loss = {loss_fn}')\n", "\n", " # 確保每次都是訓練新的模型,而不是接續上一輪的模型\n", " model = build_model(X_train.shape[1], NUM_CLASS)\n", "\n", " optimizer = torch.optim.RMSprop(model.parameters(), lr=0.001)\n", " model = model.to(device)\n", "\n", " # # 確保每次都設定一樣的參數\n", " history = run(model, optimizer, loss_fn, train_loader, valid_loader, verbose=0)\n", "\n", " # 將訓練過程記錄下來\n", " all_loss.append(history[0])\n", " all_acc.append(history[1])\n", "print('----------------- training done! -----------------')" ] }, { "cell_type": "code", "execution_count": null, "id": "e5ddd8df", "metadata": { "id": "e5ddd8df" }, "outputs": [], "source": [ "# 視覺化訓練過程\n", "plt.figure(figsize=(15, 7))\n", "\n", "# 繪製 Training loss\n", "plt.subplot(121)\n", "for k in range(len(loss_funcs)):\n", " plt.plot(range(len(all_loss[k])), all_loss[k], label=loss_funcs[k])\n", "plt.title('Loss')\n", "\n", "# 繪製 Training accuracy\n", "plt.subplot(122)\n", "for k in range(len(loss_funcs)):\n", " plt.plot(range(len(all_acc[k])), all_acc[k], label=loss_funcs[k])\n", "plt.title('Accuracy')\n", "\n", "plt.legend(bbox_to_anchor=(1.05, 1), loc=2, borderaxespad=2.)\n", "plt.ylim((0, 1))\n", "plt.show()" ] }, { "cell_type": "markdown", "id": "48d4b3d2", "metadata": { "id": "48d4b3d2" }, "source": [ "---\n", "![](https://hackmd.io/_uploads/BknsRtLZa.png)\n", "\n", "---" ] }, { "cell_type": "markdown", "id": "51977b51", "metadata": { "id": "51977b51" }, "source": [ "\n", "* ## 激活函數(Activation function)\n", "torch.nn: https://pytorch.org/docs/stable/nn.html#non-linear-activations-weighted-sum-nonlinearity" ] }, { "cell_type": "code", "execution_count": null, "id": "e0dc4d8c", "metadata": { "id": "e0dc4d8c" }, "outputs": [], "source": [ "def build_model_activation(input_shape, num_class, activation):\n", " torch.manual_seed(5566)\n", " # 重新建構一個可以更改 Activation 的模型\n", " model = nn.Sequential(\n", " nn.Linear(input_shape, 16),\n", " activation(),\n", " nn.Linear(16, 16),\n", " activation(),\n", " nn.Linear(16, num_class),\n", " )\n", " return model" ] }, { "cell_type": "code", "execution_count": null, "id": "e95785db", "metadata": { "id": "e95785db" }, "outputs": [], "source": [ "# 以下放置要比較的 activation function\n", "activation_funcs = [\n", " nn.Identity,\n", " nn.Sigmoid,\n", " nn.Tanh,\n", " nn.ReLU,\n", " nn.Softplus,\n", " nn.LeakyReLU,\n", " nn.Mish,\n", "]\n", "\n", "# 建立兩個 list 記錄選用不同 activation function 的訓練結果\n", "all_loss, all_acc = [], []\n", "\n", "# 迭代不同的 activation function 去訓練模型\n", "for activation_f in activation_funcs:\n", " print(f'Running model, activation = {activation_f}')\n", "\n", " # 確保每次都是訓練新的模型,而不是接續上一輪的模型\n", " model = build_model_activation(X_train.shape[1],\n", " NUM_CLASS,\n", " activation_f)\n", " model = model.to(device)\n", " optimizer = torch.optim.RMSprop(model.parameters(), lr=0.001)\n", " loss_fn = nn.CrossEntropyLoss() # 多元分類損失函數\n", "\n", " history = run(model, optimizer, loss_fn, train_loader, valid_loader, verbose=0)\n", "\n", " # 將訓練過程記錄下來\n", " all_loss.append(history[0])\n", " all_acc.append(history[1])\n", "print('----------------- training done! -----------------')" ] }, { "cell_type": "code", "execution_count": null, "id": "b428c947", "metadata": { "id": "b428c947" }, "outputs": [], "source": [ "# 視覺化訓練過程\n", "plt.figure(figsize=(15, 7))\n", "\n", "# 繪製 Training loss\n", "plt.subplot(121)\n", "for k in range(len(activation_funcs)):\n", " plt.plot(range(len(all_loss[k])), all_loss[k], label=activation_funcs[k])\n", "plt.title('Loss')\n", "\n", "# 繪製 Training accuracy\n", "plt.subplot(122)\n", "for k in range(len(activation_funcs)):\n", " plt.plot(range(len(all_acc[k])), all_acc[k], label=activation_funcs[k])\n", "plt.title('Accuracy')\n", "\n", "plt.legend(bbox_to_anchor=(1.05, 1), loc=2, borderaxespad=0.)\n", "\n", "plt.ylim((0, 1))\n", "plt.show()" ] }, { "cell_type": "code", "execution_count": null, "id": "6a37ff70", "metadata": { "id": "6a37ff70" }, "outputs": [], "source": [ "# 視覺化訓練過程\n", "plt.figure(figsize=(15, 4))\n", "\n", "# 繪製 Training loss\n", "plt.subplot(121)\n", "for k in range(len(activation_funcs)):\n", " plt.plot(range(len(all_loss[k])), all_loss[k], label=activation_funcs[k])\n", "plt.title('Loss')\n", "plt.ylim((0, 0.3))\n", "\n", "# 繪製 Training accuracy\n", "plt.subplot(122)\n", "for k in range(len(activation_funcs)):\n", " plt.plot(range(len(all_acc[k])), all_acc[k], label=activation_funcs[k])\n", "plt.title('Accuracy')\n", "\n", "plt.legend(bbox_to_anchor=(1.05, 1), loc=2, borderaxespad=0.)\n", "\n", "plt.ylim((0.95, 0.975))\n", "plt.show()" ] }, { "cell_type": "markdown", "id": "4b3f2f31", "metadata": { "id": "4b3f2f31" }, "source": [ "---\n", "![](https://hackmd.io/_uploads/BJ1pAY8bT.png)\n", "\n", "---" ] }, { "cell_type": "markdown", "id": "aa781738", "metadata": { "id": "aa781738" }, "source": [ "\n", "* ## 優化器(Optimizer)\n", "torch.optim: https://pytorch.org/docs/stable/optim.html#algorithms" ] }, { "cell_type": "code", "execution_count": null, "id": "0d1d383c", "metadata": { "id": "0d1d383c" }, "outputs": [], "source": [ "# 以下放置要比較的 optimizer\n", "optimizer_funcs = [\n", " torch.optim.SGD,\n", " torch.optim.RMSprop,\n", " torch.optim.Adam,\n", " torch.optim.NAdam,\n", "]\n", "\n", "# 建立兩個 list 記錄選用不同 optimizer 的訓練結果\n", "all_loss, all_acc = [], []\n", "\n", "# 迭代不同的 optimizer 去訓練模型\n", "for optimizer_f in optimizer_funcs:\n", " print(f'Running model, optimizer = {optimizer_f}')\n", "\n", " # 確保每次都是訓練新的模型,而不是接續上一輪的模型\n", " model = build_model_activation(X_train.shape[1],\n", " NUM_CLASS,\n", " nn.Tanh)\n", " model = model.to(device)\n", " optimizer = optimizer_f(model.parameters())\n", " loss_fn = nn.CrossEntropyLoss() # 多元分類損失函數\n", "\n", " # 確保每次都設定一樣的參數\n", " history = run(model, optimizer, loss_fn, train_loader, valid_loader, verbose=0)\n", " # 將訓練過程記錄下來\n", " all_loss.append(history[0])\n", " all_acc.append(history[1])\n", "\n", "print('----------------- training done! -----------------')" ] }, { "cell_type": "code", "execution_count": null, "id": "5b8f7cf3", "metadata": { "id": "5b8f7cf3" }, "outputs": [], "source": [ "# 視覺化訓練過程\n", "plt.figure(figsize=(15, 7))\n", "\n", "# 繪製 Training loss\n", "plt.subplot(121)\n", "for k in range(len(optimizer_funcs)):\n", " plt.plot(range(len(all_loss[k])), all_loss[k], label=optimizer_funcs[k])\n", "plt.title('Loss')\n", "\n", "# 繪製 Training accuracy\n", "plt.subplot(122)\n", "for k in range(len(optimizer_funcs)):\n", " plt.plot(range(len(all_acc[k])), all_acc[k], label=optimizer_funcs[k])\n", "plt.title('Accuracy')\n", "\n", "plt.legend(bbox_to_anchor=(1.05, 1), loc=2, borderaxespad=2.)\n", "plt.ylim((0, 1))\n", "plt.show()" ] }, { "cell_type": "markdown", "id": "21e99f51", "metadata": { "id": "21e99f51" }, "source": [ "\n", "* ## 學習率(Learning rate)" ] }, { "cell_type": "code", "execution_count": null, "id": "f2cff10b", "metadata": { "id": "f2cff10b" }, "outputs": [], "source": [ "# 以下放置要比較的 learning rate\n", "lr_list = [0.1, 0.01, 0.001, 0.0001, 0.00001]\n", "\n", "# 建立兩個 list 記錄選用不同 learning rate 的訓練結果\n", "all_loss, all_acc = [], []\n", "\n", "# 迭代不同的 learning rate 去訓練模型\n", "for lr in lr_list:\n", " print(f'Running model, learning rate = {lr}')\n", "\n", " # 確保每次都是訓練新的模型,而不是接續上一輪的模型\n", " model = build_model_activation(X_train.shape[1],\n", " NUM_CLASS,\n", " nn.Tanh)\n", " model = model.to(device)\n", " optimizer = optimizer_f(model.parameters(), lr=lr)\n", " loss_fn = nn.CrossEntropyLoss() # 多元分類損失函數\n", "\n", " # 確保每次都設定一樣的參數\n", " history = run(model, optimizer, loss_fn, train_loader, valid_loader, verbose=0)\n", " # 將訓練過程記錄下來\n", " all_loss.append(history[0])\n", " all_acc.append(history[1])\n", "print('----------------- training done! -----------------')" ] }, { "cell_type": "code", "execution_count": null, "id": "288def33", "metadata": { "id": "288def33" }, "outputs": [], "source": [ "# 視覺化訓練過程\n", "plt.figure(figsize=(15, 7))\n", "\n", "# 繪製 Training loss\n", "plt.subplot(121)\n", "for k in range(len(lr_list)):\n", " plt.plot(range(len(all_loss[k])), all_loss[k], label=lr_list[k])\n", "plt.title('Loss')\n", "\n", "# 繪製 Training accuracy\n", "plt.subplot(122)\n", "for k in range(len(lr_list)):\n", " plt.plot(range(len(all_acc[k])), all_acc[k], label=lr_list[k])\n", "plt.title('Accuracy')\n", "\n", "plt.legend(bbox_to_anchor=(1.05, 1), loc=2, borderaxespad=2.)\n", "plt.ylim((0, 1))\n", "plt.show()" ] }, { "cell_type": "markdown", "id": "615b9c7b", "metadata": { "id": "615b9c7b" }, "source": [ "\n", "* ## 模型架構(Model architecture)" ] }, { "cell_type": "code", "execution_count": null, "id": "5bfb7e23", "metadata": { "id": "5bfb7e23" }, "outputs": [], "source": [ "def build_model_architecture(input_shape, num_class, layer, neuron):\n", " torch.manual_seed(5566)\n", " layers = []\n", " input_dim = input_shape\n", " for i in range(layer):\n", " layers.append(nn.Linear(input_dim, neuron))\n", " layers.append(nn.Tanh())\n", " input_dim = neuron\n", " layers.append(nn.Linear(input_dim, num_class))\n", " model = nn.Sequential(*layers)\n", " return model" ] }, { "cell_type": "code", "execution_count": null, "id": "bb933a0d", "metadata": { "id": "bb933a0d" }, "outputs": [], "source": [ "# 以下放置要比較的 layers/ neurons\n", "layers_num = [1, 2, 3]\n", "neurons_num = [16, 32, 64]\n", "\n", "batch_size = 64\n", "epochs = 20\n", "\n", "# 建立兩個 list 記錄選用不同 layers/ neurons 的訓練結果\n", "all_loss, all_acc = [], []\n", "\n", "# 迭代不同的 layers/ neurons 去訓練模型\n", "for layer in layers_num:\n", " for neuron in neurons_num:\n", " print(f'Running model, (layer, neuron) = {(layer, neuron)}')\n", "\n", " # 確保每次都是訓練新的模型,而不是接續上一輪的模型\n", " model = build_model_architecture(X_train.shape[1],\n", " NUM_CLASS,\n", " layer,\n", " neuron)\n", " model = model.to(device)\n", " optimizer = torch.optim.NAdam(model.parameters())\n", "\n", " # 確保每次都設定一樣的參數\n", " history = run(model, optimizer, loss_fn, train_loader, valid_loader, verbose=0)\n", " # 將訓練過程記錄下來\n", " all_loss.append(history[0])\n", " all_acc.append(history[1])\n", "print('----------------- training done! -----------------')" ] }, { "cell_type": "code", "execution_count": null, "id": "74298636", "metadata": { "id": "74298636" }, "outputs": [], "source": [ "layer_neuron = list(zip(sum([[i]*3 for i in layers_num], []), neurons_num*3))\n", "\n", "# 視覺化訓練過程\n", "plt.figure(figsize=(15, 7))\n", "\n", "# 繪製 Training loss\n", "plt.subplot(121)\n", "for k in range(len(layer_neuron)):\n", " plt.plot(range(len(all_loss[k])), all_loss[k], label=layer_neuron[k])\n", "plt.title('Loss')\n", "\n", "# 繪製 Training accuracy\n", "plt.subplot(122)\n", "for k in range(len(layer_neuron)):\n", " plt.plot(range(len(all_acc[k])), all_acc[k], label=layer_neuron[k])\n", "plt.title('Accuracy')\n", "\n", "plt.legend(bbox_to_anchor=(1.05, 1), loc=2, borderaxespad=0.)\n", "\n", "plt.ylim((0, 1))\n", "plt.show()" ] }, { "cell_type": "code", "execution_count": null, "id": "8e5036b3", "metadata": { "id": "8e5036b3" }, "outputs": [], "source": [ "# 視覺化訓練過程\n", "plt.figure(figsize=(15, 4))\n", "\n", "# 繪製 Training loss\n", "plt.subplot(121)\n", "for k in range(len(layer_neuron)):\n", " plt.plot(range(len(all_loss[k])), all_loss[k], label=layer_neuron[k])\n", "plt.title('Loss')\n", "plt.ylim((0.05, 0.1))\n", "\n", "# 繪製 Training accuracy\n", "plt.subplot(122)\n", "for k in range(len(layer_neuron)):\n", " plt.plot(range(len(all_acc[k])), all_acc[k], label=layer_neuron[k])\n", "plt.title('Accuracy')\n", "\n", "plt.legend(bbox_to_anchor=(1.05, 1), loc=2, borderaxespad=0.)\n", "plt.ylim((0.96, 1.))\n", "plt.show()" ] }, { "cell_type": "markdown", "id": "5227388a", "metadata": { "id": "5227388a" }, "source": [ "---\n", "### Quiz\n", "請試著利用 Data/pkgo_train.csv 做多元分類問題,預測五個種類的 pokemon,並調整模型(網路層數、神經元數目、激活函數)以及訓練相關的參數得到更高的準確度。" ] }, { "cell_type": "code", "execution_count": null, "id": "5a3be942", "metadata": { "id": "5a3be942" }, "outputs": [], "source": [] } ], "metadata": { "kernelspec": { "display_name": "Python 3 (ipykernel)", "language": "python", "name": "python3" }, "language_info": { "codemirror_mode": { "name": "ipython", "version": 3 }, "file_extension": ".py", "mimetype": "text/x-python", "name": "python", "nbconvert_exporter": "python", "pygments_lexer": "ipython3", "version": "3.7.12" }, "colab": { "provenance": [] }, "accelerator": "GPU", "gpuClass": "standard" }, "nbformat": 4, "nbformat_minor": 5 }