{ "cells": [ { "cell_type": "markdown", "id": "9dec772c", "metadata": {}, "source": [ "# **可解釋 AI (XAI)**\n", "此份程式碼會介紹如何使用套件快速觀看模型的可解釋性結果。" ] }, { "cell_type": "code", "execution_count": null, "id": "c072c238", "metadata": { "colab": { "base_uri": "https://localhost:8080/" }, "executionInfo": { "elapsed": 2610, "status": "ok", "timestamp": 1677134318954, "user": { "displayName": "吳承澔", "userId": "17428420001093174904" }, "user_tz": -480 }, "id": "6e368687", "outputId": "7629482e-133b-4f64-fb69-f699f4af86da" }, "outputs": [], "source": [ "!pip install tf_keras_vis" ] }, { "cell_type": "markdown", "id": "8dc9032a", "metadata": {}, "source": [ "## 匯入套件" ] }, { "cell_type": "code", "execution_count": null, "id": "4fe68cb2", "metadata": { "executionInfo": { "elapsed": 3618, "status": "ok", "timestamp": 1677134322570, "user": { "displayName": "吳承澔", "userId": "17428420001093174904" }, "user_tz": -480 }, "id": "TIqunw1JXkbR" }, "outputs": [], "source": [ "import numpy as np\n", "import cv2\n", "import tensorflow as tf\n", "\n", "import matplotlib.pyplot as plt\n", "from matplotlib import cm\n", "\n", "# 經典模型相關套件\n", "from tensorflow.keras.applications.vgg16 import VGG16\n", "from tensorflow.keras.applications.vgg16 import preprocess_input\n", "\n", "# 可解釋性模型相關套件\n", "from tf_keras_vis.gradcam import Gradcam\n", "from tf_keras_vis.gradcam_plus_plus import GradcamPlusPlus\n", "from tf_keras_vis.scorecam import Scorecam" ] }, { "cell_type": "markdown", "id": "4854ac46", "metadata": { "id": "301i4iu3Roj0" }, "source": [ "## 載入模型" ] }, { "cell_type": "code", "execution_count": null, "id": "93986cdb", "metadata": { "colab": { "base_uri": "https://localhost:8080/" }, "executionInfo": { "elapsed": 4294, "status": "ok", "timestamp": 1677134326862, "user": { "displayName": "吳承澔", "userId": "17428420001093174904" }, "user_tz": -480 }, "id": "6f5146fe", "outputId": "7691e70a-a412-4f19-db36-199b2ccc5b24" }, "outputs": [], "source": [ "model = VGG16(include_top=True, # 是否要包含分類器\n", " weights='imagenet', # 在 imagenet 資料集上訓練的權重值\n", " input_shape=(224, 224, 3),\n", " classes=1000) # 分類數目\n", "model.summary()" ] }, { "cell_type": "code", "execution_count": null, "id": "938659f2", "metadata": { "id": "_RfoPr3q1AMB" }, "outputs": [], "source": [ "# upload Data\n", "!wget -q https://github.com/TA-aiacademy/course_3.0/releases/download/CVCNN_Data/CVCNN_part4.zip\n", "!unzip -q CVCNN_part4" ] }, { "cell_type": "code", "execution_count": null, "id": "178ea9d5", "metadata": { "colab": { "base_uri": "https://localhost:8080/", "height": 308 }, "executionInfo": { "elapsed": 3581, "status": "ok", "timestamp": 1677134337222, "user": { "displayName": "吳承澔", "userId": "17428420001093174904" }, "user_tz": -480 }, "id": "oUWPCERTubhK", "outputId": "0bcea8a6-dd60-4934-ce30-40ffda4be878" }, "outputs": [], "source": [ "# Load images\n", "img1 = cv2.imread('Golden_Retriever.jpg')\n", "img2 = cv2.imread('Border_Collie.jpg')\n", "img3 = cv2.imread('White_Wolf.jpg')\n", "\n", "image_titles = ['Golden Retriever', 'Border Collie', 'White Wolf']\n", "images = []\n", "for i, img in enumerate([img1, img2, img3]):\n", " img = cv2.cvtColor(img, cv2.COLOR_BGR2RGB)\n", " images.append(cv2.resize(img, (224, 224)))\n", "images = np.array(images) # shape: (3, 224, 224, 3)\n", "\n", "# Preparing input data for VGG16\n", "X = preprocess_input(images)\n", "\n", "# Plot images\n", "f, ax = plt.subplots(nrows=1, ncols=3, figsize=(12, 4))\n", "for i, title in enumerate(image_titles):\n", " ax[i].set_title(title, fontsize=16)\n", " ax[i].imshow(images[i])\n", " ax[i].axis('off')\n", "plt.tight_layout()\n", "plt.show()" ] }, { "cell_type": "markdown", "id": "13e8cf74", "metadata": { "id": "ckzKo2_NWRMq" }, "source": [ "## Model modifier, Score function and Render" ] }, { "cell_type": "markdown", "id": "89ddc2e6", "metadata": { "id": "DaePKbbtWU0F" }, "source": [ "當 softmax 用於模型的最後一層時,可能會阻礙注意力圖像的生成,因此需替換成線性激活函數。\n", "\n", "然後,**必須** 定義 score 函式返回目標分數的分數函數 (對應類別的得分值)。\n", "\n", "- [IMAGENET 1000 Class List](https://deeplearning.cms.waikato.ac.nz/user-guide/class-maps/IMAGENET/)" ] }, { "cell_type": "code", "execution_count": null, "id": "ea745136", "metadata": { "executionInfo": { "elapsed": 5, "status": "ok", "timestamp": 1677134337222, "user": { "displayName": "吳承澔", "userId": "17428420001093174904" }, "user_tz": -480 }, "id": "zJW8weZKW37B" }, "outputs": [], "source": [ "def model_modifier_function(cloned_model):\n", " # Convert last activation from softmax to linear\n", " cloned_model.layers[-1].activation = tf.keras.activations.linear" ] }, { "cell_type": "code", "execution_count": null, "id": "9c81d66d", "metadata": { "executionInfo": { "elapsed": 5, "status": "ok", "timestamp": 1677134337223, "user": { "displayName": "吳承澔", "userId": "17428420001093174904" }, "user_tz": -480 }, "id": "S_E118eKWOGl" }, "outputs": [], "source": [ "def score_function(output):\n", " # output.shape: (samples, classes)\n", " # Golden Retriever: 207\n", " # Border Collie: 232\n", " # White Wolf: 270\n", " return (output[0][207], output[1][232], output[2][270])" ] }, { "cell_type": "code", "execution_count": null, "id": "ae576a02", "metadata": { "executionInfo": { "elapsed": 5, "status": "ok", "timestamp": 1677134337223, "user": { "displayName": "吳承澔", "userId": "17428420001093174904" }, "user_tz": -480 }, "id": "BfrF5hsmf70H" }, "outputs": [], "source": [ "def images_plot(image_titles, images_array, visualize_cam):\n", " f, ax = plt.subplots(nrows=1, ncols=3, figsize=(12, 4))\n", " for i, title in enumerate(image_titles):\n", " heatmap = np.uint8(cm.jet(visualize_cam[i])[..., :3] * 255)\n", " ax[i].set_title(title, fontsize=16)\n", " ax[i].imshow(images_array[i])\n", " ax[i].imshow(heatmap, cmap='jet', alpha=0.5)\n", " ax[i].axis('off')\n", " plt.tight_layout()\n", " plt.show()" ] }, { "cell_type": "markdown", "id": "a268455b", "metadata": { "id": "I5d-FzNHSKoJ" }, "source": [ "## GradCAM" ] }, { "cell_type": "markdown", "id": "820310a6", "metadata": { "id": "x-5N8mR4SYH-" }, "source": [ "GradCAM 是一種將輸入注意力可視化的方法。它不使用模型輸出的梯度,而是使用倒數第二層的輸出(dense layer 之前的 convolution layer)。" ] }, { "cell_type": "code", "execution_count": null, "id": "fec61f1a", "metadata": { "colab": { "base_uri": "https://localhost:8080/", "height": 308 }, "executionInfo": { "elapsed": 9522, "status": "ok", "timestamp": 1677134346740, "user": { "displayName": "吳承澔", "userId": "17428420001093174904" }, "user_tz": -480 }, "id": "lpgkqiiXSLR9", "outputId": "aa870227-7c44-434f-9139-3f52af2d5d74" }, "outputs": [], "source": [ "# Create Gradcam object\n", "gradcam = Gradcam(model, model_modifier=model_modifier_function)\n", "\n", "# Generate heatmap with GradCAM\n", "cam = gradcam(score_function, X, penultimate_layer=-1)\n", "\n", "# Plot results\n", "images_plot(image_titles, images, visualize_cam=cam)" ] }, { "cell_type": "markdown", "id": "c55a0c5a", "metadata": { "id": "HYLlNrVJSY42" }, "source": [ "如結果所示,GradCAM 可以直觀地了解模型的注意力在哪裡,但是會發現,可視化的注意力並沒有完全覆蓋圖片中的類別物件。" ] }, { "cell_type": "markdown", "id": "1eb9112e", "metadata": { "id": "8lJfE1cUSjHG" }, "source": [ "## GradCAM++" ] }, { "cell_type": "markdown", "id": "1003ae42", "metadata": { "id": "-UsWkXZESl3f" }, "source": [ "GradCAM++ 可以為 CNN 模型預測提供更好的視覺解釋。在多個同類別物件時,GradCAM 較無法正確定位或指定位出部分物件,GranCAM++ 改善了這個問題,對於每個像素的梯度上使用加權平均,而不是全局平均。" ] }, { "cell_type": "code", "execution_count": null, "id": "dfae0491", "metadata": { "colab": { "base_uri": "https://localhost:8080/", "height": 308 }, "executionInfo": { "elapsed": 8, "status": "ok", "timestamp": 1677134346741, "user": { "displayName": "吳承澔", "userId": "17428420001093174904" }, "user_tz": -480 }, "id": "76QkU7cnSZeU", "outputId": "73598915-deac-45f4-daca-8c6334b69115" }, "outputs": [], "source": [ "# Create GradCAM++ object\n", "gradcam = GradcamPlusPlus(model, model_modifier=model_modifier_function)\n", "\n", "# Generate heatmap with GradCAM++\n", "cam = gradcam(score_function, X, penultimate_layer=-1)\n", "\n", "# Plot results\n", "images_plot(image_titles, images, cam)" ] }, { "cell_type": "markdown", "id": "6fbf2a3f", "metadata": { "id": "nz7RsaB4TygC" }, "source": [ "## ScoreCAM" ] }, { "cell_type": "markdown", "id": "c19565ee", "metadata": { "id": "YF_IL3mcTyja" }, "source": [ "ScoreCAM 是另一種生成 Class Activation Map 的方法。 該方法的特點是不同於 GradCAM 和 GradCAM,他使用無梯度方法進行,類似從特徵圖中取得遮罩重新與輸入進到模型計算,得到所有特徵圖的權重。" ] }, { "cell_type": "code", "execution_count": null, "id": "066fc683", "metadata": { "colab": { "base_uri": "https://localhost:8080/", "height": 326 }, "executionInfo": { "elapsed": 27497, "status": "ok", "timestamp": 1677134374234, "user": { "displayName": "吳承澔", "userId": "17428420001093174904" }, "user_tz": -480 }, "id": "9vnu7NB3TzCD", "outputId": "45cbcce5-cb58-4da5-dedc-0e0dd8c98fcb" }, "outputs": [], "source": [ "# Create ScoreCAM object\n", "scorecam = Scorecam(model, model_modifier=model_modifier_function)\n", "\n", "# Generate heatmap with ScoreCAM\n", "cam = scorecam(score_function, X, penultimate_layer=-1)\n", "\n", "# Plot results\n", "images_plot(image_titles, images, cam)" ] }, { "cell_type": "markdown", "id": "8fff4aaa", "metadata": { "id": "-eC8hkcmT_Em" }, "source": [ "## Faster-ScoreCAM" ] }, { "cell_type": "markdown", "id": "aae59722", "metadata": { "id": "uAo-aeycT_IF" }, "source": [ "ScoreCAM 比其他方法需要更多的時間來處理。 \n", "比 ScoreCAM 更快速的 Faster-ScorecAM 由 @tabayashi0117 設計。\n", "\n", "https://github.com/tabayashi0117/Score-CAM/blob/master/README.md#faster-score-cam\n", "\n", "Faster-Score-CAM 在 Score-CAM 中增加了“只使用方差較大的 channel 作為 mask 圖像”的處理。 (max_N = -1 是原始的 Score-CAM)。" ] }, { "cell_type": "code", "execution_count": null, "id": "465c3e29", "metadata": { "colab": { "base_uri": "https://localhost:8080/", "height": 326 }, "executionInfo": { "elapsed": 5324, "status": "ok", "timestamp": 1677134379547, "user": { "displayName": "吳承澔", "userId": "17428420001093174904" }, "user_tz": -480 }, "id": "Q7YFDJpqT_Ti", "outputId": "7e14b5c2-ef1b-4d9d-8fe8-8d8cc4a8c919" }, "outputs": [], "source": [ "# Create ScoreCAM object\n", "scorecam = Scorecam(model, model_modifier=model_modifier_function)\n", "\n", "# Generate heatmap with Faster-ScoreCAM\n", "cam = scorecam(score_function, X, penultimate_layer=-1, max_N=10)\n", "\n", "# Plot results\n", "images_plot(image_titles, images, cam)" ] }, { "cell_type": "markdown", "id": "0aa95125", "metadata": { "id": "eypdhJVr8AEw" }, "source": [ "參考資源\n", "- [reference](https://github.com/keisen/tf-keras-vis)" ] } ], "metadata": { "accelerator": "GPU", "colab": { "provenance": [] }, "gpuClass": "standard", "kernelspec": { "display_name": "Python 3 (ipykernel)", "language": "python", "name": "python3" }, "language_info": { "codemirror_mode": { "name": "ipython", "version": 3 }, "file_extension": ".py", "mimetype": "text/x-python", "name": "python", "nbconvert_exporter": "python", "pygments_lexer": "ipython3", "version": "3.7.12" } }, "nbformat": 4, "nbformat_minor": 5 }