{ "cells": [ { "cell_type": "markdown", "metadata": { "id": "Yx_wytHx0fDz" }, "source": [ "## $\\large{Quiz}$" ] }, { "cell_type": "markdown", "metadata": { "id": "YceKSrFj0fD4" }, "source": [ "請將 Path 每張影像讀取出來,並將影像整理成底下的 DataFrame 格式...
\n", "如下表 my_dataframe 及下圖 my_imageshow 的樣子" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "id": "u6ftyJIt0fD4" }, "outputs": [], "source": [ "# upload Simpson.zip\n", "!wget -q https://github.com/TA-aiacademy/course_3.0/releases/download/CVCNN_Data/Simpson.zip\n", "# unzip file\n", "!unzip -q Simpson.zip" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "id": "k_bkgCqc0fD5" }, "outputs": [], "source": [ "Path = \"./Simpson\"" ] }, { "cell_type": "markdown", "metadata": { "id": "pvipLWYK0fD6" }, "source": [ "$\\begin{array}{c | c} class & label\\\\\\hline\n", " 0 & abraham\\_grampa\\_simpson\\\\\n", " 1 & agnes\\_skinner\\\\\n", " 2 & apu\\_nahasapeemapetilon\\\\\n", " 3 & bart\\_simpson\\\\\n", " 4 & carl\\_carlson \\end{array}$" ] }, { "cell_type": "markdown", "metadata": { "id": "OY4LUZf70fD6" }, "source": [ "## my_dataframe" ] }, { "cell_type": "markdown", "metadata": { "id": "EO5KBohu0fD6" }, "source": [ "" ] }, { "cell_type": "markdown", "metadata": { "id": "fiBDMyR50fD7" }, "source": [ "## my_imageshow" ] }, { "cell_type": "markdown", "metadata": { "id": "wajRHowZ0fD7" }, "source": [ "![](https://hackmd.io/_uploads/SkHqW3PMa.png)\n" ] }, { "cell_type": "markdown", "metadata": { "id": "rqXXrxU20fD7" }, "source": [ "# ANS\n", "先觀察一下Simpson資料夾的結構
\n", "可以看出是\n", ">Simposon\n", ">>人物名稱\n", ">>>圖片名稱\n", "\n", "這次希望大家做的是用程式自動整理出一份含檔案位置、以及對應類別的表格
" ] }, { "cell_type": "markdown", "metadata": { "id": "JykA2rs70fD8" }, "source": [ "以這次小考的要求,主要可以分為四個步驟:\n", "1. [資料夾處理](#資料夾處理)\n", "2. [列舉檔案以及對應標籤](#列舉出所有檔案路徑以及對應的人物名稱)\n", "3. [DataFrame處理](#DataFrame處理)\n", "4. [畫圖呈現](#畫圖呈現)" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "id": "7XmgYYiY0fD8" }, "outputs": [], "source": [ "import pandas as pd\n", "import numpy as np\n", "import os\n", "import matplotlib.pyplot as plt" ] }, { "cell_type": "markdown", "metadata": { "id": "Fz6VPuNC0fD8" }, "source": [ "### 資料夾處理\n", "通常第一步要對資料夾處理時,會想要知道資料夾底下有什麼資料夾
\n", "而在 python 中使用 os.listdir(資料夾路徑) 可以得知指定的路徑底下有什麼檔案
\n", "還有一些常見作法可以參看以下討論區:
\n", "https://stackoverflow.com/questions/3207219/how-do-i-list-all-files-of-a-directory" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "id": "7Z1i5T2D0fD8" }, "outputs": [], "source": [ "folder = [os.path.join(Path, each) for each in os.listdir(Path)]" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "id": "E4MtXP980fD9" }, "outputs": [], "source": [ "folder" ] }, { "cell_type": "markdown", "metadata": { "id": "ZuSniRiS0fD9" }, "source": [ "有了每個人物對應的資料夾位置了後,現在該把每張圖片以及其對應的人物名稱整理出來" ] }, { "cell_type": "markdown", "metadata": { "id": "omCox68Z0fD9" }, "source": [ "### 列舉出所有檔案路徑以及對應的人物名稱" ] }, { "cell_type": "markdown", "metadata": { "id": "yTW-oJgA0fD9" }, "source": [ "### Way1 -- os.listdir()" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "id": "ge33gX6j0fD9" }, "outputs": [], "source": [ "datalist = []\n", "datalabel = []\n", "for root in folder:\n", " for file in os.listdir(root):\n", " # 由於使用 jupyter notebook時會產生一些附加檔案\n", " # 但是我們不希望這些檔案被加進我們的資料之中,所以做了下列判斷式\n", " if file.find('.ipynb_checkpoints') == -1:\n", " datalist.append(os.path.join(root, file))\n", " datalabel.append(root.split('/')[-1])" ] }, { "cell_type": "markdown", "metadata": { "id": "MSlmi_tu0fD-" }, "source": [ "### Way2 -- os.walk()" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "id": "32-g_YVZ0fD-" }, "outputs": [], "source": [ "datalist = []\n", "datalabel = []\n", "for roots, dirs, files in os.walk(Path):\n", " if roots.find(\".ipynb_checkpoints\") == -1:\n", " for file in files:\n", " datalist.append(os.path.join(roots, file))\n", " datalabel.append(root.split('/')[-1])" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "id": "7437dhX00fD-" }, "outputs": [], "source": [ "print(*np.array((datalist, datalabel)).T, sep='\\n')" ] }, { "cell_type": "markdown", "metadata": { "id": "WCzpchlp0fD-" }, "source": [ "### DataFrame處理" ] }, { "cell_type": "markdown", "metadata": { "id": "tt8M-Gmp0fD_" }, "source": [ "有了路徑以及名稱後,只要將其整理成DataFrame就大功告成了" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "id": "sS6RSLyi0fD_" }, "outputs": [], "source": [ "data = {\"id_code\": datalist,\n", " \"label\": datalabel}" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "id": "1YApievO0fD_" }, "outputs": [], "source": [ "my_data = pd.DataFrame(data)" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "id": "gunEUsl-0fD_" }, "outputs": [], "source": [ "classlabel = {'abraham_grampa_simpson': 0,\n", " 'agnes_skinner': 1,\n", " 'apu_nahasapeemapetilon': 2,\n", " 'bart_simpson': 3,\n", " 'carl_carlson': 4}" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "id": "GB0h_GdQ0fD_" }, "outputs": [], "source": [ "my_data[\"label\"] = my_data[\"label\"].map(classlabel)" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "id": "HoSCYfXz0fD_" }, "outputs": [], "source": [ "my_data = my_data.sort_values('label')" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "id": "dgBBExSV0fD_" }, "outputs": [], "source": [ "my_data = my_data.reset_index(drop=True)" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "id": "t4HzO7vn0fEA" }, "outputs": [], "source": [ "my_data" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "id": "XOagjJTw0fEA" }, "outputs": [], "source": [ "my_data.to_csv(\"quiz1_ans_simspon.csv\", index=False)" ] }, { "cell_type": "markdown", "metadata": { "id": "25SwBV5V0fEA" }, "source": [ "### 畫圖呈現" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "id": "n4TaayFw0fEA" }, "outputs": [], "source": [ "plt.figure(figsize=(16, 20))\n", "for i in range(20):\n", " ax = plt.subplot(5, 4, i+1)\n", " plt.title(\"class {}\".format(my_data.loc[i, \"label\"]), fontsize=20)\n", " plt.xticks([])\n", " plt.yticks([])\n", " ax.imshow(plt.imread(my_data.loc[i, \"id_code\"]))" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "id": "ZPp7z3-R0fEA" }, "outputs": [], "source": [] } ], "metadata": { "colab": { "provenance": [] }, "kernelspec": { "display_name": "Python 3 (ipykernel)", "language": "python", "name": "python3" }, "language_info": { "codemirror_mode": { "name": "ipython", "version": 3 }, "file_extension": ".py", "mimetype": "text/x-python", "name": "python", "nbconvert_exporter": "python", "pygments_lexer": "ipython3", "version": "3.7.12" } }, "nbformat": 4, "nbformat_minor": 0 }