{
  "cells": [
    {
      "cell_type": "markdown",
      "metadata": {
        "id": "view-in-github",
        "colab_type": "text"
      },
      "source": [
        "<a href=\"https://colab.research.google.com/github/TA-aiacademy/course_3.0/blob/main/05_CVCNN/Part5_Object_Detection/01new_YOLOv7_Train.ipynb\" target=\"_parent\"><img src=\"https://colab.research.google.com/assets/colab-badge.svg\" alt=\"Open In Colab\"/></a>"
      ]
    },
    {
      "cell_type": "markdown",
      "id": "fab26a2f",
      "metadata": {
        "id": "fab26a2f"
      },
      "source": [
        "# 下載課程所需檔案 (YOLOv7, Dataset)"
      ]
    },
    {
      "cell_type": "code",
      "execution_count": null,
      "id": "eb76e2e5",
      "metadata": {
        "id": "eb76e2e5"
      },
      "outputs": [],
      "source": [
        "!wget https://github.com/TA-aiacademy/course_3.0/releases/download/CVCNN_Data/yolo_datasets.zip\n",
        "!unzip -q yolo_datasets.zip\n",
        "!wget https://github.com/TA-aiacademy/course_3.0/releases/download/CVCNN_Data/yolov7new.zip\n",
        "!unzip -q yolov7new.zip"
      ]
    },
    {
      "cell_type": "markdown",
      "id": "4d098373",
      "metadata": {
        "id": "4d098373"
      },
      "source": [
        "# YOLOv7 實作\n"
      ]
    },
    {
      "cell_type": "markdown",
      "id": "917fbba2",
      "metadata": {
        "id": "917fbba2"
      },
      "source": [
        "## [貓狗公開資料集](https://public.roboflow.com/object-detection/oxford-pets/2/images/fc82071578629d4d44696cb666898d45)\n",
        "![VnNscKi](https://hackmd.io/_uploads/HkGe-eSO6.png)\n",
        "這個貓狗公開資料集提供了 3680 張影像，為了訓練快一點，這邊只取了 250 張影像來訓練，檔案放在 datasets/pet.zip 中"
      ]
    },
    {
      "cell_type": "markdown",
      "id": "72541f99",
      "metadata": {
        "id": "72541f99"
      },
      "source": [
        "## 1. 準備資料集\n",
        "    改變標籤格式\n",
        "    - 從 Pascal_voc(xml)->Yolo(txt)\n",
        "    - 從 Coco(json)->Yolo(txt)\n",
        "![eNWUWGQ](https://hackmd.io/_uploads/BJPb-gS_6.png)\n"
      ]
    },
    {
      "cell_type": "markdown",
      "id": "670da330",
      "metadata": {
        "id": "670da330"
      },
      "source": [
        "* ### Pascal_voc(xml)->Yolo(txt)"
      ]
    },
    {
      "cell_type": "code",
      "execution_count": null,
      "id": "614f3e06",
      "metadata": {
        "id": "614f3e06"
      },
      "outputs": [],
      "source": [
        "import os\n",
        "import glob\n",
        "import random\n",
        "import shutil\n",
        "import xml.etree.ElementTree as ET\n",
        "\n",
        "#讀取資料夾的圖片名稱\n",
        "def getImagesInDir(dir_path):\n",
        "    img_formats = ['bmp', 'jpg', 'jpeg', 'png', 'tif', 'tiff', 'dng']\n",
        "    image_list = []\n",
        "    for img_format in img_formats:\n",
        "        for filename in glob.glob(dir_path + f'/*.{img_format}'):\n",
        "            image_list.append(filename)\n",
        "\n",
        "    return image_list\n",
        "\n",
        "# 座標轉換\n",
        "def convert(size, box):\n",
        "    dw = 1./(size[0])\n",
        "    dh = 1./(size[1])\n",
        "    x = (box[0] + box[1])/2.0 - 1\n",
        "    y = (box[2] + box[3])/2.0 - 1\n",
        "    w = box[1] - box[0]\n",
        "    h = box[3] - box[2]\n",
        "    x = x*dw\n",
        "    w = w*dw\n",
        "    y = y*dh\n",
        "    h = h*dh\n",
        "    return (x, y, w, h)\n",
        "\n",
        "# 讀取 annotation 檔案內容並轉換\n",
        "def convert_annotation(img_path, ann_dir,\n",
        "                       output_image_path, output_label_path):\n",
        "    basename = os.path.basename(img_path)\n",
        "    basename_no_ext = os.path.splitext(basename)[0]\n",
        "\n",
        "    # copy image\n",
        "    shutil.copyfile(img_path, os.path.join(output_image_path, basename))\n",
        "\n",
        "    in_file = open(ann_dir + '/' + basename_no_ext + '.xml')\n",
        "    out_file = open(output_label_path + basename_no_ext + '.txt', 'w')\n",
        "    tree = ET.parse(in_file)\n",
        "    root = tree.getroot()\n",
        "    size = root.find('size')\n",
        "    w = int(size.find('width').text)\n",
        "    h = int(size.find('height').text)\n",
        "\n",
        "    for obj in root.iter('object'):\n",
        "        difficult = obj.find('difficult').text\n",
        "        cls = obj.find('name').text\n",
        "        if cls not in classes or difficult == '1':\n",
        "            continue\n",
        "        cls_id = classes.index(cls)\n",
        "        xmlbox = obj.find('bndbox')\n",
        "        b = (float(xmlbox.find('xmin').text), float(xmlbox.find('xmax').text),\n",
        "             float(xmlbox.find('ymin').text), float(xmlbox.find('ymax').text))\n",
        "        bb = convert((w, h), b)\n",
        "        out_file.write(str(cls_id) + \" \" + \" \".join(\n",
        "                        [str(a) for a in bb]) + '\\n')"
      ]
    },
    {
      "cell_type": "code",
      "execution_count": null,
      "id": "b9450ef9",
      "metadata": {
        "id": "b9450ef9"
      },
      "outputs": [],
      "source": [
        "name = 'pet'  # 資料集名稱\n",
        "classes = ['cat', 'dog']  # 修改自己的類別\n",
        "train_test_split_rate = 0.2\n",
        "\n",
        "img_dir = 'datasets/pet_voc/JPEGImages/'  # 照片存放路徑\n",
        "ann_dir = 'datasets/pet_voc/Annotations/'  # 標籤存放路徑\n",
        "image_paths = getImagesInDir(img_dir)\n",
        "random.seed(2022)\n",
        "random.shuffle(image_paths)\n",
        "\n",
        "train_image_path = f'datasets/{name}/train/images/'\n",
        "train_label_path = f'datasets/{name}/train/labels/'\n",
        "valid_image_path = f'datasets/{name}/valid/images/'\n",
        "valid_label_path = f'datasets/{name}/valid/labels/'\n",
        "\n",
        "if not os.path.exists(train_image_path):\n",
        "    os.makedirs(train_image_path)\n",
        "if not os.path.exists(train_label_path):\n",
        "    os.makedirs(train_label_path)\n",
        "if not os.path.exists(valid_image_path):\n",
        "    os.makedirs(valid_image_path)\n",
        "if not os.path.exists(valid_label_path):\n",
        "    os.makedirs(valid_label_path)\n",
        "\n",
        "train_test_split = len(image_paths)*train_test_split_rate\n",
        "\n",
        "for i, img_path in enumerate(image_paths):\n",
        "    if i >= train_test_split:\n",
        "        # train\n",
        "        convert_annotation(img_path, ann_dir,\n",
        "                           train_image_path, train_label_path)\n",
        "    else:\n",
        "        # valid\n",
        "        convert_annotation(img_path, ann_dir,\n",
        "                           valid_image_path, valid_label_path)"
      ]
    },
    {
      "cell_type": "markdown",
      "id": "2eb9e785",
      "metadata": {
        "id": "2eb9e785"
      },
      "source": [
        "* ### Coco(json)->Yolo(txt)"
      ]
    },
    {
      "cell_type": "code",
      "execution_count": null,
      "id": "90f15178",
      "metadata": {
        "id": "90f15178"
      },
      "outputs": [],
      "source": [
        "import os\n",
        "import glob\n",
        "import random\n",
        "import json\n",
        "import shutil"
      ]
    },
    {
      "cell_type": "code",
      "execution_count": null,
      "id": "be5d0c72",
      "metadata": {
        "id": "be5d0c72"
      },
      "outputs": [],
      "source": [
        "def getImagesInDir(dir_path):\n",
        "    img_formats = ['bmp', 'jpg', 'jpeg', 'png', 'tif', 'tiff', 'dng']\n",
        "    image_list = []\n",
        "    for img_format in img_formats:\n",
        "        for filename in glob.glob(dir_path + f'/*.{img_format}'):\n",
        "            image_list.append(filename)\n",
        "\n",
        "    return image_list\n",
        "\n",
        "# 座標轉換\n",
        "def convert(size, box):\n",
        "    dw = 1./(size[0])\n",
        "    dh = 1./(size[1])\n",
        "    x = (box[0] + box[1])/2.0 - 1\n",
        "    y = (box[2] + box[3])/2.0 - 1\n",
        "    w = box[1] - box[0]\n",
        "    h = box[3] - box[2]\n",
        "    x = x*dw\n",
        "    w = w*dw\n",
        "    y = y*dh\n",
        "    h = h*dh\n",
        "    return (x, y, w, h)\n",
        "\n",
        "\n",
        "def convert_annotation(img_path, ann_dir,\n",
        "                       output_image_path, output_label_path):\n",
        "    basename = os.path.basename(img_path)\n",
        "    basename_no_ext = os.path.splitext(basename)[0]\n",
        "\n",
        "    # copy image\n",
        "    shutil.copyfile(img_path, os.path.join(output_image_path, basename))\n",
        "\n",
        "    # get json\n",
        "    in_file = json.load(open(ann_dir + '/' + basename_no_ext + '.json', encoding=\"utf-8\"))\n",
        "    out_file = open(output_label_path + basename_no_ext + '.txt', 'w')\n",
        "\n",
        "    bboxes = []\n",
        "    labels = []\n",
        "    for shape in in_file[\"shapes\"]:\n",
        "        class_name = shape[\"label\"]\n",
        "        cls_id = class_names.index(class_name)\n",
        "        (xmin, ymin), (xmax, ymax) = shape[\"points\"]\n",
        "        xmin, xmax = sorted([xmin, xmax])\n",
        "        ymin, ymax = sorted([ymin, ymax])\n",
        "        b = (float(xmin), float(xmax), float(ymin), float(ymax))\n",
        "        w = int(in_file[\"imageWidth\"])\n",
        "        h = int(in_file[\"imageHeight\"])\n",
        "        bb = convert((w, h), b)\n",
        "        out_file.write(str(cls_id) + \" \" + \" \".join(\n",
        "                        [str(a) for a in bb]) + '\\n')"
      ]
    },
    {
      "cell_type": "code",
      "execution_count": null,
      "id": "6e3c346f",
      "metadata": {
        "id": "6e3c346f"
      },
      "outputs": [],
      "source": [
        "name = 'pet'  # 資料集名稱\n",
        "class_names = ['cat', 'dog']  # 修改自己的類別\n",
        "train_test_split_rate = 0.2\n",
        "\n",
        "img_dir = 'datasets/pet_coco/'  # 照片存放路徑\n",
        "ann_dir = 'datasets/pet_coco/'  # 標籤存放路徑\n",
        "image_paths = getImagesInDir(img_dir)\n",
        "random.seed(2022)\n",
        "random.shuffle(image_paths)\n",
        "\n",
        "train_image_path = f'datasets/{name}/train/images/'\n",
        "train_label_path = f'datasets/{name}/train/labels/'\n",
        "valid_image_path = f'datasets/{name}/valid/images/'\n",
        "valid_label_path = f'datasets/{name}/valid/labels/'\n",
        "\n",
        "if not os.path.exists(train_image_path):\n",
        "    os.makedirs(train_image_path)\n",
        "if not os.path.exists(train_label_path):\n",
        "    os.makedirs(train_label_path)\n",
        "if not os.path.exists(valid_image_path):\n",
        "    os.makedirs(valid_image_path)\n",
        "if not os.path.exists(valid_label_path):\n",
        "    os.makedirs(valid_label_path)\n",
        "\n",
        "train_test_split = len(image_paths)*train_test_split_rate\n",
        "\n",
        "\n",
        "for i, img_path in enumerate(image_paths):\n",
        "    if i >= train_test_split:\n",
        "        # train\n",
        "        convert_annotation(img_path, ann_dir,\n",
        "                           train_image_path, train_label_path)\n",
        "    else:\n",
        "        # valid\n",
        "        convert_annotation(img_path, ann_dir,\n",
        "                           valid_image_path, valid_label_path)"
      ]
    },
    {
      "cell_type": "markdown",
      "id": "2378cee1",
      "metadata": {
        "id": "2378cee1"
      },
      "source": [
        "---"
      ]
    },
    {
      "cell_type": "markdown",
      "id": "7df24cb3",
      "metadata": {
        "id": "7df24cb3"
      },
      "source": [
        "## 2. 更改設定檔案\n",
        "- 修改 cfg/training/yolov7.yaml\n",
        "- 修改 data/coco.yaml 製作一個自己資料集的 yaml"
      ]
    },
    {
      "cell_type": "markdown",
      "id": "a567867b",
      "metadata": {
        "id": "a567867b"
      },
      "source": [
        "將yolov7.yaml 設定檔複製一份\n",
        "\n",
        "!cp 要複製的檔案 新檔案名稱"
      ]
    },
    {
      "cell_type": "code",
      "execution_count": null,
      "id": "77522a59",
      "metadata": {
        "id": "77522a59"
      },
      "outputs": [],
      "source": [
        "!cp cfg/training/yolov7.yaml cfg/training/yolov7-pet.yaml"
      ]
    },
    {
      "cell_type": "markdown",
      "id": "241971c0",
      "metadata": {
        "id": "241971c0"
      },
      "source": [
        "將class的地方改成自己的class數量\n",
        "\n",
        "!sed -n -e (顯示) 第幾行 檔案名稱"
      ]
    },
    {
      "cell_type": "code",
      "execution_count": null,
      "id": "977a9391",
      "metadata": {
        "id": "977a9391"
      },
      "outputs": [],
      "source": [
        "!sed -n -e 2p cfg/training/yolov7-pet.yaml"
      ]
    },
    {
      "cell_type": "markdown",
      "source": [],
      "metadata": {
        "id": "edq_kVCqkdIa"
      },
      "id": "edq_kVCqkdIa"
    },
    {
      "cell_type": "markdown",
      "id": "9c9c52ed",
      "metadata": {
        "id": "9c9c52ed"
      },
      "source": [
        "\n",
        "\n",
        "!sed -i (修改) 第幾行/欲修改的字/目標字/ 檔案名稱"
      ]
    },
    {
      "cell_type": "code",
      "execution_count": null,
      "id": "2395d21d",
      "metadata": {
        "id": "2395d21d"
      },
      "outputs": [],
      "source": [
        "!sed -i '2s/80/2/' cfg/training/yolov7-pet.yaml"
      ]
    },
    {
      "cell_type": "code",
      "execution_count": null,
      "id": "1f0c5ff7",
      "metadata": {
        "id": "1f0c5ff7"
      },
      "outputs": [],
      "source": [
        "!sed -n -e 2p cfg/training/yolov7-pet.yaml"
      ]
    },
    {
      "cell_type": "markdown",
      "id": "91f41ebe",
      "metadata": {
        "id": "91f41ebe"
      },
      "source": [
        "![image](https://hackmd.io/_uploads/Skkm-gBdp.png)\n"
      ]
    },
    {
      "cell_type": "markdown",
      "id": "b4095ba9",
      "metadata": {
        "id": "b4095ba9"
      },
      "source": [
        "參考data/coco.yaml 製作一個自己資料集的yaml"
      ]
    },
    {
      "cell_type": "code",
      "execution_count": null,
      "id": "d0c54ff6",
      "metadata": {
        "id": "d0c54ff6"
      },
      "outputs": [],
      "source": [
        "text = \\\n",
        "    \"\"\"\n",
        "    train: ./datasets/pet/train # 訓練資料夾位置\n",
        "    val: ./datasets/pet/valid # 驗證資料夾位置\n",
        "\n",
        "    # number of classes\n",
        "    nc: 2 # <-需修改乘自己的類別數量\n",
        "\n",
        "    # class names\n",
        "    names: [ 'cat','dog' ]\n",
        "    \"\"\""
      ]
    },
    {
      "cell_type": "code",
      "execution_count": null,
      "id": "9898983a",
      "metadata": {
        "id": "9898983a"
      },
      "outputs": [],
      "source": [
        "with open(f'data/{name}.yaml', 'w') as file:\n",
        "    file.write(text)"
      ]
    },
    {
      "cell_type": "markdown",
      "id": "9a77d431",
      "metadata": {
        "id": "9a77d431"
      },
      "source": [
        "![image](https://hackmd.io/_uploads/H1Am-gHua.png)\n"
      ]
    },
    {
      "cell_type": "markdown",
      "id": "04fe6f92",
      "metadata": {
        "id": "04fe6f92"
      },
      "source": [
        "下載預訓練權重檔案\n",
        "https://github.com/WongKinYiu/yolov7"
      ]
    },
    {
      "cell_type": "markdown",
      "id": "67e1bb52",
      "metadata": {
        "id": "67e1bb52"
      },
      "source": [
        "![image](https://hackmd.io/_uploads/HJ0NZxHua.png)\n",
        "放置於weights/資料夾底下"
      ]
    },
    {
      "cell_type": "markdown",
      "id": "c2154177",
      "metadata": {
        "id": "c2154177"
      },
      "source": [
        "執行訓練，訓練參數介紹：\n",
        "- --weights : 預先訓練的權重路徑(weights/yolov7_training.pt)\n",
        "- --cfg：模型設定檔案路徑(cfg/training/yolov7-pet.yaml)\n",
        "- --data：資料集設定檔案路徑(data/pet.yaml)\n",
        "- --device：GPU設定\n",
        "- --batch-size：一次訓練照片張數\n",
        "- --epoch： 訓練圈數\n",
        "\n",
        "其他可調控參數可置train.py中察看"
      ]
    },
    {
      "cell_type": "code",
      "execution_count": null,
      "id": "fa136a0e",
      "metadata": {
        "id": "fa136a0e"
      },
      "outputs": [],
      "source": [
        "!python train.py --weights weights/yolov7_training.pt --cfg cfg/training/yolov7-pet.yaml --data data/pet.yaml --device 0 --batch-size 16 --epoch 50"
      ]
    },
    {
      "cell_type": "code",
      "execution_count": null,
      "id": "6a716521",
      "metadata": {
        "id": "6a716521"
      },
      "outputs": [],
      "source": []
    }
  ],
  "metadata": {
    "kernelspec": {
      "display_name": "Python 3",
      "name": "python3"
    },
    "language_info": {
      "codemirror_mode": {
        "name": "ipython",
        "version": 3
      },
      "file_extension": ".py",
      "mimetype": "text/x-python",
      "name": "python",
      "nbconvert_exporter": "python",
      "pygments_lexer": "ipython3",
      "version": "3.8.5"
    },
    "colab": {
      "provenance": [],
      "gpuType": "T4",
      "include_colab_link": true
    },
    "accelerator": "GPU"
  },
  "nbformat": 4,
  "nbformat_minor": 5
}