{ "cells": [ { "cell_type": "markdown", "metadata": { "id": "view-in-github", "colab_type": "text" }, "source": [ "\"Open" ] }, { "cell_type": "markdown", "id": "fab26a2f", "metadata": { "id": "fab26a2f" }, "source": [ "# 下載課程所需檔案 (YOLOv7, Dataset)" ] }, { "cell_type": "code", "execution_count": null, "id": "eb76e2e5", "metadata": { "id": "eb76e2e5" }, "outputs": [], "source": [ "!wget https://github.com/TA-aiacademy/course_3.0/releases/download/CVCNN_Data/yolo_datasets.zip\n", "!unzip -q yolo_datasets.zip\n", "!wget https://github.com/TA-aiacademy/course_3.0/releases/download/CVCNN_Data/yolov7new.zip\n", "!unzip -q yolov7new.zip" ] }, { "cell_type": "markdown", "id": "4d098373", "metadata": { "id": "4d098373" }, "source": [ "# YOLOv7 實作\n" ] }, { "cell_type": "markdown", "id": "917fbba2", "metadata": { "id": "917fbba2" }, "source": [ "## [貓狗公開資料集](https://public.roboflow.com/object-detection/oxford-pets/2/images/fc82071578629d4d44696cb666898d45)\n", "![VnNscKi](https://hackmd.io/_uploads/HkGe-eSO6.png)\n", "這個貓狗公開資料集提供了 3680 張影像,為了訓練快一點,這邊只取了 250 張影像來訓練,檔案放在 datasets/pet.zip 中" ] }, { "cell_type": "markdown", "id": "72541f99", "metadata": { "id": "72541f99" }, "source": [ "## 1. 準備資料集\n", " 改變標籤格式\n", " - 從 Pascal_voc(xml)->Yolo(txt)\n", " - 從 Coco(json)->Yolo(txt)\n", "![eNWUWGQ](https://hackmd.io/_uploads/BJPb-gS_6.png)\n" ] }, { "cell_type": "markdown", "id": "670da330", "metadata": { "id": "670da330" }, "source": [ "* ### Pascal_voc(xml)->Yolo(txt)" ] }, { "cell_type": "code", "execution_count": null, "id": "614f3e06", "metadata": { "id": "614f3e06" }, "outputs": [], "source": [ "import os\n", "import glob\n", "import random\n", "import shutil\n", "import xml.etree.ElementTree as ET\n", "\n", "#讀取資料夾的圖片名稱\n", "def getImagesInDir(dir_path):\n", " img_formats = ['bmp', 'jpg', 'jpeg', 'png', 'tif', 'tiff', 'dng']\n", " image_list = []\n", " for img_format in img_formats:\n", " for filename in glob.glob(dir_path + f'/*.{img_format}'):\n", " image_list.append(filename)\n", "\n", " return image_list\n", "\n", "# 座標轉換\n", "def convert(size, box):\n", " dw = 1./(size[0])\n", " dh = 1./(size[1])\n", " x = (box[0] + box[1])/2.0 - 1\n", " y = (box[2] + box[3])/2.0 - 1\n", " w = box[1] - box[0]\n", " h = box[3] - box[2]\n", " x = x*dw\n", " w = w*dw\n", " y = y*dh\n", " h = h*dh\n", " return (x, y, w, h)\n", "\n", "# 讀取 annotation 檔案內容並轉換\n", "def convert_annotation(img_path, ann_dir,\n", " output_image_path, output_label_path):\n", " basename = os.path.basename(img_path)\n", " basename_no_ext = os.path.splitext(basename)[0]\n", "\n", " # copy image\n", " shutil.copyfile(img_path, os.path.join(output_image_path, basename))\n", "\n", " in_file = open(ann_dir + '/' + basename_no_ext + '.xml')\n", " out_file = open(output_label_path + basename_no_ext + '.txt', 'w')\n", " tree = ET.parse(in_file)\n", " root = tree.getroot()\n", " size = root.find('size')\n", " w = int(size.find('width').text)\n", " h = int(size.find('height').text)\n", "\n", " for obj in root.iter('object'):\n", " difficult = obj.find('difficult').text\n", " cls = obj.find('name').text\n", " if cls not in classes or difficult == '1':\n", " continue\n", " cls_id = classes.index(cls)\n", " xmlbox = obj.find('bndbox')\n", " b = (float(xmlbox.find('xmin').text), float(xmlbox.find('xmax').text),\n", " float(xmlbox.find('ymin').text), float(xmlbox.find('ymax').text))\n", " bb = convert((w, h), b)\n", " out_file.write(str(cls_id) + \" \" + \" \".join(\n", " [str(a) for a in bb]) + '\\n')" ] }, { "cell_type": "code", "execution_count": null, "id": "b9450ef9", "metadata": { "id": "b9450ef9" }, "outputs": [], "source": [ "name = 'pet' # 資料集名稱\n", "classes = ['cat', 'dog'] # 修改自己的類別\n", "train_test_split_rate = 0.2\n", "\n", "img_dir = 'datasets/pet_voc/JPEGImages/' # 照片存放路徑\n", "ann_dir = 'datasets/pet_voc/Annotations/' # 標籤存放路徑\n", "image_paths = getImagesInDir(img_dir)\n", "random.seed(2022)\n", "random.shuffle(image_paths)\n", "\n", "train_image_path = f'datasets/{name}/train/images/'\n", "train_label_path = f'datasets/{name}/train/labels/'\n", "valid_image_path = f'datasets/{name}/valid/images/'\n", "valid_label_path = f'datasets/{name}/valid/labels/'\n", "\n", "if not os.path.exists(train_image_path):\n", " os.makedirs(train_image_path)\n", "if not os.path.exists(train_label_path):\n", " os.makedirs(train_label_path)\n", "if not os.path.exists(valid_image_path):\n", " os.makedirs(valid_image_path)\n", "if not os.path.exists(valid_label_path):\n", " os.makedirs(valid_label_path)\n", "\n", "train_test_split = len(image_paths)*train_test_split_rate\n", "\n", "for i, img_path in enumerate(image_paths):\n", " if i >= train_test_split:\n", " # train\n", " convert_annotation(img_path, ann_dir,\n", " train_image_path, train_label_path)\n", " else:\n", " # valid\n", " convert_annotation(img_path, ann_dir,\n", " valid_image_path, valid_label_path)" ] }, { "cell_type": "markdown", "id": "2eb9e785", "metadata": { "id": "2eb9e785" }, "source": [ "* ### Coco(json)->Yolo(txt)" ] }, { "cell_type": "code", "execution_count": null, "id": "90f15178", "metadata": { "id": "90f15178" }, "outputs": [], "source": [ "import os\n", "import glob\n", "import random\n", "import json\n", "import shutil" ] }, { "cell_type": "code", "execution_count": null, "id": "be5d0c72", "metadata": { "id": "be5d0c72" }, "outputs": [], "source": [ "def getImagesInDir(dir_path):\n", " img_formats = ['bmp', 'jpg', 'jpeg', 'png', 'tif', 'tiff', 'dng']\n", " image_list = []\n", " for img_format in img_formats:\n", " for filename in glob.glob(dir_path + f'/*.{img_format}'):\n", " image_list.append(filename)\n", "\n", " return image_list\n", "\n", "# 座標轉換\n", "def convert(size, box):\n", " dw = 1./(size[0])\n", " dh = 1./(size[1])\n", " x = (box[0] + box[1])/2.0 - 1\n", " y = (box[2] + box[3])/2.0 - 1\n", " w = box[1] - box[0]\n", " h = box[3] - box[2]\n", " x = x*dw\n", " w = w*dw\n", " y = y*dh\n", " h = h*dh\n", " return (x, y, w, h)\n", "\n", "\n", "def convert_annotation(img_path, ann_dir,\n", " output_image_path, output_label_path):\n", " basename = os.path.basename(img_path)\n", " basename_no_ext = os.path.splitext(basename)[0]\n", "\n", " # copy image\n", " shutil.copyfile(img_path, os.path.join(output_image_path, basename))\n", "\n", " # get json\n", " in_file = json.load(open(ann_dir + '/' + basename_no_ext + '.json', encoding=\"utf-8\"))\n", " out_file = open(output_label_path + basename_no_ext + '.txt', 'w')\n", "\n", " bboxes = []\n", " labels = []\n", " for shape in in_file[\"shapes\"]:\n", " class_name = shape[\"label\"]\n", " cls_id = class_names.index(class_name)\n", " (xmin, ymin), (xmax, ymax) = shape[\"points\"]\n", " xmin, xmax = sorted([xmin, xmax])\n", " ymin, ymax = sorted([ymin, ymax])\n", " b = (float(xmin), float(xmax), float(ymin), float(ymax))\n", " w = int(in_file[\"imageWidth\"])\n", " h = int(in_file[\"imageHeight\"])\n", " bb = convert((w, h), b)\n", " out_file.write(str(cls_id) + \" \" + \" \".join(\n", " [str(a) for a in bb]) + '\\n')" ] }, { "cell_type": "code", "execution_count": null, "id": "6e3c346f", "metadata": { "id": "6e3c346f" }, "outputs": [], "source": [ "name = 'pet' # 資料集名稱\n", "class_names = ['cat', 'dog'] # 修改自己的類別\n", "train_test_split_rate = 0.2\n", "\n", "img_dir = 'datasets/pet_coco/' # 照片存放路徑\n", "ann_dir = 'datasets/pet_coco/' # 標籤存放路徑\n", "image_paths = getImagesInDir(img_dir)\n", "random.seed(2022)\n", "random.shuffle(image_paths)\n", "\n", "train_image_path = f'datasets/{name}/train/images/'\n", "train_label_path = f'datasets/{name}/train/labels/'\n", "valid_image_path = f'datasets/{name}/valid/images/'\n", "valid_label_path = f'datasets/{name}/valid/labels/'\n", "\n", "if not os.path.exists(train_image_path):\n", " os.makedirs(train_image_path)\n", "if not os.path.exists(train_label_path):\n", " os.makedirs(train_label_path)\n", "if not os.path.exists(valid_image_path):\n", " os.makedirs(valid_image_path)\n", "if not os.path.exists(valid_label_path):\n", " os.makedirs(valid_label_path)\n", "\n", "train_test_split = len(image_paths)*train_test_split_rate\n", "\n", "\n", "for i, img_path in enumerate(image_paths):\n", " if i >= train_test_split:\n", " # train\n", " convert_annotation(img_path, ann_dir,\n", " train_image_path, train_label_path)\n", " else:\n", " # valid\n", " convert_annotation(img_path, ann_dir,\n", " valid_image_path, valid_label_path)" ] }, { "cell_type": "markdown", "id": "2378cee1", "metadata": { "id": "2378cee1" }, "source": [ "---" ] }, { "cell_type": "markdown", "id": "7df24cb3", "metadata": { "id": "7df24cb3" }, "source": [ "## 2. 更改設定檔案\n", "- 修改 cfg/training/yolov7.yaml\n", "- 修改 data/coco.yaml 製作一個自己資料集的 yaml" ] }, { "cell_type": "markdown", "id": "a567867b", "metadata": { "id": "a567867b" }, "source": [ "將yolov7.yaml 設定檔複製一份\n", "\n", "!cp 要複製的檔案 新檔案名稱" ] }, { "cell_type": "code", "execution_count": null, "id": "77522a59", "metadata": { "id": "77522a59" }, "outputs": [], "source": [ "!cp cfg/training/yolov7.yaml cfg/training/yolov7-pet.yaml" ] }, { "cell_type": "markdown", "id": "241971c0", "metadata": { "id": "241971c0" }, "source": [ "將class的地方改成自己的class數量\n", "\n", "!sed -n -e (顯示) 第幾行 檔案名稱" ] }, { "cell_type": "code", "execution_count": null, "id": "977a9391", "metadata": { "id": "977a9391" }, "outputs": [], "source": [ "!sed -n -e 2p cfg/training/yolov7-pet.yaml" ] }, { "cell_type": "markdown", "source": [], "metadata": { "id": "edq_kVCqkdIa" }, "id": "edq_kVCqkdIa" }, { "cell_type": "markdown", "id": "9c9c52ed", "metadata": { "id": "9c9c52ed" }, "source": [ "\n", "\n", "!sed -i (修改) 第幾行/欲修改的字/目標字/ 檔案名稱" ] }, { "cell_type": "code", "execution_count": null, "id": "2395d21d", "metadata": { "id": "2395d21d" }, "outputs": [], "source": [ "!sed -i '2s/80/2/' cfg/training/yolov7-pet.yaml" ] }, { "cell_type": "code", "execution_count": null, "id": "1f0c5ff7", "metadata": { "id": "1f0c5ff7" }, "outputs": [], "source": [ "!sed -n -e 2p cfg/training/yolov7-pet.yaml" ] }, { "cell_type": "markdown", "id": "91f41ebe", "metadata": { "id": "91f41ebe" }, "source": [ "![image](https://hackmd.io/_uploads/Skkm-gBdp.png)\n" ] }, { "cell_type": "markdown", "id": "b4095ba9", "metadata": { "id": "b4095ba9" }, "source": [ "參考data/coco.yaml 製作一個自己資料集的yaml" ] }, { "cell_type": "code", "execution_count": null, "id": "d0c54ff6", "metadata": { "id": "d0c54ff6" }, "outputs": [], "source": [ "text = \\\n", " \"\"\"\n", " train: ./datasets/pet/train # 訓練資料夾位置\n", " val: ./datasets/pet/valid # 驗證資料夾位置\n", "\n", " # number of classes\n", " nc: 2 # <-需修改乘自己的類別數量\n", "\n", " # class names\n", " names: [ 'cat','dog' ]\n", " \"\"\"" ] }, { "cell_type": "code", "execution_count": null, "id": "9898983a", "metadata": { "id": "9898983a" }, "outputs": [], "source": [ "with open(f'data/{name}.yaml', 'w') as file:\n", " file.write(text)" ] }, { "cell_type": "markdown", "id": "9a77d431", "metadata": { "id": "9a77d431" }, "source": [ "![image](https://hackmd.io/_uploads/H1Am-gHua.png)\n" ] }, { "cell_type": "markdown", "id": "04fe6f92", "metadata": { "id": "04fe6f92" }, "source": [ "下載預訓練權重檔案\n", "https://github.com/WongKinYiu/yolov7" ] }, { "cell_type": "markdown", "id": "67e1bb52", "metadata": { "id": "67e1bb52" }, "source": [ "![image](https://hackmd.io/_uploads/HJ0NZxHua.png)\n", "放置於weights/資料夾底下" ] }, { "cell_type": "markdown", "id": "c2154177", "metadata": { "id": "c2154177" }, "source": [ "執行訓練,訓練參數介紹:\n", "- --weights : 預先訓練的權重路徑(weights/yolov7_training.pt)\n", "- --cfg:模型設定檔案路徑(cfg/training/yolov7-pet.yaml)\n", "- --data:資料集設定檔案路徑(data/pet.yaml)\n", "- --device:GPU設定\n", "- --batch-size:一次訓練照片張數\n", "- --epoch: 訓練圈數\n", "\n", "其他可調控參數可置train.py中察看" ] }, { "cell_type": "code", "execution_count": null, "id": "fa136a0e", "metadata": { "id": "fa136a0e" }, "outputs": [], "source": [ "!python train.py --weights weights/yolov7_training.pt --cfg cfg/training/yolov7-pet.yaml --data data/pet.yaml --device 0 --batch-size 16 --epoch 50" ] }, { "cell_type": "code", "execution_count": null, "id": "6a716521", "metadata": { "id": "6a716521" }, "outputs": [], "source": [] } ], "metadata": { "kernelspec": { "display_name": "Python 3", "name": "python3" }, "language_info": { "codemirror_mode": { "name": "ipython", "version": 3 }, "file_extension": ".py", "mimetype": "text/x-python", "name": "python", "nbconvert_exporter": "python", "pygments_lexer": "ipython3", "version": "3.8.5" }, "colab": { "provenance": [], "gpuType": "T4", "include_colab_link": true }, "accelerator": "GPU" }, "nbformat": 4, "nbformat_minor": 5 }