{ "cells": [ { "cell_type": "markdown", "metadata": { "id": "EoAQKhRxKSOv" }, "source": [ "# The Basic of Image Preprocessing" ] }, { "cell_type": "markdown", "metadata": { "id": "c0zHj4cnKSOz" }, "source": [ "### 本章節內容大綱\n", "* [影像縮放 Resize](#影像縮放-Resize)\n", "* [影像平移 Shift](#影像平移)\n", "* [影像旋轉 Rotation](#影像旋轉)\n", "* [影像翻轉 Flip](#影像翻轉)\n", "* [影像仿射 Affine](#影像仿射)" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "id": "pfJS9BeNKSOz" }, "outputs": [], "source": [ "# opencv 在 python 中的 module 為 cv2\n", "import cv2\n", "import matplotlib.pyplot as plt\n", "import numpy as np" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "id": "_RfoPr3q1AMB" }, "outputs": [], "source": [ "# upload Data\n", "!wget -q https://github.com/TA-aiacademy/course_3.0/releases/download/CVCNN_Data/CVCNN_part1.zip\n", "!unzip -q CVCNN_part1" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "id": "bztVV4ACKSO0" }, "outputs": [], "source": [ "img = cv2.imread(\"poker.jpg\")[:, :, ::-1]\n", "\n", "plt.imshow(img)\n", "plt.axis(\"off\")\n", "plt.show()" ] }, { "cell_type": "markdown", "metadata": { "id": "fMdp2xdDKSO1" }, "source": [ "* ### 影像縮放 Resize" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "id": "Bvye3FWtKSO1" }, "outputs": [], "source": [ "print(img.shape)" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "id": "pDNgs8ioKSO1" }, "outputs": [], "source": [ "# 縮成 200 x 300 大小\n", "resize_img1 = cv2.resize(img, (300, 200))\n", "\n", "plt.imshow(resize_img1)\n", "plt.title(\"shape {}\".format(resize_img1.shape))\n", "plt.axis(\"off\")\n", "plt.show()" ] }, { "cell_type": "markdown", "metadata": { "id": "_Ykrbh0VKSO2" }, "source": [ "-- 如果不想指定縮放大小,可以指定 fx, fy 縮放比例
\n", "-- 常見影像補插值法,如下列表
\n", "\n", "Name | 插值法\n", ":-----------------:|:--------------------:\n", "cv2.INTER_LINEAR | 線性插值 (default) \n", "cv2.INTER_NEAREST | 最鄰近插值\n", "cv2.INTER_AREA | 區域插值\n", "cv2.INTER_CUBIC | 三次插值法" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "id": "YKAhPH-5KSO2" }, "outputs": [], "source": [ "resize_img2 = cv2.resize(img, (0, 0), fx=0.5, fy=0.2)\n", "\n", "img_list = [resize_img2]\n", "plt.figure(figsize=(16, 16))\n", "for i, each in enumerate(img_list):\n", " plt.subplot(1, 2, i+1)\n", " plt.imshow(each)\n", " plt.title(\"shape: {}\".format(each.shape), fontsize=15)\n", " plt.axis(\"off\")\n", "plt.show()" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "id": "JqqbHXWNKSO3" }, "outputs": [], "source": [ "resize_img3 = cv2.resize(img, (0, 0), fx=1.5, fy=1.8)\n", "\n", "# 嘗試作比較 cv2.INTER_NEAREST 最鄰近插值\n", "resize_img4 = cv2.resize(img, (0, 0), fx=1.5, fy=1.8,\n", " interpolation=cv2.INTER_NEAREST)\n", "\n", "\n", "img_list = [resize_img3, resize_img4]\n", "plt.figure(figsize=(20, 20))\n", "for i, each in enumerate(img_list):\n", " plt.subplot(1, 2, i+1)\n", " plt.imshow(each)\n", " plt.title(\"shape: {}\".format(each.shape), fontsize=15)\n", " plt.axis(\"off\")\n", "plt.show()" ] }, { "cell_type": "markdown", "metadata": { "id": "9090pmilKSO3" }, "source": [ "[(back...)](#The-Basic-of-Image-Preprocessing)" ] }, { "cell_type": "markdown", "metadata": { "id": "9VjcOnrcKSO3" }, "source": [ "* ### 影像平移\n", " [1] 定義影像平移矩陣 $M$
\n", " [2] 指定水平、垂直位移量 $t_x$ $t_y$
\n", "\n", " $M = \\left[\\begin{array}{c c c} 1 & 0 & t_x \\\\ 0 & 1 & t_y\\end{array}\\right]$\n", "\n", " $M\\left[\\begin{array}{c} x \\\\ y\\\\1\\end{array}\\right] = \\left[\\begin{array}{c c c} 1 & 0 & t_x \\\\ 0 & 1 & t_y\\end{array}\\right]\\left[\\begin{array}{c} x \\\\ y\\\\1\\end{array}\\right] = \\left[\\begin{array}{c} x + t_x \\\\ y + t_y\\end{array}\\right]$" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "id": "lTgIYSK7KSO3" }, "outputs": [], "source": [ "h, w, _ = img.shape\n", "tx, ty = 25, 50\n", "M1 = np.float32([[1, 0, tx], # 向右 tx\n", " [0, 1, ty]]) # 向下 ty\n", "shift_img1 = cv2.warpAffine(img, M1, (w, h)) #\n", "\n", "\n", "img_list = [img, shift_img1]\n", "plt.figure(figsize=(16, 16))\n", "for i, each in enumerate(img_list):\n", " plt.subplot(1, 2, i+1)\n", " plt.imshow(each)\n", " plt.title(\"shape: {}\".format(each.shape), fontsize=15)\n", "\n", "plt.show()" ] }, { "cell_type": "markdown", "metadata": { "id": "Ii1WsyqdKSO4" }, "source": [ "可以注意到上方右圖中,有很多點直接被補成黑色。
\n", "有時候我們不希望邊界值被這樣補值。
\n", "這時可以透過 warpAffine 中的 borderMode 去改變補值模式。
\n", "常見的補值方式如下列表....
\n", "\n", "Name | 插值法\n", ":--------------------:|:-----------------------:\n", "cv2.BORDER_CONSTANT | 補常數值 (default) \n", "cv2.BORDER_REPLICATE | 補最鄰近點值\n", "cv2.BORDER_REFLECT | 鏡像反射補值\n", "cv2.BORDER_WRAP | 複製補值" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "id": "GtM74juqKSO4" }, "outputs": [], "source": [ "M2 = np.float32([[1, 0, 100],\n", " [0, 1, 50]])\n", "shift_img2 = cv2.warpAffine(img, M2, (w//2, h),\n", " borderMode=cv2.BORDER_REFLECT)\n", "\n", "shift_img3 = cv2.warpAffine(img, M2, (w//2, h),\n", " borderValue=(168, 0, 0))\n", "\n", "img_list = [img[:, :img.shape[1]//2], shift_img2, shift_img3]\n", "plt.figure(figsize=(16, 16))\n", "for i, each in enumerate(img_list):\n", " plt.subplot(1, 3, i+1)\n", " plt.imshow(each)\n", " plt.title(\"shape: {}\".format(each.shape), fontsize=15)\n", " plt.axis(\"off\")\n", "plt.show()" ] }, { "cell_type": "markdown", "metadata": { "id": "WEAsy1qEKSO4" }, "source": [ "[(back...)](#The-Basic-of-Image-Preprocessing)" ] }, { "cell_type": "markdown", "metadata": { "id": "aCJd0_aUKSO4" }, "source": [ "* ### 影像旋轉\n", " [1] 複習線性代數旋轉矩陣 $M = \\left[\\begin{array}{c c} \\cos\\theta & -\\sin\\theta \\\\ \\sin\\theta & \\cos\\theta \\end{array}\\right]$
\n", " \n", "\n", " $\\left[\\begin{array}{c}x'\\\\ y'\\end{array}\\right] =\n", " \\left[\\begin{array}{c}\\cos{(\\theta_1 + \\theta_2)}\\\\ \\sin{(\\theta_1 + \\theta_2)}\\end{array}\\right] =\n", " \\left[\\begin{array}{c}\\cos\\theta_1\\cos\\theta_2- \\sin\\theta_1\\sin\\theta_2\\\\\\cos\\theta_1\\sin\\theta_2+\\sin\\theta_1\\cos\\theta_2\\end{array}\\right] =\n", " \\left[\\begin{array}{c}x\\cos\\theta_2 - y\\sin\\theta_2\\\\ x\\sin\\theta_2 + y\\cos\\theta_2\\end{array}\\right] =\n", " \\left[\\begin{array}{c c} \\cos\\theta_2 & -\\sin\\theta_2 \\\\ \\sin\\theta_2 & \\cos\\theta_2 \\end{array}\\right]\n", " \\left[\\begin{array}{c}x\\\\ y\\end{array}\\right]=\n", " \\left[\\begin{array}{c c} \\alpha & -\\beta \\\\ \\beta & \\alpha \\end{array}\\right]\n", " \\left[\\begin{array}{c}x\\\\ y\\end{array}\\right]\n", " $\n", " \n", " [2] 回憶起旋轉,我們都是以原點為中心作變換的,在 opencv 中我們另外考慮任一點都能當旋轉中心$(center_x, center_y)$,所以改良後的旋轉矩陣表示如下...\n", "\n", " $M = \\left[\\begin{array}{ccc} \\alpha & \\beta & (1-\\alpha)center_x -\\beta center_y \\\\ -\\beta & \\alpha & \\beta center_x+(1-\\alpha)center_y\\end{array}\\right]$" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "id": "HU5ErF9_KSO5" }, "outputs": [], "source": [ "# 第一個參數為旋轉的中心點\n", "# 第二個參數為旋轉角度\n", "# 第三個參數為縮放大小\n", "\n", "center = (512//2, 512//2)\n", "M = cv2.getRotationMatrix2D(center, 30, 1)\n", "print(M)" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "id": "IOEJrB8HKSO5" }, "outputs": [], "source": [ "h, w, _ = img.shape\n", "center = (w//2, h//2)\n", "M = cv2.getRotationMatrix2D(center, 45, 1)\n", "rotate_img = cv2.warpAffine(img, M, (w, h))\n", "\n", "plt.figure(figsize=(8, 8))\n", "plt.imshow(rotate_img)\n", "plt.show()" ] }, { "cell_type": "markdown", "metadata": { "id": "KnaUZshyKSO5" }, "source": [ "[(back...)](#The-Basic-of-Image-Preprocessing)" ] }, { "cell_type": "markdown", "metadata": { "id": "ruXmBAhYKSO5" }, "source": [ "* ### 影像翻轉\n", "cv2.flip 的第一個參數為輸入影像,第二個參數位置為翻轉方向
\n", " * 1: 水平翻轉\n", " * 0: 垂直翻轉\n", " * -1: 水平垂直翻轉" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "id": "P8nkO0qJKSO5" }, "outputs": [], "source": [ "flipped_A = cv2.flip(img, 1)\n", "flipped_B = cv2.flip(img, 0)\n", "flipped_C = cv2.flip(img, -1)\n", "\n", "\n", "titles = ['Original Image',\n", " 'Horizantal Flip',\n", " 'Vertical Flip',\n", " 'Horizantal and Vertical Flip']\n", "img_list = [img, flipped_A, flipped_B, flipped_C]\n", "plt.figure(figsize=(20, 12))\n", "for i in range(4):\n", " plt.subplot(2, 2, i+1)\n", " plt.imshow(img_list[i])\n", " plt.title(titles[i], fontsize=15)\n", " plt.axis(\"off\")\n", "plt.show()" ] }, { "cell_type": "markdown", "metadata": { "id": "UEo3VxQMKSO5" }, "source": [ "[(back...)](#The-Basic-of-Image-Preprocessing)" ] }, { "cell_type": "markdown", "metadata": { "id": "yUhYslTiKSO6" }, "source": [ "* ### 影像仿射\n" ] }, { "cell_type": "markdown", "metadata": { "id": "jFJslWr8KSO6" }, "source": [ "影像仿射其實就是平移和旋轉的組合。
\n", "前面章節透過 warpAffine 進行過影像平移,而我們可以透過轉換矩陣 $M$ 做到影像仿射。" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "id": "39rftT3PKSO6" }, "outputs": [], "source": [ "h, w, _ = img.shape\n", "raw_loc = np.array([[0, 0], [w, 0], [0, h]]).astype(np.float32)\n", "new_loc = np.array([[0, 0],\n", " [w*0.85, h*0.25],\n", " [w*0.15, h*0.7]]).astype(np.float32)\n", "affine_transform = cv2.getAffineTransform(raw_loc, new_loc)" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "id": "sEMak1a3U8OY" }, "outputs": [], "source": [ "affine_transform" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "id": "pR8yRbxrKSO6" }, "outputs": [], "source": [ "affine_img1 = cv2.warpAffine(img, affine_transform, (w, h))" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "id": "ziFn7OnKKSO6" }, "outputs": [], "source": [ "img_list = [img, affine_img1]\n", "titles = ['Original Image', 'Affine transform']\n", "plt.figure(figsize=(20, 20))\n", "for i, each in enumerate(img_list):\n", " plt.subplot(1, 2, i+1)\n", " plt.imshow(each)\n", " plt.title(\"shape: {}\".format(each.shape), fontsize=15)\n", " plt.axis(\"off\")\n", "plt.show()" ] }, { "cell_type": "markdown", "metadata": { "id": "R32aULxCKSO7" }, "source": [ "[(back...)](#The-Basic-of-Image-Preprocessing)" ] } ], "metadata": { "colab": { "provenance": [] }, "kernelspec": { "display_name": "Python 3 (ipykernel)", "language": "python", "name": "python3" }, "language_info": { "codemirror_mode": { "name": "ipython", "version": 3 }, "file_extension": ".py", "mimetype": "text/x-python", "name": "python", "nbconvert_exporter": "python", "pygments_lexer": "ipython3", "version": "3.7.12" } }, "nbformat": 4, "nbformat_minor": 0 }