{ "cells": [ { "cell_type": "markdown", "metadata": { "id": "wGp_eGfBKTEW" }, "source": [ "# The Advanced of Image Preprocessing" ] }, { "cell_type": "markdown", "metadata": { "id": "tv3vNe-kKTEd" }, "source": [ "### 本章節內容大綱\n", "* [影像二值化](#影像二值化)\n", "* [用 cv2.findContours 找影像輪廓](#用-cv2.findContours-找影像輪廓)\n", "* [K-Means Clustering in OpenCV](#K-Means-Clustering-in-OpenCV)" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "id": "hYExLJCEKTEd" }, "outputs": [], "source": [ "# opencv 在 python 中的 module 為 cv2\n", "import cv2\n", "import matplotlib.pyplot as plt\n", "import numpy as np" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "id": "rZ65aUaxN4Mg" }, "outputs": [], "source": [ "# upload Data\n", "!wget -q https://github.com/TA-aiacademy/course_3.0/releases/download/CVCNN_Data/CVCNN_part1.zip\n", "!unzip -q CVCNN_part1.zip" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "id": "5lWbDqm3KTEe" }, "outputs": [], "source": [ "image = cv2.imread(\"aia_logo.png\")[:, :, ::-1]" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "id": "2GEhWeZLKTEe" }, "outputs": [], "source": [ "plt.imshow(image)\n", "plt.axis(\"off\")\n", "plt.show()" ] }, { "cell_type": "markdown", "metadata": { "id": "J4h25mL-KTEf" }, "source": [ "* ### 影像二值化\n", " -- 基本上影像都是用二進位表示的,最簡單的二進位表示方式就是將一個像素值用 0(黑) 或 1(白) 表示。
\n", " -- 而將一張灰階影像轉換成黑白值可以透過 cv2.threshold 來達成。
\n", " -- 此函數的作法是給其一個門檻值,小於門檻值的皆設為黑色,反之設為白色。
" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "id": "Zk1ZDzV6KTEf" }, "outputs": [], "source": [ "gray_img = cv2.cvtColor(image, cv2.COLOR_RGB2GRAY)" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "id": "KjZBFH31KTEf" }, "outputs": [], "source": [ "# 小於 127 設為黑\n", "_, thresh1 = cv2.threshold(gray_img, 127, 1, cv2.THRESH_BINARY)\n", "\n", "# 小於 200 設為黑\n", "_, thresh2 = cv2.threshold(gray_img, 200, 1, cv2.THRESH_BINARY)\n", "\n", "\n", "img_list = [gray_img, thresh1, thresh2]\n", "title = ['Gray Image', 'Threshold: 127', 'Threshold: 200']\n", "\n", "plt.figure(figsize=(12, 12))\n", "for i, each in enumerate(img_list):\n", " plt.subplot(1, 3, i+1)\n", " plt.imshow(each, cmap='gray')\n", " plt.title(title[i], fontsize=15)\n", " plt.axis(\"off\")\n", "plt.show()" ] }, { "cell_type": "markdown", "metadata": { "id": "G_XZZ0dzKTEg" }, "source": [ "- 設定門檻值,讓灰階可以黑白分明,以下為各種不同函式的表現,可以觀察原圖與所使用的函式差異。" ] }, { "cell_type": "markdown", "metadata": { "id": "ZNAmpGy7T1q2" }, "source": [ "* #### Global Thresholding" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "id": "AiVUyRubKTEg" }, "outputs": [], "source": [ "image = cv2.imread('bw.jpg', cv2.IMREAD_GRAYSCALE)\n", "(h, w) = image.shape[:2]\n", "center = (w/2, h/2)\n", "M = cv2.getRotationMatrix2D(center, 270, 2.0)\n", "rotated = cv2.warpAffine(image, M, (w, h))\n", "img = rotated\n", "\n", "_, thresh1 = cv2.threshold(img, 127, 255, cv2.THRESH_BINARY)\n", "_, thresh2 = cv2.threshold(img, 127, 255, cv2.THRESH_BINARY_INV)\n", "_, thresh3 = cv2.threshold(img, 127, 255, cv2.THRESH_TRUNC)\n", "_, thresh4 = cv2.threshold(img, 127, 255, cv2.THRESH_TOZERO)\n", "_, thresh5 = cv2.threshold(img, 127, 255, cv2.THRESH_TOZERO_INV)\n", "\n", "titles = ['Original Image', 'BINARY', 'BINARY_INV',\n", " 'TRUNC', 'TOZERO', 'TOZERO_INV']\n", "images = [img, thresh1, thresh2, thresh3, thresh4, thresh5]\n", "\n", "plt.figure(figsize=(12, 6))\n", "for i in range(6):\n", " plt.subplot(2, 3, i+1), plt.imshow(images[i], 'gray')\n", " plt.title(titles[i])\n", " plt.axis(\"off\")\n", "plt.show()" ] }, { "cell_type": "markdown", "metadata": { "id": "dC4U6VnpKTEh" }, "source": [ "[(back...)](#The-Advanced-of-Image-Preprocessing)" ] }, { "cell_type": "markdown", "metadata": { "id": "ZwYaE_V2T_1o" }, "source": [ "* #### Adaptive Thresholding\n" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "id": "YF804tzqKTEh" }, "outputs": [], "source": [ "image = cv2.imread('bicycle.jpg', cv2.IMREAD_GRAYSCALE)\n", "\n", "# 二值化(未模糊降噪)\n", "ret, th1 = cv2.threshold(image, 127, 255, cv2.THRESH_BINARY)\n", "\n", "# 自適應平均二值化(未模糊降噪)\n", "th2 = cv2.adaptiveThreshold(image, 255, cv2.ADAPTIVE_THRESH_MEAN_C,\n", " cv2.THRESH_BINARY, 11, 2)\n", "\n", "# 自適應高斯二值化(未模糊降噪)\n", "th3 = cv2.adaptiveThreshold(image, 255, cv2.ADAPTIVE_THRESH_GAUSSIAN_C,\n", " cv2.THRESH_BINARY, 11, 2)\n", "\n", "titles = ['Original Image', 'Global Thresholding (v = 127)',\n", " 'Adaptive Mean Thresholding', 'Adaptive Gaussian Thresholding']\n", "images = [image, th1, th2, th3]\n", "\n", "for i in range(4):\n", " plt.subplot(2, 2, i+1)\n", " plt.imshow(images[i], 'gray')\n", " plt.title(titles[i])\n", " plt.xticks([])\n", " plt.yticks([])\n", "plt.show()" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "id": "vWrkSQ6VKTEh" }, "outputs": [], "source": [ "image = cv2.imread('bicycle.jpg', cv2.IMREAD_GRAYSCALE)\n", "\n", "# 將圖片做模糊化,可以降噪\n", "blur_img = cv2.medianBlur(image, 5)\n", "\n", "# 二值化(有模糊降噪)\n", "ret, th4 = cv2.threshold(blur_img, 127, 255, cv2.THRESH_BINARY)\n", "\n", "# 算術平均法的自適應二值化(有模糊降噪)\n", "th5 = cv2.adaptiveThreshold(blur_img, 255, cv2.ADAPTIVE_THRESH_MEAN_C,\n", " cv2.THRESH_BINARY, 11, 2)\n", "\n", "# 高斯加權均值法自適應二值化(有模糊降噪)\n", "th6 = cv2.adaptiveThreshold(blur_img, 255, cv2.ADAPTIVE_THRESH_GAUSSIAN_C,\n", " cv2.THRESH_BINARY, 11, 2)\n", "\n", "titles = ['Blur Image', 'Global Thresholding (v = 127)',\n", " 'Adaptive Mean Thresholding', 'Adaptive Gaussian Thresholding']\n", "images = [img, th1, th2, th3]\n", "\n", "for i in range(4):\n", " plt.subplot(2, 2, i+1)\n", " plt.imshow(images[i], 'gray')\n", " plt.title(titles[i])\n", " plt.xticks([])\n", " plt.yticks([])\n", "plt.show()" ] }, { "cell_type": "markdown", "metadata": { "id": "sp80rF8BKTEj" }, "source": [ "* ### 用 cv2.findContours 找影像輪廓" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "id": "4U5kgf8WKTEj" }, "outputs": [], "source": [ "image = cv2.imread('poker.jpg')[:, :, ::-1]\n", "gray_img = cv2.cvtColor(image, cv2.COLOR_RGB2GRAY)\n", "new_image = image.copy()\n", "\n", "# threshold image\n", "ret, thresh = cv2.threshold(gray_img, 127, 255, cv2.THRESH_BINARY)\n", "# find contours and get the external one\n", "contours, hier = cv2.findContours(thresh, cv2.RETR_EXTERNAL,\n", " cv2.CHAIN_APPROX_SIMPLE)\n", "\n", "\n", "cv2.drawContours(new_image, contours, -1, (255, 255, 0), 3)\n", "\n", "titles = ['Original_image', 'Threshed_image', 'DrawContours']\n", "images = [image, thresh, new_image]\n", "\n", "\n", "plt.figure(figsize=(18, 9))\n", "for i in range(3):\n", " plt.subplot(2, 3, i+1),\n", " plt.imshow(images[i], 'gray')\n", " plt.title(titles[i])\n", " plt.axis(\"off\")\n", "plt.show()" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "id": "gVmdpxM0KTEj" }, "outputs": [], "source": [ "img = cv2.imread('poker.jpg', cv2.IMREAD_UNCHANGED)\n", "img = cv2.cvtColor(img, cv2.COLOR_BGR2RGB)\n", "for i in range(4):\n", " for a in range(2):\n", " var_name = f'img_{i+1}{a+1}'\n", " locals()[var_name] = cv2.imread('poker.jpg',\n", " cv2.IMREAD_UNCHANGED)\n", " locals()[var_name] = cv2.cvtColor(locals()[var_name],\n", " cv2.COLOR_BGR2RGB)\n", "\n", "img_g = cv2.GaussianBlur(img, (15, 15), 5)\n", "\n", "for a in range(2):\n", " # 對影像做二值化處理\n", " if a == 0:\n", " ret, threshed_img = cv2.threshold(\n", " cv2.cvtColor(img_11, cv2.COLOR_RGB2GRAY),\n", " 127, 255, cv2.THRESH_BINARY\n", " )\n", " if a == 1:\n", " ret, threshed_img = cv2.threshold(\n", " cv2.cvtColor(img_g, cv2.COLOR_RGB2GRAY),\n", " 127, 255, cv2.THRESH_BINARY\n", " )\n", "\n", " # 找出二值化後的邊界\n", " contours, hier = cv2.findContours(threshed_img,\n", " cv2.RETR_TREE,\n", " cv2.CHAIN_APPROX_SIMPLE)\n", "\n", " # 沿著邊界找到小適合設定的定界框\n", " # 黃:只沿著邊界畫框的\n", " # 綠:沿著邊界畫出無旋轉角度的最小矩形\n", " # 藍:沿著邊界畫出有旋轉角度的最小矩形\n", " # 紅:沿著邊界畫出的最小圓形\n", "\n", " for c in contours:\n", " # get the bounding rect\n", " x, y, w, h = cv2.boundingRect(c)\n", "\n", " # 第一組圖\n", " # draw a green rectangle to visualize the bounding rect\n", " cv2.rectangle(locals()[f'img_1{a+1}'], (x, y),\n", " (x+w, y+h), (0, 255, 0), 3)\n", "\n", " # 取最小包含物體的區域\n", " rect = cv2.minAreaRect(c)\n", " box = cv2.boxPoints(rect)\n", "\n", " # 將值從 float 轉為 int\n", " box = np.int0(box)\n", "\n", " # 第二組圖\n", " # draw a red 'nghien' rectangle\n", " cv2.drawContours(locals()[f'img_2{a+1}'], [box],\n", " 0, (0, 0, 255), 3)\n", "\n", " (x, y), radius = cv2.minEnclosingCircle(c)\n", "\n", " center = (int(x), int(y))\n", " radius = int(radius)\n", "\n", " # 第三組圖\n", " # and draw the circle in blue\n", " cv2.circle(locals()['img_3{}'.format(a+1)],\n", " center, radius, (255, 0, 0), 3)\n", " # 第四組圖\n", " # and draw the contour with yellow\n", " cv2.drawContours(locals()['img_4{}'.format(a+1)],\n", " contours, -1, (255, 255, 0), 3)\n", "\n", "\n", "titles = ['Box', 'Box_blur', 'Box_2', 'Box_2_blur',\n", " 'Circle', 'Circle_blur', 'Contours', 'Contours_blur']\n", "images = [img_11, img_12, img_21, img_22,\n", " img_31, img_32, img_41, img_42]\n", "plt.figure(figsize=(20, 20))\n", "for i in range(8):\n", " plt.subplot(4, 2, i+1), plt.imshow(images[i])\n", " plt.title(titles[i])\n", " plt.axis(\"off\")\n", "plt.show()" ] }, { "cell_type": "markdown", "metadata": { "id": "RWHSFuHzKTEj" }, "source": [ "[(back...)](#The-Advanced-of-Image-Preprocessing)" ] }, { "cell_type": "markdown", "metadata": { "id": "x9shdDghKTEk" }, "source": [ "* ## K-Means Clustering in OpenCV\n", "一般的影像儲存方式是三個通道 (RGB),每個通道都是 0~255(8 bits)
\n", "這樣一個像素需要花 8*3=24 bits 儲存
\n", "若想要降低影像的儲存空間,K-Means 是一種很棒的方式
\n", "我們可以將影像中的所有顏色做分群後,用群的中心點來代表這群所有點的值
\n", "將影像的顏色分成 $2^5=32$ 個群,只需要花 5 bits 就可以代表一個像素了
\n", "相對原本的儲存方式大約節省了五倍的記憶體容量了" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "id": "_ub6Geu5KTEk" }, "outputs": [], "source": [ "img = cv2.imread('flower2.jpg')[:, :, ::-1]\n", "Z = img.reshape((-1, 3))\n", "\n", "# convert to np.float32\n", "Z = np.float32(Z)\n", "\n", "# define criteria, number of clusters(K) and apply kmeans()\n", "criteria = (cv2.TERM_CRITERIA_EPS + cv2.TERM_CRITERIA_MAX_ITER, 10, 1.0)\n", "\n", "result = []\n", "title = []\n", "\n", "# [不分群, 分 2 群, 分 4 群, 分 8 群, 分 16 群, 分 32 群]\n", "for i in range(1, 6):\n", " K = 2 ** i\n", " ret, label, center = cv2.kmeans(Z, K, None,\n", " criteria, 10,\n", " cv2.KMEANS_RANDOM_CENTERS)\n", "\n", " center = np.uint8(center)\n", " cluster = center[label.flatten()]\n", " cluster = cluster.reshape((img.shape))\n", "\n", " title .append(str(K)+\"-clustering\")\n", " result.append(cluster)\n", "\n", "\n", "titles = ['Original_img'] + title\n", "images = [img] + result\n", "\n", "plt.figure(figsize=(16, 8))\n", "for i in range(6):\n", " plt.subplot(2, 3, i+1)\n", " plt.imshow(images[i])\n", " plt.title(titles[i])\n", " plt.axis(\"off\")\n", "plt.show()" ] }, { "cell_type": "markdown", "metadata": { "id": "eWxMVi_BKTEk" }, "source": [ "[(back...)](#The-Advanced-of-Image-Preprocessing)" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "id": "AtOq8JhbKTEk" }, "outputs": [], "source": [] } ], "metadata": { "colab": { "provenance": [] }, "kernelspec": { "display_name": "Python 3 (ipykernel)", "language": "python", "name": "python3" }, "language_info": { "codemirror_mode": { "name": "ipython", "version": 3 }, "file_extension": ".py", "mimetype": "text/x-python", "name": "python", "nbconvert_exporter": "python", "pygments_lexer": "ipython3", "version": "3.7.12" } }, "nbformat": 4, "nbformat_minor": 4 }