{
"cells": [
{
"cell_type": "markdown",
"metadata": {
"id": "wGp_eGfBKTEW"
},
"source": [
"# The Advanced of Image Preprocessing"
]
},
{
"cell_type": "markdown",
"metadata": {
"id": "tv3vNe-kKTEd"
},
"source": [
"### 本章節內容大綱\n",
"* [影像二值化](#影像二值化)\n",
"* [用 cv2.findContours 找影像輪廓](#用-cv2.findContours-找影像輪廓)\n",
"* [K-Means Clustering in OpenCV](#K-Means-Clustering-in-OpenCV)"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {
"id": "hYExLJCEKTEd"
},
"outputs": [],
"source": [
"# opencv 在 python 中的 module 為 cv2\n",
"import cv2\n",
"import matplotlib.pyplot as plt\n",
"import numpy as np"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {
"id": "rZ65aUaxN4Mg"
},
"outputs": [],
"source": [
"# upload Data\n",
"!wget -q https://github.com/TA-aiacademy/course_3.0/releases/download/CVCNN_Data/CVCNN_part1.zip\n",
"!unzip -q CVCNN_part1.zip"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {
"id": "5lWbDqm3KTEe"
},
"outputs": [],
"source": [
"image = cv2.imread(\"aia_logo.png\")[:, :, ::-1]"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {
"id": "2GEhWeZLKTEe"
},
"outputs": [],
"source": [
"plt.imshow(image)\n",
"plt.axis(\"off\")\n",
"plt.show()"
]
},
{
"cell_type": "markdown",
"metadata": {
"id": "J4h25mL-KTEf"
},
"source": [
"* ### 影像二值化\n",
" -- 基本上影像都是用二進位表示的,最簡單的二進位表示方式就是將一個像素值用 0(黑) 或 1(白) 表示。
\n",
" -- 而將一張灰階影像轉換成黑白值可以透過 cv2.threshold 來達成。
\n",
" -- 此函數的作法是給其一個門檻值,小於門檻值的皆設為黑色,反之設為白色。
"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {
"id": "Zk1ZDzV6KTEf"
},
"outputs": [],
"source": [
"gray_img = cv2.cvtColor(image, cv2.COLOR_RGB2GRAY)"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {
"id": "KjZBFH31KTEf"
},
"outputs": [],
"source": [
"# 小於 127 設為黑\n",
"_, thresh1 = cv2.threshold(gray_img, 127, 1, cv2.THRESH_BINARY)\n",
"\n",
"# 小於 200 設為黑\n",
"_, thresh2 = cv2.threshold(gray_img, 200, 1, cv2.THRESH_BINARY)\n",
"\n",
"\n",
"img_list = [gray_img, thresh1, thresh2]\n",
"title = ['Gray Image', 'Threshold: 127', 'Threshold: 200']\n",
"\n",
"plt.figure(figsize=(12, 12))\n",
"for i, each in enumerate(img_list):\n",
" plt.subplot(1, 3, i+1)\n",
" plt.imshow(each, cmap='gray')\n",
" plt.title(title[i], fontsize=15)\n",
" plt.axis(\"off\")\n",
"plt.show()"
]
},
{
"cell_type": "markdown",
"metadata": {
"id": "G_XZZ0dzKTEg"
},
"source": [
"- 設定門檻值,讓灰階可以黑白分明,以下為各種不同函式的表現,可以觀察原圖與所使用的函式差異。"
]
},
{
"cell_type": "markdown",
"metadata": {
"id": "ZNAmpGy7T1q2"
},
"source": [
"* #### Global Thresholding"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {
"id": "AiVUyRubKTEg"
},
"outputs": [],
"source": [
"image = cv2.imread('bw.jpg', cv2.IMREAD_GRAYSCALE)\n",
"(h, w) = image.shape[:2]\n",
"center = (w/2, h/2)\n",
"M = cv2.getRotationMatrix2D(center, 270, 2.0)\n",
"rotated = cv2.warpAffine(image, M, (w, h))\n",
"img = rotated\n",
"\n",
"_, thresh1 = cv2.threshold(img, 127, 255, cv2.THRESH_BINARY)\n",
"_, thresh2 = cv2.threshold(img, 127, 255, cv2.THRESH_BINARY_INV)\n",
"_, thresh3 = cv2.threshold(img, 127, 255, cv2.THRESH_TRUNC)\n",
"_, thresh4 = cv2.threshold(img, 127, 255, cv2.THRESH_TOZERO)\n",
"_, thresh5 = cv2.threshold(img, 127, 255, cv2.THRESH_TOZERO_INV)\n",
"\n",
"titles = ['Original Image', 'BINARY', 'BINARY_INV',\n",
" 'TRUNC', 'TOZERO', 'TOZERO_INV']\n",
"images = [img, thresh1, thresh2, thresh3, thresh4, thresh5]\n",
"\n",
"plt.figure(figsize=(12, 6))\n",
"for i in range(6):\n",
" plt.subplot(2, 3, i+1), plt.imshow(images[i], 'gray')\n",
" plt.title(titles[i])\n",
" plt.axis(\"off\")\n",
"plt.show()"
]
},
{
"cell_type": "markdown",
"metadata": {
"id": "dC4U6VnpKTEh"
},
"source": [
"[(back...)](#The-Advanced-of-Image-Preprocessing)"
]
},
{
"cell_type": "markdown",
"metadata": {
"id": "ZwYaE_V2T_1o"
},
"source": [
"* #### Adaptive Thresholding\n"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {
"id": "YF804tzqKTEh"
},
"outputs": [],
"source": [
"image = cv2.imread('bicycle.jpg', cv2.IMREAD_GRAYSCALE)\n",
"\n",
"# 二值化(未模糊降噪)\n",
"ret, th1 = cv2.threshold(image, 127, 255, cv2.THRESH_BINARY)\n",
"\n",
"# 自適應平均二值化(未模糊降噪)\n",
"th2 = cv2.adaptiveThreshold(image, 255, cv2.ADAPTIVE_THRESH_MEAN_C,\n",
" cv2.THRESH_BINARY, 11, 2)\n",
"\n",
"# 自適應高斯二值化(未模糊降噪)\n",
"th3 = cv2.adaptiveThreshold(image, 255, cv2.ADAPTIVE_THRESH_GAUSSIAN_C,\n",
" cv2.THRESH_BINARY, 11, 2)\n",
"\n",
"titles = ['Original Image', 'Global Thresholding (v = 127)',\n",
" 'Adaptive Mean Thresholding', 'Adaptive Gaussian Thresholding']\n",
"images = [image, th1, th2, th3]\n",
"\n",
"for i in range(4):\n",
" plt.subplot(2, 2, i+1)\n",
" plt.imshow(images[i], 'gray')\n",
" plt.title(titles[i])\n",
" plt.xticks([])\n",
" plt.yticks([])\n",
"plt.show()"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {
"id": "vWrkSQ6VKTEh"
},
"outputs": [],
"source": [
"image = cv2.imread('bicycle.jpg', cv2.IMREAD_GRAYSCALE)\n",
"\n",
"# 將圖片做模糊化,可以降噪\n",
"blur_img = cv2.medianBlur(image, 5)\n",
"\n",
"# 二值化(有模糊降噪)\n",
"ret, th4 = cv2.threshold(blur_img, 127, 255, cv2.THRESH_BINARY)\n",
"\n",
"# 算術平均法的自適應二值化(有模糊降噪)\n",
"th5 = cv2.adaptiveThreshold(blur_img, 255, cv2.ADAPTIVE_THRESH_MEAN_C,\n",
" cv2.THRESH_BINARY, 11, 2)\n",
"\n",
"# 高斯加權均值法自適應二值化(有模糊降噪)\n",
"th6 = cv2.adaptiveThreshold(blur_img, 255, cv2.ADAPTIVE_THRESH_GAUSSIAN_C,\n",
" cv2.THRESH_BINARY, 11, 2)\n",
"\n",
"titles = ['Blur Image', 'Global Thresholding (v = 127)',\n",
" 'Adaptive Mean Thresholding', 'Adaptive Gaussian Thresholding']\n",
"images = [img, th1, th2, th3]\n",
"\n",
"for i in range(4):\n",
" plt.subplot(2, 2, i+1)\n",
" plt.imshow(images[i], 'gray')\n",
" plt.title(titles[i])\n",
" plt.xticks([])\n",
" plt.yticks([])\n",
"plt.show()"
]
},
{
"cell_type": "markdown",
"metadata": {
"id": "sp80rF8BKTEj"
},
"source": [
"* ### 用 cv2.findContours 找影像輪廓"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {
"id": "4U5kgf8WKTEj"
},
"outputs": [],
"source": [
"image = cv2.imread('poker.jpg')[:, :, ::-1]\n",
"gray_img = cv2.cvtColor(image, cv2.COLOR_RGB2GRAY)\n",
"new_image = image.copy()\n",
"\n",
"# threshold image\n",
"ret, thresh = cv2.threshold(gray_img, 127, 255, cv2.THRESH_BINARY)\n",
"# find contours and get the external one\n",
"contours, hier = cv2.findContours(thresh, cv2.RETR_EXTERNAL,\n",
" cv2.CHAIN_APPROX_SIMPLE)\n",
"\n",
"\n",
"cv2.drawContours(new_image, contours, -1, (255, 255, 0), 3)\n",
"\n",
"titles = ['Original_image', 'Threshed_image', 'DrawContours']\n",
"images = [image, thresh, new_image]\n",
"\n",
"\n",
"plt.figure(figsize=(18, 9))\n",
"for i in range(3):\n",
" plt.subplot(2, 3, i+1),\n",
" plt.imshow(images[i], 'gray')\n",
" plt.title(titles[i])\n",
" plt.axis(\"off\")\n",
"plt.show()"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {
"id": "gVmdpxM0KTEj"
},
"outputs": [],
"source": [
"img = cv2.imread('poker.jpg', cv2.IMREAD_UNCHANGED)\n",
"img = cv2.cvtColor(img, cv2.COLOR_BGR2RGB)\n",
"for i in range(4):\n",
" for a in range(2):\n",
" var_name = f'img_{i+1}{a+1}'\n",
" locals()[var_name] = cv2.imread('poker.jpg',\n",
" cv2.IMREAD_UNCHANGED)\n",
" locals()[var_name] = cv2.cvtColor(locals()[var_name],\n",
" cv2.COLOR_BGR2RGB)\n",
"\n",
"img_g = cv2.GaussianBlur(img, (15, 15), 5)\n",
"\n",
"for a in range(2):\n",
" # 對影像做二值化處理\n",
" if a == 0:\n",
" ret, threshed_img = cv2.threshold(\n",
" cv2.cvtColor(img_11, cv2.COLOR_RGB2GRAY),\n",
" 127, 255, cv2.THRESH_BINARY\n",
" )\n",
" if a == 1:\n",
" ret, threshed_img = cv2.threshold(\n",
" cv2.cvtColor(img_g, cv2.COLOR_RGB2GRAY),\n",
" 127, 255, cv2.THRESH_BINARY\n",
" )\n",
"\n",
" # 找出二值化後的邊界\n",
" contours, hier = cv2.findContours(threshed_img,\n",
" cv2.RETR_TREE,\n",
" cv2.CHAIN_APPROX_SIMPLE)\n",
"\n",
" # 沿著邊界找到小適合設定的定界框\n",
" # 黃:只沿著邊界畫框的\n",
" # 綠:沿著邊界畫出無旋轉角度的最小矩形\n",
" # 藍:沿著邊界畫出有旋轉角度的最小矩形\n",
" # 紅:沿著邊界畫出的最小圓形\n",
"\n",
" for c in contours:\n",
" # get the bounding rect\n",
" x, y, w, h = cv2.boundingRect(c)\n",
"\n",
" # 第一組圖\n",
" # draw a green rectangle to visualize the bounding rect\n",
" cv2.rectangle(locals()[f'img_1{a+1}'], (x, y),\n",
" (x+w, y+h), (0, 255, 0), 3)\n",
"\n",
" # 取最小包含物體的區域\n",
" rect = cv2.minAreaRect(c)\n",
" box = cv2.boxPoints(rect)\n",
"\n",
" # 將值從 float 轉為 int\n",
" box = np.int0(box)\n",
"\n",
" # 第二組圖\n",
" # draw a red 'nghien' rectangle\n",
" cv2.drawContours(locals()[f'img_2{a+1}'], [box],\n",
" 0, (0, 0, 255), 3)\n",
"\n",
" (x, y), radius = cv2.minEnclosingCircle(c)\n",
"\n",
" center = (int(x), int(y))\n",
" radius = int(radius)\n",
"\n",
" # 第三組圖\n",
" # and draw the circle in blue\n",
" cv2.circle(locals()['img_3{}'.format(a+1)],\n",
" center, radius, (255, 0, 0), 3)\n",
" # 第四組圖\n",
" # and draw the contour with yellow\n",
" cv2.drawContours(locals()['img_4{}'.format(a+1)],\n",
" contours, -1, (255, 255, 0), 3)\n",
"\n",
"\n",
"titles = ['Box', 'Box_blur', 'Box_2', 'Box_2_blur',\n",
" 'Circle', 'Circle_blur', 'Contours', 'Contours_blur']\n",
"images = [img_11, img_12, img_21, img_22,\n",
" img_31, img_32, img_41, img_42]\n",
"plt.figure(figsize=(20, 20))\n",
"for i in range(8):\n",
" plt.subplot(4, 2, i+1), plt.imshow(images[i])\n",
" plt.title(titles[i])\n",
" plt.axis(\"off\")\n",
"plt.show()"
]
},
{
"cell_type": "markdown",
"metadata": {
"id": "RWHSFuHzKTEj"
},
"source": [
"[(back...)](#The-Advanced-of-Image-Preprocessing)"
]
},
{
"cell_type": "markdown",
"metadata": {
"id": "x9shdDghKTEk"
},
"source": [
"* ## K-Means Clustering in OpenCV\n",
"一般的影像儲存方式是三個通道 (RGB),每個通道都是 0~255(8 bits)
\n",
"這樣一個像素需要花 8*3=24 bits 儲存
\n",
"若想要降低影像的儲存空間,K-Means 是一種很棒的方式
\n",
"我們可以將影像中的所有顏色做分群後,用群的中心點來代表這群所有點的值
\n",
"將影像的顏色分成 $2^5=32$ 個群,只需要花 5 bits 就可以代表一個像素了
\n",
"相對原本的儲存方式大約節省了五倍的記憶體容量了"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {
"id": "_ub6Geu5KTEk"
},
"outputs": [],
"source": [
"img = cv2.imread('flower2.jpg')[:, :, ::-1]\n",
"Z = img.reshape((-1, 3))\n",
"\n",
"# convert to np.float32\n",
"Z = np.float32(Z)\n",
"\n",
"# define criteria, number of clusters(K) and apply kmeans()\n",
"criteria = (cv2.TERM_CRITERIA_EPS + cv2.TERM_CRITERIA_MAX_ITER, 10, 1.0)\n",
"\n",
"result = []\n",
"title = []\n",
"\n",
"# [不分群, 分 2 群, 分 4 群, 分 8 群, 分 16 群, 分 32 群]\n",
"for i in range(1, 6):\n",
" K = 2 ** i\n",
" ret, label, center = cv2.kmeans(Z, K, None,\n",
" criteria, 10,\n",
" cv2.KMEANS_RANDOM_CENTERS)\n",
"\n",
" center = np.uint8(center)\n",
" cluster = center[label.flatten()]\n",
" cluster = cluster.reshape((img.shape))\n",
"\n",
" title .append(str(K)+\"-clustering\")\n",
" result.append(cluster)\n",
"\n",
"\n",
"titles = ['Original_img'] + title\n",
"images = [img] + result\n",
"\n",
"plt.figure(figsize=(16, 8))\n",
"for i in range(6):\n",
" plt.subplot(2, 3, i+1)\n",
" plt.imshow(images[i])\n",
" plt.title(titles[i])\n",
" plt.axis(\"off\")\n",
"plt.show()"
]
},
{
"cell_type": "markdown",
"metadata": {
"id": "eWxMVi_BKTEk"
},
"source": [
"[(back...)](#The-Advanced-of-Image-Preprocessing)"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {
"id": "AtOq8JhbKTEk"
},
"outputs": [],
"source": []
}
],
"metadata": {
"colab": {
"provenance": []
},
"kernelspec": {
"display_name": "Python 3 (ipykernel)",
"language": "python",
"name": "python3"
},
"language_info": {
"codemirror_mode": {
"name": "ipython",
"version": 3
},
"file_extension": ".py",
"mimetype": "text/x-python",
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.7.12"
}
},
"nbformat": 4,
"nbformat_minor": 4
}