{ "cells": [ { "attachments": {}, "cell_type": "markdown", "metadata": { "id": "WUZm-Jbl3V6n" }, "source": [ "## $\\Large{Data\\; Visualization \\;(part1)}$\n", "\n", "在Pandas教學中最後我們已經有簡單說明資料視覺化(Data Visualization)的重要性以及作了基本的練習,然而在Pandas中我們只簡單地教了大家如何繪製基本的統計圖表,但是若我們需要客製化圖表中的內容,例如增加圖形標題、在圖中增加文字、或是呈現許多子圖應該要怎麼作呢? 在資料視覺化章節的第一個部分我們將會教大家python內最廣泛被使用到的基礎繪圖套件matplotlib,雖然我們需要一步一步設定繪圖的內容而無法快速繪製完成,但這也代表任何我們想增加的內容都可客製化地作調整,另外,無論是先前的pandas以及後續會教的高階繪圖套件seaborn底層都是使用matplotlib,接下來就讓我們來看一下如何使用matplotlib吧。" ] }, { "cell_type": "markdown", "metadata": { "id": "CIKdmwomyjWG" }, "source": [ "### 本章節內容大綱\n", "* [事前準備](#事前準備)\n", " - [載入matplotlib套件與其他相關套件](#載入matplotlib套件與其它相關套件)\n", " - [繪圖前的客製化設定](#繪圖前的客製化設定)\n", "* [繪製你的第一張圖](#繪製你的第一張圖)\n", " - [基本繪圖模組](#基本繪圖模組)\n", " - [設定x軸、y軸標題與主標題](#設定x軸、y軸標題與主標題)\n", " - [設定色彩與線條類型](#設定色彩與線條類型)\n", " - [設定x軸與y軸範圍](#設定x軸與y軸範圍)\n", " - [加入文字與標註](#加入文字與標註)\n", " - [增加圖表說明](#增加圖表說明)\n", "* [物件式導向繪圖方式](#物件式導向繪圖方式)\n", "* [設定子圖](#設定子圖)\n", "* [常見圖表繪製](#常見圖表繪製)\n", " - [直方圖](#直方圖)\n", " - [盒型圖](#盒型圖)\n", " - [長條圖](#長條圖)\n", " - [圓餅圖](#圓餅圖)\n", " - [散佈圖](#散佈圖)" ] }, { "cell_type": "markdown", "metadata": { "id": "w-gWXcrEyjWI" }, "source": [ "\n", "## 事前準備" ] }, { "cell_type": "markdown", "metadata": { "id": "_v5879jtyjWK" }, "source": [ "\n", "- ### 載入matplotlib套件與其它相關套件" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "id": "7U0ZWKjUyjWL" }, "outputs": [], "source": [ "# 載入matplotlib中的pypplot模組並且命名為plt\n", "import matplotlib.pyplot as plt\n", "\n", "# 由於繪圖需要資料,在此同時載入numpy套件\n", "import numpy as np\n", "\n", "# 額外載入rcParams函數協助我們進行繪圖的設定\n", "from matplotlib import rcParams\n", "\n", "%matplotlib inline" ] }, { "cell_type": "markdown", "metadata": { "id": "IIRrrQt-yjWN" }, "source": [ "\n", "- ### 繪圖前的客製化設定\n", "\n", "在作資料視覺化時依照需求以及環境不同我們可能會希望預先調整一些設定使得後續畫圖較為美觀,例如預先更改字體大小、線條寬度、以及設定繪圖區域的大小,此時我們可以使用rcParams去作相關的設定更改。另外我們也可以透過設定style調整繪圖的風格。" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "id": "NyB-fAfRyjWO" }, "outputs": [], "source": [ "# 以rcParams設定圖的大小,預設值為(6.4, 4.8),單位為英吋inches\n", "rcParams['figure.figsize'] = (8, 6)\n", "\n", "# 以rcParams設定字體大小,預設值為10\n", "rcParams['font.size'] = 14\n", "\n", "# 設定style為ggplot風格\n", "plt.style.use('ggplot')" ] }, { "cell_type": "markdown", "metadata": { "id": "SWkuVB2iyjWP" }, "source": [ "---\n", "\n", "## 繪製你的第一張圖" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "id": "kavYnSbjyjWQ" }, "outputs": [], "source": [ "# 製作要繪圖所使用的資料\n", "x = np.arange(start=0, stop=10, step=0.1)\n", "y = np.sin(x)" ] }, { "cell_type": "markdown", "metadata": { "id": "bz-pK8VSyjWS" }, "source": [ "\n", "- ### 基本繪圖模組\n", "\n", "在最簡單的狀況下,只要使用plt.plot就可以快速地繪製線圖。" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "id": "dM2sPk5VyjWT" }, "outputs": [], "source": [ "# 使用plt.plot繪圖\n", "plt.plot(x, y)\n", "\n", "# 使用plt.show()呈現這張圖\n", "plt.show()" ] }, { "cell_type": "markdown", "metadata": { "id": "Ws8WOT65yjWT" }, "source": [ "\n", "- ### 設定x軸、y軸標題與主標題\n", "\n", "由於matplotlib是最底層的繪圖套件,因此我們可以彈性地增加我們需要的內容" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "id": "ss2_paXAyjWU" }, "outputs": [], "source": [ "# 設定x軸、y軸、以及主標題的內容\n", "plt.plot(x, y)\n", "plt.xlabel('x')\n", "plt.ylabel('sin(x)')\n", "plt.title('Example plot')\n", "plt.show()" ] }, { "cell_type": "markdown", "metadata": { "id": "TKWm4E0myjWU" }, "source": [ "\n", "- ### 設定色彩與線條類型\n", "在plt.plot中若我們想要更改線條的色彩與線條類型,我們可以在第三個參數位置以縮寫的方式進行設定(例如'b+'代表藍色的加號)。" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "id": "QNIXqUkeyjWV" }, "outputs": [], "source": [ "plt.plot(x, y, 'b+')\n", "plt.show()" ] }, { "cell_type": "markdown", "metadata": { "id": "HAbLUWhPyjWV" }, "source": [ "\n", "- ### 設定x軸與y軸範圍\n", "\n", "透過xlim與ylim,我們可以彈性調整圖形的呈現範圍" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "id": "jC_zrSpzyjWW" }, "outputs": [], "source": [ "plt.plot(x, y, 'b+')\n", "plt.xlim((1, 5))\n", "plt.show()" ] }, { "cell_type": "markdown", "metadata": { "id": "ZVqeRSS9yjWW" }, "source": [ "\n", "- ### 加入文字與標註\n", "\n", "使用text與annotate可以分別讓我們在圖上加入文字與標註" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "id": "58K6lhbzyjWX" }, "outputs": [], "source": [ "# 繪製圖形\n", "plt.plot(x, y, 'b+')\n", "\n", "# 加入標註,需要指定標註位置以及箭號屬性\n", "plt.annotate('Start from here', xy=(0, np.sin(0)), xytext=(0+1, np.sin(0)-0.5),\n", " arrowprops=dict(facecolor='black'))\n", "\n", "# 加入文字,可以使用fontsize設定字體大小\n", "plt.text(x=3+1, y=np.sin(3), s='Set Text Here', fontsize=14)\n", "plt.show()" ] }, { "cell_type": "markdown", "metadata": { "id": "pTI59YCVyjWX" }, "source": [ "\n", "- ### 增加圖表說明\n", "\n", "若想要增加圖表說明,我們可以使用legend函數" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "id": "BLXNs3AQyjWY" }, "outputs": [], "source": [ "x = np.arange(0, 10, step=0.1)\n", "sin_x = np.sin(x)\n", "cos_x = np.cos(x)\n", "\n", "# 繪製圖形,這次我們同時繪製Sin波與Cos波\n", "plt.plot(x, sin_x, 'b+')\n", "plt.plot(x, cos_x, 'r--')\n", "\n", "# 加入圖表說明,並透過loc與prop設定說明的位置與字體大小\n", "plt.legend(labels=['Sine Wave', 'Cosine Wave'],\n", " loc='upper center', prop={'size':12})\n", "plt.show()" ] }, { "cell_type": "markdown", "metadata": { "id": "9TSqauaTyjWY" }, "source": [ "---\n", "\n", "## 物件式導向繪圖方式\n", "\n", "除了上述的畫圖方式外,matplotlib還有另外一種物件式導向的繪圖方式,在繪製多張圖形時非常適合使用這種方式建立你的圖形。\n", "\n", "在以下的範例我們除了畫一張基本的圖之外也嘗試著在圖上疊一張小圖,讓大家了解我們使用物件式導向繪圖時可以做到甚麼事情。" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "id": "ZMYEYp33yjWZ" }, "outputs": [], "source": [ "x = np.arange(start=0, stop=10, step=0.1)\n", "y = np.sin(x)\n", "\n", "# 先建立一個圖的物件,可在此指定此張圖的相關參數\n", "fig = plt.figure(figsize=(4, 3))\n", "\n", "# 增加一個子區域,以及設定它在圖中的位置\n", "ax = fig.add_axes([0.1, 0.1, 0.9, 0.9])\n", "ax.plot(x, y)\n", "ax.set_title('Axes 1')\n", "ax.set_xlabel('x')\n", "ax.set_ylabel('sin(x)')\n", "\n", "# 再增加一個子區域,這次我們把位置設定在左下角\n", "ax2 = fig.add_axes([0.15, 0.15, 0.3, 0.3])\n", "ax2.plot(x, y, 'b--')\n", "\n", "# 將x軸與y軸的標記手動取代,讓畫面比較乾淨\n", "ax2.set_xticks([])\n", "ax2.set_yticks([])\n", "ax2.set_title('Axes 2')\n", "\n", "plt.show()" ] }, { "cell_type": "markdown", "metadata": { "id": "yVKNKqkpyjWa" }, "source": [ "---\n", "\n", "## 設定子圖\n", "\n", "有時我們會需要同時觀察多張圖的資訊、或是將相關的圖形依照需求放置在同一區,雖然使用上面的方式自己設定子區域可以作到最細緻的客製化,但需要大量的程式碼才能作到這件事情。在matplotlib中也提供了另一個subbplot的功能,可以讓我們較方便的設定子圖的配置以及快速地畫出來。\n", "\n", "\n", "" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "id": "VR5zNXjiyjWb" }, "outputs": [], "source": [ "# 設定一張圖中有 2 x 2 張子圖,目前畫第一張(將以先列後行的方式排序)\n", "plt.subplot(2, 2, 1)\n", "plt.plot(x, y)\n", "plt.title('subplot 1')\n", "\n", "# 切換到第二張子圖作繪製\n", "plt.subplot(2, 2, 2)\n", "plt.plot(x, y, 'g')\n", "plt.title('subplot 2')\n", "\n", "# 切換到第三張子圖作繪製\n", "plt.subplot(2, 2, 3)\n", "plt.plot(x, y, 'bo')\n", "plt.title('subplot 3')\n", "\n", "# 切換到第四張子圖作繪製\n", "plt.subplot(2, 2, 4)\n", "plt.plot(x, y, 'r--')\n", "plt.title('subplot 4')\n", "\n", "# 設定子圖時有可能標題或xy軸標籤會相互重疊,在此以tight_layout作自動的調整\n", "plt.tight_layout()\n", "plt.show()" ] }, { "cell_type": "markdown", "metadata": { "id": "V-1oBmFZyjWb" }, "source": [ "\n", "- ### 物件導向方式\n", "\n", "除了上述方式外,我們同樣可以使用物件導向的方式繪製子圖。我們現在來試試看同樣用另一個方式畫出上面的圖形。" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "id": "9wJaADvWyjWc" }, "outputs": [], "source": [ "# 先將子圖的配置設定好\n", "fig, ax = plt.subplots(nrows=2, ncols=2)\n", "\n", "# 在第一張圖上作繪製\n", "ax[0, 0].plot(x, y)\n", "ax[0, 0].set_title('subplot 1')\n", "\n", "# 在第二張圖上作繪製\n", "ax[0, 1].plot(x, y, 'g')\n", "ax[0, 1].set_title('subplot 2')\n", "\n", "# 在第三張圖上作繪製\n", "ax[1, 0].plot(x, y, 'bo')\n", "ax[1, 0].set_title('subplot 3')\n", "\n", "# 在第四張圖上作繪製\n", "ax[1, 1].plot(x, y, 'r--')\n", "ax[1, 1].set_title('subplot 4')\n", "\n", "plt.tight_layout()\n", "plt.show()" ] }, { "cell_type": "markdown", "metadata": { "id": "SVkwifRHyjWc" }, "source": [ "---\n", "\n", "## 常見圖表繪製\n", "\n", "上面我們畫的圖都是線圖或散佈圖,但matplotlib當然不只這樣! 接下來我們就來看如何畫一些常見的統計圖。" ] }, { "cell_type": "markdown", "metadata": { "id": "j8Nqk2wryjWc" }, "source": [ "\n", "- ### 直方圖" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "id": "xspNeIaPyjWd" }, "outputs": [], "source": [ "# 產生要繪圖使用的資料\n", "normal_samples = np.random.normal(size=100)\n", "\n", "# 使用plt.hist繪製直方圖\n", "plt.hist(normal_samples)\n", "plt.show()\n", "\n", "# 我們也可以透過一些參數調整圖形的內容,例如只畫出直方圖的端點並且改為累積直方圖\n", "plt.hist(normal_samples, histtype='step', cumulative=True)\n", "plt.show()" ] }, { "cell_type": "markdown", "metadata": { "id": "I8tv_yA4yjWd" }, "source": [ "\n", "- ### 盒型圖" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "id": "jXGqoVWhyjWd" }, "outputs": [], "source": [ "# 產生繪圖使用的資料\n", "dat = np.random.randn(200) ** 2\n", "plt.boxplot(dat)\n", "plt.show()" ] }, { "cell_type": "markdown", "metadata": { "id": "vaFmfXnryjWd" }, "source": [ "\n", "- ### 長條圖" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "id": "gll_FcadyjWe" }, "outputs": [], "source": [ "# 產生繪圖要使用的資料\n", "labels = ['Physics', 'Chemistry', 'Literature', 'Math']\n", "x = [1, 2, 3, 4]\n", "height = [85, 82, 93, 67]\n", "\n", "# 依照資料畫出長條圖,並且設定顏色、寬度、與透明度\n", "plt.bar(x, height, color='blue', width=0.5, alpha=0.5)\n", "plt.xticks(x, labels)\n", "plt.show()" ] }, { "cell_type": "markdown", "metadata": { "id": "B5EUS2vzyjWe" }, "source": [ "\n", "- ### 圓餅圖" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "id": "g4mVZfwGyjWe" }, "outputs": [], "source": [ "# 產生繪圖要使用的資料\n", "labels = ['Banana', 'Cherry', 'Apple', 'Orange']\n", "frac = [0.2, 0.1, 0.4, 0.3]\n", "\n", "plt.pie(frac, labels=labels)\n", "plt.show()" ] }, { "cell_type": "markdown", "metadata": { "id": "7biqgbx-yjWf" }, "source": [ "\n", "- ### 散佈圖" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "id": "92PtyxxnyjWf" }, "outputs": [], "source": [ "# 產生繪圖要使用的資料\n", "x = np.arange(1, 10, step=0.1)\n", "y = 3 + x*x + np.random.randn(x.shape[0]) * 10\n", "\n", "# 畫出x與y的散佈圖,並且設定顏色和透明度\n", "plt.scatter(x, y, c='blue', alpha=0.5)\n", "plt.show()" ] }, { "cell_type": "markdown", "metadata": { "id": "ODXf4dzkyjWf" }, "source": [ "### 想挑戰更多 matplotlib 的圖形可參考 [python-matplotlib-plotting-examples-and-exercises](http://gree2.github.io/python/2015/04/10/python-matplotlib-plotting-examples-and-exercises)" ] } ], "metadata": { "colab": { "name": "Data_Visualization_part1.ipynb", "provenance": [] }, "kernelspec": { "display_name": "Python 3 (ipykernel)", "language": "python", "name": "python3" }, "language_info": { "codemirror_mode": { "name": "ipython", "version": 3 }, "file_extension": ".py", "mimetype": "text/x-python", "name": "python", "nbconvert_exporter": "python", "pygments_lexer": "ipython3", "version": "3.7.10" } }, "nbformat": 4, "nbformat_minor": 0 }