{
  "cells": [
    {
      "cell_type": "markdown",
      "metadata": {
        "id": "ggFUZOmm3V8H"
      },
      "source": [
        "## $\\Large{Pandas\\; 練習題}$"
      ]
    },
    {
      "cell_type": "code",
      "execution_count": null,
      "metadata": {
        "id": "6KNEKLNu3V8H"
      },
      "outputs": [],
      "source": [
        "import pandas as pd\n",
        "import numpy as np"
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {
        "id": "oTwSt8wsthv7"
      },
      "source": [
        "## 範例資料"
      ]
    },
    {
      "cell_type": "code",
      "execution_count": null,
      "metadata": {
        "id": "rac-V0Zzthv7"
      },
      "outputs": [],
      "source": [
        "data = {'animal': ['cat', 'cat', 'snake', 'dog', 'dog', 'cat', 'snake', 'cat', 'dog', 'dog'],\n",
        "        'age': [2.5, 3, 0.5, np.nan, 5, 2, 4.5, np.nan, 7, 3],\n",
        "        'visits': [1, 3, 2, 3, 2, 3, 1, 1, 2, 1],\n",
        "        'priority': ['yes', 'yes', 'no', 'yes', 'no', 'no', 'no', 'yes', 'no', 'no']}"
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {
        "id": "vpvu3q3Tthv8"
      },
      "source": [
        "## Exercise 1\n",
        "以上面提供的字典資料建立dataframe資料，並且命名為df"
      ]
    },
    {
      "cell_type": "code",
      "execution_count": null,
      "metadata": {
        "id": "q13U4OLZthv8"
      },
      "outputs": [],
      "source": []
    },
    {
      "cell_type": "markdown",
      "metadata": {
        "id": "hMG5gNlmthv8"
      },
      "source": [
        "## Exercise 2\n",
        "使用describe呈現df資料的基本資訊"
      ]
    },
    {
      "cell_type": "code",
      "execution_count": null,
      "metadata": {
        "id": "M0xW_f1Othv9"
      },
      "outputs": [],
      "source": []
    },
    {
      "cell_type": "markdown",
      "metadata": {
        "id": "V-tlQJKbthv9"
      },
      "source": [
        "## Exercise 3\n",
        "從df資料中挑選animal與priority兩個欄位的資料"
      ]
    },
    {
      "cell_type": "code",
      "execution_count": null,
      "metadata": {
        "id": "gadybog6thv9"
      },
      "outputs": [],
      "source": []
    },
    {
      "cell_type": "markdown",
      "metadata": {
        "id": "gErXk1Ybthv9"
      },
      "source": [
        "## Exercise 4\n",
        "使用.loc的方式選取df資料中index為3, 4, 8且欄位為animal與age的資料\n",
        "\n",
        "範例輸出\n",
        "<img src=\"https://i.imgur.com/L0ErJVI.png\"/>"
      ]
    },
    {
      "cell_type": "code",
      "execution_count": null,
      "metadata": {
        "id": "QpvdTIlBthv-"
      },
      "outputs": [],
      "source": []
    },
    {
      "cell_type": "markdown",
      "metadata": {
        "id": "lVHOq1Dwthv-"
      },
      "source": [
        "## Exercise 5\n",
        "選取df資料中age欄位非遺漏值的所有資料\n",
        "\n",
        "範例輸出\n",
        "<img src=\"https://i.imgur.com/6zcVSOn.png\"/>"
      ]
    },
    {
      "cell_type": "code",
      "execution_count": null,
      "metadata": {
        "id": "i6wQUEOothv-"
      },
      "outputs": [],
      "source": []
    },
    {
      "cell_type": "markdown",
      "metadata": {
        "id": "fcsHpO2Rthv-"
      },
      "source": [
        "## Exercise 6\n",
        "承上，在剔除掉age的遺漏資料後將df資料以age欄位由小到大排序"
      ]
    },
    {
      "cell_type": "code",
      "execution_count": null,
      "metadata": {
        "id": "a04ZN9Jxthv_"
      },
      "outputs": [],
      "source": []
    },
    {
      "cell_type": "markdown",
      "metadata": {
        "id": "SFMiQXa3thv_"
      },
      "source": [
        "## Exercise 7\n",
        "找出df資料中age的最大值"
      ]
    },
    {
      "cell_type": "code",
      "execution_count": null,
      "metadata": {
        "id": "A4fEjIYmthv_"
      },
      "outputs": [],
      "source": []
    },
    {
      "cell_type": "markdown",
      "metadata": {
        "id": "4d45AtFJthv_"
      },
      "source": [
        "## Exercise 8\n",
        "使用groupby方法依據priority欄位分組並計算visits的平均數\n",
        "\n",
        "範例輸出\n",
        "<img src=\"https://i.imgur.com/yNLji6n.png\"/>"
      ]
    },
    {
      "cell_type": "code",
      "execution_count": null,
      "metadata": {
        "id": "8a5d0qaXthv_"
      },
      "outputs": [],
      "source": []
    },
    {
      "cell_type": "markdown",
      "metadata": {
        "id": "jkbSaGPmthv_"
      },
      "source": [
        "## Exercise 9\n",
        "繪製df資料中animal欄位的長條圖"
      ]
    },
    {
      "cell_type": "code",
      "execution_count": null,
      "metadata": {
        "id": "f1yi6iUFthv_"
      },
      "outputs": [],
      "source": []
    },
    {
      "cell_type": "markdown",
      "metadata": {
        "id": "rJx7yq89thwA"
      },
      "source": [
        "## Exercise 10\n",
        "繪製df資料中age欄位的機率密度函數圖"
      ]
    },
    {
      "cell_type": "code",
      "execution_count": null,
      "metadata": {
        "id": "1sCHxph_thwA"
      },
      "outputs": [],
      "source": []
    },
    {
      "cell_type": "markdown",
      "metadata": {
        "id": "ECE333NPthwA"
      },
      "source": [
        "---\n",
        "## 範例資料 2\n",
        "資料路徑: 'https://github.com/TA-aiacademy/course_3.0/releases/download/Python/airline.csv'"
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {
        "id": "JuT8pRt-thwA"
      },
      "source": [
        "## Exercise 11\n",
        "\n",
        "將下列的csv檔讀取為pandas dataFrame型態並且命名為df。檔案包含 header且第一列是index"
      ]
    },
    {
      "cell_type": "code",
      "execution_count": null,
      "metadata": {
        "id": "WGDNBGrfthwA"
      },
      "outputs": [],
      "source": [
        "csv_path = 'https://github.com/TA-aiacademy/course_3.0/releases/download/Python/airline.csv'"
      ]
    },
    {
      "cell_type": "code",
      "execution_count": null,
      "metadata": {
        "id": "yweE_yGGthwA"
      },
      "outputs": [],
      "source": []
    },
    {
      "cell_type": "markdown",
      "metadata": {
        "id": "xGABsDSlthwA"
      },
      "source": [
        "## Exercise 12\n",
        "承上，從df中隨機選取10筆資料印出"
      ]
    },
    {
      "cell_type": "code",
      "execution_count": null,
      "metadata": {
        "id": "0afKE3rxthwB"
      },
      "outputs": [],
      "source": []
    },
    {
      "cell_type": "markdown",
      "metadata": {
        "id": "DtL3Ty_3thwB"
      },
      "source": [
        "## Exercise 13\n",
        "\n",
        "移除df中數值完全重複的列(rows)，並同樣存回df變數中。"
      ]
    },
    {
      "cell_type": "code",
      "execution_count": null,
      "metadata": {
        "id": "3Ti0f8aUthwB"
      },
      "outputs": [],
      "source": []
    },
    {
      "cell_type": "markdown",
      "metadata": {
        "id": "c7qxmXU0thwB"
      },
      "source": [
        "## Exercise 14\n",
        "\n",
        "印出df中每個欄位的遺漏值數量，將含有遺漏值的列(row)移除並同樣存回df變數中。"
      ]
    },
    {
      "cell_type": "code",
      "execution_count": null,
      "metadata": {
        "id": "h1W0MtRothwB"
      },
      "outputs": [],
      "source": []
    },
    {
      "cell_type": "markdown",
      "metadata": {
        "id": "esS_1VhpthwB"
      },
      "source": [
        "## Excerise 15\n",
        "\n",
        "將df中第89行(index=199)的「src_airport」欄位取代為 `SFO`"
      ]
    },
    {
      "cell_type": "code",
      "execution_count": null,
      "metadata": {
        "id": "cs4Jt5lUthwB"
      },
      "outputs": [],
      "source": []
    },
    {
      "cell_type": "markdown",
      "metadata": {
        "id": "BqHy0Yq7thwB"
      },
      "source": [
        "## Exercise 16\n",
        " \n",
        "複製一個新的資料表df2，並回傳一個和原本一模一樣的新資料表。\n",
        "\n",
        "註：你可以試著對df作一些修改，若df2也同樣被改動代表你沒有成功複製一個新的資料表"
      ]
    },
    {
      "cell_type": "code",
      "execution_count": null,
      "metadata": {
        "id": "qkL3M6xethwC"
      },
      "outputs": [],
      "source": []
    },
    {
      "cell_type": "markdown",
      "metadata": {
        "id": "8CND-I1othwC"
      },
      "source": [
        "## Exercise 17\n",
        "重新編號df資料的索引使其變成連續(0, 1, 2,....)，並將舊索引存成index欄位，將執行後的資料表命名為df_new"
      ]
    },
    {
      "cell_type": "code",
      "execution_count": null,
      "metadata": {
        "id": "7WZl6bwfthwC"
      },
      "outputs": [],
      "source": []
    },
    {
      "cell_type": "markdown",
      "metadata": {
        "id": "OvonmBFxthwC"
      },
      "source": [
        "## Exercise 18\n",
        "\n",
        "將df資料轉換成Numpy array的物件類型，並且命名為df_np"
      ]
    },
    {
      "cell_type": "code",
      "execution_count": null,
      "metadata": {
        "id": "Xq-xuRYAthwC"
      },
      "outputs": [],
      "source": []
    },
    {
      "cell_type": "markdown",
      "metadata": {
        "id": "Mg6ZFLI_thwC"
      },
      "source": [
        "### 請參考 [100-pandas-puzzles](https://github.com/ajcr/100-pandas-puzzles/blob/master/100-pandas-puzzles.ipynb) 做更多 pandas  的資料操作練習\n",
        "\n",
        "以下也有許多其他資源可供大家練習Pandas或做參考。\n",
        "* [10 minutes to pandas](http://pandas.pydata.org/pandas-docs/stable/10min.html)\n",
        "* [pandas basics](http://pandas.pydata.org/pandas-docs/stable/basics.html)\n",
        "* [tutorials](http://pandas.pydata.org/pandas-docs/stable/tutorials.html)\n",
        "* [cookbook and idioms](http://pandas.pydata.org/pandas-docs/version/0.17.0/cookbook.html#cookbook)\n",
        "* [Guilherme Samora's pandas exercises](https://github.com/guipsamora/pandas_exercises)"
      ]
    }
  ],
  "metadata": {
    "kernelspec": {
      "display_name": "Python 3",
      "language": "python",
      "name": "python3"
    },
    "language_info": {
      "codemirror_mode": {
        "name": "ipython",
        "version": 3
      },
      "file_extension": ".py",
      "mimetype": "text/x-python",
      "name": "python",
      "nbconvert_exporter": "python",
      "pygments_lexer": "ipython3",
      "version": "3.7.3"
    },
    "colab": {
      "provenance": []
    }
  },
  "nbformat": 4,
  "nbformat_minor": 0
}