{
  "cells": [
    {
      "cell_type": "markdown",
      "metadata": {
        "id": "view-in-github",
        "colab_type": "text"
      },
      "source": [
        "<a href=\"https://colab.research.google.com/github/TA-aiacademy/course_3.0/blob/v2-5_gan/08_v2-5_GAN/Part4/01_Stylegan.ipynb\" target=\"_parent\"><img src=\"https://colab.research.google.com/assets/colab-badge.svg\" alt=\"Open In Colab\"/></a>"
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {
        "id": "maiBHYDsAycF"
      },
      "source": [
        "# StyleGAN"
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {
        "id": "OhSQeY2tAycU"
      },
      "source": [
        "### 本章節內容大綱\n",
        "* [StyleGAN](#StyleGAN)\n",
        "* [StyleGAN in Anime dataset (by Gwern)](#StyleGAN-in-Anime-dataset-(by-Gwern))"
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {
        "id": "XKX-EhQAAycX"
      },
      "source": [
        "這個章節要來 demo 2019年初才釋出 weight 的 StyleGAN，本次的教學是由 https://github.com/NVlabs/stylegan clone下來的 weight 以及 code 再作一些修改，由於該模型是用 tf1.X 版本訓練的，故助教在這邊有修改一些版本的細節，學員如果要在本機端上clone github，記得使用tf 1.X版本來使用，或是將這份教材複製到本機端用tf2.0跑也是可以的。"
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {
        "id": "xcgM4usGAyca"
      },
      "source": [
        "StyleGAN Generator的架構，可以理解為前面幾層是在勾勒輪廓，後面是在畫精細的細節。\n",
        "\n",
        "<img src=\"https://hackmd.io/_uploads/HJp5gETga.jpg\" width=500  />\n",
        "\n",
        "\n",
        "\n",
        "StyleGAN 承襲了 ProgressiveGAN 的 Discriminator，基本上也是用PatchGAN的概念。\n",
        "\n",
        "<img src=\"https://hackmd.io/_uploads/SkMpg4Tx6.png\" width=500  />"
      ]
    },
    {
      "cell_type": "code",
      "execution_count": null,
      "metadata": {
        "id": "YM7J-tpdAycj"
      },
      "outputs": [],
      "source": [
        "# 上傳資料\n",
        "!wget -q https://github.com/TA-aiacademy/course_3.0/releases/download/v2.5_gan/GAN_part4.zip\n",
        "!unzip -q GAN_part4.zip"
      ]
    },
    {
      "cell_type": "code",
      "execution_count": null,
      "metadata": {
        "id": "-U9UUylSAycd"
      },
      "outputs": [],
      "source": [
        "import os\n",
        "import pickle\n",
        "import numpy as np\n",
        "import PIL.Image\n",
        "import dnnlib\n",
        "import dnnlib.tflib as tflib\n",
        "import matplotlib.pyplot as plt\n",
        "\n",
        "import imageio\n",
        "import glob\n",
        "from IPython.display import display, Image\n",
        "import cv2"
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {
        "id": "cDGJ7MauAych"
      },
      "source": [
        "### 讀入產生高畫質人臉圖片的權重"
      ]
    },
    {
      "cell_type": "code",
      "execution_count": null,
      "metadata": {
        "id": "E_r2M3LDAycl"
      },
      "outputs": [],
      "source": [
        "url = 'cache/2019stylegan-ffhq-1024x1024_mod.pkl'\n",
        "\n",
        "tflib.init_tf()\n",
        "with open(url, 'rb') as f:\n",
        "    _G, _D, Gs = pickle.load(f)"
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {
        "id": "J9A71hbmAycn"
      },
      "source": [
        "### 產生隨機的圖片"
      ]
    },
    {
      "cell_type": "code",
      "execution_count": null,
      "metadata": {
        "id": "nuR445fJAycq"
      },
      "outputs": [],
      "source": [
        "# 隨機sample一組潛在向量(latent vector)來產生圖片\n",
        "rnd = np.random.RandomState(420)\n",
        "latents = rnd.randn(1, Gs.input_shape[1])"
      ]
    },
    {
      "cell_type": "code",
      "execution_count": null,
      "metadata": {
        "id": "IEwYLMnUAyct"
      },
      "outputs": [],
      "source": [
        "# Generate image\n",
        "fmt = dict(func=tflib.convert_images_to_uint8, nchw_to_nhwc=True)\n",
        "images = Gs.run(latents, None, truncation_psi=0.7, randomize_noise=True, output_transform=fmt)\n",
        "\n",
        "plt.figure(figsize=(10,10))\n",
        "plt.imshow(images[-1])\n",
        "plt.show()"
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {
        "id": "lYjOmnuaAycw"
      },
      "source": [
        "### 試著一次改變vector中的一個element來看看有什麼變化吧"
      ]
    },
    {
      "cell_type": "code",
      "execution_count": null,
      "metadata": {
        "id": "Tfx7gYpuAycx"
      },
      "outputs": [],
      "source": [
        "# 先固定一個值都是 1 的向量\n",
        "latents = np.ones((1, Gs.input_shape[1]))"
      ]
    },
    {
      "cell_type": "code",
      "execution_count": null,
      "metadata": {
        "id": "rjYp09h2Aycz"
      },
      "outputs": [],
      "source": [
        "# 產生圖片\n",
        "fmt = dict(func=tflib.convert_images_to_uint8, nchw_to_nhwc=True)\n",
        "images = Gs.run(latents, None, truncation_psi=0.7, randomize_noise=True, output_transform=fmt)\n",
        "\n",
        "plt.figure(figsize=(10, 10))\n",
        "plt.imshow(images[-1])\n",
        "plt.show()"
      ]
    },
    {
      "cell_type": "code",
      "execution_count": null,
      "metadata": {
        "collapsed": true,
        "jupyter": {
          "outputs_hidden": true
        },
        "id": "wiFINIRjAyc0"
      },
      "outputs": [],
      "source": [
        "save_path = './exp_img/lat_18' # 向量總共有 512 維，如果想要改變第18維就修改成 lat_18\n",
        "\n",
        "if not os.path.exists(save_path):\n",
        "    os.makedirs(save_path)\n",
        "\n",
        "ind = int(save_path.split('_')[-1])\n",
        "\n",
        "for i in np.arange(-15, 16, 0.5): # 每次改變 0.5 的值看看\n",
        "    latents[0][ind] = i\n",
        "    fmt = dict(func=tflib.convert_images_to_uint8, nchw_to_nhwc=True)\n",
        "    images = Gs.run(latents, None, truncation_psi=0.7, randomize_noise=True, output_transform=fmt)\n",
        "\n",
        "    plt.figure(figsize=(5, 5))\n",
        "    plt.imshow(images[-1])\n",
        "\n",
        "    plt.savefig(os.path.join(save_path, 'image_{:03f}.png'.format(i)))\n",
        "\n",
        "    plt.show()"
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {
        "id": "6WxsIcqsAyc2"
      },
      "source": [
        "下面的gif可以方便我們觀察改動不同的 element ，會對於 output 有什麼影響"
      ]
    },
    {
      "cell_type": "code",
      "execution_count": null,
      "metadata": {
        "id": "JtaoxlzUAyc3"
      },
      "outputs": [],
      "source": [
        "# 使用imageio製作gif圖\n",
        "anim_file = save_path + '/anim.gif'\n",
        "\n",
        "with imageio.get_writer(anim_file, mode='I') as writer:\n",
        "\n",
        "    filenames = glob.glob(save_path + '/image*.png')\n",
        "#     filenames = sorted(filenames)\n",
        "    filenames.sort(key=lambda x: os.path.getmtime(x))\n",
        "\n",
        "    last = -1\n",
        "    for i, filename in enumerate(filenames):\n",
        "        frame = 2*(i**0.5)\n",
        "        if round(frame) > round(last):\n",
        "            last = frame\n",
        "        else:\n",
        "            continue\n",
        "        image = imageio.imread(filename)\n",
        "        writer.append_data(image)\n",
        "    image = imageio.imread(filename)\n",
        "    writer.append_data(image)\n",
        "\n",
        "display(Image(filename=anim_file))"
      ]
    },
    {
      "cell_type": "code",
      "execution_count": null,
      "metadata": {
        "id": "aoGntZyCAyc6"
      },
      "outputs": [],
      "source": [
        "# change the first element in latent vector\n",
        "display(Image(filename='./exp_img/lat_0/anim.gif'))"
      ]
    },
    {
      "cell_type": "code",
      "execution_count": null,
      "metadata": {
        "id": "FN9oCJlUAyc8"
      },
      "outputs": [],
      "source": [
        "# change the 256th element in latent vector\n",
        "display(Image(filename='./exp_img/lat_255/anim.gif'))"
      ]
    },
    {
      "cell_type": "code",
      "execution_count": null,
      "metadata": {
        "id": "qAEVAvegAyc9"
      },
      "outputs": [],
      "source": [
        "# change the last element in latent vector\n",
        "display(Image(filename='./exp_img/lat_511/anim.gif'))"
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {
        "id": "gE7_06o9Ayc-"
      },
      "source": [
        "# 風格混合(style mixing)"
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {
        "id": "-rhc2716Ayc_"
      },
      "source": [
        "StyleGAN 不只像是一般的 GAN 能隨機生成一張逼真的圖片，因為它一層層疊加的結構，讓它有辦法可以做各種不同細緻度的風格轉換。"
      ]
    },
    {
      "cell_type": "code",
      "execution_count": null,
      "metadata": {
        "id": "osQWl_BTAydA"
      },
      "outputs": [],
      "source": [
        "def draw_style_mixing_figure(png, Gs, w, h, src_seeds, dst_seeds, style_ranges):\n",
        "    print(png)\n",
        "    src_latents = np.stack(np.random.RandomState(seed).randn(Gs.input_shape[1]) for seed in src_seeds)\n",
        "    dst_latents = np.stack(np.random.RandomState(seed).randn(Gs.input_shape[1]) for seed in dst_seeds)\n",
        "    src_dlatents = Gs.components.mapping.run(src_latents, None)  # [seed, layer, component]\n",
        "    dst_dlatents = Gs.components.mapping.run(dst_latents, None)  # [seed, layer, component]\n",
        "    src_images = Gs.components.synthesis.run(src_dlatents, randomize_noise=False, **synthesis_kwargs)\n",
        "    dst_images = Gs.components.synthesis.run(dst_dlatents, randomize_noise=False, **synthesis_kwargs)\n",
        "\n",
        "    canvas = PIL.Image.new('RGB', (w * (len(src_seeds) + 1), h * (len(dst_seeds) + 1)), 'white')\n",
        "    for col, src_image in enumerate(list(src_images)):\n",
        "        canvas.paste(PIL.Image.fromarray(src_image, 'RGB'), ((col + 1) * w, 0))\n",
        "    for row, dst_image in enumerate(list(dst_images)):\n",
        "        canvas.paste(PIL.Image.fromarray(dst_image, 'RGB'), (0, (row + 1) * h))\n",
        "        row_dlatents = np.stack([dst_dlatents[row]] * len(src_seeds))\n",
        "        row_dlatents[:, style_ranges[row]] = src_dlatents[:, style_ranges[row]]\n",
        "        row_images = Gs.components.synthesis.run(row_dlatents, randomize_noise=False, **synthesis_kwargs)\n",
        "        for col, image in enumerate(list(row_images)):\n",
        "            canvas.paste(PIL.Image.fromarray(image, 'RGB'), ((col + 1) * w, (row + 1) * h))\n",
        "    canvas.save(png)\n",
        "\n",
        "synthesis_kwargs = dict(output_transform=dict(func=tflib.convert_images_to_uint8, nchw_to_nhwc=True),\n",
        "                        minibatch_size=8)\n",
        "\n",
        "result_dir = 'results'\n",
        "if not os.path.exists(result_dir):\n",
        "    os.makedirs(result_dir)\n",
        "\n",
        "draw_style_mixing_figure(os.path.join(result_dir, 'style-mixing-human8.png'), Gs, w=1024, h=1024,\n",
        "                         src_seeds=[639, 701, 687, 615, 2268],\n",
        "                         dst_seeds=[888, 829, 1898, 1733, 1614, 845, 1450, 2266],\n",
        "                         style_ranges=[range(0, 4)]*2+[range(4, 8)]*2+[range(8, 12)]*2+[range(12, 16)]*2)"
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {
        "id": "shWs4hY2AydB"
      },
      "source": [
        "### 轉換結果\n",
        "下圖的第一個 row 是隨機產生的來源圖片(source image)，第一個 column 是隨機產生的目標圖片(destination image)，透過將目標圖片的部分層的「中層潛在向量」(intermediate latent vector)替換成來源圖片的向量層，就可以達到變換風格的效果。下面的例子是以兩個 row 為一單位，每單位分別是變換第 0-3 層的向量、4-7 層...到第 15 層，每四個層去取代的結果，可以發現前幾層的改變幅度很大，會把整個臉型跟面向都改成另一個風格，然而後面幾層可能開始只改變五官、到最後幾層只改變整個色調細節而已。"
      ]
    },
    {
      "cell_type": "code",
      "execution_count": null,
      "metadata": {
        "id": "Wrm0vfDrAydC"
      },
      "outputs": [],
      "source": [
        "mix_img = cv2.imread('results/style-mixing-human8.png')\n",
        "plt.figure(figsize=(15, 25))\n",
        "plt.imshow(cv2.cvtColor(mix_img, cv2.COLOR_BGR2RGB))\n",
        "plt.show()"
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {
        "id": "gHrep_GNAydD"
      },
      "source": [
        "# StyleGAN in Anime dataset (by Gwern)"
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {
        "id": "oJSb5nsHAydE"
      },
      "source": [
        "看到 Nvidia 釋出如此強大的模型，各路大神也紛紛來試玩看看，而這位 Gwern 用爬蟲抓了一堆動漫的角色圖，前處理後丟進模型訓練，下面是他釋出的pre-train weight，有興趣的學員也可以玩玩看，連結如下:https://www.gwern.net/Faces#"
      ]
    },
    {
      "cell_type": "code",
      "execution_count": null,
      "metadata": {
        "id": "hwclZ5LZAydF"
      },
      "outputs": [],
      "source": [
        "url = 'cache/2019-04-30-stylegan-danbooru2018-portraits-02095-066083_mod.pkl'\n",
        "\n",
        "tflib.init_tf()\n",
        "with open(url, 'rb') as f:\n",
        "    _G, _D, Gs = pickle.load(f)"
      ]
    },
    {
      "cell_type": "code",
      "execution_count": null,
      "metadata": {
        "id": "aX2nHNJtAydF"
      },
      "outputs": [],
      "source": [
        "# 產生 0 向量\n",
        "latents = np.zeros((1, Gs.input_shape[1]))"
      ]
    },
    {
      "cell_type": "code",
      "execution_count": null,
      "metadata": {
        "id": "EMkf7ntYAydG"
      },
      "outputs": [],
      "source": [
        "# Generate image.\n",
        "fmt = dict(func=tflib.convert_images_to_uint8, nchw_to_nhwc=True)\n",
        "images = Gs.run(latents, None, truncation_psi=0.5, randomize_noise=True, output_transform=fmt)\n",
        "\n",
        "plt.figure(figsize=(10,10))\n",
        "plt.imshow(images[-1])\n",
        "plt.show()"
      ]
    },
    {
      "cell_type": "code",
      "execution_count": null,
      "metadata": {
        "id": "BPRg9FT4AydI"
      },
      "outputs": [],
      "source": [
        "save_path = './exp_img_anim/lat_255' # 向量總共有 512 維，如果想要改變第255維就修改成 lat_255\n",
        "\n",
        "plt.ioff() # 用這個 method 就能不要把圖 plot 出來\n",
        "\n",
        "if not os.path.exists(save_path):\n",
        "    os.makedirs(save_path)\n",
        "\n",
        "\n",
        "ind = int(save_path.split('_')[-1])\n",
        "\n",
        "for i in np.arange(-0.004,0.0041,0.0001):\n",
        "    latents[0][ind] = i\n",
        "    fmt = dict(func=tflib.convert_images_to_uint8, nchw_to_nhwc=True)\n",
        "    images = Gs.run(latents, None, truncation_psi=0.7, randomize_noise=False, output_transform=fmt)\n",
        "\n",
        "    plt.figure(figsize=(5, 5))\n",
        "    plt.imshow(images[-1])\n",
        "\n",
        "    plt.savefig(os.path.join(save_path, 'image_{:03f}.png'.format(i)))\n",
        "\n",
        "#     plt.show()\n",
        "    plt.close()"
      ]
    },
    {
      "cell_type": "code",
      "execution_count": null,
      "metadata": {
        "id": "cC1vTOubAydJ"
      },
      "outputs": [],
      "source": [
        "# 使用imageio製作gif圖\n",
        "anim_file = save_path + '/anim.gif'\n",
        "\n",
        "with imageio.get_writer(anim_file, mode='I') as writer:\n",
        "\n",
        "    filenames = glob.glob(save_path + '/image*.png')\n",
        "    filenames.sort(key=lambda x: os.path.getmtime(x))\n",
        "\n",
        "    last = -1\n",
        "    for i, filename in enumerate(filenames):\n",
        "        frame = 4*(i**0.5)\n",
        "        if round(frame) > round(last):\n",
        "            last = frame\n",
        "        else:\n",
        "            continue\n",
        "        image = imageio.imread(filename)\n",
        "        writer.append_data(image)\n",
        "    image = imageio.imread(filename)\n",
        "    writer.append_data(image)\n",
        "\n",
        "display(Image(filename=anim_file))"
      ]
    },
    {
      "cell_type": "code",
      "execution_count": null,
      "metadata": {
        "id": "9cdClmeAAydK"
      },
      "outputs": [],
      "source": [
        "display(Image(filename='./exp_img_anim/lat_0/anim.gif'))"
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {
        "id": "S8AXAr0GAydL"
      },
      "source": [
        "# Style mixing"
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {
        "id": "MZoKJrWZAydL"
      },
      "source": [
        "這部分與上面的人臉類似，就不在贅述"
      ]
    },
    {
      "cell_type": "code",
      "execution_count": null,
      "metadata": {
        "id": "lL-uigScAydM"
      },
      "outputs": [],
      "source": [
        "# sample a vector from Normal distribution\n",
        "rnd = np.random.RandomState(0)\n",
        "latents = rnd.randn(1, Gs.input_shape[1])"
      ]
    },
    {
      "cell_type": "code",
      "execution_count": null,
      "metadata": {
        "id": "1Bf7QBvhAydO"
      },
      "outputs": [],
      "source": [
        "def draw_style_mixing_figure(png, Gs, w, h, src_seeds, dst_seeds, style_ranges):\n",
        "    print(png)\n",
        "    src_latents = np.stack(np.random.RandomState(seed).randn(Gs.input_shape[1]) for seed in src_seeds)\n",
        "    dst_latents = np.stack(np.random.RandomState(seed).randn(Gs.input_shape[1]) for seed in dst_seeds)\n",
        "    src_dlatents = Gs.components.mapping.run(src_latents, None)  # [seed, layer, component]\n",
        "    dst_dlatents = Gs.components.mapping.run(dst_latents, None)  # [seed, layer, component]\n",
        "    src_images = Gs.components.synthesis.run(src_dlatents, randomize_noise=False, **synthesis_kwargs)\n",
        "    dst_images = Gs.components.synthesis.run(dst_dlatents, randomize_noise=False, **synthesis_kwargs)\n",
        "\n",
        "    canvas = PIL.Image.new('RGB', (w * (len(src_seeds) + 1), h * (len(dst_seeds) + 1)), 'white')\n",
        "    for col, src_image in enumerate(list(src_images)):\n",
        "        canvas.paste(PIL.Image.fromarray(src_image, 'RGB'), ((col + 1) * w, 0))\n",
        "    for row, dst_image in enumerate(list(dst_images)):\n",
        "        canvas.paste(PIL.Image.fromarray(dst_image, 'RGB'), (0, (row + 1) * h))\n",
        "        row_dlatents = np.stack([dst_dlatents[row]] * len(src_seeds))\n",
        "        row_dlatents[:, style_ranges[row]] = src_dlatents[:, style_ranges[row]]\n",
        "        row_images = Gs.components.synthesis.run(row_dlatents, randomize_noise=False, **synthesis_kwargs)\n",
        "        for col, image in enumerate(list(row_images)):\n",
        "            canvas.paste(PIL.Image.fromarray(image, 'RGB'), ((col + 1) * w, (row + 1) * h))\n",
        "    canvas.save(png)\n",
        "\n",
        "synthesis_kwargs = dict(output_transform=dict(func=tflib.convert_images_to_uint8, nchw_to_nhwc=True),\n",
        "                        minibatch_size=8)\n",
        "\n",
        "draw_style_mixing_figure(os.path.join(result_dir, 'style-mixing-anim_8.png'), Gs, w=512, h=512,\n",
        "                         src_seeds=[639, 701, 687, 615, 2268],\n",
        "                         dst_seeds=[888, 829, 1898, 1733, 1614, 845, 1450, 2266],\n",
        "                         style_ranges=[range(0, 4)]*2+[range(4, 8)]*2+[range(8, 12)]*2+[range(12, 16)]*2)"
      ]
    },
    {
      "cell_type": "code",
      "execution_count": null,
      "metadata": {
        "id": "PBIhpD5NAydP"
      },
      "outputs": [],
      "source": [
        "mix_img = cv2.imread('results/style-mixing-anim_8.png')\n",
        "plt.figure(figsize=(16, 24))\n",
        "plt.imshow(cv2.cvtColor(mix_img, cv2.COLOR_BGR2RGB))\n",
        "plt.show()"
      ]
    },
    {
      "cell_type": "code",
      "execution_count": null,
      "metadata": {
        "id": "1paYfOWIAydQ"
      },
      "outputs": [],
      "source": []
    }
  ],
  "metadata": {
    "kernelspec": {
      "display_name": "Python 3",
      "name": "python3"
    },
    "language_info": {
      "codemirror_mode": {
        "name": "ipython",
        "version": 3
      },
      "file_extension": ".py",
      "mimetype": "text/x-python",
      "name": "python",
      "nbconvert_exporter": "python",
      "pygments_lexer": "ipython3",
      "version": "3.7.12"
    },
    "colab": {
      "provenance": [],
      "gpuType": "T4",
      "include_colab_link": true
    },
    "accelerator": "GPU"
  },
  "nbformat": 4,
  "nbformat_minor": 0
}