diff --git a/LICENSE b/LICENSE index 5c9f2a0b8b..d633bd320d 100644 --- a/LICENSE +++ b/LICENSE @@ -1,6 +1,6 @@ MIT License -Copyright (c) 2017 François Chollet +Copyright (c) 2017-present François Chollet Permission is hereby granted, free of charge, to any person obtaining a copy of this software and associated documentation files (the "Software"), to deal diff --git a/README.md b/README.md index 6039b0d710..a57c07e7c5 100644 --- a/README.md +++ b/README.md @@ -1,34 +1,76 @@ -# Companion Jupyter notebooks for the book "Deep Learning with Python" +# Companion notebooks for Deep Learning with Python -This repository contains Jupyter notebooks implementing the code samples found in the book [Deep Learning with Python (Manning Publications)](https://www.manning.com/books/deep-learning-with-python?a_aid=keras&a_bid=76564dff). Note that the original text of the book features far more content than you will find in these notebooks, in particular further explanations and figures. Here we have only included the code samples themselves and immediately related surrounding comments. +This repository contains Jupyter notebooks implementing the code samples found in the book [Deep Learning with Python, third edition (2025)](https://www.manning.com/books/deep-learning-with-python-third-edition?a_aid=keras&a_bid=76564dff) +by Francois Chollet and Matthew Watson. In addition, you will also find the legacy notebooks for the [second edition (2021)](https://www.manning.com/books/deep-learning-with-python-second-edition?a_aid=keras&a_bid=76564dff) +and the [first edition (2017)](https://www.manning.com/books/deep-learning-with-python?a_aid=keras&a_bid=76564dff). -These notebooks use Python 3.6 and Keras 2.0.8. They were generated on a p2.xlarge EC2 instance. +For readability, these notebooks only contain runnable code blocks and section titles, and omit everything else in the book: text paragraphs, figures, and pseudocode. +**If you want to be able to follow what's going on, I recommend reading the notebooks side by side with your copy of the book.** + +## Running the code + +We recommend running these notebooks on [Colab](https://colab.google), which +provides a hosted runtime with all the dependencies you will need. You can also, +run these notebooks locally, either by setting up your own Jupyter environment, +or using Colab's instructions for +[running locally](https://research.google.com/colaboratory/local-runtimes.html). + +By default, all notebooks will run on Colab's free tier GPU runtime, which +is sufficient to run all code in this book. Chapter 8-18 chapters will benefit +from a faster GPU if you have a Colab Pro subscription. You can change your +runtime type using **Runtime -> Change runtime type** in Colab's dropdown menus. + +## Choosing a backend + +The code for third edition is written using Keras 3. As such, it can be run with +JAX, TensorFlow or PyTorch as a backend. To set the backend, update the backend +in the cell at the top of the colab that looks like this: + +```python +import os +os.environ["KERAS_BACKEND"] = "jax" +``` + +This must be done only once per session before importing Keras. If you are +in the middle running a notebook, you will need to restart the notebook session +and rerun all relevant notebook cells. This can be done in using +**Runtime -> Restart Session** in Colab's dropdown menus. + +## Using Kaggle data + +This book uses datasets and model weights provided by Kaggle, an online Machine +Learning community and platform. You will need to create a Kaggle login to run +Kaggle code in this book; instructions are given in Chapter 8. + +For chapters that need Kaggle data, you can login to Kaggle once per session +when you hit the notebook cell with `kagglehub.login()`. Alternately, +you can set up your Kaggle login information once as Colab secrets: + + * Go to https://www.kaggle.com/ and sign in. + * Go to https://www.kaggle.com/settings and generate a Kaggle API key. + * Open the secrets tab in Colab by clicking the key icon on the left. + * Add two secrets, `KAGGLE_USERNAME` and `KAGGLE_KEY` with the username and key + you just created. + +Following this approach you will only need to copy your Kaggle secret key once, +though you will need to allow each notebook to access your secrets when running +the relevant Kaggle code. ## Table of contents -* Chapter 2: - * [2.1: A first look at a neural network](http://nbviewer.jupyter.org/github/fchollet/deep-learning-with-python-notebooks/blob/master/2.1-a-first-look-at-a-neural-network.ipynb) -* Chapter 3: - * [3.5: Classifying movie reviews](http://nbviewer.jupyter.org/github/fchollet/deep-learning-with-python-notebooks/blob/master/3.5-classifying-movie-reviews.ipynb) - * [3.6: Classifying newswires](http://nbviewer.jupyter.org/github/fchollet/deep-learning-with-python-notebooks/blob/master/3.6-classifying-newswires.ipynb) - * [3.7: Predicting house prices](http://nbviewer.jupyter.org/github/fchollet/deep-learning-with-python-notebooks/blob/master/3.7-predicting-house-prices.ipynb) -* Chapter 4: - * [4.4: Underfitting and overfitting](http://nbviewer.jupyter.org/github/fchollet/deep-learning-with-python-notebooks/blob/master/4.4-overfitting-and-underfitting.ipynb) -* Chapter 5: - * [5.1: Introduction to convnets](http://nbviewer.jupyter.org/github/fchollet/deep-learning-with-python-notebooks/blob/master/5.1-introduction-to-convnets.ipynb) - * [5.2: Using convnets with small datasets](http://nbviewer.jupyter.org/github/fchollet/deep-learning-with-python-notebooks/blob/master/5.2-using-convnets-with-small-datasets.ipynb) - * [5.3: Using a pre-trained convnet](http://nbviewer.jupyter.org/github/fchollet/deep-learning-with-python-notebooks/blob/master/5.3-using-a-pretrained-convnet.ipynb) - * [5.4: Visualizing what convnets learn](http://nbviewer.jupyter.org/github/fchollet/deep-learning-with-python-notebooks/blob/master/5.4-visualizing-what-convnets-learn.ipynb) -* Chapter 6: - * [6.1: One-hot encoding of words or characters](http://nbviewer.jupyter.org/github/fchollet/deep-learning-with-python-notebooks/blob/master/6.1-one-hot-encoding-of-words-or-characters.ipynb) - * [6.1: Using word embeddings](http://nbviewer.jupyter.org/github/fchollet/deep-learning-with-python-notebooks/blob/master/6.1-using-word-embeddings.ipynb) - * [6.2: Understanding RNNs](http://nbviewer.jupyter.org/github/fchollet/deep-learning-with-python-notebooks/blob/master/6.2-understanding-recurrent-neural-networks.ipynb) - * [6.3: Advanced usage of RNNs](http://nbviewer.jupyter.org/github/fchollet/deep-learning-with-python-notebooks/blob/master/6.3-advanced-usage-of-recurrent-neural-networks.ipynb) - * [6.4: Sequence processing with convnets](http://nbviewer.jupyter.org/github/fchollet/deep-learning-with-python-notebooks/blob/master/6.4-sequence-processing-with-convnets.ipynb) -* Chapter 8: - * [8.1: Text generation with LSTM](http://nbviewer.jupyter.org/github/fchollet/deep-learning-with-python-notebooks/blob/master/8.1-text-generation-with-lstm.ipynb) - * [8.2: Deep dream](http://nbviewer.jupyter.org/github/fchollet/deep-learning-with-python-notebooks/blob/master/8.2-deep-dream.ipynb) - * [8.3: Neural style transfer](http://nbviewer.jupyter.org/github/fchollet/deep-learning-with-python-notebooks/blob/master/8.3-neural-style-transfer.ipynb) - * [8.4: Generating images with VAEs](http://nbviewer.jupyter.org/github/fchollet/deep-learning-with-python-notebooks/blob/master/8.4-generating-images-with-vaes.ipynb) - * [8.5: Introduction to GANs](http://nbviewer.jupyter.org/github/fchollet/deep-learning-with-python-notebooks/blob/master/8.5-introduction-to-gans.ipynb -) +* [Chapter 2: The mathematical building blocks of neural networks](https://colab.research.google.com/github/fchollet/deep-learning-with-python-notebooks/blob/master/chapter02_mathematical-building-blocks.ipynb) +* [Chapter 3: Introduction to TensorFlow, PyTorch, JAX, and Keras](https://colab.research.google.com/github/fchollet/deep-learning-with-python-notebooks/blob/master/chapter03_introduction-to-ml-frameworks.ipynb) +* [Chapter 4: Classification and regression](https://colab.research.google.com/github/fchollet/deep-learning-with-python-notebooks/blob/master/chapter04_classification-and-regression.ipynb) +* [Chapter 5: Fundamentals of machine learning](https://colab.research.google.com/github/fchollet/deep-learning-with-python-notebooks/blob/master/chapter05_fundamentals-of-ml.ipynb) +* [Chapter 7: A deep dive on Keras](https://colab.research.google.com/github/fchollet/deep-learning-with-python-notebooks/blob/master/chapter07_deep-dive-keras.ipynb) +* [Chapter 8: Image Classification](https://colab.research.google.com/github/fchollet/deep-learning-with-python-notebooks/blob/master/chapter08_image-classification.ipynb) +* [Chapter 9: Convnet architecture patterns](https://colab.research.google.com/github/fchollet/deep-learning-with-python-notebooks/blob/master/chapter09_convnet-architecture-patterns.ipynb) +* [Chapter 10: Interpreting what ConvNets learn](https://colab.research.google.com/github/fchollet/deep-learning-with-python-notebooks/blob/master/chapter10_interpreting-what-convnets-learn.ipynb) +* [Chapter 11: Image Segmentation](https://colab.research.google.com/github/fchollet/deep-learning-with-python-notebooks/blob/master/chapter11_image-segmentation.ipynb) +* [Chapter 12: Object Detection](https://colab.research.google.com/github/fchollet/deep-learning-with-python-notebooks/blob/master/chapter12_object-detection.ipynb) +* [Chapter 13: Timeseries Forecasting](https://colab.research.google.com/github/fchollet/deep-learning-with-python-notebooks/blob/master/chapter13_timeseries-forecasting.ipynb) +* [Chapter 14: Text Classification](https://colab.research.google.com/github/fchollet/deep-learning-with-python-notebooks/blob/master/chapter14_text-classification.ipynb) +* [Chapter 15: Language Models and the Transformer](https://colab.research.google.com/github/fchollet/deep-learning-with-python-notebooks/blob/master/chapter15_language-models-and-the-transformer.ipynb) +* [Chapter 16: Text Generation](https://colab.research.google.com/github/fchollet/deep-learning-with-python-notebooks/blob/master/chapter16_text-generation.ipynb) +* [Chapter 17: Image Generation](https://colab.research.google.com/github/fchollet/deep-learning-with-python-notebooks/blob/master/chapter17_image-generation.ipynb) +* [Chapter 18: Best practices for the real world](https://colab.research.google.com/github/fchollet/deep-learning-with-python-notebooks/blob/master/chapter18_best-practices-for-the-real-world.ipynb) diff --git a/chapter02_mathematical-building-blocks.ipynb b/chapter02_mathematical-building-blocks.ipynb new file mode 100644 index 0000000000..de53b0dd64 --- /dev/null +++ b/chapter02_mathematical-building-blocks.ipynb @@ -0,0 +1,1457 @@ +{ + "cells": [ + { + "cell_type": "markdown", + "metadata": { + "colab_type": "text" + }, + "source": [ + "This is a companion notebook for the book [Deep Learning with Python, Third Edition](https://www.manning.com/books/deep-learning-with-python-third-edition). For readability, it only contains runnable code blocks and section titles, and omits everything else in the book: text paragraphs, figures, and pseudocode.\n\n**If you want to be able to follow what's going on, I recommend reading the notebook side by side with your copy of the book.**\n\nThe book's contents are available online at [deeplearningwithpython.io](https://deeplearningwithpython.io)." + ] + }, + { + "cell_type": "code", + "execution_count": 0, + "metadata": { + "colab_type": "code" + }, + "outputs": [], + "source": [ + "!pip install keras keras-hub --upgrade -q" + ] + }, + { + "cell_type": "code", + "execution_count": 0, + "metadata": { + "colab_type": "code" + }, + "outputs": [], + "source": [ + "import os\n", + "os.environ[\"KERAS_BACKEND\"] = \"tensorflow\"" + ] + }, + { + "cell_type": "code", + "execution_count": 0, + "metadata": { + "cellView": "form", + "colab_type": "code" + }, + "outputs": [], + "source": [ + "# @title\n", + "import os\n", + "from IPython.core.magic import register_cell_magic\n", + "\n", + "@register_cell_magic\n", + "def backend(line, cell):\n", + " current, required = os.environ.get(\"KERAS_BACKEND\", \"\"), line.split()[-1]\n", + " if current == required:\n", + " get_ipython().run_cell(cell)\n", + " else:\n", + " print(\n", + " f\"This cell requires the {required} backend. To run it, change KERAS_BACKEND to \"\n", + " f\"\\\"{required}\\\" at the top of the notebook, restart the runtime, and rerun the notebook.\"\n", + " )" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "colab_type": "text" + }, + "source": [ + "## The mathematical building blocks of neural networks" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "colab_type": "text" + }, + "source": [ + "### A first look at a neural network" + ] + }, + { + "cell_type": "code", + "execution_count": 0, + "metadata": { + "colab_type": "code" + }, + "outputs": [], + "source": [ + "from keras.datasets import mnist\n", + "\n", + "(train_images, train_labels), (test_images, test_labels) = mnist.load_data()" + ] + }, + { + "cell_type": "code", + "execution_count": 0, + "metadata": { + "colab_type": "code" + }, + "outputs": [], + "source": [ + "train_images.shape" + ] + }, + { + "cell_type": "code", + "execution_count": 0, + "metadata": { + "colab_type": "code" + }, + "outputs": [], + "source": [ + "len(train_labels)" + ] + }, + { + "cell_type": "code", + "execution_count": 0, + "metadata": { + "colab_type": "code" + }, + "outputs": [], + "source": [ + "train_labels" + ] + }, + { + "cell_type": "code", + "execution_count": 0, + "metadata": { + "colab_type": "code" + }, + "outputs": [], + "source": [ + "test_images.shape" + ] + }, + { + "cell_type": "code", + "execution_count": 0, + "metadata": { + "colab_type": "code" + }, + "outputs": [], + "source": [ + "len(test_labels)" + ] + }, + { + "cell_type": "code", + "execution_count": 0, + "metadata": { + "colab_type": "code" + }, + "outputs": [], + "source": [ + "test_labels" + ] + }, + { + "cell_type": "code", + "execution_count": 0, + "metadata": { + "colab_type": "code" + }, + "outputs": [], + "source": [ + "import keras\n", + "from keras import layers\n", + "\n", + "model = keras.Sequential(\n", + " [\n", + " layers.Dense(512, activation=\"relu\"),\n", + " layers.Dense(10, activation=\"softmax\"),\n", + " ]\n", + ")" + ] + }, + { + "cell_type": "code", + "execution_count": 0, + "metadata": { + "colab_type": "code" + }, + "outputs": [], + "source": [ + "model.compile(\n", + " optimizer=\"adam\",\n", + " loss=\"sparse_categorical_crossentropy\",\n", + " metrics=[\"accuracy\"],\n", + ")" + ] + }, + { + "cell_type": "code", + "execution_count": 0, + "metadata": { + "colab_type": "code" + }, + "outputs": [], + "source": [ + "train_images = train_images.reshape((60000, 28 * 28))\n", + "train_images = train_images.astype(\"float32\") / 255\n", + "test_images = test_images.reshape((10000, 28 * 28))\n", + "test_images = test_images.astype(\"float32\") / 255" + ] + }, + { + "cell_type": "code", + "execution_count": 0, + "metadata": { + "colab_type": "code" + }, + "outputs": [], + "source": [ + "model.fit(train_images, train_labels, epochs=5, batch_size=128)" + ] + }, + { + "cell_type": "code", + "execution_count": 0, + "metadata": { + "colab_type": "code" + }, + "outputs": [], + "source": [ + "test_digits = test_images[0:10]\n", + "predictions = model.predict(test_digits)\n", + "predictions[0]" + ] + }, + { + "cell_type": "code", + "execution_count": 0, + "metadata": { + "colab_type": "code" + }, + "outputs": [], + "source": [ + "predictions[0].argmax()" + ] + }, + { + "cell_type": "code", + "execution_count": 0, + "metadata": { + "colab_type": "code" + }, + "outputs": [], + "source": [ + "predictions[0][7]" + ] + }, + { + "cell_type": "code", + "execution_count": 0, + "metadata": { + "colab_type": "code" + }, + "outputs": [], + "source": [ + "test_labels[0]" + ] + }, + { + "cell_type": "code", + "execution_count": 0, + "metadata": { + "colab_type": "code" + }, + "outputs": [], + "source": [ + "test_loss, test_acc = model.evaluate(test_images, test_labels)\n", + "print(f\"test_acc: {test_acc}\")" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "colab_type": "text" + }, + "source": [ + "### Data representations for neural networks" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "colab_type": "text" + }, + "source": [ + "#### Scalars (rank-0 tensors)" + ] + }, + { + "cell_type": "code", + "execution_count": 0, + "metadata": { + "colab_type": "code" + }, + "outputs": [], + "source": [ + "import numpy as np\n", + "x = np.array(12)\n", + "x" + ] + }, + { + "cell_type": "code", + "execution_count": 0, + "metadata": { + "colab_type": "code" + }, + "outputs": [], + "source": [ + "x.ndim" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "colab_type": "text" + }, + "source": [ + "#### Vectors (rank-1 tensors)" + ] + }, + { + "cell_type": "code", + "execution_count": 0, + "metadata": { + "colab_type": "code" + }, + "outputs": [], + "source": [ + "x = np.array([12, 3, 6, 14, 7])\n", + "x" + ] + }, + { + "cell_type": "code", + "execution_count": 0, + "metadata": { + "colab_type": "code" + }, + "outputs": [], + "source": [ + "x.ndim" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "colab_type": "text" + }, + "source": [ + "#### Matrices (rank-2 tensors)" + ] + }, + { + "cell_type": "code", + "execution_count": 0, + "metadata": { + "colab_type": "code" + }, + "outputs": [], + "source": [ + "x = np.array([[5, 78, 2, 34, 0],\n", + " [6, 79, 3, 35, 1],\n", + " [7, 80, 4, 36, 2]])\n", + "x.ndim" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "colab_type": "text" + }, + "source": [ + "#### Rank-3 tensors and higher-rank tensors" + ] + }, + { + "cell_type": "code", + "execution_count": 0, + "metadata": { + "colab_type": "code" + }, + "outputs": [], + "source": [ + "x = np.array([[[5, 78, 2, 34, 0],\n", + " [6, 79, 3, 35, 1],\n", + " [7, 80, 4, 36, 2]],\n", + " [[5, 78, 2, 34, 0],\n", + " [6, 79, 3, 35, 1],\n", + " [7, 80, 4, 36, 2]],\n", + " [[5, 78, 2, 34, 0],\n", + " [6, 79, 3, 35, 1],\n", + " [7, 80, 4, 36, 2]]])\n", + "x.ndim" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "colab_type": "text" + }, + "source": [ + "#### Key attributes" + ] + }, + { + "cell_type": "code", + "execution_count": 0, + "metadata": { + "colab_type": "code" + }, + "outputs": [], + "source": [ + "from keras.datasets import mnist\n", + "\n", + "(train_images, train_labels), (test_images, test_labels) = mnist.load_data()" + ] + }, + { + "cell_type": "code", + "execution_count": 0, + "metadata": { + "colab_type": "code" + }, + "outputs": [], + "source": [ + "train_images.ndim" + ] + }, + { + "cell_type": "code", + "execution_count": 0, + "metadata": { + "colab_type": "code" + }, + "outputs": [], + "source": [ + "train_images.shape" + ] + }, + { + "cell_type": "code", + "execution_count": 0, + "metadata": { + "colab_type": "code" + }, + "outputs": [], + "source": [ + "train_images.dtype" + ] + }, + { + "cell_type": "code", + "execution_count": 0, + "metadata": { + "colab_type": "code" + }, + "outputs": [], + "source": [ + "import matplotlib.pyplot as plt\n", + "\n", + "digit = train_images[4]\n", + "plt.imshow(digit, cmap=plt.cm.binary)\n", + "plt.show()" + ] + }, + { + "cell_type": "code", + "execution_count": 0, + "metadata": { + "colab_type": "code" + }, + "outputs": [], + "source": [ + "train_labels[4]" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "colab_type": "text" + }, + "source": [ + "#### Manipulating tensors in NumPy" + ] + }, + { + "cell_type": "code", + "execution_count": 0, + "metadata": { + "colab_type": "code" + }, + "outputs": [], + "source": [ + "my_slice = train_images[10:100]\n", + "my_slice.shape" + ] + }, + { + "cell_type": "code", + "execution_count": 0, + "metadata": { + "colab_type": "code" + }, + "outputs": [], + "source": [ + "my_slice = train_images[10:100, :, :]\n", + "my_slice.shape" + ] + }, + { + "cell_type": "code", + "execution_count": 0, + "metadata": { + "colab_type": "code" + }, + "outputs": [], + "source": [ + "my_slice = train_images[10:100, 0:28, 0:28]\n", + "my_slice.shape" + ] + }, + { + "cell_type": "code", + "execution_count": 0, + "metadata": { + "colab_type": "code" + }, + "outputs": [], + "source": [ + "my_slice = train_images[:, 14:, 14:]" + ] + }, + { + "cell_type": "code", + "execution_count": 0, + "metadata": { + "colab_type": "code" + }, + "outputs": [], + "source": [ + "my_slice = train_images[:, 7:-7, 7:-7]" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "colab_type": "text" + }, + "source": [ + "#### The notion of data batches" + ] + }, + { + "cell_type": "code", + "execution_count": 0, + "metadata": { + "colab_type": "code" + }, + "outputs": [], + "source": [ + "batch = train_images[:128]" + ] + }, + { + "cell_type": "code", + "execution_count": 0, + "metadata": { + "colab_type": "code" + }, + "outputs": [], + "source": [ + "batch = train_images[128:256]" + ] + }, + { + "cell_type": "code", + "execution_count": 0, + "metadata": { + "colab_type": "code" + }, + "outputs": [], + "source": [ + "n = 3\n", + "batch = train_images[128 * n : 128 * (n + 1)]" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "colab_type": "text" + }, + "source": [ + "#### Real-world examples of data tensors" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "colab_type": "text" + }, + "source": [ + "##### Vector data" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "colab_type": "text" + }, + "source": [ + "##### Timeseries data or sequence data" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "colab_type": "text" + }, + "source": [ + "##### Image data" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "colab_type": "text" + }, + "source": [ + "##### Video data" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "colab_type": "text" + }, + "source": [ + "### The gears of neural networks: tensor operations" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "colab_type": "text" + }, + "source": [ + "#### Element-wise operations" + ] + }, + { + "cell_type": "code", + "execution_count": 0, + "metadata": { + "colab_type": "code" + }, + "outputs": [], + "source": [ + "def naive_relu(x):\n", + " assert len(x.shape) == 2\n", + " x = x.copy()\n", + " for i in range(x.shape[0]):\n", + " for j in range(x.shape[1]):\n", + " x[i, j] = max(x[i, j], 0)\n", + " return x" + ] + }, + { + "cell_type": "code", + "execution_count": 0, + "metadata": { + "colab_type": "code" + }, + "outputs": [], + "source": [ + "def naive_add(x, y):\n", + " assert len(x.shape) == 2\n", + " assert x.shape == y.shape\n", + " x = x.copy()\n", + " for i in range(x.shape[0]):\n", + " for j in range(x.shape[1]):\n", + " x[i, j] += y[i, j]\n", + " return x" + ] + }, + { + "cell_type": "code", + "execution_count": 0, + "metadata": { + "colab_type": "code" + }, + "outputs": [], + "source": [ + "import time\n", + "\n", + "x = np.random.random((20, 100))\n", + "y = np.random.random((20, 100))\n", + "\n", + "t0 = time.time()\n", + "for _ in range(1000):\n", + " z = x + y\n", + " z = np.maximum(z, 0.0)\n", + "print(\"Took: {0:.2f} s\".format(time.time() - t0))" + ] + }, + { + "cell_type": "code", + "execution_count": 0, + "metadata": { + "colab_type": "code" + }, + "outputs": [], + "source": [ + "t0 = time.time()\n", + "for _ in range(1000):\n", + " z = naive_add(x, y)\n", + " z = naive_relu(z)\n", + "print(\"Took: {0:.2f} s\".format(time.time() - t0))" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "colab_type": "text" + }, + "source": [ + "#### Broadcasting" + ] + }, + { + "cell_type": "code", + "execution_count": 0, + "metadata": { + "colab_type": "code" + }, + "outputs": [], + "source": [ + "import numpy as np\n", + "\n", + "X = np.random.random((32, 10))\n", + "y = np.random.random((10,))" + ] + }, + { + "cell_type": "code", + "execution_count": 0, + "metadata": { + "colab_type": "code" + }, + "outputs": [], + "source": [ + "y = np.expand_dims(y, axis=0)" + ] + }, + { + "cell_type": "code", + "execution_count": 0, + "metadata": { + "colab_type": "code" + }, + "outputs": [], + "source": [ + "Y = np.tile(y, (32, 1))" + ] + }, + { + "cell_type": "code", + "execution_count": 0, + "metadata": { + "colab_type": "code" + }, + "outputs": [], + "source": [ + "def naive_add_matrix_and_vector(x, y):\n", + " assert len(x.shape) == 2\n", + " assert len(y.shape) == 1\n", + " assert x.shape[1] == y.shape[0]\n", + " x = x.copy()\n", + " for i in range(x.shape[0]):\n", + " for j in range(x.shape[1]):\n", + " x[i, j] += y[j]\n", + " return x" + ] + }, + { + "cell_type": "code", + "execution_count": 0, + "metadata": { + "colab_type": "code" + }, + "outputs": [], + "source": [ + "import numpy as np\n", + "\n", + "x = np.random.random((64, 3, 32, 10))\n", + "y = np.random.random((32, 10))\n", + "z = np.maximum(x, y)" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "colab_type": "text" + }, + "source": [ + "#### Tensor product" + ] + }, + { + "cell_type": "code", + "execution_count": 0, + "metadata": { + "colab_type": "code" + }, + "outputs": [], + "source": [ + "x = np.random.random((32,))\n", + "y = np.random.random((32,))\n", + "\n", + "z = np.matmul(x, y)\n", + "z = x @ y" + ] + }, + { + "cell_type": "code", + "execution_count": 0, + "metadata": { + "colab_type": "code" + }, + "outputs": [], + "source": [ + "def naive_vector_product(x, y):\n", + " assert len(x.shape) == 1\n", + " assert len(y.shape) == 1\n", + " assert x.shape[0] == y.shape[0]\n", + " z = 0.0\n", + " for i in range(x.shape[0]):\n", + " z += x[i] * y[i]\n", + " return z" + ] + }, + { + "cell_type": "code", + "execution_count": 0, + "metadata": { + "colab_type": "code" + }, + "outputs": [], + "source": [ + "def naive_matrix_vector_product(x, y):\n", + " assert len(x.shape) == 2\n", + " assert len(y.shape) == 1\n", + " assert x.shape[1] == y.shape[0]\n", + " z = np.zeros(x.shape[0])\n", + " for i in range(x.shape[0]):\n", + " for j in range(x.shape[1]):\n", + " z[i] += x[i, j] * y[j]\n", + " return z" + ] + }, + { + "cell_type": "code", + "execution_count": 0, + "metadata": { + "colab_type": "code" + }, + "outputs": [], + "source": [ + "def naive_matrix_vector_product(x, y):\n", + " z = np.zeros(x.shape[0])\n", + " for i in range(x.shape[0]):\n", + " z[i] = naive_vector_product(x[i, :], y)\n", + " return z" + ] + }, + { + "cell_type": "code", + "execution_count": 0, + "metadata": { + "colab_type": "code" + }, + "outputs": [], + "source": [ + "def naive_matrix_product(x, y):\n", + " assert len(x.shape) == 2\n", + " assert len(y.shape) == 2\n", + " assert x.shape[1] == y.shape[0]\n", + " z = np.zeros((x.shape[0], y.shape[1]))\n", + " for i in range(x.shape[0]):\n", + " for j in range(y.shape[1]):\n", + " row_x = x[i, :]\n", + " column_y = y[:, j]\n", + " z[i, j] = naive_vector_product(row_x, column_y)\n", + " return z" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "colab_type": "text" + }, + "source": [ + "#### Tensor reshaping" + ] + }, + { + "cell_type": "code", + "execution_count": 0, + "metadata": { + "colab_type": "code" + }, + "outputs": [], + "source": [ + "train_images = train_images.reshape((60000, 28 * 28))" + ] + }, + { + "cell_type": "code", + "execution_count": 0, + "metadata": { + "colab_type": "code" + }, + "outputs": [], + "source": [ + "x = np.array([[0., 1.],\n", + " [2., 3.],\n", + " [4., 5.]])\n", + "x.shape" + ] + }, + { + "cell_type": "code", + "execution_count": 0, + "metadata": { + "colab_type": "code" + }, + "outputs": [], + "source": [ + "x = x.reshape((6, 1))\n", + "x" + ] + }, + { + "cell_type": "code", + "execution_count": 0, + "metadata": { + "colab_type": "code" + }, + "outputs": [], + "source": [ + "x = x.reshape((2, 3))\n", + "x" + ] + }, + { + "cell_type": "code", + "execution_count": 0, + "metadata": { + "colab_type": "code" + }, + "outputs": [], + "source": [ + "x = np.zeros((300, 20))\n", + "x = np.transpose(x)\n", + "x.shape" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "colab_type": "text" + }, + "source": [ + "#### Geometric interpretation of tensor operations" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "colab_type": "text" + }, + "source": [ + "#### A geometric interpretation of deep learning" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "colab_type": "text" + }, + "source": [ + "### The engine of neural networks: Gradient-based optimization" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "colab_type": "text" + }, + "source": [ + "#### What's a derivative?" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "colab_type": "text" + }, + "source": [ + "#### Derivative of a tensor operation: the gradient" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "colab_type": "text" + }, + "source": [ + "#### Stochastic gradient descent" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "colab_type": "text" + }, + "source": [ + "#### Chaining derivatives: The Backpropagation algorithm" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "colab_type": "text" + }, + "source": [ + "##### The chain rule" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "colab_type": "text" + }, + "source": [ + "##### Automatic differentiation with computation graphs" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "colab_type": "text" + }, + "source": [ + "### Looking back at our first example" + ] + }, + { + "cell_type": "code", + "execution_count": 0, + "metadata": { + "colab_type": "code" + }, + "outputs": [], + "source": [ + "(train_images, train_labels), (test_images, test_labels) = mnist.load_data()\n", + "train_images = train_images.reshape((60000, 28 * 28))\n", + "train_images = train_images.astype(\"float32\") / 255\n", + "test_images = test_images.reshape((10000, 28 * 28))\n", + "test_images = test_images.astype(\"float32\") / 255" + ] + }, + { + "cell_type": "code", + "execution_count": 0, + "metadata": { + "colab_type": "code" + }, + "outputs": [], + "source": [ + "model = keras.Sequential(\n", + " [\n", + " layers.Dense(512, activation=\"relu\"),\n", + " layers.Dense(10, activation=\"softmax\"),\n", + " ]\n", + ")" + ] + }, + { + "cell_type": "code", + "execution_count": 0, + "metadata": { + "colab_type": "code" + }, + "outputs": [], + "source": [ + "model.compile(\n", + " optimizer=\"adam\",\n", + " loss=\"sparse_categorical_crossentropy\",\n", + " metrics=[\"accuracy\"],\n", + ")" + ] + }, + { + "cell_type": "code", + "execution_count": 0, + "metadata": { + "colab_type": "code" + }, + "outputs": [], + "source": [ + "model.fit(\n", + " train_images,\n", + " train_labels,\n", + " epochs=5,\n", + " batch_size=128,\n", + ")" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "colab_type": "text" + }, + "source": [ + "#### Reimplementing our first example from scratch" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "colab_type": "text" + }, + "source": [ + "##### A simple Dense class" + ] + }, + { + "cell_type": "code", + "execution_count": 0, + "metadata": { + "colab_type": "code" + }, + "outputs": [], + "source": [ + "import keras\n", + "from keras import ops\n", + "\n", + "class NaiveDense:\n", + " def __init__(self, input_size, output_size, activation=None):\n", + " self.activation = activation\n", + " self.W = keras.Variable(\n", + " shape=(input_size, output_size), initializer=\"uniform\"\n", + " )\n", + " self.b = keras.Variable(shape=(output_size,), initializer=\"zeros\")\n", + "\n", + " def __call__(self, inputs):\n", + " x = ops.matmul(inputs, self.W)\n", + " x = x + self.b\n", + " if self.activation is not None:\n", + " x = self.activation(x)\n", + " return x\n", + "\n", + " @property\n", + " def weights(self):\n", + " return [self.W, self.b]" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "colab_type": "text" + }, + "source": [ + "##### A simple Sequential class" + ] + }, + { + "cell_type": "code", + "execution_count": 0, + "metadata": { + "colab_type": "code" + }, + "outputs": [], + "source": [ + "class NaiveSequential:\n", + " def __init__(self, layers):\n", + " self.layers = layers\n", + "\n", + " def __call__(self, inputs):\n", + " x = inputs\n", + " for layer in self.layers:\n", + " x = layer(x)\n", + " return x\n", + "\n", + " @property\n", + " def weights(self):\n", + " weights = []\n", + " for layer in self.layers:\n", + " weights += layer.weights\n", + " return weights" + ] + }, + { + "cell_type": "code", + "execution_count": 0, + "metadata": { + "colab_type": "code" + }, + "outputs": [], + "source": [ + "model = NaiveSequential(\n", + " [\n", + " NaiveDense(input_size=28 * 28, output_size=512, activation=ops.relu),\n", + " NaiveDense(input_size=512, output_size=10, activation=ops.softmax),\n", + " ]\n", + ")\n", + "assert len(model.weights) == 4" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "colab_type": "text" + }, + "source": [ + "##### A batch generator" + ] + }, + { + "cell_type": "code", + "execution_count": 0, + "metadata": { + "colab_type": "code" + }, + "outputs": [], + "source": [ + "import math\n", + "\n", + "class BatchGenerator:\n", + " def __init__(self, images, labels, batch_size=128):\n", + " assert len(images) == len(labels)\n", + " self.index = 0\n", + " self.images = images\n", + " self.labels = labels\n", + " self.batch_size = batch_size\n", + " self.num_batches = math.ceil(len(images) / batch_size)\n", + "\n", + " def next(self):\n", + " images = self.images[self.index : self.index + self.batch_size]\n", + " labels = self.labels[self.index : self.index + self.batch_size]\n", + " self.index += self.batch_size\n", + " return images, labels" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "colab_type": "text" + }, + "source": [ + "#### Running one training step" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "colab_type": "text" + }, + "source": [ + "##### The weight update step" + ] + }, + { + "cell_type": "code", + "execution_count": 0, + "metadata": { + "colab_type": "code" + }, + "outputs": [], + "source": [ + "learning_rate = 1e-3\n", + "\n", + "def update_weights(gradients, weights):\n", + " for g, w in zip(gradients, weights):\n", + " w.assign(w - g * learning_rate)" + ] + }, + { + "cell_type": "code", + "execution_count": 0, + "metadata": { + "colab_type": "code" + }, + "outputs": [], + "source": [ + "from keras import optimizers\n", + "\n", + "optimizer = optimizers.SGD(learning_rate=1e-3)\n", + "\n", + "def update_weights(gradients, weights):\n", + " optimizer.apply_gradients(zip(gradients, weights))" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "colab_type": "text" + }, + "source": [ + "##### Gradient computation" + ] + }, + { + "cell_type": "code", + "execution_count": 0, + "metadata": { + "colab_type": "code" + }, + "outputs": [], + "source": [ + "%%backend tensorflow\n", + "import tensorflow as tf\n", + "\n", + "x = tf.zeros(shape=())\n", + "with tf.GradientTape() as tape:\n", + " y = 2 * x + 3\n", + "grad_of_y_wrt_x = tape.gradient(y, x)" + ] + }, + { + "cell_type": "code", + "execution_count": 0, + "metadata": { + "colab_type": "code" + }, + "outputs": [], + "source": [ + "%%backend tensorflow\n", + "def one_training_step(model, images_batch, labels_batch):\n", + " with tf.GradientTape() as tape:\n", + " predictions = model(images_batch)\n", + " loss = ops.sparse_categorical_crossentropy(labels_batch, predictions)\n", + " average_loss = ops.mean(loss)\n", + " gradients = tape.gradient(average_loss, model.weights)\n", + " update_weights(gradients, model.weights)\n", + " return average_loss" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "colab_type": "text" + }, + "source": [ + "#### The full training loop" + ] + }, + { + "cell_type": "code", + "execution_count": 0, + "metadata": { + "colab_type": "code" + }, + "outputs": [], + "source": [ + "%%backend tensorflow\n", + "def fit(model, images, labels, epochs, batch_size=128):\n", + " for epoch_counter in range(epochs):\n", + " print(f\"Epoch {epoch_counter}\")\n", + " batch_generator = BatchGenerator(images, labels)\n", + " for batch_counter in range(batch_generator.num_batches):\n", + " images_batch, labels_batch = batch_generator.next()\n", + " loss = one_training_step(model, images_batch, labels_batch)\n", + " if batch_counter % 100 == 0:\n", + " print(f\"loss at batch {batch_counter}: {loss:.2f}\")" + ] + }, + { + "cell_type": "code", + "execution_count": 0, + "metadata": { + "colab_type": "code" + }, + "outputs": [], + "source": [ + "%%backend tensorflow\n", + "from keras.datasets import mnist\n", + "\n", + "(train_images, train_labels), (test_images, test_labels) = mnist.load_data()\n", + "\n", + "train_images = train_images.reshape((60000, 28 * 28))\n", + "train_images = train_images.astype(\"float32\") / 255\n", + "test_images = test_images.reshape((10000, 28 * 28))\n", + "test_images = test_images.astype(\"float32\") / 255\n", + "\n", + "fit(model, train_images, train_labels, epochs=10, batch_size=128)" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "colab_type": "text" + }, + "source": [ + "#### Evaluating the model" + ] + }, + { + "cell_type": "code", + "execution_count": 0, + "metadata": { + "colab_type": "code" + }, + "outputs": [], + "source": [ + "%%backend tensorflow\n", + "predictions = model(test_images)\n", + "predicted_labels = ops.argmax(predictions, axis=1)\n", + "matches = predicted_labels == test_labels\n", + "f\"accuracy: {ops.mean(matches):.2f}\"" + ] + } + ], + "metadata": { + "accelerator": "GPU", + "colab": { + "collapsed_sections": [], + "name": "chapter02_mathematical-building-blocks", + "private_outputs": false, + "provenance": [], + "toc_visible": true + }, + "kernelspec": { + "display_name": "Python 3", + "language": "python", + "name": "python3" + }, + "language_info": { + "codemirror_mode": { + "name": "ipython", + "version": 3 + }, + "file_extension": ".py", + "mimetype": "text/x-python", + "name": "python", + "nbconvert_exporter": "python", + "pygments_lexer": "ipython3", + "version": "3.10.0" + } + }, + "nbformat": 4, + "nbformat_minor": 0 +} \ No newline at end of file diff --git a/chapter03_introduction-to-ml-frameworks.ipynb b/chapter03_introduction-to-ml-frameworks.ipynb new file mode 100644 index 0000000000..7d29c2f859 --- /dev/null +++ b/chapter03_introduction-to-ml-frameworks.ipynb @@ -0,0 +1,1779 @@ +{ + "cells": [ + { + "cell_type": "markdown", + "metadata": { + "colab_type": "text" + }, + "source": [ + "This is a companion notebook for the book [Deep Learning with Python, Third Edition](https://www.manning.com/books/deep-learning-with-python-third-edition). For readability, it only contains runnable code blocks and section titles, and omits everything else in the book: text paragraphs, figures, and pseudocode.\n\n**If you want to be able to follow what's going on, I recommend reading the notebook side by side with your copy of the book.**\n\nThe book's contents are available online at [deeplearningwithpython.io](https://deeplearningwithpython.io)." + ] + }, + { + "cell_type": "code", + "execution_count": 0, + "metadata": { + "colab_type": "code" + }, + "outputs": [], + "source": [ + "!pip install keras keras-hub --upgrade -q" + ] + }, + { + "cell_type": "code", + "execution_count": 0, + "metadata": { + "colab_type": "code" + }, + "outputs": [], + "source": [ + "import os\n", + "os.environ[\"KERAS_BACKEND\"] = \"jax\"" + ] + }, + { + "cell_type": "code", + "execution_count": 0, + "metadata": { + "cellView": "form", + "colab_type": "code" + }, + "outputs": [], + "source": [ + "# @title\n", + "import os\n", + "from IPython.core.magic import register_cell_magic\n", + "\n", + "@register_cell_magic\n", + "def backend(line, cell):\n", + " current, required = os.environ.get(\"KERAS_BACKEND\", \"\"), line.split()[-1]\n", + " if current == required:\n", + " get_ipython().run_cell(cell)\n", + " else:\n", + " print(\n", + " f\"This cell requires the {required} backend. To run it, change KERAS_BACKEND to \"\n", + " f\"\\\"{required}\\\" at the top of the notebook, restart the runtime, and rerun the notebook.\"\n", + " )" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "colab_type": "text" + }, + "source": [ + "## Introduction to TensorFlow, PyTorch, JAX, and Keras" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "colab_type": "text" + }, + "source": [ + "### A brief history of deep learning frameworks" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "colab_type": "text" + }, + "source": [ + "### How these frameworks relate to each other" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "colab_type": "text" + }, + "source": [ + "### Introduction to TensorFlow" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "colab_type": "text" + }, + "source": [ + "#### First steps with TensorFlow" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "colab_type": "text" + }, + "source": [ + "##### Tensors and variables in TensorFlow" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "colab_type": "text" + }, + "source": [ + "###### Constant tensors" + ] + }, + { + "cell_type": "code", + "execution_count": 0, + "metadata": { + "colab_type": "code" + }, + "outputs": [], + "source": [ + "import tensorflow as tf\n", + "tf.ones(shape=(2, 1))" + ] + }, + { + "cell_type": "code", + "execution_count": 0, + "metadata": { + "colab_type": "code" + }, + "outputs": [], + "source": [ + "tf.zeros(shape=(2, 1))" + ] + }, + { + "cell_type": "code", + "execution_count": 0, + "metadata": { + "colab_type": "code" + }, + "outputs": [], + "source": [ + "tf.constant([1, 2, 3], dtype=\"float32\")" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "colab_type": "text" + }, + "source": [ + "###### Random tensors" + ] + }, + { + "cell_type": "code", + "execution_count": 0, + "metadata": { + "colab_type": "code" + }, + "outputs": [], + "source": [ + "x = tf.random.normal(shape=(3, 1), mean=0., stddev=1.)\n", + "print(x)" + ] + }, + { + "cell_type": "code", + "execution_count": 0, + "metadata": { + "colab_type": "code" + }, + "outputs": [], + "source": [ + "x = tf.random.uniform(shape=(3, 1), minval=0., maxval=1.)\n", + "print(x)" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "colab_type": "text" + }, + "source": [ + "###### Tensor assignment and the Variable class" + ] + }, + { + "cell_type": "code", + "execution_count": 0, + "metadata": { + "colab_type": "code" + }, + "outputs": [], + "source": [ + "import numpy as np\n", + "\n", + "x = np.ones(shape=(2, 2))\n", + "x[0, 0] = 0.0" + ] + }, + { + "cell_type": "code", + "execution_count": 0, + "metadata": { + "colab_type": "code" + }, + "outputs": [], + "source": [ + "v = tf.Variable(initial_value=tf.random.normal(shape=(3, 1)))\n", + "print(v)" + ] + }, + { + "cell_type": "code", + "execution_count": 0, + "metadata": { + "colab_type": "code" + }, + "outputs": [], + "source": [ + "v.assign(tf.ones((3, 1)))" + ] + }, + { + "cell_type": "code", + "execution_count": 0, + "metadata": { + "colab_type": "code" + }, + "outputs": [], + "source": [ + "v[0, 0].assign(3.)" + ] + }, + { + "cell_type": "code", + "execution_count": 0, + "metadata": { + "colab_type": "code" + }, + "outputs": [], + "source": [ + "v.assign_add(tf.ones((3, 1)))" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "colab_type": "text" + }, + "source": [ + "##### Tensor operations: Doing math in TensorFlow" + ] + }, + { + "cell_type": "code", + "execution_count": 0, + "metadata": { + "colab_type": "code" + }, + "outputs": [], + "source": [ + "a = tf.ones((2, 2))\n", + "b = tf.square(a)\n", + "c = tf.sqrt(a)\n", + "d = b + c\n", + "e = tf.matmul(a, b)\n", + "f = tf.concat((a, b), axis=0)" + ] + }, + { + "cell_type": "code", + "execution_count": 0, + "metadata": { + "colab_type": "code" + }, + "outputs": [], + "source": [ + "def dense(inputs, W, b):\n", + " return tf.nn.relu(tf.matmul(inputs, W) + b)" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "colab_type": "text" + }, + "source": [ + "##### Gradients in TensorFlow: A second look at the GradientTape API" + ] + }, + { + "cell_type": "code", + "execution_count": 0, + "metadata": { + "colab_type": "code" + }, + "outputs": [], + "source": [ + "input_var = tf.Variable(initial_value=3.0)\n", + "with tf.GradientTape() as tape:\n", + " result = tf.square(input_var)\n", + "gradient = tape.gradient(result, input_var)" + ] + }, + { + "cell_type": "code", + "execution_count": 0, + "metadata": { + "colab_type": "code" + }, + "outputs": [], + "source": [ + "input_const = tf.constant(3.0)\n", + "with tf.GradientTape() as tape:\n", + " tape.watch(input_const)\n", + " result = tf.square(input_const)\n", + "gradient = tape.gradient(result, input_const)" + ] + }, + { + "cell_type": "code", + "execution_count": 0, + "metadata": { + "colab_type": "code" + }, + "outputs": [], + "source": [ + "time = tf.Variable(0.0)\n", + "with tf.GradientTape() as outer_tape:\n", + " with tf.GradientTape() as inner_tape:\n", + " position = 4.9 * time**2\n", + " speed = inner_tape.gradient(position, time)\n", + "acceleration = outer_tape.gradient(speed, time)" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "colab_type": "text" + }, + "source": [ + "##### Making TensorFlow functions fast using compilation" + ] + }, + { + "cell_type": "code", + "execution_count": 0, + "metadata": { + "colab_type": "code" + }, + "outputs": [], + "source": [ + "@tf.function\n", + "def dense(inputs, W, b):\n", + " return tf.nn.relu(tf.matmul(inputs, W) + b)" + ] + }, + { + "cell_type": "code", + "execution_count": 0, + "metadata": { + "colab_type": "code" + }, + "outputs": [], + "source": [ + "@tf.function(jit_compile=True)\n", + "def dense(inputs, W, b):\n", + " return tf.nn.relu(tf.matmul(inputs, W) + b)" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "colab_type": "text" + }, + "source": [ + "#### An end-to-end example: A linear classifier in pure TensorFlow" + ] + }, + { + "cell_type": "code", + "execution_count": 0, + "metadata": { + "colab_type": "code" + }, + "outputs": [], + "source": [ + "import numpy as np\n", + "\n", + "num_samples_per_class = 1000\n", + "negative_samples = np.random.multivariate_normal(\n", + " mean=[0, 3], cov=[[1, 0.5], [0.5, 1]], size=num_samples_per_class\n", + ")\n", + "positive_samples = np.random.multivariate_normal(\n", + " mean=[3, 0], cov=[[1, 0.5], [0.5, 1]], size=num_samples_per_class\n", + ")" + ] + }, + { + "cell_type": "code", + "execution_count": 0, + "metadata": { + "colab_type": "code" + }, + "outputs": [], + "source": [ + "inputs = np.vstack((negative_samples, positive_samples)).astype(np.float32)" + ] + }, + { + "cell_type": "code", + "execution_count": 0, + "metadata": { + "colab_type": "code" + }, + "outputs": [], + "source": [ + "targets = np.vstack(\n", + " (\n", + " np.zeros((num_samples_per_class, 1), dtype=\"float32\"),\n", + " np.ones((num_samples_per_class, 1), dtype=\"float32\"),\n", + " )\n", + ")" + ] + }, + { + "cell_type": "code", + "execution_count": 0, + "metadata": { + "colab_type": "code" + }, + "outputs": [], + "source": [ + "import matplotlib.pyplot as plt\n", + "\n", + "plt.scatter(inputs[:, 0], inputs[:, 1], c=targets[:, 0])\n", + "plt.show()" + ] + }, + { + "cell_type": "code", + "execution_count": 0, + "metadata": { + "colab_type": "code" + }, + "outputs": [], + "source": [ + "input_dim = 2\n", + "output_dim = 1\n", + "W = tf.Variable(initial_value=tf.random.uniform(shape=(input_dim, output_dim)))\n", + "b = tf.Variable(initial_value=tf.zeros(shape=(output_dim,)))" + ] + }, + { + "cell_type": "code", + "execution_count": 0, + "metadata": { + "colab_type": "code" + }, + "outputs": [], + "source": [ + "def model(inputs, W, b):\n", + " return tf.matmul(inputs, W) + b" + ] + }, + { + "cell_type": "code", + "execution_count": 0, + "metadata": { + "colab_type": "code" + }, + "outputs": [], + "source": [ + "def mean_squared_error(targets, predictions):\n", + " per_sample_losses = tf.square(targets - predictions)\n", + " return tf.reduce_mean(per_sample_losses)" + ] + }, + { + "cell_type": "code", + "execution_count": 0, + "metadata": { + "colab_type": "code" + }, + "outputs": [], + "source": [ + "learning_rate = 0.1\n", + "\n", + "@tf.function(jit_compile=True)\n", + "def training_step(inputs, targets, W, b):\n", + " with tf.GradientTape() as tape:\n", + " predictions = model(inputs, W, b)\n", + " loss = mean_squared_error(predictions, targets)\n", + " grad_loss_wrt_W, grad_loss_wrt_b = tape.gradient(loss, [W, b])\n", + " W.assign_sub(grad_loss_wrt_W * learning_rate)\n", + " b.assign_sub(grad_loss_wrt_b * learning_rate)\n", + " return loss" + ] + }, + { + "cell_type": "code", + "execution_count": 0, + "metadata": { + "colab_type": "code" + }, + "outputs": [], + "source": [ + "for step in range(40):\n", + " loss = training_step(inputs, targets, W, b)\n", + " print(f\"Loss at step {step}: {loss:.4f}\")" + ] + }, + { + "cell_type": "code", + "execution_count": 0, + "metadata": { + "colab_type": "code" + }, + "outputs": [], + "source": [ + "predictions = model(inputs, W, b)\n", + "plt.scatter(inputs[:, 0], inputs[:, 1], c=predictions[:, 0] > 0.5)\n", + "plt.show()" + ] + }, + { + "cell_type": "code", + "execution_count": 0, + "metadata": { + "colab_type": "code" + }, + "outputs": [], + "source": [ + "x = np.linspace(-1, 4, 100)\n", + "y = -W[0] / W[1] * x + (0.5 - b) / W[1]\n", + "plt.plot(x, y, \"-r\")\n", + "plt.scatter(inputs[:, 0], inputs[:, 1], c=predictions[:, 0] > 0.5)" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "colab_type": "text" + }, + "source": [ + "#### What makes the TensorFlow approach unique" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "colab_type": "text" + }, + "source": [ + "### Introduction to PyTorch" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "colab_type": "text" + }, + "source": [ + "#### First steps with PyTorch" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "colab_type": "text" + }, + "source": [ + "##### Tensors and parameters in PyTorch" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "colab_type": "text" + }, + "source": [ + "###### Constant tensors" + ] + }, + { + "cell_type": "code", + "execution_count": 0, + "metadata": { + "colab_type": "code" + }, + "outputs": [], + "source": [ + "import torch\n", + "torch.ones(size=(2, 1))" + ] + }, + { + "cell_type": "code", + "execution_count": 0, + "metadata": { + "colab_type": "code" + }, + "outputs": [], + "source": [ + "torch.zeros(size=(2, 1))" + ] + }, + { + "cell_type": "code", + "execution_count": 0, + "metadata": { + "colab_type": "code" + }, + "outputs": [], + "source": [ + "torch.tensor([1, 2, 3], dtype=torch.float32)" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "colab_type": "text" + }, + "source": [ + "###### Random tensors" + ] + }, + { + "cell_type": "code", + "execution_count": 0, + "metadata": { + "colab_type": "code" + }, + "outputs": [], + "source": [ + "torch.normal(\n", + "mean=torch.zeros(size=(3, 1)),\n", + "std=torch.ones(size=(3, 1)))" + ] + }, + { + "cell_type": "code", + "execution_count": 0, + "metadata": { + "colab_type": "code" + }, + "outputs": [], + "source": [ + "torch.rand(3, 1)" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "colab_type": "text" + }, + "source": [ + "###### Tensor assignment and the Parameter class" + ] + }, + { + "cell_type": "code", + "execution_count": 0, + "metadata": { + "colab_type": "code" + }, + "outputs": [], + "source": [ + "x = torch.zeros(size=(2, 1))\n", + "x[0, 0] = 1.\n", + "x" + ] + }, + { + "cell_type": "code", + "execution_count": 0, + "metadata": { + "colab_type": "code" + }, + "outputs": [], + "source": [ + "x = torch.zeros(size=(2, 1))\n", + "p = torch.nn.parameter.Parameter(data=x)" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "colab_type": "text" + }, + "source": [ + "##### Tensor operations: Doing math in PyTorch" + ] + }, + { + "cell_type": "code", + "execution_count": 0, + "metadata": { + "colab_type": "code" + }, + "outputs": [], + "source": [ + "a = torch.ones((2, 2))\n", + "b = torch.square(a)\n", + "c = torch.sqrt(a)\n", + "d = b + c\n", + "e = torch.matmul(a, b)\n", + "f = torch.cat((a, b), dim=0)" + ] + }, + { + "cell_type": "code", + "execution_count": 0, + "metadata": { + "colab_type": "code" + }, + "outputs": [], + "source": [ + "def dense(inputs, W, b):\n", + " return torch.nn.relu(torch.matmul(inputs, W) + b)" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "colab_type": "text" + }, + "source": [ + "##### Computing gradients with PyTorch" + ] + }, + { + "cell_type": "code", + "execution_count": 0, + "metadata": { + "colab_type": "code" + }, + "outputs": [], + "source": [ + "input_var = torch.tensor(3.0, requires_grad=True)\n", + "result = torch.square(input_var)\n", + "result.backward()\n", + "gradient = input_var.grad\n", + "gradient" + ] + }, + { + "cell_type": "code", + "execution_count": 0, + "metadata": { + "colab_type": "code" + }, + "outputs": [], + "source": [ + "result = torch.square(input_var)\n", + "result.backward()\n", + "input_var.grad" + ] + }, + { + "cell_type": "code", + "execution_count": 0, + "metadata": { + "colab_type": "code" + }, + "outputs": [], + "source": [ + "input_var.grad = None" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "colab_type": "text" + }, + "source": [ + "#### An end-to-end example: A linear classifier in pure PyTorch" + ] + }, + { + "cell_type": "code", + "execution_count": 0, + "metadata": { + "colab_type": "code" + }, + "outputs": [], + "source": [ + "input_dim = 2\n", + "output_dim = 1\n", + "\n", + "W = torch.rand(input_dim, output_dim, requires_grad=True)\n", + "b = torch.zeros(output_dim, requires_grad=True)" + ] + }, + { + "cell_type": "code", + "execution_count": 0, + "metadata": { + "colab_type": "code" + }, + "outputs": [], + "source": [ + "def model(inputs, W, b):\n", + " return torch.matmul(inputs, W) + b" + ] + }, + { + "cell_type": "code", + "execution_count": 0, + "metadata": { + "colab_type": "code" + }, + "outputs": [], + "source": [ + "def mean_squared_error(targets, predictions):\n", + " per_sample_losses = torch.square(targets - predictions)\n", + " return torch.mean(per_sample_losses)" + ] + }, + { + "cell_type": "code", + "execution_count": 0, + "metadata": { + "colab_type": "code" + }, + "outputs": [], + "source": [ + "learning_rate = 0.1\n", + "\n", + "def training_step(inputs, targets, W, b):\n", + " predictions = model(inputs)\n", + " loss = mean_squared_error(targets, predictions)\n", + " loss.backward()\n", + " grad_loss_wrt_W, grad_loss_wrt_b = W.grad, b.grad\n", + " with torch.no_grad():\n", + " W -= grad_loss_wrt_W * learning_rate\n", + " b -= grad_loss_wrt_b * learning_rate\n", + " W.grad = None\n", + " b.grad = None\n", + " return loss" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "colab_type": "text" + }, + "source": [ + "##### Packaging state and computation with the Module class" + ] + }, + { + "cell_type": "code", + "execution_count": 0, + "metadata": { + "colab_type": "code" + }, + "outputs": [], + "source": [ + "class LinearModel(torch.nn.Module):\n", + " def __init__(self):\n", + " super().__init__()\n", + " self.W = torch.nn.Parameter(torch.rand(input_dim, output_dim))\n", + " self.b = torch.nn.Parameter(torch.zeros(output_dim))\n", + "\n", + " def forward(self, inputs):\n", + " return torch.matmul(inputs, self.W) + self.b" + ] + }, + { + "cell_type": "code", + "execution_count": 0, + "metadata": { + "colab_type": "code" + }, + "outputs": [], + "source": [ + "model = LinearModel()" + ] + }, + { + "cell_type": "code", + "execution_count": 0, + "metadata": { + "colab_type": "code" + }, + "outputs": [], + "source": [ + "torch_inputs = torch.tensor(inputs)\n", + "output = model(torch_inputs)" + ] + }, + { + "cell_type": "code", + "execution_count": 0, + "metadata": { + "colab_type": "code" + }, + "outputs": [], + "source": [ + "optimizer = torch.optim.SGD(model.parameters(), lr=learning_rate)" + ] + }, + { + "cell_type": "code", + "execution_count": 0, + "metadata": { + "colab_type": "code" + }, + "outputs": [], + "source": [ + "def training_step(inputs, targets):\n", + " predictions = model(inputs)\n", + " loss = mean_squared_error(targets, predictions)\n", + " loss.backward()\n", + " optimizer.step()\n", + " model.zero_grad()\n", + " return loss" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "colab_type": "text" + }, + "source": [ + "##### Making PyTorch modules fast using compilation" + ] + }, + { + "cell_type": "code", + "execution_count": 0, + "metadata": { + "colab_type": "code" + }, + "outputs": [], + "source": [ + "compiled_model = torch.compile(model)" + ] + }, + { + "cell_type": "code", + "execution_count": 0, + "metadata": { + "colab_type": "code" + }, + "outputs": [], + "source": [ + "@torch.compile\n", + "def dense(inputs, W, b):\n", + " return torch.nn.relu(torch.matmul(inputs, W) + b)" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "colab_type": "text" + }, + "source": [ + "#### What makes the PyTorch approach unique" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "colab_type": "text" + }, + "source": [ + "### Introduction to JAX" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "colab_type": "text" + }, + "source": [ + "#### First steps with JAX" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "colab_type": "text" + }, + "source": [ + "#### Tensors in JAX" + ] + }, + { + "cell_type": "code", + "execution_count": 0, + "metadata": { + "colab_type": "code" + }, + "outputs": [], + "source": [ + "from jax import numpy as jnp\n", + "jnp.ones(shape=(2, 1))" + ] + }, + { + "cell_type": "code", + "execution_count": 0, + "metadata": { + "colab_type": "code" + }, + "outputs": [], + "source": [ + "jnp.zeros(shape=(2, 1))" + ] + }, + { + "cell_type": "code", + "execution_count": 0, + "metadata": { + "colab_type": "code" + }, + "outputs": [], + "source": [ + "jnp.array([1, 2, 3], dtype=\"float32\")" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "colab_type": "text" + }, + "source": [ + "#### Random number generation in JAX" + ] + }, + { + "cell_type": "code", + "execution_count": 0, + "metadata": { + "colab_type": "code" + }, + "outputs": [], + "source": [ + "np.random.normal(size=(3,))" + ] + }, + { + "cell_type": "code", + "execution_count": 0, + "metadata": { + "colab_type": "code" + }, + "outputs": [], + "source": [ + "np.random.normal(size=(3,))" + ] + }, + { + "cell_type": "code", + "execution_count": 0, + "metadata": { + "colab_type": "code" + }, + "outputs": [], + "source": [ + "def apply_noise(x, seed):\n", + " np.random.seed(seed)\n", + " x = x * np.random.normal((3,))\n", + " return x\n", + "\n", + "seed = 1337\n", + "y = apply_noise(x, seed)\n", + "seed += 1\n", + "z = apply_noise(x, seed)" + ] + }, + { + "cell_type": "code", + "execution_count": 0, + "metadata": { + "colab_type": "code" + }, + "outputs": [], + "source": [ + "import jax\n", + "\n", + "seed_key = jax.random.key(1337)" + ] + }, + { + "cell_type": "code", + "execution_count": 0, + "metadata": { + "colab_type": "code" + }, + "outputs": [], + "source": [ + "seed_key = jax.random.key(0)\n", + "jax.random.normal(seed_key, shape=(3,))" + ] + }, + { + "cell_type": "code", + "execution_count": 0, + "metadata": { + "colab_type": "code" + }, + "outputs": [], + "source": [ + "seed_key = jax.random.key(123)\n", + "jax.random.normal(seed_key, shape=(3,))" + ] + }, + { + "cell_type": "code", + "execution_count": 0, + "metadata": { + "colab_type": "code" + }, + "outputs": [], + "source": [ + "jax.random.normal(seed_key, shape=(3,))" + ] + }, + { + "cell_type": "code", + "execution_count": 0, + "metadata": { + "colab_type": "code" + }, + "outputs": [], + "source": [ + "seed_key = jax.random.key(123)\n", + "jax.random.normal(seed_key, shape=(3,))" + ] + }, + { + "cell_type": "code", + "execution_count": 0, + "metadata": { + "colab_type": "code" + }, + "outputs": [], + "source": [ + "new_seed_key = jax.random.split(seed_key, num=1)[0]\n", + "jax.random.normal(new_seed_key, shape=(3,))" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "colab_type": "text" + }, + "source": [ + "##### Tensor assignment" + ] + }, + { + "cell_type": "code", + "execution_count": 0, + "metadata": { + "colab_type": "code" + }, + "outputs": [], + "source": [ + "x = jnp.array([1, 2, 3], dtype=\"float32\")\n", + "new_x = x.at[0].set(10)" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "colab_type": "text" + }, + "source": [ + "##### Tensor operations: Doing math in JAX" + ] + }, + { + "cell_type": "code", + "execution_count": 0, + "metadata": { + "colab_type": "code" + }, + "outputs": [], + "source": [ + "a = jnp.ones((2, 2))\n", + "b = jnp.square(a)\n", + "c = jnp.sqrt(a)\n", + "d = b + c\n", + "e = jnp.matmul(a, b)\n", + "e *= d" + ] + }, + { + "cell_type": "code", + "execution_count": 0, + "metadata": { + "colab_type": "code" + }, + "outputs": [], + "source": [ + "def dense(inputs, W, b):\n", + " return jax.nn.relu(jnp.matmul(inputs, W) + b)" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "colab_type": "text" + }, + "source": [ + "##### Computing gradients with JAX" + ] + }, + { + "cell_type": "code", + "execution_count": 0, + "metadata": { + "colab_type": "code" + }, + "outputs": [], + "source": [ + "def compute_loss(input_var):\n", + " return jnp.square(input_var)" + ] + }, + { + "cell_type": "code", + "execution_count": 0, + "metadata": { + "colab_type": "code" + }, + "outputs": [], + "source": [ + "grad_fn = jax.grad(compute_loss)" + ] + }, + { + "cell_type": "code", + "execution_count": 0, + "metadata": { + "colab_type": "code" + }, + "outputs": [], + "source": [ + "input_var = jnp.array(3.0)\n", + "grad_of_loss_wrt_input_var = grad_fn(input_var)" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "colab_type": "text" + }, + "source": [ + "##### JAX gradient-computation best practices" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "colab_type": "text" + }, + "source": [ + "###### Returning the loss value" + ] + }, + { + "cell_type": "code", + "execution_count": 0, + "metadata": { + "colab_type": "code" + }, + "outputs": [], + "source": [ + "grad_fn = jax.value_and_grad(compute_loss)\n", + "output, grad_of_loss_wrt_input_var = grad_fn(input_var)" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "colab_type": "text" + }, + "source": [ + "###### Getting gradients for a complex function" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "colab_type": "text" + }, + "source": [ + "###### Returning auxiliary outputs" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "colab_type": "text" + }, + "source": [ + "##### Making JAX functions fast with @jax.jit" + ] + }, + { + "cell_type": "code", + "execution_count": 0, + "metadata": { + "colab_type": "code" + }, + "outputs": [], + "source": [ + "@jax.jit\n", + "def dense(inputs, W, b):\n", + " return jax.nn.relu(jnp.matmul(inputs, W) + b)" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "colab_type": "text" + }, + "source": [ + "#### An end-to-end example: A linear classifier in pure JAX" + ] + }, + { + "cell_type": "code", + "execution_count": 0, + "metadata": { + "colab_type": "code" + }, + "outputs": [], + "source": [ + "def model(inputs, W, b):\n", + " return jnp.matmul(inputs, W) + b\n", + "\n", + "def mean_squared_error(targets, predictions):\n", + " per_sample_losses = jnp.square(targets - predictions)\n", + " return jnp.mean(per_sample_losses)" + ] + }, + { + "cell_type": "code", + "execution_count": 0, + "metadata": { + "colab_type": "code" + }, + "outputs": [], + "source": [ + "def compute_loss(state, inputs, targets):\n", + " W, b = state\n", + " predictions = model(inputs, W, b)\n", + " loss = mean_squared_error(targets, predictions)\n", + " return loss" + ] + }, + { + "cell_type": "code", + "execution_count": 0, + "metadata": { + "colab_type": "code" + }, + "outputs": [], + "source": [ + "grad_fn = jax.value_and_grad(compute_loss)" + ] + }, + { + "cell_type": "code", + "execution_count": 0, + "metadata": { + "colab_type": "code" + }, + "outputs": [], + "source": [ + "learning_rate = 0.1\n", + "\n", + "@jax.jit\n", + "def training_step(inputs, targets, W, b):\n", + " loss, grads = grad_fn((W, b), inputs, targets)\n", + " grad_wrt_W, grad_wrt_b = grads\n", + " W = W - grad_wrt_W * learning_rate\n", + " b = b - grad_wrt_b * learning_rate\n", + " return loss, W, b" + ] + }, + { + "cell_type": "code", + "execution_count": 0, + "metadata": { + "colab_type": "code" + }, + "outputs": [], + "source": [ + "input_dim = 2\n", + "output_dim = 1\n", + "\n", + "W = jax.numpy.array(np.random.uniform(size=(input_dim, output_dim)))\n", + "b = jax.numpy.array(np.zeros(shape=(output_dim,)))\n", + "state = (W, b)\n", + "for step in range(40):\n", + " loss, W, b = training_step(inputs, targets, W, b)\n", + " print(f\"Loss at step {step}: {loss:.4f}\")" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "colab_type": "text" + }, + "source": [ + "#### What makes the JAX approach unique" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "colab_type": "text" + }, + "source": [ + "### Introduction to Keras" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "colab_type": "text" + }, + "source": [ + "#### First steps with Keras" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "colab_type": "text" + }, + "source": [ + "##### Picking a backend framework" + ] + }, + { + "cell_type": "code", + "execution_count": 0, + "metadata": { + "colab_type": "code" + }, + "outputs": [], + "source": [ + "import os\n", + "\n", + "os.environ[\"KERAS_BACKEND\"] = \"jax\"\n", + "\n", + "import keras" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "colab_type": "text" + }, + "source": [ + "#### Layers: The building blocks of deep learning" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "colab_type": "text" + }, + "source": [ + "##### The base `Layer` class in Keras" + ] + }, + { + "cell_type": "code", + "execution_count": 0, + "metadata": { + "colab_type": "code" + }, + "outputs": [], + "source": [ + "import keras\n", + "\n", + "class SimpleDense(keras.Layer):\n", + " def __init__(self, units, activation=None):\n", + " super().__init__()\n", + " self.units = units\n", + " self.activation = activation\n", + "\n", + " def build(self, input_shape):\n", + " batch_dim, input_dim = input_shape\n", + " self.W = self.add_weight(\n", + " shape=(input_dim, self.units), initializer=\"random_normal\"\n", + " )\n", + " self.b = self.add_weight(shape=(self.units,), initializer=\"zeros\")\n", + "\n", + " def call(self, inputs):\n", + " y = keras.ops.matmul(inputs, self.W) + self.b\n", + " if self.activation is not None:\n", + " y = self.activation(y)\n", + " return y" + ] + }, + { + "cell_type": "code", + "execution_count": 0, + "metadata": { + "colab_type": "code" + }, + "outputs": [], + "source": [ + "my_dense = SimpleDense(units=32, activation=keras.ops.relu)\n", + "input_tensor = keras.ops.ones(shape=(2, 784))\n", + "output_tensor = my_dense(input_tensor)\n", + "print(output_tensor.shape)" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "colab_type": "text" + }, + "source": [ + "##### Automatic shape inference: Building layers on the fly" + ] + }, + { + "cell_type": "code", + "execution_count": 0, + "metadata": { + "colab_type": "code" + }, + "outputs": [], + "source": [ + "from keras import layers\n", + "\n", + "layer = layers.Dense(32, activation=\"relu\")" + ] + }, + { + "cell_type": "code", + "execution_count": 0, + "metadata": { + "colab_type": "code" + }, + "outputs": [], + "source": [ + "from keras import models\n", + "from keras import layers\n", + "\n", + "model = models.Sequential(\n", + " [\n", + " layers.Dense(32, activation=\"relu\"),\n", + " layers.Dense(32),\n", + " ]\n", + ")" + ] + }, + { + "cell_type": "code", + "execution_count": 0, + "metadata": { + "colab_type": "code" + }, + "outputs": [], + "source": [ + "model = keras.Sequential(\n", + " [\n", + " SimpleDense(32, activation=\"relu\"),\n", + " SimpleDense(64, activation=\"relu\"),\n", + " SimpleDense(32, activation=\"relu\"),\n", + " SimpleDense(10, activation=\"softmax\"),\n", + " ]\n", + ")" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "colab_type": "text" + }, + "source": [ + "#### From layers to models" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "colab_type": "text" + }, + "source": [ + "#### The \"compile\" step: Configuring the learning process" + ] + }, + { + "cell_type": "code", + "execution_count": 0, + "metadata": { + "colab_type": "code" + }, + "outputs": [], + "source": [ + "model = keras.Sequential([keras.layers.Dense(1)])\n", + "model.compile(\n", + " optimizer=\"rmsprop\",\n", + " loss=\"mean_squared_error\",\n", + " metrics=[\"accuracy\"],\n", + ")" + ] + }, + { + "cell_type": "code", + "execution_count": 0, + "metadata": { + "colab_type": "code" + }, + "outputs": [], + "source": [ + "model.compile(\n", + " optimizer=keras.optimizers.RMSprop(),\n", + " loss=keras.losses.MeanSquaredError(),\n", + " metrics=[keras.metrics.BinaryAccuracy()],\n", + ")" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "colab_type": "text" + }, + "source": [ + "#### Picking a loss function" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "colab_type": "text" + }, + "source": [ + "#### Understanding the fit method" + ] + }, + { + "cell_type": "code", + "execution_count": 0, + "metadata": { + "colab_type": "code" + }, + "outputs": [], + "source": [ + "history = model.fit(\n", + " inputs,\n", + " targets,\n", + " epochs=5,\n", + " batch_size=128,\n", + ")" + ] + }, + { + "cell_type": "code", + "execution_count": 0, + "metadata": { + "colab_type": "code" + }, + "outputs": [], + "source": [ + "history.history" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "colab_type": "text" + }, + "source": [ + "#### Monitoring loss and metrics on validation data" + ] + }, + { + "cell_type": "code", + "execution_count": 0, + "metadata": { + "colab_type": "code" + }, + "outputs": [], + "source": [ + "model = keras.Sequential([keras.layers.Dense(1)])\n", + "model.compile(\n", + " optimizer=keras.optimizers.RMSprop(learning_rate=0.1),\n", + " loss=keras.losses.MeanSquaredError(),\n", + " metrics=[keras.metrics.BinaryAccuracy()],\n", + ")\n", + "\n", + "indices_permutation = np.random.permutation(len(inputs))\n", + "shuffled_inputs = inputs[indices_permutation]\n", + "shuffled_targets = targets[indices_permutation]\n", + "\n", + "num_validation_samples = int(0.3 * len(inputs))\n", + "val_inputs = shuffled_inputs[:num_validation_samples]\n", + "val_targets = shuffled_targets[:num_validation_samples]\n", + "training_inputs = shuffled_inputs[num_validation_samples:]\n", + "training_targets = shuffled_targets[num_validation_samples:]\n", + "model.fit(\n", + " training_inputs,\n", + " training_targets,\n", + " epochs=5,\n", + " batch_size=16,\n", + " validation_data=(val_inputs, val_targets),\n", + ")" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "colab_type": "text" + }, + "source": [ + "#### Inference: Using a model after training" + ] + }, + { + "cell_type": "code", + "execution_count": 0, + "metadata": { + "colab_type": "code" + }, + "outputs": [], + "source": [ + "predictions = model.predict(val_inputs, batch_size=128)\n", + "print(predictions[:10])" + ] + } + ], + "metadata": { + "accelerator": "GPU", + "colab": { + "collapsed_sections": [], + "name": "chapter03_introduction-to-ml-frameworks", + "private_outputs": false, + "provenance": [], + "toc_visible": true + }, + "kernelspec": { + "display_name": "Python 3", + "language": "python", + "name": "python3" + }, + "language_info": { + "codemirror_mode": { + "name": "ipython", + "version": 3 + }, + "file_extension": ".py", + "mimetype": "text/x-python", + "name": "python", + "nbconvert_exporter": "python", + "pygments_lexer": "ipython3", + "version": "3.10.0" + } + }, + "nbformat": 4, + "nbformat_minor": 0 +} \ No newline at end of file diff --git a/chapter04_classification-and-regression.ipynb b/chapter04_classification-and-regression.ipynb new file mode 100644 index 0000000000..1e2e7e8225 --- /dev/null +++ b/chapter04_classification-and-regression.ipynb @@ -0,0 +1,1305 @@ +{ + "cells": [ + { + "cell_type": "markdown", + "metadata": { + "colab_type": "text" + }, + "source": [ + "This is a companion notebook for the book [Deep Learning with Python, Third Edition](https://www.manning.com/books/deep-learning-with-python-third-edition). For readability, it only contains runnable code blocks and section titles, and omits everything else in the book: text paragraphs, figures, and pseudocode.\n\n**If you want to be able to follow what's going on, I recommend reading the notebook side by side with your copy of the book.**\n\nThe book's contents are available online at [deeplearningwithpython.io](https://deeplearningwithpython.io)." + ] + }, + { + "cell_type": "code", + "execution_count": 0, + "metadata": { + "colab_type": "code" + }, + "outputs": [], + "source": [ + "!pip install keras keras-hub --upgrade -q" + ] + }, + { + "cell_type": "code", + "execution_count": 0, + "metadata": { + "colab_type": "code" + }, + "outputs": [], + "source": [ + "import os\n", + "os.environ[\"KERAS_BACKEND\"] = \"jax\"" + ] + }, + { + "cell_type": "code", + "execution_count": 0, + "metadata": { + "cellView": "form", + "colab_type": "code" + }, + "outputs": [], + "source": [ + "# @title\n", + "import os\n", + "from IPython.core.magic import register_cell_magic\n", + "\n", + "@register_cell_magic\n", + "def backend(line, cell):\n", + " current, required = os.environ.get(\"KERAS_BACKEND\", \"\"), line.split()[-1]\n", + " if current == required:\n", + " get_ipython().run_cell(cell)\n", + " else:\n", + " print(\n", + " f\"This cell requires the {required} backend. To run it, change KERAS_BACKEND to \"\n", + " f\"\\\"{required}\\\" at the top of the notebook, restart the runtime, and rerun the notebook.\"\n", + " )" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "colab_type": "text" + }, + "source": [ + "## Classification and regression" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "colab_type": "text" + }, + "source": [ + "### Classifying movie reviews: A binary classification example" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "colab_type": "text" + }, + "source": [ + "#### The IMDb dataset" + ] + }, + { + "cell_type": "code", + "execution_count": 0, + "metadata": { + "colab_type": "code" + }, + "outputs": [], + "source": [ + "from keras.datasets import imdb\n", + "\n", + "(train_data, train_labels), (test_data, test_labels) = imdb.load_data(\n", + " num_words=10000\n", + ")" + ] + }, + { + "cell_type": "code", + "execution_count": 0, + "metadata": { + "colab_type": "code" + }, + "outputs": [], + "source": [ + "train_data[0]" + ] + }, + { + "cell_type": "code", + "execution_count": 0, + "metadata": { + "colab_type": "code" + }, + "outputs": [], + "source": [ + "train_labels[0]" + ] + }, + { + "cell_type": "code", + "execution_count": 0, + "metadata": { + "colab_type": "code" + }, + "outputs": [], + "source": [ + "max([max(sequence) for sequence in train_data])" + ] + }, + { + "cell_type": "code", + "execution_count": 0, + "metadata": { + "colab_type": "code" + }, + "outputs": [], + "source": [ + "word_index = imdb.get_word_index()\n", + "reverse_word_index = dict([(value, key) for (key, value) in word_index.items()])\n", + "decoded_review = \" \".join(\n", + " [reverse_word_index.get(i - 3, \"?\") for i in train_data[0]]\n", + ")" + ] + }, + { + "cell_type": "code", + "execution_count": 0, + "metadata": { + "colab_type": "code" + }, + "outputs": [], + "source": [ + "decoded_review[:100]" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "colab_type": "text" + }, + "source": [ + "#### Preparing the data" + ] + }, + { + "cell_type": "code", + "execution_count": 0, + "metadata": { + "colab_type": "code" + }, + "outputs": [], + "source": [ + "import numpy as np\n", + "\n", + "def multi_hot_encode(sequences, num_classes):\n", + " results = np.zeros((len(sequences), num_classes))\n", + " for i, sequence in enumerate(sequences):\n", + " results[i][sequence] = 1.0\n", + " return results\n", + "\n", + "x_train = multi_hot_encode(train_data, num_classes=10000)\n", + "x_test = multi_hot_encode(test_data, num_classes=10000)" + ] + }, + { + "cell_type": "code", + "execution_count": 0, + "metadata": { + "colab_type": "code" + }, + "outputs": [], + "source": [ + "x_train[0]" + ] + }, + { + "cell_type": "code", + "execution_count": 0, + "metadata": { + "colab_type": "code" + }, + "outputs": [], + "source": [ + "y_train = train_labels.astype(\"float32\")\n", + "y_test = test_labels.astype(\"float32\")" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "colab_type": "text" + }, + "source": [ + "#### Building your model" + ] + }, + { + "cell_type": "code", + "execution_count": 0, + "metadata": { + "colab_type": "code" + }, + "outputs": [], + "source": [ + "import keras\n", + "from keras import layers\n", + "\n", + "model = keras.Sequential(\n", + " [\n", + " layers.Dense(16, activation=\"relu\"),\n", + " layers.Dense(16, activation=\"relu\"),\n", + " layers.Dense(1, activation=\"sigmoid\"),\n", + " ]\n", + ")" + ] + }, + { + "cell_type": "code", + "execution_count": 0, + "metadata": { + "colab_type": "code" + }, + "outputs": [], + "source": [ + "model.compile(\n", + " optimizer=\"adam\",\n", + " loss=\"binary_crossentropy\",\n", + " metrics=[\"accuracy\"],\n", + ")" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "colab_type": "text" + }, + "source": [ + "#### Validating your approach" + ] + }, + { + "cell_type": "code", + "execution_count": 0, + "metadata": { + "colab_type": "code" + }, + "outputs": [], + "source": [ + "x_val = x_train[:10000]\n", + "partial_x_train = x_train[10000:]\n", + "y_val = y_train[:10000]\n", + "partial_y_train = y_train[10000:]" + ] + }, + { + "cell_type": "code", + "execution_count": 0, + "metadata": { + "colab_type": "code" + }, + "outputs": [], + "source": [ + "history = model.fit(\n", + " partial_x_train,\n", + " partial_y_train,\n", + " epochs=20,\n", + " batch_size=512,\n", + " validation_data=(x_val, y_val),\n", + ")" + ] + }, + { + "cell_type": "code", + "execution_count": 0, + "metadata": { + "colab_type": "code" + }, + "outputs": [], + "source": [ + "history = model.fit(\n", + " x_train,\n", + " y_train,\n", + " epochs=20,\n", + " batch_size=512,\n", + " validation_split=0.2,\n", + ")" + ] + }, + { + "cell_type": "code", + "execution_count": 0, + "metadata": { + "colab_type": "code" + }, + "outputs": [], + "source": [ + "history_dict = history.history\n", + "history_dict.keys()" + ] + }, + { + "cell_type": "code", + "execution_count": 0, + "metadata": { + "colab_type": "code" + }, + "outputs": [], + "source": [ + "import matplotlib.pyplot as plt\n", + "\n", + "history_dict = history.history\n", + "loss_values = history_dict[\"loss\"]\n", + "val_loss_values = history_dict[\"val_loss\"]\n", + "epochs = range(1, len(loss_values) + 1)\n", + "plt.plot(epochs, loss_values, \"r--\", label=\"Training loss\")\n", + "plt.plot(epochs, val_loss_values, \"b\", label=\"Validation loss\")\n", + "plt.title(\"[IMDB] Training and validation loss\")\n", + "plt.xlabel(\"Epochs\")\n", + "plt.xticks(epochs)\n", + "plt.ylabel(\"Loss\")\n", + "plt.legend()\n", + "plt.show()" + ] + }, + { + "cell_type": "code", + "execution_count": 0, + "metadata": { + "colab_type": "code" + }, + "outputs": [], + "source": [ + "plt.clf()\n", + "acc = history_dict[\"accuracy\"]\n", + "val_acc = history_dict[\"val_accuracy\"]\n", + "plt.plot(epochs, acc, \"r--\", label=\"Training acc\")\n", + "plt.plot(epochs, val_acc, \"b\", label=\"Validation acc\")\n", + "plt.title(\"[IMDB] Training and validation accuracy\")\n", + "plt.xlabel(\"Epochs\")\n", + "plt.xticks(epochs)\n", + "plt.ylabel(\"Accuracy\")\n", + "plt.legend()\n", + "plt.show()" + ] + }, + { + "cell_type": "code", + "execution_count": 0, + "metadata": { + "colab_type": "code" + }, + "outputs": [], + "source": [ + "model = keras.Sequential(\n", + " [\n", + " layers.Dense(16, activation=\"relu\"),\n", + " layers.Dense(16, activation=\"relu\"),\n", + " layers.Dense(1, activation=\"sigmoid\"),\n", + " ]\n", + ")\n", + "model.compile(\n", + " optimizer=\"adam\",\n", + " loss=\"binary_crossentropy\",\n", + " metrics=[\"accuracy\"],\n", + ")\n", + "model.fit(x_train, y_train, epochs=4, batch_size=512)\n", + "results = model.evaluate(x_test, y_test)" + ] + }, + { + "cell_type": "code", + "execution_count": 0, + "metadata": { + "colab_type": "code" + }, + "outputs": [], + "source": [ + "results" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "colab_type": "text" + }, + "source": [ + "#### Using a trained model to generate predictions on new data" + ] + }, + { + "cell_type": "code", + "execution_count": 0, + "metadata": { + "colab_type": "code" + }, + "outputs": [], + "source": [ + "model.predict(x_test)" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "colab_type": "text" + }, + "source": [ + "#### Further experiments" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "colab_type": "text" + }, + "source": [ + "#### Wrapping up" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "colab_type": "text" + }, + "source": [ + "### Classifying newswires: A multiclass classification example" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "colab_type": "text" + }, + "source": [ + "#### The Reuters dataset" + ] + }, + { + "cell_type": "code", + "execution_count": 0, + "metadata": { + "colab_type": "code" + }, + "outputs": [], + "source": [ + "from keras.datasets import reuters\n", + "\n", + "(train_data, train_labels), (test_data, test_labels) = reuters.load_data(\n", + " num_words=10000\n", + ")" + ] + }, + { + "cell_type": "code", + "execution_count": 0, + "metadata": { + "colab_type": "code" + }, + "outputs": [], + "source": [ + "len(train_data)" + ] + }, + { + "cell_type": "code", + "execution_count": 0, + "metadata": { + "colab_type": "code" + }, + "outputs": [], + "source": [ + "len(test_data)" + ] + }, + { + "cell_type": "code", + "execution_count": 0, + "metadata": { + "colab_type": "code" + }, + "outputs": [], + "source": [ + "train_data[10]" + ] + }, + { + "cell_type": "code", + "execution_count": 0, + "metadata": { + "colab_type": "code" + }, + "outputs": [], + "source": [ + "word_index = reuters.get_word_index()\n", + "reverse_word_index = dict([(value, key) for (key, value) in word_index.items()])\n", + "decoded_newswire = \" \".join(\n", + " [reverse_word_index.get(i - 3, \"?\") for i in train_data[10]]\n", + ")" + ] + }, + { + "cell_type": "code", + "execution_count": 0, + "metadata": { + "colab_type": "code" + }, + "outputs": [], + "source": [ + "train_labels[10]" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "colab_type": "text" + }, + "source": [ + "#### Preparing the data" + ] + }, + { + "cell_type": "code", + "execution_count": 0, + "metadata": { + "colab_type": "code" + }, + "outputs": [], + "source": [ + "x_train = multi_hot_encode(train_data, num_classes=10000)\n", + "x_test = multi_hot_encode(test_data, num_classes=10000)" + ] + }, + { + "cell_type": "code", + "execution_count": 0, + "metadata": { + "colab_type": "code" + }, + "outputs": [], + "source": [ + "def one_hot_encode(labels, num_classes=46):\n", + " results = np.zeros((len(labels), num_classes))\n", + " for i, label in enumerate(labels):\n", + " results[i, label] = 1.0\n", + " return results\n", + "\n", + "y_train = one_hot_encode(train_labels)\n", + "y_test = one_hot_encode(test_labels)" + ] + }, + { + "cell_type": "code", + "execution_count": 0, + "metadata": { + "colab_type": "code" + }, + "outputs": [], + "source": [ + "from keras.utils import to_categorical\n", + "\n", + "y_train = to_categorical(train_labels)\n", + "y_test = to_categorical(test_labels)" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "colab_type": "text" + }, + "source": [ + "#### Building your model" + ] + }, + { + "cell_type": "code", + "execution_count": 0, + "metadata": { + "colab_type": "code" + }, + "outputs": [], + "source": [ + "model = keras.Sequential(\n", + " [\n", + " layers.Dense(64, activation=\"relu\"),\n", + " layers.Dense(64, activation=\"relu\"),\n", + " layers.Dense(46, activation=\"softmax\"),\n", + " ]\n", + ")" + ] + }, + { + "cell_type": "code", + "execution_count": 0, + "metadata": { + "colab_type": "code" + }, + "outputs": [], + "source": [ + "top_3_accuracy = keras.metrics.TopKCategoricalAccuracy(\n", + " k=3, name=\"top_3_accuracy\"\n", + ")\n", + "model.compile(\n", + " optimizer=\"adam\",\n", + " loss=\"categorical_crossentropy\",\n", + " metrics=[\"accuracy\", top_3_accuracy],\n", + ")" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "colab_type": "text" + }, + "source": [ + "#### Validating your approach" + ] + }, + { + "cell_type": "code", + "execution_count": 0, + "metadata": { + "colab_type": "code" + }, + "outputs": [], + "source": [ + "x_val = x_train[:1000]\n", + "partial_x_train = x_train[1000:]\n", + "y_val = y_train[:1000]\n", + "partial_y_train = y_train[1000:]" + ] + }, + { + "cell_type": "code", + "execution_count": 0, + "metadata": { + "colab_type": "code" + }, + "outputs": [], + "source": [ + "history = model.fit(\n", + " partial_x_train,\n", + " partial_y_train,\n", + " epochs=20,\n", + " batch_size=512,\n", + " validation_data=(x_val, y_val),\n", + ")" + ] + }, + { + "cell_type": "code", + "execution_count": 0, + "metadata": { + "colab_type": "code" + }, + "outputs": [], + "source": [ + "loss = history.history[\"loss\"]\n", + "val_loss = history.history[\"val_loss\"]\n", + "epochs = range(1, len(loss) + 1)\n", + "plt.plot(epochs, loss, \"r--\", label=\"Training loss\")\n", + "plt.plot(epochs, val_loss, \"b\", label=\"Validation loss\")\n", + "plt.title(\"Training and validation loss\")\n", + "plt.xlabel(\"Epochs\")\n", + "plt.xticks(epochs)\n", + "plt.ylabel(\"Loss\")\n", + "plt.legend()\n", + "plt.show()" + ] + }, + { + "cell_type": "code", + "execution_count": 0, + "metadata": { + "colab_type": "code" + }, + "outputs": [], + "source": [ + "plt.clf()\n", + "acc = history.history[\"accuracy\"]\n", + "val_acc = history.history[\"val_accuracy\"]\n", + "plt.plot(epochs, acc, \"r--\", label=\"Training accuracy\")\n", + "plt.plot(epochs, val_acc, \"b\", label=\"Validation accuracy\")\n", + "plt.title(\"Training and validation accuracy\")\n", + "plt.xlabel(\"Epochs\")\n", + "plt.xticks(epochs)\n", + "plt.ylabel(\"Accuracy\")\n", + "plt.legend()\n", + "plt.show()" + ] + }, + { + "cell_type": "code", + "execution_count": 0, + "metadata": { + "colab_type": "code" + }, + "outputs": [], + "source": [ + "plt.clf()\n", + "acc = history.history[\"top_3_accuracy\"]\n", + "val_acc = history.history[\"val_top_3_accuracy\"]\n", + "plt.plot(epochs, acc, \"r--\", label=\"Training top-3 accuracy\")\n", + "plt.plot(epochs, val_acc, \"b\", label=\"Validation top-3 accuracy\")\n", + "plt.title(\"Training and validation top-3 accuracy\")\n", + "plt.xlabel(\"Epochs\")\n", + "plt.xticks(epochs)\n", + "plt.ylabel(\"Top-3 accuracy\")\n", + "plt.legend()\n", + "plt.show()" + ] + }, + { + "cell_type": "code", + "execution_count": 0, + "metadata": { + "colab_type": "code" + }, + "outputs": [], + "source": [ + "model = keras.Sequential(\n", + " [\n", + " layers.Dense(64, activation=\"relu\"),\n", + " layers.Dense(64, activation=\"relu\"),\n", + " layers.Dense(46, activation=\"softmax\"),\n", + " ]\n", + ")\n", + "model.compile(\n", + " optimizer=\"adam\",\n", + " loss=\"categorical_crossentropy\",\n", + " metrics=[\"accuracy\"],\n", + ")\n", + "model.fit(\n", + " x_train,\n", + " y_train,\n", + " epochs=9,\n", + " batch_size=512,\n", + ")\n", + "results = model.evaluate(x_test, y_test)" + ] + }, + { + "cell_type": "code", + "execution_count": 0, + "metadata": { + "colab_type": "code" + }, + "outputs": [], + "source": [ + "results" + ] + }, + { + "cell_type": "code", + "execution_count": 0, + "metadata": { + "colab_type": "code" + }, + "outputs": [], + "source": [ + "import copy\n", + "test_labels_copy = copy.copy(test_labels)\n", + "np.random.shuffle(test_labels_copy)\n", + "hits_array = np.array(test_labels == test_labels_copy)\n", + "hits_array.mean()" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "colab_type": "text" + }, + "source": [ + "#### Generating predictions on new data" + ] + }, + { + "cell_type": "code", + "execution_count": 0, + "metadata": { + "colab_type": "code" + }, + "outputs": [], + "source": [ + "predictions = model.predict(x_test)" + ] + }, + { + "cell_type": "code", + "execution_count": 0, + "metadata": { + "colab_type": "code" + }, + "outputs": [], + "source": [ + "predictions[0].shape" + ] + }, + { + "cell_type": "code", + "execution_count": 0, + "metadata": { + "colab_type": "code" + }, + "outputs": [], + "source": [ + "np.sum(predictions[0])" + ] + }, + { + "cell_type": "code", + "execution_count": 0, + "metadata": { + "colab_type": "code" + }, + "outputs": [], + "source": [ + "np.argmax(predictions[0])" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "colab_type": "text" + }, + "source": [ + "#### A different way to handle the labels and the loss" + ] + }, + { + "cell_type": "code", + "execution_count": 0, + "metadata": { + "colab_type": "code" + }, + "outputs": [], + "source": [ + "y_train = train_labels\n", + "y_test = test_labels" + ] + }, + { + "cell_type": "code", + "execution_count": 0, + "metadata": { + "colab_type": "code" + }, + "outputs": [], + "source": [ + "model.compile(\n", + " optimizer=\"adam\",\n", + " loss=\"sparse_categorical_crossentropy\",\n", + " metrics=[\"accuracy\"],\n", + ")" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "colab_type": "text" + }, + "source": [ + "#### The importance of having sufficiently large intermediate layers" + ] + }, + { + "cell_type": "code", + "execution_count": 0, + "metadata": { + "colab_type": "code" + }, + "outputs": [], + "source": [ + "model = keras.Sequential(\n", + " [\n", + " layers.Dense(64, activation=\"relu\"),\n", + " layers.Dense(4, activation=\"relu\"),\n", + " layers.Dense(46, activation=\"softmax\"),\n", + " ]\n", + ")\n", + "model.compile(\n", + " optimizer=\"adam\",\n", + " loss=\"categorical_crossentropy\",\n", + " metrics=[\"accuracy\"],\n", + ")\n", + "model.fit(\n", + " partial_x_train,\n", + " partial_y_train,\n", + " epochs=20,\n", + " batch_size=128,\n", + " validation_data=(x_val, y_val),\n", + ")" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "colab_type": "text" + }, + "source": [ + "#### Further experiments" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "colab_type": "text" + }, + "source": [ + "#### Wrapping up" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "colab_type": "text" + }, + "source": [ + "### Predicting house prices: a regression example" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "colab_type": "text" + }, + "source": [ + "#### The California Housing Price dataset" + ] + }, + { + "cell_type": "code", + "execution_count": 0, + "metadata": { + "colab_type": "code" + }, + "outputs": [], + "source": [ + "from keras.datasets import california_housing\n", + "\n", + "(train_data, train_targets), (test_data, test_targets) = (\n", + " california_housing.load_data(version=\"small\")\n", + ")" + ] + }, + { + "cell_type": "code", + "execution_count": 0, + "metadata": { + "colab_type": "code" + }, + "outputs": [], + "source": [ + "train_data.shape" + ] + }, + { + "cell_type": "code", + "execution_count": 0, + "metadata": { + "colab_type": "code" + }, + "outputs": [], + "source": [ + "test_data.shape" + ] + }, + { + "cell_type": "code", + "execution_count": 0, + "metadata": { + "colab_type": "code" + }, + "outputs": [], + "source": [ + "train_targets" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "colab_type": "text" + }, + "source": [ + "#### Preparing the data" + ] + }, + { + "cell_type": "code", + "execution_count": 0, + "metadata": { + "colab_type": "code" + }, + "outputs": [], + "source": [ + "mean = train_data.mean(axis=0)\n", + "std = train_data.std(axis=0)\n", + "x_train = (train_data - mean) / std\n", + "x_test = (test_data - mean) / std" + ] + }, + { + "cell_type": "code", + "execution_count": 0, + "metadata": { + "colab_type": "code" + }, + "outputs": [], + "source": [ + "y_train = train_targets / 100000\n", + "y_test = test_targets / 100000" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "colab_type": "text" + }, + "source": [ + "#### Building your model" + ] + }, + { + "cell_type": "code", + "execution_count": 0, + "metadata": { + "colab_type": "code" + }, + "outputs": [], + "source": [ + "def get_model():\n", + " model = keras.Sequential(\n", + " [\n", + " layers.Dense(64, activation=\"relu\"),\n", + " layers.Dense(64, activation=\"relu\"),\n", + " layers.Dense(1),\n", + " ]\n", + " )\n", + " model.compile(\n", + " optimizer=\"adam\",\n", + " loss=\"mean_squared_error\",\n", + " metrics=[\"mean_absolute_error\"],\n", + " )\n", + " return model" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "colab_type": "text" + }, + "source": [ + "#### Validating your approach using K-fold validation" + ] + }, + { + "cell_type": "code", + "execution_count": 0, + "metadata": { + "colab_type": "code" + }, + "outputs": [], + "source": [ + "k = 4\n", + "num_val_samples = len(x_train) // k\n", + "num_epochs = 50\n", + "all_scores = []\n", + "for i in range(k):\n", + " print(f\"Processing fold #{i + 1}\")\n", + " fold_x_val = x_train[i * num_val_samples : (i + 1) * num_val_samples]\n", + " fold_y_val = y_train[i * num_val_samples : (i + 1) * num_val_samples]\n", + " fold_x_train = np.concatenate(\n", + " [x_train[: i * num_val_samples], x_train[(i + 1) * num_val_samples :]],\n", + " axis=0,\n", + " )\n", + " fold_y_train = np.concatenate(\n", + " [y_train[: i * num_val_samples], y_train[(i + 1) * num_val_samples :]],\n", + " axis=0,\n", + " )\n", + " model = get_model()\n", + " model.fit(\n", + " fold_x_train,\n", + " fold_y_train,\n", + " epochs=num_epochs,\n", + " batch_size=16,\n", + " verbose=0,\n", + " )\n", + " scores = model.evaluate(fold_x_val, fold_y_val, verbose=0)\n", + " val_loss, val_mae = scores\n", + " all_scores.append(val_mae)" + ] + }, + { + "cell_type": "code", + "execution_count": 0, + "metadata": { + "colab_type": "code" + }, + "outputs": [], + "source": [ + "[round(value, 3) for value in all_scores]" + ] + }, + { + "cell_type": "code", + "execution_count": 0, + "metadata": { + "colab_type": "code" + }, + "outputs": [], + "source": [ + "round(np.mean(all_scores), 3)" + ] + }, + { + "cell_type": "code", + "execution_count": 0, + "metadata": { + "colab_type": "code" + }, + "outputs": [], + "source": [ + "k = 4\n", + "num_val_samples = len(x_train) // k\n", + "num_epochs = 200\n", + "all_mae_histories = []\n", + "for i in range(k):\n", + " print(f\"Processing fold #{i + 1}\")\n", + " fold_x_val = x_train[i * num_val_samples : (i + 1) * num_val_samples]\n", + " fold_y_val = y_train[i * num_val_samples : (i + 1) * num_val_samples]\n", + " fold_x_train = np.concatenate(\n", + " [x_train[: i * num_val_samples], x_train[(i + 1) * num_val_samples :]],\n", + " axis=0,\n", + " )\n", + " fold_y_train = np.concatenate(\n", + " [y_train[: i * num_val_samples], y_train[(i + 1) * num_val_samples :]],\n", + " axis=0,\n", + " )\n", + " model = get_model()\n", + " history = model.fit(\n", + " fold_x_train,\n", + " fold_y_train,\n", + " validation_data=(fold_x_val, fold_y_val),\n", + " epochs=num_epochs,\n", + " batch_size=16,\n", + " verbose=0,\n", + " )\n", + " mae_history = history.history[\"val_mean_absolute_error\"]\n", + " all_mae_histories.append(mae_history)" + ] + }, + { + "cell_type": "code", + "execution_count": 0, + "metadata": { + "colab_type": "code" + }, + "outputs": [], + "source": [ + "average_mae_history = [\n", + " np.mean([x[i] for x in all_mae_histories]) for i in range(num_epochs)\n", + "]" + ] + }, + { + "cell_type": "code", + "execution_count": 0, + "metadata": { + "colab_type": "code" + }, + "outputs": [], + "source": [ + "epochs = range(1, len(average_mae_history) + 1)\n", + "plt.plot(epochs, average_mae_history)\n", + "plt.xlabel(\"Epochs\")\n", + "plt.ylabel(\"Validation MAE\")\n", + "plt.show()" + ] + }, + { + "cell_type": "code", + "execution_count": 0, + "metadata": { + "colab_type": "code" + }, + "outputs": [], + "source": [ + "truncated_mae_history = average_mae_history[10:]\n", + "epochs = range(10, len(truncated_mae_history) + 10)\n", + "plt.plot(epochs, truncated_mae_history)\n", + "plt.xlabel(\"Epochs\")\n", + "plt.ylabel(\"Validation MAE\")\n", + "plt.show()" + ] + }, + { + "cell_type": "code", + "execution_count": 0, + "metadata": { + "colab_type": "code" + }, + "outputs": [], + "source": [ + "model = get_model()\n", + "model.fit(x_train, y_train, epochs=130, batch_size=16, verbose=0)\n", + "test_mean_squared_error, test_mean_absolute_error = model.evaluate(\n", + " x_test, y_test\n", + ")" + ] + }, + { + "cell_type": "code", + "execution_count": 0, + "metadata": { + "colab_type": "code" + }, + "outputs": [], + "source": [ + "round(test_mean_absolute_error, 3)" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "colab_type": "text" + }, + "source": [ + "#### Generating predictions on new data" + ] + }, + { + "cell_type": "code", + "execution_count": 0, + "metadata": { + "colab_type": "code" + }, + "outputs": [], + "source": [ + "predictions = model.predict(x_test)\n", + "predictions[0]" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "colab_type": "text" + }, + "source": [ + "#### Wrapping up" + ] + } + ], + "metadata": { + "accelerator": "GPU", + "colab": { + "collapsed_sections": [], + "name": "chapter04_classification-and-regression", + "private_outputs": false, + "provenance": [], + "toc_visible": true + }, + "kernelspec": { + "display_name": "Python 3", + "language": "python", + "name": "python3" + }, + "language_info": { + "codemirror_mode": { + "name": "ipython", + "version": 3 + }, + "file_extension": ".py", + "mimetype": "text/x-python", + "name": "python", + "nbconvert_exporter": "python", + "pygments_lexer": "ipython3", + "version": "3.10.0" + } + }, + "nbformat": 4, + "nbformat_minor": 0 +} \ No newline at end of file diff --git a/chapter05_fundamentals-of-ml.ipynb b/chapter05_fundamentals-of-ml.ipynb new file mode 100644 index 0000000000..2aadcc9b85 --- /dev/null +++ b/chapter05_fundamentals-of-ml.ipynb @@ -0,0 +1,985 @@ +{ + "cells": [ + { + "cell_type": "markdown", + "metadata": { + "colab_type": "text" + }, + "source": [ + "This is a companion notebook for the book [Deep Learning with Python, Third Edition](https://www.manning.com/books/deep-learning-with-python-third-edition). For readability, it only contains runnable code blocks and section titles, and omits everything else in the book: text paragraphs, figures, and pseudocode.\n\n**If you want to be able to follow what's going on, I recommend reading the notebook side by side with your copy of the book.**\n\nThe book's contents are available online at [deeplearningwithpython.io](https://deeplearningwithpython.io)." + ] + }, + { + "cell_type": "code", + "execution_count": 0, + "metadata": { + "colab_type": "code" + }, + "outputs": [], + "source": [ + "!pip install keras keras-hub --upgrade -q" + ] + }, + { + "cell_type": "code", + "execution_count": 0, + "metadata": { + "colab_type": "code" + }, + "outputs": [], + "source": [ + "import os\n", + "os.environ[\"KERAS_BACKEND\"] = \"jax\"" + ] + }, + { + "cell_type": "code", + "execution_count": 0, + "metadata": { + "cellView": "form", + "colab_type": "code" + }, + "outputs": [], + "source": [ + "# @title\n", + "import os\n", + "from IPython.core.magic import register_cell_magic\n", + "\n", + "@register_cell_magic\n", + "def backend(line, cell):\n", + " current, required = os.environ.get(\"KERAS_BACKEND\", \"\"), line.split()[-1]\n", + " if current == required:\n", + " get_ipython().run_cell(cell)\n", + " else:\n", + " print(\n", + " f\"This cell requires the {required} backend. To run it, change KERAS_BACKEND to \"\n", + " f\"\\\"{required}\\\" at the top of the notebook, restart the runtime, and rerun the notebook.\"\n", + " )" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "colab_type": "text" + }, + "source": [ + "## Fundamentals of machine learning" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "colab_type": "text" + }, + "source": [ + "### Generalization: The goal of machine learning" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "colab_type": "text" + }, + "source": [ + "#### Underfitting and overfitting" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "colab_type": "text" + }, + "source": [ + "##### Noisy training data" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "colab_type": "text" + }, + "source": [ + "##### Ambiguous features" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "colab_type": "text" + }, + "source": [ + "##### Rare features and spurious correlations" + ] + }, + { + "cell_type": "code", + "execution_count": 0, + "metadata": { + "colab_type": "code" + }, + "outputs": [], + "source": [ + "from keras.datasets import mnist\n", + "import numpy as np\n", + "\n", + "(train_images, train_labels), _ = mnist.load_data()\n", + "train_images = train_images.reshape((60000, 28 * 28))\n", + "train_images = train_images.astype(\"float32\") / 255\n", + "\n", + "train_images_with_noise_channels = np.concatenate(\n", + " [train_images, np.random.random((len(train_images), 784))], axis=1\n", + ")\n", + "\n", + "train_images_with_zeros_channels = np.concatenate(\n", + " [train_images, np.zeros((len(train_images), 784))], axis=1\n", + ")" + ] + }, + { + "cell_type": "code", + "execution_count": 0, + "metadata": { + "colab_type": "code" + }, + "outputs": [], + "source": [ + "import keras\n", + "from keras import layers\n", + "\n", + "def get_model():\n", + " model = keras.Sequential(\n", + " [\n", + " layers.Dense(512, activation=\"relu\"),\n", + " layers.Dense(10, activation=\"softmax\"),\n", + " ]\n", + " )\n", + " model.compile(\n", + " optimizer=\"adam\",\n", + " loss=\"sparse_categorical_crossentropy\",\n", + " metrics=[\"accuracy\"],\n", + " )\n", + " return model\n", + "\n", + "model = get_model()\n", + "history_noise = model.fit(\n", + " train_images_with_noise_channels,\n", + " train_labels,\n", + " epochs=10,\n", + " batch_size=128,\n", + " validation_split=0.2,\n", + ")\n", + "\n", + "model = get_model()\n", + "history_zeros = model.fit(\n", + " train_images_with_zeros_channels,\n", + " train_labels,\n", + " epochs=10,\n", + " batch_size=128,\n", + " validation_split=0.2,\n", + ")" + ] + }, + { + "cell_type": "code", + "execution_count": 0, + "metadata": { + "colab_type": "code" + }, + "outputs": [], + "source": [ + "import matplotlib.pyplot as plt\n", + "\n", + "val_acc_noise = history_noise.history[\"val_accuracy\"]\n", + "val_acc_zeros = history_zeros.history[\"val_accuracy\"]\n", + "epochs = range(1, 11)\n", + "plt.plot(\n", + " epochs,\n", + " val_acc_noise,\n", + " \"b-\",\n", + " label=\"Validation accuracy with noise channels\",\n", + ")\n", + "plt.plot(\n", + " epochs,\n", + " val_acc_zeros,\n", + " \"r--\",\n", + " label=\"Validation accuracy with zeros channels\",\n", + ")\n", + "plt.title(\"Effect of noise channels on validation accuracy\")\n", + "plt.xlabel(\"Epochs\")\n", + "plt.xticks(epochs)\n", + "plt.ylabel(\"Accuracy\")\n", + "plt.legend()\n", + "plt.show()" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "colab_type": "text" + }, + "source": [ + "#### The nature of generalization in deep learning" + ] + }, + { + "cell_type": "code", + "execution_count": 0, + "metadata": { + "colab_type": "code" + }, + "outputs": [], + "source": [ + "(train_images, train_labels), _ = mnist.load_data()\n", + "train_images = train_images.reshape((60000, 28 * 28))\n", + "train_images = train_images.astype(\"float32\") / 255\n", + "\n", + "random_train_labels = train_labels[:]\n", + "np.random.shuffle(random_train_labels)\n", + "\n", + "model = keras.Sequential(\n", + " [\n", + " layers.Dense(512, activation=\"relu\"),\n", + " layers.Dense(10, activation=\"softmax\"),\n", + " ]\n", + ")\n", + "model.compile(\n", + " optimizer=\"rmsprop\",\n", + " loss=\"sparse_categorical_crossentropy\",\n", + " metrics=[\"accuracy\"],\n", + ")\n", + "model.fit(\n", + " train_images,\n", + " random_train_labels,\n", + " epochs=100,\n", + " batch_size=128,\n", + " validation_split=0.2,\n", + ")" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "colab_type": "text" + }, + "source": [ + "##### The manifold hypothesis" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "colab_type": "text" + }, + "source": [ + "##### Interpolation as a source of generalization" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "colab_type": "text" + }, + "source": [ + "##### Why deep learning works" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "colab_type": "text" + }, + "source": [ + "##### Training data is paramount" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "colab_type": "text" + }, + "source": [ + "### Evaluating machine-learning models" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "colab_type": "text" + }, + "source": [ + "#### Training, validation, and test sets" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "colab_type": "text" + }, + "source": [ + "##### Simple hold-out validation" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "colab_type": "text" + }, + "source": [ + "##### K-fold validation" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "colab_type": "text" + }, + "source": [ + "##### Iterated K-fold validation with shuffling" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "colab_type": "text" + }, + "source": [ + "#### Beating a common-sense baseline" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "colab_type": "text" + }, + "source": [ + "#### Things to keep in mind about model evaluation" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "colab_type": "text" + }, + "source": [ + "### Improving model fit" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "colab_type": "text" + }, + "source": [ + "#### Tuning key gradient descent parameters" + ] + }, + { + "cell_type": "code", + "execution_count": 0, + "metadata": { + "colab_type": "code" + }, + "outputs": [], + "source": [ + "(train_images, train_labels), _ = mnist.load_data()\n", + "train_images = train_images.reshape((60000, 28 * 28))\n", + "train_images = train_images.astype(\"float32\") / 255\n", + "\n", + "model = keras.Sequential(\n", + " [\n", + " layers.Dense(512, activation=\"relu\"),\n", + " layers.Dense(10, activation=\"softmax\"),\n", + " ]\n", + ")\n", + "model.compile(\n", + " optimizer=keras.optimizers.RMSprop(learning_rate=1.0),\n", + " loss=\"sparse_categorical_crossentropy\",\n", + " metrics=[\"accuracy\"],\n", + ")\n", + "model.fit(\n", + " train_images, train_labels, epochs=10, batch_size=128, validation_split=0.2\n", + ")" + ] + }, + { + "cell_type": "code", + "execution_count": 0, + "metadata": { + "colab_type": "code" + }, + "outputs": [], + "source": [ + "model = keras.Sequential(\n", + " [\n", + " layers.Dense(512, activation=\"relu\"),\n", + " layers.Dense(10, activation=\"softmax\"),\n", + " ]\n", + ")\n", + "model.compile(\n", + " optimizer=keras.optimizers.RMSprop(learning_rate=1e-2),\n", + " loss=\"sparse_categorical_crossentropy\",\n", + " metrics=[\"accuracy\"],\n", + ")\n", + "model.fit(\n", + " train_images, train_labels, epochs=10, batch_size=128, validation_split=0.2\n", + ")" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "colab_type": "text" + }, + "source": [ + "#### Using better architecture priors" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "colab_type": "text" + }, + "source": [ + "#### Increasing model capacity" + ] + }, + { + "cell_type": "code", + "execution_count": 0, + "metadata": { + "colab_type": "code" + }, + "outputs": [], + "source": [ + "model = keras.Sequential([layers.Dense(10, activation=\"softmax\")])\n", + "model.compile(\n", + " optimizer=\"rmsprop\",\n", + " loss=\"sparse_categorical_crossentropy\",\n", + " metrics=[\"accuracy\"],\n", + ")\n", + "history_small_model = model.fit(\n", + " train_images, train_labels, epochs=20, batch_size=128, validation_split=0.2\n", + ")" + ] + }, + { + "cell_type": "code", + "execution_count": 0, + "metadata": { + "colab_type": "code" + }, + "outputs": [], + "source": [ + "import matplotlib.pyplot as plt\n", + "\n", + "val_loss = history_small_model.history[\"val_loss\"]\n", + "epochs = range(1, 21)\n", + "plt.plot(epochs, val_loss, \"b-\", label=\"Validation loss\")\n", + "plt.title(\"Validation loss for a model with insufficient capacity\")\n", + "plt.xlabel(\"Epochs\")\n", + "plt.ylabel(\"Loss\")\n", + "plt.legend()\n", + "plt.show()" + ] + }, + { + "cell_type": "code", + "execution_count": 0, + "metadata": { + "colab_type": "code" + }, + "outputs": [], + "source": [ + "model = keras.Sequential(\n", + " [\n", + " layers.Dense(128, activation=\"relu\"),\n", + " layers.Dense(128, activation=\"relu\"),\n", + " layers.Dense(10, activation=\"softmax\"),\n", + " ]\n", + ")\n", + "model.compile(\n", + " optimizer=\"rmsprop\",\n", + " loss=\"sparse_categorical_crossentropy\",\n", + " metrics=[\"accuracy\"],\n", + ")\n", + "history_large_model = model.fit(\n", + " train_images,\n", + " train_labels,\n", + " epochs=20,\n", + " batch_size=128,\n", + " validation_split=0.2,\n", + ")" + ] + }, + { + "cell_type": "code", + "execution_count": 0, + "metadata": { + "colab_type": "code" + }, + "outputs": [], + "source": [ + "val_loss = history_large_model.history[\"val_loss\"]\n", + "epochs = range(1, 21)\n", + "plt.plot(epochs, val_loss, \"b-\", label=\"Validation loss\")\n", + "plt.title(\"Validation loss for a model with appropriate capacity\")\n", + "plt.xlabel(\"Epochs\")\n", + "plt.ylabel(\"Loss\")\n", + "plt.legend()\n", + "plt.show()" + ] + }, + { + "cell_type": "code", + "execution_count": 0, + "metadata": { + "colab_type": "code" + }, + "outputs": [], + "source": [ + "model = keras.Sequential(\n", + " [\n", + " layers.Dense(2048, activation=\"relu\"),\n", + " layers.Dense(2048, activation=\"relu\"),\n", + " layers.Dense(2048, activation=\"relu\"),\n", + " layers.Dense(10, activation=\"softmax\"),\n", + " ]\n", + ")\n", + "model.compile(\n", + " optimizer=\"rmsprop\",\n", + " loss=\"sparse_categorical_crossentropy\",\n", + " metrics=[\"accuracy\"],\n", + ")\n", + "history_very_large_model = model.fit(\n", + " train_images,\n", + " train_labels,\n", + " epochs=20,\n", + " batch_size=32,\n", + " validation_split=0.2,\n", + ")" + ] + }, + { + "cell_type": "code", + "execution_count": 0, + "metadata": { + "colab_type": "code" + }, + "outputs": [], + "source": [ + "val_loss = history_very_large_model.history[\"val_loss\"]\n", + "epochs = range(1, 21)\n", + "plt.plot(epochs, val_loss, \"b-\", label=\"Validation loss\")\n", + "plt.title(\"Validation loss for a model with too much capacity\")\n", + "plt.xlabel(\"Epochs\")\n", + "plt.ylabel(\"Loss\")\n", + "plt.legend()\n", + "plt.show()" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "colab_type": "text" + }, + "source": [ + "### Improving generalization" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "colab_type": "text" + }, + "source": [ + "#### Dataset curation" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "colab_type": "text" + }, + "source": [ + "#### Feature engineering" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "colab_type": "text" + }, + "source": [ + "#### Using early stopping" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "colab_type": "text" + }, + "source": [ + "#### Regularizing your model" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "colab_type": "text" + }, + "source": [ + "##### Reducing the network's size" + ] + }, + { + "cell_type": "code", + "execution_count": 0, + "metadata": { + "colab_type": "code" + }, + "outputs": [], + "source": [ + "from keras.datasets import imdb\n", + "\n", + "(train_data, train_labels), _ = imdb.load_data(num_words=10000)\n", + "\n", + "def vectorize_sequences(sequences, dimension=10000):\n", + " results = np.zeros((len(sequences), dimension))\n", + " for i, sequence in enumerate(sequences):\n", + " results[i, sequence] = 1.0\n", + " return results\n", + "\n", + "train_data = vectorize_sequences(train_data)\n", + "\n", + "model = keras.Sequential(\n", + " [\n", + " layers.Dense(16, activation=\"relu\"),\n", + " layers.Dense(16, activation=\"relu\"),\n", + " layers.Dense(1, activation=\"sigmoid\"),\n", + " ]\n", + ")\n", + "model.compile(\n", + " optimizer=\"rmsprop\",\n", + " loss=\"binary_crossentropy\",\n", + " metrics=[\"accuracy\"],\n", + ")\n", + "history_original = model.fit(\n", + " train_data,\n", + " train_labels,\n", + " epochs=20,\n", + " batch_size=512,\n", + " validation_split=0.4,\n", + ")" + ] + }, + { + "cell_type": "code", + "execution_count": 0, + "metadata": { + "colab_type": "code" + }, + "outputs": [], + "source": [ + "model = keras.Sequential(\n", + " [\n", + " layers.Dense(4, activation=\"relu\"),\n", + " layers.Dense(4, activation=\"relu\"),\n", + " layers.Dense(1, activation=\"sigmoid\"),\n", + " ]\n", + ")\n", + "model.compile(\n", + " optimizer=\"rmsprop\",\n", + " loss=\"binary_crossentropy\",\n", + " metrics=[\"accuracy\"],\n", + ")\n", + "history_smaller_model = model.fit(\n", + " train_data,\n", + " train_labels,\n", + " epochs=20,\n", + " batch_size=512,\n", + " validation_split=0.4,\n", + ")" + ] + }, + { + "cell_type": "code", + "execution_count": 0, + "metadata": { + "colab_type": "code" + }, + "outputs": [], + "source": [ + "original_val_loss = history_original.history[\"val_loss\"]\n", + "smaller_model_val_loss = history_smaller_model.history[\"val_loss\"]\n", + "epochs = range(1, 21)\n", + "plt.plot(\n", + " epochs,\n", + " original_val_loss,\n", + " \"r--\",\n", + " label=\"Validation loss of original model\",\n", + ")\n", + "plt.plot(\n", + " epochs,\n", + " smaller_model_val_loss,\n", + " \"b-\",\n", + " label=\"Validation loss of smaller model\",\n", + ")\n", + "plt.title(\"Original model vs. smaller model (IMDB review classification)\")\n", + "plt.xlabel(\"Epochs\")\n", + "plt.ylabel(\"Loss\")\n", + "plt.xticks(epochs)\n", + "plt.legend()\n", + "plt.show()" + ] + }, + { + "cell_type": "code", + "execution_count": 0, + "metadata": { + "colab_type": "code" + }, + "outputs": [], + "source": [ + "model = keras.Sequential(\n", + " [\n", + " layers.Dense(512, activation=\"relu\"),\n", + " layers.Dense(512, activation=\"relu\"),\n", + " layers.Dense(1, activation=\"sigmoid\"),\n", + " ]\n", + ")\n", + "model.compile(\n", + " optimizer=\"rmsprop\",\n", + " loss=\"binary_crossentropy\",\n", + " metrics=[\"accuracy\"],\n", + ")\n", + "history_larger_model = model.fit(\n", + " train_data,\n", + " train_labels,\n", + " epochs=20,\n", + " batch_size=512,\n", + " validation_split=0.4,\n", + ")" + ] + }, + { + "cell_type": "code", + "execution_count": 0, + "metadata": { + "colab_type": "code" + }, + "outputs": [], + "source": [ + "original_val_loss = history_original.history[\"val_loss\"]\n", + "larger_model_val_loss = history_larger_model.history[\"val_loss\"]\n", + "epochs = range(1, 21)\n", + "plt.plot(\n", + " epochs,\n", + " original_val_loss,\n", + " \"r--\",\n", + " label=\"Validation loss of original model\",\n", + ")\n", + "plt.plot(\n", + " epochs,\n", + " larger_model_val_loss,\n", + " \"b-\",\n", + " label=\"Validation loss of larger model\",\n", + ")\n", + "plt.title(\"Original model vs. larger model (IMDB review classification)\")\n", + "plt.xlabel(\"Epochs\")\n", + "plt.ylabel(\"Loss\")\n", + "plt.xticks(epochs)\n", + "plt.legend()\n", + "plt.show()" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "colab_type": "text" + }, + "source": [ + "##### Adding weight regularization" + ] + }, + { + "cell_type": "code", + "execution_count": 0, + "metadata": { + "colab_type": "code" + }, + "outputs": [], + "source": [ + "from keras.regularizers import l2\n", + "\n", + "model = keras.Sequential(\n", + " [\n", + " layers.Dense(16, kernel_regularizer=l2(0.002), activation=\"relu\"),\n", + " layers.Dense(16, kernel_regularizer=l2(0.002), activation=\"relu\"),\n", + " layers.Dense(1, activation=\"sigmoid\"),\n", + " ]\n", + ")\n", + "model.compile(\n", + " optimizer=\"rmsprop\",\n", + " loss=\"binary_crossentropy\",\n", + " metrics=[\"accuracy\"],\n", + ")\n", + "history_l2_reg = model.fit(\n", + " train_data,\n", + " train_labels,\n", + " epochs=20,\n", + " batch_size=512,\n", + " validation_split=0.4,\n", + ")" + ] + }, + { + "cell_type": "code", + "execution_count": 0, + "metadata": { + "colab_type": "code" + }, + "outputs": [], + "source": [ + "original_val_loss = history_original.history[\"val_loss\"]\n", + "l2_val_loss = history_l2_reg.history[\"val_loss\"]\n", + "epochs = range(1, 21)\n", + "plt.plot(\n", + " epochs,\n", + " original_val_loss,\n", + " \"r--\",\n", + " label=\"Validation loss of original model\",\n", + ")\n", + "plt.plot(\n", + " epochs,\n", + " l2_val_loss,\n", + " \"b-\",\n", + " label=\"Validation loss of L2-regularized model\",\n", + ")\n", + "plt.title(\n", + " \"Original model vs. L2-regularized model (IMDB review classification)\"\n", + ")\n", + "plt.xlabel(\"Epochs\")\n", + "plt.ylabel(\"Loss\")\n", + "plt.xticks(epochs)\n", + "plt.legend()\n", + "plt.show()" + ] + }, + { + "cell_type": "code", + "execution_count": 0, + "metadata": { + "colab_type": "code" + }, + "outputs": [], + "source": [ + "from keras import regularizers\n", + "\n", + "regularizers.l1(0.001)\n", + "regularizers.l1_l2(l1=0.001, l2=0.001)" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "colab_type": "text" + }, + "source": [ + "##### Adding dropout" + ] + }, + { + "cell_type": "code", + "execution_count": 0, + "metadata": { + "colab_type": "code" + }, + "outputs": [], + "source": [ + "model = keras.Sequential(\n", + " [\n", + " layers.Dense(16, activation=\"relu\"),\n", + " layers.Dropout(0.5),\n", + " layers.Dense(16, activation=\"relu\"),\n", + " layers.Dropout(0.5),\n", + " layers.Dense(1, activation=\"sigmoid\"),\n", + " ]\n", + ")\n", + "model.compile(\n", + " optimizer=\"rmsprop\",\n", + " loss=\"binary_crossentropy\",\n", + " metrics=[\"accuracy\"],\n", + ")\n", + "history_dropout = model.fit(\n", + " train_data,\n", + " train_labels,\n", + " epochs=20,\n", + " batch_size=512,\n", + " validation_split=0.4,\n", + ")" + ] + }, + { + "cell_type": "code", + "execution_count": 0, + "metadata": { + "colab_type": "code" + }, + "outputs": [], + "source": [ + "original_val_loss = history_original.history[\"val_loss\"]\n", + "dropout_val_loss = history_dropout.history[\"val_loss\"]\n", + "epochs = range(1, 21)\n", + "plt.plot(\n", + " epochs,\n", + " original_val_loss,\n", + " \"r--\",\n", + " label=\"Validation loss of original model\",\n", + ")\n", + "plt.plot(\n", + " epochs,\n", + " dropout_val_loss,\n", + " \"b-\",\n", + " label=\"Validation loss of dropout-regularized model\",\n", + ")\n", + "plt.title(\n", + " \"Original model vs. dropout-regularized model (IMDB review classification)\"\n", + ")\n", + "plt.xlabel(\"Epochs\")\n", + "plt.ylabel(\"Loss\")\n", + "plt.xticks(epochs)\n", + "plt.legend()\n", + "plt.show()" + ] + } + ], + "metadata": { + "accelerator": "GPU", + "colab": { + "collapsed_sections": [], + "name": "chapter05_fundamentals-of-ml", + "private_outputs": false, + "provenance": [], + "toc_visible": true + }, + "kernelspec": { + "display_name": "Python 3", + "language": "python", + "name": "python3" + }, + "language_info": { + "codemirror_mode": { + "name": "ipython", + "version": 3 + }, + "file_extension": ".py", + "mimetype": "text/x-python", + "name": "python", + "nbconvert_exporter": "python", + "pygments_lexer": "ipython3", + "version": "3.10.0" + } + }, + "nbformat": 4, + "nbformat_minor": 0 +} \ No newline at end of file diff --git a/chapter07_deep-dive-keras.ipynb b/chapter07_deep-dive-keras.ipynb new file mode 100644 index 0000000000..be5963473c --- /dev/null +++ b/chapter07_deep-dive-keras.ipynb @@ -0,0 +1,1834 @@ +{ + "cells": [ + { + "cell_type": "markdown", + "metadata": { + "colab_type": "text" + }, + "source": [ + "This is a companion notebook for the book [Deep Learning with Python, Third Edition](https://www.manning.com/books/deep-learning-with-python-third-edition). For readability, it only contains runnable code blocks and section titles, and omits everything else in the book: text paragraphs, figures, and pseudocode.\n\n**If you want to be able to follow what's going on, I recommend reading the notebook side by side with your copy of the book.**\n\nThe book's contents are available online at [deeplearningwithpython.io](https://deeplearningwithpython.io)." + ] + }, + { + "cell_type": "code", + "execution_count": 0, + "metadata": { + "colab_type": "code" + }, + "outputs": [], + "source": [ + "!pip install keras keras-hub --upgrade -q" + ] + }, + { + "cell_type": "code", + "execution_count": 0, + "metadata": { + "colab_type": "code" + }, + "outputs": [], + "source": [ + "import os\n", + "os.environ[\"KERAS_BACKEND\"] = \"jax\"" + ] + }, + { + "cell_type": "code", + "execution_count": 0, + "metadata": { + "cellView": "form", + "colab_type": "code" + }, + "outputs": [], + "source": [ + "# @title\n", + "import os\n", + "from IPython.core.magic import register_cell_magic\n", + "\n", + "@register_cell_magic\n", + "def backend(line, cell):\n", + " current, required = os.environ.get(\"KERAS_BACKEND\", \"\"), line.split()[-1]\n", + " if current == required:\n", + " get_ipython().run_cell(cell)\n", + " else:\n", + " print(\n", + " f\"This cell requires the {required} backend. To run it, change KERAS_BACKEND to \"\n", + " f\"\\\"{required}\\\" at the top of the notebook, restart the runtime, and rerun the notebook.\"\n", + " )" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "colab_type": "text" + }, + "source": [ + "## A deep dive on Keras" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "colab_type": "text" + }, + "source": [ + "### A spectrum of workflows" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "colab_type": "text" + }, + "source": [ + "### Different ways to build Keras models" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "colab_type": "text" + }, + "source": [ + "#### The Sequential model" + ] + }, + { + "cell_type": "code", + "execution_count": 0, + "metadata": { + "colab_type": "code" + }, + "outputs": [], + "source": [ + "import keras\n", + "from keras import layers\n", + "\n", + "model = keras.Sequential(\n", + " [\n", + " layers.Dense(64, activation=\"relu\"),\n", + " layers.Dense(10, activation=\"softmax\"),\n", + " ]\n", + ")" + ] + }, + { + "cell_type": "code", + "execution_count": 0, + "metadata": { + "colab_type": "code" + }, + "outputs": [], + "source": [ + "model = keras.Sequential()\n", + "model.add(layers.Dense(64, activation=\"relu\"))\n", + "model.add(layers.Dense(10, activation=\"softmax\"))" + ] + }, + { + "cell_type": "code", + "execution_count": 0, + "metadata": { + "colab_type": "code" + }, + "outputs": [], + "source": [ + "model.weights" + ] + }, + { + "cell_type": "code", + "execution_count": 0, + "metadata": { + "colab_type": "code" + }, + "outputs": [], + "source": [ + "model.build(input_shape=(None, 3))\n", + "model.weights" + ] + }, + { + "cell_type": "code", + "execution_count": 0, + "metadata": { + "colab_type": "code" + }, + "outputs": [], + "source": [ + "model.summary(line_length=80)" + ] + }, + { + "cell_type": "code", + "execution_count": 0, + "metadata": { + "colab_type": "code" + }, + "outputs": [], + "source": [ + "model = keras.Sequential(name=\"my_example_model\")\n", + "model.add(layers.Dense(64, activation=\"relu\", name=\"my_first_layer\"))\n", + "model.add(layers.Dense(10, activation=\"softmax\", name=\"my_last_layer\"))\n", + "model.build((None, 3))\n", + "model.summary(line_length=80)" + ] + }, + { + "cell_type": "code", + "execution_count": 0, + "metadata": { + "colab_type": "code" + }, + "outputs": [], + "source": [ + "model = keras.Sequential()\n", + "model.add(keras.Input(shape=(3,)))\n", + "model.add(layers.Dense(64, activation=\"relu\"))" + ] + }, + { + "cell_type": "code", + "execution_count": 0, + "metadata": { + "colab_type": "code" + }, + "outputs": [], + "source": [ + "model.summary(line_length=80)" + ] + }, + { + "cell_type": "code", + "execution_count": 0, + "metadata": { + "colab_type": "code" + }, + "outputs": [], + "source": [ + "model.add(layers.Dense(10, activation=\"softmax\"))\n", + "model.summary(line_length=80)" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "colab_type": "text" + }, + "source": [ + "#### The Functional API" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "colab_type": "text" + }, + "source": [ + "##### A simple example" + ] + }, + { + "cell_type": "code", + "execution_count": 0, + "metadata": { + "colab_type": "code" + }, + "outputs": [], + "source": [ + "inputs = keras.Input(shape=(3,), name=\"my_input\")\n", + "features = layers.Dense(64, activation=\"relu\")(inputs)\n", + "outputs = layers.Dense(10, activation=\"softmax\")(features)\n", + "model = keras.Model(inputs=inputs, outputs=outputs, name=\"my_functional_model\")" + ] + }, + { + "cell_type": "code", + "execution_count": 0, + "metadata": { + "colab_type": "code" + }, + "outputs": [], + "source": [ + "inputs = keras.Input(shape=(3,), name=\"my_input\")" + ] + }, + { + "cell_type": "code", + "execution_count": 0, + "metadata": { + "colab_type": "code" + }, + "outputs": [], + "source": [ + "inputs.shape" + ] + }, + { + "cell_type": "code", + "execution_count": 0, + "metadata": { + "colab_type": "code" + }, + "outputs": [], + "source": [ + "inputs.dtype" + ] + }, + { + "cell_type": "code", + "execution_count": 0, + "metadata": { + "colab_type": "code" + }, + "outputs": [], + "source": [ + "features = layers.Dense(64, activation=\"relu\")(inputs)" + ] + }, + { + "cell_type": "code", + "execution_count": 0, + "metadata": { + "colab_type": "code" + }, + "outputs": [], + "source": [ + "features.shape" + ] + }, + { + "cell_type": "code", + "execution_count": 0, + "metadata": { + "colab_type": "code" + }, + "outputs": [], + "source": [ + "outputs = layers.Dense(10, activation=\"softmax\")(features)\n", + "model = keras.Model(inputs=inputs, outputs=outputs, name=\"my_functional_model\")" + ] + }, + { + "cell_type": "code", + "execution_count": 0, + "metadata": { + "colab_type": "code" + }, + "outputs": [], + "source": [ + "model.summary(line_length=80)" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "colab_type": "text" + }, + "source": [ + "##### Multi-input, multi-output models" + ] + }, + { + "cell_type": "code", + "execution_count": 0, + "metadata": { + "colab_type": "code" + }, + "outputs": [], + "source": [ + "vocabulary_size = 10000\n", + "num_tags = 100\n", + "num_departments = 4\n", + "\n", + "title = keras.Input(shape=(vocabulary_size,), name=\"title\")\n", + "text_body = keras.Input(shape=(vocabulary_size,), name=\"text_body\")\n", + "tags = keras.Input(shape=(num_tags,), name=\"tags\")\n", + "\n", + "features = layers.Concatenate()([title, text_body, tags])\n", + "features = layers.Dense(64, activation=\"relu\", name=\"dense_features\")(features)\n", + "\n", + "priority = layers.Dense(1, activation=\"sigmoid\", name=\"priority\")(features)\n", + "department = layers.Dense(\n", + " num_departments, activation=\"softmax\", name=\"department\"\n", + ")(features)\n", + "\n", + "model = keras.Model(\n", + " inputs=[title, text_body, tags],\n", + " outputs=[priority, department],\n", + ")" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "colab_type": "text" + }, + "source": [ + "##### Training a multi-input, multi-output model" + ] + }, + { + "cell_type": "code", + "execution_count": 0, + "metadata": { + "colab_type": "code" + }, + "outputs": [], + "source": [ + "import numpy as np\n", + "\n", + "num_samples = 1280\n", + "\n", + "title_data = np.random.randint(0, 2, size=(num_samples, vocabulary_size))\n", + "text_body_data = np.random.randint(0, 2, size=(num_samples, vocabulary_size))\n", + "tags_data = np.random.randint(0, 2, size=(num_samples, num_tags))\n", + "\n", + "priority_data = np.random.random(size=(num_samples, 1))\n", + "department_data = np.random.randint(0, num_departments, size=(num_samples, 1))\n", + "\n", + "model.compile(\n", + " optimizer=\"adam\",\n", + " loss=[\"mean_squared_error\", \"sparse_categorical_crossentropy\"],\n", + " metrics=[[\"mean_absolute_error\"], [\"accuracy\"]],\n", + ")\n", + "model.fit(\n", + " [title_data, text_body_data, tags_data],\n", + " [priority_data, department_data],\n", + " epochs=1,\n", + ")\n", + "model.evaluate(\n", + " [title_data, text_body_data, tags_data], [priority_data, department_data]\n", + ")\n", + "priority_preds, department_preds = model.predict(\n", + " [title_data, text_body_data, tags_data]\n", + ")" + ] + }, + { + "cell_type": "code", + "execution_count": 0, + "metadata": { + "colab_type": "code" + }, + "outputs": [], + "source": [ + "model.compile(\n", + " optimizer=\"adam\",\n", + " loss={\n", + " \"priority\": \"mean_squared_error\",\n", + " \"department\": \"sparse_categorical_crossentropy\",\n", + " },\n", + " metrics={\n", + " \"priority\": [\"mean_absolute_error\"],\n", + " \"department\": [\"accuracy\"],\n", + " },\n", + ")\n", + "model.fit(\n", + " {\"title\": title_data, \"text_body\": text_body_data, \"tags\": tags_data},\n", + " {\"priority\": priority_data, \"department\": department_data},\n", + " epochs=1,\n", + ")\n", + "model.evaluate(\n", + " {\"title\": title_data, \"text_body\": text_body_data, \"tags\": tags_data},\n", + " {\"priority\": priority_data, \"department\": department_data},\n", + ")\n", + "priority_preds, department_preds = model.predict(\n", + " {\"title\": title_data, \"text_body\": text_body_data, \"tags\": tags_data}\n", + ")" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "colab_type": "text" + }, + "source": [ + "##### The power of the Functional API: Access to layer connectivity" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "colab_type": "text" + }, + "source": [ + "###### Plotting layer connectivity" + ] + }, + { + "cell_type": "code", + "execution_count": 0, + "metadata": { + "colab_type": "code" + }, + "outputs": [], + "source": [ + "keras.utils.plot_model(model, \"ticket_classifier.png\")" + ] + }, + { + "cell_type": "code", + "execution_count": 0, + "metadata": { + "colab_type": "code" + }, + "outputs": [], + "source": [ + "keras.utils.plot_model(\n", + " model,\n", + " \"ticket_classifier_with_shape_info.png\",\n", + " show_shapes=True,\n", + " show_layer_names=True,\n", + ")" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "colab_type": "text" + }, + "source": [ + "###### Feature extraction with a Functional model" + ] + }, + { + "cell_type": "code", + "execution_count": 0, + "metadata": { + "colab_type": "code" + }, + "outputs": [], + "source": [ + "model.layers" + ] + }, + { + "cell_type": "code", + "execution_count": 0, + "metadata": { + "colab_type": "code" + }, + "outputs": [], + "source": [ + "model.layers[3].input" + ] + }, + { + "cell_type": "code", + "execution_count": 0, + "metadata": { + "colab_type": "code" + }, + "outputs": [], + "source": [ + "model.layers[3].output" + ] + }, + { + "cell_type": "code", + "execution_count": 0, + "metadata": { + "colab_type": "code" + }, + "outputs": [], + "source": [ + "features = model.layers[4].output\n", + "difficulty = layers.Dense(3, activation=\"softmax\", name=\"difficulty\")(features)\n", + "\n", + "new_model = keras.Model(\n", + " inputs=[title, text_body, tags], outputs=[priority, department, difficulty]\n", + ")" + ] + }, + { + "cell_type": "code", + "execution_count": 0, + "metadata": { + "colab_type": "code" + }, + "outputs": [], + "source": [ + "keras.utils.plot_model(\n", + " new_model,\n", + " \"updated_ticket_classifier.png\",\n", + " show_shapes=True,\n", + " show_layer_names=True,\n", + ")" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "colab_type": "text" + }, + "source": [ + "#### Subclassing the Model class" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "colab_type": "text" + }, + "source": [ + "##### Rewriting our previous example as a subclassed model" + ] + }, + { + "cell_type": "code", + "execution_count": 0, + "metadata": { + "colab_type": "code" + }, + "outputs": [], + "source": [ + "class CustomerTicketModel(keras.Model):\n", + " def __init__(self, num_departments):\n", + " super().__init__()\n", + " self.concat_layer = layers.Concatenate()\n", + " self.mixing_layer = layers.Dense(64, activation=\"relu\")\n", + " self.priority_scorer = layers.Dense(1, activation=\"sigmoid\")\n", + " self.department_classifier = layers.Dense(\n", + " num_departments, activation=\"softmax\"\n", + " )\n", + "\n", + " def call(self, inputs):\n", + " title = inputs[\"title\"]\n", + " text_body = inputs[\"text_body\"]\n", + " tags = inputs[\"tags\"]\n", + "\n", + " features = self.concat_layer([title, text_body, tags])\n", + " features = self.mixing_layer(features)\n", + " priority = self.priority_scorer(features)\n", + " department = self.department_classifier(features)\n", + " return priority, department" + ] + }, + { + "cell_type": "code", + "execution_count": 0, + "metadata": { + "colab_type": "code" + }, + "outputs": [], + "source": [ + "model = CustomerTicketModel(num_departments=4)\n", + "\n", + "priority, department = model(\n", + " {\"title\": title_data, \"text_body\": text_body_data, \"tags\": tags_data}\n", + ")" + ] + }, + { + "cell_type": "code", + "execution_count": 0, + "metadata": { + "colab_type": "code" + }, + "outputs": [], + "source": [ + "model.compile(\n", + " optimizer=\"adam\",\n", + " loss=[\"mean_squared_error\", \"sparse_categorical_crossentropy\"],\n", + " metrics=[[\"mean_absolute_error\"], [\"accuracy\"]],\n", + ")\n", + "model.fit(\n", + " {\"title\": title_data, \"text_body\": text_body_data, \"tags\": tags_data},\n", + " [priority_data, department_data],\n", + " epochs=1,\n", + ")\n", + "model.evaluate(\n", + " {\"title\": title_data, \"text_body\": text_body_data, \"tags\": tags_data},\n", + " [priority_data, department_data],\n", + ")\n", + "priority_preds, department_preds = model.predict(\n", + " {\"title\": title_data, \"text_body\": text_body_data, \"tags\": tags_data}\n", + ")" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "colab_type": "text" + }, + "source": [ + "##### Beware: What subclassed models don't support" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "colab_type": "text" + }, + "source": [ + "#### Mixing and matching different components" + ] + }, + { + "cell_type": "code", + "execution_count": 0, + "metadata": { + "colab_type": "code" + }, + "outputs": [], + "source": [ + "class Classifier(keras.Model):\n", + " def __init__(self, num_classes=2):\n", + " super().__init__()\n", + " if num_classes == 2:\n", + " num_units = 1\n", + " activation = \"sigmoid\"\n", + " else:\n", + " num_units = num_classes\n", + " activation = \"softmax\"\n", + " self.dense = layers.Dense(num_units, activation=activation)\n", + "\n", + " def call(self, inputs):\n", + " return self.dense(inputs)\n", + "\n", + "inputs = keras.Input(shape=(3,))\n", + "features = layers.Dense(64, activation=\"relu\")(inputs)\n", + "outputs = Classifier(num_classes=10)(features)\n", + "model = keras.Model(inputs=inputs, outputs=outputs)" + ] + }, + { + "cell_type": "code", + "execution_count": 0, + "metadata": { + "colab_type": "code" + }, + "outputs": [], + "source": [ + "inputs = keras.Input(shape=(64,))\n", + "outputs = layers.Dense(1, activation=\"sigmoid\")(inputs)\n", + "binary_classifier = keras.Model(inputs=inputs, outputs=outputs)\n", + "\n", + "class MyModel(keras.Model):\n", + " def __init__(self, num_classes=2):\n", + " super().__init__()\n", + " self.dense = layers.Dense(64, activation=\"relu\")\n", + " self.classifier = binary_classifier\n", + "\n", + " def call(self, inputs):\n", + " features = self.dense(inputs)\n", + " return self.classifier(features)\n", + "\n", + "model = MyModel()" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "colab_type": "text" + }, + "source": [ + "#### Remember: Use the right tool for the job" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "colab_type": "text" + }, + "source": [ + "### Using built-in training and evaluation loops" + ] + }, + { + "cell_type": "code", + "execution_count": 0, + "metadata": { + "colab_type": "code" + }, + "outputs": [], + "source": [ + "from keras.datasets import mnist\n", + "\n", + "def get_mnist_model():\n", + " inputs = keras.Input(shape=(28 * 28,))\n", + " features = layers.Dense(512, activation=\"relu\")(inputs)\n", + " features = layers.Dropout(0.5)(features)\n", + " outputs = layers.Dense(10, activation=\"softmax\")(features)\n", + " model = keras.Model(inputs, outputs)\n", + " return model\n", + "\n", + "(images, labels), (test_images, test_labels) = mnist.load_data()\n", + "images = images.reshape((60000, 28 * 28)).astype(\"float32\") / 255\n", + "test_images = test_images.reshape((10000, 28 * 28)).astype(\"float32\") / 255\n", + "train_images, val_images = images[10000:], images[:10000]\n", + "train_labels, val_labels = labels[10000:], labels[:10000]\n", + "\n", + "model = get_mnist_model()\n", + "model.compile(\n", + " optimizer=\"adam\",\n", + " loss=\"sparse_categorical_crossentropy\",\n", + " metrics=[\"accuracy\"],\n", + ")\n", + "model.fit(\n", + " train_images,\n", + " train_labels,\n", + " epochs=3,\n", + " validation_data=(val_images, val_labels),\n", + ")\n", + "test_metrics = model.evaluate(test_images, test_labels)\n", + "predictions = model.predict(test_images)" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "colab_type": "text" + }, + "source": [ + "#### Writing your own metrics" + ] + }, + { + "cell_type": "code", + "execution_count": 0, + "metadata": { + "colab_type": "code" + }, + "outputs": [], + "source": [ + "from keras import ops\n", + "\n", + "class RootMeanSquaredError(keras.metrics.Metric):\n", + " def __init__(self, name=\"rmse\", **kwargs):\n", + " super().__init__(name=name, **kwargs)\n", + " self.mse_sum = self.add_weight(name=\"mse_sum\", initializer=\"zeros\")\n", + " self.total_samples = self.add_weight(\n", + " name=\"total_samples\", initializer=\"zeros\"\n", + " )\n", + "\n", + " def update_state(self, y_true, y_pred, sample_weight=None):\n", + " y_true = ops.one_hot(y_true, num_classes=ops.shape(y_pred)[1])\n", + " mse = ops.sum(ops.square(y_true - y_pred))\n", + " self.mse_sum.assign_add(mse)\n", + " num_samples = ops.shape(y_pred)[0]\n", + " self.total_samples.assign_add(num_samples)\n", + "\n", + " def result(self):\n", + " return ops.sqrt(self.mse_sum / self.total_samples)\n", + "\n", + " def reset_state(self):\n", + " self.mse_sum.assign(0.)\n", + " self.total_samples.assign(0.)" + ] + }, + { + "cell_type": "code", + "execution_count": 0, + "metadata": { + "colab_type": "code" + }, + "outputs": [], + "source": [ + "model = get_mnist_model()\n", + "model.compile(\n", + " optimizer=\"adam\",\n", + " loss=\"sparse_categorical_crossentropy\",\n", + " metrics=[\"accuracy\", RootMeanSquaredError()],\n", + ")\n", + "model.fit(\n", + " train_images,\n", + " train_labels,\n", + " epochs=3,\n", + " validation_data=(val_images, val_labels),\n", + ")\n", + "test_metrics = model.evaluate(test_images, test_labels)" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "colab_type": "text" + }, + "source": [ + "#### Using callbacks" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "colab_type": "text" + }, + "source": [ + "##### The EarlyStopping and ModelCheckpoint callbacks" + ] + }, + { + "cell_type": "code", + "execution_count": 0, + "metadata": { + "colab_type": "code" + }, + "outputs": [], + "source": [ + "callbacks_list = [\n", + " keras.callbacks.EarlyStopping(\n", + " monitor=\"accuracy\",\n", + " patience=1,\n", + " ),\n", + " keras.callbacks.ModelCheckpoint(\n", + " filepath=\"checkpoint_path.keras\",\n", + " monitor=\"val_loss\",\n", + " save_best_only=True,\n", + " ),\n", + "]\n", + "model = get_mnist_model()\n", + "model.compile(\n", + " optimizer=\"adam\",\n", + " loss=\"sparse_categorical_crossentropy\",\n", + " metrics=[\"accuracy\"],\n", + ")\n", + "model.fit(\n", + " train_images,\n", + " train_labels,\n", + " epochs=10,\n", + " callbacks=callbacks_list,\n", + " validation_data=(val_images, val_labels),\n", + ")" + ] + }, + { + "cell_type": "code", + "execution_count": 0, + "metadata": { + "colab_type": "code" + }, + "outputs": [], + "source": [ + "model = keras.models.load_model(\"checkpoint_path.keras\")" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "colab_type": "text" + }, + "source": [ + "#### Writing your own callbacks" + ] + }, + { + "cell_type": "code", + "execution_count": 0, + "metadata": { + "colab_type": "code" + }, + "outputs": [], + "source": [ + "from matplotlib import pyplot as plt\n", + "\n", + "class LossHistory(keras.callbacks.Callback):\n", + " def on_train_begin(self, logs):\n", + " self.per_batch_losses = []\n", + "\n", + " def on_batch_end(self, batch, logs):\n", + " self.per_batch_losses.append(logs.get(\"loss\"))\n", + "\n", + " def on_epoch_end(self, epoch, logs):\n", + " plt.clf()\n", + " plt.plot(\n", + " range(len(self.per_batch_losses)),\n", + " self.per_batch_losses,\n", + " label=\"Training loss for each batch\",\n", + " )\n", + " plt.xlabel(f\"Batch (epoch {epoch})\")\n", + " plt.ylabel(\"Loss\")\n", + " plt.legend()\n", + " plt.savefig(f\"plot_at_epoch_{epoch}\", dpi=300)\n", + " self.per_batch_losses = []" + ] + }, + { + "cell_type": "code", + "execution_count": 0, + "metadata": { + "colab_type": "code" + }, + "outputs": [], + "source": [ + "model = get_mnist_model()\n", + "model.compile(\n", + " optimizer=\"adam\",\n", + " loss=\"sparse_categorical_crossentropy\",\n", + " metrics=[\"accuracy\"],\n", + ")\n", + "model.fit(\n", + " train_images,\n", + " train_labels,\n", + " epochs=10,\n", + " callbacks=[LossHistory()],\n", + " validation_data=(val_images, val_labels),\n", + ")" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "colab_type": "text" + }, + "source": [ + "#### Monitoring and visualization with TensorBoard" + ] + }, + { + "cell_type": "code", + "execution_count": 0, + "metadata": { + "colab_type": "code" + }, + "outputs": [], + "source": [ + "model = get_mnist_model()\n", + "model.compile(\n", + " optimizer=\"adam\",\n", + " loss=\"sparse_categorical_crossentropy\",\n", + " metrics=[\"accuracy\"],\n", + ")\n", + "\n", + "tensorboard = keras.callbacks.TensorBoard(\n", + " log_dir=\"/full_path_to_your_log_dir\",\n", + ")\n", + "model.fit(\n", + " train_images,\n", + " train_labels,\n", + " epochs=10,\n", + " validation_data=(val_images, val_labels),\n", + " callbacks=[tensorboard],\n", + ")" + ] + }, + { + "cell_type": "code", + "execution_count": 0, + "metadata": { + "colab_type": "code" + }, + "outputs": [], + "source": [ + "%load_ext tensorboard\n", + "%tensorboard --logdir /full_path_to_your_log_dir" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "colab_type": "text" + }, + "source": [ + "### Writing your own training and evaluation loops" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "colab_type": "text" + }, + "source": [ + "#### Training vs. inference" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "colab_type": "text" + }, + "source": [ + "#### Writing custom training step functions" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "colab_type": "text" + }, + "source": [ + "##### A TensorFlow training step function" + ] + }, + { + "cell_type": "code", + "execution_count": 0, + "metadata": { + "colab_type": "code" + }, + "outputs": [], + "source": [ + "%%backend tensorflow\n", + "import tensorflow as tf\n", + "\n", + "model = get_mnist_model()\n", + "loss_fn = keras.losses.SparseCategoricalCrossentropy()\n", + "optimizer = keras.optimizers.Adam()\n", + "\n", + "def train_step(inputs, targets):\n", + " with tf.GradientTape() as tape:\n", + " predictions = model(inputs, training=True)\n", + " loss = loss_fn(targets, predictions)\n", + " gradients = tape.gradient(loss, model.trainable_weights)\n", + " optimizer.apply(gradients, model.trainable_weights)\n", + " return loss" + ] + }, + { + "cell_type": "code", + "execution_count": 0, + "metadata": { + "colab_type": "code" + }, + "outputs": [], + "source": [ + "%%backend tensorflow\n", + "batch_size = 32\n", + "inputs = train_images[:batch_size]\n", + "targets = train_labels[:batch_size]\n", + "loss = train_step(inputs, targets)" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "colab_type": "text" + }, + "source": [ + "##### A PyTorch training step function" + ] + }, + { + "cell_type": "code", + "execution_count": 0, + "metadata": { + "colab_type": "code" + }, + "outputs": [], + "source": [ + "%%backend torch\n", + "import torch\n", + "\n", + "model = get_mnist_model()\n", + "loss_fn = keras.losses.SparseCategoricalCrossentropy()\n", + "optimizer = keras.optimizers.Adam()\n", + "\n", + "def train_step(inputs, targets):\n", + " predictions = model(inputs, training=True)\n", + " loss = loss_fn(targets, predictions)\n", + " loss.backward()\n", + " gradients = [weight.value.grad for weight in model.trainable_weights]\n", + " with torch.no_grad():\n", + " optimizer.apply(gradients, model.trainable_weights)\n", + " model.zero_grad()\n", + " return loss" + ] + }, + { + "cell_type": "code", + "execution_count": 0, + "metadata": { + "colab_type": "code" + }, + "outputs": [], + "source": [ + "%%backend torch\n", + "batch_size = 32\n", + "inputs = train_images[:batch_size]\n", + "targets = train_labels[:batch_size]\n", + "loss = train_step(inputs, targets)" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "colab_type": "text" + }, + "source": [ + "##### A JAX training step function" + ] + }, + { + "cell_type": "code", + "execution_count": 0, + "metadata": { + "colab_type": "code" + }, + "outputs": [], + "source": [ + "%%backend jax\n", + "model = get_mnist_model()\n", + "loss_fn = keras.losses.SparseCategoricalCrossentropy()\n", + "\n", + "def compute_loss_and_updates(\n", + " trainable_variables, non_trainable_variables, inputs, targets\n", + "):\n", + " outputs, non_trainable_variables = model.stateless_call(\n", + " trainable_variables, non_trainable_variables, inputs, training=True\n", + " )\n", + " loss = loss_fn(targets, outputs)\n", + " return loss, non_trainable_variables" + ] + }, + { + "cell_type": "code", + "execution_count": 0, + "metadata": { + "colab_type": "code" + }, + "outputs": [], + "source": [ + "%%backend jax\n", + "import jax\n", + "\n", + "grad_fn = jax.value_and_grad(compute_loss_and_updates, has_aux=True)" + ] + }, + { + "cell_type": "code", + "execution_count": 0, + "metadata": { + "colab_type": "code" + }, + "outputs": [], + "source": [ + "%%backend jax\n", + "optimizer = keras.optimizers.Adam()\n", + "optimizer.build(model.trainable_variables)\n", + "\n", + "def train_step(state, inputs, targets):\n", + " (trainable_variables, non_trainable_variables, optimizer_variables) = state\n", + " (loss, non_trainable_variables), grads = grad_fn(\n", + " trainable_variables, non_trainable_variables, inputs, targets\n", + " )\n", + " trainable_variables, optimizer_variables = optimizer.stateless_apply(\n", + " optimizer_variables, grads, trainable_variables\n", + " )\n", + " return loss, (\n", + " trainable_variables,\n", + " non_trainable_variables,\n", + " optimizer_variables,\n", + " )" + ] + }, + { + "cell_type": "code", + "execution_count": 0, + "metadata": { + "colab_type": "code" + }, + "outputs": [], + "source": [ + "%%backend jax\n", + "batch_size = 32\n", + "inputs = train_images[:batch_size]\n", + "targets = train_labels[:batch_size]\n", + "\n", + "trainable_variables = [v.value for v in model.trainable_variables]\n", + "non_trainable_variables = [v.value for v in model.non_trainable_variables]\n", + "optimizer_variables = [v.value for v in optimizer.variables]\n", + "\n", + "state = (trainable_variables, non_trainable_variables, optimizer_variables)\n", + "loss, state = train_step(state, inputs, targets)" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "colab_type": "text" + }, + "source": [ + "#### Low-level usage of metrics" + ] + }, + { + "cell_type": "code", + "execution_count": 0, + "metadata": { + "colab_type": "code" + }, + "outputs": [], + "source": [ + "from keras import ops\n", + "\n", + "metric = keras.metrics.SparseCategoricalAccuracy()\n", + "targets = ops.array([0, 1, 2])\n", + "predictions = ops.array([[1, 0, 0], [0, 1, 0], [0, 0, 1]])\n", + "metric.update_state(targets, predictions)\n", + "current_result = metric.result()\n", + "print(f\"result: {current_result:.2f}\")" + ] + }, + { + "cell_type": "code", + "execution_count": 0, + "metadata": { + "colab_type": "code" + }, + "outputs": [], + "source": [ + "values = ops.array([0, 1, 2, 3, 4])\n", + "mean_tracker = keras.metrics.Mean()\n", + "for value in values:\n", + " mean_tracker.update_state(value)\n", + "print(f\"Mean of values: {mean_tracker.result():.2f}\")" + ] + }, + { + "cell_type": "code", + "execution_count": 0, + "metadata": { + "colab_type": "code" + }, + "outputs": [], + "source": [ + "metric = keras.metrics.SparseCategoricalAccuracy()\n", + "targets = ops.array([0, 1, 2])\n", + "predictions = ops.array([[1, 0, 0], [0, 1, 0], [0, 0, 1]])\n", + "\n", + "metric_variables = metric.variables\n", + "metric_variables = metric.stateless_update_state(\n", + " metric_variables, targets, predictions\n", + ")\n", + "current_result = metric.stateless_result(metric_variables)\n", + "print(f\"result: {current_result:.2f}\")\n", + "\n", + "metric_variables = metric.stateless_reset_state()" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "colab_type": "text" + }, + "source": [ + "#### Using fit() with a custom training loop" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "colab_type": "text" + }, + "source": [ + "##### Customizing fit() with TensorFlow" + ] + }, + { + "cell_type": "code", + "execution_count": 0, + "metadata": { + "colab_type": "code" + }, + "outputs": [], + "source": [ + "%%backend tensorflow\n", + "import keras\n", + "from keras import layers\n", + "\n", + "loss_fn = keras.losses.SparseCategoricalCrossentropy()\n", + "loss_tracker = keras.metrics.Mean(name=\"loss\")\n", + "\n", + "class CustomModel(keras.Model):\n", + " def train_step(self, data):\n", + " inputs, targets = data\n", + " with tf.GradientTape() as tape:\n", + " predictions = self(inputs, training=True)\n", + " loss = loss_fn(targets, predictions)\n", + " gradients = tape.gradient(loss, self.trainable_weights)\n", + " self.optimizer.apply(gradients, self.trainable_weights)\n", + "\n", + " loss_tracker.update_state(loss)\n", + " return {\"loss\": loss_tracker.result()}\n", + "\n", + " @property\n", + " def metrics(self):\n", + " return [loss_tracker]" + ] + }, + { + "cell_type": "code", + "execution_count": 0, + "metadata": { + "colab_type": "code" + }, + "outputs": [], + "source": [ + "%%backend tensorflow\n", + "def get_custom_model():\n", + " inputs = keras.Input(shape=(28 * 28,))\n", + " features = layers.Dense(512, activation=\"relu\")(inputs)\n", + " features = layers.Dropout(0.5)(features)\n", + " outputs = layers.Dense(10, activation=\"softmax\")(features)\n", + " model = CustomModel(inputs, outputs)\n", + " model.compile(optimizer=keras.optimizers.Adam())\n", + " return model" + ] + }, + { + "cell_type": "code", + "execution_count": 0, + "metadata": { + "colab_type": "code" + }, + "outputs": [], + "source": [ + "%%backend tensorflow\n", + "model = get_custom_model()\n", + "model.fit(train_images, train_labels, epochs=3)" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "colab_type": "text" + }, + "source": [ + "##### Customizing fit() with PyTorch" + ] + }, + { + "cell_type": "code", + "execution_count": 0, + "metadata": { + "colab_type": "code" + }, + "outputs": [], + "source": [ + "%%backend torch\n", + "import keras\n", + "from keras import layers\n", + "\n", + "loss_fn = keras.losses.SparseCategoricalCrossentropy()\n", + "loss_tracker = keras.metrics.Mean(name=\"loss\")\n", + "\n", + "class CustomModel(keras.Model):\n", + " def train_step(self, data):\n", + " inputs, targets = data\n", + " predictions = self(inputs, training=True)\n", + " loss = loss_fn(targets, predictions)\n", + "\n", + " loss.backward()\n", + " trainable_weights = [v for v in self.trainable_weights]\n", + " gradients = [v.value.grad for v in trainable_weights]\n", + "\n", + " with torch.no_grad():\n", + " self.optimizer.apply(gradients, trainable_weights)\n", + "\n", + " loss_tracker.update_state(loss)\n", + " return {\"loss\": loss_tracker.result()}\n", + "\n", + " @property\n", + " def metrics(self):\n", + " return [loss_tracker]" + ] + }, + { + "cell_type": "code", + "execution_count": 0, + "metadata": { + "colab_type": "code" + }, + "outputs": [], + "source": [ + "%%backend torch\n", + "def get_custom_model():\n", + " inputs = keras.Input(shape=(28 * 28,))\n", + " features = layers.Dense(512, activation=\"relu\")(inputs)\n", + " features = layers.Dropout(0.5)(features)\n", + " outputs = layers.Dense(10, activation=\"softmax\")(features)\n", + " model = CustomModel(inputs, outputs)\n", + " model.compile(optimizer=keras.optimizers.Adam())\n", + " return model" + ] + }, + { + "cell_type": "code", + "execution_count": 0, + "metadata": { + "colab_type": "code" + }, + "outputs": [], + "source": [ + "%%backend torch\n", + "model = get_custom_model()\n", + "model.fit(train_images, train_labels, epochs=3)" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "colab_type": "text" + }, + "source": [ + "##### Customizing fit() with JAX" + ] + }, + { + "cell_type": "code", + "execution_count": 0, + "metadata": { + "colab_type": "code" + }, + "outputs": [], + "source": [ + "%%backend jax\n", + "import keras\n", + "from keras import layers\n", + "\n", + "loss_fn = keras.losses.SparseCategoricalCrossentropy()\n", + "\n", + "class CustomModel(keras.Model):\n", + " def compute_loss_and_updates(\n", + " self,\n", + " trainable_variables,\n", + " non_trainable_variables,\n", + " inputs,\n", + " targets,\n", + " training=False,\n", + " ):\n", + " predictions, non_trainable_variables = self.stateless_call(\n", + " trainable_variables,\n", + " non_trainable_variables,\n", + " inputs,\n", + " training=training,\n", + " )\n", + " loss = loss_fn(targets, predictions)\n", + " return loss, non_trainable_variables\n", + "\n", + " def train_step(self, state, data):\n", + " (\n", + " trainable_variables,\n", + " non_trainable_variables,\n", + " optimizer_variables,\n", + " metrics_variables,\n", + " ) = state\n", + " inputs, targets = data\n", + "\n", + " grad_fn = jax.value_and_grad(\n", + " self.compute_loss_and_updates, has_aux=True\n", + " )\n", + "\n", + " (loss, non_trainable_variables), grads = grad_fn(\n", + " trainable_variables,\n", + " non_trainable_variables,\n", + " inputs,\n", + " targets,\n", + " training=True,\n", + " )\n", + "\n", + " (\n", + " trainable_variables,\n", + " optimizer_variables,\n", + " ) = self.optimizer.stateless_apply(\n", + " optimizer_variables, grads, trainable_variables\n", + " )\n", + "\n", + " logs = {\"loss\": loss}\n", + " state = (\n", + " trainable_variables,\n", + " non_trainable_variables,\n", + " optimizer_variables,\n", + " metrics_variables,\n", + " )\n", + " return logs, state" + ] + }, + { + "cell_type": "code", + "execution_count": 0, + "metadata": { + "colab_type": "code" + }, + "outputs": [], + "source": [ + "%%backend jax\n", + "def get_custom_model():\n", + " inputs = keras.Input(shape=(28 * 28,))\n", + " features = layers.Dense(512, activation=\"relu\")(inputs)\n", + " features = layers.Dropout(0.5)(features)\n", + " outputs = layers.Dense(10, activation=\"softmax\")(features)\n", + " model = CustomModel(inputs, outputs)\n", + " model.compile(optimizer=keras.optimizers.Adam())\n", + " return model" + ] + }, + { + "cell_type": "code", + "execution_count": 0, + "metadata": { + "colab_type": "code" + }, + "outputs": [], + "source": [ + "%%backend jax\n", + "model = get_custom_model()\n", + "model.fit(train_images, train_labels, epochs=3)" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "colab_type": "text" + }, + "source": [ + "#### Handling metrics in a custom train_step()" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "colab_type": "text" + }, + "source": [ + "##### train_step() metrics handling with TensorFlow" + ] + }, + { + "cell_type": "code", + "execution_count": 0, + "metadata": { + "colab_type": "code" + }, + "outputs": [], + "source": [ + "%%backend tensorflow\n", + "import keras\n", + "from keras import layers\n", + "\n", + "class CustomModel(keras.Model):\n", + " def train_step(self, data):\n", + " inputs, targets = data\n", + " with tf.GradientTape() as tape:\n", + " predictions = self(inputs, training=True)\n", + " loss = self.compute_loss(y=targets, y_pred=predictions)\n", + "\n", + " gradients = tape.gradient(loss, self.trainable_weights)\n", + " self.optimizer.apply(gradients, self.trainable_weights)\n", + "\n", + " for metric in self.metrics:\n", + " if metric.name == \"loss\":\n", + " metric.update_state(loss)\n", + " else:\n", + " metric.update_state(targets, predictions)\n", + "\n", + " return {m.name: m.result() for m in self.metrics}" + ] + }, + { + "cell_type": "code", + "execution_count": 0, + "metadata": { + "colab_type": "code" + }, + "outputs": [], + "source": [ + "%%backend tensorflow\n", + "def get_custom_model():\n", + " inputs = keras.Input(shape=(28 * 28,))\n", + " features = layers.Dense(512, activation=\"relu\")(inputs)\n", + " features = layers.Dropout(0.5)(features)\n", + " outputs = layers.Dense(10, activation=\"softmax\")(features)\n", + " model = CustomModel(inputs, outputs)\n", + " model.compile(\n", + " optimizer=keras.optimizers.Adam(),\n", + " loss=keras.losses.SparseCategoricalCrossentropy(),\n", + " metrics=[keras.metrics.SparseCategoricalAccuracy()],\n", + " )\n", + " return model\n", + "\n", + "model = get_custom_model()\n", + "model.fit(train_images, train_labels, epochs=3)" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "colab_type": "text" + }, + "source": [ + "##### train_step() metrics handling with PyTorch" + ] + }, + { + "cell_type": "code", + "execution_count": 0, + "metadata": { + "colab_type": "code" + }, + "outputs": [], + "source": [ + "%%backend torch\n", + "import keras\n", + "from keras import layers\n", + "\n", + "class CustomModel(keras.Model):\n", + " def train_step(self, data):\n", + " inputs, targets = data\n", + " predictions = self(inputs, training=True)\n", + " loss = self.compute_loss(y=targets, y_pred=predictions)\n", + "\n", + " loss.backward()\n", + " trainable_weights = [v for v in self.trainable_weights]\n", + " gradients = [v.value.grad for v in trainable_weights]\n", + "\n", + " with torch.no_grad():\n", + " self.optimizer.apply(gradients, trainable_weights)\n", + "\n", + " for metric in self.metrics:\n", + " if metric.name == \"loss\":\n", + " metric.update_state(loss)\n", + " else:\n", + " metric.update_state(targets, predictions)\n", + "\n", + " return {m.name: m.result() for m in self.metrics}" + ] + }, + { + "cell_type": "code", + "execution_count": 0, + "metadata": { + "colab_type": "code" + }, + "outputs": [], + "source": [ + "%%backend torch\n", + "def get_custom_model():\n", + " inputs = keras.Input(shape=(28 * 28,))\n", + " features = layers.Dense(512, activation=\"relu\")(inputs)\n", + " features = layers.Dropout(0.5)(features)\n", + " outputs = layers.Dense(10, activation=\"softmax\")(features)\n", + " model = CustomModel(inputs, outputs)\n", + " model.compile(\n", + " optimizer=keras.optimizers.Adam(),\n", + " loss=keras.losses.SparseCategoricalCrossentropy(),\n", + " metrics=[keras.metrics.SparseCategoricalAccuracy()],\n", + " )\n", + " return model\n", + "\n", + "model = get_custom_model()\n", + "model.fit(train_images, train_labels, epochs=3)" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "colab_type": "text" + }, + "source": [ + "##### train_step() metrics handling with JAX" + ] + }, + { + "cell_type": "code", + "execution_count": 0, + "metadata": { + "colab_type": "code" + }, + "outputs": [], + "source": [ + "%%backend jax\n", + "import keras\n", + "from keras import layers\n", + "\n", + "class CustomModel(keras.Model):\n", + " def compute_loss_and_updates(\n", + " self,\n", + " trainable_variables,\n", + " non_trainable_variables,\n", + " inputs,\n", + " targets,\n", + " training=False,\n", + " ):\n", + " predictions, non_trainable_variables = self.stateless_call(\n", + " trainable_variables,\n", + " non_trainable_variables,\n", + " inputs,\n", + " training=training,\n", + " )\n", + " loss = self.compute_loss(y=targets, y_pred=predictions)\n", + " return loss, (predictions, non_trainable_variables)\n", + "\n", + " def train_step(self, state, data):\n", + " (\n", + " trainable_variables,\n", + " non_trainable_variables,\n", + " optimizer_variables,\n", + " metrics_variables,\n", + " ) = state\n", + " inputs, targets = data\n", + "\n", + " grad_fn = jax.value_and_grad(\n", + " self.compute_loss_and_updates, has_aux=True\n", + " )\n", + "\n", + " (loss, (predictions, non_trainable_variables)), grads = grad_fn(\n", + " trainable_variables,\n", + " non_trainable_variables,\n", + " inputs,\n", + " targets,\n", + " training=True,\n", + " )\n", + " (\n", + " trainable_variables,\n", + " optimizer_variables,\n", + " ) = self.optimizer.stateless_apply(\n", + " optimizer_variables, grads, trainable_variables\n", + " )\n", + "\n", + " new_metrics_vars = []\n", + " logs = {}\n", + " for metric in self.metrics:\n", + " num_prev = len(new_metrics_vars)\n", + " num_current = len(metric.variables)\n", + " current_vars = metrics_variables[num_prev : num_prev + num_current]\n", + " if metric.name == \"loss\":\n", + " current_vars = metric.stateless_update_state(current_vars, loss)\n", + " else:\n", + " current_vars = metric.stateless_update_state(\n", + " current_vars, targets, predictions\n", + " )\n", + " logs[metric.name] = metric.stateless_result(current_vars)\n", + " new_metrics_vars += current_vars\n", + "\n", + " state = (\n", + " trainable_variables,\n", + " non_trainable_variables,\n", + " optimizer_variables,\n", + " new_metrics_vars,\n", + " )\n", + " return logs, state" + ] + } + ], + "metadata": { + "accelerator": "GPU", + "colab": { + "collapsed_sections": [], + "name": "chapter07_deep-dive-keras", + "private_outputs": false, + "provenance": [], + "toc_visible": true + }, + "kernelspec": { + "display_name": "Python 3", + "language": "python", + "name": "python3" + }, + "language_info": { + "codemirror_mode": { + "name": "ipython", + "version": 3 + }, + "file_extension": ".py", + "mimetype": "text/x-python", + "name": "python", + "nbconvert_exporter": "python", + "pygments_lexer": "ipython3", + "version": "3.10.0" + } + }, + "nbformat": 4, + "nbformat_minor": 0 +} \ No newline at end of file diff --git a/chapter08_image-classification.ipynb b/chapter08_image-classification.ipynb new file mode 100644 index 0000000000..3ce065e2b3 --- /dev/null +++ b/chapter08_image-classification.ipynb @@ -0,0 +1,1030 @@ +{ + "cells": [ + { + "cell_type": "markdown", + "metadata": { + "colab_type": "text" + }, + "source": [ + "This is a companion notebook for the book [Deep Learning with Python, Third Edition](https://www.manning.com/books/deep-learning-with-python-third-edition). For readability, it only contains runnable code blocks and section titles, and omits everything else in the book: text paragraphs, figures, and pseudocode.\n\n**If you want to be able to follow what's going on, I recommend reading the notebook side by side with your copy of the book.**\n\nThe book's contents are available online at [deeplearningwithpython.io](https://deeplearningwithpython.io)." + ] + }, + { + "cell_type": "code", + "execution_count": 0, + "metadata": { + "colab_type": "code" + }, + "outputs": [], + "source": [ + "!pip install keras keras-hub --upgrade -q" + ] + }, + { + "cell_type": "code", + "execution_count": 0, + "metadata": { + "colab_type": "code" + }, + "outputs": [], + "source": [ + "import os\n", + "os.environ[\"KERAS_BACKEND\"] = \"jax\"" + ] + }, + { + "cell_type": "code", + "execution_count": 0, + "metadata": { + "cellView": "form", + "colab_type": "code" + }, + "outputs": [], + "source": [ + "# @title\n", + "import os\n", + "from IPython.core.magic import register_cell_magic\n", + "\n", + "@register_cell_magic\n", + "def backend(line, cell):\n", + " current, required = os.environ.get(\"KERAS_BACKEND\", \"\"), line.split()[-1]\n", + " if current == required:\n", + " get_ipython().run_cell(cell)\n", + " else:\n", + " print(\n", + " f\"This cell requires the {required} backend. To run it, change KERAS_BACKEND to \"\n", + " f\"\\\"{required}\\\" at the top of the notebook, restart the runtime, and rerun the notebook.\"\n", + " )" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "colab_type": "text" + }, + "source": [ + "## Image classification" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "colab_type": "text" + }, + "source": [ + "### Introduction to convnets" + ] + }, + { + "cell_type": "code", + "execution_count": 0, + "metadata": { + "colab_type": "code" + }, + "outputs": [], + "source": [ + "import keras\n", + "from keras import layers\n", + "\n", + "inputs = keras.Input(shape=(28, 28, 1))\n", + "x = layers.Conv2D(filters=64, kernel_size=3, activation=\"relu\")(inputs)\n", + "x = layers.MaxPooling2D(pool_size=2)(x)\n", + "x = layers.Conv2D(filters=128, kernel_size=3, activation=\"relu\")(x)\n", + "x = layers.MaxPooling2D(pool_size=2)(x)\n", + "x = layers.Conv2D(filters=256, kernel_size=3, activation=\"relu\")(x)\n", + "x = layers.GlobalAveragePooling2D()(x)\n", + "outputs = layers.Dense(10, activation=\"softmax\")(x)\n", + "model = keras.Model(inputs=inputs, outputs=outputs)" + ] + }, + { + "cell_type": "code", + "execution_count": 0, + "metadata": { + "colab_type": "code" + }, + "outputs": [], + "source": [ + "model.summary(line_length=80)" + ] + }, + { + "cell_type": "code", + "execution_count": 0, + "metadata": { + "colab_type": "code" + }, + "outputs": [], + "source": [ + "from keras.datasets import mnist\n", + "\n", + "(train_images, train_labels), (test_images, test_labels) = mnist.load_data()\n", + "train_images = train_images.reshape((60000, 28, 28, 1))\n", + "train_images = train_images.astype(\"float32\") / 255\n", + "test_images = test_images.reshape((10000, 28, 28, 1))\n", + "test_images = test_images.astype(\"float32\") / 255\n", + "model.compile(\n", + " optimizer=\"adam\",\n", + " loss=\"sparse_categorical_crossentropy\",\n", + " metrics=[\"accuracy\"],\n", + ")\n", + "model.fit(train_images, train_labels, epochs=5, batch_size=64)" + ] + }, + { + "cell_type": "code", + "execution_count": 0, + "metadata": { + "colab_type": "code" + }, + "outputs": [], + "source": [ + "test_loss, test_acc = model.evaluate(test_images, test_labels)\n", + "print(f\"Test accuracy: {test_acc:.3f}\")" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "colab_type": "text" + }, + "source": [ + "#### The convolution operation" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "colab_type": "text" + }, + "source": [ + "##### Understanding border effects and padding" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "colab_type": "text" + }, + "source": [ + "##### Understanding convolution strides" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "colab_type": "text" + }, + "source": [ + "#### The max-pooling operation" + ] + }, + { + "cell_type": "code", + "execution_count": 0, + "metadata": { + "colab_type": "code" + }, + "outputs": [], + "source": [ + "inputs = keras.Input(shape=(28, 28, 1))\n", + "x = layers.Conv2D(filters=64, kernel_size=3, activation=\"relu\")(inputs)\n", + "x = layers.Conv2D(filters=128, kernel_size=3, activation=\"relu\")(x)\n", + "x = layers.Conv2D(filters=256, kernel_size=3, activation=\"relu\")(x)\n", + "x = layers.GlobalAveragePooling2D()(x)\n", + "outputs = layers.Dense(10, activation=\"softmax\")(x)\n", + "model_no_max_pool = keras.Model(inputs=inputs, outputs=outputs)" + ] + }, + { + "cell_type": "code", + "execution_count": 0, + "metadata": { + "colab_type": "code" + }, + "outputs": [], + "source": [ + "model_no_max_pool.summary(line_length=80)" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "colab_type": "text" + }, + "source": [ + "### Training a convnet from scratch on a small dataset" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "colab_type": "text" + }, + "source": [ + "#### The relevance of deep learning for small-data problems" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "colab_type": "text" + }, + "source": [ + "#### Downloading the data" + ] + }, + { + "cell_type": "code", + "execution_count": 0, + "metadata": { + "colab_type": "code" + }, + "outputs": [], + "source": [ + "import kagglehub\n", + "\n", + "kagglehub.login()" + ] + }, + { + "cell_type": "code", + "execution_count": 0, + "metadata": { + "colab_type": "code" + }, + "outputs": [], + "source": [ + "download_path = kagglehub.competition_download(\"dogs-vs-cats\")" + ] + }, + { + "cell_type": "code", + "execution_count": 0, + "metadata": { + "colab_type": "code" + }, + "outputs": [], + "source": [ + "import zipfile\n", + "\n", + "with zipfile.ZipFile(download_path + \"/train.zip\", \"r\") as zip_ref:\n", + " zip_ref.extractall(\".\")" + ] + }, + { + "cell_type": "code", + "execution_count": 0, + "metadata": { + "colab_type": "code" + }, + "outputs": [], + "source": [ + "import os, shutil, pathlib\n", + "\n", + "original_dir = pathlib.Path(\"train\")\n", + "new_base_dir = pathlib.Path(\"dogs_vs_cats_small\")\n", + "\n", + "def make_subset(subset_name, start_index, end_index):\n", + " for category in (\"cat\", \"dog\"):\n", + " dir = new_base_dir / subset_name / category\n", + " os.makedirs(dir)\n", + " fnames = [f\"{category}.{i}.jpg\" for i in range(start_index, end_index)]\n", + " for fname in fnames:\n", + " shutil.copyfile(src=original_dir / fname, dst=dir / fname)\n", + "\n", + "make_subset(\"train\", start_index=0, end_index=1000)\n", + "make_subset(\"validation\", start_index=1000, end_index=1500)\n", + "make_subset(\"test\", start_index=1500, end_index=2500)" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "colab_type": "text" + }, + "source": [ + "#### Building your model" + ] + }, + { + "cell_type": "code", + "execution_count": 0, + "metadata": { + "colab_type": "code" + }, + "outputs": [], + "source": [ + "import keras\n", + "from keras import layers\n", + "\n", + "inputs = keras.Input(shape=(180, 180, 3))\n", + "x = layers.Rescaling(1.0 / 255)(inputs)\n", + "x = layers.Conv2D(filters=32, kernel_size=3, activation=\"relu\")(x)\n", + "x = layers.MaxPooling2D(pool_size=2)(x)\n", + "x = layers.Conv2D(filters=64, kernel_size=3, activation=\"relu\")(x)\n", + "x = layers.MaxPooling2D(pool_size=2)(x)\n", + "x = layers.Conv2D(filters=128, kernel_size=3, activation=\"relu\")(x)\n", + "x = layers.MaxPooling2D(pool_size=2)(x)\n", + "x = layers.Conv2D(filters=256, kernel_size=3, activation=\"relu\")(x)\n", + "x = layers.MaxPooling2D(pool_size=2)(x)\n", + "x = layers.Conv2D(filters=512, kernel_size=3, activation=\"relu\")(x)\n", + "x = layers.GlobalAveragePooling2D()(x)\n", + "outputs = layers.Dense(1, activation=\"sigmoid\")(x)\n", + "model = keras.Model(inputs=inputs, outputs=outputs)" + ] + }, + { + "cell_type": "code", + "execution_count": 0, + "metadata": { + "colab_type": "code" + }, + "outputs": [], + "source": [ + "model.summary(line_length=80)" + ] + }, + { + "cell_type": "code", + "execution_count": 0, + "metadata": { + "colab_type": "code" + }, + "outputs": [], + "source": [ + "model.compile(\n", + " loss=\"binary_crossentropy\",\n", + " optimizer=\"adam\",\n", + " metrics=[\"accuracy\"],\n", + ")" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "colab_type": "text" + }, + "source": [ + "#### Data preprocessing" + ] + }, + { + "cell_type": "code", + "execution_count": 0, + "metadata": { + "colab_type": "code" + }, + "outputs": [], + "source": [ + "from keras.utils import image_dataset_from_directory\n", + "\n", + "batch_size = 64\n", + "image_size = (180, 180)\n", + "train_dataset = image_dataset_from_directory(\n", + " new_base_dir / \"train\", image_size=image_size, batch_size=batch_size\n", + ")\n", + "validation_dataset = image_dataset_from_directory(\n", + " new_base_dir / \"validation\", image_size=image_size, batch_size=batch_size\n", + ")\n", + "test_dataset = image_dataset_from_directory(\n", + " new_base_dir / \"test\", image_size=image_size, batch_size=batch_size\n", + ")" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "colab_type": "text" + }, + "source": [ + "##### Understanding TensorFlow Dataset objects" + ] + }, + { + "cell_type": "code", + "execution_count": 0, + "metadata": { + "colab_type": "code" + }, + "outputs": [], + "source": [ + "import numpy as np\n", + "import tensorflow as tf\n", + "\n", + "random_numbers = np.random.normal(size=(1000, 16))\n", + "dataset = tf.data.Dataset.from_tensor_slices(random_numbers)" + ] + }, + { + "cell_type": "code", + "execution_count": 0, + "metadata": { + "colab_type": "code" + }, + "outputs": [], + "source": [ + "for i, element in enumerate(dataset):\n", + " print(element.shape)\n", + " if i >= 2:\n", + " break" + ] + }, + { + "cell_type": "code", + "execution_count": 0, + "metadata": { + "colab_type": "code" + }, + "outputs": [], + "source": [ + "batched_dataset = dataset.batch(32)\n", + "for i, element in enumerate(batched_dataset):\n", + " print(element.shape)\n", + " if i >= 2:\n", + " break" + ] + }, + { + "cell_type": "code", + "execution_count": 0, + "metadata": { + "colab_type": "code" + }, + "outputs": [], + "source": [ + "reshaped_dataset = dataset.map(\n", + " lambda x: tf.reshape(x, (4, 4)),\n", + " num_parallel_calls=8)\n", + "for i, element in enumerate(reshaped_dataset):\n", + " print(element.shape)\n", + " if i >= 2:\n", + " break" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "colab_type": "text" + }, + "source": [ + "##### Fitting the model" + ] + }, + { + "cell_type": "code", + "execution_count": 0, + "metadata": { + "colab_type": "code" + }, + "outputs": [], + "source": [ + "for data_batch, labels_batch in train_dataset:\n", + " print(\"data batch shape:\", data_batch.shape)\n", + " print(\"labels batch shape:\", labels_batch.shape)\n", + " break" + ] + }, + { + "cell_type": "code", + "execution_count": 0, + "metadata": { + "colab_type": "code" + }, + "outputs": [], + "source": [ + "callbacks = [\n", + " keras.callbacks.ModelCheckpoint(\n", + " filepath=\"convnet_from_scratch.keras\",\n", + " save_best_only=True,\n", + " monitor=\"val_loss\",\n", + " )\n", + "]\n", + "history = model.fit(\n", + " train_dataset,\n", + " epochs=50,\n", + " validation_data=validation_dataset,\n", + " callbacks=callbacks,\n", + ")" + ] + }, + { + "cell_type": "code", + "execution_count": 0, + "metadata": { + "colab_type": "code" + }, + "outputs": [], + "source": [ + "import matplotlib.pyplot as plt\n", + "\n", + "accuracy = history.history[\"accuracy\"]\n", + "val_accuracy = history.history[\"val_accuracy\"]\n", + "loss = history.history[\"loss\"]\n", + "val_loss = history.history[\"val_loss\"]\n", + "epochs = range(1, len(accuracy) + 1)\n", + "\n", + "plt.plot(epochs, accuracy, \"r--\", label=\"Training accuracy\")\n", + "plt.plot(epochs, val_accuracy, \"b\", label=\"Validation accuracy\")\n", + "plt.title(\"Training and validation accuracy\")\n", + "plt.legend()\n", + "plt.figure()\n", + "\n", + "plt.plot(epochs, loss, \"r--\", label=\"Training loss\")\n", + "plt.plot(epochs, val_loss, \"b\", label=\"Validation loss\")\n", + "plt.title(\"Training and validation loss\")\n", + "plt.legend()\n", + "plt.show()" + ] + }, + { + "cell_type": "code", + "execution_count": 0, + "metadata": { + "colab_type": "code" + }, + "outputs": [], + "source": [ + "test_model = keras.models.load_model(\"convnet_from_scratch.keras\")\n", + "test_loss, test_acc = test_model.evaluate(test_dataset)\n", + "print(f\"Test accuracy: {test_acc:.3f}\")" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "colab_type": "text" + }, + "source": [ + "#### Using data augmentation" + ] + }, + { + "cell_type": "code", + "execution_count": 0, + "metadata": { + "colab_type": "code" + }, + "outputs": [], + "source": [ + "data_augmentation_layers = [\n", + " layers.RandomFlip(\"horizontal\"),\n", + " layers.RandomRotation(0.1),\n", + " layers.RandomZoom(0.2),\n", + "]\n", + "\n", + "def data_augmentation(images, targets):\n", + " for layer in data_augmentation_layers:\n", + " images = layer(images)\n", + " return images, targets\n", + "\n", + "augmented_train_dataset = train_dataset.map(\n", + " data_augmentation, num_parallel_calls=8\n", + ")\n", + "augmented_train_dataset = augmented_train_dataset.prefetch(tf.data.AUTOTUNE)" + ] + }, + { + "cell_type": "code", + "execution_count": 0, + "metadata": { + "colab_type": "code" + }, + "outputs": [], + "source": [ + "plt.figure(figsize=(10, 10))\n", + "for image_batch, _ in train_dataset.take(1):\n", + " image = image_batch[0]\n", + " for i in range(9):\n", + " ax = plt.subplot(3, 3, i + 1)\n", + " augmented_image, _ = data_augmentation(image, None)\n", + " augmented_image = keras.ops.convert_to_numpy(augmented_image)\n", + " plt.imshow(augmented_image.astype(\"uint8\"))\n", + " plt.axis(\"off\")" + ] + }, + { + "cell_type": "code", + "execution_count": 0, + "metadata": { + "colab_type": "code" + }, + "outputs": [], + "source": [ + "inputs = keras.Input(shape=(180, 180, 3))\n", + "x = layers.Rescaling(1.0 / 255)(inputs)\n", + "x = layers.Conv2D(filters=32, kernel_size=3, activation=\"relu\")(x)\n", + "x = layers.MaxPooling2D(pool_size=2)(x)\n", + "x = layers.Conv2D(filters=64, kernel_size=3, activation=\"relu\")(x)\n", + "x = layers.MaxPooling2D(pool_size=2)(x)\n", + "x = layers.Conv2D(filters=128, kernel_size=3, activation=\"relu\")(x)\n", + "x = layers.MaxPooling2D(pool_size=2)(x)\n", + "x = layers.Conv2D(filters=256, kernel_size=3, activation=\"relu\")(x)\n", + "x = layers.MaxPooling2D(pool_size=2)(x)\n", + "x = layers.Conv2D(filters=512, kernel_size=3, activation=\"relu\")(x)\n", + "x = layers.GlobalAveragePooling2D()(x)\n", + "x = layers.Dropout(0.25)(x)\n", + "outputs = layers.Dense(1, activation=\"sigmoid\")(x)\n", + "model = keras.Model(inputs=inputs, outputs=outputs)\n", + "\n", + "model.compile(\n", + " loss=\"binary_crossentropy\",\n", + " optimizer=\"adam\",\n", + " metrics=[\"accuracy\"],\n", + ")" + ] + }, + { + "cell_type": "code", + "execution_count": 0, + "metadata": { + "colab_type": "code" + }, + "outputs": [], + "source": [ + "callbacks = [\n", + " keras.callbacks.ModelCheckpoint(\n", + " filepath=\"convnet_from_scratch_with_augmentation.keras\",\n", + " save_best_only=True,\n", + " monitor=\"val_loss\",\n", + " )\n", + "]\n", + "history = model.fit(\n", + " augmented_train_dataset,\n", + " epochs=100,\n", + " validation_data=validation_dataset,\n", + " callbacks=callbacks,\n", + ")" + ] + }, + { + "cell_type": "code", + "execution_count": 0, + "metadata": { + "colab_type": "code" + }, + "outputs": [], + "source": [ + "test_model = keras.models.load_model(\n", + " \"convnet_from_scratch_with_augmentation.keras\"\n", + ")\n", + "test_loss, test_acc = test_model.evaluate(test_dataset)\n", + "print(f\"Test accuracy: {test_acc:.3f}\")" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "colab_type": "text" + }, + "source": [ + "### Using a pretrained model" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "colab_type": "text" + }, + "source": [ + "#### Feature extraction with a pretrained model" + ] + }, + { + "cell_type": "code", + "execution_count": 0, + "metadata": { + "colab_type": "code" + }, + "outputs": [], + "source": [ + "import keras_hub\n", + "\n", + "conv_base = keras_hub.models.Backbone.from_preset(\"xception_41_imagenet\")" + ] + }, + { + "cell_type": "code", + "execution_count": 0, + "metadata": { + "colab_type": "code" + }, + "outputs": [], + "source": [ + "preprocessor = keras_hub.layers.ImageConverter.from_preset(\n", + " \"xception_41_imagenet\",\n", + " image_size=(180, 180),\n", + ")" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "colab_type": "text" + }, + "source": [ + "##### Fast feature extraction without data augmentation" + ] + }, + { + "cell_type": "code", + "execution_count": 0, + "metadata": { + "colab_type": "code" + }, + "outputs": [], + "source": [ + "def get_features_and_labels(dataset):\n", + " all_features = []\n", + " all_labels = []\n", + " for images, labels in dataset:\n", + " preprocessed_images = preprocessor(images)\n", + " features = conv_base.predict(preprocessed_images, verbose=0)\n", + " all_features.append(features)\n", + " all_labels.append(labels)\n", + " return np.concatenate(all_features), np.concatenate(all_labels)\n", + "\n", + "train_features, train_labels = get_features_and_labels(train_dataset)\n", + "val_features, val_labels = get_features_and_labels(validation_dataset)\n", + "test_features, test_labels = get_features_and_labels(test_dataset)" + ] + }, + { + "cell_type": "code", + "execution_count": 0, + "metadata": { + "colab_type": "code" + }, + "outputs": [], + "source": [ + "train_features.shape" + ] + }, + { + "cell_type": "code", + "execution_count": 0, + "metadata": { + "colab_type": "code" + }, + "outputs": [], + "source": [ + "inputs = keras.Input(shape=(6, 6, 2048))\n", + "x = layers.GlobalAveragePooling2D()(inputs)\n", + "x = layers.Dense(256, activation=\"relu\")(x)\n", + "x = layers.Dropout(0.25)(x)\n", + "outputs = layers.Dense(1, activation=\"sigmoid\")(x)\n", + "model = keras.Model(inputs, outputs)\n", + "model.compile(\n", + " loss=\"binary_crossentropy\",\n", + " optimizer=\"adam\",\n", + " metrics=[\"accuracy\"],\n", + ")\n", + "\n", + "callbacks = [\n", + " keras.callbacks.ModelCheckpoint(\n", + " filepath=\"feature_extraction.keras\",\n", + " save_best_only=True,\n", + " monitor=\"val_loss\",\n", + " )\n", + "]\n", + "history = model.fit(\n", + " train_features,\n", + " train_labels,\n", + " epochs=10,\n", + " validation_data=(val_features, val_labels),\n", + " callbacks=callbacks,\n", + ")" + ] + }, + { + "cell_type": "code", + "execution_count": 0, + "metadata": { + "colab_type": "code" + }, + "outputs": [], + "source": [ + "import matplotlib.pyplot as plt\n", + "\n", + "acc = history.history[\"accuracy\"]\n", + "val_acc = history.history[\"val_accuracy\"]\n", + "loss = history.history[\"loss\"]\n", + "val_loss = history.history[\"val_loss\"]\n", + "epochs = range(1, len(acc) + 1)\n", + "plt.plot(epochs, acc, \"r--\", label=\"Training accuracy\")\n", + "plt.plot(epochs, val_acc, \"b\", label=\"Validation accuracy\")\n", + "plt.title(\"Training and validation accuracy\")\n", + "plt.legend()\n", + "plt.figure()\n", + "plt.plot(epochs, loss, \"r--\", label=\"Training loss\")\n", + "plt.plot(epochs, val_loss, \"b\", label=\"Validation loss\")\n", + "plt.title(\"Training and validation loss\")\n", + "plt.legend()\n", + "plt.show()" + ] + }, + { + "cell_type": "code", + "execution_count": 0, + "metadata": { + "colab_type": "code" + }, + "outputs": [], + "source": [ + "test_model = keras.models.load_model(\"feature_extraction.keras\")\n", + "test_loss, test_acc = test_model.evaluate(test_features, test_labels)\n", + "print(f\"Test accuracy: {test_acc:.3f}\")" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "colab_type": "text" + }, + "source": [ + "##### Feature extraction together with data augmentation" + ] + }, + { + "cell_type": "code", + "execution_count": 0, + "metadata": { + "colab_type": "code" + }, + "outputs": [], + "source": [ + "import keras_hub\n", + "\n", + "conv_base = keras_hub.models.Backbone.from_preset(\n", + " \"xception_41_imagenet\",\n", + " trainable=False,\n", + ")" + ] + }, + { + "cell_type": "code", + "execution_count": 0, + "metadata": { + "colab_type": "code" + }, + "outputs": [], + "source": [ + "conv_base.trainable = True\n", + "len(conv_base.trainable_weights)" + ] + }, + { + "cell_type": "code", + "execution_count": 0, + "metadata": { + "colab_type": "code" + }, + "outputs": [], + "source": [ + "conv_base.trainable = False\n", + "len(conv_base.trainable_weights)" + ] + }, + { + "cell_type": "code", + "execution_count": 0, + "metadata": { + "colab_type": "code" + }, + "outputs": [], + "source": [ + "inputs = keras.Input(shape=(180, 180, 3))\n", + "x = preprocessor(inputs)\n", + "x = conv_base(x)\n", + "x = layers.GlobalAveragePooling2D()(x)\n", + "x = layers.Dense(256)(x)\n", + "x = layers.Dropout(0.25)(x)\n", + "outputs = layers.Dense(1, activation=\"sigmoid\")(x)\n", + "model = keras.Model(inputs, outputs)\n", + "model.compile(\n", + " loss=\"binary_crossentropy\",\n", + " optimizer=\"adam\",\n", + " metrics=[\"accuracy\"],\n", + ")" + ] + }, + { + "cell_type": "code", + "execution_count": 0, + "metadata": { + "colab_type": "code" + }, + "outputs": [], + "source": [ + "callbacks = [\n", + " keras.callbacks.ModelCheckpoint(\n", + " filepath=\"feature_extraction_with_data_augmentation.keras\",\n", + " save_best_only=True,\n", + " monitor=\"val_loss\",\n", + " )\n", + "]\n", + "history = model.fit(\n", + " augmented_train_dataset,\n", + " epochs=30,\n", + " validation_data=validation_dataset,\n", + " callbacks=callbacks,\n", + ")" + ] + }, + { + "cell_type": "code", + "execution_count": 0, + "metadata": { + "colab_type": "code" + }, + "outputs": [], + "source": [ + "test_model = keras.models.load_model(\n", + " \"feature_extraction_with_data_augmentation.keras\"\n", + ")\n", + "test_loss, test_acc = test_model.evaluate(test_dataset)\n", + "print(f\"Test accuracy: {test_acc:.3f}\")" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "colab_type": "text" + }, + "source": [ + "#### Fine-tuning a pretrained model" + ] + }, + { + "cell_type": "code", + "execution_count": 0, + "metadata": { + "colab_type": "code" + }, + "outputs": [], + "source": [ + "model.compile(\n", + " loss=\"binary_crossentropy\",\n", + " optimizer=keras.optimizers.Adam(learning_rate=1e-5),\n", + " metrics=[\"accuracy\"],\n", + ")\n", + "\n", + "callbacks = [\n", + " keras.callbacks.ModelCheckpoint(\n", + " filepath=\"fine_tuning.keras\",\n", + " save_best_only=True,\n", + " monitor=\"val_loss\",\n", + " )\n", + "]\n", + "history = model.fit(\n", + " augmented_train_dataset,\n", + " epochs=30,\n", + " validation_data=validation_dataset,\n", + " callbacks=callbacks,\n", + ")" + ] + }, + { + "cell_type": "code", + "execution_count": 0, + "metadata": { + "colab_type": "code" + }, + "outputs": [], + "source": [ + "model = keras.models.load_model(\"fine_tuning.keras\")\n", + "test_loss, test_acc = model.evaluate(test_dataset)\n", + "print(f\"Test accuracy: {test_acc:.3f}\")" + ] + } + ], + "metadata": { + "accelerator": "GPU", + "colab": { + "collapsed_sections": [], + "name": "chapter08_image-classification", + "private_outputs": false, + "provenance": [], + "toc_visible": true + }, + "kernelspec": { + "display_name": "Python 3", + "language": "python", + "name": "python3" + }, + "language_info": { + "codemirror_mode": { + "name": "ipython", + "version": 3 + }, + "file_extension": ".py", + "mimetype": "text/x-python", + "name": "python", + "nbconvert_exporter": "python", + "pygments_lexer": "ipython3", + "version": "3.10.0" + } + }, + "nbformat": 4, + "nbformat_minor": 0 +} \ No newline at end of file diff --git a/chapter09_convnet-architecture-patterns.ipynb b/chapter09_convnet-architecture-patterns.ipynb new file mode 100644 index 0000000000..218546a3c6 --- /dev/null +++ b/chapter09_convnet-architecture-patterns.ipynb @@ -0,0 +1,381 @@ +{ + "cells": [ + { + "cell_type": "markdown", + "metadata": { + "colab_type": "text" + }, + "source": [ + "This is a companion notebook for the book [Deep Learning with Python, Third Edition](https://www.manning.com/books/deep-learning-with-python-third-edition). For readability, it only contains runnable code blocks and section titles, and omits everything else in the book: text paragraphs, figures, and pseudocode.\n\n**If you want to be able to follow what's going on, I recommend reading the notebook side by side with your copy of the book.**\n\nThe book's contents are available online at [deeplearningwithpython.io](https://deeplearningwithpython.io)." + ] + }, + { + "cell_type": "code", + "execution_count": 0, + "metadata": { + "colab_type": "code" + }, + "outputs": [], + "source": [ + "!pip install keras keras-hub --upgrade -q" + ] + }, + { + "cell_type": "code", + "execution_count": 0, + "metadata": { + "colab_type": "code" + }, + "outputs": [], + "source": [ + "import os\n", + "os.environ[\"KERAS_BACKEND\"] = \"jax\"" + ] + }, + { + "cell_type": "code", + "execution_count": 0, + "metadata": { + "cellView": "form", + "colab_type": "code" + }, + "outputs": [], + "source": [ + "# @title\n", + "import os\n", + "from IPython.core.magic import register_cell_magic\n", + "\n", + "@register_cell_magic\n", + "def backend(line, cell):\n", + " current, required = os.environ.get(\"KERAS_BACKEND\", \"\"), line.split()[-1]\n", + " if current == required:\n", + " get_ipython().run_cell(cell)\n", + " else:\n", + " print(\n", + " f\"This cell requires the {required} backend. To run it, change KERAS_BACKEND to \"\n", + " f\"\\\"{required}\\\" at the top of the notebook, restart the runtime, and rerun the notebook.\"\n", + " )" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "colab_type": "text" + }, + "source": [ + "## Convnet architecture patterns" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "colab_type": "text" + }, + "source": [ + "### Modularity, hierarchy, and reuse" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "colab_type": "text" + }, + "source": [ + "### Residual connections" + ] + }, + { + "cell_type": "code", + "execution_count": 0, + "metadata": { + "colab_type": "code" + }, + "outputs": [], + "source": [ + "import keras\n", + "from keras import layers\n", + "\n", + "inputs = keras.Input(shape=(32, 32, 3))\n", + "x = layers.Conv2D(32, 3, activation=\"relu\")(inputs)\n", + "residual = x\n", + "x = layers.Conv2D(64, 3, activation=\"relu\", padding=\"same\")(x)\n", + "residual = layers.Conv2D(64, 1)(residual)\n", + "x = layers.add([x, residual])" + ] + }, + { + "cell_type": "code", + "execution_count": 0, + "metadata": { + "colab_type": "code" + }, + "outputs": [], + "source": [ + "inputs = keras.Input(shape=(32, 32, 3))\n", + "x = layers.Conv2D(32, 3, activation=\"relu\")(inputs)\n", + "residual = x\n", + "x = layers.Conv2D(64, 3, activation=\"relu\", padding=\"same\")(x)\n", + "x = layers.MaxPooling2D(2, padding=\"same\")(x)\n", + "residual = layers.Conv2D(64, 1, strides=2)(residual)\n", + "x = layers.add([x, residual])" + ] + }, + { + "cell_type": "code", + "execution_count": 0, + "metadata": { + "colab_type": "code" + }, + "outputs": [], + "source": [ + "inputs = keras.Input(shape=(32, 32, 3))\n", + "x = layers.Rescaling(1.0 / 255)(inputs)\n", + "\n", + "def residual_block(x, filters, pooling=False):\n", + " residual = x\n", + " x = layers.Conv2D(filters, 3, activation=\"relu\", padding=\"same\")(x)\n", + " x = layers.Conv2D(filters, 3, activation=\"relu\", padding=\"same\")(x)\n", + " if pooling:\n", + " x = layers.MaxPooling2D(2, padding=\"same\")(x)\n", + " residual = layers.Conv2D(filters, 1, strides=2)(residual)\n", + " elif filters != residual.shape[-1]:\n", + " residual = layers.Conv2D(filters, 1)(residual)\n", + " x = layers.add([x, residual])\n", + " return x\n", + "\n", + "x = residual_block(x, filters=32, pooling=True)\n", + "x = residual_block(x, filters=64, pooling=True)\n", + "x = residual_block(x, filters=128, pooling=False)\n", + "\n", + "x = layers.GlobalAveragePooling2D()(x)\n", + "outputs = layers.Dense(1, activation=\"sigmoid\")(x)\n", + "model = keras.Model(inputs=inputs, outputs=outputs)" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "colab_type": "text" + }, + "source": [ + "### Batch normalization" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "colab_type": "text" + }, + "source": [ + "### Depthwise separable convolutions" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "colab_type": "text" + }, + "source": [ + "### Putting it together: A mini Xception-like model" + ] + }, + { + "cell_type": "code", + "execution_count": 0, + "metadata": { + "colab_type": "code" + }, + "outputs": [], + "source": [ + "import kagglehub\n", + "\n", + "kagglehub.login()" + ] + }, + { + "cell_type": "code", + "execution_count": 0, + "metadata": { + "colab_type": "code" + }, + "outputs": [], + "source": [ + "import zipfile\n", + "\n", + "download_path = kagglehub.competition_download(\"dogs-vs-cats\")\n", + "\n", + "with zipfile.ZipFile(download_path + \"/train.zip\", \"r\") as zip_ref:\n", + " zip_ref.extractall(\".\")" + ] + }, + { + "cell_type": "code", + "execution_count": 0, + "metadata": { + "colab_type": "code" + }, + "outputs": [], + "source": [ + "import os, shutil, pathlib\n", + "from keras.utils import image_dataset_from_directory\n", + "\n", + "original_dir = pathlib.Path(\"train\")\n", + "new_base_dir = pathlib.Path(\"dogs_vs_cats_small\")\n", + "\n", + "def make_subset(subset_name, start_index, end_index):\n", + " for category in (\"cat\", \"dog\"):\n", + " dir = new_base_dir / subset_name / category\n", + " os.makedirs(dir)\n", + " fnames = [f\"{category}.{i}.jpg\" for i in range(start_index, end_index)]\n", + " for fname in fnames:\n", + " shutil.copyfile(src=original_dir / fname, dst=dir / fname)\n", + "\n", + "make_subset(\"train\", start_index=0, end_index=1000)\n", + "make_subset(\"validation\", start_index=1000, end_index=1500)\n", + "make_subset(\"test\", start_index=1500, end_index=2500)\n", + "\n", + "batch_size = 64\n", + "image_size = (180, 180)\n", + "train_dataset = image_dataset_from_directory(\n", + " new_base_dir / \"train\",\n", + " image_size=image_size,\n", + " batch_size=batch_size,\n", + ")\n", + "validation_dataset = image_dataset_from_directory(\n", + " new_base_dir / \"validation\",\n", + " image_size=image_size,\n", + " batch_size=batch_size,\n", + ")\n", + "test_dataset = image_dataset_from_directory(\n", + " new_base_dir / \"test\",\n", + " image_size=image_size,\n", + " batch_size=batch_size,\n", + ")" + ] + }, + { + "cell_type": "code", + "execution_count": 0, + "metadata": { + "colab_type": "code" + }, + "outputs": [], + "source": [ + "import tensorflow as tf\n", + "from keras import layers\n", + "\n", + "data_augmentation_layers = [\n", + " layers.RandomFlip(\"horizontal\"),\n", + " layers.RandomRotation(0.1),\n", + " layers.RandomZoom(0.2),\n", + "]\n", + "\n", + "def data_augmentation(images, targets):\n", + " for layer in data_augmentation_layers:\n", + " images = layer(images)\n", + " return images, targets\n", + "\n", + "augmented_train_dataset = train_dataset.map(\n", + " data_augmentation, num_parallel_calls=8\n", + ")\n", + "augmented_train_dataset = augmented_train_dataset.prefetch(tf.data.AUTOTUNE)" + ] + }, + { + "cell_type": "code", + "execution_count": 0, + "metadata": { + "colab_type": "code" + }, + "outputs": [], + "source": [ + "import keras\n", + "\n", + "inputs = keras.Input(shape=(180, 180, 3))\n", + "x = layers.Rescaling(1.0 / 255)(inputs)\n", + "x = layers.Conv2D(filters=32, kernel_size=5, use_bias=False)(x)\n", + "\n", + "for size in [32, 64, 128, 256, 512]:\n", + " residual = x\n", + "\n", + " x = layers.BatchNormalization()(x)\n", + " x = layers.Activation(\"relu\")(x)\n", + " x = layers.SeparableConv2D(size, 3, padding=\"same\", use_bias=False)(x)\n", + "\n", + " x = layers.BatchNormalization()(x)\n", + " x = layers.Activation(\"relu\")(x)\n", + " x = layers.SeparableConv2D(size, 3, padding=\"same\", use_bias=False)(x)\n", + "\n", + " x = layers.MaxPooling2D(3, strides=2, padding=\"same\")(x)\n", + "\n", + " residual = layers.Conv2D(\n", + " size, 1, strides=2, padding=\"same\", use_bias=False\n", + " )(residual)\n", + " x = layers.add([x, residual])\n", + "\n", + "x = layers.GlobalAveragePooling2D()(x)\n", + "x = layers.Dropout(0.5)(x)\n", + "outputs = layers.Dense(1, activation=\"sigmoid\")(x)\n", + "model = keras.Model(inputs=inputs, outputs=outputs)" + ] + }, + { + "cell_type": "code", + "execution_count": 0, + "metadata": { + "colab_type": "code" + }, + "outputs": [], + "source": [ + "model.compile(\n", + " loss=\"binary_crossentropy\",\n", + " optimizer=\"adam\",\n", + " metrics=[\"accuracy\"],\n", + ")\n", + "history = model.fit(\n", + " augmented_train_dataset,\n", + " epochs=100,\n", + " validation_data=validation_dataset,\n", + ")" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "colab_type": "text" + }, + "source": [ + "### Beyond convolution: Vision Transformers" + ] + } + ], + "metadata": { + "accelerator": "GPU", + "colab": { + "collapsed_sections": [], + "name": "chapter09_convnet-architecture-patterns", + "private_outputs": false, + "provenance": [], + "toc_visible": true + }, + "kernelspec": { + "display_name": "Python 3", + "language": "python", + "name": "python3" + }, + "language_info": { + "codemirror_mode": { + "name": "ipython", + "version": 3 + }, + "file_extension": ".py", + "mimetype": "text/x-python", + "name": "python", + "nbconvert_exporter": "python", + "pygments_lexer": "ipython3", + "version": "3.10.0" + } + }, + "nbformat": 4, + "nbformat_minor": 0 +} \ No newline at end of file diff --git a/chapter10_interpreting-what-convnets-learn.ipynb b/chapter10_interpreting-what-convnets-learn.ipynb new file mode 100644 index 0000000000..1ff0164326 --- /dev/null +++ b/chapter10_interpreting-what-convnets-learn.ipynb @@ -0,0 +1,827 @@ +{ + "cells": [ + { + "cell_type": "markdown", + "metadata": { + "colab_type": "text" + }, + "source": [ + "This is a companion notebook for the book [Deep Learning with Python, Third Edition](https://www.manning.com/books/deep-learning-with-python-third-edition). For readability, it only contains runnable code blocks and section titles, and omits everything else in the book: text paragraphs, figures, and pseudocode.\n\n**If you want to be able to follow what's going on, I recommend reading the notebook side by side with your copy of the book.**\n\nThe book's contents are available online at [deeplearningwithpython.io](https://deeplearningwithpython.io)." + ] + }, + { + "cell_type": "code", + "execution_count": 0, + "metadata": { + "colab_type": "code" + }, + "outputs": [], + "source": [ + "!pip install keras keras-hub --upgrade -q" + ] + }, + { + "cell_type": "code", + "execution_count": 0, + "metadata": { + "colab_type": "code" + }, + "outputs": [], + "source": [ + "import os\n", + "os.environ[\"KERAS_BACKEND\"] = \"jax\"" + ] + }, + { + "cell_type": "code", + "execution_count": 0, + "metadata": { + "cellView": "form", + "colab_type": "code" + }, + "outputs": [], + "source": [ + "# @title\n", + "import os\n", + "from IPython.core.magic import register_cell_magic\n", + "\n", + "@register_cell_magic\n", + "def backend(line, cell):\n", + " current, required = os.environ.get(\"KERAS_BACKEND\", \"\"), line.split()[-1]\n", + " if current == required:\n", + " get_ipython().run_cell(cell)\n", + " else:\n", + " print(\n", + " f\"This cell requires the {required} backend. To run it, change KERAS_BACKEND to \"\n", + " f\"\\\"{required}\\\" at the top of the notebook, restart the runtime, and rerun the notebook.\"\n", + " )" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "colab_type": "text" + }, + "source": [ + "## Interpreting what convnets learn" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "colab_type": "text" + }, + "source": [ + "### Visualizing intermediate activations" + ] + }, + { + "cell_type": "code", + "execution_count": 0, + "metadata": { + "colab_type": "code" + }, + "outputs": [], + "source": [ + "from google.colab import files\n", + "\n", + "# You can use this to load the file\n", + "# \"convnet_from_scratch_with_augmentation.keras\"\n", + "# you obtained in the last chapter.\n", + "files.upload()" + ] + }, + { + "cell_type": "code", + "execution_count": 0, + "metadata": { + "colab_type": "code" + }, + "outputs": [], + "source": [ + "import keras\n", + "model = keras.models.load_model(\n", + " \"convnet_from_scratch_with_augmentation.keras\"\n", + ")\n", + "model.summary(line_length=80)" + ] + }, + { + "cell_type": "code", + "execution_count": 0, + "metadata": { + "colab_type": "code" + }, + "outputs": [], + "source": [ + "import keras\n", + "import numpy as np\n", + "\n", + "img_path = keras.utils.get_file(\n", + " fname=\"cat.jpg\", origin=\"https://img-datasets.s3.amazonaws.com/cat.jpg\"\n", + ")\n", + "\n", + "def get_img_array(img_path, target_size):\n", + " img = keras.utils.load_img(img_path, target_size=target_size)\n", + " array = keras.utils.img_to_array(img)\n", + " array = np.expand_dims(array, axis=0)\n", + " return array\n", + "\n", + "img_tensor = get_img_array(img_path, target_size=(180, 180))" + ] + }, + { + "cell_type": "code", + "execution_count": 0, + "metadata": { + "colab_type": "code" + }, + "outputs": [], + "source": [ + "import matplotlib.pyplot as plt\n", + "\n", + "plt.axis(\"off\")\n", + "plt.imshow(img_tensor[0].astype(\"uint8\"))\n", + "plt.show()" + ] + }, + { + "cell_type": "code", + "execution_count": 0, + "metadata": { + "colab_type": "code" + }, + "outputs": [], + "source": [ + "from keras import layers\n", + "\n", + "layer_outputs = []\n", + "layer_names = []\n", + "for layer in model.layers:\n", + " if isinstance(layer, (layers.Conv2D, layers.MaxPooling2D)):\n", + " layer_outputs.append(layer.output)\n", + " layer_names.append(layer.name)\n", + "activation_model = keras.Model(inputs=model.input, outputs=layer_outputs)" + ] + }, + { + "cell_type": "code", + "execution_count": 0, + "metadata": { + "colab_type": "code" + }, + "outputs": [], + "source": [ + "activations = activation_model.predict(img_tensor)" + ] + }, + { + "cell_type": "code", + "execution_count": 0, + "metadata": { + "colab_type": "code" + }, + "outputs": [], + "source": [ + "first_layer_activation = activations[0]\n", + "print(first_layer_activation.shape)" + ] + }, + { + "cell_type": "code", + "execution_count": 0, + "metadata": { + "colab_type": "code" + }, + "outputs": [], + "source": [ + "import matplotlib.pyplot as plt\n", + "\n", + "plt.matshow(first_layer_activation[0, :, :, 5], cmap=\"viridis\")" + ] + }, + { + "cell_type": "code", + "execution_count": 0, + "metadata": { + "colab_type": "code" + }, + "outputs": [], + "source": [ + "images_per_row = 16\n", + "for layer_name, layer_activation in zip(layer_names, activations):\n", + " n_features = layer_activation.shape[-1]\n", + " size = layer_activation.shape[1]\n", + " n_cols = n_features // images_per_row\n", + " display_grid = np.zeros(\n", + " ((size + 1) * n_cols - 1, images_per_row * (size + 1) - 1)\n", + " )\n", + " for col in range(n_cols):\n", + " for row in range(images_per_row):\n", + " channel_index = col * images_per_row + row\n", + " channel_image = layer_activation[0, :, :, channel_index].copy()\n", + " if channel_image.sum() != 0:\n", + " channel_image -= channel_image.mean()\n", + " channel_image /= channel_image.std()\n", + " channel_image *= 64\n", + " channel_image += 128\n", + " channel_image = np.clip(channel_image, 0, 255).astype(\"uint8\")\n", + " display_grid[\n", + " col * (size + 1) : (col + 1) * size + col,\n", + " row * (size + 1) : (row + 1) * size + row,\n", + " ] = channel_image\n", + " scale = 1.0 / size\n", + " plt.figure(\n", + " figsize=(scale * display_grid.shape[1], scale * display_grid.shape[0])\n", + " )\n", + " plt.title(layer_name)\n", + " plt.grid(False)\n", + " plt.axis(\"off\")\n", + " plt.imshow(display_grid, aspect=\"auto\", cmap=\"viridis\")" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "colab_type": "text" + }, + "source": [ + "### Visualizing convnet filters" + ] + }, + { + "cell_type": "code", + "execution_count": 0, + "metadata": { + "colab_type": "code" + }, + "outputs": [], + "source": [ + "import keras_hub\n", + "\n", + "model = keras_hub.models.Backbone.from_preset(\n", + " \"xception_41_imagenet\",\n", + ")\n", + "preprocessor = keras_hub.layers.ImageConverter.from_preset(\n", + " \"xception_41_imagenet\",\n", + " image_size=(180, 180),\n", + ")" + ] + }, + { + "cell_type": "code", + "execution_count": 0, + "metadata": { + "colab_type": "code" + }, + "outputs": [], + "source": [ + "for layer in model.layers:\n", + " if isinstance(layer, (keras.layers.Conv2D, keras.layers.SeparableConv2D)):\n", + " print(layer.name)" + ] + }, + { + "cell_type": "code", + "execution_count": 0, + "metadata": { + "colab_type": "code" + }, + "outputs": [], + "source": [ + "layer_name = \"block3_sepconv1\"\n", + "layer = model.get_layer(name=layer_name)\n", + "feature_extractor = keras.Model(inputs=model.input, outputs=layer.output)" + ] + }, + { + "cell_type": "code", + "execution_count": 0, + "metadata": { + "colab_type": "code" + }, + "outputs": [], + "source": [ + "activation = feature_extractor(preprocessor(img_tensor))" + ] + }, + { + "cell_type": "code", + "execution_count": 0, + "metadata": { + "colab_type": "code" + }, + "outputs": [], + "source": [ + "from keras import ops\n", + "\n", + "def compute_loss(image, filter_index):\n", + " activation = feature_extractor(image)\n", + " filter_activation = activation[:, 2:-2, 2:-2, filter_index]\n", + " return ops.mean(filter_activation)" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "colab_type": "text" + }, + "source": [ + "#### Gradient ascent in TensorFlow" + ] + }, + { + "cell_type": "code", + "execution_count": 0, + "metadata": { + "colab_type": "code" + }, + "outputs": [], + "source": [ + "%%backend tensorflow\n", + "import tensorflow as tf\n", + "\n", + "@tf.function\n", + "def gradient_ascent_step(image, filter_index, learning_rate):\n", + " with tf.GradientTape() as tape:\n", + " tape.watch(image)\n", + " loss = compute_loss(image, filter_index)\n", + " grads = tape.gradient(loss, image)\n", + " grads = ops.normalize(grads)\n", + " image += learning_rate * grads\n", + " return image" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "colab_type": "text" + }, + "source": [ + "#### Gradient ascent in PyTorch" + ] + }, + { + "cell_type": "code", + "execution_count": 0, + "metadata": { + "colab_type": "code" + }, + "outputs": [], + "source": [ + "%%backend torch\n", + "import torch\n", + "\n", + "def gradient_ascent_step(image, filter_index, learning_rate):\n", + " image = image.clone().detach().requires_grad_(True)\n", + " loss = compute_loss(image, filter_index)\n", + " loss.backward()\n", + " grads = image.grad\n", + " grads = ops.normalize(grads)\n", + " image = image + learning_rate * grads\n", + " return image" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "colab_type": "text" + }, + "source": [ + "#### Gradient ascent in JAX" + ] + }, + { + "cell_type": "code", + "execution_count": 0, + "metadata": { + "colab_type": "code" + }, + "outputs": [], + "source": [ + "%%backend jax\n", + "import jax\n", + "\n", + "grad_fn = jax.grad(compute_loss)\n", + "\n", + "@jax.jit\n", + "def gradient_ascent_step(image, filter_index, learning_rate):\n", + " grads = grad_fn(image, filter_index)\n", + " grads = ops.normalize(grads)\n", + " image += learning_rate * grads\n", + " return image" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "colab_type": "text" + }, + "source": [ + "#### The filter visualization loop" + ] + }, + { + "cell_type": "code", + "execution_count": 0, + "metadata": { + "colab_type": "code" + }, + "outputs": [], + "source": [ + "img_width = 200\n", + "img_height = 200\n", + "\n", + "def generate_filter_pattern(filter_index):\n", + " iterations = 30\n", + " learning_rate = 10.0\n", + " image = keras.random.uniform(\n", + " minval=0.4, maxval=0.6, shape=(1, img_width, img_height, 3)\n", + " )\n", + " for i in range(iterations):\n", + " image = gradient_ascent_step(image, filter_index, learning_rate)\n", + " return image[0]" + ] + }, + { + "cell_type": "code", + "execution_count": 0, + "metadata": { + "colab_type": "code" + }, + "outputs": [], + "source": [ + "def deprocess_image(image):\n", + " image -= ops.mean(image)\n", + " image /= ops.std(image)\n", + " image *= 64\n", + " image += 128\n", + " image = ops.clip(image, 0, 255)\n", + " image = image[25:-25, 25:-25, :]\n", + " image = ops.cast(image, dtype=\"uint8\")\n", + " return ops.convert_to_numpy(image)" + ] + }, + { + "cell_type": "code", + "execution_count": 0, + "metadata": { + "colab_type": "code" + }, + "outputs": [], + "source": [ + "plt.axis(\"off\")\n", + "plt.imshow(deprocess_image(generate_filter_pattern(filter_index=2)))" + ] + }, + { + "cell_type": "code", + "execution_count": 0, + "metadata": { + "colab_type": "code" + }, + "outputs": [], + "source": [ + "all_images = []\n", + "for filter_index in range(64):\n", + " print(f\"Processing filter {filter_index}\")\n", + " image = deprocess_image(generate_filter_pattern(filter_index))\n", + " all_images.append(image)\n", + "\n", + "margin = 5\n", + "n = 8\n", + "box_width = img_width - 25 * 2\n", + "box_height = img_height - 25 * 2\n", + "full_width = n * box_width + (n - 1) * margin\n", + "full_height = n * box_height + (n - 1) * margin\n", + "stitched_filters = np.zeros((full_width, full_height, 3))\n", + "\n", + "for i in range(n):\n", + " for j in range(n):\n", + " image = all_images[i * n + j]\n", + " stitched_filters[\n", + " (box_width + margin) * i : (box_width + margin) * i + box_width,\n", + " (box_height + margin) * j : (box_height + margin) * j + box_height,\n", + " :,\n", + " ] = image\n", + "\n", + "keras.utils.save_img(f\"filters_for_layer_{layer_name}.png\", stitched_filters)" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "colab_type": "text" + }, + "source": [ + "### Visualizing heatmaps of class activation" + ] + }, + { + "cell_type": "code", + "execution_count": 0, + "metadata": { + "colab_type": "code" + }, + "outputs": [], + "source": [ + "img_path = keras.utils.get_file(\n", + " fname=\"elephant.jpg\",\n", + " origin=\"https://img-datasets.s3.amazonaws.com/elephant.jpg\",\n", + ")\n", + "img = keras.utils.load_img(img_path)\n", + "img_array = np.expand_dims(img, axis=0)" + ] + }, + { + "cell_type": "code", + "execution_count": 0, + "metadata": { + "colab_type": "code" + }, + "outputs": [], + "source": [ + "model = keras_hub.models.ImageClassifier.from_preset(\n", + " \"xception_41_imagenet\",\n", + " activation=\"softmax\",\n", + ")\n", + "preds = model.predict(img_array)\n", + "preds.shape" + ] + }, + { + "cell_type": "code", + "execution_count": 0, + "metadata": { + "colab_type": "code" + }, + "outputs": [], + "source": [ + "keras_hub.utils.decode_imagenet_predictions(preds)" + ] + }, + { + "cell_type": "code", + "execution_count": 0, + "metadata": { + "colab_type": "code" + }, + "outputs": [], + "source": [ + "np.argmax(preds[0])" + ] + }, + { + "cell_type": "code", + "execution_count": 0, + "metadata": { + "colab_type": "code" + }, + "outputs": [], + "source": [ + "img_array = model.preprocessor(img_array)" + ] + }, + { + "cell_type": "code", + "execution_count": 0, + "metadata": { + "colab_type": "code" + }, + "outputs": [], + "source": [ + "last_conv_layer_name = \"block14_sepconv2_act\"\n", + "last_conv_layer = model.backbone.get_layer(last_conv_layer_name)\n", + "last_conv_layer_model = keras.Model(model.inputs, last_conv_layer.output)" + ] + }, + { + "cell_type": "code", + "execution_count": 0, + "metadata": { + "colab_type": "code" + }, + "outputs": [], + "source": [ + "classifier_input = last_conv_layer.output\n", + "x = classifier_input\n", + "for layer_name in [\"pooler\", \"predictions\"]:\n", + " x = model.get_layer(layer_name)(x)\n", + "classifier_model = keras.Model(classifier_input, x)" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "colab_type": "text" + }, + "source": [ + "#### Getting the gradient of the top class: TensorFlow version" + ] + }, + { + "cell_type": "code", + "execution_count": 0, + "metadata": { + "colab_type": "code" + }, + "outputs": [], + "source": [ + "%%backend tensorflow\n", + "import tensorflow as tf\n", + "\n", + "def get_top_class_gradients(img_array):\n", + " last_conv_layer_output = last_conv_layer_model(img_array)\n", + " with tf.GradientTape() as tape:\n", + " tape.watch(last_conv_layer_output)\n", + " preds = classifier_model(last_conv_layer_output)\n", + " top_pred_index = ops.argmax(preds[0])\n", + " top_class_channel = preds[:, top_pred_index]\n", + "\n", + " grads = tape.gradient(top_class_channel, last_conv_layer_output)\n", + " return grads, last_conv_layer_output\n", + "\n", + "grads, last_conv_layer_output = get_top_class_gradients(img_array)\n", + "grads = ops.convert_to_numpy(grads)\n", + "last_conv_layer_output = ops.convert_to_numpy(last_conv_layer_output)" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "colab_type": "text" + }, + "source": [ + "#### Getting the gradient of the top class: PyTorch version" + ] + }, + { + "cell_type": "code", + "execution_count": 0, + "metadata": { + "colab_type": "code" + }, + "outputs": [], + "source": [ + "%%backend torch\n", + "def get_top_class_gradients(img_array):\n", + " last_conv_layer_output = last_conv_layer_model(img_array)\n", + " last_conv_layer_output = (\n", + " last_conv_layer_output.clone().detach().requires_grad_(True)\n", + " )\n", + " preds = classifier_model(last_conv_layer_output)\n", + " top_pred_index = ops.argmax(preds[0])\n", + " top_class_channel = preds[:, top_pred_index]\n", + " top_class_channel.backward()\n", + " grads = last_conv_layer_output.grad\n", + " return grads, last_conv_layer_output\n", + "\n", + "grads, last_conv_layer_output = get_top_class_gradients(img_array)\n", + "grads = ops.convert_to_numpy(grads)\n", + "last_conv_layer_output = ops.convert_to_numpy(last_conv_layer_output)" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "colab_type": "text" + }, + "source": [ + "#### Getting the gradient of the top class: JAX version" + ] + }, + { + "cell_type": "code", + "execution_count": 0, + "metadata": { + "colab_type": "code" + }, + "outputs": [], + "source": [ + "%%backend jax\n", + "import jax\n", + "\n", + "def loss_fn(last_conv_layer_output):\n", + " preds = classifier_model(last_conv_layer_output)\n", + " top_pred_index = ops.argmax(preds[0])\n", + " top_class_channel = preds[:, top_pred_index]\n", + " return top_class_channel[0]\n", + "\n", + "grad_fn = jax.grad(loss_fn)\n", + "\n", + "def get_top_class_gradients(img_array):\n", + " last_conv_layer_output = last_conv_layer_model(img_array)\n", + " grads = grad_fn(last_conv_layer_output)\n", + " return grads, last_conv_layer_output\n", + "\n", + "grads, last_conv_layer_output = get_top_class_gradients(img_array)\n", + "grads = ops.convert_to_numpy(grads)\n", + "last_conv_layer_output = ops.convert_to_numpy(last_conv_layer_output)" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "colab_type": "text" + }, + "source": [ + "#### Displaying the class activation heatmap" + ] + }, + { + "cell_type": "code", + "execution_count": 0, + "metadata": { + "colab_type": "code" + }, + "outputs": [], + "source": [ + "pooled_grads = np.mean(grads, axis=(0, 1, 2))\n", + "last_conv_layer_output = last_conv_layer_output[0].copy()\n", + "for i in range(pooled_grads.shape[-1]):\n", + " last_conv_layer_output[:, :, i] *= pooled_grads[i]\n", + "heatmap = np.mean(last_conv_layer_output, axis=-1)" + ] + }, + { + "cell_type": "code", + "execution_count": 0, + "metadata": { + "colab_type": "code" + }, + "outputs": [], + "source": [ + "heatmap = np.maximum(heatmap, 0)\n", + "heatmap /= np.max(heatmap)\n", + "plt.matshow(heatmap)" + ] + }, + { + "cell_type": "code", + "execution_count": 0, + "metadata": { + "colab_type": "code" + }, + "outputs": [], + "source": [ + "import matplotlib.cm as cm\n", + "\n", + "img = keras.utils.load_img(img_path)\n", + "img = keras.utils.img_to_array(img)\n", + "\n", + "heatmap = np.uint8(255 * heatmap)\n", + "\n", + "jet = cm.get_cmap(\"jet\")\n", + "jet_colors = jet(np.arange(256))[:, :3]\n", + "jet_heatmap = jet_colors[heatmap]\n", + "\n", + "jet_heatmap = keras.utils.array_to_img(jet_heatmap)\n", + "jet_heatmap = jet_heatmap.resize((img.shape[1], img.shape[0]))\n", + "jet_heatmap = keras.utils.img_to_array(jet_heatmap)\n", + "\n", + "superimposed_img = jet_heatmap * 0.4 + img\n", + "superimposed_img = keras.utils.array_to_img(superimposed_img)\n", + "\n", + "plt.imshow(superimposed_img)" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "colab_type": "text" + }, + "source": [ + "### Visualizing the latent space of a convnet" + ] + } + ], + "metadata": { + "accelerator": "GPU", + "colab": { + "collapsed_sections": [], + "name": "chapter10_interpreting-what-convnets-learn", + "private_outputs": false, + "provenance": [], + "toc_visible": true + }, + "kernelspec": { + "display_name": "Python 3", + "language": "python", + "name": "python3" + }, + "language_info": { + "codemirror_mode": { + "name": "ipython", + "version": 3 + }, + "file_extension": ".py", + "mimetype": "text/x-python", + "name": "python", + "nbconvert_exporter": "python", + "pygments_lexer": "ipython3", + "version": "3.10.0" + } + }, + "nbformat": 4, + "nbformat_minor": 0 +} \ No newline at end of file diff --git a/chapter11_image-segmentation.ipynb b/chapter11_image-segmentation.ipynb new file mode 100644 index 0000000000..74f4c6b13d --- /dev/null +++ b/chapter11_image-segmentation.ipynb @@ -0,0 +1,701 @@ +{ + "cells": [ + { + "cell_type": "markdown", + "metadata": { + "colab_type": "text" + }, + "source": [ + "This is a companion notebook for the book [Deep Learning with Python, Third Edition](https://www.manning.com/books/deep-learning-with-python-third-edition). For readability, it only contains runnable code blocks and section titles, and omits everything else in the book: text paragraphs, figures, and pseudocode.\n\n**If you want to be able to follow what's going on, I recommend reading the notebook side by side with your copy of the book.**\n\nThe book's contents are available online at [deeplearningwithpython.io](https://deeplearningwithpython.io)." + ] + }, + { + "cell_type": "code", + "execution_count": 0, + "metadata": { + "colab_type": "code" + }, + "outputs": [], + "source": [ + "!pip install keras keras-hub --upgrade -q" + ] + }, + { + "cell_type": "code", + "execution_count": 0, + "metadata": { + "colab_type": "code" + }, + "outputs": [], + "source": [ + "import os\n", + "os.environ[\"KERAS_BACKEND\"] = \"jax\"" + ] + }, + { + "cell_type": "code", + "execution_count": 0, + "metadata": { + "cellView": "form", + "colab_type": "code" + }, + "outputs": [], + "source": [ + "# @title\n", + "import os\n", + "from IPython.core.magic import register_cell_magic\n", + "\n", + "@register_cell_magic\n", + "def backend(line, cell):\n", + " current, required = os.environ.get(\"KERAS_BACKEND\", \"\"), line.split()[-1]\n", + " if current == required:\n", + " get_ipython().run_cell(cell)\n", + " else:\n", + " print(\n", + " f\"This cell requires the {required} backend. To run it, change KERAS_BACKEND to \"\n", + " f\"\\\"{required}\\\" at the top of the notebook, restart the runtime, and rerun the notebook.\"\n", + " )" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "colab_type": "text" + }, + "source": [ + "## Image segmentation" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "colab_type": "text" + }, + "source": [ + "### Computer vision tasks" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "colab_type": "text" + }, + "source": [ + "#### Types of image segmentation" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "colab_type": "text" + }, + "source": [ + "### Training a segmentation model from scratch" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "colab_type": "text" + }, + "source": [ + "#### Downloading a segmentation dataset" + ] + }, + { + "cell_type": "code", + "execution_count": 0, + "metadata": { + "colab_type": "code" + }, + "outputs": [], + "source": [ + "!wget http://www.robots.ox.ac.uk/~vgg/data/pets/data/images.tar.gz\n", + "!wget http://www.robots.ox.ac.uk/~vgg/data/pets/data/annotations.tar.gz\n", + "!tar -xf images.tar.gz\n", + "!tar -xf annotations.tar.gz" + ] + }, + { + "cell_type": "code", + "execution_count": 0, + "metadata": { + "colab_type": "code" + }, + "outputs": [], + "source": [ + "import pathlib\n", + "\n", + "input_dir = pathlib.Path(\"images\")\n", + "target_dir = pathlib.Path(\"annotations/trimaps\")\n", + "\n", + "input_img_paths = sorted(input_dir.glob(\"*.jpg\"))\n", + "target_paths = sorted(target_dir.glob(\"[!.]*.png\"))" + ] + }, + { + "cell_type": "code", + "execution_count": 0, + "metadata": { + "colab_type": "code" + }, + "outputs": [], + "source": [ + "import matplotlib.pyplot as plt\n", + "from keras.utils import load_img, img_to_array, array_to_img\n", + "\n", + "plt.axis(\"off\")\n", + "plt.imshow(load_img(input_img_paths[9]))" + ] + }, + { + "cell_type": "code", + "execution_count": 0, + "metadata": { + "colab_type": "code" + }, + "outputs": [], + "source": [ + "def display_target(target_array):\n", + " normalized_array = (target_array.astype(\"uint8\") - 1) * 127\n", + " plt.axis(\"off\")\n", + " plt.imshow(normalized_array[:, :, 0])\n", + "\n", + "img = img_to_array(load_img(target_paths[9], color_mode=\"grayscale\"))\n", + "display_target(img)" + ] + }, + { + "cell_type": "code", + "execution_count": 0, + "metadata": { + "colab_type": "code" + }, + "outputs": [], + "source": [ + "import numpy as np\n", + "import random\n", + "\n", + "img_size = (200, 200)\n", + "num_imgs = len(input_img_paths)\n", + "\n", + "random.Random(1337).shuffle(input_img_paths)\n", + "random.Random(1337).shuffle(target_paths)\n", + "\n", + "def path_to_input_image(path):\n", + " return img_to_array(load_img(path, target_size=img_size))\n", + "\n", + "def path_to_target(path):\n", + " img = img_to_array(\n", + " load_img(path, target_size=img_size, color_mode=\"grayscale\")\n", + " )\n", + " img = img.astype(\"uint8\") - 1\n", + " return img\n", + "\n", + "input_imgs = np.zeros((num_imgs,) + img_size + (3,), dtype=\"float32\")\n", + "targets = np.zeros((num_imgs,) + img_size + (1,), dtype=\"uint8\")\n", + "for i in range(num_imgs):\n", + " input_imgs[i] = path_to_input_image(input_img_paths[i])\n", + " targets[i] = path_to_target(target_paths[i])" + ] + }, + { + "cell_type": "code", + "execution_count": 0, + "metadata": { + "colab_type": "code" + }, + "outputs": [], + "source": [ + "num_val_samples = 1000\n", + "train_input_imgs = input_imgs[:-num_val_samples]\n", + "train_targets = targets[:-num_val_samples]\n", + "val_input_imgs = input_imgs[-num_val_samples:]\n", + "val_targets = targets[-num_val_samples:]" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "colab_type": "text" + }, + "source": [ + "#### Building and training the segmentation model" + ] + }, + { + "cell_type": "code", + "execution_count": 0, + "metadata": { + "colab_type": "code" + }, + "outputs": [], + "source": [ + "import keras\n", + "from keras.layers import Rescaling, Conv2D, Conv2DTranspose\n", + "\n", + "def get_model(img_size, num_classes):\n", + " inputs = keras.Input(shape=img_size + (3,))\n", + " x = Rescaling(1.0 / 255)(inputs)\n", + "\n", + " x = Conv2D(64, 3, strides=2, activation=\"relu\", padding=\"same\")(x)\n", + " x = Conv2D(64, 3, activation=\"relu\", padding=\"same\")(x)\n", + " x = Conv2D(128, 3, strides=2, activation=\"relu\", padding=\"same\")(x)\n", + " x = Conv2D(128, 3, activation=\"relu\", padding=\"same\")(x)\n", + " x = Conv2D(256, 3, strides=2, padding=\"same\", activation=\"relu\")(x)\n", + " x = Conv2D(256, 3, activation=\"relu\", padding=\"same\")(x)\n", + "\n", + " x = Conv2DTranspose(256, 3, activation=\"relu\", padding=\"same\")(x)\n", + " x = Conv2DTranspose(256, 3, strides=2, activation=\"relu\", padding=\"same\")(x)\n", + " x = Conv2DTranspose(128, 3, activation=\"relu\", padding=\"same\")(x)\n", + " x = Conv2DTranspose(128, 3, strides=2, activation=\"relu\", padding=\"same\")(x)\n", + " x = Conv2DTranspose(64, 3, activation=\"relu\", padding=\"same\")(x)\n", + " x = Conv2DTranspose(64, 3, strides=2, activation=\"relu\", padding=\"same\")(x)\n", + "\n", + " outputs = Conv2D(num_classes, 3, activation=\"softmax\", padding=\"same\")(x)\n", + "\n", + " return keras.Model(inputs, outputs)\n", + "\n", + "model = get_model(img_size=img_size, num_classes=3)" + ] + }, + { + "cell_type": "code", + "execution_count": 0, + "metadata": { + "colab_type": "code" + }, + "outputs": [], + "source": [ + "# \u26a0\ufe0fNOTE\u26a0\ufe0f: The following IoU metric is *very* slow on the PyTorch backend!\n", + "# If you are running with PyTorch, we recommend re-running the notebook with Jax\n", + "# or TensorFlow, or skipping to the next section of this chapter." + ] + }, + { + "cell_type": "code", + "execution_count": 0, + "metadata": { + "colab_type": "code" + }, + "outputs": [], + "source": [ + "foreground_iou = keras.metrics.IoU(\n", + " num_classes=3,\n", + " target_class_ids=(0,),\n", + " name=\"foreground_iou\",\n", + " sparse_y_true=True,\n", + " sparse_y_pred=False,\n", + ")" + ] + }, + { + "cell_type": "code", + "execution_count": 0, + "metadata": { + "colab_type": "code" + }, + "outputs": [], + "source": [ + "model.compile(\n", + " optimizer=\"adam\",\n", + " loss=\"sparse_categorical_crossentropy\",\n", + " metrics=[foreground_iou],\n", + ")\n", + "callbacks = [\n", + " keras.callbacks.ModelCheckpoint(\n", + " \"oxford_segmentation.keras\",\n", + " save_best_only=True,\n", + " ),\n", + "]\n", + "history = model.fit(\n", + " train_input_imgs,\n", + " train_targets,\n", + " epochs=50,\n", + " callbacks=callbacks,\n", + " batch_size=64,\n", + " validation_data=(val_input_imgs, val_targets),\n", + ")" + ] + }, + { + "cell_type": "code", + "execution_count": 0, + "metadata": { + "colab_type": "code" + }, + "outputs": [], + "source": [ + "epochs = range(1, len(history.history[\"loss\"]) + 1)\n", + "loss = history.history[\"loss\"]\n", + "val_loss = history.history[\"val_loss\"]\n", + "plt.figure()\n", + "plt.plot(epochs, loss, \"r--\", label=\"Training loss\")\n", + "plt.plot(epochs, val_loss, \"b\", label=\"Validation loss\")\n", + "plt.title(\"Training and validation loss\")\n", + "plt.legend()" + ] + }, + { + "cell_type": "code", + "execution_count": 0, + "metadata": { + "colab_type": "code" + }, + "outputs": [], + "source": [ + "model = keras.models.load_model(\"oxford_segmentation.keras\")\n", + "\n", + "i = 4\n", + "test_image = val_input_imgs[i]\n", + "plt.axis(\"off\")\n", + "plt.imshow(array_to_img(test_image))\n", + "\n", + "mask = model.predict(np.expand_dims(test_image, 0))[0]\n", + "\n", + "def display_mask(pred):\n", + " mask = np.argmax(pred, axis=-1)\n", + " mask *= 127\n", + " plt.axis(\"off\")\n", + " plt.imshow(mask)\n", + "\n", + "display_mask(mask)" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "colab_type": "text" + }, + "source": [ + "### Using a pretrained segmentation model" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "colab_type": "text" + }, + "source": [ + "#### Downloading the Segment Anything Model" + ] + }, + { + "cell_type": "code", + "execution_count": 0, + "metadata": { + "colab_type": "code" + }, + "outputs": [], + "source": [ + "import keras_hub\n", + "\n", + "model = keras_hub.models.ImageSegmenter.from_preset(\"sam_huge_sa1b\")" + ] + }, + { + "cell_type": "code", + "execution_count": 0, + "metadata": { + "colab_type": "code" + }, + "outputs": [], + "source": [ + "model.count_params()" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "colab_type": "text" + }, + "source": [ + "#### How Segment Anything works" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "colab_type": "text" + }, + "source": [ + "#### Preparing a test image" + ] + }, + { + "cell_type": "code", + "execution_count": 0, + "metadata": { + "colab_type": "code" + }, + "outputs": [], + "source": [ + "path = keras.utils.get_file(\n", + " origin=\"https://s3.amazonaws.com/keras.io/img/book/fruits.jpg\"\n", + ")\n", + "pil_image = keras.utils.load_img(path)\n", + "image_array = keras.utils.img_to_array(pil_image)\n", + "\n", + "plt.imshow(image_array.astype(\"uint8\"))\n", + "plt.axis(\"off\")\n", + "plt.show()" + ] + }, + { + "cell_type": "code", + "execution_count": 0, + "metadata": { + "colab_type": "code" + }, + "outputs": [], + "source": [ + "from keras import ops\n", + "\n", + "image_size = (1024, 1024)\n", + "\n", + "def resize_and_pad(x):\n", + " return ops.image.resize(x, image_size, pad_to_aspect_ratio=True)\n", + "\n", + "image = resize_and_pad(image_array)" + ] + }, + { + "cell_type": "code", + "execution_count": 0, + "metadata": { + "colab_type": "code" + }, + "outputs": [], + "source": [ + "import matplotlib.pyplot as plt\n", + "from keras import ops\n", + "\n", + "def show_image(image, ax):\n", + " ax.imshow(ops.convert_to_numpy(image).astype(\"uint8\"))\n", + "\n", + "def show_mask(mask, ax):\n", + " color = np.array([30 / 255, 144 / 255, 255 / 255, 0.6])\n", + " h, w, _ = mask.shape\n", + " mask_image = mask.reshape(h, w, 1) * color.reshape(1, 1, -1)\n", + " ax.imshow(mask_image)\n", + "\n", + "def show_points(points, ax):\n", + " x, y = points[:, 0], points[:, 1]\n", + " ax.scatter(x, y, c=\"green\", marker=\"*\", s=375, ec=\"white\", lw=1.25)\n", + "\n", + "def show_box(box, ax):\n", + " box = box.reshape(-1)\n", + " x0, y0 = box[0], box[1]\n", + " w, h = box[2] - box[0], box[3] - box[1]\n", + " ax.add_patch(plt.Rectangle((x0, y0), w, h, ec=\"red\", fc=\"none\", lw=2))" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "colab_type": "text" + }, + "source": [ + "#### Prompting the model with a target point" + ] + }, + { + "cell_type": "code", + "execution_count": 0, + "metadata": { + "colab_type": "code" + }, + "outputs": [], + "source": [ + "import numpy as np\n", + "\n", + "input_point = np.array([[580, 450]])\n", + "input_label = np.array([1])\n", + "\n", + "plt.figure(figsize=(10, 10))\n", + "show_image(image, plt.gca())\n", + "show_points(input_point, plt.gca())\n", + "plt.show()" + ] + }, + { + "cell_type": "code", + "execution_count": 0, + "metadata": { + "colab_type": "code" + }, + "outputs": [], + "source": [ + "outputs = model.predict(\n", + " {\n", + " \"images\": ops.expand_dims(image, axis=0),\n", + " \"points\": ops.expand_dims(input_point, axis=0),\n", + " \"labels\": ops.expand_dims(input_label, axis=0),\n", + " }\n", + ")" + ] + }, + { + "cell_type": "code", + "execution_count": 0, + "metadata": { + "colab_type": "code" + }, + "outputs": [], + "source": [ + "outputs[\"masks\"].shape" + ] + }, + { + "cell_type": "code", + "execution_count": 0, + "metadata": { + "colab_type": "code" + }, + "outputs": [], + "source": [ + "def get_mask(sam_outputs, index=0):\n", + " mask = sam_outputs[\"masks\"][0][index]\n", + " mask = np.expand_dims(mask, axis=-1)\n", + " mask = resize_and_pad(mask)\n", + " return ops.convert_to_numpy(mask) > 0.0\n", + "\n", + "mask = get_mask(outputs, index=0)\n", + "\n", + "plt.figure(figsize=(10, 10))\n", + "show_image(image, plt.gca())\n", + "show_mask(mask, plt.gca())\n", + "show_points(input_point, plt.gca())\n", + "plt.show()" + ] + }, + { + "cell_type": "code", + "execution_count": 0, + "metadata": { + "colab_type": "code" + }, + "outputs": [], + "source": [ + "input_point = np.array([[300, 550]])\n", + "input_label = np.array([1])\n", + "\n", + "outputs = model.predict(\n", + " {\n", + " \"images\": ops.expand_dims(image, axis=0),\n", + " \"points\": ops.expand_dims(input_point, axis=0),\n", + " \"labels\": ops.expand_dims(input_label, axis=0),\n", + " }\n", + ")\n", + "mask = get_mask(outputs, index=0)\n", + "\n", + "plt.figure(figsize=(10, 10))\n", + "show_image(image, plt.gca())\n", + "show_mask(mask, plt.gca())\n", + "show_points(input_point, plt.gca())\n", + "plt.show()" + ] + }, + { + "cell_type": "code", + "execution_count": 0, + "metadata": { + "colab_type": "code" + }, + "outputs": [], + "source": [ + "fig, axes = plt.subplots(1, 3, figsize=(20, 60))\n", + "masks = outputs[\"masks\"][0][1:]\n", + "for i, mask in enumerate(masks):\n", + " show_image(image, axes[i])\n", + " show_points(input_point, axes[i])\n", + " mask = get_mask(outputs, index=i + 1)\n", + " show_mask(mask, axes[i])\n", + " axes[i].set_title(f\"Mask {i + 1}\", fontsize=16)\n", + " axes[i].axis(\"off\")\n", + "plt.show()" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "colab_type": "text" + }, + "source": [ + "#### Prompting the model with a target box" + ] + }, + { + "cell_type": "code", + "execution_count": 0, + "metadata": { + "colab_type": "code" + }, + "outputs": [], + "source": [ + "input_box = np.array(\n", + " [\n", + " [520, 180],\n", + " [770, 420],\n", + " ]\n", + ")\n", + "\n", + "plt.figure(figsize=(10, 10))\n", + "show_image(image, plt.gca())\n", + "show_box(input_box, plt.gca())\n", + "plt.show()" + ] + }, + { + "cell_type": "code", + "execution_count": 0, + "metadata": { + "colab_type": "code" + }, + "outputs": [], + "source": [ + "outputs = model.predict(\n", + " {\n", + " \"images\": ops.expand_dims(image, axis=0),\n", + " \"boxes\": ops.expand_dims(input_box, axis=(0, 1)),\n", + " }\n", + ")\n", + "mask = get_mask(outputs, 0)\n", + "plt.figure(figsize=(10, 10))\n", + "show_image(image, plt.gca())\n", + "show_mask(mask, plt.gca())\n", + "show_box(input_box, plt.gca())\n", + "plt.show()" + ] + } + ], + "metadata": { + "accelerator": "GPU", + "colab": { + "collapsed_sections": [], + "name": "chapter11_image-segmentation", + "private_outputs": false, + "provenance": [], + "toc_visible": true + }, + "kernelspec": { + "display_name": "Python 3", + "language": "python", + "name": "python3" + }, + "language_info": { + "codemirror_mode": { + "name": "ipython", + "version": 3 + }, + "file_extension": ".py", + "mimetype": "text/x-python", + "name": "python", + "nbconvert_exporter": "python", + "pygments_lexer": "ipython3", + "version": "3.10.0" + } + }, + "nbformat": 4, + "nbformat_minor": 0 +} \ No newline at end of file diff --git a/chapter12_object-detection.ipynb b/chapter12_object-detection.ipynb new file mode 100644 index 0000000000..6b562082e3 --- /dev/null +++ b/chapter12_object-detection.ipynb @@ -0,0 +1,716 @@ +{ + "cells": [ + { + "cell_type": "markdown", + "metadata": { + "colab_type": "text" + }, + "source": [ + "This is a companion notebook for the book [Deep Learning with Python, Third Edition](https://www.manning.com/books/deep-learning-with-python-third-edition). For readability, it only contains runnable code blocks and section titles, and omits everything else in the book: text paragraphs, figures, and pseudocode.\n\n**If you want to be able to follow what's going on, I recommend reading the notebook side by side with your copy of the book.**\n\nThe book's contents are available online at [deeplearningwithpython.io](https://deeplearningwithpython.io)." + ] + }, + { + "cell_type": "code", + "execution_count": 0, + "metadata": { + "colab_type": "code" + }, + "outputs": [], + "source": [ + "!pip install keras keras-hub --upgrade -q" + ] + }, + { + "cell_type": "code", + "execution_count": 0, + "metadata": { + "colab_type": "code" + }, + "outputs": [], + "source": [ + "import os\n", + "os.environ[\"KERAS_BACKEND\"] = \"jax\"" + ] + }, + { + "cell_type": "code", + "execution_count": 0, + "metadata": { + "cellView": "form", + "colab_type": "code" + }, + "outputs": [], + "source": [ + "# @title\n", + "import os\n", + "from IPython.core.magic import register_cell_magic\n", + "\n", + "@register_cell_magic\n", + "def backend(line, cell):\n", + " current, required = os.environ.get(\"KERAS_BACKEND\", \"\"), line.split()[-1]\n", + " if current == required:\n", + " get_ipython().run_cell(cell)\n", + " else:\n", + " print(\n", + " f\"This cell requires the {required} backend. To run it, change KERAS_BACKEND to \"\n", + " f\"\\\"{required}\\\" at the top of the notebook, restart the runtime, and rerun the notebook.\"\n", + " )" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "colab_type": "text" + }, + "source": [ + "## Object detection" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "colab_type": "text" + }, + "source": [ + "### Single-stage vs. two-stage object detectors" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "colab_type": "text" + }, + "source": [ + "#### Two-stage R-CNN detectors" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "colab_type": "text" + }, + "source": [ + "#### Single-stage detectors" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "colab_type": "text" + }, + "source": [ + "### Training a YOLO model from scratch" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "colab_type": "text" + }, + "source": [ + "#### Downloading the COCO dataset" + ] + }, + { + "cell_type": "code", + "execution_count": 0, + "metadata": { + "colab_type": "code" + }, + "outputs": [], + "source": [ + "import keras\n", + "import keras_hub\n", + "\n", + "images_path = keras.utils.get_file(\n", + " \"coco\",\n", + " \"http://images.cocodataset.org/zips/train2017.zip\",\n", + " extract=True,\n", + ")\n", + "annotations_path = keras.utils.get_file(\n", + " \"annotations\",\n", + " \"http://images.cocodataset.org/annotations/annotations_trainval2017.zip\",\n", + " extract=True,\n", + ")" + ] + }, + { + "cell_type": "code", + "execution_count": 0, + "metadata": { + "colab_type": "code" + }, + "outputs": [], + "source": [ + "import json\n", + "\n", + "with open(f\"{annotations_path}/annotations/instances_train2017.json\", \"r\") as f:\n", + " annotations = json.load(f)\n", + "\n", + "images = {image[\"id\"]: image for image in annotations[\"images\"]}\n", + "\n", + "def scale_box(box, width, height):\n", + " scale = 1.0 / max(width, height)\n", + " x, y, w, h = [v * scale for v in box]\n", + " x += (height - width) * scale / 2 if height > width else 0\n", + " y += (width - height) * scale / 2 if width > height else 0\n", + " return [x, y, w, h]\n", + "\n", + "metadata = {}\n", + "for annotation in annotations[\"annotations\"]:\n", + " id = annotation[\"image_id\"]\n", + " if id not in metadata:\n", + " metadata[id] = {\"boxes\": [], \"labels\": []}\n", + " image = images[id]\n", + " box = scale_box(annotation[\"bbox\"], image[\"width\"], image[\"height\"])\n", + " metadata[id][\"boxes\"].append(box)\n", + " metadata[id][\"labels\"].append(annotation[\"category_id\"])\n", + " metadata[id][\"path\"] = images_path + \"/train2017/\" + image[\"file_name\"]\n", + "metadata = list(metadata.values())" + ] + }, + { + "cell_type": "code", + "execution_count": 0, + "metadata": { + "colab_type": "code" + }, + "outputs": [], + "source": [ + "len(metadata)" + ] + }, + { + "cell_type": "code", + "execution_count": 0, + "metadata": { + "colab_type": "code" + }, + "outputs": [], + "source": [ + "min([len(x[\"boxes\"]) for x in metadata])" + ] + }, + { + "cell_type": "code", + "execution_count": 0, + "metadata": { + "colab_type": "code" + }, + "outputs": [], + "source": [ + "max([len(x[\"boxes\"]) for x in metadata])" + ] + }, + { + "cell_type": "code", + "execution_count": 0, + "metadata": { + "colab_type": "code" + }, + "outputs": [], + "source": [ + "max(max(x[\"labels\"]) for x in metadata) + 1" + ] + }, + { + "cell_type": "code", + "execution_count": 0, + "metadata": { + "colab_type": "code" + }, + "outputs": [], + "source": [ + "metadata[435]" + ] + }, + { + "cell_type": "code", + "execution_count": 0, + "metadata": { + "colab_type": "code" + }, + "outputs": [], + "source": [ + "[keras_hub.utils.coco_id_to_name(x) for x in metadata[435][\"labels\"]]" + ] + }, + { + "cell_type": "code", + "execution_count": 0, + "metadata": { + "colab_type": "code" + }, + "outputs": [], + "source": [ + "import matplotlib.pyplot as plt\n", + "from matplotlib.colors import hsv_to_rgb\n", + "from matplotlib.patches import Rectangle\n", + "\n", + "color_map = {0: \"gray\"}\n", + "\n", + "def label_to_color(label):\n", + " if label not in color_map:\n", + " h, s, v = (len(color_map) * 0.618) % 1, 0.5, 0.9\n", + " color_map[label] = hsv_to_rgb((h, s, v))\n", + " return color_map[label]\n", + "\n", + "def draw_box(ax, box, text, color):\n", + " x, y, w, h = box\n", + " ax.add_patch(Rectangle((x, y), w, h, lw=2, ec=color, fc=\"none\"))\n", + " textbox = dict(fc=color, pad=1, ec=\"none\")\n", + " ax.text(x, y, text, c=\"white\", size=10, va=\"bottom\", bbox=textbox)\n", + "\n", + "def draw_image(ax, image):\n", + " ax.set(xlim=(0, 1), ylim=(1, 0), xticks=[], yticks=[], aspect=\"equal\")\n", + " image = plt.imread(image)\n", + " height, width = image.shape[:2]\n", + " hpad = (1 - height / width) / 2 if width > height else 0\n", + " wpad = (1 - width / height) / 2 if height > width else 0\n", + " extent = [wpad, 1 - wpad, 1 - hpad, hpad]\n", + " ax.imshow(image, extent=extent)" + ] + }, + { + "cell_type": "code", + "execution_count": 0, + "metadata": { + "colab_type": "code" + }, + "outputs": [], + "source": [ + "sample = metadata[435]\n", + "ig, ax = plt.subplots(dpi=300)\n", + "draw_image(ax, sample[\"path\"])\n", + "for box, label in zip(sample[\"boxes\"], sample[\"labels\"]):\n", + " label_name = keras_hub.utils.coco_id_to_name(label)\n", + " draw_box(ax, box, label_name, label_to_color(label))\n", + "plt.show()" + ] + }, + { + "cell_type": "code", + "execution_count": 0, + "metadata": { + "colab_type": "code" + }, + "outputs": [], + "source": [ + "import random\n", + "\n", + "metadata = list(filter(lambda x: len(x[\"boxes\"]) <= 4, metadata))\n", + "random.shuffle(metadata)" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "colab_type": "text" + }, + "source": [ + "#### Creating a YOLO model" + ] + }, + { + "cell_type": "code", + "execution_count": 0, + "metadata": { + "colab_type": "code" + }, + "outputs": [], + "source": [ + "image_size = 448\n", + "\n", + "backbone = keras_hub.models.Backbone.from_preset(\n", + " \"resnet_50_imagenet\",\n", + ")\n", + "preprocessor = keras_hub.layers.ImageConverter.from_preset(\n", + " \"resnet_50_imagenet\",\n", + " image_size=(image_size, image_size),\n", + ")" + ] + }, + { + "cell_type": "code", + "execution_count": 0, + "metadata": { + "colab_type": "code" + }, + "outputs": [], + "source": [ + "from keras import layers\n", + "\n", + "grid_size = 6\n", + "num_labels = 91\n", + "\n", + "inputs = keras.Input(shape=(image_size, image_size, 3))\n", + "x = backbone(inputs)\n", + "x = layers.Conv2D(512, (3, 3), strides=(2, 2))(x)\n", + "x = keras.layers.Flatten()(x)\n", + "x = layers.Dense(2048, activation=\"relu\", kernel_initializer=\"glorot_normal\")(x)\n", + "x = layers.Dropout(0.5)(x)\n", + "x = layers.Dense(grid_size * grid_size * (num_labels + 5))(x)\n", + "x = layers.Reshape((grid_size, grid_size, num_labels + 5))(x)\n", + "box_predictions = x[..., :5]\n", + "class_predictions = layers.Activation(\"softmax\")(x[..., 5:])\n", + "outputs = {\"box\": box_predictions, \"class\": class_predictions}\n", + "model = keras.Model(inputs, outputs)" + ] + }, + { + "cell_type": "code", + "execution_count": 0, + "metadata": { + "colab_type": "code" + }, + "outputs": [], + "source": [ + "model.summary()" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "colab_type": "text" + }, + "source": [ + "#### Readying the COCO data for the YOLO model" + ] + }, + { + "cell_type": "code", + "execution_count": 0, + "metadata": { + "colab_type": "code" + }, + "outputs": [], + "source": [ + "def to_grid(box):\n", + " x, y, w, h = box\n", + " cx, cy = (x + w / 2) * grid_size, (y + h / 2) * grid_size\n", + " ix, iy = int(cx), int(cy)\n", + " return (ix, iy), (cx - ix, cy - iy, w, h)\n", + "\n", + "def from_grid(loc, box):\n", + " (xi, yi), (x, y, w, h) = loc, box\n", + " x = (xi + x) / grid_size - w / 2\n", + " y = (yi + y) / grid_size - h / 2\n", + " return (x, y, w, h)" + ] + }, + { + "cell_type": "code", + "execution_count": 0, + "metadata": { + "colab_type": "code" + }, + "outputs": [], + "source": [ + "import numpy as np\n", + "import math\n", + "\n", + "class_array = np.zeros((len(metadata), grid_size, grid_size))\n", + "box_array = np.zeros((len(metadata), grid_size, grid_size, 5))\n", + "\n", + "for index, sample in enumerate(metadata):\n", + " boxes, labels = sample[\"boxes\"], sample[\"labels\"]\n", + " for box, label in zip(boxes, labels):\n", + " (x, y, w, h) = box\n", + " left, right = math.floor(x * grid_size), math.ceil((x + w) * grid_size)\n", + " bottom, top = math.floor(y * grid_size), math.ceil((y + h) * grid_size)\n", + " class_array[index, bottom:top, left:right] = label\n", + "\n", + "for index, sample in enumerate(metadata):\n", + " boxes, labels = sample[\"boxes\"], sample[\"labels\"]\n", + " for box, label in zip(boxes, labels):\n", + " (xi, yi), (grid_box) = to_grid(box)\n", + " box_array[index, yi, xi] = [*grid_box, 1.0]\n", + " class_array[index, yi, xi] = label" + ] + }, + { + "cell_type": "code", + "execution_count": 0, + "metadata": { + "colab_type": "code" + }, + "outputs": [], + "source": [ + "def draw_prediction(image, boxes, classes, cutoff=None):\n", + " fig, ax = plt.subplots(dpi=300)\n", + " draw_image(ax, image)\n", + " for yi, row in enumerate(classes):\n", + " for xi, label in enumerate(row):\n", + " color = label_to_color(label) if label else \"none\"\n", + " x, y, w, h = (v / grid_size for v in (xi, yi, 1.0, 1.0))\n", + " r = Rectangle((x, y), w, h, lw=2, ec=\"black\", fc=color, alpha=0.5)\n", + " ax.add_patch(r)\n", + " for yi, row in enumerate(boxes):\n", + " for xi, box in enumerate(row):\n", + " box, confidence = box[:4], box[4]\n", + " if not cutoff or confidence >= cutoff:\n", + " box = from_grid((xi, yi), box)\n", + " label = classes[yi, xi]\n", + " color = label_to_color(label)\n", + " name = keras_hub.utils.coco_id_to_name(label)\n", + " draw_box(ax, box, f\"{name} {max(confidence, 0):.2f}\", color)\n", + " plt.show()\n", + "\n", + "draw_prediction(metadata[0][\"path\"], box_array[0], class_array[0], cutoff=1.0)" + ] + }, + { + "cell_type": "code", + "execution_count": 0, + "metadata": { + "colab_type": "code" + }, + "outputs": [], + "source": [ + "import tensorflow as tf\n", + "\n", + "def load_image(path):\n", + " x = tf.io.read_file(path)\n", + " x = tf.image.decode_jpeg(x, channels=3)\n", + " return preprocessor(x)\n", + "\n", + "images = tf.data.Dataset.from_tensor_slices([x[\"path\"] for x in metadata])\n", + "images = images.map(load_image, num_parallel_calls=8)\n", + "labels = {\"box\": box_array, \"class\": class_array}\n", + "labels = tf.data.Dataset.from_tensor_slices(labels)\n", + "\n", + "dataset = tf.data.Dataset.zip(images, labels).batch(16).prefetch(2)\n", + "val_dataset, train_dataset = dataset.take(500), dataset.skip(500)" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "colab_type": "text" + }, + "source": [ + "#### Training the YOLO model" + ] + }, + { + "cell_type": "code", + "execution_count": 0, + "metadata": { + "colab_type": "code" + }, + "outputs": [], + "source": [ + "from keras import ops\n", + "\n", + "def unpack(box):\n", + " return box[..., 0], box[..., 1], box[..., 2], box[..., 3]\n", + "\n", + "def intersection(box1, box2):\n", + " cx1, cy1, w1, h1 = unpack(box1)\n", + " cx2, cy2, w2, h2 = unpack(box2)\n", + " left = ops.maximum(cx1 - w1 / 2, cx2 - w2 / 2)\n", + " bottom = ops.maximum(cy1 - h1 / 2, cy2 - h2 / 2)\n", + " right = ops.minimum(cx1 + w1 / 2, cx2 + w2 / 2)\n", + " top = ops.minimum(cy1 + h1 / 2, cy2 + h2 / 2)\n", + " return ops.maximum(0.0, right - left) * ops.maximum(0.0, top - bottom)\n", + "\n", + "def intersection_over_union(box1, box2):\n", + " cx1, cy1, w1, h1 = unpack(box1)\n", + " cx2, cy2, w2, h2 = unpack(box2)\n", + " intersection_area = intersection(box1, box2)\n", + " a1 = ops.maximum(w1, 0.0) * ops.maximum(h1, 0.0)\n", + " a2 = ops.maximum(w2, 0.0) * ops.maximum(h2, 0.0)\n", + " union_area = a1 + a2 - intersection_area\n", + " return ops.divide_no_nan(intersection_area, union_area)" + ] + }, + { + "cell_type": "code", + "execution_count": 0, + "metadata": { + "colab_type": "code" + }, + "outputs": [], + "source": [ + "def signed_sqrt(x):\n", + " return ops.sign(x) * ops.sqrt(ops.absolute(x) + keras.config.epsilon())\n", + "\n", + "def box_loss(true, pred):\n", + " xy_true, wh_true, conf_true = true[..., :2], true[..., 2:4], true[..., 4:]\n", + " xy_pred, wh_pred, conf_pred = pred[..., :2], pred[..., 2:4], pred[..., 4:]\n", + " no_object = conf_true == 0.0\n", + " xy_error = ops.square(xy_true - xy_pred)\n", + " wh_error = ops.square(signed_sqrt(wh_true) - signed_sqrt(wh_pred))\n", + " iou = intersection_over_union(true, pred)\n", + " conf_target = ops.where(no_object, 0.0, ops.expand_dims(iou, -1))\n", + " conf_error = ops.square(conf_target - conf_pred)\n", + " error = ops.concatenate(\n", + " (\n", + " ops.where(no_object, 0.0, xy_error * 5.0),\n", + " ops.where(no_object, 0.0, wh_error * 5.0),\n", + " ops.where(no_object, conf_error * 0.5, conf_error),\n", + " ),\n", + " axis=-1,\n", + " )\n", + " return ops.sum(error, axis=(1, 2, 3))" + ] + }, + { + "cell_type": "code", + "execution_count": 0, + "metadata": { + "colab_type": "code" + }, + "outputs": [], + "source": [ + "model.compile(\n", + " optimizer=keras.optimizers.Adam(2e-4),\n", + " loss={\"box\": box_loss, \"class\": \"sparse_categorical_crossentropy\"},\n", + ")\n", + "model.fit(\n", + " train_dataset,\n", + " validation_data=val_dataset,\n", + " epochs=4,\n", + ")" + ] + }, + { + "cell_type": "code", + "execution_count": 0, + "metadata": { + "colab_type": "code" + }, + "outputs": [], + "source": [ + "x, y = next(iter(val_dataset.rebatch(1)))\n", + "preds = model.predict(x)\n", + "boxes = preds[\"box\"][0]\n", + "classes = np.argmax(preds[\"class\"][0], axis=-1)\n", + "path = metadata[0][\"path\"]\n", + "draw_prediction(path, boxes, classes, cutoff=0.1)" + ] + }, + { + "cell_type": "code", + "execution_count": 0, + "metadata": { + "colab_type": "code" + }, + "outputs": [], + "source": [ + "draw_prediction(path, boxes, classes, cutoff=None)" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "colab_type": "text" + }, + "source": [ + "### Using a pretrained RetinaNet detector" + ] + }, + { + "cell_type": "code", + "execution_count": 0, + "metadata": { + "colab_type": "code" + }, + "outputs": [], + "source": [ + "url = (\n", + " \"https://upload.wikimedia.org/wikipedia/commons/thumb/7/7d/\"\n", + " \"A_Sunday_on_La_Grande_Jatte%2C_Georges_Seurat%2C_1884.jpg/\"\n", + " \"1280px-A_Sunday_on_La_Grande_Jatte%2C_Georges_Seurat%2C_1884.jpg\"\n", + ")\n", + "path = keras.utils.get_file(origin=url)\n", + "image = np.array([keras.utils.load_img(path)])" + ] + }, + { + "cell_type": "code", + "execution_count": 0, + "metadata": { + "colab_type": "code" + }, + "outputs": [], + "source": [ + "detector = keras_hub.models.ObjectDetector.from_preset(\n", + " \"retinanet_resnet50_fpn_v2_coco\",\n", + " bounding_box_format=\"rel_xywh\",\n", + ")\n", + "predictions = detector.predict(image)" + ] + }, + { + "cell_type": "code", + "execution_count": 0, + "metadata": { + "colab_type": "code" + }, + "outputs": [], + "source": [ + "[(k, v.shape) for k, v in predictions.items()]" + ] + }, + { + "cell_type": "code", + "execution_count": 0, + "metadata": { + "colab_type": "code" + }, + "outputs": [], + "source": [ + "predictions[\"boxes\"][0][0]" + ] + }, + { + "cell_type": "code", + "execution_count": 0, + "metadata": { + "colab_type": "code" + }, + "outputs": [], + "source": [ + "fig, ax = plt.subplots(dpi=300)\n", + "draw_image(ax, path)\n", + "num_detections = predictions[\"num_detections\"][0]\n", + "for i in range(num_detections):\n", + " box = predictions[\"boxes\"][0][i]\n", + " label = predictions[\"labels\"][0][i]\n", + " label_name = keras_hub.utils.coco_id_to_name(label)\n", + " draw_box(ax, box, label_name, label_to_color(label))\n", + "plt.show()" + ] + } + ], + "metadata": { + "accelerator": "GPU", + "colab": { + "collapsed_sections": [], + "name": "chapter12_object-detection", + "private_outputs": false, + "provenance": [], + "toc_visible": true + }, + "kernelspec": { + "display_name": "Python 3", + "language": "python", + "name": "python3" + }, + "language_info": { + "codemirror_mode": { + "name": "ipython", + "version": 3 + }, + "file_extension": ".py", + "mimetype": "text/x-python", + "name": "python", + "nbconvert_exporter": "python", + "pygments_lexer": "ipython3", + "version": "3.10.0" + } + }, + "nbformat": 4, + "nbformat_minor": 0 +} \ No newline at end of file diff --git a/chapter13_timeseries-forecasting.ipynb b/chapter13_timeseries-forecasting.ipynb new file mode 100644 index 0000000000..0fe4788f17 --- /dev/null +++ b/chapter13_timeseries-forecasting.ipynb @@ -0,0 +1,714 @@ +{ + "cells": [ + { + "cell_type": "markdown", + "metadata": { + "colab_type": "text" + }, + "source": [ + "This is a companion notebook for the book [Deep Learning with Python, Third Edition](https://www.manning.com/books/deep-learning-with-python-third-edition). For readability, it only contains runnable code blocks and section titles, and omits everything else in the book: text paragraphs, figures, and pseudocode.\n\n**If you want to be able to follow what's going on, I recommend reading the notebook side by side with your copy of the book.**\n\nThe book's contents are available online at [deeplearningwithpython.io](https://deeplearningwithpython.io)." + ] + }, + { + "cell_type": "code", + "execution_count": 0, + "metadata": { + "colab_type": "code" + }, + "outputs": [], + "source": [ + "!pip install keras keras-hub --upgrade -q" + ] + }, + { + "cell_type": "code", + "execution_count": 0, + "metadata": { + "colab_type": "code" + }, + "outputs": [], + "source": [ + "import os\n", + "os.environ[\"KERAS_BACKEND\"] = \"jax\"" + ] + }, + { + "cell_type": "code", + "execution_count": 0, + "metadata": { + "cellView": "form", + "colab_type": "code" + }, + "outputs": [], + "source": [ + "# @title\n", + "import os\n", + "from IPython.core.magic import register_cell_magic\n", + "\n", + "@register_cell_magic\n", + "def backend(line, cell):\n", + " current, required = os.environ.get(\"KERAS_BACKEND\", \"\"), line.split()[-1]\n", + " if current == required:\n", + " get_ipython().run_cell(cell)\n", + " else:\n", + " print(\n", + " f\"This cell requires the {required} backend. To run it, change KERAS_BACKEND to \"\n", + " f\"\\\"{required}\\\" at the top of the notebook, restart the runtime, and rerun the notebook.\"\n", + " )" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "colab_type": "text" + }, + "source": [ + "## Timeseries forecasting" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "colab_type": "text" + }, + "source": [ + "### Different kinds of timeseries tasks" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "colab_type": "text" + }, + "source": [ + "### A temperature forecasting example" + ] + }, + { + "cell_type": "code", + "execution_count": 0, + "metadata": { + "colab_type": "code" + }, + "outputs": [], + "source": [ + "!wget https://s3.amazonaws.com/keras-datasets/jena_climate_2009_2016.csv.zip\n", + "!unzip jena_climate_2009_2016.csv.zip" + ] + }, + { + "cell_type": "code", + "execution_count": 0, + "metadata": { + "colab_type": "code" + }, + "outputs": [], + "source": [ + "import os\n", + "\n", + "fname = os.path.join(\"jena_climate_2009_2016.csv\")\n", + "\n", + "with open(fname) as f:\n", + " data = f.read()\n", + "\n", + "lines = data.split(\"\\n\")\n", + "header = lines[0].split(\",\")\n", + "lines = lines[1:]\n", + "print(header)\n", + "print(len(lines))" + ] + }, + { + "cell_type": "code", + "execution_count": 0, + "metadata": { + "colab_type": "code" + }, + "outputs": [], + "source": [ + "import numpy as np\n", + "\n", + "temperature = np.zeros((len(lines),))\n", + "raw_data = np.zeros((len(lines), len(header) - 1))\n", + "\n", + "for i, line in enumerate(lines):\n", + " values = [float(x) for x in line.split(\",\")[1:]]\n", + " temperature[i] = values[1]\n", + " raw_data[i, :] = values[:]" + ] + }, + { + "cell_type": "code", + "execution_count": 0, + "metadata": { + "colab_type": "code" + }, + "outputs": [], + "source": [ + "from matplotlib import pyplot as plt\n", + "\n", + "plt.plot(range(len(temperature)), temperature)" + ] + }, + { + "cell_type": "code", + "execution_count": 0, + "metadata": { + "colab_type": "code" + }, + "outputs": [], + "source": [ + "plt.plot(range(1440), temperature[:1440])" + ] + }, + { + "cell_type": "code", + "execution_count": 0, + "metadata": { + "colab_type": "code" + }, + "outputs": [], + "source": [ + "num_train_samples = int(0.5 * len(raw_data))\n", + "num_val_samples = int(0.25 * len(raw_data))\n", + "num_test_samples = len(raw_data) - num_train_samples - num_val_samples\n", + "print(\"num_train_samples:\", num_train_samples)\n", + "print(\"num_val_samples:\", num_val_samples)\n", + "print(\"num_test_samples:\", num_test_samples)" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "colab_type": "text" + }, + "source": [ + "#### Preparing the data" + ] + }, + { + "cell_type": "code", + "execution_count": 0, + "metadata": { + "colab_type": "code" + }, + "outputs": [], + "source": [ + "mean = raw_data[:num_train_samples].mean(axis=0)\n", + "raw_data -= mean\n", + "std = raw_data[:num_train_samples].std(axis=0)\n", + "raw_data /= std" + ] + }, + { + "cell_type": "code", + "execution_count": 0, + "metadata": { + "colab_type": "code" + }, + "outputs": [], + "source": [ + "import numpy as np\n", + "import keras\n", + "\n", + "int_sequence = np.arange(10)\n", + "dummy_dataset = keras.utils.timeseries_dataset_from_array(\n", + " data=int_sequence[:-3],\n", + " targets=int_sequence[3:],\n", + " sequence_length=3,\n", + " batch_size=2,\n", + ")\n", + "\n", + "for inputs, targets in dummy_dataset:\n", + " for i in range(inputs.shape[0]):\n", + " print([int(x) for x in inputs[i]], int(targets[i]))" + ] + }, + { + "cell_type": "code", + "execution_count": 0, + "metadata": { + "colab_type": "code" + }, + "outputs": [], + "source": [ + "sampling_rate = 6\n", + "sequence_length = 120\n", + "delay = sampling_rate * (sequence_length + 24 - 1)\n", + "batch_size = 256\n", + "\n", + "train_dataset = keras.utils.timeseries_dataset_from_array(\n", + " raw_data[:-delay],\n", + " targets=temperature[delay:],\n", + " sampling_rate=sampling_rate,\n", + " sequence_length=sequence_length,\n", + " shuffle=True,\n", + " batch_size=batch_size,\n", + " start_index=0,\n", + " end_index=num_train_samples,\n", + ")\n", + "\n", + "val_dataset = keras.utils.timeseries_dataset_from_array(\n", + " raw_data[:-delay],\n", + " targets=temperature[delay:],\n", + " sampling_rate=sampling_rate,\n", + " sequence_length=sequence_length,\n", + " shuffle=True,\n", + " batch_size=batch_size,\n", + " start_index=num_train_samples,\n", + " end_index=num_train_samples + num_val_samples,\n", + ")\n", + "\n", + "test_dataset = keras.utils.timeseries_dataset_from_array(\n", + " raw_data[:-delay],\n", + " targets=temperature[delay:],\n", + " sampling_rate=sampling_rate,\n", + " sequence_length=sequence_length,\n", + " shuffle=True,\n", + " batch_size=batch_size,\n", + " start_index=num_train_samples + num_val_samples,\n", + ")" + ] + }, + { + "cell_type": "code", + "execution_count": 0, + "metadata": { + "colab_type": "code" + }, + "outputs": [], + "source": [ + "for samples, targets in train_dataset:\n", + " print(\"samples shape:\", samples.shape)\n", + " print(\"targets shape:\", targets.shape)\n", + " break" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "colab_type": "text" + }, + "source": [ + "#### A common-sense, non-machine-learning baseline" + ] + }, + { + "cell_type": "code", + "execution_count": 0, + "metadata": { + "colab_type": "code" + }, + "outputs": [], + "source": [ + "def evaluate_naive_method(dataset):\n", + " total_abs_err = 0.0\n", + " samples_seen = 0\n", + " for samples, targets in dataset:\n", + " preds = samples[:, -1, 1] * std[1] + mean[1]\n", + " total_abs_err += np.sum(np.abs(preds - targets))\n", + " samples_seen += samples.shape[0]\n", + " return total_abs_err / samples_seen\n", + "\n", + "print(f\"Validation MAE: {evaluate_naive_method(val_dataset):.2f}\")\n", + "print(f\"Test MAE: {evaluate_naive_method(test_dataset):.2f}\")" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "colab_type": "text" + }, + "source": [ + "#### Let's try a basic machine learning model" + ] + }, + { + "cell_type": "code", + "execution_count": 0, + "metadata": { + "colab_type": "code" + }, + "outputs": [], + "source": [ + "import keras\n", + "from keras import layers\n", + "\n", + "inputs = keras.Input(shape=(sequence_length, raw_data.shape[-1]))\n", + "x = layers.Flatten()(inputs)\n", + "x = layers.Dense(16, activation=\"relu\")(x)\n", + "outputs = layers.Dense(1)(x)\n", + "model = keras.Model(inputs, outputs)\n", + "\n", + "callbacks = [\n", + " keras.callbacks.ModelCheckpoint(\"jena_dense.keras\", save_best_only=True)\n", + "]\n", + "model.compile(optimizer=\"adam\", loss=\"mse\", metrics=[\"mae\"])\n", + "history = model.fit(\n", + " train_dataset,\n", + " epochs=10,\n", + " validation_data=val_dataset,\n", + " callbacks=callbacks,\n", + ")\n", + "\n", + "model = keras.models.load_model(\"jena_dense.keras\")\n", + "print(f\"Test MAE: {model.evaluate(test_dataset)[1]:.2f}\")" + ] + }, + { + "cell_type": "code", + "execution_count": 0, + "metadata": { + "colab_type": "code" + }, + "outputs": [], + "source": [ + "import matplotlib.pyplot as plt\n", + "\n", + "loss = history.history[\"mae\"]\n", + "val_loss = history.history[\"val_mae\"]\n", + "epochs = range(1, len(loss) + 1)\n", + "plt.figure()\n", + "plt.plot(epochs, loss, \"r--\", label=\"Training MAE\")\n", + "plt.plot(epochs, val_loss, \"b\", label=\"Validation MAE\")\n", + "plt.title(\"Training and validation MAE\")\n", + "plt.legend()\n", + "plt.show()" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "colab_type": "text" + }, + "source": [ + "#### Let's try a 1D convolutional model" + ] + }, + { + "cell_type": "code", + "execution_count": 0, + "metadata": { + "colab_type": "code" + }, + "outputs": [], + "source": [ + "inputs = keras.Input(shape=(sequence_length, raw_data.shape[-1]))\n", + "x = layers.Conv1D(8, 24, activation=\"relu\")(inputs)\n", + "x = layers.MaxPooling1D(2)(x)\n", + "x = layers.Conv1D(8, 12, activation=\"relu\")(x)\n", + "x = layers.MaxPooling1D(2)(x)\n", + "x = layers.Conv1D(8, 6, activation=\"relu\")(x)\n", + "x = layers.GlobalAveragePooling1D()(x)\n", + "outputs = layers.Dense(1)(x)\n", + "model = keras.Model(inputs, outputs)\n", + "\n", + "callbacks = [\n", + " keras.callbacks.ModelCheckpoint(\"jena_conv.keras\", save_best_only=True)\n", + "]\n", + "model.compile(optimizer=\"adam\", loss=\"mse\", metrics=[\"mae\"])\n", + "history = model.fit(\n", + " train_dataset,\n", + " epochs=10,\n", + " validation_data=val_dataset,\n", + " callbacks=callbacks,\n", + ")\n", + "\n", + "model = keras.models.load_model(\"jena_conv.keras\")\n", + "print(f\"Test MAE: {model.evaluate(test_dataset)[1]:.2f}\")" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "colab_type": "text" + }, + "source": [ + "### Recurrent neural networks" + ] + }, + { + "cell_type": "code", + "execution_count": 0, + "metadata": { + "colab_type": "code" + }, + "outputs": [], + "source": [ + "inputs = keras.Input(shape=(sequence_length, raw_data.shape[-1]))\n", + "x = layers.LSTM(16)(inputs)\n", + "outputs = layers.Dense(1)(x)\n", + "model = keras.Model(inputs, outputs)\n", + "\n", + "callbacks = [\n", + " keras.callbacks.ModelCheckpoint(\"jena_lstm.keras\", save_best_only=True)\n", + "]\n", + "model.compile(optimizer=\"adam\", loss=\"mse\", metrics=[\"mae\"])\n", + "history = model.fit(\n", + " train_dataset,\n", + " epochs=10,\n", + " validation_data=val_dataset,\n", + " callbacks=callbacks,\n", + ")\n", + "\n", + "model = keras.models.load_model(\"jena_lstm.keras\")\n", + "print(\"Test MAE: {model.evaluate(test_dataset)[1]:.2f}\")" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "colab_type": "text" + }, + "source": [ + "#### Understanding recurrent neural networks" + ] + }, + { + "cell_type": "code", + "execution_count": 0, + "metadata": { + "colab_type": "code" + }, + "outputs": [], + "source": [ + "import numpy as np\n", + "\n", + "timesteps = 100\n", + "input_features = 32\n", + "output_features = 64\n", + "inputs = np.random.random((timesteps, input_features))\n", + "state_t = np.zeros((output_features,))\n", + "W = np.random.random((output_features, input_features))\n", + "U = np.random.random((output_features, output_features))\n", + "b = np.random.random((output_features,))\n", + "successive_outputs = []\n", + "for input_t in inputs:\n", + " output_t = np.tanh(np.dot(W, input_t) + np.dot(U, state_t) + b)\n", + " successive_outputs.append(output_t)\n", + " state_t = output_t\n", + "final_output_sequence = np.concatenate(successive_outputs, axis=0)" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "colab_type": "text" + }, + "source": [ + "#### A recurrent layer in Keras" + ] + }, + { + "cell_type": "code", + "execution_count": 0, + "metadata": { + "colab_type": "code" + }, + "outputs": [], + "source": [ + "num_features = 14\n", + "inputs = keras.Input(shape=(None, num_features))\n", + "outputs = layers.SimpleRNN(16)(inputs)" + ] + }, + { + "cell_type": "code", + "execution_count": 0, + "metadata": { + "colab_type": "code" + }, + "outputs": [], + "source": [ + "num_features = 14\n", + "steps = 120\n", + "inputs = keras.Input(shape=(steps, num_features))\n", + "outputs = layers.SimpleRNN(16, return_sequences=False)(inputs)\n", + "print(outputs.shape)" + ] + }, + { + "cell_type": "code", + "execution_count": 0, + "metadata": { + "colab_type": "code" + }, + "outputs": [], + "source": [ + "num_features = 14\n", + "steps = 120\n", + "inputs = keras.Input(shape=(steps, num_features))\n", + "outputs = layers.SimpleRNN(16, return_sequences=True)(inputs)\n", + "print(outputs.shape)" + ] + }, + { + "cell_type": "code", + "execution_count": 0, + "metadata": { + "colab_type": "code" + }, + "outputs": [], + "source": [ + "inputs = keras.Input(shape=(steps, num_features))\n", + "x = layers.SimpleRNN(16, return_sequences=True)(inputs)\n", + "x = layers.SimpleRNN(16, return_sequences=True)(x)\n", + "outputs = layers.SimpleRNN(16)(x)" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "colab_type": "text" + }, + "source": [ + "#### Getting the most out of recurrent neural networks" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "colab_type": "text" + }, + "source": [ + "#### Using recurrent dropout to fight overfitting" + ] + }, + { + "cell_type": "code", + "execution_count": 0, + "metadata": { + "colab_type": "code" + }, + "outputs": [], + "source": [ + "inputs = keras.Input(shape=(sequence_length, raw_data.shape[-1]))\n", + "x = layers.LSTM(32, recurrent_dropout=0.25)(inputs)\n", + "x = layers.Dropout(0.5)(x)\n", + "outputs = layers.Dense(1)(x)\n", + "model = keras.Model(inputs, outputs)\n", + "\n", + "callbacks = [\n", + " keras.callbacks.ModelCheckpoint(\n", + " \"jena_lstm_dropout.keras\", save_best_only=True\n", + " )\n", + "]\n", + "model.compile(optimizer=\"adam\", loss=\"mse\", metrics=[\"mae\"])\n", + "history = model.fit(\n", + " train_dataset,\n", + " epochs=50,\n", + " validation_data=val_dataset,\n", + " callbacks=callbacks,\n", + ")" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "colab_type": "text" + }, + "source": [ + "#### Stacking recurrent layers" + ] + }, + { + "cell_type": "code", + "execution_count": 0, + "metadata": { + "colab_type": "code" + }, + "outputs": [], + "source": [ + "inputs = keras.Input(shape=(sequence_length, raw_data.shape[-1]))\n", + "x = layers.GRU(32, recurrent_dropout=0.5, return_sequences=True)(inputs)\n", + "x = layers.GRU(32, recurrent_dropout=0.5)(x)\n", + "x = layers.Dropout(0.5)(x)\n", + "outputs = layers.Dense(1)(x)\n", + "model = keras.Model(inputs, outputs)\n", + "\n", + "callbacks = [\n", + " keras.callbacks.ModelCheckpoint(\n", + " \"jena_stacked_gru_dropout.keras\", save_best_only=True\n", + " )\n", + "]\n", + "model.compile(optimizer=\"adam\", loss=\"mse\", metrics=[\"mae\"])\n", + "history = model.fit(\n", + " train_dataset,\n", + " epochs=50,\n", + " validation_data=val_dataset,\n", + " callbacks=callbacks,\n", + ")\n", + "model = keras.models.load_model(\"jena_stacked_gru_dropout.keras\")\n", + "print(f\"Test MAE: {model.evaluate(test_dataset)[1]:.2f}\")" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "colab_type": "text" + }, + "source": [ + "#### Using bidirectional RNNs" + ] + }, + { + "cell_type": "code", + "execution_count": 0, + "metadata": { + "colab_type": "code" + }, + "outputs": [], + "source": [ + "inputs = keras.Input(shape=(sequence_length, raw_data.shape[-1]))\n", + "x = layers.Bidirectional(layers.LSTM(16))(inputs)\n", + "outputs = layers.Dense(1)(x)\n", + "model = keras.Model(inputs, outputs)\n", + "\n", + "model.compile(optimizer=\"adam\", loss=\"mse\", metrics=[\"mae\"])\n", + "history = model.fit(\n", + " train_dataset,\n", + " epochs=10,\n", + " validation_data=val_dataset,\n", + ")" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "colab_type": "text" + }, + "source": [ + "### Going even further" + ] + } + ], + "metadata": { + "accelerator": "GPU", + "colab": { + "collapsed_sections": [], + "name": "chapter13_timeseries-forecasting", + "private_outputs": false, + "provenance": [], + "toc_visible": true + }, + "kernelspec": { + "display_name": "Python 3", + "language": "python", + "name": "python3" + }, + "language_info": { + "codemirror_mode": { + "name": "ipython", + "version": 3 + }, + "file_extension": ".py", + "mimetype": "text/x-python", + "name": "python", + "nbconvert_exporter": "python", + "pygments_lexer": "ipython3", + "version": "3.10.0" + } + }, + "nbformat": 4, + "nbformat_minor": 0 +} \ No newline at end of file diff --git a/chapter14_text-classification.ipynb b/chapter14_text-classification.ipynb new file mode 100644 index 0000000000..15e34f0f0c --- /dev/null +++ b/chapter14_text-classification.ipynb @@ -0,0 +1,1439 @@ +{ + "cells": [ + { + "cell_type": "markdown", + "metadata": { + "colab_type": "text" + }, + "source": [ + "This is a companion notebook for the book [Deep Learning with Python, Third Edition](https://www.manning.com/books/deep-learning-with-python-third-edition). For readability, it only contains runnable code blocks and section titles, and omits everything else in the book: text paragraphs, figures, and pseudocode.\n\n**If you want to be able to follow what's going on, I recommend reading the notebook side by side with your copy of the book.**\n\nThe book's contents are available online at [deeplearningwithpython.io](https://deeplearningwithpython.io)." + ] + }, + { + "cell_type": "code", + "execution_count": 0, + "metadata": { + "colab_type": "code" + }, + "outputs": [], + "source": [ + "!pip install keras keras-hub --upgrade -q" + ] + }, + { + "cell_type": "code", + "execution_count": 0, + "metadata": { + "colab_type": "code" + }, + "outputs": [], + "source": [ + "import os\n", + "os.environ[\"KERAS_BACKEND\"] = \"jax\"" + ] + }, + { + "cell_type": "code", + "execution_count": 0, + "metadata": { + "cellView": "form", + "colab_type": "code" + }, + "outputs": [], + "source": [ + "# @title\n", + "import os\n", + "from IPython.core.magic import register_cell_magic\n", + "\n", + "@register_cell_magic\n", + "def backend(line, cell):\n", + " current, required = os.environ.get(\"KERAS_BACKEND\", \"\"), line.split()[-1]\n", + " if current == required:\n", + " get_ipython().run_cell(cell)\n", + " else:\n", + " print(\n", + " f\"This cell requires the {required} backend. To run it, change KERAS_BACKEND to \"\n", + " f\"\\\"{required}\\\" at the top of the notebook, restart the runtime, and rerun the notebook.\"\n", + " )" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "colab_type": "text" + }, + "source": [ + "## Text classification" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "colab_type": "text" + }, + "source": [ + "### A brief history of natural language processing" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "colab_type": "text" + }, + "source": [ + "### Preparing text data" + ] + }, + { + "cell_type": "code", + "execution_count": 0, + "metadata": { + "colab_type": "code" + }, + "outputs": [], + "source": [ + "import regex as re\n", + "\n", + "def split_chars(text):\n", + " return re.findall(r\".\", text)" + ] + }, + { + "cell_type": "code", + "execution_count": 0, + "metadata": { + "colab_type": "code" + }, + "outputs": [], + "source": [ + "chars = split_chars(\"The quick brown fox jumped over the lazy dog.\")\n", + "chars[:12]" + ] + }, + { + "cell_type": "code", + "execution_count": 0, + "metadata": { + "colab_type": "code" + }, + "outputs": [], + "source": [ + "def split_words(text):\n", + " return re.findall(r\"[\\w]+|[.,!?;]\", text)" + ] + }, + { + "cell_type": "code", + "execution_count": 0, + "metadata": { + "colab_type": "code" + }, + "outputs": [], + "source": [ + "split_words(\"The quick brown fox jumped over the dog.\")" + ] + }, + { + "cell_type": "code", + "execution_count": 0, + "metadata": { + "colab_type": "code" + }, + "outputs": [], + "source": [ + "vocabulary = {\n", + " \"[UNK]\": 0,\n", + " \"the\": 1,\n", + " \"quick\": 2,\n", + " \"brown\": 3,\n", + " \"fox\": 4,\n", + " \"jumped\": 5,\n", + " \"over\": 6,\n", + " \"dog\": 7,\n", + " \".\": 8,\n", + "}\n", + "words = split_words(\"The quick brown fox jumped over the lazy dog.\")\n", + "indices = [vocabulary.get(word, 0) for word in words]" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "colab_type": "text" + }, + "source": [ + "#### Character and word tokenization" + ] + }, + { + "cell_type": "code", + "execution_count": 0, + "metadata": { + "colab_type": "code" + }, + "outputs": [], + "source": [ + "class CharTokenizer:\n", + " def __init__(self, vocabulary):\n", + " self.vocabulary = vocabulary\n", + " self.unk_id = vocabulary[\"[UNK]\"]\n", + "\n", + " def standardize(self, inputs):\n", + " return inputs.lower()\n", + "\n", + " def split(self, inputs):\n", + " return re.findall(r\".\", inputs)\n", + "\n", + " def index(self, tokens):\n", + " return [self.vocabulary.get(t, self.unk_id) for t in tokens]\n", + "\n", + " def __call__(self, inputs):\n", + " inputs = self.standardize(inputs)\n", + " tokens = self.split(inputs)\n", + " indices = self.index(tokens)\n", + " return indices" + ] + }, + { + "cell_type": "code", + "execution_count": 0, + "metadata": { + "colab_type": "code" + }, + "outputs": [], + "source": [ + "import collections\n", + "\n", + "def compute_char_vocabulary(inputs, max_size):\n", + " char_counts = collections.Counter()\n", + " for x in inputs:\n", + " x = x.lower()\n", + " tokens = re.findall(r\".\", x)\n", + " char_counts.update(tokens)\n", + " vocabulary = [\"[UNK]\"]\n", + " most_common = char_counts.most_common(max_size - len(vocabulary))\n", + " for token, count in most_common:\n", + " vocabulary.append(token)\n", + " return dict((token, i) for i, token in enumerate(vocabulary))" + ] + }, + { + "cell_type": "code", + "execution_count": 0, + "metadata": { + "colab_type": "code" + }, + "outputs": [], + "source": [ + "class WordTokenizer:\n", + " def __init__(self, vocabulary):\n", + " self.vocabulary = vocabulary\n", + " self.unk_id = vocabulary[\"[UNK]\"]\n", + "\n", + " def standardize(self, inputs):\n", + " return inputs.lower()\n", + "\n", + " def split(self, inputs):\n", + " return re.findall(r\"[\\w]+|[.,!?;]\", inputs)\n", + "\n", + " def index(self, tokens):\n", + " return [self.vocabulary.get(t, self.unk_id) for t in tokens]\n", + "\n", + " def __call__(self, inputs):\n", + " inputs = self.standardize(inputs)\n", + " tokens = self.split(inputs)\n", + " indices = self.index(tokens)\n", + " return indices" + ] + }, + { + "cell_type": "code", + "execution_count": 0, + "metadata": { + "colab_type": "code" + }, + "outputs": [], + "source": [ + "def compute_word_vocabulary(inputs, max_size):\n", + " word_counts = collections.Counter()\n", + " for x in inputs:\n", + " x = x.lower()\n", + " tokens = re.findall(r\"[\\w]+|[.,!?;]\", x)\n", + " word_counts.update(tokens)\n", + " vocabulary = [\"[UNK]\"]\n", + " most_common = word_counts.most_common(max_size - len(vocabulary))\n", + " for token, count in most_common:\n", + " vocabulary.append(token)\n", + " return dict((token, i) for i, token in enumerate(vocabulary))" + ] + }, + { + "cell_type": "code", + "execution_count": 0, + "metadata": { + "colab_type": "code" + }, + "outputs": [], + "source": [ + "import keras\n", + "\n", + "filename = keras.utils.get_file(\n", + " origin=\"https://www.gutenberg.org/files/2701/old/moby10b.txt\",\n", + ")\n", + "moby_dick = list(open(filename, \"r\"))\n", + "\n", + "vocabulary = compute_char_vocabulary(moby_dick, max_size=100)\n", + "char_tokenizer = CharTokenizer(vocabulary)" + ] + }, + { + "cell_type": "code", + "execution_count": 0, + "metadata": { + "colab_type": "code" + }, + "outputs": [], + "source": [ + "print(\"Vocabulary length:\", len(vocabulary))" + ] + }, + { + "cell_type": "code", + "execution_count": 0, + "metadata": { + "colab_type": "code" + }, + "outputs": [], + "source": [ + "print(\"Vocabulary start:\", list(vocabulary.keys())[:10])" + ] + }, + { + "cell_type": "code", + "execution_count": 0, + "metadata": { + "colab_type": "code" + }, + "outputs": [], + "source": [ + "print(\"Vocabulary end:\", list(vocabulary.keys())[-10:])" + ] + }, + { + "cell_type": "code", + "execution_count": 0, + "metadata": { + "colab_type": "code" + }, + "outputs": [], + "source": [ + "print(\"Line length:\", len(char_tokenizer(\n", + " \"Call me Ishmael. Some years ago--never mind how long precisely.\"\n", + ")))" + ] + }, + { + "cell_type": "code", + "execution_count": 0, + "metadata": { + "colab_type": "code" + }, + "outputs": [], + "source": [ + "vocabulary = compute_word_vocabulary(moby_dick, max_size=2_000)\n", + "word_tokenizer = WordTokenizer(vocabulary)" + ] + }, + { + "cell_type": "code", + "execution_count": 0, + "metadata": { + "colab_type": "code" + }, + "outputs": [], + "source": [ + "print(\"Vocabulary length:\", len(vocabulary))" + ] + }, + { + "cell_type": "code", + "execution_count": 0, + "metadata": { + "colab_type": "code" + }, + "outputs": [], + "source": [ + "print(\"Vocabulary start:\", list(vocabulary.keys())[:5])" + ] + }, + { + "cell_type": "code", + "execution_count": 0, + "metadata": { + "colab_type": "code" + }, + "outputs": [], + "source": [ + "print(\"Vocabulary end:\", list(vocabulary.keys())[-5:])" + ] + }, + { + "cell_type": "code", + "execution_count": 0, + "metadata": { + "colab_type": "code" + }, + "outputs": [], + "source": [ + "print(\"Line length:\", len(word_tokenizer(\n", + " \"Call me Ishmael. Some years ago--never mind how long precisely.\"\n", + ")))" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "colab_type": "text" + }, + "source": [ + "#### Subword tokenization" + ] + }, + { + "cell_type": "code", + "execution_count": 0, + "metadata": { + "colab_type": "code" + }, + "outputs": [], + "source": [ + "data = [\n", + " \"the quick brown fox\",\n", + " \"the slow brown fox\",\n", + " \"the quick brown foxhound\",\n", + "]" + ] + }, + { + "cell_type": "code", + "execution_count": 0, + "metadata": { + "colab_type": "code" + }, + "outputs": [], + "source": [ + "def count_and_split_words(data):\n", + " counts = collections.Counter()\n", + " for line in data:\n", + " line = line.lower()\n", + " for word in re.findall(r\"[\\w]+|[.,!?;]\", line):\n", + " chars = re.findall(r\".\", word)\n", + " split_word = \" \".join(chars)\n", + " counts[split_word] += 1\n", + " return dict(counts)\n", + "\n", + "counts = count_and_split_words(data)" + ] + }, + { + "cell_type": "code", + "execution_count": 0, + "metadata": { + "colab_type": "code" + }, + "outputs": [], + "source": [ + "counts" + ] + }, + { + "cell_type": "code", + "execution_count": 0, + "metadata": { + "colab_type": "code" + }, + "outputs": [], + "source": [ + "def count_pairs(counts):\n", + " pairs = collections.Counter()\n", + " for word, freq in counts.items():\n", + " symbols = word.split()\n", + " for pair in zip(symbols[:-1], symbols[1:]):\n", + " pairs[pair] += freq\n", + " return pairs\n", + "\n", + "def merge_pair(counts, first, second):\n", + " split = re.compile(f\"(?\")])\n", + "\n", + "def read_file(filename):\n", + " ds = tf.data.TextLineDataset(filename)\n", + " ds = ds.map(lambda x: tf.strings.regex_replace(x, r\"\\\\n\", \"\\n\"))\n", + " ds = ds.map(tokenizer, num_parallel_calls=8)\n", + " return ds.map(lambda x: tf.concat([x, suffix], -1))\n", + "\n", + "files = [str(file) for file in extract_dir.glob(\"*.txt\")]\n", + "ds = tf.data.Dataset.from_tensor_slices(files)\n", + "ds = ds.interleave(read_file, cycle_length=32, num_parallel_calls=32)\n", + "ds = ds.rebatch(sequence_length + 1, drop_remainder=True)\n", + "ds = ds.map(lambda x: (x[:-1], x[1:]))\n", + "ds = ds.batch(batch_size).prefetch(8)" + ] + }, + { + "cell_type": "code", + "execution_count": 0, + "metadata": { + "colab_type": "code" + }, + "outputs": [], + "source": [ + "num_batches = 29373\n", + "num_val_batches = 500\n", + "num_train_batches = num_batches - num_val_batches\n", + "val_ds = ds.take(num_val_batches).repeat()\n", + "train_ds = ds.skip(num_val_batches).repeat()" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "colab_type": "text" + }, + "source": [ + "#### Building the model" + ] + }, + { + "cell_type": "code", + "execution_count": 0, + "metadata": { + "colab_type": "code" + }, + "outputs": [], + "source": [ + "from keras import layers\n", + "\n", + "class TransformerDecoder(keras.Layer):\n", + " def __init__(self, hidden_dim, intermediate_dim, num_heads):\n", + " super().__init__()\n", + " key_dim = hidden_dim // num_heads\n", + " self.self_attention = layers.MultiHeadAttention(\n", + " num_heads, key_dim, dropout=0.1\n", + " )\n", + " self.self_attention_layernorm = layers.LayerNormalization()\n", + " self.feed_forward_1 = layers.Dense(intermediate_dim, activation=\"relu\")\n", + " self.feed_forward_2 = layers.Dense(hidden_dim)\n", + " self.feed_forward_layernorm = layers.LayerNormalization()\n", + " self.dropout = layers.Dropout(0.1)\n", + "\n", + " def call(self, inputs):\n", + " residual = x = inputs\n", + " x = self.self_attention(query=x, key=x, value=x, use_causal_mask=True)\n", + " x = self.dropout(x)\n", + " x = x + residual\n", + " x = self.self_attention_layernorm(x)\n", + " residual = x\n", + " x = self.feed_forward_1(x)\n", + " x = self.feed_forward_2(x)\n", + " x = self.dropout(x)\n", + " x = x + residual\n", + " x = self.feed_forward_layernorm(x)\n", + " return x" + ] + }, + { + "cell_type": "code", + "execution_count": 0, + "metadata": { + "colab_type": "code" + }, + "outputs": [], + "source": [ + "from keras import ops\n", + "\n", + "class PositionalEmbedding(keras.Layer):\n", + " def __init__(self, sequence_length, input_dim, output_dim):\n", + " super().__init__()\n", + " self.token_embeddings = layers.Embedding(input_dim, output_dim)\n", + " self.position_embeddings = layers.Embedding(sequence_length, output_dim)\n", + "\n", + " def call(self, inputs, reverse=False):\n", + " if reverse:\n", + " token_embeddings = self.token_embeddings.embeddings\n", + " return ops.matmul(inputs, ops.transpose(token_embeddings))\n", + " positions = ops.cumsum(ops.ones_like(inputs), axis=-1) - 1\n", + " embedded_tokens = self.token_embeddings(inputs)\n", + " embedded_positions = self.position_embeddings(positions)\n", + " return embedded_tokens + embedded_positions" + ] + }, + { + "cell_type": "code", + "execution_count": 0, + "metadata": { + "colab_type": "code" + }, + "outputs": [], + "source": [ + "keras.config.set_dtype_policy(\"mixed_float16\")\n", + "\n", + "vocab_size = tokenizer.vocabulary_size()\n", + "hidden_dim = 512\n", + "intermediate_dim = 2056\n", + "num_heads = 8\n", + "num_layers = 8\n", + "\n", + "inputs = keras.Input(shape=(None,), dtype=\"int32\", name=\"inputs\")\n", + "embedding = PositionalEmbedding(sequence_length, vocab_size, hidden_dim)\n", + "x = embedding(inputs)\n", + "x = layers.LayerNormalization()(x)\n", + "for i in range(num_layers):\n", + " x = TransformerDecoder(hidden_dim, intermediate_dim, num_heads)(x)\n", + "outputs = embedding(x, reverse=True)\n", + "mini_gpt = keras.Model(inputs, outputs)" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "colab_type": "text" + }, + "source": [ + "#### Pretraining the model" + ] + }, + { + "cell_type": "code", + "execution_count": 0, + "metadata": { + "colab_type": "code" + }, + "outputs": [], + "source": [ + "class WarmupSchedule(keras.optimizers.schedules.LearningRateSchedule):\n", + " def __init__(self):\n", + " self.rate = 2e-4\n", + " self.warmup_steps = 1_000.0\n", + "\n", + " def __call__(self, step):\n", + " step = ops.cast(step, dtype=\"float32\")\n", + " scale = ops.minimum(step / self.warmup_steps, 1.0)\n", + " return self.rate * scale" + ] + }, + { + "cell_type": "code", + "execution_count": 0, + "metadata": { + "colab_type": "code" + }, + "outputs": [], + "source": [ + "import matplotlib.pyplot as plt\n", + "\n", + "schedule = WarmupSchedule()\n", + "x = range(0, 5_000, 100)\n", + "y = [ops.convert_to_numpy(schedule(step)) for step in x]\n", + "plt.plot(x, y)\n", + "plt.xlabel(\"Train Step\")\n", + "plt.ylabel(\"Learning Rate\")\n", + "plt.show()" + ] + }, + { + "cell_type": "code", + "execution_count": 0, + "metadata": { + "colab_type": "code" + }, + "outputs": [], + "source": [ + "num_epochs = 8\n", + "steps_per_epoch = num_train_batches // num_epochs\n", + "validation_steps = num_val_batches\n", + "\n", + "mini_gpt.compile(\n", + " optimizer=keras.optimizers.Adam(schedule),\n", + " loss=keras.losses.SparseCategoricalCrossentropy(from_logits=True),\n", + " metrics=[\"accuracy\"],\n", + ")\n", + "mini_gpt.fit(\n", + " train_ds,\n", + " validation_data=val_ds,\n", + " epochs=num_epochs,\n", + " steps_per_epoch=steps_per_epoch,\n", + " validation_steps=validation_steps,\n", + ")" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "colab_type": "text" + }, + "source": [ + "#### Generative decoding" + ] + }, + { + "cell_type": "code", + "execution_count": 0, + "metadata": { + "colab_type": "code" + }, + "outputs": [], + "source": [ + "def generate(prompt, max_length=64):\n", + " tokens = list(ops.convert_to_numpy(tokenizer(prompt)))\n", + " prompt_length = len(tokens)\n", + " for _ in range(max_length - prompt_length):\n", + " prediction = mini_gpt(ops.convert_to_numpy([tokens]))\n", + " prediction = ops.convert_to_numpy(prediction[0, -1])\n", + " tokens.append(np.argmax(prediction).item())\n", + " return tokenizer.detokenize(tokens)" + ] + }, + { + "cell_type": "code", + "execution_count": 0, + "metadata": { + "colab_type": "code" + }, + "outputs": [], + "source": [ + "prompt = \"A piece of advice\"\n", + "generate(prompt)" + ] + }, + { + "cell_type": "code", + "execution_count": 0, + "metadata": { + "colab_type": "code" + }, + "outputs": [], + "source": [ + "def compiled_generate(prompt, max_length=64):\n", + " tokens = list(ops.convert_to_numpy(tokenizer(prompt)))\n", + " prompt_length = len(tokens)\n", + " tokens = tokens + [0] * (max_length - prompt_length)\n", + " for i in range(prompt_length, max_length):\n", + " prediction = mini_gpt.predict(np.array([tokens]), verbose=0)\n", + " prediction = prediction[0, i - 1]\n", + " tokens[i] = np.argmax(prediction).item()\n", + " return tokenizer.detokenize(tokens)" + ] + }, + { + "cell_type": "code", + "execution_count": 0, + "metadata": { + "colab_type": "code" + }, + "outputs": [], + "source": [ + "import timeit\n", + "tries = 10\n", + "timeit.timeit(lambda: compiled_generate(prompt), number=tries) / tries" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "colab_type": "text" + }, + "source": [ + "#### Sampling strategies" + ] + }, + { + "cell_type": "code", + "execution_count": 0, + "metadata": { + "colab_type": "code" + }, + "outputs": [], + "source": [ + "def compiled_generate(prompt, sample_fn, max_length=64):\n", + " tokens = list(ops.convert_to_numpy(tokenizer(prompt)))\n", + " prompt_length = len(tokens)\n", + " tokens = tokens + [0] * (max_length - prompt_length)\n", + " for i in range(prompt_length, max_length):\n", + " prediction = mini_gpt.predict(np.array([tokens]), verbose=0)\n", + " prediction = prediction[0, i - 1]\n", + " next_token = ops.convert_to_numpy(sample_fn(prediction))\n", + " tokens[i] = np.array(next_token).item()\n", + " return tokenizer.detokenize(tokens)" + ] + }, + { + "cell_type": "code", + "execution_count": 0, + "metadata": { + "colab_type": "code" + }, + "outputs": [], + "source": [ + "def greedy_search(preds):\n", + " return ops.argmax(preds)\n", + "\n", + "compiled_generate(prompt, greedy_search)" + ] + }, + { + "cell_type": "code", + "execution_count": 0, + "metadata": { + "colab_type": "code" + }, + "outputs": [], + "source": [ + "def random_sample(preds, temperature=1.0):\n", + " preds = preds / temperature\n", + " return keras.random.categorical(preds[None, :], num_samples=1)[0]" + ] + }, + { + "cell_type": "code", + "execution_count": 0, + "metadata": { + "colab_type": "code" + }, + "outputs": [], + "source": [ + "compiled_generate(prompt, random_sample)" + ] + }, + { + "cell_type": "code", + "execution_count": 0, + "metadata": { + "colab_type": "code" + }, + "outputs": [], + "source": [ + "from functools import partial\n", + "compiled_generate(prompt, partial(random_sample, temperature=2.0))" + ] + }, + { + "cell_type": "code", + "execution_count": 0, + "metadata": { + "colab_type": "code" + }, + "outputs": [], + "source": [ + "compiled_generate(prompt, partial(random_sample, temperature=0.8))" + ] + }, + { + "cell_type": "code", + "execution_count": 0, + "metadata": { + "colab_type": "code" + }, + "outputs": [], + "source": [ + "compiled_generate(prompt, partial(random_sample, temperature=0.2))" + ] + }, + { + "cell_type": "code", + "execution_count": 0, + "metadata": { + "colab_type": "code" + }, + "outputs": [], + "source": [ + "def top_k(preds, k=5, temperature=1.0):\n", + " preds = preds / temperature\n", + " top_preds, top_indices = ops.top_k(preds, k=k, sorted=False)\n", + " choice = keras.random.categorical(top_preds[None, :], num_samples=1)[0]\n", + " return ops.take_along_axis(top_indices, choice, axis=-1)" + ] + }, + { + "cell_type": "code", + "execution_count": 0, + "metadata": { + "colab_type": "code" + }, + "outputs": [], + "source": [ + "compiled_generate(prompt, partial(top_k, k=5))" + ] + }, + { + "cell_type": "code", + "execution_count": 0, + "metadata": { + "colab_type": "code" + }, + "outputs": [], + "source": [ + "compiled_generate(prompt, partial(top_k, k=20))" + ] + }, + { + "cell_type": "code", + "execution_count": 0, + "metadata": { + "colab_type": "code" + }, + "outputs": [], + "source": [ + "compiled_generate(prompt, partial(top_k, k=5, temperature=0.5))" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "colab_type": "text" + }, + "source": [ + "### Using a pretrained LLM" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "colab_type": "text" + }, + "source": [ + "#### Text generation with the Gemma model" + ] + }, + { + "cell_type": "code", + "execution_count": 0, + "metadata": { + "colab_type": "code" + }, + "outputs": [], + "source": [ + "import kagglehub\n", + "\n", + "kagglehub.login()" + ] + }, + { + "cell_type": "code", + "execution_count": 0, + "metadata": { + "colab_type": "code" + }, + "outputs": [], + "source": [ + "gemma_lm = keras_hub.models.CausalLM.from_preset(\n", + " \"gemma3_1b\",\n", + " dtype=\"float32\",\n", + ")" + ] + }, + { + "cell_type": "code", + "execution_count": 0, + "metadata": { + "colab_type": "code" + }, + "outputs": [], + "source": [ + "gemma_lm.summary(line_length=80)" + ] + }, + { + "cell_type": "code", + "execution_count": 0, + "metadata": { + "colab_type": "code" + }, + "outputs": [], + "source": [ + "gemma_lm.compile(sampler=\"greedy\")\n", + "gemma_lm.generate(\"A piece of advice\", max_length=40)" + ] + }, + { + "cell_type": "code", + "execution_count": 0, + "metadata": { + "colab_type": "code" + }, + "outputs": [], + "source": [ + "gemma_lm.generate(\"How can I make brownies?\", max_length=40)" + ] + }, + { + "cell_type": "code", + "execution_count": 0, + "metadata": { + "colab_type": "code" + }, + "outputs": [], + "source": [ + "gemma_lm.generate(\n", + " \"The following brownie recipe is easy to make in just a few \"\n", + " \"steps.\\n\\nYou can start by\",\n", + " max_length=40,\n", + ")" + ] + }, + { + "cell_type": "code", + "execution_count": 0, + "metadata": { + "colab_type": "code" + }, + "outputs": [], + "source": [ + "gemma_lm.generate(\n", + " \"Tell me about the 542nd president of the United States.\",\n", + " max_length=40,\n", + ")" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "colab_type": "text" + }, + "source": [ + "#### Instruction fine-tuning" + ] + }, + { + "cell_type": "code", + "execution_count": 0, + "metadata": { + "colab_type": "code" + }, + "outputs": [], + "source": [ + "import json\n", + "\n", + "PROMPT_TEMPLATE = \"\"\"\"[instruction]\\n{}[end]\\n[response]\\n\"\"\"\n", + "RESPONSE_TEMPLATE = \"\"\"{}[end]\"\"\"\n", + "\n", + "dataset_path = keras.utils.get_file(\n", + " origin=(\n", + " \"https://hf.co/datasets/databricks/databricks-dolly-15k/\"\n", + " \"resolve/main/databricks-dolly-15k.jsonl\"\n", + " ),\n", + ")\n", + "data = {\"prompts\": [], \"responses\": []}\n", + "with open(dataset_path) as file:\n", + " for line in file:\n", + " features = json.loads(line)\n", + " if features[\"context\"]:\n", + " continue\n", + " data[\"prompts\"].append(PROMPT_TEMPLATE.format(features[\"instruction\"]))\n", + " data[\"responses\"].append(RESPONSE_TEMPLATE.format(features[\"response\"]))" + ] + }, + { + "cell_type": "code", + "execution_count": 0, + "metadata": { + "colab_type": "code" + }, + "outputs": [], + "source": [ + "data[\"prompts\"][0]" + ] + }, + { + "cell_type": "code", + "execution_count": 0, + "metadata": { + "colab_type": "code" + }, + "outputs": [], + "source": [ + "data[\"responses\"][0]" + ] + }, + { + "cell_type": "code", + "execution_count": 0, + "metadata": { + "colab_type": "code" + }, + "outputs": [], + "source": [ + "ds = tf.data.Dataset.from_tensor_slices(data).shuffle(2000).batch(2)\n", + "val_ds = ds.take(100)\n", + "train_ds = ds.skip(100)" + ] + }, + { + "cell_type": "code", + "execution_count": 0, + "metadata": { + "colab_type": "code" + }, + "outputs": [], + "source": [ + "preprocessor = gemma_lm.preprocessor\n", + "preprocessor.sequence_length = 512\n", + "batch = next(iter(train_ds))\n", + "x, y, sample_weight = preprocessor(batch)\n", + "x[\"token_ids\"].shape" + ] + }, + { + "cell_type": "code", + "execution_count": 0, + "metadata": { + "colab_type": "code" + }, + "outputs": [], + "source": [ + "x[\"padding_mask\"].shape" + ] + }, + { + "cell_type": "code", + "execution_count": 0, + "metadata": { + "colab_type": "code" + }, + "outputs": [], + "source": [ + "y.shape" + ] + }, + { + "cell_type": "code", + "execution_count": 0, + "metadata": { + "colab_type": "code" + }, + "outputs": [], + "source": [ + "sample_weight.shape" + ] + }, + { + "cell_type": "code", + "execution_count": 0, + "metadata": { + "colab_type": "code" + }, + "outputs": [], + "source": [ + "x[\"token_ids\"][0, :5], y[0, :5]" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "colab_type": "text" + }, + "source": [ + "#### Low-Rank Adaptation (LoRA)" + ] + }, + { + "cell_type": "code", + "execution_count": 0, + "metadata": { + "colab_type": "code" + }, + "outputs": [], + "source": [ + "gemma_lm.backbone.enable_lora(rank=8)" + ] + }, + { + "cell_type": "code", + "execution_count": 0, + "metadata": { + "colab_type": "code" + }, + "outputs": [], + "source": [ + "gemma_lm.summary(line_length=80)" + ] + }, + { + "cell_type": "code", + "execution_count": 0, + "metadata": { + "colab_type": "code" + }, + "outputs": [], + "source": [ + "gemma_lm.compile(\n", + " loss=keras.losses.SparseCategoricalCrossentropy(from_logits=True),\n", + " optimizer=keras.optimizers.Adam(5e-5),\n", + " weighted_metrics=[keras.metrics.SparseCategoricalAccuracy()],\n", + ")\n", + "gemma_lm.fit(train_ds, validation_data=val_ds, epochs=1)" + ] + }, + { + "cell_type": "code", + "execution_count": 0, + "metadata": { + "colab_type": "code" + }, + "outputs": [], + "source": [ + "gemma_lm.generate(\n", + " \"[instruction]\\nHow can I make brownies?[end]\\n\"\n", + " \"[response]\\n\",\n", + " max_length=512,\n", + ")" + ] + }, + { + "cell_type": "code", + "execution_count": 0, + "metadata": { + "colab_type": "code" + }, + "outputs": [], + "source": [ + "gemma_lm.generate(\n", + " \"[instruction]\\nWhat is a proper noun?[end]\\n\"\n", + " \"[response]\\n\",\n", + " max_length=512,\n", + ")" + ] + }, + { + "cell_type": "code", + "execution_count": 0, + "metadata": { + "colab_type": "code" + }, + "outputs": [], + "source": [ + "gemma_lm.generate(\n", + " \"[instruction]\\nWho is the 542nd president of the United States?[end]\\n\"\n", + " \"[response]\\n\",\n", + " max_length=512,\n", + ")" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "colab_type": "text" + }, + "source": [ + "### Going further with LLMs" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "colab_type": "text" + }, + "source": [ + "#### Reinforcement Learning with Human Feedback (RLHF)" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "colab_type": "text" + }, + "source": [ + "##### Using a chatbot trained with RLHF" + ] + }, + { + "cell_type": "code", + "execution_count": 0, + "metadata": { + "colab_type": "code" + }, + "outputs": [], + "source": [ + "# \u26a0\ufe0fNOTE\u26a0\ufe0f: If you are running on the free tier Colab GPUs, you will need to\n", + "# restart your runtime and run the notebook from here to free up memory for\n", + "# this 4 billion parameter model.\n", + "import os\n", + "\n", + "os.environ[\"KERAS_BACKEND\"] = \"jax\"\n", + "# Free up more GPU memory on the Jax and TensorFlow backends.\n", + "os.environ[\"XLA_PYTHON_CLIENT_MEM_FRACTION\"] = \"1.00\"\n", + "\n", + "import keras\n", + "import keras_hub\n", + "import kagglehub\n", + "import numpy as np\n", + "\n", + "kagglehub.login()" + ] + }, + { + "cell_type": "code", + "execution_count": 0, + "metadata": { + "colab_type": "code" + }, + "outputs": [], + "source": [ + "gemma_lm = keras_hub.models.CausalLM.from_preset(\n", + " \"gemma3_instruct_4b\",\n", + " dtype=\"bfloat16\",\n", + ")" + ] + }, + { + "cell_type": "code", + "execution_count": 0, + "metadata": { + "colab_type": "code" + }, + "outputs": [], + "source": [ + "PROMPT_TEMPLATE = \"\"\"user\n", + "{}\n", + "model\n", + "\"\"\"" + ] + }, + { + "cell_type": "code", + "execution_count": 0, + "metadata": { + "colab_type": "code" + }, + "outputs": [], + "source": [ + "prompt = \"Why can't you assign values in Jax tensors? Be brief!\"\n", + "gemma_lm.generate(PROMPT_TEMPLATE.format(prompt), max_length=512)" + ] + }, + { + "cell_type": "code", + "execution_count": 0, + "metadata": { + "colab_type": "code" + }, + "outputs": [], + "source": [ + "prompt = \"Who is the 542nd president of the United States?\"\n", + "gemma_lm.generate(PROMPT_TEMPLATE.format(prompt), max_length=512)" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "colab_type": "text" + }, + "source": [ + "#### Multimodal LLMs" + ] + }, + { + "cell_type": "code", + "execution_count": 0, + "metadata": { + "colab_type": "code" + }, + "outputs": [], + "source": [ + "import matplotlib.pyplot as plt\n", + "\n", + "image_url = (\n", + " \"https://github.com/mattdangerw/keras-nlp-scripts/\"\n", + " \"blob/main/learned-python.png?raw=true\"\n", + ")\n", + "image_path = keras.utils.get_file(origin=image_url)\n", + "\n", + "image = np.array(keras.utils.load_img(image_path))\n", + "plt.axis(\"off\")\n", + "plt.imshow(image)\n", + "plt.show()" + ] + }, + { + "cell_type": "code", + "execution_count": 0, + "metadata": { + "colab_type": "code" + }, + "outputs": [], + "source": [ + "gemma_lm.preprocessor.max_images_per_prompt = 1\n", + "gemma_lm.preprocessor.sequence_length = 512\n", + "prompt = \"What is going on in this image? Be concise!\"\n", + "gemma_lm.generate({\n", + " \"prompts\": PROMPT_TEMPLATE.format(prompt),\n", + " \"images\": [image],\n", + "})" + ] + }, + { + "cell_type": "code", + "execution_count": 0, + "metadata": { + "colab_type": "code" + }, + "outputs": [], + "source": [ + "prompt = \"What is the snake wearing?\"\n", + "gemma_lm.generate({\n", + " \"prompts\": PROMPT_TEMPLATE.format(prompt),\n", + " \"images\": [image],\n", + "})" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "colab_type": "text" + }, + "source": [ + "##### Foundation models" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "colab_type": "text" + }, + "source": [ + "#### Retrieval Augmented Generation (RAG)" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "colab_type": "text" + }, + "source": [ + "#### \"Reasoning\" models" + ] + }, + { + "cell_type": "code", + "execution_count": 0, + "metadata": { + "colab_type": "code" + }, + "outputs": [], + "source": [ + "prompt = \"\"\"Judy wrote a 2-page letter to 3 friends twice a week for 3 months.\n", + "How many letters did she write?\n", + "Be brief, and add \"ANSWER:\" before your final answer.\"\"\"\n", + "\n", + "gemma_lm.compile(sampler=\"random\")" + ] + }, + { + "cell_type": "code", + "execution_count": 0, + "metadata": { + "colab_type": "code" + }, + "outputs": [], + "source": [ + "gemma_lm.generate(PROMPT_TEMPLATE.format(prompt))" + ] + }, + { + "cell_type": "code", + "execution_count": 0, + "metadata": { + "colab_type": "code" + }, + "outputs": [], + "source": [ + "gemma_lm.generate(PROMPT_TEMPLATE.format(prompt))" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "colab_type": "text" + }, + "source": [ + "### Where are LLMs heading next?" + ] + } + ], + "metadata": { + "accelerator": "GPU", + "colab": { + "collapsed_sections": [], + "name": "chapter16_text-generation", + "private_outputs": false, + "provenance": [], + "toc_visible": true + }, + "kernelspec": { + "display_name": "Python 3", + "language": "python", + "name": "python3" + }, + "language_info": { + "codemirror_mode": { + "name": "ipython", + "version": 3 + }, + "file_extension": ".py", + "mimetype": "text/x-python", + "name": "python", + "nbconvert_exporter": "python", + "pygments_lexer": "ipython3", + "version": "3.10.0" + } + }, + "nbformat": 4, + "nbformat_minor": 0 +} \ No newline at end of file diff --git a/chapter17_image-generation.ipynb b/chapter17_image-generation.ipynb new file mode 100644 index 0000000000..f78b560505 --- /dev/null +++ b/chapter17_image-generation.ipynb @@ -0,0 +1,902 @@ +{ + "cells": [ + { + "cell_type": "markdown", + "metadata": { + "colab_type": "text" + }, + "source": [ + "This is a companion notebook for the book [Deep Learning with Python, Third Edition](https://www.manning.com/books/deep-learning-with-python-third-edition). For readability, it only contains runnable code blocks and section titles, and omits everything else in the book: text paragraphs, figures, and pseudocode.\n\n**If you want to be able to follow what's going on, I recommend reading the notebook side by side with your copy of the book.**\n\nThe book's contents are available online at [deeplearningwithpython.io](https://deeplearningwithpython.io)." + ] + }, + { + "cell_type": "code", + "execution_count": 0, + "metadata": { + "colab_type": "code" + }, + "outputs": [], + "source": [ + "!pip install keras keras-hub --upgrade -q" + ] + }, + { + "cell_type": "code", + "execution_count": 0, + "metadata": { + "colab_type": "code" + }, + "outputs": [], + "source": [ + "import os\n", + "os.environ[\"KERAS_BACKEND\"] = \"jax\"" + ] + }, + { + "cell_type": "code", + "execution_count": 0, + "metadata": { + "cellView": "form", + "colab_type": "code" + }, + "outputs": [], + "source": [ + "# @title\n", + "import os\n", + "from IPython.core.magic import register_cell_magic\n", + "\n", + "@register_cell_magic\n", + "def backend(line, cell):\n", + " current, required = os.environ.get(\"KERAS_BACKEND\", \"\"), line.split()[-1]\n", + " if current == required:\n", + " get_ipython().run_cell(cell)\n", + " else:\n", + " print(\n", + " f\"This cell requires the {required} backend. To run it, change KERAS_BACKEND to \"\n", + " f\"\\\"{required}\\\" at the top of the notebook, restart the runtime, and rerun the notebook.\"\n", + " )" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "colab_type": "text" + }, + "source": [ + "## Image generation" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "colab_type": "text" + }, + "source": [ + "### Deep learning for image generation" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "colab_type": "text" + }, + "source": [ + "#### Sampling from latent spaces of images" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "colab_type": "text" + }, + "source": [ + "#### Variational autoencoders" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "colab_type": "text" + }, + "source": [ + "#### Implementing a VAE with Keras" + ] + }, + { + "cell_type": "code", + "execution_count": 0, + "metadata": { + "colab_type": "code" + }, + "outputs": [], + "source": [ + "import keras\n", + "from keras import layers\n", + "\n", + "latent_dim = 2\n", + "\n", + "image_inputs = keras.Input(shape=(28, 28, 1))\n", + "x = layers.Conv2D(32, 3, activation=\"relu\", strides=2, padding=\"same\")(\n", + " image_inputs\n", + ")\n", + "x = layers.Conv2D(64, 3, activation=\"relu\", strides=2, padding=\"same\")(x)\n", + "x = layers.Flatten()(x)\n", + "x = layers.Dense(16, activation=\"relu\")(x)\n", + "z_mean = layers.Dense(latent_dim, name=\"z_mean\")(x)\n", + "z_log_var = layers.Dense(latent_dim, name=\"z_log_var\")(x)\n", + "encoder = keras.Model(image_inputs, [z_mean, z_log_var], name=\"encoder\")" + ] + }, + { + "cell_type": "code", + "execution_count": 0, + "metadata": { + "colab_type": "code" + }, + "outputs": [], + "source": [ + "encoder.summary(line_length=80)" + ] + }, + { + "cell_type": "code", + "execution_count": 0, + "metadata": { + "colab_type": "code" + }, + "outputs": [], + "source": [ + "from keras import ops\n", + "\n", + "class Sampler(keras.Layer):\n", + " def __init__(self, **kwargs):\n", + " super().__init__(**kwargs)\n", + " self.seed_generator = keras.random.SeedGenerator()\n", + " self.built = True\n", + "\n", + " def call(self, z_mean, z_log_var):\n", + " batch_size = ops.shape(z_mean)[0]\n", + " z_size = ops.shape(z_mean)[1]\n", + " epsilon = keras.random.normal(\n", + " (batch_size, z_size), seed=self.seed_generator\n", + " )\n", + " return z_mean + ops.exp(0.5 * z_log_var) * epsilon" + ] + }, + { + "cell_type": "code", + "execution_count": 0, + "metadata": { + "colab_type": "code" + }, + "outputs": [], + "source": [ + "latent_inputs = keras.Input(shape=(latent_dim,))\n", + "x = layers.Dense(7 * 7 * 64, activation=\"relu\")(latent_inputs)\n", + "x = layers.Reshape((7, 7, 64))(x)\n", + "x = layers.Conv2DTranspose(64, 3, activation=\"relu\", strides=2, padding=\"same\")(\n", + " x\n", + ")\n", + "x = layers.Conv2DTranspose(32, 3, activation=\"relu\", strides=2, padding=\"same\")(\n", + " x\n", + ")\n", + "decoder_outputs = layers.Conv2D(1, 3, activation=\"sigmoid\", padding=\"same\")(x)\n", + "decoder = keras.Model(latent_inputs, decoder_outputs, name=\"decoder\")" + ] + }, + { + "cell_type": "code", + "execution_count": 0, + "metadata": { + "colab_type": "code" + }, + "outputs": [], + "source": [ + "decoder.summary(line_length=80)" + ] + }, + { + "cell_type": "code", + "execution_count": 0, + "metadata": { + "colab_type": "code" + }, + "outputs": [], + "source": [ + "class VAE(keras.Model):\n", + " def __init__(self, encoder, decoder, **kwargs):\n", + " super().__init__(**kwargs)\n", + " self.encoder = encoder\n", + " self.decoder = decoder\n", + " self.sampler = Sampler()\n", + " self.reconstruction_loss_tracker = keras.metrics.Mean(\n", + " name=\"reconstruction_loss\"\n", + " )\n", + " self.kl_loss_tracker = keras.metrics.Mean(name=\"kl_loss\")\n", + "\n", + " def call(self, inputs):\n", + " return self.encoder(inputs)\n", + "\n", + " def compute_loss(self, x, y, y_pred, sample_weight=None, training=True):\n", + " original = x\n", + " z_mean, z_log_var = y_pred\n", + " reconstruction = self.decoder(self.sampler(z_mean, z_log_var))\n", + "\n", + " reconstruction_loss = ops.mean(\n", + " ops.sum(\n", + " keras.losses.binary_crossentropy(x, reconstruction), axis=(1, 2)\n", + " )\n", + " )\n", + " kl_loss = -0.5 * (\n", + " 1 + z_log_var - ops.square(z_mean) - ops.exp(z_log_var)\n", + " )\n", + " total_loss = reconstruction_loss + ops.mean(kl_loss)\n", + "\n", + " self.reconstruction_loss_tracker.update_state(reconstruction_loss)\n", + " self.kl_loss_tracker.update_state(kl_loss)\n", + " return total_loss" + ] + }, + { + "cell_type": "code", + "execution_count": 0, + "metadata": { + "colab_type": "code" + }, + "outputs": [], + "source": [ + "import numpy as np\n", + "\n", + "(x_train, _), (x_test, _) = keras.datasets.mnist.load_data()\n", + "mnist_digits = np.concatenate([x_train, x_test], axis=0)\n", + "mnist_digits = np.expand_dims(mnist_digits, -1).astype(\"float32\") / 255\n", + "\n", + "vae = VAE(encoder, decoder)\n", + "vae.compile(optimizer=keras.optimizers.Adam())\n", + "vae.fit(mnist_digits, epochs=30, batch_size=128)" + ] + }, + { + "cell_type": "code", + "execution_count": 0, + "metadata": { + "colab_type": "code" + }, + "outputs": [], + "source": [ + "import matplotlib.pyplot as plt\n", + "\n", + "n = 30\n", + "digit_size = 28\n", + "figure = np.zeros((digit_size * n, digit_size * n))\n", + "\n", + "grid_x = np.linspace(-1, 1, n)\n", + "grid_y = np.linspace(-1, 1, n)[::-1]\n", + "\n", + "for i, yi in enumerate(grid_y):\n", + " for j, xi in enumerate(grid_x):\n", + " z_sample = np.array([[xi, yi]])\n", + " x_decoded = vae.decoder.predict(z_sample)\n", + " digit = x_decoded[0].reshape(digit_size, digit_size)\n", + " figure[\n", + " i * digit_size : (i + 1) * digit_size,\n", + " j * digit_size : (j + 1) * digit_size,\n", + " ] = digit\n", + "\n", + "plt.figure(figsize=(15, 15))\n", + "start_range = digit_size // 2\n", + "end_range = n * digit_size + start_range\n", + "pixel_range = np.arange(start_range, end_range, digit_size)\n", + "sample_range_x = np.round(grid_x, 1)\n", + "sample_range_y = np.round(grid_y, 1)\n", + "plt.xticks(pixel_range, sample_range_x)\n", + "plt.yticks(pixel_range, sample_range_y)\n", + "plt.xlabel(\"z[0]\")\n", + "plt.ylabel(\"z[1]\")\n", + "plt.axis(\"off\")\n", + "plt.imshow(figure, cmap=\"Greys_r\")" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "colab_type": "text" + }, + "source": [ + "### Diffusion models" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "colab_type": "text" + }, + "source": [ + "#### The Oxford Flowers dataset" + ] + }, + { + "cell_type": "code", + "execution_count": 0, + "metadata": { + "colab_type": "code" + }, + "outputs": [], + "source": [ + "import os\n", + "\n", + "fpath = keras.utils.get_file(\n", + " origin=\"https://www.robots.ox.ac.uk/~vgg/data/flowers/102/102flowers.tgz\",\n", + " extract=True,\n", + ")" + ] + }, + { + "cell_type": "code", + "execution_count": 0, + "metadata": { + "colab_type": "code" + }, + "outputs": [], + "source": [ + "batch_size = 32\n", + "image_size = 128\n", + "images_dir = os.path.join(fpath, \"jpg\")\n", + "dataset = keras.utils.image_dataset_from_directory(\n", + " images_dir,\n", + " labels=None,\n", + " image_size=(image_size, image_size),\n", + " crop_to_aspect_ratio=True,\n", + ")\n", + "dataset = dataset.rebatch(\n", + " batch_size,\n", + " drop_remainder=True,\n", + ")" + ] + }, + { + "cell_type": "code", + "execution_count": 0, + "metadata": { + "colab_type": "code" + }, + "outputs": [], + "source": [ + "from matplotlib import pyplot as plt\n", + "\n", + "for batch in dataset:\n", + " img = batch.numpy()[0]\n", + " break\n", + "plt.imshow(img.astype(\"uint8\"))" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "colab_type": "text" + }, + "source": [ + "#### A U-Net denoising autoencoder" + ] + }, + { + "cell_type": "code", + "execution_count": 0, + "metadata": { + "colab_type": "code" + }, + "outputs": [], + "source": [ + "def residual_block(x, width):\n", + " input_width = x.shape[3]\n", + " if input_width == width:\n", + " residual = x\n", + " else:\n", + " residual = layers.Conv2D(width, 1)(x)\n", + " x = layers.BatchNormalization(center=False, scale=False)(x)\n", + " x = layers.Conv2D(width, 3, padding=\"same\", activation=\"swish\")(x)\n", + " x = layers.Conv2D(width, 3, padding=\"same\")(x)\n", + " x = x + residual\n", + " return x\n", + "\n", + "def get_model(image_size, widths, block_depth):\n", + " noisy_images = keras.Input(shape=(image_size, image_size, 3))\n", + " noise_rates = keras.Input(shape=(1, 1, 1))\n", + "\n", + " x = layers.Conv2D(widths[0], 1)(noisy_images)\n", + " n = layers.UpSampling2D(image_size, interpolation=\"nearest\")(noise_rates)\n", + " x = layers.Concatenate()([x, n])\n", + "\n", + " skips = []\n", + " for width in widths[:-1]:\n", + " for _ in range(block_depth):\n", + " x = residual_block(x, width)\n", + " skips.append(x)\n", + " x = layers.AveragePooling2D(pool_size=2)(x)\n", + "\n", + " for _ in range(block_depth):\n", + " x = residual_block(x, widths[-1])\n", + "\n", + " for width in reversed(widths[:-1]):\n", + " x = layers.UpSampling2D(size=2, interpolation=\"bilinear\")(x)\n", + " for _ in range(block_depth):\n", + " x = layers.Concatenate()([x, skips.pop()])\n", + " x = residual_block(x, width)\n", + "\n", + " pred_noise_masks = layers.Conv2D(3, 1, kernel_initializer=\"zeros\")(x)\n", + "\n", + " return keras.Model([noisy_images, noise_rates], pred_noise_masks)" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "colab_type": "text" + }, + "source": [ + "#### The concepts of diffusion time and diffusion schedule" + ] + }, + { + "cell_type": "code", + "execution_count": 0, + "metadata": { + "colab_type": "code" + }, + "outputs": [], + "source": [ + "def diffusion_schedule(\n", + " diffusion_times,\n", + " min_signal_rate=0.02,\n", + " max_signal_rate=0.95,\n", + "):\n", + " start_angle = ops.cast(ops.arccos(max_signal_rate), \"float32\")\n", + " end_angle = ops.cast(ops.arccos(min_signal_rate), \"float32\")\n", + " diffusion_angles = start_angle + diffusion_times * (end_angle - start_angle)\n", + " signal_rates = ops.cos(diffusion_angles)\n", + " noise_rates = ops.sin(diffusion_angles)\n", + " return noise_rates, signal_rates" + ] + }, + { + "cell_type": "code", + "execution_count": 0, + "metadata": { + "colab_type": "code" + }, + "outputs": [], + "source": [ + "diffusion_times = ops.arange(0.0, 1.0, 0.01)\n", + "noise_rates, signal_rates = diffusion_schedule(diffusion_times)\n", + "\n", + "diffusion_times = ops.convert_to_numpy(diffusion_times)\n", + "noise_rates = ops.convert_to_numpy(noise_rates)\n", + "signal_rates = ops.convert_to_numpy(signal_rates)\n", + "\n", + "plt.plot(diffusion_times, noise_rates, label=\"Noise rate\")\n", + "plt.plot(diffusion_times, signal_rates, label=\"Signal rate\")\n", + "\n", + "plt.xlabel(\"Diffusion time\")\n", + "plt.legend()" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "colab_type": "text" + }, + "source": [ + "#### The training process" + ] + }, + { + "cell_type": "code", + "execution_count": 0, + "metadata": { + "colab_type": "code" + }, + "outputs": [], + "source": [ + "class DiffusionModel(keras.Model):\n", + " def __init__(self, image_size, widths, block_depth, **kwargs):\n", + " super().__init__(**kwargs)\n", + " self.image_size = image_size\n", + " self.denoising_model = get_model(image_size, widths, block_depth)\n", + " self.seed_generator = keras.random.SeedGenerator()\n", + " self.loss = keras.losses.MeanAbsoluteError()\n", + " self.normalizer = keras.layers.Normalization()\n", + "\n", + " def denoise(self, noisy_images, noise_rates, signal_rates):\n", + " pred_noise_masks = self.denoising_model([noisy_images, noise_rates])\n", + " pred_images = (\n", + " noisy_images - noise_rates * pred_noise_masks\n", + " ) / signal_rates\n", + " return pred_images, pred_noise_masks\n", + "\n", + " def call(self, images):\n", + " images = self.normalizer(images)\n", + " noise_masks = keras.random.normal(\n", + " (batch_size, self.image_size, self.image_size, 3),\n", + " seed=self.seed_generator,\n", + " )\n", + " diffusion_times = keras.random.uniform(\n", + " (batch_size, 1, 1, 1),\n", + " minval=0.0,\n", + " maxval=1.0,\n", + " seed=self.seed_generator,\n", + " )\n", + " noise_rates, signal_rates = diffusion_schedule(diffusion_times)\n", + " noisy_images = signal_rates * images + noise_rates * noise_masks\n", + " pred_images, pred_noise_masks = self.denoise(\n", + " noisy_images, noise_rates, signal_rates\n", + " )\n", + " return pred_images, pred_noise_masks, noise_masks\n", + "\n", + " def compute_loss(self, x, y, y_pred, sample_weight=None, training=True):\n", + " _, pred_noise_masks, noise_masks = y_pred\n", + " return self.loss(noise_masks, pred_noise_masks)" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "colab_type": "text" + }, + "source": [ + "#### The generation process\n", + "\n", + " def generate(self, num_images, diffusion_steps):\n", + " noisy_images = keras.random.normal(\n", + " (num_images, self.image_size, self.image_size, 3),\n", + " seed=self.seed_generator,\n", + " )\n", + " step_size = 1.0 / diffusion_steps\n", + " for step in range(diffusion_steps):\n", + " diffusion_times = ops.ones((num_images, 1, 1, 1)) - step * step_size\n", + " noise_rates, signal_rates = diffusion_schedule(diffusion_times)\n", + " pred_images, pred_noises = self.denoise(\n", + " noisy_images, noise_rates, signal_rates\n", + " )\n", + " next_diffusion_times = diffusion_times - step_size\n", + " next_noise_rates, next_signal_rates = diffusion_schedule(\n", + " next_diffusion_times\n", + " )\n", + " noisy_images = (\n", + " next_signal_rates * pred_images + next_noise_rates * pred_noises\n", + " )\n", + " images = (\n", + " self.normalizer.mean + pred_images * self.normalizer.variance**0.5\n", + " )\n", + " return ops.clip(images, 0.0, 255.0)" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "colab_type": "text" + }, + "source": [ + "#### Visualizing results with a custom callback" + ] + }, + { + "cell_type": "code", + "execution_count": 0, + "metadata": { + "colab_type": "code" + }, + "outputs": [], + "source": [ + "class VisualizationCallback(keras.callbacks.Callback):\n", + " def __init__(self, diffusion_steps=20, num_rows=3, num_cols=6):\n", + " self.diffusion_steps = diffusion_steps\n", + " self.num_rows = num_rows\n", + " self.num_cols = num_cols\n", + "\n", + " def on_epoch_end(self, epoch=None, logs=None):\n", + " generated_images = self.model.generate(\n", + " num_images=self.num_rows * self.num_cols,\n", + " diffusion_steps=self.diffusion_steps,\n", + " )\n", + "\n", + " plt.figure(figsize=(self.num_cols * 2.0, self.num_rows * 2.0))\n", + " for row in range(self.num_rows):\n", + " for col in range(self.num_cols):\n", + " i = row * self.num_cols + col\n", + " plt.subplot(self.num_rows, self.num_cols, i + 1)\n", + " img = ops.convert_to_numpy(generated_images[i]).astype(\"uint8\")\n", + " plt.imshow(img)\n", + " plt.axis(\"off\")\n", + " plt.tight_layout()\n", + " plt.show()\n", + " plt.close()" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "colab_type": "text" + }, + "source": [ + "#### It's go time!" + ] + }, + { + "cell_type": "code", + "execution_count": 0, + "metadata": { + "colab_type": "code" + }, + "outputs": [], + "source": [ + "model = DiffusionModel(image_size, widths=[32, 64, 96, 128], block_depth=2)\n", + "model.normalizer.adapt(dataset)" + ] + }, + { + "cell_type": "code", + "execution_count": 0, + "metadata": { + "colab_type": "code" + }, + "outputs": [], + "source": [ + "model.compile(\n", + " optimizer=keras.optimizers.AdamW(\n", + " learning_rate=keras.optimizers.schedules.InverseTimeDecay(\n", + " initial_learning_rate=1e-3,\n", + " decay_steps=1000,\n", + " decay_rate=0.1,\n", + " ),\n", + " use_ema=True,\n", + " ema_overwrite_frequency=100,\n", + " ),\n", + ")" + ] + }, + { + "cell_type": "code", + "execution_count": 0, + "metadata": { + "colab_type": "code" + }, + "outputs": [], + "source": [ + "model.fit(\n", + " dataset,\n", + " epochs=100,\n", + " callbacks=[\n", + " VisualizationCallback(),\n", + " keras.callbacks.ModelCheckpoint(\n", + " filepath=\"diffusion_model.weights.h5\",\n", + " save_weights_only=True,\n", + " save_best_only=True,\n", + " ),\n", + " ],\n", + ")" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "colab_type": "text" + }, + "source": [ + "### Text-to-image models" + ] + }, + { + "cell_type": "code", + "execution_count": 0, + "metadata": { + "colab_type": "code" + }, + "outputs": [], + "source": [ + "if keras.config.backend() == \"torch\":\n", + " # The rest of this chapter will not do any training. The following keeps\n", + " # PyTorch from using too much memory by disabling gradients. TensorFlow and\n", + " # JAX use a much smaller memory footprint and do not need this hack.\n", + " import torch\n", + "\n", + " torch.set_grad_enabled(False)" + ] + }, + { + "cell_type": "code", + "execution_count": 0, + "metadata": { + "colab_type": "code" + }, + "outputs": [], + "source": [ + "import keras_hub\n", + "\n", + "height, width = 512, 512\n", + "task = keras_hub.models.TextToImage.from_preset(\n", + " \"stable_diffusion_3_medium\",\n", + " image_shape=(height, width, 3),\n", + " dtype=\"float16\",\n", + ")\n", + "prompt = \"A NASA astraunaut riding an origami elephant in New York City\"\n", + "task.generate(prompt)" + ] + }, + { + "cell_type": "code", + "execution_count": 0, + "metadata": { + "colab_type": "code" + }, + "outputs": [], + "source": [ + "task.generate(\n", + " {\n", + " \"prompts\": prompt,\n", + " \"negative_prompts\": \"blue color\",\n", + " }\n", + ")" + ] + }, + { + "cell_type": "code", + "execution_count": 0, + "metadata": { + "colab_type": "code" + }, + "outputs": [], + "source": [ + "import numpy as np\n", + "from PIL import Image\n", + "\n", + "def display(images):\n", + " return Image.fromarray(np.concatenate(images, axis=1))\n", + "\n", + "display([task.generate(prompt, num_steps=x) for x in [5, 10, 15, 20, 25]])" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "colab_type": "text" + }, + "source": [ + "#### Exploring the latent space of a text-to-image model" + ] + }, + { + "cell_type": "code", + "execution_count": 0, + "metadata": { + "colab_type": "code" + }, + "outputs": [], + "source": [ + "from keras import random\n", + "\n", + "def get_text_embeddings(prompt):\n", + " token_ids = task.preprocessor.generate_preprocess([prompt])\n", + " negative_token_ids = task.preprocessor.generate_preprocess([\"\"])\n", + " return task.backbone.encode_text_step(token_ids, negative_token_ids)\n", + "\n", + "def denoise_with_text_embeddings(embeddings, num_steps=28, guidance_scale=7.0):\n", + " latents = random.normal((1, height // 8, width // 8, 16))\n", + " for step in range(num_steps):\n", + " latents = task.backbone.denoise_step(\n", + " latents,\n", + " embeddings,\n", + " step,\n", + " num_steps,\n", + " guidance_scale,\n", + " )\n", + " return task.backbone.decode_step(latents)[0]\n", + "\n", + "def scale_output(x):\n", + " x = ops.convert_to_numpy(x)\n", + " x = np.clip((x + 1.0) / 2.0, 0.0, 1.0)\n", + " return np.round(x * 255.0).astype(\"uint8\")\n", + "\n", + "embeddings = get_text_embeddings(prompt)\n", + "image = denoise_with_text_embeddings(embeddings)\n", + "scale_output(image)" + ] + }, + { + "cell_type": "code", + "execution_count": 0, + "metadata": { + "colab_type": "code" + }, + "outputs": [], + "source": [ + "[x.shape for x in embeddings]" + ] + }, + { + "cell_type": "code", + "execution_count": 0, + "metadata": { + "colab_type": "code" + }, + "outputs": [], + "source": [ + "from keras import ops\n", + "\n", + "def slerp(t, v1, v2):\n", + " v1, v2 = ops.cast(v1, \"float32\"), ops.cast(v2, \"float32\")\n", + " v1_norm = ops.linalg.norm(ops.ravel(v1))\n", + " v2_norm = ops.linalg.norm(ops.ravel(v2))\n", + " dot = ops.sum(v1 * v2 / (v1_norm * v2_norm))\n", + " theta_0 = ops.arccos(dot)\n", + " sin_theta_0 = ops.sin(theta_0)\n", + " theta_t = theta_0 * t\n", + " sin_theta_t = ops.sin(theta_t)\n", + " s0 = ops.sin(theta_0 - theta_t) / sin_theta_0\n", + " s1 = sin_theta_t / sin_theta_0\n", + " return s0 * v1 + s1 * v2\n", + "\n", + "def interpolate_text_embeddings(e1, e2, start=0, stop=1, num=10):\n", + " embeddings = []\n", + " for t in np.linspace(start, stop, num):\n", + " embeddings.append(\n", + " (\n", + " slerp(t, e1[0], e2[0]),\n", + " e1[1],\n", + " slerp(t, e1[2], e2[2]),\n", + " e1[3],\n", + " )\n", + " )\n", + " return embeddings" + ] + }, + { + "cell_type": "code", + "execution_count": 0, + "metadata": { + "colab_type": "code" + }, + "outputs": [], + "source": [ + "prompt1 = \"A friendly dog looking up in a field of flowers\"\n", + "prompt2 = \"A horrifying, tentacled creature hovering over a field of flowers\"\n", + "e1 = get_text_embeddings(prompt1)\n", + "e2 = get_text_embeddings(prompt2)\n", + "\n", + "images = []\n", + "for et in interpolate_text_embeddings(e1, e2, start=0.5, stop=0.6, num=9):\n", + " image = denoise_with_text_embeddings(et)\n", + " images.append(scale_output(image))\n", + "display(images)" + ] + } + ], + "metadata": { + "accelerator": "GPU", + "colab": { + "collapsed_sections": [], + "name": "chapter17_image-generation", + "private_outputs": false, + "provenance": [], + "toc_visible": true + }, + "kernelspec": { + "display_name": "Python 3", + "language": "python", + "name": "python3" + }, + "language_info": { + "codemirror_mode": { + "name": "ipython", + "version": 3 + }, + "file_extension": ".py", + "mimetype": "text/x-python", + "name": "python", + "nbconvert_exporter": "python", + "pygments_lexer": "ipython3", + "version": "3.10.0" + } + }, + "nbformat": 4, + "nbformat_minor": 0 +} \ No newline at end of file diff --git a/chapter18_best-practices-for-the-real-world.ipynb b/chapter18_best-practices-for-the-real-world.ipynb new file mode 100644 index 0000000000..d7e28359aa --- /dev/null +++ b/chapter18_best-practices-for-the-real-world.ipynb @@ -0,0 +1,598 @@ +{ + "cells": [ + { + "cell_type": "markdown", + "metadata": { + "colab_type": "text" + }, + "source": [ + "This is a companion notebook for the book [Deep Learning with Python, Third Edition](https://www.manning.com/books/deep-learning-with-python-third-edition). For readability, it only contains runnable code blocks and section titles, and omits everything else in the book: text paragraphs, figures, and pseudocode.\n\n**If you want to be able to follow what's going on, I recommend reading the notebook side by side with your copy of the book.**\n\nThe book's contents are available online at [deeplearningwithpython.io](https://deeplearningwithpython.io)." + ] + }, + { + "cell_type": "code", + "execution_count": 0, + "metadata": { + "colab_type": "code" + }, + "outputs": [], + "source": [ + "!pip install keras keras-hub --upgrade -q" + ] + }, + { + "cell_type": "code", + "execution_count": 0, + "metadata": { + "colab_type": "code" + }, + "outputs": [], + "source": [ + "import os\n", + "os.environ[\"KERAS_BACKEND\"] = \"jax\"" + ] + }, + { + "cell_type": "code", + "execution_count": 0, + "metadata": { + "cellView": "form", + "colab_type": "code" + }, + "outputs": [], + "source": [ + "# @title\n", + "import os\n", + "from IPython.core.magic import register_cell_magic\n", + "\n", + "@register_cell_magic\n", + "def backend(line, cell):\n", + " current, required = os.environ.get(\"KERAS_BACKEND\", \"\"), line.split()[-1]\n", + " if current == required:\n", + " get_ipython().run_cell(cell)\n", + " else:\n", + " print(\n", + " f\"This cell requires the {required} backend. To run it, change KERAS_BACKEND to \"\n", + " f\"\\\"{required}\\\" at the top of the notebook, restart the runtime, and rerun the notebook.\"\n", + " )" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "colab_type": "text" + }, + "source": [ + "## Best practices for the real world" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "colab_type": "text" + }, + "source": [ + "### Getting the most out of your models" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "colab_type": "text" + }, + "source": [ + "#### Hyperparameter optimization" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "colab_type": "text" + }, + "source": [ + "##### Using KerasTuner" + ] + }, + { + "cell_type": "code", + "execution_count": 0, + "metadata": { + "colab_type": "code" + }, + "outputs": [], + "source": [ + "!pip install keras-tuner -q" + ] + }, + { + "cell_type": "code", + "execution_count": 0, + "metadata": { + "colab_type": "code" + }, + "outputs": [], + "source": [ + "import keras\n", + "from keras import layers\n", + "\n", + "def build_model(hp):\n", + " units = hp.Int(name=\"units\", min_value=16, max_value=64, step=16)\n", + " model = keras.Sequential(\n", + " [\n", + " layers.Dense(units, activation=\"relu\"),\n", + " layers.Dense(10, activation=\"softmax\"),\n", + " ]\n", + " )\n", + " optimizer = hp.Choice(name=\"optimizer\", values=[\"rmsprop\", \"adam\"])\n", + " model.compile(\n", + " optimizer=optimizer,\n", + " loss=\"sparse_categorical_crossentropy\",\n", + " metrics=[\"accuracy\"],\n", + " )\n", + " return model" + ] + }, + { + "cell_type": "code", + "execution_count": 0, + "metadata": { + "colab_type": "code" + }, + "outputs": [], + "source": [ + "import keras_tuner as kt\n", + "\n", + "class SimpleMLP(kt.HyperModel):\n", + " def __init__(self, num_classes):\n", + " self.num_classes = num_classes\n", + "\n", + " def build(self, hp):\n", + " units = hp.Int(name=\"units\", min_value=16, max_value=64, step=16)\n", + " model = keras.Sequential(\n", + " [\n", + " layers.Dense(units, activation=\"relu\"),\n", + " layers.Dense(self.num_classes, activation=\"softmax\"),\n", + " ]\n", + " )\n", + " optimizer = hp.Choice(name=\"optimizer\", values=[\"rmsprop\", \"adam\"])\n", + " model.compile(\n", + " optimizer=optimizer,\n", + " loss=\"sparse_categorical_crossentropy\",\n", + " metrics=[\"accuracy\"],\n", + " )\n", + " return model\n", + "\n", + "hypermodel = SimpleMLP(num_classes=10)" + ] + }, + { + "cell_type": "code", + "execution_count": 0, + "metadata": { + "colab_type": "code" + }, + "outputs": [], + "source": [ + "tuner = kt.BayesianOptimization(\n", + " build_model,\n", + " objective=\"val_accuracy\",\n", + " max_trials=20,\n", + " executions_per_trial=2,\n", + " directory=\"mnist_kt_test\",\n", + " overwrite=True,\n", + ")" + ] + }, + { + "cell_type": "code", + "execution_count": 0, + "metadata": { + "colab_type": "code" + }, + "outputs": [], + "source": [ + "tuner.search_space_summary()" + ] + }, + { + "cell_type": "code", + "execution_count": 0, + "metadata": { + "colab_type": "code" + }, + "outputs": [], + "source": [ + "(x_train, y_train), (x_test, y_test) = keras.datasets.mnist.load_data()\n", + "x_train = x_train.reshape((-1, 28 * 28)).astype(\"float32\") / 255\n", + "x_test = x_test.reshape((-1, 28 * 28)).astype(\"float32\") / 255\n", + "x_train_full = x_train[:]\n", + "y_train_full = y_train[:]\n", + "num_val_samples = 10000\n", + "x_train, x_val = x_train[:-num_val_samples], x_train[-num_val_samples:]\n", + "y_train, y_val = y_train[:-num_val_samples], y_train[-num_val_samples:]\n", + "callbacks = [\n", + " keras.callbacks.EarlyStopping(monitor=\"val_loss\", patience=5),\n", + "]\n", + "tuner.search(\n", + " x_train,\n", + " y_train,\n", + " batch_size=128,\n", + " epochs=100,\n", + " validation_data=(x_val, y_val),\n", + " callbacks=callbacks,\n", + " verbose=2,\n", + ")" + ] + }, + { + "cell_type": "code", + "execution_count": 0, + "metadata": { + "colab_type": "code" + }, + "outputs": [], + "source": [ + "top_n = 4\n", + "best_hps = tuner.get_best_hyperparameters(top_n)" + ] + }, + { + "cell_type": "code", + "execution_count": 0, + "metadata": { + "colab_type": "code" + }, + "outputs": [], + "source": [ + "def get_best_epoch(hp):\n", + " model = build_model(hp)\n", + " callbacks = [\n", + " keras.callbacks.EarlyStopping(\n", + " monitor=\"val_loss\", mode=\"min\", patience=10\n", + " )\n", + " ]\n", + " history = model.fit(\n", + " x_train,\n", + " y_train,\n", + " validation_data=(x_val, y_val),\n", + " epochs=100,\n", + " batch_size=128,\n", + " callbacks=callbacks,\n", + " )\n", + " val_loss_per_epoch = history.history[\"val_loss\"]\n", + " best_epoch = val_loss_per_epoch.index(min(val_loss_per_epoch)) + 1\n", + " print(f\"Best epoch: {best_epoch}\")\n", + " return best_epoch" + ] + }, + { + "cell_type": "code", + "execution_count": 0, + "metadata": { + "colab_type": "code" + }, + "outputs": [], + "source": [ + "def get_best_trained_model(hp):\n", + " best_epoch = get_best_epoch(hp)\n", + " model = build_model(hp)\n", + " model.fit(\n", + " x_train_full, y_train_full, batch_size=128, epochs=int(best_epoch * 1.2)\n", + " )\n", + " return model\n", + "\n", + "best_models = []\n", + "for hp in best_hps:\n", + " model = get_best_trained_model(hp)\n", + " model.evaluate(x_test, y_test)\n", + " best_models.append(model)" + ] + }, + { + "cell_type": "code", + "execution_count": 0, + "metadata": { + "colab_type": "code" + }, + "outputs": [], + "source": [ + "best_models = tuner.get_best_models(top_n)" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "colab_type": "text" + }, + "source": [ + "##### The art of crafting the right search space" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "colab_type": "text" + }, + "source": [ + "##### The future of hyperparameter tuning: automated machine learning" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "colab_type": "text" + }, + "source": [ + "#### Model ensembling" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "colab_type": "text" + }, + "source": [ + "### Scaling up model training with multiple devices" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "colab_type": "text" + }, + "source": [ + "#### Multi-GPU training" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "colab_type": "text" + }, + "source": [ + "##### Data parallelism: Replicating your model on each GPU" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "colab_type": "text" + }, + "source": [ + "##### Model parallelism: Splitting your model across multiple GPUs" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "colab_type": "text" + }, + "source": [ + "#### Distributed training in practice" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "colab_type": "text" + }, + "source": [ + "##### Getting your hands on two or more GPUs" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "colab_type": "text" + }, + "source": [ + "##### Using data parallelism with JAX" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "colab_type": "text" + }, + "source": [ + "##### Using model parallelism with JAX" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "colab_type": "text" + }, + "source": [ + "###### The DeviceMesh API" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "colab_type": "text" + }, + "source": [ + "###### The LayoutMap API" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "colab_type": "text" + }, + "source": [ + "#### TPU training" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "colab_type": "text" + }, + "source": [ + "##### Using step fusing to improve TPU utilization" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "colab_type": "text" + }, + "source": [ + "### Speeding up training and inference with lower-precision computation" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "colab_type": "text" + }, + "source": [ + "##### Understanding floating-point precision" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "colab_type": "text" + }, + "source": [ + "##### Float16 inference" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "colab_type": "text" + }, + "source": [ + "##### Mixed-precision training" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "colab_type": "text" + }, + "source": [ + "##### Using loss scaling with mixed precision" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "colab_type": "text" + }, + "source": [ + "##### Beyond mixed precision: float8 training" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "colab_type": "text" + }, + "source": [ + "#### Faster inference with quantization" + ] + }, + { + "cell_type": "code", + "execution_count": 0, + "metadata": { + "colab_type": "code" + }, + "outputs": [], + "source": [ + "from keras import ops\n", + "\n", + "x = ops.array([[0.1, 0.9], [1.2, -0.8]])\n", + "kernel = ops.array([[-0.1, -2.2], [1.1, 0.7]])" + ] + }, + { + "cell_type": "code", + "execution_count": 0, + "metadata": { + "colab_type": "code" + }, + "outputs": [], + "source": [ + "def abs_max_quantize(value):\n", + " abs_max = ops.max(ops.abs(value), keepdims=True)\n", + " scale = ops.divide(127, abs_max + 1e-7)\n", + " scaled_value = value * scale\n", + " scaled_value = ops.clip(ops.round(scaled_value), -127, 127)\n", + " scaled_value = ops.cast(scaled_value, dtype=\"int8\")\n", + " return scaled_value, scale\n", + "\n", + "int_x, x_scale = abs_max_quantize(x)\n", + "int_kernel, kernel_scale = abs_max_quantize(kernel)" + ] + }, + { + "cell_type": "code", + "execution_count": 0, + "metadata": { + "colab_type": "code" + }, + "outputs": [], + "source": [ + "int_y = ops.matmul(int_x, int_kernel)\n", + "y = ops.cast(int_y, dtype=\"float32\") / (x_scale * kernel_scale)" + ] + }, + { + "cell_type": "code", + "execution_count": 0, + "metadata": { + "colab_type": "code" + }, + "outputs": [], + "source": [ + "y" + ] + }, + { + "cell_type": "code", + "execution_count": 0, + "metadata": { + "colab_type": "code" + }, + "outputs": [], + "source": [ + "ops.matmul(x, kernel)" + ] + } + ], + "metadata": { + "accelerator": "GPU", + "colab": { + "collapsed_sections": [], + "name": "chapter18_best-practices-for-the-real-world", + "private_outputs": false, + "provenance": [], + "toc_visible": true + }, + "kernelspec": { + "display_name": "Python 3", + "language": "python", + "name": "python3" + }, + "language_info": { + "codemirror_mode": { + "name": "ipython", + "version": 3 + }, + "file_extension": ".py", + "mimetype": "text/x-python", + "name": "python", + "nbconvert_exporter": "python", + "pygments_lexer": "ipython3", + "version": "3.10.0" + } + }, + "nbformat": 4, + "nbformat_minor": 0 +} \ No newline at end of file diff --git a/2.1-a-first-look-at-a-neural-network.ipynb b/first_edition/2.1-a-first-look-at-a-neural-network.ipynb similarity index 100% rename from 2.1-a-first-look-at-a-neural-network.ipynb rename to first_edition/2.1-a-first-look-at-a-neural-network.ipynb diff --git a/3.5-classifying-movie-reviews.ipynb b/first_edition/3.5-classifying-movie-reviews.ipynb similarity index 100% rename from 3.5-classifying-movie-reviews.ipynb rename to first_edition/3.5-classifying-movie-reviews.ipynb diff --git a/3.6-classifying-newswires.ipynb b/first_edition/3.6-classifying-newswires.ipynb similarity index 100% rename from 3.6-classifying-newswires.ipynb rename to first_edition/3.6-classifying-newswires.ipynb diff --git a/3.7-predicting-house-prices.ipynb b/first_edition/3.7-predicting-house-prices.ipynb similarity index 100% rename from 3.7-predicting-house-prices.ipynb rename to first_edition/3.7-predicting-house-prices.ipynb diff --git a/4.4-overfitting-and-underfitting.ipynb b/first_edition/4.4-overfitting-and-underfitting.ipynb similarity index 100% rename from 4.4-overfitting-and-underfitting.ipynb rename to first_edition/4.4-overfitting-and-underfitting.ipynb diff --git a/5.1-introduction-to-convnets.ipynb b/first_edition/5.1-introduction-to-convnets.ipynb similarity index 100% rename from 5.1-introduction-to-convnets.ipynb rename to first_edition/5.1-introduction-to-convnets.ipynb diff --git a/5.2-using-convnets-with-small-datasets.ipynb b/first_edition/5.2-using-convnets-with-small-datasets.ipynb similarity index 100% rename from 5.2-using-convnets-with-small-datasets.ipynb rename to first_edition/5.2-using-convnets-with-small-datasets.ipynb diff --git a/5.3-using-a-pretrained-convnet.ipynb b/first_edition/5.3-using-a-pretrained-convnet.ipynb similarity index 100% rename from 5.3-using-a-pretrained-convnet.ipynb rename to first_edition/5.3-using-a-pretrained-convnet.ipynb diff --git a/5.4-visualizing-what-convnets-learn.ipynb b/first_edition/5.4-visualizing-what-convnets-learn.ipynb similarity index 100% rename from 5.4-visualizing-what-convnets-learn.ipynb rename to first_edition/5.4-visualizing-what-convnets-learn.ipynb diff --git a/6.1-one-hot-encoding-of-words-or-characters.ipynb b/first_edition/6.1-one-hot-encoding-of-words-or-characters.ipynb similarity index 100% rename from 6.1-one-hot-encoding-of-words-or-characters.ipynb rename to first_edition/6.1-one-hot-encoding-of-words-or-characters.ipynb diff --git a/6.1-using-word-embeddings.ipynb b/first_edition/6.1-using-word-embeddings.ipynb similarity index 100% rename from 6.1-using-word-embeddings.ipynb rename to first_edition/6.1-using-word-embeddings.ipynb diff --git a/6.2-understanding-recurrent-neural-networks.ipynb b/first_edition/6.2-understanding-recurrent-neural-networks.ipynb similarity index 100% rename from 6.2-understanding-recurrent-neural-networks.ipynb rename to first_edition/6.2-understanding-recurrent-neural-networks.ipynb diff --git a/6.3-advanced-usage-of-recurrent-neural-networks.ipynb b/first_edition/6.3-advanced-usage-of-recurrent-neural-networks.ipynb similarity index 100% rename from 6.3-advanced-usage-of-recurrent-neural-networks.ipynb rename to first_edition/6.3-advanced-usage-of-recurrent-neural-networks.ipynb diff --git a/6.4-sequence-processing-with-convnets.ipynb b/first_edition/6.4-sequence-processing-with-convnets.ipynb similarity index 100% rename from 6.4-sequence-processing-with-convnets.ipynb rename to first_edition/6.4-sequence-processing-with-convnets.ipynb diff --git a/8.1-text-generation-with-lstm.ipynb b/first_edition/8.1-text-generation-with-lstm.ipynb similarity index 100% rename from 8.1-text-generation-with-lstm.ipynb rename to first_edition/8.1-text-generation-with-lstm.ipynb diff --git a/8.2-deep-dream.ipynb b/first_edition/8.2-deep-dream.ipynb similarity index 100% rename from 8.2-deep-dream.ipynb rename to first_edition/8.2-deep-dream.ipynb diff --git a/8.3-neural-style-transfer.ipynb b/first_edition/8.3-neural-style-transfer.ipynb similarity index 100% rename from 8.3-neural-style-transfer.ipynb rename to first_edition/8.3-neural-style-transfer.ipynb diff --git a/8.4-generating-images-with-vaes.ipynb b/first_edition/8.4-generating-images-with-vaes.ipynb similarity index 100% rename from 8.4-generating-images-with-vaes.ipynb rename to first_edition/8.4-generating-images-with-vaes.ipynb diff --git a/8.5-introduction-to-gans.ipynb b/first_edition/8.5-introduction-to-gans.ipynb similarity index 100% rename from 8.5-introduction-to-gans.ipynb rename to first_edition/8.5-introduction-to-gans.ipynb diff --git a/second_edition/README.md b/second_edition/README.md new file mode 100644 index 0000000000..53b72c363f --- /dev/null +++ b/second_edition/README.md @@ -0,0 +1,30 @@ +# Second edition notebooks + +These are the notebooks for the second edition of the book, originally published in 2021. These notebooks use `tf.keras` with TensorFlow 2.16. + +## Table of contents + +* [Chapter 2: The mathematical building blocks of neural networks](https://colab.research.google.com/github/fchollet/deep-learning-with-python-notebooks/blob/master/second_edition/chapter02_mathematical-building-blocks.ipynb) +* [Chapter 3: Introduction to Keras and TensorFlow](https://colab.research.google.com/github/fchollet/deep-learning-with-python-notebooks/blob/master/second_edition/chapter03_introduction-to-keras-and-tf.ipynb) +* [Chapter 4: Getting started with neural networks: classification and regression](https://colab.research.google.com/github/fchollet/deep-learning-with-python-notebooks/blob/master/second_edition/chapter04_getting-started-with-neural-networks.ipynb) +* [Chapter 5: Fundamentals of machine learning](https://colab.research.google.com/github/fchollet/deep-learning-with-python-notebooks/blob/master/second_edition/chapter05_fundamentals-of-ml.ipynb) +* [Chapter 7: Working with Keras: a deep dive](https://colab.research.google.com/github/fchollet/deep-learning-with-python-notebooks/blob/master/second_edition/chapter07_working-with-keras.ipynb) +* [Chapter 8: Introduction to deep learning for computer vision](https://colab.research.google.com/github/fchollet/deep-learning-with-python-notebooks/blob/master/second_edition/chapter08_intro-to-dl-for-computer-vision.ipynb) +* Chapter 9: Advanced deep learning for computer vision + - [Part 1: Image segmentation](https://colab.research.google.com/github/fchollet/deep-learning-with-python-notebooks/blob/master/second_edition/chapter09_part01_image-segmentation.ipynb) + - [Part 2: Modern convnet architecture patterns](https://colab.research.google.com/github/fchollet/deep-learning-with-python-notebooks/blob/master/second_edition/chapter09_part02_modern-convnet-architecture-patterns.ipynb) + - [Part 3: Interpreting what convnets learn](https://colab.research.google.com/github/fchollet/deep-learning-with-python-notebooks/blob/master/second_edition/chapter09_part03_interpreting-what-convnets-learn.ipynb) +* [Chapter 10: Deep learning for timeseries](https://colab.research.google.com/github/fchollet/deep-learning-with-python-notebooks/blob/master/second_edition/chapter10_dl-for-timeseries.ipynb) +* Chapter 11: Deep learning for text + - [Part 1: Introduction](https://colab.research.google.com/github/fchollet/deep-learning-with-python-notebooks/blob/master/second_edition/chapter11_part01_introduction.ipynb) + - [Part 2: Sequence models](https://colab.research.google.com/github/fchollet/deep-learning-with-python-notebooks/blob/master/second_edition/chapter11_part02_sequence-models.ipynb) + - [Part 3: Transformer](https://colab.research.google.com/github/fchollet/deep-learning-with-python-notebooks/blob/master/second_edition/chapter11_part03_transformer.ipynb) + - [Part 4: Sequence-to-sequence learning](https://colab.research.google.com/github/fchollet/deep-learning-with-python-notebooks/blob/master/second_edition/chapter11_part04_sequence-to-sequence-learning.ipynb) +* Chapter 12: Generative deep learning + - [Part 1: Text generation](https://colab.research.google.com/github/fchollet/deep-learning-with-python-notebooks/blob/master/second_edition/chapter12_part01_text-generation.ipynb) + - [Part 2: Deep Dream](https://colab.research.google.com/github/fchollet/deep-learning-with-python-notebooks/blob/master/second_edition/chapter12_part02_deep-dream.ipynb) + - [Part 3: Neural style transfer](https://colab.research.google.com/github/fchollet/deep-learning-with-python-notebooks/blob/master/second_edition/chapter12_part03_neural-style-transfer.ipynb) + - [Part 4: Variational autoencoders](https://colab.research.google.com/github/fchollet/deep-learning-with-python-notebooks/blob/master/second_edition/chapter12_part04_variational-autoencoders.ipynb) + - [Part 5: Generative adversarial networks](https://colab.research.google.com/github/fchollet/deep-learning-with-python-notebooks/blob/master/second_edition/chapter12_part05_gans.ipynb) +* [Chapter 13: Best practices for the real world](https://colab.research.google.com/github/fchollet/deep-learning-with-python-notebooks/blob/master/second_edition/chapter13_best-practices-for-the-real-world.ipynb) +* [Chapter 14: Conclusions](https://colab.research.google.com/github/fchollet/deep-learning-with-python-notebooks/blob/master/second_edition/chapter14_conclusions.ipynb) diff --git a/second_edition/chapter02_mathematical-building-blocks.ipynb b/second_edition/chapter02_mathematical-building-blocks.ipynb new file mode 100644 index 0000000000..01edc9becc --- /dev/null +++ b/second_edition/chapter02_mathematical-building-blocks.ipynb @@ -0,0 +1,1469 @@ +{ + "cells": [ + { + "cell_type": "markdown", + "metadata": { + "colab_type": "text" + }, + "source": [ + "This is a companion notebook for the book [Deep Learning with Python, Second Edition](https://www.manning.com/books/deep-learning-with-python-second-edition?a_aid=keras&a_bid=76564dff). For readability, it only contains runnable code blocks and section titles, and omits everything else in the book: text paragraphs, figures, and pseudocode.\n\n**If you want to be able to follow what's going on, I recommend reading the notebook side by side with your copy of the book.**\n\nThis notebook was generated for TensorFlow 2.6." + ] + }, + { + "cell_type": "markdown", + "metadata": { + "colab_type": "text" + }, + "source": [ + "# The mathematical building blocks of neural networks" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "colab_type": "text" + }, + "source": [ + "## A first look at a neural network" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "colab_type": "text" + }, + "source": [ + "**Loading the MNIST dataset in Keras**" + ] + }, + { + "cell_type": "code", + "execution_count": 0, + "metadata": { + "colab_type": "code" + }, + "outputs": [], + "source": [ + "from tensorflow.keras.datasets import mnist\n", + "(train_images, train_labels), (test_images, test_labels) = mnist.load_data()" + ] + }, + { + "cell_type": "code", + "execution_count": 0, + "metadata": { + "colab_type": "code" + }, + "outputs": [], + "source": [ + "train_images.shape" + ] + }, + { + "cell_type": "code", + "execution_count": 0, + "metadata": { + "colab_type": "code" + }, + "outputs": [], + "source": [ + "len(train_labels)" + ] + }, + { + "cell_type": "code", + "execution_count": 0, + "metadata": { + "colab_type": "code" + }, + "outputs": [], + "source": [ + "train_labels" + ] + }, + { + "cell_type": "code", + "execution_count": 0, + "metadata": { + "colab_type": "code" + }, + "outputs": [], + "source": [ + "test_images.shape" + ] + }, + { + "cell_type": "code", + "execution_count": 0, + "metadata": { + "colab_type": "code" + }, + "outputs": [], + "source": [ + "len(test_labels)" + ] + }, + { + "cell_type": "code", + "execution_count": 0, + "metadata": { + "colab_type": "code" + }, + "outputs": [], + "source": [ + "test_labels" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "colab_type": "text" + }, + "source": [ + "**The network architecture**" + ] + }, + { + "cell_type": "code", + "execution_count": 0, + "metadata": { + "colab_type": "code" + }, + "outputs": [], + "source": [ + "from tensorflow import keras\n", + "from tensorflow.keras import layers\n", + "model = keras.Sequential([\n", + " layers.Dense(512, activation=\"relu\"),\n", + " layers.Dense(10, activation=\"softmax\")\n", + "])" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "colab_type": "text" + }, + "source": [ + "**The compilation step**" + ] + }, + { + "cell_type": "code", + "execution_count": 0, + "metadata": { + "colab_type": "code" + }, + "outputs": [], + "source": [ + "model.compile(optimizer=\"rmsprop\",\n", + " loss=\"sparse_categorical_crossentropy\",\n", + " metrics=[\"accuracy\"])" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "colab_type": "text" + }, + "source": [ + "**Preparing the image data**" + ] + }, + { + "cell_type": "code", + "execution_count": 0, + "metadata": { + "colab_type": "code" + }, + "outputs": [], + "source": [ + "train_images = train_images.reshape((60000, 28 * 28))\n", + "train_images = train_images.astype(\"float32\") / 255\n", + "test_images = test_images.reshape((10000, 28 * 28))\n", + "test_images = test_images.astype(\"float32\") / 255" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "colab_type": "text" + }, + "source": [ + "**\"Fitting\" the model**" + ] + }, + { + "cell_type": "code", + "execution_count": 0, + "metadata": { + "colab_type": "code" + }, + "outputs": [], + "source": [ + "model.fit(train_images, train_labels, epochs=5, batch_size=128)" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "colab_type": "text" + }, + "source": [ + "**Using the model to make predictions**" + ] + }, + { + "cell_type": "code", + "execution_count": 0, + "metadata": { + "colab_type": "code" + }, + "outputs": [], + "source": [ + "test_digits = test_images[0:10]\n", + "predictions = model.predict(test_digits)\n", + "predictions[0]" + ] + }, + { + "cell_type": "code", + "execution_count": 0, + "metadata": { + "colab_type": "code" + }, + "outputs": [], + "source": [ + "predictions[0].argmax()" + ] + }, + { + "cell_type": "code", + "execution_count": 0, + "metadata": { + "colab_type": "code" + }, + "outputs": [], + "source": [ + "predictions[0][7]" + ] + }, + { + "cell_type": "code", + "execution_count": 0, + "metadata": { + "colab_type": "code" + }, + "outputs": [], + "source": [ + "test_labels[0]" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "colab_type": "text" + }, + "source": [ + "**Evaluating the model on new data**" + ] + }, + { + "cell_type": "code", + "execution_count": 0, + "metadata": { + "colab_type": "code" + }, + "outputs": [], + "source": [ + "test_loss, test_acc = model.evaluate(test_images, test_labels)\n", + "print(f\"test_acc: {test_acc}\")" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "colab_type": "text" + }, + "source": [ + "## Data representations for neural networks" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "colab_type": "text" + }, + "source": [ + "### Scalars (rank-0 tensors)" + ] + }, + { + "cell_type": "code", + "execution_count": 0, + "metadata": { + "colab_type": "code" + }, + "outputs": [], + "source": [ + "import numpy as np\n", + "x = np.array(12)\n", + "x" + ] + }, + { + "cell_type": "code", + "execution_count": 0, + "metadata": { + "colab_type": "code" + }, + "outputs": [], + "source": [ + "x.ndim" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "colab_type": "text" + }, + "source": [ + "### Vectors (rank-1 tensors)" + ] + }, + { + "cell_type": "code", + "execution_count": 0, + "metadata": { + "colab_type": "code" + }, + "outputs": [], + "source": [ + "x = np.array([12, 3, 6, 14, 7])\n", + "x" + ] + }, + { + "cell_type": "code", + "execution_count": 0, + "metadata": { + "colab_type": "code" + }, + "outputs": [], + "source": [ + "x.ndim" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "colab_type": "text" + }, + "source": [ + "### Matrices (rank-2 tensors)" + ] + }, + { + "cell_type": "code", + "execution_count": 0, + "metadata": { + "colab_type": "code" + }, + "outputs": [], + "source": [ + "x = np.array([[5, 78, 2, 34, 0],\n", + " [6, 79, 3, 35, 1],\n", + " [7, 80, 4, 36, 2]])\n", + "x.ndim" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "colab_type": "text" + }, + "source": [ + "### Rank-3 and higher-rank tensors" + ] + }, + { + "cell_type": "code", + "execution_count": 0, + "metadata": { + "colab_type": "code" + }, + "outputs": [], + "source": [ + "x = np.array([[[5, 78, 2, 34, 0],\n", + " [6, 79, 3, 35, 1],\n", + " [7, 80, 4, 36, 2]],\n", + " [[5, 78, 2, 34, 0],\n", + " [6, 79, 3, 35, 1],\n", + " [7, 80, 4, 36, 2]],\n", + " [[5, 78, 2, 34, 0],\n", + " [6, 79, 3, 35, 1],\n", + " [7, 80, 4, 36, 2]]])\n", + "x.ndim" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "colab_type": "text" + }, + "source": [ + "### Key attributes" + ] + }, + { + "cell_type": "code", + "execution_count": 0, + "metadata": { + "colab_type": "code" + }, + "outputs": [], + "source": [ + "from tensorflow.keras.datasets import mnist\n", + "(train_images, train_labels), (test_images, test_labels) = mnist.load_data()" + ] + }, + { + "cell_type": "code", + "execution_count": 0, + "metadata": { + "colab_type": "code" + }, + "outputs": [], + "source": [ + "train_images.ndim" + ] + }, + { + "cell_type": "code", + "execution_count": 0, + "metadata": { + "colab_type": "code" + }, + "outputs": [], + "source": [ + "train_images.shape" + ] + }, + { + "cell_type": "code", + "execution_count": 0, + "metadata": { + "colab_type": "code" + }, + "outputs": [], + "source": [ + "train_images.dtype" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "colab_type": "text" + }, + "source": [ + "**Displaying the fourth digit**" + ] + }, + { + "cell_type": "code", + "execution_count": 0, + "metadata": { + "colab_type": "code" + }, + "outputs": [], + "source": [ + "import matplotlib.pyplot as plt\n", + "digit = train_images[4]\n", + "plt.imshow(digit, cmap=plt.cm.binary)\n", + "plt.show()" + ] + }, + { + "cell_type": "code", + "execution_count": 0, + "metadata": { + "colab_type": "code" + }, + "outputs": [], + "source": [ + "train_labels[4]" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "colab_type": "text" + }, + "source": [ + "### Manipulating tensors in NumPy" + ] + }, + { + "cell_type": "code", + "execution_count": 0, + "metadata": { + "colab_type": "code" + }, + "outputs": [], + "source": [ + "my_slice = train_images[10:100]\n", + "my_slice.shape" + ] + }, + { + "cell_type": "code", + "execution_count": 0, + "metadata": { + "colab_type": "code" + }, + "outputs": [], + "source": [ + "my_slice = train_images[10:100, :, :]\n", + "my_slice.shape" + ] + }, + { + "cell_type": "code", + "execution_count": 0, + "metadata": { + "colab_type": "code" + }, + "outputs": [], + "source": [ + "my_slice = train_images[10:100, 0:28, 0:28]\n", + "my_slice.shape" + ] + }, + { + "cell_type": "code", + "execution_count": 0, + "metadata": { + "colab_type": "code" + }, + "outputs": [], + "source": [ + "my_slice = train_images[:, 14:, 14:]" + ] + }, + { + "cell_type": "code", + "execution_count": 0, + "metadata": { + "colab_type": "code" + }, + "outputs": [], + "source": [ + "my_slice = train_images[:, 7:-7, 7:-7]" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "colab_type": "text" + }, + "source": [ + "### The notion of data batches" + ] + }, + { + "cell_type": "code", + "execution_count": 0, + "metadata": { + "colab_type": "code" + }, + "outputs": [], + "source": [ + "batch = train_images[:128]" + ] + }, + { + "cell_type": "code", + "execution_count": 0, + "metadata": { + "colab_type": "code" + }, + "outputs": [], + "source": [ + "batch = train_images[128:256]" + ] + }, + { + "cell_type": "code", + "execution_count": 0, + "metadata": { + "colab_type": "code" + }, + "outputs": [], + "source": [ + "n = 3\n", + "batch = train_images[128 * n:128 * (n + 1)]" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "colab_type": "text" + }, + "source": [ + "### Real-world examples of data tensors" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "colab_type": "text" + }, + "source": [ + "### Vector data" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "colab_type": "text" + }, + "source": [ + "### Timeseries data or sequence data" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "colab_type": "text" + }, + "source": [ + "### Image data" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "colab_type": "text" + }, + "source": [ + "### Video data" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "colab_type": "text" + }, + "source": [ + "## The gears of neural networks: tensor operations" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "colab_type": "text" + }, + "source": [ + "### Element-wise operations" + ] + }, + { + "cell_type": "code", + "execution_count": 0, + "metadata": { + "colab_type": "code" + }, + "outputs": [], + "source": [ + "def naive_relu(x):\n", + " assert len(x.shape) == 2\n", + " x = x.copy()\n", + " for i in range(x.shape[0]):\n", + " for j in range(x.shape[1]):\n", + " x[i, j] = max(x[i, j], 0)\n", + " return x" + ] + }, + { + "cell_type": "code", + "execution_count": 0, + "metadata": { + "colab_type": "code" + }, + "outputs": [], + "source": [ + "def naive_add(x, y):\n", + " assert len(x.shape) == 2\n", + " assert x.shape == y.shape\n", + " x = x.copy()\n", + " for i in range(x.shape[0]):\n", + " for j in range(x.shape[1]):\n", + " x[i, j] += y[i, j]\n", + " return x" + ] + }, + { + "cell_type": "code", + "execution_count": 0, + "metadata": { + "colab_type": "code" + }, + "outputs": [], + "source": [ + "import time\n", + "\n", + "x = np.random.random((20, 100))\n", + "y = np.random.random((20, 100))\n", + "\n", + "t0 = time.time()\n", + "for _ in range(1000):\n", + " z = x + y\n", + " z = np.maximum(z, 0.)\n", + "print(\"Took: {0:.2f} s\".format(time.time() - t0))" + ] + }, + { + "cell_type": "code", + "execution_count": 0, + "metadata": { + "colab_type": "code" + }, + "outputs": [], + "source": [ + "t0 = time.time()\n", + "for _ in range(1000):\n", + " z = naive_add(x, y)\n", + " z = naive_relu(z)\n", + "print(\"Took: {0:.2f} s\".format(time.time() - t0))" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "colab_type": "text" + }, + "source": [ + "### Broadcasting" + ] + }, + { + "cell_type": "code", + "execution_count": 0, + "metadata": { + "colab_type": "code" + }, + "outputs": [], + "source": [ + "import numpy as np\n", + "X = np.random.random((32, 10))\n", + "y = np.random.random((10,))" + ] + }, + { + "cell_type": "code", + "execution_count": 0, + "metadata": { + "colab_type": "code" + }, + "outputs": [], + "source": [ + "y = np.expand_dims(y, axis=0)" + ] + }, + { + "cell_type": "code", + "execution_count": 0, + "metadata": { + "colab_type": "code" + }, + "outputs": [], + "source": [ + "Y = np.concatenate([y] * 32, axis=0)" + ] + }, + { + "cell_type": "code", + "execution_count": 0, + "metadata": { + "colab_type": "code" + }, + "outputs": [], + "source": [ + "def naive_add_matrix_and_vector(x, y):\n", + " assert len(x.shape) == 2\n", + " assert len(y.shape) == 1\n", + " assert x.shape[1] == y.shape[0]\n", + " x = x.copy()\n", + " for i in range(x.shape[0]):\n", + " for j in range(x.shape[1]):\n", + " x[i, j] += y[j]\n", + " return x" + ] + }, + { + "cell_type": "code", + "execution_count": 0, + "metadata": { + "colab_type": "code" + }, + "outputs": [], + "source": [ + "import numpy as np\n", + "x = np.random.random((64, 3, 32, 10))\n", + "y = np.random.random((32, 10))\n", + "z = np.maximum(x, y)" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "colab_type": "text" + }, + "source": [ + "### Tensor product" + ] + }, + { + "cell_type": "code", + "execution_count": 0, + "metadata": { + "colab_type": "code" + }, + "outputs": [], + "source": [ + "x = np.random.random((32,))\n", + "y = np.random.random((32,))\n", + "z = np.dot(x, y)" + ] + }, + { + "cell_type": "code", + "execution_count": 0, + "metadata": { + "colab_type": "code" + }, + "outputs": [], + "source": [ + "def naive_vector_dot(x, y):\n", + " assert len(x.shape) == 1\n", + " assert len(y.shape) == 1\n", + " assert x.shape[0] == y.shape[0]\n", + " z = 0.\n", + " for i in range(x.shape[0]):\n", + " z += x[i] * y[i]\n", + " return z" + ] + }, + { + "cell_type": "code", + "execution_count": 0, + "metadata": { + "colab_type": "code" + }, + "outputs": [], + "source": [ + "def naive_matrix_vector_dot(x, y):\n", + " assert len(x.shape) == 2\n", + " assert len(y.shape) == 1\n", + " assert x.shape[1] == y.shape[0]\n", + " z = np.zeros(x.shape[0])\n", + " for i in range(x.shape[0]):\n", + " for j in range(x.shape[1]):\n", + " z[i] += x[i, j] * y[j]\n", + " return z" + ] + }, + { + "cell_type": "code", + "execution_count": 0, + "metadata": { + "colab_type": "code" + }, + "outputs": [], + "source": [ + "def naive_matrix_vector_dot(x, y):\n", + " z = np.zeros(x.shape[0])\n", + " for i in range(x.shape[0]):\n", + " z[i] = naive_vector_dot(x[i, :], y)\n", + " return z" + ] + }, + { + "cell_type": "code", + "execution_count": 0, + "metadata": { + "colab_type": "code" + }, + "outputs": [], + "source": [ + "def naive_matrix_dot(x, y):\n", + " assert len(x.shape) == 2\n", + " assert len(y.shape) == 2\n", + " assert x.shape[1] == y.shape[0]\n", + " z = np.zeros((x.shape[0], y.shape[1]))\n", + " for i in range(x.shape[0]):\n", + " for j in range(y.shape[1]):\n", + " row_x = x[i, :]\n", + " column_y = y[:, j]\n", + " z[i, j] = naive_vector_dot(row_x, column_y)\n", + " return z" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "colab_type": "text" + }, + "source": [ + "### Tensor reshaping" + ] + }, + { + "cell_type": "code", + "execution_count": 0, + "metadata": { + "colab_type": "code" + }, + "outputs": [], + "source": [ + "train_images = train_images.reshape((60000, 28 * 28))" + ] + }, + { + "cell_type": "code", + "execution_count": 0, + "metadata": { + "colab_type": "code" + }, + "outputs": [], + "source": [ + "x = np.array([[0., 1.],\n", + " [2., 3.],\n", + " [4., 5.]])\n", + "x.shape" + ] + }, + { + "cell_type": "code", + "execution_count": 0, + "metadata": { + "colab_type": "code" + }, + "outputs": [], + "source": [ + "x = x.reshape((6, 1))\n", + "x" + ] + }, + { + "cell_type": "code", + "execution_count": 0, + "metadata": { + "colab_type": "code" + }, + "outputs": [], + "source": [ + "x = np.zeros((300, 20))\n", + "x = np.transpose(x)\n", + "x.shape" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "colab_type": "text" + }, + "source": [ + "### Geometric interpretation of tensor operations" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "colab_type": "text" + }, + "source": [ + "### A geometric interpretation of deep learning" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "colab_type": "text" + }, + "source": [ + "## The engine of neural networks: gradient-based optimization" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "colab_type": "text" + }, + "source": [ + "### What's a derivative?" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "colab_type": "text" + }, + "source": [ + "### Derivative of a tensor operation: the gradient" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "colab_type": "text" + }, + "source": [ + "### Stochastic gradient descent" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "colab_type": "text" + }, + "source": [ + "### Chaining derivatives: The Backpropagation algorithm" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "colab_type": "text" + }, + "source": [ + "#### The chain rule" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "colab_type": "text" + }, + "source": [ + "#### Automatic differentiation with computation graphs" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "colab_type": "text" + }, + "source": [ + "#### The gradient tape in TensorFlow" + ] + }, + { + "cell_type": "code", + "execution_count": 0, + "metadata": { + "colab_type": "code" + }, + "outputs": [], + "source": [ + "import tensorflow as tf\n", + "x = tf.Variable(0.)\n", + "with tf.GradientTape() as tape:\n", + " y = 2 * x + 3\n", + "grad_of_y_wrt_x = tape.gradient(y, x)" + ] + }, + { + "cell_type": "code", + "execution_count": 0, + "metadata": { + "colab_type": "code" + }, + "outputs": [], + "source": [ + "x = tf.Variable(tf.random.uniform((2, 2)))\n", + "with tf.GradientTape() as tape:\n", + " y = 2 * x + 3\n", + "grad_of_y_wrt_x = tape.gradient(y, x)" + ] + }, + { + "cell_type": "code", + "execution_count": 0, + "metadata": { + "colab_type": "code" + }, + "outputs": [], + "source": [ + "W = tf.Variable(tf.random.uniform((2, 2)))\n", + "b = tf.Variable(tf.zeros((2,)))\n", + "x = tf.random.uniform((2, 2))\n", + "with tf.GradientTape() as tape:\n", + " y = tf.matmul(x, W) + b\n", + "grad_of_y_wrt_W_and_b = tape.gradient(y, [W, b])" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "colab_type": "text" + }, + "source": [ + "## Looking back at our first example" + ] + }, + { + "cell_type": "code", + "execution_count": 0, + "metadata": { + "colab_type": "code" + }, + "outputs": [], + "source": [ + "(train_images, train_labels), (test_images, test_labels) = mnist.load_data()\n", + "train_images = train_images.reshape((60000, 28 * 28))\n", + "train_images = train_images.astype(\"float32\") / 255\n", + "test_images = test_images.reshape((10000, 28 * 28))\n", + "test_images = test_images.astype(\"float32\") / 255" + ] + }, + { + "cell_type": "code", + "execution_count": 0, + "metadata": { + "colab_type": "code" + }, + "outputs": [], + "source": [ + "model = keras.Sequential([\n", + " layers.Dense(512, activation=\"relu\"),\n", + " layers.Dense(10, activation=\"softmax\")\n", + "])" + ] + }, + { + "cell_type": "code", + "execution_count": 0, + "metadata": { + "colab_type": "code" + }, + "outputs": [], + "source": [ + "model.compile(optimizer=\"rmsprop\",\n", + " loss=\"sparse_categorical_crossentropy\",\n", + " metrics=[\"accuracy\"])" + ] + }, + { + "cell_type": "code", + "execution_count": 0, + "metadata": { + "colab_type": "code" + }, + "outputs": [], + "source": [ + "model.fit(train_images, train_labels, epochs=5, batch_size=128)" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "colab_type": "text" + }, + "source": [ + "### Reimplementing our first example from scratch in TensorFlow" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "colab_type": "text" + }, + "source": [ + "#### A simple Dense class" + ] + }, + { + "cell_type": "code", + "execution_count": 0, + "metadata": { + "colab_type": "code" + }, + "outputs": [], + "source": [ + "import tensorflow as tf\n", + "\n", + "class NaiveDense:\n", + " def __init__(self, input_size, output_size, activation):\n", + " self.activation = activation\n", + "\n", + " w_shape = (input_size, output_size)\n", + " w_initial_value = tf.random.uniform(w_shape, minval=0, maxval=1e-1)\n", + " self.W = tf.Variable(w_initial_value)\n", + "\n", + " b_shape = (output_size,)\n", + " b_initial_value = tf.zeros(b_shape)\n", + " self.b = tf.Variable(b_initial_value)\n", + "\n", + " def __call__(self, inputs):\n", + " return self.activation(tf.matmul(inputs, self.W) + self.b)\n", + "\n", + " @property\n", + " def weights(self):\n", + " return [self.W, self.b]" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "colab_type": "text" + }, + "source": [ + "#### A simple Sequential class" + ] + }, + { + "cell_type": "code", + "execution_count": 0, + "metadata": { + "colab_type": "code" + }, + "outputs": [], + "source": [ + "class NaiveSequential:\n", + " def __init__(self, layers):\n", + " self.layers = layers\n", + "\n", + " def __call__(self, inputs):\n", + " x = inputs\n", + " for layer in self.layers:\n", + " x = layer(x)\n", + " return x\n", + "\n", + " @property\n", + " def weights(self):\n", + " weights = []\n", + " for layer in self.layers:\n", + " weights += layer.weights\n", + " return weights" + ] + }, + { + "cell_type": "code", + "execution_count": 0, + "metadata": { + "colab_type": "code" + }, + "outputs": [], + "source": [ + "model = NaiveSequential([\n", + " NaiveDense(input_size=28 * 28, output_size=512, activation=tf.nn.relu),\n", + " NaiveDense(input_size=512, output_size=10, activation=tf.nn.softmax)\n", + "])\n", + "assert len(model.weights) == 4" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "colab_type": "text" + }, + "source": [ + "#### A batch generator" + ] + }, + { + "cell_type": "code", + "execution_count": 0, + "metadata": { + "colab_type": "code" + }, + "outputs": [], + "source": [ + "import math\n", + "\n", + "class BatchGenerator:\n", + " def __init__(self, images, labels, batch_size=128):\n", + " assert len(images) == len(labels)\n", + " self.index = 0\n", + " self.images = images\n", + " self.labels = labels\n", + " self.batch_size = batch_size\n", + " self.num_batches = math.ceil(len(images) / batch_size)\n", + "\n", + " def next(self):\n", + " images = self.images[self.index : self.index + self.batch_size]\n", + " labels = self.labels[self.index : self.index + self.batch_size]\n", + " self.index += self.batch_size\n", + " return images, labels" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "colab_type": "text" + }, + "source": [ + "### Running one training step" + ] + }, + { + "cell_type": "code", + "execution_count": 0, + "metadata": { + "colab_type": "code" + }, + "outputs": [], + "source": [ + "def one_training_step(model, images_batch, labels_batch):\n", + " with tf.GradientTape() as tape:\n", + " predictions = model(images_batch)\n", + " per_sample_losses = tf.keras.losses.sparse_categorical_crossentropy(\n", + " labels_batch, predictions)\n", + " average_loss = tf.reduce_mean(per_sample_losses)\n", + " gradients = tape.gradient(average_loss, model.weights)\n", + " update_weights(gradients, model.weights)\n", + " return average_loss" + ] + }, + { + "cell_type": "code", + "execution_count": 0, + "metadata": { + "colab_type": "code" + }, + "outputs": [], + "source": [ + "learning_rate = 1e-3\n", + "\n", + "def update_weights(gradients, weights):\n", + " for g, w in zip(gradients, weights):\n", + " w.assign_sub(g * learning_rate)" + ] + }, + { + "cell_type": "code", + "execution_count": 0, + "metadata": { + "colab_type": "code" + }, + "outputs": [], + "source": [ + "from tensorflow.keras import optimizers\n", + "\n", + "optimizer = optimizers.SGD(learning_rate=1e-3)\n", + "\n", + "def update_weights(gradients, weights):\n", + " optimizer.apply_gradients(zip(gradients, weights))" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "colab_type": "text" + }, + "source": [ + "### The full training loop" + ] + }, + { + "cell_type": "code", + "execution_count": 0, + "metadata": { + "colab_type": "code" + }, + "outputs": [], + "source": [ + "def fit(model, images, labels, epochs, batch_size=128):\n", + " for epoch_counter in range(epochs):\n", + " print(f\"Epoch {epoch_counter}\")\n", + " batch_generator = BatchGenerator(images, labels)\n", + " for batch_counter in range(batch_generator.num_batches):\n", + " images_batch, labels_batch = batch_generator.next()\n", + " loss = one_training_step(model, images_batch, labels_batch)\n", + " if batch_counter % 100 == 0:\n", + " print(f\"loss at batch {batch_counter}: {loss:.2f}\")" + ] + }, + { + "cell_type": "code", + "execution_count": 0, + "metadata": { + "colab_type": "code" + }, + "outputs": [], + "source": [ + "from tensorflow.keras.datasets import mnist\n", + "(train_images, train_labels), (test_images, test_labels) = mnist.load_data()\n", + "\n", + "train_images = train_images.reshape((60000, 28 * 28))\n", + "train_images = train_images.astype(\"float32\") / 255\n", + "test_images = test_images.reshape((10000, 28 * 28))\n", + "test_images = test_images.astype(\"float32\") / 255\n", + "\n", + "fit(model, train_images, train_labels, epochs=10, batch_size=128)" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "colab_type": "text" + }, + "source": [ + "### Evaluating the model" + ] + }, + { + "cell_type": "code", + "execution_count": 0, + "metadata": { + "colab_type": "code" + }, + "outputs": [], + "source": [ + "predictions = model(test_images)\n", + "predictions = predictions.numpy()\n", + "predicted_labels = np.argmax(predictions, axis=1)\n", + "matches = predicted_labels == test_labels\n", + "print(f\"accuracy: {matches.mean():.2f}\")" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "colab_type": "text" + }, + "source": [ + "## Summary" + ] + } + ], + "metadata": { + "colab": { + "collapsed_sections": [], + "name": "chapter02_mathematical-building-blocks.i", + "private_outputs": false, + "provenance": [], + "toc_visible": true + }, + "kernelspec": { + "display_name": "Python 3", + "language": "python", + "name": "python3" + }, + "language_info": { + "codemirror_mode": { + "name": "ipython", + "version": 3 + }, + "file_extension": ".py", + "mimetype": "text/x-python", + "name": "python", + "nbconvert_exporter": "python", + "pygments_lexer": "ipython3", + "version": "3.7.0" + } + }, + "nbformat": 4, + "nbformat_minor": 0 +} \ No newline at end of file diff --git a/second_edition/chapter03_introduction-to-keras-and-tf.ipynb b/second_edition/chapter03_introduction-to-keras-and-tf.ipynb new file mode 100644 index 0000000000..04c0d056eb --- /dev/null +++ b/second_edition/chapter03_introduction-to-keras-and-tf.ipynb @@ -0,0 +1,990 @@ +{ + "cells": [ + { + "cell_type": "markdown", + "metadata": { + "colab_type": "text" + }, + "source": [ + "This is a companion notebook for the book [Deep Learning with Python, Second Edition](https://www.manning.com/books/deep-learning-with-python-second-edition?a_aid=keras&a_bid=76564dff). For readability, it only contains runnable code blocks and section titles, and omits everything else in the book: text paragraphs, figures, and pseudocode.\n\n**If you want to be able to follow what's going on, I recommend reading the notebook side by side with your copy of the book.**\n\nThis notebook was generated for TensorFlow 2.6." + ] + }, + { + "cell_type": "markdown", + "metadata": { + "colab_type": "text" + }, + "source": [ + "# Introduction to Keras and TensorFlow" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "colab_type": "text" + }, + "source": [ + "## What's TensorFlow?" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "colab_type": "text" + }, + "source": [ + "## What's Keras?" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "colab_type": "text" + }, + "source": [ + "## Keras and TensorFlow: A brief history" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "colab_type": "text" + }, + "source": [ + "## Setting up a deep-learning workspace" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "colab_type": "text" + }, + "source": [ + "### Jupyter notebooks: The preferred way to run deep-learning experiments" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "colab_type": "text" + }, + "source": [ + "### Using Colaboratory" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "colab_type": "text" + }, + "source": [ + "#### First steps with Colaboratory" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "colab_type": "text" + }, + "source": [ + "#### Installing packages with pip" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "colab_type": "text" + }, + "source": [ + "#### Using the GPU runtime" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "colab_type": "text" + }, + "source": [ + "## First steps with TensorFlow" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "colab_type": "text" + }, + "source": [ + "#### Constant tensors and variables" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "colab_type": "text" + }, + "source": [ + "**All-ones or all-zeros tensors**" + ] + }, + { + "cell_type": "code", + "execution_count": 0, + "metadata": { + "colab_type": "code" + }, + "outputs": [], + "source": [ + "import tensorflow as tf\n", + "x = tf.ones(shape=(2, 1))\n", + "print(x)" + ] + }, + { + "cell_type": "code", + "execution_count": 0, + "metadata": { + "colab_type": "code" + }, + "outputs": [], + "source": [ + "x = tf.zeros(shape=(2, 1))\n", + "print(x)" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "colab_type": "text" + }, + "source": [ + "**Random tensors**" + ] + }, + { + "cell_type": "code", + "execution_count": 0, + "metadata": { + "colab_type": "code" + }, + "outputs": [], + "source": [ + "x = tf.random.normal(shape=(3, 1), mean=0., stddev=1.)\n", + "print(x)" + ] + }, + { + "cell_type": "code", + "execution_count": 0, + "metadata": { + "colab_type": "code" + }, + "outputs": [], + "source": [ + "x = tf.random.uniform(shape=(3, 1), minval=0., maxval=1.)\n", + "print(x)" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "colab_type": "text" + }, + "source": [ + "**NumPy arrays are assignable**" + ] + }, + { + "cell_type": "code", + "execution_count": 0, + "metadata": { + "colab_type": "code" + }, + "outputs": [], + "source": [ + "import numpy as np\n", + "x = np.ones(shape=(2, 2))\n", + "x[0, 0] = 0." + ] + }, + { + "cell_type": "markdown", + "metadata": { + "colab_type": "text" + }, + "source": [ + "**Creating a TensorFlow variable**" + ] + }, + { + "cell_type": "code", + "execution_count": 0, + "metadata": { + "colab_type": "code" + }, + "outputs": [], + "source": [ + "v = tf.Variable(initial_value=tf.random.normal(shape=(3, 1)))\n", + "print(v)" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "colab_type": "text" + }, + "source": [ + "**Assigning a value to a TensorFlow variable**" + ] + }, + { + "cell_type": "code", + "execution_count": 0, + "metadata": { + "colab_type": "code" + }, + "outputs": [], + "source": [ + "v.assign(tf.ones((3, 1)))" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "colab_type": "text" + }, + "source": [ + "**Assigning a value to a subset of a TensorFlow variable**" + ] + }, + { + "cell_type": "code", + "execution_count": 0, + "metadata": { + "colab_type": "code" + }, + "outputs": [], + "source": [ + "v[0, 0].assign(3.)" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "colab_type": "text" + }, + "source": [ + "**Using `assign_add`**" + ] + }, + { + "cell_type": "code", + "execution_count": 0, + "metadata": { + "colab_type": "code" + }, + "outputs": [], + "source": [ + "v.assign_add(tf.ones((3, 1)))" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "colab_type": "text" + }, + "source": [ + "#### Tensor operations: Doing math in TensorFlow" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "colab_type": "text" + }, + "source": [ + "**A few basic math operations**" + ] + }, + { + "cell_type": "code", + "execution_count": 0, + "metadata": { + "colab_type": "code" + }, + "outputs": [], + "source": [ + "a = tf.ones((2, 2))\n", + "b = tf.square(a)\n", + "c = tf.sqrt(a)\n", + "d = b + c\n", + "e = tf.matmul(a, b)\n", + "e *= d" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "colab_type": "text" + }, + "source": [ + "#### A second look at the GradientTape API" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "colab_type": "text" + }, + "source": [ + "**Using the `GradientTape`**" + ] + }, + { + "cell_type": "code", + "execution_count": 0, + "metadata": { + "colab_type": "code" + }, + "outputs": [], + "source": [ + "input_var = tf.Variable(initial_value=3.)\n", + "with tf.GradientTape() as tape:\n", + " result = tf.square(input_var)\n", + "gradient = tape.gradient(result, input_var)" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "colab_type": "text" + }, + "source": [ + "**Using `GradientTape` with constant tensor inputs**" + ] + }, + { + "cell_type": "code", + "execution_count": 0, + "metadata": { + "colab_type": "code" + }, + "outputs": [], + "source": [ + "input_const = tf.constant(3.)\n", + "with tf.GradientTape() as tape:\n", + " tape.watch(input_const)\n", + " result = tf.square(input_const)\n", + "gradient = tape.gradient(result, input_const)" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "colab_type": "text" + }, + "source": [ + "**Using nested gradient tapes to compute second-order gradients**" + ] + }, + { + "cell_type": "code", + "execution_count": 0, + "metadata": { + "colab_type": "code" + }, + "outputs": [], + "source": [ + "time = tf.Variable(0.)\n", + "with tf.GradientTape() as outer_tape:\n", + " with tf.GradientTape() as inner_tape:\n", + " position = 4.9 * time ** 2\n", + " speed = inner_tape.gradient(position, time)\n", + "acceleration = outer_tape.gradient(speed, time)" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "colab_type": "text" + }, + "source": [ + "#### An end-to-end example: A linear classifier in pure TensorFlow" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "colab_type": "text" + }, + "source": [ + "**Generating two classes of random points in a 2D plane**" + ] + }, + { + "cell_type": "code", + "execution_count": 0, + "metadata": { + "colab_type": "code" + }, + "outputs": [], + "source": [ + "num_samples_per_class = 1000\n", + "negative_samples = np.random.multivariate_normal(\n", + " mean=[0, 3],\n", + " cov=[[1, 0.5],[0.5, 1]],\n", + " size=num_samples_per_class)\n", + "positive_samples = np.random.multivariate_normal(\n", + " mean=[3, 0],\n", + " cov=[[1, 0.5],[0.5, 1]],\n", + " size=num_samples_per_class)" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "colab_type": "text" + }, + "source": [ + "**Stacking the two classes into an array with shape (2000, 2)**" + ] + }, + { + "cell_type": "code", + "execution_count": 0, + "metadata": { + "colab_type": "code" + }, + "outputs": [], + "source": [ + "inputs = np.vstack((negative_samples, positive_samples)).astype(np.float32)" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "colab_type": "text" + }, + "source": [ + "**Generating the corresponding targets (0 and 1)**" + ] + }, + { + "cell_type": "code", + "execution_count": 0, + "metadata": { + "colab_type": "code" + }, + "outputs": [], + "source": [ + "targets = np.vstack((np.zeros((num_samples_per_class, 1), dtype=\"float32\"),\n", + " np.ones((num_samples_per_class, 1), dtype=\"float32\")))" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "colab_type": "text" + }, + "source": [ + "**Plotting the two point classes**" + ] + }, + { + "cell_type": "code", + "execution_count": 0, + "metadata": { + "colab_type": "code" + }, + "outputs": [], + "source": [ + "import matplotlib.pyplot as plt\n", + "plt.scatter(inputs[:, 0], inputs[:, 1], c=targets[:, 0])\n", + "plt.show()" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "colab_type": "text" + }, + "source": [ + "**Creating the linear classifier variables**" + ] + }, + { + "cell_type": "code", + "execution_count": 0, + "metadata": { + "colab_type": "code" + }, + "outputs": [], + "source": [ + "input_dim = 2\n", + "output_dim = 1\n", + "W = tf.Variable(initial_value=tf.random.uniform(shape=(input_dim, output_dim)))\n", + "b = tf.Variable(initial_value=tf.zeros(shape=(output_dim,)))" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "colab_type": "text" + }, + "source": [ + "**The forward pass function**" + ] + }, + { + "cell_type": "code", + "execution_count": 0, + "metadata": { + "colab_type": "code" + }, + "outputs": [], + "source": [ + "def model(inputs):\n", + " return tf.matmul(inputs, W) + b" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "colab_type": "text" + }, + "source": [ + "**The mean squared error loss function**" + ] + }, + { + "cell_type": "code", + "execution_count": 0, + "metadata": { + "colab_type": "code" + }, + "outputs": [], + "source": [ + "def square_loss(targets, predictions):\n", + " per_sample_losses = tf.square(targets - predictions)\n", + " return tf.reduce_mean(per_sample_losses)" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "colab_type": "text" + }, + "source": [ + "**The training step function**" + ] + }, + { + "cell_type": "code", + "execution_count": 0, + "metadata": { + "colab_type": "code" + }, + "outputs": [], + "source": [ + "learning_rate = 0.1\n", + "\n", + "def training_step(inputs, targets):\n", + " with tf.GradientTape() as tape:\n", + " predictions = model(inputs)\n", + " loss = square_loss(targets, predictions)\n", + " grad_loss_wrt_W, grad_loss_wrt_b = tape.gradient(loss, [W, b])\n", + " W.assign_sub(grad_loss_wrt_W * learning_rate)\n", + " b.assign_sub(grad_loss_wrt_b * learning_rate)\n", + " return loss" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "colab_type": "text" + }, + "source": [ + "**The batch training loop**" + ] + }, + { + "cell_type": "code", + "execution_count": 0, + "metadata": { + "colab_type": "code" + }, + "outputs": [], + "source": [ + "for step in range(40):\n", + " loss = training_step(inputs, targets)\n", + " print(f\"Loss at step {step}: {loss:.4f}\")" + ] + }, + { + "cell_type": "code", + "execution_count": 0, + "metadata": { + "colab_type": "code" + }, + "outputs": [], + "source": [ + "predictions = model(inputs)\n", + "plt.scatter(inputs[:, 0], inputs[:, 1], c=predictions[:, 0] > 0.5)\n", + "plt.show()" + ] + }, + { + "cell_type": "code", + "execution_count": 0, + "metadata": { + "colab_type": "code" + }, + "outputs": [], + "source": [ + "x = np.linspace(-1, 4, 100)\n", + "y = - W[0] / W[1] * x + (0.5 - b) / W[1]\n", + "plt.plot(x, y, \"-r\")\n", + "plt.scatter(inputs[:, 0], inputs[:, 1], c=predictions[:, 0] > 0.5)" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "colab_type": "text" + }, + "source": [ + "## Anatomy of a neural network: Understanding core Keras APIs" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "colab_type": "text" + }, + "source": [ + "### Layers: The building blocks of deep learning" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "colab_type": "text" + }, + "source": [ + "#### The base Layer class in Keras" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "colab_type": "text" + }, + "source": [ + "**A `Dense` layer implemented as a `Layer` subclass**" + ] + }, + { + "cell_type": "code", + "execution_count": 0, + "metadata": { + "colab_type": "code" + }, + "outputs": [], + "source": [ + "from tensorflow import keras\n", + "\n", + "class SimpleDense(keras.layers.Layer):\n", + "\n", + " def __init__(self, units, activation=None):\n", + " super().__init__()\n", + " self.units = units\n", + " self.activation = activation\n", + "\n", + " def build(self, input_shape):\n", + " input_dim = input_shape[-1]\n", + " self.W = self.add_weight(shape=(input_dim, self.units),\n", + " initializer=\"random_normal\")\n", + " self.b = self.add_weight(shape=(self.units,),\n", + " initializer=\"zeros\")\n", + "\n", + " def call(self, inputs):\n", + " y = tf.matmul(inputs, self.W) + self.b\n", + " if self.activation is not None:\n", + " y = self.activation(y)\n", + " return y" + ] + }, + { + "cell_type": "code", + "execution_count": 0, + "metadata": { + "colab_type": "code" + }, + "outputs": [], + "source": [ + "my_dense = SimpleDense(units=32, activation=tf.nn.relu)\n", + "input_tensor = tf.ones(shape=(2, 784))\n", + "output_tensor = my_dense(input_tensor)\n", + "print(output_tensor.shape)" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "colab_type": "text" + }, + "source": [ + "#### Automatic shape inference: Building layers on the fly" + ] + }, + { + "cell_type": "code", + "execution_count": 0, + "metadata": { + "colab_type": "code" + }, + "outputs": [], + "source": [ + "from tensorflow.keras import layers\n", + "layer = layers.Dense(32, activation=\"relu\")" + ] + }, + { + "cell_type": "code", + "execution_count": 0, + "metadata": { + "colab_type": "code" + }, + "outputs": [], + "source": [ + "from tensorflow.keras import models\n", + "from tensorflow.keras import layers\n", + "model = models.Sequential([\n", + " layers.Dense(32, activation=\"relu\"),\n", + " layers.Dense(32)\n", + "])" + ] + }, + { + "cell_type": "code", + "execution_count": 0, + "metadata": { + "colab_type": "code" + }, + "outputs": [], + "source": [ + "model = keras.Sequential([\n", + " SimpleDense(32, activation=\"relu\"),\n", + " SimpleDense(64, activation=\"relu\"),\n", + " SimpleDense(32, activation=\"relu\"),\n", + " SimpleDense(10, activation=\"softmax\")\n", + "])" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "colab_type": "text" + }, + "source": [ + "### From layers to models" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "colab_type": "text" + }, + "source": [ + "### The \"compile\" step: Configuring the learning process" + ] + }, + { + "cell_type": "code", + "execution_count": 0, + "metadata": { + "colab_type": "code" + }, + "outputs": [], + "source": [ + "model = keras.Sequential([keras.layers.Dense(1)])\n", + "model.compile(optimizer=\"rmsprop\",\n", + " loss=\"mean_squared_error\",\n", + " metrics=[\"accuracy\"])" + ] + }, + { + "cell_type": "code", + "execution_count": 0, + "metadata": { + "colab_type": "code" + }, + "outputs": [], + "source": [ + "model.compile(optimizer=keras.optimizers.RMSprop(),\n", + " loss=keras.losses.MeanSquaredError(),\n", + " metrics=[keras.metrics.BinaryAccuracy()])" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "colab_type": "text" + }, + "source": [ + "### Picking a loss function" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "colab_type": "text" + }, + "source": [ + "### Understanding the fit() method" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "colab_type": "text" + }, + "source": [ + "**Calling `fit()` with NumPy data**" + ] + }, + { + "cell_type": "code", + "execution_count": 0, + "metadata": { + "colab_type": "code" + }, + "outputs": [], + "source": [ + "history = model.fit(\n", + " inputs,\n", + " targets,\n", + " epochs=5,\n", + " batch_size=128\n", + ")" + ] + }, + { + "cell_type": "code", + "execution_count": 0, + "metadata": { + "colab_type": "code" + }, + "outputs": [], + "source": [ + "history.history" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "colab_type": "text" + }, + "source": [ + "### Monitoring loss and metrics on validation data" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "colab_type": "text" + }, + "source": [ + "**Using the `validation_data` argument**" + ] + }, + { + "cell_type": "code", + "execution_count": 0, + "metadata": { + "colab_type": "code" + }, + "outputs": [], + "source": [ + "model = keras.Sequential([keras.layers.Dense(1)])\n", + "model.compile(optimizer=keras.optimizers.RMSprop(learning_rate=0.1),\n", + " loss=keras.losses.MeanSquaredError(),\n", + " metrics=[keras.metrics.BinaryAccuracy()])\n", + "\n", + "indices_permutation = np.random.permutation(len(inputs))\n", + "shuffled_inputs = inputs[indices_permutation]\n", + "shuffled_targets = targets[indices_permutation]\n", + "\n", + "num_validation_samples = int(0.3 * len(inputs))\n", + "val_inputs = shuffled_inputs[:num_validation_samples]\n", + "val_targets = shuffled_targets[:num_validation_samples]\n", + "training_inputs = shuffled_inputs[num_validation_samples:]\n", + "training_targets = shuffled_targets[num_validation_samples:]\n", + "model.fit(\n", + " training_inputs,\n", + " training_targets,\n", + " epochs=5,\n", + " batch_size=16,\n", + " validation_data=(val_inputs, val_targets)\n", + ")" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "colab_type": "text" + }, + "source": [ + "### Inference: Using a model after training" + ] + }, + { + "cell_type": "code", + "execution_count": 0, + "metadata": { + "colab_type": "code" + }, + "outputs": [], + "source": [ + "predictions = model.predict(val_inputs, batch_size=128)\n", + "print(predictions[:10])" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "colab_type": "text" + }, + "source": [ + "## Summary" + ] + } + ], + "metadata": { + "colab": { + "collapsed_sections": [], + "name": "chapter03_introduction-to-keras-and-tf.i", + "private_outputs": false, + "provenance": [], + "toc_visible": true + }, + "kernelspec": { + "display_name": "Python 3", + "language": "python", + "name": "python3" + }, + "language_info": { + "codemirror_mode": { + "name": "ipython", + "version": 3 + }, + "file_extension": ".py", + "mimetype": "text/x-python", + "name": "python", + "nbconvert_exporter": "python", + "pygments_lexer": "ipython3", + "version": "3.7.0" + } + }, + "nbformat": 4, + "nbformat_minor": 0 +} \ No newline at end of file diff --git a/second_edition/chapter04_getting-started-with-neural-networks.ipynb b/second_edition/chapter04_getting-started-with-neural-networks.ipynb new file mode 100644 index 0000000000..ba77a17d45 --- /dev/null +++ b/second_edition/chapter04_getting-started-with-neural-networks.ipynb @@ -0,0 +1,1413 @@ +{ + "cells": [ + { + "cell_type": "markdown", + "metadata": { + "colab_type": "text" + }, + "source": [ + "This is a companion notebook for the book [Deep Learning with Python, Second Edition](https://www.manning.com/books/deep-learning-with-python-second-edition?a_aid=keras&a_bid=76564dff). For readability, it only contains runnable code blocks and section titles, and omits everything else in the book: text paragraphs, figures, and pseudocode.\n\n**If you want to be able to follow what's going on, I recommend reading the notebook side by side with your copy of the book.**\n\nThis notebook was generated for TensorFlow 2.6." + ] + }, + { + "cell_type": "markdown", + "metadata": { + "colab_type": "text" + }, + "source": [ + "# Getting started with neural networks: Classification and regression" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "colab_type": "text" + }, + "source": [ + "## Classifying movie reviews: A binary classification example" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "colab_type": "text" + }, + "source": [ + "### The IMDB dataset" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "colab_type": "text" + }, + "source": [ + "**Loading the IMDB dataset**" + ] + }, + { + "cell_type": "code", + "execution_count": 0, + "metadata": { + "colab_type": "code" + }, + "outputs": [], + "source": [ + "from tensorflow.keras.datasets import imdb\n", + "(train_data, train_labels), (test_data, test_labels) = imdb.load_data(\n", + " num_words=10000)" + ] + }, + { + "cell_type": "code", + "execution_count": 0, + "metadata": { + "colab_type": "code" + }, + "outputs": [], + "source": [ + "train_data[0]" + ] + }, + { + "cell_type": "code", + "execution_count": 0, + "metadata": { + "colab_type": "code" + }, + "outputs": [], + "source": [ + "train_labels[0]" + ] + }, + { + "cell_type": "code", + "execution_count": 0, + "metadata": { + "colab_type": "code" + }, + "outputs": [], + "source": [ + "max([max(sequence) for sequence in train_data])" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "colab_type": "text" + }, + "source": [ + "**Decoding reviews back to text**" + ] + }, + { + "cell_type": "code", + "execution_count": 0, + "metadata": { + "colab_type": "code" + }, + "outputs": [], + "source": [ + "word_index = imdb.get_word_index()\n", + "reverse_word_index = dict(\n", + " [(value, key) for (key, value) in word_index.items()])\n", + "decoded_review = \" \".join(\n", + " [reverse_word_index.get(i - 3, \"?\") for i in train_data[0]])" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "colab_type": "text" + }, + "source": [ + "### Preparing the data" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "colab_type": "text" + }, + "source": [ + "**Encoding the integer sequences via multi-hot encoding**" + ] + }, + { + "cell_type": "code", + "execution_count": 0, + "metadata": { + "colab_type": "code" + }, + "outputs": [], + "source": [ + "import numpy as np\n", + "def vectorize_sequences(sequences, dimension=10000):\n", + " results = np.zeros((len(sequences), dimension))\n", + " for i, sequence in enumerate(sequences):\n", + " for j in sequence:\n", + " results[i, j] = 1.\n", + " return results\n", + "x_train = vectorize_sequences(train_data)\n", + "x_test = vectorize_sequences(test_data)" + ] + }, + { + "cell_type": "code", + "execution_count": 0, + "metadata": { + "colab_type": "code" + }, + "outputs": [], + "source": [ + "x_train[0]" + ] + }, + { + "cell_type": "code", + "execution_count": 0, + "metadata": { + "colab_type": "code" + }, + "outputs": [], + "source": [ + "y_train = np.asarray(train_labels).astype(\"float32\")\n", + "y_test = np.asarray(test_labels).astype(\"float32\")" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "colab_type": "text" + }, + "source": [ + "### Building your model" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "colab_type": "text" + }, + "source": [ + "**Model definition**" + ] + }, + { + "cell_type": "code", + "execution_count": 0, + "metadata": { + "colab_type": "code" + }, + "outputs": [], + "source": [ + "from tensorflow import keras\n", + "from tensorflow.keras import layers\n", + "\n", + "model = keras.Sequential([\n", + " layers.Dense(16, activation=\"relu\"),\n", + " layers.Dense(16, activation=\"relu\"),\n", + " layers.Dense(1, activation=\"sigmoid\")\n", + "])" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "colab_type": "text" + }, + "source": [ + "**Compiling the model**" + ] + }, + { + "cell_type": "code", + "execution_count": 0, + "metadata": { + "colab_type": "code" + }, + "outputs": [], + "source": [ + "model.compile(optimizer=\"rmsprop\",\n", + " loss=\"binary_crossentropy\",\n", + " metrics=[\"accuracy\"])" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "colab_type": "text" + }, + "source": [ + "### Validating your approach" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "colab_type": "text" + }, + "source": [ + "**Setting aside a validation set**" + ] + }, + { + "cell_type": "code", + "execution_count": 0, + "metadata": { + "colab_type": "code" + }, + "outputs": [], + "source": [ + "x_val = x_train[:10000]\n", + "partial_x_train = x_train[10000:]\n", + "y_val = y_train[:10000]\n", + "partial_y_train = y_train[10000:]" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "colab_type": "text" + }, + "source": [ + "**Training your model**" + ] + }, + { + "cell_type": "code", + "execution_count": 0, + "metadata": { + "colab_type": "code" + }, + "outputs": [], + "source": [ + "history = model.fit(partial_x_train,\n", + " partial_y_train,\n", + " epochs=20,\n", + " batch_size=512,\n", + " validation_data=(x_val, y_val))" + ] + }, + { + "cell_type": "code", + "execution_count": 0, + "metadata": { + "colab_type": "code" + }, + "outputs": [], + "source": [ + "history_dict = history.history\n", + "history_dict.keys()" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "colab_type": "text" + }, + "source": [ + "**Plotting the training and validation loss**" + ] + }, + { + "cell_type": "code", + "execution_count": 0, + "metadata": { + "colab_type": "code" + }, + "outputs": [], + "source": [ + "import matplotlib.pyplot as plt\n", + "history_dict = history.history\n", + "loss_values = history_dict[\"loss\"]\n", + "val_loss_values = history_dict[\"val_loss\"]\n", + "epochs = range(1, len(loss_values) + 1)\n", + "plt.plot(epochs, loss_values, \"bo\", label=\"Training loss\")\n", + "plt.plot(epochs, val_loss_values, \"b\", label=\"Validation loss\")\n", + "plt.title(\"Training and validation loss\")\n", + "plt.xlabel(\"Epochs\")\n", + "plt.ylabel(\"Loss\")\n", + "plt.legend()\n", + "plt.show()" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "colab_type": "text" + }, + "source": [ + "**Plotting the training and validation accuracy**" + ] + }, + { + "cell_type": "code", + "execution_count": 0, + "metadata": { + "colab_type": "code" + }, + "outputs": [], + "source": [ + "plt.clf()\n", + "acc = history_dict[\"accuracy\"]\n", + "val_acc = history_dict[\"val_accuracy\"]\n", + "plt.plot(epochs, acc, \"bo\", label=\"Training acc\")\n", + "plt.plot(epochs, val_acc, \"b\", label=\"Validation acc\")\n", + "plt.title(\"Training and validation accuracy\")\n", + "plt.xlabel(\"Epochs\")\n", + "plt.ylabel(\"Accuracy\")\n", + "plt.legend()\n", + "plt.show()" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "colab_type": "text" + }, + "source": [ + "**Retraining a model from scratch**" + ] + }, + { + "cell_type": "code", + "execution_count": 0, + "metadata": { + "colab_type": "code" + }, + "outputs": [], + "source": [ + "model = keras.Sequential([\n", + " layers.Dense(16, activation=\"relu\"),\n", + " layers.Dense(16, activation=\"relu\"),\n", + " layers.Dense(1, activation=\"sigmoid\")\n", + "])\n", + "model.compile(optimizer=\"rmsprop\",\n", + " loss=\"binary_crossentropy\",\n", + " metrics=[\"accuracy\"])\n", + "model.fit(x_train, y_train, epochs=4, batch_size=512)\n", + "results = model.evaluate(x_test, y_test)" + ] + }, + { + "cell_type": "code", + "execution_count": 0, + "metadata": { + "colab_type": "code" + }, + "outputs": [], + "source": [ + "results" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "colab_type": "text" + }, + "source": [ + "### Using a trained model to generate predictions on new data" + ] + }, + { + "cell_type": "code", + "execution_count": 0, + "metadata": { + "colab_type": "code" + }, + "outputs": [], + "source": [ + "model.predict(x_test)" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "colab_type": "text" + }, + "source": [ + "### Further experiments" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "colab_type": "text" + }, + "source": [ + "### Wrapping up" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "colab_type": "text" + }, + "source": [ + "## Classifying newswires: A multiclass classification example" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "colab_type": "text" + }, + "source": [ + "### The Reuters dataset" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "colab_type": "text" + }, + "source": [ + "**Loading the Reuters dataset**" + ] + }, + { + "cell_type": "code", + "execution_count": 0, + "metadata": { + "colab_type": "code" + }, + "outputs": [], + "source": [ + "from tensorflow.keras.datasets import reuters\n", + "(train_data, train_labels), (test_data, test_labels) = reuters.load_data(\n", + " num_words=10000)" + ] + }, + { + "cell_type": "code", + "execution_count": 0, + "metadata": { + "colab_type": "code" + }, + "outputs": [], + "source": [ + "len(train_data)" + ] + }, + { + "cell_type": "code", + "execution_count": 0, + "metadata": { + "colab_type": "code" + }, + "outputs": [], + "source": [ + "len(test_data)" + ] + }, + { + "cell_type": "code", + "execution_count": 0, + "metadata": { + "colab_type": "code" + }, + "outputs": [], + "source": [ + "train_data[10]" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "colab_type": "text" + }, + "source": [ + "**Decoding newswires back to text**" + ] + }, + { + "cell_type": "code", + "execution_count": 0, + "metadata": { + "colab_type": "code" + }, + "outputs": [], + "source": [ + "word_index = reuters.get_word_index()\n", + "reverse_word_index = dict([(value, key) for (key, value) in word_index.items()])\n", + "decoded_newswire = \" \".join([reverse_word_index.get(i - 3, \"?\") for i in\n", + " train_data[0]])" + ] + }, + { + "cell_type": "code", + "execution_count": 0, + "metadata": { + "colab_type": "code" + }, + "outputs": [], + "source": [ + "train_labels[10]" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "colab_type": "text" + }, + "source": [ + "### Preparing the data" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "colab_type": "text" + }, + "source": [ + "**Encoding the input data**" + ] + }, + { + "cell_type": "code", + "execution_count": 0, + "metadata": { + "colab_type": "code" + }, + "outputs": [], + "source": [ + "x_train = vectorize_sequences(train_data)\n", + "x_test = vectorize_sequences(test_data)" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "colab_type": "text" + }, + "source": [ + "**Encoding the labels**" + ] + }, + { + "cell_type": "code", + "execution_count": 0, + "metadata": { + "colab_type": "code" + }, + "outputs": [], + "source": [ + "def to_one_hot(labels, dimension=46):\n", + " results = np.zeros((len(labels), dimension))\n", + " for i, label in enumerate(labels):\n", + " results[i, label] = 1.\n", + " return results\n", + "y_train = to_one_hot(train_labels)\n", + "y_test = to_one_hot(test_labels)" + ] + }, + { + "cell_type": "code", + "execution_count": 0, + "metadata": { + "colab_type": "code" + }, + "outputs": [], + "source": [ + "from tensorflow.keras.utils import to_categorical\n", + "y_train = to_categorical(train_labels)\n", + "y_test = to_categorical(test_labels)" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "colab_type": "text" + }, + "source": [ + "### Building your model" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "colab_type": "text" + }, + "source": [ + "**Model definition**" + ] + }, + { + "cell_type": "code", + "execution_count": 0, + "metadata": { + "colab_type": "code" + }, + "outputs": [], + "source": [ + "model = keras.Sequential([\n", + " layers.Dense(64, activation=\"relu\"),\n", + " layers.Dense(64, activation=\"relu\"),\n", + " layers.Dense(46, activation=\"softmax\")\n", + "])" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "colab_type": "text" + }, + "source": [ + "**Compiling the model**" + ] + }, + { + "cell_type": "code", + "execution_count": 0, + "metadata": { + "colab_type": "code" + }, + "outputs": [], + "source": [ + "model.compile(optimizer=\"rmsprop\",\n", + " loss=\"categorical_crossentropy\",\n", + " metrics=[\"accuracy\"])" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "colab_type": "text" + }, + "source": [ + "### Validating your approach" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "colab_type": "text" + }, + "source": [ + "**Setting aside a validation set**" + ] + }, + { + "cell_type": "code", + "execution_count": 0, + "metadata": { + "colab_type": "code" + }, + "outputs": [], + "source": [ + "x_val = x_train[:1000]\n", + "partial_x_train = x_train[1000:]\n", + "y_val = y_train[:1000]\n", + "partial_y_train = y_train[1000:]" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "colab_type": "text" + }, + "source": [ + "**Training the model**" + ] + }, + { + "cell_type": "code", + "execution_count": 0, + "metadata": { + "colab_type": "code" + }, + "outputs": [], + "source": [ + "history = model.fit(partial_x_train,\n", + " partial_y_train,\n", + " epochs=20,\n", + " batch_size=512,\n", + " validation_data=(x_val, y_val))" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "colab_type": "text" + }, + "source": [ + "**Plotting the training and validation loss**" + ] + }, + { + "cell_type": "code", + "execution_count": 0, + "metadata": { + "colab_type": "code" + }, + "outputs": [], + "source": [ + "loss = history.history[\"loss\"]\n", + "val_loss = history.history[\"val_loss\"]\n", + "epochs = range(1, len(loss) + 1)\n", + "plt.plot(epochs, loss, \"bo\", label=\"Training loss\")\n", + "plt.plot(epochs, val_loss, \"b\", label=\"Validation loss\")\n", + "plt.title(\"Training and validation loss\")\n", + "plt.xlabel(\"Epochs\")\n", + "plt.ylabel(\"Loss\")\n", + "plt.legend()\n", + "plt.show()" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "colab_type": "text" + }, + "source": [ + "**Plotting the training and validation accuracy**" + ] + }, + { + "cell_type": "code", + "execution_count": 0, + "metadata": { + "colab_type": "code" + }, + "outputs": [], + "source": [ + "plt.clf()\n", + "acc = history.history[\"accuracy\"]\n", + "val_acc = history.history[\"val_accuracy\"]\n", + "plt.plot(epochs, acc, \"bo\", label=\"Training accuracy\")\n", + "plt.plot(epochs, val_acc, \"b\", label=\"Validation accuracy\")\n", + "plt.title(\"Training and validation accuracy\")\n", + "plt.xlabel(\"Epochs\")\n", + "plt.ylabel(\"Accuracy\")\n", + "plt.legend()\n", + "plt.show()" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "colab_type": "text" + }, + "source": [ + "**Retraining a model from scratch**" + ] + }, + { + "cell_type": "code", + "execution_count": 0, + "metadata": { + "colab_type": "code" + }, + "outputs": [], + "source": [ + "model = keras.Sequential([\n", + " layers.Dense(64, activation=\"relu\"),\n", + " layers.Dense(64, activation=\"relu\"),\n", + " layers.Dense(46, activation=\"softmax\")\n", + "])\n", + "model.compile(optimizer=\"rmsprop\",\n", + " loss=\"categorical_crossentropy\",\n", + " metrics=[\"accuracy\"])\n", + "model.fit(x_train,\n", + " y_train,\n", + " epochs=9,\n", + " batch_size=512)\n", + "results = model.evaluate(x_test, y_test)" + ] + }, + { + "cell_type": "code", + "execution_count": 0, + "metadata": { + "colab_type": "code" + }, + "outputs": [], + "source": [ + "results" + ] + }, + { + "cell_type": "code", + "execution_count": 0, + "metadata": { + "colab_type": "code" + }, + "outputs": [], + "source": [ + "import copy\n", + "test_labels_copy = copy.copy(test_labels)\n", + "np.random.shuffle(test_labels_copy)\n", + "hits_array = np.array(test_labels) == np.array(test_labels_copy)\n", + "hits_array.mean()" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "colab_type": "text" + }, + "source": [ + "### Generating predictions on new data" + ] + }, + { + "cell_type": "code", + "execution_count": 0, + "metadata": { + "colab_type": "code" + }, + "outputs": [], + "source": [ + "predictions = model.predict(x_test)" + ] + }, + { + "cell_type": "code", + "execution_count": 0, + "metadata": { + "colab_type": "code" + }, + "outputs": [], + "source": [ + "predictions[0].shape" + ] + }, + { + "cell_type": "code", + "execution_count": 0, + "metadata": { + "colab_type": "code" + }, + "outputs": [], + "source": [ + "np.sum(predictions[0])" + ] + }, + { + "cell_type": "code", + "execution_count": 0, + "metadata": { + "colab_type": "code" + }, + "outputs": [], + "source": [ + "np.argmax(predictions[0])" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "colab_type": "text" + }, + "source": [ + "### A different way to handle the labels and the loss" + ] + }, + { + "cell_type": "code", + "execution_count": 0, + "metadata": { + "colab_type": "code" + }, + "outputs": [], + "source": [ + "y_train = np.array(train_labels)\n", + "y_test = np.array(test_labels)" + ] + }, + { + "cell_type": "code", + "execution_count": 0, + "metadata": { + "colab_type": "code" + }, + "outputs": [], + "source": [ + "model.compile(optimizer=\"rmsprop\",\n", + " loss=\"sparse_categorical_crossentropy\",\n", + " metrics=[\"accuracy\"])" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "colab_type": "text" + }, + "source": [ + "### The importance of having sufficiently large intermediate layers" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "colab_type": "text" + }, + "source": [ + "**A model with an information bottleneck**" + ] + }, + { + "cell_type": "code", + "execution_count": 0, + "metadata": { + "colab_type": "code" + }, + "outputs": [], + "source": [ + "model = keras.Sequential([\n", + " layers.Dense(64, activation=\"relu\"),\n", + " layers.Dense(4, activation=\"relu\"),\n", + " layers.Dense(46, activation=\"softmax\")\n", + "])\n", + "model.compile(optimizer=\"rmsprop\",\n", + " loss=\"categorical_crossentropy\",\n", + " metrics=[\"accuracy\"])\n", + "model.fit(partial_x_train,\n", + " partial_y_train,\n", + " epochs=20,\n", + " batch_size=128,\n", + " validation_data=(x_val, y_val))" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "colab_type": "text" + }, + "source": [ + "### Further experiments" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "colab_type": "text" + }, + "source": [ + "### Wrapping up" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "colab_type": "text" + }, + "source": [ + "## Predicting house prices: A regression example" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "colab_type": "text" + }, + "source": [ + "### The Boston Housing Price dataset" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "colab_type": "text" + }, + "source": [ + "**Loading the Boston housing dataset**" + ] + }, + { + "cell_type": "code", + "execution_count": 0, + "metadata": { + "colab_type": "code" + }, + "outputs": [], + "source": [ + "from tensorflow.keras.datasets import boston_housing\n", + "(train_data, train_targets), (test_data, test_targets) = boston_housing.load_data()" + ] + }, + { + "cell_type": "code", + "execution_count": 0, + "metadata": { + "colab_type": "code" + }, + "outputs": [], + "source": [ + "train_data.shape" + ] + }, + { + "cell_type": "code", + "execution_count": 0, + "metadata": { + "colab_type": "code" + }, + "outputs": [], + "source": [ + "test_data.shape" + ] + }, + { + "cell_type": "code", + "execution_count": 0, + "metadata": { + "colab_type": "code" + }, + "outputs": [], + "source": [ + "train_targets" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "colab_type": "text" + }, + "source": [ + "### Preparing the data" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "colab_type": "text" + }, + "source": [ + "**Normalizing the data**" + ] + }, + { + "cell_type": "code", + "execution_count": 0, + "metadata": { + "colab_type": "code" + }, + "outputs": [], + "source": [ + "mean = train_data.mean(axis=0)\n", + "train_data -= mean\n", + "std = train_data.std(axis=0)\n", + "train_data /= std\n", + "test_data -= mean\n", + "test_data /= std" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "colab_type": "text" + }, + "source": [ + "### Building your model" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "colab_type": "text" + }, + "source": [ + "**Model definition**" + ] + }, + { + "cell_type": "code", + "execution_count": 0, + "metadata": { + "colab_type": "code" + }, + "outputs": [], + "source": [ + "def build_model():\n", + " model = keras.Sequential([\n", + " layers.Dense(64, activation=\"relu\"),\n", + " layers.Dense(64, activation=\"relu\"),\n", + " layers.Dense(1)\n", + " ])\n", + " model.compile(optimizer=\"rmsprop\", loss=\"mse\", metrics=[\"mae\"])\n", + " return model" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "colab_type": "text" + }, + "source": [ + "### Validating your approach using K-fold validation" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "colab_type": "text" + }, + "source": [ + "**K-fold validation**" + ] + }, + { + "cell_type": "code", + "execution_count": 0, + "metadata": { + "colab_type": "code" + }, + "outputs": [], + "source": [ + "k = 4\n", + "num_val_samples = len(train_data) // k\n", + "num_epochs = 100\n", + "all_scores = []\n", + "for i in range(k):\n", + " print(f\"Processing fold #{i}\")\n", + " val_data = train_data[i * num_val_samples: (i + 1) * num_val_samples]\n", + " val_targets = train_targets[i * num_val_samples: (i + 1) * num_val_samples]\n", + " partial_train_data = np.concatenate(\n", + " [train_data[:i * num_val_samples],\n", + " train_data[(i + 1) * num_val_samples:]],\n", + " axis=0)\n", + " partial_train_targets = np.concatenate(\n", + " [train_targets[:i * num_val_samples],\n", + " train_targets[(i + 1) * num_val_samples:]],\n", + " axis=0)\n", + " model = build_model()\n", + " model.fit(partial_train_data, partial_train_targets,\n", + " epochs=num_epochs, batch_size=16, verbose=0)\n", + " val_mse, val_mae = model.evaluate(val_data, val_targets, verbose=0)\n", + " all_scores.append(val_mae)" + ] + }, + { + "cell_type": "code", + "execution_count": 0, + "metadata": { + "colab_type": "code" + }, + "outputs": [], + "source": [ + "all_scores" + ] + }, + { + "cell_type": "code", + "execution_count": 0, + "metadata": { + "colab_type": "code" + }, + "outputs": [], + "source": [ + "np.mean(all_scores)" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "colab_type": "text" + }, + "source": [ + "**Saving the validation logs at each fold**" + ] + }, + { + "cell_type": "code", + "execution_count": 0, + "metadata": { + "colab_type": "code" + }, + "outputs": [], + "source": [ + "num_epochs = 500\n", + "all_mae_histories = []\n", + "for i in range(k):\n", + " print(f\"Processing fold #{i}\")\n", + " val_data = train_data[i * num_val_samples: (i + 1) * num_val_samples]\n", + " val_targets = train_targets[i * num_val_samples: (i + 1) * num_val_samples]\n", + " partial_train_data = np.concatenate(\n", + " [train_data[:i * num_val_samples],\n", + " train_data[(i + 1) * num_val_samples:]],\n", + " axis=0)\n", + " partial_train_targets = np.concatenate(\n", + " [train_targets[:i * num_val_samples],\n", + " train_targets[(i + 1) * num_val_samples:]],\n", + " axis=0)\n", + " model = build_model()\n", + " history = model.fit(partial_train_data, partial_train_targets,\n", + " validation_data=(val_data, val_targets),\n", + " epochs=num_epochs, batch_size=16, verbose=0)\n", + " mae_history = history.history[\"val_mae\"]\n", + " all_mae_histories.append(mae_history)" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "colab_type": "text" + }, + "source": [ + "**Building the history of successive mean K-fold validation scores**" + ] + }, + { + "cell_type": "code", + "execution_count": 0, + "metadata": { + "colab_type": "code" + }, + "outputs": [], + "source": [ + "average_mae_history = [\n", + " np.mean([x[i] for x in all_mae_histories]) for i in range(num_epochs)]" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "colab_type": "text" + }, + "source": [ + "**Plotting validation scores**" + ] + }, + { + "cell_type": "code", + "execution_count": 0, + "metadata": { + "colab_type": "code" + }, + "outputs": [], + "source": [ + "plt.plot(range(1, len(average_mae_history) + 1), average_mae_history)\n", + "plt.xlabel(\"Epochs\")\n", + "plt.ylabel(\"Validation MAE\")\n", + "plt.show()" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "colab_type": "text" + }, + "source": [ + "**Plotting validation scores, excluding the first 10 data points**" + ] + }, + { + "cell_type": "code", + "execution_count": 0, + "metadata": { + "colab_type": "code" + }, + "outputs": [], + "source": [ + "truncated_mae_history = average_mae_history[10:]\n", + "plt.plot(range(1, len(truncated_mae_history) + 1), truncated_mae_history)\n", + "plt.xlabel(\"Epochs\")\n", + "plt.ylabel(\"Validation MAE\")\n", + "plt.show()" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "colab_type": "text" + }, + "source": [ + "**Training the final model**" + ] + }, + { + "cell_type": "code", + "execution_count": 0, + "metadata": { + "colab_type": "code" + }, + "outputs": [], + "source": [ + "model = build_model()\n", + "model.fit(train_data, train_targets,\n", + " epochs=130, batch_size=16, verbose=0)\n", + "test_mse_score, test_mae_score = model.evaluate(test_data, test_targets)" + ] + }, + { + "cell_type": "code", + "execution_count": 0, + "metadata": { + "colab_type": "code" + }, + "outputs": [], + "source": [ + "test_mae_score" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "colab_type": "text" + }, + "source": [ + "### Generating predictions on new data" + ] + }, + { + "cell_type": "code", + "execution_count": 0, + "metadata": { + "colab_type": "code" + }, + "outputs": [], + "source": [ + "predictions = model.predict(test_data)\n", + "predictions[0]" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "colab_type": "text" + }, + "source": [ + "### Wrapping up" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "colab_type": "text" + }, + "source": [ + "## Summary" + ] + } + ], + "metadata": { + "colab": { + "collapsed_sections": [], + "name": "chapter04_getting-started-with-neural-networks.i", + "private_outputs": false, + "provenance": [], + "toc_visible": true + }, + "kernelspec": { + "display_name": "Python 3", + "language": "python", + "name": "python3" + }, + "language_info": { + "codemirror_mode": { + "name": "ipython", + "version": 3 + }, + "file_extension": ".py", + "mimetype": "text/x-python", + "name": "python", + "nbconvert_exporter": "python", + "pygments_lexer": "ipython3", + "version": "3.7.0" + } + }, + "nbformat": 4, + "nbformat_minor": 0 +} \ No newline at end of file diff --git a/second_edition/chapter05_fundamentals-of-ml.ipynb b/second_edition/chapter05_fundamentals-of-ml.ipynb new file mode 100644 index 0000000000..dd61f4ead8 --- /dev/null +++ b/second_edition/chapter05_fundamentals-of-ml.ipynb @@ -0,0 +1,786 @@ +{ + "cells": [ + { + "cell_type": "markdown", + "metadata": { + "colab_type": "text" + }, + "source": [ + "This is a companion notebook for the book [Deep Learning with Python, Second Edition](https://www.manning.com/books/deep-learning-with-python-second-edition?a_aid=keras&a_bid=76564dff). For readability, it only contains runnable code blocks and section titles, and omits everything else in the book: text paragraphs, figures, and pseudocode.\n\n**If you want to be able to follow what's going on, I recommend reading the notebook side by side with your copy of the book.**\n\nThis notebook was generated for TensorFlow 2.6." + ] + }, + { + "cell_type": "markdown", + "metadata": { + "colab_type": "text" + }, + "source": [ + "# Fundamentals of machine learning" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "colab_type": "text" + }, + "source": [ + "## Generalization: The goal of machine learning" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "colab_type": "text" + }, + "source": [ + "### Underfitting and overfitting" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "colab_type": "text" + }, + "source": [ + "#### Noisy training data" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "colab_type": "text" + }, + "source": [ + "#### Ambiguous features" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "colab_type": "text" + }, + "source": [ + "#### Rare features and spurious correlations" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "colab_type": "text" + }, + "source": [ + "**Adding white-noise channels or all-zeros channels to MNIST**" + ] + }, + { + "cell_type": "code", + "execution_count": 0, + "metadata": { + "colab_type": "code" + }, + "outputs": [], + "source": [ + "from tensorflow.keras.datasets import mnist\n", + "import numpy as np\n", + "\n", + "(train_images, train_labels), _ = mnist.load_data()\n", + "train_images = train_images.reshape((60000, 28 * 28))\n", + "train_images = train_images.astype(\"float32\") / 255\n", + "\n", + "train_images_with_noise_channels = np.concatenate(\n", + " [train_images, np.random.random((len(train_images), 784))], axis=1)\n", + "\n", + "train_images_with_zeros_channels = np.concatenate(\n", + " [train_images, np.zeros((len(train_images), 784))], axis=1)" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "colab_type": "text" + }, + "source": [ + "**Training the same model on MNIST data with noise channels or all-zero channels**" + ] + }, + { + "cell_type": "code", + "execution_count": 0, + "metadata": { + "colab_type": "code" + }, + "outputs": [], + "source": [ + "from tensorflow import keras\n", + "from tensorflow.keras import layers\n", + "\n", + "def get_model():\n", + " model = keras.Sequential([\n", + " layers.Dense(512, activation=\"relu\"),\n", + " layers.Dense(10, activation=\"softmax\")\n", + " ])\n", + " model.compile(optimizer=\"rmsprop\",\n", + " loss=\"sparse_categorical_crossentropy\",\n", + " metrics=[\"accuracy\"])\n", + " return model\n", + "\n", + "model = get_model()\n", + "history_noise = model.fit(\n", + " train_images_with_noise_channels, train_labels,\n", + " epochs=10,\n", + " batch_size=128,\n", + " validation_split=0.2)\n", + "\n", + "model = get_model()\n", + "history_zeros = model.fit(\n", + " train_images_with_zeros_channels, train_labels,\n", + " epochs=10,\n", + " batch_size=128,\n", + " validation_split=0.2)" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "colab_type": "text" + }, + "source": [ + "**Plotting a validation accuracy comparison**" + ] + }, + { + "cell_type": "code", + "execution_count": 0, + "metadata": { + "colab_type": "code" + }, + "outputs": [], + "source": [ + "import matplotlib.pyplot as plt\n", + "val_acc_noise = history_noise.history[\"val_accuracy\"]\n", + "val_acc_zeros = history_zeros.history[\"val_accuracy\"]\n", + "epochs = range(1, 11)\n", + "plt.plot(epochs, val_acc_noise, \"b-\",\n", + " label=\"Validation accuracy with noise channels\")\n", + "plt.plot(epochs, val_acc_zeros, \"b--\",\n", + " label=\"Validation accuracy with zeros channels\")\n", + "plt.title(\"Effect of noise channels on validation accuracy\")\n", + "plt.xlabel(\"Epochs\")\n", + "plt.ylabel(\"Accuracy\")\n", + "plt.legend()" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "colab_type": "text" + }, + "source": [ + "### The nature of generalization in deep learning" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "colab_type": "text" + }, + "source": [ + "**Fitting a MNIST model with randomly shuffled labels**" + ] + }, + { + "cell_type": "code", + "execution_count": 0, + "metadata": { + "colab_type": "code" + }, + "outputs": [], + "source": [ + "(train_images, train_labels), _ = mnist.load_data()\n", + "train_images = train_images.reshape((60000, 28 * 28))\n", + "train_images = train_images.astype(\"float32\") / 255\n", + "\n", + "random_train_labels = train_labels[:]\n", + "np.random.shuffle(random_train_labels)\n", + "\n", + "model = keras.Sequential([\n", + " layers.Dense(512, activation=\"relu\"),\n", + " layers.Dense(10, activation=\"softmax\")\n", + "])\n", + "model.compile(optimizer=\"rmsprop\",\n", + " loss=\"sparse_categorical_crossentropy\",\n", + " metrics=[\"accuracy\"])\n", + "model.fit(train_images, random_train_labels,\n", + " epochs=100,\n", + " batch_size=128,\n", + " validation_split=0.2)" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "colab_type": "text" + }, + "source": [ + "#### The manifold hypothesis" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "colab_type": "text" + }, + "source": [ + "#### Interpolation as a source of generalization" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "colab_type": "text" + }, + "source": [ + "#### Why deep learning works" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "colab_type": "text" + }, + "source": [ + "#### Training data is paramount" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "colab_type": "text" + }, + "source": [ + "## Evaluating machine-learning models" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "colab_type": "text" + }, + "source": [ + "### Training, validation, and test sets" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "colab_type": "text" + }, + "source": [ + "#### Simple hold-out validation" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "colab_type": "text" + }, + "source": [ + "#### K-fold validation" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "colab_type": "text" + }, + "source": [ + "#### Iterated K-fold validation with shuffling" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "colab_type": "text" + }, + "source": [ + "### Beating a common-sense baseline" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "colab_type": "text" + }, + "source": [ + "### Things to keep in mind about model evaluation" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "colab_type": "text" + }, + "source": [ + "## Improving model fit" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "colab_type": "text" + }, + "source": [ + "### Tuning key gradient descent parameters" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "colab_type": "text" + }, + "source": [ + "**Training a MNIST model with an incorrectly high learning rate**" + ] + }, + { + "cell_type": "code", + "execution_count": 0, + "metadata": { + "colab_type": "code" + }, + "outputs": [], + "source": [ + "(train_images, train_labels), _ = mnist.load_data()\n", + "train_images = train_images.reshape((60000, 28 * 28))\n", + "train_images = train_images.astype(\"float32\") / 255\n", + "\n", + "model = keras.Sequential([\n", + " layers.Dense(512, activation=\"relu\"),\n", + " layers.Dense(10, activation=\"softmax\")\n", + "])\n", + "model.compile(optimizer=keras.optimizers.RMSprop(1.),\n", + " loss=\"sparse_categorical_crossentropy\",\n", + " metrics=[\"accuracy\"])\n", + "model.fit(train_images, train_labels,\n", + " epochs=10,\n", + " batch_size=128,\n", + " validation_split=0.2)" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "colab_type": "text" + }, + "source": [ + "**The same model with a more appropriate learning rate**" + ] + }, + { + "cell_type": "code", + "execution_count": 0, + "metadata": { + "colab_type": "code" + }, + "outputs": [], + "source": [ + "model = keras.Sequential([\n", + " layers.Dense(512, activation=\"relu\"),\n", + " layers.Dense(10, activation=\"softmax\")\n", + "])\n", + "model.compile(optimizer=keras.optimizers.RMSprop(1e-2),\n", + " loss=\"sparse_categorical_crossentropy\",\n", + " metrics=[\"accuracy\"])\n", + "model.fit(train_images, train_labels,\n", + " epochs=10,\n", + " batch_size=128,\n", + " validation_split=0.2)" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "colab_type": "text" + }, + "source": [ + "### Leveraging better architecture priors" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "colab_type": "text" + }, + "source": [ + "### Increasing model capacity" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "colab_type": "text" + }, + "source": [ + "**A simple logistic regression on MNIST**" + ] + }, + { + "cell_type": "code", + "execution_count": 0, + "metadata": { + "colab_type": "code" + }, + "outputs": [], + "source": [ + "model = keras.Sequential([layers.Dense(10, activation=\"softmax\")])\n", + "model.compile(optimizer=\"rmsprop\",\n", + " loss=\"sparse_categorical_crossentropy\",\n", + " metrics=[\"accuracy\"])\n", + "history_small_model = model.fit(\n", + " train_images, train_labels,\n", + " epochs=20,\n", + " batch_size=128,\n", + " validation_split=0.2)" + ] + }, + { + "cell_type": "code", + "execution_count": 0, + "metadata": { + "colab_type": "code" + }, + "outputs": [], + "source": [ + "import matplotlib.pyplot as plt\n", + "val_loss = history_small_model.history[\"val_loss\"]\n", + "epochs = range(1, 21)\n", + "plt.plot(epochs, val_loss, \"b--\",\n", + " label=\"Validation loss\")\n", + "plt.title(\"Effect of insufficient model capacity on validation loss\")\n", + "plt.xlabel(\"Epochs\")\n", + "plt.ylabel(\"Loss\")\n", + "plt.legend()" + ] + }, + { + "cell_type": "code", + "execution_count": 0, + "metadata": { + "colab_type": "code" + }, + "outputs": [], + "source": [ + "model = keras.Sequential([\n", + " layers.Dense(96, activation=\"relu\"),\n", + " layers.Dense(96, activation=\"relu\"),\n", + " layers.Dense(10, activation=\"softmax\"),\n", + "])\n", + "model.compile(optimizer=\"rmsprop\",\n", + " loss=\"sparse_categorical_crossentropy\",\n", + " metrics=[\"accuracy\"])\n", + "history_large_model = model.fit(\n", + " train_images, train_labels,\n", + " epochs=20,\n", + " batch_size=128,\n", + " validation_split=0.2)" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "colab_type": "text" + }, + "source": [ + "## Improving generalization" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "colab_type": "text" + }, + "source": [ + "### Dataset curation" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "colab_type": "text" + }, + "source": [ + "### Feature engineering" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "colab_type": "text" + }, + "source": [ + "### Using early stopping" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "colab_type": "text" + }, + "source": [ + "### Regularizing your model" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "colab_type": "text" + }, + "source": [ + "#### Reducing the network's size" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "colab_type": "text" + }, + "source": [ + "**Original model**" + ] + }, + { + "cell_type": "code", + "execution_count": 0, + "metadata": { + "colab_type": "code" + }, + "outputs": [], + "source": [ + "from tensorflow.keras.datasets import imdb\n", + "(train_data, train_labels), _ = imdb.load_data(num_words=10000)\n", + "\n", + "def vectorize_sequences(sequences, dimension=10000):\n", + " results = np.zeros((len(sequences), dimension))\n", + " for i, sequence in enumerate(sequences):\n", + " results[i, sequence] = 1.\n", + " return results\n", + "train_data = vectorize_sequences(train_data)\n", + "\n", + "model = keras.Sequential([\n", + " layers.Dense(16, activation=\"relu\"),\n", + " layers.Dense(16, activation=\"relu\"),\n", + " layers.Dense(1, activation=\"sigmoid\")\n", + "])\n", + "model.compile(optimizer=\"rmsprop\",\n", + " loss=\"binary_crossentropy\",\n", + " metrics=[\"accuracy\"])\n", + "history_original = model.fit(train_data, train_labels,\n", + " epochs=20, batch_size=512, validation_split=0.4)" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "colab_type": "text" + }, + "source": [ + "**Version of the model with lower capacity**" + ] + }, + { + "cell_type": "code", + "execution_count": 0, + "metadata": { + "colab_type": "code" + }, + "outputs": [], + "source": [ + "model = keras.Sequential([\n", + " layers.Dense(4, activation=\"relu\"),\n", + " layers.Dense(4, activation=\"relu\"),\n", + " layers.Dense(1, activation=\"sigmoid\")\n", + "])\n", + "model.compile(optimizer=\"rmsprop\",\n", + " loss=\"binary_crossentropy\",\n", + " metrics=[\"accuracy\"])\n", + "history_smaller_model = model.fit(\n", + " train_data, train_labels,\n", + " epochs=20, batch_size=512, validation_split=0.4)" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "colab_type": "text" + }, + "source": [ + "**Version of the model with higher capacity**" + ] + }, + { + "cell_type": "code", + "execution_count": 0, + "metadata": { + "colab_type": "code" + }, + "outputs": [], + "source": [ + "model = keras.Sequential([\n", + " layers.Dense(512, activation=\"relu\"),\n", + " layers.Dense(512, activation=\"relu\"),\n", + " layers.Dense(1, activation=\"sigmoid\")\n", + "])\n", + "model.compile(optimizer=\"rmsprop\",\n", + " loss=\"binary_crossentropy\",\n", + " metrics=[\"accuracy\"])\n", + "history_larger_model = model.fit(\n", + " train_data, train_labels,\n", + " epochs=20, batch_size=512, validation_split=0.4)" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "colab_type": "text" + }, + "source": [ + "#### Adding weight regularization" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "colab_type": "text" + }, + "source": [ + "**Adding L2 weight regularization to the model**" + ] + }, + { + "cell_type": "code", + "execution_count": 0, + "metadata": { + "colab_type": "code" + }, + "outputs": [], + "source": [ + "from tensorflow.keras import regularizers\n", + "model = keras.Sequential([\n", + " layers.Dense(16,\n", + " kernel_regularizer=regularizers.l2(0.002),\n", + " activation=\"relu\"),\n", + " layers.Dense(16,\n", + " kernel_regularizer=regularizers.l2(0.002),\n", + " activation=\"relu\"),\n", + " layers.Dense(1, activation=\"sigmoid\")\n", + "])\n", + "model.compile(optimizer=\"rmsprop\",\n", + " loss=\"binary_crossentropy\",\n", + " metrics=[\"accuracy\"])\n", + "history_l2_reg = model.fit(\n", + " train_data, train_labels,\n", + " epochs=20, batch_size=512, validation_split=0.4)" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "colab_type": "text" + }, + "source": [ + "**Different weight regularizers available in Keras**" + ] + }, + { + "cell_type": "code", + "execution_count": 0, + "metadata": { + "colab_type": "code" + }, + "outputs": [], + "source": [ + "from tensorflow.keras import regularizers\n", + "regularizers.l1(0.001)\n", + "regularizers.l1_l2(l1=0.001, l2=0.001)" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "colab_type": "text" + }, + "source": [ + "#### Adding dropout" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "colab_type": "text" + }, + "source": [ + "**Adding dropout to the IMDB model**" + ] + }, + { + "cell_type": "code", + "execution_count": 0, + "metadata": { + "colab_type": "code" + }, + "outputs": [], + "source": [ + "model = keras.Sequential([\n", + " layers.Dense(16, activation=\"relu\"),\n", + " layers.Dropout(0.5),\n", + " layers.Dense(16, activation=\"relu\"),\n", + " layers.Dropout(0.5),\n", + " layers.Dense(1, activation=\"sigmoid\")\n", + "])\n", + "model.compile(optimizer=\"rmsprop\",\n", + " loss=\"binary_crossentropy\",\n", + " metrics=[\"accuracy\"])\n", + "history_dropout = model.fit(\n", + " train_data, train_labels,\n", + " epochs=20, batch_size=512, validation_split=0.4)" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "colab_type": "text" + }, + "source": [ + "## Summary" + ] + } + ], + "metadata": { + "colab": { + "collapsed_sections": [], + "name": "chapter05_fundamentals-of-ml.i", + "private_outputs": false, + "provenance": [], + "toc_visible": true + }, + "kernelspec": { + "display_name": "Python 3", + "language": "python", + "name": "python3" + }, + "language_info": { + "codemirror_mode": { + "name": "ipython", + "version": 3 + }, + "file_extension": ".py", + "mimetype": "text/x-python", + "name": "python", + "nbconvert_exporter": "python", + "pygments_lexer": "ipython3", + "version": "3.7.0" + } + }, + "nbformat": 4, + "nbformat_minor": 0 +} \ No newline at end of file diff --git a/second_edition/chapter07_working-with-keras.ipynb b/second_edition/chapter07_working-with-keras.ipynb new file mode 100644 index 0000000000..632d7c7e99 --- /dev/null +++ b/second_edition/chapter07_working-with-keras.ipynb @@ -0,0 +1,1439 @@ +{ + "cells": [ + { + "cell_type": "markdown", + "metadata": { + "colab_type": "text" + }, + "source": [ + "This is a companion notebook for the book [Deep Learning with Python, Second Edition](https://www.manning.com/books/deep-learning-with-python-second-edition?a_aid=keras&a_bid=76564dff). For readability, it only contains runnable code blocks and section titles, and omits everything else in the book: text paragraphs, figures, and pseudocode.\n\n**If you want to be able to follow what's going on, I recommend reading the notebook side by side with your copy of the book.**\n\nThis notebook was generated for TensorFlow 2.6." + ] + }, + { + "cell_type": "markdown", + "metadata": { + "colab_type": "text" + }, + "source": [ + "# Working with Keras: A deep dive" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "colab_type": "text" + }, + "source": [ + "## A spectrum of workflows" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "colab_type": "text" + }, + "source": [ + "## Different ways to build Keras models" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "colab_type": "text" + }, + "source": [ + "### The Sequential model" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "colab_type": "text" + }, + "source": [ + "**The `Sequential` class**" + ] + }, + { + "cell_type": "code", + "execution_count": 0, + "metadata": { + "colab_type": "code" + }, + "outputs": [], + "source": [ + "from tensorflow import keras\n", + "from tensorflow.keras import layers\n", + "\n", + "model = keras.Sequential([\n", + " layers.Dense(64, activation=\"relu\"),\n", + " layers.Dense(10, activation=\"softmax\")\n", + "])" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "colab_type": "text" + }, + "source": [ + "**Incrementally building a Sequential model**" + ] + }, + { + "cell_type": "code", + "execution_count": 0, + "metadata": { + "colab_type": "code" + }, + "outputs": [], + "source": [ + "model = keras.Sequential()\n", + "model.add(layers.Dense(64, activation=\"relu\"))\n", + "model.add(layers.Dense(10, activation=\"softmax\"))" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "colab_type": "text" + }, + "source": [ + "**Calling a model for the first time to build it**" + ] + }, + { + "cell_type": "code", + "execution_count": 0, + "metadata": { + "colab_type": "code" + }, + "outputs": [], + "source": [ + "model.build(input_shape=(None, 3))\n", + "model.weights" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "colab_type": "text" + }, + "source": [ + "**The summary method**" + ] + }, + { + "cell_type": "code", + "execution_count": 0, + "metadata": { + "colab_type": "code" + }, + "outputs": [], + "source": [ + "model.summary()" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "colab_type": "text" + }, + "source": [ + "**Naming models and layers with the `name` argument**" + ] + }, + { + "cell_type": "code", + "execution_count": 0, + "metadata": { + "colab_type": "code" + }, + "outputs": [], + "source": [ + "model = keras.Sequential(name=\"my_example_model\")\n", + "model.add(layers.Dense(64, activation=\"relu\", name=\"my_first_layer\"))\n", + "model.add(layers.Dense(10, activation=\"softmax\", name=\"my_last_layer\"))\n", + "model.build((None, 3))\n", + "model.summary()" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "colab_type": "text" + }, + "source": [ + "**Specifying the input shape of your model in advance**" + ] + }, + { + "cell_type": "code", + "execution_count": 0, + "metadata": { + "colab_type": "code" + }, + "outputs": [], + "source": [ + "model = keras.Sequential()\n", + "model.add(keras.Input(shape=(3,)))\n", + "model.add(layers.Dense(64, activation=\"relu\"))" + ] + }, + { + "cell_type": "code", + "execution_count": 0, + "metadata": { + "colab_type": "code" + }, + "outputs": [], + "source": [ + "model.summary()" + ] + }, + { + "cell_type": "code", + "execution_count": 0, + "metadata": { + "colab_type": "code" + }, + "outputs": [], + "source": [ + "model.add(layers.Dense(10, activation=\"softmax\"))\n", + "model.summary()" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "colab_type": "text" + }, + "source": [ + "### The Functional API" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "colab_type": "text" + }, + "source": [ + "#### A simple example" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "colab_type": "text" + }, + "source": [ + "**A simple Functional model with two `Dense` layers**" + ] + }, + { + "cell_type": "code", + "execution_count": 0, + "metadata": { + "colab_type": "code" + }, + "outputs": [], + "source": [ + "inputs = keras.Input(shape=(3,), name=\"my_input\")\n", + "features = layers.Dense(64, activation=\"relu\")(inputs)\n", + "outputs = layers.Dense(10, activation=\"softmax\")(features)\n", + "model = keras.Model(inputs=inputs, outputs=outputs)" + ] + }, + { + "cell_type": "code", + "execution_count": 0, + "metadata": { + "colab_type": "code" + }, + "outputs": [], + "source": [ + "inputs = keras.Input(shape=(3,), name=\"my_input\")" + ] + }, + { + "cell_type": "code", + "execution_count": 0, + "metadata": { + "colab_type": "code" + }, + "outputs": [], + "source": [ + "inputs.shape" + ] + }, + { + "cell_type": "code", + "execution_count": 0, + "metadata": { + "colab_type": "code" + }, + "outputs": [], + "source": [ + "inputs.dtype" + ] + }, + { + "cell_type": "code", + "execution_count": 0, + "metadata": { + "colab_type": "code" + }, + "outputs": [], + "source": [ + "features = layers.Dense(64, activation=\"relu\")(inputs)" + ] + }, + { + "cell_type": "code", + "execution_count": 0, + "metadata": { + "colab_type": "code" + }, + "outputs": [], + "source": [ + "features.shape" + ] + }, + { + "cell_type": "code", + "execution_count": 0, + "metadata": { + "colab_type": "code" + }, + "outputs": [], + "source": [ + "outputs = layers.Dense(10, activation=\"softmax\")(features)\n", + "model = keras.Model(inputs=inputs, outputs=outputs)" + ] + }, + { + "cell_type": "code", + "execution_count": 0, + "metadata": { + "colab_type": "code" + }, + "outputs": [], + "source": [ + "model.summary()" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "colab_type": "text" + }, + "source": [ + "#### Multi-input, multi-output models" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "colab_type": "text" + }, + "source": [ + "**A multi-input, multi-output Functional model**" + ] + }, + { + "cell_type": "code", + "execution_count": 0, + "metadata": { + "colab_type": "code" + }, + "outputs": [], + "source": [ + "vocabulary_size = 10000\n", + "num_tags = 100\n", + "num_departments = 4\n", + "\n", + "title = keras.Input(shape=(vocabulary_size,), name=\"title\")\n", + "text_body = keras.Input(shape=(vocabulary_size,), name=\"text_body\")\n", + "tags = keras.Input(shape=(num_tags,), name=\"tags\")\n", + "\n", + "features = layers.Concatenate()([title, text_body, tags])\n", + "features = layers.Dense(64, activation=\"relu\")(features)\n", + "\n", + "priority = layers.Dense(1, activation=\"sigmoid\", name=\"priority\")(features)\n", + "department = layers.Dense(\n", + " num_departments, activation=\"softmax\", name=\"department\")(features)\n", + "\n", + "model = keras.Model(inputs=[title, text_body, tags], outputs=[priority, department])" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "colab_type": "text" + }, + "source": [ + "#### Training a multi-input, multi-output model" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "colab_type": "text" + }, + "source": [ + "**Training a model by providing lists of input & target arrays**" + ] + }, + { + "cell_type": "code", + "execution_count": 0, + "metadata": { + "colab_type": "code" + }, + "outputs": [], + "source": [ + "import numpy as np\n", + "\n", + "num_samples = 1280\n", + "\n", + "title_data = np.random.randint(0, 2, size=(num_samples, vocabulary_size))\n", + "text_body_data = np.random.randint(0, 2, size=(num_samples, vocabulary_size))\n", + "tags_data = np.random.randint(0, 2, size=(num_samples, num_tags))\n", + "\n", + "priority_data = np.random.random(size=(num_samples, 1))\n", + "department_data = np.random.randint(0, 2, size=(num_samples, num_departments))\n", + "\n", + "model.compile(optimizer=\"rmsprop\",\n", + " loss=[\"mean_squared_error\", \"categorical_crossentropy\"],\n", + " metrics=[[\"mean_absolute_error\"], [\"accuracy\"]])\n", + "model.fit([title_data, text_body_data, tags_data],\n", + " [priority_data, department_data],\n", + " epochs=1)\n", + "model.evaluate([title_data, text_body_data, tags_data],\n", + " [priority_data, department_data])\n", + "priority_preds, department_preds = model.predict([title_data, text_body_data, tags_data])" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "colab_type": "text" + }, + "source": [ + "**Training a model by providing dicts of input & target arrays**" + ] + }, + { + "cell_type": "code", + "execution_count": 0, + "metadata": { + "colab_type": "code" + }, + "outputs": [], + "source": [ + "model.compile(optimizer=\"rmsprop\",\n", + " loss={\"priority\": \"mean_squared_error\", \"department\": \"categorical_crossentropy\"},\n", + " metrics={\"priority\": [\"mean_absolute_error\"], \"department\": [\"accuracy\"]})\n", + "model.fit({\"title\": title_data, \"text_body\": text_body_data, \"tags\": tags_data},\n", + " {\"priority\": priority_data, \"department\": department_data},\n", + " epochs=1)\n", + "model.evaluate({\"title\": title_data, \"text_body\": text_body_data, \"tags\": tags_data},\n", + " {\"priority\": priority_data, \"department\": department_data})\n", + "priority_preds, department_preds = model.predict(\n", + " {\"title\": title_data, \"text_body\": text_body_data, \"tags\": tags_data})" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "colab_type": "text" + }, + "source": [ + "#### The power of the Functional API: Access to layer connectivity" + ] + }, + { + "cell_type": "code", + "execution_count": 0, + "metadata": { + "colab_type": "code" + }, + "outputs": [], + "source": [ + "keras.utils.plot_model(model, \"ticket_classifier.png\")" + ] + }, + { + "cell_type": "code", + "execution_count": 0, + "metadata": { + "colab_type": "code" + }, + "outputs": [], + "source": [ + "keras.utils.plot_model(model, \"ticket_classifier_with_shape_info.png\", show_shapes=True)" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "colab_type": "text" + }, + "source": [ + "**Retrieving the inputs or outputs of a layer in a Functional model**" + ] + }, + { + "cell_type": "code", + "execution_count": 0, + "metadata": { + "colab_type": "code" + }, + "outputs": [], + "source": [ + "model.layers" + ] + }, + { + "cell_type": "code", + "execution_count": 0, + "metadata": { + "colab_type": "code" + }, + "outputs": [], + "source": [ + "model.layers[3].input" + ] + }, + { + "cell_type": "code", + "execution_count": 0, + "metadata": { + "colab_type": "code" + }, + "outputs": [], + "source": [ + "model.layers[3].output" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "colab_type": "text" + }, + "source": [ + "**Creating a new model by reusing intermediate layer outputs**" + ] + }, + { + "cell_type": "code", + "execution_count": 0, + "metadata": { + "colab_type": "code" + }, + "outputs": [], + "source": [ + "features = model.layers[4].output\n", + "difficulty = layers.Dense(3, activation=\"softmax\", name=\"difficulty\")(features)\n", + "\n", + "new_model = keras.Model(\n", + " inputs=[title, text_body, tags],\n", + " outputs=[priority, department, difficulty])" + ] + }, + { + "cell_type": "code", + "execution_count": 0, + "metadata": { + "colab_type": "code" + }, + "outputs": [], + "source": [ + "keras.utils.plot_model(new_model, \"updated_ticket_classifier.png\", show_shapes=True)" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "colab_type": "text" + }, + "source": [ + "### Subclassing the Model class" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "colab_type": "text" + }, + "source": [ + "#### Rewriting our previous example as a subclassed model" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "colab_type": "text" + }, + "source": [ + "**A simple subclassed model**" + ] + }, + { + "cell_type": "code", + "execution_count": 0, + "metadata": { + "colab_type": "code" + }, + "outputs": [], + "source": [ + "class CustomerTicketModel(keras.Model):\n", + "\n", + " def __init__(self, num_departments):\n", + " super().__init__()\n", + " self.concat_layer = layers.Concatenate()\n", + " self.mixing_layer = layers.Dense(64, activation=\"relu\")\n", + " self.priority_scorer = layers.Dense(1, activation=\"sigmoid\")\n", + " self.department_classifier = layers.Dense(\n", + " num_departments, activation=\"softmax\")\n", + "\n", + " def call(self, inputs):\n", + " title = inputs[\"title\"]\n", + " text_body = inputs[\"text_body\"]\n", + " tags = inputs[\"tags\"]\n", + "\n", + " features = self.concat_layer([title, text_body, tags])\n", + " features = self.mixing_layer(features)\n", + " priority = self.priority_scorer(features)\n", + " department = self.department_classifier(features)\n", + " return priority, department" + ] + }, + { + "cell_type": "code", + "execution_count": 0, + "metadata": { + "colab_type": "code" + }, + "outputs": [], + "source": [ + "model = CustomerTicketModel(num_departments=4)\n", + "\n", + "priority, department = model(\n", + " {\"title\": title_data, \"text_body\": text_body_data, \"tags\": tags_data})" + ] + }, + { + "cell_type": "code", + "execution_count": 0, + "metadata": { + "colab_type": "code" + }, + "outputs": [], + "source": [ + "model.compile(optimizer=\"rmsprop\",\n", + " loss=[\"mean_squared_error\", \"categorical_crossentropy\"],\n", + " metrics=[[\"mean_absolute_error\"], [\"accuracy\"]])\n", + "model.fit({\"title\": title_data,\n", + " \"text_body\": text_body_data,\n", + " \"tags\": tags_data},\n", + " [priority_data, department_data],\n", + " epochs=1)\n", + "model.evaluate({\"title\": title_data,\n", + " \"text_body\": text_body_data,\n", + " \"tags\": tags_data},\n", + " [priority_data, department_data])\n", + "priority_preds, department_preds = model.predict({\"title\": title_data,\n", + " \"text_body\": text_body_data,\n", + " \"tags\": tags_data})" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "colab_type": "text" + }, + "source": [ + "#### Beware: What subclassed models don't support" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "colab_type": "text" + }, + "source": [ + "### Mixing and matching different components" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "colab_type": "text" + }, + "source": [ + "**Creating a Functional model that includes a subclassed model**" + ] + }, + { + "cell_type": "code", + "execution_count": 0, + "metadata": { + "colab_type": "code" + }, + "outputs": [], + "source": [ + "class Classifier(keras.Model):\n", + "\n", + " def __init__(self, num_classes=2):\n", + " super().__init__()\n", + " if num_classes == 2:\n", + " num_units = 1\n", + " activation = \"sigmoid\"\n", + " else:\n", + " num_units = num_classes\n", + " activation = \"softmax\"\n", + " self.dense = layers.Dense(num_units, activation=activation)\n", + "\n", + " def call(self, inputs):\n", + " return self.dense(inputs)\n", + "\n", + "inputs = keras.Input(shape=(3,))\n", + "features = layers.Dense(64, activation=\"relu\")(inputs)\n", + "outputs = Classifier(num_classes=10)(features)\n", + "model = keras.Model(inputs=inputs, outputs=outputs)" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "colab_type": "text" + }, + "source": [ + "**Creating a subclassed model that includes a Functional model**" + ] + }, + { + "cell_type": "code", + "execution_count": 0, + "metadata": { + "colab_type": "code" + }, + "outputs": [], + "source": [ + "inputs = keras.Input(shape=(64,))\n", + "outputs = layers.Dense(1, activation=\"sigmoid\")(inputs)\n", + "binary_classifier = keras.Model(inputs=inputs, outputs=outputs)\n", + "\n", + "class MyModel(keras.Model):\n", + "\n", + " def __init__(self, num_classes=2):\n", + " super().__init__()\n", + " self.dense = layers.Dense(64, activation=\"relu\")\n", + " self.classifier = binary_classifier\n", + "\n", + " def call(self, inputs):\n", + " features = self.dense(inputs)\n", + " return self.classifier(features)\n", + "\n", + "model = MyModel()" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "colab_type": "text" + }, + "source": [ + "### Remember: Use the right tool for the job" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "colab_type": "text" + }, + "source": [ + "## Using built-in training and evaluation loops" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "colab_type": "text" + }, + "source": [ + "**The standard workflow: `compile()`, `fit()`, `evaluate()`, `predict()`**" + ] + }, + { + "cell_type": "code", + "execution_count": 0, + "metadata": { + "colab_type": "code" + }, + "outputs": [], + "source": [ + "from tensorflow.keras.datasets import mnist\n", + "\n", + "def get_mnist_model():\n", + " inputs = keras.Input(shape=(28 * 28,))\n", + " features = layers.Dense(512, activation=\"relu\")(inputs)\n", + " features = layers.Dropout(0.5)(features)\n", + " outputs = layers.Dense(10, activation=\"softmax\")(features)\n", + " model = keras.Model(inputs, outputs)\n", + " return model\n", + "\n", + "(images, labels), (test_images, test_labels) = mnist.load_data()\n", + "images = images.reshape((60000, 28 * 28)).astype(\"float32\") / 255\n", + "test_images = test_images.reshape((10000, 28 * 28)).astype(\"float32\") / 255\n", + "train_images, val_images = images[10000:], images[:10000]\n", + "train_labels, val_labels = labels[10000:], labels[:10000]\n", + "\n", + "model = get_mnist_model()\n", + "model.compile(optimizer=\"rmsprop\",\n", + " loss=\"sparse_categorical_crossentropy\",\n", + " metrics=[\"accuracy\"])\n", + "model.fit(train_images, train_labels,\n", + " epochs=3,\n", + " validation_data=(val_images, val_labels))\n", + "test_metrics = model.evaluate(test_images, test_labels)\n", + "predictions = model.predict(test_images)" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "colab_type": "text" + }, + "source": [ + "### Writing your own metrics" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "colab_type": "text" + }, + "source": [ + "**Implementing a custom metric by subclassing the `Metric` class**" + ] + }, + { + "cell_type": "code", + "execution_count": 0, + "metadata": { + "colab_type": "code" + }, + "outputs": [], + "source": [ + "import tensorflow as tf\n", + "\n", + "class RootMeanSquaredError(keras.metrics.Metric):\n", + "\n", + " def __init__(self, name=\"rmse\", **kwargs):\n", + " super().__init__(name=name, **kwargs)\n", + " self.mse_sum = self.add_weight(name=\"mse_sum\", initializer=\"zeros\")\n", + " self.total_samples = self.add_weight(\n", + " name=\"total_samples\", initializer=\"zeros\", dtype=\"int32\")\n", + "\n", + " def update_state(self, y_true, y_pred, sample_weight=None):\n", + " y_true = tf.one_hot(y_true, depth=tf.shape(y_pred)[1])\n", + " mse = tf.reduce_sum(tf.square(y_true - y_pred))\n", + " self.mse_sum.assign_add(mse)\n", + " num_samples = tf.shape(y_pred)[0]\n", + " self.total_samples.assign_add(num_samples)\n", + "\n", + " def result(self):\n", + " return tf.sqrt(self.mse_sum / tf.cast(self.total_samples, tf.float32))\n", + "\n", + " def reset_state(self):\n", + " self.mse_sum.assign(0.)\n", + " self.total_samples.assign(0)" + ] + }, + { + "cell_type": "code", + "execution_count": 0, + "metadata": { + "colab_type": "code" + }, + "outputs": [], + "source": [ + "model = get_mnist_model()\n", + "model.compile(optimizer=\"rmsprop\",\n", + " loss=\"sparse_categorical_crossentropy\",\n", + " metrics=[\"accuracy\", RootMeanSquaredError()])\n", + "model.fit(train_images, train_labels,\n", + " epochs=3,\n", + " validation_data=(val_images, val_labels))\n", + "test_metrics = model.evaluate(test_images, test_labels)" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "colab_type": "text" + }, + "source": [ + "### Using callbacks" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "colab_type": "text" + }, + "source": [ + "#### The EarlyStopping and ModelCheckpoint callbacks" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "colab_type": "text" + }, + "source": [ + "**Using the `callbacks` argument in the `fit()` method**" + ] + }, + { + "cell_type": "code", + "execution_count": 0, + "metadata": { + "colab_type": "code" + }, + "outputs": [], + "source": [ + "callbacks_list = [\n", + " keras.callbacks.EarlyStopping(\n", + " monitor=\"val_accuracy\",\n", + " patience=2,\n", + " ),\n", + " keras.callbacks.ModelCheckpoint(\n", + " filepath=\"checkpoint_path.keras\",\n", + " monitor=\"val_loss\",\n", + " save_best_only=True,\n", + " )\n", + "]\n", + "model = get_mnist_model()\n", + "model.compile(optimizer=\"rmsprop\",\n", + " loss=\"sparse_categorical_crossentropy\",\n", + " metrics=[\"accuracy\"])\n", + "model.fit(train_images, train_labels,\n", + " epochs=10,\n", + " callbacks=callbacks_list,\n", + " validation_data=(val_images, val_labels))" + ] + }, + { + "cell_type": "code", + "execution_count": 0, + "metadata": { + "colab_type": "code" + }, + "outputs": [], + "source": [ + "model = keras.models.load_model(\"checkpoint_path.keras\")" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "colab_type": "text" + }, + "source": [ + "### Writing your own callbacks" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "colab_type": "text" + }, + "source": [ + "**Creating a custom callback by subclassing the `Callback` class**" + ] + }, + { + "cell_type": "code", + "execution_count": 0, + "metadata": { + "colab_type": "code" + }, + "outputs": [], + "source": [ + "from matplotlib import pyplot as plt\n", + "\n", + "class LossHistory(keras.callbacks.Callback):\n", + " def on_train_begin(self, logs):\n", + " self.per_batch_losses = []\n", + "\n", + " def on_batch_end(self, batch, logs):\n", + " self.per_batch_losses.append(logs.get(\"loss\"))\n", + "\n", + " def on_epoch_end(self, epoch, logs):\n", + " plt.clf()\n", + " plt.plot(range(len(self.per_batch_losses)), self.per_batch_losses,\n", + " label=\"Training loss for each batch\")\n", + " plt.xlabel(f\"Batch (epoch {epoch})\")\n", + " plt.ylabel(\"Loss\")\n", + " plt.legend()\n", + " plt.savefig(f\"plot_at_epoch_{epoch}\")\n", + " self.per_batch_losses = []" + ] + }, + { + "cell_type": "code", + "execution_count": 0, + "metadata": { + "colab_type": "code" + }, + "outputs": [], + "source": [ + "model = get_mnist_model()\n", + "model.compile(optimizer=\"rmsprop\",\n", + " loss=\"sparse_categorical_crossentropy\",\n", + " metrics=[\"accuracy\"])\n", + "model.fit(train_images, train_labels,\n", + " epochs=10,\n", + " callbacks=[LossHistory()],\n", + " validation_data=(val_images, val_labels))" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "colab_type": "text" + }, + "source": [ + "### Monitoring and visualization with TensorBoard" + ] + }, + { + "cell_type": "code", + "execution_count": 0, + "metadata": { + "colab_type": "code" + }, + "outputs": [], + "source": [ + "model = get_mnist_model()\n", + "model.compile(optimizer=\"rmsprop\",\n", + " loss=\"sparse_categorical_crossentropy\",\n", + " metrics=[\"accuracy\"])\n", + "\n", + "tensorboard = keras.callbacks.TensorBoard(\n", + " log_dir=\"/full_path_to_your_log_dir\",\n", + ")\n", + "model.fit(train_images, train_labels,\n", + " epochs=10,\n", + " validation_data=(val_images, val_labels),\n", + " callbacks=[tensorboard])" + ] + }, + { + "cell_type": "code", + "execution_count": 0, + "metadata": { + "colab_type": "code" + }, + "outputs": [], + "source": [ + "%load_ext tensorboard\n", + "%tensorboard --logdir /full_path_to_your_log_dir" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "colab_type": "text" + }, + "source": [ + "## Writing your own training and evaluation loops" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "colab_type": "text" + }, + "source": [ + "### Training versus inference" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "colab_type": "text" + }, + "source": [ + "### Low-level usage of metrics" + ] + }, + { + "cell_type": "code", + "execution_count": 0, + "metadata": { + "colab_type": "code" + }, + "outputs": [], + "source": [ + "metric = keras.metrics.SparseCategoricalAccuracy()\n", + "targets = [0, 1, 2]\n", + "predictions = [[1, 0, 0], [0, 1, 0], [0, 0, 1]]\n", + "metric.update_state(targets, predictions)\n", + "current_result = metric.result()\n", + "print(f\"result: {current_result:.2f}\")" + ] + }, + { + "cell_type": "code", + "execution_count": 0, + "metadata": { + "colab_type": "code" + }, + "outputs": [], + "source": [ + "values = [0, 1, 2, 3, 4]\n", + "mean_tracker = keras.metrics.Mean()\n", + "for value in values:\n", + " mean_tracker.update_state(value)\n", + "print(f\"Mean of values: {mean_tracker.result():.2f}\")" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "colab_type": "text" + }, + "source": [ + "### A complete training and evaluation loop" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "colab_type": "text" + }, + "source": [ + "**Writing a step-by-step training loop: the training step function**" + ] + }, + { + "cell_type": "code", + "execution_count": 0, + "metadata": { + "colab_type": "code" + }, + "outputs": [], + "source": [ + "model = get_mnist_model()\n", + "\n", + "loss_fn = keras.losses.SparseCategoricalCrossentropy()\n", + "optimizer = keras.optimizers.RMSprop()\n", + "metrics = [keras.metrics.SparseCategoricalAccuracy()]\n", + "loss_tracking_metric = keras.metrics.Mean()\n", + "\n", + "def train_step(inputs, targets):\n", + " with tf.GradientTape() as tape:\n", + " predictions = model(inputs, training=True)\n", + " loss = loss_fn(targets, predictions)\n", + " gradients = tape.gradient(loss, model.trainable_weights)\n", + " optimizer.apply_gradients(zip(gradients, model.trainable_weights))\n", + "\n", + " logs = {}\n", + " for metric in metrics:\n", + " metric.update_state(targets, predictions)\n", + " logs[metric.name] = metric.result()\n", + "\n", + " loss_tracking_metric.update_state(loss)\n", + " logs[\"loss\"] = loss_tracking_metric.result()\n", + " return logs" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "colab_type": "text" + }, + "source": [ + "**Writing a step-by-step training loop: resetting the metrics**" + ] + }, + { + "cell_type": "code", + "execution_count": 0, + "metadata": { + "colab_type": "code" + }, + "outputs": [], + "source": [ + "def reset_metrics():\n", + " for metric in metrics:\n", + " metric.reset_state()\n", + " loss_tracking_metric.reset_state()" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "colab_type": "text" + }, + "source": [ + "**Writing a step-by-step training loop: the loop itself**" + ] + }, + { + "cell_type": "code", + "execution_count": 0, + "metadata": { + "colab_type": "code" + }, + "outputs": [], + "source": [ + "training_dataset = tf.data.Dataset.from_tensor_slices((train_images, train_labels))\n", + "training_dataset = training_dataset.batch(32)\n", + "epochs = 3\n", + "for epoch in range(epochs):\n", + " reset_metrics()\n", + " for inputs_batch, targets_batch in training_dataset:\n", + " logs = train_step(inputs_batch, targets_batch)\n", + " print(f\"Results at the end of epoch {epoch}\")\n", + " for key, value in logs.items():\n", + " print(f\"...{key}: {value:.4f}\")" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "colab_type": "text" + }, + "source": [ + "**Writing a step-by-step evaluation loop**" + ] + }, + { + "cell_type": "code", + "execution_count": 0, + "metadata": { + "colab_type": "code" + }, + "outputs": [], + "source": [ + "def test_step(inputs, targets):\n", + " predictions = model(inputs, training=False)\n", + " loss = loss_fn(targets, predictions)\n", + "\n", + " logs = {}\n", + " for metric in metrics:\n", + " metric.update_state(targets, predictions)\n", + " logs[\"val_\" + metric.name] = metric.result()\n", + "\n", + " loss_tracking_metric.update_state(loss)\n", + " logs[\"val_loss\"] = loss_tracking_metric.result()\n", + " return logs\n", + "\n", + "val_dataset = tf.data.Dataset.from_tensor_slices((val_images, val_labels))\n", + "val_dataset = val_dataset.batch(32)\n", + "reset_metrics()\n", + "for inputs_batch, targets_batch in val_dataset:\n", + " logs = test_step(inputs_batch, targets_batch)\n", + "print(\"Evaluation results:\")\n", + "for key, value in logs.items():\n", + " print(f\"...{key}: {value:.4f}\")" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "colab_type": "text" + }, + "source": [ + "### Make it fast with tf.function" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "colab_type": "text" + }, + "source": [ + "**Adding a `tf.function` decorator to our evaluation-step function**" + ] + }, + { + "cell_type": "code", + "execution_count": 0, + "metadata": { + "colab_type": "code" + }, + "outputs": [], + "source": [ + "@tf.function\n", + "def test_step(inputs, targets):\n", + " predictions = model(inputs, training=False)\n", + " loss = loss_fn(targets, predictions)\n", + "\n", + " logs = {}\n", + " for metric in metrics:\n", + " metric.update_state(targets, predictions)\n", + " logs[\"val_\" + metric.name] = metric.result()\n", + "\n", + " loss_tracking_metric.update_state(loss)\n", + " logs[\"val_loss\"] = loss_tracking_metric.result()\n", + " return logs\n", + "\n", + "val_dataset = tf.data.Dataset.from_tensor_slices((val_images, val_labels))\n", + "val_dataset = val_dataset.batch(32)\n", + "reset_metrics()\n", + "for inputs_batch, targets_batch in val_dataset:\n", + " logs = test_step(inputs_batch, targets_batch)\n", + "print(\"Evaluation results:\")\n", + "for key, value in logs.items():\n", + " print(f\"...{key}: {value:.4f}\")" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "colab_type": "text" + }, + "source": [ + "### Leveraging fit() with a custom training loop" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "colab_type": "text" + }, + "source": [ + "**Implementing a custom training step to use with `fit()`**" + ] + }, + { + "cell_type": "code", + "execution_count": 0, + "metadata": { + "colab_type": "code" + }, + "outputs": [], + "source": [ + "loss_fn = keras.losses.SparseCategoricalCrossentropy()\n", + "loss_tracker = keras.metrics.Mean(name=\"loss\")\n", + "\n", + "class CustomModel(keras.Model):\n", + " def train_step(self, data):\n", + " inputs, targets = data\n", + " with tf.GradientTape() as tape:\n", + " predictions = self(inputs, training=True)\n", + " loss = loss_fn(targets, predictions)\n", + " gradients = tape.gradient(loss, self.trainable_weights)\n", + " self.optimizer.apply_gradients(zip(gradients, self.trainable_weights))\n", + "\n", + " loss_tracker.update_state(loss)\n", + " return {\"loss\": loss_tracker.result()}\n", + "\n", + " @property\n", + " def metrics(self):\n", + " return [loss_tracker]" + ] + }, + { + "cell_type": "code", + "execution_count": 0, + "metadata": { + "colab_type": "code" + }, + "outputs": [], + "source": [ + "inputs = keras.Input(shape=(28 * 28,))\n", + "features = layers.Dense(512, activation=\"relu\")(inputs)\n", + "features = layers.Dropout(0.5)(features)\n", + "outputs = layers.Dense(10, activation=\"softmax\")(features)\n", + "model = CustomModel(inputs, outputs)\n", + "\n", + "model.compile(optimizer=keras.optimizers.RMSprop())\n", + "model.fit(train_images, train_labels, epochs=3)" + ] + }, + { + "cell_type": "code", + "execution_count": 0, + "metadata": { + "colab_type": "code" + }, + "outputs": [], + "source": [ + "class CustomModel(keras.Model):\n", + " def train_step(self, data):\n", + " inputs, targets = data\n", + " with tf.GradientTape() as tape:\n", + " predictions = self(inputs, training=True)\n", + " loss = self.compiled_loss(targets, predictions)\n", + " gradients = tape.gradient(loss, self.trainable_weights)\n", + " self.optimizer.apply_gradients(zip(gradients, self.trainable_weights))\n", + " self.compiled_metrics.update_state(targets, predictions)\n", + " return {m.name: m.result() for m in self.metrics}" + ] + }, + { + "cell_type": "code", + "execution_count": 0, + "metadata": { + "colab_type": "code" + }, + "outputs": [], + "source": [ + "inputs = keras.Input(shape=(28 * 28,))\n", + "features = layers.Dense(512, activation=\"relu\")(inputs)\n", + "features = layers.Dropout(0.5)(features)\n", + "outputs = layers.Dense(10, activation=\"softmax\")(features)\n", + "model = CustomModel(inputs, outputs)\n", + "\n", + "model.compile(optimizer=keras.optimizers.RMSprop(),\n", + " loss=keras.losses.SparseCategoricalCrossentropy(),\n", + " metrics=[keras.metrics.SparseCategoricalAccuracy()])\n", + "model.fit(train_images, train_labels, epochs=3)" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "colab_type": "text" + }, + "source": [ + "## Summary" + ] + } + ], + "metadata": { + "colab": { + "collapsed_sections": [], + "name": "chapter07_working-with-keras.i", + "private_outputs": false, + "provenance": [], + "toc_visible": true + }, + "kernelspec": { + "display_name": "Python 3", + "language": "python", + "name": "python3" + }, + "language_info": { + "codemirror_mode": { + "name": "ipython", + "version": 3 + }, + "file_extension": ".py", + "mimetype": "text/x-python", + "name": "python", + "nbconvert_exporter": "python", + "pygments_lexer": "ipython3", + "version": "3.7.0" + } + }, + "nbformat": 4, + "nbformat_minor": 0 +} \ No newline at end of file diff --git a/second_edition/chapter08_intro-to-dl-for-computer-vision.ipynb b/second_edition/chapter08_intro-to-dl-for-computer-vision.ipynb new file mode 100644 index 0000000000..2459d444c4 --- /dev/null +++ b/second_edition/chapter08_intro-to-dl-for-computer-vision.ipynb @@ -0,0 +1,1224 @@ +{ + "cells": [ + { + "cell_type": "markdown", + "metadata": { + "colab_type": "text" + }, + "source": [ + "This is a companion notebook for the book [Deep Learning with Python, Second Edition](https://www.manning.com/books/deep-learning-with-python-second-edition?a_aid=keras&a_bid=76564dff). For readability, it only contains runnable code blocks and section titles, and omits everything else in the book: text paragraphs, figures, and pseudocode.\n\n**If you want to be able to follow what's going on, I recommend reading the notebook side by side with your copy of the book.**\n\nThis notebook was generated for TensorFlow 2.6." + ] + }, + { + "cell_type": "markdown", + "metadata": { + "colab_type": "text" + }, + "source": [ + "# Introduction to deep learning for computer vision" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "colab_type": "text" + }, + "source": [ + "## Introduction to convnets" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "colab_type": "text" + }, + "source": [ + "**Instantiating a small convnet**" + ] + }, + { + "cell_type": "code", + "execution_count": 0, + "metadata": { + "colab_type": "code" + }, + "outputs": [], + "source": [ + "from tensorflow import keras\n", + "from tensorflow.keras import layers\n", + "inputs = keras.Input(shape=(28, 28, 1))\n", + "x = layers.Conv2D(filters=32, kernel_size=3, activation=\"relu\")(inputs)\n", + "x = layers.MaxPooling2D(pool_size=2)(x)\n", + "x = layers.Conv2D(filters=64, kernel_size=3, activation=\"relu\")(x)\n", + "x = layers.MaxPooling2D(pool_size=2)(x)\n", + "x = layers.Conv2D(filters=128, kernel_size=3, activation=\"relu\")(x)\n", + "x = layers.Flatten()(x)\n", + "outputs = layers.Dense(10, activation=\"softmax\")(x)\n", + "model = keras.Model(inputs=inputs, outputs=outputs)" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "colab_type": "text" + }, + "source": [ + "**Displaying the model's summary**" + ] + }, + { + "cell_type": "code", + "execution_count": 0, + "metadata": { + "colab_type": "code" + }, + "outputs": [], + "source": [ + "model.summary()" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "colab_type": "text" + }, + "source": [ + "**Training the convnet on MNIST images**" + ] + }, + { + "cell_type": "code", + "execution_count": 0, + "metadata": { + "colab_type": "code" + }, + "outputs": [], + "source": [ + "from tensorflow.keras.datasets import mnist\n", + "\n", + "(train_images, train_labels), (test_images, test_labels) = mnist.load_data()\n", + "train_images = train_images.reshape((60000, 28, 28, 1))\n", + "train_images = train_images.astype(\"float32\") / 255\n", + "test_images = test_images.reshape((10000, 28, 28, 1))\n", + "test_images = test_images.astype(\"float32\") / 255\n", + "model.compile(optimizer=\"rmsprop\",\n", + " loss=\"sparse_categorical_crossentropy\",\n", + " metrics=[\"accuracy\"])\n", + "model.fit(train_images, train_labels, epochs=5, batch_size=64)" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "colab_type": "text" + }, + "source": [ + "**Evaluating the convnet**" + ] + }, + { + "cell_type": "code", + "execution_count": 0, + "metadata": { + "colab_type": "code" + }, + "outputs": [], + "source": [ + "test_loss, test_acc = model.evaluate(test_images, test_labels)\n", + "print(f\"Test accuracy: {test_acc:.3f}\")" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "colab_type": "text" + }, + "source": [ + "### The convolution operation" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "colab_type": "text" + }, + "source": [ + "#### Understanding border effects and padding" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "colab_type": "text" + }, + "source": [ + "#### Understanding convolution strides" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "colab_type": "text" + }, + "source": [ + "### The max-pooling operation" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "colab_type": "text" + }, + "source": [ + "**An incorrectly structured convnet missing its max-pooling layers**" + ] + }, + { + "cell_type": "code", + "execution_count": 0, + "metadata": { + "colab_type": "code" + }, + "outputs": [], + "source": [ + "inputs = keras.Input(shape=(28, 28, 1))\n", + "x = layers.Conv2D(filters=32, kernel_size=3, activation=\"relu\")(inputs)\n", + "x = layers.Conv2D(filters=64, kernel_size=3, activation=\"relu\")(x)\n", + "x = layers.Conv2D(filters=128, kernel_size=3, activation=\"relu\")(x)\n", + "x = layers.Flatten()(x)\n", + "outputs = layers.Dense(10, activation=\"softmax\")(x)\n", + "model_no_max_pool = keras.Model(inputs=inputs, outputs=outputs)" + ] + }, + { + "cell_type": "code", + "execution_count": 0, + "metadata": { + "colab_type": "code" + }, + "outputs": [], + "source": [ + "model_no_max_pool.summary()" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "colab_type": "text" + }, + "source": [ + "## Training a convnet from scratch on a small dataset" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "colab_type": "text" + }, + "source": [ + "### The relevance of deep learning for small-data problems" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "colab_type": "text" + }, + "source": [ + "### Downloading the data" + ] + }, + { + "cell_type": "code", + "execution_count": 0, + "metadata": { + "colab_type": "code" + }, + "outputs": [], + "source": [ + "from google.colab import files\n", + "files.upload()" + ] + }, + { + "cell_type": "code", + "execution_count": 0, + "metadata": { + "colab_type": "code" + }, + "outputs": [], + "source": [ + "!mkdir ~/.kaggle\n", + "!cp kaggle.json ~/.kaggle/\n", + "!chmod 600 ~/.kaggle/kaggle.json" + ] + }, + { + "cell_type": "code", + "execution_count": 0, + "metadata": { + "colab_type": "code" + }, + "outputs": [], + "source": [ + "!kaggle competitions download -c dogs-vs-cats" + ] + }, + { + "cell_type": "code", + "execution_count": 0, + "metadata": { + "colab_type": "code" + }, + "outputs": [], + "source": [ + "!unzip -qq dogs-vs-cats.zip" + ] + }, + { + "cell_type": "code", + "execution_count": 0, + "metadata": { + "colab_type": "code" + }, + "outputs": [], + "source": [ + "!unzip -qq train.zip" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "colab_type": "text" + }, + "source": [ + "**Copying images to training, validation, and test directories**" + ] + }, + { + "cell_type": "code", + "execution_count": 0, + "metadata": { + "colab_type": "code" + }, + "outputs": [], + "source": [ + "import os, shutil, pathlib\n", + "\n", + "original_dir = pathlib.Path(\"train\")\n", + "new_base_dir = pathlib.Path(\"cats_vs_dogs_small\")\n", + "\n", + "def make_subset(subset_name, start_index, end_index):\n", + " for category in (\"cat\", \"dog\"):\n", + " dir = new_base_dir / subset_name / category\n", + " os.makedirs(dir)\n", + " fnames = [f\"{category}.{i}.jpg\" for i in range(start_index, end_index)]\n", + " for fname in fnames:\n", + " shutil.copyfile(src=original_dir / fname,\n", + " dst=dir / fname)\n", + "\n", + "make_subset(\"train\", start_index=0, end_index=1000)\n", + "make_subset(\"validation\", start_index=1000, end_index=1500)\n", + "make_subset(\"test\", start_index=1500, end_index=2500)" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "colab_type": "text" + }, + "source": [ + "### Building the model" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "colab_type": "text" + }, + "source": [ + "**Instantiating a small convnet for dogs vs. cats classification**" + ] + }, + { + "cell_type": "code", + "execution_count": 0, + "metadata": { + "colab_type": "code" + }, + "outputs": [], + "source": [ + "from tensorflow import keras\n", + "from tensorflow.keras import layers\n", + "\n", + "inputs = keras.Input(shape=(180, 180, 3))\n", + "x = layers.Rescaling(1./255)(inputs)\n", + "x = layers.Conv2D(filters=32, kernel_size=3, activation=\"relu\")(x)\n", + "x = layers.MaxPooling2D(pool_size=2)(x)\n", + "x = layers.Conv2D(filters=64, kernel_size=3, activation=\"relu\")(x)\n", + "x = layers.MaxPooling2D(pool_size=2)(x)\n", + "x = layers.Conv2D(filters=128, kernel_size=3, activation=\"relu\")(x)\n", + "x = layers.MaxPooling2D(pool_size=2)(x)\n", + "x = layers.Conv2D(filters=256, kernel_size=3, activation=\"relu\")(x)\n", + "x = layers.MaxPooling2D(pool_size=2)(x)\n", + "x = layers.Conv2D(filters=256, kernel_size=3, activation=\"relu\")(x)\n", + "x = layers.Flatten()(x)\n", + "outputs = layers.Dense(1, activation=\"sigmoid\")(x)\n", + "model = keras.Model(inputs=inputs, outputs=outputs)" + ] + }, + { + "cell_type": "code", + "execution_count": 0, + "metadata": { + "colab_type": "code" + }, + "outputs": [], + "source": [ + "model.summary()" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "colab_type": "text" + }, + "source": [ + "**Configuring the model for training**" + ] + }, + { + "cell_type": "code", + "execution_count": 0, + "metadata": { + "colab_type": "code" + }, + "outputs": [], + "source": [ + "model.compile(loss=\"binary_crossentropy\",\n", + " optimizer=\"rmsprop\",\n", + " metrics=[\"accuracy\"])" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "colab_type": "text" + }, + "source": [ + "### Data preprocessing" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "colab_type": "text" + }, + "source": [ + "**Using `image_dataset_from_directory` to read images**" + ] + }, + { + "cell_type": "code", + "execution_count": 0, + "metadata": { + "colab_type": "code" + }, + "outputs": [], + "source": [ + "from tensorflow.keras.utils import image_dataset_from_directory\n", + "\n", + "train_dataset = image_dataset_from_directory(\n", + " new_base_dir / \"train\",\n", + " image_size=(180, 180),\n", + " batch_size=32)\n", + "validation_dataset = image_dataset_from_directory(\n", + " new_base_dir / \"validation\",\n", + " image_size=(180, 180),\n", + " batch_size=32)\n", + "test_dataset = image_dataset_from_directory(\n", + " new_base_dir / \"test\",\n", + " image_size=(180, 180),\n", + " batch_size=32)" + ] + }, + { + "cell_type": "code", + "execution_count": 0, + "metadata": { + "colab_type": "code" + }, + "outputs": [], + "source": [ + "import numpy as np\n", + "import tensorflow as tf\n", + "random_numbers = np.random.normal(size=(1000, 16))\n", + "dataset = tf.data.Dataset.from_tensor_slices(random_numbers)" + ] + }, + { + "cell_type": "code", + "execution_count": 0, + "metadata": { + "colab_type": "code" + }, + "outputs": [], + "source": [ + "for i, element in enumerate(dataset):\n", + " print(element.shape)\n", + " if i >= 2:\n", + " break" + ] + }, + { + "cell_type": "code", + "execution_count": 0, + "metadata": { + "colab_type": "code" + }, + "outputs": [], + "source": [ + "batched_dataset = dataset.batch(32)\n", + "for i, element in enumerate(batched_dataset):\n", + " print(element.shape)\n", + " if i >= 2:\n", + " break" + ] + }, + { + "cell_type": "code", + "execution_count": 0, + "metadata": { + "colab_type": "code" + }, + "outputs": [], + "source": [ + "reshaped_dataset = dataset.map(lambda x: tf.reshape(x, (4, 4)))\n", + "for i, element in enumerate(reshaped_dataset):\n", + " print(element.shape)\n", + " if i >= 2:\n", + " break" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "colab_type": "text" + }, + "source": [ + "**Displaying the shapes of the data and labels yielded by the `Dataset`**" + ] + }, + { + "cell_type": "code", + "execution_count": 0, + "metadata": { + "colab_type": "code" + }, + "outputs": [], + "source": [ + "for data_batch, labels_batch in train_dataset:\n", + " print(\"data batch shape:\", data_batch.shape)\n", + " print(\"labels batch shape:\", labels_batch.shape)\n", + " break" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "colab_type": "text" + }, + "source": [ + "**Fitting the model using a `Dataset`**" + ] + }, + { + "cell_type": "code", + "execution_count": 0, + "metadata": { + "colab_type": "code" + }, + "outputs": [], + "source": [ + "callbacks = [\n", + " keras.callbacks.ModelCheckpoint(\n", + " filepath=\"convnet_from_scratch.keras\",\n", + " save_best_only=True,\n", + " monitor=\"val_loss\")\n", + "]\n", + "history = model.fit(\n", + " train_dataset,\n", + " epochs=30,\n", + " validation_data=validation_dataset,\n", + " callbacks=callbacks)" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "colab_type": "text" + }, + "source": [ + "**Displaying curves of loss and accuracy during training**" + ] + }, + { + "cell_type": "code", + "execution_count": 0, + "metadata": { + "colab_type": "code" + }, + "outputs": [], + "source": [ + "import matplotlib.pyplot as plt\n", + "accuracy = history.history[\"accuracy\"]\n", + "val_accuracy = history.history[\"val_accuracy\"]\n", + "loss = history.history[\"loss\"]\n", + "val_loss = history.history[\"val_loss\"]\n", + "epochs = range(1, len(accuracy) + 1)\n", + "plt.plot(epochs, accuracy, \"bo\", label=\"Training accuracy\")\n", + "plt.plot(epochs, val_accuracy, \"b\", label=\"Validation accuracy\")\n", + "plt.title(\"Training and validation accuracy\")\n", + "plt.legend()\n", + "plt.figure()\n", + "plt.plot(epochs, loss, \"bo\", label=\"Training loss\")\n", + "plt.plot(epochs, val_loss, \"b\", label=\"Validation loss\")\n", + "plt.title(\"Training and validation loss\")\n", + "plt.legend()\n", + "plt.show()" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "colab_type": "text" + }, + "source": [ + "**Evaluating the model on the test set**" + ] + }, + { + "cell_type": "code", + "execution_count": 0, + "metadata": { + "colab_type": "code" + }, + "outputs": [], + "source": [ + "test_model = keras.models.load_model(\"convnet_from_scratch.keras\")\n", + "test_loss, test_acc = test_model.evaluate(test_dataset)\n", + "print(f\"Test accuracy: {test_acc:.3f}\")" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "colab_type": "text" + }, + "source": [ + "### Using data augmentation" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "colab_type": "text" + }, + "source": [ + "**Define a data augmentation stage to add to an image model**" + ] + }, + { + "cell_type": "code", + "execution_count": 0, + "metadata": { + "colab_type": "code" + }, + "outputs": [], + "source": [ + "data_augmentation = keras.Sequential(\n", + " [\n", + " layers.RandomFlip(\"horizontal\"),\n", + " layers.RandomRotation(0.1),\n", + " layers.RandomZoom(0.2),\n", + " ]\n", + ")" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "colab_type": "text" + }, + "source": [ + "**Displaying some randomly augmented training images**" + ] + }, + { + "cell_type": "code", + "execution_count": 0, + "metadata": { + "colab_type": "code" + }, + "outputs": [], + "source": [ + "plt.figure(figsize=(10, 10))\n", + "for images, _ in train_dataset.take(1):\n", + " for i in range(9):\n", + " augmented_images = data_augmentation(images)\n", + " ax = plt.subplot(3, 3, i + 1)\n", + " plt.imshow(augmented_images[0].numpy().astype(\"uint8\"))\n", + " plt.axis(\"off\")" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "colab_type": "text" + }, + "source": [ + "**Defining a new convnet that includes image augmentation and dropout**" + ] + }, + { + "cell_type": "code", + "execution_count": 0, + "metadata": { + "colab_type": "code" + }, + "outputs": [], + "source": [ + "inputs = keras.Input(shape=(180, 180, 3))\n", + "x = data_augmentation(inputs)\n", + "x = layers.Rescaling(1./255)(x)\n", + "x = layers.Conv2D(filters=32, kernel_size=3, activation=\"relu\")(x)\n", + "x = layers.MaxPooling2D(pool_size=2)(x)\n", + "x = layers.Conv2D(filters=64, kernel_size=3, activation=\"relu\")(x)\n", + "x = layers.MaxPooling2D(pool_size=2)(x)\n", + "x = layers.Conv2D(filters=128, kernel_size=3, activation=\"relu\")(x)\n", + "x = layers.MaxPooling2D(pool_size=2)(x)\n", + "x = layers.Conv2D(filters=256, kernel_size=3, activation=\"relu\")(x)\n", + "x = layers.MaxPooling2D(pool_size=2)(x)\n", + "x = layers.Conv2D(filters=256, kernel_size=3, activation=\"relu\")(x)\n", + "x = layers.Flatten()(x)\n", + "x = layers.Dropout(0.5)(x)\n", + "outputs = layers.Dense(1, activation=\"sigmoid\")(x)\n", + "model = keras.Model(inputs=inputs, outputs=outputs)\n", + "\n", + "model.compile(loss=\"binary_crossentropy\",\n", + " optimizer=\"rmsprop\",\n", + " metrics=[\"accuracy\"])" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "colab_type": "text" + }, + "source": [ + "**Training the regularized convnet**" + ] + }, + { + "cell_type": "code", + "execution_count": 0, + "metadata": { + "colab_type": "code" + }, + "outputs": [], + "source": [ + "callbacks = [\n", + " keras.callbacks.ModelCheckpoint(\n", + " filepath=\"convnet_from_scratch_with_augmentation.keras\",\n", + " save_best_only=True,\n", + " monitor=\"val_loss\")\n", + "]\n", + "history = model.fit(\n", + " train_dataset,\n", + " epochs=100,\n", + " validation_data=validation_dataset,\n", + " callbacks=callbacks)" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "colab_type": "text" + }, + "source": [ + "**Evaluating the model on the test set**" + ] + }, + { + "cell_type": "code", + "execution_count": 0, + "metadata": { + "colab_type": "code" + }, + "outputs": [], + "source": [ + "test_model = keras.models.load_model(\n", + " \"convnet_from_scratch_with_augmentation.keras\")\n", + "test_loss, test_acc = test_model.evaluate(test_dataset)\n", + "print(f\"Test accuracy: {test_acc:.3f}\")" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "colab_type": "text" + }, + "source": [ + "## Leveraging a pretrained model" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "colab_type": "text" + }, + "source": [ + "### Feature extraction with a pretrained model" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "colab_type": "text" + }, + "source": [ + "**Instantiating the VGG16 convolutional base**" + ] + }, + { + "cell_type": "code", + "execution_count": 0, + "metadata": { + "colab_type": "code" + }, + "outputs": [], + "source": [ + "conv_base = keras.applications.vgg16.VGG16(\n", + " weights=\"imagenet\",\n", + " include_top=False,\n", + " input_shape=(180, 180, 3))" + ] + }, + { + "cell_type": "code", + "execution_count": 0, + "metadata": { + "colab_type": "code" + }, + "outputs": [], + "source": [ + "conv_base.summary()" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "colab_type": "text" + }, + "source": [ + "#### Fast feature extraction without data augmentation" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "colab_type": "text" + }, + "source": [ + "**Extracting the VGG16 features and corresponding labels**" + ] + }, + { + "cell_type": "code", + "execution_count": 0, + "metadata": { + "colab_type": "code" + }, + "outputs": [], + "source": [ + "import numpy as np\n", + "\n", + "def get_features_and_labels(dataset):\n", + " all_features = []\n", + " all_labels = []\n", + " for images, labels in dataset:\n", + " preprocessed_images = keras.applications.vgg16.preprocess_input(images)\n", + " features = conv_base.predict(preprocessed_images)\n", + " all_features.append(features)\n", + " all_labels.append(labels)\n", + " return np.concatenate(all_features), np.concatenate(all_labels)\n", + "\n", + "train_features, train_labels = get_features_and_labels(train_dataset)\n", + "val_features, val_labels = get_features_and_labels(validation_dataset)\n", + "test_features, test_labels = get_features_and_labels(test_dataset)" + ] + }, + { + "cell_type": "code", + "execution_count": 0, + "metadata": { + "colab_type": "code" + }, + "outputs": [], + "source": [ + "train_features.shape" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "colab_type": "text" + }, + "source": [ + "**Defining and training the densely connected classifier**" + ] + }, + { + "cell_type": "code", + "execution_count": 0, + "metadata": { + "colab_type": "code" + }, + "outputs": [], + "source": [ + "inputs = keras.Input(shape=(5, 5, 512))\n", + "x = layers.Flatten()(inputs)\n", + "x = layers.Dense(256)(x)\n", + "x = layers.Dropout(0.5)(x)\n", + "outputs = layers.Dense(1, activation=\"sigmoid\")(x)\n", + "model = keras.Model(inputs, outputs)\n", + "model.compile(loss=\"binary_crossentropy\",\n", + " optimizer=\"rmsprop\",\n", + " metrics=[\"accuracy\"])\n", + "\n", + "callbacks = [\n", + " keras.callbacks.ModelCheckpoint(\n", + " filepath=\"feature_extraction.keras\",\n", + " save_best_only=True,\n", + " monitor=\"val_loss\")\n", + "]\n", + "history = model.fit(\n", + " train_features, train_labels,\n", + " epochs=20,\n", + " validation_data=(val_features, val_labels),\n", + " callbacks=callbacks)" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "colab_type": "text" + }, + "source": [ + "**Plotting the results**" + ] + }, + { + "cell_type": "code", + "execution_count": 0, + "metadata": { + "colab_type": "code" + }, + "outputs": [], + "source": [ + "import matplotlib.pyplot as plt\n", + "acc = history.history[\"accuracy\"]\n", + "val_acc = history.history[\"val_accuracy\"]\n", + "loss = history.history[\"loss\"]\n", + "val_loss = history.history[\"val_loss\"]\n", + "epochs = range(1, len(acc) + 1)\n", + "plt.plot(epochs, acc, \"bo\", label=\"Training accuracy\")\n", + "plt.plot(epochs, val_acc, \"b\", label=\"Validation accuracy\")\n", + "plt.title(\"Training and validation accuracy\")\n", + "plt.legend()\n", + "plt.figure()\n", + "plt.plot(epochs, loss, \"bo\", label=\"Training loss\")\n", + "plt.plot(epochs, val_loss, \"b\", label=\"Validation loss\")\n", + "plt.title(\"Training and validation loss\")\n", + "plt.legend()\n", + "plt.show()" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "colab_type": "text" + }, + "source": [ + "#### Feature extraction together with data augmentation" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "colab_type": "text" + }, + "source": [ + "**Instantiating and freezing the VGG16 convolutional base**" + ] + }, + { + "cell_type": "code", + "execution_count": 0, + "metadata": { + "colab_type": "code" + }, + "outputs": [], + "source": [ + "conv_base = keras.applications.vgg16.VGG16(\n", + " weights=\"imagenet\",\n", + " include_top=False)\n", + "conv_base.trainable = False" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "colab_type": "text" + }, + "source": [ + "**Printing the list of trainable weights before and after freezing**" + ] + }, + { + "cell_type": "code", + "execution_count": 0, + "metadata": { + "colab_type": "code" + }, + "outputs": [], + "source": [ + "conv_base.trainable = True\n", + "print(\"This is the number of trainable weights \"\n", + " \"before freezing the conv base:\", len(conv_base.trainable_weights))" + ] + }, + { + "cell_type": "code", + "execution_count": 0, + "metadata": { + "colab_type": "code" + }, + "outputs": [], + "source": [ + "conv_base.trainable = False\n", + "print(\"This is the number of trainable weights \"\n", + " \"after freezing the conv base:\", len(conv_base.trainable_weights))" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "colab_type": "text" + }, + "source": [ + "**Adding a data augmentation stage and a classifier to the convolutional base**" + ] + }, + { + "cell_type": "code", + "execution_count": 0, + "metadata": { + "colab_type": "code" + }, + "outputs": [], + "source": [ + "data_augmentation = keras.Sequential(\n", + " [\n", + " layers.RandomFlip(\"horizontal\"),\n", + " layers.RandomRotation(0.1),\n", + " layers.RandomZoom(0.2),\n", + " ]\n", + ")\n", + "\n", + "inputs = keras.Input(shape=(180, 180, 3))\n", + "x = data_augmentation(inputs)\n", + "x = keras.applications.vgg16.preprocess_input(x)\n", + "x = conv_base(x)\n", + "x = layers.Flatten()(x)\n", + "x = layers.Dense(256)(x)\n", + "x = layers.Dropout(0.5)(x)\n", + "outputs = layers.Dense(1, activation=\"sigmoid\")(x)\n", + "model = keras.Model(inputs, outputs)\n", + "model.compile(loss=\"binary_crossentropy\",\n", + " optimizer=\"rmsprop\",\n", + " metrics=[\"accuracy\"])" + ] + }, + { + "cell_type": "code", + "execution_count": 0, + "metadata": { + "colab_type": "code" + }, + "outputs": [], + "source": [ + "callbacks = [\n", + " keras.callbacks.ModelCheckpoint(\n", + " filepath=\"feature_extraction_with_data_augmentation.keras\",\n", + " save_best_only=True,\n", + " monitor=\"val_loss\")\n", + "]\n", + "history = model.fit(\n", + " train_dataset,\n", + " epochs=50,\n", + " validation_data=validation_dataset,\n", + " callbacks=callbacks)" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "colab_type": "text" + }, + "source": [ + "**Evaluating the model on the test set**" + ] + }, + { + "cell_type": "code", + "execution_count": 0, + "metadata": { + "colab_type": "code" + }, + "outputs": [], + "source": [ + "test_model = keras.models.load_model(\n", + " \"feature_extraction_with_data_augmentation.keras\")\n", + "test_loss, test_acc = test_model.evaluate(test_dataset)\n", + "print(f\"Test accuracy: {test_acc:.3f}\")" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "colab_type": "text" + }, + "source": [ + "### Fine-tuning a pretrained model" + ] + }, + { + "cell_type": "code", + "execution_count": 0, + "metadata": { + "colab_type": "code" + }, + "outputs": [], + "source": [ + "conv_base.summary()" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "colab_type": "text" + }, + "source": [ + "**Freezing all layers until the fourth from the last**" + ] + }, + { + "cell_type": "code", + "execution_count": 0, + "metadata": { + "colab_type": "code" + }, + "outputs": [], + "source": [ + "conv_base.trainable = True\n", + "for layer in conv_base.layers[:-4]:\n", + " layer.trainable = False" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "colab_type": "text" + }, + "source": [ + "**Fine-tuning the model**" + ] + }, + { + "cell_type": "code", + "execution_count": 0, + "metadata": { + "colab_type": "code" + }, + "outputs": [], + "source": [ + "model.compile(loss=\"binary_crossentropy\",\n", + " optimizer=keras.optimizers.RMSprop(learning_rate=1e-5),\n", + " metrics=[\"accuracy\"])\n", + "\n", + "callbacks = [\n", + " keras.callbacks.ModelCheckpoint(\n", + " filepath=\"fine_tuning.keras\",\n", + " save_best_only=True,\n", + " monitor=\"val_loss\")\n", + "]\n", + "history = model.fit(\n", + " train_dataset,\n", + " epochs=30,\n", + " validation_data=validation_dataset,\n", + " callbacks=callbacks)" + ] + }, + { + "cell_type": "code", + "execution_count": 0, + "metadata": { + "colab_type": "code" + }, + "outputs": [], + "source": [ + "model = keras.models.load_model(\"fine_tuning.keras\")\n", + "test_loss, test_acc = model.evaluate(test_dataset)\n", + "print(f\"Test accuracy: {test_acc:.3f}\")" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "colab_type": "text" + }, + "source": [ + "## Summary" + ] + } + ], + "metadata": { + "colab": { + "collapsed_sections": [], + "name": "chapter08_intro-to-dl-for-computer-vision.i", + "private_outputs": false, + "provenance": [], + "toc_visible": true + }, + "kernelspec": { + "display_name": "Python 3", + "language": "python", + "name": "python3" + }, + "language_info": { + "codemirror_mode": { + "name": "ipython", + "version": 3 + }, + "file_extension": ".py", + "mimetype": "text/x-python", + "name": "python", + "nbconvert_exporter": "python", + "pygments_lexer": "ipython3", + "version": "3.7.0" + } + }, + "nbformat": 4, + "nbformat_minor": 0 +} diff --git a/second_edition/chapter09_part01_image-segmentation.ipynb b/second_edition/chapter09_part01_image-segmentation.ipynb new file mode 100644 index 0000000000..7d266d9853 --- /dev/null +++ b/second_edition/chapter09_part01_image-segmentation.ipynb @@ -0,0 +1,282 @@ +{ + "cells": [ + { + "cell_type": "markdown", + "metadata": { + "colab_type": "text" + }, + "source": [ + "This is a companion notebook for the book [Deep Learning with Python, Second Edition](https://www.manning.com/books/deep-learning-with-python-second-edition?a_aid=keras&a_bid=76564dff). For readability, it only contains runnable code blocks and section titles, and omits everything else in the book: text paragraphs, figures, and pseudocode.\n\n**If you want to be able to follow what's going on, I recommend reading the notebook side by side with your copy of the book.**\n\nThis notebook was generated for TensorFlow 2.6." + ] + }, + { + "cell_type": "markdown", + "metadata": { + "colab_type": "text" + }, + "source": [ + "# Advanced deep learning for computer vision" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "colab_type": "text" + }, + "source": [ + "## Three essential computer vision tasks" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "colab_type": "text" + }, + "source": [ + "## An image segmentation example" + ] + }, + { + "cell_type": "code", + "execution_count": 0, + "metadata": { + "colab_type": "code" + }, + "outputs": [], + "source": [ + "!wget http://www.robots.ox.ac.uk/~vgg/data/pets/data/images.tar.gz\n", + "!wget http://www.robots.ox.ac.uk/~vgg/data/pets/data/annotations.tar.gz\n", + "!tar -xf images.tar.gz\n", + "!tar -xf annotations.tar.gz" + ] + }, + { + "cell_type": "code", + "execution_count": 0, + "metadata": { + "colab_type": "code" + }, + "outputs": [], + "source": [ + "import os\n", + "\n", + "input_dir = \"images/\"\n", + "target_dir = \"annotations/trimaps/\"\n", + "\n", + "input_img_paths = sorted(\n", + " [os.path.join(input_dir, fname)\n", + " for fname in os.listdir(input_dir)\n", + " if fname.endswith(\".jpg\")])\n", + "target_paths = sorted(\n", + " [os.path.join(target_dir, fname)\n", + " for fname in os.listdir(target_dir)\n", + " if fname.endswith(\".png\") and not fname.startswith(\".\")])" + ] + }, + { + "cell_type": "code", + "execution_count": 0, + "metadata": { + "colab_type": "code" + }, + "outputs": [], + "source": [ + "import matplotlib.pyplot as plt\n", + "from tensorflow.keras.utils import load_img, img_to_array\n", + "\n", + "plt.axis(\"off\")\n", + "plt.imshow(load_img(input_img_paths[9]))" + ] + }, + { + "cell_type": "code", + "execution_count": 0, + "metadata": { + "colab_type": "code" + }, + "outputs": [], + "source": [ + "def display_target(target_array):\n", + " normalized_array = (target_array.astype(\"uint8\") - 1) * 127\n", + " plt.axis(\"off\")\n", + " plt.imshow(normalized_array[:, :, 0])\n", + "\n", + "img = img_to_array(load_img(target_paths[9], color_mode=\"grayscale\"))\n", + "display_target(img)" + ] + }, + { + "cell_type": "code", + "execution_count": 0, + "metadata": { + "colab_type": "code" + }, + "outputs": [], + "source": [ + "import numpy as np\n", + "import random\n", + "\n", + "img_size = (200, 200)\n", + "num_imgs = len(input_img_paths)\n", + "\n", + "random.Random(1337).shuffle(input_img_paths)\n", + "random.Random(1337).shuffle(target_paths)\n", + "\n", + "def path_to_input_image(path):\n", + " return img_to_array(load_img(path, target_size=img_size))\n", + "\n", + "def path_to_target(path):\n", + " img = img_to_array(\n", + " load_img(path, target_size=img_size, color_mode=\"grayscale\"))\n", + " img = img.astype(\"uint8\") - 1\n", + " return img\n", + "\n", + "input_imgs = np.zeros((num_imgs,) + img_size + (3,), dtype=\"float32\")\n", + "targets = np.zeros((num_imgs,) + img_size + (1,), dtype=\"uint8\")\n", + "for i in range(num_imgs):\n", + " input_imgs[i] = path_to_input_image(input_img_paths[i])\n", + " targets[i] = path_to_target(target_paths[i])\n", + "\n", + "num_val_samples = 1000\n", + "train_input_imgs = input_imgs[:-num_val_samples]\n", + "train_targets = targets[:-num_val_samples]\n", + "val_input_imgs = input_imgs[-num_val_samples:]\n", + "val_targets = targets[-num_val_samples:]" + ] + }, + { + "cell_type": "code", + "execution_count": 0, + "metadata": { + "colab_type": "code" + }, + "outputs": [], + "source": [ + "from tensorflow import keras\n", + "from tensorflow.keras import layers\n", + "\n", + "def get_model(img_size, num_classes):\n", + " inputs = keras.Input(shape=img_size + (3,))\n", + " x = layers.Rescaling(1./255)(inputs)\n", + "\n", + " x = layers.Conv2D(64, 3, strides=2, activation=\"relu\", padding=\"same\")(x)\n", + " x = layers.Conv2D(64, 3, activation=\"relu\", padding=\"same\")(x)\n", + " x = layers.Conv2D(128, 3, strides=2, activation=\"relu\", padding=\"same\")(x)\n", + " x = layers.Conv2D(128, 3, activation=\"relu\", padding=\"same\")(x)\n", + " x = layers.Conv2D(256, 3, strides=2, padding=\"same\", activation=\"relu\")(x)\n", + " x = layers.Conv2D(256, 3, activation=\"relu\", padding=\"same\")(x)\n", + "\n", + " x = layers.Conv2DTranspose(256, 3, activation=\"relu\", padding=\"same\")(x)\n", + " x = layers.Conv2DTranspose(256, 3, activation=\"relu\", padding=\"same\", strides=2)(x)\n", + " x = layers.Conv2DTranspose(128, 3, activation=\"relu\", padding=\"same\")(x)\n", + " x = layers.Conv2DTranspose(128, 3, activation=\"relu\", padding=\"same\", strides=2)(x)\n", + " x = layers.Conv2DTranspose(64, 3, activation=\"relu\", padding=\"same\")(x)\n", + " x = layers.Conv2DTranspose(64, 3, activation=\"relu\", padding=\"same\", strides=2)(x)\n", + "\n", + " outputs = layers.Conv2D(num_classes, 3, activation=\"softmax\", padding=\"same\")(x)\n", + "\n", + " model = keras.Model(inputs, outputs)\n", + " return model\n", + "\n", + "model = get_model(img_size=img_size, num_classes=3)\n", + "model.summary()" + ] + }, + { + "cell_type": "code", + "execution_count": 0, + "metadata": { + "colab_type": "code" + }, + "outputs": [], + "source": [ + "model.compile(optimizer=\"rmsprop\", loss=\"sparse_categorical_crossentropy\")\n", + "\n", + "callbacks = [\n", + " keras.callbacks.ModelCheckpoint(\"oxford_segmentation.keras\",\n", + " save_best_only=True)\n", + "]\n", + "\n", + "history = model.fit(train_input_imgs, train_targets,\n", + " epochs=50,\n", + " callbacks=callbacks,\n", + " batch_size=64,\n", + " validation_data=(val_input_imgs, val_targets))" + ] + }, + { + "cell_type": "code", + "execution_count": 0, + "metadata": { + "colab_type": "code" + }, + "outputs": [], + "source": [ + "epochs = range(1, len(history.history[\"loss\"]) + 1)\n", + "loss = history.history[\"loss\"]\n", + "val_loss = history.history[\"val_loss\"]\n", + "plt.figure()\n", + "plt.plot(epochs, loss, \"bo\", label=\"Training loss\")\n", + "plt.plot(epochs, val_loss, \"b\", label=\"Validation loss\")\n", + "plt.title(\"Training and validation loss\")\n", + "plt.legend()" + ] + }, + { + "cell_type": "code", + "execution_count": 0, + "metadata": { + "colab_type": "code" + }, + "outputs": [], + "source": [ + "from tensorflow.keras.utils import array_to_img\n", + "\n", + "model = keras.models.load_model(\"oxford_segmentation.keras\")\n", + "\n", + "i = 4\n", + "test_image = val_input_imgs[i]\n", + "plt.axis(\"off\")\n", + "plt.imshow(array_to_img(test_image))\n", + "\n", + "mask = model.predict(np.expand_dims(test_image, 0))[0]\n", + "\n", + "def display_mask(pred):\n", + " mask = np.argmax(pred, axis=-1)\n", + " mask *= 127\n", + " plt.axis(\"off\")\n", + " plt.imshow(mask)\n", + "\n", + "display_mask(mask)" + ] + } + ], + "metadata": { + "colab": { + "collapsed_sections": [], + "name": "chapter09_part01_image-segmentation.i", + "private_outputs": false, + "provenance": [], + "toc_visible": true + }, + "kernelspec": { + "display_name": "Python 3", + "language": "python", + "name": "python3" + }, + "language_info": { + "codemirror_mode": { + "name": "ipython", + "version": 3 + }, + "file_extension": ".py", + "mimetype": "text/x-python", + "name": "python", + "nbconvert_exporter": "python", + "pygments_lexer": "ipython3", + "version": "3.7.0" + } + }, + "nbformat": 4, + "nbformat_minor": 0 +} \ No newline at end of file diff --git a/second_edition/chapter09_part02_modern-convnet-architecture-patterns.ipynb b/second_edition/chapter09_part02_modern-convnet-architecture-patterns.ipynb new file mode 100644 index 0000000000..941946b2e2 --- /dev/null +++ b/second_edition/chapter09_part02_modern-convnet-architecture-patterns.ipynb @@ -0,0 +1,321 @@ +{ + "cells": [ + { + "cell_type": "markdown", + "metadata": { + "colab_type": "text" + }, + "source": [ + "This is a companion notebook for the book [Deep Learning with Python, Second Edition](https://www.manning.com/books/deep-learning-with-python-second-edition?a_aid=keras&a_bid=76564dff). For readability, it only contains runnable code blocks and section titles, and omits everything else in the book: text paragraphs, figures, and pseudocode.\n\n**If you want to be able to follow what's going on, I recommend reading the notebook side by side with your copy of the book.**\n\nThis notebook was generated for TensorFlow 2.6." + ] + }, + { + "cell_type": "markdown", + "metadata": { + "colab_type": "text" + }, + "source": [ + "## Modern convnet architecture patterns" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "colab_type": "text" + }, + "source": [ + "### Modularity, hierarchy, and reuse" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "colab_type": "text" + }, + "source": [ + "### Residual connections" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "colab_type": "text" + }, + "source": [ + "**Residual block where the number of filters changes**" + ] + }, + { + "cell_type": "code", + "execution_count": 0, + "metadata": { + "colab_type": "code" + }, + "outputs": [], + "source": [ + "from tensorflow import keras\n", + "from tensorflow.keras import layers\n", + "\n", + "inputs = keras.Input(shape=(32, 32, 3))\n", + "x = layers.Conv2D(32, 3, activation=\"relu\")(inputs)\n", + "residual = x\n", + "x = layers.Conv2D(64, 3, activation=\"relu\", padding=\"same\")(x)\n", + "residual = layers.Conv2D(64, 1)(residual)\n", + "x = layers.add([x, residual])" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "colab_type": "text" + }, + "source": [ + "**Case where target block includes a max pooling layer**" + ] + }, + { + "cell_type": "code", + "execution_count": 0, + "metadata": { + "colab_type": "code" + }, + "outputs": [], + "source": [ + "inputs = keras.Input(shape=(32, 32, 3))\n", + "x = layers.Conv2D(32, 3, activation=\"relu\")(inputs)\n", + "residual = x\n", + "x = layers.Conv2D(64, 3, activation=\"relu\", padding=\"same\")(x)\n", + "x = layers.MaxPooling2D(2, padding=\"same\")(x)\n", + "residual = layers.Conv2D(64, 1, strides=2)(residual)\n", + "x = layers.add([x, residual])" + ] + }, + { + "cell_type": "code", + "execution_count": 0, + "metadata": { + "colab_type": "code" + }, + "outputs": [], + "source": [ + "inputs = keras.Input(shape=(32, 32, 3))\n", + "x = layers.Rescaling(1./255)(inputs)\n", + "\n", + "def residual_block(x, filters, pooling=False):\n", + " residual = x\n", + " x = layers.Conv2D(filters, 3, activation=\"relu\", padding=\"same\")(x)\n", + " x = layers.Conv2D(filters, 3, activation=\"relu\", padding=\"same\")(x)\n", + " if pooling:\n", + " x = layers.MaxPooling2D(2, padding=\"same\")(x)\n", + " residual = layers.Conv2D(filters, 1, strides=2)(residual)\n", + " elif filters != residual.shape[-1]:\n", + " residual = layers.Conv2D(filters, 1)(residual)\n", + " x = layers.add([x, residual])\n", + " return x\n", + "\n", + "x = residual_block(x, filters=32, pooling=True)\n", + "x = residual_block(x, filters=64, pooling=True)\n", + "x = residual_block(x, filters=128, pooling=False)\n", + "\n", + "x = layers.GlobalAveragePooling2D()(x)\n", + "outputs = layers.Dense(1, activation=\"sigmoid\")(x)\n", + "model = keras.Model(inputs=inputs, outputs=outputs)\n", + "model.summary()" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "colab_type": "text" + }, + "source": [ + "### Batch normalization" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "colab_type": "text" + }, + "source": [ + "### Depthwise separable convolutions" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "colab_type": "text" + }, + "source": [ + "### Putting it together: A mini Xception-like model" + ] + }, + { + "cell_type": "code", + "execution_count": 0, + "metadata": { + "colab_type": "code" + }, + "outputs": [], + "source": [ + "from google.colab import files\n", + "files.upload()" + ] + }, + { + "cell_type": "code", + "execution_count": 0, + "metadata": { + "colab_type": "code" + }, + "outputs": [], + "source": [ + "!mkdir ~/.kaggle\n", + "!cp kaggle.json ~/.kaggle/\n", + "!chmod 600 ~/.kaggle/kaggle.json\n", + "!kaggle competitions download -c dogs-vs-cats\n", + "!unzip -qq train.zip" + ] + }, + { + "cell_type": "code", + "execution_count": 0, + "metadata": { + "colab_type": "code" + }, + "outputs": [], + "source": [ + "import os, shutil, pathlib\n", + "from tensorflow.keras.utils import image_dataset_from_directory\n", + "\n", + "original_dir = pathlib.Path(\"train\")\n", + "new_base_dir = pathlib.Path(\"cats_vs_dogs_small\")\n", + "\n", + "def make_subset(subset_name, start_index, end_index):\n", + " for category in (\"cat\", \"dog\"):\n", + " dir = new_base_dir / subset_name / category\n", + " os.makedirs(dir)\n", + " fnames = [f\"{category}.{i}.jpg\" for i in range(start_index, end_index)]\n", + " for fname in fnames:\n", + " shutil.copyfile(src=original_dir / fname,\n", + " dst=dir / fname)\n", + "\n", + "make_subset(\"train\", start_index=0, end_index=1000)\n", + "make_subset(\"validation\", start_index=1000, end_index=1500)\n", + "make_subset(\"test\", start_index=1500, end_index=2500)\n", + "\n", + "train_dataset = image_dataset_from_directory(\n", + " new_base_dir / \"train\",\n", + " image_size=(180, 180),\n", + " batch_size=32)\n", + "validation_dataset = image_dataset_from_directory(\n", + " new_base_dir / \"validation\",\n", + " image_size=(180, 180),\n", + " batch_size=32)\n", + "test_dataset = image_dataset_from_directory(\n", + " new_base_dir / \"test\",\n", + " image_size=(180, 180),\n", + " batch_size=32)" + ] + }, + { + "cell_type": "code", + "execution_count": 0, + "metadata": { + "colab_type": "code" + }, + "outputs": [], + "source": [ + "data_augmentation = keras.Sequential(\n", + " [\n", + " layers.RandomFlip(\"horizontal\"),\n", + " layers.RandomRotation(0.1),\n", + " layers.RandomZoom(0.2),\n", + " ]\n", + ")" + ] + }, + { + "cell_type": "code", + "execution_count": 0, + "metadata": { + "colab_type": "code" + }, + "outputs": [], + "source": [ + "inputs = keras.Input(shape=(180, 180, 3))\n", + "x = data_augmentation(inputs)\n", + "\n", + "x = layers.Rescaling(1./255)(x)\n", + "x = layers.Conv2D(filters=32, kernel_size=5, use_bias=False)(x)\n", + "\n", + "for size in [32, 64, 128, 256, 512]:\n", + " residual = x\n", + "\n", + " x = layers.BatchNormalization()(x)\n", + " x = layers.Activation(\"relu\")(x)\n", + " x = layers.SeparableConv2D(size, 3, padding=\"same\", use_bias=False)(x)\n", + "\n", + " x = layers.BatchNormalization()(x)\n", + " x = layers.Activation(\"relu\")(x)\n", + " x = layers.SeparableConv2D(size, 3, padding=\"same\", use_bias=False)(x)\n", + "\n", + " x = layers.MaxPooling2D(3, strides=2, padding=\"same\")(x)\n", + "\n", + " residual = layers.Conv2D(\n", + " size, 1, strides=2, padding=\"same\", use_bias=False)(residual)\n", + " x = layers.add([x, residual])\n", + "\n", + "x = layers.GlobalAveragePooling2D()(x)\n", + "x = layers.Dropout(0.5)(x)\n", + "outputs = layers.Dense(1, activation=\"sigmoid\")(x)\n", + "model = keras.Model(inputs=inputs, outputs=outputs)" + ] + }, + { + "cell_type": "code", + "execution_count": 0, + "metadata": { + "colab_type": "code" + }, + "outputs": [], + "source": [ + "model.compile(loss=\"binary_crossentropy\",\n", + " optimizer=\"rmsprop\",\n", + " metrics=[\"accuracy\"])\n", + "history = model.fit(\n", + " train_dataset,\n", + " epochs=100,\n", + " validation_data=validation_dataset)" + ] + } + ], + "metadata": { + "colab": { + "collapsed_sections": [], + "name": "chapter09_part02_modern-convnet-architecture-patterns.i", + "private_outputs": false, + "provenance": [], + "toc_visible": true + }, + "kernelspec": { + "display_name": "Python 3", + "language": "python", + "name": "python3" + }, + "language_info": { + "codemirror_mode": { + "name": "ipython", + "version": 3 + }, + "file_extension": ".py", + "mimetype": "text/x-python", + "name": "python", + "nbconvert_exporter": "python", + "pygments_lexer": "ipython3", + "version": "3.7.0" + } + }, + "nbformat": 4, + "nbformat_minor": 0 +} \ No newline at end of file diff --git a/second_edition/chapter09_part03_interpreting-what-convnets-learn.ipynb b/second_edition/chapter09_part03_interpreting-what-convnets-learn.ipynb new file mode 100644 index 0000000000..0767d47e71 --- /dev/null +++ b/second_edition/chapter09_part03_interpreting-what-convnets-learn.ipynb @@ -0,0 +1,785 @@ +{ + "cells": [ + { + "cell_type": "markdown", + "metadata": { + "colab_type": "text" + }, + "source": [ + "This is a companion notebook for the book [Deep Learning with Python, Second Edition](https://www.manning.com/books/deep-learning-with-python-second-edition?a_aid=keras&a_bid=76564dff). For readability, it only contains runnable code blocks and section titles, and omits everything else in the book: text paragraphs, figures, and pseudocode.\n\n**If you want to be able to follow what's going on, I recommend reading the notebook side by side with your copy of the book.**\n\nThis notebook was generated for TensorFlow 2.6." + ] + }, + { + "cell_type": "markdown", + "metadata": { + "colab_type": "text" + }, + "source": [ + "## Interpreting what convnets learn" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "colab_type": "text" + }, + "source": [ + "### Visualizing intermediate activations" + ] + }, + { + "cell_type": "code", + "execution_count": 0, + "metadata": { + "colab_type": "code" + }, + "outputs": [], + "source": [ + "# You can use this to load the file \"convnet_from_scratch_with_augmentation.keras\"\n", + "# you obtained in the last chapter.\n", + "from google.colab import files\n", + "files.upload()" + ] + }, + { + "cell_type": "code", + "execution_count": 0, + "metadata": { + "colab_type": "code" + }, + "outputs": [], + "source": [ + "from tensorflow import keras\n", + "model = keras.models.load_model(\"convnet_from_scratch_with_augmentation.keras\")\n", + "model.summary()" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "colab_type": "text" + }, + "source": [ + "**Preprocessing a single image**" + ] + }, + { + "cell_type": "code", + "execution_count": 0, + "metadata": { + "colab_type": "code" + }, + "outputs": [], + "source": [ + "from tensorflow import keras\n", + "import numpy as np\n", + "\n", + "img_path = keras.utils.get_file(\n", + " fname=\"cat.jpg\",\n", + " origin=\"https://img-datasets.s3.amazonaws.com/cat.jpg\")\n", + "\n", + "def get_img_array(img_path, target_size):\n", + " img = keras.utils.load_img(\n", + " img_path, target_size=target_size)\n", + " array = keras.utils.img_to_array(img)\n", + " array = np.expand_dims(array, axis=0)\n", + " return array\n", + "\n", + "img_tensor = get_img_array(img_path, target_size=(180, 180))" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "colab_type": "text" + }, + "source": [ + "**Displaying the test picture**" + ] + }, + { + "cell_type": "code", + "execution_count": 0, + "metadata": { + "colab_type": "code" + }, + "outputs": [], + "source": [ + "import matplotlib.pyplot as plt\n", + "plt.axis(\"off\")\n", + "plt.imshow(img_tensor[0].astype(\"uint8\"))\n", + "plt.show()" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "colab_type": "text" + }, + "source": [ + "**Instantiating a model that returns layer activations**" + ] + }, + { + "cell_type": "code", + "execution_count": 0, + "metadata": { + "colab_type": "code" + }, + "outputs": [], + "source": [ + "from tensorflow.keras import layers\n", + "\n", + "layer_outputs = []\n", + "layer_names = []\n", + "for layer in model.layers:\n", + " if isinstance(layer, (layers.Conv2D, layers.MaxPooling2D)):\n", + " layer_outputs.append(layer.output)\n", + " layer_names.append(layer.name)\n", + "activation_model = keras.Model(inputs=model.input, outputs=layer_outputs)" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "colab_type": "text" + }, + "source": [ + "**Using the model to compute layer activations**" + ] + }, + { + "cell_type": "code", + "execution_count": 0, + "metadata": { + "colab_type": "code" + }, + "outputs": [], + "source": [ + "activations = activation_model.predict(img_tensor)" + ] + }, + { + "cell_type": "code", + "execution_count": 0, + "metadata": { + "colab_type": "code" + }, + "outputs": [], + "source": [ + "first_layer_activation = activations[0]\n", + "print(first_layer_activation.shape)" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "colab_type": "text" + }, + "source": [ + "**Visualizing the fifth channel**" + ] + }, + { + "cell_type": "code", + "execution_count": 0, + "metadata": { + "colab_type": "code" + }, + "outputs": [], + "source": [ + "import matplotlib.pyplot as plt\n", + "plt.matshow(first_layer_activation[0, :, :, 5], cmap=\"viridis\")" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "colab_type": "text" + }, + "source": [ + "**Visualizing every channel in every intermediate activation**" + ] + }, + { + "cell_type": "code", + "execution_count": 0, + "metadata": { + "colab_type": "code" + }, + "outputs": [], + "source": [ + "images_per_row = 16\n", + "for layer_name, layer_activation in zip(layer_names, activations):\n", + " n_features = layer_activation.shape[-1]\n", + " size = layer_activation.shape[1]\n", + " n_cols = n_features // images_per_row\n", + " display_grid = np.zeros(((size + 1) * n_cols - 1,\n", + " images_per_row * (size + 1) - 1))\n", + " for col in range(n_cols):\n", + " for row in range(images_per_row):\n", + " channel_index = col * images_per_row + row\n", + " channel_image = layer_activation[0, :, :, channel_index].copy()\n", + " if channel_image.sum() != 0:\n", + " channel_image -= channel_image.mean()\n", + " channel_image /= channel_image.std()\n", + " channel_image *= 64\n", + " channel_image += 128\n", + " channel_image = np.clip(channel_image, 0, 255).astype(\"uint8\")\n", + " display_grid[\n", + " col * (size + 1): (col + 1) * size + col,\n", + " row * (size + 1) : (row + 1) * size + row] = channel_image\n", + " scale = 1. / size\n", + " plt.figure(figsize=(scale * display_grid.shape[1],\n", + " scale * display_grid.shape[0]))\n", + " plt.title(layer_name)\n", + " plt.grid(False)\n", + " plt.axis(\"off\")\n", + " plt.imshow(display_grid, aspect=\"auto\", cmap=\"viridis\")" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "colab_type": "text" + }, + "source": [ + "### Visualizing convnet filters" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "colab_type": "text" + }, + "source": [ + "**Instantiating the Xception convolutional base**" + ] + }, + { + "cell_type": "code", + "execution_count": 0, + "metadata": { + "colab_type": "code" + }, + "outputs": [], + "source": [ + "model = keras.applications.xception.Xception(\n", + " weights=\"imagenet\",\n", + " include_top=False)" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "colab_type": "text" + }, + "source": [ + "**Printing the names of all convolutional layers in Xception**" + ] + }, + { + "cell_type": "code", + "execution_count": 0, + "metadata": { + "colab_type": "code" + }, + "outputs": [], + "source": [ + "for layer in model.layers:\n", + " if isinstance(layer, (keras.layers.Conv2D, keras.layers.SeparableConv2D)):\n", + " print(layer.name)" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "colab_type": "text" + }, + "source": [ + "**Creating a feature extractor model**" + ] + }, + { + "cell_type": "code", + "execution_count": 0, + "metadata": { + "colab_type": "code" + }, + "outputs": [], + "source": [ + "layer_name = \"block3_sepconv1\"\n", + "layer = model.get_layer(name=layer_name)\n", + "feature_extractor = keras.Model(inputs=model.input, outputs=layer.output)" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "colab_type": "text" + }, + "source": [ + "**Using the feature extractor**" + ] + }, + { + "cell_type": "code", + "execution_count": 0, + "metadata": { + "colab_type": "code" + }, + "outputs": [], + "source": [ + "activation = feature_extractor(\n", + " keras.applications.xception.preprocess_input(img_tensor)\n", + ")" + ] + }, + { + "cell_type": "code", + "execution_count": 0, + "metadata": { + "colab_type": "code" + }, + "outputs": [], + "source": [ + "import tensorflow as tf\n", + "\n", + "def compute_loss(image, filter_index):\n", + " activation = feature_extractor(image)\n", + " filter_activation = activation[:, 2:-2, 2:-2, filter_index]\n", + " return tf.reduce_mean(filter_activation)" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "colab_type": "text" + }, + "source": [ + "**Loss maximization via stochastic gradient ascent**" + ] + }, + { + "cell_type": "code", + "execution_count": 0, + "metadata": { + "colab_type": "code" + }, + "outputs": [], + "source": [ + "@tf.function\n", + "def gradient_ascent_step(image, filter_index, learning_rate):\n", + " with tf.GradientTape() as tape:\n", + " tape.watch(image)\n", + " loss = compute_loss(image, filter_index)\n", + " grads = tape.gradient(loss, image)\n", + " grads = tf.math.l2_normalize(grads)\n", + " image += learning_rate * grads\n", + " return image" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "colab_type": "text" + }, + "source": [ + "**Function to generate filter visualizations**" + ] + }, + { + "cell_type": "code", + "execution_count": 0, + "metadata": { + "colab_type": "code" + }, + "outputs": [], + "source": [ + "img_width = 200\n", + "img_height = 200\n", + "\n", + "def generate_filter_pattern(filter_index):\n", + " iterations = 30\n", + " learning_rate = 10.\n", + " image = tf.random.uniform(\n", + " minval=0.4,\n", + " maxval=0.6,\n", + " shape=(1, img_width, img_height, 3))\n", + " for i in range(iterations):\n", + " image = gradient_ascent_step(image, filter_index, learning_rate)\n", + " return image[0].numpy()" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "colab_type": "text" + }, + "source": [ + "**Utility function to convert a tensor into a valid image**" + ] + }, + { + "cell_type": "code", + "execution_count": 0, + "metadata": { + "colab_type": "code" + }, + "outputs": [], + "source": [ + "def deprocess_image(image):\n", + " image -= image.mean()\n", + " image /= image.std()\n", + " image *= 64\n", + " image += 128\n", + " image = np.clip(image, 0, 255).astype(\"uint8\")\n", + " image = image[25:-25, 25:-25, :]\n", + " return image" + ] + }, + { + "cell_type": "code", + "execution_count": 0, + "metadata": { + "colab_type": "code" + }, + "outputs": [], + "source": [ + "plt.axis(\"off\")\n", + "plt.imshow(deprocess_image(generate_filter_pattern(filter_index=2)))" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "colab_type": "text" + }, + "source": [ + "**Generating a grid of all filter response patterns in a layer**" + ] + }, + { + "cell_type": "code", + "execution_count": 0, + "metadata": { + "colab_type": "code" + }, + "outputs": [], + "source": [ + "all_images = []\n", + "for filter_index in range(64):\n", + " print(f\"Processing filter {filter_index}\")\n", + " image = deprocess_image(\n", + " generate_filter_pattern(filter_index)\n", + " )\n", + " all_images.append(image)\n", + "\n", + "margin = 5\n", + "n = 8\n", + "cropped_width = img_width - 25 * 2\n", + "cropped_height = img_height - 25 * 2\n", + "width = n * cropped_width + (n - 1) * margin\n", + "height = n * cropped_height + (n - 1) * margin\n", + "stitched_filters = np.zeros((width, height, 3))\n", + "\n", + "for i in range(n):\n", + " for j in range(n):\n", + " image = all_images[i * n + j]\n", + " stitched_filters[\n", + " (cropped_width + margin) * i : (cropped_width + margin) * i + cropped_width,\n", + " (cropped_height + margin) * j : (cropped_height + margin) * j\n", + " + cropped_height,\n", + " :,\n", + " ] = image\n", + "\n", + "keras.utils.save_img(\n", + " f\"filters_for_layer_{layer_name}.png\", stitched_filters)" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "colab_type": "text" + }, + "source": [ + "### Visualizing heatmaps of class activation" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "colab_type": "text" + }, + "source": [ + "**Loading the Xception network with pretrained weights**" + ] + }, + { + "cell_type": "code", + "execution_count": 0, + "metadata": { + "colab_type": "code" + }, + "outputs": [], + "source": [ + "model = keras.applications.xception.Xception(weights=\"imagenet\")" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "colab_type": "text" + }, + "source": [ + "**Preprocessing an input image for Xception**" + ] + }, + { + "cell_type": "code", + "execution_count": 0, + "metadata": { + "colab_type": "code" + }, + "outputs": [], + "source": [ + "img_path = keras.utils.get_file(\n", + " fname=\"elephant.jpg\",\n", + " origin=\"https://img-datasets.s3.amazonaws.com/elephant.jpg\")\n", + "\n", + "def get_img_array(img_path, target_size):\n", + " img = keras.utils.load_img(img_path, target_size=target_size)\n", + " array = keras.utils.img_to_array(img)\n", + " array = np.expand_dims(array, axis=0)\n", + " array = keras.applications.xception.preprocess_input(array)\n", + " return array\n", + "\n", + "img_array = get_img_array(img_path, target_size=(299, 299))" + ] + }, + { + "cell_type": "code", + "execution_count": 0, + "metadata": { + "colab_type": "code" + }, + "outputs": [], + "source": [ + "preds = model.predict(img_array)\n", + "print(keras.applications.xception.decode_predictions(preds, top=3)[0])" + ] + }, + { + "cell_type": "code", + "execution_count": 0, + "metadata": { + "colab_type": "code" + }, + "outputs": [], + "source": [ + "np.argmax(preds[0])" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "colab_type": "text" + }, + "source": [ + "**Setting up a model that returns the last convolutional output**" + ] + }, + { + "cell_type": "code", + "execution_count": 0, + "metadata": { + "colab_type": "code" + }, + "outputs": [], + "source": [ + "last_conv_layer_name = \"block14_sepconv2_act\"\n", + "classifier_layer_names = [\n", + " \"avg_pool\",\n", + " \"predictions\",\n", + "]\n", + "last_conv_layer = model.get_layer(last_conv_layer_name)\n", + "last_conv_layer_model = keras.Model(model.inputs, last_conv_layer.output)" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "colab_type": "text" + }, + "source": [ + "**Reapplying the classifier on top of the last convolutional output**" + ] + }, + { + "cell_type": "code", + "execution_count": 0, + "metadata": { + "colab_type": "code" + }, + "outputs": [], + "source": [ + "classifier_input = keras.Input(shape=last_conv_layer.output.shape[1:])\n", + "x = classifier_input\n", + "for layer_name in classifier_layer_names:\n", + " x = model.get_layer(layer_name)(x)\n", + "classifier_model = keras.Model(classifier_input, x)" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "colab_type": "text" + }, + "source": [ + "**Retrieving the gradients of the top predicted class**" + ] + }, + { + "cell_type": "code", + "execution_count": 0, + "metadata": { + "colab_type": "code" + }, + "outputs": [], + "source": [ + "import tensorflow as tf\n", + "\n", + "with tf.GradientTape() as tape:\n", + " last_conv_layer_output = last_conv_layer_model(img_array)\n", + " tape.watch(last_conv_layer_output)\n", + " preds = classifier_model(last_conv_layer_output)\n", + " top_pred_index = tf.argmax(preds[0])\n", + " top_class_channel = preds[:, top_pred_index]\n", + "\n", + "grads = tape.gradient(top_class_channel, last_conv_layer_output)" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "colab_type": "text" + }, + "source": [ + "**Gradient pooling and channel-importance weighting**" + ] + }, + { + "cell_type": "code", + "execution_count": 0, + "metadata": { + "colab_type": "code" + }, + "outputs": [], + "source": [ + "pooled_grads = tf.reduce_mean(grads, axis=(0, 1, 2)).numpy()\n", + "last_conv_layer_output = last_conv_layer_output.numpy()[0]\n", + "for i in range(pooled_grads.shape[-1]):\n", + " last_conv_layer_output[:, :, i] *= pooled_grads[i]\n", + "heatmap = np.mean(last_conv_layer_output, axis=-1)" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "colab_type": "text" + }, + "source": [ + "**Heatmap post-processing**" + ] + }, + { + "cell_type": "code", + "execution_count": 0, + "metadata": { + "colab_type": "code" + }, + "outputs": [], + "source": [ + "heatmap = np.maximum(heatmap, 0)\n", + "heatmap /= np.max(heatmap)\n", + "plt.matshow(heatmap)" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "colab_type": "text" + }, + "source": [ + "**Superimposing the heatmap on the original picture**" + ] + }, + { + "cell_type": "code", + "execution_count": 0, + "metadata": { + "colab_type": "code" + }, + "outputs": [], + "source": [ + "import matplotlib.cm as cm\n", + "\n", + "img = keras.utils.load_img(img_path)\n", + "img = keras.utils.img_to_array(img)\n", + "\n", + "heatmap = np.uint8(255 * heatmap)\n", + "\n", + "jet = cm.get_cmap(\"jet\")\n", + "jet_colors = jet(np.arange(256))[:, :3]\n", + "jet_heatmap = jet_colors[heatmap]\n", + "\n", + "jet_heatmap = keras.utils.array_to_img(jet_heatmap)\n", + "jet_heatmap = jet_heatmap.resize((img.shape[1], img.shape[0]))\n", + "jet_heatmap = keras.utils.img_to_array(jet_heatmap)\n", + "\n", + "superimposed_img = jet_heatmap * 0.4 + img\n", + "superimposed_img = keras.utils.array_to_img(superimposed_img)\n", + "\n", + "save_path = \"elephant_cam.jpg\"\n", + "superimposed_img.save(save_path)" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "colab_type": "text" + }, + "source": [ + "## Summary" + ] + } + ], + "metadata": { + "colab": { + "collapsed_sections": [], + "name": "chapter09_part03_interpreting-what-convnets-learn.i", + "private_outputs": false, + "provenance": [], + "toc_visible": true + }, + "kernelspec": { + "display_name": "Python 3", + "language": "python", + "name": "python3" + }, + "language_info": { + "codemirror_mode": { + "name": "ipython", + "version": 3 + }, + "file_extension": ".py", + "mimetype": "text/x-python", + "name": "python", + "nbconvert_exporter": "python", + "pygments_lexer": "ipython3", + "version": "3.7.0" + } + }, + "nbformat": 4, + "nbformat_minor": 0 +} \ No newline at end of file diff --git a/second_edition/chapter10_dl-for-timeseries.ipynb b/second_edition/chapter10_dl-for-timeseries.ipynb new file mode 100644 index 0000000000..ee1eb236cf --- /dev/null +++ b/second_edition/chapter10_dl-for-timeseries.ipynb @@ -0,0 +1,845 @@ +{ + "cells": [ + { + "cell_type": "markdown", + "metadata": { + "colab_type": "text" + }, + "source": [ + "This is a companion notebook for the book [Deep Learning with Python, Second Edition](https://www.manning.com/books/deep-learning-with-python-second-edition?a_aid=keras&a_bid=76564dff). For readability, it only contains runnable code blocks and section titles, and omits everything else in the book: text paragraphs, figures, and pseudocode.\n\n**If you want to be able to follow what's going on, I recommend reading the notebook side by side with your copy of the book.**\n\nThis notebook was generated for TensorFlow 2.6." + ] + }, + { + "cell_type": "markdown", + "metadata": { + "colab_type": "text" + }, + "source": [ + "# Deep learning for timeseries" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "colab_type": "text" + }, + "source": [ + "## Different kinds of timeseries tasks" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "colab_type": "text" + }, + "source": [ + "## A temperature-forecasting example" + ] + }, + { + "cell_type": "code", + "execution_count": 0, + "metadata": { + "colab_type": "code" + }, + "outputs": [], + "source": [ + "!wget https://s3.amazonaws.com/keras-datasets/jena_climate_2009_2016.csv.zip\n", + "!unzip jena_climate_2009_2016.csv.zip" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "colab_type": "text" + }, + "source": [ + "**Inspecting the data of the Jena weather dataset**" + ] + }, + { + "cell_type": "code", + "execution_count": 0, + "metadata": { + "colab_type": "code" + }, + "outputs": [], + "source": [ + "import os\n", + "fname = os.path.join(\"jena_climate_2009_2016.csv\")\n", + "\n", + "with open(fname) as f:\n", + " data = f.read()\n", + "\n", + "lines = data.split(\"\\n\")\n", + "header = lines[0].split(\",\")\n", + "lines = lines[1:]\n", + "print(header)\n", + "print(len(lines))" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "colab_type": "text" + }, + "source": [ + "**Parsing the data**" + ] + }, + { + "cell_type": "code", + "execution_count": 0, + "metadata": { + "colab_type": "code" + }, + "outputs": [], + "source": [ + "import numpy as np\n", + "temperature = np.zeros((len(lines),))\n", + "raw_data = np.zeros((len(lines), len(header) - 1))\n", + "for i, line in enumerate(lines):\n", + " values = [float(x) for x in line.split(\",\")[1:]]\n", + " temperature[i] = values[1]\n", + " raw_data[i, :] = values[:]" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "colab_type": "text" + }, + "source": [ + "**Plotting the temperature timeseries**" + ] + }, + { + "cell_type": "code", + "execution_count": 0, + "metadata": { + "colab_type": "code" + }, + "outputs": [], + "source": [ + "from matplotlib import pyplot as plt\n", + "plt.plot(range(len(temperature)), temperature)" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "colab_type": "text" + }, + "source": [ + "**Plotting the first 10 days of the temperature timeseries**" + ] + }, + { + "cell_type": "code", + "execution_count": 0, + "metadata": { + "colab_type": "code" + }, + "outputs": [], + "source": [ + "plt.plot(range(1440), temperature[:1440])" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "colab_type": "text" + }, + "source": [ + "**Computing the number of samples we'll use for each data split**" + ] + }, + { + "cell_type": "code", + "execution_count": 0, + "metadata": { + "colab_type": "code" + }, + "outputs": [], + "source": [ + "num_train_samples = int(0.5 * len(raw_data))\n", + "num_val_samples = int(0.25 * len(raw_data))\n", + "num_test_samples = len(raw_data) - num_train_samples - num_val_samples\n", + "print(\"num_train_samples:\", num_train_samples)\n", + "print(\"num_val_samples:\", num_val_samples)\n", + "print(\"num_test_samples:\", num_test_samples)" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "colab_type": "text" + }, + "source": [ + "### Preparing the data" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "colab_type": "text" + }, + "source": [ + "**Normalizing the data**" + ] + }, + { + "cell_type": "code", + "execution_count": 0, + "metadata": { + "colab_type": "code" + }, + "outputs": [], + "source": [ + "mean = raw_data[:num_train_samples].mean(axis=0)\n", + "raw_data -= mean\n", + "std = raw_data[:num_train_samples].std(axis=0)\n", + "raw_data /= std" + ] + }, + { + "cell_type": "code", + "execution_count": 0, + "metadata": { + "colab_type": "code" + }, + "outputs": [], + "source": [ + "import numpy as np\n", + "from tensorflow import keras\n", + "int_sequence = np.arange(10)\n", + "dummy_dataset = keras.utils.timeseries_dataset_from_array(\n", + " data=int_sequence[:-3],\n", + " targets=int_sequence[3:],\n", + " sequence_length=3,\n", + " batch_size=2,\n", + ")\n", + "\n", + "for inputs, targets in dummy_dataset:\n", + " for i in range(inputs.shape[0]):\n", + " print([int(x) for x in inputs[i]], int(targets[i]))" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "colab_type": "text" + }, + "source": [ + "**Instantiating datasets for training, validation, and testing**" + ] + }, + { + "cell_type": "code", + "execution_count": 0, + "metadata": { + "colab_type": "code" + }, + "outputs": [], + "source": [ + "sampling_rate = 6\n", + "sequence_length = 120\n", + "delay = sampling_rate * (sequence_length + 24 - 1)\n", + "batch_size = 256\n", + "\n", + "train_dataset = keras.utils.timeseries_dataset_from_array(\n", + " raw_data[:-delay],\n", + " targets=temperature[delay:],\n", + " sampling_rate=sampling_rate,\n", + " sequence_length=sequence_length,\n", + " shuffle=True,\n", + " batch_size=batch_size,\n", + " start_index=0,\n", + " end_index=num_train_samples)\n", + "\n", + "val_dataset = keras.utils.timeseries_dataset_from_array(\n", + " raw_data[:-delay],\n", + " targets=temperature[delay:],\n", + " sampling_rate=sampling_rate,\n", + " sequence_length=sequence_length,\n", + " shuffle=True,\n", + " batch_size=batch_size,\n", + " start_index=num_train_samples,\n", + " end_index=num_train_samples + num_val_samples)\n", + "\n", + "test_dataset = keras.utils.timeseries_dataset_from_array(\n", + " raw_data[:-delay],\n", + " targets=temperature[delay:],\n", + " sampling_rate=sampling_rate,\n", + " sequence_length=sequence_length,\n", + " shuffle=True,\n", + " batch_size=batch_size,\n", + " start_index=num_train_samples + num_val_samples)" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "colab_type": "text" + }, + "source": [ + "**Inspecting the output of one of our datasets**" + ] + }, + { + "cell_type": "code", + "execution_count": 0, + "metadata": { + "colab_type": "code" + }, + "outputs": [], + "source": [ + "for samples, targets in train_dataset:\n", + " print(\"samples shape:\", samples.shape)\n", + " print(\"targets shape:\", targets.shape)\n", + " break" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "colab_type": "text" + }, + "source": [ + "### A common-sense, non-machine-learning baseline" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "colab_type": "text" + }, + "source": [ + "**Computing the common-sense baseline MAE**" + ] + }, + { + "cell_type": "code", + "execution_count": 0, + "metadata": { + "colab_type": "code" + }, + "outputs": [], + "source": [ + "def evaluate_naive_method(dataset):\n", + " total_abs_err = 0.\n", + " samples_seen = 0\n", + " for samples, targets in dataset:\n", + " preds = samples[:, -1, 1] * std[1] + mean[1]\n", + " total_abs_err += np.sum(np.abs(preds - targets))\n", + " samples_seen += samples.shape[0]\n", + " return total_abs_err / samples_seen\n", + "\n", + "print(f\"Validation MAE: {evaluate_naive_method(val_dataset):.2f}\")\n", + "print(f\"Test MAE: {evaluate_naive_method(test_dataset):.2f}\")" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "colab_type": "text" + }, + "source": [ + "### Let's try a basic machine-learning model" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "colab_type": "text" + }, + "source": [ + "**Training and evaluating a densely connected model**" + ] + }, + { + "cell_type": "code", + "execution_count": 0, + "metadata": { + "colab_type": "code" + }, + "outputs": [], + "source": [ + "from tensorflow import keras\n", + "from tensorflow.keras import layers\n", + "\n", + "inputs = keras.Input(shape=(sequence_length, raw_data.shape[-1]))\n", + "x = layers.Flatten()(inputs)\n", + "x = layers.Dense(16, activation=\"relu\")(x)\n", + "outputs = layers.Dense(1)(x)\n", + "model = keras.Model(inputs, outputs)\n", + "\n", + "callbacks = [\n", + " keras.callbacks.ModelCheckpoint(\"jena_dense.keras\",\n", + " save_best_only=True)\n", + "]\n", + "model.compile(optimizer=\"rmsprop\", loss=\"mse\", metrics=[\"mae\"])\n", + "history = model.fit(train_dataset,\n", + " epochs=10,\n", + " validation_data=val_dataset,\n", + " callbacks=callbacks)\n", + "\n", + "model = keras.models.load_model(\"jena_dense.keras\")\n", + "print(f\"Test MAE: {model.evaluate(test_dataset)[1]:.2f}\")" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "colab_type": "text" + }, + "source": [ + "**Plotting results**" + ] + }, + { + "cell_type": "code", + "execution_count": 0, + "metadata": { + "colab_type": "code" + }, + "outputs": [], + "source": [ + "import matplotlib.pyplot as plt\n", + "loss = history.history[\"mae\"]\n", + "val_loss = history.history[\"val_mae\"]\n", + "epochs = range(1, len(loss) + 1)\n", + "plt.figure()\n", + "plt.plot(epochs, loss, \"bo\", label=\"Training MAE\")\n", + "plt.plot(epochs, val_loss, \"b\", label=\"Validation MAE\")\n", + "plt.title(\"Training and validation MAE\")\n", + "plt.legend()\n", + "plt.show()" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "colab_type": "text" + }, + "source": [ + "### Let's try a 1D convolutional model" + ] + }, + { + "cell_type": "code", + "execution_count": 0, + "metadata": { + "colab_type": "code" + }, + "outputs": [], + "source": [ + "inputs = keras.Input(shape=(sequence_length, raw_data.shape[-1]))\n", + "x = layers.Conv1D(8, 24, activation=\"relu\")(inputs)\n", + "x = layers.MaxPooling1D(2)(x)\n", + "x = layers.Conv1D(8, 12, activation=\"relu\")(x)\n", + "x = layers.MaxPooling1D(2)(x)\n", + "x = layers.Conv1D(8, 6, activation=\"relu\")(x)\n", + "x = layers.GlobalAveragePooling1D()(x)\n", + "outputs = layers.Dense(1)(x)\n", + "model = keras.Model(inputs, outputs)\n", + "\n", + "callbacks = [\n", + " keras.callbacks.ModelCheckpoint(\"jena_conv.keras\",\n", + " save_best_only=True)\n", + "]\n", + "model.compile(optimizer=\"rmsprop\", loss=\"mse\", metrics=[\"mae\"])\n", + "history = model.fit(train_dataset,\n", + " epochs=10,\n", + " validation_data=val_dataset,\n", + " callbacks=callbacks)\n", + "\n", + "model = keras.models.load_model(\"jena_conv.keras\")\n", + "print(f\"Test MAE: {model.evaluate(test_dataset)[1]:.2f}\")" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "colab_type": "text" + }, + "source": [ + "### A first recurrent baseline" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "colab_type": "text" + }, + "source": [ + "**A simple LSTM-based model**" + ] + }, + { + "cell_type": "code", + "execution_count": 0, + "metadata": { + "colab_type": "code" + }, + "outputs": [], + "source": [ + "inputs = keras.Input(shape=(sequence_length, raw_data.shape[-1]))\n", + "x = layers.LSTM(16)(inputs)\n", + "outputs = layers.Dense(1)(x)\n", + "model = keras.Model(inputs, outputs)\n", + "\n", + "callbacks = [\n", + " keras.callbacks.ModelCheckpoint(\"jena_lstm.keras\",\n", + " save_best_only=True)\n", + "]\n", + "model.compile(optimizer=\"rmsprop\", loss=\"mse\", metrics=[\"mae\"])\n", + "history = model.fit(train_dataset,\n", + " epochs=10,\n", + " validation_data=val_dataset,\n", + " callbacks=callbacks)\n", + "\n", + "model = keras.models.load_model(\"jena_lstm.keras\")\n", + "print(f\"Test MAE: {model.evaluate(test_dataset)[1]:.2f}\")" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "colab_type": "text" + }, + "source": [ + "## Understanding recurrent neural networks" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "colab_type": "text" + }, + "source": [ + "**NumPy implementation of a simple RNN**" + ] + }, + { + "cell_type": "code", + "execution_count": 0, + "metadata": { + "colab_type": "code" + }, + "outputs": [], + "source": [ + "import numpy as np\n", + "timesteps = 100\n", + "input_features = 32\n", + "output_features = 64\n", + "inputs = np.random.random((timesteps, input_features))\n", + "state_t = np.zeros((output_features,))\n", + "W = np.random.random((output_features, input_features))\n", + "U = np.random.random((output_features, output_features))\n", + "b = np.random.random((output_features,))\n", + "successive_outputs = []\n", + "for input_t in inputs:\n", + " output_t = np.tanh(np.dot(W, input_t) + np.dot(U, state_t) + b)\n", + " successive_outputs.append(output_t)\n", + " state_t = output_t\n", + "final_output_sequence = np.stack(successive_outputs, axis=0)" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "colab_type": "text" + }, + "source": [ + "### A recurrent layer in Keras" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "colab_type": "text" + }, + "source": [ + "**An RNN layer that can process sequences of any length**" + ] + }, + { + "cell_type": "code", + "execution_count": 0, + "metadata": { + "colab_type": "code" + }, + "outputs": [], + "source": [ + "num_features = 14\n", + "inputs = keras.Input(shape=(None, num_features))\n", + "outputs = layers.SimpleRNN(16)(inputs)" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "colab_type": "text" + }, + "source": [ + "**An RNN layer that returns only its last output step**" + ] + }, + { + "cell_type": "code", + "execution_count": 0, + "metadata": { + "colab_type": "code" + }, + "outputs": [], + "source": [ + "num_features = 14\n", + "steps = 120\n", + "inputs = keras.Input(shape=(steps, num_features))\n", + "outputs = layers.SimpleRNN(16, return_sequences=False)(inputs)\n", + "print(outputs.shape)" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "colab_type": "text" + }, + "source": [ + "**An RNN layer that returns its full output sequence**" + ] + }, + { + "cell_type": "code", + "execution_count": 0, + "metadata": { + "colab_type": "code" + }, + "outputs": [], + "source": [ + "num_features = 14\n", + "steps = 120\n", + "inputs = keras.Input(shape=(steps, num_features))\n", + "outputs = layers.SimpleRNN(16, return_sequences=True)(inputs)\n", + "print(outputs.shape)" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "colab_type": "text" + }, + "source": [ + "**Stacking RNN layers**" + ] + }, + { + "cell_type": "code", + "execution_count": 0, + "metadata": { + "colab_type": "code" + }, + "outputs": [], + "source": [ + "inputs = keras.Input(shape=(steps, num_features))\n", + "x = layers.SimpleRNN(16, return_sequences=True)(inputs)\n", + "x = layers.SimpleRNN(16, return_sequences=True)(x)\n", + "outputs = layers.SimpleRNN(16)(x)" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "colab_type": "text" + }, + "source": [ + "## Advanced use of recurrent neural networks" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "colab_type": "text" + }, + "source": [ + "### Using recurrent dropout to fight overfitting" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "colab_type": "text" + }, + "source": [ + "**Training and evaluating a dropout-regularized LSTM**" + ] + }, + { + "cell_type": "code", + "execution_count": 0, + "metadata": { + "colab_type": "code" + }, + "outputs": [], + "source": [ + "inputs = keras.Input(shape=(sequence_length, raw_data.shape[-1]))\n", + "x = layers.LSTM(32, recurrent_dropout=0.25)(inputs)\n", + "x = layers.Dropout(0.5)(x)\n", + "outputs = layers.Dense(1)(x)\n", + "model = keras.Model(inputs, outputs)\n", + "\n", + "callbacks = [\n", + " keras.callbacks.ModelCheckpoint(\"jena_lstm_dropout.keras\",\n", + " save_best_only=True)\n", + "]\n", + "model.compile(optimizer=\"rmsprop\", loss=\"mse\", metrics=[\"mae\"])\n", + "history = model.fit(train_dataset,\n", + " epochs=50,\n", + " validation_data=val_dataset,\n", + " callbacks=callbacks)" + ] + }, + { + "cell_type": "code", + "execution_count": 0, + "metadata": { + "colab_type": "code" + }, + "outputs": [], + "source": [ + "inputs = keras.Input(shape=(sequence_length, num_features))\n", + "x = layers.LSTM(32, recurrent_dropout=0.2, unroll=True)(inputs)" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "colab_type": "text" + }, + "source": [ + "### Stacking recurrent layers" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "colab_type": "text" + }, + "source": [ + "**Training and evaluating a dropout-regularized, stacked GRU model**" + ] + }, + { + "cell_type": "code", + "execution_count": 0, + "metadata": { + "colab_type": "code" + }, + "outputs": [], + "source": [ + "inputs = keras.Input(shape=(sequence_length, raw_data.shape[-1]))\n", + "x = layers.GRU(32, recurrent_dropout=0.5, return_sequences=True)(inputs)\n", + "x = layers.GRU(32, recurrent_dropout=0.5)(x)\n", + "x = layers.Dropout(0.5)(x)\n", + "outputs = layers.Dense(1)(x)\n", + "model = keras.Model(inputs, outputs)\n", + "\n", + "callbacks = [\n", + " keras.callbacks.ModelCheckpoint(\"jena_stacked_gru_dropout.keras\",\n", + " save_best_only=True)\n", + "]\n", + "model.compile(optimizer=\"rmsprop\", loss=\"mse\", metrics=[\"mae\"])\n", + "history = model.fit(train_dataset,\n", + " epochs=50,\n", + " validation_data=val_dataset,\n", + " callbacks=callbacks)\n", + "model = keras.models.load_model(\"jena_stacked_gru_dropout.keras\")\n", + "print(f\"Test MAE: {model.evaluate(test_dataset)[1]:.2f}\")" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "colab_type": "text" + }, + "source": [ + "### Using bidirectional RNNs" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "colab_type": "text" + }, + "source": [ + "**Training and evaluating a bidirectional LSTM**" + ] + }, + { + "cell_type": "code", + "execution_count": 0, + "metadata": { + "colab_type": "code" + }, + "outputs": [], + "source": [ + "inputs = keras.Input(shape=(sequence_length, raw_data.shape[-1]))\n", + "x = layers.Bidirectional(layers.LSTM(16))(inputs)\n", + "outputs = layers.Dense(1)(x)\n", + "model = keras.Model(inputs, outputs)\n", + "\n", + "model.compile(optimizer=\"rmsprop\", loss=\"mse\", metrics=[\"mae\"])\n", + "history = model.fit(train_dataset,\n", + " epochs=10,\n", + " validation_data=val_dataset)" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "colab_type": "text" + }, + "source": [ + "### Going even further" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "colab_type": "text" + }, + "source": [ + "## Summary" + ] + } + ], + "metadata": { + "colab": { + "collapsed_sections": [], + "name": "chapter10_dl-for-timeseries.i", + "private_outputs": false, + "provenance": [], + "toc_visible": true + }, + "kernelspec": { + "display_name": "Python 3", + "language": "python", + "name": "python3" + }, + "language_info": { + "codemirror_mode": { + "name": "ipython", + "version": 3 + }, + "file_extension": ".py", + "mimetype": "text/x-python", + "name": "python", + "nbconvert_exporter": "python", + "pygments_lexer": "ipython3", + "version": "3.7.0" + } + }, + "nbformat": 4, + "nbformat_minor": 0 +} \ No newline at end of file diff --git a/second_edition/chapter11_part01_introduction.ipynb b/second_edition/chapter11_part01_introduction.ipynb new file mode 100644 index 0000000000..3ef20b7618 --- /dev/null +++ b/second_edition/chapter11_part01_introduction.ipynb @@ -0,0 +1,754 @@ +{ + "cells": [ + { + "cell_type": "markdown", + "metadata": { + "colab_type": "text" + }, + "source": [ + "This is a companion notebook for the book [Deep Learning with Python, Second Edition](https://www.manning.com/books/deep-learning-with-python-second-edition?a_aid=keras&a_bid=76564dff). For readability, it only contains runnable code blocks and section titles, and omits everything else in the book: text paragraphs, figures, and pseudocode.\n\n**If you want to be able to follow what's going on, I recommend reading the notebook side by side with your copy of the book.**\n\nThis notebook was generated for TensorFlow 2.6." + ] + }, + { + "cell_type": "markdown", + "metadata": { + "colab_type": "text" + }, + "source": [ + "# Deep learning for text" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "colab_type": "text" + }, + "source": [ + "## Natural-language processing: The bird's eye view" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "colab_type": "text" + }, + "source": [ + "## Preparing text data" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "colab_type": "text" + }, + "source": [ + "### Text standardization" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "colab_type": "text" + }, + "source": [ + "### Text splitting (tokenization)" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "colab_type": "text" + }, + "source": [ + "### Vocabulary indexing" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "colab_type": "text" + }, + "source": [ + "### Using the TextVectorization layer" + ] + }, + { + "cell_type": "code", + "execution_count": 0, + "metadata": { + "colab_type": "code" + }, + "outputs": [], + "source": [ + "import string\n", + "\n", + "class Vectorizer:\n", + " def standardize(self, text):\n", + " text = text.lower()\n", + " return \"\".join(char for char in text if char not in string.punctuation)\n", + "\n", + " def tokenize(self, text):\n", + " text = self.standardize(text)\n", + " return text.split()\n", + "\n", + " def make_vocabulary(self, dataset):\n", + " self.vocabulary = {\"\": 0, \"[UNK]\": 1}\n", + " for text in dataset:\n", + " text = self.standardize(text)\n", + " tokens = self.tokenize(text)\n", + " for token in tokens:\n", + " if token not in self.vocabulary:\n", + " self.vocabulary[token] = len(self.vocabulary)\n", + " self.inverse_vocabulary = dict(\n", + " (v, k) for k, v in self.vocabulary.items())\n", + "\n", + " def encode(self, text):\n", + " text = self.standardize(text)\n", + " tokens = self.tokenize(text)\n", + " return [self.vocabulary.get(token, 1) for token in tokens]\n", + "\n", + " def decode(self, int_sequence):\n", + " return \" \".join(\n", + " self.inverse_vocabulary.get(i, \"[UNK]\") for i in int_sequence)\n", + "\n", + "vectorizer = Vectorizer()\n", + "dataset = [\n", + " \"I write, erase, rewrite\",\n", + " \"Erase again, and then\",\n", + " \"A poppy blooms.\",\n", + "]\n", + "vectorizer.make_vocabulary(dataset)" + ] + }, + { + "cell_type": "code", + "execution_count": 0, + "metadata": { + "colab_type": "code" + }, + "outputs": [], + "source": [ + "test_sentence = \"I write, rewrite, and still rewrite again\"\n", + "encoded_sentence = vectorizer.encode(test_sentence)\n", + "print(encoded_sentence)" + ] + }, + { + "cell_type": "code", + "execution_count": 0, + "metadata": { + "colab_type": "code" + }, + "outputs": [], + "source": [ + "decoded_sentence = vectorizer.decode(encoded_sentence)\n", + "print(decoded_sentence)" + ] + }, + { + "cell_type": "code", + "execution_count": 0, + "metadata": { + "colab_type": "code" + }, + "outputs": [], + "source": [ + "from tensorflow.keras.layers import TextVectorization\n", + "text_vectorization = TextVectorization(\n", + " output_mode=\"int\",\n", + ")" + ] + }, + { + "cell_type": "code", + "execution_count": 0, + "metadata": { + "colab_type": "code" + }, + "outputs": [], + "source": [ + "import re\n", + "import string\n", + "import tensorflow as tf\n", + "\n", + "def custom_standardization_fn(string_tensor):\n", + " lowercase_string = tf.strings.lower(string_tensor)\n", + " return tf.strings.regex_replace(\n", + " lowercase_string, f\"[{re.escape(string.punctuation)}]\", \"\")\n", + "\n", + "def custom_split_fn(string_tensor):\n", + " return tf.strings.split(string_tensor)\n", + "\n", + "text_vectorization = TextVectorization(\n", + " output_mode=\"int\",\n", + " standardize=custom_standardization_fn,\n", + " split=custom_split_fn,\n", + ")" + ] + }, + { + "cell_type": "code", + "execution_count": 0, + "metadata": { + "colab_type": "code" + }, + "outputs": [], + "source": [ + "dataset = [\n", + " \"I write, erase, rewrite\",\n", + " \"Erase again, and then\",\n", + " \"A poppy blooms.\",\n", + "]\n", + "text_vectorization.adapt(dataset)" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "colab_type": "text" + }, + "source": [ + "**Displaying the vocabulary**" + ] + }, + { + "cell_type": "code", + "execution_count": 0, + "metadata": { + "colab_type": "code" + }, + "outputs": [], + "source": [ + "text_vectorization.get_vocabulary()" + ] + }, + { + "cell_type": "code", + "execution_count": 0, + "metadata": { + "colab_type": "code" + }, + "outputs": [], + "source": [ + "vocabulary = text_vectorization.get_vocabulary()\n", + "test_sentence = \"I write, rewrite, and still rewrite again\"\n", + "encoded_sentence = text_vectorization(test_sentence)\n", + "print(encoded_sentence)" + ] + }, + { + "cell_type": "code", + "execution_count": 0, + "metadata": { + "colab_type": "code" + }, + "outputs": [], + "source": [ + "inverse_vocab = dict(enumerate(vocabulary))\n", + "decoded_sentence = \" \".join(inverse_vocab[int(i)] for i in encoded_sentence)\n", + "print(decoded_sentence)" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "colab_type": "text" + }, + "source": [ + "## Two approaches for representing groups of words: Sets and sequences" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "colab_type": "text" + }, + "source": [ + "### Preparing the IMDB movie reviews data" + ] + }, + { + "cell_type": "code", + "execution_count": 0, + "metadata": { + "colab_type": "code" + }, + "outputs": [], + "source": [ + "!curl -O https://ai.stanford.edu/~amaas/data/sentiment/aclImdb_v1.tar.gz\n", + "!tar -xf aclImdb_v1.tar.gz" + ] + }, + { + "cell_type": "code", + "execution_count": 0, + "metadata": { + "colab_type": "code" + }, + "outputs": [], + "source": [ + "!rm -r aclImdb/train/unsup" + ] + }, + { + "cell_type": "code", + "execution_count": 0, + "metadata": { + "colab_type": "code" + }, + "outputs": [], + "source": [ + "!cat aclImdb/train/pos/4077_10.txt" + ] + }, + { + "cell_type": "code", + "execution_count": 0, + "metadata": { + "colab_type": "code" + }, + "outputs": [], + "source": [ + "import os, pathlib, shutil, random\n", + "\n", + "base_dir = pathlib.Path(\"aclImdb\")\n", + "val_dir = base_dir / \"val\"\n", + "train_dir = base_dir / \"train\"\n", + "for category in (\"neg\", \"pos\"):\n", + " os.makedirs(val_dir / category)\n", + " files = os.listdir(train_dir / category)\n", + " random.Random(1337).shuffle(files)\n", + " num_val_samples = int(0.2 * len(files))\n", + " val_files = files[-num_val_samples:]\n", + " for fname in val_files:\n", + " shutil.move(train_dir / category / fname,\n", + " val_dir / category / fname)" + ] + }, + { + "cell_type": "code", + "execution_count": 0, + "metadata": { + "colab_type": "code" + }, + "outputs": [], + "source": [ + "from tensorflow import keras\n", + "batch_size = 32\n", + "\n", + "train_ds = keras.utils.text_dataset_from_directory(\n", + " \"aclImdb/train\", batch_size=batch_size\n", + ")\n", + "val_ds = keras.utils.text_dataset_from_directory(\n", + " \"aclImdb/val\", batch_size=batch_size\n", + ")\n", + "test_ds = keras.utils.text_dataset_from_directory(\n", + " \"aclImdb/test\", batch_size=batch_size\n", + ")" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "colab_type": "text" + }, + "source": [ + "**Displaying the shapes and dtypes of the first batch**" + ] + }, + { + "cell_type": "code", + "execution_count": 0, + "metadata": { + "colab_type": "code" + }, + "outputs": [], + "source": [ + "for inputs, targets in train_ds:\n", + " print(\"inputs.shape:\", inputs.shape)\n", + " print(\"inputs.dtype:\", inputs.dtype)\n", + " print(\"targets.shape:\", targets.shape)\n", + " print(\"targets.dtype:\", targets.dtype)\n", + " print(\"inputs[0]:\", inputs[0])\n", + " print(\"targets[0]:\", targets[0])\n", + " break" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "colab_type": "text" + }, + "source": [ + "### Processing words as a set: The bag-of-words approach" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "colab_type": "text" + }, + "source": [ + "#### Single words (unigrams) with binary encoding" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "colab_type": "text" + }, + "source": [ + "**Preprocessing our datasets with a `TextVectorization` layer**" + ] + }, + { + "cell_type": "code", + "execution_count": 0, + "metadata": { + "colab_type": "code" + }, + "outputs": [], + "source": [ + "text_vectorization = TextVectorization(\n", + " max_tokens=20000,\n", + " output_mode=\"multi_hot\",\n", + ")\n", + "text_only_train_ds = train_ds.map(lambda x, y: x)\n", + "text_vectorization.adapt(text_only_train_ds)\n", + "\n", + "binary_1gram_train_ds = train_ds.map(\n", + " lambda x, y: (text_vectorization(x), y),\n", + " num_parallel_calls=4)\n", + "binary_1gram_val_ds = val_ds.map(\n", + " lambda x, y: (text_vectorization(x), y),\n", + " num_parallel_calls=4)\n", + "binary_1gram_test_ds = test_ds.map(\n", + " lambda x, y: (text_vectorization(x), y),\n", + " num_parallel_calls=4)" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "colab_type": "text" + }, + "source": [ + "**Inspecting the output of our binary unigram dataset**" + ] + }, + { + "cell_type": "code", + "execution_count": 0, + "metadata": { + "colab_type": "code" + }, + "outputs": [], + "source": [ + "for inputs, targets in binary_1gram_train_ds:\n", + " print(\"inputs.shape:\", inputs.shape)\n", + " print(\"inputs.dtype:\", inputs.dtype)\n", + " print(\"targets.shape:\", targets.shape)\n", + " print(\"targets.dtype:\", targets.dtype)\n", + " print(\"inputs[0]:\", inputs[0])\n", + " print(\"targets[0]:\", targets[0])\n", + " break" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "colab_type": "text" + }, + "source": [ + "**Our model-building utility**" + ] + }, + { + "cell_type": "code", + "execution_count": 0, + "metadata": { + "colab_type": "code" + }, + "outputs": [], + "source": [ + "from tensorflow import keras\n", + "from tensorflow.keras import layers\n", + "\n", + "def get_model(max_tokens=20000, hidden_dim=16):\n", + " inputs = keras.Input(shape=(max_tokens,))\n", + " x = layers.Dense(hidden_dim, activation=\"relu\")(inputs)\n", + " x = layers.Dropout(0.5)(x)\n", + " outputs = layers.Dense(1, activation=\"sigmoid\")(x)\n", + " model = keras.Model(inputs, outputs)\n", + " model.compile(optimizer=\"rmsprop\",\n", + " loss=\"binary_crossentropy\",\n", + " metrics=[\"accuracy\"])\n", + " return model" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "colab_type": "text" + }, + "source": [ + "**Training and testing the binary unigram model**" + ] + }, + { + "cell_type": "code", + "execution_count": 0, + "metadata": { + "colab_type": "code" + }, + "outputs": [], + "source": [ + "model = get_model()\n", + "model.summary()\n", + "callbacks = [\n", + " keras.callbacks.ModelCheckpoint(\"binary_1gram.keras\",\n", + " save_best_only=True)\n", + "]\n", + "model.fit(binary_1gram_train_ds.cache(),\n", + " validation_data=binary_1gram_val_ds.cache(),\n", + " epochs=10,\n", + " callbacks=callbacks)\n", + "model = keras.models.load_model(\"binary_1gram.keras\")\n", + "print(f\"Test acc: {model.evaluate(binary_1gram_test_ds)[1]:.3f}\")" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "colab_type": "text" + }, + "source": [ + "#### Bigrams with binary encoding" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "colab_type": "text" + }, + "source": [ + "**Configuring the `TextVectorization` layer to return bigrams**" + ] + }, + { + "cell_type": "code", + "execution_count": 0, + "metadata": { + "colab_type": "code" + }, + "outputs": [], + "source": [ + "text_vectorization = TextVectorization(\n", + " ngrams=2,\n", + " max_tokens=20000,\n", + " output_mode=\"multi_hot\",\n", + ")" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "colab_type": "text" + }, + "source": [ + "**Training and testing the binary bigram model**" + ] + }, + { + "cell_type": "code", + "execution_count": 0, + "metadata": { + "colab_type": "code" + }, + "outputs": [], + "source": [ + "text_vectorization.adapt(text_only_train_ds)\n", + "binary_2gram_train_ds = train_ds.map(\n", + " lambda x, y: (text_vectorization(x), y),\n", + " num_parallel_calls=4)\n", + "binary_2gram_val_ds = val_ds.map(\n", + " lambda x, y: (text_vectorization(x), y),\n", + " num_parallel_calls=4)\n", + "binary_2gram_test_ds = test_ds.map(\n", + " lambda x, y: (text_vectorization(x), y),\n", + " num_parallel_calls=4)\n", + "\n", + "model = get_model()\n", + "model.summary()\n", + "callbacks = [\n", + " keras.callbacks.ModelCheckpoint(\"binary_2gram.keras\",\n", + " save_best_only=True)\n", + "]\n", + "model.fit(binary_2gram_train_ds.cache(),\n", + " validation_data=binary_2gram_val_ds.cache(),\n", + " epochs=10,\n", + " callbacks=callbacks)\n", + "model = keras.models.load_model(\"binary_2gram.keras\")\n", + "print(f\"Test acc: {model.evaluate(binary_2gram_test_ds)[1]:.3f}\")" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "colab_type": "text" + }, + "source": [ + "#### Bigrams with TF-IDF encoding" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "colab_type": "text" + }, + "source": [ + "**Configuring the `TextVectorization` layer to return token counts**" + ] + }, + { + "cell_type": "code", + "execution_count": 0, + "metadata": { + "colab_type": "code" + }, + "outputs": [], + "source": [ + "text_vectorization = TextVectorization(\n", + " ngrams=2,\n", + " max_tokens=20000,\n", + " output_mode=\"count\"\n", + ")" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "colab_type": "text" + }, + "source": [ + "**Configuring `TextVectorization` to return TF-IDF-weighted outputs**" + ] + }, + { + "cell_type": "code", + "execution_count": 0, + "metadata": { + "colab_type": "code" + }, + "outputs": [], + "source": [ + "text_vectorization = TextVectorization(\n", + " ngrams=2,\n", + " max_tokens=20000,\n", + " output_mode=\"tf_idf\",\n", + ")" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "colab_type": "text" + }, + "source": [ + "**Training and testing the TF-IDF bigram model**" + ] + }, + { + "cell_type": "code", + "execution_count": 0, + "metadata": { + "colab_type": "code" + }, + "outputs": [], + "source": [ + "text_vectorization.adapt(text_only_train_ds)\n", + "\n", + "tfidf_2gram_train_ds = train_ds.map(\n", + " lambda x, y: (text_vectorization(x), y),\n", + " num_parallel_calls=4)\n", + "tfidf_2gram_val_ds = val_ds.map(\n", + " lambda x, y: (text_vectorization(x), y),\n", + " num_parallel_calls=4)\n", + "tfidf_2gram_test_ds = test_ds.map(\n", + " lambda x, y: (text_vectorization(x), y),\n", + " num_parallel_calls=4)\n", + "\n", + "model = get_model()\n", + "model.summary()\n", + "callbacks = [\n", + " keras.callbacks.ModelCheckpoint(\"tfidf_2gram.keras\",\n", + " save_best_only=True)\n", + "]\n", + "model.fit(tfidf_2gram_train_ds.cache(),\n", + " validation_data=tfidf_2gram_val_ds.cache(),\n", + " epochs=10,\n", + " callbacks=callbacks)\n", + "model = keras.models.load_model(\"tfidf_2gram.keras\")\n", + "print(f\"Test acc: {model.evaluate(tfidf_2gram_test_ds)[1]:.3f}\")" + ] + }, + { + "cell_type": "code", + "execution_count": 0, + "metadata": { + "colab_type": "code" + }, + "outputs": [], + "source": [ + "inputs = keras.Input(shape=(1,), dtype=\"string\")\n", + "processed_inputs = text_vectorization(inputs)\n", + "outputs = model(processed_inputs)\n", + "inference_model = keras.Model(inputs, outputs)" + ] + }, + { + "cell_type": "code", + "execution_count": 0, + "metadata": { + "colab_type": "code" + }, + "outputs": [], + "source": [ + "import tensorflow as tf\n", + "raw_text_data = tf.convert_to_tensor([\n", + " [\"That was an excellent movie, I loved it.\"],\n", + "])\n", + "predictions = inference_model(raw_text_data)\n", + "print(f\"{float(predictions[0] * 100):.2f} percent positive\")" + ] + } + ], + "metadata": { + "colab": { + "collapsed_sections": [], + "name": "chapter11_part01_introduction.i", + "private_outputs": false, + "provenance": [], + "toc_visible": true + }, + "kernelspec": { + "display_name": "Python 3", + "language": "python", + "name": "python3" + }, + "language_info": { + "codemirror_mode": { + "name": "ipython", + "version": 3 + }, + "file_extension": ".py", + "mimetype": "text/x-python", + "name": "python", + "nbconvert_exporter": "python", + "pygments_lexer": "ipython3", + "version": "3.7.0" + } + }, + "nbformat": 4, + "nbformat_minor": 0 +} \ No newline at end of file diff --git a/second_edition/chapter11_part02_sequence-models.ipynb b/second_edition/chapter11_part02_sequence-models.ipynb new file mode 100644 index 0000000000..bfcf6237a2 --- /dev/null +++ b/second_edition/chapter11_part02_sequence-models.ipynb @@ -0,0 +1,478 @@ +{ + "cells": [ + { + "cell_type": "markdown", + "metadata": { + "colab_type": "text" + }, + "source": [ + "This is a companion notebook for the book [Deep Learning with Python, Second Edition](https://www.manning.com/books/deep-learning-with-python-second-edition?a_aid=keras&a_bid=76564dff). For readability, it only contains runnable code blocks and section titles, and omits everything else in the book: text paragraphs, figures, and pseudocode.\n\n**If you want to be able to follow what's going on, I recommend reading the notebook side by side with your copy of the book.**\n\nThis notebook was generated for TensorFlow 2.6." + ] + }, + { + "cell_type": "markdown", + "metadata": { + "colab_type": "text" + }, + "source": [ + "### Processing words as a sequence: The sequence model approach" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "colab_type": "text" + }, + "source": [ + "#### A first practical example" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "colab_type": "text" + }, + "source": [ + "**Downloading the data**" + ] + }, + { + "cell_type": "code", + "execution_count": 0, + "metadata": { + "colab_type": "code" + }, + "outputs": [], + "source": [ + "!curl -O https://ai.stanford.edu/~amaas/data/sentiment/aclImdb_v1.tar.gz\n", + "!tar -xf aclImdb_v1.tar.gz\n", + "!rm -r aclImdb/train/unsup" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "colab_type": "text" + }, + "source": [ + "**Preparing the data**" + ] + }, + { + "cell_type": "code", + "execution_count": 0, + "metadata": { + "colab_type": "code" + }, + "outputs": [], + "source": [ + "import os, pathlib, shutil, random\n", + "from tensorflow import keras\n", + "batch_size = 32\n", + "base_dir = pathlib.Path(\"aclImdb\")\n", + "val_dir = base_dir / \"val\"\n", + "train_dir = base_dir / \"train\"\n", + "for category in (\"neg\", \"pos\"):\n", + " os.makedirs(val_dir / category)\n", + " files = os.listdir(train_dir / category)\n", + " random.Random(1337).shuffle(files)\n", + " num_val_samples = int(0.2 * len(files))\n", + " val_files = files[-num_val_samples:]\n", + " for fname in val_files:\n", + " shutil.move(train_dir / category / fname,\n", + " val_dir / category / fname)\n", + "\n", + "train_ds = keras.utils.text_dataset_from_directory(\n", + " \"aclImdb/train\", batch_size=batch_size\n", + ")\n", + "val_ds = keras.utils.text_dataset_from_directory(\n", + " \"aclImdb/val\", batch_size=batch_size\n", + ")\n", + "test_ds = keras.utils.text_dataset_from_directory(\n", + " \"aclImdb/test\", batch_size=batch_size\n", + ")\n", + "text_only_train_ds = train_ds.map(lambda x, y: x)" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "colab_type": "text" + }, + "source": [ + "**Preparing integer sequence datasets**" + ] + }, + { + "cell_type": "code", + "execution_count": 0, + "metadata": { + "colab_type": "code" + }, + "outputs": [], + "source": [ + "from tensorflow.keras import layers\n", + "\n", + "max_length = 600\n", + "max_tokens = 20000\n", + "text_vectorization = layers.TextVectorization(\n", + " max_tokens=max_tokens,\n", + " output_mode=\"int\",\n", + " output_sequence_length=max_length,\n", + ")\n", + "text_vectorization.adapt(text_only_train_ds)\n", + "\n", + "int_train_ds = train_ds.map(\n", + " lambda x, y: (text_vectorization(x), y),\n", + " num_parallel_calls=4)\n", + "int_val_ds = val_ds.map(\n", + " lambda x, y: (text_vectorization(x), y),\n", + " num_parallel_calls=4)\n", + "int_test_ds = test_ds.map(\n", + " lambda x, y: (text_vectorization(x), y),\n", + " num_parallel_calls=4)" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "colab_type": "text" + }, + "source": [ + "**A sequence model built on one-hot encoded vector sequences**" + ] + }, + { + "cell_type": "code", + "execution_count": 0, + "metadata": { + "colab_type": "code" + }, + "outputs": [], + "source": [ + "import tensorflow as tf\n", + "inputs = keras.Input(shape=(None,), dtype=\"int64\")\n", + "embedded = tf.one_hot(inputs, depth=max_tokens)\n", + "x = layers.Bidirectional(layers.LSTM(32))(embedded)\n", + "x = layers.Dropout(0.5)(x)\n", + "outputs = layers.Dense(1, activation=\"sigmoid\")(x)\n", + "model = keras.Model(inputs, outputs)\n", + "model.compile(optimizer=\"rmsprop\",\n", + " loss=\"binary_crossentropy\",\n", + " metrics=[\"accuracy\"])\n", + "model.summary()" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "colab_type": "text" + }, + "source": [ + "**Training a first basic sequence model**" + ] + }, + { + "cell_type": "code", + "execution_count": 0, + "metadata": { + "colab_type": "code" + }, + "outputs": [], + "source": [ + "callbacks = [\n", + " keras.callbacks.ModelCheckpoint(\"one_hot_bidir_lstm.keras\",\n", + " save_best_only=True)\n", + "]\n", + "model.fit(int_train_ds, validation_data=int_val_ds, epochs=10, callbacks=callbacks)\n", + "model = keras.models.load_model(\"one_hot_bidir_lstm.keras\")\n", + "print(f\"Test acc: {model.evaluate(int_test_ds)[1]:.3f}\")" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "colab_type": "text" + }, + "source": [ + "#### Understanding word embeddings" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "colab_type": "text" + }, + "source": [ + "#### Learning word embeddings with the Embedding layer" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "colab_type": "text" + }, + "source": [ + "**Instantiating an `Embedding` layer**" + ] + }, + { + "cell_type": "code", + "execution_count": 0, + "metadata": { + "colab_type": "code" + }, + "outputs": [], + "source": [ + "embedding_layer = layers.Embedding(input_dim=max_tokens, output_dim=256)" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "colab_type": "text" + }, + "source": [ + "**Model that uses an `Embedding` layer trained from scratch**" + ] + }, + { + "cell_type": "code", + "execution_count": 0, + "metadata": { + "colab_type": "code" + }, + "outputs": [], + "source": [ + "inputs = keras.Input(shape=(None,), dtype=\"int64\")\n", + "embedded = layers.Embedding(input_dim=max_tokens, output_dim=256)(inputs)\n", + "x = layers.Bidirectional(layers.LSTM(32))(embedded)\n", + "x = layers.Dropout(0.5)(x)\n", + "outputs = layers.Dense(1, activation=\"sigmoid\")(x)\n", + "model = keras.Model(inputs, outputs)\n", + "model.compile(optimizer=\"rmsprop\",\n", + " loss=\"binary_crossentropy\",\n", + " metrics=[\"accuracy\"])\n", + "model.summary()\n", + "\n", + "callbacks = [\n", + " keras.callbacks.ModelCheckpoint(\"embeddings_bidir_gru.keras\",\n", + " save_best_only=True)\n", + "]\n", + "model.fit(int_train_ds, validation_data=int_val_ds, epochs=10, callbacks=callbacks)\n", + "model = keras.models.load_model(\"embeddings_bidir_gru.keras\")\n", + "print(f\"Test acc: {model.evaluate(int_test_ds)[1]:.3f}\")" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "colab_type": "text" + }, + "source": [ + "#### Understanding padding and masking" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "colab_type": "text" + }, + "source": [ + "**Using an `Embedding` layer with masking enabled**" + ] + }, + { + "cell_type": "code", + "execution_count": 0, + "metadata": { + "colab_type": "code" + }, + "outputs": [], + "source": [ + "inputs = keras.Input(shape=(None,), dtype=\"int64\")\n", + "embedded = layers.Embedding(\n", + " input_dim=max_tokens, output_dim=256, mask_zero=True)(inputs)\n", + "x = layers.Bidirectional(layers.LSTM(32))(embedded)\n", + "x = layers.Dropout(0.5)(x)\n", + "outputs = layers.Dense(1, activation=\"sigmoid\")(x)\n", + "model = keras.Model(inputs, outputs)\n", + "model.compile(optimizer=\"rmsprop\",\n", + " loss=\"binary_crossentropy\",\n", + " metrics=[\"accuracy\"])\n", + "model.summary()\n", + "\n", + "callbacks = [\n", + " keras.callbacks.ModelCheckpoint(\"embeddings_bidir_gru_with_masking.keras\",\n", + " save_best_only=True)\n", + "]\n", + "model.fit(int_train_ds, validation_data=int_val_ds, epochs=10, callbacks=callbacks)\n", + "model = keras.models.load_model(\"embeddings_bidir_gru_with_masking.keras\")\n", + "print(f\"Test acc: {model.evaluate(int_test_ds)[1]:.3f}\")" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "colab_type": "text" + }, + "source": [ + "#### Using pretrained word embeddings" + ] + }, + { + "cell_type": "code", + "execution_count": 0, + "metadata": { + "colab_type": "code" + }, + "outputs": [], + "source": [ + "!wget http://nlp.stanford.edu/data/glove.6B.zip\n", + "!unzip -q glove.6B.zip" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "colab_type": "text" + }, + "source": [ + "**Parsing the GloVe word-embeddings file**" + ] + }, + { + "cell_type": "code", + "execution_count": 0, + "metadata": { + "colab_type": "code" + }, + "outputs": [], + "source": [ + "import numpy as np\n", + "path_to_glove_file = \"glove.6B.100d.txt\"\n", + "\n", + "embeddings_index = {}\n", + "with open(path_to_glove_file) as f:\n", + " for line in f:\n", + " word, coefs = line.split(maxsplit=1)\n", + " coefs = np.fromstring(coefs, \"f\", sep=\" \")\n", + " embeddings_index[word] = coefs\n", + "\n", + "print(f\"Found {len(embeddings_index)} word vectors.\")" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "colab_type": "text" + }, + "source": [ + "**Preparing the GloVe word-embeddings matrix**" + ] + }, + { + "cell_type": "code", + "execution_count": 0, + "metadata": { + "colab_type": "code" + }, + "outputs": [], + "source": [ + "embedding_dim = 100\n", + "\n", + "vocabulary = text_vectorization.get_vocabulary()\n", + "word_index = dict(zip(vocabulary, range(len(vocabulary))))\n", + "\n", + "embedding_matrix = np.zeros((max_tokens, embedding_dim))\n", + "for word, i in word_index.items():\n", + " if i < max_tokens:\n", + " embedding_vector = embeddings_index.get(word)\n", + " if embedding_vector is not None:\n", + " embedding_matrix[i] = embedding_vector" + ] + }, + { + "cell_type": "code", + "execution_count": 0, + "metadata": { + "colab_type": "code" + }, + "outputs": [], + "source": [ + "embedding_layer = layers.Embedding(\n", + " max_tokens,\n", + " embedding_dim,\n", + " embeddings_initializer=keras.initializers.Constant(embedding_matrix),\n", + " trainable=False,\n", + " mask_zero=True,\n", + ")" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "colab_type": "text" + }, + "source": [ + "**Model that uses a pretrained Embedding layer**" + ] + }, + { + "cell_type": "code", + "execution_count": 0, + "metadata": { + "colab_type": "code" + }, + "outputs": [], + "source": [ + "inputs = keras.Input(shape=(None,), dtype=\"int64\")\n", + "embedded = embedding_layer(inputs)\n", + "x = layers.Bidirectional(layers.LSTM(32))(embedded)\n", + "x = layers.Dropout(0.5)(x)\n", + "outputs = layers.Dense(1, activation=\"sigmoid\")(x)\n", + "model = keras.Model(inputs, outputs)\n", + "model.compile(optimizer=\"rmsprop\",\n", + " loss=\"binary_crossentropy\",\n", + " metrics=[\"accuracy\"])\n", + "model.summary()\n", + "\n", + "callbacks = [\n", + " keras.callbacks.ModelCheckpoint(\"glove_embeddings_sequence_model.keras\",\n", + " save_best_only=True)\n", + "]\n", + "model.fit(int_train_ds, validation_data=int_val_ds, epochs=10, callbacks=callbacks)\n", + "model = keras.models.load_model(\"glove_embeddings_sequence_model.keras\")\n", + "print(f\"Test acc: {model.evaluate(int_test_ds)[1]:.3f}\")" + ] + } + ], + "metadata": { + "colab": { + "collapsed_sections": [], + "name": "chapter11_part02_sequence-models.i", + "private_outputs": false, + "provenance": [], + "toc_visible": true + }, + "kernelspec": { + "display_name": "Python 3", + "language": "python", + "name": "python3" + }, + "language_info": { + "codemirror_mode": { + "name": "ipython", + "version": 3 + }, + "file_extension": ".py", + "mimetype": "text/x-python", + "name": "python", + "nbconvert_exporter": "python", + "pygments_lexer": "ipython3", + "version": "3.7.0" + } + }, + "nbformat": 4, + "nbformat_minor": 0 +} diff --git a/second_edition/chapter11_part03_transformer.ipynb b/second_edition/chapter11_part03_transformer.ipynb new file mode 100644 index 0000000000..0cab099487 --- /dev/null +++ b/second_edition/chapter11_part03_transformer.ipynb @@ -0,0 +1,432 @@ +{ + "cells": [ + { + "cell_type": "markdown", + "metadata": { + "colab_type": "text" + }, + "source": [ + "This is a companion notebook for the book [Deep Learning with Python, Second Edition](https://www.manning.com/books/deep-learning-with-python-second-edition?a_aid=keras&a_bid=76564dff). For readability, it only contains runnable code blocks and section titles, and omits everything else in the book: text paragraphs, figures, and pseudocode.\n\n**If you want to be able to follow what's going on, I recommend reading the notebook side by side with your copy of the book.**\n\nThis notebook was generated for TensorFlow 2.6." + ] + }, + { + "cell_type": "markdown", + "metadata": { + "colab_type": "text" + }, + "source": [ + "## The Transformer architecture" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "colab_type": "text" + }, + "source": [ + "### Understanding self-attention" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "colab_type": "text" + }, + "source": [ + "#### Generalized self-attention: the query-key-value model" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "colab_type": "text" + }, + "source": [ + "### Multi-head attention" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "colab_type": "text" + }, + "source": [ + "### The Transformer encoder" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "colab_type": "text" + }, + "source": [ + "**Getting the data**" + ] + }, + { + "cell_type": "code", + "execution_count": 0, + "metadata": { + "colab_type": "code" + }, + "outputs": [], + "source": [ + "!curl -O https://ai.stanford.edu/~amaas/data/sentiment/aclImdb_v1.tar.gz\n", + "!tar -xf aclImdb_v1.tar.gz\n", + "!rm -r aclImdb/train/unsup" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "colab_type": "text" + }, + "source": [ + "**Preparing the data**" + ] + }, + { + "cell_type": "code", + "execution_count": 0, + "metadata": { + "colab_type": "code" + }, + "outputs": [], + "source": [ + "import os, pathlib, shutil, random\n", + "from tensorflow import keras\n", + "batch_size = 32\n", + "base_dir = pathlib.Path(\"aclImdb\")\n", + "val_dir = base_dir / \"val\"\n", + "train_dir = base_dir / \"train\"\n", + "for category in (\"neg\", \"pos\"):\n", + " os.makedirs(val_dir / category)\n", + " files = os.listdir(train_dir / category)\n", + " random.Random(1337).shuffle(files)\n", + " num_val_samples = int(0.2 * len(files))\n", + " val_files = files[-num_val_samples:]\n", + " for fname in val_files:\n", + " shutil.move(train_dir / category / fname,\n", + " val_dir / category / fname)\n", + "\n", + "train_ds = keras.utils.text_dataset_from_directory(\n", + " \"aclImdb/train\", batch_size=batch_size\n", + ")\n", + "val_ds = keras.utils.text_dataset_from_directory(\n", + " \"aclImdb/val\", batch_size=batch_size\n", + ")\n", + "test_ds = keras.utils.text_dataset_from_directory(\n", + " \"aclImdb/test\", batch_size=batch_size\n", + ")\n", + "text_only_train_ds = train_ds.map(lambda x, y: x)" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "colab_type": "text" + }, + "source": [ + "**Vectorizing the data**" + ] + }, + { + "cell_type": "code", + "execution_count": 0, + "metadata": { + "colab_type": "code" + }, + "outputs": [], + "source": [ + "from tensorflow.keras import layers\n", + "\n", + "max_length = 600\n", + "max_tokens = 20000\n", + "text_vectorization = layers.TextVectorization(\n", + " max_tokens=max_tokens,\n", + " output_mode=\"int\",\n", + " output_sequence_length=max_length,\n", + ")\n", + "text_vectorization.adapt(text_only_train_ds)\n", + "\n", + "int_train_ds = train_ds.map(\n", + " lambda x, y: (text_vectorization(x), y),\n", + " num_parallel_calls=4)\n", + "int_val_ds = val_ds.map(\n", + " lambda x, y: (text_vectorization(x), y),\n", + " num_parallel_calls=4)\n", + "int_test_ds = test_ds.map(\n", + " lambda x, y: (text_vectorization(x), y),\n", + " num_parallel_calls=4)" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "colab_type": "text" + }, + "source": [ + "**Transformer encoder implemented as a subclassed `Layer`**" + ] + }, + { + "cell_type": "code", + "execution_count": 0, + "metadata": { + "colab_type": "code" + }, + "outputs": [], + "source": [ + "import tensorflow as tf\n", + "from tensorflow import keras\n", + "from tensorflow.keras import layers\n", + "\n", + "class TransformerEncoder(layers.Layer):\n", + " def __init__(self, embed_dim, dense_dim, num_heads, **kwargs):\n", + " super().__init__(**kwargs)\n", + " self.embed_dim = embed_dim\n", + " self.dense_dim = dense_dim\n", + " self.num_heads = num_heads\n", + " self.attention = layers.MultiHeadAttention(\n", + " num_heads=num_heads, key_dim=embed_dim)\n", + " self.dense_proj = keras.Sequential(\n", + " [layers.Dense(dense_dim, activation=\"relu\"),\n", + " layers.Dense(embed_dim),]\n", + " )\n", + " self.layernorm_1 = layers.LayerNormalization()\n", + " self.layernorm_2 = layers.LayerNormalization()\n", + "\n", + " def call(self, inputs, mask=None):\n", + " if mask is not None:\n", + " mask = mask[:, tf.newaxis, :]\n", + " attention_output = self.attention(\n", + " inputs, inputs, attention_mask=mask)\n", + " proj_input = self.layernorm_1(inputs + attention_output)\n", + " proj_output = self.dense_proj(proj_input)\n", + " return self.layernorm_2(proj_input + proj_output)\n", + "\n", + " def get_config(self):\n", + " config = super().get_config()\n", + " config.update({\n", + " \"embed_dim\": self.embed_dim,\n", + " \"num_heads\": self.num_heads,\n", + " \"dense_dim\": self.dense_dim,\n", + " })\n", + " return config" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "colab_type": "text" + }, + "source": [ + "**Using the Transformer encoder for text classification**" + ] + }, + { + "cell_type": "code", + "execution_count": 0, + "metadata": { + "colab_type": "code" + }, + "outputs": [], + "source": [ + "vocab_size = 20000\n", + "embed_dim = 256\n", + "num_heads = 2\n", + "dense_dim = 32\n", + "\n", + "inputs = keras.Input(shape=(None,), dtype=\"int64\")\n", + "x = layers.Embedding(vocab_size, embed_dim)(inputs)\n", + "x = TransformerEncoder(embed_dim, dense_dim, num_heads)(x)\n", + "x = layers.GlobalMaxPooling1D()(x)\n", + "x = layers.Dropout(0.5)(x)\n", + "outputs = layers.Dense(1, activation=\"sigmoid\")(x)\n", + "model = keras.Model(inputs, outputs)\n", + "model.compile(optimizer=\"rmsprop\",\n", + " loss=\"binary_crossentropy\",\n", + " metrics=[\"accuracy\"])\n", + "model.summary()" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "colab_type": "text" + }, + "source": [ + "**Training and evaluating the Transformer encoder based model**" + ] + }, + { + "cell_type": "code", + "execution_count": 0, + "metadata": { + "colab_type": "code" + }, + "outputs": [], + "source": [ + "callbacks = [\n", + " keras.callbacks.ModelCheckpoint(\"transformer_encoder.keras\",\n", + " save_best_only=True)\n", + "]\n", + "model.fit(int_train_ds, validation_data=int_val_ds, epochs=20, callbacks=callbacks)\n", + "model = keras.models.load_model(\n", + " \"transformer_encoder.keras\",\n", + " custom_objects={\"TransformerEncoder\": TransformerEncoder})\n", + "print(f\"Test acc: {model.evaluate(int_test_ds)[1]:.3f}\")" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "colab_type": "text" + }, + "source": [ + "#### Using positional encoding to re-inject order information" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "colab_type": "text" + }, + "source": [ + "**Implementing positional embedding as a subclassed layer**" + ] + }, + { + "cell_type": "code", + "execution_count": 0, + "metadata": { + "colab_type": "code" + }, + "outputs": [], + "source": [ + "class PositionalEmbedding(layers.Layer):\n", + " def __init__(self, sequence_length, input_dim, output_dim, **kwargs):\n", + " super().__init__(**kwargs)\n", + " self.token_embeddings = layers.Embedding(\n", + " input_dim=input_dim, output_dim=output_dim)\n", + " self.position_embeddings = layers.Embedding(\n", + " input_dim=sequence_length, output_dim=output_dim)\n", + " self.sequence_length = sequence_length\n", + " self.input_dim = input_dim\n", + " self.output_dim = output_dim\n", + "\n", + " def call(self, inputs):\n", + " length = tf.shape(inputs)[-1]\n", + " positions = tf.range(start=0, limit=length, delta=1)\n", + " embedded_tokens = self.token_embeddings(inputs)\n", + " embedded_positions = self.position_embeddings(positions)\n", + " return embedded_tokens + embedded_positions\n", + "\n", + " def compute_mask(self, inputs, mask=None):\n", + " return tf.math.not_equal(inputs, 0)\n", + "\n", + " def get_config(self):\n", + " config = super().get_config()\n", + " config.update({\n", + " \"output_dim\": self.output_dim,\n", + " \"sequence_length\": self.sequence_length,\n", + " \"input_dim\": self.input_dim,\n", + " })\n", + " return config" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "colab_type": "text" + }, + "source": [ + "#### Putting it all together: A text-classification Transformer" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "colab_type": "text" + }, + "source": [ + "**Combining the Transformer encoder with positional embedding**" + ] + }, + { + "cell_type": "code", + "execution_count": 0, + "metadata": { + "colab_type": "code" + }, + "outputs": [], + "source": [ + "vocab_size = 20000\n", + "sequence_length = 600\n", + "embed_dim = 256\n", + "num_heads = 2\n", + "dense_dim = 32\n", + "\n", + "inputs = keras.Input(shape=(None,), dtype=\"int64\")\n", + "x = PositionalEmbedding(sequence_length, vocab_size, embed_dim)(inputs)\n", + "x = TransformerEncoder(embed_dim, dense_dim, num_heads)(x)\n", + "x = layers.GlobalMaxPooling1D()(x)\n", + "x = layers.Dropout(0.5)(x)\n", + "outputs = layers.Dense(1, activation=\"sigmoid\")(x)\n", + "model = keras.Model(inputs, outputs)\n", + "model.compile(optimizer=\"rmsprop\",\n", + " loss=\"binary_crossentropy\",\n", + " metrics=[\"accuracy\"])\n", + "model.summary()\n", + "\n", + "callbacks = [\n", + " keras.callbacks.ModelCheckpoint(\"full_transformer_encoder.keras\",\n", + " save_best_only=True)\n", + "]\n", + "model.fit(int_train_ds, validation_data=int_val_ds, epochs=20, callbacks=callbacks)\n", + "model = keras.models.load_model(\n", + " \"full_transformer_encoder.keras\",\n", + " custom_objects={\"TransformerEncoder\": TransformerEncoder,\n", + " \"PositionalEmbedding\": PositionalEmbedding})\n", + "print(f\"Test acc: {model.evaluate(int_test_ds)[1]:.3f}\")" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "colab_type": "text" + }, + "source": [ + "### When to use sequence models over bag-of-words models?" + ] + } + ], + "metadata": { + "colab": { + "collapsed_sections": [], + "name": "chapter11_part03_transformer.i", + "private_outputs": false, + "provenance": [], + "toc_visible": true + }, + "kernelspec": { + "display_name": "Python 3", + "language": "python", + "name": "python3" + }, + "language_info": { + "codemirror_mode": { + "name": "ipython", + "version": 3 + }, + "file_extension": ".py", + "mimetype": "text/x-python", + "name": "python", + "nbconvert_exporter": "python", + "pygments_lexer": "ipython3", + "version": "3.7.0" + } + }, + "nbformat": 4, + "nbformat_minor": 0 +} \ No newline at end of file diff --git a/second_edition/chapter11_part04_sequence-to-sequence-learning.ipynb b/second_edition/chapter11_part04_sequence-to-sequence-learning.ipynb new file mode 100644 index 0000000000..8f7bf72641 --- /dev/null +++ b/second_edition/chapter11_part04_sequence-to-sequence-learning.ipynb @@ -0,0 +1,625 @@ +{ + "cells": [ + { + "cell_type": "markdown", + "metadata": { + "colab_type": "text" + }, + "source": [ + "This is a companion notebook for the book [Deep Learning with Python, Second Edition](https://www.manning.com/books/deep-learning-with-python-second-edition?a_aid=keras&a_bid=76564dff). For readability, it only contains runnable code blocks and section titles, and omits everything else in the book: text paragraphs, figures, and pseudocode.\n\n**If you want to be able to follow what's going on, I recommend reading the notebook side by side with your copy of the book.**\n\nThis notebook was generated for TensorFlow 2.6." + ] + }, + { + "cell_type": "markdown", + "metadata": { + "colab_type": "text" + }, + "source": [ + "## Beyond text classification: Sequence-to-sequence learning" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "colab_type": "text" + }, + "source": [ + "### A machine translation example" + ] + }, + { + "cell_type": "code", + "execution_count": 0, + "metadata": { + "colab_type": "code" + }, + "outputs": [], + "source": [ + "!wget http://storage.googleapis.com/download.tensorflow.org/data/spa-eng.zip\n", + "!unzip -q spa-eng.zip" + ] + }, + { + "cell_type": "code", + "execution_count": 0, + "metadata": { + "colab_type": "code" + }, + "outputs": [], + "source": [ + "text_file = \"spa-eng/spa.txt\"\n", + "with open(text_file) as f:\n", + " lines = f.read().split(\"\\n\")[:-1]\n", + "text_pairs = []\n", + "for line in lines:\n", + " english, spanish = line.split(\"\\t\")\n", + " spanish = \"[start] \" + spanish + \" [end]\"\n", + " text_pairs.append((english, spanish))" + ] + }, + { + "cell_type": "code", + "execution_count": 0, + "metadata": { + "colab_type": "code" + }, + "outputs": [], + "source": [ + "import random\n", + "print(random.choice(text_pairs))" + ] + }, + { + "cell_type": "code", + "execution_count": 0, + "metadata": { + "colab_type": "code" + }, + "outputs": [], + "source": [ + "import random\n", + "random.shuffle(text_pairs)\n", + "num_val_samples = int(0.15 * len(text_pairs))\n", + "num_train_samples = len(text_pairs) - 2 * num_val_samples\n", + "train_pairs = text_pairs[:num_train_samples]\n", + "val_pairs = text_pairs[num_train_samples:num_train_samples + num_val_samples]\n", + "test_pairs = text_pairs[num_train_samples + num_val_samples:]" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "colab_type": "text" + }, + "source": [ + "**Vectorizing the English and Spanish text pairs**" + ] + }, + { + "cell_type": "code", + "execution_count": 0, + "metadata": { + "colab_type": "code" + }, + "outputs": [], + "source": [ + "import tensorflow as tf\n", + "import string\n", + "import re\n", + "from tensorflow import keras\n", + "from tensorflow.keras import layers\n", + "\n", + "strip_chars = string.punctuation + \"\u00bf\"\n", + "strip_chars = strip_chars.replace(\"[\", \"\")\n", + "strip_chars = strip_chars.replace(\"]\", \"\")\n", + "\n", + "def custom_standardization(input_string):\n", + " lowercase = tf.strings.lower(input_string)\n", + " return tf.strings.regex_replace(\n", + " lowercase, f\"[{re.escape(strip_chars)}]\", \"\")\n", + "\n", + "vocab_size = 15000\n", + "sequence_length = 20\n", + "\n", + "source_vectorization = layers.TextVectorization(\n", + " max_tokens=vocab_size,\n", + " output_mode=\"int\",\n", + " output_sequence_length=sequence_length,\n", + ")\n", + "target_vectorization = layers.TextVectorization(\n", + " max_tokens=vocab_size,\n", + " output_mode=\"int\",\n", + " output_sequence_length=sequence_length + 1,\n", + " standardize=custom_standardization,\n", + ")\n", + "train_english_texts = [pair[0] for pair in train_pairs]\n", + "train_spanish_texts = [pair[1] for pair in train_pairs]\n", + "source_vectorization.adapt(train_english_texts)\n", + "target_vectorization.adapt(train_spanish_texts)" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "colab_type": "text" + }, + "source": [ + "**Preparing datasets for the translation task**" + ] + }, + { + "cell_type": "code", + "execution_count": 0, + "metadata": { + "colab_type": "code" + }, + "outputs": [], + "source": [ + "batch_size = 64\n", + "\n", + "def format_dataset(eng, spa):\n", + " eng = source_vectorization(eng)\n", + " spa = target_vectorization(spa)\n", + " return ({\n", + " \"english\": eng,\n", + " \"spanish\": spa[:, :-1],\n", + " }, spa[:, 1:])\n", + "\n", + "def make_dataset(pairs):\n", + " eng_texts, spa_texts = zip(*pairs)\n", + " eng_texts = list(eng_texts)\n", + " spa_texts = list(spa_texts)\n", + " dataset = tf.data.Dataset.from_tensor_slices((eng_texts, spa_texts))\n", + " dataset = dataset.batch(batch_size)\n", + " dataset = dataset.map(format_dataset, num_parallel_calls=4)\n", + " return dataset.shuffle(2048).prefetch(16).cache()\n", + "\n", + "train_ds = make_dataset(train_pairs)\n", + "val_ds = make_dataset(val_pairs)" + ] + }, + { + "cell_type": "code", + "execution_count": 0, + "metadata": { + "colab_type": "code" + }, + "outputs": [], + "source": [ + "for inputs, targets in train_ds.take(1):\n", + " print(f\"inputs['english'].shape: {inputs['english'].shape}\")\n", + " print(f\"inputs['spanish'].shape: {inputs['spanish'].shape}\")\n", + " print(f\"targets.shape: {targets.shape}\")" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "colab_type": "text" + }, + "source": [ + "### Sequence-to-sequence learning with RNNs" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "colab_type": "text" + }, + "source": [ + "**GRU-based encoder**" + ] + }, + { + "cell_type": "code", + "execution_count": 0, + "metadata": { + "colab_type": "code" + }, + "outputs": [], + "source": [ + "from tensorflow import keras\n", + "from tensorflow.keras import layers\n", + "\n", + "embed_dim = 256\n", + "latent_dim = 1024\n", + "\n", + "source = keras.Input(shape=(None,), dtype=\"int64\", name=\"english\")\n", + "x = layers.Embedding(vocab_size, embed_dim, mask_zero=True)(source)\n", + "encoded_source = layers.Bidirectional(\n", + " layers.GRU(latent_dim), merge_mode=\"sum\")(x)" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "colab_type": "text" + }, + "source": [ + "**GRU-based decoder and the end-to-end model**" + ] + }, + { + "cell_type": "code", + "execution_count": 0, + "metadata": { + "colab_type": "code" + }, + "outputs": [], + "source": [ + "past_target = keras.Input(shape=(None,), dtype=\"int64\", name=\"spanish\")\n", + "x = layers.Embedding(vocab_size, embed_dim, mask_zero=True)(past_target)\n", + "decoder_gru = layers.GRU(latent_dim, return_sequences=True)\n", + "x = decoder_gru(x, initial_state=encoded_source)\n", + "x = layers.Dropout(0.5)(x)\n", + "target_next_step = layers.Dense(vocab_size, activation=\"softmax\")(x)\n", + "seq2seq_rnn = keras.Model([source, past_target], target_next_step)" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "colab_type": "text" + }, + "source": [ + "**Training our recurrent sequence-to-sequence model**" + ] + }, + { + "cell_type": "code", + "execution_count": 0, + "metadata": { + "colab_type": "code" + }, + "outputs": [], + "source": [ + "seq2seq_rnn.compile(\n", + " optimizer=\"rmsprop\",\n", + " loss=\"sparse_categorical_crossentropy\",\n", + " metrics=[\"accuracy\"])\n", + "seq2seq_rnn.fit(train_ds, epochs=15, validation_data=val_ds)" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "colab_type": "text" + }, + "source": [ + "**Translating new sentences with our RNN encoder and decoder**" + ] + }, + { + "cell_type": "code", + "execution_count": 0, + "metadata": { + "colab_type": "code" + }, + "outputs": [], + "source": [ + "import numpy as np\n", + "spa_vocab = target_vectorization.get_vocabulary()\n", + "spa_index_lookup = dict(zip(range(len(spa_vocab)), spa_vocab))\n", + "max_decoded_sentence_length = 20\n", + "\n", + "def decode_sequence(input_sentence):\n", + " tokenized_input_sentence = source_vectorization([input_sentence])\n", + " decoded_sentence = \"[start]\"\n", + " for i in range(max_decoded_sentence_length):\n", + " tokenized_target_sentence = target_vectorization([decoded_sentence])\n", + " next_token_predictions = seq2seq_rnn.predict(\n", + " [tokenized_input_sentence, tokenized_target_sentence])\n", + " sampled_token_index = np.argmax(next_token_predictions[0, i, :])\n", + " sampled_token = spa_index_lookup[sampled_token_index]\n", + " decoded_sentence += \" \" + sampled_token\n", + " if sampled_token == \"[end]\":\n", + " break\n", + " return decoded_sentence\n", + "\n", + "test_eng_texts = [pair[0] for pair in test_pairs]\n", + "for _ in range(20):\n", + " input_sentence = random.choice(test_eng_texts)\n", + " print(\"-\")\n", + " print(input_sentence)\n", + " print(decode_sequence(input_sentence))" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "colab_type": "text" + }, + "source": [ + "### Sequence-to-sequence learning with Transformer" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "colab_type": "text" + }, + "source": [ + "#### The Transformer decoder" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "colab_type": "text" + }, + "source": [ + "**The `TransformerDecoder`**" + ] + }, + { + "cell_type": "code", + "execution_count": 0, + "metadata": { + "colab_type": "code" + }, + "outputs": [], + "source": [ + "class TransformerDecoder(layers.Layer):\n", + " def __init__(self, embed_dim, dense_dim, num_heads, **kwargs):\n", + " super().__init__(**kwargs)\n", + " self.embed_dim = embed_dim\n", + " self.dense_dim = dense_dim\n", + " self.num_heads = num_heads\n", + " self.attention_1 = layers.MultiHeadAttention(\n", + " num_heads=num_heads, key_dim=embed_dim)\n", + " self.attention_2 = layers.MultiHeadAttention(\n", + " num_heads=num_heads, key_dim=embed_dim)\n", + " self.dense_proj = keras.Sequential(\n", + " [layers.Dense(dense_dim, activation=\"relu\"),\n", + " layers.Dense(embed_dim),]\n", + " )\n", + " self.layernorm_1 = layers.LayerNormalization()\n", + " self.layernorm_2 = layers.LayerNormalization()\n", + " self.layernorm_3 = layers.LayerNormalization()\n", + " self.supports_masking = True\n", + "\n", + " def get_config(self):\n", + " config = super().get_config()\n", + " config.update({\n", + " \"embed_dim\": self.embed_dim,\n", + " \"num_heads\": self.num_heads,\n", + " \"dense_dim\": self.dense_dim,\n", + " })\n", + " return config\n", + "\n", + " def get_causal_attention_mask(self, inputs):\n", + " input_shape = tf.shape(inputs)\n", + " batch_size, sequence_length = input_shape[0], input_shape[1]\n", + " i = tf.range(sequence_length)[:, tf.newaxis]\n", + " j = tf.range(sequence_length)\n", + " mask = tf.cast(i >= j, dtype=\"int32\")\n", + " mask = tf.reshape(mask, (1, input_shape[1], input_shape[1]))\n", + " mult = tf.concat(\n", + " [tf.expand_dims(batch_size, -1),\n", + " tf.constant([1, 1], dtype=tf.int32)], axis=0)\n", + " return tf.tile(mask, mult)\n", + "\n", + " def call(self, inputs, encoder_outputs, mask=None):\n", + " causal_mask = self.get_causal_attention_mask(inputs)\n", + " if mask is not None:\n", + " padding_mask = tf.cast(\n", + " mask[:, tf.newaxis, :], dtype=\"int32\")\n", + " padding_mask = tf.minimum(padding_mask, causal_mask)\n", + " else:\n", + " padding_mask = mask\n", + " attention_output_1 = self.attention_1(\n", + " query=inputs,\n", + " value=inputs,\n", + " key=inputs,\n", + " attention_mask=causal_mask)\n", + " attention_output_1 = self.layernorm_1(inputs + attention_output_1)\n", + " attention_output_2 = self.attention_2(\n", + " query=attention_output_1,\n", + " value=encoder_outputs,\n", + " key=encoder_outputs,\n", + " attention_mask=padding_mask,\n", + " )\n", + " attention_output_2 = self.layernorm_2(\n", + " attention_output_1 + attention_output_2)\n", + " proj_output = self.dense_proj(attention_output_2)\n", + " return self.layernorm_3(attention_output_2 + proj_output)" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "colab_type": "text" + }, + "source": [ + "#### Putting it all together: A Transformer for machine translation" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "colab_type": "text" + }, + "source": [ + "**PositionalEmbedding layer**" + ] + }, + { + "cell_type": "code", + "execution_count": 0, + "metadata": { + "colab_type": "code" + }, + "outputs": [], + "source": [ + "class PositionalEmbedding(layers.Layer):\n", + " def __init__(self, sequence_length, input_dim, output_dim, **kwargs):\n", + " super().__init__(**kwargs)\n", + " self.token_embeddings = layers.Embedding(\n", + " input_dim=input_dim, output_dim=output_dim)\n", + " self.position_embeddings = layers.Embedding(\n", + " input_dim=sequence_length, output_dim=output_dim)\n", + " self.sequence_length = sequence_length\n", + " self.input_dim = input_dim\n", + " self.output_dim = output_dim\n", + "\n", + " def call(self, inputs):\n", + " length = tf.shape(inputs)[-1]\n", + " positions = tf.range(start=0, limit=length, delta=1)\n", + " embedded_tokens = self.token_embeddings(inputs)\n", + " embedded_positions = self.position_embeddings(positions)\n", + " return embedded_tokens + embedded_positions\n", + "\n", + " def compute_mask(self, inputs, mask=None):\n", + " return tf.math.not_equal(inputs, 0)\n", + "\n", + " def get_config(self):\n", + " config = super(PositionalEmbedding, self).get_config()\n", + " config.update({\n", + " \"output_dim\": self.output_dim,\n", + " \"sequence_length\": self.sequence_length,\n", + " \"input_dim\": self.input_dim,\n", + " })\n", + " return config" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "colab_type": "text" + }, + "source": [ + "**End-to-end Transformer**" + ] + }, + { + "cell_type": "code", + "execution_count": 0, + "metadata": { + "colab_type": "code" + }, + "outputs": [], + "source": [ + "embed_dim = 256\n", + "dense_dim = 2048\n", + "num_heads = 8\n", + "\n", + "encoder_inputs = keras.Input(shape=(None,), dtype=\"int64\", name=\"english\")\n", + "x = PositionalEmbedding(sequence_length, vocab_size, embed_dim)(encoder_inputs)\n", + "encoder_outputs = TransformerEncoder(embed_dim, dense_dim, num_heads)(x)\n", + "\n", + "decoder_inputs = keras.Input(shape=(None,), dtype=\"int64\", name=\"spanish\")\n", + "x = PositionalEmbedding(sequence_length, vocab_size, embed_dim)(decoder_inputs)\n", + "x = TransformerDecoder(embed_dim, dense_dim, num_heads)(x, encoder_outputs)\n", + "x = layers.Dropout(0.5)(x)\n", + "decoder_outputs = layers.Dense(vocab_size, activation=\"softmax\")(x)\n", + "transformer = keras.Model([encoder_inputs, decoder_inputs], decoder_outputs)" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "colab_type": "text" + }, + "source": [ + "**Training the sequence-to-sequence Transformer**" + ] + }, + { + "cell_type": "code", + "execution_count": 0, + "metadata": { + "colab_type": "code" + }, + "outputs": [], + "source": [ + "transformer.compile(\n", + " optimizer=\"rmsprop\",\n", + " loss=\"sparse_categorical_crossentropy\",\n", + " metrics=[\"accuracy\"])\n", + "transformer.fit(train_ds, epochs=30, validation_data=val_ds)" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "colab_type": "text" + }, + "source": [ + "**Translating new sentences with our Transformer model**" + ] + }, + { + "cell_type": "code", + "execution_count": 0, + "metadata": { + "colab_type": "code" + }, + "outputs": [], + "source": [ + "import numpy as np\n", + "spa_vocab = target_vectorization.get_vocabulary()\n", + "spa_index_lookup = dict(zip(range(len(spa_vocab)), spa_vocab))\n", + "max_decoded_sentence_length = 20\n", + "\n", + "def decode_sequence(input_sentence):\n", + " tokenized_input_sentence = source_vectorization([input_sentence])\n", + " decoded_sentence = \"[start]\"\n", + " for i in range(max_decoded_sentence_length):\n", + " tokenized_target_sentence = target_vectorization(\n", + " [decoded_sentence])[:, :-1]\n", + " predictions = transformer(\n", + " [tokenized_input_sentence, tokenized_target_sentence])\n", + " sampled_token_index = np.argmax(predictions[0, i, :])\n", + " sampled_token = spa_index_lookup[sampled_token_index]\n", + " decoded_sentence += \" \" + sampled_token\n", + " if sampled_token == \"[end]\":\n", + " break\n", + " return decoded_sentence\n", + "\n", + "test_eng_texts = [pair[0] for pair in test_pairs]\n", + "for _ in range(20):\n", + " input_sentence = random.choice(test_eng_texts)\n", + " print(\"-\")\n", + " print(input_sentence)\n", + " print(decode_sequence(input_sentence))" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "colab_type": "text" + }, + "source": [ + "## Summary" + ] + } + ], + "metadata": { + "colab": { + "collapsed_sections": [], + "name": "chapter11_part04_sequence-to-sequence-learning.i", + "private_outputs": false, + "provenance": [], + "toc_visible": true + }, + "kernelspec": { + "display_name": "Python 3", + "language": "python", + "name": "python3" + }, + "language_info": { + "codemirror_mode": { + "name": "ipython", + "version": 3 + }, + "file_extension": ".py", + "mimetype": "text/x-python", + "name": "python", + "nbconvert_exporter": "python", + "pygments_lexer": "ipython3", + "version": "3.7.0" + } + }, + "nbformat": 4, + "nbformat_minor": 0 +} diff --git a/second_edition/chapter12_part01_text-generation.ipynb b/second_edition/chapter12_part01_text-generation.ipynb new file mode 100644 index 0000000000..f683c1d73b --- /dev/null +++ b/second_edition/chapter12_part01_text-generation.ipynb @@ -0,0 +1,481 @@ +{ + "cells": [ + { + "cell_type": "markdown", + "metadata": { + "colab_type": "text" + }, + "source": [ + "This is a companion notebook for the book [Deep Learning with Python, Second Edition](https://www.manning.com/books/deep-learning-with-python-second-edition?a_aid=keras&a_bid=76564dff). For readability, it only contains runnable code blocks and section titles, and omits everything else in the book: text paragraphs, figures, and pseudocode.\n\n**If you want to be able to follow what's going on, I recommend reading the notebook side by side with your copy of the book.**\n\nThis notebook was generated for TensorFlow 2.6." + ] + }, + { + "cell_type": "markdown", + "metadata": { + "colab_type": "text" + }, + "source": [ + "# Generative deep learning" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "colab_type": "text" + }, + "source": [ + "## Text generation" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "colab_type": "text" + }, + "source": [ + "### A brief history of generative deep learning for sequence generation" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "colab_type": "text" + }, + "source": [ + "### How do you generate sequence data?" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "colab_type": "text" + }, + "source": [ + "### The importance of the sampling strategy" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "colab_type": "text" + }, + "source": [ + "**Reweighting a probability distribution to a different temperature**" + ] + }, + { + "cell_type": "code", + "execution_count": 0, + "metadata": { + "colab_type": "code" + }, + "outputs": [], + "source": [ + "import numpy as np\n", + "def reweight_distribution(original_distribution, temperature=0.5):\n", + " distribution = np.log(original_distribution) / temperature\n", + " distribution = np.exp(distribution)\n", + " return distribution / np.sum(distribution)" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "colab_type": "text" + }, + "source": [ + "### Implementing text generation with Keras" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "colab_type": "text" + }, + "source": [ + "#### Preparing the data" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "colab_type": "text" + }, + "source": [ + "**Downloading and uncompressing the IMDB movie reviews dataset**" + ] + }, + { + "cell_type": "code", + "execution_count": 0, + "metadata": { + "colab_type": "code" + }, + "outputs": [], + "source": [ + "!wget https://ai.stanford.edu/~amaas/data/sentiment/aclImdb_v1.tar.gz\n", + "!tar -xf aclImdb_v1.tar.gz" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "colab_type": "text" + }, + "source": [ + "**Creating a dataset from text files (one file = one sample)**" + ] + }, + { + "cell_type": "code", + "execution_count": 0, + "metadata": { + "colab_type": "code" + }, + "outputs": [], + "source": [ + "import tensorflow as tf\n", + "from tensorflow import keras\n", + "dataset = keras.utils.text_dataset_from_directory(\n", + " directory=\"aclImdb\", label_mode=None, batch_size=256)\n", + "dataset = dataset.map(lambda x: tf.strings.regex_replace(x, \"
\", \" \"))" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "colab_type": "text" + }, + "source": [ + "**Preparing a `TextVectorization` layer**" + ] + }, + { + "cell_type": "code", + "execution_count": 0, + "metadata": { + "colab_type": "code" + }, + "outputs": [], + "source": [ + "from tensorflow.keras.layers import TextVectorization\n", + "\n", + "sequence_length = 100\n", + "vocab_size = 15000\n", + "text_vectorization = TextVectorization(\n", + " max_tokens=vocab_size,\n", + " output_mode=\"int\",\n", + " output_sequence_length=sequence_length,\n", + ")\n", + "text_vectorization.adapt(dataset)" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "colab_type": "text" + }, + "source": [ + "**Setting up a language modeling dataset**" + ] + }, + { + "cell_type": "code", + "execution_count": 0, + "metadata": { + "colab_type": "code" + }, + "outputs": [], + "source": [ + "def prepare_lm_dataset(text_batch):\n", + " vectorized_sequences = text_vectorization(text_batch)\n", + " x = vectorized_sequences[:, :-1]\n", + " y = vectorized_sequences[:, 1:]\n", + " return x, y\n", + "\n", + "lm_dataset = dataset.map(prepare_lm_dataset, num_parallel_calls=4)" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "colab_type": "text" + }, + "source": [ + "#### A Transformer-based sequence-to-sequence model" + ] + }, + { + "cell_type": "code", + "execution_count": 0, + "metadata": { + "colab_type": "code" + }, + "outputs": [], + "source": [ + "import tensorflow as tf\n", + "from tensorflow.keras import layers\n", + "\n", + "class PositionalEmbedding(layers.Layer):\n", + " def __init__(self, sequence_length, input_dim, output_dim, **kwargs):\n", + " super().__init__(**kwargs)\n", + " self.token_embeddings = layers.Embedding(\n", + " input_dim=input_dim, output_dim=output_dim)\n", + " self.position_embeddings = layers.Embedding(\n", + " input_dim=sequence_length, output_dim=output_dim)\n", + " self.sequence_length = sequence_length\n", + " self.input_dim = input_dim\n", + " self.output_dim = output_dim\n", + "\n", + " def call(self, inputs):\n", + " length = tf.shape(inputs)[-1]\n", + " positions = tf.range(start=0, limit=length, delta=1)\n", + " embedded_tokens = self.token_embeddings(inputs)\n", + " embedded_positions = self.position_embeddings(positions)\n", + " return embedded_tokens + embedded_positions\n", + "\n", + " def compute_mask(self, inputs, mask=None):\n", + " return tf.math.not_equal(inputs, 0)\n", + "\n", + " def get_config(self):\n", + " config = super(PositionalEmbedding, self).get_config()\n", + " config.update({\n", + " \"output_dim\": self.output_dim,\n", + " \"sequence_length\": self.sequence_length,\n", + " \"input_dim\": self.input_dim,\n", + " })\n", + " return config\n", + "\n", + "\n", + "class TransformerDecoder(layers.Layer):\n", + " def __init__(self, embed_dim, dense_dim, num_heads, **kwargs):\n", + " super().__init__(**kwargs)\n", + " self.embed_dim = embed_dim\n", + " self.dense_dim = dense_dim\n", + " self.num_heads = num_heads\n", + " self.attention_1 = layers.MultiHeadAttention(\n", + " num_heads=num_heads, key_dim=embed_dim)\n", + " self.attention_2 = layers.MultiHeadAttention(\n", + " num_heads=num_heads, key_dim=embed_dim)\n", + " self.dense_proj = keras.Sequential(\n", + " [layers.Dense(dense_dim, activation=\"relu\"),\n", + " layers.Dense(embed_dim),]\n", + " )\n", + " self.layernorm_1 = layers.LayerNormalization()\n", + " self.layernorm_2 = layers.LayerNormalization()\n", + " self.layernorm_3 = layers.LayerNormalization()\n", + " self.supports_masking = True\n", + "\n", + " def get_config(self):\n", + " config = super(TransformerDecoder, self).get_config()\n", + " config.update({\n", + " \"embed_dim\": self.embed_dim,\n", + " \"num_heads\": self.num_heads,\n", + " \"dense_dim\": self.dense_dim,\n", + " })\n", + " return config\n", + "\n", + " def get_causal_attention_mask(self, inputs):\n", + " input_shape = tf.shape(inputs)\n", + " batch_size, sequence_length = input_shape[0], input_shape[1]\n", + " i = tf.range(sequence_length)[:, tf.newaxis]\n", + " j = tf.range(sequence_length)\n", + " mask = tf.cast(i >= j, dtype=\"int32\")\n", + " mask = tf.reshape(mask, (1, input_shape[1], input_shape[1]))\n", + " mult = tf.concat(\n", + " [tf.expand_dims(batch_size, -1),\n", + " tf.constant([1, 1], dtype=tf.int32)], axis=0)\n", + " return tf.tile(mask, mult)\n", + "\n", + " def call(self, inputs, encoder_outputs, mask=None):\n", + " causal_mask = self.get_causal_attention_mask(inputs)\n", + " if mask is not None:\n", + " padding_mask = tf.cast(\n", + " mask[:, tf.newaxis, :], dtype=\"int32\")\n", + " padding_mask = tf.minimum(padding_mask, causal_mask)\n", + " else:\n", + " padding_mask = mask\n", + " attention_output_1 = self.attention_1(\n", + " query=inputs,\n", + " value=inputs,\n", + " key=inputs,\n", + " attention_mask=causal_mask)\n", + " attention_output_1 = self.layernorm_1(inputs + attention_output_1)\n", + " attention_output_2 = self.attention_2(\n", + " query=attention_output_1,\n", + " value=encoder_outputs,\n", + " key=encoder_outputs,\n", + " attention_mask=padding_mask,\n", + " )\n", + " attention_output_2 = self.layernorm_2(\n", + " attention_output_1 + attention_output_2)\n", + " proj_output = self.dense_proj(attention_output_2)\n", + " return self.layernorm_3(attention_output_2 + proj_output)" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "colab_type": "text" + }, + "source": [ + "**A simple Transformer-based language model**" + ] + }, + { + "cell_type": "code", + "execution_count": 0, + "metadata": { + "colab_type": "code" + }, + "outputs": [], + "source": [ + "from tensorflow.keras import layers\n", + "embed_dim = 256\n", + "latent_dim = 2048\n", + "num_heads = 2\n", + "\n", + "inputs = keras.Input(shape=(None,), dtype=\"int64\")\n", + "x = PositionalEmbedding(sequence_length, vocab_size, embed_dim)(inputs)\n", + "x = TransformerDecoder(embed_dim, latent_dim, num_heads)(x, x)\n", + "outputs = layers.Dense(vocab_size, activation=\"softmax\")(x)\n", + "model = keras.Model(inputs, outputs)\n", + "model.compile(loss=\"sparse_categorical_crossentropy\", optimizer=\"rmsprop\")" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "colab_type": "text" + }, + "source": [ + "### A text-generation callback with variable-temperature sampling" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "colab_type": "text" + }, + "source": [ + "**The text-generation callback**" + ] + }, + { + "cell_type": "code", + "execution_count": 0, + "metadata": { + "colab_type": "code" + }, + "outputs": [], + "source": [ + "import numpy as np\n", + "\n", + "tokens_index = dict(enumerate(text_vectorization.get_vocabulary()))\n", + "\n", + "def sample_next(predictions, temperature=1.0):\n", + " predictions = np.asarray(predictions).astype(\"float64\")\n", + " predictions = np.log(predictions) / temperature\n", + " exp_preds = np.exp(predictions)\n", + " predictions = exp_preds / np.sum(exp_preds)\n", + " probas = np.random.multinomial(1, predictions, 1)\n", + " return np.argmax(probas)\n", + "\n", + "class TextGenerator(keras.callbacks.Callback):\n", + " def __init__(self,\n", + " prompt,\n", + " generate_length,\n", + " model_input_length,\n", + " temperatures=(1.,),\n", + " print_freq=1):\n", + " self.prompt = prompt\n", + " self.generate_length = generate_length\n", + " self.model_input_length = model_input_length\n", + " self.temperatures = temperatures\n", + " self.print_freq = print_freq\n", + " vectorized_prompt = text_vectorization([prompt])[0].numpy()\n", + " self.prompt_length = np.nonzero(vectorized_prompt == 0)[0][0]\n", + "\n", + " def on_epoch_end(self, epoch, logs=None):\n", + " if (epoch + 1) % self.print_freq != 0:\n", + " return\n", + " for temperature in self.temperatures:\n", + " print(\"== Generating with temperature\", temperature)\n", + " sentence = self.prompt\n", + " for i in range(self.generate_length):\n", + " tokenized_sentence = text_vectorization([sentence])\n", + " predictions = self.model(tokenized_sentence)\n", + " next_token = sample_next(\n", + " predictions[0, self.prompt_length - 1 + i, :]\n", + " )\n", + " sampled_token = tokens_index[next_token]\n", + " sentence += \" \" + sampled_token\n", + " print(sentence)\n", + "\n", + "prompt = \"This movie\"\n", + "text_gen_callback = TextGenerator(\n", + " prompt,\n", + " generate_length=50,\n", + " model_input_length=sequence_length,\n", + " temperatures=(0.2, 0.5, 0.7, 1., 1.5))" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "colab_type": "text" + }, + "source": [ + "**Fitting the language model**" + ] + }, + { + "cell_type": "code", + "execution_count": 0, + "metadata": { + "colab_type": "code" + }, + "outputs": [], + "source": [ + "model.fit(lm_dataset, epochs=200, callbacks=[text_gen_callback])" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "colab_type": "text" + }, + "source": [ + "### Wrapping up" + ] + } + ], + "metadata": { + "colab": { + "collapsed_sections": [], + "name": "chapter12_part01_text-generation.i", + "private_outputs": false, + "provenance": [], + "toc_visible": true + }, + "kernelspec": { + "display_name": "Python 3", + "language": "python", + "name": "python3" + }, + "language_info": { + "codemirror_mode": { + "name": "ipython", + "version": 3 + }, + "file_extension": ".py", + "mimetype": "text/x-python", + "name": "python", + "nbconvert_exporter": "python", + "pygments_lexer": "ipython3", + "version": "3.7.0" + } + }, + "nbformat": 4, + "nbformat_minor": 0 +} \ No newline at end of file diff --git a/second_edition/chapter12_part02_deep-dream.ipynb b/second_edition/chapter12_part02_deep-dream.ipynb new file mode 100644 index 0000000000..7e01d0fbee --- /dev/null +++ b/second_edition/chapter12_part02_deep-dream.ipynb @@ -0,0 +1,308 @@ +{ + "cells": [ + { + "cell_type": "markdown", + "metadata": { + "colab_type": "text" + }, + "source": [ + "This is a companion notebook for the book [Deep Learning with Python, Second Edition](https://www.manning.com/books/deep-learning-with-python-second-edition?a_aid=keras&a_bid=76564dff). For readability, it only contains runnable code blocks and section titles, and omits everything else in the book: text paragraphs, figures, and pseudocode.\n\n**If you want to be able to follow what's going on, I recommend reading the notebook side by side with your copy of the book.**\n\nThis notebook was generated for TensorFlow 2.6." + ] + }, + { + "cell_type": "markdown", + "metadata": { + "colab_type": "text" + }, + "source": [ + "## DeepDream" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "colab_type": "text" + }, + "source": [ + "### Implementing DeepDream in Keras" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "colab_type": "text" + }, + "source": [ + "**Fetching the test image**" + ] + }, + { + "cell_type": "code", + "execution_count": 0, + "metadata": { + "colab_type": "code" + }, + "outputs": [], + "source": [ + "from tensorflow import keras\n", + "import matplotlib.pyplot as plt\n", + "\n", + "base_image_path = keras.utils.get_file(\n", + " \"coast.jpg\", origin=\"https://img-datasets.s3.amazonaws.com/coast.jpg\")\n", + "\n", + "plt.axis(\"off\")\n", + "plt.imshow(keras.utils.load_img(base_image_path))" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "colab_type": "text" + }, + "source": [ + "**Instantiating a pretrained `InceptionV3` model**" + ] + }, + { + "cell_type": "code", + "execution_count": 0, + "metadata": { + "colab_type": "code" + }, + "outputs": [], + "source": [ + "from tensorflow.keras.applications import inception_v3\n", + "model = inception_v3.InceptionV3(weights=\"imagenet\", include_top=False)" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "colab_type": "text" + }, + "source": [ + "**Configuring the contribution of each layer to the DeepDream loss**" + ] + }, + { + "cell_type": "code", + "execution_count": 0, + "metadata": { + "colab_type": "code" + }, + "outputs": [], + "source": [ + "layer_settings = {\n", + " \"mixed4\": 1.0,\n", + " \"mixed5\": 1.5,\n", + " \"mixed6\": 2.0,\n", + " \"mixed7\": 2.5,\n", + "}\n", + "outputs_dict = dict(\n", + " [\n", + " (layer.name, layer.output)\n", + " for layer in [model.get_layer(name) for name in layer_settings.keys()]\n", + " ]\n", + ")\n", + "feature_extractor = keras.Model(inputs=model.inputs, outputs=outputs_dict)" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "colab_type": "text" + }, + "source": [ + "**The DeepDream loss**" + ] + }, + { + "cell_type": "code", + "execution_count": 0, + "metadata": { + "colab_type": "code" + }, + "outputs": [], + "source": [ + "def compute_loss(input_image):\n", + " features = feature_extractor(input_image)\n", + " loss = tf.zeros(shape=())\n", + " for name in features.keys():\n", + " coeff = layer_settings[name]\n", + " activation = features[name]\n", + " loss += coeff * tf.reduce_mean(tf.square(activation[:, 2:-2, 2:-2, :]))\n", + " return loss" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "colab_type": "text" + }, + "source": [ + "**The DeepDream gradient ascent process**" + ] + }, + { + "cell_type": "code", + "execution_count": 0, + "metadata": { + "colab_type": "code" + }, + "outputs": [], + "source": [ + "import tensorflow as tf\n", + "\n", + "@tf.function\n", + "def gradient_ascent_step(image, learning_rate):\n", + " with tf.GradientTape() as tape:\n", + " tape.watch(image)\n", + " loss = compute_loss(image)\n", + " grads = tape.gradient(loss, image)\n", + " grads = tf.math.l2_normalize(grads)\n", + " image += learning_rate * grads\n", + " return loss, image\n", + "\n", + "\n", + "def gradient_ascent_loop(image, iterations, learning_rate, max_loss=None):\n", + " for i in range(iterations):\n", + " loss, image = gradient_ascent_step(image, learning_rate)\n", + " if max_loss is not None and loss > max_loss:\n", + " break\n", + " print(f\"... Loss value at step {i}: {loss:.2f}\")\n", + " return image" + ] + }, + { + "cell_type": "code", + "execution_count": 0, + "metadata": { + "colab_type": "code" + }, + "outputs": [], + "source": [ + "step = 20.\n", + "num_octave = 3\n", + "octave_scale = 1.4\n", + "iterations = 30\n", + "max_loss = 15." + ] + }, + { + "cell_type": "markdown", + "metadata": { + "colab_type": "text" + }, + "source": [ + "**Image processing utilities**" + ] + }, + { + "cell_type": "code", + "execution_count": 0, + "metadata": { + "colab_type": "code" + }, + "outputs": [], + "source": [ + "import numpy as np\n", + "\n", + "def preprocess_image(image_path):\n", + " img = keras.utils.load_img(image_path)\n", + " img = keras.utils.img_to_array(img)\n", + " img = np.expand_dims(img, axis=0)\n", + " img = keras.applications.inception_v3.preprocess_input(img)\n", + " return img\n", + "\n", + "def deprocess_image(img):\n", + " img = img.reshape((img.shape[1], img.shape[2], 3))\n", + " img /= 2.0\n", + " img += 0.5\n", + " img *= 255.\n", + " img = np.clip(img, 0, 255).astype(\"uint8\")\n", + " return img" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "colab_type": "text" + }, + "source": [ + "**Running gradient ascent over multiple successive \"octaves\"**" + ] + }, + { + "cell_type": "code", + "execution_count": 0, + "metadata": { + "colab_type": "code" + }, + "outputs": [], + "source": [ + "original_img = preprocess_image(base_image_path)\n", + "original_shape = original_img.shape[1:3]\n", + "\n", + "successive_shapes = [original_shape]\n", + "for i in range(1, num_octave):\n", + " shape = tuple([int(dim / (octave_scale ** i)) for dim in original_shape])\n", + " successive_shapes.append(shape)\n", + "successive_shapes = successive_shapes[::-1]\n", + "\n", + "shrunk_original_img = tf.image.resize(original_img, successive_shapes[0])\n", + "\n", + "img = tf.identity(original_img)\n", + "for i, shape in enumerate(successive_shapes):\n", + " print(f\"Processing octave {i} with shape {shape}\")\n", + " img = tf.image.resize(img, shape)\n", + " img = gradient_ascent_loop(\n", + " img, iterations=iterations, learning_rate=step, max_loss=max_loss\n", + " )\n", + " upscaled_shrunk_original_img = tf.image.resize(shrunk_original_img, shape)\n", + " same_size_original = tf.image.resize(original_img, shape)\n", + " lost_detail = same_size_original - upscaled_shrunk_original_img\n", + " img += lost_detail\n", + " shrunk_original_img = tf.image.resize(original_img, shape)\n", + "\n", + "keras.utils.save_img(\"dream.png\", deprocess_image(img.numpy()))" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "colab_type": "text" + }, + "source": [ + "### Wrapping up" + ] + } + ], + "metadata": { + "colab": { + "collapsed_sections": [], + "name": "chapter12_part02_deep-dream.i", + "private_outputs": false, + "provenance": [], + "toc_visible": true + }, + "kernelspec": { + "display_name": "Python 3", + "language": "python", + "name": "python3" + }, + "language_info": { + "codemirror_mode": { + "name": "ipython", + "version": 3 + }, + "file_extension": ".py", + "mimetype": "text/x-python", + "name": "python", + "nbconvert_exporter": "python", + "pygments_lexer": "ipython3", + "version": "3.7.0" + } + }, + "nbformat": 4, + "nbformat_minor": 0 +} \ No newline at end of file diff --git a/second_edition/chapter12_part03_neural-style-transfer.ipynb b/second_edition/chapter12_part03_neural-style-transfer.ipynb new file mode 100644 index 0000000000..42fd13ef84 --- /dev/null +++ b/second_edition/chapter12_part03_neural-style-transfer.ipynb @@ -0,0 +1,356 @@ +{ + "cells": [ + { + "cell_type": "markdown", + "metadata": { + "colab_type": "text" + }, + "source": [ + "This is a companion notebook for the book [Deep Learning with Python, Second Edition](https://www.manning.com/books/deep-learning-with-python-second-edition?a_aid=keras&a_bid=76564dff). For readability, it only contains runnable code blocks and section titles, and omits everything else in the book: text paragraphs, figures, and pseudocode.\n\n**If you want to be able to follow what's going on, I recommend reading the notebook side by side with your copy of the book.**\n\nThis notebook was generated for TensorFlow 2.6." + ] + }, + { + "cell_type": "markdown", + "metadata": { + "colab_type": "text" + }, + "source": [ + "## Neural style transfer" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "colab_type": "text" + }, + "source": [ + "### The content loss" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "colab_type": "text" + }, + "source": [ + "### The style loss" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "colab_type": "text" + }, + "source": [ + "### Neural style transfer in Keras" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "colab_type": "text" + }, + "source": [ + "**Getting the style and content images**" + ] + }, + { + "cell_type": "code", + "execution_count": 0, + "metadata": { + "colab_type": "code" + }, + "outputs": [], + "source": [ + "from tensorflow import keras\n", + "\n", + "base_image_path = keras.utils.get_file(\n", + " \"sf.jpg\", origin=\"https://img-datasets.s3.amazonaws.com/sf.jpg\")\n", + "style_reference_image_path = keras.utils.get_file(\n", + " \"starry_night.jpg\", origin=\"https://img-datasets.s3.amazonaws.com/starry_night.jpg\")\n", + "\n", + "original_width, original_height = keras.utils.load_img(base_image_path).size\n", + "img_height = 400\n", + "img_width = round(original_width * img_height / original_height)" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "colab_type": "text" + }, + "source": [ + "**Auxiliary functions**" + ] + }, + { + "cell_type": "code", + "execution_count": 0, + "metadata": { + "colab_type": "code" + }, + "outputs": [], + "source": [ + "import numpy as np\n", + "\n", + "def preprocess_image(image_path):\n", + " img = keras.utils.load_img(\n", + " image_path, target_size=(img_height, img_width))\n", + " img = keras.utils.img_to_array(img)\n", + " img = np.expand_dims(img, axis=0)\n", + " img = keras.applications.vgg19.preprocess_input(img)\n", + " return img\n", + "\n", + "def deprocess_image(img):\n", + " img = img.reshape((img_height, img_width, 3))\n", + " img[:, :, 0] += 103.939\n", + " img[:, :, 1] += 116.779\n", + " img[:, :, 2] += 123.68\n", + " img = img[:, :, ::-1]\n", + " img = np.clip(img, 0, 255).astype(\"uint8\")\n", + " return img" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "colab_type": "text" + }, + "source": [ + "**Using a pretrained VGG19 model to create a feature extractor**" + ] + }, + { + "cell_type": "code", + "execution_count": 0, + "metadata": { + "colab_type": "code" + }, + "outputs": [], + "source": [ + "model = keras.applications.vgg19.VGG19(weights=\"imagenet\", include_top=False)\n", + "\n", + "outputs_dict = dict([(layer.name, layer.output) for layer in model.layers])\n", + "feature_extractor = keras.Model(inputs=model.inputs, outputs=outputs_dict)" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "colab_type": "text" + }, + "source": [ + "**Content loss**" + ] + }, + { + "cell_type": "code", + "execution_count": 0, + "metadata": { + "colab_type": "code" + }, + "outputs": [], + "source": [ + "def content_loss(base_img, combination_img):\n", + " return tf.reduce_sum(tf.square(combination_img - base_img))" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "colab_type": "text" + }, + "source": [ + "**Style loss**" + ] + }, + { + "cell_type": "code", + "execution_count": 0, + "metadata": { + "colab_type": "code" + }, + "outputs": [], + "source": [ + "def gram_matrix(x):\n", + " x = tf.transpose(x, (2, 0, 1))\n", + " features = tf.reshape(x, (tf.shape(x)[0], -1))\n", + " gram = tf.matmul(features, tf.transpose(features))\n", + " return gram\n", + "\n", + "def style_loss(style_img, combination_img):\n", + " S = gram_matrix(style_img)\n", + " C = gram_matrix(combination_img)\n", + " channels = 3\n", + " size = img_height * img_width\n", + " return tf.reduce_sum(tf.square(S - C)) / (4.0 * (channels ** 2) * (size ** 2))" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "colab_type": "text" + }, + "source": [ + "**Total variation loss**" + ] + }, + { + "cell_type": "code", + "execution_count": 0, + "metadata": { + "colab_type": "code" + }, + "outputs": [], + "source": [ + "def total_variation_loss(x):\n", + " a = tf.square(\n", + " x[:, : img_height - 1, : img_width - 1, :] - x[:, 1:, : img_width - 1, :]\n", + " )\n", + " b = tf.square(\n", + " x[:, : img_height - 1, : img_width - 1, :] - x[:, : img_height - 1, 1:, :]\n", + " )\n", + " return tf.reduce_sum(tf.pow(a + b, 1.25))" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "colab_type": "text" + }, + "source": [ + "**Defining the final loss that you'll minimize**" + ] + }, + { + "cell_type": "code", + "execution_count": 0, + "metadata": { + "colab_type": "code" + }, + "outputs": [], + "source": [ + "style_layer_names = [\n", + " \"block1_conv1\",\n", + " \"block2_conv1\",\n", + " \"block3_conv1\",\n", + " \"block4_conv1\",\n", + " \"block5_conv1\",\n", + "]\n", + "content_layer_name = \"block5_conv2\"\n", + "total_variation_weight = 1e-6\n", + "style_weight = 1e-6\n", + "content_weight = 2.5e-8\n", + "\n", + "def compute_loss(combination_image, base_image, style_reference_image):\n", + " input_tensor = tf.concat(\n", + " [base_image, style_reference_image, combination_image], axis=0\n", + " )\n", + " features = feature_extractor(input_tensor)\n", + " loss = tf.zeros(shape=())\n", + " layer_features = features[content_layer_name]\n", + " base_image_features = layer_features[0, :, :, :]\n", + " combination_features = layer_features[2, :, :, :]\n", + " loss = loss + content_weight * content_loss(\n", + " base_image_features, combination_features\n", + " )\n", + " for layer_name in style_layer_names:\n", + " layer_features = features[layer_name]\n", + " style_reference_features = layer_features[1, :, :, :]\n", + " combination_features = layer_features[2, :, :, :]\n", + " style_loss_value = style_loss(\n", + " style_reference_features, combination_features)\n", + " loss += (style_weight / len(style_layer_names)) * style_loss_value\n", + "\n", + " loss += total_variation_weight * total_variation_loss(combination_image)\n", + " return loss" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "colab_type": "text" + }, + "source": [ + "**Setting up the gradient-descent process**" + ] + }, + { + "cell_type": "code", + "execution_count": 0, + "metadata": { + "colab_type": "code" + }, + "outputs": [], + "source": [ + "import tensorflow as tf\n", + "\n", + "@tf.function\n", + "def compute_loss_and_grads(combination_image, base_image, style_reference_image):\n", + " with tf.GradientTape() as tape:\n", + " loss = compute_loss(combination_image, base_image, style_reference_image)\n", + " grads = tape.gradient(loss, combination_image)\n", + " return loss, grads\n", + "\n", + "optimizer = keras.optimizers.SGD(\n", + " keras.optimizers.schedules.ExponentialDecay(\n", + " initial_learning_rate=100.0, decay_steps=100, decay_rate=0.96\n", + " )\n", + ")\n", + "\n", + "base_image = preprocess_image(base_image_path)\n", + "style_reference_image = preprocess_image(style_reference_image_path)\n", + "combination_image = tf.Variable(preprocess_image(base_image_path))\n", + "\n", + "iterations = 4000\n", + "for i in range(1, iterations + 1):\n", + " loss, grads = compute_loss_and_grads(\n", + " combination_image, base_image, style_reference_image\n", + " )\n", + " optimizer.apply_gradients([(grads, combination_image)])\n", + " if i % 100 == 0:\n", + " print(f\"Iteration {i}: loss={loss:.2f}\")\n", + " img = deprocess_image(combination_image.numpy())\n", + " fname = f\"combination_image_at_iteration_{i}.png\"\n", + " keras.utils.save_img(fname, img)" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "colab_type": "text" + }, + "source": [ + "### Wrapping up" + ] + } + ], + "metadata": { + "colab": { + "collapsed_sections": [], + "name": "chapter12_part03_neural-style-transfer.i", + "private_outputs": false, + "provenance": [], + "toc_visible": true + }, + "kernelspec": { + "display_name": "Python 3", + "language": "python", + "name": "python3" + }, + "language_info": { + "codemirror_mode": { + "name": "ipython", + "version": 3 + }, + "file_extension": ".py", + "mimetype": "text/x-python", + "name": "python", + "nbconvert_exporter": "python", + "pygments_lexer": "ipython3", + "version": "3.7.0" + } + }, + "nbformat": 4, + "nbformat_minor": 0 +} \ No newline at end of file diff --git a/second_edition/chapter12_part04_variational-autoencoders.ipynb b/second_edition/chapter12_part04_variational-autoencoders.ipynb new file mode 100644 index 0000000000..fd6ae9c13a --- /dev/null +++ b/second_edition/chapter12_part04_variational-autoencoders.ipynb @@ -0,0 +1,339 @@ +{ + "cells": [ + { + "cell_type": "markdown", + "metadata": { + "colab_type": "text" + }, + "source": [ + "This is a companion notebook for the book [Deep Learning with Python, Second Edition](https://www.manning.com/books/deep-learning-with-python-second-edition?a_aid=keras&a_bid=76564dff). For readability, it only contains runnable code blocks and section titles, and omits everything else in the book: text paragraphs, figures, and pseudocode.\n\n**If you want to be able to follow what's going on, I recommend reading the notebook side by side with your copy of the book.**\n\nThis notebook was generated for TensorFlow 2.6." + ] + }, + { + "cell_type": "markdown", + "metadata": { + "colab_type": "text" + }, + "source": [ + "## Generating images with variational autoencoders" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "colab_type": "text" + }, + "source": [ + "### Sampling from latent spaces of images" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "colab_type": "text" + }, + "source": [ + "### Concept vectors for image editing" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "colab_type": "text" + }, + "source": [ + "### Variational autoencoders" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "colab_type": "text" + }, + "source": [ + "### Implementing a VAE with Keras" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "colab_type": "text" + }, + "source": [ + "**VAE encoder network**" + ] + }, + { + "cell_type": "code", + "execution_count": 0, + "metadata": { + "colab_type": "code" + }, + "outputs": [], + "source": [ + "from tensorflow import keras\n", + "from tensorflow.keras import layers\n", + "\n", + "latent_dim = 2\n", + "\n", + "encoder_inputs = keras.Input(shape=(28, 28, 1))\n", + "x = layers.Conv2D(32, 3, activation=\"relu\", strides=2, padding=\"same\")(encoder_inputs)\n", + "x = layers.Conv2D(64, 3, activation=\"relu\", strides=2, padding=\"same\")(x)\n", + "x = layers.Flatten()(x)\n", + "x = layers.Dense(16, activation=\"relu\")(x)\n", + "z_mean = layers.Dense(latent_dim, name=\"z_mean\")(x)\n", + "z_log_var = layers.Dense(latent_dim, name=\"z_log_var\")(x)\n", + "encoder = keras.Model(encoder_inputs, [z_mean, z_log_var], name=\"encoder\")" + ] + }, + { + "cell_type": "code", + "execution_count": 0, + "metadata": { + "colab_type": "code" + }, + "outputs": [], + "source": [ + "encoder.summary()" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "colab_type": "text" + }, + "source": [ + "**Latent-space-sampling layer**" + ] + }, + { + "cell_type": "code", + "execution_count": 0, + "metadata": { + "colab_type": "code" + }, + "outputs": [], + "source": [ + "import tensorflow as tf\n", + "\n", + "class Sampler(layers.Layer):\n", + " def call(self, z_mean, z_log_var):\n", + " batch_size = tf.shape(z_mean)[0]\n", + " z_size = tf.shape(z_mean)[1]\n", + " epsilon = tf.random.normal(shape=(batch_size, z_size))\n", + " return z_mean + tf.exp(0.5 * z_log_var) * epsilon" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "colab_type": "text" + }, + "source": [ + "**VAE decoder network, mapping latent space points to images**" + ] + }, + { + "cell_type": "code", + "execution_count": 0, + "metadata": { + "colab_type": "code" + }, + "outputs": [], + "source": [ + "latent_inputs = keras.Input(shape=(latent_dim,))\n", + "x = layers.Dense(7 * 7 * 64, activation=\"relu\")(latent_inputs)\n", + "x = layers.Reshape((7, 7, 64))(x)\n", + "x = layers.Conv2DTranspose(64, 3, activation=\"relu\", strides=2, padding=\"same\")(x)\n", + "x = layers.Conv2DTranspose(32, 3, activation=\"relu\", strides=2, padding=\"same\")(x)\n", + "decoder_outputs = layers.Conv2D(1, 3, activation=\"sigmoid\", padding=\"same\")(x)\n", + "decoder = keras.Model(latent_inputs, decoder_outputs, name=\"decoder\")" + ] + }, + { + "cell_type": "code", + "execution_count": 0, + "metadata": { + "colab_type": "code" + }, + "outputs": [], + "source": [ + "decoder.summary()" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "colab_type": "text" + }, + "source": [ + "**VAE model with custom `train_step()`**" + ] + }, + { + "cell_type": "code", + "execution_count": 0, + "metadata": { + "colab_type": "code" + }, + "outputs": [], + "source": [ + "class VAE(keras.Model):\n", + " def __init__(self, encoder, decoder, **kwargs):\n", + " super().__init__(**kwargs)\n", + " self.encoder = encoder\n", + " self.decoder = decoder\n", + " self.sampler = Sampler()\n", + " self.total_loss_tracker = keras.metrics.Mean(name=\"total_loss\")\n", + " self.reconstruction_loss_tracker = keras.metrics.Mean(\n", + " name=\"reconstruction_loss\")\n", + " self.kl_loss_tracker = keras.metrics.Mean(name=\"kl_loss\")\n", + "\n", + " @property\n", + " def metrics(self):\n", + " return [self.total_loss_tracker,\n", + " self.reconstruction_loss_tracker,\n", + " self.kl_loss_tracker]\n", + "\n", + " def train_step(self, data):\n", + " with tf.GradientTape() as tape:\n", + " z_mean, z_log_var = self.encoder(data)\n", + " z = self.sampler(z_mean, z_log_var)\n", + " reconstruction = decoder(z)\n", + " reconstruction_loss = tf.reduce_mean(\n", + " tf.reduce_sum(\n", + " keras.losses.binary_crossentropy(data, reconstruction),\n", + " axis=(1, 2)\n", + " )\n", + " )\n", + " kl_loss = -0.5 * (1 + z_log_var - tf.square(z_mean) - tf.exp(z_log_var))\n", + " total_loss = reconstruction_loss + tf.reduce_mean(kl_loss)\n", + " grads = tape.gradient(total_loss, self.trainable_weights)\n", + " self.optimizer.apply_gradients(zip(grads, self.trainable_weights))\n", + " self.total_loss_tracker.update_state(total_loss)\n", + " self.reconstruction_loss_tracker.update_state(reconstruction_loss)\n", + " self.kl_loss_tracker.update_state(kl_loss)\n", + " return {\n", + " \"total_loss\": self.total_loss_tracker.result(),\n", + " \"reconstruction_loss\": self.reconstruction_loss_tracker.result(),\n", + " \"kl_loss\": self.kl_loss_tracker.result(),\n", + " }" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "colab_type": "text" + }, + "source": [ + "**Training the VAE**" + ] + }, + { + "cell_type": "code", + "execution_count": 0, + "metadata": { + "colab_type": "code" + }, + "outputs": [], + "source": [ + "import numpy as np\n", + "\n", + "(x_train, _), (x_test, _) = keras.datasets.mnist.load_data()\n", + "mnist_digits = np.concatenate([x_train, x_test], axis=0)\n", + "mnist_digits = np.expand_dims(mnist_digits, -1).astype(\"float32\") / 255\n", + "\n", + "vae = VAE(encoder, decoder)\n", + "vae.compile(optimizer=keras.optimizers.Adam(), run_eagerly=True)\n", + "vae.fit(mnist_digits, epochs=30, batch_size=128)" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "colab_type": "text" + }, + "source": [ + "**Sampling a grid of images from the 2D latent space**" + ] + }, + { + "cell_type": "code", + "execution_count": 0, + "metadata": { + "colab_type": "code" + }, + "outputs": [], + "source": [ + "import matplotlib.pyplot as plt\n", + "\n", + "n = 30\n", + "digit_size = 28\n", + "figure = np.zeros((digit_size * n, digit_size * n))\n", + "\n", + "grid_x = np.linspace(-1, 1, n)\n", + "grid_y = np.linspace(-1, 1, n)[::-1]\n", + "\n", + "for i, yi in enumerate(grid_y):\n", + " for j, xi in enumerate(grid_x):\n", + " z_sample = np.array([[xi, yi]])\n", + " x_decoded = vae.decoder.predict(z_sample)\n", + " digit = x_decoded[0].reshape(digit_size, digit_size)\n", + " figure[\n", + " i * digit_size : (i + 1) * digit_size,\n", + " j * digit_size : (j + 1) * digit_size,\n", + " ] = digit\n", + "\n", + "plt.figure(figsize=(15, 15))\n", + "start_range = digit_size // 2\n", + "end_range = n * digit_size + start_range\n", + "pixel_range = np.arange(start_range, end_range, digit_size)\n", + "sample_range_x = np.round(grid_x, 1)\n", + "sample_range_y = np.round(grid_y, 1)\n", + "plt.xticks(pixel_range, sample_range_x)\n", + "plt.yticks(pixel_range, sample_range_y)\n", + "plt.xlabel(\"z[0]\")\n", + "plt.ylabel(\"z[1]\")\n", + "plt.axis(\"off\")\n", + "plt.imshow(figure, cmap=\"Greys_r\")" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "colab_type": "text" + }, + "source": [ + "### Wrapping up" + ] + } + ], + "metadata": { + "colab": { + "collapsed_sections": [], + "name": "chapter12_part04_variational-autoencoders.i", + "private_outputs": false, + "provenance": [], + "toc_visible": true + }, + "kernelspec": { + "display_name": "Python 3", + "language": "python", + "name": "python3" + }, + "language_info": { + "codemirror_mode": { + "name": "ipython", + "version": 3 + }, + "file_extension": ".py", + "mimetype": "text/x-python", + "name": "python", + "nbconvert_exporter": "python", + "pygments_lexer": "ipython3", + "version": "3.7.0" + } + }, + "nbformat": 4, + "nbformat_minor": 0 +} \ No newline at end of file diff --git a/second_edition/chapter12_part05_gans.ipynb b/second_edition/chapter12_part05_gans.ipynb new file mode 100644 index 0000000000..4b861c891a --- /dev/null +++ b/second_edition/chapter12_part05_gans.ipynb @@ -0,0 +1,447 @@ +{ + "cells": [ + { + "cell_type": "markdown", + "metadata": { + "colab_type": "text" + }, + "source": [ + "This is a companion notebook for the book [Deep Learning with Python, Second Edition](https://www.manning.com/books/deep-learning-with-python-second-edition?a_aid=keras&a_bid=76564dff). For readability, it only contains runnable code blocks and section titles, and omits everything else in the book: text paragraphs, figures, and pseudocode.\n\n**If you want to be able to follow what's going on, I recommend reading the notebook side by side with your copy of the book.**\n\nThis notebook was generated for TensorFlow 2.6." + ] + }, + { + "cell_type": "markdown", + "metadata": { + "colab_type": "text" + }, + "source": [ + "## Introduction to generative adversarial networks" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "colab_type": "text" + }, + "source": [ + "### A schematic GAN implementation" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "colab_type": "text" + }, + "source": [ + "### A bag of tricks" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "colab_type": "text" + }, + "source": [ + "### Getting our hands on the CelebA dataset" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "colab_type": "text" + }, + "source": [ + "**Getting the CelebA data**" + ] + }, + { + "cell_type": "code", + "execution_count": 0, + "metadata": { + "colab_type": "code" + }, + "outputs": [], + "source": [ + "!mkdir celeba_gan\n", + "!gdown --id 1O7m1010EJjLE5QxLZiM9Fpjs7Oj6e684 -O celeba_gan/data.zip\n", + "!unzip -qq celeba_gan/data.zip -d celeba_gan" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "colab_type": "text" + }, + "source": [ + "**Creating a dataset from a directory of images**" + ] + }, + { + "cell_type": "code", + "execution_count": 0, + "metadata": { + "colab_type": "code" + }, + "outputs": [], + "source": [ + "from tensorflow import keras\n", + "dataset = keras.utils.image_dataset_from_directory(\n", + " \"celeba_gan\",\n", + " label_mode=None,\n", + " image_size=(64, 64),\n", + " batch_size=32,\n", + " smart_resize=True)" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "colab_type": "text" + }, + "source": [ + "**Rescaling the images**" + ] + }, + { + "cell_type": "code", + "execution_count": 0, + "metadata": { + "colab_type": "code" + }, + "outputs": [], + "source": [ + "dataset = dataset.map(lambda x: x / 255.)" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "colab_type": "text" + }, + "source": [ + "**Displaying the first image**" + ] + }, + { + "cell_type": "code", + "execution_count": 0, + "metadata": { + "colab_type": "code" + }, + "outputs": [], + "source": [ + "import matplotlib.pyplot as plt\n", + "for x in dataset:\n", + " plt.axis(\"off\")\n", + " plt.imshow((x.numpy() * 255).astype(\"int32\")[0])\n", + " break" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "colab_type": "text" + }, + "source": [ + "### The discriminator" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "colab_type": "text" + }, + "source": [ + "**The GAN discriminator network**" + ] + }, + { + "cell_type": "code", + "execution_count": 0, + "metadata": { + "colab_type": "code" + }, + "outputs": [], + "source": [ + "from tensorflow.keras import layers\n", + "\n", + "discriminator = keras.Sequential(\n", + " [\n", + " keras.Input(shape=(64, 64, 3)),\n", + " layers.Conv2D(64, kernel_size=4, strides=2, padding=\"same\"),\n", + " layers.LeakyReLU(alpha=0.2),\n", + " layers.Conv2D(128, kernel_size=4, strides=2, padding=\"same\"),\n", + " layers.LeakyReLU(alpha=0.2),\n", + " layers.Conv2D(128, kernel_size=4, strides=2, padding=\"same\"),\n", + " layers.LeakyReLU(alpha=0.2),\n", + " layers.Flatten(),\n", + " layers.Dropout(0.2),\n", + " layers.Dense(1, activation=\"sigmoid\"),\n", + " ],\n", + " name=\"discriminator\",\n", + ")" + ] + }, + { + "cell_type": "code", + "execution_count": 0, + "metadata": { + "colab_type": "code" + }, + "outputs": [], + "source": [ + "discriminator.summary()" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "colab_type": "text" + }, + "source": [ + "### The generator" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "colab_type": "text" + }, + "source": [ + "**GAN generator network**" + ] + }, + { + "cell_type": "code", + "execution_count": 0, + "metadata": { + "colab_type": "code" + }, + "outputs": [], + "source": [ + "latent_dim = 128\n", + "\n", + "generator = keras.Sequential(\n", + " [\n", + " keras.Input(shape=(latent_dim,)),\n", + " layers.Dense(8 * 8 * 128),\n", + " layers.Reshape((8, 8, 128)),\n", + " layers.Conv2DTranspose(128, kernel_size=4, strides=2, padding=\"same\"),\n", + " layers.LeakyReLU(alpha=0.2),\n", + " layers.Conv2DTranspose(256, kernel_size=4, strides=2, padding=\"same\"),\n", + " layers.LeakyReLU(alpha=0.2),\n", + " layers.Conv2DTranspose(512, kernel_size=4, strides=2, padding=\"same\"),\n", + " layers.LeakyReLU(alpha=0.2),\n", + " layers.Conv2D(3, kernel_size=5, padding=\"same\", activation=\"sigmoid\"),\n", + " ],\n", + " name=\"generator\",\n", + ")" + ] + }, + { + "cell_type": "code", + "execution_count": 0, + "metadata": { + "colab_type": "code" + }, + "outputs": [], + "source": [ + "generator.summary()" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "colab_type": "text" + }, + "source": [ + "### The adversarial network" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "colab_type": "text" + }, + "source": [ + "**The GAN `Model`**" + ] + }, + { + "cell_type": "code", + "execution_count": 0, + "metadata": { + "colab_type": "code" + }, + "outputs": [], + "source": [ + "import tensorflow as tf\n", + "class GAN(keras.Model):\n", + " def __init__(self, discriminator, generator, latent_dim):\n", + " super().__init__()\n", + " self.discriminator = discriminator\n", + " self.generator = generator\n", + " self.latent_dim = latent_dim\n", + " self.d_loss_metric = keras.metrics.Mean(name=\"d_loss\")\n", + " self.g_loss_metric = keras.metrics.Mean(name=\"g_loss\")\n", + "\n", + " def compile(self, d_optimizer, g_optimizer, loss_fn):\n", + " super(GAN, self).compile()\n", + " self.d_optimizer = d_optimizer\n", + " self.g_optimizer = g_optimizer\n", + " self.loss_fn = loss_fn\n", + "\n", + " @property\n", + " def metrics(self):\n", + " return [self.d_loss_metric, self.g_loss_metric]\n", + "\n", + " def train_step(self, real_images):\n", + " batch_size = tf.shape(real_images)[0]\n", + " random_latent_vectors = tf.random.normal(\n", + " shape=(batch_size, self.latent_dim))\n", + " generated_images = self.generator(random_latent_vectors)\n", + " combined_images = tf.concat([generated_images, real_images], axis=0)\n", + " labels = tf.concat(\n", + " [tf.ones((batch_size, 1)), tf.zeros((batch_size, 1))],\n", + " axis=0\n", + " )\n", + " labels += 0.05 * tf.random.uniform(tf.shape(labels))\n", + "\n", + " with tf.GradientTape() as tape:\n", + " predictions = self.discriminator(combined_images)\n", + " d_loss = self.loss_fn(labels, predictions)\n", + " grads = tape.gradient(d_loss, self.discriminator.trainable_weights)\n", + " self.d_optimizer.apply_gradients(\n", + " zip(grads, self.discriminator.trainable_weights)\n", + " )\n", + "\n", + " random_latent_vectors = tf.random.normal(\n", + " shape=(batch_size, self.latent_dim))\n", + "\n", + " misleading_labels = tf.zeros((batch_size, 1))\n", + "\n", + " with tf.GradientTape() as tape:\n", + " predictions = self.discriminator(\n", + " self.generator(random_latent_vectors))\n", + " g_loss = self.loss_fn(misleading_labels, predictions)\n", + " grads = tape.gradient(g_loss, self.generator.trainable_weights)\n", + " self.g_optimizer.apply_gradients(\n", + " zip(grads, self.generator.trainable_weights))\n", + "\n", + " self.d_loss_metric.update_state(d_loss)\n", + " self.g_loss_metric.update_state(g_loss)\n", + " return {\"d_loss\": self.d_loss_metric.result(),\n", + " \"g_loss\": self.g_loss_metric.result()}" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "colab_type": "text" + }, + "source": [ + "**A callback that samples generated images during training**" + ] + }, + { + "cell_type": "code", + "execution_count": 0, + "metadata": { + "colab_type": "code" + }, + "outputs": [], + "source": [ + "class GANMonitor(keras.callbacks.Callback):\n", + " def __init__(self, num_img=3, latent_dim=128):\n", + " self.num_img = num_img\n", + " self.latent_dim = latent_dim\n", + "\n", + " def on_epoch_end(self, epoch, logs=None):\n", + " random_latent_vectors = tf.random.normal(shape=(self.num_img, self.latent_dim))\n", + " generated_images = self.model.generator(random_latent_vectors)\n", + " generated_images *= 255\n", + " generated_images.numpy()\n", + " for i in range(self.num_img):\n", + " img = keras.utils.array_to_img(generated_images[i])\n", + " img.save(f\"generated_img_{epoch:03d}_{i}.png\")" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "colab_type": "text" + }, + "source": [ + "**Compiling and training the GAN**" + ] + }, + { + "cell_type": "code", + "execution_count": 0, + "metadata": { + "colab_type": "code" + }, + "outputs": [], + "source": [ + "epochs = 100\n", + "\n", + "gan = GAN(discriminator=discriminator, generator=generator, latent_dim=latent_dim)\n", + "gan.compile(\n", + " d_optimizer=keras.optimizers.Adam(learning_rate=0.0001),\n", + " g_optimizer=keras.optimizers.Adam(learning_rate=0.0001),\n", + " loss_fn=keras.losses.BinaryCrossentropy(),\n", + ")\n", + "\n", + "gan.fit(\n", + " dataset, epochs=epochs, callbacks=[GANMonitor(num_img=10, latent_dim=latent_dim)]\n", + ")" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "colab_type": "text" + }, + "source": [ + "### Wrapping up" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "colab_type": "text" + }, + "source": [ + "## Summary" + ] + } + ], + "metadata": { + "colab": { + "collapsed_sections": [], + "name": "chapter12_part05_gans.i", + "private_outputs": false, + "provenance": [], + "toc_visible": true + }, + "kernelspec": { + "display_name": "Python 3", + "language": "python", + "name": "python3" + }, + "language_info": { + "codemirror_mode": { + "name": "ipython", + "version": 3 + }, + "file_extension": ".py", + "mimetype": "text/x-python", + "name": "python", + "nbconvert_exporter": "python", + "pygments_lexer": "ipython3", + "version": "3.7.0" + } + }, + "nbformat": 4, + "nbformat_minor": 0 +} diff --git a/second_edition/chapter13_best-practices-for-the-real-world.ipynb b/second_edition/chapter13_best-practices-for-the-real-world.ipynb new file mode 100644 index 0000000000..1d4b3b28c6 --- /dev/null +++ b/second_edition/chapter13_best-practices-for-the-real-world.ipynb @@ -0,0 +1,466 @@ +{ + "cells": [ + { + "cell_type": "markdown", + "metadata": { + "colab_type": "text" + }, + "source": [ + "This is a companion notebook for the book [Deep Learning with Python, Second Edition](https://www.manning.com/books/deep-learning-with-python-second-edition?a_aid=keras&a_bid=76564dff). For readability, it only contains runnable code blocks and section titles, and omits everything else in the book: text paragraphs, figures, and pseudocode.\n\n**If you want to be able to follow what's going on, I recommend reading the notebook side by side with your copy of the book.**\n\nThis notebook was generated for TensorFlow 2.6." + ] + }, + { + "cell_type": "markdown", + "metadata": { + "colab_type": "text" + }, + "source": [ + "# Best practices for the real world" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "colab_type": "text" + }, + "source": [ + "## Getting the most out of your models" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "colab_type": "text" + }, + "source": [ + "### Hyperparameter optimization" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "colab_type": "text" + }, + "source": [ + "#### Using KerasTuner" + ] + }, + { + "cell_type": "code", + "execution_count": 0, + "metadata": { + "colab_type": "code" + }, + "outputs": [], + "source": [ + "!pip install keras-tuner -q" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "colab_type": "text" + }, + "source": [ + "**A KerasTuner model-building function**" + ] + }, + { + "cell_type": "code", + "execution_count": 0, + "metadata": { + "colab_type": "code" + }, + "outputs": [], + "source": [ + "from tensorflow import keras\n", + "from tensorflow.keras import layers\n", + "\n", + "def build_model(hp):\n", + " units = hp.Int(name=\"units\", min_value=16, max_value=64, step=16)\n", + " model = keras.Sequential([\n", + " layers.Dense(units, activation=\"relu\"),\n", + " layers.Dense(10, activation=\"softmax\")\n", + " ])\n", + " optimizer = hp.Choice(name=\"optimizer\", values=[\"rmsprop\", \"adam\"])\n", + " model.compile(\n", + " optimizer=optimizer,\n", + " loss=\"sparse_categorical_crossentropy\",\n", + " metrics=[\"accuracy\"])\n", + " return model" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "colab_type": "text" + }, + "source": [ + "**A KerasTuner `HyperModel`**" + ] + }, + { + "cell_type": "code", + "execution_count": 0, + "metadata": { + "colab_type": "code" + }, + "outputs": [], + "source": [ + "import kerastuner as kt\n", + "\n", + "class SimpleMLP(kt.HyperModel):\n", + " def __init__(self, num_classes):\n", + " self.num_classes = num_classes\n", + "\n", + " def build(self, hp):\n", + " units = hp.Int(name=\"units\", min_value=16, max_value=64, step=16)\n", + " model = keras.Sequential([\n", + " layers.Dense(units, activation=\"relu\"),\n", + " layers.Dense(self.num_classes, activation=\"softmax\")\n", + " ])\n", + " optimizer = hp.Choice(name=\"optimizer\", values=[\"rmsprop\", \"adam\"])\n", + " model.compile(\n", + " optimizer=optimizer,\n", + " loss=\"sparse_categorical_crossentropy\",\n", + " metrics=[\"accuracy\"])\n", + " return model\n", + "\n", + "hypermodel = SimpleMLP(num_classes=10)" + ] + }, + { + "cell_type": "code", + "execution_count": 0, + "metadata": { + "colab_type": "code" + }, + "outputs": [], + "source": [ + "tuner = kt.BayesianOptimization(\n", + " build_model,\n", + " objective=\"val_accuracy\",\n", + " max_trials=100,\n", + " executions_per_trial=2,\n", + " directory=\"mnist_kt_test\",\n", + " overwrite=True,\n", + ")" + ] + }, + { + "cell_type": "code", + "execution_count": 0, + "metadata": { + "colab_type": "code" + }, + "outputs": [], + "source": [ + "tuner.search_space_summary()" + ] + }, + { + "cell_type": "code", + "execution_count": 0, + "metadata": { + "colab_type": "code" + }, + "outputs": [], + "source": [ + "(x_train, y_train), (x_test, y_test) = keras.datasets.mnist.load_data()\n", + "x_train = x_train.reshape((-1, 28 * 28)).astype(\"float32\") / 255\n", + "x_test = x_test.reshape((-1, 28 * 28)).astype(\"float32\") / 255\n", + "x_train_full = x_train[:]\n", + "y_train_full = y_train[:]\n", + "num_val_samples = 10000\n", + "x_train, x_val = x_train[:-num_val_samples], x_train[-num_val_samples:]\n", + "y_train, y_val = y_train[:-num_val_samples], y_train[-num_val_samples:]\n", + "callbacks = [\n", + " keras.callbacks.EarlyStopping(monitor=\"val_loss\", patience=5),\n", + "]\n", + "tuner.search(\n", + " x_train, y_train,\n", + " batch_size=128,\n", + " epochs=100,\n", + " validation_data=(x_val, y_val),\n", + " callbacks=callbacks,\n", + " verbose=2,\n", + ")" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "colab_type": "text" + }, + "source": [ + "**Querying the best hyperparameter configurations**" + ] + }, + { + "cell_type": "code", + "execution_count": 0, + "metadata": { + "colab_type": "code" + }, + "outputs": [], + "source": [ + "top_n = 4\n", + "best_hps = tuner.get_best_hyperparameters(top_n)" + ] + }, + { + "cell_type": "code", + "execution_count": 0, + "metadata": { + "colab_type": "code" + }, + "outputs": [], + "source": [ + "def get_best_epoch(hp):\n", + " model = build_model(hp)\n", + " callbacks=[\n", + " keras.callbacks.EarlyStopping(\n", + " monitor=\"val_loss\", mode=\"min\", patience=10)\n", + " ]\n", + " history = model.fit(\n", + " x_train, y_train,\n", + " validation_data=(x_val, y_val),\n", + " epochs=100,\n", + " batch_size=128,\n", + " callbacks=callbacks)\n", + " val_loss_per_epoch = history.history[\"val_loss\"]\n", + " best_epoch = val_loss_per_epoch.index(min(val_loss_per_epoch)) + 1\n", + " print(f\"Best epoch: {best_epoch}\")\n", + " return best_epoch" + ] + }, + { + "cell_type": "code", + "execution_count": 0, + "metadata": { + "colab_type": "code" + }, + "outputs": [], + "source": [ + "def get_best_trained_model(hp):\n", + " best_epoch = get_best_epoch(hp)\n", + " model = build_model(hp)\n", + " model.fit(\n", + " x_train_full, y_train_full,\n", + " batch_size=128, epochs=int(best_epoch * 1.2))\n", + " return model\n", + "\n", + "best_models = []\n", + "for hp in best_hps:\n", + " model = get_best_trained_model(hp)\n", + " model.evaluate(x_test, y_test)\n", + " best_models.append(model)" + ] + }, + { + "cell_type": "code", + "execution_count": 0, + "metadata": { + "colab_type": "code" + }, + "outputs": [], + "source": [ + "best_models = tuner.get_best_models(top_n)" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "colab_type": "text" + }, + "source": [ + "#### The art of crafting the right search space" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "colab_type": "text" + }, + "source": [ + "#### The future of hyperparameter tuning: automated machine learning" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "colab_type": "text" + }, + "source": [ + "### Model ensembling" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "colab_type": "text" + }, + "source": [ + "## Scaling-up model training" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "colab_type": "text" + }, + "source": [ + "### Speeding up training on GPU with mixed precision" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "colab_type": "text" + }, + "source": [ + "#### Understanding floating-point precision" + ] + }, + { + "cell_type": "code", + "execution_count": 0, + "metadata": { + "colab_type": "code" + }, + "outputs": [], + "source": [ + "import tensorflow as tf\n", + "import numpy as np\n", + "np_array = np.zeros((2, 2))\n", + "tf_tensor = tf.convert_to_tensor(np_array)\n", + "tf_tensor.dtype" + ] + }, + { + "cell_type": "code", + "execution_count": 0, + "metadata": { + "colab_type": "code" + }, + "outputs": [], + "source": [ + "np_array = np.zeros((2, 2))\n", + "tf_tensor = tf.convert_to_tensor(np_array, dtype=\"float32\")\n", + "tf_tensor.dtype" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "colab_type": "text" + }, + "source": [ + "#### Mixed-precision training in practice" + ] + }, + { + "cell_type": "code", + "execution_count": 0, + "metadata": { + "colab_type": "code" + }, + "outputs": [], + "source": [ + "from tensorflow import keras\n", + "keras.mixed_precision.set_global_policy(\"mixed_float16\")" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "colab_type": "text" + }, + "source": [ + "### Multi-GPU training" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "colab_type": "text" + }, + "source": [ + "#### Getting your hands on two or more GPUs" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "colab_type": "text" + }, + "source": [ + "#### Single-host, multi-device synchronous training" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "colab_type": "text" + }, + "source": [ + "### TPU training" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "colab_type": "text" + }, + "source": [ + "#### Using a TPU via Google Colab" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "colab_type": "text" + }, + "source": [ + "#### Leveraging step fusing to improve TPU utilization" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "colab_type": "text" + }, + "source": [ + "## Summary" + ] + } + ], + "metadata": { + "colab": { + "collapsed_sections": [], + "name": "chapter13_best-practices-for-the-real-world.i", + "private_outputs": false, + "provenance": [], + "toc_visible": true + }, + "kernelspec": { + "display_name": "Python 3", + "language": "python", + "name": "python3" + }, + "language_info": { + "codemirror_mode": { + "name": "ipython", + "version": 3 + }, + "file_extension": ".py", + "mimetype": "text/x-python", + "name": "python", + "nbconvert_exporter": "python", + "pygments_lexer": "ipython3", + "version": "3.7.0" + } + }, + "nbformat": 4, + "nbformat_minor": 0 +} \ No newline at end of file diff --git a/second_edition/chapter14_conclusions.ipynb b/second_edition/chapter14_conclusions.ipynb new file mode 100644 index 0000000000..e8ce1e0b57 --- /dev/null +++ b/second_edition/chapter14_conclusions.ipynb @@ -0,0 +1,568 @@ +{ + "cells": [ + { + "cell_type": "markdown", + "metadata": { + "colab_type": "text" + }, + "source": [ + "This is a companion notebook for the book [Deep Learning with Python, Second Edition](https://www.manning.com/books/deep-learning-with-python-second-edition?a_aid=keras&a_bid=76564dff). For readability, it only contains runnable code blocks and section titles, and omits everything else in the book: text paragraphs, figures, and pseudocode.\n\n**If you want to be able to follow what's going on, I recommend reading the notebook side by side with your copy of the book.**\n\nThis notebook was generated for TensorFlow 2.6." + ] + }, + { + "cell_type": "markdown", + "metadata": { + "colab_type": "text" + }, + "source": [ + "# Conclusions" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "colab_type": "text" + }, + "source": [ + "## Key concepts in review" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "colab_type": "text" + }, + "source": [ + "### Various approaches to AI" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "colab_type": "text" + }, + "source": [ + "### What makes deep learning special within the field of machine learning" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "colab_type": "text" + }, + "source": [ + "### How to think about deep learning" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "colab_type": "text" + }, + "source": [ + "### Key enabling technologies" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "colab_type": "text" + }, + "source": [ + "### The universal machine-learning workflow" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "colab_type": "text" + }, + "source": [ + "### Key network architectures" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "colab_type": "text" + }, + "source": [ + "#### Densely connected networks" + ] + }, + { + "cell_type": "code", + "execution_count": 0, + "metadata": { + "colab_type": "code" + }, + "outputs": [], + "source": [ + "from tensorflow import keras\n", + "from tensorflow.keras\u00a0import\u00a0layers\n", + "inputs = keras.Input(shape=(num_input_features,))\n", + "x = layers.Dense(32,\u00a0activation=\"relu\")(inputs)\n", + "x = layers.Dense(32,\u00a0activation=\"relu\")(x)\n", + "outputs = layers.Dense(1,\u00a0activation=\"sigmoid\")(x)\n", + "model = keras.Model(inputs, outputs)\n", + "model.compile(optimizer=\"rmsprop\",\u00a0loss=\"binary_crossentropy\")" + ] + }, + { + "cell_type": "code", + "execution_count": 0, + "metadata": { + "colab_type": "code" + }, + "outputs": [], + "source": [ + "inputs = keras.Input(shape=(num_input_features,))\n", + "x = layers.Dense(32,\u00a0activation=\"relu\")(inputs)\n", + "x = layers.Dense(32,\u00a0activation=\"relu\")(x)\n", + "outputs = layers.Dense(num_classes,\u00a0activation=\"softmax\")(x)\n", + "model = keras.Model(inputs, outputs)\n", + "model.compile(optimizer=\"rmsprop\",\u00a0loss=\"categorical_crossentropy\")" + ] + }, + { + "cell_type": "code", + "execution_count": 0, + "metadata": { + "colab_type": "code" + }, + "outputs": [], + "source": [ + "inputs = keras.Input(shape=(num_input_features,))\n", + "x = layers.Dense(32,\u00a0activation=\"relu\")(inputs)\n", + "x = layers.Dense(32,\u00a0activation=\"relu\")(x)\n", + "outputs = layers.Dense(num_classes,\u00a0activation=\"sigmoid\")(x)\n", + "model = keras.Model(inputs, outputs)\n", + "model.compile(optimizer=\"rmsprop\",\u00a0loss=\"binary_crossentropy\")" + ] + }, + { + "cell_type": "code", + "execution_count": 0, + "metadata": { + "colab_type": "code" + }, + "outputs": [], + "source": [ + "inputs = keras.Input(shape=(num_input_features,))\n", + "x = layers.Dense(32,\u00a0activation=\"relu\")(inputs)\n", + "x = layers.Dense(32,\u00a0activation=\"relu\")(x)\n", + "outputs layers.Dense(num_values)(x)\n", + "model = keras.Model(inputs, outputs)\n", + "model.compile(optimizer=\"rmsprop\",\u00a0loss=\"mse\")" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "colab_type": "text" + }, + "source": [ + "#### Convnets" + ] + }, + { + "cell_type": "code", + "execution_count": 0, + "metadata": { + "colab_type": "code" + }, + "outputs": [], + "source": [ + "inputs = keras.Input(shape=(height,\u00a0width,\u00a0channels))\n", + "x = layers.SeparableConv2D(32,\u00a03,\u00a0activation=\"relu\")(inputs)\n", + "x = layers.SeparableConv2D(64,\u00a03,\u00a0activation=\"relu\")(x)\n", + "x = layers.MaxPooling2D(2)(x)\n", + "x = layers.SeparableConv2D(64,\u00a03,\u00a0activation=\"relu\")(x)\n", + "x = layers.SeparableConv2D(128,\u00a03,\u00a0activation=\"relu\")(x)\n", + "x = layers.MaxPooling2D(2)(x)\n", + "x = layers.SeparableConv2D(64,\u00a03,\u00a0activation=\"relu\")(x)\n", + "x = layers.SeparableConv2D(128,\u00a03,\u00a0activation=\"relu\")(x)\n", + "x = layers.GlobalAveragePooling2D()(x)\n", + "x = layers.Dense(32,\u00a0activation=\"relu\")(x)\n", + "outputs = layers.Dense(num_classes,\u00a0activation=\"softmax\")(x)\n", + "model = keras.Model(inputs, outputs)\n", + "model.compile(optimizer=\"rmsprop\",\u00a0loss=\"categorical_crossentropy\")" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "colab_type": "text" + }, + "source": [ + "#### RNNs" + ] + }, + { + "cell_type": "code", + "execution_count": 0, + "metadata": { + "colab_type": "code" + }, + "outputs": [], + "source": [ + "inputs = keras.Input(shape=(num_timesteps,\u00a0num_features))\n", + "x = layers.LSTM(32)(inputs)\n", + "outputs = layers.Dense(num_classes,\u00a0activation=\"sigmoid\")(x)\n", + "model = keras.Model(inputs, outputs)\n", + "model.compile(optimizer=\"rmsprop\",\u00a0loss=\"binary_crossentropy\")" + ] + }, + { + "cell_type": "code", + "execution_count": 0, + "metadata": { + "colab_type": "code" + }, + "outputs": [], + "source": [ + "inputs = keras.Input(shape=(num_timesteps,\u00a0num_features))\n", + "x = layers.LSTM(32,\u00a0return_sequences=True)(inputs)\n", + "x = layers.LSTM(32,\u00a0return_sequences=True)(x)\n", + "x = layers.LSTM(32)(x)\n", + "outputs = layers.Dense(num_classes,\u00a0activation=\"sigmoid\")(x)\n", + "model = keras.Model(inputs, outputs)\n", + "model.compile(optimizer=\"rmsprop\",\u00a0loss=\"binary_crossentropy\")" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "colab_type": "text" + }, + "source": [ + "#### Transformers" + ] + }, + { + "cell_type": "code", + "execution_count": 0, + "metadata": { + "colab_type": "code" + }, + "outputs": [], + "source": [ + "encoder_inputs = keras.Input(shape=(sequence_length,), dtype=\"int64\")\n", + "x = PositionalEmbedding(sequence_length, vocab_size, embed_dim)(encoder_inputs)\n", + "encoder_outputs = TransformerEncoder(embed_dim, dense_dim, num_heads)(x)\n", + "decoder_inputs = keras.Input(shape=(None,), dtype=\"int64\")\n", + "x = PositionalEmbedding(sequence_length, vocab_size, embed_dim)(decoder_inputs)\n", + "x = TransformerDecoder(embed_dim, dense_dim, num_heads)(x, encoder_outputs)\n", + "decoder_outputs = layers.Dense(vocab_size, activation=\"softmax\")(x)\n", + "transformer = keras.Model([encoder_inputs, decoder_inputs], decoder_outputs)\n", + "transformer.compile(optimizer=\"rmsprop\", loss=\"categorical_crossentropy\")" + ] + }, + { + "cell_type": "code", + "execution_count": 0, + "metadata": { + "colab_type": "code" + }, + "outputs": [], + "source": [ + "inputs = keras.Input(shape=(sequence_length,), dtype=\"int64\")\n", + "x = PositionalEmbedding(sequence_length, vocab_size, embed_dim)(inputs)\n", + "x = TransformerEncoder(embed_dim, dense_dim, num_heads)(x)\n", + "x = layers.GlobalMaxPooling1D()(x)\n", + "outputs = layers.Dense(1, activation=\"sigmoid\")(x)\n", + "model = keras.Model(inputs, outputs)\n", + "model.compile(optimizer=\"rmsprop\", loss=\"binary_crossentropy\")" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "colab_type": "text" + }, + "source": [ + "### The space of possibilities" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "colab_type": "text" + }, + "source": [ + "## The limitations of deep learning" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "colab_type": "text" + }, + "source": [ + "### The risk of anthropomorphizing machine-learning models" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "colab_type": "text" + }, + "source": [ + "### Automatons vs. intelligent agents" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "colab_type": "text" + }, + "source": [ + "### Local generalization vs. extreme generalization" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "colab_type": "text" + }, + "source": [ + "### The purpose of intelligence" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "colab_type": "text" + }, + "source": [ + "### Climbing the spectrum of generalization" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "colab_type": "text" + }, + "source": [ + "## Setting the course toward greater generality in AI" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "colab_type": "text" + }, + "source": [ + "### On the importance of setting the right objective: The shortcut rule" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "colab_type": "text" + }, + "source": [ + "### A new target" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "colab_type": "text" + }, + "source": [ + "## Implementing intelligence: The missing ingredients" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "colab_type": "text" + }, + "source": [ + "### Intelligence as sensitivity to abstract analogies" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "colab_type": "text" + }, + "source": [ + "### The two poles of abstraction" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "colab_type": "text" + }, + "source": [ + "#### Value-centric analogy" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "colab_type": "text" + }, + "source": [ + "#### Program-centric analogy" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "colab_type": "text" + }, + "source": [ + "#### Cognition as a combination of both kinds of abstraction" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "colab_type": "text" + }, + "source": [ + "### The missing half of the picture" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "colab_type": "text" + }, + "source": [ + "## The future of deep learning" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "colab_type": "text" + }, + "source": [ + "### Models as programs" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "colab_type": "text" + }, + "source": [ + "### Blending together deep learning and program synthesis" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "colab_type": "text" + }, + "source": [ + "#### Integrating deep-learning modules and algorithmic modules into hybrid systems" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "colab_type": "text" + }, + "source": [ + "#### Using deep learning to guide program search" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "colab_type": "text" + }, + "source": [ + "### Lifelong learning and modular subroutine reuse" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "colab_type": "text" + }, + "source": [ + "### The long-term vision" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "colab_type": "text" + }, + "source": [ + "## Staying up to date in a fast-moving field" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "colab_type": "text" + }, + "source": [ + "### Practice on real-world problems using Kaggle" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "colab_type": "text" + }, + "source": [ + "### Read about the latest developments on arXiv" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "colab_type": "text" + }, + "source": [ + "### Explore the Keras ecosystem" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "colab_type": "text" + }, + "source": [ + "## Final words" + ] + } + ], + "metadata": { + "colab": { + "collapsed_sections": [], + "name": "chapter14_conclusions.i", + "private_outputs": false, + "provenance": [], + "toc_visible": true + }, + "kernelspec": { + "display_name": "Python 3", + "language": "python", + "name": "python3" + }, + "language_info": { + "codemirror_mode": { + "name": "ipython", + "version": 3 + }, + "file_extension": ".py", + "mimetype": "text/x-python", + "name": "python", + "nbconvert_exporter": "python", + "pygments_lexer": "ipython3", + "version": "3.7.0" + } + }, + "nbformat": 4, + "nbformat_minor": 0 +} \ No newline at end of file