{
 "cells": [
  {
   "cell_type": "markdown",
   "metadata": {
    "origin_pos": 0
   },
   "source": [
    "# Data Manipulation\n",
    ":label:`sec_ndarray`\n",
    "\n",
    "In order to get anything done, we need some way to store and manipulate data.\n",
    "Generally, there are two important things we need to do with data: (i) acquire\n",
    "them; and (ii) process them once they are inside the computer.  There is no\n",
    "point in acquiring data without some way to store it, so let us get our hands\n",
    "dirty first by playing with synthetic data.  To start, we introduce the\n",
    "$n$-dimensional array, which is also called the *tensor*.\n",
    "\n",
    "If you have worked with NumPy, the most widely-used\n",
    "scientific computing package in Python,\n",
    "then you will find this section familiar.\n",
    "No matter which framework you use,\n",
    "its *tensor class* (`ndarray` in MXNet,\n",
    "`Tensor` in both PyTorch and TensorFlow) is similar to NumPy's `ndarray` with\n",
    "a few killer features.\n",
    "First, GPU is well-supported to accelerate the computation\n",
    "whereas NumPy only supports CPU computation.\n",
    "Second, the tensor class\n",
    "supports automatic differentiation.\n",
    "These properties make the tensor class suitable for deep learning.\n",
    "Throughout the book, when we say tensors,\n",
    "we are referring to instances of the tensor class unless otherwise stated.\n",
    "\n",
    "## Getting Started\n",
    "\n",
    "In this section, we aim to get you up and running,\n",
    "equipping you with the basic math and numerical computing tools\n",
    "that you will build on as you progress through the book.\n",
    "Do not worry if you struggle to grok some of\n",
    "the mathematical concepts or library functions.\n",
    "The following sections will revisit this material\n",
    "in the context of practical examples and it will sink in.\n",
    "On the other hand, if you already have some background\n",
    "and want to go deeper into the mathematical content, just skip this section.\n"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "origin_pos": 3,
    "tab": [
     "tensorflow"
    ]
   },
   "source": [
    "To start, we import `tensorflow`. As the name is a little long, we often import\n",
    "it with a short alias `tf`.\n"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 1,
   "metadata": {
    "origin_pos": 6,
    "tab": [
     "tensorflow"
    ]
   },
   "outputs": [],
   "source": [
    "import tensorflow as tf"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "origin_pos": 7
   },
   "source": [
    "[**A tensor represents a (possibly multi-dimensional) array of numerical values.**]\n",
    "With one axis, a tensor is called a *vector*.\n",
    "With two axes, a tensor is called a *matrix*.\n",
    "With $k > 2$ axes, we drop the specialized names\n",
    "and just refer to the object as a $k^\\mathrm{th}$ *order tensor*.\n"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "origin_pos": 10,
    "tab": [
     "tensorflow"
    ]
   },
   "source": [
    "TensorFlow provides a variety of functions \n",
    "for creating new tensors \n",
    "prepopulated with values. \n",
    "For example, by invoking `range(n)`,\n",
    "we can create a vector of evenly spaced values,\n",
    "starting at 0 (included) \n",
    "and ending at `n` (not included).\n",
    "By default, the interval size is $1$.\n",
    "Unless otherwise specified, \n",
    "new tensors are stored in main memory \n",
    "and designated for CPU-based computation.\n"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 2,
   "metadata": {
    "origin_pos": 13,
    "tab": [
     "tensorflow"
    ]
   },
   "outputs": [
    {
     "data": {
      "text/plain": [
       "<tf.Tensor: shape=(12,), dtype=float32, numpy=\n",
       "array([ 0.,  1.,  2.,  3.,  4.,  5.,  6.,  7.,  8.,  9., 10., 11.],\n",
       "      dtype=float32)>"
      ]
     },
     "execution_count": 2,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "x = tf.range(12, dtype=tf.float32)\n",
    "x"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "origin_pos": 14
   },
   "source": [
    "(**We can access a tensor's *shape***) (~~and the total number of elements~~) (the length along each axis)\n",
    "by inspecting its `shape` property.\n"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 3,
   "metadata": {
    "origin_pos": 15,
    "tab": [
     "tensorflow"
    ]
   },
   "outputs": [
    {
     "data": {
      "text/plain": [
       "TensorShape([12])"
      ]
     },
     "execution_count": 3,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "x.shape"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "origin_pos": 16
   },
   "source": [
    "If we just want to know the total number of elements in a tensor,\n",
    "i.e., the product of all of the shape elements,\n",
    "we can inspect its size.\n",
    "Because we are dealing with a vector here,\n",
    "the single element of its `shape` is identical to its size.\n"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 4,
   "metadata": {
    "origin_pos": 19,
    "tab": [
     "tensorflow"
    ]
   },
   "outputs": [
    {
     "data": {
      "text/plain": [
       "<tf.Tensor: shape=(), dtype=int32, numpy=12>"
      ]
     },
     "execution_count": 4,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "tf.size(x)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "origin_pos": 20
   },
   "source": [
    "To [**change the shape of a tensor without altering\n",
    "either the number of elements or their values**],\n",
    "we can invoke the `reshape` function.\n",
    "For example, we can transform our tensor, `x`,\n",
    "from a row vector with shape (12,) to a matrix with shape (3, 4).\n",
    "This new tensor contains the exact same values,\n",
    "but views them as a matrix organized as 3 rows and 4 columns.\n",
    "To reiterate, although the shape has changed,\n",
    "the elements have not.\n",
    "Note that the size is unaltered by reshaping.\n"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 5,
   "metadata": {
    "origin_pos": 22,
    "tab": [
     "tensorflow"
    ]
   },
   "outputs": [
    {
     "data": {
      "text/plain": [
       "<tf.Tensor: shape=(3, 4), dtype=float32, numpy=\n",
       "array([[ 0.,  1.,  2.,  3.],\n",
       "       [ 4.,  5.,  6.,  7.],\n",
       "       [ 8.,  9., 10., 11.]], dtype=float32)>"
      ]
     },
     "execution_count": 5,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "X = tf.reshape(x, (3, 4))\n",
    "X"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "origin_pos": 23
   },
   "source": [
    "Reshaping by manually specifying every dimension is unnecessary.\n",
    "If our target shape is a matrix with shape (height, width),\n",
    "then after we know the width, the height is given implicitly.\n",
    "Why should we have to perform the division ourselves?\n",
    "In the example above, to get a matrix with 3 rows,\n",
    "we specified both that it should have 3 rows and 4 columns.\n",
    "Fortunately, tensors can automatically work out one dimension given the rest.\n",
    "We invoke this capability by placing `-1` for the dimension\n",
    "that we would like tensors to automatically infer.\n",
    "In our case, instead of calling `x.reshape(3, 4)`,\n",
    "we could have equivalently called `x.reshape(-1, 4)` or `x.reshape(3, -1)`.\n",
    "\n",
    "Typically, we will want our matrices initialized\n",
    "either with zeros, ones, some other constants,\n",
    "or numbers randomly sampled from a specific distribution.\n",
    "[**We can create a tensor representing a tensor with all elements\n",
    "set to 0**] (~~or 1~~)\n",
    "and a shape of (2, 3, 4) as follows:\n"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 6,
   "metadata": {
    "origin_pos": 26,
    "tab": [
     "tensorflow"
    ]
   },
   "outputs": [
    {
     "data": {
      "text/plain": [
       "<tf.Tensor: shape=(2, 3, 4), dtype=float32, numpy=\n",
       "array([[[0., 0., 0., 0.],\n",
       "        [0., 0., 0., 0.],\n",
       "        [0., 0., 0., 0.]],\n",
       "\n",
       "       [[0., 0., 0., 0.],\n",
       "        [0., 0., 0., 0.],\n",
       "        [0., 0., 0., 0.]]], dtype=float32)>"
      ]
     },
     "execution_count": 6,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "tf.zeros((2, 3, 4))"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "origin_pos": 27
   },
   "source": [
    "Similarly, we can create tensors with each element set to 1 as follows:\n"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 7,
   "metadata": {
    "origin_pos": 30,
    "tab": [
     "tensorflow"
    ]
   },
   "outputs": [
    {
     "data": {
      "text/plain": [
       "<tf.Tensor: shape=(2, 3, 4), dtype=float32, numpy=\n",
       "array([[[1., 1., 1., 1.],\n",
       "        [1., 1., 1., 1.],\n",
       "        [1., 1., 1., 1.]],\n",
       "\n",
       "       [[1., 1., 1., 1.],\n",
       "        [1., 1., 1., 1.],\n",
       "        [1., 1., 1., 1.]]], dtype=float32)>"
      ]
     },
     "execution_count": 7,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "tf.ones((2, 3, 4))"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "origin_pos": 31
   },
   "source": [
    "Often, we want to [**randomly sample the values\n",
    "for each element in a tensor**]\n",
    "from some probability distribution.\n",
    "For example, when we construct arrays to serve\n",
    "as parameters in a neural network, we will\n",
    "typically initialize their values randomly.\n",
    "The following snippet creates a tensor with shape (3, 4).\n",
    "Each of its elements is randomly sampled\n",
    "from a standard Gaussian (normal) distribution\n",
    "with a mean of 0 and a standard deviation of 1.\n"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 8,
   "metadata": {
    "origin_pos": 34,
    "tab": [
     "tensorflow"
    ]
   },
   "outputs": [
    {
     "data": {
      "text/plain": [
       "<tf.Tensor: shape=(3, 4), dtype=float32, numpy=\n",
       "array([[ 0.53423303, -0.20285548,  0.3730694 , -0.41860604],\n",
       "       [-0.8485273 ,  0.7166412 ,  1.0269386 , -0.12489712],\n",
       "       [ 0.19665839, -0.77016014, -0.32350102, -1.9406134 ]],\n",
       "      dtype=float32)>"
      ]
     },
     "execution_count": 8,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "tf.random.normal(shape=[3, 4])"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "origin_pos": 35
   },
   "source": [
    "We can also [**specify the exact values for each element**] in the desired tensor\n",
    "by supplying a Python list (or list of lists) containing the numerical values.\n",
    "Here, the outermost list corresponds to axis 0, and the inner list to axis 1.\n"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 9,
   "metadata": {
    "origin_pos": 38,
    "tab": [
     "tensorflow"
    ]
   },
   "outputs": [
    {
     "data": {
      "text/plain": [
       "<tf.Tensor: shape=(3, 4), dtype=int32, numpy=\n",
       "array([[2, 1, 4, 3],\n",
       "       [1, 2, 3, 4],\n",
       "       [4, 3, 2, 1]], dtype=int32)>"
      ]
     },
     "execution_count": 9,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "tf.constant([[2, 1, 4, 3], [1, 2, 3, 4], [4, 3, 2, 1]])"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "origin_pos": 39
   },
   "source": [
    "## Operations\n",
    "\n",
    "This book is not about software engineering.\n",
    "Our interests are not limited to simply\n",
    "reading and writing data from/to arrays.\n",
    "We want to perform mathematical operations on those arrays.\n",
    "Some of the simplest and most useful operations\n",
    "are the *elementwise* operations.\n",
    "These apply a standard scalar operation\n",
    "to each element of an array.\n",
    "For functions that take two arrays as inputs,\n",
    "elementwise operations apply some standard binary operator\n",
    "on each pair of corresponding elements from the two arrays.\n",
    "We can create an elementwise function from any function\n",
    "that maps from a scalar to a scalar.\n",
    "\n",
    "In mathematical notation, we would denote such\n",
    "a *unary* scalar operator (taking one input)\n",
    "by the signature $f: \\mathbb{R} \\rightarrow \\mathbb{R}$.\n",
    "This just means that the function is mapping\n",
    "from any real number ($\\mathbb{R}$) onto another.\n",
    "Likewise, we denote a *binary* scalar operator\n",
    "(taking two real inputs, and yielding one output)\n",
    "by the signature $f: \\mathbb{R}, \\mathbb{R} \\rightarrow \\mathbb{R}$.\n",
    "Given any two vectors $\\mathbf{u}$ and $\\mathbf{v}$ *of the same shape*,\n",
    "and a binary operator $f$, we can produce a vector\n",
    "$\\mathbf{c} = F(\\mathbf{u},\\mathbf{v})$\n",
    "by setting $c_i \\gets f(u_i, v_i)$ for all $i$,\n",
    "where $c_i, u_i$, and $v_i$ are the $i^\\mathrm{th}$ elements\n",
    "of vectors $\\mathbf{c}, \\mathbf{u}$, and $\\mathbf{v}$.\n",
    "Here, we produced the vector-valued\n",
    "$F: \\mathbb{R}^d, \\mathbb{R}^d \\rightarrow \\mathbb{R}^d$\n",
    "by *lifting* the scalar function to an elementwise vector operation.\n",
    "\n",
    "The common standard arithmetic operators\n",
    "(`+`, `-`, `*`, `/`, and `**`)\n",
    "have all been *lifted* to elementwise operations\n",
    "for any identically-shaped tensors of arbitrary shape.\n",
    "We can call elementwise operations on any two tensors of the same shape.\n",
    "In the following example, we use commas to formulate a 5-element tuple,\n",
    "where each element is the result of an elementwise operation.\n",
    "\n",
    "### Operations\n",
    "\n",
    "[**The common standard arithmetic operators\n",
    "(`+`, `-`, `*`, `/`, and `**`)\n",
    "have all been *lifted* to elementwise operations.**]\n"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 10,
   "metadata": {
    "origin_pos": 42,
    "tab": [
     "tensorflow"
    ]
   },
   "outputs": [
    {
     "data": {
      "text/plain": [
       "(<tf.Tensor: shape=(4,), dtype=float32, numpy=array([ 3.,  4.,  6., 10.], dtype=float32)>,\n",
       " <tf.Tensor: shape=(4,), dtype=float32, numpy=array([-1.,  0.,  2.,  6.], dtype=float32)>,\n",
       " <tf.Tensor: shape=(4,), dtype=float32, numpy=array([ 2.,  4.,  8., 16.], dtype=float32)>,\n",
       " <tf.Tensor: shape=(4,), dtype=float32, numpy=array([0.5, 1. , 2. , 4. ], dtype=float32)>,\n",
       " <tf.Tensor: shape=(4,), dtype=float32, numpy=array([ 1.,  4., 16., 64.], dtype=float32)>)"
      ]
     },
     "execution_count": 10,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "x = tf.constant([1.0, 2, 4, 8])\n",
    "y = tf.constant([2.0, 2, 2, 2])\n",
    "x + y, x - y, x * y, x / y, x ** y  # The ** operator is exponentiation"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "origin_pos": 43
   },
   "source": [
    "Many (**more operations can be applied elementwise**),\n",
    "including unary operators like exponentiation.\n"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 11,
   "metadata": {
    "origin_pos": 46,
    "tab": [
     "tensorflow"
    ]
   },
   "outputs": [
    {
     "data": {
      "text/plain": [
       "<tf.Tensor: shape=(4,), dtype=float32, numpy=\n",
       "array([2.7182817e+00, 7.3890562e+00, 5.4598148e+01, 2.9809580e+03],\n",
       "      dtype=float32)>"
      ]
     },
     "execution_count": 11,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "tf.exp(x)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "origin_pos": 47
   },
   "source": [
    "In addition to elementwise computations,\n",
    "we can also perform linear algebra operations,\n",
    "including vector dot products and matrix multiplication.\n",
    "We will explain the crucial bits of linear algebra\n",
    "(with no assumed prior knowledge) in :numref:`sec_linear-algebra`.\n",
    "\n",
    "We can also [***concatenate* multiple tensors together,**]\n",
    "stacking them end-to-end to form a larger tensor.\n",
    "We just need to provide a list of tensors\n",
    "and tell the system along which axis to concatenate.\n",
    "The example below shows what happens when we concatenate\n",
    "two matrices along rows (axis 0, the first element of the shape)\n",
    "vs. columns (axis 1, the second element of the shape).\n",
    "We can see that the first output tensor's axis-0 length ($6$)\n",
    "is the sum of the two input tensors' axis-0 lengths ($3 + 3$);\n",
    "while the second output tensor's axis-1 length ($8$)\n",
    "is the sum of the two input tensors' axis-1 lengths ($4 + 4$).\n"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 12,
   "metadata": {
    "origin_pos": 50,
    "tab": [
     "tensorflow"
    ]
   },
   "outputs": [
    {
     "data": {
      "text/plain": [
       "(<tf.Tensor: shape=(6, 4), dtype=float32, numpy=\n",
       " array([[ 0.,  1.,  2.,  3.],\n",
       "        [ 4.,  5.,  6.,  7.],\n",
       "        [ 8.,  9., 10., 11.],\n",
       "        [ 2.,  1.,  4.,  3.],\n",
       "        [ 1.,  2.,  3.,  4.],\n",
       "        [ 4.,  3.,  2.,  1.]], dtype=float32)>,\n",
       " <tf.Tensor: shape=(3, 8), dtype=float32, numpy=\n",
       " array([[ 0.,  1.,  2.,  3.,  2.,  1.,  4.,  3.],\n",
       "        [ 4.,  5.,  6.,  7.,  1.,  2.,  3.,  4.],\n",
       "        [ 8.,  9., 10., 11.,  4.,  3.,  2.,  1.]], dtype=float32)>)"
      ]
     },
     "execution_count": 12,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "X = tf.reshape(tf.range(12, dtype=tf.float32), (3, 4))\n",
    "Y = tf.constant([[2.0, 1, 4, 3], [1, 2, 3, 4], [4, 3, 2, 1]])\n",
    "tf.concat([X, Y], axis=0), tf.concat([X, Y], axis=1)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "origin_pos": 51
   },
   "source": [
    "Sometimes, we want to [**construct a binary tensor via *logical statements*.**]\n",
    "Take `X == Y` as an example.\n",
    "For each position, if `X` and `Y` are equal at that position,\n",
    "the corresponding entry in the new tensor takes a value of 1,\n",
    "meaning that the logical statement `X == Y` is true at that position;\n",
    "otherwise that position takes 0.\n"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 13,
   "metadata": {
    "origin_pos": 52,
    "tab": [
     "tensorflow"
    ]
   },
   "outputs": [
    {
     "data": {
      "text/plain": [
       "<tf.Tensor: shape=(3, 4), dtype=bool, numpy=\n",
       "array([[False,  True, False,  True],\n",
       "       [False, False, False, False],\n",
       "       [False, False, False, False]])>"
      ]
     },
     "execution_count": 13,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "X == Y"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "origin_pos": 53
   },
   "source": [
    "[**Summing all the elements in the tensor**] yields a tensor with only one element.\n"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 14,
   "metadata": {
    "origin_pos": 55,
    "tab": [
     "tensorflow"
    ]
   },
   "outputs": [
    {
     "data": {
      "text/plain": [
       "<tf.Tensor: shape=(), dtype=float32, numpy=66.0>"
      ]
     },
     "execution_count": 14,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "tf.reduce_sum(X)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "origin_pos": 56
   },
   "source": [
    "## Broadcasting Mechanism\n",
    ":label:`subsec_broadcasting`\n",
    "\n",
    "In the above section, we saw how to perform elementwise operations\n",
    "on two tensors of the same shape. Under certain conditions,\n",
    "even when shapes differ, we can still [**perform elementwise operations\n",
    "by invoking the *broadcasting mechanism*.**]\n",
    "This mechanism works in the following way:\n",
    "First, expand one or both arrays\n",
    "by copying elements appropriately\n",
    "so that after this transformation,\n",
    "the two tensors have the same shape.\n",
    "Second, carry out the elementwise operations\n",
    "on the resulting arrays.\n",
    "\n",
    "In most cases, we broadcast along an axis where an array\n",
    "initially only has length 1, such as in the following example:\n"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 15,
   "metadata": {
    "origin_pos": 59,
    "tab": [
     "tensorflow"
    ]
   },
   "outputs": [
    {
     "data": {
      "text/plain": [
       "(<tf.Tensor: shape=(3, 1), dtype=int32, numpy=\n",
       " array([[0],\n",
       "        [1],\n",
       "        [2]], dtype=int32)>,\n",
       " <tf.Tensor: shape=(1, 2), dtype=int32, numpy=array([[0, 1]], dtype=int32)>)"
      ]
     },
     "execution_count": 15,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "a = tf.reshape(tf.range(3), (3, 1))\n",
    "b = tf.reshape(tf.range(2), (1, 2))\n",
    "a, b"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "origin_pos": 60
   },
   "source": [
    "Since `a` and `b` are $3\\times1$ and $1\\times2$ matrices respectively,\n",
    "their shapes do not match up if we want to add them.\n",
    "We *broadcast* the entries of both matrices into a larger $3\\times2$ matrix as follows:\n",
    "for matrix `a` it replicates the columns\n",
    "and for matrix `b` it replicates the rows\n",
    "before adding up both elementwise.\n"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 16,
   "metadata": {
    "origin_pos": 61,
    "tab": [
     "tensorflow"
    ]
   },
   "outputs": [
    {
     "data": {
      "text/plain": [
       "<tf.Tensor: shape=(3, 2), dtype=int32, numpy=\n",
       "array([[0, 1],\n",
       "       [1, 2],\n",
       "       [2, 3]], dtype=int32)>"
      ]
     },
     "execution_count": 16,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "a + b"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "origin_pos": 62
   },
   "source": [
    "## Indexing and Slicing\n",
    "\n",
    "Just as in any other Python array, elements in a tensor can be accessed by index.\n",
    "As in any Python array, the first element has index 0\n",
    "and ranges are specified to include the first but *before* the last element.\n",
    "As in standard Python lists, we can access elements\n",
    "according to their relative position to the end of the list\n",
    "by using negative indices.\n",
    "\n",
    "Thus, [**`[-1]` selects the last element and `[1:3]`\n",
    "selects the second and the third elements**] as follows:\n"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 17,
   "metadata": {
    "origin_pos": 63,
    "tab": [
     "tensorflow"
    ]
   },
   "outputs": [
    {
     "data": {
      "text/plain": [
       "(<tf.Tensor: shape=(4,), dtype=float32, numpy=array([ 8.,  9., 10., 11.], dtype=float32)>,\n",
       " <tf.Tensor: shape=(2, 4), dtype=float32, numpy=\n",
       " array([[ 4.,  5.,  6.,  7.],\n",
       "        [ 8.,  9., 10., 11.]], dtype=float32)>)"
      ]
     },
     "execution_count": 17,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "X[-1], X[1:3]"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "origin_pos": 65,
    "tab": [
     "tensorflow"
    ]
   },
   "source": [
    "`Tensors` in TensorFlow are immutable, and cannot be assigned to.\n",
    "`Variables` in TensorFlow are mutable containers of state that support\n",
    "assignments. Keep in mind that gradients in TensorFlow do not flow backwards\n",
    "through `Variable` assignments.\n",
    "\n",
    "Beyond assigning a value to the entire `Variable`, we can write elements of a\n",
    "`Variable` by specifying indices.\n"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 18,
   "metadata": {
    "origin_pos": 67,
    "tab": [
     "tensorflow"
    ]
   },
   "outputs": [
    {
     "data": {
      "text/plain": [
       "<tf.Variable 'Variable:0' shape=(3, 4) dtype=float32, numpy=\n",
       "array([[ 0.,  1.,  2.,  3.],\n",
       "       [ 4.,  5.,  9.,  7.],\n",
       "       [ 8.,  9., 10., 11.]], dtype=float32)>"
      ]
     },
     "execution_count": 18,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "X_var = tf.Variable(X)\n",
    "X_var[1, 2].assign(9)\n",
    "X_var"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "origin_pos": 68
   },
   "source": [
    "If we want [**to assign multiple elements the same value,\n",
    "we simply index all of them and then assign them the value.**]\n",
    "For instance, `[0:2, :]` accesses the first and second rows,\n",
    "where `:` takes all the elements along axis 1 (column).\n",
    "While we discussed indexing for matrices,\n",
    "this obviously also works for vectors\n",
    "and for tensors of more than 2 dimensions.\n"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 19,
   "metadata": {
    "origin_pos": 70,
    "tab": [
     "tensorflow"
    ]
   },
   "outputs": [
    {
     "data": {
      "text/plain": [
       "<tf.Variable 'Variable:0' shape=(3, 4) dtype=float32, numpy=\n",
       "array([[12., 12., 12., 12.],\n",
       "       [12., 12., 12., 12.],\n",
       "       [ 8.,  9., 10., 11.]], dtype=float32)>"
      ]
     },
     "execution_count": 19,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "X_var = tf.Variable(X)\n",
    "X_var[0:2, :].assign(tf.ones(X_var[0:2,:].shape, dtype = tf.float32) * 12)\n",
    "X_var"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "origin_pos": 71
   },
   "source": [
    "## Saving Memory\n",
    "\n",
    "[**Running operations can cause new memory to be\n",
    "allocated to host results.**]\n",
    "For example, if we write `Y = X + Y`,\n",
    "we will dereference the tensor that `Y` used to point to\n",
    "and instead point `Y` at the newly allocated memory.\n",
    "In the following example, we demonstrate this with Python's `id()` function,\n",
    "which gives us the exact address of the referenced object in memory.\n",
    "After running `Y = Y + X`, we will find that `id(Y)` points to a different location.\n",
    "That is because Python first evaluates `Y + X`,\n",
    "allocating new memory for the result and then makes `Y`\n",
    "point to this new location in memory.\n"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 20,
   "metadata": {
    "origin_pos": 72,
    "tab": [
     "tensorflow"
    ]
   },
   "outputs": [
    {
     "data": {
      "text/plain": [
       "False"
      ]
     },
     "execution_count": 20,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "before = id(Y)\n",
    "Y = Y + X\n",
    "id(Y) == before"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "origin_pos": 73
   },
   "source": [
    "This might be undesirable for two reasons.\n",
    "First, we do not want to run around\n",
    "allocating memory unnecessarily all the time.\n",
    "In machine learning, we might have\n",
    "hundreds of megabytes of parameters\n",
    "and update all of them multiple times per second.\n",
    "Typically, we will want to perform these updates *in place*.\n",
    "Second, we might point at the same parameters from multiple variables.\n",
    "If we do not update in place, other references will still point to\n",
    "the old memory location, making it possible for parts of our code\n",
    "to inadvertently reference stale parameters.\n"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "origin_pos": 75,
    "tab": [
     "tensorflow"
    ]
   },
   "source": [
    "`Variables` are mutable containers of state in TensorFlow. They provide\n",
    "a way to store your model parameters.\n",
    "We can assign the result of an operation\n",
    "to a `Variable` with `assign`.\n",
    "To illustrate this concept, we create a `Variable` `Z`\n",
    "with the same shape as another tensor `Y`,\n",
    "using `zeros_like` to allocate a block of $0$ entries.\n"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 21,
   "metadata": {
    "origin_pos": 78,
    "tab": [
     "tensorflow"
    ]
   },
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "id(Z): 139874995380176\n",
      "id(Z): 139874995380176\n"
     ]
    }
   ],
   "source": [
    "Z = tf.Variable(tf.zeros_like(Y))\n",
    "print('id(Z):', id(Z))\n",
    "Z.assign(X + Y)\n",
    "print('id(Z):', id(Z))"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "origin_pos": 80,
    "tab": [
     "tensorflow"
    ]
   },
   "source": [
    "Even once you store state persistently in a `Variable`, you\n",
    "may want to reduce your memory usage further by avoiding excess\n",
    "allocations for tensors that are not your model parameters.\n",
    "\n",
    "Because TensorFlow `Tensors` are immutable and gradients do not flow through\n",
    "`Variable` assignments, TensorFlow does not provide an explicit way to run\n",
    "an individual operation in-place.\n",
    "\n",
    "However, TensorFlow provides the `tf.function` decorator to wrap computation\n",
    "inside of a TensorFlow graph that gets compiled and optimized before running.\n",
    "This allows TensorFlow to prune unused values, and to re-use\n",
    "prior allocations that are no longer needed. This minimizes the memory\n",
    "overhead of TensorFlow computations.\n"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 22,
   "metadata": {
    "origin_pos": 82,
    "tab": [
     "tensorflow"
    ]
   },
   "outputs": [
    {
     "data": {
      "text/plain": [
       "<tf.Tensor: shape=(3, 4), dtype=float32, numpy=\n",
       "array([[ 8.,  9., 26., 27.],\n",
       "       [24., 33., 42., 51.],\n",
       "       [56., 57., 58., 59.]], dtype=float32)>"
      ]
     },
     "execution_count": 22,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "@tf.function\n",
    "def computation(X, Y):\n",
    "    Z = tf.zeros_like(Y)  # This unused value will be pruned out\n",
    "    A = X + Y  # Allocations will be re-used when no longer needed\n",
    "    B = A + Y\n",
    "    C = B + Y\n",
    "    return C + Y\n",
    "\n",
    "computation(X, Y)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "origin_pos": 83
   },
   "source": [
    "## Conversion to Other Python Objects\n"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "origin_pos": 84,
    "tab": [
     "tensorflow"
    ]
   },
   "source": [
    "[**Converting to a NumPy tensor (`ndarray`)**], or vice versa, is easy.\n",
    "The converted result does not share memory.\n",
    "This minor inconvenience is actually quite important:\n",
    "when you perform operations on the CPU or on GPUs,\n",
    "you do not want to halt computation, waiting to see\n",
    "whether the NumPy package of Python might want to be doing something else\n",
    "with the same chunk of memory.\n"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 23,
   "metadata": {
    "origin_pos": 88,
    "tab": [
     "tensorflow"
    ]
   },
   "outputs": [
    {
     "data": {
      "text/plain": [
       "(numpy.ndarray, tensorflow.python.framework.ops.EagerTensor)"
      ]
     },
     "execution_count": 23,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "A = X.numpy()\n",
    "B = tf.constant(A)\n",
    "type(A), type(B)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "origin_pos": 89
   },
   "source": [
    "To (**convert a size-1 tensor to a Python scalar**),\n",
    "we can invoke the `item` function or Python's built-in functions.\n"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 24,
   "metadata": {
    "origin_pos": 92,
    "tab": [
     "tensorflow"
    ]
   },
   "outputs": [
    {
     "data": {
      "text/plain": [
       "(array([3.5], dtype=float32), 3.5, 3.5, 3)"
      ]
     },
     "execution_count": 24,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "a = tf.constant([3.5]).numpy()\n",
    "a, a.item(), float(a), int(a)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "origin_pos": 93
   },
   "source": [
    "## Summary\n",
    "\n",
    "* The main interface to store and manipulate data for deep learning is the tensor ($n$-dimensional array). It provides a variety of functionalities including basic mathematics operations, broadcasting, indexing, slicing, memory saving, and conversion to other Python objects.\n",
    "\n",
    "\n",
    "## Exercises\n",
    "\n",
    "1. Run the code in this section. Change the conditional statement `X == Y` in this section to `X < Y` or `X > Y`, and then see what kind of tensor you can get.\n",
    "1. Replace the two tensors that operate by element in the broadcasting mechanism with other shapes, e.g., 3-dimensional tensors. Is the result the same as expected?\n"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "origin_pos": 96,
    "tab": [
     "tensorflow"
    ]
   },
   "source": [
    "[Discussions](https://discuss.d2l.ai/t/187)\n"
   ]
  }
 ],
 "metadata": {
  "language_info": {
   "name": "python"
  }
 },
 "nbformat": 4,
 "nbformat_minor": 4
}