{
 "cells": [
  {
   "cell_type": "markdown",
   "metadata": {
    "origin_pos": 0
   },
   "source": [
    "# Convexity\n",
    ":label:`sec_convexity`\n",
    "\n",
    "Convexity plays a vital role in the design of optimization algorithms. \n",
    "This is largely due to the fact that it is much easier to analyze and test algorithms in such a context. \n",
    "In other words,\n",
    "if the algorithm performs poorly even in the convex setting,\n",
    "typically we should not hope to see great results otherwise. \n",
    "Furthermore, even though the optimization problems in deep learning are generally nonconvex, they often exhibit some properties of convex ones near local minima. This can lead to exciting new optimization variants such as :cite:`Izmailov.Podoprikhin.Garipov.ea.2018`.\n"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 1,
   "metadata": {
    "origin_pos": 3,
    "tab": [
     "tensorflow"
    ]
   },
   "outputs": [],
   "source": [
    "%matplotlib inline\n",
    "import numpy as np\n",
    "import tensorflow as tf\n",
    "from mpl_toolkits import mplot3d\n",
    "from d2l import tensorflow as d2l"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "origin_pos": 4
   },
   "source": [
    "## Definitions\n",
    "\n",
    "Before convex analysis,\n",
    "we need to define *convex sets* and *convex functions*.\n",
    "They lead to mathematical tools that are commonly applied to machine learning.\n",
    "\n",
    "\n",
    "### Convex Sets\n",
    "\n",
    "Sets are the basis of convexity. Simply put, a set $\\mathcal{X}$ in a vector space is *convex* if for any $a, b \\in \\mathcal{X}$ the line segment connecting $a$ and $b$ is also in $\\mathcal{X}$. In mathematical terms this means that for all $\\lambda \\in [0, 1]$ we have\n",
    "\n",
    "$$\\lambda  a + (1-\\lambda)  b \\in \\mathcal{X} \\text{ whenever } a, b \\in \\mathcal{X}.$$\n",
    "\n",
    "This sounds a bit abstract. Consider :numref:`fig_pacman`. The first set is not convex since there exist line segments that are not contained in it.\n",
    "The other two sets suffer no such problem.\n",
    "\n",
    "![The first set is nonconvex and the other two are convex.](../img/pacman.svg)\n",
    ":label:`fig_pacman`\n",
    "\n",
    "Definitions on their own are not particularly useful unless you can do something with them.\n",
    "In this case we can look at intersections as shown in :numref:`fig_convex_intersect`.\n",
    "Assume that $\\mathcal{X}$ and $\\mathcal{Y}$ are convex sets. Then $\\mathcal{X} \\cap \\mathcal{Y}$ is also convex. To see this, consider any $a, b \\in \\mathcal{X} \\cap \\mathcal{Y}$. Since $\\mathcal{X}$ and $\\mathcal{Y}$ are convex, the line segments connecting $a$ and $b$ are contained in both $\\mathcal{X}$ and $\\mathcal{Y}$. Given that, they also need to be contained in $\\mathcal{X} \\cap \\mathcal{Y}$, thus proving our theorem.\n",
    "\n",
    "![The intersection between two convex sets is convex.](../img/convex-intersect.svg)\n",
    ":label:`fig_convex_intersect`\n",
    "\n",
    "We can strengthen this result with little effort: given convex sets $\\mathcal{X}_i$, their intersection $\\cap_{i} \\mathcal{X}_i$ is convex.\n",
    "To see that the converse is not true, consider two disjoint sets $\\mathcal{X} \\cap \\mathcal{Y} = \\emptyset$. Now pick $a \\in \\mathcal{X}$ and $b \\in \\mathcal{Y}$. The line segment in :numref:`fig_nonconvex` connecting $a$ and $b$ needs to contain some part that is neither in $\\mathcal{X}$ nor in $\\mathcal{Y}$, since we assumed that $\\mathcal{X} \\cap \\mathcal{Y} = \\emptyset$. Hence the line segment is not in $\\mathcal{X} \\cup \\mathcal{Y}$ either, thus proving that in general unions of convex sets need not be convex.\n",
    "\n",
    "![The union of two convex sets need not be convex.](../img/nonconvex.svg)\n",
    ":label:`fig_nonconvex`\n",
    "\n",
    "Typically the problems in deep learning are defined on convex sets. For instance, $\\mathbb{R}^d$,\n",
    "the set of $d$-dimensional vectors of real numbers,\n",
    "is a convex set (after all, the line between any two points in $\\mathbb{R}^d$ remains in $\\mathbb{R}^d$). In some cases we work with variables of bounded length, such as balls of radius $r$ as defined by $\\{\\mathbf{x} | \\mathbf{x} \\in \\mathbb{R}^d \\text{ and } \\|\\mathbf{x}\\| \\leq r\\}$.\n",
    "\n",
    "### Convex Functions\n",
    "\n",
    "Now that we have convex sets we can introduce *convex functions* $f$.\n",
    "Given a convex set $\\mathcal{X}$, a function $f: \\mathcal{X} \\to \\mathbb{R}$ is *convex* if for all $x, x' \\in \\mathcal{X}$ and for all $\\lambda \\in [0, 1]$ we have\n",
    "\n",
    "$$\\lambda f(x) + (1-\\lambda) f(x') \\geq f(\\lambda x + (1-\\lambda) x').$$\n",
    "\n",
    "To illustrate this let us plot a few functions and check which ones satisfy the requirement.\n",
    "Below we define a few functions, both convex and nonconvex.\n"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 2,
   "metadata": {
    "origin_pos": 5,
    "tab": [
     "tensorflow"
    ]
   },
   "outputs": [
    {
     "data": {
      "image/svg+xml": [
       "<?xml version=\"1.0\" encoding=\"utf-8\" standalone=\"no\"?>\n",
       "<!DOCTYPE svg PUBLIC \"-//W3C//DTD SVG 1.1//EN\"\n",
       "  \"http://www.w3.org/Graphics/SVG/1.1/DTD/svg11.dtd\">\n",
       "<svg xmlns:xlink=\"http://www.w3.org/1999/xlink\" width=\"539.503125pt\" height=\"194.158125pt\" viewBox=\"0 0 539.503125 194.158125\" xmlns=\"http://www.w3.org/2000/svg\" version=\"1.1\">\n",
       " <metadata>\n",
       "  <rdf:RDF xmlns:dc=\"http://purl.org/dc/elements/1.1/\" xmlns:cc=\"http://creativecommons.org/ns#\" xmlns:rdf=\"http://www.w3.org/1999/02/22-rdf-syntax-ns#\">\n",
       "   <cc:Work>\n",
       "    <dc:type rdf:resource=\"http://purl.org/dc/dcmitype/StillImage\"/>\n",
       "    <dc:date>2022-03-24T13:09:24.603463</dc:date>\n",
       "    <dc:format>image/svg+xml</dc:format>\n",
       "    <dc:creator>\n",
       "     <cc:Agent>\n",
       "      <dc:title>Matplotlib v3.5.1, https://matplotlib.org/</dc:title>\n",
       "     </cc:Agent>\n",
       "    </dc:creator>\n",
       "   </cc:Work>\n",
       "  </rdf:RDF>\n",
       " </metadata>\n",
       " <defs>\n",
       "  <style type=\"text/css\">*{stroke-linejoin: round; stroke-linecap: butt}</style>\n",
       " </defs>\n",
       " <g id=\"figure_1\">\n",
       "  <g id=\"patch_1\">\n",
       "   <path d=\"M 0 194.158125 \n",
       "L 539.503125 194.158125 \n",
       "L 539.503125 0 \n",
       "L 0 0 \n",
       "L 0 194.158125 \n",
       "z\n",
       "\" style=\"fill: none\"/>\n",
       "  </g>\n",
       "  <g id=\"axes_1\">\n",
       "   <g id=\"patch_2\">\n",
       "    <path d=\"M 30.103125 170.28 \n",
       "L 177.809007 170.28 \n",
       "L 177.809007 7.2 \n",
       "L 30.103125 7.2 \n",
       "z\n",
       "\" style=\"fill: #ffffff\"/>\n",
       "   </g>\n",
       "   <g id=\"matplotlib.axis_1\">\n",
       "    <g id=\"xtick_1\">\n",
       "     <g id=\"line2d_1\">\n",
       "      <path d=\"M 36.817029 170.28 \n",
       "L 36.817029 7.2 \n",
       "\" clip-path=\"url(#p8586ef34a1)\" style=\"fill: none; stroke: #b0b0b0; stroke-width: 0.8; stroke-linecap: square\"/>\n",
       "     </g>\n",
       "     <g id=\"line2d_2\">\n",
       "      <defs>\n",
       "       <path id=\"m154addf18e\" d=\"M 0 0 \n",
       "L 0 3.5 \n",
       "\" style=\"stroke: #000000; stroke-width: 0.8\"/>\n",
       "      </defs>\n",
       "      <g>\n",
       "       <use xlink:href=\"#m154addf18e\" x=\"36.817029\" y=\"170.28\" style=\"stroke: #000000; stroke-width: 0.8\"/>\n",
       "      </g>\n",
       "     </g>\n",
       "     <g id=\"text_1\">\n",
       "      <!-- −2 -->\n",
       "      <g transform=\"translate(29.445935 184.878438)scale(0.1 -0.1)\">\n",
       "       <defs>\n",
       "        <path id=\"DejaVuSans-2212\" d=\"M 678 2272 \n",
       "L 4684 2272 \n",
       "L 4684 1741 \n",
       "L 678 1741 \n",
       "L 678 2272 \n",
       "z\n",
       "\" transform=\"scale(0.015625)\"/>\n",
       "        <path id=\"DejaVuSans-32\" d=\"M 1228 531 \n",
       "L 3431 531 \n",
       "L 3431 0 \n",
       "L 469 0 \n",
       "L 469 531 \n",
       "Q 828 903 1448 1529 \n",
       "Q 2069 2156 2228 2338 \n",
       "Q 2531 2678 2651 2914 \n",
       "Q 2772 3150 2772 3378 \n",
       "Q 2772 3750 2511 3984 \n",
       "Q 2250 4219 1831 4219 \n",
       "Q 1534 4219 1204 4116 \n",
       "Q 875 4013 500 3803 \n",
       "L 500 4441 \n",
       "Q 881 4594 1212 4672 \n",
       "Q 1544 4750 1819 4750 \n",
       "Q 2544 4750 2975 4387 \n",
       "Q 3406 4025 3406 3419 \n",
       "Q 3406 3131 3298 2873 \n",
       "Q 3191 2616 2906 2266 \n",
       "Q 2828 2175 2409 1742 \n",
       "Q 1991 1309 1228 531 \n",
       "z\n",
       "\" transform=\"scale(0.015625)\"/>\n",
       "       </defs>\n",
       "       <use xlink:href=\"#DejaVuSans-2212\"/>\n",
       "       <use xlink:href=\"#DejaVuSans-32\" x=\"83.789062\"/>\n",
       "      </g>\n",
       "     </g>\n",
       "    </g>\n",
       "    <g id=\"xtick_2\">\n",
       "     <g id=\"line2d_3\">\n",
       "      <path d=\"M 104.124389 170.28 \n",
       "L 104.124389 7.2 \n",
       "\" clip-path=\"url(#p8586ef34a1)\" style=\"fill: none; stroke: #b0b0b0; stroke-width: 0.8; stroke-linecap: square\"/>\n",
       "     </g>\n",
       "     <g id=\"line2d_4\">\n",
       "      <g>\n",
       "       <use xlink:href=\"#m154addf18e\" x=\"104.124389\" y=\"170.28\" style=\"stroke: #000000; stroke-width: 0.8\"/>\n",
       "      </g>\n",
       "     </g>\n",
       "     <g id=\"text_2\">\n",
       "      <!-- 0 -->\n",
       "      <g transform=\"translate(100.943139 184.878438)scale(0.1 -0.1)\">\n",
       "       <defs>\n",
       "        <path id=\"DejaVuSans-30\" d=\"M 2034 4250 \n",
       "Q 1547 4250 1301 3770 \n",
       "Q 1056 3291 1056 2328 \n",
       "Q 1056 1369 1301 889 \n",
       "Q 1547 409 2034 409 \n",
       "Q 2525 409 2770 889 \n",
       "Q 3016 1369 3016 2328 \n",
       "Q 3016 3291 2770 3770 \n",
       "Q 2525 4250 2034 4250 \n",
       "z\n",
       "M 2034 4750 \n",
       "Q 2819 4750 3233 4129 \n",
       "Q 3647 3509 3647 2328 \n",
       "Q 3647 1150 3233 529 \n",
       "Q 2819 -91 2034 -91 \n",
       "Q 1250 -91 836 529 \n",
       "Q 422 1150 422 2328 \n",
       "Q 422 3509 836 4129 \n",
       "Q 1250 4750 2034 4750 \n",
       "z\n",
       "\" transform=\"scale(0.015625)\"/>\n",
       "       </defs>\n",
       "       <use xlink:href=\"#DejaVuSans-30\"/>\n",
       "      </g>\n",
       "     </g>\n",
       "    </g>\n",
       "    <g id=\"xtick_3\">\n",
       "     <g id=\"line2d_5\">\n",
       "      <path d=\"M 171.431748 170.28 \n",
       "L 171.431748 7.2 \n",
       "\" clip-path=\"url(#p8586ef34a1)\" style=\"fill: none; stroke: #b0b0b0; stroke-width: 0.8; stroke-linecap: square\"/>\n",
       "     </g>\n",
       "     <g id=\"line2d_6\">\n",
       "      <g>\n",
       "       <use xlink:href=\"#m154addf18e\" x=\"171.431748\" y=\"170.28\" style=\"stroke: #000000; stroke-width: 0.8\"/>\n",
       "      </g>\n",
       "     </g>\n",
       "     <g id=\"text_3\">\n",
       "      <!-- 2 -->\n",
       "      <g transform=\"translate(168.250498 184.878438)scale(0.1 -0.1)\">\n",
       "       <use xlink:href=\"#DejaVuSans-32\"/>\n",
       "      </g>\n",
       "     </g>\n",
       "    </g>\n",
       "   </g>\n",
       "   <g id=\"matplotlib.axis_2\">\n",
       "    <g id=\"ytick_1\">\n",
       "     <g id=\"line2d_7\">\n",
       "      <path d=\"M 30.103125 162.867273 \n",
       "L 177.809007 162.867273 \n",
       "\" clip-path=\"url(#p8586ef34a1)\" style=\"fill: none; stroke: #b0b0b0; stroke-width: 0.8; stroke-linecap: square\"/>\n",
       "     </g>\n",
       "     <g id=\"line2d_8\">\n",
       "      <defs>\n",
       "       <path id=\"m030d4b8499\" d=\"M 0 0 \n",
       "L -3.5 0 \n",
       "\" style=\"stroke: #000000; stroke-width: 0.8\"/>\n",
       "      </defs>\n",
       "      <g>\n",
       "       <use xlink:href=\"#m030d4b8499\" x=\"30.103125\" y=\"162.867273\" style=\"stroke: #000000; stroke-width: 0.8\"/>\n",
       "      </g>\n",
       "     </g>\n",
       "     <g id=\"text_4\">\n",
       "      <!-- 0.0 -->\n",
       "      <g transform=\"translate(7.2 166.666491)scale(0.1 -0.1)\">\n",
       "       <defs>\n",
       "        <path id=\"DejaVuSans-2e\" d=\"M 684 794 \n",
       "L 1344 794 \n",
       "L 1344 0 \n",
       "L 684 0 \n",
       "L 684 794 \n",
       "z\n",
       "\" transform=\"scale(0.015625)\"/>\n",
       "       </defs>\n",
       "       <use xlink:href=\"#DejaVuSans-30\"/>\n",
       "       <use xlink:href=\"#DejaVuSans-2e\" x=\"63.623047\"/>\n",
       "       <use xlink:href=\"#DejaVuSans-30\" x=\"95.410156\"/>\n",
       "      </g>\n",
       "     </g>\n",
       "    </g>\n",
       "    <g id=\"ytick_2\">\n",
       "     <g id=\"line2d_9\">\n",
       "      <path d=\"M 30.103125 125.803636 \n",
       "L 177.809007 125.803636 \n",
       "\" clip-path=\"url(#p8586ef34a1)\" style=\"fill: none; stroke: #b0b0b0; stroke-width: 0.8; stroke-linecap: square\"/>\n",
       "     </g>\n",
       "     <g id=\"line2d_10\">\n",
       "      <g>\n",
       "       <use xlink:href=\"#m030d4b8499\" x=\"30.103125\" y=\"125.803636\" style=\"stroke: #000000; stroke-width: 0.8\"/>\n",
       "      </g>\n",
       "     </g>\n",
       "     <g id=\"text_5\">\n",
       "      <!-- 0.5 -->\n",
       "      <g transform=\"translate(7.2 129.602855)scale(0.1 -0.1)\">\n",
       "       <defs>\n",
       "        <path id=\"DejaVuSans-35\" d=\"M 691 4666 \n",
       "L 3169 4666 \n",
       "L 3169 4134 \n",
       "L 1269 4134 \n",
       "L 1269 2991 \n",
       "Q 1406 3038 1543 3061 \n",
       "Q 1681 3084 1819 3084 \n",
       "Q 2600 3084 3056 2656 \n",
       "Q 3513 2228 3513 1497 \n",
       "Q 3513 744 3044 326 \n",
       "Q 2575 -91 1722 -91 \n",
       "Q 1428 -91 1123 -41 \n",
       "Q 819 9 494 109 \n",
       "L 494 744 \n",
       "Q 775 591 1075 516 \n",
       "Q 1375 441 1709 441 \n",
       "Q 2250 441 2565 725 \n",
       "Q 2881 1009 2881 1497 \n",
       "Q 2881 1984 2565 2268 \n",
       "Q 2250 2553 1709 2553 \n",
       "Q 1456 2553 1204 2497 \n",
       "Q 953 2441 691 2322 \n",
       "L 691 4666 \n",
       "z\n",
       "\" transform=\"scale(0.015625)\"/>\n",
       "       </defs>\n",
       "       <use xlink:href=\"#DejaVuSans-30\"/>\n",
       "       <use xlink:href=\"#DejaVuSans-2e\" x=\"63.623047\"/>\n",
       "       <use xlink:href=\"#DejaVuSans-35\" x=\"95.410156\"/>\n",
       "      </g>\n",
       "     </g>\n",
       "    </g>\n",
       "    <g id=\"ytick_3\">\n",
       "     <g id=\"line2d_11\">\n",
       "      <path d=\"M 30.103125 88.74 \n",
       "L 177.809007 88.74 \n",
       "\" clip-path=\"url(#p8586ef34a1)\" style=\"fill: none; stroke: #b0b0b0; stroke-width: 0.8; stroke-linecap: square\"/>\n",
       "     </g>\n",
       "     <g id=\"line2d_12\">\n",
       "      <g>\n",
       "       <use xlink:href=\"#m030d4b8499\" x=\"30.103125\" y=\"88.74\" style=\"stroke: #000000; stroke-width: 0.8\"/>\n",
       "      </g>\n",
       "     </g>\n",
       "     <g id=\"text_6\">\n",
       "      <!-- 1.0 -->\n",
       "      <g transform=\"translate(7.2 92.539219)scale(0.1 -0.1)\">\n",
       "       <defs>\n",
       "        <path id=\"DejaVuSans-31\" d=\"M 794 531 \n",
       "L 1825 531 \n",
       "L 1825 4091 \n",
       "L 703 3866 \n",
       "L 703 4441 \n",
       "L 1819 4666 \n",
       "L 2450 4666 \n",
       "L 2450 531 \n",
       "L 3481 531 \n",
       "L 3481 0 \n",
       "L 794 0 \n",
       "L 794 531 \n",
       "z\n",
       "\" transform=\"scale(0.015625)\"/>\n",
       "       </defs>\n",
       "       <use xlink:href=\"#DejaVuSans-31\"/>\n",
       "       <use xlink:href=\"#DejaVuSans-2e\" x=\"63.623047\"/>\n",
       "       <use xlink:href=\"#DejaVuSans-30\" x=\"95.410156\"/>\n",
       "      </g>\n",
       "     </g>\n",
       "    </g>\n",
       "    <g id=\"ytick_4\">\n",
       "     <g id=\"line2d_13\">\n",
       "      <path d=\"M 30.103125 51.676364 \n",
       "L 177.809007 51.676364 \n",
       "\" clip-path=\"url(#p8586ef34a1)\" style=\"fill: none; stroke: #b0b0b0; stroke-width: 0.8; stroke-linecap: square\"/>\n",
       "     </g>\n",
       "     <g id=\"line2d_14\">\n",
       "      <g>\n",
       "       <use xlink:href=\"#m030d4b8499\" x=\"30.103125\" y=\"51.676364\" style=\"stroke: #000000; stroke-width: 0.8\"/>\n",
       "      </g>\n",
       "     </g>\n",
       "     <g id=\"text_7\">\n",
       "      <!-- 1.5 -->\n",
       "      <g transform=\"translate(7.2 55.475582)scale(0.1 -0.1)\">\n",
       "       <use xlink:href=\"#DejaVuSans-31\"/>\n",
       "       <use xlink:href=\"#DejaVuSans-2e\" x=\"63.623047\"/>\n",
       "       <use xlink:href=\"#DejaVuSans-35\" x=\"95.410156\"/>\n",
       "      </g>\n",
       "     </g>\n",
       "    </g>\n",
       "    <g id=\"ytick_5\">\n",
       "     <g id=\"line2d_15\">\n",
       "      <path d=\"M 30.103125 14.612727 \n",
       "L 177.809007 14.612727 \n",
       "\" clip-path=\"url(#p8586ef34a1)\" style=\"fill: none; stroke: #b0b0b0; stroke-width: 0.8; stroke-linecap: square\"/>\n",
       "     </g>\n",
       "     <g id=\"line2d_16\">\n",
       "      <g>\n",
       "       <use xlink:href=\"#m030d4b8499\" x=\"30.103125\" y=\"14.612727\" style=\"stroke: #000000; stroke-width: 0.8\"/>\n",
       "      </g>\n",
       "     </g>\n",
       "     <g id=\"text_8\">\n",
       "      <!-- 2.0 -->\n",
       "      <g transform=\"translate(7.2 18.411946)scale(0.1 -0.1)\">\n",
       "       <use xlink:href=\"#DejaVuSans-32\"/>\n",
       "       <use xlink:href=\"#DejaVuSans-2e\" x=\"63.623047\"/>\n",
       "       <use xlink:href=\"#DejaVuSans-30\" x=\"95.410156\"/>\n",
       "      </g>\n",
       "     </g>\n",
       "    </g>\n",
       "   </g>\n",
       "   <g id=\"line2d_17\">\n",
       "    <path d=\"M 36.817029 14.612727 \n",
       "L 40.855466 31.869539 \n",
       "L 44.557368 46.75058 \n",
       "L 48.259269 60.734673 \n",
       "L 51.96117 73.821837 \n",
       "L 55.326535 84.940917 \n",
       "L 58.6919 95.318734 \n",
       "L 62.057265 104.955275 \n",
       "L 65.086093 112.994377 \n",
       "L 68.114921 120.433046 \n",
       "L 71.14375 127.271286 \n",
       "L 73.836041 132.845656 \n",
       "L 76.528333 137.945615 \n",
       "L 79.220625 142.57116 \n",
       "L 81.912917 146.72229 \n",
       "L 84.268672 149.965362 \n",
       "L 86.624428 152.845211 \n",
       "L 88.980183 155.361837 \n",
       "L 90.999402 157.22985 \n",
       "L 93.018621 158.831004 \n",
       "L 95.037839 160.165301 \n",
       "L 97.05706 161.232741 \n",
       "L 99.076282 162.033323 \n",
       "L 101.095503 162.567046 \n",
       "L 103.114724 162.833912 \n",
       "L 105.133944 162.833919 \n",
       "L 107.153165 162.567068 \n",
       "L 109.172386 162.033359 \n",
       "L 111.191607 161.232791 \n",
       "L 113.210828 160.165365 \n",
       "L 115.230047 158.831083 \n",
       "L 117.249266 157.229943 \n",
       "L 119.268485 155.361945 \n",
       "L 121.624241 152.845335 \n",
       "L 123.979997 149.965504 \n",
       "L 126.335752 146.722447 \n",
       "L 128.691507 143.11617 \n",
       "L 131.383799 138.549946 \n",
       "L 134.076091 133.50931 \n",
       "L 136.768383 127.994258 \n",
       "L 139.460675 122.004792 \n",
       "L 142.489503 114.699574 \n",
       "L 145.518331 106.793923 \n",
       "L 148.54716 98.287843 \n",
       "L 151.912524 88.132435 \n",
       "L 155.277889 77.235756 \n",
       "L 158.643254 65.597805 \n",
       "L 162.345155 51.939899 \n",
       "L 166.047056 37.385046 \n",
       "L 169.748958 21.933255 \n",
       "L 171.095104 16.092037 \n",
       "L 171.095104 16.092037 \n",
       "\" clip-path=\"url(#p8586ef34a1)\" style=\"fill: none; stroke: #1f77b4; stroke-width: 1.5; stroke-linecap: square\"/>\n",
       "   </g>\n",
       "   <g id=\"line2d_18\">\n",
       "    <path d=\"M 53.643869 79.474091 \n",
       "L 137.778068 125.803636 \n",
       "\" clip-path=\"url(#p8586ef34a1)\" style=\"fill: none; stroke-dasharray: 5.55,2.4; stroke-dashoffset: 0; stroke: #bf00bf; stroke-width: 1.5\"/>\n",
       "   </g>\n",
       "   <g id=\"patch_3\">\n",
       "    <path d=\"M 30.103125 170.28 \n",
       "L 30.103125 7.2 \n",
       "\" style=\"fill: none; stroke: #000000; stroke-width: 0.8; stroke-linejoin: miter; stroke-linecap: square\"/>\n",
       "   </g>\n",
       "   <g id=\"patch_4\">\n",
       "    <path d=\"M 177.809007 170.28 \n",
       "L 177.809007 7.2 \n",
       "\" style=\"fill: none; stroke: #000000; stroke-width: 0.8; stroke-linejoin: miter; stroke-linecap: square\"/>\n",
       "   </g>\n",
       "   <g id=\"patch_5\">\n",
       "    <path d=\"M 30.103125 170.28 \n",
       "L 177.809007 170.28 \n",
       "\" style=\"fill: none; stroke: #000000; stroke-width: 0.8; stroke-linejoin: miter; stroke-linecap: square\"/>\n",
       "   </g>\n",
       "   <g id=\"patch_6\">\n",
       "    <path d=\"M 30.103125 7.2 \n",
       "L 177.809007 7.2 \n",
       "\" style=\"fill: none; stroke: #000000; stroke-width: 0.8; stroke-linejoin: miter; stroke-linecap: square\"/>\n",
       "   </g>\n",
       "  </g>\n",
       "  <g id=\"axes_2\">\n",
       "   <g id=\"patch_7\">\n",
       "    <path d=\"M 207.350184 170.28 \n",
       "L 355.056066 170.28 \n",
       "L 355.056066 7.2 \n",
       "L 207.350184 7.2 \n",
       "z\n",
       "\" style=\"fill: #ffffff\"/>\n",
       "   </g>\n",
       "   <g id=\"matplotlib.axis_3\">\n",
       "    <g id=\"xtick_4\">\n",
       "     <g id=\"line2d_19\">\n",
       "      <path d=\"M 214.064088 170.28 \n",
       "L 214.064088 7.2 \n",
       "\" clip-path=\"url(#p065ad13977)\" style=\"fill: none; stroke: #b0b0b0; stroke-width: 0.8; stroke-linecap: square\"/>\n",
       "     </g>\n",
       "     <g id=\"line2d_20\">\n",
       "      <g>\n",
       "       <use xlink:href=\"#m154addf18e\" x=\"214.064088\" y=\"170.28\" style=\"stroke: #000000; stroke-width: 0.8\"/>\n",
       "      </g>\n",
       "     </g>\n",
       "     <g id=\"text_9\">\n",
       "      <!-- −2 -->\n",
       "      <g transform=\"translate(206.692994 184.878438)scale(0.1 -0.1)\">\n",
       "       <use xlink:href=\"#DejaVuSans-2212\"/>\n",
       "       <use xlink:href=\"#DejaVuSans-32\" x=\"83.789062\"/>\n",
       "      </g>\n",
       "     </g>\n",
       "    </g>\n",
       "    <g id=\"xtick_5\">\n",
       "     <g id=\"line2d_21\">\n",
       "      <path d=\"M 281.371447 170.28 \n",
       "L 281.371447 7.2 \n",
       "\" clip-path=\"url(#p065ad13977)\" style=\"fill: none; stroke: #b0b0b0; stroke-width: 0.8; stroke-linecap: square\"/>\n",
       "     </g>\n",
       "     <g id=\"line2d_22\">\n",
       "      <g>\n",
       "       <use xlink:href=\"#m154addf18e\" x=\"281.371447\" y=\"170.28\" style=\"stroke: #000000; stroke-width: 0.8\"/>\n",
       "      </g>\n",
       "     </g>\n",
       "     <g id=\"text_10\">\n",
       "      <!-- 0 -->\n",
       "      <g transform=\"translate(278.190197 184.878438)scale(0.1 -0.1)\">\n",
       "       <use xlink:href=\"#DejaVuSans-30\"/>\n",
       "      </g>\n",
       "     </g>\n",
       "    </g>\n",
       "    <g id=\"xtick_6\">\n",
       "     <g id=\"line2d_23\">\n",
       "      <path d=\"M 348.678807 170.28 \n",
       "L 348.678807 7.2 \n",
       "\" clip-path=\"url(#p065ad13977)\" style=\"fill: none; stroke: #b0b0b0; stroke-width: 0.8; stroke-linecap: square\"/>\n",
       "     </g>\n",
       "     <g id=\"line2d_24\">\n",
       "      <g>\n",
       "       <use xlink:href=\"#m154addf18e\" x=\"348.678807\" y=\"170.28\" style=\"stroke: #000000; stroke-width: 0.8\"/>\n",
       "      </g>\n",
       "     </g>\n",
       "     <g id=\"text_11\">\n",
       "      <!-- 2 -->\n",
       "      <g transform=\"translate(345.497557 184.878438)scale(0.1 -0.1)\">\n",
       "       <use xlink:href=\"#DejaVuSans-32\"/>\n",
       "      </g>\n",
       "     </g>\n",
       "    </g>\n",
       "   </g>\n",
       "   <g id=\"matplotlib.axis_4\">\n",
       "    <g id=\"ytick_6\">\n",
       "     <g id=\"line2d_25\">\n",
       "      <path d=\"M 207.350184 162.867273 \n",
       "L 355.056066 162.867273 \n",
       "\" clip-path=\"url(#p065ad13977)\" style=\"fill: none; stroke: #b0b0b0; stroke-width: 0.8; stroke-linecap: square\"/>\n",
       "     </g>\n",
       "     <g id=\"line2d_26\">\n",
       "      <g>\n",
       "       <use xlink:href=\"#m030d4b8499\" x=\"207.350184\" y=\"162.867273\" style=\"stroke: #000000; stroke-width: 0.8\"/>\n",
       "      </g>\n",
       "     </g>\n",
       "     <g id=\"text_12\">\n",
       "      <!-- −1.0 -->\n",
       "      <g transform=\"translate(176.067371 166.666491)scale(0.1 -0.1)\">\n",
       "       <use xlink:href=\"#DejaVuSans-2212\"/>\n",
       "       <use xlink:href=\"#DejaVuSans-31\" x=\"83.789062\"/>\n",
       "       <use xlink:href=\"#DejaVuSans-2e\" x=\"147.412109\"/>\n",
       "       <use xlink:href=\"#DejaVuSans-30\" x=\"179.199219\"/>\n",
       "      </g>\n",
       "     </g>\n",
       "    </g>\n",
       "    <g id=\"ytick_7\">\n",
       "     <g id=\"line2d_27\">\n",
       "      <path d=\"M 207.350184 125.803636 \n",
       "L 355.056066 125.803636 \n",
       "\" clip-path=\"url(#p065ad13977)\" style=\"fill: none; stroke: #b0b0b0; stroke-width: 0.8; stroke-linecap: square\"/>\n",
       "     </g>\n",
       "     <g id=\"line2d_28\">\n",
       "      <g>\n",
       "       <use xlink:href=\"#m030d4b8499\" x=\"207.350184\" y=\"125.803636\" style=\"stroke: #000000; stroke-width: 0.8\"/>\n",
       "      </g>\n",
       "     </g>\n",
       "     <g id=\"text_13\">\n",
       "      <!-- −0.5 -->\n",
       "      <g transform=\"translate(176.067371 129.602855)scale(0.1 -0.1)\">\n",
       "       <use xlink:href=\"#DejaVuSans-2212\"/>\n",
       "       <use xlink:href=\"#DejaVuSans-30\" x=\"83.789062\"/>\n",
       "       <use xlink:href=\"#DejaVuSans-2e\" x=\"147.412109\"/>\n",
       "       <use xlink:href=\"#DejaVuSans-35\" x=\"179.199219\"/>\n",
       "      </g>\n",
       "     </g>\n",
       "    </g>\n",
       "    <g id=\"ytick_8\">\n",
       "     <g id=\"line2d_29\">\n",
       "      <path d=\"M 207.350184 88.74 \n",
       "L 355.056066 88.74 \n",
       "\" clip-path=\"url(#p065ad13977)\" style=\"fill: none; stroke: #b0b0b0; stroke-width: 0.8; stroke-linecap: square\"/>\n",
       "     </g>\n",
       "     <g id=\"line2d_30\">\n",
       "      <g>\n",
       "       <use xlink:href=\"#m030d4b8499\" x=\"207.350184\" y=\"88.74\" style=\"stroke: #000000; stroke-width: 0.8\"/>\n",
       "      </g>\n",
       "     </g>\n",
       "     <g id=\"text_14\">\n",
       "      <!-- 0.0 -->\n",
       "      <g transform=\"translate(184.447059 92.539219)scale(0.1 -0.1)\">\n",
       "       <use xlink:href=\"#DejaVuSans-30\"/>\n",
       "       <use xlink:href=\"#DejaVuSans-2e\" x=\"63.623047\"/>\n",
       "       <use xlink:href=\"#DejaVuSans-30\" x=\"95.410156\"/>\n",
       "      </g>\n",
       "     </g>\n",
       "    </g>\n",
       "    <g id=\"ytick_9\">\n",
       "     <g id=\"line2d_31\">\n",
       "      <path d=\"M 207.350184 51.676364 \n",
       "L 355.056066 51.676364 \n",
       "\" clip-path=\"url(#p065ad13977)\" style=\"fill: none; stroke: #b0b0b0; stroke-width: 0.8; stroke-linecap: square\"/>\n",
       "     </g>\n",
       "     <g id=\"line2d_32\">\n",
       "      <g>\n",
       "       <use xlink:href=\"#m030d4b8499\" x=\"207.350184\" y=\"51.676364\" style=\"stroke: #000000; stroke-width: 0.8\"/>\n",
       "      </g>\n",
       "     </g>\n",
       "     <g id=\"text_15\">\n",
       "      <!-- 0.5 -->\n",
       "      <g transform=\"translate(184.447059 55.475582)scale(0.1 -0.1)\">\n",
       "       <use xlink:href=\"#DejaVuSans-30\"/>\n",
       "       <use xlink:href=\"#DejaVuSans-2e\" x=\"63.623047\"/>\n",
       "       <use xlink:href=\"#DejaVuSans-35\" x=\"95.410156\"/>\n",
       "      </g>\n",
       "     </g>\n",
       "    </g>\n",
       "    <g id=\"ytick_10\">\n",
       "     <g id=\"line2d_33\">\n",
       "      <path d=\"M 207.350184 14.612727 \n",
       "L 355.056066 14.612727 \n",
       "\" clip-path=\"url(#p065ad13977)\" style=\"fill: none; stroke: #b0b0b0; stroke-width: 0.8; stroke-linecap: square\"/>\n",
       "     </g>\n",
       "     <g id=\"line2d_34\">\n",
       "      <g>\n",
       "       <use xlink:href=\"#m030d4b8499\" x=\"207.350184\" y=\"14.612727\" style=\"stroke: #000000; stroke-width: 0.8\"/>\n",
       "      </g>\n",
       "     </g>\n",
       "     <g id=\"text_16\">\n",
       "      <!-- 1.0 -->\n",
       "      <g transform=\"translate(184.447059 18.411946)scale(0.1 -0.1)\">\n",
       "       <use xlink:href=\"#DejaVuSans-31\"/>\n",
       "       <use xlink:href=\"#DejaVuSans-2e\" x=\"63.623047\"/>\n",
       "       <use xlink:href=\"#DejaVuSans-30\" x=\"95.410156\"/>\n",
       "      </g>\n",
       "     </g>\n",
       "    </g>\n",
       "   </g>\n",
       "   <g id=\"line2d_35\">\n",
       "    <path d=\"M 214.064088 14.612727 \n",
       "L 214.737161 14.759 \n",
       "L 215.410233 15.197241 \n",
       "L 216.083306 15.925718 \n",
       "L 216.756379 16.941563 \n",
       "L 217.765989 18.995061 \n",
       "L 218.775598 21.667625 \n",
       "L 219.785208 24.935527 \n",
       "L 221.131354 30.167926 \n",
       "L 222.4775 36.324059 \n",
       "L 224.160182 45.169008 \n",
       "L 226.179401 57.178066 \n",
       "L 228.871693 74.849823 \n",
       "L 235.938959 122.392933 \n",
       "L 237.958178 134.173004 \n",
       "L 239.64086 142.776341 \n",
       "L 240.987006 148.710114 \n",
       "L 242.333152 153.698129 \n",
       "L 243.342761 156.77057 \n",
       "L 244.352371 159.239161 \n",
       "L 245.36198 161.082002 \n",
       "L 246.035053 161.954609 \n",
       "L 246.708126 162.538271 \n",
       "L 247.381199 162.830689 \n",
       "L 248.054272 162.830702 \n",
       "L 248.727345 162.538315 \n",
       "L 249.400418 161.95468 \n",
       "L 250.073491 161.082104 \n",
       "L 250.746564 159.92402 \n",
       "L 251.756173 157.661893 \n",
       "L 252.765783 154.787999 \n",
       "L 253.775392 151.327872 \n",
       "L 255.121538 145.856217 \n",
       "L 256.467684 139.483815 \n",
       "L 258.150366 130.405954 \n",
       "L 260.169585 118.179771 \n",
       "L 262.861877 100.336383 \n",
       "L 269.25607 57.178477 \n",
       "L 271.275289 45.169388 \n",
       "L 272.957971 36.324382 \n",
       "L 274.304119 30.168199 \n",
       "L 275.650267 24.935731 \n",
       "L 276.996415 20.709501 \n",
       "L 278.006025 18.240892 \n",
       "L 279.015635 16.398033 \n",
       "L 279.688709 15.525417 \n",
       "L 280.361782 14.941743 \n",
       "L 281.034856 14.649315 \n",
       "L 281.70793 14.649293 \n",
       "L 282.381003 14.941672 \n",
       "L 283.054077 15.525298 \n",
       "L 283.72715 16.397869 \n",
       "L 284.400224 17.55594 \n",
       "L 285.409834 19.818063 \n",
       "L 286.419444 22.691944 \n",
       "L 287.429055 26.15207 \n",
       "L 288.775203 31.623717 \n",
       "L 290.121351 37.996123 \n",
       "L 291.804033 47.07398 \n",
       "L 293.823252 59.300152 \n",
       "L 296.515544 77.143537 \n",
       "L 302.909738 120.301448 \n",
       "L 304.928957 132.310542 \n",
       "L 306.611639 141.155574 \n",
       "L 307.957785 147.311748 \n",
       "L 309.303931 152.544208 \n",
       "L 310.650077 156.77045 \n",
       "L 311.659686 159.239073 \n",
       "L 312.669296 161.081936 \n",
       "L 313.342369 161.954565 \n",
       "L 314.015442 162.538244 \n",
       "L 314.688515 162.83068 \n",
       "L 315.361588 162.830711 \n",
       "L 316.034661 162.538341 \n",
       "L 316.707733 161.954729 \n",
       "L 317.380806 161.082166 \n",
       "L 318.053879 159.9241 \n",
       "L 319.063489 157.661994 \n",
       "L 320.073098 154.788131 \n",
       "L 321.082708 151.328027 \n",
       "L 322.428854 145.856412 \n",
       "L 323.775 139.484027 \n",
       "L 325.457682 130.406193 \n",
       "L 327.476901 118.180054 \n",
       "L 330.169193 100.336658 \n",
       "L 336.563386 57.178737 \n",
       "L 338.582605 45.16964 \n",
       "L 340.265287 36.324585 \n",
       "L 341.611433 30.168381 \n",
       "L 342.957579 24.935903 \n",
       "L 344.303725 20.709634 \n",
       "L 345.313334 18.240993 \n",
       "L 346.322944 16.398112 \n",
       "L 346.996017 15.52547 \n",
       "L 347.669089 14.941774 \n",
       "L 348.342162 14.649329 \n",
       "L 348.342162 14.649329 \n",
       "\" clip-path=\"url(#p065ad13977)\" style=\"fill: none; stroke: #1f77b4; stroke-width: 1.5; stroke-linecap: square\"/>\n",
       "   </g>\n",
       "   <g id=\"line2d_36\">\n",
       "    <path d=\"M 230.890928 88.739999 \n",
       "L 315.025127 162.867273 \n",
       "\" clip-path=\"url(#p065ad13977)\" style=\"fill: none; stroke-dasharray: 5.55,2.4; stroke-dashoffset: 0; stroke: #bf00bf; stroke-width: 1.5\"/>\n",
       "   </g>\n",
       "   <g id=\"patch_8\">\n",
       "    <path d=\"M 207.350184 170.28 \n",
       "L 207.350184 7.2 \n",
       "\" style=\"fill: none; stroke: #000000; stroke-width: 0.8; stroke-linejoin: miter; stroke-linecap: square\"/>\n",
       "   </g>\n",
       "   <g id=\"patch_9\">\n",
       "    <path d=\"M 355.056066 170.28 \n",
       "L 355.056066 7.2 \n",
       "\" style=\"fill: none; stroke: #000000; stroke-width: 0.8; stroke-linejoin: miter; stroke-linecap: square\"/>\n",
       "   </g>\n",
       "   <g id=\"patch_10\">\n",
       "    <path d=\"M 207.350184 170.28 \n",
       "L 355.056066 170.28 \n",
       "\" style=\"fill: none; stroke: #000000; stroke-width: 0.8; stroke-linejoin: miter; stroke-linecap: square\"/>\n",
       "   </g>\n",
       "   <g id=\"patch_11\">\n",
       "    <path d=\"M 207.350184 7.2 \n",
       "L 355.056066 7.2 \n",
       "\" style=\"fill: none; stroke: #000000; stroke-width: 0.8; stroke-linejoin: miter; stroke-linecap: square\"/>\n",
       "   </g>\n",
       "  </g>\n",
       "  <g id=\"axes_3\">\n",
       "   <g id=\"patch_12\">\n",
       "    <path d=\"M 384.597243 170.28 \n",
       "L 532.303125 170.28 \n",
       "L 532.303125 7.2 \n",
       "L 384.597243 7.2 \n",
       "z\n",
       "\" style=\"fill: #ffffff\"/>\n",
       "   </g>\n",
       "   <g id=\"matplotlib.axis_5\">\n",
       "    <g id=\"xtick_7\">\n",
       "     <g id=\"line2d_37\">\n",
       "      <path d=\"M 391.311146 170.28 \n",
       "L 391.311146 7.2 \n",
       "\" clip-path=\"url(#pc42c309d73)\" style=\"fill: none; stroke: #b0b0b0; stroke-width: 0.8; stroke-linecap: square\"/>\n",
       "     </g>\n",
       "     <g id=\"line2d_38\">\n",
       "      <g>\n",
       "       <use xlink:href=\"#m154addf18e\" x=\"391.311146\" y=\"170.28\" style=\"stroke: #000000; stroke-width: 0.8\"/>\n",
       "      </g>\n",
       "     </g>\n",
       "     <g id=\"text_17\">\n",
       "      <!-- −2 -->\n",
       "      <g transform=\"translate(383.940053 184.878438)scale(0.1 -0.1)\">\n",
       "       <use xlink:href=\"#DejaVuSans-2212\"/>\n",
       "       <use xlink:href=\"#DejaVuSans-32\" x=\"83.789062\"/>\n",
       "      </g>\n",
       "     </g>\n",
       "    </g>\n",
       "    <g id=\"xtick_8\">\n",
       "     <g id=\"line2d_39\">\n",
       "      <path d=\"M 458.618506 170.28 \n",
       "L 458.618506 7.2 \n",
       "\" clip-path=\"url(#pc42c309d73)\" style=\"fill: none; stroke: #b0b0b0; stroke-width: 0.8; stroke-linecap: square\"/>\n",
       "     </g>\n",
       "     <g id=\"line2d_40\">\n",
       "      <g>\n",
       "       <use xlink:href=\"#m154addf18e\" x=\"458.618506\" y=\"170.28\" style=\"stroke: #000000; stroke-width: 0.8\"/>\n",
       "      </g>\n",
       "     </g>\n",
       "     <g id=\"text_18\">\n",
       "      <!-- 0 -->\n",
       "      <g transform=\"translate(455.437256 184.878438)scale(0.1 -0.1)\">\n",
       "       <use xlink:href=\"#DejaVuSans-30\"/>\n",
       "      </g>\n",
       "     </g>\n",
       "    </g>\n",
       "    <g id=\"xtick_9\">\n",
       "     <g id=\"line2d_41\">\n",
       "      <path d=\"M 525.925866 170.28 \n",
       "L 525.925866 7.2 \n",
       "\" clip-path=\"url(#pc42c309d73)\" style=\"fill: none; stroke: #b0b0b0; stroke-width: 0.8; stroke-linecap: square\"/>\n",
       "     </g>\n",
       "     <g id=\"line2d_42\">\n",
       "      <g>\n",
       "       <use xlink:href=\"#m154addf18e\" x=\"525.925866\" y=\"170.28\" style=\"stroke: #000000; stroke-width: 0.8\"/>\n",
       "      </g>\n",
       "     </g>\n",
       "     <g id=\"text_19\">\n",
       "      <!-- 2 -->\n",
       "      <g transform=\"translate(522.744616 184.878438)scale(0.1 -0.1)\">\n",
       "       <use xlink:href=\"#DejaVuSans-32\"/>\n",
       "      </g>\n",
       "     </g>\n",
       "    </g>\n",
       "   </g>\n",
       "   <g id=\"matplotlib.axis_6\">\n",
       "    <g id=\"ytick_11\">\n",
       "     <g id=\"line2d_43\">\n",
       "      <path d=\"M 384.597243 154.485241 \n",
       "L 532.303125 154.485241 \n",
       "\" clip-path=\"url(#pc42c309d73)\" style=\"fill: none; stroke: #b0b0b0; stroke-width: 0.8; stroke-linecap: square\"/>\n",
       "     </g>\n",
       "     <g id=\"line2d_44\">\n",
       "      <g>\n",
       "       <use xlink:href=\"#m030d4b8499\" x=\"384.597243\" y=\"154.485241\" style=\"stroke: #000000; stroke-width: 0.8\"/>\n",
       "      </g>\n",
       "     </g>\n",
       "     <g id=\"text_20\">\n",
       "      <!-- 0.5 -->\n",
       "      <g transform=\"translate(361.694118 158.28446)scale(0.1 -0.1)\">\n",
       "       <use xlink:href=\"#DejaVuSans-30\"/>\n",
       "       <use xlink:href=\"#DejaVuSans-2e\" x=\"63.623047\"/>\n",
       "       <use xlink:href=\"#DejaVuSans-35\" x=\"95.410156\"/>\n",
       "      </g>\n",
       "     </g>\n",
       "    </g>\n",
       "    <g id=\"ytick_12\">\n",
       "     <g id=\"line2d_45\">\n",
       "      <path d=\"M 384.597243 122.76409 \n",
       "L 532.303125 122.76409 \n",
       "\" clip-path=\"url(#pc42c309d73)\" style=\"fill: none; stroke: #b0b0b0; stroke-width: 0.8; stroke-linecap: square\"/>\n",
       "     </g>\n",
       "     <g id=\"line2d_46\">\n",
       "      <g>\n",
       "       <use xlink:href=\"#m030d4b8499\" x=\"384.597243\" y=\"122.76409\" style=\"stroke: #000000; stroke-width: 0.8\"/>\n",
       "      </g>\n",
       "     </g>\n",
       "     <g id=\"text_21\">\n",
       "      <!-- 1.0 -->\n",
       "      <g transform=\"translate(361.694118 126.563309)scale(0.1 -0.1)\">\n",
       "       <use xlink:href=\"#DejaVuSans-31\"/>\n",
       "       <use xlink:href=\"#DejaVuSans-2e\" x=\"63.623047\"/>\n",
       "       <use xlink:href=\"#DejaVuSans-30\" x=\"95.410156\"/>\n",
       "      </g>\n",
       "     </g>\n",
       "    </g>\n",
       "    <g id=\"ytick_13\">\n",
       "     <g id=\"line2d_47\">\n",
       "      <path d=\"M 384.597243 91.042939 \n",
       "L 532.303125 91.042939 \n",
       "\" clip-path=\"url(#pc42c309d73)\" style=\"fill: none; stroke: #b0b0b0; stroke-width: 0.8; stroke-linecap: square\"/>\n",
       "     </g>\n",
       "     <g id=\"line2d_48\">\n",
       "      <g>\n",
       "       <use xlink:href=\"#m030d4b8499\" x=\"384.597243\" y=\"91.042939\" style=\"stroke: #000000; stroke-width: 0.8\"/>\n",
       "      </g>\n",
       "     </g>\n",
       "     <g id=\"text_22\">\n",
       "      <!-- 1.5 -->\n",
       "      <g transform=\"translate(361.694118 94.842158)scale(0.1 -0.1)\">\n",
       "       <use xlink:href=\"#DejaVuSans-31\"/>\n",
       "       <use xlink:href=\"#DejaVuSans-2e\" x=\"63.623047\"/>\n",
       "       <use xlink:href=\"#DejaVuSans-35\" x=\"95.410156\"/>\n",
       "      </g>\n",
       "     </g>\n",
       "    </g>\n",
       "    <g id=\"ytick_14\">\n",
       "     <g id=\"line2d_49\">\n",
       "      <path d=\"M 384.597243 59.321788 \n",
       "L 532.303125 59.321788 \n",
       "\" clip-path=\"url(#pc42c309d73)\" style=\"fill: none; stroke: #b0b0b0; stroke-width: 0.8; stroke-linecap: square\"/>\n",
       "     </g>\n",
       "     <g id=\"line2d_50\">\n",
       "      <g>\n",
       "       <use xlink:href=\"#m030d4b8499\" x=\"384.597243\" y=\"59.321788\" style=\"stroke: #000000; stroke-width: 0.8\"/>\n",
       "      </g>\n",
       "     </g>\n",
       "     <g id=\"text_23\">\n",
       "      <!-- 2.0 -->\n",
       "      <g transform=\"translate(361.694118 63.121006)scale(0.1 -0.1)\">\n",
       "       <use xlink:href=\"#DejaVuSans-32\"/>\n",
       "       <use xlink:href=\"#DejaVuSans-2e\" x=\"63.623047\"/>\n",
       "       <use xlink:href=\"#DejaVuSans-30\" x=\"95.410156\"/>\n",
       "      </g>\n",
       "     </g>\n",
       "    </g>\n",
       "    <g id=\"ytick_15\">\n",
       "     <g id=\"line2d_51\">\n",
       "      <path d=\"M 384.597243 27.600637 \n",
       "L 532.303125 27.600637 \n",
       "\" clip-path=\"url(#pc42c309d73)\" style=\"fill: none; stroke: #b0b0b0; stroke-width: 0.8; stroke-linecap: square\"/>\n",
       "     </g>\n",
       "     <g id=\"line2d_52\">\n",
       "      <g>\n",
       "       <use xlink:href=\"#m030d4b8499\" x=\"384.597243\" y=\"27.600637\" style=\"stroke: #000000; stroke-width: 0.8\"/>\n",
       "      </g>\n",
       "     </g>\n",
       "     <g id=\"text_24\">\n",
       "      <!-- 2.5 -->\n",
       "      <g transform=\"translate(361.694118 31.399855)scale(0.1 -0.1)\">\n",
       "       <use xlink:href=\"#DejaVuSans-32\"/>\n",
       "       <use xlink:href=\"#DejaVuSans-2e\" x=\"63.623047\"/>\n",
       "       <use xlink:href=\"#DejaVuSans-35\" x=\"95.410156\"/>\n",
       "      </g>\n",
       "     </g>\n",
       "    </g>\n",
       "   </g>\n",
       "   <g id=\"line2d_53\">\n",
       "    <path d=\"M 391.311146 162.867273 \n",
       "L 398.041876 160.412679 \n",
       "L 404.436069 157.842109 \n",
       "L 410.493726 155.170927 \n",
       "L 416.214846 152.417552 \n",
       "L 421.935966 149.419909 \n",
       "L 427.32055 146.356075 \n",
       "L 432.705133 143.037061 \n",
       "L 437.75318 139.674859 \n",
       "L 442.801228 136.050795 \n",
       "L 447.849275 132.144471 \n",
       "L 452.560789 128.224539 \n",
       "L 457.272304 124.020384 \n",
       "L 461.983819 119.511387 \n",
       "L 466.695336 114.675457 \n",
       "L 471.070311 109.871511 \n",
       "L 475.445285 104.744938 \n",
       "L 479.82026 99.274064 \n",
       "L 484.195234 93.435784 \n",
       "L 488.570209 87.205411 \n",
       "L 492.945183 80.556606 \n",
       "L 496.983621 74.023597 \n",
       "L 501.022058 67.086607 \n",
       "L 505.060496 59.720656 \n",
       "L 509.098934 51.899213 \n",
       "L 513.137372 43.594153 \n",
       "L 517.175809 34.775511 \n",
       "L 521.214247 25.411552 \n",
       "L 525.252685 15.468561 \n",
       "L 525.589221 14.612727 \n",
       "L 525.589221 14.612727 \n",
       "\" clip-path=\"url(#pc42c309d73)\" style=\"fill: none; stroke: #1f77b4; stroke-width: 1.5; stroke-linecap: square\"/>\n",
       "   </g>\n",
       "   <g id=\"line2d_54\">\n",
       "    <path d=\"M 408.137986 156.238371 \n",
       "L 492.272186 81.607722 \n",
       "\" clip-path=\"url(#pc42c309d73)\" style=\"fill: none; stroke-dasharray: 5.55,2.4; stroke-dashoffset: 0; stroke: #bf00bf; stroke-width: 1.5\"/>\n",
       "   </g>\n",
       "   <g id=\"patch_13\">\n",
       "    <path d=\"M 384.597243 170.28 \n",
       "L 384.597243 7.2 \n",
       "\" style=\"fill: none; stroke: #000000; stroke-width: 0.8; stroke-linejoin: miter; stroke-linecap: square\"/>\n",
       "   </g>\n",
       "   <g id=\"patch_14\">\n",
       "    <path d=\"M 532.303125 170.28 \n",
       "L 532.303125 7.2 \n",
       "\" style=\"fill: none; stroke: #000000; stroke-width: 0.8; stroke-linejoin: miter; stroke-linecap: square\"/>\n",
       "   </g>\n",
       "   <g id=\"patch_15\">\n",
       "    <path d=\"M 384.597243 170.28 \n",
       "L 532.303125 170.28 \n",
       "\" style=\"fill: none; stroke: #000000; stroke-width: 0.8; stroke-linejoin: miter; stroke-linecap: square\"/>\n",
       "   </g>\n",
       "   <g id=\"patch_16\">\n",
       "    <path d=\"M 384.597243 7.2 \n",
       "L 532.303125 7.2 \n",
       "\" style=\"fill: none; stroke: #000000; stroke-width: 0.8; stroke-linejoin: miter; stroke-linecap: square\"/>\n",
       "   </g>\n",
       "  </g>\n",
       " </g>\n",
       " <defs>\n",
       "  <clipPath id=\"p8586ef34a1\">\n",
       "   <rect x=\"30.103125\" y=\"7.2\" width=\"147.705882\" height=\"163.08\"/>\n",
       "  </clipPath>\n",
       "  <clipPath id=\"p065ad13977\">\n",
       "   <rect x=\"207.350184\" y=\"7.2\" width=\"147.705882\" height=\"163.08\"/>\n",
       "  </clipPath>\n",
       "  <clipPath id=\"pc42c309d73\">\n",
       "   <rect x=\"384.597243\" y=\"7.2\" width=\"147.705882\" height=\"163.08\"/>\n",
       "  </clipPath>\n",
       " </defs>\n",
       "</svg>\n"
      ],
      "text/plain": [
       "<Figure size 648x216 with 3 Axes>"
      ]
     },
     "metadata": {
      "needs_background": "light"
     },
     "output_type": "display_data"
    }
   ],
   "source": [
    "f = lambda x: 0.5 * x**2  # Convex\n",
    "g = lambda x: tf.cos(np.pi * x)  # Nonconvex\n",
    "h = lambda x: tf.exp(0.5 * x)  # Convex\n",
    "\n",
    "x, segment = tf.range(-2, 2, 0.01), tf.constant([-1.5, 1])\n",
    "d2l.use_svg_display()\n",
    "_, axes = d2l.plt.subplots(1, 3, figsize=(9, 3))\n",
    "for ax, func in zip(axes, [f, g, h]):\n",
    "    d2l.plot([x, segment], [func(x), func(segment)], axes=ax)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "origin_pos": 6
   },
   "source": [
    "As expected, the cosine function is *nonconvex*, whereas the parabola and the exponential function are. Note that the requirement that $\\mathcal{X}$ is a convex set is necessary for the condition to make sense. Otherwise the outcome of $f(\\lambda x + (1-\\lambda) x')$ might not be well defined.\n",
    "\n",
    "\n",
    "### Jensen's Inequality\n",
    "\n",
    "Given a convex function $f$,\n",
    "one of the most useful mathematical tools\n",
    "is *Jensen's inequality*.\n",
    "It amounts to a generalization of the definition of convexity:\n",
    "\n",
    "$$\\sum_i \\alpha_i f(x_i)  \\geq f\\left(\\sum_i \\alpha_i x_i\\right)    \\text{ and }    E_X[f(X)]  \\geq f\\left(E_X[X]\\right),$$\n",
    ":eqlabel:`eq_jensens-inequality`\n",
    "\n",
    "where $\\alpha_i$ are nonnegative real numbers such that $\\sum_i \\alpha_i = 1$ and $X$ is a random variable.\n",
    "In other words, the expectation of a convex function is no less than the convex function of an expectation, where the latter is usually a simpler expression. \n",
    "To prove the first inequality we repeatedly apply the definition of convexity to one term in the sum at a time.\n",
    "\n",
    "\n",
    "One of the common applications of Jensen's inequality is\n",
    "to bound a more complicated expression by a simpler one.\n",
    "For example,\n",
    "its application can be\n",
    "with regard to the log-likelihood of partially observed random variables. That is, we use\n",
    "\n",
    "$$E_{Y \\sim P(Y)}[-\\log P(X \\mid Y)] \\geq -\\log P(X),$$\n",
    "\n",
    "since $\\int P(Y) P(X \\mid Y) dY = P(X)$.\n",
    "This can be used in variational methods. Here $Y$ is typically the unobserved random variable, $P(Y)$ is the best guess of how it might be distributed, and $P(X)$ is the distribution with $Y$ integrated out. For instance, in clustering $Y$ might be the cluster labels and $P(X \\mid Y)$ is the generative model when applying cluster labels.\n",
    "\n",
    "\n",
    "\n",
    "## Properties\n",
    "\n",
    "Convex functions have many useful properties. We describe a few commonly-used ones below.\n",
    "\n",
    "\n",
    "### Local Minima Are Global Minima\n",
    "\n",
    "First and foremost, the local minima of convex functions are also the global minima. \n",
    "We can prove it by contradiction as follows.\n",
    "\n",
    "Consider a convex function $f$ defined on a convex set $\\mathcal{X}$.\n",
    "Suppose that $x^{\\ast} \\in \\mathcal{X}$ is a local minimum:\n",
    "there exists a small positive value $p$ so that for $x \\in \\mathcal{X}$ that satisfies $0 < |x - x^{\\ast}| \\leq p$ we have $f(x^{\\ast}) < f(x)$.\n",
    "\n",
    "Assume that the local minimum $x^{\\ast}$\n",
    "is not the global minumum of $f$:\n",
    "there exists $x' \\in \\mathcal{X}$ for which $f(x') < f(x^{\\ast})$. \n",
    "There also exists \n",
    "$\\lambda \\in [0, 1)$ such as $\\lambda = 1 - \\frac{p}{|x^{\\ast} - x'|}$\n",
    "so that\n",
    "$0 < |\\lambda x^{\\ast} + (1-\\lambda) x' - x^{\\ast}| \\leq p$. \n",
    "\n",
    "However,\n",
    "according to the definition of convex functions, we have\n",
    "\n",
    "$$\\begin{aligned}\n",
    "    f(\\lambda x^{\\ast} + (1-\\lambda) x') &\\leq \\lambda f(x^{\\ast}) + (1-\\lambda) f(x') \\\\\n",
    "    &< \\lambda f(x^{\\ast}) + (1-\\lambda) f(x^{\\ast}) \\\\\n",
    "    &= f(x^{\\ast}),\n",
    "\\end{aligned}$$\n",
    "\n",
    "which contradicts with our statement that $x^{\\ast}$ is a local minimum.\n",
    "Therefore, there does not exist $x' \\in \\mathcal{X}$ for which $f(x') < f(x^{\\ast})$. The local minimum $x^{\\ast}$ is also the global minimum.\n",
    "\n",
    "For instance, the convex function $f(x) = (x-1)^2$ has a local minimum at $x=1$, which is also the global minimum.\n"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 3,
   "metadata": {
    "origin_pos": 7,
    "tab": [
     "tensorflow"
    ]
   },
   "outputs": [
    {
     "data": {
      "image/svg+xml": [
       "<?xml version=\"1.0\" encoding=\"utf-8\" standalone=\"no\"?>\n",
       "<!DOCTYPE svg PUBLIC \"-//W3C//DTD SVG 1.1//EN\"\n",
       "  \"http://www.w3.org/Graphics/SVG/1.1/DTD/svg11.dtd\">\n",
       "<svg xmlns:xlink=\"http://www.w3.org/1999/xlink\" width=\"236.740625pt\" height=\"180.65625pt\" viewBox=\"0 0 236.740625 180.65625\" xmlns=\"http://www.w3.org/2000/svg\" version=\"1.1\">\n",
       " <metadata>\n",
       "  <rdf:RDF xmlns:dc=\"http://purl.org/dc/elements/1.1/\" xmlns:cc=\"http://creativecommons.org/ns#\" xmlns:rdf=\"http://www.w3.org/1999/02/22-rdf-syntax-ns#\">\n",
       "   <cc:Work>\n",
       "    <dc:type rdf:resource=\"http://purl.org/dc/dcmitype/StillImage\"/>\n",
       "    <dc:date>2022-03-24T13:09:24.813892</dc:date>\n",
       "    <dc:format>image/svg+xml</dc:format>\n",
       "    <dc:creator>\n",
       "     <cc:Agent>\n",
       "      <dc:title>Matplotlib v3.5.1, https://matplotlib.org/</dc:title>\n",
       "     </cc:Agent>\n",
       "    </dc:creator>\n",
       "   </cc:Work>\n",
       "  </rdf:RDF>\n",
       " </metadata>\n",
       " <defs>\n",
       "  <style type=\"text/css\">*{stroke-linejoin: round; stroke-linecap: butt}</style>\n",
       " </defs>\n",
       " <g id=\"figure_1\">\n",
       "  <g id=\"patch_1\">\n",
       "   <path d=\"M 0 180.65625 \n",
       "L 236.740625 180.65625 \n",
       "L 236.740625 0 \n",
       "L 0 0 \n",
       "L 0 180.65625 \n",
       "z\n",
       "\" style=\"fill: none\"/>\n",
       "  </g>\n",
       "  <g id=\"axes_1\">\n",
       "   <g id=\"patch_2\">\n",
       "    <path d=\"M 34.240625 143.1 \n",
       "L 229.540625 143.1 \n",
       "L 229.540625 7.2 \n",
       "L 34.240625 7.2 \n",
       "z\n",
       "\" style=\"fill: #ffffff\"/>\n",
       "   </g>\n",
       "   <g id=\"matplotlib.axis_1\">\n",
       "    <g id=\"xtick_1\">\n",
       "     <g id=\"line2d_1\">\n",
       "      <path d=\"M 43.117898 143.1 \n",
       "L 43.117898 7.2 \n",
       "\" clip-path=\"url(#p62fd75e551)\" style=\"fill: none; stroke: #b0b0b0; stroke-width: 0.8; stroke-linecap: square\"/>\n",
       "     </g>\n",
       "     <g id=\"line2d_2\">\n",
       "      <defs>\n",
       "       <path id=\"mcf5beb4e53\" d=\"M 0 0 \n",
       "L 0 3.5 \n",
       "\" style=\"stroke: #000000; stroke-width: 0.8\"/>\n",
       "      </defs>\n",
       "      <g>\n",
       "       <use xlink:href=\"#mcf5beb4e53\" x=\"43.117898\" y=\"143.1\" style=\"stroke: #000000; stroke-width: 0.8\"/>\n",
       "      </g>\n",
       "     </g>\n",
       "     <g id=\"text_1\">\n",
       "      <!-- −2 -->\n",
       "      <g transform=\"translate(35.746804 157.698438)scale(0.1 -0.1)\">\n",
       "       <defs>\n",
       "        <path id=\"DejaVuSans-2212\" d=\"M 678 2272 \n",
       "L 4684 2272 \n",
       "L 4684 1741 \n",
       "L 678 1741 \n",
       "L 678 2272 \n",
       "z\n",
       "\" transform=\"scale(0.015625)\"/>\n",
       "        <path id=\"DejaVuSans-32\" d=\"M 1228 531 \n",
       "L 3431 531 \n",
       "L 3431 0 \n",
       "L 469 0 \n",
       "L 469 531 \n",
       "Q 828 903 1448 1529 \n",
       "Q 2069 2156 2228 2338 \n",
       "Q 2531 2678 2651 2914 \n",
       "Q 2772 3150 2772 3378 \n",
       "Q 2772 3750 2511 3984 \n",
       "Q 2250 4219 1831 4219 \n",
       "Q 1534 4219 1204 4116 \n",
       "Q 875 4013 500 3803 \n",
       "L 500 4441 \n",
       "Q 881 4594 1212 4672 \n",
       "Q 1544 4750 1819 4750 \n",
       "Q 2544 4750 2975 4387 \n",
       "Q 3406 4025 3406 3419 \n",
       "Q 3406 3131 3298 2873 \n",
       "Q 3191 2616 2906 2266 \n",
       "Q 2828 2175 2409 1742 \n",
       "Q 1991 1309 1228 531 \n",
       "z\n",
       "\" transform=\"scale(0.015625)\"/>\n",
       "       </defs>\n",
       "       <use xlink:href=\"#DejaVuSans-2212\"/>\n",
       "       <use xlink:href=\"#DejaVuSans-32\" x=\"83.789062\"/>\n",
       "      </g>\n",
       "     </g>\n",
       "    </g>\n",
       "    <g id=\"xtick_2\">\n",
       "     <g id=\"line2d_3\">\n",
       "      <path d=\"M 87.615541 143.1 \n",
       "L 87.615541 7.2 \n",
       "\" clip-path=\"url(#p62fd75e551)\" style=\"fill: none; stroke: #b0b0b0; stroke-width: 0.8; stroke-linecap: square\"/>\n",
       "     </g>\n",
       "     <g id=\"line2d_4\">\n",
       "      <g>\n",
       "       <use xlink:href=\"#mcf5beb4e53\" x=\"87.615541\" y=\"143.1\" style=\"stroke: #000000; stroke-width: 0.8\"/>\n",
       "      </g>\n",
       "     </g>\n",
       "     <g id=\"text_2\">\n",
       "      <!-- −1 -->\n",
       "      <g transform=\"translate(80.244447 157.698438)scale(0.1 -0.1)\">\n",
       "       <defs>\n",
       "        <path id=\"DejaVuSans-31\" d=\"M 794 531 \n",
       "L 1825 531 \n",
       "L 1825 4091 \n",
       "L 703 3866 \n",
       "L 703 4441 \n",
       "L 1819 4666 \n",
       "L 2450 4666 \n",
       "L 2450 531 \n",
       "L 3481 531 \n",
       "L 3481 0 \n",
       "L 794 0 \n",
       "L 794 531 \n",
       "z\n",
       "\" transform=\"scale(0.015625)\"/>\n",
       "       </defs>\n",
       "       <use xlink:href=\"#DejaVuSans-2212\"/>\n",
       "       <use xlink:href=\"#DejaVuSans-31\" x=\"83.789062\"/>\n",
       "      </g>\n",
       "     </g>\n",
       "    </g>\n",
       "    <g id=\"xtick_3\">\n",
       "     <g id=\"line2d_5\">\n",
       "      <path d=\"M 132.113185 143.1 \n",
       "L 132.113185 7.2 \n",
       "\" clip-path=\"url(#p62fd75e551)\" style=\"fill: none; stroke: #b0b0b0; stroke-width: 0.8; stroke-linecap: square\"/>\n",
       "     </g>\n",
       "     <g id=\"line2d_6\">\n",
       "      <g>\n",
       "       <use xlink:href=\"#mcf5beb4e53\" x=\"132.113185\" y=\"143.1\" style=\"stroke: #000000; stroke-width: 0.8\"/>\n",
       "      </g>\n",
       "     </g>\n",
       "     <g id=\"text_3\">\n",
       "      <!-- 0 -->\n",
       "      <g transform=\"translate(128.931935 157.698438)scale(0.1 -0.1)\">\n",
       "       <defs>\n",
       "        <path id=\"DejaVuSans-30\" d=\"M 2034 4250 \n",
       "Q 1547 4250 1301 3770 \n",
       "Q 1056 3291 1056 2328 \n",
       "Q 1056 1369 1301 889 \n",
       "Q 1547 409 2034 409 \n",
       "Q 2525 409 2770 889 \n",
       "Q 3016 1369 3016 2328 \n",
       "Q 3016 3291 2770 3770 \n",
       "Q 2525 4250 2034 4250 \n",
       "z\n",
       "M 2034 4750 \n",
       "Q 2819 4750 3233 4129 \n",
       "Q 3647 3509 3647 2328 \n",
       "Q 3647 1150 3233 529 \n",
       "Q 2819 -91 2034 -91 \n",
       "Q 1250 -91 836 529 \n",
       "Q 422 1150 422 2328 \n",
       "Q 422 3509 836 4129 \n",
       "Q 1250 4750 2034 4750 \n",
       "z\n",
       "\" transform=\"scale(0.015625)\"/>\n",
       "       </defs>\n",
       "       <use xlink:href=\"#DejaVuSans-30\"/>\n",
       "      </g>\n",
       "     </g>\n",
       "    </g>\n",
       "    <g id=\"xtick_4\">\n",
       "     <g id=\"line2d_7\">\n",
       "      <path d=\"M 176.610828 143.1 \n",
       "L 176.610828 7.2 \n",
       "\" clip-path=\"url(#p62fd75e551)\" style=\"fill: none; stroke: #b0b0b0; stroke-width: 0.8; stroke-linecap: square\"/>\n",
       "     </g>\n",
       "     <g id=\"line2d_8\">\n",
       "      <g>\n",
       "       <use xlink:href=\"#mcf5beb4e53\" x=\"176.610828\" y=\"143.1\" style=\"stroke: #000000; stroke-width: 0.8\"/>\n",
       "      </g>\n",
       "     </g>\n",
       "     <g id=\"text_4\">\n",
       "      <!-- 1 -->\n",
       "      <g transform=\"translate(173.429578 157.698438)scale(0.1 -0.1)\">\n",
       "       <use xlink:href=\"#DejaVuSans-31\"/>\n",
       "      </g>\n",
       "     </g>\n",
       "    </g>\n",
       "    <g id=\"xtick_5\">\n",
       "     <g id=\"line2d_9\">\n",
       "      <path d=\"M 221.108472 143.1 \n",
       "L 221.108472 7.2 \n",
       "\" clip-path=\"url(#p62fd75e551)\" style=\"fill: none; stroke: #b0b0b0; stroke-width: 0.8; stroke-linecap: square\"/>\n",
       "     </g>\n",
       "     <g id=\"line2d_10\">\n",
       "      <g>\n",
       "       <use xlink:href=\"#mcf5beb4e53\" x=\"221.108472\" y=\"143.1\" style=\"stroke: #000000; stroke-width: 0.8\"/>\n",
       "      </g>\n",
       "     </g>\n",
       "     <g id=\"text_5\">\n",
       "      <!-- 2 -->\n",
       "      <g transform=\"translate(217.927222 157.698438)scale(0.1 -0.1)\">\n",
       "       <use xlink:href=\"#DejaVuSans-32\"/>\n",
       "      </g>\n",
       "     </g>\n",
       "    </g>\n",
       "    <g id=\"text_6\">\n",
       "     <!-- x -->\n",
       "     <g transform=\"translate(128.93125 171.376563)scale(0.1 -0.1)\">\n",
       "      <defs>\n",
       "       <path id=\"DejaVuSans-78\" d=\"M 3513 3500 \n",
       "L 2247 1797 \n",
       "L 3578 0 \n",
       "L 2900 0 \n",
       "L 1881 1375 \n",
       "L 863 0 \n",
       "L 184 0 \n",
       "L 1544 1831 \n",
       "L 300 3500 \n",
       "L 978 3500 \n",
       "L 1906 2253 \n",
       "L 2834 3500 \n",
       "L 3513 3500 \n",
       "z\n",
       "\" transform=\"scale(0.015625)\"/>\n",
       "      </defs>\n",
       "      <use xlink:href=\"#DejaVuSans-78\"/>\n",
       "     </g>\n",
       "    </g>\n",
       "   </g>\n",
       "   <g id=\"matplotlib.axis_2\">\n",
       "    <g id=\"ytick_1\">\n",
       "     <g id=\"line2d_11\">\n",
       "      <path d=\"M 34.240625 136.922727 \n",
       "L 229.540625 136.922727 \n",
       "\" clip-path=\"url(#p62fd75e551)\" style=\"fill: none; stroke: #b0b0b0; stroke-width: 0.8; stroke-linecap: square\"/>\n",
       "     </g>\n",
       "     <g id=\"line2d_12\">\n",
       "      <defs>\n",
       "       <path id=\"m791247d2c2\" d=\"M 0 0 \n",
       "L -3.5 0 \n",
       "\" style=\"stroke: #000000; stroke-width: 0.8\"/>\n",
       "      </defs>\n",
       "      <g>\n",
       "       <use xlink:href=\"#m791247d2c2\" x=\"34.240625\" y=\"136.922727\" style=\"stroke: #000000; stroke-width: 0.8\"/>\n",
       "      </g>\n",
       "     </g>\n",
       "     <g id=\"text_7\">\n",
       "      <!-- 0 -->\n",
       "      <g transform=\"translate(20.878125 140.721946)scale(0.1 -0.1)\">\n",
       "       <use xlink:href=\"#DejaVuSans-30\"/>\n",
       "      </g>\n",
       "     </g>\n",
       "    </g>\n",
       "    <g id=\"ytick_2\">\n",
       "     <g id=\"line2d_13\">\n",
       "      <path d=\"M 34.240625 109.468182 \n",
       "L 229.540625 109.468182 \n",
       "\" clip-path=\"url(#p62fd75e551)\" style=\"fill: none; stroke: #b0b0b0; stroke-width: 0.8; stroke-linecap: square\"/>\n",
       "     </g>\n",
       "     <g id=\"line2d_14\">\n",
       "      <g>\n",
       "       <use xlink:href=\"#m791247d2c2\" x=\"34.240625\" y=\"109.468182\" style=\"stroke: #000000; stroke-width: 0.8\"/>\n",
       "      </g>\n",
       "     </g>\n",
       "     <g id=\"text_8\">\n",
       "      <!-- 2 -->\n",
       "      <g transform=\"translate(20.878125 113.267401)scale(0.1 -0.1)\">\n",
       "       <use xlink:href=\"#DejaVuSans-32\"/>\n",
       "      </g>\n",
       "     </g>\n",
       "    </g>\n",
       "    <g id=\"ytick_3\">\n",
       "     <g id=\"line2d_15\">\n",
       "      <path d=\"M 34.240625 82.013636 \n",
       "L 229.540625 82.013636 \n",
       "\" clip-path=\"url(#p62fd75e551)\" style=\"fill: none; stroke: #b0b0b0; stroke-width: 0.8; stroke-linecap: square\"/>\n",
       "     </g>\n",
       "     <g id=\"line2d_16\">\n",
       "      <g>\n",
       "       <use xlink:href=\"#m791247d2c2\" x=\"34.240625\" y=\"82.013636\" style=\"stroke: #000000; stroke-width: 0.8\"/>\n",
       "      </g>\n",
       "     </g>\n",
       "     <g id=\"text_9\">\n",
       "      <!-- 4 -->\n",
       "      <g transform=\"translate(20.878125 85.812855)scale(0.1 -0.1)\">\n",
       "       <defs>\n",
       "        <path id=\"DejaVuSans-34\" d=\"M 2419 4116 \n",
       "L 825 1625 \n",
       "L 2419 1625 \n",
       "L 2419 4116 \n",
       "z\n",
       "M 2253 4666 \n",
       "L 3047 4666 \n",
       "L 3047 1625 \n",
       "L 3713 1625 \n",
       "L 3713 1100 \n",
       "L 3047 1100 \n",
       "L 3047 0 \n",
       "L 2419 0 \n",
       "L 2419 1100 \n",
       "L 313 1100 \n",
       "L 313 1709 \n",
       "L 2253 4666 \n",
       "z\n",
       "\" transform=\"scale(0.015625)\"/>\n",
       "       </defs>\n",
       "       <use xlink:href=\"#DejaVuSans-34\"/>\n",
       "      </g>\n",
       "     </g>\n",
       "    </g>\n",
       "    <g id=\"ytick_4\">\n",
       "     <g id=\"line2d_17\">\n",
       "      <path d=\"M 34.240625 54.559091 \n",
       "L 229.540625 54.559091 \n",
       "\" clip-path=\"url(#p62fd75e551)\" style=\"fill: none; stroke: #b0b0b0; stroke-width: 0.8; stroke-linecap: square\"/>\n",
       "     </g>\n",
       "     <g id=\"line2d_18\">\n",
       "      <g>\n",
       "       <use xlink:href=\"#m791247d2c2\" x=\"34.240625\" y=\"54.559091\" style=\"stroke: #000000; stroke-width: 0.8\"/>\n",
       "      </g>\n",
       "     </g>\n",
       "     <g id=\"text_10\">\n",
       "      <!-- 6 -->\n",
       "      <g transform=\"translate(20.878125 58.35831)scale(0.1 -0.1)\">\n",
       "       <defs>\n",
       "        <path id=\"DejaVuSans-36\" d=\"M 2113 2584 \n",
       "Q 1688 2584 1439 2293 \n",
       "Q 1191 2003 1191 1497 \n",
       "Q 1191 994 1439 701 \n",
       "Q 1688 409 2113 409 \n",
       "Q 2538 409 2786 701 \n",
       "Q 3034 994 3034 1497 \n",
       "Q 3034 2003 2786 2293 \n",
       "Q 2538 2584 2113 2584 \n",
       "z\n",
       "M 3366 4563 \n",
       "L 3366 3988 \n",
       "Q 3128 4100 2886 4159 \n",
       "Q 2644 4219 2406 4219 \n",
       "Q 1781 4219 1451 3797 \n",
       "Q 1122 3375 1075 2522 \n",
       "Q 1259 2794 1537 2939 \n",
       "Q 1816 3084 2150 3084 \n",
       "Q 2853 3084 3261 2657 \n",
       "Q 3669 2231 3669 1497 \n",
       "Q 3669 778 3244 343 \n",
       "Q 2819 -91 2113 -91 \n",
       "Q 1303 -91 875 529 \n",
       "Q 447 1150 447 2328 \n",
       "Q 447 3434 972 4092 \n",
       "Q 1497 4750 2381 4750 \n",
       "Q 2619 4750 2861 4703 \n",
       "Q 3103 4656 3366 4563 \n",
       "z\n",
       "\" transform=\"scale(0.015625)\"/>\n",
       "       </defs>\n",
       "       <use xlink:href=\"#DejaVuSans-36\"/>\n",
       "      </g>\n",
       "     </g>\n",
       "    </g>\n",
       "    <g id=\"ytick_5\">\n",
       "     <g id=\"line2d_19\">\n",
       "      <path d=\"M 34.240625 27.104545 \n",
       "L 229.540625 27.104545 \n",
       "\" clip-path=\"url(#p62fd75e551)\" style=\"fill: none; stroke: #b0b0b0; stroke-width: 0.8; stroke-linecap: square\"/>\n",
       "     </g>\n",
       "     <g id=\"line2d_20\">\n",
       "      <g>\n",
       "       <use xlink:href=\"#m791247d2c2\" x=\"34.240625\" y=\"27.104545\" style=\"stroke: #000000; stroke-width: 0.8\"/>\n",
       "      </g>\n",
       "     </g>\n",
       "     <g id=\"text_11\">\n",
       "      <!-- 8 -->\n",
       "      <g transform=\"translate(20.878125 30.903764)scale(0.1 -0.1)\">\n",
       "       <defs>\n",
       "        <path id=\"DejaVuSans-38\" d=\"M 2034 2216 \n",
       "Q 1584 2216 1326 1975 \n",
       "Q 1069 1734 1069 1313 \n",
       "Q 1069 891 1326 650 \n",
       "Q 1584 409 2034 409 \n",
       "Q 2484 409 2743 651 \n",
       "Q 3003 894 3003 1313 \n",
       "Q 3003 1734 2745 1975 \n",
       "Q 2488 2216 2034 2216 \n",
       "z\n",
       "M 1403 2484 \n",
       "Q 997 2584 770 2862 \n",
       "Q 544 3141 544 3541 \n",
       "Q 544 4100 942 4425 \n",
       "Q 1341 4750 2034 4750 \n",
       "Q 2731 4750 3128 4425 \n",
       "Q 3525 4100 3525 3541 \n",
       "Q 3525 3141 3298 2862 \n",
       "Q 3072 2584 2669 2484 \n",
       "Q 3125 2378 3379 2068 \n",
       "Q 3634 1759 3634 1313 \n",
       "Q 3634 634 3220 271 \n",
       "Q 2806 -91 2034 -91 \n",
       "Q 1263 -91 848 271 \n",
       "Q 434 634 434 1313 \n",
       "Q 434 1759 690 2068 \n",
       "Q 947 2378 1403 2484 \n",
       "z\n",
       "M 1172 3481 \n",
       "Q 1172 3119 1398 2916 \n",
       "Q 1625 2713 2034 2713 \n",
       "Q 2441 2713 2670 2916 \n",
       "Q 2900 3119 2900 3481 \n",
       "Q 2900 3844 2670 4047 \n",
       "Q 2441 4250 2034 4250 \n",
       "Q 1625 4250 1398 4047 \n",
       "Q 1172 3844 1172 3481 \n",
       "z\n",
       "\" transform=\"scale(0.015625)\"/>\n",
       "       </defs>\n",
       "       <use xlink:href=\"#DejaVuSans-38\"/>\n",
       "      </g>\n",
       "     </g>\n",
       "    </g>\n",
       "    <g id=\"text_12\">\n",
       "     <!-- f(x) -->\n",
       "     <g transform=\"translate(14.798438 83.771094)rotate(-90)scale(0.1 -0.1)\">\n",
       "      <defs>\n",
       "       <path id=\"DejaVuSans-66\" d=\"M 2375 4863 \n",
       "L 2375 4384 \n",
       "L 1825 4384 \n",
       "Q 1516 4384 1395 4259 \n",
       "Q 1275 4134 1275 3809 \n",
       "L 1275 3500 \n",
       "L 2222 3500 \n",
       "L 2222 3053 \n",
       "L 1275 3053 \n",
       "L 1275 0 \n",
       "L 697 0 \n",
       "L 697 3053 \n",
       "L 147 3053 \n",
       "L 147 3500 \n",
       "L 697 3500 \n",
       "L 697 3744 \n",
       "Q 697 4328 969 4595 \n",
       "Q 1241 4863 1831 4863 \n",
       "L 2375 4863 \n",
       "z\n",
       "\" transform=\"scale(0.015625)\"/>\n",
       "       <path id=\"DejaVuSans-28\" d=\"M 1984 4856 \n",
       "Q 1566 4138 1362 3434 \n",
       "Q 1159 2731 1159 2009 \n",
       "Q 1159 1288 1364 580 \n",
       "Q 1569 -128 1984 -844 \n",
       "L 1484 -844 \n",
       "Q 1016 -109 783 600 \n",
       "Q 550 1309 550 2009 \n",
       "Q 550 2706 781 3412 \n",
       "Q 1013 4119 1484 4856 \n",
       "L 1984 4856 \n",
       "z\n",
       "\" transform=\"scale(0.015625)\"/>\n",
       "       <path id=\"DejaVuSans-29\" d=\"M 513 4856 \n",
       "L 1013 4856 \n",
       "Q 1481 4119 1714 3412 \n",
       "Q 1947 2706 1947 2009 \n",
       "Q 1947 1309 1714 600 \n",
       "Q 1481 -109 1013 -844 \n",
       "L 513 -844 \n",
       "Q 928 -128 1133 580 \n",
       "Q 1338 1288 1338 2009 \n",
       "Q 1338 2731 1133 3434 \n",
       "Q 928 4138 513 4856 \n",
       "z\n",
       "\" transform=\"scale(0.015625)\"/>\n",
       "      </defs>\n",
       "      <use xlink:href=\"#DejaVuSans-66\"/>\n",
       "      <use xlink:href=\"#DejaVuSans-28\" x=\"35.205078\"/>\n",
       "      <use xlink:href=\"#DejaVuSans-78\" x=\"74.21875\"/>\n",
       "      <use xlink:href=\"#DejaVuSans-29\" x=\"133.398438\"/>\n",
       "     </g>\n",
       "    </g>\n",
       "   </g>\n",
       "   <g id=\"line2d_21\">\n",
       "    <path d=\"M 43.117898 13.377273 \n",
       "L 48.902586 23.85255 \n",
       "L 54.687274 33.863837 \n",
       "L 60.471962 43.411147 \n",
       "L 65.811674 51.812232 \n",
       "L 71.151386 59.81797 \n",
       "L 76.491098 67.428363 \n",
       "L 81.830811 74.643417 \n",
       "L 87.170523 81.463119 \n",
       "L 92.065259 87.367219 \n",
       "L 96.959995 92.939115 \n",
       "L 101.854731 98.178816 \n",
       "L 106.749467 103.086314 \n",
       "L 111.644203 107.661615 \n",
       "L 116.093963 111.532705 \n",
       "L 120.543723 115.129252 \n",
       "L 124.99349 118.451257 \n",
       "L 129.443254 121.498716 \n",
       "L 133.893018 124.27163 \n",
       "L 138.342782 126.769998 \n",
       "L 142.792549 128.993821 \n",
       "L 146.797333 130.760524 \n",
       "L 150.802117 132.304845 \n",
       "L 154.806903 133.626785 \n",
       "L 158.811687 134.726343 \n",
       "L 162.816471 135.60352 \n",
       "L 166.821255 136.258315 \n",
       "L 170.826039 136.690729 \n",
       "L 174.830823 136.900761 \n",
       "L 178.835607 136.888412 \n",
       "L 182.840391 136.653682 \n",
       "L 186.845176 136.19657 \n",
       "L 190.84996 135.517077 \n",
       "L 194.854744 134.615203 \n",
       "L 198.859528 133.490947 \n",
       "L 202.864312 132.144309 \n",
       "L 206.869096 130.575291 \n",
       "L 210.87388 128.783891 \n",
       "L 214.878664 126.770109 \n",
       "L 219.328424 124.271756 \n",
       "L 220.663352 123.468714 \n",
       "L 220.663352 123.468714 \n",
       "\" clip-path=\"url(#p62fd75e551)\" style=\"fill: none; stroke: #1f77b4; stroke-width: 1.5; stroke-linecap: square\"/>\n",
       "   </g>\n",
       "   <g id=\"line2d_22\">\n",
       "    <path d=\"M 65.366719 51.127273 \n",
       "L 176.610828 136.922727 \n",
       "\" clip-path=\"url(#p62fd75e551)\" style=\"fill: none; stroke-dasharray: 5.55,2.4; stroke-dashoffset: 0; stroke: #bf00bf; stroke-width: 1.5\"/>\n",
       "   </g>\n",
       "   <g id=\"patch_3\">\n",
       "    <path d=\"M 34.240625 143.1 \n",
       "L 34.240625 7.2 \n",
       "\" style=\"fill: none; stroke: #000000; stroke-width: 0.8; stroke-linejoin: miter; stroke-linecap: square\"/>\n",
       "   </g>\n",
       "   <g id=\"patch_4\">\n",
       "    <path d=\"M 229.540625 143.1 \n",
       "L 229.540625 7.2 \n",
       "\" style=\"fill: none; stroke: #000000; stroke-width: 0.8; stroke-linejoin: miter; stroke-linecap: square\"/>\n",
       "   </g>\n",
       "   <g id=\"patch_5\">\n",
       "    <path d=\"M 34.240625 143.1 \n",
       "L 229.540625 143.1 \n",
       "\" style=\"fill: none; stroke: #000000; stroke-width: 0.8; stroke-linejoin: miter; stroke-linecap: square\"/>\n",
       "   </g>\n",
       "   <g id=\"patch_6\">\n",
       "    <path d=\"M 34.240625 7.2 \n",
       "L 229.540625 7.2 \n",
       "\" style=\"fill: none; stroke: #000000; stroke-width: 0.8; stroke-linejoin: miter; stroke-linecap: square\"/>\n",
       "   </g>\n",
       "  </g>\n",
       " </g>\n",
       " <defs>\n",
       "  <clipPath id=\"p62fd75e551\">\n",
       "   <rect x=\"34.240625\" y=\"7.2\" width=\"195.3\" height=\"135.9\"/>\n",
       "  </clipPath>\n",
       " </defs>\n",
       "</svg>\n"
      ],
      "text/plain": [
       "<Figure size 252x180 with 1 Axes>"
      ]
     },
     "metadata": {
      "needs_background": "light"
     },
     "output_type": "display_data"
    }
   ],
   "source": [
    "f = lambda x: (x - 1) ** 2\n",
    "d2l.set_figsize()\n",
    "d2l.plot([x, segment], [f(x), f(segment)], 'x', 'f(x)')"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "origin_pos": 8
   },
   "source": [
    "The fact that the local minima for convex functions are also the global minima is very convenient. \n",
    "It means that if we minimize functions we cannot \"get stuck\". \n",
    "Note, though, that this does not mean that there cannot be more than one global minimum or that there might even exist one. For instance, the function $f(x) = \\mathrm{max}(|x|-1, 0)$ attains its minimum value over the interval $[-1, 1]$. Conversely, the function $f(x) = \\exp(x)$ does not attain a minimum value on $\\mathbb{R}$: for $x \\to -\\infty$ it asymptotes to $0$, but there is no $x$ for which $f(x) = 0$.\n",
    "\n",
    "### Below Sets of Convex Functions Are Convex\n",
    "\n",
    "We can conveniently \n",
    "define convex sets \n",
    "via *below sets* of convex functions.\n",
    "Concretely,\n",
    "given a convex function $f$ defined on a convex set $\\mathcal{X}$,\n",
    "any below set\n",
    "\n",
    "$$\\mathcal{S}_b := \\{x | x \\in \\mathcal{X} \\text{ and } f(x) \\leq b\\}$$\n",
    "\n",
    "is convex. \n",
    "\n",
    "Let us prove this quickly. Recall that for any $x, x' \\in \\mathcal{S}_b$ we need to show that $\\lambda x + (1-\\lambda) x' \\in \\mathcal{S}_b$ as long as $\\lambda \\in [0, 1]$. \n",
    "Since $f(x) \\leq b$ and $f(x') \\leq b$,\n",
    "by the definition of convexity we have \n",
    "\n",
    "$$f(\\lambda x + (1-\\lambda) x') \\leq \\lambda f(x) + (1-\\lambda) f(x') \\leq b.$$\n",
    "\n",
    "\n",
    "### Convexity and Second Derivatives\n",
    "\n",
    "Whenever the second derivative of a function $f: \\mathbb{R}^n \\rightarrow \\mathbb{R}$ exists it is very easy to check whether $f$ is convex. \n",
    "All we need to do is check whether the Hessian of $f$ is positive semidefinite: $\\nabla^2f \\succeq 0$, i.e., \n",
    "denoting the Hessian matrix $\\nabla^2f$ by $\\mathbf{H}$,\n",
    "$\\mathbf{x}^\\top \\mathbf{H} \\mathbf{x} \\geq 0$\n",
    "for all $\\mathbf{x} \\in \\mathbb{R}^n$.\n",
    "For instance, the function $f(\\mathbf{x}) = \\frac{1}{2} \\|\\mathbf{x}\\|^2$ is convex since $\\nabla^2 f = \\mathbf{1}$, i.e., its Hessian is an identity matrix.\n",
    "\n",
    "\n",
    "Formally, a twice-differentiable one-dimensional function $f: \\mathbb{R} \\rightarrow \\mathbb{R}$ is convex\n",
    "if and only if its second derivative $f'' \\geq 0$. For any twice-differentiable multi-dimensional function $f: \\mathbb{R}^{n} \\rightarrow \\mathbb{R}$,\n",
    "it is convex if and only if its Hessian $\\nabla^2f \\succeq 0$.\n",
    "\n",
    "First, we need to prove the one-dimensional case.\n",
    "To see that \n",
    "convexity of $f$ implies \n",
    "$f'' \\geq 0$  we use the fact that\n",
    "\n",
    "$$\\frac{1}{2} f(x + \\epsilon) + \\frac{1}{2} f(x - \\epsilon) \\geq f\\left(\\frac{x + \\epsilon}{2} + \\frac{x - \\epsilon}{2}\\right) = f(x).$$\n",
    "\n",
    "Since the second derivative is given by the limit over finite differences it follows that\n",
    "\n",
    "$$f''(x) = \\lim_{\\epsilon \\to 0} \\frac{f(x+\\epsilon) + f(x - \\epsilon) - 2f(x)}{\\epsilon^2} \\geq 0.$$\n",
    "\n",
    "To see that \n",
    "$f'' \\geq 0$ implies that $f$ is convex\n",
    "we use the fact that $f'' \\geq 0$ implies that $f'$ is a monotonically nondecreasing function. Let $a < x < b$ be three points in $\\mathbb{R}$,\n",
    "where $x = (1-\\lambda)a + \\lambda b$ and $\\lambda \\in (0, 1)$.\n",
    "According to the mean value theorem,\n",
    "there exist $\\alpha \\in [a, x]$ and $\\beta \\in [x, b]$\n",
    "such that\n",
    "\n",
    "$$f'(\\alpha) = \\frac{f(x) - f(a)}{x-a} \\text{ and } f'(\\beta) = \\frac{f(b) - f(x)}{b-x}.$$\n",
    "\n",
    "\n",
    "By monotonicity $f'(\\beta) \\geq f'(\\alpha)$, hence\n",
    "\n",
    "$$\\frac{x-a}{b-a}f(b) + \\frac{b-x}{b-a}f(a) \\geq f(x).$$\n",
    "\n",
    "Since $x = (1-\\lambda)a + \\lambda b$,\n",
    "we have\n",
    "\n",
    "$$\\lambda f(b) + (1-\\lambda)f(a) \\geq f((1-\\lambda)a + \\lambda b),$$\n",
    "\n",
    "thus proving convexity.\n",
    "\n",
    "Second, we need a lemma before \n",
    "proving the multi-dimensional case:\n",
    "$f: \\mathbb{R}^n \\rightarrow \\mathbb{R}$\n",
    "is convex if and only if for all $\\mathbf{x}, \\mathbf{y} \\in \\mathbb{R}^n$\n",
    "\n",
    "$$g(z) \\stackrel{\\mathrm{def}}{=} f(z \\mathbf{x} + (1-z)  \\mathbf{y}) \\text{ where } z \\in [0,1]$$ \n",
    "\n",
    "is convex.\n",
    "\n",
    "To prove that convexity of $f$ implies that $g$ is convex,\n",
    "we can show that for all $a, b, \\lambda \\in [0, 1]$ (thus\n",
    "$0 \\leq \\lambda a + (1-\\lambda) b \\leq 1$)\n",
    "\n",
    "$$\\begin{aligned} &g(\\lambda a + (1-\\lambda) b)\\\\\n",
    "=&f\\left(\\left(\\lambda a + (1-\\lambda) b\\right)\\mathbf{x} + \\left(1-\\lambda a - (1-\\lambda) b\\right)\\mathbf{y} \\right)\\\\\n",
    "=&f\\left(\\lambda \\left(a \\mathbf{x} + (1-a)  \\mathbf{y}\\right)  + (1-\\lambda) \\left(b \\mathbf{x} + (1-b)  \\mathbf{y}\\right) \\right)\\\\\n",
    "\\leq& \\lambda f\\left(a \\mathbf{x} + (1-a)  \\mathbf{y}\\right)  + (1-\\lambda) f\\left(b \\mathbf{x} + (1-b)  \\mathbf{y}\\right) \\\\\n",
    "=& \\lambda g(a) + (1-\\lambda) g(b).\n",
    "\\end{aligned}$$\n",
    "\n",
    "To prove the converse,\n",
    "we can show that for \n",
    "all $\\lambda \\in [0, 1]$ \n",
    "\n",
    "$$\\begin{aligned} &f(\\lambda \\mathbf{x} + (1-\\lambda) \\mathbf{y})\\\\\n",
    "=&g(\\lambda \\cdot 1 + (1-\\lambda) \\cdot 0)\\\\\n",
    "\\leq& \\lambda g(1)  + (1-\\lambda) g(0) \\\\\n",
    "=& \\lambda f(\\mathbf{x}) + (1-\\lambda) g(\\mathbf{y}).\n",
    "\\end{aligned}$$\n",
    "\n",
    "\n",
    "Finally,\n",
    "using the lemma above and the result of the one-dimensional case,\n",
    "the multi-dimensional case\n",
    "can be proven as follows.\n",
    "A multi-dimensional function $f: \\mathbb{R}^n \\rightarrow \\mathbb{R}$ is convex\n",
    "if and only if for all $\\mathbf{x}, \\mathbf{y} \\in \\mathbb{R}^n$ $g(z) \\stackrel{\\mathrm{def}}{=} f(z \\mathbf{x} + (1-z)  \\mathbf{y})$, where $z \\in [0,1]$,\n",
    "is convex.\n",
    "According to the one-dimensional case,\n",
    "this holds if and only if\n",
    "$g'' = (\\mathbf{x} - \\mathbf{y})^\\top \\mathbf{H}(\\mathbf{x} - \\mathbf{y}) \\geq 0$ ($\\mathbf{H} \\stackrel{\\mathrm{def}}{=} \\nabla^2f$)\n",
    "for all $\\mathbf{x}, \\mathbf{y} \\in \\mathbb{R}^n$,\n",
    "which is equivalent to $\\mathbf{H} \\succeq 0$\n",
    "per the definition of positive semidefinite matrices.\n",
    "\n",
    "\n",
    "## Constraints\n",
    "\n",
    "One of the nice properties of convex optimization is that it allows us to handle constraints efficiently. That is, it allows us to solve *constrained optimization* problems of the form:\n",
    "\n",
    "$$\\begin{aligned} \\mathop{\\mathrm{minimize~}}_{\\mathbf{x}} & f(\\mathbf{x}) \\\\\n",
    "    \\text{ subject to } & c_i(\\mathbf{x}) \\leq 0 \\text{ for all } i \\in \\{1, \\ldots, n\\},\n",
    "\\end{aligned}$$\n",
    "\n",
    "where $f$ is the objective and the functions $c_i$ are constraint functions. To see what this does consider the case where $c_1(\\mathbf{x}) = \\|\\mathbf{x}\\|_2 - 1$. In this case the parameters $\\mathbf{x}$ are constrained to the unit ball. If a second constraint is $c_2(\\mathbf{x}) = \\mathbf{v}^\\top \\mathbf{x} + b$, then this corresponds to all $\\mathbf{x}$ lying on a half-space. Satisfying both constraints simultaneously amounts to selecting a slice of a ball.\n",
    "\n",
    "### Lagrangian\n",
    "\n",
    "In general, solving a constrained optimization problem is difficult. One way of addressing it stems from physics with a rather simple intuition. Imagine a ball inside a box. The ball will roll to the place that is lowest and the forces of gravity will be balanced out with the forces that the sides of the box can impose on the ball. In short, the gradient of the objective function (i.e., gravity) will be offset by the gradient of the constraint function (the ball need to remain inside the box by virtue of the walls \"pushing back\"). \n",
    "Note that some constraints may not be active:\n",
    "the walls that are not touched by the ball\n",
    "will not be able to exert any force on the ball.\n",
    "\n",
    "\n",
    "Skipping over the derivation of the *Lagrangian* $L$,\n",
    "the above reasoning\n",
    "can be expressed via the following saddle point optimization problem:\n",
    "\n",
    "$$L(\\mathbf{x}, \\alpha_1, \\ldots, \\alpha_n) = f(\\mathbf{x}) + \\sum_{i=1}^n \\alpha_i c_i(\\mathbf{x}) \\text{ where } \\alpha_i \\geq 0.$$\n",
    "\n",
    "Here the variables $\\alpha_i$ ($i=1,\\ldots,n$) are the so-called *Lagrange multipliers* that ensure that constraints are properly enforced. They are chosen just large enough to ensure that $c_i(\\mathbf{x}) \\leq 0$ for all $i$. For instance, for any $\\mathbf{x}$ where $c_i(\\mathbf{x}) < 0$ naturally, we'd end up picking $\\alpha_i = 0$. Moreover, this is a saddle point optimization problem where one wants to *maximize* $L$ with respect to all $\\alpha_i$ and simultaneously *minimize* it with respect to $\\mathbf{x}$. There is a rich body of literature explaining how to arrive at the function $L(\\mathbf{x}, \\alpha_1, \\ldots, \\alpha_n)$. For our purposes it is sufficient to know that the saddle point of $L$ is where the original constrained optimization problem is solved optimally.\n",
    "\n",
    "### Penalties\n",
    "\n",
    "One way of satisfying constrained optimization problems at least *approximately* is to adapt the Lagrangian $L$. \n",
    "Rather than satisfying $c_i(\\mathbf{x}) \\leq 0$ we simply add $\\alpha_i c_i(\\mathbf{x})$ to the objective function $f(x)$. This ensures that the constraints will not be violated too badly.\n",
    "\n",
    "In fact, we have been using this trick all along. Consider weight decay in :numref:`sec_weight_decay`. In it we add $\\frac{\\lambda}{2} \\|\\mathbf{w}\\|^2$ to the objective function to ensure that $\\mathbf{w}$ does not grow too large. From the constrained optimization point of view we can see that this will ensure that $\\|\\mathbf{w}\\|^2 - r^2 \\leq 0$ for some radius $r$. Adjusting the value of $\\lambda$ allows us to vary the size of $\\mathbf{w}$.\n",
    "\n",
    "In general, adding penalties is a good way of ensuring approximate constraint satisfaction. In practice this turns out to be much more robust than exact satisfaction. Furthermore, for nonconvex problems many of the properties that make the exact approach so appealing in the convex case (e.g., optimality) no longer hold.\n",
    "\n",
    "### Projections\n",
    "\n",
    "An alternative strategy for satisfying constraints is projections. Again, we encountered them before, e.g., when dealing with gradient clipping in :numref:`sec_rnn_scratch`. There we ensured that a gradient has length bounded by $\\theta$ via\n",
    "\n",
    "$$\\mathbf{g} \\leftarrow \\mathbf{g} \\cdot \\mathrm{min}(1, \\theta/\\|\\mathbf{g}\\|).$$\n",
    "\n",
    "This turns out to be a *projection* of $\\mathbf{g}$ onto the ball of radius $\\theta$. More generally, a projection on a convex set $\\mathcal{X}$ is defined as\n",
    "\n",
    "$$\\mathrm{Proj}_\\mathcal{X}(\\mathbf{x}) = \\mathop{\\mathrm{argmin}}_{\\mathbf{x}' \\in \\mathcal{X}} \\|\\mathbf{x} - \\mathbf{x}'\\|,$$\n",
    "\n",
    "which is the closest point in $\\mathcal{X}$ to $\\mathbf{x}$. \n",
    "\n",
    "![Convex Projections.](../img/projections.svg)\n",
    ":label:`fig_projections`\n",
    "\n",
    "The mathematical definition of projections may sound a bit abstract. :numref:`fig_projections` explains it somewhat more clearly. In it we have two convex sets, a circle and a diamond. \n",
    "Points inside both sets (yellow) remain unchanged during projections. \n",
    "Points outside both sets (black) are projected to \n",
    "the points inside the sets (red) that are closet to the original points (black).\n",
    "While for $L_2$ balls this leaves the direction unchanged, this need not be the case in general, as can be seen in the case of the diamond.\n",
    "\n",
    "\n",
    "One of the uses for convex projections is to compute sparse weight vectors. In this case we project weight vectors onto an $L_1$ ball,\n",
    "which is a generalized version of the diamond case in :numref:`fig_projections`.\n",
    "\n",
    "\n",
    "## Summary\n",
    "\n",
    "In the context of deep learning the main purpose of convex functions is to motivate optimization algorithms and help us understand them in detail. In the following we will see how gradient descent and stochastic gradient descent can be derived accordingly.\n",
    "\n",
    "\n",
    "* Intersections of convex sets are convex. Unions are not.\n",
    "* The expectation of a convex function is no less than the convex function of an expectation (Jensen's inequality).\n",
    "* A twice-differentiable function is convex if and only if its Hessian (a matrix of second derivatives) is positive semidefinite.\n",
    "* Convex constraints can be added via the Lagrangian. In practice we may simply add them with a penalty to the objective function.\n",
    "* Projections map to points in the convex set closest to the original points.\n",
    "\n",
    "## Exercises\n",
    "\n",
    "1. Assume that we want to verify convexity of a set by drawing all lines between points within the set and checking whether the lines are contained.\n",
    "    1. Prove that it is sufficient to check only the points on the boundary.\n",
    "    1. Prove that it is sufficient to check only the vertices of the set.\n",
    "1. Denote by $\\mathcal{B}_p[r] \\stackrel{\\mathrm{def}}{=} \\{\\mathbf{x} | \\mathbf{x} \\in \\mathbb{R}^d \\text{ and } \\|\\mathbf{x}\\|_p \\leq r\\}$ the ball of radius $r$ using the $p$-norm. Prove that $\\mathcal{B}_p[r]$ is convex for all $p \\geq 1$.\n",
    "1. Given convex functions $f$ and $g$, show that $\\mathrm{max}(f, g)$ is convex, too. Prove that $\\mathrm{min}(f, g)$ is not convex.\n",
    "1. Prove that the normalization of the softmax function is convex. More specifically prove the convexity of\n",
    "    $f(x) = \\log \\sum_i \\exp(x_i)$.\n",
    "1. Prove that linear subspaces, i.e., $\\mathcal{X} = \\{\\mathbf{x} | \\mathbf{W} \\mathbf{x} = \\mathbf{b}\\}$, are convex sets.\n",
    "1. Prove that in the case of linear subspaces with $\\mathbf{b} = \\mathbf{0}$ the projection $\\mathrm{Proj}_\\mathcal{X}$ can be written as $\\mathbf{M} \\mathbf{x}$ for some matrix $\\mathbf{M}$.\n",
    "1. Show that for  twice-differentiable convex functions $f$ we can write $f(x + \\epsilon) = f(x) + \\epsilon f'(x) + \\frac{1}{2} \\epsilon^2 f''(x + \\xi)$ for some $\\xi \\in [0, \\epsilon]$.\n",
    "1. Given a vector $\\mathbf{w} \\in \\mathbb{R}^d$ with $\\|\\mathbf{w}\\|_1 > 1$ compute the projection on the $L_1$ unit ball.\n",
    "    1. As an intermediate step write out the penalized objective $\\|\\mathbf{w} - \\mathbf{w}'\\|^2 + \\lambda \\|\\mathbf{w}'\\|_1$ and compute the solution for a given $\\lambda > 0$.\n",
    "    1. Can you find the \"right\" value of $\\lambda$ without a lot of trial and error?\n",
    "1. Given a convex set $\\mathcal{X}$ and two vectors $\\mathbf{x}$ and $\\mathbf{y}$, prove that projections never increase distances, i.e., $\\|\\mathbf{x} - \\mathbf{y}\\| \\geq \\|\\mathrm{Proj}_\\mathcal{X}(\\mathbf{x}) - \\mathrm{Proj}_\\mathcal{X}(\\mathbf{y})\\|$.\n",
    "\n",
    "\n",
    "[Discussions](https://discuss.d2l.ai/t/350)\n"
   ]
  }
 ],
 "metadata": {
  "language_info": {
   "name": "python"
  }
 },
 "nbformat": 4,
 "nbformat_minor": 4
}