{
 "cells": [
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "# Testing in a Data Analysis Workflow\n",
    "\n",
    "\n",
    "It's hard to do much programming without coming across the concept of *testing* — the practice of writing code whose sole purpose is to verify the programs you write actually do what you expect them to do. Indeed, testing is such a core part of programming that few people would consider posting their code publicly on a service like github without including a suite of tests. Some programmers have even adopted a practice known as *test-driven development* in which they *start* a project by writing tests that they want the code the plan to write to eventually be able to pass and then work until those tests pass.\n",
    "\n",
    "\n",
    "But while testing is deeply engrained in the practice of writing software, all too often, data scientists assume testing isn't relevant when they are writing simple scripts that are meant to run linearly to do things like run load, clean, merge, reshape, and analyze datasets using canned library functions. After all, if the point of tests is to ensure that the code you write is working correctly, and all you use in your simple scripts are functions that come from libraries like numpy and pandas (that are tested by the library maintainers), why would you need to use tests?\n",
    "\n",
    "\n",
    "The answer is twofold. First, it's easy to make coding mistakes even when you aren't writing complicated programs. Operations like merging, grouping, and reshaping can easily get wrong, so it's good to verify that the results of those operations are what you expect.\n",
    "\n",
    "\n",
    "But the second and bigger reason is that tests are needed in data analysis workflows to **test that what you think is true about your data is actually correct**. In other words, writing tests in a data analysis workflow is often not about ensuring you wrote the code you think you wrote, but rather about verifying your assumptions about the structure and properties of your data — and thus the behavior of the code you write — are correct.\n",
    "\n",
    "\n",
    "## A Simple Example\n",
    "\n",
    "\n",
    "To illustrate, consider the following simple example: I load a small subset of data from the World Bank World Development Indicators. The data includes data on countries, their GDP per capita, and Polity Scores (a measure of political freedom).\n",
    "\n",
    "\n",
    "In the code, I then try to compare the average Polity score for large oil producers to all other countries:\n"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 1,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/html": [
       "<div>\n",
       "<style scoped>\n",
       "    .dataframe tbody tr th:only-of-type {\n",
       "        vertical-align: middle;\n",
       "    }\n",
       "\n",
       "    .dataframe tbody tr th {\n",
       "        vertical-align: top;\n",
       "    }\n",
       "\n",
       "    .dataframe thead th {\n",
       "        text-align: right;\n",
       "    }\n",
       "</style>\n",
       "<table border=\"1\" class=\"dataframe\">\n",
       "  <thead>\n",
       "    <tr style=\"text-align: right;\">\n",
       "      <th></th>\n",
       "      <th>country</th>\n",
       "      <th>region</th>\n",
       "      <th>gdppcap08</th>\n",
       "      <th>polityIV</th>\n",
       "    </tr>\n",
       "  </thead>\n",
       "  <tbody>\n",
       "    <tr>\n",
       "      <th>0</th>\n",
       "      <td>Albania</td>\n",
       "      <td>C&amp;E Europe</td>\n",
       "      <td>7715</td>\n",
       "      <td>17.8</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>1</th>\n",
       "      <td>Algeria</td>\n",
       "      <td>Africa</td>\n",
       "      <td>8033</td>\n",
       "      <td>10.0</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>2</th>\n",
       "      <td>Angola</td>\n",
       "      <td>Africa</td>\n",
       "      <td>5899</td>\n",
       "      <td>8.0</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>3</th>\n",
       "      <td>Argentina</td>\n",
       "      <td>S. America</td>\n",
       "      <td>14333</td>\n",
       "      <td>18.0</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>4</th>\n",
       "      <td>Armenia</td>\n",
       "      <td>C&amp;E Europe</td>\n",
       "      <td>6070</td>\n",
       "      <td>15.0</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>...</th>\n",
       "      <td>...</td>\n",
       "      <td>...</td>\n",
       "      <td>...</td>\n",
       "      <td>...</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>140</th>\n",
       "      <td>Venezuela</td>\n",
       "      <td>S. America</td>\n",
       "      <td>12804</td>\n",
       "      <td>16.0</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>141</th>\n",
       "      <td>Vietnam</td>\n",
       "      <td>Asia-Pacific</td>\n",
       "      <td>2785</td>\n",
       "      <td>3.0</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>142</th>\n",
       "      <td>Yemen</td>\n",
       "      <td>Middle East</td>\n",
       "      <td>2400</td>\n",
       "      <td>8.0</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>143</th>\n",
       "      <td>Zambia</td>\n",
       "      <td>Africa</td>\n",
       "      <td>1356</td>\n",
       "      <td>15.0</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>144</th>\n",
       "      <td>Zimbabwe</td>\n",
       "      <td>Africa</td>\n",
       "      <td>188</td>\n",
       "      <td>6.0</td>\n",
       "    </tr>\n",
       "  </tbody>\n",
       "</table>\n",
       "<p>145 rows × 4 columns</p>\n",
       "</div>"
      ],
      "text/plain": [
       "       country        region  gdppcap08  polityIV\n",
       "0      Albania    C&E Europe       7715      17.8\n",
       "1      Algeria        Africa       8033      10.0\n",
       "2       Angola        Africa       5899       8.0\n",
       "3    Argentina    S. America      14333      18.0\n",
       "4      Armenia    C&E Europe       6070      15.0\n",
       "..         ...           ...        ...       ...\n",
       "140  Venezuela    S. America      12804      16.0\n",
       "141    Vietnam  Asia-Pacific       2785       3.0\n",
       "142      Yemen   Middle East       2400       8.0\n",
       "143     Zambia        Africa       1356      15.0\n",
       "144   Zimbabwe        Africa        188       6.0\n",
       "\n",
       "[145 rows x 4 columns]"
      ]
     },
     "execution_count": 1,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "import pandas as pd\n",
    "import numpy as np\n",
    "\n",
    "pd.set_option(\"mode.copy_on_write\", True)\n",
    "wdi = pd.read_csv(\"data/world-small.csv\")\n",
    "wdi"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 2,
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "The average GDP per Capita of the ten largest oil producers is: $21,625.\n",
      "The average GDP per Capita of all other countries is: $12,698.\n"
     ]
    }
   ],
   "source": [
    "# Top 10 from Wikipedia: https://en.wikipedia.org/wiki/List_of_countries_by_oil_production\n",
    "\n",
    "list_of_top10_oil_producers = [\n",
    "    \"United States\",\n",
    "    \"Russia\",\n",
    "    \"Saudi Arabia\",\n",
    "    \"Canada\",\n",
    "    \"Iraq\",\n",
    "    \"China\",\n",
    "    \"Iran\",\n",
    "    \"Brazil\",\n",
    "    \"United Arab Emirates\",\n",
    "    \"Kuwait\",\n",
    "]\n",
    "\n",
    "\n",
    "wdi[\"large_oil_producers\"] = wdi[\"country\"].isin(list_of_top10_oil_producers)\n",
    "\n",
    "avg_income_of_large_producers = wdi.loc[wdi[\"large_oil_producers\"], \"gdppcap08\"].mean()\n",
    "avg_income_of_non_producers = wdi.loc[~wdi[\"large_oil_producers\"], \"gdppcap08\"].mean()\n",
    "\n",
    "print(\n",
    "    \"The average GDP per Capita of the ten largest \"\n",
    "    f\"oil producers is: ${avg_income_of_large_producers:,.0f}.\"\n",
    ")\n",
    "print(\n",
    "    \"The average GDP per Capita of all other \"\n",
    "    f\"countries is: ${avg_income_of_non_producers:,.0f}.\"\n",
    ")"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Simple, right? Except... that answer is wrong. Why? Well, we were trying to subset for the 10 largest oil producers. But were we successful? Let's check."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 3,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "16            Brazil\n",
       "21            Canada\n",
       "25             China\n",
       "60              Iran\n",
       "61              Iraq\n",
       "71            Kuwait\n",
       "109           Russia\n",
       "111     Saudi Arabia\n",
       "137    United States\n",
       "Name: country, dtype: object"
      ]
     },
     "execution_count": 3,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "# Look at countries for which we subset:\n",
    "wdi.loc[wdi[\"large_oil_producers\"], \"country\"]"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 4,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "9"
      ]
     },
     "execution_count": 4,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "# Wait... is that 10?\n",
    "len(wdi.loc[wdi[\"large_oil_producers\"], \"country\"])"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Oops! What happened? Well, if we compare the list of country names we for which `large_oil_producers` is `True` to our original list, we can see what country is missing:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 5,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "{'United Arab Emirates'}"
      ]
     },
     "execution_count": 5,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "big_producers_in_df = wdi.loc[wdi[\"large_oil_producers\"], \"country\"]\n",
    "\n",
    "set(big_producers_in_df).symmetric_difference(list_of_top10_oil_producers)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "And if we look at our `wdi` data more carefully, we can see the problem is that United Arab Emirates isn't `\"United Arab Emirates\"` in the World Development Indicator data, it's `\"UAE\"`:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 6,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "['Tunisia',\n",
       " 'Turkey',\n",
       " 'Turkmenistan',\n",
       " 'UAE',\n",
       " 'Uganda',\n",
       " 'Ukraine',\n",
       " 'United Kingdom',\n",
       " 'United States',\n",
       " 'Uruguay',\n",
       " 'Uzbekistan',\n",
       " 'Venezuela',\n",
       " 'Vietnam',\n",
       " 'Yemen',\n",
       " 'Zambia',\n",
       " 'Zimbabwe']"
      ]
     },
     "execution_count": 6,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "list(wdi[\"country\"][130:])"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "If this feels contrived, I can promise it's not! \"Messy data\" is a *constant* in data science, especially when you're trying to answer a question no one has ever answered before and are thus working with data in ways other people haven't explored before!\n",
    "\n",
    "So now let's correct that problem *and* write a test to ensure we fixed it correctly!\n"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 7,
   "metadata": {},
   "outputs": [],
   "source": [
    "list_of_top10_oil_producers = [\n",
    "    \"United States\",\n",
    "    \"Russia\",\n",
    "    \"Saudi Arabia\",\n",
    "    \"Canada\",\n",
    "    \"Iraq\",\n",
    "    \"China\",\n",
    "    \"Iran\",\n",
    "    \"Brazil\",\n",
    "    \"UAE\",\n",
    "    \"Kuwait\",\n",
    "]\n",
    "\n",
    "\n",
    "wdi[\"large_oil_producers\"] = wdi[\"country\"].isin(list_of_top10_oil_producers)\n",
    "\n",
    "big_producers_in_df = wdi.loc[wdi[\"large_oil_producers\"], \"country\"]\n",
    "\n",
    "\n",
    "# Check all oil producers found in wdi\n",
    "assert (\n",
    "    set(big_producers_in_df).symmetric_difference(list_of_top10_oil_producers) == set()\n",
    ")"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "\n",
    "```{note}\n",
    "Catching errors in your intermediate code is especially important when doing data analysis data science. Why? Because when analyzing data, you're trying to answer a question no one has answered before, meaning it's **inherent** to the undertaking that you can't verify your final answer directly! The only way to have confidence in your answer is by having confidence in the steps you took to get to it.\n",
    "```"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## I check my code, why test?\n",
    "\n",
    "In my experience, most data analysts check that things are working correctly *interactively* as they go. For example, they might do exactly what I did above — type `len(wdi.loc[wdi[\"large_oil_producers\"], \"country\"])` and look at the result. But that's not a *test*. A test has to check something and alert you to the problem *automatically*, which requires, at the very least, an `assert` statement. Why?\n",
    "\n",
    "1.  **Tests are executed _every time your code is run_.** Most of us check things the first time we write a piece of code. But days, weeks, or months later, we may come back, modify the code that occurs earlier in our code stream, and then just re-run the code. If those changes lead to problems in later files, we may not be aware of them. But if you have tests in place, then those early changes will result in an error in the later files, and you can track down the problem.\n",
    "2.  **It gets you in the habit of always checking.** Most of us only stop to check aspects of our data when we suspect problems. But if you become accustomed to writing a handful of tests at the bottom of every file -- or after every execution of a certain operation (e.g., I always include tests after a merge or reshape), we get into the habit of _always_ stopping to think about what our data should look like.\n",
    "3.  **It helps you catch your problems faster.** This is less about code integrity than sanity, but a great upside to tests is that they ensure that if a mistake slips into your code, you become aware of it quickly, making it easier to identify and fix the changes that caused the problem.\n",
    "4.  **Tests catch more than anticipated problems.** When problems emerge in code, they often manifest in lots of different ways. Duplicate observations, for example, will not only lead to inaccurate observation counts but may also give rise to bizarre summary statistics, bad subsequent merges, etc. Thus, adding tests not only guards against errors we've thought of but may also guard against errors we don't anticipate during the test writing process."
   ]
  },
  {
   "attachments": {},
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Writing Tests\n",
    "\n",
    "Tests are easy to write in any language, but given this course's focus on Python, I will discuss Python here. For examples of tests in Stata and R, you can see [some resources I created towards the bottom of this site](http://www.nickeubank.com/replication).  \n",
    "\n",
    "In Python, the easiest way to add tests to a data analysis script is with the `assert` keyword:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 8,
   "metadata": {},
   "outputs": [],
   "source": [
    "x = 7\n",
    "y = 5\n",
    "\n",
    "# Make sure that x is greater than y\n",
    "assert x > y"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 9,
   "metadata": {},
   "outputs": [],
   "source": [
    "# Make sure that x is odd\n",
    "assert x % 2 == 1"
   ]
  },
  {
   "attachments": {},
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "`assert` can also be used with vectors, though doing so requires one additional step. Since logical tests applied to vectors return vectors of Booleans, we have to specify how to evaluate that whole vector using `.any()` (returns `True` if any entries in the vector are `True`) and `all()` (only returns `True` if *all* the entries are `True`). For example:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 10,
   "metadata": {},
   "outputs": [],
   "source": [
    "# Make sure everyone's GDP per capita estimates are positive:\n",
    "assert (wdi[\"gdppcap08\"] > 0).all()"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "If you don't use `.all()` or `.any()`, you will get this error saying \"the truth value of a Series is ambiguous,\" meaning \"This vector may have both `True` and `False` values, but `assert` is looking for a single `True` or `False`. How should I interpret a mix of Trues and Falses?\"\n",
    "\n",
    "```python\n",
    "assert wdi[\"gdppcap08\"] > 0\n",
    "\n",
    "---------------------------------------------------------------------------\n",
    "ValueError                                Traceback (most recent call last)\n",
    "/var/folders/fs/h_8_rwsn5hvg9mhp0txgc_s9v6191b/T/ipykernel_48692/2951563830.py in ?()\n",
    "      1 # Make sure everyone's GDP per capita estimates are positive:\n",
    "----> 2 assert wdi[\"gdppcap08\"] > 0\n",
    "\n",
    "/users/nce8/miniforge3/lib/python3.12/site-packages/pandas/core/generic.py in ?(self)\n",
    "   1575     @final\n",
    "   1576     def __nonzero__(self) -> NoReturn:\n",
    "-> 1577         raise ValueError(\n",
    "   1578             f\"The truth value of a {type(self).__name__} is ambiguous. \"\n",
    "   1579             \"Use a.empty, a.bool(), a.item(), a.any() or a.all().\"\n",
    "   1580         )\n",
    "\n",
    "ValueError: The truth value of a Series is ambiguous. Use a.empty, a.bool(), a.item(), a.any() or a.all().\n",
    "```\n",
    "\n"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## When Should I Write Tests?"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "The best way to get into writing tests is to think about how you check your data interactively to make stuff work. After a `merge` or a `groupby` command, most people pause to browse the data and/or watch the code step by step or do a set of quick tabs or plots.  But these are not systematic, and you generally only do them once (when first writing the code).\n",
    "\n",
    "\n",
    "So, a good starting point is to examine what you do when you check your data interactively and convert the logic of those interactive interrogations into systematic assert statements. \n",
    "\n",
    "\n",
    "After `merge`: Nowhere are problems with data made clearer than in a merge. ALWAYS add tests after a merge! \n",
    "- After complicated manipulations: If you have to think more than a little about how to get Python or Pandas to do something, there's a chance you missed something. Add a test or two to make sure you did it right! Personally, for example, I seldom use `groupby` commands without adding tests — it's just not a natural way to think about things, so I know I may have screwed up (and often have!).\n",
    "- Before dropping observations: Dropping observations masks problems. Before you drop variables, add a test to count the number of observations you expect to drop or retain."
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Common Test Examples\n",
    "\n",
    "Test number of observations is right:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 11,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/html": [
       "<div>\n",
       "<style scoped>\n",
       "    .dataframe tbody tr th:only-of-type {\n",
       "        vertical-align: middle;\n",
       "    }\n",
       "\n",
       "    .dataframe tbody tr th {\n",
       "        vertical-align: top;\n",
       "    }\n",
       "\n",
       "    .dataframe thead th {\n",
       "        text-align: right;\n",
       "    }\n",
       "</style>\n",
       "<table border=\"1\" class=\"dataframe\">\n",
       "  <thead>\n",
       "    <tr style=\"text-align: right;\">\n",
       "      <th></th>\n",
       "      <th>country</th>\n",
       "      <th>region</th>\n",
       "      <th>gdppcap08</th>\n",
       "      <th>polityIV</th>\n",
       "      <th>large_oil_producers</th>\n",
       "    </tr>\n",
       "  </thead>\n",
       "  <tbody>\n",
       "    <tr>\n",
       "      <th>16</th>\n",
       "      <td>Brazil</td>\n",
       "      <td>S. America</td>\n",
       "      <td>10296</td>\n",
       "      <td>18.0</td>\n",
       "      <td>True</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>67</th>\n",
       "      <td>Jordan</td>\n",
       "      <td>Middle East</td>\n",
       "      <td>5283</td>\n",
       "      <td>8.0</td>\n",
       "      <td>False</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>78</th>\n",
       "      <td>Lithuania</td>\n",
       "      <td>C&amp;E Europe</td>\n",
       "      <td>18824</td>\n",
       "      <td>20.0</td>\n",
       "      <td>False</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>44</th>\n",
       "      <td>France</td>\n",
       "      <td>W. Europe</td>\n",
       "      <td>34045</td>\n",
       "      <td>19.0</td>\n",
       "      <td>False</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>55</th>\n",
       "      <td>Haiti</td>\n",
       "      <td>S. America</td>\n",
       "      <td>1177</td>\n",
       "      <td>8.0</td>\n",
       "      <td>False</td>\n",
       "    </tr>\n",
       "  </tbody>\n",
       "</table>\n",
       "</div>"
      ],
      "text/plain": [
       "      country       region  gdppcap08  polityIV  large_oil_producers\n",
       "16     Brazil   S. America      10296      18.0                 True\n",
       "67     Jordan  Middle East       5283       8.0                False\n",
       "78  Lithuania   C&E Europe      18824      20.0                False\n",
       "44     France    W. Europe      34045      19.0                False\n",
       "55      Haiti   S. America       1177       8.0                False"
      ]
     },
     "execution_count": 11,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "# We'll use `wdi` again:\n",
    "\n",
    "wdi.sample(5)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 12,
   "metadata": {},
   "outputs": [],
   "source": [
    "assert len(wdi) == 145"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Check var that should have no missing has no missing."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 13,
   "metadata": {},
   "outputs": [],
   "source": [
    "assert pd.notnull(wdi[\"country\"]).all()"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 14,
   "metadata": {},
   "outputs": [],
   "source": [
    "# Or, the same test written with any instead of all\n",
    "assert not pd.isnull(wdi[\"country\"]).any()"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Check what I *think* is my unique identifier is actually unique."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 15,
   "metadata": {},
   "outputs": [],
   "source": [
    "assert not wdi[\"country\"].duplicated().any()"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Make sure values of GDP Per Capita have a reasonable value. Note this is a “reasonableness” test, not an absolute test. It's possible this would fail and the data is ok, but this way if there's a problem your attention will be flagged so you can check."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "assert (0 < wdi.gdppcap08).all()"
   ]
  }
 ],
 "metadata": {
  "kernelspec": {
   "display_name": "Python 3",
   "language": "python",
   "name": "python3"
  },
  "language_info": {
   "codemirror_mode": {
    "name": "ipython",
    "version": 3
   },
   "file_extension": ".py",
   "mimetype": "text/x-python",
   "name": "python",
   "nbconvert_exporter": "python",
   "pygments_lexer": "ipython3",
   "version": "3.12.5"
  }
 },
 "nbformat": 4,
 "nbformat_minor": 4
}