{ "cells": [ { "cell_type": "markdown", "metadata": {}, "source": [ "# Solving Performance Issues\n", "\n", "So: you've done all the things suggested in the last page on [Performance basics](performance_understanding.ipynb) that you can do within the constraints of your project, and you still have a performance problem. Now what?\n", "\n", "## Do you *need* to optimize?\n", "\n", "So, before you start investing in optimizing your code, ask yourself whether you really *need* to optimize your code. \n", "\n", "In programming, there is often a trade-off between developer's time (i.e. the time you spend actively playing with your code) and hardware time (the amount of computer time you code occupies when running). If you have a block of code you need to run every day, then investing *your* time in making it fast may really help make you more productive, and so may be worth the investment. \n", "\n", "If you have a block of code you only plan to run once or a couple times (say, the initial ingest of a big dataset), then it probably isn't worth investing days making it 10x faster if the running time is less than insane (i.e. if you can just let it run overnight and it'll be done by morning).\n", "\n", "So when deciding whether to optimize something, ask first \"Will investing my time in optimizing this code really pay off in future productivity?\"" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Profiling Code\n", "\n", "**If you take nothing else away from this page, *please* read and remember this section!**\n", "\n", "There's no reason to tune a line of code that is only responsible for 1/100 of your running time, so before you invest in speeding up your code, figure out *exactly* what in your code is causing it to be slow -- a process known as \"profiling\". \n", "\n", "Thankfully, because this is so important, there are lots of tools (called profilers) for measuring exactly how long your computer is spending doing each step in a block of code. Here are a couple, with some demonstrations below:\n", "\n", "- Profiling in R: the two packages I've seen used most are [Rprof](http://www.stat.berkeley.edu/~nolan/stat133/Fall05/lectures/profilingEx.html) and [lineprof](http://adv-r.had.co.nz/Profiling.html#measure-perf).\n", "- Profiling in Python: if you use Jupyter Notebooks or Jupyter Labs, you can use the [prun tool](http://pynash.org/2013/03/06/timing-and-profiling.html). If for some reason you're not using Jupyter, here's a [guide to a few other tools](https://zapier.com/engineering/profiling-python-boss/).\n" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### Profiling Example\n", "\n", "To illustrate, let's write a function (called `my_analysis`) which we can pretend is a big analysis that's causing me problems. Within this analysis we'll place several functions, most of which are fast, but one of which is slow. To make it *really* easy to see what is fast and what is slow, these functions will just call the `time.sleep()` function, which literally just tells the computer to pause for a given number of seconds (i.e. `time.sleep(10)` makes execution pause for 10 seconds). " ] }, { "cell_type": "code", "execution_count": 1, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "the result of my analysis is: 3\n" ] } ], "source": [ "import time\n", "\n", "def a_slow_function():\n", " time.sleep(5)\n", " return 1\n", " \n", "def a_medium_function():\n", " time.sleep(1)\n", " return 1\n", "\n", "def a_fast_function():\n", " return 1\n", " \n", "def my_analysis():\n", " x = 0\n", " x = x + a_slow_function()\n", " x = x + a_medium_function()\n", " x = x + a_fast_function()\n", " print(f'the result of my analysis is: {x}')\n", "\n", "my_analysis()" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Now we can profile this code with the IPython magic `%prun`:" ] }, { "cell_type": "code", "execution_count": 2, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "the result of my analysis is: 3\n", " " ] }, { "data": { "text/plain": [ " 44 function calls in 6.007 seconds\n", "\n", " Ordered by: internal time\n", "\n", " ncalls tottime percall cumtime percall filename:lineno(function)\n", " 2 6.007 3.003 6.007 3.003 {built-in method time.sleep}\n", " 3 0.000 0.000 0.000 0.000 socket.py:337(send)\n", " 1 0.000 0.000 6.007 6.007 :14(my_analysis)\n", " 3 0.000 0.000 0.000 0.000 iostream.py:197(schedule)\n", " 2 0.000 0.000 0.000 0.000 iostream.py:384(write)\n", " 1 0.000 0.000 6.007 6.007 {built-in method builtins.exec}\n", " 1 0.000 0.000 0.000 0.000 {built-in method builtins.print}\n", " 3 0.000 0.000 0.000 0.000 threading.py:1080(is_alive)\n", " 1 0.000 0.000 1.002 1.002 :7(a_medium_function)\n", " 3 0.000 0.000 0.000 0.000 threading.py:1038(_wait_for_tstate_lock)\n", " 1 0.000 0.000 5.005 5.005 :3(a_slow_function)\n", " 2 0.000 0.000 0.000 0.000 iostream.py:309(_is_master_process)\n", " 3 0.000 0.000 0.000 0.000 {method 'acquire' of '_thread.lock' objects}\n", " 3 0.000 0.000 0.000 0.000 iostream.py:93(_event_pipe)\n", " 2 0.000 0.000 0.000 0.000 iostream.py:322(_schedule_flush)\n", " 2 0.000 0.000 0.000 0.000 {built-in method builtins.isinstance}\n", " 2 0.000 0.000 0.000 0.000 {built-in method posix.getpid}\n", " 3 0.000 0.000 0.000 0.000 threading.py:507(is_set)\n", " 1 0.000 0.000 6.007 6.007 :1()\n", " 1 0.000 0.000 0.000 0.000 {method 'disable' of '_lsprof.Profiler' objects}\n", " 1 0.000 0.000 0.000 0.000 :11(a_fast_function)\n", " 3 0.000 0.000 0.000 0.000 {method 'append' of 'collections.deque' objects}" ] }, "metadata": {}, "output_type": "display_data" } ], "source": [ "%prun my_analysis()" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "The output shows a number of things, but the most important are `tottime` and `cumtime`. \n", "\n", "From `tottime` we can see that 6 seconds was dedicated to running `time.sleep()`. \n", "\n", "From `cumtime`, you can also see *in which functions* `time.sleep()` took the most time. As you can see, `cumtime` is not equal to the total time the function took to run -- rather, it's all the time spent within each function. `time.sleep()` has a `cumtime` of 6.009 because a total of 6 seconds was spend while that function ran, but it is *also* the case that `a_slow_function` (listed as `:3(a_slow_function)`) has a `cumtime` of 5 seconds (because that function was in the process of executing when `time.sleep()` paused for 5 seconds). \n", "\n", "From this, we can deduce that `time.sleep()` was slowing down our code, and that the occurance of `time.sleep()` that slowed down our code the most was in `a_slow_function`. " ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Speeding Code with Cython\n", "\n", "There are two libraries designed to allow you to massively speed up Python code. The first is called `Cython`, and it is a way of writing code that is basically Python with type declarations. For example, if you wanted to add up all the numbers to a million in Python, you could write something like the following (obviously not the most concise way to do it, but you get the idea): " ] }, { "cell_type": "code", "execution_count": 3, "metadata": {}, "outputs": [], "source": [ "def avg_numbers_up_to(N):\n", " adding_total = 0\n", " for i in range(N):\n", " adding_total = adding_total + i\n", " \n", " avg = adding_total / N\n", " \n", " return avg" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "But in Cython, you would write:\n", "\n", "```cython \n", " def avg_numbers_up_to(int N):\n", " cdef int adding_total\n", "\n", " adding_total = 0\n", "\n", " for i in range(N):\n", " adding_total = adding_total + i\n", "\n", " return adding_total\n", "```\n", "\n", "Then to integrate this into your Python code, you would save this function definition into a new file (with the suffix `.pyx` (say, `avg_numbers.pyx`), and put this code at the top of your Python script: \n", "\n", "```python\n", "from distutils.core import setup\n", "from Cython.Build import cythonize\n", "\n", "setup(ext_modules=cythonize('avg_numbers.pyx'))\n", "```\n", "\n", "Then you can call your Cythonized function (`avg_number_up_to`) in your normal Python script, but you'll now find it runs ~10x - 100x faster! (Note that this speedup is only likely when compared to *pure python* code. If you're comparing Cython to a library function that was already written in C, youre Cythonized Python is unlikely be any faster (and may be slower) than that library function. \n", "\n", "Also, note that **in Cythonized code, loops are just as fast as vectorized code!**" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "To illustrate, we can actually compile Cython code directly in Jupyter Notebooks with the %%cythonize magic. This is good for teaching, but is not how you should plan on using Cython yourself -- it's just not robust to rely on jupyter magics for core functionality in your programs (and you should only cythonize something if it's a core part of your code!). " ] }, { "cell_type": "code", "execution_count": 4, "metadata": {}, "outputs": [], "source": [ "%load_ext Cython" ] }, { "cell_type": "code", "execution_count": 5, "metadata": {}, "outputs": [], "source": [ "%%cython\n", "\n", "def cython_avg_numbers_up_to(long N):\n", " cdef long adding_total\n", " adding_total = 0\n", "\n", " cdef long i\n", " for i in range(N):\n", " adding_total += i\n", "\n", " return adding_total" ] }, { "cell_type": "code", "execution_count": 6, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "45.4 s ± 1.4 s per loop (mean ± std. dev. of 7 runs, 1 loop each)\n" ] } ], "source": [ "%timeit avg_numbers_up_to(1000000000)" ] }, { "cell_type": "code", "execution_count": 7, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "46.1 ns ± 0.751 ns per loop (mean ± std. dev. of 7 runs, 10000000 loops each)\n" ] } ], "source": [ "%timeit cython_avg_numbers_up_to(1000000000)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "But while Cython is nice, it does have lots of limitations!\n", "\n", "The biggest is that can get integer overflows in Cython! Moreover, be aware that `int` is a C integer (32 bit, which overflows at about 2 billion), so super prone to overflowing. `long` is what you need to use to get a 64 bit integer! So if I made the accumulator an `int` rather than a `long`..." ] }, { "cell_type": "code", "execution_count": 8, "metadata": {}, "outputs": [], "source": [ "%%cython\n", "\n", "def cython_w_overflow_avg_numbers_up_to(long N):\n", " cdef int adding_total\n", " adding_total = 0\n", "\n", " cdef long i\n", " for i in range(N):\n", " adding_total += i\n", "\n", " return adding_total" ] }, { "cell_type": "code", "execution_count": 9, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "-1243309312" ] }, "execution_count": 9, "metadata": {}, "output_type": "execute_result" } ], "source": [ "cython_w_overflow_avg_numbers_up_to(1000000000)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Yup... we reached the limit of the 32-bit integer, then keep adding things, making it go negative!" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### Cython limitations\n", "\n", "There are a few limitations to be aware of, however: \n", "\n", "- Cython is it's own language. It's very similar to Python, but you'll have to invest a little in learning it's quirks. \n", "- You can get integer overflows in Cython! Moreover, be aware that `int` is a C integer (32 bit, which overflows at about 2 billion), so super prone to overflowing. `long` is what you need to use to get a 64 bit integer!\n", "- Cython only really works with (a) native Python and (b) NumPy ([numpy instructions here](https://cython.readthedocs.io/en/latest/src/userguide/numpy_tutorial.html#numpy-tutorial)). *Some* other Python libraries are / can be supported, but it's not nearly as straightfoward as the example above. So if your code uses other libraries, all bets are off. \n", " - [Here's a guide to use with pandas.](https://pandas.pydata.org/pandas-docs/stable/user_guide/enhancingperf.html)\n", "- The function you write will *not* be dynamically typed, so if you said the function would accept integers, you can only give it integers. \n", "- Distributing code you write with Cython can be tricky...\n", "\n", "### Cython Advantages\n", "\n", "- Can make C libraries directly accessable from Python (if you know how to work with C libraries)\n", "- It is a very mature tool, so well supported and documented. Indeed, a lot of both pandas and libraries like scikit-learn are built on Cython to ensure performance!\n", "- Definitely easier to write one function in Cython than move all your code to C!" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Speeding Code with Numba\n", "\n", "Another tool you can use is `numba`. Numba is a program that, when it works, is *super easy* and kinda magic, but can also be rather finicky. \n", "\n", "The idea of `numba` is that it treats each function like it's own little program, and tries to compile it to make it faster. \n", "\n", "It can operate in two modes. In the first (\"python mode\"), it achieves it's speed-up by saving the machine code that was used the first time a function is run. The speed benefits of this aren't huge -- Python still has to do all the work of doing type checking and de-referencing, but it only has to actually convert what it's doing to machine code once, so if you plan on using a function over and over, it can be beneficial. \n", "\n", "The second mode (\"nopython mode\") is blazing fast. In nopython mode, `numba` analyzes your function, and then makes inferences about the types of variables that it will encounter. For example, if you gave the following code to Python: \n", "\n", "```python\n", "def my_big_loop(N):\n", " accumulator = 0\n", " for i in range(N):\n", " accumulator = accumulator + i\n", " return accumulator\n", "```\n", "\n", "And then you ran that code with `N=1000000`, then `numba` would look at the function and think to itself \"ok, so `N` is an integer. And `accumulator` starts as an integer. And if I add integers, they will always stay integers. So `accumulator + i` will always be integer addition. So I don't have to think about types at every stop of this loop!\n", "\n", "The only catch is that `numba` can't always compile code in `nopython` mode. For example, `numba` isn't compatible with `pandas`, so if you put `pandas` code in a function you pass to `numba`, it can't work in `nopython` mode. \n", "\n", "But when it does work, it's magic, because instead of making a new file that has to be compiled seperately and which won't \"just work\" on other computers, to make `numba` work you just add a \"decorator\" to the function you want to speed up. For example:" ] }, { "cell_type": "code", "execution_count": 10, "metadata": {}, "outputs": [], "source": [ "# Without numba\n", "def my_big_loop(N):\n", " accumulator = 0\n", " for i in range(N):\n", " accumulator = accumulator + i\n", " return accumulator" ] }, { "cell_type": "code", "execution_count": 11, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "45.3 ms ± 598 µs per loop (mean ± std. dev. of 7 runs, 10 loops each)\n" ] } ], "source": [ "%timeit my_big_loop(1000000)" ] }, { "cell_type": "code", "execution_count": 12, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "114 ns ± 0.883 ns per loop (mean ± std. dev. of 7 runs, 10000000 loops each)\n" ] } ], "source": [ "# With numba\n", "from numba import jit\n", "\n", "# We just add this \"decorator\" (A line that starts with @ just above a function)\n", "# The \"nopython=True\" option says to jit \"tell me if you can't work in nopython more, \n", "# dont' just silently revert to Python mode.\"\n", "@jit(nopython=True) \n", "def my_big_loop(N):\n", " accumulator = 0\n", " for i in range(N):\n", " accumulator = accumulator + i\n", " return accumulator\n", "\n", "%timeit my_big_loop(1000000)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Not bad, huh? Add one line of code, and this speeds up by 600x!\n", "\n", "But as I said, it doesn't work with everything -- not even all regular Python datatypes (e.g. if you define your own class, you're apparently out of luck with Numba). Here's an [intro to numba](http://numba.pydata.org/numba-doc/latest/user/5minguide.html), and here's a [full list of things it can handle in nopython mode, and things it can't.](https://numba.pydata.org/numba-doc/dev/reference/numpysupported.html). \n", "\n", "Also, as with Cython, note that **in nopython Numba functions, loops are just as fast as vectorized code!**\n", "\n", "### Type Stability\n", "\n", "If you want `numba` to work well, there is one overriding rule: your code must be \"type-stable\". Type stability means that *conditional on the types of the input arguments*, the types of all variables in your code must be fully predictable. \n", "\n", "For example, consider the following code:" ] }, { "cell_type": "code", "execution_count": 13, "metadata": {}, "outputs": [], "source": [ "def my_doubler(x):\n", " y = x * 2\n", " return y" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "If `x` is an integer, than we know that doubling it will also generate an integer, so the type of `y` will always be \"integer\". \n", "\n", "Similarly, if `x` is a float, then we know that doubling it will also generate a float, sot he type of `y` will always be \"float\". \n", "\n", "In other words, *conditional on the type of the input arguments*, this function is type stable. \n", "\n", "But now consider the following code:" ] }, { "cell_type": "code", "execution_count": 14, "metadata": {}, "outputs": [], "source": [ "def is_this_even(x):\n", " if x % 2 == 0:\n", " y = x\n", " else:\n", " y = 'not even, sorry!'\n", " return y" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "This code is **not** type stable because the fact that `x` is an integer does not guarantee the type of `y`'. If `x` is an even integer, than `y` will also be an integer; and if `x` is an odd integer, then `y` will be a string. As a result, `numba` can't predict the type of `y` in advance of running the code, and so can't be as efficient. " ] }, { "cell_type": "code", "execution_count": 15, "metadata": {}, "outputs": [], "source": [ "@jit(nopython=True)\n", "def is_this_even(x):\n", " if x % 2 == 0:\n", " y = x\n", " else:\n", " y = 'not even, sorry!'\n", " return y" ] }, { "cell_type": "code", "execution_count": 16, "metadata": {}, "outputs": [ { "ename": "TypingError", "evalue": "Failed in nopython mode pipeline (step: nopython frontend)\nCannot unify int64 and Literal[str](not even, sorry!) for 'y', defined at (4)\n\nFile \"\", line 4:\ndef is_this_even(x):\n \n if x % 2 == 0:\n y = x\n ^\n\n[1] During: typing of assignment at (6)\n\nFile \"\", line 6:\ndef is_this_even(x):\n \n else:\n y = 'not even, sorry!'\n ^\n\nThis is not usually a problem with Numba itself but instead often caused by\nthe use of unsupported features or an issue in resolving types.\n\nTo see Python/NumPy features supported by the latest release of Numba visit:\nhttp://numba.pydata.org/numba-doc/latest/reference/pysupported.html\nand\nhttp://numba.pydata.org/numba-doc/latest/reference/numpysupported.html\n\nFor more information about typing errors and how to debug them visit:\nhttp://numba.pydata.org/numba-doc/latest/user/troubleshoot.html#my-code-doesn-t-compile\n\nIf you think your code should work with Numba, please report the error message\nand traceback, along with a minimal reproducer at:\nhttps://github.com/numba/numba/issues/new\n", "output_type": "error", "traceback": [ "\u001b[0;31m---------------------------------------------------------------------------\u001b[0m", "\u001b[0;31mTypingError\u001b[0m Traceback (most recent call last)", "\u001b[0;32m\u001b[0m in \u001b[0;36m\u001b[0;34m\u001b[0m\n\u001b[0;32m----> 1\u001b[0;31m \u001b[0mis_this_even\u001b[0m\u001b[0;34m(\u001b[0m\u001b[0;36m3\u001b[0m\u001b[0;34m)\u001b[0m\u001b[0;34m\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[0m", "\u001b[0;32m~/miniconda3/lib/python3.7/site-packages/numba/dispatcher.py\u001b[0m in \u001b[0;36m_compile_for_args\u001b[0;34m(self, *args, **kws)\u001b[0m\n\u001b[1;32m 374\u001b[0m \u001b[0me\u001b[0m\u001b[0;34m.\u001b[0m\u001b[0mpatch_message\u001b[0m\u001b[0;34m(\u001b[0m\u001b[0mmsg\u001b[0m\u001b[0;34m)\u001b[0m\u001b[0;34m\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[1;32m 375\u001b[0m \u001b[0;34m\u001b[0m\u001b[0m\n\u001b[0;32m--> 376\u001b[0;31m \u001b[0merror_rewrite\u001b[0m\u001b[0;34m(\u001b[0m\u001b[0me\u001b[0m\u001b[0;34m,\u001b[0m \u001b[0;34m'typing'\u001b[0m\u001b[0;34m)\u001b[0m\u001b[0;34m\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[0m\u001b[1;32m 377\u001b[0m \u001b[0;32mexcept\u001b[0m \u001b[0merrors\u001b[0m\u001b[0;34m.\u001b[0m\u001b[0mUnsupportedError\u001b[0m \u001b[0;32mas\u001b[0m \u001b[0me\u001b[0m\u001b[0;34m:\u001b[0m\u001b[0;34m\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[1;32m 378\u001b[0m \u001b[0;31m# Something unsupported is present in the user code, add help info\u001b[0m\u001b[0;34m\u001b[0m\u001b[0;34m\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n", "\u001b[0;32m~/miniconda3/lib/python3.7/site-packages/numba/dispatcher.py\u001b[0m in \u001b[0;36merror_rewrite\u001b[0;34m(e, issue_type)\u001b[0m\n\u001b[1;32m 341\u001b[0m \u001b[0;32mraise\u001b[0m \u001b[0me\u001b[0m\u001b[0;34m\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[1;32m 342\u001b[0m \u001b[0;32melse\u001b[0m\u001b[0;34m:\u001b[0m\u001b[0;34m\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[0;32m--> 343\u001b[0;31m \u001b[0mreraise\u001b[0m\u001b[0;34m(\u001b[0m\u001b[0mtype\u001b[0m\u001b[0;34m(\u001b[0m\u001b[0me\u001b[0m\u001b[0;34m)\u001b[0m\u001b[0;34m,\u001b[0m \u001b[0me\u001b[0m\u001b[0;34m,\u001b[0m \u001b[0;32mNone\u001b[0m\u001b[0;34m)\u001b[0m\u001b[0;34m\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[0m\u001b[1;32m 344\u001b[0m \u001b[0;34m\u001b[0m\u001b[0m\n\u001b[1;32m 345\u001b[0m \u001b[0margtypes\u001b[0m \u001b[0;34m=\u001b[0m \u001b[0;34m[\u001b[0m\u001b[0;34m]\u001b[0m\u001b[0;34m\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n", "\u001b[0;32m~/miniconda3/lib/python3.7/site-packages/numba/six.py\u001b[0m in \u001b[0;36mreraise\u001b[0;34m(tp, value, tb)\u001b[0m\n\u001b[1;32m 656\u001b[0m \u001b[0mvalue\u001b[0m \u001b[0;34m=\u001b[0m \u001b[0mtp\u001b[0m\u001b[0;34m(\u001b[0m\u001b[0;34m)\u001b[0m\u001b[0;34m\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[1;32m 657\u001b[0m \u001b[0;32mif\u001b[0m \u001b[0mvalue\u001b[0m\u001b[0;34m.\u001b[0m\u001b[0m__traceback__\u001b[0m \u001b[0;32mis\u001b[0m \u001b[0;32mnot\u001b[0m \u001b[0mtb\u001b[0m\u001b[0;34m:\u001b[0m\u001b[0;34m\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[0;32m--> 658\u001b[0;31m \u001b[0;32mraise\u001b[0m \u001b[0mvalue\u001b[0m\u001b[0;34m.\u001b[0m\u001b[0mwith_traceback\u001b[0m\u001b[0;34m(\u001b[0m\u001b[0mtb\u001b[0m\u001b[0;34m)\u001b[0m\u001b[0;34m\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[0m\u001b[1;32m 659\u001b[0m \u001b[0;32mraise\u001b[0m \u001b[0mvalue\u001b[0m\u001b[0;34m\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[1;32m 660\u001b[0m \u001b[0;34m\u001b[0m\u001b[0m\n", "\u001b[0;31mTypingError\u001b[0m: Failed in nopython mode pipeline (step: nopython frontend)\nCannot unify int64 and Literal[str](not even, sorry!) for 'y', defined at (4)\n\nFile \"\", line 4:\ndef is_this_even(x):\n \n if x % 2 == 0:\n y = x\n ^\n\n[1] During: typing of assignment at (6)\n\nFile \"\", line 6:\ndef is_this_even(x):\n \n else:\n y = 'not even, sorry!'\n ^\n\nThis is not usually a problem with Numba itself but instead often caused by\nthe use of unsupported features or an issue in resolving types.\n\nTo see Python/NumPy features supported by the latest release of Numba visit:\nhttp://numba.pydata.org/numba-doc/latest/reference/pysupported.html\nand\nhttp://numba.pydata.org/numba-doc/latest/reference/numpysupported.html\n\nFor more information about typing errors and how to debug them visit:\nhttp://numba.pydata.org/numba-doc/latest/user/troubleshoot.html#my-code-doesn-t-compile\n\nIf you think your code should work with Numba, please report the error message\nand traceback, along with a minimal reproducer at:\nhttps://github.com/numba/numba/issues/new\n" ] } ], "source": [ "is_this_even(3)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Use Julia\n", "\n", "I would be remiss at this point to not mention one other option for getting more speed: use the programming language [Julia](https://julialang.org/). Julia is a very new language that has syntax that is *very* similar to Python, but which [runs tens or hundreds of times faster out of the box](https://julialang.org/benchmarks/). Basically, it's kinda like an entire language built around the technology also used by `numba`, but where `numba` is kind of finiky because it's been tacked on to a language that was never built for speed, Julia was designed from the ground up for speed. \n", "\n", "If you want to know why I love Julia, [you can find a talk I gave on it here](https://www.youtube.com/watch?v=C4dMYHzW-SY). It's a little old (I refer to Julia 1.0 not being out yet, but Julia's up to 1.2 now), but the core arguments all still apply. \n", "\n", "To be clear, I wouldn't recommend jumping languages if you just have one function you need to speed up, but if you're doing work that causes you to have performance issues regularly, consider Julia. \n", "\n", "(Note: Julia also reaches peak speed when code is type stable, but is also super fast even if your code isn't type stable.)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## PyPy\n", "\n", "A quick note about something you may come across in the future: pypy! \n", "\n", "So while you might not realize it, there are actually several different back-end engines for running Python. Python itself is just a set of rules that dictate that code written in a certain way should behave in a certain way. BUt you can write different back-end engines that *implement* these rules in different ways. \n", "\n", "*By far* the dominant backend (and, honestly, the only one you should use because it's the most mature and most commonly used) is CPython, an implementation of Python written in C. But there are a number of other back-ends, like JPython (written in Java), IronPython (written in the .NET framework), etc. \n", "\n", "One of these is PyPy, an implementation of Python that is designed for speed. *When it works* PyPy is about 5x faster than CPython. That means that *the same code* run in CPython (what you're using) will run about 5x slower (on average) than when run in PyPy. \n", "\n", "BUT: PyPy has coverage problems. It works well with vanilla Python code, but it does not support core data science libraries like `numpy`. So... yes, it does provide speedups for free, but only if you aren't using things like `numpy`, which themselves offer much more than 5x speed ups. So it's unlikely to ever be a solution for you as a data scientist. " ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Parallelization\n", "\n", "If you've done all this and your code is *still* too slow, it's time to look into [parallelization, which we'll be doing soon!](parallelism.ipynb)" ] } ], "metadata": { "kernelspec": { "display_name": "Python 3", "language": "python", "name": "python3" }, "language_info": { "codemirror_mode": { "name": "ipython", "version": 3 }, "file_extension": ".py", "mimetype": "text/x-python", "name": "python", "nbconvert_exporter": "python", "pygments_lexer": "ipython3", "version": "3.7.3" } }, "nbformat": 4, "nbformat_minor": 4 }