The Object dtype and Performance

The Object dtype and Performance#

When you’re working with a pandas Series or numpy array of integers, all of the entries of those arrays live in the same place in memory. This is because it is easy for a computer to put, say, 100 integers side by side, since it knows that each integer requires exactly 64 bits and so it can put them all end to end and tell the computer that each number starts 64 bits after the last.

But an object array is a little different. Instead of laying all the contents of the array in one place, an object array puts a collection of memory addresses in one place. This is because objects can be of arbitrary size, so you can’t put them all in one place in memory reliably. For example, suppose you had an array that had the following strings: 'this', 'is', 'an', 'array'. The first entry is longer than the second, is the same as the third, and is shorter than the third. You can’t just put them end-to-end and tell the computer “each number starts X bits after the last”.

But you can do that with memory addresses. So an object array is a collection of memory addresses (each the same size) laid end to end.

But that means that to look at the second entry of an object array, you have to go to the second location in the array, read the address, then go to that address. Moreover, because object arrays can hold anything, you could have an integer in the first entry, and a string in the second entry, so you can’t assume that when you see code like:

my_array * 2

that * will mean the same thing for each entry. In one entry, you might be doing integer multiplication; in another you might be doing floating point multiplication, in another you may be doubling up a list!

Indeed we can see this is we make a numpy object array full of ints and compare it to a numpy integer array. The both have the same content, but they are organized in memory differently:

import numpy as np

object_numbers = np.array(np.arange(1000000), dtype="object")
numbers = np.arange(1000000)
%timeit object_numbers * 2
15.5 ms ± 155 µs per loop (mean ± std. dev. of 7 runs, 100 loops each)
%timeit numbers * 2
683 µs ± 15.7 µs per loop (mean ± std. dev. of 7 runs, 1,000 loops each)

See? Same operation (doubling each entry of arrays with the integers from 0 to 1,000,000), but the object array operation is ~25x slower.