Python Generators Part II - Under the hood

The main concept that has helped me grasp and use generators is the idea that a generator is an iterable function. I’m going to poke around briefly in the repl to show what methods are available to generators, and how Python uses them.

What is the original range at its core? It’s a function that returns a list of numbers. What is xrange at its core? It’s a function that…what? Brings one element into memory at a time. Okay, but what does that mean?

Let’s look. I’m going to switch to iPython because it’s much easier on the eyes. If you’ve never used iPython before, don’t worry. It’s just a repl with a different look:

In [1]: a = xrange(5)

In [2]: type(a)
Out[2]: xrange

In [3]: b = iter(a)

In [4]: type(b)
Out[4]: rangeiterator

In [5]: dir(b)
Out[5]:
['__class__',
 '__delattr__',
 '__doc__',
 '__format__',
 '__getattribute__',
 '__hash__',
 '__init__',
 '__iter__',
 '__length_hint__',
 '__new__',
 '__reduce__',
 '__reduce_ex__',
 '__repr__',
 '__setattr__',
 '__sizeof__',
 '__str__',
 '__subclasshook__',
 'next']

In [6]: next(b)
Out[6]: 0

In [7]: next(b)
Out[7]: 1

It looks like xrange returns an ‘xrange’ object which is an iterable. An iterable is an object that can return an iterator. An iterator has the method next().

On the other hand, (the original) range returns a list, which is an iterable. Let’s imagine you want to use range in a for-loop like above:

In [8]: squares = []

In [9]: for x in range(5):
   ...:     squares.append(x)
   ...:

In [10]: squares
Out[10]: [0, 1, 2, 3, 4]

You’ll remember this from the article on iterables and iterators. This is roughly what’s happening when you iterate over a list, like in the above code:

In [11]: rng = range(5)

In [12]: iterator_rng = iter(rng)

In [13]: squares = []

In [14]: while 1:
    ...:     try:
    ...:         sqr = next(iterator_rng) ** 2
    ...:         squares.append(sqr)
    ...:     except StopIteration:
    ...:         break
    ...:

In [15]: squares
Out[16]: [0, 1, 4, 9, 16]

What about generators? I’ll use this custom generator_range function:

In [19]: def generator_range(n):
    ...:     x = 0
    ...:     while x < n:
    ...:         yield x
    ...:         x += 1
    ...:

In [20]: rng = generator_range(5)

In [21]: rng
Out[21]: <generator object generator_range at 0x7fac65d006e0>

In [22]: dir(rng)
Out[22]:
['__class__',
 '__delattr__',
 '__doc__',
 '__format__',
 '__getattribute__',
 '__hash__',
 '__init__',
 '__iter__',
 '__name__',
 '__new__',
 '__reduce__',
 '__reduce_ex__',
 '__repr__',
 '__setattr__',
 '__sizeof__',
 '__str__',
 '__subclasshook__',
 'close',
 'gi_code',
 'gi_frame',
 'gi_running',
 'next',
 'send',
 'throw']

In [23]: rng[3] # Nope
---------------------------------------------------------------------------
TypeError                                 Traceback (most recent call last)
<ipython-input-23-90eaee0d7e42> in <module>()
----> 1 rng[3]

TypeError: 'generator' object has no attribute '__getitem__'

In [24]: next(rng)
Out[24]: 0

In [25]: next(rng)
Out[25]: 1

In [26]: rng = generator_range(5) # Reset rng

In [27]: sum(rng)
Out[27]: 10

In [28]: next(rng) # The generator is exhausted after calling sum() on it
---------------------------------------------------------------------------
StopIteration                             Traceback (most recent call last)
<ipython-input-28-a4b881fa7ec6> in <module>()
----> 1 next(rng)

StopIteration:

In [29]:

The generator_range function returns a generator object. Generator objects have the next() method which, like any iterator, holds its present state.

To compare xrange to my custom generator_range function, it seems that, roughly: xrange returns an object is an iterable, which in turn returns an iterator, whereas generator_range returns a generator object which is an iterator.

Okay, enough with xrange.




Written on July 30, 2016