Python Iterators and Generators


  • Description: The iterator protocol (__iter__/__next__/StopIteration), iter/next, generator functions with yield, generator expressions, yield from, and the itertools building blocks
  • My Notion Note ID: K2A-D1-11
  • Created: 2022-10-12
  • Updated: 2026-05-11
  • License: Reuse is very welcome. Please credit Yu Zhang and link back to the original on yuzhang.io

Table of Contents


1. The Iterator Protocol

  • Iterable: pass to iter() to get an iterator
  • Iterator: has __next__(), raises StopIteration when exhausted
class Countdown:
    def __init__(self, n): self.n = n
    def __iter__(self): return self            # iterable IS the iterator
    def __next__(self):
        if self.n <= 0:
            raise StopIteration
        self.n -= 1
        return self.n + 1

for x in Countdown(3):
    print(x)        # 3, 2, 1

for x in obj: desugars to:

it = iter(obj)
while True:
    try:
        x = next(it)
    except StopIteration:
        break
    ...
  • Containers (list, tuple, dict, set, str) are iterables but NOT iterators
  • Each iter(container) creates a fresh iterator → loop twice works

2. iter() and next()

it = iter([1, 2, 3])
next(it)            # 1
next(it)            # 2
next(it)            # 3
next(it)            # StopIteration
next(it, "done")    # "done", default suppresses StopIteration

# Two-argument form: call a zero-arg callable until it returns sentinel
for line in iter(input, "STOP"):     # read lines until user types STOP
    process(line)
  • Two-arg iter() often pairs with f.readline + a fixed read size

3. Generators with yield

  • A generator function has one or more yield expressions
  • Calling it doesn't run the body, returns a generator iterator
  • Each next() advances to the next yield
def counts(start: int):
    while True:
        yield start
        start += 1

g = counts(10)
next(g)          # 10
next(g)          # 11
  • Express lazy / infinite sequences without materializing a list:
def read_lines(path):
    with open(path) as f:
        for line in f:
            yield line.rstrip()

for line in read_lines("big.log"):
    if "ERROR" in line:
        print(line)
  • Closest C++ analog: co_yield in C++20 coroutines (PEP 255 generators predate by ~17 years)

4. Generator Expressions

  • Same syntax as a list comprehension but with (), lazy, nothing computed until iterated
squares = (x * x for x in range(10**6))   # constant memory
total = sum(squares)

# Passing a genexp as the only argument lets you drop the inner ()
total = sum(x * x for x in range(10**6))
  • Pipelines ending in sum/min/max/any/all/set/dict → prefer genexp over list comp to skip the intermediate list

5. yield from

  • Delegates iteration to another iterable, propagating values and send()/throw() correctly
def chain(*iterables):
    for it in iterables:
        yield from it

list(chain([1, 2], (3, 4), "ab"))   # [1, 2, 3, 4, 'a', 'b']
  • PEP 380 (3.3+), also forwards subgenerator return values, unlike for x in it: yield x

6. itertools Highlights

from itertools import (
    count, cycle, repeat,          # infinite
    chain, islice, tee,            # combining/slicing
    starmap, accumulate,           # adapters
    groupby, takewhile, dropwhile,
    product, permutations, combinations,
)

count(10, 2)             # 10, 12, 14, ...
cycle("AB")              # A, B, A, B, ...
repeat(0, 5)             # 0, 0, 0, 0, 0
chain([1, 2], [3, 4])    # 1, 2, 3, 4
islice(count(), 5, 15)   # 5..14

accumulate([1, 2, 3, 4])         # 1, 3, 6, 10 (running sum)
accumulate([1, 2, 3, 4], max)    # 1, 2, 3, 4 (running max)

# groupby groups *consecutive* equal keys, sort first if you want global groups
data = sorted(records, key=lambda r: r.team)
for team, members in groupby(data, key=lambda r: r.team):
    print(team, list(members))

product([0, 1], repeat=3)        # (0,0,0), (0,0,1), ..., (1,1,1)
permutations("ABC", 2)           # ('A','B'), ('A','C'), ...
combinations("ABCD", 2)          # ('A','B'), ('A','C'), ('A','D'), ('B','C'), ...
  • The functional-pipeline answer to a lot of C++ range-v3 code

7. Iterators vs Iterables

Type Has __iter__ Has __next__ Reusable?
Iterable yes no (typically) yes, many iterations
Iterator yes (returns self) yes no, single-consumption

Examples:

  • Iterable: list, tuple, dict, set, str, range
  • Iterator: generators, iter(list), file objects, zip/map/enumerate/reversed returns
xs = [1, 2, 3]
list(xs); list(xs)              # [1,2,3] twice, list is reusable

z = zip([1,2,3], [4,5,6])
list(z); list(z)                # first [(1,4),(2,5),(3,6)], second []
  • To consume a generator twice: materialize with list(gen), or use itertools.tee (with memory cost)