Python Iterators and Generators
- Description: The iterator protocol (
__iter__/__next__/StopIteration),iter/next, generator functions withyield, generator expressions,yield from, and theitertoolsbuilding blocks - My Notion Note ID: K2A-D1-11
- Created: 2022-10-12
- Updated: 2026-05-11
- License: Reuse is very welcome. Please credit Yu Zhang and link back to the original on yuzhang.io
Table of Contents
- 1. The Iterator Protocol
- 2.
iter()andnext() - 3. Generators with
yield - 4. Generator Expressions
- 5.
yield from - 6.
itertoolsHighlights - 7. Iterators vs Iterables
1. The Iterator Protocol
- Iterable: pass to
iter()to get an iterator - Iterator: has
__next__(), raisesStopIterationwhen exhausted
class Countdown:
def __init__(self, n): self.n = n
def __iter__(self): return self # iterable IS the iterator
def __next__(self):
if self.n <= 0:
raise StopIteration
self.n -= 1
return self.n + 1
for x in Countdown(3):
print(x) # 3, 2, 1
for x in obj: desugars to:
it = iter(obj)
while True:
try:
x = next(it)
except StopIteration:
break
...
- Containers (
list,tuple,dict,set,str) are iterables but NOT iterators - Each
iter(container)creates a fresh iterator → loop twice works
2. iter() and next()
it = iter([1, 2, 3])
next(it) # 1
next(it) # 2
next(it) # 3
next(it) # StopIteration
next(it, "done") # "done", default suppresses StopIteration
# Two-argument form: call a zero-arg callable until it returns sentinel
for line in iter(input, "STOP"): # read lines until user types STOP
process(line)
- Two-arg
iter()often pairs withf.readline+ a fixed read size
3. Generators with yield
- A generator function has one or more
yieldexpressions - Calling it doesn't run the body, returns a generator iterator
- Each
next()advances to the nextyield
def counts(start: int):
while True:
yield start
start += 1
g = counts(10)
next(g) # 10
next(g) # 11
- Express lazy / infinite sequences without materializing a list:
def read_lines(path):
with open(path) as f:
for line in f:
yield line.rstrip()
for line in read_lines("big.log"):
if "ERROR" in line:
print(line)
- Closest C++ analog:
co_yieldin C++20 coroutines (PEP 255 generators predate by ~17 years)
4. Generator Expressions
- Same syntax as a list comprehension but with
(), lazy, nothing computed until iterated
squares = (x * x for x in range(10**6)) # constant memory
total = sum(squares)
# Passing a genexp as the only argument lets you drop the inner ()
total = sum(x * x for x in range(10**6))
- Pipelines ending in
sum/min/max/any/all/set/dict→ prefer genexp over list comp to skip the intermediate list
5. yield from
- Delegates iteration to another iterable, propagating values and
send()/throw()correctly
def chain(*iterables):
for it in iterables:
yield from it
list(chain([1, 2], (3, 4), "ab")) # [1, 2, 3, 4, 'a', 'b']
- PEP 380 (3.3+), also forwards subgenerator return values, unlike
for x in it: yield x
6. itertools Highlights
from itertools import (
count, cycle, repeat, # infinite
chain, islice, tee, # combining/slicing
starmap, accumulate, # adapters
groupby, takewhile, dropwhile,
product, permutations, combinations,
)
count(10, 2) # 10, 12, 14, ...
cycle("AB") # A, B, A, B, ...
repeat(0, 5) # 0, 0, 0, 0, 0
chain([1, 2], [3, 4]) # 1, 2, 3, 4
islice(count(), 5, 15) # 5..14
accumulate([1, 2, 3, 4]) # 1, 3, 6, 10 (running sum)
accumulate([1, 2, 3, 4], max) # 1, 2, 3, 4 (running max)
# groupby groups *consecutive* equal keys, sort first if you want global groups
data = sorted(records, key=lambda r: r.team)
for team, members in groupby(data, key=lambda r: r.team):
print(team, list(members))
product([0, 1], repeat=3) # (0,0,0), (0,0,1), ..., (1,1,1)
permutations("ABC", 2) # ('A','B'), ('A','C'), ...
combinations("ABCD", 2) # ('A','B'), ('A','C'), ('A','D'), ('B','C'), ...
- The functional-pipeline answer to a lot of C++ range-v3 code
7. Iterators vs Iterables
| Type | Has __iter__ |
Has __next__ |
Reusable? |
|---|---|---|---|
| Iterable | yes | no (typically) | yes, many iterations |
| Iterator | yes (returns self) | yes | no, single-consumption |
Examples:
- Iterable:
list,tuple,dict,set,str,range - Iterator: generators,
iter(list), file objects,zip/map/enumerate/reversedreturns
xs = [1, 2, 3]
list(xs); list(xs) # [1,2,3] twice, list is reusable
z = zip([1,2,3], [4,5,6])
list(z); list(z) # first [(1,4),(2,5),(3,6)], second []
- To consume a generator twice: materialize with
list(gen), or useitertools.tee(with memory cost)