๐Ÿ”—chain, product, combinations, permutations, groupbyLESSON

Python itertools

The itertools module provides memory-efficient building blocks for working with iterators. These functions produce results lazily โ€” they generate values on demand rather than building entire lists in memory. This matters enormously when working with large datasets.

Why Itertools?

Instead of building a list like [(x, y) for x in range(1000) for y in range(1000)] (1,000,000 tuples in memory), you can iterate over itertools.product(range(1000), range(1000)) and produce one pair at a time.

itertools.chain

Chains multiple iterables together into a single stream:

itertools.product

Cartesian product โ€” every combination of elements from multiple iterables:

itertools.combinations

All combinations of r elements from an iterable (order doesn't matter, no repetition):

itertools.permutations

All orderings of r elements (order matters):

itertools.groupby

Groups consecutive elements by a key function. Important: the input must be sorted by the same key, or you'll get multiple groups for the same key value:

itertools.islice

Slices an iterator (like s[start:stop:step] but for any iterable):

itertools.cycle

Cycles through an iterable indefinitely:

itertools.accumulate

Produces running totals (or any running aggregate):

itertools.takewhile and itertools.dropwhile

Stop or skip elements based on a predicate:

Combining Itertools: A Data Pipeline

Knowledge Check

What is the key requirement for `itertools.groupby` to produce correct groups?

What is the difference between `itertools.combinations(['a','b','c'], 2)` and `itertools.permutations(['a','b','c'], 2)`?

Why is `itertools.islice(my_generator, 5)` preferred over converting the generator to a list and slicing?