Python iterator chunk generator function
Here’s a versatile Python function that splits any iterator into chunks of size n as a generator. It can take simple iterators such as lists, but also take a generator and chunk it without unwinding the whole thing up front.
This is quite useful when dealing with a database cursor, for example. You can iterate through the cursor in chunks of size n without having to load all of the objects into memory at once.
The function is generically typed so that your linter and IDE can follow the types going in and out of it.
from itertools import chain, slice
from typing import TypeVar, List, Generator, Iterator
Item = TypeVar("Item")
def chunks(
iterable: Iterator[Item], chunk_size: int
) -> Generator[List[Item], None, None]:
iterator = iter(iterable)
for first in iterator:
yield chain([first], islice(iterator, chunk_size - 1))
Here’s a test case for it:
def test_chunks_generator():
# Given we have an iterable generator of 16 items;
def item_generator():
for i in range(16):
yield i
# When we split it into chunks of size 5;
chunks_generator = chunks(item_generator(), 5)
# Then we should get 3 chunks of 5 items each and one final chunk.
assert [[item for item in chunk] for chunk in chunks_generator] == [
[0, 1, 2, 3, 4],
[5, 6, 7, 8, 9],
[10, 11, 12, 13, 14],
[15]
]