|
Here’s a versatile Python function that splits any iterator into chunks of
size n as a generator. It can take simple iterators such as lists, but also
take a generator and chunk it without unwinding the whole thing up front.
This is quite useful when dealing with a database cursor, for example. You can
iterate through the cursor in chunks of size n without having to load all of
the objects into memory at once.
The function is generically typed so that your linter and IDE can follow the
types going in and out of it.
from itertools import chain, slice
from typing import TypeVar, List, Generator, Iterator
Item = TypeVar("Item")
def chunks(
iterable: Iterator[Item], chunk_size: int
) -> Generator[List[Item], None, None]:
iterator = iter(iterable)
for first in iterator:
yield chain([first], islice(iterator, chunk_size - 1))
Here’s a test case for it:
def test_chunks_generator():
# Given we have an iterable generator of 16 items;
def item_generator():
for i in range(16):
yield i
# When we split it into chunks of size 5;
chunks_generator = chunks(item_generator(), 5)
# Then we should get 3 chunks of 5 items each and one final chunk.
assert [[item for item in chunk] for chunk in chunks_generator] == [
[0, 1, 2, 3, 4],
[5, 6, 7, 8, 9],
[10, 11, 12, 13, 14],
[15]
]
View post:
Python iterator chunk generator function
|
|
|
|