Generators and Iterators in Python (yield, Lazy Evaluation)

In Python, handling large amounts of data efficiently is a common requirement, especially in data processing, machine learning, and backend systems. This is where iterators and generators come into play. They provide a memory-efficient way to iterate over data without loading everything into memory at once.

Understanding Iterators

An iterator is an object that allows you to traverse through a sequence of elements one at a time. In Python, an iterator implements two special methods: iter() and next().

The iter() method returns the iterator object itself, while next() returns the next value from the sequence. When there are no more elements, it raises a StopIteration exception. Let's look at a simple example:
numbers = [1, 2, 3]

iterator = iter(numbers)

print(next(iterator))  # 1
print(next(iterator))  # 2
print(next(iterator))  # 3
# print(next(iterator))  -- StopIteration exception
Output
1
2
3
Here, the list is an iterable, and calling iter() on it returns an iterator. The next() function retrieves elements one by one.

Iterable vs Iterator

It is important to distinguish between an iterable and an iterator.

An iterable is any object capable of returning an iterator. Examples include lists, tuples, strings, and dictionaries. An iterator, on the other hand, is the object that actually performs the iteration.
my_list = [10, 20, 30]

print(hasattr(my_list, '__iter__'))   # True (iterable)
print(hasattr(my_list, '__next__'))   # False (not iterator)

it = iter(my_list)

print(hasattr(it, '__next__'))        # True (iterator)

Creating a Custom Iterator

You can create your own iterator by defining a class with iter() and next() methods.
class CountUpTo:
    def __init__(self, max_value):
        self.max = max_value
        self.current = 1

    def __iter__(self):
        return self

    def __next__(self):
        if self.current <= self.max:
            value = self.current
            self.current += 1
            return value
        else:
            raise StopIteration

counter = CountUpTo(3)

for num in counter:
    print(num)
Output:
1
2
3
This demonstrates how iteration works internally in Python.
A dunder method (double underscore method) is a special Python method like __init__ or __str__ that defines how objects behave. Python calls these methods automatically for built-in operations (like object creation, printing, or iteration).

They let you customize how your objects interact with Python syntax (e.g., +, len(), for loop).

Introduction to Generators

While custom iterators are powerful, writing them can be verbose. Python provides a simpler way to create iterators using generators.

A generator is a special type of function that returns an iterator. Instead of using return, it uses the yield keyword to produce values one at a time.
def count_up_to(max_value):
    current = 1
    while current <= max_value:
        yield current
        current += 1

gen = count_up_to(3)

for num in gen:
    print(num)
Output:
1
2
3
Generators automatically handle iter() and next(), making them much easier to write and read.

How yield Works

The yield keyword is what makes generators special. When Python encounters yield, it pauses the function's execution and returns the value to the caller. The function's state is saved, and when iteration continues, execution resumes from where it left off.
def simple_generator():
    print("Start")
    yield 1
    print("Middle")
    yield 2
    print("End")

gen = simple_generator()

print(next(gen))
print(next(gen))
print(next(gen))
Output:
Start
1
Middle
2
End
StopIteration
This behavior allows generators to produce values lazily and efficiently.

Lazy Evaluation

One of the biggest advantages of generators is lazy evaluation. Instead of computing all values at once, generators produce values only when needed.

Consider this example:
def large_numbers():
    for i in range(1_000_000):
        yield i
Here, numbers are generated one at a time, rather than storing a million values in memory. This makes generators extremely useful for handling large datasets, file processing, or streaming data.

Compare this with a list:
nums = [i for i in range(1_000_000)]  # consumes large memory
Generators are far more memory-efficient.

Generator Expressions

Python also provides a concise way to create generators using generator expressions, similar to list comprehensions.
gen = (x * x for x in range(5))

for value in gen:
    print(value)
Output
0
1
4
9
16
Unlike list comprehensions, generator expressions do not store all values in memory.

Practical Use Cases

Generators and iterators are widely used in real-world applications. One common use case is reading large files line by line.
def read_file(file_path):
    with open(file_path) as file:
        for line in file:
            yield line.strip()
This approach avoids loading the entire file into memory.

Another use case is infinite sequences:
def infinite_counter():
    i = 0
    while True:
        yield i
        i += 1

counter = infinite_counter()

for _ in range(5):
    print(next(counter))
Generators make it possible to represent potentially infinite data streams safely.

Generators vs Lists

The key difference between generators and lists lies in memory usage and execution.

A list stores all elements in memory, while a generator produces elements one at a time. This makes generators more efficient but slightly slower for repeated access, since values are not stored.

Iterators and generators are fundamental concepts in Python that enable efficient data processing through lazy evaluation. While iterators give you full control over iteration, generators provide a concise and elegant way to create them using the yield keyword.
Nagesh Chauhan
Nagesh Chauhan
Principal Engineer | Java ยท Spring Boot ยท Python ยท Microservices ยท AI/ML

Principal Engineer with 14+ years of experience in designing scalable systems using Java, Spring Boot, and Python. Specialized in microservices architecture, system design, and machine learning.

Share this Article

๐Ÿ’ฌ Comments

Join the discussion