Functions and Modules¶

Intermediate50 minPrereqs: object-oriented-programmingpython programming functions modules

Learning outcomes

Use closures, decorators, and generators to write reusable Python code
Navigate the import system and create properly structured packages
Manage project environments with advanced pip and poetry workflows

Version: 0.1 Year: 2026

Copyright Notice¶

Copyright (c) 2025-2026 Ryan Thomas Robson / Robworks Software LLC. Licensed under CC BY-NC-ND 4.0. You may share this material for non-commercial purposes with attribution, but you may not distribute modified versions.

You already know how to define functions with def and install packages with pip. This guide goes deeper on both fronts. Python treats functions as ordinary objects - you can pass them around, nest them, and transform them with decorators. On the module side, the import system that runs every time you write import os is a sophisticated machinery of finders, loaders, and caches. Understanding both halves - functions as building blocks and modules as the organizational layer - is what separates scripts from maintainable projects.

flowchart LR
    A["First-Class Functions"] --> B["Closures"]
    B --> C["Decorators"]
    C --> D["Generators"]
    D --> E["Import System"]
    E --> F["Packages"]
    F --> G["Environments"]

The guide follows the path from left to right: advanced function features first, then the module and package system, and finally the tooling that ties it all together.

First-Class Functions¶

In Python, functions are objects. A def statement creates a function object and binds it to a name, but that name is just a variable like any other. You can assign it, store it in a data structure, or pass it as an argument.

def add(a, b):
    return a + b

def subtract(a, b):
    return a - b

# Functions stored in a dictionary
operations = {
    "add": add,
    "subtract": subtract,
}

result = operations["add"](10, 3)  # 13

This pattern - a dispatch table - replaces long if/elif chains with a clean lookup. It appears everywhere in Python: web framework route registries, CLI argument handlers, and plugin systems.

Functions can also accept other functions as arguments. You have already used this with built-in functions like sorted():

servers = [
    {"name": "web-1", "load": 0.82},
    {"name": "db-1", "load": 0.45},
    {"name": "cache-1", "load": 0.91},
]

# Pass a function (lambda) as the sort key
by_load = sorted(servers, key=lambda s: s["load"])

And functions can return other functions. This is the foundation for closures and decorators, which you will see next.

Closures and Scope¶

When a function is defined inside another function, the inner function can reference variables from the enclosing scope. This creates a closure - the inner function "closes over" the enclosing variables, keeping them alive even after the outer function returns.

Python resolves variable names using the LEGB rule, checking four scopes in order:

Local - names assigned inside the current function
Enclosing - names in any enclosing function (for nested functions)
Global - names at the module level
Built-in - names in the builtins module (len, print, range, etc.)

limit = 100  # Global scope

def make_checker(threshold):  # threshold is in the enclosing scope
    def check(value):         # value is in the local scope
        return value > threshold
    return check

is_high = make_checker(80)
is_high(95)   # True  - threshold=80 is captured in the closure
is_high(50)   # False

The function make_checker is a factory function - it creates and returns a new function each time it is called. Each returned function carries its own copy of threshold.

The nonlocal keyword

If an inner function needs to modify an enclosing variable (not just read it), use nonlocal. Without it, an assignment creates a new local variable instead of updating the enclosing one.

def make_counter(start=0):
    count = start
    def increment():
        nonlocal count
        count += 1
        return count
    return increment

counter = make_counter()
counter()  # 1
counter()  # 2
counter()  # 3

A practical use of closures is building configurable retry logic:

import time

def make_retrier(max_attempts=3, delay=1.0):
    def retry(func, *args, **kwargs):
        last_error = None
        for attempt in range(1, max_attempts + 1):
            try:
                return func(*args, **kwargs)
            except Exception as e:
                last_error = e
                if attempt < max_attempts:
                    time.sleep(delay)
        raise last_error
    return retry

cautious_retry = make_retrier(max_attempts=5, delay=2.0)
result = cautious_retry(fetch_data, "https://api.example.com/health")

Each call to make_retrier captures max_attempts and delay in a closure, producing a self-contained retry function with its own configuration.

Decorators¶

A decorator is a function that takes a function as input and returns a modified version of it. You have already seen the building blocks - first-class functions and closures. A decorator combines them into a pattern for wrapping behavior around existing functions.

Start with the manual approach:

import time

def timing(func):
    def wrapper(*args, **kwargs):
        start = time.perf_counter()
        result = func(*args, **kwargs)
        elapsed = time.perf_counter() - start
        print(f"{func.__name__} took {elapsed:.4f}s")
        return result
    return wrapper

def process_data(records):
    # ... expensive computation ...
    return sorted(records, key=lambda r: r["score"])

process_data = timing(process_data)  # Manually wrap

The last line is what the @ syntax replaces. These two forms are equivalent:

# With @ syntax
@timing
def process_data(records):
    return sorted(records, key=lambda r: r["score"])

# Without @ syntax
def process_data(records):
    return sorted(records, key=lambda r: r["score"])
process_data = timing(process_data)

There is one problem with the simple wrapper above: the wrapped function loses its original name and docstring. Calling process_data.__name__ returns "wrapper" instead of "process_data". The functools.wraps decorator fixes this by copying metadata from the original function to the wrapper:

Decorators with Arguments¶

Sometimes you want a decorator that takes configuration. A decorator factory is a function that returns a decorator:

import functools
import time

def timing(threshold=0.0):
    def decorator(func):
        @functools.wraps(func)
        def wrapper(*args, **kwargs):
            start = time.perf_counter()
            result = func(*args, **kwargs)
            elapsed = time.perf_counter() - start
            if elapsed > threshold:
                print(f"SLOW: {func.__name__} took {elapsed:.4f}s")
            return result
        return wrapper
    return decorator

@timing(threshold=0.5)
def process_batch(items):
    # Only logs if execution exceeds 0.5 seconds
    ...

There are three layers: timing() returns decorator, which takes func and returns wrapper. The @timing(threshold=0.5) call executes the outer function first, producing the actual decorator.

Stacking Decorators¶

Multiple decorators execute from bottom to top:

@require_auth       # 3rd: checks authentication
@validate_input     # 2nd: validates the arguments
@timing()           # 1st: wraps the innermost function
def create_user(username, email):
    ...

The function is first wrapped by @timing(), then that result is wrapped by @validate_input, and finally by @require_auth. When create_user() is called, require_auth runs first (outermost), then validate_input, then timing, then the original function.

Decorator order matters

If @timing() is above @require_auth, the timer includes the authentication check. If it is below, it only measures the core function. Place decorators deliberately based on what you want each one to see.

Generators and Iterators¶

A generator is a function that produces a sequence of values one at a time, pausing between each. Instead of building an entire list in memory and returning it, a generator yields values on demand.

def count_up(limit):
    n = 1
    while n <= limit:
        yield n
        n += 1

for number in count_up(5):
    print(number)  # 1, 2, 3, 4, 5

When Python encounters yield, the function's execution is suspended - local variables, instruction pointer, and all - until the next value is requested. This makes generators ideal for processing large datasets or infinite sequences without exhausting memory.

Generator State¶

A generator object has four possible states:

Created - the generator function was called, but next() has not been called yet
Suspended - paused at a yield expression, waiting for the next next() call
Running - currently executing (between a next() call and the next yield)
Closed - the function has returned or close() was called

gen = count_up(3)        # Created
next(gen)                # Running -> yields 1 -> Suspended
next(gen)                # Running -> yields 2 -> Suspended
next(gen)                # Running -> yields 3 -> Suspended
next(gen)                # Running -> returns  -> raises StopIteration

Generator Expressions¶

Just as list comprehensions provide a compact syntax for building lists, generator expressions produce values lazily:

# List comprehension - builds entire list in memory
squares_list = [x ** 2 for x in range(1_000_000)]

# Generator expression - produces values on demand
squares_gen = (x ** 2 for x in range(1_000_000))

The generator expression uses parentheses instead of brackets and consumes almost no memory regardless of the range size. You can iterate over it once, but you cannot index into it or get its length without consuming it.

yield from¶

When a generator needs to delegate to another generator or iterable, yield from passes values through directly:

def read_files(paths):
    for path in paths:
        yield from read_lines(path)

def read_lines(path):
    with open(path) as f:
        for line in f:
            yield line.rstrip("\n")

Without yield from, you would need an inner for loop that yields each value individually. yield from also propagates send() and throw() calls to the inner generator, which matters for advanced coroutine patterns.

When to use generators

Use a generator when you are processing data that could be large, you only need to iterate through it once, and you do not need random access. Common examples: reading files line by line, streaming API responses, transforming database result sets, and pipeline-style data processing.

The Import System¶

Every import statement triggers a multi-step process. Understanding it helps you debug import errors, avoid circular imports, and structure projects effectively.

How Import Works¶

When Python encounters import mymodule, it follows these steps:

Check the cache - look in sys.modules for an already-loaded module. If found, return it immediately.
Find the module - search sys.path using a series of finders (importlib meta path finders). The default finders check built-in modules, frozen modules, and the filesystem.
Load the module - once found, a loader reads the source, compiles it to bytecode (cached in __pycache__/), and executes the module's top-level code.
Cache the module - store the module object in sys.modules so future imports skip steps 2-3.

import sys

# After importing os, it is cached
import os
print("os" in sys.modules)  # True

# Importing again returns the cached object - the module code does not re-execute
import os  # No-op, returns cached module

Import Styles¶

import os                        # Access as os.path.join(...)
from os.path import join         # Access as join(...)
from os.path import join as pjoin  # Access as pjoin(...)
import os.path                   # Access as os.path.join(...)

The from form binds specific names into the current namespace. The plain import form binds the top-level module. Neither form is inherently better - from is convenient for frequently used names, while plain import makes the source module explicit at every call site.

Relative Imports¶

Inside a package, you can import from sibling or parent modules using dots:

# Inside mypackage/utils/helpers.py
from . import formatting          # Same directory (mypackage/utils/)
from .formatting import bold      # Specific name from sibling
from .. import config             # Parent directory (mypackage/)
from ..core import engine         # Sibling of parent (mypackage/core/)

A single dot (.) means the current package directory. Two dots (..) mean the parent. Relative imports only work inside packages - they fail in standalone scripts run directly with python script.py.

sys.path¶

sys.path is a list of directory paths that Python's filesystem finder searches, in order:

import sys
for p in sys.path:
    print(p)

The first entry is typically the directory containing the script being run (or an empty string "" for the interactive interpreter). The rest come from the PYTHONPATH environment variable, the site-packages directory (where pip installs packages), and the standard library path.

Modifying sys.path

You can append directories to sys.path at runtime, but doing so makes your code dependent on filesystem layout. Prefer installing packages properly (with pip install -e .) over sys.path manipulation.

Circular Imports¶

A circular import happens when module A imports module B, and module B imports module A. Python does not raise an error immediately - it returns the partially initialized module from sys.modules - but you may get ImportError or AttributeError if you try to use a name that has not been defined yet.

# models.py
from validators import validate_user  # validators.py imports models.py too

class User:
    def save(self):
        validate_user(self)

# validators.py
from models import User  # Circular: models.py imports validators.py

def validate_user(user):
    if not isinstance(user, User):
        raise TypeError("Expected a User instance")

Three strategies to break the cycle:

Move the import inside the function that needs it (lazy import)
Reorganize so shared types live in a third module both can import
Use TYPE_CHECKING for type-hint-only imports that do not need runtime access

# Option 1: Lazy import
def validate_user(user):
    from models import User  # Imported only when this function runs
    if not isinstance(user, User):
        raise TypeError("Expected a User instance")

The name Guard¶

When Python loads a module, it sets the __name__ attribute. If the module is the entry point (run directly), __name__ is "__main__". If it is imported, __name__ is the module's qualified name.

# mymodule.py
def main():
    print("Running as a script")

if __name__ == "__main__":
    main()

This guard lets a file serve as both an importable module and a standalone script. Without it, the main() call would execute every time another module imports mymodule.

Creating Packages¶

A package is a directory that Python recognizes as a collection of modules. The simplest way to make one is to add an __init__.py file.

Package Structure¶

mypackage/
├── __init__.py        # Makes this directory a package
├── __main__.py        # Entry point for python -m mypackage
├── core.py
├── formatting.py
└── utils/
    ├── __init__.py
    └── helpers.py

The __init__.py file runs when the package is first imported. It can be empty, or it can define the package's public API:

# mypackage/__init__.py
from .core import Engine
from .formatting import bold, table

__all__ = ["Engine", "bold", "table"]

__all__ controls what from mypackage import * exports. Without it, a wildcard import brings in every public name defined in __init__.py.

Namespace Packages¶

Since Python 3.3, a directory without __init__.py can still act as a namespace package. This allows a single logical package to be split across multiple directories (useful for large organizations with distributed codebases). For most projects, use regular packages with __init__.py - namespace packages are a specialized tool.

Project Layouts¶

Two common layouts for distributable packages:

# Flat layout                    # src layout
myproject/                       myproject/
├── pyproject.toml               ├── pyproject.toml
├── mypackage/                   ├── src/
│   ├── __init__.py              │   └── mypackage/
│   └── core.py                  │       ├── __init__.py
└── tests/                       │       └── core.py
    └── test_core.py             └── tests/
                                     └── test_core.py

The src layout prevents accidentally importing the local source directory instead of the installed package during testing. The flat layout is simpler and works well for smaller projects. The Python Packaging User Guide has a detailed comparison.

The main.py Entry Point¶

Adding __main__.py to a package lets you run it with python -m mypackage:

# mypackage/__main__.py
from .core import Engine

def main():
    engine = Engine()
    engine.run()

if __name__ == "__main__":
    main()

This is the standard way to make a package executable. The -m flag tells Python to find the package in sys.path and run its __main__.py.

Virtual Environments Deep Dive¶

The Introduction guide covered creating and activating virtual environments. Here you will look at what venv actually builds and how activation works under the hood.

What venv Creates¶

Running python3 -m venv myenv creates a directory structure like this:

myenv/
├── bin/                     # Scripts and symlinks (Linux/macOS)
│   ├── python -> python3.12  # Symlink to the base Python
│   ├── python3 -> python3.12
│   ├── python3.12
│   ├── pip
│   ├── pip3
│   └── activate             # Shell script that modifies PATH
├── include/                 # C headers (for compiling extensions)
├── lib/
│   └── python3.12/
│       └── site-packages/   # Where pip installs packages
├── lib64 -> lib             # Symlink (Linux only)
└── pyvenv.cfg               # Configuration file

The pyvenv.cfg file tells Python this is a virtual environment:

home = /usr/local/bin
include-system-site-packages = false
version = 3.12.0

How Activation Works¶

The activate script does two things:

Prepends myenv/bin/ to PATH - so running python or pip resolves to the virtual environment's copies first
Sets VIRTUAL_ENV to the environment's root path

That is it. There is no system-level configuration change, no daemon, no container. Activation is just a PATH modification in your current shell session.

You can also use a virtual environment without activating it by calling its Python directly:

# Without activation
myenv/bin/python my_script.py
myenv/bin/pip install requests

This is common in CI pipelines and automation scripts where sourcing an activate script adds unnecessary complexity.

Multiple Python Versions¶

If you have multiple Python versions installed, create separate environments for each:

python3.11 -m venv env311
python3.12 -m venv env312

env311/bin/python --version  # Python 3.11.x
env312/bin/python --version  # Python 3.12.x

Each environment is independent. Packages installed in env311 are not visible to env312.

System site-packages

The --system-site-packages flag lets a virtual environment fall back to globally installed packages. This is useful when you want access to system-wide libraries (like those installed by the OS package manager) but still want to isolate project-specific dependencies: python3 -m venv --system-site-packages myenv

Advanced pip and poetry¶

The Testing and Tooling guide covered basic pip install and pyproject.toml. This section goes further into reproducible dependency management.

pip: Beyond Basic Install¶

Constraints files restrict package versions without installing them. They are useful when you want to pin transitive dependencies (dependencies of your dependencies) without listing them in your requirements:

# constraints.txt
urllib3==2.1.0
certifi==2024.2.2

pip install -r requirements.txt -c constraints.txt

Hash checking verifies that downloaded packages match expected checksums, preventing supply-chain attacks:

pip install --require-hashes -r requirements.txt

This requires every entry in requirements.txt to include a hash:

requests==2.31.0 \
    --hash=sha256:942c5a758f98d790eaed1a29cb6eefc7f0edf3fcb0fce8aea3fbd5951d8bcc8e

Editable installs link your local source directory into site-packages so changes take effect immediately without reinstalling:

pip install -e .          # Install current project in editable mode
pip install -e ./mylib    # Install a local dependency in editable mode

pip-tools: Compiled Requirements¶

pip-tools adds a two-file workflow that separates what you want from what you get:

pip install pip-tools

Write your direct dependencies in requirements.in:
```
# requirements.in
requests>=2.28
click>=8.0
```
Compile a fully pinned requirements.txt:
```
pip-compile requirements.in
```

This produces a requirements.txt with exact versions and hashes for every package, including transitive dependencies. When you want to update, run pip-compile --upgrade.

poetry: Advanced Workflows¶

Poetry manages dependencies, virtual environments, and packaging in one tool.

Dependency groups separate production, dev, and optional dependencies:

# pyproject.toml
[tool.poetry.dependencies]
python = "^3.11"
requests = "^2.31"

[tool.poetry.group.dev.dependencies]
pytest = "^8.0"
ruff = "^0.3"

[tool.poetry.group.docs.dependencies]
mkdocs = "^1.5"

poetry install                    # Install all groups
poetry install --without docs     # Skip docs group
poetry install --only dev         # Only dev dependencies

Lock files (poetry.lock) pin every dependency to an exact version and hash. The lock file is checked into version control so every developer and CI runner uses identical packages. Run poetry lock to regenerate it after changing pyproject.toml.

PEP 723: Inline Script Metadata¶

PEP 723 (Python 3.12+) lets you declare dependencies inside a script using a special comment block. Tools like pipx and uv read this metadata and install dependencies automatically:

# /// script
# requires-python = ">=3.12"
# dependencies = [
#     "requests>=2.31",
#     "rich>=13.0",
# ]
# ///

import requests
from rich import print

response = requests.get("https://api.example.com/status")
print(response.json())

# Run with automatic dependency resolution
pipx run my_script.py
# Or with uv
uv run my_script.py

This eliminates the need for a separate requirements.txt or virtual environment setup for one-off scripts.

Bringing It Together¶

The concepts from this guide - decorators, dynamic imports, and package structure - combine naturally in a plugin architecture. The following exercise asks you to build one from scratch.

Functions and Modules¶

Copyright Notice¶

First-Class Functions¶

Closures and Scope¶

Decorators¶

Decorators with Arguments¶

Stacking Decorators¶

Generators and Iterators¶

Generator State¶

Generator Expressions¶

yield from¶

The Import System¶

How Import Works¶

Import Styles¶

Relative Imports¶

sys.path¶

Circular Imports¶

The name Guard¶

Creating Packages¶

Package Structure¶

Namespace Packages¶

Project Layouts¶

The main.py Entry Point¶

Virtual Environments Deep Dive¶

What venv Creates¶

How Activation Works¶

Multiple Python Versions¶

Advanced pip and poetry¶

pip: Beyond Basic Install¶

pip-tools: Compiled Requirements¶

poetry: Advanced Workflows¶

PEP 723: Inline Script Metadata¶

Bringing It Together¶

Further Reading¶

Comments

Functions and Modules¶

Copyright Notice¶

First-Class Functions¶

Closures and Scope¶

Decorators¶

Decorators with Arguments¶

Stacking Decorators¶

Generators and Iterators¶

Generator State¶

Generator Expressions¶

yield from¶

The Import System¶

How Import Works¶

Import Styles¶

Relative Imports¶

sys.path¶

Circular Imports¶

The __name__ Guard¶

Creating Packages¶

Package Structure¶

Namespace Packages¶

Project Layouts¶

The __main__.py Entry Point¶

Virtual Environments Deep Dive¶

What venv Creates¶

How Activation Works¶

Multiple Python Versions¶

Advanced pip and poetry¶

pip: Beyond Basic Install¶

pip-tools: Compiled Requirements¶

poetry: Advanced Workflows¶

PEP 723: Inline Script Metadata¶

Bringing It Together¶

Further Reading¶

Comments

The name Guard¶

The main.py Entry Point¶