Working with Files and APIs (Python)¶
Version: 0.2 Year: 2026
Copyright Notice¶
Copyright (c) 2025-2026 Ryan Thomas Robson / Robworks Software LLC. Licensed under CC BY-NC-ND 4.0. You may share this material for non-commercial purposes with attribution, but you may not distribute modified versions.
Sysadmin automation usually boils down to two things: reading data from somewhere and acting on it. The "somewhere" is either the local filesystem (config files, logs, CSVs) or a remote API (monitoring services, cloud providers, notification systems). Python handles both with clean, consistent patterns.
Local File Operations¶
Python uses the context manager pattern (with statement) to handle files safely. The file is guaranteed to close when the block exits, even if an error occurs mid-read.
Reading and Writing¶
open() is the built-in function for file access.
# Read an entire file into a string
with open("/etc/hostname", "r") as f:
hostname = f.read().strip()
# Read line by line (memory-efficient for large files)
with open("/var/log/auth.log", "r") as f:
for line in f:
if "Failed password" in line:
print(line.strip())
# Write to a file (overwrites existing content)
with open("inventory.txt", "w") as f:
f.write("web01\nweb02\ndb01\n")
# Append to an existing file
with open("audit.log", "a") as f:
f.write("2026-03-25: Updated server inventory.\n")
File Modes¶
| Mode | Description | Creates File? | Truncates? |
|---|---|---|---|
"r" |
Read (text) | No - raises FileNotFoundError | No |
"w" |
Write (text) | Yes | Yes - empties the file |
"a" |
Append (text) | Yes | No - adds to end |
"x" |
Exclusive create | Yes - raises FileExistsError if exists | N/A |
"rb" |
Read (binary) | No | No |
"wb" |
Write (binary) | Yes | Yes |
Binary modes ("rb", "wb") are needed for non-text files: images, compressed archives, protocol buffers, database dumps.
Modern Path Handling with pathlib¶
The pathlib module (Python 3.4+) provides an object-oriented interface for filesystem paths. It's cleaner and more portable than string manipulation with os.path.
from pathlib import Path
# Build paths without worrying about separators
log_dir = Path("/var/log")
auth_log = log_dir / "auth.log" # PosixPath('/var/log/auth.log')
# Check existence and type
auth_log.exists() # True
auth_log.is_file() # True
log_dir.is_dir() # True
# Read and write in one step
content = auth_log.read_text()
Path("output.txt").write_text("hello\n")
# List directory contents
for p in log_dir.iterdir():
if p.suffix == ".log":
print(f"{p.name}: {p.stat().st_size} bytes")
# Glob for pattern matching
for p in log_dir.glob("*.log"):
print(p)
# Recursive glob
for p in Path("/etc").rglob("*.conf"):
print(p)
Prefer pathlib over os.path
os.path.join("/var", "log", "auth.log") works, but Path("/var") / "log" / "auth.log" is more readable and gives you methods like .read_text(), .exists(), and .glob() for free. Most modern Python code and libraries accept Path objects wherever a string path works.
Working with JSON¶
JSON is the standard format for configuration files and API responses. Python's json module converts JSON strings to Python dictionaries and lists, and vice versa.
import json
# Parse a JSON file
with open("config.json") as f:
config = json.load(f) # File -> dict/list
# Parse a JSON string
raw = '{"status": "ok", "count": 42}'
data = json.loads(raw) # String -> dict/list
# Write Python data as JSON
new_config = {"debug": True, "port": 8080, "hosts": ["web01", "web02"]}
with open("settings.json", "w") as f:
json.dump(new_config, f, indent=2) # indent for human-readable output
# Convert Python data to a JSON string
json_str = json.dumps(new_config, indent=2)
json.load() reads from a file object. json.loads() parses a string. The "s" stands for "string." This is the most common source of confusion.
Working with CSV¶
Many sysadmin data sources (inventory exports, billing reports, monitoring data) come as CSV files.
import csv
# Read a CSV file
with open("servers.csv") as f:
reader = csv.DictReader(f) # Each row becomes a dict
for row in reader:
print(f"{row['hostname']}: {row['ip_address']}")
# Write a CSV file
servers = [
{"hostname": "web01", "ip": "10.0.0.1", "role": "frontend"},
{"hostname": "db01", "ip": "10.0.1.1", "role": "database"},
]
with open("inventory.csv", "w", newline="") as f:
writer = csv.DictWriter(f, fieldnames=["hostname", "ip", "role"])
writer.writeheader()
writer.writerows(servers)
csv.DictReader is almost always what you want - it maps each row to a dictionary using the header row as keys, so you access fields by name instead of index.
Working with YAML¶
YAML is common in configuration management (Ansible, Kubernetes, Docker Compose). It's not in the standard library, so you need the PyYAML package.
import yaml
# Read a YAML file
with open("playbook.yml") as f:
config = yaml.safe_load(f) # Always use safe_load, never load()
# Write YAML
data = {"services": {"web": {"image": "nginx", "ports": ["80:80"]}}}
with open("compose.yml", "w") as f:
yaml.dump(data, f, default_flow_style=False)
Always use yaml.safe_load()
yaml.load() (without safe_) can execute arbitrary Python code embedded in the YAML file. This is a remote code execution vulnerability if you're loading untrusted input. Always use yaml.safe_load() unless you have a specific, verified reason not to.
Interacting with APIs¶
While Python's standard library includes urllib, the requests library is the industry standard for HTTP calls. It handles encoding, sessions, headers, and error reporting with a clean interface.
GET Requests¶
import requests
response = requests.get("https://api.github.com/repos/python/cpython")
if response.status_code == 200:
repo = response.json() # Parse JSON response body
print(f"Stars: {repo['stargazers_count']}")
print(f"Language: {repo['language']}")
else:
print(f"Error: HTTP {response.status_code}")
POST Requests¶
import requests
alert = {
"severity": "critical",
"message": "CPU usage exceeded 95% on app01",
"timestamp": "2026-03-25T14:30:00Z"
}
response = requests.post(
"https://hooks.slack.com/services/T00000/B00000/XXXXX",
json=alert # Automatically serializes and sets Content-Type
)
if response.ok: # True for any 2xx status
print("Alert sent successfully.")
Authentication and Headers¶
import requests
# API key in headers
headers = {
"Authorization": "Bearer your-api-token-here",
"Accept": "application/json"
}
response = requests.get(
"https://api.cloudprovider.com/v1/instances",
headers=headers
)
# Basic auth
response = requests.get(
"https://monitoring.internal/api/status",
auth=("username", "password")
)
Sessions and Connection Reuse¶
When making multiple requests to the same host, use a Session to reuse TCP connections and persist headers:
import requests
session = requests.Session()
session.headers.update({
"Authorization": "Bearer your-token",
"Accept": "application/json"
})
# All requests through this session include the headers above
instances = session.get("https://api.cloud.com/v1/instances").json()
volumes = session.get("https://api.cloud.com/v1/volumes").json()
Handling Timeouts and Errors¶
Network calls fail. Always set timeouts and handle errors:
import requests
try:
response = requests.get(
"https://api.example.com/status",
timeout=10 # 10 seconds (connect + read)
)
response.raise_for_status() # Raises HTTPError for 4xx/5xx
data = response.json()
except requests.ConnectionError:
print("Could not connect to the API.")
except requests.Timeout:
print("Request timed out after 10 seconds.")
except requests.HTTPError as e:
print(f"API returned error: {e}")
Pagination¶
Many APIs return results in pages. You need to loop until there are no more pages:
import requests
def get_all_items(base_url, headers):
items = []
url = base_url
while url:
response = requests.get(url, headers=headers, timeout=10)
response.raise_for_status()
data = response.json()
items.extend(data["results"])
url = data.get("next") # None when there are no more pages
return items
Interactive Quizzes¶
Further Reading¶
- Python Docs: Reading and Writing Files - official guide to file I/O
- Python Docs: pathlib - object-oriented filesystem path handling
- Requests: Quickstart Guide - getting started with the requests library
- Real Python: Working with JSON - complete guide to JSON parsing, serialization, and best practices
Previous: Data Structures and Logic | Next: System Automation | Back to Index