DNS Fundamentals¶

This guide covers what DNS actually is, how it evolved from a single text file into the largest distributed database on Earth, and how every name resolution works from root servers down to the answer your browser needs.

Before DNS: The HOSTS.TXT Era¶

Before DNS existed, the entire internet ran on a single text file.

Every machine on the ARPANET had a file called HOSTS.TXT that mapped hostnames to IP addresses - the same concept as /etc/hosts on your machine today. The Stanford Research Institute's Network Information Center (SRI-NIC) maintained the master copy. If you added a new host to the network, you called SRI-NIC on the phone (during California business hours), asked them to add your entry, and then every other site on the network would periodically FTP the updated file from SRI-NIC.

By the early 1980s this system was falling apart. The ARPANET was growing exponentially, and HOSTS.TXT had fundamental problems:

No scalability - a single file can't track thousands of hosts
No delegation - SRI-NIC had to approve every name
No consistency - sites fetched updates at different times, so different machines had different views of the network
Name collisions - nothing prevented two sites from claiming the same hostname

In 1983, Paul Mockapetris at USC's Information Sciences Institute published RFC 882 and RFC 883, proposing a distributed, hierarchical naming system. His first implementation was called "Jeeves." Two years later, he refined the design into RFC 1034 and RFC 1035 - the specifications that still define DNS today.

What DNS Actually Does¶

DNS is a distributed hierarchical database that maps names to data. Most people think of it as "the phone book of the internet" that converts domain names to IP addresses, but that undersells it significantly.

DNS handles:

Name-to-address mapping - example.com to 93.184.216.34 (A records)
Mail routing - which servers accept email for a domain (MX records)
Service discovery - where to find SIP, LDAP, or XMPP servers (SRV records)
TLS certificate validation - which certificate authorities can issue certs for a domain (CAA records)
Email authentication - SPF, DKIM, and DMARC policies (TXT records)
Delegation - which servers are authoritative for a zone (NS records)
Reverse lookups - IP address to hostname (PTR records)

Every HTTPS connection, every email delivery, every API call starts with a DNS query. DNS processes an estimated 2 trillion queries per day worldwide. It's the most queried database in existence.

The DNS Hierarchy¶

DNS organizes names into a tree structure, read from right to left. The name www.example.com. (note the trailing dot) breaks down like this:

graph TD
    root(". (root)")
    root --> com(".com")
    root --> org(".org")
    root --> uk(".uk")
    root --> arpa(".arpa")
    com --> example("example")
    org --> wikipedia("wikipedia")
    uk --> couk(".co.uk")
    example --> www("www")
    example --> mail("mail")

The Root Zone¶

At the top of the tree is the root zone, represented by a single dot (.). This is the starting point for every DNS resolution.

You'll often hear that there are "13 root servers," named a.root-servers.net through m.root-servers.net. That's technically true but deeply misleading. The number 13 is a legacy constraint - early DNS responses had to fit in a single 512-byte UDP packet, and 13 sets of A and AAAA records was the maximum that would fit.

In reality, those 13 identities are served by roughly 1,954 instances deployed worldwide using anycast routing. Anycast means the same IP address is announced from multiple physical locations, and your query reaches whichever instance is closest in network terms. Verisign alone operates over 290 instances for a.root-servers.net and j.root-servers.net.

The root servers are operated by 12 independent organizations:

Letter	Operator
A	Verisign
B	USC-ISI
C	Cogent Communications
D	University of Maryland
E	NASA Ames Research Center
F	Internet Systems Consortium (ISC)
G	U.S. Department of Defense (DISA)
H	U.S. Army Research Lab
I	Netnod (Sweden)
J	Verisign
K	RIPE NCC (Netherlands)
L	ICANN
M	WIDE Project (Japan)

NASA and the U.S. Army Research Lab each operate a root server. The geographic and organizational diversity is deliberate - no single failure, attack, or political action can take down DNS.

Top-Level Domains¶

Below the root are top-level domains (TLDs). These fall into several categories:

Generic TLDs (gTLDs) - .com, .net, .org, .info, and since 2012, thousands of new gTLDs like .app, .dev, and .io.

Country-code TLDs (ccTLDs) - two-letter codes assigned by ISO 3166: .uk, .de, .jp, .au. Some ccTLDs have taken on broader use - .io (British Indian Ocean Territory) became popular with tech companies, and .tv (Tuvalu) earns that country millions in licensing revenue.

Country-code TLDs occasionally outlive their countries. The .su TLD was assigned to the Soviet Union and is still active over 34 years after the USSR dissolved. Yugoslavia's .yu survived through multiple wars and name changes before finally being retired in 2010.

Infrastructure TLD - .arpa is used for reverse DNS lookups and other infrastructure purposes.

Zones vs Domains¶

These terms are often confused, but the distinction matters when you're running DNS servers.

A domain is a name in the hierarchy. example.com is a domain. mail.example.com is a subdomain of example.com.

A zone is the portion of the DNS tree that a particular server is authoritative for. If example.com delegates mail.example.com to a different set of nameservers, then example.com and mail.example.com are two separate zones even though they're part of the same domain.

A zone always starts at a delegation point (where NS records hand authority to specific nameservers) and extends down the tree until it hits another delegation point.

How DNS Resolution Works¶

When you type www.example.com into a browser, a chain of queries and responses happens in milliseconds. Understanding this process is essential for troubleshooting DNS issues.

The Players¶

Stub resolver - the DNS client built into your operating system. When an application calls getaddrinfo() or similar functions, the stub resolver handles it. The stub resolver is simple - it sends a query to a configured recursive resolver and waits for the final answer. Check your configured resolver with:

# Linux
cat /etc/resolv.conf

# macOS
scutil --dns | head -20

Recursive resolver - the server that does the actual work of chasing down answers. Your ISP runs one, and public options include Google (8.8.8.8), Cloudflare (1.1.1.1), and Quad9 (9.9.9.9). When you point your machine at 8.8.8.8, you're telling your stub resolver to use Google's recursive resolver.

Authoritative nameserver - a server that holds the actual DNS data for a zone and can answer queries about it without asking anyone else.

The Resolution Process¶

Here's what happens when your recursive resolver needs to look up www.example.com and has nothing cached:

sequenceDiagram
    participant App as Application
    participant Stub as Stub Resolver
    participant Rec as Recursive Resolver
    participant Root as Root Server (.)
    participant TLD as .com TLD Server
    participant Auth as example.com Auth Server

    App->>Stub: www.example.com?
    Stub->>Rec: www.example.com? (rd flag set)
    Rec->>Root: www.example.com?
    Root-->>Rec: Referral: .com NS a.gtld-servers.net
    Rec->>TLD: www.example.com?
    TLD-->>Rec: Referral: example.com NS a.iana-servers.net
    Rec->>Auth: www.example.com?
    Auth-->>Rec: Answer: 93.184.216.34 (aa flag set)
    Note over Rec: Cache answer (TTL: 86400s)
    Rec-->>Stub: 93.184.216.34
    Stub-->>App: 93.184.216.34

Step 1 - Query the root. The resolver picks a root server and asks: "What are the nameservers for www.example.com?" The root server doesn't know the final answer, but it knows who handles .com. It responds with a referral - the NS records and IP addresses for the .com TLD servers.

Step 2 - Query the TLD. The resolver asks a .com TLD server the same question. The TLD server doesn't know the final answer either, but it knows who handles example.com. It responds with another referral - the NS records for example.com's authoritative nameservers.

Step 3 - Query the authoritative server. The resolver asks example.com's nameserver for the A record for www.example.com. This server owns the data, so it responds with the answer: 93.184.216.34.

Step 4 - Return and cache. The resolver sends the answer back to your stub resolver and caches it according to the response's TTL.

You can watch this entire process yourself with:

dig +trace www.example.com

; <<>> DiG 9.18.28 <<>> +trace www.example.com
;; global options: +cmd
.                       515195  IN      NS      a.root-servers.net.
.                       515195  IN      NS      b.root-servers.net.
;; [... other root servers ...]
;; Received 525 bytes from 127.0.0.1#53(127.0.0.1) in 0 ms

com.                    172800  IN      NS      a.gtld-servers.net.
com.                    172800  IN      NS      b.gtld-servers.net.
;; [... other .com servers ...]
;; Received 1170 bytes from 198.41.0.4#53(a.root-servers.net) in 24 ms

example.com.            172800  IN      NS      a.iana-servers.net.
example.com.            172800  IN      NS      b.iana-servers.net.
;; Received 326 bytes from 192.5.6.30#53(a.gtld-servers.net) in 12 ms

www.example.com.        86400   IN      A       93.184.216.34
;; Received 56 bytes from 199.43.135.53#53(a.iana-servers.net) in 88 ms

Each section shows a hop in the resolution chain. The IP addresses in the from field show you which server answered each step.

Caching and TTLs¶

graph LR
    Auth["Authoritative Server<br/>example.com A 93.184.216.34<br/>TTL: 86400"]

    subgraph Resolver A
        direction TB
        RA_Q["Query at 09:00"] --> RA_C["Cached until 09:00 +24h"]
        RA_C --> RA_E["Cache expires 09:00 next day"]
    end

    subgraph Resolver B
        direction TB
        RB_Q["Query at 14:00"] --> RB_C["Cached until 14:00 +24h"]
        RB_C --> RB_E["Cache expires 14:00 next day"]
    end

    subgraph Resolver C
        direction TB
        RC_Q["Query at 22:00"] --> RC_C["Cached until 22:00 +24h"]
        RC_C --> RC_E["Cache expires 22:00 next day"]
    end

    Auth -.-> RA_Q
    Auth -.-> RB_Q
    Auth -.-> RC_Q

Each resolver's TTL countdown starts independently from when it first queried. There is no synchronization between them - this is why "DNS propagation" is a myth. It's really independent cache expiration.

If your resolver had to chase every query from the root, DNS would be unbearably slow. Caching makes the system practical.

Every DNS response includes a TTL (Time To Live) - a number in seconds that tells the resolver how long to cache the answer. When you query for example.com and get a TTL of 86400, your resolver will serve that cached answer for the next 24 hours without asking anyone.

You can watch TTLs count down by querying the same name repeatedly:

dig example.com | grep -A1 "ANSWER SECTION"

;; ANSWER SECTION:
example.com.            86400   IN      A       93.184.216.34

Wait a few seconds and query again - the TTL will be lower. Your resolver is counting down until it expires and needs to ask again.

There is no "DNS propagation." This is one of the most misunderstood concepts in DNS. When you change a DNS record, there is no mechanism that pushes the change to resolvers around the world. What actually happens is cache expiration - resolvers everywhere are holding the old answer with a countdown timer. As each resolver's cache expires, it asks your authoritative server and gets the new answer.

This is why lowering your TTL before making a change is standard practice. If your TTL is 86400 (24 hours) and you change an A record, some resolvers may serve the old answer for up to 24 hours. If you lower the TTL to 300 (5 minutes) a day before the change, all caches will expire within 5 minutes after you make the switch.

Negative Caching¶

Resolvers also cache negative answers. When a name doesn't exist, the authoritative server returns an NXDOMAIN response. The resolver caches this too, based on the SOA record's minimum TTL field (see the Zone Files and Records guide). This prevents repeated queries for names that don't exist.

RFC 2308 defines negative caching behavior. It's worth knowing this exists because it explains a common gotcha: if you query a name before you've created the record, the NXDOMAIN gets cached, and you may have to wait for that negative cache to expire before the new record is visible.

Authoritative vs Recursive¶

These are two fundamentally different roles, and understanding the distinction prevents a whole category of configuration mistakes.

An authoritative server owns the data. It has the zone file (or database) for a domain and answers queries from that data. When it answers, it sets the aa (authoritative answer) flag in the response. If it's asked about a domain it doesn't own, it doesn't try to find the answer - it simply refuses or returns a referral.

A recursive resolver owns nothing. It receives queries from clients, chases referrals from root to TLD to authoritative, caches the results, and returns the final answer. It never sets the aa flag (unless it happens to also be authoritative for the queried zone).

You can see the difference with dig:

# Query an authoritative server directly - note the "aa" flag
dig @a.iana-servers.net example.com

;; flags: qr aa rd; QUERY: 1, ANSWER: 1

# Query a recursive resolver - no "aa" flag
dig @8.8.8.8 example.com

;; flags: qr rd ra; QUERY: 1, ANSWER: 1

The aa flag is present in the first response and absent in the second. rd means "recursion desired" (the client asked for recursion), and ra means "recursion available" (the server supports it).

Why Not Both on the Same IP?¶

Running authoritative and recursive services on the same server at the same IP address creates problems:

Security - a recursive resolver must accept queries from clients, but an authoritative server should accept queries from anyone. Combining them makes access control difficult.
Attack surface - recursive resolvers are targets for DNS amplification attacks. If your authoritative server is also an open resolver, attackers can use it to amplify DDoS attacks against third parties.
Performance - recursive resolution is CPU-intensive (chasing referrals, validating DNSSEC). Authoritative serving is I/O-bound (reading zone data). Mixing them causes unpredictable resource contention.

Many ISPs run resolvers that also happen to be authoritative for customer reverse DNS zones. That works because access is already restricted to their customers. But if you're building your own DNS infrastructure, keep them separate.

Open Resolvers¶

An open resolver is a recursive server that accepts queries from anyone on the internet. This is dangerous because DNS responses are larger than queries - an attacker can send a small query with a spoofed source IP, and the resolver sends a large response to the victim. A typical DNS amplification attack achieves 50x amplification: a 1 Mbps query stream becomes 50 Mbps hitting the target.

ISPs routinely operate resolvers, but they restrict access to their own customers' IP ranges. If you run your own recursive resolver, restrict it to your network with ACLs.

Some ISPs also hijack NXDOMAIN responses. Instead of telling your browser that a domain doesn't exist, they redirect you to a search/advertising page. This violates DNS standards (RFC 4924 Section 2.5.2.3) and breaks applications that rely on NXDOMAIN behavior. Public resolvers like 1.1.1.1 and 9.9.9.9 don't do this.

Registrars, Registries, and Glue Records¶

When you "buy" a domain name, you're interacting with a specific chain of organizations:

A registry operates a TLD. Verisign operates the .com registry. The registry maintains the authoritative database of all domain names in that TLD. You never interact with the registry directly.

A registrar is an organization accredited to sell domain names. GoDaddy, Namecheap, Cloudflare Registrar, and Google Domains (now Squarespace) are registrars. When you register a domain through a registrar, they submit the registration to the registry on your behalf using the Extensible Provisioning Protocol (EPP).

The WHOIS system (and its modern replacement, RDAP) lets you look up registration information for a domain:

whois example.com

   Domain Name: EXAMPLE.COM
   Registry Domain ID: 2336799_DOMAIN_COM-VRSN
   Registrar WHOIS Server: whois.iana.org
   Updated Date: 2024-08-14T07:01:34Z
   Creation Date: 1995-08-14T04:00:00Z
   Registry Expiry Date: 2025-08-13T04:00:00Z
   Registrar: RESERVED-Internet Assigned Numbers Authority
   Name Server: A.IANA-SERVERS.NET
   Name Server: B.IANA-SERVERS.NET
   DNSSEC: signedDelegation

The Trailing Dot¶

Every fully qualified domain name (FQDN) ends with a dot: www.example.com. - that final dot represents the root zone. Most software adds it implicitly, so you never type it. But inside DNS zone files, the trailing dot is critical. Without it, the name is treated as relative and the zone's origin is appended.

This creates one of the most common DNS bugs. In a zone file for example.com:

www     IN  CNAME   other.example.com     ; WRONG - becomes other.example.com.example.com.
www     IN  CNAME   other.example.com.    ; RIGHT - the dot makes it absolute

This distinction also matters in Kubernetes. A pod looking up example.com (without a trailing dot) may first try example.com.mycluster.local, example.com.svc.cluster.local, and other search domain suffixes before trying the actual name. Adding the trailing dot (example.com.) skips the search list and goes straight to the intended name.

Glue Records¶

There's a chicken-and-egg problem in DNS. Suppose example.com uses ns1.example.com as its nameserver. To resolve anything under example.com, you need to reach ns1.example.com. But to find the IP address of ns1.example.com, you need to query... the nameserver for example.com. That's circular.

Glue records break this cycle. When example.com is registered, the parent zone (.com) stores not just the NS record (example.com NS ns1.example.com) but also an A record for the nameserver itself (ns1.example.com A 198.51.100.1). These A records in the parent zone are the "glue" that bootstraps the delegation.

graph TD
    subgraph problem ["The Problem: Circular Dependency"]
        P1["Resolve anything under example.com"]
        P2["Need ns1.example.com IP"]
        P3["Query example.com nameserver"]
        P1 -->|"requires"| P2
        P2 -->|"requires"| P3
        P3 -->|"requires"| P1
    end

    subgraph solution ["The Solution: Glue Records"]
        S1[".com TLD Zone<br/><br/>example.com NS ns1.example.com<br/>ns1.example.com A 198.51.100.1 ← glue"]
        S2["Resolver gets NS + IP<br/>in the same referral"]
        S3["Connects directly to<br/>198.51.100.1"]
        S1 --> S2 --> S3
    end

Glue is only needed when a nameserver's name is within the zone it serves. If example.com uses ns1.dnsprovider.net as its nameserver, no glue is needed - the resolver can find ns1.dnsprovider.net by following the normal delegation chain through .net.

You can see glue records in the additional section of a DNS referral:

dig +norec @a.gtld-servers.net example.com NS

;; AUTHORITY SECTION:
example.com.        172800  IN  NS  a.iana-servers.net.
example.com.        172800  IN  NS  b.iana-servers.net.

;; ADDITIONAL SECTION:
a.iana-servers.net. 172800  IN  A   199.43.135.53
b.iana-servers.net. 172800  IN  A   199.43.133.53

The AUTHORITY section says "these are the nameservers." The ADDITIONAL section says "and here are their IP addresses so you don't have to look them up separately." That's glue.

DNS Transport¶

DNS traditionally uses UDP on port 53 for queries under 512 bytes and TCP on port 53 for larger responses and zone transfers. The 512-byte UDP limit was a practical constraint from the 1980s when network reliability was poor and UDP was faster.

The Extension Mechanisms for DNS (EDNS0) specification raised the effective UDP message size. Modern resolvers advertise a buffer size of 1232-4096 bytes, allowing larger responses (like DNSSEC-signed answers) to travel over UDP.

Newer transport protocols are emerging for privacy:

DNS over TLS (DoT) - DNS queries encrypted with TLS on port 853
DNS over HTTPS (DoH) - DNS queries sent as HTTPS requests on port 443, making them indistinguishable from normal web traffic
DNS over QUIC (DoQ) - DNS queries over the QUIC protocol, offering TLS encryption with lower latency than DoT

These encrypted transports prevent ISPs and network operators from seeing (or tampering with) your DNS queries. They don't change how DNS resolution works - they just encrypt the transport between your stub resolver and the recursive resolver.