← Back to Logs

DNS: The Internet Phone Book That Everyone Trusts and Nobody Should

Every time you type a URL into a browser, your machine kicks off a process that was designed in 1983 by people who assumed everyone on the internet could be trusted. That process is DNS, and it is still, in 2026, largely running on unsigned, unauthenticated UDP packets. Let's walk through how it actually works, why it's broken, and what the various fixes actually fix.

The Resolution Chain

The "phone book" analogy is fine for a cocktail party. Here's what actually happens when your machine needs to resolve blog.example.com.

Step 1: Stub Resolver. Your operating system has a stub resolver (the code behind getaddrinfo()). It doesn't do any real resolution itself. It reads /etc/resolv.conf (or the system equivalent), finds the IP of a configured recursive resolver, and fires off a UDP packet to port 53 on that server. That's it. The stub resolver is deliberately simple.

Step 2: Recursive Resolver. This is the workhorse. Your ISP runs one, Cloudflare runs one at 1.1.1.1, Google runs one at 8.8.8.8. The recursive resolver checks its cache first. On a cache miss, it starts walking the DNS tree from the top.

Step 3: Root Servers. The recursive resolver sends a query to one of the 13 root server clusters. Why exactly 13? The answer is protocol math. A DNS response needs to fit in a single 512-byte UDP packet (the original RFC 1035 limit). Each root server's name and address record takes roughly 32 bytes. Thirteen entries, plus the header and other overhead, just barely fits. The 13 "servers" (named a.root-servers.net through m.root-servers.net) are actually clusters of hundreds of physical machines distributed globally using anycast routing. Any packet sent to, say, 198.41.0.4 (a.root-servers.net) gets routed to the nearest physical instance by BGP.

The root server doesn't know the answer. It returns a referral: "I don't know blog.example.com, but here are the authoritative servers for .com."

Step 4: TLD Servers. The recursive resolver now queries the .com TLD servers. These are run by Verisign. Again, it gets a referral: "I don't know blog.example.com, but the authoritative nameservers for example.com are ns1.example.com and ns2.example.com, and here are their IP addresses in the additional section."

Step 5: Authoritative Server. The recursive resolver queries ns1.example.com for blog.example.com. This server owns the zone. It returns the actual A record (or AAAA, or CNAME, or whatever was asked for). The recursive resolver caches the result, then sends it back to your stub resolver.

That entire chain, root to TLD to authoritative, typically completes in under 100 milliseconds. On a warm cache, it's a single round trip to your recursive resolver.

The Wire Format

DNS packets follow the format defined in RFC 1035, and it has barely changed since 1987. Every DNS message, query or response, starts with a 12-byte header:

  • ID (16 bits): A random identifier. The response must echo this back so the resolver can match responses to queries. This is, depressingly, the primary "security" mechanism in plain DNS.
  • QR (1 bit): 0 for query, 1 for response.
  • OPCODE (4 bits): Usually 0 (standard query).
  • AA (1 bit): Authoritative Answer. Set when the responding server is authoritative for the zone.
  • TC (1 bit): Truncated. If the response exceeds 512 bytes, this flag is set and the client should retry over TCP.
  • RD (1 bit): Recursion Desired. The stub resolver sets this to ask the recursive resolver to do the full tree walk.
  • RA (1 bit): Recursion Available. The server sets this to indicate it supports recursion.
  • RCODE (4 bits): Response code. 0 is no error, 3 is NXDOMAIN (domain doesn't exist).

After the header, the message has four sections. The Question section contains the name being queried, the query type (A, AAAA, MX, etc.), and the class (almost always IN for internet). The Answer section contains the actual resource records. The Authority section contains NS records pointing to authoritative servers (used in referrals). The Additional section contains "glue" records, the IP addresses of nameservers mentioned in the authority section, so the resolver doesn't have to do a separate lookup to find them.

The whole thing is a compact binary format. Domain names are encoded as length-prefixed labels with pointer compression to avoid repeating strings. It's elegant, efficient, and completely trusting.

The EDNS0 and Larger Packets Story

The original 512-byte UDP limit in RFC 1035 was a reasonable choice in 1987. It guaranteed that DNS responses could traverse any network path without fragmentation (the minimum MTU for IPv4 is 576 bytes, and 512 bytes of payload plus headers fits comfortably). But as DNS grew, 512 bytes became a straitjacket.

EDNS0 (Extension Mechanisms for DNS, RFC 6891) solved this by adding an OPT pseudo-record to DNS messages. The client includes an OPT record in its query that advertises its UDP buffer size, typically 4096 bytes. The server can then send responses up to that size over UDP without falling back to TCP. EDNS0 was essential for DNSSEC, because signed responses include RRSIG, DNSKEY, and DS records that easily push responses well past 512 bytes. A DNSSEC-signed response for a moderately complex zone can be 2,000 to 4,000 bytes.

But larger UDP packets introduced a new set of problems. UDP packets over roughly 1,232 bytes (the practical limit for avoiding IPv6 fragmentation on most paths) get fragmented at the IP layer. Fragmented UDP is handled poorly by many firewalls and NAT devices, some of which silently drop fragments. This means a perfectly valid DNSSEC response can simply vanish in transit, causing resolution failures that are maddening to debug because they depend on the specific network path.

The DNS community's response was the "DNS flag day" initiative. The first flag day (February 2019) was a coordinated effort by major DNS software vendors and operators to stop working around non-EDNS0-compliant servers. Previously, resolvers would fall back to plain DNS if an EDNS0 query got no response, masking broken implementations. After flag day, those broken servers simply stopped working, forcing their operators to fix them. A second flag day (October 2020) addressed issues with IP fragmentation and oversized UDP responses, pushing implementations toward smaller default buffer sizes (1,232 bytes) and more aggressive TCP fallback.

TCP support for DNS has technically been mandatory since the beginning (RFC 1035 says so), and RFC 7766 reinforced this, stating that all DNS implementations must support TCP. In practice, the vast majority of DNS queries still use UDP. TCP adds connection setup overhead (a three-way handshake before any data flows), which matters when you're doing millions of queries per second. But for large responses, DNSSEC validation, and zone transfers, TCP is not optional. The gradual shift toward treating TCP as a first-class transport for DNS, rather than a fallback for truncated responses, is one of those slow infrastructure changes that most people will never notice but that keeps the system functional.

Caching and TTL

Every DNS record comes with a TTL (Time To Live), a value in seconds that tells the resolver how long it can cache that record. When your recursive resolver gets an A record for example.com with a TTL of 3600, it stores it and answers all subsequent queries from cache for the next hour.

Caching happens at multiple layers. The recursive resolver caches. Your operating system caches (the stub resolver keeps its own cache). Your browser caches DNS results separately. Each layer decrements the TTL independently from the moment it received the record.

TTL choices have real engineering tradeoffs. A high TTL (like 86400, one day) means fewer queries to your authoritative servers and faster resolution for users, but if you need to change an IP address, you're waiting up to a day for the change to propagate. A low TTL (like 60 seconds) gives you fast failover, which is why CDNs and load balancers typically use TTLs of 30 to 60 seconds. The cost is that every client hits the recursive resolver every minute, and the resolver hits your authoritative servers constantly.

DNS as a Content Delivery Mechanism

CDNs have turned DNS into a traffic steering engine. When you resolve cdn.example.com, you aren't getting a single fixed IP address. The authoritative server is making a real-time decision about which edge server to send you to.

The technique is called GeoDNS. The authoritative nameserver examines the source IP of the incoming query (which is the IP of the recursive resolver, not the end user) and maps it to a geographic location. A query from a resolver in Frankfurt gets an IP for a European edge server. A query from a resolver in Tokyo gets an IP for an Asian edge server. Cloudflare, Akamai, and AWS CloudFront all do this. It's why your TTLs for CDN-hosted domains are typically 30 to 60 seconds: the CDN needs the ability to re-steer traffic quickly in response to load changes or server failures.

The obvious problem with GeoDNS: it keys on the resolver's location, not the user's location. If a user in Tokyo configures their machine to use a recursive resolver hosted in New York (maybe a corporate VPN resolver, or just a preference for a specific provider), the CDN sees a New York source IP and returns a New York edge server IP. The user in Tokyo now has their traffic routed across the Pacific for every request, adding hundreds of milliseconds of latency.

EDNS Client Subnet (ECS, defined in RFC 7871) partially fixes this. When a recursive resolver supports ECS, it includes a truncated version of the client's IP address (typically the first 24 bits for IPv4) in the DNS query sent to the authoritative server. The authoritative server can then use the client's approximate location rather than the resolver's location for its GeoDNS decision. The tradeoff is privacy: the authoritative server (often a CDN operated by a third party) now knows approximately where the end user is located. Some privacy-focused resolvers (like Cloudflare's 1.1.1.1) send minimal or no ECS data, preferring user privacy over CDN optimization. Others (like Google's 8.8.8.8) send ECS data by default. There's no universally right answer here; it depends on whether you prioritize latency or location privacy.

Split-Horizon DNS and Internal Leaks

Split-horizon DNS (also called split-brain DNS) is a configuration where the same domain name resolves to different IP addresses depending on where the query comes from. Inside the corporate network, internal.company.com resolves to 10.0.0.50 via the internal DNS server. From the public internet, it either returns a different public IP or, more commonly, returns NXDOMAIN as if the record doesn't exist.

This is a standard enterprise pattern. It lets organizations use the same domain names internally and externally without exposing internal infrastructure. The internal resolver knows about the private IP space; the external authoritative servers only publish public-facing records.

The security problem is DNS query leakage. When a machine on the corporate network sends a DNS query for internal.company.com to an external resolver instead of the internal one, that query exposes the existence of an internal hostname to a third party. VPN split-tunnel configurations are a common source of this. If the VPN is configured to only tunnel traffic destined for specific IP ranges, DNS queries may still go out over the user's regular internet connection to their ISP's resolver or a public resolver like 8.8.8.8. Every internal hostname the user's machine tries to resolve is now visible to that external resolver's operator.

For penetration testers and attackers, leaked internal DNS names are valuable reconnaissance. Hostnames like db-prod-01.internal.company.com, jenkins.internal.company.com, or vpn-gateway.internal.company.com reveal the internal service topology, technology stack, and naming conventions. Combine this with an accidentally exposed DNS zone transfer (AXFR) and the attacker gets a complete map of every host in the internal zone. AXFR should always be restricted to authorized secondary nameservers, but misconfigurations are common enough that checking for open zone transfers is a standard step in any security assessment.

The fix is straightforward in principle: ensure all internal DNS queries go to internal resolvers, block outbound DNS to external resolvers from the corporate network, and restrict zone transfers. In practice, the proliferation of remote work, BYOD devices, and misconfigured VPN clients makes this a constant operational headache.

Stale caching is a real operational hazard. If your authoritative servers go down, resolvers with cached entries will keep serving the old answers until the TTL expires. After that, resolution fails entirely. Some resolvers (like Unbound) support "serve-stale" (RFC 8767), where they'll return expired cache entries while attempting to refresh in the background. This is a pragmatic hack for a real problem: a brief authoritative outage shouldn't mean a global DNS outage for your domain.

Why DNS Is Insecure By Design

Paul Mockapetris designed DNS in 1983 (RFC 882 and RFC 883, later refined in RFC 1034 and 1035). The internet was a research network. There were a few hundred hosts. Trust was assumed. DNS has no built-in mechanism for verifying that a response is authentic.

The security model, such as it is, works like this: your resolver sends a query with a random 16-bit ID to port 53. It accepts the first response that matches the ID and the query name. That's it. No signatures, no certificates, no authentication.

This makes several attacks trivial.

DNS Cache Poisoning (Kaminsky Attack). In 2008, Dan Kaminsky demonstrated a devastating flaw. An attacker sends a flood of forged responses to a recursive resolver, each with a different guessed transaction ID. Since the ID is only 16 bits, there are only 65,536 possibilities. The attacker queries the resolver for random subdomains of the target (like aaa.example.com, aab.example.com), ensuring cache misses. For each query, they race to provide a forged response before the real authoritative server does. If any forged response arrives first with the correct transaction ID, the resolver caches the attacker's data. Worse, the forged response can include a delegated authority record pointing to the attacker's nameserver, poisoning the cache for the entire domain. The fix was source port randomization (adding another ~16 bits of entropy), but this is mitigation, not a real solution.

DNS Spoofing on the Wire. On any network where an attacker can see your DNS traffic (public WiFi, a compromised router, a malicious ISP), they can simply intercept queries and send back forged responses. Plain DNS is UDP on port 53 with no encryption. It's a plaintext protocol.

ISP-Level DNS Hijacking. Many ISPs intercept DNS traffic transparently. Some do it to redirect NXDOMAIN responses to ad-laden search pages. Some do it for government-mandated censorship. Some do it for "security" filtering. Because DNS is unencrypted UDP, your ISP can inspect and modify it trivially, even if you've configured a third-party resolver like 1.1.1.1. They just intercept all traffic on port 53.

The fundamental problem is simple: DNS responses are unsigned, unauthenticated UDP packets. Anyone who can get a packet to your resolver with the right query ID gets trusted.

DNS Amplification Attacks

DNS isn't just vulnerable to spoofing and poisoning. It's also one of the most effective DDoS amplification vectors on the internet.

The attack works like this. The attacker sends a DNS query to an open recursive resolver, but spoofs the source IP address to be the victim's IP. The resolver processes the query and sends the response to the victim, not the attacker. The key is the amplification factor: a DNS query is small (roughly 60 bytes), but the response can be enormous. A well-crafted ANY query against a DNSSEC-signed zone can generate a response of 3,000 to 4,000 bytes. That's a 50-70x amplification factor. The attacker sends 1 Mbps of traffic and the victim receives 50-70 Mbps of DNS responses they never asked for.

Open recursive resolvers are the enabler. A properly configured resolver only answers queries from its own clients. But millions of resolvers on the internet are configured to answer queries from anyone. Each one is a potential amplification node. The attacker needs only a list of open resolvers (easily found with internet-wide scanning) and the ability to spoof source IP addresses.

Source address validation (BCP38, RFC 2827) would eliminate the IP spoofing that makes this possible. If every network verified that outbound packets had source addresses belonging to that network, spoofed packets would never leave the originating network. The standard has existed since 2000. Deployment is still incomplete because it requires coordination across thousands of independent network operators, and there's no direct incentive for any individual network to implement it. You bear the cost of deploying the filter, but the benefit goes to everyone else on the internet.

The scale of DNS amplification attacks is staggering. The 2013 attack against Spamhaus peaked at roughly 300 Gbps, almost entirely generated through DNS amplification. At the time, it was the largest DDoS attack ever recorded and caused collateral congestion across major internet exchange points in Europe. The technique has only gotten more refined since then, often combined with other amplification protocols (NTP, memcached) in multi-vector attacks.

DNS Rebinding Attacks

DNS rebinding is subtler than amplification, but in some ways more dangerous. It turns the victim's own browser into an attack tool against their local network.

Here's the setup. An attacker controls a domain, evil.com, and runs the authoritative DNS server for it. The victim visits evil.com in their browser (maybe through a phishing link, a malicious ad, or an iframe on a compromised site). On the initial DNS lookup, the attacker's server returns the real IP of the attacker's web server, and the browser loads a page containing JavaScript. Everything looks normal so far.

Now the attacker's DNS server changes its response. The TTL on the initial record was set very low (a few seconds). When the JavaScript on the page makes a subsequent request to evil.com, the browser re-resolves the domain. This time, the attacker's DNS returns 127.0.0.1, or 192.168.1.1, or any other internal IP address. The browser's same-origin policy considers this request valid, because the JavaScript originated from evil.com and it's still making requests to evil.com. But the actual HTTP request now hits the victim's localhost or a device on their internal network.

The result: attacker-controlled JavaScript making authenticated requests to services on your local network, behind your firewall. It can reach your router's admin panel, your NAS, your printer, your smart home hub, anything with a web interface on the local network. For IoT devices, which often have minimal authentication and unpatched firmware, this is devastating.

Countermeasures exist but require active configuration. Some recursive resolvers (like dnsmasq and Unbound) can be configured to block DNS responses that contain private IP ranges (RFC 1918 addresses) when the query was for an external domain. Modern browsers have added some protections, but coverage is inconsistent. The fundamental problem is that DNS rebinding exploits the gap between DNS resolution (which operates on IP addresses) and same-origin policy (which operates on domain names), and closing that gap completely without breaking legitimate use cases is difficult.

DNSSEC: Signing, Not Encrypting

DNSSEC (DNS Security Extensions, RFC 4033-4035) adds cryptographic signatures to DNS records. Here's how the chain of trust works.

The root zone is signed with a root key (the Root Zone Signing Key). This key signs DS (Delegation Signer) records for each TLD. The .com zone has its own key pair, which signs DS records for each second-level domain. Each authoritative zone signs its own records with RRSIG records using keys published in DNSKEY records. A validating resolver can walk the chain: root key signs .com DS, .com key signs example.com DS, example.com key signs the actual A record. If any signature doesn't validate, the response is rejected.

What DNSSEC does: it lets a resolver verify that a response is authentic and hasn't been tampered with. It prevents cache poisoning and spoofing.

What DNSSEC does not do: it provides zero privacy. Queries and responses are still plaintext. Everyone on the network can still see what domains you're looking up. DNSSEC also doesn't encrypt anything. It only signs.

Adoption is still underwhelming. Roughly 30% of domains have DNSSEC enabled, and the percentage of resolvers that actually validate signatures is lower. The operational burden is real. Zone signing requires generating and managing cryptographic keys, performing key rollovers on schedule, and re-signing the zone whenever records change. A botched key rollover can make your entire domain unresolvable. ICANN's own root key rollover in 2018 was a years-long project that nearly broke DNS for a significant fraction of the internet.

DoH and DoT: Encrypting the Last Mile

DNS over TLS (DoT, RFC 7858) wraps DNS queries in a TLS connection on port 853. DNS over HTTPS (DoH, RFC 8484) sends DNS queries as HTTPS requests on port 443. Both encrypt the content of DNS queries and responses between the stub resolver and the recursive resolver.

What they protect: the privacy of your DNS queries from anyone between you and your recursive resolver. Your ISP can no longer see (or tamper with) which domains you're resolving. On public WiFi, your DNS lookups are encrypted. ISP-level DNS injection becomes impossible if you use DoH, because the traffic is indistinguishable from regular HTTPS traffic on port 443.

What they don't protect: the recursive resolver itself sees every query you make. If you're using Cloudflare's 1.1.1.1, Cloudflare sees your queries. If you're using Google's 8.8.8.8, Google sees them. You've moved the trust from your ISP to your DNS provider. Whether that's an improvement depends on who you trust less.

DoH has a specific advantage over DoT for censorship resistance. DoT uses a dedicated port (853), which is trivial for a firewall to block. DoH uses port 443, the same port as all HTTPS traffic. Blocking it means blocking HTTPS entirely. This is exactly why some governments and ISPs oppose DoH: it makes DNS censorship significantly harder.

The privacy argument gets complicated. Encrypted DNS hides your queries from the network, but the resolver still sees them. Your browser still connects to the IP address it resolved, so anyone watching network traffic can often infer the domain from the destination IP (especially for sites that don't share IPs). SNI in TLS handshakes also leaks the domain name, though Encrypted Client Hello (ECH) is addressing that separately.

Real-World DNS Attacks

The 2016 Dyn Attack. On October 21, 2016, a massive DDoS attack targeted Dyn, a major managed DNS provider. The Mirai botnet, composed of hundreds of thousands of compromised IoT devices (cameras, DVRs, routers), flooded Dyn's infrastructure with traffic. Because Dyn provided authoritative DNS for major services, the attack made Twitter, Netflix, Reddit, GitHub, and dozens of other sites unreachable for hours. The sites themselves were fine. Their DNS provider was overwhelmed, so nobody could resolve their domain names. This exposed a critical concentration risk: too many major services depending on a single DNS provider.

BGP Hijacking of DNS. In 2018, attackers used BGP hijacking to reroute traffic destined for Amazon's Route 53 DNS service. By announcing more specific IP prefixes, they redirected DNS queries to their own servers, which served forged responses pointing myetherwallet.com to a phishing server. Users who happened to use the affected resolvers had their cryptocurrency stolen. The attack combined two layers of internet infrastructure (BGP and DNS), both of which lack authentication by default.

ISP DNS Injection. This isn't an "attack" in the traditional sense, but it's widespread. ISPs like Comcast, Vodafone, and others have been caught intercepting DNS queries and injecting modified responses. The most common variant redirects NXDOMAIN responses to ISP-operated search and advertising pages. Some ISPs inject tracking identifiers into DNS responses. This is possible because plain DNS is unencrypted, unauthenticated UDP.

DNS Tunneling

DNS can be used as a covert communication channel, and it is surprisingly effective. The basic idea: encode arbitrary data in DNS queries and responses. An attacker sets up an authoritative nameserver for a domain they control (say, tunnel.evil.com). The compromised machine inside the target network sends DNS queries like dGhpcyBpcyBlbmNvZGVk.tunnel.evil.com, where the subdomain is base64-encoded data. The attacker's nameserver receives the query, decodes the subdomain, and sends back a response with data encoded in TXT records or other record types. Full bidirectional communication, carried entirely over DNS.

This works because almost every network allows DNS traffic. Corporate firewalls, hotel captive portals, air-gapped-adjacent networks: they all need DNS to function, so port 53 is almost always open. Tools like iodine and dnscat2 implement complete TCP-over-DNS tunnels, letting you SSH through a network that blocks everything except DNS. The throughput is low (typically tens of kilobytes per second), but for command-and-control traffic or slow data exfiltration, it is more than sufficient.

Detecting DNS tunneling is harder than it sounds. The queries are syntactically valid DNS. But they have distinctive characteristics: unusually long subdomain labels, high query volumes to a single domain, heavy use of TXT record lookups, and entropy patterns in the query names that look nothing like normal hostnames. Catching this requires either deep packet inspection on DNS traffic or statistical analysis of query patterns, and most organizations do neither.

Real malware has used this technique extensively. The FrameworkPOS malware exfiltrated stolen credit card data entirely through DNS queries, encoding card numbers as subdomain lookups. The Morto worm used DNS TXT records for C2 communication. Several APT groups have used DNS tunneling as a backup C2 channel that persists even when HTTP-based channels are blocked. If your network monitoring ignores DNS content (and most do), you have a blind spot large enough to exfiltrate a database through.

What You Can Actually Do

Run your own recursive resolver. Unbound is a solid, well-maintained recursive resolver. Install it, point your machines at it, and you no longer depend on your ISP's resolver or a third party. Unbound can validate DNSSEC, serve stale records, and forward to upstream resolvers over TLS if you want encrypted transport for the upstream leg. The configuration is straightforward:

server:
  interface: 127.0.0.1
  access-control: 127.0.0.0/8 allow
  auto-trust-anchor-file: "/var/lib/unbound/root.key"
  tls-cert-bundle: "/etc/ssl/certs/ca-certificates.crt"
 
forward-zone:
  name: "."
  forward-tls-upstream: yes
  forward-addr: 1.1.1.1@853#cloudflare-dns.com
  forward-addr: 1.0.0.1@853#cloudflare-dns.com

Use encrypted DNS. At minimum, configure DoH or DoT in your OS or browser. Firefox and Chrome both support DoH natively. On Linux, systemd-resolved supports DoT. This won't make you anonymous, but it stops your ISP from seeing and tampering with your lookups.

Enable DNSSEC validation. If you run your own resolver, enable DNSSEC validation. Unbound does this out of the box with the auto-trust-anchor-file directive. This won't help for the majority of domains that haven't signed their zones, but for the ones that have, it prevents cache poisoning.

Monitor for DNS leaks. If you're using a VPN, check that your DNS queries are actually going through the tunnel. DNS leak test sites exist for this purpose. A VPN that leaks DNS queries gives your ISP a complete log of every domain you visit, defeating much of the point.

Check your domain's DNS configuration. If you operate domains, sign your zones with DNSSEC. Use multiple DNS providers for redundancy (the Dyn attack taught us this). Set reasonable TTLs. Monitor your DNS resolution from external vantage points.

The Centralization Problem

DNS was designed as a distributed system. The hierarchy of root servers, TLD servers, and authoritative servers was supposed to spread both load and risk across many independent operators. In practice, DNS has become dangerously centralized.

On the recursive resolver side, a huge fraction of consumer DNS queries now flow through just two companies. Cloudflare's 1.1.1.1 and Google's 8.8.8.8 handle a dominant share of public recursive resolution. For authoritative DNS, the concentration is equally stark: AWS Route 53, Cloudflare DNS, and Google Cloud DNS host a massive percentage of all authoritative zones on the internet. The root servers, while anycasted across hundreds of physical nodes, are still operated by only 12 organizations.

This concentration creates exactly the kind of single points of failure that distributed systems are supposed to prevent. When AWS Route 53 has had incidents, thousands of domains became unreachable simultaneously. The Cloudflare outage in June 2022 knocked out a significant chunk of the internet's DNS infrastructure in one event. These aren't hypotheticals. They keep happening.

The tension is real. Managed DNS is operationally easier: you get automatic failover, global anycast, DDoS mitigation, and a nice API. Running your own authoritative DNS is painful, and most organizations lack the expertise. So everyone migrates to the same three or four providers, and the "distributed" system becomes a system where a single provider's bad config push or capacity failure cascades across millions of domains. The convenience is genuine. The resilience tradeoff is rarely discussed until something breaks.

The Uncomfortable Truth

DNS is a 40-year-old protocol that underpins virtually everything on the internet. It was designed for a trusted network of a few hundred machines and now serves billions of devices in an adversarial environment. The fixes (DNSSEC, DoH, DoT) are real improvements, but they're layers of armor bolted onto a protocol that was never designed to wear it. DNSSEC adoption is glacially slow. Encrypted DNS shifts trust rather than eliminating it. And the entire system still depends on a handful of root server operators, TLD registries, and large DNS providers.

The DNS infrastructure works remarkably well for something this old and this fundamental. But "works well" and "is secure" are very different statements. Every DNS query you make is an act of trust in a system that has no built-in reason to be trusted. Understanding the mechanics of that system is the first step toward making informed decisions about how to harden it.