From dbd87f982b76e3f04c18e933500a0a081074986d Mon Sep 17 00:00:00 2001 From: bert hubert Date: Wed, 28 Mar 2018 22:20:38 +0200 Subject: [PATCH] wip --- README.md | 423 ++++++++++++++++++++++++++++++++++++++++++++++++++++++ 1 file changed, 423 insertions(+) diff --git a/README.md b/README.md index 18de115..954e4ed 100644 --- a/README.md +++ b/README.md @@ -1,2 +1,425 @@ # hello-dns Hello and welcome to DNS! + +This document attempts to provide a correct introduction to the Domain Name +System as of 2018. The original RFCs remain the authoritative source of +normative text, but this document tries to be in full alignment with all +relevant and useful RFCs. + +Although we start from relatively basic principles, the reader is expected +to know what IP addresses are, what a (stub) resolver is and what an +authoritative server is supposed to do. When in doubt: authoritative servers +'host' DNS, 'resolvers' look up things over at authoritative servers and +clients run 'stub resolvers' to look things up over at resolvers. + +DNS was originally written down in August 1979 in 'IEN 116', a parallel +series of documents describing the internet. IEN 116 era DNS is not +compatible with today's DNS. In 1983, RFC 882 was released, and stunningly +enough, an implementation of this 35 year old document would function +on the internet and be interoperable. + +RFC attained its modern form in 1987 when RFC 1034 and 1035 were published. +Although most of 1034/1035 remains valid, these standards are not that easy +to read because they were written in a very different time. + +The main goal of this document is not to contradict 1034 and 1035 but to +provide an easier entrypoint into DNS. + +## Layout +We start off with a general introduction of DNS basics: what is a resource +record, what is a RRSET, what is a zone, what is a zone-cut, how are packets +laid out. This part is required reading for anyone ever wanting to query a +nameserver or emit a valid response. + +We then specialize into what applications can expect when they send +questions to a resolver, or what a stub-resolver can expect. + +The next part is about what an authoritative server is supposed to do. On +top of this, we describe in slightly less detail how a resolver could +operate. Finally, there is a section on DNSSEC. + +Note that this file, which describes DNS basics, absolutely must be read from +beginning to end in order for the rest of the documents (or DNS) to make +sense. + +## DNS Basics +In this section we will initially ignore optional extensions that were added +to DNS later, specifically EDNS and DNSSEC which requires EDNS to function. + +DNS is mostly used to serve IP addresses and mailserver details, but it can +contain arbitrary data. DNS is all about names. Every name can have data +of several *types*. The most well known externally useful types are *A* for +IPv4 addresses, *AAAA* for IPv6 addresses and *MX* for mailserver details. +DNS also has types that have meaning for its own use, like *NS*, *CNAME* and +*SOA*. + +When we ask a DNS question we call this a *query*. We call the reply the +*response*. These queries and responses are contained in DNS messages. When +UDP is used, the message is also the packet. + +A DNS message has: + + * A header + * A query name and query type + * An answer section + * An authority section + * An additional section + +The header has the following fields that are useful for queries and +responses: + + * ID: a 16 bit identifier used as part of the process of matching queries to responses + * QR: Set to 0 to identify a message as a query, 1 for a response + * OPCODE: 0 for a standard query, other opcodes also exist + * RD: Set to indicate that this question wants *recursion* + +Relevant for responses: + * AA: This answer has Authoritative Answers + * RA: Recursive service was available + * TC: Not all the required parts of the answer fit in the message + +In basic DNS, query messages should have no answer, authority or additional +sections. DNS queries are mostly sent over UDP, and UDP packets can easily +be spoofed. To recognize the authentic response to a query it is important +that the ID field is random or at least unpredictable. This is however not +enough protection, so the source port of a UDP DNS query must also be +unpredictable. + +The header of a question for the IPv6 address of www.ietf.org looks like +this: + +``` + 1 1 1 1 1 1 + 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 + +--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+ + | ID = random 16 bits | + +--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+ + |QR| Opcode |AA|TC|RD|RA| Z | RCODE | + |0 | 0 |0 | 0| 0|0 | 0 | 0 | + +--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+ + | QDCOUNT = 1 | + +--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+ + | ANCOUNT = 0 | + +--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+ + | NSCOUNT = 0 | + +--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+ + | ARCOUNT = 0 | + +--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+ +``` + +Note that we did not spend time on field Z, this is because it is defined to +be 0 at all times. This packets also requests recursion. QDCOUNT = 1 means +there is 1 question. In theory DNS supported several questions in one +message, but this has not been implemented. ANCOUNT, NSCOUNT and ARCOUNT are +all zero, indicating there as no answers in this question packet. + +Here is the actual question: + +``` + 1 1 1 1 1 1 + 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 + +--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+ + | 3 w | + +--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+ + | w w | + +--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+ + | 4 i | + +--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+ + | e t | + +--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+ + | f 3 | + +--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+ + | o r | + +--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+ + | g 0 | + +--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+ + | 0 28 | + +--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+ + | 0 1 | + +--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+ +``` + +This consists of the 'www.ietf.org' encoded in DNS wire format (for which +see below), followed by a 16 bit type field. For AAAA, which denotes the +IPv6 address, this is 28. This is then followed by the 'class' of the +question. It was originally intended that DNS records would exist in +different 'classes', but the semantics of this were not specified completely +and it was not really implemented. For now, always set class to 1. + +Of specific note is the somewhat unusual way the name 'www.ietf.org' is +serialized in DNS. 'www.ietf.org' consists of 3 labels of lenghts 3, 4 and +3. In DNS messages, this is encoded as the value 3, then www, then the +value 4, then ietf, then 3 followed by org. Then there is a trailing 0 +which denotes this is the end. + +This format is unusual, but has several highly attractive properties. For +example, it is binary safe and it needs no escaping. When writing DNS +software, it may be tempting to pass DNS names around as "ASCII". This then +leads to escaping an unescaping code in lots of places. It is highly +recommended to use the native DNS encoding to store DNS names. This will +save a lot of pain when processing DNS names with spaces or dots in them. + +Finally, DNS queries are case-insensitive. This however is defined rather +mechanically. Operators do not need to know that in some ASCII encodings a Ü +is equivalent to ü when compared case insensitively. For DNS purposes, the +fifth bit (0x20) is ignored when comparing octets within a-Z and A-Z. + +Note that individual labels of a name may only be 63 octets long. + +Next up, a DNS response. Note that this again is a DNS message, and it looks +a lot like the original DNS query. Here is the beginning of a response: + +``` + 1 1 1 1 1 1 + 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 + +--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+ + | ID = same random 16 bits | + +--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+ + |QR| Opcode |AA|TC|RD|RA| Z | RCODE | + |1 | 0 | 1| 0| 0| 0| 0 | 0 | + +--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+ + | QDCOUNT = 1 | + +--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+ + | ANCOUNT = 1 | + +--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+ + | NSCOUNT = 0 | + +--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+ + | ARCOUNT = 0 | + +--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+ + | 3 w | + +--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+ + | w w | + +--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+ + | 4 i | + +--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+ + | e t | + +--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+ + | f 3 | + +--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+ + | o r | + +--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+ + | g 0 | + +--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+ + | 0 28 (0x1c) | + +--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+ + | 0 1 | + +--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+ + +``` + +Note that QR is now set to 1 to denote a response. The 'AA' bit was set +because this answer came from a from a server authoriative for this name. + +In addition, ANCOUNT is now set to '1', indicating a single answer is to be +found in the message, immediately after the original question, which has been +repeated from the query message. + +To recognize the right response, check that the ID field is the same as the +query, make sure the answer arrives on the right source port and that the +query name and type match up with the original query. In addition, make sure +not to send out more than one equivalent query when still waiting for the +response, as doing so opens a security hole. + +After the header and the original question we find the answer: + +``` + 1 1 1 1 1 1 + 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 + +--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+ + | 0xc0 0x0c | + +--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+ + | 00 28 | + +--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+ + | 00 01 | + +--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+ + | TTL = 3600 | + | | + +--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+ + | RDLENGTH = 16 | + +--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+--| + | 24 00 | + +--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+ + | cb 00 | + +--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+ + | 20 48 | + +--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+ + | 00 01 | + +--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+ + | 00 00 | + +--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+ + | 00 00 | + +--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+ + | 68 14 | + +--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+ + | 00 55 | + +--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+ +``` + +The first two bytes (0xc0 0c0c) look rather mysterious. When DNS was +created, 512 octets was considered the maximum size of a UDP datagram and +thus the maximum size of a DNS message transported without using the (then +slow) TCP protocol. + +In order to squeeze as much information as possible into the 512 bytes, DNS +names can (and often MUST) be compressed. The details of this compression +are arcane and easy to get wrong, leading to infinite loops or buffer +overflows. So tread very carefully. If you remember one thing, make sure +that a pointer always has to go to a lower position in the packet. Also +beware of signed/unsigned arithmetic. + +In this case, the DNS name of the answer is encoded is '0xc0 0x0c'. The c0 +part has the two most significant bits set, indicating that the following +6+8 bits are a pointer to somewhere earlier in the message. In this case, +this points to position 12 (= 0x0c) within the packet, which is immediately +after the DNS header. There we find 'www.ietf.org'. + +So what this means is that the answer about the DNS name 'www.ietf.org' is +also called 'www.ietf.org'. + +This is then followed in the packet by '28', which denotes AAAA (IPv6), and +the usual 'class' of 1. Then a whole 32 bits are devoted to the Time To Live +of this record, followed by a 16 bits length field. Since this is an IPv6 +address, the actual answer payload length is 16 bytes (or 128 bits). + +This is then followed by the binary representation of the current IPv6 +address of www.ietf.org, 2400:cb00:2048:1::6814:55. + +## RRSETs +In the example above, the question for the AAAA record of 'www.ietf.org' had +exactly one corresponding resource record. In a human readable 'zone file', +this would stored as: + +``` +www.ietf.org IN AAAA 3600 2400:cb00:2048:1::6814:55 +``` + +It is however possible to have multiple AAAA records for the same name. Even +if there is only one record, the DNS specifications talk about 'Resource +Record Sets', or RRSETs. These operate in unity. So even though the encoding +in the DNS packet allows different TTL values within a single RRSET, this +should never happen. + +## Zone files +Zone files are one way of storing DNS data, but these are not integral to +the operation of a nameserver. The zone file format is standardised, but it +is highly non-trivial to parse. It is entirely possible to write useful +nameserver that do not read or write DNS zone files. When embarking on +parsing zonefiles, do not do so lightly. As an example, various fields +within a single line can appear in many orders. Most fields are optional, +and some will then be copied from the previous line. But not all. + +Of specific note, many people have attempted to write a grammar (parser) for +zonefiles and it is almost impossible. + +## Zones +The concept of a DNS zone is non-trivial and frequently misunderstood. +Despite writing 'www.ietf.org' from left to right, within DNS it is fairer +to describe it as 'org' below the root node, with below the 'org' node a +node called 'ietf'. Finally to the 'ietf' node is attached a node called +'www'. + + +Or in graphical form: + +``` + +-----+ + | . | + +-----+ + | + +-----+ + | ORG | + +-----+ + | + +------+ + | IETF | + +------+ + | + +-----+ + | WWW | + +-----+ +``` + +To make life confusing, 'www.ietf.org' could be defined in four different +places. It could be in the 'root' zone itself, fully written out: + +``` +www.ietf.org IN AAAA 3600 2400:cb00:2048:1::6814:55 +``` +Or it could be in the org zone, where it might look like this: + +``` +$origin ORG +www.ietf IN AAAA 3600 2400:cb00:2048:1::6814:55 +``` + +Or, (as is actually the case), this name could live in the 'ietf.org' zone: + +``` +$origin ietf.org +www IN AAAA 3600 2400:cb00:2048:1::6814:55 +``` + +And finally, it is even possible that there is a zone called 'www.ietf.org', +where the record lives like this: + +``` +$origin www.ietf.org +@ IN AAAA 3600 2400:cb00:2048:1::6814:55 +``` + +For each of these four scenarios, the 'tree' of nodes as shown above is +real. This for example means that if there is a name called +'www.fr.ietf.org' and a query comes in for 'fr.ietf.org', that name +exists - even though no records may be assigned to it. + +NOTE: This means that any implementation that sees DNS as a simple +'key/value' store, where only records that exist can match, is headed for +trouble down the line. + +## Start of Authority, zone cuts, delegations +As noted above, 'www.ietf.org' can live in four places: the root zone, the +org zone, the ietf.org zone and even in a zone that is itself called +'www.ietf.org'. + +A zone always starts with a SOA or Start Of Authority record. A SOA record +is DNS metadata. It stores various things that may be of interest about a +zone, like the email address of the maintainer, the name of the most +authoritative server. It also has time intervals that describe how a zone +needs to be replicated. Finally, the SOA record has a number that influences +TTL values for names that do or do not exist. + +There is only one SOA that is guaranteed to exist and that is the one for +the root zone (called '.'). As of 2018, it looks like this: + +``` +. 86400 IN SOA a.root-servers.net. nstld.verisign-grs.com. 2018032802 1800 900 604800 86400 +``` + +This says: the authoritative server for the root zone is called +'a.root-servers.net'. This name is however only used for diagnostics. +Secondly, nstld@verisign-grs.com is the email address of the zone +maintainer. Note that the '@' is replaced by a dot. Specifically, if the +email address had been 'nstld.maintainer@verisign-grs.com', this would have +been stored as nstld\.maintainer.verisign-grs.com. This name would then +still be 3 labels long, but the first one has a dot in it. + +The following field, 2018032802, is a serial number. Quite often, but by all +means not always, this is a date in proper order (YYYYMMDD), followed by two +digits of serial numbers. This serial number is used for replication +purposes, as are the following 3 numbers. + +Zones are hosted on 'masters'. Meanwhile, 'slave' servers poll the master +for updates, and pull down a new zone if they see new contents, as noted by +an increase in serial number. + +The numbers 1800 and 900 describe how often a zone should be checked for +updates (twice an hour), and that if an update check fails it should be +repeated after 900 seconds. Finally, 604800 says that if a master server +was unreachable for over a week, the zone should be deleted from the slave. +This is not a popular feature. + +The final number, 86400, denotes that if a response says a name or RRSET +does not exist, it will continue to not exist for the next day, and that +this knowledge may be cached. + +