rebase
This commit is contained in:
parent
c6382f05ec
commit
e2950ae08e
3
.gitmodules
vendored
Normal file
3
.gitmodules
vendored
Normal file
@ -0,0 +1,3 @@
|
||||
[submodule "tdns/ext/simplesocket"]
|
||||
path = tdns/ext/simplesocket
|
||||
url = https://github.com/ahupowerdns/simplesocket.git
|
650
README.md
650
README.md
@ -1,6 +1,5 @@
|
||||
<meta charset="utf-8" emacsmode="-*- markdown -*-">
|
||||
**A warm welcome to DNS**
|
||||
<link rel="stylesheet" href="https://casual-effects.com/markdeep/latest/apidoc.css?">
|
||||
|
||||
# Hello, and welcome to DNS!
|
||||
|
||||
@ -49,12 +48,15 @@ enthusiasm for improving the state of DNS.
|
||||
## Layout
|
||||
The content is spread out over several documents:
|
||||
|
||||
* The core of DNS (this file)
|
||||
* [The core of DNS](basic.md.html)
|
||||
* [Relevant to stub resolvers and applications](stub.md.html)
|
||||
* [Relevant to authoritative servers](auth.md.html)
|
||||
* [Relevant to resolvers](resolver.md.html)
|
||||
* Optional elements: [EDNS, TSIG, Dynamic Updates, DNAME, DNS Cookies](optional.md.html)
|
||||
* [Privacy related](privacy.md.html): QName minimization, DNS-over-TLS, DNS-over-HTTPS, EDNS Padding
|
||||
* [DNSSEC](dnssec.md.html)
|
||||
* [non-IETF standards](non-ietf.md.html): RRL and RPZ
|
||||
* [Rare parts of DNS](rare.md.html) - not obsolete, but not frequently encountered in production
|
||||
|
||||
We start off with a general introduction of DNS basics: what is a resource
|
||||
record, what is an RRSET, what is a zone, what is a zone-cut, how are packets
|
||||
@ -74,648 +76,6 @@ authoritative and resolver functions. This turns out to make both code and
|
||||
troubleshooting harder. Therefore, in these documents, the authoritative and
|
||||
caching functions are described separately.
|
||||
|
||||
Note that this file, which describes DNS basics, absolutely must be read from
|
||||
beginning to end in order for the rest of the documents (or DNS) to make
|
||||
sense.
|
||||
|
||||
# DNS Basics
|
||||
In this section we will initially ignore optional extensions that were added
|
||||
to DNS later, specifically EDNS and DNSSEC.
|
||||
|
||||
This file corresponds roughly to the fundamental parts of RFCs 1034, 1035,
|
||||
2181, 2308, 3596, 4343, 5452, 6604.
|
||||
|
||||
DNS is mostly used to serve IP addresses and mailserver details, but it can
|
||||
contain arbitrary data. DNS is all about names. Every name can have data
|
||||
of several *types*. The most well known externally useful types are *A* for
|
||||
IPv4 addresses, *AAAA* for IPv6 addresses and *MX* for mailserver details.
|
||||
DNS also has types that have meaning for its own use, like *NS*, *CNAME* and
|
||||
*SOA*.
|
||||
|
||||
When we ask a DNS question we call this a *query*. We call the reply the
|
||||
*response*. These queries and responses are contained in DNS messages. When
|
||||
UDP is used, the message is also the packet.
|
||||
|
||||
A DNS message has:
|
||||
|
||||
* A header
|
||||
* A query name and query type
|
||||
* An answer section
|
||||
* An authority section
|
||||
* An additional section
|
||||
|
||||
In basic DNS, query messages should have no answer, authority or additional
|
||||
sections.
|
||||
|
||||
The header has the following fields that are useful for queries and
|
||||
responses:
|
||||
|
||||
* ID: a 16 bit identifier used as part of the process of matching queries to responses
|
||||
* QR: Set to 0 to identify a message as a query, 1 for a response
|
||||
* OPCODE: 0 for a standard query, other opcodes also exist
|
||||
* RD: Set to indicate that this question wants *recursion*
|
||||
|
||||
Relevant for responses:
|
||||
* AA: This response has Authoritative Answers
|
||||
* RA: Recursive service was available
|
||||
* TC: Not all the required parts of the response fit in the UDP message
|
||||
* RCODE: Result code. 0 is ok, 2 is SERVFAIL, 3 is NXDOMAIN.
|
||||
|
||||
DNS queries are mostly sent over UDP, and UDP packets can easily be spoofed.
|
||||
To recognize the authentic response to a query it is important that the ID
|
||||
field is random or at least unpredictable. This is however not enough
|
||||
protection, so the source port of a UDP DNS query must also be
|
||||
unpredictable.
|
||||
|
||||
DNS messages can also be sent over TCP/IP. Because TCP is not a datagram
|
||||
oriented protocol, each DNS message in TCP/IP is preceded by a 16 bit
|
||||
network endian length field.
|
||||
|
||||
DNS servers must listen on both UDP and TCP, port 53.
|
||||
|
||||
The header of a question for the IPv6 address of www.ietf.org looks like
|
||||
this:
|
||||
|
||||
|
||||
***************************************************************
|
||||
* 1 1 1 1 1 1
|
||||
* 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5
|
||||
* +--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+
|
||||
* | ID = random 16 bits |
|
||||
* +--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+
|
||||
* |QR| Opcode |AA|TC|RD|RA| Z | RCODE |
|
||||
* |0 | 0 |0 | 0| 0|0 | 0 | 0 |
|
||||
* +--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+
|
||||
* | QDCOUNT = 1 |
|
||||
* +--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+
|
||||
* | ANCOUNT = 0 |
|
||||
* +--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+
|
||||
* | NSCOUNT = 0 |
|
||||
* +--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+
|
||||
* | ARCOUNT = 0 |
|
||||
* +--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+
|
||||
*
|
||||
***************************************************************
|
||||
|
||||
|
||||
Note that we did not spend time on field Z, this is because it is defined to
|
||||
be 0 at all times. This packet does not request recursion. QDCOUNT = 1
|
||||
means there is 1 question. In theory DNS supported several questions in one
|
||||
message, but this has not been implemented. ANCOUNT, NSCOUNT and ARCOUNT
|
||||
are all zero, indicating there as no answers in this question packet.
|
||||
|
||||
Here is the actual question:
|
||||
|
||||
********************************************************
|
||||
* 1 1 1 1 1 1 *
|
||||
* 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 *
|
||||
* +--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+ *
|
||||
* | 3 | w | *
|
||||
* +--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+ *
|
||||
* | w | w | *
|
||||
* +--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+ *
|
||||
* | 4 | i | *
|
||||
* +--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+ *
|
||||
* | e | t | *
|
||||
* +--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+ *
|
||||
* | f | 3 | *
|
||||
* +--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+ *
|
||||
* | o | r | *
|
||||
* +--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+ *
|
||||
* | g | 0 | *
|
||||
* +--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+ *
|
||||
* | 0 | 28 | *
|
||||
* +--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+ *
|
||||
* | 0 | 1 | *
|
||||
* +--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+ *
|
||||
********************************************************
|
||||
|
||||
This consists of the name 'www.ietf.org' encoded in DNS wire format (for which
|
||||
see below), followed by a 16 bit type field. For AAAA, which denotes the
|
||||
IPv6 address, this is 28. This is then followed by the 'class' of the
|
||||
question. It was originally intended that DNS records would exist in
|
||||
different 'classes', but the semantics of this were not specified completely
|
||||
and it was not really implemented. For now, always set class to 1.
|
||||
|
||||
Of specific note is the somewhat unusual way the name 'www.ietf.org' is
|
||||
serialized in DNS. 'www.ietf.org' consists of 3 'labels' of lengths 3, 4
|
||||
and 3 respectively. In DNS messages, this is encoded as the value 3, then
|
||||
www, then the value 4, then ietf, then 3 followed by org. Then there is a
|
||||
trailing 0 which denotes this is the end.
|
||||
|
||||
This format is unusual, but has several highly attractive properties. For
|
||||
example, it is binary safe and it needs no escaping. When writing DNS
|
||||
software, it may be tempting to pass DNS names around as "ASCII". This then
|
||||
leads to escaping and unescaping code in lots of places. It is highly
|
||||
recommended to use the native DNS encoding to store DNS names. This will
|
||||
save a lot of pain when processing DNS names with spaces or dots in them.
|
||||
|
||||
Finally, DNS queries are
|
||||
[case-insensitive](https://tools.ietf.org/html/rfc4343). This however is
|
||||
defined rather mechanically. Operators do not need to know that in some
|
||||
ASCII encodings a Ü is equivalent to ü when compared case insensitively.
|
||||
For DNS purposes, the fifth bit (0x20) is ignored when comparing octets
|
||||
within a-Z and A-Z.
|
||||
|
||||
Note that individual labels of a name may only be 63 octets long.
|
||||
|
||||
Next up, a DNS response. Note that this again is a DNS message, and it looks
|
||||
a lot like the original DNS query. Here is the beginning of a response:
|
||||
|
||||
|
||||
*****************************************************************
|
||||
* 1 1 1 1 1 1 *
|
||||
* 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 *
|
||||
* +--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+ *
|
||||
* | ID = same random 16 bits | *
|
||||
* +--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+ *
|
||||
* |QR| Opcode |AA|TC|RD|RA| Z | RCODE | *
|
||||
* |1 | 0 | 1| 0| 0| 0| 0 | 0 | *
|
||||
* +--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+ *
|
||||
* | QDCOUNT = 1 | *
|
||||
* +--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+ *
|
||||
* | ANCOUNT = 1 | *
|
||||
* +--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+ *
|
||||
* | NSCOUNT = 0 | *
|
||||
* +--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+ *
|
||||
* | ARCOUNT = 0 | *
|
||||
* +--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+ *
|
||||
* | 3 | w | *
|
||||
* +--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+ *
|
||||
* | w | w | *
|
||||
* +--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+ *
|
||||
* | 4 | i | *
|
||||
* +--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+ *
|
||||
* | e | t | *
|
||||
* +--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+ *
|
||||
* | f | 3 | *
|
||||
* +--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+ *
|
||||
* | o | r | *
|
||||
* +--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+ *
|
||||
* | g | 0 | *
|
||||
* +--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+ *
|
||||
* | 0 | 28 (= 0x1c)| *
|
||||
* +--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+ *
|
||||
* | 0 | 1 | *
|
||||
* +--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+ *
|
||||
*****************************************************************
|
||||
|
||||
Note that QR is now set to 1 to denote a response. The 'AA' bit was set
|
||||
because this answer came from a from a server authoritative for this name.
|
||||
|
||||
In addition, ANCOUNT is now set to '1', indicating a single answer is to be
|
||||
found in the message, immediately after the original question, which has been
|
||||
repeated from the query message.
|
||||
|
||||
To recognize the right response, check that the ID field is the same as the
|
||||
query, make sure the answer arrives on the right source port and that the
|
||||
query name and type match up with the original query. In addition, make sure
|
||||
not to send out more than one equivalent query when still waiting for the
|
||||
response, as doing so opens a security hole.
|
||||
|
||||
After the header and the original question we find the answer:
|
||||
|
||||
*****************************************************************
|
||||
* 1 1 1 1 1 1
|
||||
* 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5
|
||||
* +--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+
|
||||
* | 0xc0 | 0x0c |
|
||||
* +--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+
|
||||
* | 00 | 28 |
|
||||
* +--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+
|
||||
* | 00 | 01 |
|
||||
* +--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+
|
||||
* | TTL = 3600 |
|
||||
* | |
|
||||
* +--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+
|
||||
* | RDLENGTH = 16 |
|
||||
* +--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+--|
|
||||
* | 24 | 00 |
|
||||
* +--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+
|
||||
* | cb | 00 |
|
||||
* +--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+
|
||||
* | 20 | 48 |
|
||||
* +--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+
|
||||
* | 00 | 01 |
|
||||
* +--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+
|
||||
* | 00 | 00 |
|
||||
* +--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+
|
||||
* | 00 | 00 |
|
||||
* +--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+
|
||||
* | 68 | 14 |
|
||||
* +--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+
|
||||
* | 00 | 55 |
|
||||
* +--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+
|
||||
*****************************************************************
|
||||
The first two bytes (0xc0 0c0c) look rather mysterious. When DNS was
|
||||
created, 512 octets was considered the maximum size of a UDP datagram and
|
||||
thus the maximum size of a DNS message transported without using the (then
|
||||
slow) TCP protocol.
|
||||
|
||||
In order to squeeze as much information as possible into the 512 bytes, DNS
|
||||
names can (and often MUST) be compressed. The details of this compression
|
||||
are arcane and easy to get wrong, leading to infinite loops or buffer
|
||||
overflows. So tread very carefully. If you remember one thing, make sure
|
||||
that a pointer always has to go to a lower position in the packet. Also
|
||||
beware of signed/unsigned arithmetic.
|
||||
|
||||
In this case, the DNS name of the answer is encoded is '0xc0 0x0c'. The c0
|
||||
part has the two most significant bits set, indicating that the following
|
||||
6+8 bits are a pointer to somewhere earlier in the message. In this case,
|
||||
this points to position 12 (= 0x0c) within the packet, which is immediately
|
||||
after the DNS header. There we find 'www.ietf.org'.
|
||||
|
||||
So what this means is that the answer about the DNS name 'www.ietf.org' is
|
||||
also called 'www.ietf.org'.
|
||||
|
||||
This is then followed in the packet by '28', which denotes AAAA (IPv6), and
|
||||
the usual 'class' of 1. Then a whole 32 bits are devoted to the Time To Live
|
||||
of this record, followed by a 16 bits length field. Since this is an IPv6
|
||||
address, the actual answer payload length is 16 bytes (or 128 bits).
|
||||
|
||||
This is then followed by the binary representation of the current IPv6
|
||||
address of www.ietf.org, 2400:cb00:2048:1::6814:55.
|
||||
|
||||
If there had been further answers, these would follow this first one, and
|
||||
the ANCOUNT would have been higher than 1. If there had been data in the
|
||||
'authoritative' and 'additional' sections, that would follow here too, with
|
||||
the corresponding adjustments to 'NSCOUNT' and 'ARCOUNT' fields. More about
|
||||
these sections later.
|
||||
|
||||
## RRSETs
|
||||
In the example above, the question for the AAAA record of 'www.ietf.org' had
|
||||
exactly one corresponding resource record. In a human readable 'zone file',
|
||||
this would stored as:
|
||||
|
||||
```
|
||||
www.ietf.org IN AAAA 3600 2400:cb00:2048:1::6814:55
|
||||
```
|
||||
|
||||
It is however possible to have multiple AAAA records for the same name. Even
|
||||
if there is only one record, the DNS specifications talk about 'Resource
|
||||
Record Sets', or RRSETs. These operate in unity. So even though the encoding
|
||||
in the DNS packet allows different TTL values within a single RRSET, this
|
||||
should never happen.
|
||||
|
||||
## Zone files
|
||||
Zone files are one way of storing DNS data, but these are not integral to
|
||||
the operation of a nameserver. The zone file format is standardized, but it
|
||||
is highly non-trivial to parse. It is entirely possible to write useful
|
||||
nameserver that does not read or write DNS zone files. When embarking on
|
||||
parsing zonefiles, do not do so lightly. As an example, various fields
|
||||
within a single line can appear in many orders. Most fields are optional,
|
||||
and some will then be copied from the previous line. But not all.
|
||||
|
||||
Of specific note, many people have attempted to write a grammar (parser) for
|
||||
zonefiles and it is almost impossible.
|
||||
|
||||
## DNS Names
|
||||
The concept of a DNS name is non-trivial and frequently misunderstood.
|
||||
Despite writing 'www.ietf.org' from left to right, within DNS it is fairer
|
||||
to describe it as 'org' below the root node, with below the 'org' node a
|
||||
node called 'ietf'. Finally to the 'ietf' node is attached a node called
|
||||
'www'.
|
||||
|
||||
|
||||
Or in graphical form:
|
||||
|
||||
***************************
|
||||
* +-----+
|
||||
* | |
|
||||
* +--+--+
|
||||
* |
|
||||
* +--+--+
|
||||
* | ORG |
|
||||
* +--+--+
|
||||
* |
|
||||
* +--+---+
|
||||
* | IETF |
|
||||
* +--+---+
|
||||
* |
|
||||
* +--+--+
|
||||
* | WWW |
|
||||
* +-----+
|
||||
***************************
|
||||
|
||||
The 'tree' of nodes as shown above is real and not just another way of
|
||||
visualizing a DNS name. This for example means that if there is a name
|
||||
called 'www.fr.ietf.org' and a query comes in for 'fr.ietf.org', that name
|
||||
exists - even though no records may be assigned to it.
|
||||
|
||||
NOTE: This means that any implementation that sees DNS as a simple
|
||||
'key/value' store, where only records that exist can match, is headed for
|
||||
trouble down the line.
|
||||
|
||||
## Zones
|
||||
As noted, DNS is more complicated than a simple key/value store. This is
|
||||
not only because of the tree style nature of names but also because the same
|
||||
data can live in multiple places, but always lives in a 'zone'.
|
||||
|
||||
Various DNS implementations over time have found out that you can mostly
|
||||
ignore the concept of 'zone' for simple nameservers or load balancers, but
|
||||
not implementing zones correctly will eventually trip you up.
|
||||
|
||||
To make life confusing, 'www.ietf.org' can be defined in four different
|
||||
places. It could be in the 'root' zone itself, fully written out:
|
||||
|
||||
```
|
||||
www.ietf.org IN AAAA 3600 2400:cb00:2048:1::6814:55
|
||||
```
|
||||
Or it could be in the org zone, where it might look like this:
|
||||
|
||||
```
|
||||
$origin ORG
|
||||
www.ietf IN AAAA 3600 2400:cb00:2048:1::6814:55
|
||||
```
|
||||
|
||||
Or, (as is actually the case), this name could live in the 'ietf.org' zone:
|
||||
|
||||
```
|
||||
$origin ietf.org
|
||||
www IN AAAA 3600 2400:cb00:2048:1::6814:55
|
||||
```
|
||||
|
||||
And finally, it is even possible that there is a zone called 'www.ietf.org',
|
||||
where the record lives like this:
|
||||
|
||||
```
|
||||
$origin www.ietf.org
|
||||
@ IN AAAA 3600 2400:cb00:2048:1::6814:55
|
||||
```
|
||||
|
||||
|
||||
### Start of Authority
|
||||
A zone always starts with a SOA or Start Of Authority record. A SOA record
|
||||
is DNS metadata. It stores various things that may be of interest about a
|
||||
zone, like the email address of the maintainer, the name of the most
|
||||
authoritative server. It also has values that describe how or if a zone
|
||||
needs to be replicated. Finally, the SOA record has a number that
|
||||
influences TTL values for names that do not exist.
|
||||
|
||||
There is only one SOA that is guaranteed to exist on the internet and that
|
||||
is the one for the root zone (called '.'). As of 2018, it looks like this:
|
||||
|
||||
```
|
||||
. 86400 IN SOA a.root-servers.net. nstld.verisign-grs.com. 2018032802 1800 900 604800 86400
|
||||
```
|
||||
|
||||
For details of what all these fields mean, please see the [authoritative
|
||||
server document](auth.md.html).
|
||||
|
||||
The final number however is important here. 86400 denotes that if a
|
||||
response says a name or RRSET does not exist, it will continue to not exist
|
||||
for the next day, and that this knowledge may be cached.
|
||||
|
||||
### Zone cuts
|
||||
As noted, 'www.ietf.org' can live in four places. If it lives where it
|
||||
currently does, in the 'ietf.org' zone, it passes through two zone cuts:
|
||||
From . to org, from org to ietf.org.
|
||||
|
||||
When an authoritative server receives a query for 'www.ietf.org', it
|
||||
consults which zones it knows about and answers from the most specific zone
|
||||
it has available.
|
||||
|
||||
For a root-server, which only knows about the root zone, this means
|
||||
consulting the '.' zone. As noted, 'www.ietf.org' is actually a tree, 'org'
|
||||
-> 'ietf' -> 'www'. And as luck will have it, the first node 'org' is
|
||||
present in the root zone.
|
||||
|
||||
Attached to that node is an NS RRSET, which has the names of nameservers
|
||||
that host the ORG zone.
|
||||
|
||||
If we ask these servers about 'www.ietf.org', they too find the best zone to
|
||||
answer from, which in this case is 'org'. Within the 'org' zone they then
|
||||
find the 'ietf' node, which again contains an NS RRSET.
|
||||
|
||||
When we ask the servers named in that RRSET about 'www.ietf.org', they find
|
||||
a node called 'www' with several RRSETs on it, one of which is for AAAA and
|
||||
contains the IPv6 address we were looking for.
|
||||
|
||||
Any authoritative server which does not implement 'zones' in this way will
|
||||
eventually run into trouble. It is not enough to consult a list of known
|
||||
names and answer records attached to those names.
|
||||
|
||||
### NS Records
|
||||
These are a mandatory part of a zone, at the 'apex'. The 'apex' is the name
|
||||
of the zone, at which point there is also a SOA record. So a typical zone
|
||||
will start like this:
|
||||
|
||||
```
|
||||
$ORIGIN ietf.org.
|
||||
@ IN SOA ns1 admin 2018032802 1800 900 604800 86400
|
||||
IN NS ns1
|
||||
IN NS ns2
|
||||
```
|
||||
|
||||
Note how in this zone file example names not ending on a '.' are interpreted
|
||||
as being part of ietf.org. The '@' is a way to specify the name of the
|
||||
apex. Lines two and three omit a name, so they default to '@' too.
|
||||
|
||||
This zone lists ns1.ietf.org and ns2.ietf.org as its nameservers.
|
||||
Being part of the zone, this data is *authoritative*. Any queries sent to
|
||||
this nameserver for the NS RRSET of 'ietf.org' will receive responses
|
||||
with the AA bit set.
|
||||
|
||||
Note however that above we learned that the parent zone, 'org' also needs to
|
||||
list the nameservers for example.org, and it does:
|
||||
|
||||
```
|
||||
$ORIGIN org.
|
||||
...
|
||||
ietf IN NS ns1.ietf
|
||||
ietf IN NS ns2.ietf
|
||||
```
|
||||
|
||||
If we ask the 'org' nameservers for the NS RRSET of 'ietf.org', we receive a
|
||||
response with AA=0, indicating that the 'org' servers know they aren't
|
||||
'authoritative' for ietf.org.
|
||||
|
||||
### Glue records
|
||||
The astute reader will have spotted a chicken and egg problem here. If
|
||||
ns1.ietf.org is the nameserver for ietf.org., where do we get the IP
|
||||
address of ns1.ietf.org?
|
||||
|
||||
To solve this problem, the parent zone can provide a free chicken. In the
|
||||
org zone, we would actually find:
|
||||
|
||||
```
|
||||
$ORIGIN org.
|
||||
...
|
||||
ietf IN NS ns1.ietf
|
||||
ietf IN NS ns2.ietf
|
||||
ns1.ietf IN A 192.0.2.1
|
||||
ns2.ietf IN A 198.51.100.1
|
||||
```
|
||||
|
||||
These entries are mirrored in the 'ietf.org' zone hosted on ns1.ietf.org and
|
||||
ns2.ietf.org. And as with the NS records, any queries for ns1.ietf.org sent
|
||||
to the org servers receive AA=0 answers, whereas ns1.ietf.org itself answers
|
||||
with AA=1.
|
||||
|
||||
Note that for various reasons the AA=0 answer from the parent zone may be
|
||||
different than the AA=1 answer, and resolvers must be aware of the
|
||||
difference.
|
||||
|
||||
|
||||
# Further aspects
|
||||
|
||||
The description up to this point is correct, but far from functionally
|
||||
complete even for basic DNS. The following sections describe additional
|
||||
aspects of basic DNS:
|
||||
|
||||
## CNAME
|
||||
A CNAME provides the 'Canonical Name' for another DNS name. For example:
|
||||
|
||||
```
|
||||
www IN CNAME www.ietf.org.cdn.cloudflare.net.
|
||||
```
|
||||
|
||||
This is frequently used to redirect to a Content Distribution Network. The
|
||||
CNAME is for a name, and not for a type. This means that *any* query for
|
||||
www.ietf.org is sent to cloudflare. This simultaneously means that what
|
||||
everyone wants is impossible:
|
||||
|
||||
```
|
||||
$ORIGIN ietf.org
|
||||
@ IN CNAME this.does.not.work.int.
|
||||
```
|
||||
|
||||
This collides with the SOA and NS records, which are then also redirected
|
||||
and not found. Frequently doing this 'apex CNAME' appears to work, but it
|
||||
really doesn't.
|
||||
|
||||
In hindsight, the CNAME should have been 'typed' to apply only to specific
|
||||
query types.
|
||||
|
||||
When a server encounters a CNAME with the name of a name it was looking for,
|
||||
it will 'follow' the chain to where it points. And please be aware that this
|
||||
can loop.
|
||||
|
||||
## Wildcards
|
||||
Wildcards allow for the following:
|
||||
|
||||
```
|
||||
$ORIGIN ietf.org.
|
||||
* IN A 192.0.2.1
|
||||
IN AAAA 2001:db8:85a3::8a2e:0370:7334
|
||||
smtp IN A 192.0.2.222
|
||||
```
|
||||
|
||||
A query for the A record of 'smtp.ietf.org' will return 192.0.2.222. A query
|
||||
for 'www.ietf.org' however will return 192.0.2.1.
|
||||
|
||||
Interestingly, as another example of how DNS really is a tree, a query for
|
||||
the AAAA record of smtp.ietf.org will return.. nothing. This is because
|
||||
the node 'smtp.ietf.org' does exist, and processing ends there. The
|
||||
wildcard match will not proceed to the '*' entry.
|
||||
|
||||
Wildcards synthesize new answers. This means that, unless explicitly
|
||||
queried, no '*.ietf.org' record will be served. Instead, a 'www.ietf.org'
|
||||
record is created on the fly.
|
||||
|
||||
## Truncation
|
||||
Without implementing the optional EDNS protocol extension, all UDP responses
|
||||
must fit in 512 bytes of payload. If on writing an answer a server finds
|
||||
itself exceeding this limit, it must truncate the packet and set the TC bit
|
||||
to 1.
|
||||
|
||||
The originator of the query will then resend the query over TCP.
|
||||
|
||||
Sometimes DNS responses contain optional data that could be left out, and
|
||||
this could be done to stay under the 512 byte limit.
|
||||
|
||||
It is recommended however to keep it simple and send an empty response
|
||||
packet with TC=1 whenever the byte limit is reached.
|
||||
|
||||
## Names and nodes that do not exist
|
||||
DNS queries can fail to match in two ways: the whole node does not exist,
|
||||
or, the requested type is not present at that node.
|
||||
|
||||
As an example of the first case, 'doesnotexist.ietf.org' really does not
|
||||
exist, which leads to a response with RCODE NXDOMAIN and no answer records.
|
||||
|
||||
As an example of the second case, 'www.ietf.org' does exist, but has no MX
|
||||
record. The RCODE is normal, but there are no answer records.
|
||||
|
||||
Empty answers however are hard to cache. To alleviate this situation, in
|
||||
these cases the authoritative server sends a copy of the SOA record in the
|
||||
Authority section of the response. The TTL of that record tells us how long
|
||||
the knowledge of 'no such name' or 'no such data' can be cached.
|
||||
|
||||
## Query types that are not RRSET types
|
||||
In addition to the resource record types covered above, like A, AAAA, NS and
|
||||
SOA, two additional types exist that can only be used in queries: ANY, AXFR
|
||||
and IXFR.
|
||||
|
||||
An ANY query instructs a nameserver to return all types it immediately has
|
||||
available for a name. This 'immediately' qualification makes ANY queries
|
||||
unsuitable for talking to resolvers - it is not sure the response is in any
|
||||
way complete.
|
||||
|
||||
Because of the potential of creating huge answers, the use of ANY is
|
||||
problematic even when talking to authoritative servers, and it may no longer
|
||||
work well in the future.
|
||||
|
||||
AXFR and IXFR are requests for (incremental) zone transfers, almost always
|
||||
over TCP. This query asks an authoritative server to list an entire zone.
|
||||
Resolvers do not process AXFR or IXFR queries.
|
||||
|
||||
# That's it for basic DNS!
|
||||
This is the core of DNS. There are quite some parts that have not been
|
||||
discussed, but based on the explanations above, it is possible to write a
|
||||
compliant authoritative server.
|
||||
|
||||
## Further reading
|
||||
|
||||
### RFC 1034 / 1035
|
||||
|
||||
These ([1034](https://tools.ietf.org/html/rfc1034) &
|
||||
[1035](https://tools.ietf.org/html/rfc1035)) describe the core of DNS in
|
||||
1987 language. When reading, disregard mentions of IQUERY and experimental
|
||||
records. They did not survive. Also realize that in this world,
|
||||
authoritative and resolver service were described as a single function. We
|
||||
now know this to be confusing.
|
||||
|
||||
### RFC 2181: "Clarifications to the DNS Specification"
|
||||
From 1997, [2181](https://tools.ietf.org/html/rfc2181) performs a decade of
|
||||
cleanup work on 1034/1035. It also talks about an early version of DNSSEC
|
||||
(NXT, SIG, KEY records), these sections should not be read as this is
|
||||
unrelated to current DNSSEC (aka DNSSEC-bis).
|
||||
|
||||
Of specific note, 5.4.1 describes very exact ordering rules which data a
|
||||
server is supposed to prefer. This list becomes a lot simpler when split up
|
||||
between pure authoritative and pure resolver functions.
|
||||
|
||||
### RFC 2308: "Negative caching of DNS Queries (NCACHE)
|
||||
This [rfc](https://tools.ietf.org/html/rfc2308) describes how negative
|
||||
responses are to be cached. The details matter for both authoritative and
|
||||
resolvers. Of specific note are the parts that dwell on CNAME chains which
|
||||
lead to a 'no data' or 'NXDOMAIN' situation.
|
||||
|
||||
As with 2181, this RFC speaks about an earlier version of DNSSEC, and these
|
||||
parts should be fully ignored.
|
||||
|
||||
### RFC 3596: "DNS Extensions to Support IP Version 6"
|
||||
This [rfc](https://tools.ietf.org/html/rfc3596) describes the AAAA record,
|
||||
which is core to DNS as it is required to look up addresses of nameservers.
|
||||
|
||||
### RFC 4343: "Domain Name System Case Insensitivity Clarification"
|
||||
[4343](https://tools.ietf.org/html/rfc4343) clarifies the somewhat odd
|
||||
case insensitivity of DNS but also writes out the escaping rules when using
|
||||
non-ASCII or whitespace in DNS names. As noted before, try not to have to
|
||||
use these rules except when reading DNS data from text files or showing DNS
|
||||
data meant for human consumption. Use native DNS names as much as possible,
|
||||
and create 4343-compliant comparison and equivalence functions.
|
||||
|
||||
### RFC 5452: "Measures for Making DNS More Resilient against Forged Answers"
|
||||
This [RFC](https://tools.ietf.org/html/rfc5452) makes source port
|
||||
randomization mandatory for UDP-based DNS messages and also has rules on
|
||||
preventing "birthday attacks".
|
||||
|
||||
### RFC 6604: "xNAME RCODE Clarification"
|
||||
[6604](https://tools.ietf.org/html/rfc6604) further describes the meanings
|
||||
of header bits (AA) and RCODEs when following CNAME chains. Also discusses
|
||||
an earlier version of DNAMEs, these parts are best ignored in lieu of
|
||||
(later) reading the newer DNAME specification.
|
||||
Next up: [DNS Basics](basic.md.html).
|
||||
|
||||
<!-- Markdeep: --><style class="fallback">body{visibility:hidden;white-space:pre;font-family:monospace}</style><script src="ext/markdeep.min.js"></script><script>window.alreadyProcessedMarkdeep||(document.body.style.visibility="visible")</script>
|
||||
|
4
auth.md
4
auth.md
@ -1,6 +1,10 @@
|
||||
<meta charset="utf-8" emacsmode="-*- markdown -*-">
|
||||
**A warm welcome to DNS**
|
||||
|
||||
<link rel="stylesheet" href="https://casual-effects.com/markdeep/latest/apidoc.css?">
|
||||
|
||||
Note: this page is part of the
|
||||
'[hello-dns](https://powerdns.org/hello-dns/)' documentation effort.
|
||||
|
||||
# Authoritative servers
|
||||
|
||||
|
662
basic.md
Normal file
662
basic.md
Normal file
@ -0,0 +1,662 @@
|
||||
<meta charset="utf-8" emacsmode="-*- markdown -*-">
|
||||
**A warm welcome to DNS**
|
||||
<link rel="stylesheet" href="https://casual-effects.com/markdeep/latest/apidoc.css?">
|
||||
|
||||
Note: this page is part of the
|
||||
'[hello-dns](https://powerdns.org/hello-dns/)' documentation effort.
|
||||
|
||||
# DNS Basics
|
||||
In this section we will initially ignore optional extensions that were added
|
||||
to DNS later, specifically EDNS and DNSSEC.
|
||||
|
||||
This file corresponds roughly to the fundamental parts of RFCs 1034, 1035,
|
||||
2181, 2308, 3596, 4343, 5452, 6604 and 7766.
|
||||
|
||||
This file, which describes DNS basics, absolutely must be read from
|
||||
beginning to end in order for the rest of the documents (or DNS) to make
|
||||
sense.
|
||||
|
||||
DNS is mostly used to serve IP addresses and mailserver details, but it can
|
||||
contain arbitrary data. DNS is all about names. Every name can have data
|
||||
of several *types*. The most well known externally useful types are *A* for
|
||||
IPv4 addresses, *AAAA* for IPv6 addresses and *MX* for mailserver details.
|
||||
DNS also has types that have meaning for its own use, like *NS*, *CNAME* and
|
||||
*SOA*.
|
||||
|
||||
When we ask a DNS question we call this a *query*. We call the reply the
|
||||
*response*. These queries and responses are contained in DNS messages. When
|
||||
UDP is used, the message is also the packet. Note that [TCP support is
|
||||
mandatory](https://tools.ietf.org/html/rfc7766.txt) for DNS in 2018.
|
||||
|
||||
A DNS message has:
|
||||
|
||||
* A header
|
||||
* A query name and query type
|
||||
* An answer section
|
||||
* An authority section
|
||||
* An additional section
|
||||
|
||||
In basic DNS, query messages should have no answer, authority or additional
|
||||
sections.
|
||||
|
||||
The header has the following fields that are useful for queries and
|
||||
responses:
|
||||
|
||||
* ID: a 16 bit identifier used as part of the process of matching queries to responses
|
||||
* QR: Set to 0 to identify a message as a query, 1 for a response
|
||||
* OPCODE: 0 for a standard query, other opcodes also exist
|
||||
* RD: Set to indicate that this question wants *recursion*
|
||||
|
||||
Relevant for responses:
|
||||
* AA: This response has Authoritative Answers
|
||||
* RA: Recursive service was available
|
||||
* TC: Not all the required parts of the response fit in the UDP message
|
||||
* RCODE: Result code. 0 is ok, 2 is SERVFAIL, 3 is NXDOMAIN.
|
||||
|
||||
DNS queries are mostly sent over UDP, and UDP packets can easily be spoofed.
|
||||
To recognize the authentic response to a query it is important that the ID
|
||||
field is random or at least unpredictable. This is however not enough
|
||||
protection, so the source port of a UDP DNS query must also be
|
||||
unpredictable.
|
||||
|
||||
DNS messages can also be sent over TCP/IP. Because TCP is not a datagram
|
||||
oriented protocol, each DNS message in TCP/IP is preceded by a 16 bit
|
||||
network endian length field.
|
||||
|
||||
DNS servers must listen on both UDP and TCP, port 53.
|
||||
|
||||
The header of a question for the IPv6 address of www.ietf.org looks like
|
||||
this:
|
||||
|
||||
|
||||
***************************************************************
|
||||
* 1 1 1 1 1 1
|
||||
* 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5
|
||||
* +--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+
|
||||
* | ID = random 16 bits |
|
||||
* +--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+
|
||||
* |QR| Opcode |AA|TC|RD|RA| Z | RCODE |
|
||||
* |0 | 0 |0 | 0| 0|0 | 0 | 0 |
|
||||
* +--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+
|
||||
* | QDCOUNT = 1 |
|
||||
* +--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+
|
||||
* | ANCOUNT = 0 |
|
||||
* +--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+
|
||||
* | NSCOUNT = 0 |
|
||||
* +--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+
|
||||
* | ARCOUNT = 0 |
|
||||
* +--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+
|
||||
*
|
||||
***************************************************************
|
||||
|
||||
|
||||
Note that we did not spend time on field Z, this is because it is defined to
|
||||
be 0 at all times. This packets does not request recursion. QDCOUNT = 1
|
||||
means there is 1 question. In theory DNS supported several questions in one
|
||||
message, but this has not been implemented. ANCOUNT, NSCOUNT and ARCOUNT
|
||||
are all zero, indicating there as no answers in this question packet.
|
||||
|
||||
Here is the actual question:
|
||||
|
||||
********************************************************
|
||||
* 1 1 1 1 1 1 *
|
||||
* 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 *
|
||||
* +--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+ *
|
||||
* | 3 | w | *
|
||||
* +--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+ *
|
||||
* | w | w | *
|
||||
* +--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+ *
|
||||
* | 4 | i | *
|
||||
* +--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+ *
|
||||
* | e | t | *
|
||||
* +--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+ *
|
||||
* | f | 3 | *
|
||||
* +--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+ *
|
||||
* | o | r | *
|
||||
* +--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+ *
|
||||
* | g | 0 | *
|
||||
* +--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+ *
|
||||
* | 0 | 28 | *
|
||||
* +--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+ *
|
||||
* | 0 | 1 | *
|
||||
* +--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+ *
|
||||
********************************************************
|
||||
|
||||
This consists of the 'www.ietf.org' encoded in DNS wire format (for which
|
||||
see below), followed by a 16 bit type field. For AAAA, which denotes the
|
||||
IPv6 address, this is 28. This is then followed by the 'class' of the
|
||||
question. It was originally intended that DNS records would exist in
|
||||
different 'classes', but the semantics of this were not specified completely
|
||||
and it was not really implemented. For now, always set class to 1.
|
||||
|
||||
The query name, type and class are also called 'qname', 'qtype' and 'qclass'
|
||||
respectively.
|
||||
|
||||
Of specific note is the somewhat unusual way the name 'www.ietf.org' is
|
||||
serialized in DNS. 'www.ietf.org' consists of 3 'labels' of lengths 3, 4
|
||||
and 3 respectively. In DNS messages, this is encoded as the value 3, then
|
||||
www, then the value 4, then ietf, then 3 followed by org. Then there is a
|
||||
trailing 0 which denotes this is the end.
|
||||
|
||||
This format is unusual, but has several highly attractive properties. For
|
||||
example, it is binary safe and it needs no escaping. When writing DNS
|
||||
software, it may be tempting to pass DNS names around as "ASCII". This then
|
||||
leads to escaping an unescaping code in lots of places. It is highly
|
||||
recommended to use the native DNS encoding to store DNS names. This will
|
||||
save a lot of pain when processing DNS names with spaces or dots in them.
|
||||
|
||||
Finally, DNS queries are
|
||||
[case-insensitive](https://tools.ietf.org/html/rfc4343). This however is
|
||||
defined rather mechanically. Operators do not need to know that in some
|
||||
ASCII encodings a Ü is equivalent to ü when compared case insensitively.
|
||||
For DNS purposes, the fifth bit (0x20) is ignored when comparing octets
|
||||
within a-Z and A-Z.
|
||||
|
||||
Note that individual labels of a name may only be 63 octets long.
|
||||
|
||||
Next up, a DNS response. Note that this again is a DNS message, and it looks
|
||||
a lot like the original DNS query. Here is the beginning of a response:
|
||||
|
||||
|
||||
*****************************************************************
|
||||
* 1 1 1 1 1 1 *
|
||||
* 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 *
|
||||
* +--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+ *
|
||||
* | ID = same random 16 bits | *
|
||||
* +--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+ *
|
||||
* |QR| Opcode |AA|TC|RD|RA| Z | RCODE | *
|
||||
* |1 | 0 | 1| 0| 0| 0| 0 | 0 | *
|
||||
* +--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+ *
|
||||
* | QDCOUNT = 1 | *
|
||||
* +--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+ *
|
||||
* | ANCOUNT = 1 | *
|
||||
* +--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+ *
|
||||
* | NSCOUNT = 0 | *
|
||||
* +--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+ *
|
||||
* | ARCOUNT = 0 | *
|
||||
* +--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+ *
|
||||
* | 3 | w | *
|
||||
* +--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+ *
|
||||
* | w | w | *
|
||||
* +--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+ *
|
||||
* | 4 | i | *
|
||||
* +--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+ *
|
||||
* | e | t | *
|
||||
* +--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+ *
|
||||
* | f | 3 | *
|
||||
* +--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+ *
|
||||
* | o | r | *
|
||||
* +--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+ *
|
||||
* | g | 0 | *
|
||||
* +--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+ *
|
||||
* | 0 | 28 (= 0x1c)| *
|
||||
* +--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+ *
|
||||
* | 0 | 1 | *
|
||||
* +--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+ *
|
||||
*****************************************************************
|
||||
|
||||
Note that QR is now set to 1 to denote a response. The 'AA' bit was set
|
||||
because this answer came from a from a server authoritative for this name.
|
||||
|
||||
In addition, ANCOUNT is now set to '1', indicating a single answer is to be
|
||||
found in the message, immediately after the original question, which has been
|
||||
repeated from the query message.
|
||||
|
||||
To recognize the right response, check that the ID field is the same as the
|
||||
query, make sure the answer arrives on the right source port and that the
|
||||
query name and type match up with the original query. In addition, make sure
|
||||
not to send out more than one equivalent query when still waiting for the
|
||||
response, as doing so opens a security hole.
|
||||
|
||||
After the header and the original question we find the answer:
|
||||
|
||||
*****************************************************************
|
||||
* 1 1 1 1 1 1
|
||||
* 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5
|
||||
* +--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+
|
||||
* | 0xc0 | 0x0c |
|
||||
* +--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+
|
||||
* | 00 | 28 |
|
||||
* +--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+
|
||||
* | 00 | 01 |
|
||||
* +--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+
|
||||
* | TTL = 3600 |
|
||||
* | |
|
||||
* +--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+
|
||||
* | RDLENGTH = 16 |
|
||||
* +--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+--|
|
||||
* | 24 | 00 |
|
||||
* +--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+
|
||||
* | cb | 00 |
|
||||
* +--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+
|
||||
* | 20 | 48 |
|
||||
* +--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+
|
||||
* | 00 | 01 |
|
||||
* +--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+
|
||||
* | 00 | 00 |
|
||||
* +--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+
|
||||
* | 00 | 00 |
|
||||
* +--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+
|
||||
* | 68 | 14 |
|
||||
* +--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+
|
||||
* | 00 | 55 |
|
||||
* +--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+
|
||||
*****************************************************************
|
||||
The first two bytes (0xc0 0c0c) look rather mysterious. When DNS was
|
||||
created, 512 octets was considered the maximum size of a UDP datagram and
|
||||
thus the maximum size of a DNS message transported without using the (then
|
||||
slow) TCP protocol.
|
||||
|
||||
In order to squeeze as much information as possible into the 512 bytes, DNS
|
||||
names can (and often MUST) be compressed. The details of this compression
|
||||
are arcane and easy to get wrong, leading to infinite loops or buffer
|
||||
overflows. So tread very carefully. If you remember one thing, make sure
|
||||
that a pointer always has to go to a lower position in the packet. Also
|
||||
beware of signed/unsigned arithmetic.
|
||||
|
||||
In this case, the DNS name of the answer is encoded is '0xc0 0x0c'. The c0
|
||||
part has the two most significant bits set, indicating that the following
|
||||
6+8 bits are a pointer to somewhere earlier in the message. In this case,
|
||||
this points to position 12 (= 0x0c) within the packet, which is immediately
|
||||
after the DNS header. There we find 'www.ietf.org'.
|
||||
|
||||
So what this means is that the answer about the DNS name 'www.ietf.org' is
|
||||
also called 'www.ietf.org'.
|
||||
|
||||
This is then followed in the packet by '28', which denotes AAAA (IPv6), and
|
||||
the usual 'class' of 1. Then a whole 32 bits are devoted to the Time To Live
|
||||
of this record, followed by a 16 bits length field. Since this is an IPv6
|
||||
address, the actual answer payload length is 16 bytes (or 128 bits).
|
||||
|
||||
This is then followed by the binary representation of the current IPv6
|
||||
address of www.ietf.org, 2400:cb00:2048:1::6814:55.
|
||||
|
||||
If there had been further answers, these would follow this first one, and
|
||||
the ANCOUNT would have been higher than 1. If there had been data in the
|
||||
'authoritative' and 'additional' sections, that would follow here too, with
|
||||
the corresponding adjustments to 'NSCOUNT' and 'ARCOUNT' fields. More about
|
||||
these sections later.
|
||||
|
||||
## RRSETs
|
||||
In the example above, the question for the AAAA record of 'www.ietf.org' had
|
||||
exactly one corresponding resource record. In a human readable 'zone file',
|
||||
this would stored as:
|
||||
|
||||
```
|
||||
www.ietf.org IN AAAA 3600 2400:cb00:2048:1::6814:55
|
||||
```
|
||||
|
||||
It is however possible to have multiple AAAA records for the same name. Even
|
||||
if there is only one record, the DNS specifications talk about 'Resource
|
||||
Record Sets', or RRSETs. These operate in unity. So even though the encoding
|
||||
in the DNS packet allows different TTL values within a single RRSET, this
|
||||
should never happen.
|
||||
|
||||
## Zone files
|
||||
Zone files are one way of storing DNS data, but these are not integral to
|
||||
the operation of a nameserver. The zone file format is standardized, but it
|
||||
is highly non-trivial to parse. It is entirely possible to write useful
|
||||
nameserver that does not read or write DNS zone files. When embarking on
|
||||
parsing zonefiles, do not do so lightly. As an example, various fields
|
||||
within a single line can appear in many orders. Most fields are optional,
|
||||
and some will then be copied from the previous line. But not all.
|
||||
|
||||
Of specific note, many people have attempted to write a grammar (parser) for
|
||||
zonefiles and it is almost impossible.
|
||||
|
||||
## DNS Names
|
||||
The concept of a DNS name is non-trivial and frequently misunderstood.
|
||||
Despite writing 'www.ietf.org' from left to right, within DNS it is fairer
|
||||
to describe it as 'org' below the root node, with below the 'org' node a
|
||||
node called 'ietf'. Finally to the 'ietf' node is attached a node called
|
||||
'www'.
|
||||
|
||||
|
||||
Or in graphical form:
|
||||
|
||||
***************************
|
||||
* +-----+
|
||||
* | |
|
||||
* +--+--+
|
||||
* |
|
||||
* +--+--+
|
||||
* | ORG |
|
||||
* +--+--+
|
||||
* |
|
||||
* +--+---+
|
||||
* | IETF |
|
||||
* +--+---+
|
||||
* |
|
||||
* +--+--+
|
||||
* | WWW |
|
||||
* +-----+
|
||||
***************************
|
||||
|
||||
The 'tree' of nodes as shown above is real and not just another way of
|
||||
visualizing a DNS name. This for example means that if there is a name
|
||||
called 'www.fr.ietf.org' and a query comes in for 'fr.ietf.org', that name
|
||||
exists - even though no records may be assigned to it.
|
||||
|
||||
NOTE: This means that any implementation that sees DNS as a simple
|
||||
'key/value' store, where only records that exist can match, is headed for
|
||||
trouble down the line.
|
||||
|
||||
## Zones
|
||||
As noted, DNS is more complicated than a simple key/value store. This is
|
||||
not only because of the tree style nature of names but also because the same
|
||||
data can live in multiple places, but always lives in a 'zone'.
|
||||
|
||||
Various DNS implementations over time have found out that you can mostly
|
||||
ignore the concept of 'zone' for simple nameservers or load balancers, but
|
||||
not implementing zones correctly will eventually trip you up.
|
||||
|
||||
To make life confusing, 'www.ietf.org' can be defined in four different
|
||||
places. It could be in the 'root' zone itself, fully written out:
|
||||
|
||||
```
|
||||
www.ietf.org IN AAAA 3600 2400:cb00:2048:1::6814:55
|
||||
```
|
||||
Or it could be in the org zone, where it might look like this:
|
||||
|
||||
```
|
||||
$origin ORG
|
||||
www.ietf IN AAAA 3600 2400:cb00:2048:1::6814:55
|
||||
```
|
||||
|
||||
Or, (as is actually the case), this name could live in the 'ietf.org' zone:
|
||||
|
||||
```
|
||||
$origin ietf.org
|
||||
www IN AAAA 3600 2400:cb00:2048:1::6814:55
|
||||
```
|
||||
|
||||
And finally, it is even possible that there is a zone called 'www.ietf.org',
|
||||
where the record lives like this:
|
||||
|
||||
```
|
||||
$origin www.ietf.org
|
||||
@ IN AAAA 3600 2400:cb00:2048:1::6814:55
|
||||
```
|
||||
|
||||
|
||||
### Start of Authority
|
||||
A zone always starts with a SOA or Start Of Authority record. A SOA record
|
||||
is DNS metadata. It stores various things that may be of interest about a
|
||||
zone, like the email address of the maintainer, the name of the most
|
||||
authoritative server. It also has values that describe how or if a zone
|
||||
needs to be replicated. Finally, the SOA record has a number that
|
||||
influences TTL values for names that do not exist.
|
||||
|
||||
There is only one SOA that is guaranteed to exist on the internet and that
|
||||
is the one for the root zone (called '.'). As of 2018, it looks like this:
|
||||
|
||||
```
|
||||
. 86400 IN SOA a.root-servers.net. nstld.verisign-grs.com. 2018032802 1800 900 604800 86400
|
||||
```
|
||||
|
||||
For details of what all these fields mean, please see the [authoritative
|
||||
server document](auth.md.html).
|
||||
|
||||
The final number however is important here. 86400 denotes that if a
|
||||
response says a name or RRSET does not exist, it will continue to not exist
|
||||
for the next day, and that this knowledge may be cached.
|
||||
|
||||
### Zone cuts
|
||||
As noted, 'www.ietf.org' can live in four places. If it lives where it
|
||||
currently does, in the 'ietf.org' zone, it passes through two zone cuts:
|
||||
From . to org, from org to ietf.org.
|
||||
|
||||
When an authoritative server receives a query for 'www.ietf.org', it
|
||||
consults which zones it knows about and answers from the most specific zone
|
||||
it has available.
|
||||
|
||||
For a root-server, which only knows about the root zone, this means
|
||||
consulting the '.' zone. As noted, 'www.ietf.org' is actually a tree, 'org'
|
||||
-> 'ietf' -> 'www'. And as luck will have it, the first node 'org' is
|
||||
present in the root zone.
|
||||
|
||||
Attached to that node is an NS RRSET, which has the names of nameservers
|
||||
that host the ORG zone.
|
||||
|
||||
If we ask these servers about 'www.ietf.org', they too find the best zone to
|
||||
answer from, which in this case is 'org'. Within the 'org' zone they then
|
||||
find the 'ietf' node, which again contains an NS RRSET.
|
||||
|
||||
When we ask the servers named in that RRSET about 'www.ietf.org', they find
|
||||
a node called 'www' with several RRSETs on it, one if which is for AAAA and
|
||||
contains the IPv6 address we were looking for.
|
||||
|
||||
Any authoritative server which does not implement 'zones' in this way will
|
||||
eventually run into trouble. It is not enough to consult a list of known
|
||||
names and answer records attached to those names.
|
||||
|
||||
### NS Records
|
||||
These are a mandatory part of a zone, at the 'apex'. The 'apex' is the name
|
||||
of the zone, at which point there is also a SOA record. So a typical zone
|
||||
will start like this:
|
||||
|
||||
```
|
||||
$ORIGIN ietf.org.
|
||||
@ IN SOA ns1 admin 2018032802 1800 900 604800 86400
|
||||
IN NS ns1
|
||||
IN NS ns2
|
||||
```
|
||||
|
||||
Note how in this zone file example names not ending on a '.' are interpreted
|
||||
as being part of ietf.org. The '@' is a way to specify the name of the
|
||||
apex. Lines two and three omit a name, so they default to '@' too.
|
||||
|
||||
This zone lists ns1.ietf.org and ns2.ietf.org as its nameservers.
|
||||
Being part of the zone, this data is *authoritative*. Any queries sent to
|
||||
this nameserver for the NS RRSET of 'ietf.org' will receive responses
|
||||
with the AA bit set.
|
||||
|
||||
Note however that above we learned that the parent zone, 'org' also needs to
|
||||
list the nameservers for example.org, and it does:
|
||||
|
||||
```
|
||||
$ORIGIN org.
|
||||
...
|
||||
ietf IN NS ns1.ietf
|
||||
ietf IN NS ns2.ietf
|
||||
```
|
||||
|
||||
If we ask the 'org' nameservers for the NS RRSET of 'ietf.org', we receive a
|
||||
response with AA=0, indicating that the 'org' servers know they aren't
|
||||
'authoritative' for ietf.org.
|
||||
|
||||
### Glue records
|
||||
The astute reader will have spotted a chicken and egg problem here. If
|
||||
ns1.ietf.org is the nameserver for ietf.org.. where do we get the IP
|
||||
address of ns1.ietf.org?
|
||||
|
||||
To solve this problem, the parent zone can provide a free chicken. In the
|
||||
org zone, we would actually find:
|
||||
|
||||
```
|
||||
$ORIGIN org.
|
||||
...
|
||||
ietf IN NS ns1.ietf
|
||||
ietf IN NS ns2.ietf
|
||||
ns1.ietf IN A 192.0.2.1
|
||||
ns2.ietf IN A 198.51.100.1
|
||||
```
|
||||
|
||||
These entries are mirrored in the 'ietf.org' zone hosted on ns1.ietf.org and
|
||||
ns2.ietf.org. And as with the NS records, any queries for ns1.ietf.org sent
|
||||
to the org servers receive AA=0 answers, whereas ns1.ietf.org itself answers
|
||||
with AA=1.
|
||||
|
||||
Note that for various reasons the AA=0 answer from the parent zone may be
|
||||
different than the AA=1 answer, and resolvers must be aware of the
|
||||
difference.
|
||||
|
||||
|
||||
# Further aspects
|
||||
|
||||
The description up to this point is correct, but far from functionally
|
||||
complete even for basic DNS. The following sections describe additional
|
||||
aspects of basic DNS:
|
||||
|
||||
## CNAME
|
||||
A CNAME provides the 'Canonical Name' for another DNS name. For example:
|
||||
|
||||
```
|
||||
www IN CNAME www.ietf.org.cdn.cloudflare.net.
|
||||
```
|
||||
|
||||
This is frequently used to redirect to a Content Distribution Network. The
|
||||
CNAME is for a name, and not for a type. This means that *any* query for
|
||||
www.ietf.org is sent to cloudflare. This simultaneously means that what
|
||||
everyone wants is impossible:
|
||||
|
||||
```
|
||||
$ORIGIN ietf.org
|
||||
@ IN CNAME this.does.not.work.int.
|
||||
```
|
||||
|
||||
This collides with the SOA and NS records, which are then also redirected
|
||||
and not found. Frequently doing this 'apex CNAME' appears to work, but it
|
||||
really doesn't.
|
||||
|
||||
In hindsight, the CNAME should have been 'typed' to apply only to specific
|
||||
query types.
|
||||
|
||||
When a server encounters a CNAME with the name of a name it was looking for,
|
||||
it will 'follow' the chain to where it points. And please be aware that this
|
||||
can loop.
|
||||
|
||||
## Wildcards
|
||||
Wildcards allow for the following:
|
||||
|
||||
```
|
||||
$ORIGIN ietf.org.
|
||||
* IN A 192.0.2.1
|
||||
IN AAAA 2001:db8:85a3::8a2e:0370:7334
|
||||
smtp IN A 192.0.2.222
|
||||
```
|
||||
|
||||
A query for the A record of 'smtp.ietf.org' will return 192.0.2.222. A query
|
||||
for 'www.ietf.org' however will return 192.0.2.1.
|
||||
|
||||
Interestingly, as another example of how DNS really is a tree, a query for
|
||||
the AAAA record of smtp.ietf.org will return.. nothing. This is because
|
||||
the node 'smtp.ietf.org' does exist, and processing ends there. The
|
||||
wildcard match will not proceed to the '*' entry.
|
||||
|
||||
Wildcards synthesize new answers. This means that, unless explicitly
|
||||
queried, no '*.ietf.org' record will be served. Instead, a 'www.ietf.org'
|
||||
record is created on the fly.
|
||||
|
||||
## Truncation
|
||||
Without implementing the optional EDNS protocol extension, all UDP responses
|
||||
must fit in 512 bytes of payload. If on writing an answer a server finds
|
||||
itself exceeding this limit, it must truncate the packet and set the TC bit
|
||||
to 1.
|
||||
|
||||
The originator of the query will then resend the query over TCP.
|
||||
|
||||
Sometimes DNS responses contain optional data that could be left out, and
|
||||
this could be done to stay under the 512 byte limit.
|
||||
|
||||
It is recommended however to keep it simple and send an empty response
|
||||
packet with TC=1 whenever the byte limit is reached.
|
||||
|
||||
## Names and nodes that do not exist
|
||||
DNS queries can fail to match in two ways: the whole node does not exist,
|
||||
or, the requested type is not present at that node.
|
||||
|
||||
As an example of the first case, 'doesnotexist.ietf.org' really does not
|
||||
exist, which leads to a response with RCODE NXDOMAIN and no answer records.
|
||||
|
||||
As an example of the second case, 'www.ietf.org' does exist, but has no MX
|
||||
record. The RCODE is normal, but there are no answer records.
|
||||
|
||||
Empty answers however are hard to cache. To alleviate this situation, in
|
||||
these cases the authoritative server sends a copy of the SOA record in the
|
||||
Authority section of the response. The TTL of that record tells us how long
|
||||
the knowledge of 'no such name' or 'no such data' can be cached.
|
||||
|
||||
## Query types that are not RRSET types
|
||||
In addition to the resource record types covered above, like A, AAAA, NS and
|
||||
SOA, two additional types exist that can only be used in queries: ANY, AXFR
|
||||
and IXFR.
|
||||
|
||||
An ANY query instructs a nameserver to return all types it immediately has
|
||||
available for a name. This 'immediately' qualification makes ANY queries
|
||||
unsuitable for talking to resolvers - it is not sure the response is in any
|
||||
way complete.
|
||||
|
||||
Because of the potential of creating huge answers, the use of ANY is
|
||||
problematic even when talking to authoritative servers, and it may no longer
|
||||
work well in the future.
|
||||
|
||||
AXFR and IXFR are requests for (incremental) zone transfers, almost always
|
||||
over TCP. This query asks an authoritative server to list an entire zone.
|
||||
Resolvers do not process AXFR or IXFR queries.
|
||||
|
||||
# That's it for basic DNS!
|
||||
This is the core of DNS. There are quite some parts that have not been
|
||||
discussed, but based on the explanations above, it is possible to write a
|
||||
compliant authoritative server.
|
||||
|
||||
## Further reading
|
||||
|
||||
### RFC 1034 / 1035
|
||||
|
||||
These ([1034](https://tools.ietf.org/html/rfc1034) &
|
||||
[1035](https://tools.ietf.org/html/rfc1035)) describe the core of DNS in
|
||||
1987 language. When reading, disregard mentions of IQUERY and experimental
|
||||
records. They did not survive. Also realize that in this world,
|
||||
authoritative and resolver service were described as a single function. We
|
||||
now know this to be confusing.
|
||||
|
||||
### RFC 2181: "Clarifications to the DNS Specification"
|
||||
From 1997, [2181](https://tools.ietf.org/html/rfc2181) performs a decade of
|
||||
cleanup work on 1034/1035. It also talks about an early version of DNSSEC
|
||||
(NXT, SIG, KEY records), these sections should not be read as this is
|
||||
unrelated to current DNSSEC (aka DNSSEC-bis).
|
||||
|
||||
Of specific note, 5.4.1 describes very exact ordering rules which data a
|
||||
server is supposed to prefer. This list becomes a lot simpler when split up
|
||||
between pure authoritative and pure resolver functions.
|
||||
|
||||
### RFC 2308: "Negative caching of DNS Queries (NCACHE)
|
||||
This [rfc](https://tools.ietf.org/html/rfc2308) describes how negative
|
||||
responses are to be cached. The details matter for both authoritative and
|
||||
resolvers. Of specific note are the parts that dwell on CNAME chains which
|
||||
lead to a 'no data' or 'NXDOMAIN' situation.
|
||||
|
||||
As with 2181, this RFC speaks about an earlier version of DNSSEC, and these
|
||||
parts should be fully ignored.
|
||||
|
||||
### RFC 3596: "DNS Extensions to Support IP Version 6"
|
||||
This [rfc](https://tools.ietf.org/html/rfc3596) describes the AAAA record,
|
||||
which is core to DNS as it is required to look up addresses of nameservers.
|
||||
|
||||
### RFC 4343: "Domain Name System Case Insensitivity Clarification"
|
||||
[4343](https://tools.ietf.org/html/rfc4343) clarifies the somewhat odd
|
||||
case insensitivity of DNS but also writes out the escaping rules when using
|
||||
non-ASCII or whitespace in DNS names. As noted before, try not to have to
|
||||
use these rules except when reading DNS data from text files or showing DNS
|
||||
data meant for human consumption. Use native DNS names as much as possible,
|
||||
and create 4343-compliant comparison and equivalence functions.
|
||||
|
||||
### RFC 5452: "Measures for Making DNS More Resilient against Forged Answers"
|
||||
This [RFC](https://tools.ietf.org/html/rfc5452) makes source port
|
||||
randomization mandatory for UDP-based DNS messages and also has rules on
|
||||
preventing "birthday attacks".
|
||||
|
||||
### RFC 6604: "xNAME RCODE Clarification"
|
||||
[6604](https://tools.ietf.org/html/rfc6604) further describes the meanings
|
||||
of header bits (AA) and RCODEs when following CNAME chains. Also discusses
|
||||
an earlier version of DNAMEs, these parts are best ignored in lieu of
|
||||
(later) reading the newer DNAME specification.
|
||||
|
||||
### RFC 7766: DNS Transport over TCP - Implementation Requirements
|
||||
[This RFC](https://tools.ietf.org/html/rfc7766.txt) updates 1034/1035 to
|
||||
state that TCP is a mandatory part of DNS and a first class citizen It also
|
||||
updates timeout rules, recommending rather brief timeouts compared to the
|
||||
'minutes' noted in the original DNS standard.
|
||||
|
||||
<!-- Markdeep: --><style class="fallback">body{visibility:hidden;white-space:pre;font-family:monospace}</style><script src="ext/markdeep.min.js"></script><script>window.alreadyProcessedMarkdeep||(document.body.style.visibility="visible")</script>
|
1
basic.md.html
Symbolic link
1
basic.md.html
Symbolic link
@ -0,0 +1 @@
|
||||
basic.md
|
@ -1,7 +1,10 @@
|
||||
<meta charset="utf-8" emacsmode="-*- markdown -*-">
|
||||
**A warm welcome to DNS**
|
||||
Note: this page is part of the
|
||||
'[hello-dns](https://powerdns.org/hello-dns/)' documentation effort.
|
||||
|
||||
# DNSSEC
|
||||
For now, see [this page](https://ds9a.nl/dnssec/).
|
||||
|
||||
<!-- Markdeep: --><style class="fallback">body{visibility:hidden;white-space:pre;font-family:monospace}</style><script src="ext/markdeep.min.js"></script><script>window.alreadyProcessedMarkdeep||(document.body.style.visibility="visible")</script>
|
||||
|
||||
|
3
meta.md
3
meta.md
@ -1,5 +1,7 @@
|
||||
<meta charset="utf-8" emacsmode="-*- markdown -*-">
|
||||
**A warm welcome to DNS**
|
||||
Note: this page is part of the
|
||||
'[hello-dns](https://powerdns.org/hello-dns/)' documentation effort.
|
||||
|
||||
# The why and what of these documents
|
||||
There are now between 1500 and 3000 pages of RFC documents describing DNS,
|
||||
@ -52,3 +54,4 @@ urge to write 'standardese' here.
|
||||
|
||||
|
||||
<!-- Markdeep: --><style class="fallback">body{visibility:hidden;white-space:pre;font-family:monospace}</style><script src="ext/markdeep.min.js"></script><script>window.alreadyProcessedMarkdeep||(document.body.style.visibility="visible")</script>
|
||||
|
||||
|
16
non-ietf.md
Normal file
16
non-ietf.md
Normal file
@ -0,0 +1,16 @@
|
||||
<meta charset="utf-8" emacsmode="-*- markdown -*-">
|
||||
**A warm welcome to DNS**
|
||||
|
||||
<link rel="stylesheet" href="https://casual-effects.com/markdeep/latest/apidoc.css?">
|
||||
|
||||
Note: this page is part of the
|
||||
'[hello-dns](https://powerdns.org/hello-dns/)' documentation effort.
|
||||
|
||||
# Non-IETF DNS standards
|
||||
|
||||
* RPZ
|
||||
* RRL
|
||||
* DNSCrypt
|
||||
* Curvedns
|
||||
|
||||
<!-- Markdeep: --><style class="fallback">body{visibility:hidden;white-space:pre;font-family:monospace}</style><script src="ext/markdeep.min.js"></script><script>window.alreadyProcessedMarkdeep||(document.body.style.visibility="visible")</script>
|
1
non-ietf.md.html
Symbolic link
1
non-ietf.md.html
Symbolic link
@ -0,0 +1 @@
|
||||
non-ietf.md
|
16
privacy.md
Normal file
16
privacy.md
Normal file
@ -0,0 +1,16 @@
|
||||
<meta charset="utf-8" emacsmode="-*- markdown -*-">
|
||||
**A warm welcome to DNS**
|
||||
|
||||
<link rel="stylesheet" href="https://casual-effects.com/markdeep/latest/apidoc.css?">
|
||||
|
||||
Note: this page is part of the
|
||||
'[hello-dns](https://powerdns.org/hello-dns/)' documentation effort.
|
||||
|
||||
# Privacy
|
||||
TBC
|
||||
|
||||
* DNS over TLS
|
||||
* DNS over HTTPS
|
||||
* Query name minimization
|
||||
|
||||
<!-- Markdeep: --><style class="fallback">body{visibility:hidden;white-space:pre;font-family:monospace}</style><script src="ext/markdeep.min.js"></script><script>window.alreadyProcessedMarkdeep||(document.body.style.visibility="visible")</script>
|
1
privacy.md.html
Symbolic link
1
privacy.md.html
Symbolic link
@ -0,0 +1 @@
|
||||
privacy.md
|
24
rare.md
Normal file
24
rare.md
Normal file
@ -0,0 +1,24 @@
|
||||
<meta charset="utf-8" emacsmode="-*- markdown -*-">
|
||||
**A warm welcome to DNS**
|
||||
|
||||
<link rel="stylesheet" href="https://casual-effects.com/markdeep/latest/apidoc.css?">
|
||||
|
||||
Note: this page is part of the
|
||||
'[hello-dns](https://powerdns.org/hello-dns/)' documentation effort.
|
||||
|
||||
# Rare
|
||||
DNS is currently described in over 150 RFCs. Not all of these are
|
||||
operational, and it is uncertain if all of them should be.
|
||||
|
||||
This page is a menu of things that are specified, but not in wide
|
||||
production.
|
||||
|
||||
Standards listed here could be omitted from 2018 implementations
|
||||
without causing operational problems, but this may change in the future.
|
||||
|
||||
Applicable to authoritative servers:
|
||||
|
||||
* TKEY
|
||||
* SIG(0)
|
||||
|
||||
<!-- Markdeep: --><style class="fallback">body{visibility:hidden;white-space:pre;font-family:monospace}</style><script src="ext/markdeep.min.js"></script><script>window.alreadyProcessedMarkdeep||(document.body.style.visibility="visible")</script>
|
1
rare.md.html
Symbolic link
1
rare.md.html
Symbolic link
@ -0,0 +1 @@
|
||||
rare.md
|
@ -1,6 +1,9 @@
|
||||
<meta charset="utf-8" emacsmode="-*- markdown -*-">
|
||||
**A warm welcome to DNS**
|
||||
|
||||
Note: this page is part of the
|
||||
'[hello-dns](https://powerdns.org/hello-dns/)' documentation effort.
|
||||
|
||||
# Resolver
|
||||
Writing a modern resolver is the hardest part of DNS. A fully standards
|
||||
compliant DNS resolver is not a resolver that can be used in practice.
|
||||
|
16
tdns/Makefile
Normal file
16
tdns/Makefile
Normal file
@ -0,0 +1,16 @@
|
||||
CXXFLAGS:=-std=gnu++14 -Wall -O2 -MMD -MP -ggdb -Iext/simplesocket
|
||||
|
||||
PROGRAMS = tdns
|
||||
|
||||
all: $(PROGRAMS)
|
||||
|
||||
clean:
|
||||
rm -f *~ *.o *.d test $(PROGRAMS)
|
||||
|
||||
#check: mtests
|
||||
# ./mtests
|
||||
|
||||
-include *.d
|
||||
|
||||
tdns: tdns.o ext/simplesocket/comboaddress.o ext/simplesocket/sclasses.o ext/simplesocket/swrappers.o
|
||||
g++ -std=gnu++14 $^ -o $@
|
38
tdns/dns.hh
Normal file
38
tdns/dns.hh
Normal file
@ -0,0 +1,38 @@
|
||||
#pragma once
|
||||
struct dnsheader {
|
||||
unsigned id :16; /* query identification number */
|
||||
#if BYTE_ORDER == BIG_ENDIAN
|
||||
/* fields in third byte */
|
||||
unsigned qr: 1; /* response flag */
|
||||
unsigned opcode: 4; /* purpose of message */
|
||||
unsigned aa: 1; /* authoritative answer */
|
||||
unsigned tc: 1; /* truncated message */
|
||||
unsigned rd: 1; /* recursion desired */
|
||||
/* fields in fourth byte */
|
||||
unsigned ra: 1; /* recursion available */
|
||||
unsigned unused :1; /* unused bits (MBZ as of 4.9.3a3) */
|
||||
unsigned ad: 1; /* authentic data from named */
|
||||
unsigned cd: 1; /* checking disabled by resolver */
|
||||
unsigned rcode :4; /* response code */
|
||||
#elif BYTE_ORDER == LITTLE_ENDIAN || BYTE_ORDER == PDP_ENDIAN
|
||||
/* fields in third byte */
|
||||
unsigned rd :1; /* recursion desired */
|
||||
unsigned tc :1; /* truncated message */
|
||||
unsigned aa :1; /* authoritative answer */
|
||||
unsigned opcode :4; /* purpose of message */
|
||||
unsigned qr :1; /* response flag */
|
||||
/* fields in fourth byte */
|
||||
unsigned rcode :4; /* response code */
|
||||
unsigned cd: 1; /* checking disabled by resolver */
|
||||
unsigned ad: 1; /* authentic data from named */
|
||||
unsigned unused :1; /* unused bits (MBZ as of 4.9.3a3) */
|
||||
unsigned ra :1; /* recursion available */
|
||||
#endif
|
||||
/* remaining bytes */
|
||||
unsigned qdcount :16; /* number of question entries */
|
||||
unsigned ancount :16; /* number of answer entries */
|
||||
unsigned nscount :16; /* number of authority entries */
|
||||
unsigned arcount :16; /* number of resource entries */
|
||||
};
|
||||
|
||||
static_assert(sizeof(dnsheader) == 12, "dnsheader size must be 12");
|
1
tdns/ext/simplesocket
Submodule
1
tdns/ext/simplesocket
Submodule
@ -0,0 +1 @@
|
||||
Subproject commit 9829ce7772b52442669ba22762c49b4e69397988
|
281
tdns/tdns.cc
Normal file
281
tdns/tdns.cc
Normal file
@ -0,0 +1,281 @@
|
||||
/* Goal: a fully standards compliant basic authoritative server. In <500 lines.
|
||||
Non-goals: notifications, slaving zones, name compression, edns,
|
||||
performance
|
||||
*/
|
||||
#include <cstdint>
|
||||
#include <string>
|
||||
#include <vector>
|
||||
#include <deque>
|
||||
#include <map>
|
||||
#include <stdexcept>
|
||||
#include "sclasses.hh"
|
||||
#include "dns.hh"
|
||||
|
||||
using namespace std;
|
||||
|
||||
typedef uint16_t dnstype;
|
||||
typedef std::string dnslabel;
|
||||
|
||||
enum class RCode
|
||||
{
|
||||
Refused=5
|
||||
};
|
||||
|
||||
enum class DNSType
|
||||
{
|
||||
A = 1,
|
||||
NS = 2,
|
||||
CNAME = 5,
|
||||
SOA=6,
|
||||
AAAA = 28
|
||||
};
|
||||
|
||||
typedef deque<dnslabel> dnsname;
|
||||
// this should perform escaping rules!
|
||||
static std::ostream & operator<<(std::ostream &os, const dnsname& d)
|
||||
{
|
||||
for(const auto& l : d) {
|
||||
os<<l<<".";
|
||||
}
|
||||
return os;
|
||||
}
|
||||
|
||||
struct DNSNode
|
||||
{
|
||||
DNSNode* find(dnsname& name, dnsname& last);
|
||||
DNSNode* add(dnsname name);
|
||||
map<dnslabel, DNSNode> children;
|
||||
map<dnstype, vector<string> > rrsets;
|
||||
|
||||
DNSNode* zone{0}; // if this is set, this node is a zone
|
||||
};
|
||||
|
||||
DNSNode* DNSNode::find(dnsname& name, dnsname& last)
|
||||
{
|
||||
cout<<"Lookup for '"<<name<<"', last is now '"<<last<<"'"<<endl;
|
||||
if(name.empty()) {
|
||||
if(!zone && rrsets.empty()) // only root zone can have this
|
||||
return 0;
|
||||
else
|
||||
return this;
|
||||
}
|
||||
auto iter = children.find(name.back());
|
||||
cout<<"Looked for child called '"<<name.back()<<"'"<<endl;
|
||||
if(iter == children.end()) {
|
||||
cout<<"Found nothing, returning leaf"<<endl;
|
||||
return this;
|
||||
}
|
||||
last.push_front(name.back());
|
||||
name.pop_back();
|
||||
return iter->second.find(name, last);
|
||||
}
|
||||
|
||||
DNSNode* DNSNode::add(dnsname name)
|
||||
{
|
||||
cout<<"Add for '"<<name<<"'"<<endl;
|
||||
if(name.size() == 1) {
|
||||
cout<<"Last label, adding "<<name.front()<<endl;
|
||||
return &children[name.front()];
|
||||
}
|
||||
|
||||
auto back = name.back();
|
||||
name.pop_back();
|
||||
auto iter = children.find(back);
|
||||
|
||||
if(iter == children.end()) {
|
||||
cout<<"Inserting new child for "<<back<<endl;
|
||||
return children[back].add(name);
|
||||
}
|
||||
return iter->second.add(name);
|
||||
}
|
||||
|
||||
|
||||
struct DNSMessage
|
||||
{
|
||||
struct dnsheader dh=dnsheader{};
|
||||
std::array<uint8_t, 500> payload;
|
||||
uint16_t payloadpos{0}, payloadsize{0};
|
||||
|
||||
dnsname getName();
|
||||
uint16_t getUInt16();
|
||||
uint32_t getUInt32();
|
||||
|
||||
void putName(const dnsname& name);
|
||||
void putUInt16(uint16_t val);
|
||||
void putUInt32(uint32_t val);
|
||||
void putBlob(const std::string& blob);
|
||||
|
||||
void getQuestion(dnsname& name, dnstype& type);
|
||||
void setQuestion(const dnsname& name, dnstype type);
|
||||
void putRR(const dnsname& name, uint16_t type, uint32_t ttl, const std::string& rr);
|
||||
std::string serialize() const;
|
||||
} __attribute__((packed));
|
||||
|
||||
dnsname DNSMessage::getName()
|
||||
{
|
||||
dnsname name;
|
||||
for(;;) {
|
||||
uint8_t labellen=payload.at(payloadpos++);
|
||||
if(labellen > 63)
|
||||
throw std::runtime_error("Got a compressed label");
|
||||
if(!labellen) // end of dnsname
|
||||
break;
|
||||
dnslabel label(&payload.at(payloadpos), &payload.at(payloadpos+labellen));
|
||||
payloadpos += labellen;
|
||||
name.push_back(label);
|
||||
}
|
||||
return name;
|
||||
}
|
||||
|
||||
uint16_t DNSMessage::getUInt16()
|
||||
{
|
||||
uint16_t ret;
|
||||
memcpy(&ret, &payload.at(payloadpos+2)-2, 2);
|
||||
payloadpos+=2;
|
||||
return htons(ret);
|
||||
}
|
||||
|
||||
void DNSMessage::getQuestion(dnsname& name, dnstype& type)
|
||||
{
|
||||
name=getName();
|
||||
type=getUInt16();
|
||||
}
|
||||
|
||||
void DNSMessage::putName(const dnsname& name)
|
||||
{
|
||||
for(const auto& l : name) {
|
||||
payload.at(payloadpos++)=l.size();
|
||||
for(const auto& a : l)
|
||||
payload.at(payloadpos++)=(uint8_t)a;
|
||||
}
|
||||
payload.at(payloadpos++)=0;
|
||||
}
|
||||
|
||||
void DNSMessage::putUInt16(uint16_t val)
|
||||
{
|
||||
val = htons(val);
|
||||
memcpy(&payload.at(payloadpos+2)-2, &val, 2);
|
||||
payloadpos+=2;
|
||||
}
|
||||
|
||||
void DNSMessage::putUInt32(uint32_t val)
|
||||
{
|
||||
val = htonl(val);
|
||||
memcpy(&payload.at(payloadpos+sizeof(val)) - sizeof(val), &val, sizeof(val));
|
||||
payloadpos += sizeof(val);
|
||||
}
|
||||
|
||||
|
||||
void DNSMessage::putBlob(const std::string& blob)
|
||||
{
|
||||
memcpy(&payload.at(payloadpos+blob.size()) - blob.size(), blob.c_str(), blob.size());
|
||||
payloadpos += blob.size();;
|
||||
}
|
||||
|
||||
|
||||
void DNSMessage::putRR(const dnsname& name, uint16_t type, uint32_t ttl, const std::string& payload)
|
||||
{
|
||||
putName(name);
|
||||
putUInt16(type); putUInt16(1);
|
||||
putUInt32(ttl);
|
||||
putUInt16(payload.size()); // check for overflow!
|
||||
putBlob(payload);
|
||||
}
|
||||
|
||||
void DNSMessage::setQuestion(const dnsname& name, dnstype type)
|
||||
{
|
||||
putName(name);
|
||||
putUInt16(type);
|
||||
putUInt16(1); // class
|
||||
}
|
||||
|
||||
string DNSMessage::serialize() const
|
||||
{
|
||||
return string((const char*)this, (const char*)this + sizeof(dnsheader) + payloadpos);
|
||||
}
|
||||
|
||||
|
||||
static_assert(sizeof(DNSMessage) == 516, "dnsmessage size must be 516");
|
||||
|
||||
int main(int argc, char** argv)
|
||||
{
|
||||
ComboAddress local(argv[1], 53);
|
||||
Socket udplistener(local.sin4.sin_family, SOCK_DGRAM);
|
||||
SBind(udplistener, local);
|
||||
|
||||
DNSNode zones;
|
||||
auto zone = zones.add({"powerdns", "org"});
|
||||
zone->zone = new DNSNode(); // XXX ICK
|
||||
zone->zone->rrsets[(dnstype)DNSType::SOA]={"hello"};
|
||||
zone->zone->rrsets[(dnstype)DNSType::A]={"\x01\x02\x03\x04"};
|
||||
|
||||
zone->zone->add({"www"})->rrsets[(dnstype)DNSType::CNAME]={"\x03www\x02nl\x00"};
|
||||
|
||||
for(;;) {
|
||||
ComboAddress remote(local);
|
||||
DNSMessage dm;
|
||||
string message = SRecvfrom(udplistener, sizeof(dm), remote);
|
||||
if(message.size() < sizeof(dnsheader)) {
|
||||
cerr<<"Dropping query from "<<remote.toStringWithPort()<<", too short"<<endl;
|
||||
continue;
|
||||
}
|
||||
memcpy(&dm, message.c_str(), message.size());
|
||||
|
||||
if(dm.dh.qr || dm.dh.opcode) {
|
||||
cerr<<"Dropping non-query from "<<remote.toStringWithPort()<<endl;
|
||||
}
|
||||
|
||||
dnsname name;
|
||||
dnstype type;
|
||||
dm.getQuestion(name, type);
|
||||
cout<<"Received a query from "<<remote.toStringWithPort()<<" for "<<name<<" and type "<<type<<endl;
|
||||
|
||||
DNSMessage response;
|
||||
response.dh = dm.dh;
|
||||
response.dh.ad = 0;
|
||||
response.dh.ra = 0;
|
||||
response.dh.aa = 0;
|
||||
response.dh.qr = 1;
|
||||
response.dh.ancount = response.dh.arcount = response.dh.nscount = 0;
|
||||
response.setQuestion(name, type);
|
||||
|
||||
dnsname zone;
|
||||
auto fnd = zones.find(name, zone);
|
||||
if(fnd && fnd->zone) {
|
||||
cout<<"Best zone: "<<zone<<", name now "<<name<<", loaded: "<<(void*)fnd->zone<<endl;
|
||||
|
||||
response.dh.aa = 1;
|
||||
|
||||
auto bestzone = fnd->zone;
|
||||
dnsname searchname(name), lastnode;
|
||||
auto rrsets = bestzone->find(searchname, lastnode);
|
||||
if(!rrsets) {
|
||||
cout<<"Found nothing in zone '"<<zone<<"' for lhs '"<<name<<"'"<<endl;
|
||||
}
|
||||
else {
|
||||
cout<<"Found something in zone '"<<zone<<"' for lhs '"<<name<<"', searchname now '"<<searchname<<"', lastnode '"<<lastnode<<"'"<<endl;
|
||||
if(rrsets->rrsets.count(type)) {
|
||||
cout<<"Had qtype too!"<<endl;
|
||||
for(const auto& rr : rrsets->rrsets[type]) {
|
||||
response.putRR({"powerdns", "org"}, type, 3600, rr);
|
||||
response.dh.ancount = htons(ntohs(response.dh.ancount)+1);
|
||||
}
|
||||
}
|
||||
else {
|
||||
cout<<"Node exists, but no matching qtype"<<endl;
|
||||
if(rrsets->rrsets.count((int)DNSType::CNAME)) {
|
||||
cout<<"We do have a CNAME!"<<endl;
|
||||
for(const auto& rr : rrsets->rrsets[(int)DNSType::CNAME]) {
|
||||
response.putRR({"www", "powerdns", "org"}, type, 3600, rr);
|
||||
response.dh.ancount = htons(ntohs(response.dh.ancount)+1);
|
||||
}
|
||||
}
|
||||
}
|
||||
}
|
||||
}
|
||||
else {
|
||||
response.dh.rcode = (uint8_t)RCode::Refused;
|
||||
}
|
||||
SSendto(udplistener, response.serialize(), remote);
|
||||
}
|
||||
}
|
Loading…
Reference in New Issue
Block a user