Domain Name System: Difference between revisions

From Citizendium
Jump to navigation Jump to search
imported>Howard C. Berkowitz
(More intro)
mNo edit summary
 
(147 intermediate revisions by 17 users not shown)
Line 1: Line 1:
{{subpages}}
{{subpages}}
In the Internet, the '''Domain Name System (DNS)''' is a relatively old service, with the first specification, by Paul Mockapetris, going back to 1983.<ref name=RFC882>{{citation
On the [[Internet]], the '''[[Domain Name System]] (DNS)''' is a critically important [[directory service]] that translates to and from a raw [[IP address]] (such as ''207.46.197.32'') and a domain name (such as ''microsoft.com''). This allows people to interact with software via domain names, which are easier to remember than numerical IP addresses.
| author = Mockapetris, P.V.
| id = RFC882
| url = http://www.ietf.org/rfc/rfc0882.txt
| title = Domain names: Concepts and facilities
| publisher = Internet Engineering Task Foce
| date = November 1983}}</ref> The first DNS implementation, written by Mockapetris, was caled JEEVES, and replaced the ARPANET (pre-Internet) environment with few enough computers that a single file, <code>hosts.txt</code>, was sufficient to contain all connected computer names and their numeric addresses.<ref name=DNS-BIND>{{citation
| title = DNS and BIND, second edition
| first1 = Paul | last1 = Albitz | first2 = Cricket | last2 = Liu
| publisher = O'Reilly | year = 1997}} p. 9</ref> Its designers, however, did not think of it as anything like a search engine, with the ability to seek a name corresponding to an idea (e.g. "pizza"), but to work with explicit names already known by the application.


Berkeley Internet Name Domain (BIND), first deployed in BSD 4.3 UNIX and written by Kevin Dunlap, was the first widespread DNS implementation. BIND is now public domain code supported by the Internet Systems Consortium <ref>http://www.isc.org/index.pl</ref>.
More importantly, it allows computer-friendly but user-unfriendly IP addresses to change without affecting human users. Thus people can still expect to find the same information behind the user-friendly ''domain names'', and need not be concerned if Microsoft Corporation changes the IP address on one of its host computers, as the domain name ''microsoft.com'' is sufficient, thanks to DNS, to find their computers regardless of which IP address the Microsoft administrator has assigned to those hosts.


DNS is both a distributed database, and set of application protocols, with the original purpose of translating from human-readable '''domain names''' to '''Internet protocol (IP) addresses''' (i.e., '''forward DNS''') and from addresses to names (i.e., '''reverse DNS'''). <ref name=RFC1034>{{citation
DNS is a hierarchical [[federated database]], distributed widely across many host computers on the public Internet, and it also has a set of application protocols for interacting with the database. DNS names must comply with standards on the public Internet, but need not do so in a private internet where DNS is still useful.  The original purpose of DNS was to translate a domain name to an IP address ('''forward DNS'''), and an IP address to a domain name ('''[[reverse mapping|reverse DNS]]'''),<ref name=RFC1034>{{citation
  | id = RFC1034  
  | id = RFC1034  
  | title = Domain names - concepts and facilities
  | title = Domain names - concepts and facilities
Line 20: Line 11:
  | date = November 1987
  | date = November 1987
  | publisher = Internet Engineering Task Force
  | publisher = Internet Engineering Task Force
}}</ref>  Over the years, it has taken on more technical and administrative roles. The domain name space, as well as the address spaces both for [[Internet Protocol version 4]] and [[Internet Protocol version 6]] (IPv6) are under the authority of the [[Internet Corporation for Assigned Names and Numbers]] (ICANN), with much delegation of administration. The original system only handled IPv4, so one of the first steps for IPv6 support was defining how to represent IPv6 addresses in DNS. <ref name=RFC3363>{{citation
}}</ref> but in recent years there have been ongoing attempts to expand the purpose and functionality of DNS in the public Internet. Further, because the lookup process for DNS superficially appears to resemble the lookup process for searching on the world wide web, it has become easy to confuse the purposes of a DNS lookup with a search-engine lookup.  These two kinds of lookups have very different goals and occur at vastly different levels within the internet protocol stack.  This article will explain the functions and purposes of the Domain Name System, the nature of its distributed and hierarchical database, and the protocols for accessing it.  It will also note how the functions of DNS differ markedly from those of search engines, since this seems to be a matter of frequent confusion on the part of learners. In lay terms, you might think of DNS as like the ''white pages'' in a traditional phone book, and search engines as more like the ''yellow pages''.
  | id = RFC3363
 
  | title = Representing Internet Protocol version 6 (IPv6) Addresses in the  Domain Name System (DNS)
As the ''white page'' type lookup service of the public Internet, DNS has been attacked by hostile programs either attempting to disrupt Internet traffic or divert users to illicit host machines.  The distributed and simplistic approach taken by DNS has proved, historically, surprisingly resilient against such attacks, but as the size and importance of the public Internet has grown, so have the security concerns related to DNS.  This article, or its related sub-articles, will also address basic [[DNS security]] issues.
| author = Bush, R. ''et al.''
 
  | url = http://www.ietf.org/rfc/rfc1034.txt
{{TOC|right}}
  | date = August 2002
 
  | publisher = Internet Engineering Task Force
==History==
}}</ref>   
DNS was first introduced for use on the Internet in 1983, with the first specification written by Paul Mockapetris.<ref name=RFC882>{{citation
  | author = Mockapetris, P.V.
  | id = RFC882
  | url = http://www.ietf.org/rfc/rfc0882.txt
  | title = Domain names: Concepts and facilities
  | publisher = Internet Engineering Task Foce
| date = November 1983}}</ref> Mockapetris' first DNS implementation was called JEEVES, and replaced the ARPANET (pre-Internet) environment with few enough computers that a single file, <code>hosts.txt</code>, was sufficient to contain all connected computer names and their numeric addresses.<ref name=DNS-BIND>{{citation
| title = DNS and BIND, second edition
| first1 = Paul | last1 = Albitz | first2 = Cricket | last2 = Liu
  | publisher = O'Reilly | year = 1997}} p. 9</ref> Its designers, however, did not think of it as anything like a search engine, with the ability to seek a name corresponding to an idea (e.g. "pizza"), but to work with explicit names already known by the application. Manually maintaining and sharing host files became impractical as the scale of the Internet grew, and DNS was designed and implemented as the solution to the problem of scalable [[host name]] resolution.


Later roles for DNS include providing additional information for the names and addresses, especially for security; the DNS infrastructure itself needed to be enhanced to be secure and trusted. <ref name=RFC4033>{{citation
'''Note well: all DNS was designed to do was replace the <code>hosts.txt</code> file that had the name to address mappings for <u>every</u> computer in the ARPANET.''' That's all. '''DNS was not designed to be a [[search engine]].''' Search engines hadn't been invented, since, after all, the Web had not been invented.  
| id = RFC4033
| title = DNS Security Introduction and Requirements
| author = Arends, R. ''et al.''
| url = http://www.ietf.org/rfc/rfc4033.txt
| date = March 2005
| publisher = Internet Engineering Task Force
}}</ref> DNS originally was manually configured, but there have needed to be a variety of extensions to allow dynamic operation, such as the temporary binding of an address to a name.  


Operationally, it was always expected that populating the Domain Name System data base would be cooperative.
{| class="wikitable"
{| class="wikitable"
<center>'''Original design goals for DNS'''</center>
|-
|-
! Protocol designers
! Protocol designers
Line 57: Line 50:
| Statements of the refresh policies desired
| Statements of the refresh policies desired
|}
|}
==New requirements==
[[Image:Security scope.png|thumb|350px|DNS security responsibilities]]
Over the years, it has taken on more technical and administrative roles. These include providing additional information for the names and addresses, especially for security; the DNS infrastructure itself needed to be enhanced to be secure and trusted. <ref name=RFC4033>{{citation
| id = RFC4033
| title = DNS Security Introduction and Requirements
| author = Arends, R. ''et al.''
| url = http://www.ietf.org/rfc/rfc4033.txt
| date = March 2005
| publisher = Internet Engineering Task Force
}}</ref> DNS originally was manually configured, but there have been a variety of extensions to allow dynamic operation, such as the temporary binding of an address to a name.
The domain name space, as well as the address spaces both for [[Internet Protocol version 4]] (IPv4) and [[Internet Protocol version 6]] (IPv6), are under the authority of the [[Internet Corporation for Assigned Names and Numbers]] (ICANN), with much delegation of administration. The original system only handled IPv4, so one of the first steps for IPv6 support was defining how to represent IPv6 addresses in DNS. <ref name=RFC3363>{{citation
| id = RFC3363
| title = Representing Internet Protocol version 6 (IPv6) Addresses in the  Domain Name System (DNS)
| author = Bush, R. ''et al.''
| url = http://www.ietf.org/rfc/rfc3363.txt
| date = August 2002
| publisher = Internet Engineering Task Force
}}</ref>  Berkeley Internet Name Domain (BIND), first deployed in BSD 4.3 UNIX and written by Kevin Dunlap, was the first widespread DNS implementation. BIND is now public domain code supported by the Internet Software Consortium <ref>{{citation
| url = https://www.isc.org/software/bind
| publisher = Internet Software Consortium
| title = BIND}}</ref>
In the years DNS has served, Internet technology and operational issues changed. When the new IPv6 address format came into use, the need to change name-to-address mapping tools to handle that format is understandable.
Less obvious, but still necessary, is the new requirement to have a capability to track dynamically assigned addresses when there is no central address server. [[Domain Name System dynamic update]] can do such tracking, but dynamic update at this level is a security vulnerability. Address assignment spoofing is, by no means, the only threat to DNS, and an entire set of [[Domain Name System security]] ([[DNSSEC]]) extensions are being deployed.<ref name=RFC4033 />
The U.S. government now requires DNSSEC for all Federal information systems, effective December 2009.<ref name=OMB-DNSSEC>{{citation
| url = http://www.whitehouse.gov/omb/memoranda/fy2008/m08-23.pdf
| date = August 22, 2008
| first = Karen | last = Evans
| title = Securing the Federal Government’s Domain Name System Infrastructure (Submission of Draft Agency Plans Due by September 5, 2008)}}</ref>


==Domain name structure and schema==
==Domain name structure and schema==
Domain names are hierarchical. A name such as
[[Image:RevUKDNS-1.png|thumb|left|350px|Domain Name System tree section]]
<center><code>en.citizendium.com</code></center>
The DNS namespace is hierarchical. Individual domain and host names within it have a textual representation, from right to left, which mirrors the tree that makes up the schema of the DNS:
 
<center><code>en.citizendium.com</code></center><br>
 
appears to have three components, but actually has four. The naming hierarchy is a tree, with increasingly specific levels reading right to left.  
appears to have three components, but actually has four. The naming hierarchy is a tree, with increasingly specific levels reading right to left.  


From what can be seen in the example,
From what can be seen in the textual example,
*'''.com''' is a '''top-level domain (TLD)''' under the authority of a TLD registry.
*'''.org''' is a '''top-level domain (TLD)''' under the authority of a TLD registry.
*'''.citizendium''' is a '''second-level domain''' under the authority of a SLD registry (SLD)
*'''.citizendium''' is a '''second-level domain''' under the authority of a SLD registry (SLD)
*'''.en''' identifies either a subdomain or a host, as defined by the <code>citizendium.com</code> technical administrator.
*'''.en''' identifies either a subdomain or a host, as defined by the <code>citizendium.com</code> technical administrator.


What cannot be seen is the hierarchically highest part, the '''root'''. If a part usually suppressed were displayed,
What cannot be seen is the hierarchically "zeroth" highest part, the '''root'''. If a part usually suppressed were displayed,
<center><code>en.citizendium.com'''<u>.</u>'''</code></center>
<center><code>en.citizendium.com'''<u>.</u>'''</code></center>


The rightmost dot identifies the '''root''' of the DNS tree. In actual practice, there are multiple '''root servers''', for which addresses are in an explicit file, a representative of whih is found at <nowiki>http://www.internic.net/zones/named.root</nowiki>
The rightmost dot identifies the '''root''' of the DNS tree. In actual practice, there are multiple '''root servers''', for which addresses are in an explicit file, a representative of which is found at <code><nowiki>http://www.internic.net/zones/named.root</nowiki></code>


It is defined as: <blockquote>This file holds the information on root name servers needed to initialize cache of Internet domain name servers (e.g. reference this file in the "cache  .  <file>"  configuration file of BIND domain name servers).</blockquote>
It is defined as: <blockquote>This file holds the information on root name servers needed to initialize cache of Internet domain name servers (e.g. reference this file in the "cache  .  <file>"  configuration file of BIND domain name servers).</blockquote>
===Name servers===
Name servers are computers that contain information about domains, all the way up to the root. Be sure to understand the difference between the abstraction of a domain or subdomain namespace, and the '''zone file''' that describes the contents of that namespace and actually runs in a name server. Name  servers can contain more than one zone file; indeed, this is the usual case when there are domains with subdomains.


Zone files, minimally, contain four kinds of '''resource record''', of which the basics are:
A '''fully qualified domain name''' can be traced from the hierarchically lowest host name to the root. For example, <code>en.citizendium.org</code> goes from the host <code>en</code> all the way up to the top-level domain <code>.org</code>, which is connected to the root.
*'''Start of authority (SOA)''' Define the start of the zone file and the domain it describes
 
*'''Name server (NS)''': gives the IP address of a hierarchically higher name server to which the name server goes when it cannot complete a name-to-address or address-to-name mapping based on its own information
A computer within the second-level domain <code>citizendium.org</code> could refer to the subdomain <code>en</code>, which would be a '''relative domain name'''; most DNS applications would append the current domain to the right of the host name. <code>k12.en.citizendium.org</code> is a hypothetical subdomain of <code>en.citizendium.org</code>; an arbitrary host could be <code>larry.en.citizendium.org</code> and the DNS software would understand if it is dealing with a host or a domain.
*'''Address (A)''': map a name to an IP address. Basic A records deal with 32-bit [[Internet Protocol version 4]] addresses, while AAAA records handle 128-bit [[Internet Protocol version 6]].
 
*'''Pointer (PTR)''': do the reverse mapping of an address to a name
==Domain name authority and issues==
===Name assignment===
The administrative process of DNS name assignment involves both '''DNS registries''' and '''DNS registrars'''
====DNS registries====
{{seealso|Domain Name System non-technical policy issues}}
DNS registries' fundamental role is to operate the data base for their top-level domain (TLD), and authorize registrars as "retail" agents to provide customer service. The bulk of TLDs are national, and use [[International Organization for Standardization]] (ISO) two-letter country codes (e.g., Canada = <tt>'''.ca'''</tt>, China = <tt>'''.cn'''</tt>, Germany = <tt>'''.de'''</tt>). In the majority of cases these country codes must be from the ISO 3166-1 list. However, there have been a few exceptions, usually for historical reasons. For example the ISO 3166-1 code for the United Kingdom is <tt>'''gb'''</tt>, but for historical reasons the assigned TLD is <tt>'''.uk'''</tt>. While the <tt>'''.gb'''</tt> TLD does exist, it has only one subdomain and does not accept new registrations.  A few country codes, such as Tuvalu's <tt>'''.tv'''</tt>, form attractive branding, and the country has few internal registrants but considerable income from outside registrants.  
 
New TLDs are created by the [[Internet Corporation for Assigned Names and Numbers]] (ICANN), who then delegates the registry function to an organization that contracts with ICANN. Some new or proposed TLDs have been quite controversial, such as the [[.xxx domain|<tt>'''.xxx'''</tt> domain]] for [[pornography]]. Others, which offer some competitive commercial service, may take much time and effort to create, since multiple organizations may want to be the registry.


The root name server zone file is expected to be retrieved, by anonymous [[FTP]], from various well-known sites approved by ICANN. In practice, most DNS implementations ship with a recent copy.  
Remember that the public Internet, while international from the start, began as a U.S. project. A small set of non-national TLDs were created for early convenience. Country codes were not, at first, used, and the majority of registrations still go into the best-known <tt>'''.com'''</tt>. While the "<tt>.cc</tt>" country codes had gradually been used, they were formalized in the 1998 U.S. Department of Commerce White Paper about moving the U.S. government out of Internet operations.


Root servers remain very busy. <ref name=DNS-BIND />
Some countries have a rational system where they use the "traditional" major suffix, or a variant of it, as a second-level domain, such as <tt>'''.co.uk'''</tt>, or <tt>'''.ac.uk'''</tt>. This, however, has not always been done in an intuitive or consistent manner. A relatively naive user might expect <tt>.com.uk</tt> to be correct in line with the international <tt>'''.com'''</tt>, but <tt>'''.co.uk'''</tt> is in fact correct. Based on this the user may then think that <tt>.or.uk</tt> would be the equivalent of <tt>'''.org'''</tt>, but in this case <tt>'''.org.uk'''</tt> is correct.<ref name=>{{citation
| url = http://www.nominet.org.uk/digitalAssets/3257_dotukrevisited.pdf
| author = Dyer, Stephen
| title = .UK – Revisited
| date = October 1, 2004}}</ref> Similarly one would expect that either <tt>.edu.uk</tt> or <tt>.ed.uk</tt> would correspond to <tt>'''.edu'''</tt>. But neither of these are correct, and instead <tt>'''.ac.uk'''</tt> is used for higher education colleges and universities, and <tt>'''.sch.uk'''</tt> for primary and secondary schools.


{| class="wikitable"
<center>'''Representative non-national TLD registries'''</center>
|-
! Top-level domain
! Registry
! Comments
|-
| .aero
| Société Internationale de Télécommunications Aéronautiques SC, (SITA)
| Sponsored by air transport industry
|-
| .com
| Verisign
| Unsponsored
|-
| .edu
| Educause
| Under U.S. government agreement, ending in 2011
|-
| .net
| Verisign
| Unsponsored
|-
| .mil
| [[Defense Information Systems Agency]]
| U.S. government agency
|-
| .org
| Public Interest Registry (PIR)
| Unsponsored; not-for-profit
|-
| .biz
|  NeuLevel, Inc.
| Unsponsored
|-
|}
There is a continuing business, political, and technical argument about the desirability of more TLDs, especially from those that want TLDs that are suggestive of the business purpose of a registrant.  From a technical standpoint, while a proliferation of TLDs would not, as once suspected, seriously impact DNS performance, it would be likely to increase customer support cost due to the likelihood of making mistakes and getting the wrong domain.
There are also [[#legal issues|legal issues]] of [[intellectual property]] involved in domain disputes.
====DNS registrars====
Registrars are the "retail" side of DNS operation. In .com and many other TLDs, they are profit-making entities. They deal with organizations that wish to acquire particular domain names, verifying the name is available, and then handling the administrative interaction with the domain registry.
Most registrars are reasonable and ethical. They may be subdivisions of companies that can sell additional services, such as web server hosting, to domain registrants. Frequently, they have user support functions that will help new DNS administrators set up their zone files, or they may actually operate name servers on behalf of registrants. If there is a dispute over the rights to a domain name, one's registrar can be a valuable ally.
There are registrars that compete for the business of large hosting centers and other organizations that need many domain names, typically discounting the registration fee to multiple-domain customers.  It is to the advantage of a registrar to keep its existing customers, as most domains will be renewed, producing a continuing income stream. Registrars want to avoid "churn", a name for customers changing to other registrars.
Some registrars, unfortunately, act against the original Internet tradition of it being a shared resource, and DNS being a service. Domain registrations expire annually, although one can pay the registrar to renew it automatically. It is not uncommon for certain registrars to look for domain names that expire in the near term, domains that were registered by a different registrar, and send the domain administrators what appear to be legitimate renewal notices. If completed and returned with payment, such a registrar will indeed renew the domain name &mdash; but transfer it away from the existing registrar.
===Legal and business issues associated with domain names===
When the [[ARPANET]], and then the [[Internet]], were new, DNS was seen as a simple mechanism to avoid memorizing or typing host addresses. As the Internet became more commercial, domain names acquired business value, since new users were apt to look for "company" at <code>company.com</code>. Indeed, as unpleasant to the DNS-knowledgeable ear as it may be, there are a substantial number of enterprises that have "dot-com", or sometimes other TLDs, as part of their corporate name.
Another argument, the details of which involve [[intellectual property]] issues beyond the scope of this article, is the legal theory that a [[trademark]] must be "defended" or risks going into the public domain. If a second-level domain is identical to a trademarked company name, does the company have exclusive rights to it? Intellectual property attorneys have often argued that a well-known-company is not "defending" its trademark if it allows a domain to be created with its name, so there has been a tendency that whenever some TLD "<tt>.new</tt>" is created, trademark holders rush to register "<tt>well-known-company.new</tt>".  Speculators, meanwhile, rush to do so before the trademark holder can do so, and, if successful, sell the rights to the domain at a very high price.
One especially hotly argued issue is whether sexually-oriented businesses should have a <tt>[[.xxx]]</tt> TLD; some of those arguing for it also want to restrict access to sexually-oriented content, which would be identified by the TLD. Obviously, there would be no way to enforce keeping sexually-oriented content in <tt>.xxx</tt>, but it could reasonably be assumed that, if a domain were in <tt>.xxx</tt>, it was sexually-oriented. After six years of debate the <tt>.xxx</tt> TLD was approved in June 2010, and is expected to be launched in early 2011.<ref>ICM Registry (June 25, 2010), [http://www.icmregistry.com/blog/?p=306 ICM Registry welcomes approval of .xxx]</ref>
==Name servers and zone files==
One of the most confusing things to newcomers to DNS is the difference between a domain and a zone. One way to look at it is that a domain declares a range of potential names, while the zone defines the names actually in use.  Formally, a [sub]domain is a '''namespace''' that need not have names in it. The basic source of name information that goes into a particular space is a '''zone file''', created manually or with software assistance.
Let us consider <tt>citizendium.org</tt>, which could have every valid character string as a subdomain from the shortened <tt>aaaa.citizendium.org to zzzz.citizendium.org</tt>. That are domains, comparable to the Citizendium name spaces such as Main, Talk, User, and CZ, in the sense that, ignoring lengths, the Main or Talk userspaces can have articles from Aaaa to Zzzz. Not all those article names, however, are meaningful.
If, however, there are only actual hosts named <tt>en.citizendium.org</tt>, <tt>test.citizendium.org</tt>, <tt>reid.citizendium.org</tt>, and <tt>locke.citizendium.org</tt>, Citizendium's zone file would have only four host entries.  To continue the analogy with CZ name spaces, the name file would be the set of articles, in each name  space, which actually exist.  Main: Zzzz is not an article; Main: Zero is an article.
[[Image:Initial population.png|thumb|250px|Populating a primary name server]]
Just as the DNS namespace is a tree of domains, the actual information in that namespace can be regarded as a tree of zone files.
'''Name servers''' are computers that contain information about domains, all the way up to the root. Be sure to understand the difference between the abstraction of a domain or subdomain namespace, and the '''[[#Basic Implementation|zone file]]''' that describes the contents of that namespace and actually runs in a name server. The '''primary name server''' is ''authoritative'' for domains, and contains the master copy of the zone file for that domain.
Name  servers can contain more than one zone file; indeed, this is the usual case when there are domains with subdomains. 
Depending on the implementation, a name server may cache information in addition to what it learned from the zone file.  For example, a local cache file in a name server could contain data about name-address relationships outside the domain, but which have been needed by a client within that domain. The name server may also contain limited-lifetime dynamic name updates, which might or might not be accessible from outside the domain.
RFC1034, the basic DNS conceptual specification, describes two ways, one optional and one required, for looking up names.<ref>RFC1034, pp. 3-4</ref> The same logic is relevant inside a domain that has caching nameservers.
*Iterative: the server refers the client to another server and lets the client pursue the query; the client is aware of multiple nameservers but is only interacting with one at a time
*Recursive: the first server pursues the query for the client at another server; the client is aware of only one DNS server
===Domains versus zones===
===Domains versus zones===
At each of these levels is an abstract '''namespace'''.  No other second-level domain could have '''notcz.citizendium.com''', but the administrator of '''citizendium.com''' is not obligated to have any number of subordinate hosts or domains. There is a subtle distinction between the abstraction of a name space, and a '''zone file''' that actually defines the hosts and subdomains in the zone.
At each of the levels of the DNS hierarchy &mdash; top-level, second level, etc. &mdash; is an abstract '''namespace'''.  No other second-level domain could have <tt>notcz.citizendium.org</tt>, but the administrator of <tt>citizendium.org</tt> is not obligated to have any number of subordinate hosts or domains. There is a subtle distinction between the abstraction of a name space, and a '''zone file''' that actually defines the hosts and subdomains in the zone.  Name spaces define possible records; zone files contain actual records within that space, plus a few special cases such as "glue" records to name servers outside that space.  <tt>wikipedia.citizendium.org</tt> is part of the <tt>citizendium.org</tt> namespace, but, since there is no such host, it is not in any zone file.


*anycast
===Resource records===
*FQDN
Zone files are made up of '''resource records (RR)'''. All RRs have several common properties:
*relative domain name
*'''owner''': the domain in which the authoritative RR resides. This is often implicitly derived from context, perhaps relative to the current domain name
*'''type''': an encoded 16 bit value that defines the type of resource defined by the current records.  Some types are obsolete, while others continue to be added for new DNS functions.
*'''class''': an obsolete but required field, it is a 16 bit value for the protocol family with which the RR is associated. The only value used is the "''Internet''", textually represented as '''IN'''
*'''time to live''': commonly called '''TTL''', this parameter specifies how long the RR may be kept in a cache and assumed to be valid. It is a 32 bit integer, whose value is measured in seconds
*'''RDATA''': type-specific data about the resource


While there are many graphic tools for creating RRs, the basic textual syntax is:
<center><code>'''[owner] IN      [class]      [rdata]'''</code></center>
For example, the RR defining the address associated with the name XX.LCS.MIT.EDU<ref>Note that the actual RR has a terminal period that does not appear when the DNS name is written in other uses</ref>
<center><code>'''XX.LCS.MIT.EDU. IN      A      10.0.0.44'''</code></center>
{| class="wikitable"
<center><u>'''RR types in current use'''</u></center>
|-
! Class
! RR Name
! Function
! Typical RDATA
|-
| SOA
| Start Of Authority
| Defines the start of a zone or a subzone; subordinate records inherit parameters
| Multiple fields
|-
| A
| Address [[IPv4]]
| Specifies the IPv4 address for a host
| IPv4 Address
|-
| AAAA
| Address [[IPv6]]
| Specifies the IPv6 address for a host
| IPv6 Address
|-
| PTR
| "Pointer"
| Reverse mapping of address to name
| Name
|-
| CNAME
| Canonical name
| Specifies an alias name for an address
| Address
|-
| NS
| Name server
| (usually) An address of a name server one level of domain hierarchy above the current domain
| Address
|-
| MX
| Mail exchanger
| Defines the start of a zone or a subzone; subordinate records inherit parameters
| A 16 bit preference value (lower is  better) followed by a host name willing  to act as a mail exchange for the owner    domain.
|-
|}
===Wildcards in Resource Records===
An additional complexity of RRs is that they may contain [[regular expression|wildcards]]. The simplest example is a " <tt>*</tt> " character that will match any string in a name expression. In specific situations, this is an extremely useful function, but it can complicate troubleshooting.<ref name=RFC4592>{{citation
| id=RFC4592
| title = The Role of Wildcards in the Domain Name System
| author = E. Lewis
| date = July 2006
| url = http://www.ietf.org/rfc/rfc4592.txt}}</ref>
In 2003, Verisign, who operates the .com registry, inserted a wildcard into the master DNS files, so that an undefined name, rather than returning an error message, would be redirected to one of the registry's commercial [[search engine]]s.<ref name=ICANN-Wild>{{citation
| url = http://www.icann.org/en/topics/wildcard-history.html
| title = Verisign's Wildcard Service Deployment
| author = [[Internet Corporation for Assigned Names and Numbers]]}}</ref> If the [[World Wide Web]] alone were the only function on the [[Internet]], this might, although revenue-generating, have been useful. Unfortunately, there are many other functions on the Internet. In particular, [[messaging application protocols]] such as the [[Simple Mail Transfer Protocol]] (SMTP) would use the "host not found" information to conclude that mail to that host was undeliverable.
A quite useful use for a wildcard, however, would be in a [[split DNS]] application, with different name resolution policies on different sides of a firewall. On the public [[Internet]] side of the firewall, the DNS server for <code>example.com</code> would have explicit records for the organization's public web server, mail server, and other public servers. Any reference to "inside" addresses, however, would be handled by the record:
<center><code>'''*.example.com  IN A [outside address of the firewall]'''</code></center>
[[Domain Name System security]], however, does not have a complete solution to working with wildcarded RRs.
==Deploying DNS==
To understand basic DNS, assume that it is being used in a single organization, which has one technical and administrative authority in control. In other words, the domain and its subdomains are homogeneous. While there may be minor exceptions due to the existence of temporarily cached data in individual clients and servers, and not all clients and servers may be able to view all parts of the highest-level domain, a single organization's DNS is essentially a [[distributed database]], where there are multiple copies of a single "golden copy" of information.
Once one starts interconnecting domains under different authority, as in the Internet, both administrative and technical aspects change. First, it is understood that while the total collection of all domains conceptually have access to all public name information, no single domain will have a copy of all information. Rather than being a distributed data base, it has become a [[federated data base]], where there is a common indexing and retrieval model, but requests may need to go to multiple servers, in multiple domains and subdomains, before the request is satisfied.
Second, even between well-recognized business partner organizations, there are trust issues. Third, there are [[miscreant]]s actively attacking the DNS, for reasons from ideology to technical status to pure criminal revenue.
===Basic Implementation===
The administrator of a homogeneous domain (and its subdomains) starts by building a zone file that defines the names and addresses of hosts in that zone, optional additional information to be added to the responses, and to a higher-level nameserver that helps connect the domain of the zone to other domains. For example, if one was in <code>'''a.com'''</code> , one would have to go to the nameserver of <code>'''.com'''</code> to find the address of the <code>'''b.com'''</code> nameserver.
====SOA Resource Record====
The zone/domain name starts the record; it must end with a trailing period. Assume that it is <code>sub.example.com.</code>
In the resource data, the first field is the primary name server that is <u>''in</u>'' this domain, as opposed to the name server in the ''NS'' record, which is <u>''above and outside</u>'' the current domain. In this case, it might be <code>ns1.sub.example.com.</code>
Next comes the mail address of the person or [[role]] responsible for the data in this domain, written not in the conventional <code>user@domain</code>, but in the syntax of a DNS name in a zone file. To create a mail address, replace the leftmost period with an "@" symbol and remove the trailing period. <br>
" <code>administrator.sub.example.com.</code> " is changed to " <code>administrator@sub.example.com</code> ".
Following the administrator are several parameters that may have defaults, but should be known. The first is the serial number of this version of the zone file, which will increase whenever this file is updated.
The next four are timers for the domain, specified in seconds:
*'''refresh interval''': Secondary name servers in the domain should check the primary for new data after this number of seconds expires
*'''retryinterval''': If the secondary was unable to get an update when the refresh interval expires, this parameter tells the secondary how long to wait before retrying. The value in this field is usually less than the refresh interval
*'''expireinterval''': If the secondary was unable to get an update before this timer expires, it should assume that all of the RR information is in its copy of the zone file. If this timer triggers, the secondary server will stop responding to DNS requests
*'''TTL''': The default TTL for RRs in this zone. An appropriate TTL is controversial, and may be quite different on an internal nameserver versus one accessible from the Internet. The shorter the interval, the more accurate is the data, and, further, the better it is for name-based load distribution schemes. The longer the interval, the less DNS traffic is generated
====Other Resource Records====
; NS : gives the IP address of a hierarchically higher name server to which the name server goes when it cannot complete a name-to-address or address-to-name mapping based on its own information.
; A ''and'' AAAA : code the authoritative host name and its address, and, optionally, the TTL if different from the zone TTL.
; PTR : code an address and the corresponding host name, and, optionally, the TTL if different from the zone TTL.
; CNAME : code an alternative host name and its address, and, optionally, the TTL if different from the zone TTL.
===Resource Record sets (RRsets)===
While no two RRs should have the same label and type and data all equal, it is perfectly possible to have RRs with the same label and type, but different RDATA. For example, a physically multihomed server could have four network interface cards (NIC), each on a different subnet. The set of addresses for this host name (i.e., label) would reasonably form a set of four A records with different address data. Such a set of records is called a  '''Resource Record Set''' (RRSet). <ref name=RFC2181>{{citation
| id = RFC2181
| title = Clarifications to the DNS Specification
| author = R. Elz, R. Bush
| url = http://www.ietf.org/rfc/rfc2181.txt
| date = July 1997
| publisher = Internet Engineering Task Force
}}</ref>
===Obtaining root information===
The root name server zone file is expected to be retrieved, by anonymous [[FTP]], from various well-known sites approved by ICANN. In practice, most DNS implementations ship with a recent copy. Root servers remain very busy. <ref name=DNS-BIND /> In fact, while the root server zone file mentioned above will give the names and addresses of root servers in the general form
<center><code>a.root-servers.net</code></center>
the address of a particular server is of the [[anycast]] type; <ref>{{citation
| url = http://www.root-servers.org/presentations/rootops-gac-rio.pdf
| title = Operation of the Root Name Servers
| author = Liman, Lars-Johan ''et al''}}</ref> there are multiple physical computers with that address, for fault tolerance and load sharing.


==Domain naming administration and issues==
===Name assignment===
*registry
*registrars
===Implementation===
For each domain, there must be at least one, and preferably more than one '''name server''' that holds the zone files. '''Primary''' domain servers have the authoritative zone files, and '''secondary''' domain servers keep an exact copy of the primary's zone file. Both types are assumed to have a disk or other storage from which they can restore the domain information.
For each domain, there must be at least one, and preferably more than one '''name server''' that holds the zone files. '''Primary''' domain servers have the authoritative zone files, and '''secondary''' domain servers keep an exact copy of the primary's zone file. Both types are assumed to have a disk or other storage from which they can restore the domain information.
[[Image:Initial population with trusted externals.png|thumb|300px|left|Zone transfer adds to populating a server database]]
A secondary server will use a '''zone transfer''' to obtain the primary zone file for its domain. There are various operational reasons why a physical server might act as primary and secondary for multiple zones; the important point here is that a zone transfer, as opposed to ordinary DNS retrieval, alters the contents of the definitions and must be treated as a sensitive operation.
[[Image:Including dynamic updateV2.png|250px|thumb|Adding trusted dynamic updates]]
The nameserver also can take dynamic transfers, which, strictly speaking, do not have to be secured, but dynamic update, especially in an IPv6 environment, is so open an invitation to miscreants that it should never be considered without being secured. DNS security is the normal way this might be done, but there are other alternatives, such as an encrypted link between the update source and the nameserver.


There are also '''caching-only''' servers that contain only the names and addresses that have been recently looked up, and are still valid with respect to the '''time to live (TTL)''' parameter in the relevant records.
There are also '''caching-only''' servers that contain only the names and addresses that have been recently looked up, and are still valid with respect to the TTL parameter in the relevant records.
[[Image:Distribution within zone.png|300px|left|thumb|Resolvers, their caches, and their information sources]]
The program, on a host, which is the client of DNS servers is most often called a '''resolver'''.  Depending on the local network architectural implementation, a resolver may go to a caching-only server, a secondary server, or the primary server for its information. It may retain a cache of recently retrieved DNS information, clearing items from cache as their TTLs expire.


The program, on a host, which is the client of DNS servers is most often called a '''resolver'''.
===Heterogeneous DNS===
{{main|Split DNS|}}
While there will be different federated databases, DNS is certainly not limited to the public Internet. It is quite common for organizations to have '''split DNS''' "inside the firewall" and "outside the firewall".  An inside user will query local DNS for the address of an internal machine and get the address of the actual host, but, if it asks for the address of <code>citizendium.com</code>, the address returned by DNS may well be that of the "inside" interface of a [[firewall]], or other security [[middlebox]]<ref name=RFC3303>{{citation
| id = RFC3303
| title = Middlebox communication architecture and framework.
| author = P. Srisuresh, J. Kuthan, J. Rosenberg, A. Molitor, A. Rayhan
| date = August 2002
| url = http://www.ietf.org/rfc/rfc3303.txt
}}</ref> Depending on the firewall implementation, it may deny access, or create a proxy connection to the outside host. To establish that connection, the middlebox will query an "outside" DNS, which contains the addresses of the organization's public hosts, but primarily contains the addresses of external hosts. In some cases, that outside DNS enjoys some trust with an external organization, and may do secured zone transfers. More often, however, the outside DNS is primarily a cache of name-address information that it obtained by queries to the nameservers of other domains.


==DNS protocols==
==DNS protocols==
The most basic DNS protocols are the '''lookup service''', which runs over the connectionless [[User Datagram Protocol]], and the '''zone transfer service''', which runs over connection-oriented [[Transmission Control Protocol]].<ref name=RFC1035>{{citation
The most basic DNS protocols are the '''lookup service''', which runs over port 53 of the connectionless [[User Datagram Protocol]], and the '''zone transfer service''', which also runs over port 53 of the connection-oriented [[Transmission Control Protocol]].<ref name=RFC1035>{{citation
  | id = RFC1035  
  | id = RFC1035  
  | title = Domain names - implementation and specification
  | title = Domain names - implementation and specification
Line 114: Line 362:
  | url = http://www.ietf.org/rfc/rfc1035.txt
  | url = http://www.ietf.org/rfc/rfc1035.txt
  | publisher = Internet Engineering Task Force
  | publisher = Internet Engineering Task Force
}}</ref> Lookup is a read-only function, while zone update is read-write and should be implemented as a privileged, authenticated operation.
}}</ref> Lookup is a read-only function, while zone update is read-write and should be implemented as a privileged, authenticated operation. Otherwise any client on a DNS server's network could request a [[zone transfer]], and receive a complete copy of a zonefile, which is a security risk.


There are also protocols for dynamic update. <ref name=RFC2136>{{citation
There are also protocols for dynamic update, so that [[network clients]] can automatically update their DNS servers to reflect correct hostnames (e.g. if they dynamically receive a different IP address via [[DHCP]]). This concept is also known as Dynamic DNS. <ref name=RFC2136>{{citation
  | id = RFC2136  
  | id = RFC2136  
  | title = Dynamic Updates in the Domain Name System (DNS UPDATE)
  | title = Dynamic Updates in the Domain Name System (DNS UPDATE)
  | editor = Vixie, P.
  | editor = Vixie, P.
  | url = http://www.ietf.org/rfc/rfc4033.txt
  | url = http://www.ietf.org/rfc/rfc2136.txt
  | date = April 1997
  | date = April 1997
  | publisher = Internet Engineering Task Force
  | publisher = Internet Engineering Task Force
}}</ref>
}}</ref>
==Extended applications==
These include [[Domain Name System dynamic update]], use of the DNS as a data base in [[public key infrastructure|Public Key Infrastructure (PKI)]] for general security, [[Domain Name System security]] ([[DNSSEC]]) and name-based routing and load distribution.
==References==
==References==
{{reflist}}
{{reflist|2}}
 
[[Category:Flagged for Review]][[Category:Suggestion Bot Tag]]

Latest revision as of 06:01, 8 August 2024

This article has a Citable Version.
Main Article
Discussion
Related Articles  [?]
Bibliography  [?]
External Links  [?]
Citable Version  [?]
 
This editable Main Article has an approved citable version (see its Citable Version subpage). While we have done conscientious work, we cannot guarantee that this Main Article, or its citable version, is wholly free of mistakes. By helping to improve this editable Main Article, you will help the process of generating a new, improved citable version.

On the Internet, the Domain Name System (DNS) is a critically important directory service that translates to and from a raw IP address (such as 207.46.197.32) and a domain name (such as microsoft.com). This allows people to interact with software via domain names, which are easier to remember than numerical IP addresses.

More importantly, it allows computer-friendly but user-unfriendly IP addresses to change without affecting human users. Thus people can still expect to find the same information behind the user-friendly domain names, and need not be concerned if Microsoft Corporation changes the IP address on one of its host computers, as the domain name microsoft.com is sufficient, thanks to DNS, to find their computers regardless of which IP address the Microsoft administrator has assigned to those hosts.

DNS is a hierarchical federated database, distributed widely across many host computers on the public Internet, and it also has a set of application protocols for interacting with the database. DNS names must comply with standards on the public Internet, but need not do so in a private internet where DNS is still useful. The original purpose of DNS was to translate a domain name to an IP address (forward DNS), and an IP address to a domain name (reverse DNS),[1] but in recent years there have been ongoing attempts to expand the purpose and functionality of DNS in the public Internet. Further, because the lookup process for DNS superficially appears to resemble the lookup process for searching on the world wide web, it has become easy to confuse the purposes of a DNS lookup with a search-engine lookup. These two kinds of lookups have very different goals and occur at vastly different levels within the internet protocol stack. This article will explain the functions and purposes of the Domain Name System, the nature of its distributed and hierarchical database, and the protocols for accessing it. It will also note how the functions of DNS differ markedly from those of search engines, since this seems to be a matter of frequent confusion on the part of learners. In lay terms, you might think of DNS as like the white pages in a traditional phone book, and search engines as more like the yellow pages.

As the white page type lookup service of the public Internet, DNS has been attacked by hostile programs either attempting to disrupt Internet traffic or divert users to illicit host machines. The distributed and simplistic approach taken by DNS has proved, historically, surprisingly resilient against such attacks, but as the size and importance of the public Internet has grown, so have the security concerns related to DNS. This article, or its related sub-articles, will also address basic DNS security issues.

History

DNS was first introduced for use on the Internet in 1983, with the first specification written by Paul Mockapetris.[2] Mockapetris' first DNS implementation was called JEEVES, and replaced the ARPANET (pre-Internet) environment with few enough computers that a single file, hosts.txt, was sufficient to contain all connected computer names and their numeric addresses.[3] Its designers, however, did not think of it as anything like a search engine, with the ability to seek a name corresponding to an idea (e.g. "pizza"), but to work with explicit names already known by the application. Manually maintaining and sharing host files became impractical as the scale of the Internet grew, and DNS was designed and implemented as the solution to the problem of scalable host name resolution.

Note well: all DNS was designed to do was replace the hosts.txt file that had the name to address mappings for every computer in the ARPANET. That's all. DNS was not designed to be a search engine. Search engines hadn't been invented, since, after all, the Web had not been invented.

Original design goals for DNS
Protocol designers Name & address authorities System administrators
Standard formats for resource data. Addresses for the root servers The definition of zone boundaries
Standard methods for querying the database Unique assignments of domain names Master files of data (i.e., sets of Resource Records (RR)
Standard methods for name servers to refresh local data from foreign name servers. Operation, perhaps with delegation of the root servers and top-level domain servers Statements of the refresh policies desired

New requirements

DNS security responsibilities

Over the years, it has taken on more technical and administrative roles. These include providing additional information for the names and addresses, especially for security; the DNS infrastructure itself needed to be enhanced to be secure and trusted. [4] DNS originally was manually configured, but there have been a variety of extensions to allow dynamic operation, such as the temporary binding of an address to a name.

The domain name space, as well as the address spaces both for Internet Protocol version 4 (IPv4) and Internet Protocol version 6 (IPv6), are under the authority of the Internet Corporation for Assigned Names and Numbers (ICANN), with much delegation of administration. The original system only handled IPv4, so one of the first steps for IPv6 support was defining how to represent IPv6 addresses in DNS. [5] Berkeley Internet Name Domain (BIND), first deployed in BSD 4.3 UNIX and written by Kevin Dunlap, was the first widespread DNS implementation. BIND is now public domain code supported by the Internet Software Consortium [6]

In the years DNS has served, Internet technology and operational issues changed. When the new IPv6 address format came into use, the need to change name-to-address mapping tools to handle that format is understandable.

Less obvious, but still necessary, is the new requirement to have a capability to track dynamically assigned addresses when there is no central address server. Domain Name System dynamic update can do such tracking, but dynamic update at this level is a security vulnerability. Address assignment spoofing is, by no means, the only threat to DNS, and an entire set of Domain Name System security (DNSSEC) extensions are being deployed.[4]

The U.S. government now requires DNSSEC for all Federal information systems, effective December 2009.[7]

Domain name structure and schema

Domain Name System tree section

The DNS namespace is hierarchical. Individual domain and host names within it have a textual representation, from right to left, which mirrors the tree that makes up the schema of the DNS:

en.citizendium.com


appears to have three components, but actually has four. The naming hierarchy is a tree, with increasingly specific levels reading right to left.

From what can be seen in the textual example,

  • .org is a top-level domain (TLD) under the authority of a TLD registry.
  • .citizendium is a second-level domain under the authority of a SLD registry (SLD)
  • .en identifies either a subdomain or a host, as defined by the citizendium.com technical administrator.

What cannot be seen is the hierarchically "zeroth" highest part, the root. If a part usually suppressed were displayed,

en.citizendium.com.

The rightmost dot identifies the root of the DNS tree. In actual practice, there are multiple root servers, for which addresses are in an explicit file, a representative of which is found at http://www.internic.net/zones/named.root

It is defined as:

This file holds the information on root name servers needed to initialize cache of Internet domain name servers (e.g. reference this file in the "cache . <file>" configuration file of BIND domain name servers).

A fully qualified domain name can be traced from the hierarchically lowest host name to the root. For example, en.citizendium.org goes from the host en all the way up to the top-level domain .org, which is connected to the root.

A computer within the second-level domain citizendium.org could refer to the subdomain en, which would be a relative domain name; most DNS applications would append the current domain to the right of the host name. k12.en.citizendium.org is a hypothetical subdomain of en.citizendium.org; an arbitrary host could be larry.en.citizendium.org and the DNS software would understand if it is dealing with a host or a domain.

Domain name authority and issues

Name assignment

The administrative process of DNS name assignment involves both DNS registries and DNS registrars

DNS registries

See also: Domain Name System non-technical policy issues

DNS registries' fundamental role is to operate the data base for their top-level domain (TLD), and authorize registrars as "retail" agents to provide customer service. The bulk of TLDs are national, and use International Organization for Standardization (ISO) two-letter country codes (e.g., Canada = .ca, China = .cn, Germany = .de). In the majority of cases these country codes must be from the ISO 3166-1 list. However, there have been a few exceptions, usually for historical reasons. For example the ISO 3166-1 code for the United Kingdom is gb, but for historical reasons the assigned TLD is .uk. While the .gb TLD does exist, it has only one subdomain and does not accept new registrations. A few country codes, such as Tuvalu's .tv, form attractive branding, and the country has few internal registrants but considerable income from outside registrants.

New TLDs are created by the Internet Corporation for Assigned Names and Numbers (ICANN), who then delegates the registry function to an organization that contracts with ICANN. Some new or proposed TLDs have been quite controversial, such as the .xxx domain for pornography. Others, which offer some competitive commercial service, may take much time and effort to create, since multiple organizations may want to be the registry.

Remember that the public Internet, while international from the start, began as a U.S. project. A small set of non-national TLDs were created for early convenience. Country codes were not, at first, used, and the majority of registrations still go into the best-known .com. While the ".cc" country codes had gradually been used, they were formalized in the 1998 U.S. Department of Commerce White Paper about moving the U.S. government out of Internet operations.

Some countries have a rational system where they use the "traditional" major suffix, or a variant of it, as a second-level domain, such as .co.uk, or .ac.uk. This, however, has not always been done in an intuitive or consistent manner. A relatively naive user might expect .com.uk to be correct in line with the international .com, but .co.uk is in fact correct. Based on this the user may then think that .or.uk would be the equivalent of .org, but in this case .org.uk is correct.[8] Similarly one would expect that either .edu.uk or .ed.uk would correspond to .edu. But neither of these are correct, and instead .ac.uk is used for higher education colleges and universities, and .sch.uk for primary and secondary schools.

Representative non-national TLD registries
Top-level domain Registry Comments
.aero Société Internationale de Télécommunications Aéronautiques SC, (SITA) Sponsored by air transport industry
.com Verisign Unsponsored
.edu Educause Under U.S. government agreement, ending in 2011
.net Verisign Unsponsored
.mil Defense Information Systems Agency U.S. government agency
.org Public Interest Registry (PIR) Unsponsored; not-for-profit
.biz NeuLevel, Inc. Unsponsored

There is a continuing business, political, and technical argument about the desirability of more TLDs, especially from those that want TLDs that are suggestive of the business purpose of a registrant. From a technical standpoint, while a proliferation of TLDs would not, as once suspected, seriously impact DNS performance, it would be likely to increase customer support cost due to the likelihood of making mistakes and getting the wrong domain.

There are also legal issues of intellectual property involved in domain disputes.

DNS registrars

Registrars are the "retail" side of DNS operation. In .com and many other TLDs, they are profit-making entities. They deal with organizations that wish to acquire particular domain names, verifying the name is available, and then handling the administrative interaction with the domain registry.

Most registrars are reasonable and ethical. They may be subdivisions of companies that can sell additional services, such as web server hosting, to domain registrants. Frequently, they have user support functions that will help new DNS administrators set up their zone files, or they may actually operate name servers on behalf of registrants. If there is a dispute over the rights to a domain name, one's registrar can be a valuable ally.

There are registrars that compete for the business of large hosting centers and other organizations that need many domain names, typically discounting the registration fee to multiple-domain customers. It is to the advantage of a registrar to keep its existing customers, as most domains will be renewed, producing a continuing income stream. Registrars want to avoid "churn", a name for customers changing to other registrars.

Some registrars, unfortunately, act against the original Internet tradition of it being a shared resource, and DNS being a service. Domain registrations expire annually, although one can pay the registrar to renew it automatically. It is not uncommon for certain registrars to look for domain names that expire in the near term, domains that were registered by a different registrar, and send the domain administrators what appear to be legitimate renewal notices. If completed and returned with payment, such a registrar will indeed renew the domain name — but transfer it away from the existing registrar.

Legal and business issues associated with domain names

When the ARPANET, and then the Internet, were new, DNS was seen as a simple mechanism to avoid memorizing or typing host addresses. As the Internet became more commercial, domain names acquired business value, since new users were apt to look for "company" at company.com. Indeed, as unpleasant to the DNS-knowledgeable ear as it may be, there are a substantial number of enterprises that have "dot-com", or sometimes other TLDs, as part of their corporate name.

Another argument, the details of which involve intellectual property issues beyond the scope of this article, is the legal theory that a trademark must be "defended" or risks going into the public domain. If a second-level domain is identical to a trademarked company name, does the company have exclusive rights to it? Intellectual property attorneys have often argued that a well-known-company is not "defending" its trademark if it allows a domain to be created with its name, so there has been a tendency that whenever some TLD ".new" is created, trademark holders rush to register "well-known-company.new". Speculators, meanwhile, rush to do so before the trademark holder can do so, and, if successful, sell the rights to the domain at a very high price.

One especially hotly argued issue is whether sexually-oriented businesses should have a .xxx TLD; some of those arguing for it also want to restrict access to sexually-oriented content, which would be identified by the TLD. Obviously, there would be no way to enforce keeping sexually-oriented content in .xxx, but it could reasonably be assumed that, if a domain were in .xxx, it was sexually-oriented. After six years of debate the .xxx TLD was approved in June 2010, and is expected to be launched in early 2011.[9]

Name servers and zone files

One of the most confusing things to newcomers to DNS is the difference between a domain and a zone. One way to look at it is that a domain declares a range of potential names, while the zone defines the names actually in use. Formally, a [sub]domain is a namespace that need not have names in it. The basic source of name information that goes into a particular space is a zone file, created manually or with software assistance.

Let us consider citizendium.org, which could have every valid character string as a subdomain from the shortened aaaa.citizendium.org to zzzz.citizendium.org. That are domains, comparable to the Citizendium name spaces such as Main, Talk, User, and CZ, in the sense that, ignoring lengths, the Main or Talk userspaces can have articles from Aaaa to Zzzz. Not all those article names, however, are meaningful.

If, however, there are only actual hosts named en.citizendium.org, test.citizendium.org, reid.citizendium.org, and locke.citizendium.org, Citizendium's zone file would have only four host entries. To continue the analogy with CZ name spaces, the name file would be the set of articles, in each name space, which actually exist. Main: Zzzz is not an article; Main: Zero is an article.

Populating a primary name server

Just as the DNS namespace is a tree of domains, the actual information in that namespace can be regarded as a tree of zone files.

Name servers are computers that contain information about domains, all the way up to the root. Be sure to understand the difference between the abstraction of a domain or subdomain namespace, and the zone file that describes the contents of that namespace and actually runs in a name server. The primary name server is authoritative for domains, and contains the master copy of the zone file for that domain.

Name servers can contain more than one zone file; indeed, this is the usual case when there are domains with subdomains.

Depending on the implementation, a name server may cache information in addition to what it learned from the zone file. For example, a local cache file in a name server could contain data about name-address relationships outside the domain, but which have been needed by a client within that domain. The name server may also contain limited-lifetime dynamic name updates, which might or might not be accessible from outside the domain.

RFC1034, the basic DNS conceptual specification, describes two ways, one optional and one required, for looking up names.[10] The same logic is relevant inside a domain that has caching nameservers.

  • Iterative: the server refers the client to another server and lets the client pursue the query; the client is aware of multiple nameservers but is only interacting with one at a time
  • Recursive: the first server pursues the query for the client at another server; the client is aware of only one DNS server

Domains versus zones

At each of the levels of the DNS hierarchy — top-level, second level, etc. — is an abstract namespace. No other second-level domain could have notcz.citizendium.org, but the administrator of citizendium.org is not obligated to have any number of subordinate hosts or domains. There is a subtle distinction between the abstraction of a name space, and a zone file that actually defines the hosts and subdomains in the zone. Name spaces define possible records; zone files contain actual records within that space, plus a few special cases such as "glue" records to name servers outside that space. wikipedia.citizendium.org is part of the citizendium.org namespace, but, since there is no such host, it is not in any zone file.

Resource records

Zone files are made up of resource records (RR). All RRs have several common properties:

  • owner: the domain in which the authoritative RR resides. This is often implicitly derived from context, perhaps relative to the current domain name
  • type: an encoded 16 bit value that defines the type of resource defined by the current records. Some types are obsolete, while others continue to be added for new DNS functions.
  • class: an obsolete but required field, it is a 16 bit value for the protocol family with which the RR is associated. The only value used is the "Internet", textually represented as IN
  • time to live: commonly called TTL, this parameter specifies how long the RR may be kept in a cache and assumed to be valid. It is a 32 bit integer, whose value is measured in seconds
  • RDATA: type-specific data about the resource

While there are many graphic tools for creating RRs, the basic textual syntax is:

[owner] IN [class] [rdata]

For example, the RR defining the address associated with the name XX.LCS.MIT.EDU[11]

XX.LCS.MIT.EDU. IN A 10.0.0.44
RR types in current use
Class RR Name Function Typical RDATA
SOA Start Of Authority Defines the start of a zone or a subzone; subordinate records inherit parameters Multiple fields
A Address IPv4 Specifies the IPv4 address for a host IPv4 Address
AAAA Address IPv6 Specifies the IPv6 address for a host IPv6 Address
PTR "Pointer" Reverse mapping of address to name Name
CNAME Canonical name Specifies an alias name for an address Address
NS Name server (usually) An address of a name server one level of domain hierarchy above the current domain Address
MX Mail exchanger Defines the start of a zone or a subzone; subordinate records inherit parameters A 16 bit preference value (lower is better) followed by a host name willing to act as a mail exchange for the owner domain.

Wildcards in Resource Records

An additional complexity of RRs is that they may contain wildcards. The simplest example is a " * " character that will match any string in a name expression. In specific situations, this is an extremely useful function, but it can complicate troubleshooting.[12]

In 2003, Verisign, who operates the .com registry, inserted a wildcard into the master DNS files, so that an undefined name, rather than returning an error message, would be redirected to one of the registry's commercial search engines.[13] If the World Wide Web alone were the only function on the Internet, this might, although revenue-generating, have been useful. Unfortunately, there are many other functions on the Internet. In particular, messaging application protocols such as the Simple Mail Transfer Protocol (SMTP) would use the "host not found" information to conclude that mail to that host was undeliverable.

A quite useful use for a wildcard, however, would be in a split DNS application, with different name resolution policies on different sides of a firewall. On the public Internet side of the firewall, the DNS server for example.com would have explicit records for the organization's public web server, mail server, and other public servers. Any reference to "inside" addresses, however, would be handled by the record:

*.example.com IN A [outside address of the firewall]

Domain Name System security, however, does not have a complete solution to working with wildcarded RRs.

Deploying DNS

To understand basic DNS, assume that it is being used in a single organization, which has one technical and administrative authority in control. In other words, the domain and its subdomains are homogeneous. While there may be minor exceptions due to the existence of temporarily cached data in individual clients and servers, and not all clients and servers may be able to view all parts of the highest-level domain, a single organization's DNS is essentially a distributed database, where there are multiple copies of a single "golden copy" of information.

Once one starts interconnecting domains under different authority, as in the Internet, both administrative and technical aspects change. First, it is understood that while the total collection of all domains conceptually have access to all public name information, no single domain will have a copy of all information. Rather than being a distributed data base, it has become a federated data base, where there is a common indexing and retrieval model, but requests may need to go to multiple servers, in multiple domains and subdomains, before the request is satisfied.

Second, even between well-recognized business partner organizations, there are trust issues. Third, there are miscreants actively attacking the DNS, for reasons from ideology to technical status to pure criminal revenue.

Basic Implementation

The administrator of a homogeneous domain (and its subdomains) starts by building a zone file that defines the names and addresses of hosts in that zone, optional additional information to be added to the responses, and to a higher-level nameserver that helps connect the domain of the zone to other domains. For example, if one was in a.com , one would have to go to the nameserver of .com to find the address of the b.com nameserver.

SOA Resource Record

The zone/domain name starts the record; it must end with a trailing period. Assume that it is sub.example.com.

In the resource data, the first field is the primary name server that is in this domain, as opposed to the name server in the NS record, which is above and outside the current domain. In this case, it might be ns1.sub.example.com.

Next comes the mail address of the person or role responsible for the data in this domain, written not in the conventional user@domain, but in the syntax of a DNS name in a zone file. To create a mail address, replace the leftmost period with an "@" symbol and remove the trailing period.
" administrator.sub.example.com. " is changed to " administrator@sub.example.com ".

Following the administrator are several parameters that may have defaults, but should be known. The first is the serial number of this version of the zone file, which will increase whenever this file is updated.

The next four are timers for the domain, specified in seconds:

  • refresh interval: Secondary name servers in the domain should check the primary for new data after this number of seconds expires
  • retryinterval: If the secondary was unable to get an update when the refresh interval expires, this parameter tells the secondary how long to wait before retrying. The value in this field is usually less than the refresh interval
  • expireinterval: If the secondary was unable to get an update before this timer expires, it should assume that all of the RR information is in its copy of the zone file. If this timer triggers, the secondary server will stop responding to DNS requests
  • TTL: The default TTL for RRs in this zone. An appropriate TTL is controversial, and may be quite different on an internal nameserver versus one accessible from the Internet. The shorter the interval, the more accurate is the data, and, further, the better it is for name-based load distribution schemes. The longer the interval, the less DNS traffic is generated

Other Resource Records

NS
gives the IP address of a hierarchically higher name server to which the name server goes when it cannot complete a name-to-address or address-to-name mapping based on its own information.
A and AAAA
code the authoritative host name and its address, and, optionally, the TTL if different from the zone TTL.
PTR
code an address and the corresponding host name, and, optionally, the TTL if different from the zone TTL.
CNAME
code an alternative host name and its address, and, optionally, the TTL if different from the zone TTL.

Resource Record sets (RRsets)

While no two RRs should have the same label and type and data all equal, it is perfectly possible to have RRs with the same label and type, but different RDATA. For example, a physically multihomed server could have four network interface cards (NIC), each on a different subnet. The set of addresses for this host name (i.e., label) would reasonably form a set of four A records with different address data. Such a set of records is called a Resource Record Set (RRSet). [14]

Obtaining root information

The root name server zone file is expected to be retrieved, by anonymous FTP, from various well-known sites approved by ICANN. In practice, most DNS implementations ship with a recent copy. Root servers remain very busy. [3] In fact, while the root server zone file mentioned above will give the names and addresses of root servers in the general form

a.root-servers.net

the address of a particular server is of the anycast type; [15] there are multiple physical computers with that address, for fault tolerance and load sharing.

For each domain, there must be at least one, and preferably more than one name server that holds the zone files. Primary domain servers have the authoritative zone files, and secondary domain servers keep an exact copy of the primary's zone file. Both types are assumed to have a disk or other storage from which they can restore the domain information.

Zone transfer adds to populating a server database

A secondary server will use a zone transfer to obtain the primary zone file for its domain. There are various operational reasons why a physical server might act as primary and secondary for multiple zones; the important point here is that a zone transfer, as opposed to ordinary DNS retrieval, alters the contents of the definitions and must be treated as a sensitive operation.

Adding trusted dynamic updates

The nameserver also can take dynamic transfers, which, strictly speaking, do not have to be secured, but dynamic update, especially in an IPv6 environment, is so open an invitation to miscreants that it should never be considered without being secured. DNS security is the normal way this might be done, but there are other alternatives, such as an encrypted link between the update source and the nameserver.

There are also caching-only servers that contain only the names and addresses that have been recently looked up, and are still valid with respect to the TTL parameter in the relevant records.

Resolvers, their caches, and their information sources

The program, on a host, which is the client of DNS servers is most often called a resolver. Depending on the local network architectural implementation, a resolver may go to a caching-only server, a secondary server, or the primary server for its information. It may retain a cache of recently retrieved DNS information, clearing items from cache as their TTLs expire.

Heterogeneous DNS

For more information, see: Split DNS.

While there will be different federated databases, DNS is certainly not limited to the public Internet. It is quite common for organizations to have split DNS "inside the firewall" and "outside the firewall". An inside user will query local DNS for the address of an internal machine and get the address of the actual host, but, if it asks for the address of citizendium.com, the address returned by DNS may well be that of the "inside" interface of a firewall, or other security middlebox[16] Depending on the firewall implementation, it may deny access, or create a proxy connection to the outside host. To establish that connection, the middlebox will query an "outside" DNS, which contains the addresses of the organization's public hosts, but primarily contains the addresses of external hosts. In some cases, that outside DNS enjoys some trust with an external organization, and may do secured zone transfers. More often, however, the outside DNS is primarily a cache of name-address information that it obtained by queries to the nameservers of other domains.

DNS protocols

The most basic DNS protocols are the lookup service, which runs over port 53 of the connectionless User Datagram Protocol, and the zone transfer service, which also runs over port 53 of the connection-oriented Transmission Control Protocol.[17] Lookup is a read-only function, while zone update is read-write and should be implemented as a privileged, authenticated operation. Otherwise any client on a DNS server's network could request a zone transfer, and receive a complete copy of a zonefile, which is a security risk.

There are also protocols for dynamic update, so that network clients can automatically update their DNS servers to reflect correct hostnames (e.g. if they dynamically receive a different IP address via DHCP). This concept is also known as Dynamic DNS. [18]

Extended applications

These include Domain Name System dynamic update, use of the DNS as a data base in Public Key Infrastructure (PKI) for general security, Domain Name System security (DNSSEC) and name-based routing and load distribution.

References

  1. Mockapetris, P.V. (November 1987), Domain names - concepts and facilities, Internet Engineering Task Force, RFC1034
  2. Mockapetris, P.V. (November 1983), Domain names: Concepts and facilities, Internet Engineering Task Foce, RFC882
  3. 3.0 3.1 Albitz, Paul & Cricket Liu (1997), DNS and BIND, second edition, O'Reilly p. 9
  4. 4.0 4.1 Arends, R. et al. (March 2005), DNS Security Introduction and Requirements, Internet Engineering Task Force, RFC4033
  5. Bush, R. et al. (August 2002), Representing Internet Protocol version 6 (IPv6) Addresses in the Domain Name System (DNS), Internet Engineering Task Force, RFC3363
  6. BIND, Internet Software Consortium
  7. Evans, Karen (August 22, 2008), Securing the Federal Government’s Domain Name System Infrastructure (Submission of Draft Agency Plans Due by September 5, 2008)
  8. Dyer, Stephen (October 1, 2004), .UK – Revisited
  9. ICM Registry (June 25, 2010), ICM Registry welcomes approval of .xxx
  10. RFC1034, pp. 3-4
  11. Note that the actual RR has a terminal period that does not appear when the DNS name is written in other uses
  12. E. Lewis (July 2006), The Role of Wildcards in the Domain Name System, RFC4592
  13. Internet Corporation for Assigned Names and Numbers, Verisign's Wildcard Service Deployment
  14. R. Elz, R. Bush (July 1997), Clarifications to the DNS Specification, Internet Engineering Task Force, RFC2181
  15. Liman, Lars-Johan et al, Operation of the Root Name Servers
  16. P. Srisuresh, J. Kuthan, J. Rosenberg, A. Molitor, A. Rayhan (August 2002), Middlebox communication architecture and framework., RFC3303
  17. Mockapetris., P.V. (November 1987), Domain names - implementation and specification, Internet Engineering Task Force, RFC1035
  18. Vixie, P., ed. (April 1997), Dynamic Updates in the Domain Name System (DNS UPDATE), Internet Engineering Task Force, RFC2136