Full Text Search: The Key to Better Natural Language Queries for NoSQL in Node.js
This is Chapter 1, "Domain Name System," from DNS in Action: A detailed and practical guide to DNS implementation, configuration, and administration by Libor Dostálek and Alena Kabelová, published by Packt Publishing. For More Information: www.packtpub.com/DNS/book.
Domain Name System
All applications that provide communication between computers on the Internet use IP addresses to identify communicating hosts. However, IP addresses are difficult for human users to remember. That is why we use the name of a network interface instead of an IP address. For each IP address, there is a name of a network interface (computer)—or to be exact, a domain name. This domain name can be used in all commands where it is possible to use an IP address. (One exception, where only an IP address can be used, is the specification of an actual name server.) A single IP address can have several domain names affiliated with it.
The relationship between the name of a computer and an IP address is defined in the Domain Name System (DNS) database. The DNS database is distributed worldwide. The DNS database contains individual records that are called Resource Records (RR). Individual parts of the DNS database called zones are placed on particular name servers. DNS is a worldwide distributed database.
If you want to use an Internet browser to browse to www.google.com with the IP address 126.96.36.199 (Figure 1.1), you enter the website name www.google.com in the browser address field.
Figure 1.1: It is necessary to translate a name to an IP address before establishing a connection
Just before the connection with the www.google.com web server is made, the www.google.com DNS name is translated into an IP address and only then is the connection actually established.
It is practical to use an IP address instead of a domain name whenever we suspect that the DNS on the computer is not working correctly. Although it seems unusual, in this case, we can write something like:
ping 188.8.131.52 http://184.108.40.206
or send email to
However, the reaction can be unexpected, especially for the email, HTTP, and HTTPS protocols. Mail servers do not necessarily support transport to servers listed in brackets. HTTP will return to us the primary home page, and the HTTPS protocol will complain that the server name does not match the server name in the server's certificate.
1.1 Domains and Subdomains
The entire Internet is divided into domains, i.e., name groups that logically belong together. The domains specify whether the names belong to a particular company, country, and so forth. It is possible to create subgroups within a domain that are called subdomains. For example, it is possible to create department subdomains for a company domain. The domain name reflects a host's membership in a group and subgroup. Each group has a name affiliated with it. The domain name of a host is composed from the individual group names. For example, the host named bob.company.com consists of a host named bob inside a subdomain called company, which is a subdomain of the domain com.
The domain name consists of strings separated by dots. The name is processed from left to right. The highest competent authority is the root domain expressed by a dot (.) on the very right (this dot is often left out). Top Level Domains (TLD) are defined in the root domain. We have two kind of TLD, Generic Top Level Domain (gTLD) and Country Code Top Level Domain (ccTLD). Well known gTLDs are edu, com, net, and mil which are used mostly in the USA. According to ISO 3166, we also have two letter ccTLD for individual countries. For example, the us domain is affiliated with USA. However ccTLD are used mostly outside the USA. A detailed list of affiliated ccTLD and their details are listed in Appendix A.
The TLD domains are divided into subdomains for particular organizations, for example, cocacola.com, mcdonalds.com, google.com. Generally, a company subdomain can be divided into lower levels of subdomains, for example, the company Company Ltd. can have its subdomain as company.com and lower levels like bill.company.com for its billing department, sec.company.com for its security department, and head.company.com for its headquarters.
Figure 1.1a: The names in the DNS system create a tree structure
The following list contains some other registered gTLDs:
- The .org domain is intended to serve the noncommercial community.
- The .aero domain is reserved for members of the air transport industry.
- The .biz domain is reserved for businesses.
- The .coop domain is reserved for cooperative associations.
- The .int domain is only used for registering organizations established by international treaties between governments.
- The .museum domain is reserved for museums.
- The .name domain is reserved for individuals.
- The .pro domain is being established; it will be restricted to credited professionals and related entities.
1.2 Name Syntax
Names are listed in a dot notation (for example, abc.head.company.com). Names have the following general syntax:
string.string.string ... ... ...string.
where the first string is a computer name, followed by the name of the lowest inserted domain, then the name of a higher domain, and so on. For unambiguousness, a dot expressing the root domain is also listed at the end.
The entire name can have a maximum of 255 characters. An individual string can have a maximum of 63 characters. The string can consist of letters, numbers, and hyphens. A hyphen cannot be at the beginning or at the end of a string. There are also extensions specifying a richer repertoire of characters that can be used to create names. However, we usually avoid these additional characters because they are not supported by all applications.
Both lower and upper case letters can be used, but this is not so easy. From the point of view of saving and processing in the DNS database, lower and upper case letters are not differentiated. In other words, the name newyork.com will be saved in the same place in a DNS database as NewYork.com or NEWYORK.com. Therefore, when translating a name to an IP address, it does not matter whether the user enters upper or lower case letters. However, the name is saved in the database in upper and lower case letters; so if NewYork.com was saved in the database, then during a query, the database will return "NewYork.com.". The final dot is part of the name.
In some cases, the part of the name on the right can be omitted. We can almost always leave out the last part of the domain name in application programs. In databases describing domains the situation is more complicated:
- It is almost always possible to omit the last dot.
- It is usually possible to omit the end of the name, which is identical to the name of the domain, on computers inside the domain. For example, inside the company.com domain it is possible to just write computer.abc instead of computer.abc.company.com. (However, you cannot write a dot at the end!) The domains that the computer belongs to are directly defined by the domain and search commands in the resolver configuration file. There can be several domains of this kind defined (see Section 1.9).
1.3 Reverse Domains
We have already said that communication between hosts is based on IP addresses, not domain names. On the other hand, some applications need to find a name for an IP address—in other words, find the reverse record. This process is the translation of an IP address into a domain name, which is often called reverse translation.
As with domains, IP addresses also create a tree structure (see Figure 1.2). Domains created by IP addresses are often called reverse domains. The pseudodomains inaddr-arpa for IPv4 and IP6.arpa for IPv6 were created for the purpose of reverse translation. This domain name has historical origins; it is an acronym for inverse addresses in the Arpanet.
Under the domain in-addr.arpa, there are domains with the same name as the first number from the network IP address. For example, the in-addr.arpa domain has subdomains 0 to 255. Each of these subdomains also contains lower subdomains 0 to 255. For example, network 220.127.116.11/24 belongs to subdomain 195.in-addr.arpa. This actual subdomain belongs to domain 47.195.in-addr.arpa, and so forth. Note that the domains here are created like network IP addresses written backwards.
Figure 1.2: Reverse domain to IP address 18.104.22.168
This whole mechanism works if the IP addresses of classes A, B, or C are affiliated. But what should you do if you only have a subnetwork of class C affiliated? Can you even run your own name server for reverse translation? The answer is yes. Even though the IP address only has four bytes and a classic reverse domain has a maximum of three numbers (the fourth numbers are already elements of the domain—IP addresses), the reverse domains for subnets of class C are created with four numbers. For example, for subnetwork 22.214.171.124/28 we will use domain 126.96.36.199.in-addr.arpa. It is as if the IP address suddenly has five bytes! This was originally a mistake in the implementation of DNS, but later this mistake proved to be very practical so it was standardized as an RFC. We will discuss this in more detail in Chapter 7. You will learn more about reverse domains for IPv6 in Section 3.5.3.
1.4 Domain 0.0.127.in-addr.arpa
The IP address 127.0.0.1 presents an interesting complication. Network 127 is reserved for loopback, i.e., a software loop on each computer. While other IP addresses are unambiguous within the Internet, the address 127.0.0.1 occurs on every computer. Each name server is not only an authority for common domains, but also an authority (primary name server) to domain 0.0.127.in-addr.arpa. We will consider this as given and will not list it in the chart, but be careful not to forget about it. For example, even a caching-only server is a primary server for this domain. Windows 2000 pretends to be the only exception to this rule, but it would not hurt for even Windows 2000 to establish a name server for zone 0.0.127.in-addr.arpa.
We often come across the questions: What is a zone? What is the relation between a domain and a zone? Let us explain the relationship of these terms using the company.com domain.
As we have already said, a domain is a group of computers that share a common right side of their domain name. For example, a domain is a group of computers whose names end with company.com. However, the domain company.com is large. It is further divided into the subdomains bill.company.com, sec.company.com, sales.company.com, xyz.company.com, etc. We can administer the entire company.com domain on one name server, or we can create independent name servers for some subdomains. (In Figure 1.3, we have created subordinate name servers for the subdomains bill.company.com and head.company.com.) The original name server serves the domain company.com and the subdomains sec.company.com, sales.company.com, and xyz.company.com—in other words, the original name server administers the company.com zone. The zone is a part of the domain namespace that is administered by a particular name server.
Figure 1.3: Zone company.com
A zone containing data of a lower-level domain is usually called a subordinate zone.
1.5.1 Special Zones
Besides classic zones, which contain data about parts of the domains or subdomains, special zones are also used for DNS implementation. Specifically, the following zones are used:
- Zone stub: Zone stub is actually a subordinate zone that only contains information about what name servers administer in a particular subdomain (they contain the NS records for the zone). The zone stub therefore does not contain the entire zone.
- Zone cache/hint: A zone hint contains a list of root name servers (non-authoritative data read into memory during the start of the name server). Only BIND version 8 and later use the name hint for this type of zone. In previous versions, a name cache zone was used. Remember that the root name servers are an authority for a root domain marked as a dot (.).