Full Text Search: The Key to Better Natural Language Queries for NoSQL in Node.js
Author: Emmanuel KARTMANN.
Last Update: June 1st, 2002
This ATL COM component provides very simple Internet name resolving functionality (Domain Name System or DNS). For example, it can automagically find the email (SMTP) servers available for your machine (see "DNS Magic: Who's my email server?" for details).
This version has several bug fixes and finds DNS domain better than before (it uses the complete "search list" as defined in the TCP/IP configuration).
Version 1.4 was a complete rewriting of the DNS component. Instead of using a porting of the BIND 8 library (from UNIX to NT), it relies on the Microsoft Platform SDK (August 2001), which, at last, contains a decent DNS API. Please refer to the implementation section for more details.
For those of you who are using Windows 95/98/Millenium (where the Microsoft DNS API is not available), you have two solutions: get the older versions of my component (ask me) or simply copy the DNSAPI.DLL file from a Windows 2000 installation to the Windows\System directory of your system (Thanks to Hans for this tip).
- WHAT'S NEW
- INTRODUCTION TO DNS
- What is DNS?
- The DNS Protocol
- Resource Records
- Well Known DNS Implementations
- Resolver Tool: nslookup
- DNS Magic: Who's my email server?
- COMPONENT FEATURES
- SAMPLE CODE (VBScript)
- TO DO LIST
- REFERENCE DOCUMENTATION
INTRODUCTION TO DNS
- an ip address (e.g. "220.127.116.11")
- a hostname (e.g. "www.kartmann.org")
- a canonical name (i.e. the real name of an IP alias)
- a mail server name
- Reply immediately if it knows the answer (i.e. if the query is about data in its name space)
- Reply immediately if the answer is in its cache (DNS data has a Time-To-Live; the data in the cache must not have expired yet).
- Reply with a alternate server name for the request (non-recursive queries). The resolver must then send the same query to the alternate server.
- Send queries to foreign name servers, wait for answers and transmit them to the resolver (recursive queries).
What is DNS?
The Domain Name System (DNS) is a distributed host information database used in the Internet.
Most (if not all) Internet software (e.g. ping, telnet, ftp, web browsers, etc...) use the DNS database to resolve IP addresses so that you, the user, can type the name of a machine instead of its IP address (user-friendly, isn't it?).
Let me give you an example: when you type a Web address in your favorite browser (e.g. "www.kartmann.org"), the browser fetches the corresponding IP address in the DNS database and uses this address to connect to the Web server.
Information held in the DNS database can be:
Information in the DNS is held in Resource Records (RR). RRs come in several types, which correspond to the varieties of data that can be contained in the DNS. Many RFCs (most of them still experimental) propose additional RR types, like the geographical location (RFC 1712), digital certificates (RFC 2538), cryptographic keys (RFC 2536), etc...
A application or library (or COM object) acting as a DNS client is called a resolver.
The DNS Protocol
Application programs can use the domain name system via a resolver library (or COM object in our case). The resolver sends queries corresponding to the library function, and waits for responses from the local name server. The local name server can either:
- "A" (value 1) a host address
- "CNAME" (value 5) the canonical name for an alias
- "PTR" (value 12) a domain name pointer
- "MX" (value 15) a mail exchanger
- "IN" (value 1) the Internet
- "CS" (value 2) the CSNET class (Obsolete)
- "CH" (value 3) the CHAOS class (MIT)
- "HS" (value 4) Hesiod (MIT)
- berkeley sockets: the berkeley C library provides a very basic implementation of the DNS via functions gethostbyname (resolve an hostname into its ip address) and gethostbyaddr (reverse: from the ip address, finds the corresponding hostname).
- Winsock: on Windows, you have the same functions (and their asynchronous versions WSAAsyncGetHostByName and WSAAsyncGetHostByAddr).
- BIND: the Berkeley Internet Name Domain is the most popular implementation of the DNS specifications (full implementation with client and server software).
- Start a MS-DOS command prompt and type the nslookup command:
C:\> nslookup Default Server: mynameserver.mydomain.com Address: XXX.XXX.XXX.XXX > myhost.mydomain.com Server: mynameserver.mydomain.com Address: XXX.XXX.XXX.XXX Name: myhost.mydomain.com Address: YYY.YYY.YYY.YYY > yourhost.mydomain.com *** mynameserver.mydomain.com can't find yourhost.mydomain.com: Non-existent domain
- Start a MS-DOS command prompt and type the nslookup command:
C:\> nslookup Default Server: youserver.yourdomain Address: X.X.X.X > set type=MX > microsoft.com microsoft.com MX preference = 10, mail exchanger = mail1.microsoft.com microsoft.com MX preference = 20, mail exchanger = mail2.microsoft.com microsoft.com MX preference = 30, mail exchanger = mail3.microsoft.com microsoft.com MX preference = 40, mail exchanger = mail4.microsoft.com microsoft.com MX preference = 50, mail exchanger = mail5.microsoft.com
Queries and Responses are usually sent via UDP (datagrams), in one (or more) packets (some implementations use TCP instead of UDP).
Information in the DNS is held in Resource Records (RR); when a server replies to a resolver, it sends
resource records in its response. RRs come in different types and formats, as describes in this section.
General Resource Record Format
All RRs have the same top level format shown below:
|NAME||an owner name, i.e., the name of the node to which this resource record pertains.|
|TYPE||two octets containing one of the RR TYPE codes. Valid types include:
|CLASS||two octets containing one of the RR CLASS codes. Valid classes are:
|TTL||a 32 bit signed integer that specifies the time interval (in seconds) that the resource record may be cached before the source of the information should again be consulted. Zero values are interpreted to mean that the RR can only be used for the transaction in progress, and should not be cached.|
|RDLENGTH||an unsigned 16 bit integer that specifies the length in octets of the RDATA field.|
|RDATA||a variable length string of octets that describes the resource. The format of this information varies according to the TYPE and CLASS of the resource record. See below for list of the most common types.|
Standard Resource Record Formats
Internet Address Format (A)
The A RR contains an IPv4 address (32 bits):
Canonical Name Format (CNAME)
The CNAME RR contains a resource name (sequence of labels):
Domain Name Pointer Format (PTR)
The PTR RR contains a resource name (sequence of labels):
Mail eXchanger Format (MX)
The MX RR contains a preference (integer, 16 bits) and a resource name (sequence of labels):
Well Known DNS Implementations
Resolver Tool: nslookup
Windows NT, Windows 2000 and UNIX systems provide a DNS resolver via the command-line program nslookup (probably a port of the nslookup program shipped with BIND). With this program, you can read information from the DNS database.
Here's an example sesssion of the nslookup program:
In Windows 2000, you'll find an implementation of nslookup. Type 'help' to
get the full syntax of nslookup commands:
Commands: (identifiers are shown in uppercase,  means optional)
NAME - print info about the host/domain NAME using default server
NAME1 NAME2 - as above, but use NAME2 as server
help or ? - print info on common commands
set OPTION - set an option
all - print options, current server and host
[no]debug - print debugging information
[no]d2 - print exhaustive debugging information
[no]defname - append domain name to each query
[no]recurse - ask for recursive answer to query
[no]search - use domain search list
[no]vc - always use a virtual circuit
domain=NAME - set default domain name to NAME
srchlist=N1[/N2/.../N6] - set domain to N1 and search list to N1,N2, etc.
root=NAME - set root server to NAME
retry=X - set number of retries to X
timeout=X - set initial time-out interval to X seconds
type=X - set query type (ex. A,ANY,CNAME,MX,NS,PTR,SOA,SRV)
querytype=X - same as type
class=X - set query class (ex. IN (Internet), ANY)
[no]msxfr - use MS fast zone transfer
ixfrver=X - current version to use in IXFR transfer request
server NAME - set default server to NAME, using current default server
lserver NAME - set default server to NAME, using initial server
finger [USER] - finger the optional NAME at the current default host
root - set current default server to the root
ls [opt] DOMAIN [> FILE] - list addresses in DOMAIN (optional: output to FILE)
-a - list canonical names and aliases
-d - list all records
-t TYPE - list records of the given type (e.g. A,CNAME,MX,NS,PTR etc.)
view FILE - sort an 'ls' output file and view it with pg
exit - exit the program
DNS Magic: Who's my email server?
There's a strong link between Internet email and the DNS. Mail servers use the DNS information database to route email messages from the originator to the recipient.
Basically, when the recipient email address is "firstname.lastname@example.org", the mail server searches the mail server (or Mail eXchanger) for domain "somewhere.com" in the DNS database. It then connects to the mail server (on port 25) and sends the email using SMTP.
Mail servers are registered in the DNS database as "Mail eXchanger" ("MX") records. Note that there can be several registered mail servers for a given domain (with a preference assigned to each server).
Using nslookup, you can find the email server of your domain, as shown in the sample session below.
The lowest preference indicates the best (primary) mail server. A mailer would try it first and if it cannot connect to this server, it would use other servers (by order of preference).
You can test this "DNS Magic" by using the nslookup program shipped with Windows 2000, or with the DNS Magic HTML page. This component provides method ISimpleDNSClient::GetEmailServers() to find the registered email servers for a given domain.
For more about email and DNS, please see the reference documentation.
COMPONENT FEATURESThis component:
- implements the basics of DNS, as defined in RFC1034 and RFC1035,
- uses the resolver cache, queries first with UDP, then retries with TCP if the response is truncated
- asks the server to perform recursive resolution on behalf of the client to resolve the query
- finds DNS server addresses in local configuration (using API or Windows Registry)
- provides extended error information (ISupportErrorInfo and IErrorInfo are implemented)
- provides very small executable: 52 KB (MinSize) to 60 KB (MinDependency)
- requires no Graphical User Interface: the component can be used in non-GUI applications, like a Windows NT Service.
- is integrated with SimpleEmailClient (another component): the latter calls method GetEmailServers to automatically find SMTP servers
- runs on Windows 2000 (relies on Windows DNS API from the Platform SDK, August 2001)
- compiles with VC++ 6.0 SP5
USAGETo use this component:
- create an instance of the component,
- (optionally) put/get properties from interface ISimpleDNSClient:
- call a method from interface ISimpleDNSClient:
- handle errors (try/catch in C++, On Error Resume Next in VBScript)
SAMPLE CODE (VBScript)
Dim oDNS ' Create object instance Set oDNS = CreateObject("Emmanuel.SimpleDNSClient.1") ' Declare output variable Dim found_names ' Set the server address(es) oDNS.ServerAddresses = "18.104.22.168" ' Set separator for output variable (if multiple results are found) oDNS.Separator = ", " ' (1) Find IP address of hostname "www.microsoft.com" (Internet class, type A) On Error Resume Next oDNS.Resolve "www.microsoft.com", found_names, "C_IN", "T_A" If Err <> 0 Then MsgBox Err.Description Else ' Show resolved names (within dialog box) MsgBox "Found names:" & vbCrLf & vbCrLf & found_names End If ' (2) Find Email Servers for domain "microsoft.com" On Error Resume Next oDNS.GetEmailServers "microsoft.com", found_names If Err <> 0 Then MsgBox Err.Description Else ' Show resolved names (within dialog box) MsgBox "Found names:" & vbCrLf & vbCrLf & found_names End If
- Base API is the new Windows 2000 DNS API
This component relies on the Windows DNS API provided by the Windows Platform SDK (August 2001). The previous version was based on a porting of the BIND resolver library. Due to the high maintenance cost associated with this library, I decided to drop it when the DNS API came out. See the Component online documentation for more details.
- Automatic Conversion of reverse lookups (PTR)
If your request is a reverse (IP address-> IP name) lookup (type PTR), then the component silently converts request to a in-arpa format. That is, if you request a resolution of type PTR for IP address "22.214.171.124", then the component will send a resolution request for "126.96.36.199.in-addr.arpa" (PTR). The result will be "www.international.microsoft.com".
- Find Email Server automatically
The method GetEmailServers sends a request of type MX (Mail eXchanger) in order to find registered servers for a domain. It's only a shortcut for the Resolve method.
- Ignored Parameter: Resource Class
Due to limitations in the current Windows DNS API, the parameter BResourceClass is ignored by the Resolve method (you should always use the default class "C_IN" for Internet Class).
- Ignored Property: ServerAddresses
Due to limitations (bugs ?) in the current Windows DNS API, the property ServerAddresses is ignored by the Resolve method (the component always uses the local machine DNS configuration to find the DNS servers).
You can download the Microsoft Platform SDK from the Microsoft Web Site:
TO DO LIST
- Test All DNS Resource Records (although all RR types are implemented, most of them couldn't be tested)
- Support DNS Security extensions (RFC 2535, "Domain Name System Security Extensions")
- Book: "DNS and BIND", by Paul Albitz & Cricket Liu, O'Reilly & Associates
If you have to deal with DNS, I strongly recommend you to read this book: it's a very good presentation of DNS, from the protocol itself to the configuration tricks of a DNS server (using BIND).
- BIND (Berkeley Internet Name Domain), the reference implementation of the Domain Name System (DNS) protocols.
- A deprecated version of BIND, but who compiles and runs on Windows NT
- RFC 1034: "DOMAIN NAMES - CONCEPTS AND FACILITIES"
- RFC 1035: "DOMAIN NAMES - IMPLEMENTATION AND SPECIFICATION"
- RFC 0974: "Mail routing and the domain system"
- RFC 1712: "DNS Encoding of Geographical Location"
- RFC 2181: "Clarifications to the DNS Specification"
- RFC 2671: "Extension Mechanisms for DNS (EDNS0)"
- RFC 2535: "Domain Name System Security Extensions"
- RFC 2536: "DSA KEYs and SIGs in the Domain Name System (DNS)"
- RFC 2538: "Storing Certificates in the Domain Name System (DNS)"
- Microsoft Plaform SDK - Domain Name System - "DNS Start Page"
- Microsoft Plaform SDK - Domain Name System - "DNS Reference"
- Microsoft Plaform SDK - Domain Name System - "DNS Functions"
- Microsoft Plaform SDK - Domain Name System - "DNS Structures"
- Microsoft Plaform SDK - Download SDK
Download Article and Source Code (202 KB).
Download self-extracting kit (191 KB).