XML, the extensible markup language, has
its share of critics as well as plenty of zealous proponents. I was
long in the former group, and only grudgingly incorporated XML into Nmap after
volunteers performed most of the work. Since then, I have learned to
appreciate the power and flexibility that XML offers, and even wrote
this book in the DocBook XML format. I strongly recommend that programmers
interact with Nmap through the XML interface rather than trying to
parse the normal, interactive, or grepable output. That format
includes more information than the others and is extensible enough
that new features can be added without breaking existing programs that
use it. It can be parsed by standard XML parsers, which are available
for all popular programming languages, usually for free. Editors,
validators, transformation systems, and many other applications
already know how to handle the format. Normal and interactive output,
on the other hand, are custom to Nmap and subject to regular changes
as I strive for a clearer presentation to end users. Grepable output
is also Nmap-specific and tougher to extend than XML. It is considered
deprecated,
and many Nmap features such as MAC address
detection are not presented in this output format.
An example of Nmap XML output is shown in Example 13.9. Whitespace has been adjusted for
readability. In this case, XML was sent to
stdout
thanks to the -oX - construct.
Some programs executing
Nmap opt to read the output that way, while others specify that output
be sent to a filename and then they read that file after Nmap completes.
Example 13.9. An example of Nmap XML output
# nmap -T4 -A -p- -oX - scanme.nmap.org
<?xml version="1.0" encoding="utf-8"?>
<?xml-stylesheet href="/usr/share/nmap/nmap.xsl" type="text/xsl"?>
<!-- Nmap 4.68 scan initiated Tue Jul 15 07:27:26 2008 as:
nmap -T4 -A -p- -oX - scanme.nmap.org -->
<nmaprun scanner="nmap" args="nmap -T4 -A -p- -oX - scanme.nmap.org"
start="1216106846" startstr="Tue Jul 15 07:27:26 2008"
version="4.68" xmloutputversion="1.02">
<scaninfo type="syn" protocol="tcp" numservices="65535" services="1-65535" />
<verbose level="0" /> <debugging level="0" />
<host starttime="1216106846" endtime="1216106985">
<status state="up" reason="reset" />
<address addr="64.13.134.52" addrtype="ipv4" />
<hostnames><hostname name="scanme.nmap.org" type="PTR" /></hostnames>
<ports><extraports state="filtered" count="65529">
<extrareasons reason="no-responses" count="65529" /></extraports>
<port protocol="tcp" portid="22">
<state state="open" reason="syn-ack" reason_ttl="52" />
<service name="ssh" product="OpenSSH" version="4.3"
extrainfo="protocol 2.0" method="probed" conf="10" /> </port>
<!-- Several port elements removed for brevity -->
<port protocol="tcp" portid="80">
<state state="open" reason="syn-ack" reason_ttl="52" />
<service name="http" product="Apache httpd" version="2.2.2"
extrainfo="(Fedora)" method="probed" conf="10" />
<script id="HTML title" output="Go ahead and ScanMe!" /> </port>
<port protocol="tcp" portid="113">
<state state="closed" reason="reset" reason_ttl="52" />
<service name="auth" method="table" conf="3" /> </port> </ports>
<os>
<portused state="open" proto="tcp" portid="22" />
<portused state="closed" proto="tcp" portid="25" />
<osclass type="general purpose" vendor="Linux" osfamily="Linux"
osgen="2.6.X" accuracy="100" />
<osmatch name="Linux 2.6.17 - 2.6.21" accuracy="100" line="11886" />
<osmatch name="Linux 2.6.23" accuracy="100" line="13895" /> </os>
<uptime seconds="1104050" lastboot="Wed Jul 2 12:48:55 2008" />
<tcpsequence index="203" difficulty="Good luck!"
values="31F88BFB,327D2AA6,329B817C,329D4191,321A15D3,32B3D917" />
<ipidsequence class="All zeros" values="0,0,0,0,0,0" />
<tcptssequence class="1000HZ"
values="41CE58DD,41CE5941,41CE59A5,41CE5A09,41CE5A6D,41CE5AD5" />
<trace port="22" proto="tcp">
<hop ttl="1" rtt="2.98" ipaddr="132.239.1.113"
host="nodem-msfc-vl245-act-security-gw-1-113.ucsd.edu" />
<!-- Several hop elements removed for brevity -->
<hop ttl="11" rtt="13.34" ipaddr="64.13.134.52"
host="scanme.nmap.org" /> </trace>
<times srtt="14359" rttvar="1215" to="100000" /> </host>
<runstats><finished time="1216106985" timestr="Tue Jul 15 07:29:45 2008" />
<hosts up="1" down="0" total="1" />
<!-- Nmap done at Tue Jul 15 07:29:45 2008;
1 IP address (1 host up) scanned in 138.938 seconds -->
</runstats>
</nmaprun>
Another advantage of XML is that its verbose nature makes it
easier to read and understand than other formats. Readers familiar
with Nmap in general can likely understand most of the XML output in Example 13.9, “An example of Nmap XML output” without further documentation. The
grepable output format, on the other hand, is tough to decipher
without its own reference guide.
There are a few aspects of the example XML output which may not
be self-explanatory. For example, look at the two
port elements in Example 13.10
Example 13.10. Nmap XML port elements
<port protocol="tcp" portid="80">
<state state="open" reason="syn-ack" reason_ttl="52" />
<service name="http" product="Apache httpd" version="2.2.2"
extrainfo="(Fedora)" method="probed" conf="10" />
<script id="HTML title" output="Go ahead and ScanMe!" />
</port>
<port protocol="tcp" portid="113">
<state state="closed" reason="reset" reason_ttl="52" />
<service name="auth" method="table" conf="3" />
</port>
The port protocol, ID (port number), state, and service name are the
same as would be shown in the interactive output port table. The
service product, version, and extrainfo come from version detection
and are combined together into one field of the interactive output
port table. The method and conf
attributes aren't present in any other output types. The method can
be table, meaning the service name was simply
looked up in nmap-services based on the port
number and protocol, or it can be probed, meaning
that it was determined through the version detection system. The
conf attribute measures the confidence Nmap has
that the service name is
correct.
The values range from one (least
confident) to ten. Nmap only has a confidence level of three for
ports determined by table lookup, while it is highly confident (level
10) that port 80 of Example 13.10, “Nmap XML port elements” is Apache httpd, because Nmap connected to the port and found a server
exhibiting the HTTP protocol with Apache banners.
One other aspect that some users find confusing is that the
attributes /nmaprun/@start and /nmaprun/runstats/finished/@time hold timestamps given in
Unix time, the number of seconds
January 1, 1970.
This is often
easier for programs to handle. For the convenience of human readers,
versions 3.78 and newer include the equivalent calendar time written
out in the attributes /nmaprun/@startstr and
/nmaprun/runstats/finished/@endstr.
Nmap includes a document type definition (DTD)
which allows XML
parsers to validate Nmap XML output. While it is primarily intended
for programmatic use, it can also help humans interpret Nmap XML
output. The DTD defines the legal elements of the format, and often
enumerates the attributes and values they can take on. It is
reproduced in Appendix A, Nmap XML Output DTD.
The Nmap XML format can be used in many powerful ways, though
few users actually take any advantage of it. I believe this is due to
inexperience of many users with XML, combined with a lack of
practical, solution-oriented documentation on using the Nmap XML
format. This chapter provides several practical examples, including
the section called “Manipulating XML Output with Perl”,
the section called “Output to a Database”,
and the section called “Creating HTML Reports”.
A key advantage of XML is that you do not need to write your
own parser as you do for specialized Nmap output types such as
grepable and interactive output. Any general XML parser should
do.
The XML parser that people are most familiar with is the one in
your web browser. Both IE and Mozilla/Firefox include capable parsers
that can be used to view Nmap XML data. Using them is as simple as
typing the XML filename or URL into the address bar.
Figure 13.1 shows an example of XML
output rendered by a web browser. How this automatic rendering works and
how to save a permanent copy of an HTML report is covered in
the section called “Creating HTML Reports”.
Nmap XML output can of course be viewed in any text editor or XML
editor. Some spreadsheet programs, including Microsoft Excel, are able
to import Nmap XML data directly for viewing. These general-purpose XML
processors share the limitation that they treat Nmap XML generically,
just like any other XML file. They don't understand the relative
importance of elements, nor how to organize the data for a more useful
presentation. The use of specialized XML processors that make sense of
Nmap XML output is the subject of the following sections.