Home page logo
/
Intro Reference Guide Book Install Guide
Download Changelog Zenmap GUI Docs
Bug Reports OS Detection Propaganda Related Projects
In the Movies In the News

Sponsors

SolarWinds makes easy-to-use enterprise IT management software to help IT pros solve problems every day and help to enable efficient and effective management of networks and IT environments.

Join our online community of over 100,000 IT professionals talking shop. Get involved. Gain insights. Share tips and tricks. Visit thwack today!


Library http

Implements the HTTP client protocol in a standard form that Nmap scripts can take advantage of.

Because HTTP has so many uses, there are a number of interfaces to this library. The most obvious and common ones are simply get, post, and head; or, if more control is required, generic_request can be used. These functions do what one would expect. The get_url helper function can be used to parse and retrieve a full URL.

These functions return a table of values, including:

  • status-line - A string representing the status, such as "HTTP/1.1 200 OK"
  • header - An associative array representing the header. Keys are all lowercase, and standard headers, such as 'date', 'content-length', etc. will typically be present.
  • rawheader - A numbered array of the headers, exactly as the server sent them. While header['content-type'] might be 'text/html', rawheader[3] might be 'Content-type: text/html'.
  • cookies - A numbered array of the cookies the server sent. Each cookie is a table with the following keys: name, value, path, domain, and expires.
  • body - The full body, as returned by the server.

If a script is planning on making a lot of requests, the pipelining functions can be helpful. pipeline_add queues requests in a table, and pipeline performs the requests, returning the results as an array, with the responses in the same order as the queries were added. As a simple example:

	-- Start by defining the 'all' variable as nil
	local all = nil

	-- Add two 'GET' requests and one 'HEAD' to the queue. These requests are not performed
	-- yet. The second parameter represents the 'options' table, which we don't need.
	all = http.pipeline_add('/book',          nil, all)
	all = http.pipeline_add('/test',          nil, all)
	all = http.pipeline_add('/monkeys',       nil, all)

	-- Perform all three requests as parallel as Nmap is able to
	local results = http.pipeline('nmap.org', 80, all)

At this point, results is an array with three elements. Each element is a table containing the HTTP result, as discussed above.

One more interface provided by the HTTP library helps scripts determine whether or not a page exists. The identify_404 function will try several URLs on the server to determine what the server's 404 pages look like. It will attempt to identify customized 404 pages that may not return the actual status code 404. If successful, the function page_exists can then be used to determine whether or not a page existed.

Some other miscellaneous functions that can come in handy are response_contains, can_use_head, and save_path. See the appropriate documentation for them.

The response to each function is typically a table on success or nil on failure. If a table is returned, the following keys will exist: status-line: The HTTP status line; for example, "HTTP/1.1 200 OK" (note: this is followed by a newline) status: The HTTP status value; for example, "200" header: A table of header values, where the keys are lowercase and the values are exactly what the server sent rawheader: A list of header values as "name: value" strings, in the exact format and order that the server sent them cookies: A list of cookies that the server is sending. Each cookie is a table containing the keys name, value, and path. This table can be sent to the server in subsequent responses in the options table to any function (see below). body: The body of the response

Many of the functions optionally allow an 'options' table. This table can alter the HTTP headers or other values like the timeout. The following are valid values in 'options' (note: not all options will necessarily affect every function):

  • timeout: A timeout used for socket operations.
  • header: A table containing additional headers to be used for the request. For example, options['header']['Content-Type'] = 'text/xml'
  • content: The content of the message (content-length will be added -- set header['Content-Length'] to override). This can be either a string, which will be directly added as the body of the message, or a table, which will have each key=value pair added (like a normal POST request).
  • cookies: A list of cookies as either a string, which will be directly sent, or a table. If it's a table, the following fields are recognized:
** name ** value ** path
  • auth: A table containing the keys username and password, which will be used for HTTP Basic authentication
  • bypass_cache: Do not perform a lookup in the local HTTP cache.
  • no_cache: Do not save the result of this request to the local HTTP cache.
  • no_cache_body: Do not save the body of the response to the local HTTP cache.
  • redirect_ok: Closure that overrides the default redirect_ok used to validate whether to follow HTTP redirects or not. False, if no HTTP redirects should be followed.
The following example shows how to write a custom closure that follows 5 consecutive redirects:
  redirect_ok = function(host,port)
    local c = 5
    return function(url)
      if ( c==0 ) then return false end
      c = c - 1
      return true
    end
  end
  

Source: http://nmap.org/svn/nselib/http.lua

Script Arguments

http.useragent

The value of the User-Agent header field sent with requests. By default it is "Mozilla/5.0 (compatible; Nmap Scripting Engine; http://nmap.org/book/nse.html)". A value of the empty string disables sending the User-Agent header field.

http-max-cache-size

The maximum memory size (in bytes) of the cache.

http.pipeline

If set, it represents the number of HTTP requests that'll be pipelined (ie, sent in a single request). This can be set low to make debugging easier, or it can be set high to test how a server reacts (its chosen max is ignored).

TODO Implement cache system for http pipelines

Functions

can_use_head (host, port, result_404, path)

Determine whether or not the server supports HEAD by requesting / and verifying that it returns 200, and doesn't return data. We implement the check like this because can't always rely on OPTIONS to tell the truth.

clean_404 (body)

Try and remove anything that might change within a 404. For example:

  • A file path (includes URI)
  • A time
  • A date
  • An execution time (numbers in general, really)

generic_request (host, port, method, path, options)

Do a single request with a given method. The response is returned as the standard response table (see the module documentation).

get (host, port, path, options)

Fetches a resource with a GET request and returns the result as a table. This is a simple wraper around generic_request, with the added benefit of having local caching and support for HTTP redirects. Redirects are followed only if they pass all the validation rules of the redirect_ok function. This function may be overrided by supplying a custom function in the redirect_ok field of the options array. The default function redirects the request if the destination is:

  • Within the same host or domain
  • Has the same port number
  • Stays within the current scheme
  • Does not exceed MAX_REDIRECT_COUNT count of redirects

get_status_string (data)

Take the data returned from a HTTP request and return the status string. Useful for stdnse.print_debug messages and even advanced output.

get_url (u, options)

Parses a URL and calls http.get with the result. The URL can contain all the standard fields, protocol://host:port/path

head (host, port, path, options)

Fetches a resource with a HEAD request. Like get, this is a simple wrapper around generic_request with response caching. This function also has support for HTTP redirects. Redirects are followed only if they pass all the validation rules of the redirect_ok function. This function may be overrided by supplying a custom function in the redirect_ok field of the options array. The default function redirects the request if the destination is:

  • Within the same host or domain
  • Has the same port number
  • Stays within the current scheme
  • Does not exceed MAX_REDIRECT_COUNT count of redirects

identify_404 (host, port)

Try requesting a non-existent file to determine how the server responds to unknown pages ("404 pages"), which a) tells us what to expect when a non-existent page is requested, and b) tells us if the server will be impossible to scan. If the server responds with a 404 status code, as it is supposed to, then this function simply returns 404. If it contains one of a series of common status codes, including unauthorized, moved, and others, it is returned like a 404.

page_exists (data, result_404, known_404, page, displayall)

Determine whether or not the page that was returned is a 404 page. This is actually a pretty simple function, but it's best to keep this logic close to identify_404, since they will generally be used together.

parse_date (s)

Parses an HTTP date string, in any of the following formats from section 3.3.1 of RFC 2616:

  • Sun, 06 Nov 1994 08:49:37 GMT (RFC 822, updated by RFC 1123)
  • Sunday, 06-Nov-94 08:49:37 GMT (RFC 850, obsoleted by RFC 1036)
  • Sun Nov 6 08:49:37 1994 (ANSI C's asctime() format)

parse_url (url)

Take a URI or URL in any form and convert it to its component parts. The URL can optionally have a protocol definition ('http://'), a server ('scanme.insecure.org'), a port (':80'), a URI ('/test/file.php'), and a query string ('?username=ron&password=turtle'). At the minimum, a path or protocol and url are required.

parse_www_authenticate (s)

Parses the WWW-Authenticate header as described in RFC 2616, section 14.47 and RFC 2617, section 1.2. The return value is an array of challenges. Each challenge is a table with the keys scheme and params.

pipeline_add (path, options, all_requests, method)

Adds a pending request to the HTTP pipeline. The HTTP pipeline is a set of requests that will all be sent at the same time, or as close as the server allows. This allows more efficient code, since requests are automatically buffered and sent simultaneously.

pipeline_go (host, port, all_requests)

Performs all queued requests in the all_requests variable (created by the pipeline_add function). Returns an array of responses, each of which is a table as defined in the module documentation above.

post (host, port, path, options, ignored, postdata)

Fetches a resource with a POST request. Like get, this is a simple wrapper around generic_request except that postdata is handled properly.

put (host, port, path, options, putdata)

Uploads a file using the PUT method and returns a result table. This is a simple wrapper around generic_request

response_contains (response, pattern, case_sensitive)

Check if the response variable, which could be a return from a http.get, http.post, http.pipeline, etc, contains the given text. The text can be:

  • Part of a header ('content-type', 'text/html', '200 OK', etc)
  • An entire header ('Content-type: text/html', 'Content-length: 123', etc)
  • Part of the body

save_path (host, port, path, status, links_to, linked_from, contenttype)

This function should be called whenever a valid path (a path that doesn't contain a known 404 page) is discovered. It will add the path to the registry in several ways, allowing other scripts to take advantage of it in interesting ways.



Functions

can_use_head (host, port, result_404, path)

Determine whether or not the server supports HEAD by requesting / and verifying that it returns 200, and doesn't return data. We implement the check like this because can't always rely on OPTIONS to tell the truth.

Note: If identify_404 returns a 200 status, HEAD requests should be disabled. Sometimes, servers use a 200 status code with a message explaining that the page wasn't found. In this case, to actually identify a 404 page, we need the full body that a HEAD request doesn't supply. This is determined automatically if the result_404 field is set.

Parameters

  • host: The host object.
  • port: The port to use.
  • result_404: [optional] The result when an unknown page is requested. This is returned by identify_404. If the 404 page returns a 200 code, then we disable HEAD requests.
  • path: The path to request; by default, / is used.

Return values:

  1. A boolean value: true if HEAD is usable, false otherwise.
  2. If HEAD is usable, the result of the HEAD request is returned (so potentially, a script can avoid an extra call to HEAD
clean_404 (body)

Try and remove anything that might change within a 404. For example:

  • A file path (includes URI)
  • A time
  • A date
  • An execution time (numbers in general, really)

The intention is that two 404 pages from different URIs and taken hours apart should, whenever possible, look the same.

During this function, we're likely going to over-trim things. This is fine -- we want enough to match on that it'll a) be unique, and b) have the best chance of not changing. Even if we remove bits and pieces from the file, as long as it isn't a significant amount, it'll remain unique.

One case this doesn't cover is if the server generates a random haiku for the user.

Parameters

  • body: The body of the page.
generic_request (host, port, method, path, options)

Do a single request with a given method. The response is returned as the standard response table (see the module documentation).

The get, head, and post functions are simple wrappers around generic_request.

Any 1XX (informational) responses are discarded.

Parameters

  • host: The host to connect to.
  • port: The port to connect to.
  • method: The method to use; for example, 'GET', 'HEAD', etc.
  • path: The path to retrieve.
  • options: [optional] A table that lets the caller control socket timeouts, HTTP headers, and other parameters. For full documentation, see the module documentation (above).

Return value:

nil if an error occurs; otherwise, a table as described in the module documentation.

See also:

get (host, port, path, options)

Fetches a resource with a GET request and returns the result as a table. This is a simple wraper around generic_request, with the added benefit of having local caching and support for HTTP redirects. Redirects are followed only if they pass all the validation rules of the redirect_ok function. This function may be overrided by supplying a custom function in the redirect_ok field of the options array. The default function redirects the request if the destination is:

  • Within the same host or domain
  • Has the same port number
  • Stays within the current scheme
  • Does not exceed MAX_REDIRECT_COUNT count of redirects

Caching and redirects can be controlled in the options array, see module documentation for more information.

Parameters

  • host: The host to connect to.
  • port: The port to connect to.
  • path: The path to retrieve.
  • options: [optional] A table that lets the caller control socket timeouts, HTTP headers, and other parameters. For full documentation, see the module documentation (above).

Return value:

nil if an error occurs; otherwise, a table as described in the module documentation.

See also:

get_status_string (data)

Take the data returned from a HTTP request and return the status string. Useful for stdnse.print_debug messages and even advanced output.

Parameters

  • data: The response table from any HTTP request

Return value:

The best status string we could find: either the actual status string, the status code, or "<unknown status>".
get_url (u, options)

Parses a URL and calls http.get with the result. The URL can contain all the standard fields, protocol://host:port/path

Parameters

  • u: The URL of the host.
  • options: [optional] A table that lets the caller control socket timeouts, HTTP headers, and other parameters. For full documentation, see the module documentation (above).

Return value:

nil if an error occurs; otherwise, a table as described in the module documentation.

See also:

head (host, port, path, options)

Fetches a resource with a HEAD request. Like get, this is a simple wrapper around generic_request with response caching. This function also has support for HTTP redirects. Redirects are followed only if they pass all the validation rules of the redirect_ok function. This function may be overrided by supplying a custom function in the redirect_ok field of the options array. The default function redirects the request if the destination is:

  • Within the same host or domain
  • Has the same port number
  • Stays within the current scheme
  • Does not exceed MAX_REDIRECT_COUNT count of redirects

Caching and redirects can be controlled in the options array, see module documentation for more information.

Parameters

  • host: The host to connect to.
  • port: The port to connect to.
  • path: The path to retrieve.
  • options: [optional] A table that lets the caller control socket timeouts, HTTP headers, and other parameters. For full documentation, see the module documentation (above).

Return value:

nil if an error occurs; otherwise, a table as described in the module documentation.

See also:

identify_404 (host, port)

Try requesting a non-existent file to determine how the server responds to unknown pages ("404 pages"), which a) tells us what to expect when a non-existent page is requested, and b) tells us if the server will be impossible to scan. If the server responds with a 404 status code, as it is supposed to, then this function simply returns 404. If it contains one of a series of common status codes, including unauthorized, moved, and others, it is returned like a 404.

I (Ron Bowes) have observed one host that responds differently for three scenarios:

  • A non-existent page, all lowercase (a login page)
  • A non-existent page, with uppercase (a weird error page that says, "Filesystem is corrupt.")
  • A page in a non-existent directory (a login page with different font colours)

As a result, I've devised three different 404 tests, one to check each of these conditions. They all have to match, the tests can proceed; if any of them are different, we can't check 404s properly.

Parameters

  • host: The host object.
  • port: The port to which we are establishing the connection.

Return values:

  1. status Did we succeed?
  2. result If status is false, result is an error message. Otherwise, it's the code to expect (typically, but not necessarily, '404').
  3. body Body is a hash of the cleaned-up body that can be used when detecting a 404 page that doesn't return a 404 error code.
page_exists (data, result_404, known_404, page, displayall)

Determine whether or not the page that was returned is a 404 page. This is actually a pretty simple function, but it's best to keep this logic close to identify_404, since they will generally be used together.

Parameters

  • data: The data returned by the HTTP request
  • result_404: The status code to expect for non-existent pages. This is returned by identify_404.
  • known_404: The 404 page itself, if result_404 is 200. If result_404 is something else, this parameter is ignored and can be set to nil. This is returned by identfy_404.
  • page: The page being requested (used in error messages).
  • displayall: [optional] If set to true, don't exclude non-404 errors (such as 500).

Return value:

A boolean value: true if the page appears to exist, and false if it does not.
parse_date (s)

Parses an HTTP date string, in any of the following formats from section 3.3.1 of RFC 2616:

  • Sun, 06 Nov 1994 08:49:37 GMT (RFC 822, updated by RFC 1123)
  • Sunday, 06-Nov-94 08:49:37 GMT (RFC 850, obsoleted by RFC 1036)
  • Sun Nov 6 08:49:37 1994 (ANSI C's asctime() format)

Parameters

  • s: the date string.

Return value:

a table with keys year, month, day, hour, min, sec, and isdst, relative to GMT, suitable for input to os.time.
parse_url (url)

Take a URI or URL in any form and convert it to its component parts. The URL can optionally have a protocol definition ('http://'), a server ('scanme.insecure.org'), a port (':80'), a URI ('/test/file.php'), and a query string ('?username=ron&password=turtle'). At the minimum, a path or protocol and url are required.

Parameters

  • url: The incoming URL to parse

Return value:

result A table containing the result, which can have the following fields: protocol, hostname, port, uri, querystring. All fields are strings except querystring, which is a table containing name=value pairs.
parse_www_authenticate (s)

Parses the WWW-Authenticate header as described in RFC 2616, section 14.47 and RFC 2617, section 1.2. The return value is an array of challenges. Each challenge is a table with the keys scheme and params.

Parameters

  • s: The header value text.

Return value:

An array of challenges, or nil on error.
pipeline_add (path, options, all_requests, method)

Adds a pending request to the HTTP pipeline. The HTTP pipeline is a set of requests that will all be sent at the same time, or as close as the server allows. This allows more efficient code, since requests are automatically buffered and sent simultaneously.

The all_requests argument contains the current list of queued requests (if this is the first time calling pipeline_add, it should be nil). After adding the request to end of the queue, the queue is returned and can be passed to the next pipeline_add call.

When all requests have been queued, call pipeline_go with the all_requests table that has been built.

Parameters

  • path: The path to retrieve.
  • options: [optional] A table that lets the caller control socket timeouts, HTTP headers, and other parameters. For full documentation, see the module documentation (above).
  • all_requests: [optional] The current pipeline queue (retunred from a previous add_pipeline call), or nil if it's the first call.
  • method: [optional] The HTTP method ('get', 'head', 'post', etc). Default: 'get'.

Return value:

Table with the pipeline get requests (plus this new one)

See also:

pipeline_go (host, port, all_requests)

Performs all queued requests in the all_requests variable (created by the pipeline_add function). Returns an array of responses, each of which is a table as defined in the module documentation above.

Parameters

  • host: The host to connect to.
  • port: The port to connect to.
  • all_requests: A table with all the previously built pipeline requests

Return value:

A list of responses, in the same order as the requests were queued. Each response is a table as described in the module documentation.
post (host, port, path, options, ignored, postdata)

Fetches a resource with a POST request. Like get, this is a simple wrapper around generic_request except that postdata is handled properly.

Parameters

  • host: The host to connect to.
  • port: The port to connect to.
  • path: The path to retrieve.
  • options: [optional] A table that lets the caller control socket timeouts, HTTP headers, and other parameters. For full documentation, see the module documentation (above).
  • ignored: Ignored for backwards compatibility.
  • postdata: A string or a table of data to be posted. If a table, the keys and values must be strings, and they will be encoded into an application/x-www-form-encoded form submission.

Return value:

nil if an error occurs; otherwise, a table as described in the module documentation.

See also:

put (host, port, path, options, putdata)

Uploads a file using the PUT method and returns a result table. This is a simple wrapper around generic_request

Parameters

  • host: The host to connect to.
  • port: The port to connect to.
  • path: The path to retrieve.
  • options: [optional] A table that lets the caller control socket timeouts, HTTP headers, and other parameters. For full documentation, see the module documentation (above).
  • putdata: The contents of the file to upload

Return value:

nil if an error occurs; otherwise, a table as described in the module documentation.

See also:

response_contains (response, pattern, case_sensitive)

Check if the response variable, which could be a return from a http.get, http.post, http.pipeline, etc, contains the given text. The text can be:

  • Part of a header ('content-type', 'text/html', '200 OK', etc)
  • An entire header ('Content-type: text/html', 'Content-length: 123', etc)
  • Part of the body

The search text is treated as a Lua pattern.

Parameters

  • response: The full response table from a HTTP request.
  • pattern: The pattern we're searching for. Don't forget to escape '-', for example, 'Content%-type'. The pattern can also contain captures, like 'abc(.*)def', which will be returned if successful.
  • case_sensitive: [optional] Set to true for case-sensitive searches. Default: not case sensitive.

Return values:

  1. result True if the string matched, false otherwise
  2. matches An array of captures from the match, if any
save_path (host, port, path, status, links_to, linked_from, contenttype)

This function should be called whenever a valid path (a path that doesn't contain a known 404 page) is discovered. It will add the path to the registry in several ways, allowing other scripts to take advantage of it in interesting ways.

Parameters

  • host: The host the path was discovered on (not necessarily the host being scanned).
  • port: The port the path was discovered on (not necessarily the port being scanned).
  • path: The path discovered. Calling this more than once with the same path is okay; it'll update the data as much as possible instead of adding a duplicate entry
  • status: [optional] The status code (200, 404, 500, etc). This can be left off if it isn't known.
  • links_to: [optional] A table of paths that this page links to.
  • linked_from: [optional] A table of paths that link to this page.
  • contenttype: [optional] The content-type value for the path, if it's known.

Nmap Site Navigation

Intro Reference Guide Book Install Guide
Download Changelog Zenmap GUI Docs
Bug Reports OS Detection Propaganda Related Projects
In the Movies In the News
[ Nmap | Sec Tools | Mailing Lists | Site News | About/Contact | Advertising | Privacy ]