Home page logo
/
Intro Reference Guide Book Install Guide
Download Changelog Zenmap GUI Docs
Bug Reports OS Detection Propaganda Related Projects
In the Movies In the News

Sponsors


Library http

Implements the HTTP client protocol in a standard form that Nmap scripts can take advantage of.

Because HTTP has so many uses, there are a number of interfaces to this library. The most obvious and common ones are simply get, post, and head; or, if more control is required, generic_request can be used. These functions do what one would expect. The get_url helper function can be used to parse and retrieve a full URL.

HTTPS support is transparent. The library uses comm.tryssl to determine whether SSL is required for a request.

These functions return a table of values, including:

  • status-line - A string representing the status, such as "HTTP/1.1 200 OK". In case of an error, a description will be provided in this line.
  • status: The HTTP status value; for example, "200". If an error occurs during a request, then this value is going to be nil.
  • header - An associative array representing the header. Keys are all lowercase, and standard headers, such as 'date', 'content-length', etc. will typically be present.
  • rawheader - A numbered array of the headers, exactly as the server sent them. While header['content-type'] might be 'text/html', rawheader[3] might be 'Content-type: text/html'.
  • cookies - A numbered array of the cookies the server sent. Each cookie is a table with the following keys: name, value, path, domain, and expires.
  • body - The full body, as returned by the server.

If a script is planning on making a lot of requests, the pipelining functions can be helpful. pipeline_add queues requests in a table, and pipeline performs the requests, returning the results as an array, with the responses in the same order as the queries were added. As a simple example:

 -- Start by defining the 'all' variable as nil
 local all = nil

 -- Add two 'GET' requests and one 'HEAD' to the queue. These requests are not performed
 -- yet. The second parameter represents the 'options' table, which we don't need.
 all = http.pipeline_add('/book',    nil, all)
 all = http.pipeline_add('/test',    nil, all)
 all = http.pipeline_add('/monkeys', nil, all, 'HEAD')

 -- Perform all three requests as parallel as Nmap is able to
 local results = http.pipeline('nmap.org', 80, all)

At this point, results is an array with three elements. Each element is a table containing the HTTP result, as discussed above.

One more interface provided by the HTTP library helps scripts determine whether or not a page exists. The identify_404 function will try several URLs on the server to determine what the server's 404 pages look like. It will attempt to identify customized 404 pages that may not return the actual status code 404. If successful, the function page_exists can then be used to determine whether or not a page existed.

Some other miscellaneous functions that can come in handy are response_contains, can_use_head, and save_path. See the appropriate documentation for them.

The response to each function is typically a table with the following keys: status-line: The HTTP status line; for example, "HTTP/1.1 200 OK" (note: this is followed by a newline). In case of an error, a description will be provided in this line. status: The HTTP status value; for example, "200". If an error occurs during a request, then this value is going to be nil. header: A table of header values, where the keys are lowercase and the values are exactly what the server sent rawheader: A list of header values as "name: value" strings, in the exact format and order that the server sent them cookies: A list of cookies that the server is sending. Each cookie is a table containing the keys name, value, and path. This table can be sent to the server in subsequent responses in the options table to any function (see below). body: The body of the response

Many of the functions optionally allow an 'options' table. This table can alter the HTTP headers or other values like the timeout. The following are valid values in 'options' (note: not all options will necessarily affect every function):

  • timeout: A timeout used for socket operations.
  • header: A table containing additional headers to be used for the request. For example, options['header']['Content-Type'] = 'text/xml'
  • content: The content of the message (content-length will be added -- set header['Content-Length'] to override). This can be either a string, which will be directly added as the body of the message, or a table, which will have each key=value pair added (like a normal POST request).
  • cookies: A list of cookies as either a string, which will be directly sent, or a table. If it's a table, the following fields are recognized:
** name ** value ** path ** expires Only name and value fields are required.
  • auth: A table containing the keys username and password, which will be used for HTTP Basic authentication.
If a server requires HTTP Digest authentication, then there must also be a key digest, with value true.
  • bypass_cache: Do not perform a lookup in the local HTTP cache.
  • no_cache: Do not save the result of this request to the local HTTP cache.
  • no_cache_body: Do not save the body of the response to the local HTTP cache.
  • redirect_ok: Closure that overrides the default redirect_ok used to validate whether to follow HTTP redirects or not. False, if no HTTP redirects should be followed.
The following example shows how to write a custom closure that follows 5 consecutive redirects:
  redirect_ok = function(host,port)
    local c = 5
    return function(url)
      if ( c==0 ) then return false end
      c = c - 1
      return true
    end
  end
  

Source: http://nmap.org/svn/nselib/http.lua

Script Arguments

http.useragent

The value of the User-Agent header field sent with requests. By default it is "Mozilla/5.0 (compatible; Nmap Scripting Engine; http://nmap.org/book/nse.html)". A value of the empty string disables sending the User-Agent header field.

http.max-cache-size

The maximum memory size (in bytes) of the cache.

http.max-pipeline

If set, it represents the number of outstanding HTTP requests that should be pipelined. Defaults to http.pipeline (if set), or to what getPipelineMax function returns.

TODO Implement cache system for http pipelines

http.pipeline

If set, it represents the number of HTTP requests that'll be sent on one connection. This can be set low to make debugging easier, or it can be set high to test how a server reacts (its chosen max is ignored).

Functions

can_use_head (host, port, result_404, path)

Determine whether or not the server supports HEAD.

clean_404 (body)

Try to remove anything that might change within a 404.

generic_request (host, port, method, path, options)

Do a single request with a given method. The response is returned as the standard response table (see the module documentation).

get (host, port, path, options)

Fetches a resource with a GET request and returns the result as a table.

get_status_string (data)

Take the data returned from a HTTP request and return the status string. Useful for stdnse.debug messages and even advanced output.

get_url (u, options)

Parses a URL and calls http.get with the result. The URL can contain all the standard fields, protocol://host:port/path

grab_forms (body)

Finds forms in html code

head (host, port, path, options)

Fetches a resource with a HEAD request.

identify_404 (host, port)

Try requesting a non-existent file to determine how the server responds to unknown pages ("404 pages")

page_exists (data, result_404, known_404, page, displayall)

Determine whether or not the page that was returned is a 404 page.

parse_date (s)

Parses an HTTP date string

parse_form (form)

Parses a form, that is, finds its action and fields.

parse_redirect (host, port, path, response)

Handles a HTTP redirect

parse_url (url)

Take a URI or URL in any form and convert it to its component parts.

parse_www_authenticate (s)

Parses the WWW-Authenticate header as described in RFC 2616, section 14.47 and RFC 2617, section 1.2.

pipeline_add (path, options, all_requests, method)

Adds a pending request to the HTTP pipeline.

pipeline_go (host, port, all_requests)

Performs all queued requests in the all_requests variable (created by the pipeline_add function).

post (host, port, path, options, ignored, postdata)

Fetches a resource with a POST request.

put (host, port, path, options, putdata)

Uploads a file using the PUT method and returns a result table. This is a simple wrapper around generic_request

response_contains (response, pattern, case_sensitive)

Check if the response variable contains the given text.

save_path (host, port, path, status, links_to, linked_from, contenttype)

This function should be called whenever a valid path (a path that doesn't contain a known 404 page) is discovered.

tag_pattern (tag, endtag)

Create a pattern to find a tag



Functions

can_use_head (host, port, result_404, path)

Determine whether or not the server supports HEAD.

Tests by requesting / and verifying that it returns 200, and doesn't return data. We implement the check like this because can't always rely on OPTIONS to tell the truth.

Note: If identify_404 returns a 200 status, HEAD requests should be disabled. Sometimes, servers use a 200 status code with a message explaining that the page wasn't found. In this case, to actually identify a 404 page, we need the full body that a HEAD request doesn't supply. This is determined automatically if the result_404 field is set.

Parameters

  • host: The host object.
  • port: The port to use.
  • result_404: [optional] The result when an unknown page is requested. This is returned by identify_404. If the 404 page returns a 200 code, then we disable HEAD requests.
  • path: The path to request; by default, / is used.

Return values:

  1. A boolean value: true if HEAD is usable, false otherwise.
  2. If HEAD is usable, the result of the HEAD request is returned (so potentially, a script can avoid an extra call to HEAD)
clean_404 (body)

Try to remove anything that might change within a 404.

For example:

  • A file path (includes URI)
  • A time
  • A date
  • An execution time (numbers in general, really)

The intention is that two 404 pages from different URIs and taken hours apart should, whenever possible, look the same.

During this function, we're likely going to over-trim things. This is fine -- we want enough to match on that it'll a) be unique, and b) have the best chance of not changing. Even if we remove bits and pieces from the file, as long as it isn't a significant amount, it'll remain unique.

One case this doesn't cover is if the server generates a random haiku for the user.

Parameters

  • body: The body of the page.
generic_request (host, port, method, path, options)

Do a single request with a given method. The response is returned as the standard response table (see the module documentation).

The get, head, and post functions are simple wrappers around generic_request.

Any 1XX (informational) responses are discarded.

Parameters

  • host: The host to connect to.
  • port: The port to connect to.
  • method: The method to use; for example, 'GET', 'HEAD', etc.
  • path: The path to retrieve.
  • options: [optional] A table that lets the caller control socket timeouts, HTTP headers, and other parameters. For full documentation, see the module documentation (above).

Return value:

A response table, see module documentation for description.

See also:

get (host, port, path, options)

Fetches a resource with a GET request and returns the result as a table.

This is a simple wrapper around generic_request, with the added benefit of having local caching and support for HTTP redirects. Redirects are followed only if they pass all the validation rules of the redirect_ok function. This function may be overridden by supplying a custom function in the redirect_ok field of the options array. The default function redirects the request if the destination is:

  • Within the same host or domain
  • Has the same port number
  • Stays within the current scheme
  • Does not exceed MAX_REDIRECT_COUNT count of redirects

Caching and redirects can be controlled in the options array, see module documentation for more information.

Parameters

  • host: The host to connect to.
  • port: The port to connect to.
  • path: The path to retrieve.
  • options: [optional] A table that lets the caller control socket timeouts, HTTP headers, and other parameters. For full documentation, see the module documentation (above).

Return value:

A response table, see module documentation for description.

See also:

get_status_string (data)

Take the data returned from a HTTP request and return the status string. Useful for stdnse.debug messages and even advanced output.

Parameters

  • data: The response table from any HTTP request

Return value:

The best status string we could find: either the actual status string, the status code, or "<unknown status>".
get_url (u, options)

Parses a URL and calls http.get with the result. The URL can contain all the standard fields, protocol://host:port/path

Parameters

  • u: The URL of the host.
  • options: [optional] A table that lets the caller control socket timeouts, HTTP headers, and other parameters. For full documentation, see the module documentation (above).

Return value:

A response table, see module documentation for description.

See also:

grab_forms (body)

Finds forms in html code

returns table of found forms, in plaintext.

Parameters

  • body: A response.body in which to search for forms

Return value:

A list of forms.
head (host, port, path, options)

Fetches a resource with a HEAD request.

Like get, this is a simple wrapper around generic_request with response caching. This function also has support for HTTP redirects. Redirects are followed only if they pass all the validation rules of the redirect_ok function. This function may be overridden by supplying a custom function in the redirect_ok field of the options array. The default function redirects the request if the destination is:

  • Within the same host or domain
  • Has the same port number
  • Stays within the current scheme
  • Does not exceed MAX_REDIRECT_COUNT count of redirects

Caching and redirects can be controlled in the options array, see module documentation for more information.

Parameters

  • host: The host to connect to.
  • port: The port to connect to.
  • path: The path to retrieve.
  • options: [optional] A table that lets the caller control socket timeouts, HTTP headers, and other parameters. For full documentation, see the module documentation (above).

Return value:

A response table, see module documentation for description.

See also:

identify_404 (host, port)

Try requesting a non-existent file to determine how the server responds to unknown pages ("404 pages")

This tells us

  • what to expect when a non-existent page is requested, and
  • if the server will be impossible to scan.

If the server responds with a 404 status code, as it is supposed to, then this function simply returns 404. If it contains one of a series of common status codes, including unauthorized, moved, and others, it is returned like a 404.

I (Ron Bowes) have observed one host that responds differently for three scenarios:

  • A non-existent page, all lowercase (a login page)
  • A non-existent page, with uppercase (a weird error page that says,
"Filesystem is corrupt.")
  • A page in a non-existent directory (a login page with different font
colours)

As a result, I've devised three different 404 tests, one to check each of these conditions. They all have to match, the tests can proceed; if any of them are different, we can't check 404s properly.

Parameters

  • host: The host object.
  • port: The port to which we are establishing the connection.

Return values:

  1. status Did we succeed?
  2. result If status is false, result is an error message. Otherwise, it's the code to expect (typically, but not necessarily, '404').
  3. body Body is a hash of the cleaned-up body that can be used when detecting a 404 page that doesn't return a 404 error code.
page_exists (data, result_404, known_404, page, displayall)

Determine whether or not the page that was returned is a 404 page.

This is actually a pretty simple function, but it's best to keep this logic close to identify_404, since they will generally be used together.

Parameters

  • data: The data returned by the HTTP request
  • result_404: The status code to expect for non-existent pages. This is returned by identify_404.
  • known_404: The 404 page itself, if result_404 is 200. If result_404 is something else, this parameter is ignored and can be set to nil. This is returned by identify_404.
  • page: The page being requested (used in error messages).
  • displayall: [optional] If set to true, don't exclude non-404 errors (such as 500).

Return value:

A boolean value: true if the page appears to exist, and false if it does not.
parse_date (s)

Parses an HTTP date string

Supports any of the following formats from section 3.3.1 of RFC 2616:

  • Sun, 06 Nov 1994 08:49:37 GMT (RFC 822, updated by RFC 1123)
  • Sunday, 06-Nov-94 08:49:37 GMT (RFC 850, obsoleted by RFC 1036)
  • Sun Nov 6 08:49:37 1994 (ANSI C's asctime() format)

Parameters

  • s: the date string.

Return value:

a table with keys year, month, day, hour, min, sec, and isdst, relative to GMT, suitable for input to os.time.
parse_form (form)

Parses a form, that is, finds its action and fields.

Parameters

  • form: A plaintext representation of form

Return value:

A dictionary with keys: action, method if one is specified, fields which is a list of fields found in the form each of which has a name attribute and type if specified.
parse_redirect (host, port, path, response)

Handles a HTTP redirect

Parameters

  • host: table as received by the script action function
  • port: table as received by the script action function
  • path: string
  • response: table as returned by http.get or http.head

Return value:

url table as returned by url.parse or nil if there's no redirect taking place
parse_url (url)

Take a URI or URL in any form and convert it to its component parts.

The URL can optionally have a protocol definition ('http://'), a server ('scanme.insecure.org'), a port (':80'), a URI ('/test/file.php'), and a query string ('?username=ron&password=turtle'). At the minimum, a path or protocol and url are required.

Parameters

  • url: The incoming URL to parse

Return value:

A table containing the result, which can have the following fields: * protocol * hostname * port * uri * querystring All fields are strings except querystring, which is a table containing name=value pairs.
parse_www_authenticate (s)

Parses the WWW-Authenticate header as described in RFC 2616, section 14.47 and RFC 2617, section 1.2.

The return value is an array of challenges. Each challenge is a table with the keys scheme and params.

Parameters

  • s: The header value text.

Return value:

An array of challenges, or nil on error.
pipeline_add (path, options, all_requests, method)

Adds a pending request to the HTTP pipeline.

The HTTP pipeline is a set of requests that will all be sent at the same time, or as close as the server allows. This allows more efficient code, since requests are automatically buffered and sent simultaneously.

The all_requests argument contains the current list of queued requests (if this is the first time calling pipeline_add, it should be nil). After adding the request to end of the queue, the queue is returned and can be passed to the next pipeline_add call.

When all requests have been queued, call pipeline_go with the all_requests table that has been built.

Parameters

  • path: The path to retrieve.
  • options: [optional] A table that lets the caller control socket timeouts, HTTP headers, and other parameters. For full documentation, see the module documentation (above).
  • all_requests: [optional] The current pipeline queue (returned from a previous add_pipeline call), or nil if it's the first call.
  • method: [optional] The HTTP method ('GET', 'HEAD', 'POST', etc). Default: 'GET'.

Return value:

Table with the pipeline get requests (plus this new one)

See also:

pipeline_go (host, port, all_requests)

Performs all queued requests in the all_requests variable (created by the pipeline_add function).

Returns an array of responses, each of which is a table as defined in the module documentation above.

Parameters

  • host: The host to connect to.
  • port: The port to connect to.
  • all_requests: A table with all the previously built pipeline requests

Return value:

A list of responses, in the same order as the requests were queued. Each response is a table as described in the module documentation.
post (host, port, path, options, ignored, postdata)

Fetches a resource with a POST request.

Like get, this is a simple wrapper around generic_request except that postdata is handled properly.

Parameters

  • host: The host to connect to.
  • port: The port to connect to.
  • path: The path to retrieve.
  • options: [optional] A table that lets the caller control socket timeouts, HTTP headers, and other parameters. For full documentation, see the module documentation (above).
  • ignored: Ignored for backwards compatibility.
  • postdata: A string or a table of data to be posted. If a table, the keys and values must be strings, and they will be encoded into an application/x-www-form-encoded form submission.

Return value:

A response table, see module documentation for description.

See also:

put (host, port, path, options, putdata)

Uploads a file using the PUT method and returns a result table. This is a simple wrapper around generic_request

Parameters

  • host: The host to connect to.
  • port: The port to connect to.
  • path: The path to retrieve.
  • options: [optional] A table that lets the caller control socket timeouts, HTTP headers, and other parameters. For full documentation, see the module documentation (above).
  • putdata: The contents of the file to upload

Return value:

A response table, see module documentation for description.

See also:

response_contains (response, pattern, case_sensitive)

Check if the response variable contains the given text.

Response variable could be a return from a http.get, http.post, http.pipeline, etc. The text can be:

  • Part of a header ('content-type', 'text/html', '200 OK', etc)
  • An entire header ('Content-type: text/html', 'Content-length: 123', etc)
  • Part of the body

The search text is treated as a Lua pattern.

Parameters

  • response: The full response table from a HTTP request.
  • pattern: The pattern we're searching for. Don't forget to escape '-', for example, 'Content%-type'. The pattern can also contain captures, like 'abc(.*)def', which will be returned if successful.
  • case_sensitive: [optional] Set to true for case-sensitive searches. Default: not case sensitive.

Return values:

  1. result True if the string matched, false otherwise
  2. matches An array of captures from the match, if any
save_path (host, port, path, status, links_to, linked_from, contenttype)

This function should be called whenever a valid path (a path that doesn't contain a known 404 page) is discovered.

It will add the path to the registry in several ways, allowing other scripts to take advantage of it in interesting ways.

Parameters

  • host: The host the path was discovered on (not necessarily the host being scanned).
  • port: The port the path was discovered on (not necessarily the port being scanned).
  • path: The path discovered. Calling this more than once with the same path is okay; it'll update the data as much as possible instead of adding a duplicate entry
  • status: [optional] The status code (200, 404, 500, etc). This can be left off if it isn't known.
  • links_to: [optional] A table of paths that this page links to.
  • linked_from: [optional] A table of paths that link to this page.
  • contenttype: [optional] The content-type value for the path, if it's known.
tag_pattern (tag, endtag)

Create a pattern to find a tag

Case-insensitive search for tags

Parameters

  • tag: The name of the tag to find
  • endtag: Boolean true if you are looking for an end tag, otherwise it will look for a start tag

Return value:

A pattern to find the tag

Nmap Site Navigation

Intro Reference Guide Book Install Guide
Download Changelog Zenmap GUI Docs
Bug Reports OS Detection Propaganda Related Projects
In the Movies In the News
[ Nmap | Sec Tools | Mailing Lists | Site News | About/Contact | Advertising | Privacy ]