Home page logo
/
Intro Reference Guide Book Install Guide
Download Changelog Zenmap GUI Docs
Bug Reports OS Detection Propaganda Related Projects
In the Movies In the News

Sponsors


Library httpspider

A smallish httpspider library providing basic spidering capabilities It consists of the following classes:

  • Options
** This class is responsible for handling library options.

  • LinkExtractor
** This class contains code responsible for extracting urls from web pages.

  • URL
** This class contains code to parse and process URLs.

  • UrlQueue
** This class contains a queue of the next links to process.

  • Crawler
** This class is responsible for the actual crawling.

The following sample code shows how the spider could be used:

  local crawler = httpspider.Crawler:new( host, port, '/', { scriptname = SCRIPT_NAME } )
  crawler:set_timeout(10000)

  local result
  while(true) do
    local status, r = crawler:crawl()
    if ( not(status) ) then
      break
    end
    if ( r.response.body:match(str_match) ) then
       crawler:stop()
       result = r.url
       break
    end
  end

  return result

Author:
Patrik Karlsson <patrik@cqure.net>

Source: http://nmap.org/svn/nselib/httpspider.lua

Script Arguments

httpspider.url

the url to start spidering. This is a URL relative to the scanned host eg. /default.html (default: /)

httpspider.maxpagecount

the maximum amount of pages to visit. A negative value disables the limit (default: 20)

httpspider.useheadfornonwebfiles

if set, the crawler would use HEAD instead of GET for files that do not have extensions indicating that they are webpages (the list of webpage extensions is located in nselib/data/http-web-files-extensions.lst)

httpspider.noblacklist

if set, doesn't load the default blacklist

httpspider.maxdepth

the maximum amount of directories beneath the initial url to spider. A negative value disables the limit. (default: 3)

httpspider.withinhost

only spider URLs within the same host. (default: true)

httpspider.withindomain

only spider URLs within the same domain. This widens the scope from withinhost and can not be used in combination. (default: false)



Nmap Site Navigation

Intro Reference Guide Book Install Guide
Download Changelog Zenmap GUI Docs
Bug Reports OS Detection Propaganda Related Projects
In the Movies In the News
[ Nmap | Sec Tools | Mailing Lists | Site News | About/Contact | Advertising | Privacy ]