Library pcre
Perl Compatible Regular Expressions.
One of Lua's quirks is its string patterns. While they have great performance and are tightly integrated into the Lua interpreter, they are very different in syntax and not as powerful as standard regular expressions. So we have integrated Perl compatible regular expressions into Lua using PCRE and a modified version of the Lua PCRE library written by Reuben Thomas and Shmuel Zeigerman. These are the same sort of regular expressions used by Nmap version detection. The main modification to their library is that the NSE version only supports PCRE expressions instead of both PCRE and POSIX patterns. In order to maintain a high script execution speed, the library interfacing with PCRE is kept very thin. It is not integrated as seamlessly as the Lua string pattern API. This allows script authors to decide when to use PCRE expressions versus Lua patterns. The use of PCRE involves a separate pattern compilation step, which saves execution time when patterns are reused. Compiled patterns can be cached in the NSE registry and reused by other scripts.
The documentation for this module is derived from that supplied by the PCRE Lua lib.
Warning: PCRE has a history of security vulnerabilities allowing attackers who are able to compile arbitrary regular expressions to execute arbitrary code. More such vulnerabilities may be discovered in the future. These have never affected Nmap because it doesn't give attackers any control over the regular expressions it uses. Similarly, NSE scripts should never build regular expressions with untrusted network input. Matching hardcoded regular expressions against the untrusted input is fine.
Authors:
Functions
- exec (string, start, flags)
Matches a string against a compiled regular expression, returning positions of substring matches.
- flags ()
Returns a table of the available PCRE option flags (numbers) keyed by their names (strings).
- match (string, start, flags)
Matches a string against a compiled regular expression.
- new (pattern, flags, locale)
Returns a compiled regular expression.
- pcre_obj:gmatch (string, func, n, ef)
Matches a string against a regular expression multiple times.
- version ()
Returns the version of the PCRE library in use as a string.
Functions
- exec (string, start, flags)
-
Matches a string against a compiled regular expression, returning positions of substring matches.
This function is like
match
except that a table returned as a third result contains offsets of substring matches rather than substring matches themselves. That table will not contain string keys, even if named sub-patterns are used. For example, if the whole match is at offsets 10, 20 and substring matches are at offsets 12, 14 and 16, 19 then the function returns10, 20, {12,14,16,19}
.Parameters
- string
- the string to match against.
- start
- where to start the match in the string (optional).
- flags
- execution flags (optional).
Usage:
i, j, substrings = regex:exec("string to be searched", 0, 0) if (i) then ... end
Return values:
nil
if no match, otherwise the start point of the match of the whole string.- the end point of the match of the whole string.
- a table containing a list of substring match start and end positions.
- flags ()
-
Returns a table of the available PCRE option flags (numbers) keyed by their names (strings).
Possible names of the available strings can be retrieved from the documentation of the PCRE library used to link against Nmap. The key is the option name in the manual minus the
PCRE_
prefix.PCRE_CASELESS
becomesCASELESS
for example. - match (string, start, flags)
-
Matches a string against a compiled regular expression.
Returns the start point and the end point of the first match of the compiled regular expression in the string.
Parameters
- string
- the string to match against.
- start
- where to start the match in the string (optional).
- flags
- execution flags (optional).
Usage:
i, j = regex:match("string to be searched", 0, 0) if (i) then ... end
Return values:
nil
if no match, otherwise the start point of the first match.- the end point of the first match.
- a table which contains false in the positions where the pattern did not match. If named sub-patterns were used, the table also contains substring matches keyed by their sub-pattern name.
- new (pattern, flags, locale)
-
Returns a compiled regular expression.
The resulting compiled regular expression is ready to be matched against strings. Compiled regular expressions are subject to Lua's garbage collection.
The compilation flags are set bitwise. If you want to set the 3rd (corresponding to the number 4) and the 1st (corresponding to 1) bit for example you would pass the number 5 as a second argument. The compilation flags accepted are those of the PCRE C library. These include flags for case insensitive matching (
1
), matching line beginnings (^
) and endings ($
) even in multiline strings (i.e. strings containing newlines) (2
) and a flag for matching across line boundaries (4
). No compilation flags yield a default value of0
.Parameters
- pattern
- a string describing the pattern, such as
"^foo$"
. - flags
- a number describing which compilation flags are set.
- locale
- a string describing the locale which should be used to compile
the regular expression (optional). The value is a string which is passed to
the C standard library function
setlocale
. For more information on this argument refer to the documentation ofsetlocale
.
Usage:
local regex = pcre.new("pcre-pattern",0,"C")
- pcre_obj:gmatch (string, func, n, ef)
-
Matches a string against a regular expression multiple times.
Tries to match the regular expression pcre_obj against string up to
n
times (or as many as possible ifn
is not given or is not a positive number), subject to the execution flagsef
. Each time there is a match,func
is called asfunc(m, t)
, wherem
is the matched string andt
is a table of substring matches. This table contains false in the positions where the corresponding sub-pattern did not match. If named sub-patterns are used then the table also contains substring matches keyed by their correspondent sub-pattern names (strings). Iffunc
returns a true value, thengmatch
immediately returns;gmatch
returns the number of matches made.Parameters
- string
- the string to match against.
- func
- the function to call for each match.
- n
- the maximum number of matches to do (optional).
- ef
- execution flags (optional).
Usage:
local t = {} local function match(m) t[#t + 1] = m end local n = regex:gmatch("string to be searched", match)
Return value:
the number of matches made. - version ()
-
Returns the version of the PCRE library in use as a string.
For example
"6.4 05-Sep-2005"
.