Skip to content

hyperscan: Hyperscan for nginx-module-lua

Installation

If you haven't set up RPM repository subscription, sign up. Then you can proceed with the following steps.

CentOS/RHEL 7 or Amazon Linux 2

yum -y install https://extras.getpagespeed.com/release-latest.rpm
yum -y install lua-resty-hyperscan

CentOS/RHEL 8+, Fedora Linux, Amazon Linux 2023

yum -y install https://extras.getpagespeed.com/release-latest.rpm
yum -y install lua5.1-resty-hyperscan

To use this Lua library with NGINX, ensure that nginx-module-lua is installed.

This document describes lua-resty-hyperscan v0.3 released on Apr 14 2022.


lua-resty-hyperscan - Hyperscan for Openresty

!!! Old Branch got too many callbacks problem, because luajit is not fully support CALLBACK. So we need a C wrapper to handle callbacks.

Status

This library is under development so far.

Support Block Mode and Vectored Mode now.


first, you should install openresty

git clone git@github.com:LubinLew/lua-resty-hyperscan.git cd lua-resty-hyperscan make make install make test

## 

## 

## Synopsis

configuration example

```lua
user  nobody;
worker_processes  auto;
error_log logs/error.log error;

events {
    worker_connections  1024;
}

http {
    include       mime.types;
    default_type  application/octet-stream;

    access_log  logs/access.log;

    init_by_lua_block {
       local whs, err = require('hyperscan')
        if not whs then
            ngx.log(ngx.ERR, "Failure:", err)
            return
        end

       -- new
       local obj = whs.block_new("a-uniq-name", true) -- true : enable debug mode

       local patterns = {
           {id = 1001, pattern = "\\d3",       flag = "iu"},
           {id = 1002, pattern = "\\s{3,5}",   flag = "u"},
           {id = 1003, pattern = "[a-d]{2,7}", flag = ""}
       }

        -- compile
        ret, err = obj:compile(patterns)
        if not ret then
           ngx.log(ngx.ERR, "hyperscan block compile failed, ", err)
           return
        end
    }

    server {
        listen       80;
        server_name  localhost;

        location / {
            content_by_lua_block {
                local whs = require('hyperscan')
                local obj = whs.block_get("a-uniq-name")
                -- scan
                local ret, id, from, to = obj:scan(ngx.var.uri)
                if ret then
                    return ngx.print("[", ngx.var.uri,"] match: ", id, " zone [", from, " - ", to, ").\n")
                else
                    return ngx.print("[", ngx.var.uri, "] not match any rule.\n")
                end
            }
        }
    }
}

test cases:

$ curl http://localhost
[/] not match any rule.

$ curl http://localhost/131111111
[/131111111] match: 1001 zone [0 - 3).

$ curl "http://localhost/      end"
[/      end] match: 1002 zone [0 - 4).

$ curl http://localhost/aaaaaaa
[/aaaaaaa] match: 1003 zone [0 - 3).

Methods

way to load this library

local whs,err = require('hyperscan')
if not whs then
    ngx.log(ngx.ERR, "reason: ", err)
end

block_new

Create a hyperscan instance for block mode.

local handle, err = whs.block_new(name, debug)
if not handle then
    ngx.log(ngx.ERR, "reason: ", err)
end
Field Name Lua Type Description
Parameter name string instance name, mainly for log
debug boolean enable/disable write debug log to syslog
Return Value handle table/nil instance reference
err string reason of failure

block_free

Destroy a hyperscan instance for block mode.

whs.block_free(name)

block_get

Get the instance reference by name.

local handle = whs.block_get(name)
Filed Name Lua Type Description
Parameter name string instance name
Return Value handle table/nil instance reference

vector_new

Create a hyperscan instance for vector mode.

local handle, err = whs.vector_new(name, debug)
if not handle then
    ngx.log(ngx.ERR, "reason: ", err)
end
Field Name Lua Type Description
Parameter name string instance name, mainly for log
debug boolean enable/disable write debug log to syslog
Return Value handle table/nil instance reference
err string reason of failure

vector_free

Destroy a hyperscan instance for vector mode.

whs.vector_free(name)

vector_get

Get the instance reference by name.

local handle = whs.vector_get(name)
Filed Name Lua Type Description
Parameter name string instance name
Return Value handle table/nil instance reference

handle:compile

compile regular expression into a Hyperscan database.

--local handle = whs.block_new(name, debug)
local ok, err = handle:compile(patterns)
if not ok then
    ngx.log(ngx.ERR, "reason: ", err)
end
Field Name Lua Type Description
parameter patterns table pattern list
Return Value ok boolean success/failure
err string reason of failure

Pattern List

Example
local patterns = {
    {id = 1001, pattern = "\\d3",       flag = "iu"   },
    {id = 1002, pattern = "\\s{3,5}",   flag = "dmsu" },
    {id = 1003, pattern = "[a-d]{2,7}", flag = ""     }
}
Flags
Flag Hyperscan Value Description
'i' HS_FLAG_CASELESS Set case-insensitive matching
'd' HS_FLAG_DOTALL Matching a . will not exclude newlines.
'm' HS_FLAG_MULTILINE Set multi-line anchoring.
's' HS_FLAG_SINGLEMATCH Set single-match only mode.
'e' HS_FLAG_ALLOWEMPTY Allow expressions that can match against empty buffers.
'u' HS_FLAG_UTF8 Enable UTF-8 mode for this expression.
'p' HS_FLAG_UCP Enable Unicode property support for this expression.
'f' HS_FLAG_PREFILTER Enable prefiltering mode for this expression.
'l' HS_FLAG_SOM_LEFTMOST Enable leftmost start of match reporting.
'c' HS_FLAG_COMBINATION Logical combination.
'q' HS_FLAG_QUIET Don't do any match reporting.

handle:scan

The actual pattern matching takes place for block-mode pattern databases.

--local handle = whs.block_get(name)
local ok, id, from, to = handle:scan(data)
if ok then
    ngx.log(ngx.INFO, "match success", id, from, to)
end

The actual pattern matching takes place for vector-mode pattern databases.

--local handle = whs.vector_get(name)
--local data = {"s","s2"}
--local data = "s"
local ok, id, dataindex, to = handle:scan(data)
if ok then
    ngx.log(ngx.INFO, "match success", id, from, to)
end
Field Name Lua Type Description
Parameter data string/string[] string to be scanned(string[] only vector mode)
Return Value ok boolean ture for match, false for not match
id number match id
from number match from byte arrary index(include itself)
to number match end byte arrary index(exclude itself)
dataindex number match data index(only vector mode)

handle:free

Destroy a hyperscan instance.

--local handle = whs.block_get(name)
handle:free()

GitHub

You may find additional configuration tips and documentation for this module in the GitHub repository for nginx-module-hyperscan.