# HTML _Path: en/lua/http/html_ ## Table of Contents - HTML Sanitization ## Content # HTML Sanitization Sanitize untrusted HTML to prevent XSS attacks. Based on [bluemonday](https://github.com/microcosm-cc/bluemonday). Sanitization works by parsing HTML and filtering it through a whitelist policy. Elements and attributes not explicitly allowed are removed. The output is always well-formed HTML. ## Loading ```lua local html = require("html") ``` ## Preset Policies Three built-in policies for common use cases: | Policy | Use Case | Allows | |--------|----------|--------| | `new_policy` | Custom sanitization | Nothing (build from scratch) | | `ugc_policy` | User comments, forums | Common formatting (`p`, `b`, `i`, `a`, lists, etc.) | | `strict_policy` | Plain text extraction | Nothing (strips all HTML) | ### Empty Policy Creates a policy that allows nothing. Use this to build a custom whitelist from scratch. ```lua local policy, err = html.sanitize.new_policy() policy:allow_elements("p", "strong", "em") policy:allow_attrs("class"):globally() local clean = policy:sanitize(user_input) ``` **Returns:** `Policy, error` ### User Content Policy Pre-configured for user-generated content. Allows common formatting elements. ```lua local policy = html.sanitize.ugc_policy() local safe = policy:sanitize('

Hello world

') -- '

Hello world

' local xss = policy:sanitize('

Hello

') -- '

Hello

' ``` **Returns:** `Policy, error` ### Strict Policy Strips all HTML, returns plain text only. ```lua local policy = html.sanitize.strict_policy() local text = policy:sanitize('

Hello world!

') -- 'Hello world!' ``` **Returns:** `Policy, error` ### Allow Elements Whitelist specific HTML elements. ```lua local policy = html.sanitize.new_policy() policy:allow_elements("p", "strong", "em", "br") policy:allow_elements("h1", "h2", "h3") policy:allow_elements("a", "img") local result = policy:sanitize('

Hello world

') -- '

Hello world

' ``` | Parameter | Type | Description | |-----------|------|-------------| | `...` | string | Element tag names | **Returns:** `Policy` ### Allow Attributes Start attribute permission. Chain with `on_elements()` or `globally()`. ```lua policy:allow_attrs("href"):on_elements("a") policy:allow_attrs("src", "alt"):on_elements("img") policy:allow_attrs("class", "id"):globally() ``` | Parameter | Type | Description | |-----------|------|-------------| | `...` | string | Attribute names | **Returns:** `AttrBuilder` ### On Specific Elements Allow attributes only on specific elements. ```lua policy:allow_elements("a", "img") policy:allow_attrs("href", "target"):on_elements("a") policy:allow_attrs("src", "alt", "width", "height"):on_elements("img") ``` | Parameter | Type | Description | |-----------|------|-------------| | `...` | string | Element tag names | **Returns:** `Policy` ### On All Elements Allow attributes globally on any permitted element. ```lua policy:allow_attrs("class"):globally() policy:allow_attrs("id"):globally() ``` **Returns:** `Policy` ### With Pattern Matching Validate attribute values against regex pattern. ```lua -- Only allow hex colors in style local builder, err = policy:allow_attrs("style"):matching("^color:#[0-9a-fA-F]{6}$") if err then return nil, err end builder:on_elements("span") policy:sanitize('Red') -- 'Red' policy:sanitize('Bad') -- 'Bad' ``` | Parameter | Type | Description | |-----------|------|-------------| | `pattern` | string | Regex pattern | **Returns:** `AttrBuilder, error` ### Standard URLs Enable URL handling with security defaults. ```lua policy:allow_elements("a") policy:allow_attrs("href"):on_elements("a") policy:allow_standard_urls() ``` **Returns:** `Policy` ### URL Schemes Restrict which URL schemes are allowed. ```lua policy:allow_url_schemes("https", "mailto") policy:sanitize('OK') -- 'OK' policy:sanitize('XSS') -- 'XSS' ``` | Parameter | Type | Description | |-----------|------|-------------| | `...` | string | Allowed schemes | **Returns:** `Policy` ### Relative URLs Allow or disallow relative URLs. ```lua policy:allow_relative_urls(true) policy:sanitize('Link') -- 'Link' ``` | Parameter | Type | Description | |-----------|------|-------------| | `allow` | boolean | Allow relative URLs | **Returns:** `Policy` ### Nofollow Links Add `rel="nofollow"` to all links. Prevents SEO spam. ```lua policy:allow_attrs("href", "rel"):on_elements("a") policy:require_nofollow_on_links(true) policy:sanitize('Link') -- 'Link' ``` | Parameter | Type | Description | |-----------|------|-------------| | `require` | boolean | Add nofollow | **Returns:** `Policy` ### Noreferrer Links Add `rel="noreferrer"` to all links. Prevents referrer leakage. ```lua policy:require_noreferrer_on_links(true) ``` | Parameter | Type | Description | |-----------|------|-------------| | `require` | boolean | Add noreferrer | **Returns:** `Policy` ### External Links in New Tab Add `target="_blank"` to fully qualified URLs. ```lua policy:allow_attrs("href", "target"):on_elements("a") policy:add_target_blank_to_fully_qualified_links(true) policy:sanitize('Link') -- 'Link' ``` | Parameter | Type | Description | |-----------|------|-------------| | `add` | boolean | Add target blank | **Returns:** `Policy` ### Allow Images Permit `` with standard attributes. ```lua policy:allow_images() policy:sanitize('Photo') -- 'Photo' ``` **Returns:** `Policy` ### Allow Data URI Images Permit base64 embedded images. ```lua policy:allow_elements("img") policy:allow_attrs("src"):on_elements("img") policy:allow_data_uri_images() policy:sanitize('') -- '' ``` **Returns:** `Policy` ### Allow Lists Permit list elements: `ul`, `ol`, `li`, `dl`, `dt`, `dd`. ```lua policy:allow_lists() policy:sanitize('
  • Item 1
  • Item 2
') -- '
  • Item 1
  • Item 2
' ``` **Returns:** `Policy` ### Allow Tables Permit table elements: `table`, `thead`, `tbody`, `tfoot`, `tr`, `td`, `th`, `caption`. ```lua policy:allow_tables() policy:sanitize('
Cell
') -- '
Cell
' ``` **Returns:** `Policy` ### Allow Standard Attributes Permit common attributes: `id`, `class`, `title`, `dir`, `lang`. ```lua policy:allow_elements("p") policy:allow_standard_attributes() policy:sanitize('

Hello

') -- '

Hello

' ``` **Returns:** `Policy` ## Sanitize Apply policy to HTML string. ```lua local policy = html.sanitize.ugc_policy() policy:require_nofollow_on_links(true) local dirty = '

Hello

' local clean = policy:sanitize(dirty) -- '

Hello

' ``` | Parameter | Type | Description | |-----------|------|-------------| | `html` | string | HTML to sanitize | **Returns:** `string` ## Errors | Condition | Kind | Retryable | |-----------|------|-----------| | Invalid regex pattern | `errors.INVALID` | no | See [Error Handling](lua/core/errors.md) for working with errors. ## Navigation Previous: WebSocket (lua/http/websocket) Next: SQL (lua/storage/sql)