Oihana PHP

sanitize.php

Table of Contents

Functions

sanitize()  : string|null
Sanitize a string based on configurable flags.

Functions

sanitize()

Sanitize a string based on configurable flags.

sanitize(string|null $source[, int $flags = SanitizeFlag::DEFAULT ][, array<string|int, mixed> $options = [] ]) : string|null

This function acts as a comprehensive filter chain for string data. It can perform operations ranging from simple trimming to complex HTML stripping, Unicode normalization, and invisible character removal.

Available flags (SanitizeFlag)

Cleaning & Security:

  • SanitizeFlag::STRIP_TAGS : Remove HTML/PHP tags (and content of <script>/<style>).
  • SanitizeFlag::DECODE_ENTITIES : Decode HTML entities (e.g., &amp; -> &).
  • SanitizeFlag::REMOVE_CONTROL_CHARS : Remove non-printable ASCII characters (0-31, 127) except line breaks/tabs.
  • SanitizeFlag::REMOVE_INVISIBLE : Remove invisible Unicode characters (zero-width, BOM, etc.) and normalize non-breaking spaces. Formatting & Normalization:
  • SanitizeFlag::NORMALIZE_UNICODE : Normalize string to Unicode Normalization Form C (NFC) by default.
  • SanitizeFlag::NORMALIZE_LINE_BREAKS: Convert Windows (\r\n) and Mac (\r) line endings to Unix (\n).
  • SanitizeFlag::REMOVE_EXTRA_LINE_BREAKS: Collapse multiple consecutive line breaks into a single one.
  • SanitizeFlag::COLLAPSE_SPACES : Collapse multiple consecutive horizontal spaces into a single space.
  • SanitizeFlag::TRIM : Remove whitespace from the start and end of the string. Output Control:
  • SanitizeFlag::NULLIFY : Return null if the resulting string is empty.

Processing order

Operations are applied in this specific order to ensure data integrity:

  1. DECODE_ENTITIES: Decode HTML entities first (to expose hidden tags or chars).
  2. STRIP_TAGS: Remove scripts/styles content, then strip tags.
  3. REMOVE_CONTROL_CHARS: Clean basic ASCII control noise.
  4. REMOVE_INVISIBLE: aggressive cleaning of Unicode invisible chars.
  5. NORMALIZE_UNICODE: Standardize Unicode composition.
  6. NORMALIZE_LINE_BREAKS: Standardize line endings to \n.
  7. REMOVE_EXTRA_LINE_BREAKS: Collapse vertical spacing.
  8. COLLAPSE_SPACES: Collapse horizontal spacing.
  9. TRIM: Clean edges.
  10. NULLIFY: Final check for emptiness.
Parameters
$source : string|null

The string to sanitize. Can be null.

$flags : int = SanitizeFlag::DEFAULT

A bitmask of SanitizeFlag constants. Defaults to SanitizeFlag::DEFAULT.

$options : array<string|int, mixed> = []

Optional parameters for specific flags:

Tags
throws
InvalidArgumentException

If the flags parameter contains invalid flag values.

example
use oihana\core\strings\{sanitize, SanitizeFlag};

// 1. Basic usage (TRIM | REMOVE_INVISIBLE by default)
sanitize("  Hello\u{200B}World  "); // "HelloWorld"

// 2. HTML Cleaning
$html = "<script>alert('xss')</script><p>Hello &amp; <b>World</b></p>";
$flags = SanitizeFlag::STRIP_TAGS | SanitizeFlag::DECODE_ENTITIES | SanitizeFlag::TRIM;
sanitize($html, $flags); // "Hello & World"

// 3. Keeping specific tags
sanitize($html, $flags, ['allowed_tags' => '<b>']); // "Hello & <b>World</b>"

// 4. Text Formatting
$text = "Line 1  \t  Content\n\n\nLine 2";
$flags = SanitizeFlag::COLLAPSE_SPACES | SanitizeFlag::REMOVE_EXTRA_LINE_BREAKS | SanitizeFlag::TRIM;
sanitize($text, $flags); // "Line 1 Content\nLine 2"
author

Marc Alcaraz

since
1.0.8
Return values
string|null

The sanitized string, or null if NULLIFY is enabled and the result is empty.


        
On this page

Search results