Character Encoding

UTF-8 encoding requirements for Protecto API requests and how to handle international characters and Unicode text.

Protecto APIs use UTF-8 encoded JSON for all requests and responses.

Requirements

Item	Value
Encoding	UTF-8
Content-Type header	`application/json; charset=utf-8`
Input text	Arbitrary Unicode strings

Always set the Content-Type header explicitly:

Content-Type: application/json; charset=utf-8

Protecto can process text in any language supported by UTF-8. This includes:

The detection engine works on semantic content, so detection accuracy may vary by language for built-in entities.

All masking inputs must be strings, even for numeric data:

{
  "mask": [
    { "value": "9876543210", "token_name": "Numeric Token" }
  ]
}

Sending a JSON number (9876543210) instead of a string ("9876543210") may cause leading zeros, spacing, and punctuation to be lost or rejected.

If your system stores numeric identifiers as integers, convert them to strings before sending to the Mask API.

Was this page helpful?