Core ConceptsTokens vs Formats

Tokens vs Formats

Two masking strategies — tokens replace values entirely, formats preserve the structure while masking the content. Both are deterministic and reversible.

When Protecto masks sensitive data, it can replace the original value in two ways: tokens or formats. Both are deterministic and reversible, but they serve different purposes.

Tokens

A token replaces the original value with a new, non-sensitive string.

OriginalTokenized
john.doe@example.com<EMAIL>0gN3SkjL@0ffM3CDS</EMAIL>

The token does not resemble the original value. It is safe to store and transmit, and can be reversed only if policy allows.

Use tokens when:

  • Masking free-form text
  • Storing values for analytics
  • Structure is not required
  • Simplicity is preferred

Formats

A format preserves the structure of the original value while masking the actual data.

OriginalFormat-masked
9876543210<PHONE>8349201756</PHONE>

The output looks like a valid phone number, has the same length and character set, but does not reveal the original value.

Use formats when:

  • Masked data must pass downstream validation
  • IDs or numbers have a fixed length requirement
  • Downstream systems enforce patterns

Key differences

PropertyTokenFormat
GoalReplace valuePreserve structure
DeterministicYesYes
ReversibleYesYes
Preserves length/patternNoYes
Looks like original typeNoOften yes

Both approaches are governed by policy and permissions.

Using both in the same system

It's common to use both strategies together. For example:

  • Emails masked with tokens (structure not important)
  • Phone numbers masked with formats (downstream validation requires it)
  • Internal IDs masked with custom tokens

Protecto allows mixing approaches without changing APIs.