Sensitive data protection for LLM requests

Mask sensitive data before it reaches an LLM.

Globesword is a locally deployable filtering gateway that detects configured PII, financial identifiers, credentials, national IDs, and customer-specific sensitive data before an LLM request is sent.

Sensitive values are replaced with request-scoped tokens. Approved values can be restored after inference through a controlled mapping layer.

Discuss a PoV See how it works

Local or private deployment

Client-specific policies

Measurable PoV results

Request inspection

Globesword Masking Gateway

POLICY ACTIVE

Original request

Contact maya@example.com and transfer funds to IBAN DE89 3704 0044 0532 0130 00.

Payload sent to the LLM

Contact [EMAIL_1] and transfer funds to IBAN [IBAN_1].

emailMasked

Exact span

ibanMasked

MOD-97 valid

Restoration is limited to authorized tokens associated with the current request context.

Baseline functional validation

Initial evidence, presented with its scope.

These results demonstrate functional correctness for the tested examples. They are not presented as universal production accuracy.

Baseline scenarios

End-to-end functional tests

Expected entities

All detected with exact spans

Observed leaks

Within the baseline suite

0.154 ms

Deterministic p95

For tested short-text inputs

Client PoVs use larger customer-specific datasets containing positive, negative, malformed, overlap, multilingual, and document-specific test cases.

How it works

A controlled layer between enterprise data and the model.

Globesword is designed to sit in the request path before data is sent to an external or internally hosted LLM.

Inspect the request

The gateway scans prompts, messages, and extracted file text before an LLM request is created.

Apply layered detection

Deterministic patterns, checksum validators, contextual detection, and tenant rules identify configured data classes.

Replace sensitive values

Detected values are replaced with request-scoped typed tokens before the content reaches the target model.

Restore approved values

Authorized tokens can be restored after inference through the separate controlled mapping layer.

Detection coverage

Broad detector library. Narrow deployment policy.

The platform includes reusable detectors, but each deployment activates only what is relevant to the customer.

A German manufacturer, a US financial institution, and an Indian healthcare provider should not run the same national identifier policy.

Personal and contact data

Detect common personal identifiers before prompt content leaves the application boundary.

Email addresses

Phone numbers

Dates of birth with label context

Usernames and account identifiers

Financial identifiers

Use structural and checksum-aware detection for common financial data.

Payment cards

IBAN and SWIFT/BIC

Bank routing numbers

Cryptocurrency addresses

Government identifiers

Enable only the country-specific identifiers relevant to each deployment.

US SSN, EIN and ITIN

India Aadhaar and PAN

Brazil CPF and CNPJ

Selected UK and European identifiers

Credentials and secrets

Identify provider-specific credentials and explicitly labelled secrets.

API keys and access tokens

Bearer tokens and JWTs

Passwords and client secrets

Private key blocks

Network and infrastructure data

Protect technical identifiers that may expose private systems or environments.

IPv4 and IPv6 addresses

MAC addresses

URL-embedded credentials

Internal identifiers through custom policies

Customer-defined identifiers

Add organization-specific patterns without retraining the detection model.

Employee IDs

Customer and patient IDs

Matter and project codes

Facility and device identifiers

Technical approach

Built for measurable risk reduction, not absolute claims.

Structured identifiers, contextual entities, and ambiguous data classes require different detection and validation strategies.

Policy-driven detection

Only the detectors relevant to the customer, geography, document type, and workflow need to be enabled.

Checksum validation

Where supported, regex candidates are validated using algorithms such as Luhn, MOD-97, Verhoeff, and identifier-specific checks.

Client-specific tuning

Confidence thresholds, custom identifiers, token prefixes, severity, and active policies can be configured per deployment.

Private deployment

The detection layer is designed to run locally, in a private cloud, or inside an enterprise-controlled container environment.

Exact-span masking

The system tracks character offsets so only the identified value is replaced while surrounding text remains intact.

Measurable evaluation

PoV results are reported using precision, recall, F1, exact-span accuracy, leakage, restoration accuracy, and latency.

Where it fits

Add masking before the workflow reaches the model.

The gateway can be integrated into applications that send user input, documents, retrieved context, logs, or tool output to an LLM.

Document and knowledge assistants

Mask sensitive content extracted from PDFs, DOCX files, spreadsheets, emails, tickets, and knowledge-base records.

Developer copilots

Detect credentials, private keys, tokens, network identifiers, and labelled secrets before code or logs reach an LLM.

Internal enterprise assistants

Apply department-specific policies for HR, finance, customer support, legal, healthcare, and operational workflows.

RAG and agent workflows

Place a masking layer before retrieval context, agent messages, tool calls, or external model requests.

Proof of Value

Validate it against the client’s real risk profile.

Every PoV is scoped around the customer’s countries, departments, document types, identifiers, model workflows, and acceptable risk.

Scope

Selected workflows and data classes

Policy

Customer-specific detector profile

Evaluation

Positive and negative test datasets

Outcome

Measured results and limitations

PoV deliverables

Customer-specific detector and policy profile

Synthetic or approved sanitized evaluation dataset

Positive, negative, malformed, and overlap test cases

Precision, recall, F1, and exact-span results

Sensitive-data leakage and restoration measurements

Latency and throughput measurements

Known limitations and remediation recommendations

Deployment and integration handoff

Grounded security claims.

The product is evaluated against defined policies and datasets. Results are reported with their scope and known limitations.

No claim that regex alone can identify every form of sensitive information

Ambiguous detectors are enabled only when relevant to the customer workflow

Structured and contextual detection results are evaluated separately

Baseline results are not presented as universal production accuracy

Client-specific validation is completed before broader deployment

Evaluate what reaches your LLM before expanding its access.

Start with one workflow, one policy profile, and a measurable customer-specific dataset.

Request a PoV discussion Request technical overview

Private deployment optionsScoped evaluationIntegration handoff