Skip to content

[Cloudflare One] Improve clarity of DLP profiles documentation#29549

Open
Oxyjun wants to merge 1 commit intoproductionfrom
jun/eli5/dlp-profiles
Open

[Cloudflare One] Improve clarity of DLP profiles documentation#29549
Oxyjun wants to merge 1 commit intoproductionfrom
jun/eli5/dlp-profiles

Conversation

@Oxyjun
Copy link
Copy Markdown
Contributor

@Oxyjun Oxyjun commented Apr 2, 2026

Simplify jargon, define technical terms on first use, and restructure key sections for readability across all four DLP profile pages. Changes were generated via an ELI5 review with adversarial fact-checking; all flagged claims were corrected before applying.

  • index.mdx — Rewrite intro to lead with purpose ("defines what sensitive data looks like") before composition. Detection entry types now align with the detection-entries.mdx taxonomy (datasets, document fingerprints, AI prompt topic classifiers). Link predefined profiles on first mention.
  • predefined-profiles.mdx — Replace vague "detection granularity" with "detection accuracy." Add a paragraph before the Financial Information table defining the three validation methods (Luhn, checksum, regex) so readers do not need to leave the page. Clarify the SSN entry to explain that Social Security numbers have no built-in checksum, so DLP validates format only.
  • integration-profiles.mdx — Replace abstract "data classification providers" with a concrete example (Microsoft Purview sensitivity labels). Remove "simply" (style guide violation). Reframe intro to explain why you would use an integration profile (avoid recreating rules manually).
  • advanced-settings.mdx — Reframe match count as a threshold for clarity. Add an illustrative example to AI context analysis while preserving the original confidence-adjustment mechanism (not binary true/false). Explicitly note that vector embeddings, not raw text, are sent to and stored by Workers AI. Restructure confidence thresholds so the inclusive/exclusive behavior (Low includes Medium + High) appears in the level definitions before the configuration steps. Define "proximity keywords" inline with an SSN example.

Simplify jargon, define technical terms on first use, and restructure
sections for readability across all four DLP profile pages. Changes
include leading with purpose over composition, defining validation
methods (Luhn, checksum, regex) before the predefined profiles table,
clarifying match count as a threshold, restructuring confidence
thresholds so inclusive/exclusive behavior is stated before config
steps, and preserving vector embedding terminology for AI context
analysis privacy properties.
@github-actions
Copy link
Copy Markdown
Contributor

github-actions bot commented Apr 2, 2026

This pull request requires reviews from CODEOWNERS as it changes files that match the following patterns:

Pattern Owners
/src/content/docs/cloudflare-one/data-loss-prevention/ @Maddy-Cloudflare, @codyanthony850, @cloudflare/pcx-technical-writing

@Oxyjun Oxyjun self-assigned this Apr 2, 2026
Confidence threshold is set on the DLP profile. When you select a confidence threshold in Cloudflare One, you will see which DLP entries will be affected by the confidence threshold. Entries that do not reflect a confidence threshold in Cloudflare One are not yet supported or are not applicable.
When you set a confidence threshold on a profile, DLP only triggers on detections at that level or higher:

- **Low** (default) — Based on regular expressions with few proximity keywords. This is the most inclusive setting and also includes Medium and High confidence detections.
Copy link
Copy Markdown
Contributor

@alexamavrogianis alexamavrogianis Apr 2, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How we describe these levels in the dash:

Low: Applies basic regex-based detection with validation with high tolerance for false positives.
Medium: Applies some additional validation with medium tolerance for false positives.
High: Applies rigorous contextual validation for minimal false positives.

I think a nice mix of the two would be best.


Credit card numbers begin with a six or eight-digit Issuer Identification Number (IIN) and are followed by up to 23 additional digits. Card verification values (CVVs) are not validated.

In the table below, entries use one of three validation methods. [Luhn's algorithm](https://en.wikipedia.org/wiki/Luhn_algorithm) is a checksum formula used to verify credit card numbers. Entries validated "with checksum" use an arithmetic check specific to that number format. Entries validated "with regex" match a known text pattern without performing a mathematical check.
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is such a good add

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants