[Cloudflare One] Improve clarity of DLP profiles documentation#29549
Open
Oxyjun wants to merge 1 commit intoproductionfrom
Open
[Cloudflare One] Improve clarity of DLP profiles documentation#29549Oxyjun wants to merge 1 commit intoproductionfrom
Oxyjun wants to merge 1 commit intoproductionfrom
Conversation
Simplify jargon, define technical terms on first use, and restructure sections for readability across all four DLP profile pages. Changes include leading with purpose over composition, defining validation methods (Luhn, checksum, regex) before the predefined profiles table, clarifying match count as a threshold, restructuring confidence thresholds so inclusive/exclusive behavior is stated before config steps, and preserving vector embedding terminology for AI context analysis privacy properties.
Contributor
|
This pull request requires reviews from CODEOWNERS as it changes files that match the following patterns:
|
Contributor
| Confidence threshold is set on the DLP profile. When you select a confidence threshold in Cloudflare One, you will see which DLP entries will be affected by the confidence threshold. Entries that do not reflect a confidence threshold in Cloudflare One are not yet supported or are not applicable. | ||
| When you set a confidence threshold on a profile, DLP only triggers on detections at that level or higher: | ||
|
|
||
| - **Low** (default) — Based on regular expressions with few proximity keywords. This is the most inclusive setting and also includes Medium and High confidence detections. |
Contributor
There was a problem hiding this comment.
How we describe these levels in the dash:
Low: Applies basic regex-based detection with validation with high tolerance for false positives.
Medium: Applies some additional validation with medium tolerance for false positives.
High: Applies rigorous contextual validation for minimal false positives.
I think a nice mix of the two would be best.
|
|
||
| Credit card numbers begin with a six or eight-digit Issuer Identification Number (IIN) and are followed by up to 23 additional digits. Card verification values (CVVs) are not validated. | ||
|
|
||
| In the table below, entries use one of three validation methods. [Luhn's algorithm](https://en.wikipedia.org/wiki/Luhn_algorithm) is a checksum formula used to verify credit card numbers. Entries validated "with checksum" use an arithmetic check specific to that number format. Entries validated "with regex" match a known text pattern without performing a mathematical check. |
Contributor
There was a problem hiding this comment.
This is such a good add
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Simplify jargon, define technical terms on first use, and restructure key sections for readability across all four DLP profile pages. Changes were generated via an ELI5 review with adversarial fact-checking; all flagged claims were corrected before applying.
index.mdx— Rewrite intro to lead with purpose ("defines what sensitive data looks like") before composition. Detection entry types now align with thedetection-entries.mdxtaxonomy (datasets, document fingerprints, AI prompt topic classifiers). Link predefined profiles on first mention.predefined-profiles.mdx— Replace vague "detection granularity" with "detection accuracy." Add a paragraph before the Financial Information table defining the three validation methods (Luhn, checksum, regex) so readers do not need to leave the page. Clarify the SSN entry to explain that Social Security numbers have no built-in checksum, so DLP validates format only.integration-profiles.mdx— Replace abstract "data classification providers" with a concrete example (Microsoft Purview sensitivity labels). Remove "simply" (style guide violation). Reframe intro to explain why you would use an integration profile (avoid recreating rules manually).advanced-settings.mdx— Reframe match count as a threshold for clarity. Add an illustrative example to AI context analysis while preserving the original confidence-adjustment mechanism (not binary true/false). Explicitly note that vector embeddings, not raw text, are sent to and stored by Workers AI. Restructure confidence thresholds so the inclusive/exclusive behavior (Low includes Medium + High) appears in the level definitions before the configuration steps. Define "proximity keywords" inline with an SSN example.