Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
Original file line number Diff line number Diff line change
Expand Up @@ -24,7 +24,7 @@ The following advanced detection settings are available for predefined and custo

### Match count

Match count refers to the number of times that any enabled entry in the profile can be detected before an action is triggered, such as blocking or logging. For example, if you select a match count of 10, the scanned file or HTTP body must contain 11 or more matching strings. Detections do not have to be unique.
Match count sets a minimum threshold for detections. DLP does not trigger an action (such as blocking or logging) until the number of detections exceeds the match count. For example, if you set a match count of 10, the scanned file or HTTP body must contain 11 or more matching strings before the action triggers. Detections do not have to be unique.

### Optical Character Recognition (OCR)

Expand All @@ -38,9 +38,9 @@ OCR supports scanning `.jpg`/`.jpeg` and `.png` files between 4 KB and 1 MB in s
AI context analysis only supports Gateway HTTP and HTTPS traffic.
:::

AI context analysis uses a pretrained model to analyze and adjust the confidence in a detection based on its surrounding context. DLP will log any matches that are above your confidence threshold.
AI context analysis uses a pretrained model to analyze surrounding context and adjust the confidence level of a detection. For example, a number that matches a credit card pattern may receive a lower confidence score if it appears in a context where credit card numbers are unlikely. DLP will log any matches that are above your confidence threshold.

DLP redacts any matched text, then submits the context as an AI text embedding vector to [Cloudflare Workers AI](/workers-ai/). Vectors are stored in user-specific private namespaces for up to six months, along with hit count and the [false positive/negative report](/cloudflare-one/data-loss-prevention/dlp-policies/logging-options/#report-false-and-true-positives-to-ai-context-analysis).
DLP redacts any matched text, then converts the surrounding context into a vector embedding and submits it to [Cloudflare Workers AI](/workers-ai/). Vector embeddings (not raw text) are stored in user-specific private namespaces for up to six months, along with hit count and the [false positive/negative report](/cloudflare-one/data-loss-prevention/dlp-policies/logging-options/#report-false-and-true-positives-to-ai-context-analysis).

To use AI context analysis:

Expand All @@ -52,11 +52,15 @@ AI context analysis results will appear in the payload section of your [DLP logs

### Confidence thresholds

Confidence thresholds indicate how confident Cloudflare DLP is in a DLP detection. DLP determines the confidence by inspecting the content for proximity keywords around the detection.
Confidence thresholds indicate how confident Cloudflare DLP is in a detection. DLP determines the confidence level by inspecting the content for proximity keywords — related terms that appear near the detected data. For example, the word "SSN" appearing near a 9-digit number increases confidence that the number is a Social Security number.

Confidence threshold is set on the DLP profile. When you select a confidence threshold in Cloudflare One, you will see which DLP entries will be affected by the confidence threshold. Entries that do not reflect a confidence threshold in Cloudflare One are not yet supported or are not applicable.
When you set a confidence threshold on a profile, DLP only triggers on detections at that level or higher:

- **Low** (default) — Based on regular expressions with few proximity keywords. This is the most inclusive setting, with high tolerance for false positives
- **Medium** — Applies additional validations, to filter out low confidence detections. This setting has a medium tolerance for false positives.
- **High** — Applies rigorous contextual validation for minimal false positives (has a higher likelihood of accuracy).

DLP confidence detections consist of Low, Medium, and High confidence thresholds. DLP will default to Low confidence detections, which are based on regular expressions, require few keywords, and will trigger more often. Medium and High confidence detections require more keywords, will trigger less often, and have a higher likelihood of accuracy.
Confidence threshold is set on the DLP profile. When you select a confidence threshold in Cloudflare One, you will see which DLP entries will be affected by the confidence threshold. Entries that do not reflect a confidence threshold in Cloudflare One are not yet supported or are not applicable.

To change the confidence threshold of a DLP profile:

Expand All @@ -65,8 +69,6 @@ To change the confidence threshold of a DLP profile:
3. In **Settings** > **Confidence threshold**, choose a new confidence threshold from the dropdown menu.
4. Select **Save profile**.

Setting the confidence to Low will also consider Medium and High confidence detections as matches. Setting the confidence to Medium or High will filter out lower confidence detections.

#### Gateway detections

For inline detections in Gateway, to display Low and Medium confidence detections but block High confidence detections, Cloudflare recommends creating two HTTP policies. The first policy should use a Low confidence DLP profile with an Allow action. The second policy should use a High confidence DLP profile with a Block action. For example:
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -8,7 +8,9 @@ sidebar:

import { Render } from "~/components";

A DLP profile is a collection of regular expressions and [detection entries](/cloudflare-one/data-loss-prevention/detection-entries/) that define the data patterns you want to detect. Cloudflare DLP provides predefined profiles for common detections, or you can build custom DLP profiles specific to your data, organization, and risk tolerance.
A DLP profile defines what sensitive data looks like so that DLP can detect it in your traffic. Each profile contains one or more [detection entries](/cloudflare-one/data-loss-prevention/detection-entries/) — patterns such as uploaded datasets, document fingerprints, and AI prompt topic classifiers that match specific types of data.

Cloudflare DLP provides [predefined profiles](/cloudflare-one/data-loss-prevention/dlp-profiles/predefined-profiles/) for common sensitive data types such as credit card numbers and national identifiers. You can also build custom DLP profiles specific to your data, organization, and risk tolerance.

## Configure a predefined profile

Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -14,9 +14,9 @@ import { Render } from "~/components";
Integration profiles require [Cloudflare CASB](/cloudflare-one/integrations/cloud-and-saas/).
:::

Cloudflare DLP integration profiles enable data loss prevention support for third-party data classification providers. Data classification information is retrieved from the third-party platform and populated into a DLP Profile. You can then enable detection entries in the profile and create a DLP policy to allow or block matching data.
Integration profiles let you use data classifications from a third-party platform (such as Microsoft Purview sensitivity labels) directly in Cloudflare DLP. Instead of recreating classification rules in Cloudflare, DLP retrieves them from the third-party platform and populates them as detection entries in a DLP profile. You can then enable the entries you want and create a DLP policy to allow or block matching data.

Detection entries in integration profiles are managed by the third-party platform and cannot be manually added, edited, or deleted within Cloudflare DLP.
Detection entries in integration profiles are managed by the third-party platform. You cannot manually add, edit, or delete these entries within Cloudflare DLP.

## Microsoft Purview Information Protection (MIP) sensitivity labels

Expand All @@ -26,7 +26,7 @@ Microsoft provides [Purview Information Protection sensitivity labels](https://l

### Setup

To add MIP sensitivity labels to a DLP Profile, simply integrate your Microsoft account with [Cloudflare CASB](/cloudflare-one/integrations/cloud-and-saas/microsoft-365/). A new integration profile will appear under **Data loss prevention** > **DLP profiles**. The profile is named **MIP Sensitivity Labels** followed by the name of the CASB integration.
To add MIP sensitivity labels to a DLP Profile, integrate your Microsoft account with [Cloudflare CASB](/cloudflare-one/integrations/cloud-and-saas/microsoft-365/). A new integration profile will appear under **Data loss prevention** > **DLP profiles**. The profile is named **MIP Sensitivity Labels** followed by the name of the CASB integration.

MIP sensitivity labels can also be added to a [custom DLP profile](/cloudflare-one/data-loss-prevention/dlp-profiles/#build-a-custom-profile) as an existing entry.

Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -7,7 +7,7 @@ sidebar:

import { Render } from "~/components";

Cloudflare Zero Trust provides predefined DLP profiles for common types of sensitive data. Some profiles include built-in validation checks to increase detection granularity. Additionally, you can configure [advanced settings](/cloudflare-one/data-loss-prevention/dlp-profiles/advanced-settings/) for predefined profiles.
Cloudflare Zero Trust provides predefined DLP profiles for common types of sensitive data. Some profiles include built-in validation checks that increase detection accuracy. You can also configure [advanced settings](/cloudflare-one/data-loss-prevention/dlp-profiles/advanced-settings/) for predefined profiles.

## AI Prompt

Expand Down Expand Up @@ -39,6 +39,8 @@ The following secrets are validated with regex.

Credit card numbers begin with a six or eight-digit Issuer Identification Number (IIN) and are followed by up to 23 additional digits. Card verification values (CVVs) are not validated.

In the table below, entries use one of three validation methods. [Luhn's algorithm](https://en.wikipedia.org/wiki/Luhn_algorithm) is a checksum formula used to verify credit card numbers. Entries validated "with checksum" use an arithmetic check specific to that number format. Entries validated "with regex" match a known text pattern without performing a mathematical check.

| Detection entry | Notes |
| -------------------------------- | --------------------------------------------------------------------------------- |
| American Express Card Number | Validated using [Luhn's algorithm](https://en.wikipedia.org/wiki/Luhn_algorithm). |
Expand Down Expand Up @@ -71,22 +73,22 @@ The following diagnosis and medication names are checked for surrounding ASCII c

The following national identifier detections are validated algorithmically when possible.

| Detection entry | Notes |
| ---------------------------------------------------- | ------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- |
| United States SSN Numeric Detection | Commonly used separators are required to match the detection entry. For example, `000-00-0000` matches but `000000000` does not. Social security numbers do not adhere to algorithmic validation. |
| Social Security Number Text | Text matching `ssn` or `social security`. |
| Australia Tax File Number | Validated with checksum. |
| Canada Social Insurance Number | Validated using Luhn's algorithm. |
| France Social Security Number | Validated with regex. |
| Hong Kong Identity Card (HKIC) Number | Validated with checksum. |
| Indonesia Identity Card Number | Validated with regex. |
| Malaysian National Identity Card Number | Validated with regex. |
| Philippines Unified Multi-Purpose ID (UMID) Number | Validated with regex. |
| Singapore National Registration Identity Card Number | Validated with checksum. |
| Taiwan National Identification Number | Validated with checksum. |
| Thai Identity Card Number | Validated with checksum. |
| United Kingdom NHS Number | Validated with checksum. |
| United Kingdom National Insurance Number | Validated with regex. |
| Detection entry | Notes |
| ---------------------------------------------------- | ----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- |
| United States SSN Numeric Detection | Matched values must include commonly used separators. For example, `000-00-0000` matches but `000000000` does not. Unlike credit card numbers, Social Security numbers have no built-in checksum, so DLP validates the format only. |
| Social Security Number Text | Text matching `ssn` or `social security`. |
| Australia Tax File Number | Validated with checksum. |
| Canada Social Insurance Number | Validated using Luhn's algorithm. |
| France Social Security Number | Validated with regex. |
| Hong Kong Identity Card (HKIC) Number | Validated with checksum. |
| Indonesia Identity Card Number | Validated with regex. |
| Malaysian National Identity Card Number | Validated with regex. |
| Philippines Unified Multi-Purpose ID (UMID) Number | Validated with regex. |
| Singapore National Registration Identity Card Number | Validated with checksum. |
| Taiwan National Identification Number | Validated with checksum. |
| Thai Identity Card Number | Validated with checksum. |
| United Kingdom NHS Number | Validated with checksum. |
| United Kingdom National Insurance Number | Validated with regex. |

## Source Code

Expand Down
Loading