Skip to content

Commit 4c0070c

Browse files
committed
add draft documentation
1 parent d931917 commit 4c0070c

File tree

2 files changed

+58
-1
lines changed

2 files changed

+58
-1
lines changed

docs/utilities/data_masking.md

Lines changed: 20 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -43,7 +43,7 @@ stateDiagram-v2
4343

4444
## Terminology
4545

46-
**Erasing** replaces sensitive information **irreversibly** with a non-sensitive placeholder _(`*****`)_. This operation replaces data in-memory, making it a one-way action.
46+
**Erasing** replaces sensitive information **irreversibly** with a non-sensitive placeholder _(`*****`)_, or with a customized mask. This operation replaces data in-memory, making it a one-way action.
4747

4848
**Encrypting** transforms plaintext into ciphertext using an encryption algorithm and a cryptographic key. It allows you to encrypt any sensitive data, so only allowed personnel to decrypt it. Learn more about encryption [here](https://aws.amazon.com/blogs/security/importance-of-encryption-and-how-aws-can-help/){target="_blank"}.
4949

@@ -117,6 +117,25 @@ Erasing will remove the original data and replace it with a `*****`. This means
117117
--8<-- "examples/data_masking/src/getting_started_erase_data_output.json"
118118
```
119119

120+
The `erase` method also supports additional flags for more advanced and flexible masking:
121+
122+
| Flag | Behavior |
123+
| ---------------- | ----------------------------------------------------------|
124+
| `dynamic_mask`(bool) | When set to `True`, this flag enables custom masking behavior. It activates the use of advanced masking techniques such as pattern-based or regex-based masking.|
125+
| `custom_mask`(str) | Specifies a simple pattern for masking data. This pattern is applied directly to the input string, replacing all the original characters. For example, with a `custom_mask` of "XX-XX" applied to "12345", the result would be "XX-XX".|
126+
| `regex_pattern`(str) | Defines a regular expression pattern used to identify parts of the input string that should be masked. This allows for more complex and flexible masking rules. It's used in conjunction with `mask_format`.|
127+
| `mask_format`(str) | Specifies the format to use when replacing parts of the string matched by `regex_pattern`. It can include placeholders (like \1, \2) to refer to captured groups in the regex pattern, allowing some parts of the original string to be preserved.|
128+
| `masking_rules`(dict) | Allows you to apply different masking rules (flags) for each data field.|
129+
130+
=== "custom_data_masking.py"
131+
```python hl_lines="13 17 21 25 36"
132+
--8<-- "examples/data_masking/src/custom_data_masking.py"
133+
```
134+
=== "generic_data_input.json"
135+
```json hl_lines="6 7 9 12"
136+
--8<-- "examples/data_masking/src/generic_data_input.json"
137+
```
138+
120139
### Encrypting data
121140

122141
???+ note "About static typing and encryption"
Lines changed: 38 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,38 @@
1+
from __future__ import annotations
2+
3+
from aws_lambda_powertools.utilities.data_masking import DataMasking
4+
from aws_lambda_powertools.utilities.typing import LambdaContext
5+
6+
data_masker = DataMasking()
7+
8+
9+
def lambda_handler(event: dict, context: LambdaContext) -> dict:
10+
data: dict = event.get("body", {})
11+
12+
# Default erase (*****)
13+
default_erased = data_masker.erase(data, fields=["address.zip"])
14+
# 'street': '*****'
15+
16+
# dynamic_mask
17+
dynamic_mask = data_masker.erase(data, fields=["address.zip"], dynamic_mask=True)
18+
#'street': '*** **** **'
19+
20+
# custom_mask
21+
custom_mask = data_masker.erase(data, fields=["address.zip"], custom_mask="XX")
22+
#'zip': 'XX'
23+
24+
# regex_pattern and mask_format
25+
regex_pattern = data_masker.erase(data, fields=["email"], regex_pattern=r"(.)(.*)(@.*)", mask_format=r"\1****\3")
26+
#'email': 'j****@example.com'
27+
28+
# Masking rules for each field
29+
masking_rules = {
30+
"email": {"regex_pattern": "(.)(.*)(@.*)", "mask_format": r"\1****\3"},
31+
"age": {"dynamic_mask": True},
32+
"address.zip": {"dynamic_mask": True, "custom_mask": "xxx"},
33+
"address.street": {"dynamic_mask": False},
34+
}
35+
36+
masking_rules_erase = data_masker.erase(data, masking_rules=masking_rules)
37+
38+
return default_erased, dynamic_mask, custom_mask, regex_pattern, masking_rules_erase

0 commit comments

Comments
 (0)