Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
Original file line number Diff line number Diff line change
Expand Up @@ -9,6 +9,45 @@ products:

{{< product-availability >}}

{{% observability_pipelines/processors/add_env_vars %}}
## Overview

Use this processor to add an environment variable field name and value to the log message.

## Setup

To set up this processor:

1. Define a [filter query](#filter-query-syntax). Only logs that match the specified filter query are processed. All logs, regardless of whether they match the filter query, are sent to the next step in the pipeline.
1. Enter the field name for the environment variable.
1. Enter the environment variable name.
1. Click **Add Environment Variable** if you want to add another environment variable.

##### Blocked environment variables

Environment variables that match any of the following patterns are blocked from being added to log messages because the environment variable could contain sensitive data.

- `CONNECTIONSTRING` / `CONNECTION-STRING` / `CONNECTION_STRING`
- `AUTH`
- `CERT`
- `CLIENTID` / `CLIENT-ID` / `CLIENT_ID`
- `CREDENTIALS`
- `DATABASEURL` / `DATABASE-URL` / `DATABASE_URL`
- `DBURL` / `DB-URL` / `DB_URL`
- `KEY`
- `OAUTH`
- `PASSWORD`
- `PWD`
- `ROOT`
- `SECRET`
- `TOKEN`
- `USER`

The environment variable is matched to the pattern and not the literal word. For example, `PASSWORD` blocks environment variables like `USER_PASSWORD` and `PASSWORD_SECRET` from getting added to the log messages.

##### Allowlist

After you have added processors to your pipeline and clicked **Next: Install**, in the **Add environment variable processor(s) allowlist** field, enter a comma-separated list of environment variables you want to pull values from and use with this processor.

The allowlist is stored in the environment variable `DD_OP_PROCESSOR_ADD_ENV_VARS_ALLOWLIST`.

{{% observability_pipelines/processors/filter_syntax %}}
Original file line number Diff line number Diff line change
Expand Up @@ -9,6 +9,13 @@ products:

{{< product-availability >}}

{{% observability_pipelines/processors/add_hostname %}}
## Overview

This processor adds a field with the name of the host that sent the log. For example, `hostname: 613e197f3526`. **Note**: If the `hostname` already exists, the Worker throws an error and does not overwrite the existing `hostname`.

## Setup

To set up this processor:
- Define a **filter query**. Only logs that match the specified [filter query](#filter-query-syntax) are processed. All logs, regardless of whether they do or do not match the filter query, are sent to the next step in the pipeline.

{{% observability_pipelines/processors/filter_syntax %}}
Original file line number Diff line number Diff line change
Expand Up @@ -13,7 +13,43 @@ products:

{{< product-availability >}}

{{% observability_pipelines/processors/custom_processor %}}
## Overview

Use this processor with Vector Remap Language (VRL) to modify and enrich your logs. VRL is an expression-oriented, domain specific language designed for transforming logs. It features built-in functions for observability use cases. You can use custom functions in the following ways:

- Manipulate [arrays](#array), [strings](#string), and other data types.
- Encode and decode values using [Codec](#codec).
- [Encrypt](#encrypt) and [decrypt](#decrypt) values.
- [Coerce](#coerce) one datatype to another datatype (for example, from an integer to a string).
- [Convert syslog values](#convert) to read-able values.
- Enrich values by using [enrichment tables](#enrichment).
- [Manipulate IP values](#ip).
- [Parse](#parse) values with custom rules (for example, grok, regex, and so on) and out-of-the-box functions (for example, syslog, apache, VPC flow logs, and so on).
- Manipulate event [paths](#path).

See [Custom functions][1] for the full list of available functions.

See [Remap Reserved Attributes][2] on how to use the Custom Processor to manually and dynamically remap attributes.

## Setup

To set up this processor:

- If you have not created any functions yet, click **Add custom processor** and follow the instructions in [Add a function](#add-a-function) to create a function.
- If you have already added custom functions, click **Manage custom processors**. Click on a function in the list to edit or delete it. You can use the search bar to find a function by its name. Click **Add Custom Processor** to [add a function](#add-a-function).

##### Add a function

1. Enter a name for your custom processor.
1. Add your script to modify your logs using [custom functions][1]. You can also click **Autofill with Example** and select one of the common use cases to get started. Click the copy icon for the example script and paste it into your script. See [Get Started with the Custom Processor][3] for more information.
1. Optionally, check **Drop events on error** if you want to drop events that encounter an error during processing.
1. Enter a sample log event.
1. Click **Run** to preview how the functions process the log. After the script has run, you can see the output for the log.
1. Click **Save**.

[1]: /observability_pipelines/processors/custom_processor#custom-functions
[2]: /observability_pipelines/guide/remap_reserved_attributes
[3]: /observability_pipelines/guide/get_started_with_the_custom_processor

## Custom functions

Expand Down
58 changes: 57 additions & 1 deletion content/en/observability_pipelines/processors/edit_fields.md
Original file line number Diff line number Diff line change
Expand Up @@ -13,7 +13,63 @@ products:

{{< product-availability >}}

{{% observability_pipelines/processors/remap %}}
## Overview

The Edit Fields processor can add, drop, or rename fields within your individual log data. Use this processor to enrich your logs with additional context, remove low-value fields to reduce volume, and standardize naming across important attributes. Select **add field**, **drop field**, or **rename field** in the dropdown menu to get started.

See the [Remap Reserved Attributes][1] guide on how to use the Edit Fields processor to remap attributes.

## Setup

##### Add field
Use **add field** to append a new key-value field to your log.

To set up the add field processor:
1. Define a **filter query**. Only logs that match the specified [filter query](#filter-query-syntax) are processed. All logs, regardless of whether they do or do not match the filter query, are sent to the next step in the pipeline.
1. Enter the field and value you want to add. To specify a nested field for your key, use the [path notation](#path-notation-example-remap): `<OUTER_FIELD>.<INNER_FIELD>`. All values are stored as strings.
**Note**: If the field you want to add already exists, the Worker throws an error and the existing field remains unchanged.

##### Drop field

Use **drop field** to drop a field from logging data that matches the filter you specify below. It can delete objects, so you can use the processor to drop nested keys.

To set up the drop field processor:
1. Define a **filter query**. Only logs that match the specified [filter query](#filter-query-syntax) are processed. All logs, regardless of whether they do or do not match the filter query, are sent to the next step in the pipeline.
1. Enter the key of the field you want to drop. To specify a nested field for your specified key, use the [path notation](#path-notation-example-remap): `<OUTER_FIELD>.<INNER_FIELD>`.
**Note**: If your specified key does not exist, your log is unimpacted.

##### Rename field

Use **rename field** to rename a field within your log.

To set up the rename field processor:
1. Define a **filter query**. Only logs that match the specified [filter query](#filter-query-syntax) are processed. All logs, regardless of whether they do or do not match the filter query, are sent to the next step in the pipeline.
1. Enter the name of the field you want to rename in the **Source field**. To specify a nested field for your key, use the [path notation](#path-notation-example-remap): `<OUTER_FIELD>.<INNER_FIELD>`. After it is renamed, your original field is deleted unless you enable the **Preserve source tag** checkbox described below.<br>**Note**: If the source key you specify doesn't exist, a default `null` value is applied to your target.
1. In the **Target field**, enter the name you want the source field to be renamed to. To specify a nested field for your specified key, use the [path notation](#path-notation-example-remap): `<OUTER_FIELD>.<INNER_FIELD>`.<br>**Note**: If the target field you specify already exists, the Worker throws an error and does not overwrite the existing target field.
1. Optionally, check the **Preserve source tag** box if you want to retain the original source field and duplicate the information from your source key to your specified target key. If this box is not checked, the source key is dropped after it is renamed.

##### Path notation example {#path-notation-example-remap}

For the following message structure:

```json
{
"outer_key": {
"inner_key": "inner_value",
"a": {
"double_inner_key": "double_inner_value",
"b": "b value"
},
"c": "c value"
},
"d": "d value"
}
```

- Use `outer_key.inner_key` to see the key with the value `inner_value`.
- Use `outer_key.inner_key.double_inner_key` to see the key with the value `double_inner_value`.

[1]: /observability_pipelines/guide/remap_reserved_attributes

{{% observability_pipelines/processors/filter_syntax %}}

Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -9,6 +9,85 @@ products:

{{< product-availability >}}

{{% observability_pipelines/processors/generate_metrics %}}
## Overview

Many types of logs are meant to be used for telemetry to track trends, such as KPIs, over long periods of time. Generating metrics from your logs is a cost-effective way to summarize log data from high-volume logs, such as CDN logs, VPC flow logs, firewall logs, and networks logs. Use the generate metrics processor to generate either a count metric of logs that match a query or a distribution metric of a numeric value contained in the logs, such as a request duration.

**Note**: The metrics generated are [custom metrics][1] and billed accordingly. See [Custom Metrics Billing][2] for more information.

## Setup

To set up the processor:

Click **Manage Metrics** to create new metrics or edit existing metrics. This opens a side panel.

- If you have not created any metrics yet, enter the metric parameters as described in the [Add a metric](#add-a-metric) section to create a metric.
- If you have already created metrics, click on the metric's row in the overview table to edit or delete it. Use the search bar to find a specific metric by its name, and then select the metric to edit or delete it. Click **Add Metric** to add another metric.

##### Add a metric

1. Enter a [filter query](#filter-query-syntax). Only logs that match the specified filter query are processed. All logs, regardless of whether they match the filter query, are sent to the next step in the pipeline. **Note**: Since a single processor can generate multiple metrics, you can define a different filter query for each metric.
1. Enter a name for the metric.
1. In the **Define parameters** section, select the metric type (count, gauge, or distribution). See the [Count metric example](#count-metric-example) and [Distribution metric example](#distribution-metric-example). Also see [Metrics Types](#metrics-types) for more information.
- For gauge and distribution metric types, select a log field which has a numeric (or parseable numeric string) value that is used for the value of the generated metric.
- For the distribution metric type, the log field's value can be an array of (parseable) numerics, which is used for the generated metric's sample set.
- The **Group by** field determines how the metric values are grouped together. For example, if you have hundreds of hosts spread across four regions, grouping by region allows you to graph one line for every region. The fields listed in the **Group by** setting are set as tags on the configured metric.
1. Click **Add Metric**.

##### Metrics types

You can generate these types of metrics for your logs. See the [Metrics types][3] and [Distributions][4] documentation for more details.

| Metric type | Description | Example |
| ------------ | ----------------------------------------------------------------------------------------------------------------------------------------------- | --------------------------------------------------------------------------------------------------- |
| COUNT | Represents the total number of event occurrences in one time interval. This value can be reset to zero, but cannot be decreased. | You want to count the number of logs with `status:error`. |
| GAUGE | Represents a snapshot of events in one time interval. | You want to measure the latest CPU utilization per host for all logs in the production environment. |
| DISTRIBUTION | Represent the global statistical distribution of a set of values calculated across your entire distributed infrastructure in one time interval. | You want to measure the average time it takes for an API call to be made. |

##### Count metric example

For this `status:error` log example:

```
{"status": "error", "env": "prod", "host": "ip-172-25-222-111.ec2.internal"}
```

To create a count metric that counts the number of logs that contain `"status":"error"` and groups them by `env` and `host`, enter the following information:

| Input parameters | Value |
|------------------|---------------------|
| Filter query | `@status:error` |
| Metric name | `status_error_total`|
| Metric type | Count |
| Group by | `env`, `prod` |

##### Distribution metric example

For this example of an API response log:

```
{
"timestamp": "2018-10-15T17:01:33Z",
"method": "GET",
"status": 200,
"request_body": "{"information"}",
"response_time_seconds: 10
}
```

To create a distribution metric that measures the average time it takes for an API call to be made, enter the following information:

| Input parameters | Value |
|------------------------|-------------------------|
| Filter query | `@method` |
| Metric name | `status_200_response` |
| Metric type | Distribution |
| Select a log attribute | `response_time_seconds` |
| Group by | `method` |

[1]: /metrics/custom_metrics/
[2]: /account_management/billing/custom_metrics/
[3]: /metrics/types/
[4]: /metrics/distributions/

{{% observability_pipelines/processors/filter_syntax %}}
37 changes: 36 additions & 1 deletion content/en/observability_pipelines/processors/grok_parser.md
Original file line number Diff line number Diff line change
Expand Up @@ -15,7 +15,42 @@ further_reading:

{{< product-availability >}}

{{% observability_pipelines/processors/grok_parser %}}
## Overview

This processor parses logs using the grok parsing rules that are available for a set of sources. The rules are automatically applied to logs based on the log source. Therefore, logs must have a `source` field with the source name. If this field is not added when the log is sent to the Observability Pipelines Worker, you can use the **Add field** processor to add it.

If the `source` field of a log matches one of the grok parsing rule sets, the log's `message` field is checked against those rules. If a rule matches, the resulting parsed data is added in the `message` field as a JSON object, overwriting the original `message`.

If there isn't a `source` field on the log, or no rule matches the log `message`, then no changes are made to the log and it is sent to the next step in the pipeline.

Datadog's Grok patterns differ from the standard Grok pattern, where Datadog's Grok implementation provides:
- Matchers that include options for how you define parsing rules
- Filters for post-processing of extracted data
- A set of built-in patterns tailored to common log formats

See [Parsing][1] for more information on Datadog's Grok patterns.

## Setup

To set up the grok parser, define a **filter query**. Only logs that match the specified [filter query](#filter-query-syntax) are processed. All logs, regardless of whether they match the filter query, are sent to the next step in the pipeline.

To test log samples for out-of-the-box rules:
1. Click the **Preview Library Rules** button.
1. Search or select a source in the dropdown menu.
1. Enter a log sample to test the parsing rules for that source.

To add a custom parsing rule:

1. Click **Add Custom Rule**.
1. If you want to clone a library rule, select **Clone library rule** and then the library source from the dropdown menu.
1. If you want to create a custom rule, select **Custom** and then enter the `source`. The parsing rules are applied to logs with that `source`.
1. Enter log samples to test the parsing rules.
1. Enter the rules for parsing the logs. See [Parsing][1] for more information on writing parsing rules with Datadog Grok patterns.<br>**Note**: The `url`, `useragent`, and `csv` filters are not available.
1. Click **Advanced Settings** if you want to add helper rules. See [Using helper rules to reuse common patterns][2] for more information.
1. Click **Add Rule**.

[1]: /logs/log_configuration/parsing/
[2]: /logs/log_configuration/parsing/?tab=matchers#using-helper-rules-to-reuse-common-patterns

{{% observability_pipelines/processors/filter_syntax %}}

Expand Down
Loading