You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardExpand all lines: docs/standard/base-types/anchors-in-regular-expressions.md
+4Lines changed: 4 additions & 0 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -63,6 +63,8 @@ Anchors, or atomic zero-width assertions, specify a position in the string where
63
63
The `$` anchor specifies that the preceding pattern must occur at the end of the input string, or before `\n` at the end of the input string.
64
64
65
65
If you use `$` with the <xref:System.Text.RegularExpressions.RegexOptions.Multiline?displayProperty=nameWithType> option, the match can also occur at the end of a line. Note that `$` is satisfied at `\n` but not at `\r\n` (the combination of carriage return and newline characters, or CR/LF). To handle the CR/LF character combination, include `\r?$` in the regular expression pattern. Note that `\r?$` will include any `\r` in the match.
66
+
67
+
Starting with .NET 11, you can use <xref:System.Text.RegularExpressions.RegexOptions.AnyNewLine?displayProperty=nameWithType> to make `$` recognize all common newline sequences instead of only `\n`. Unlike the `\r?$` workaround, `AnyNewLine` treats `\r\n` as an atomic sequence, so `\r` is not included in the match. For more information, see [AnyNewLine mode](regular-expression-options.md#anynewline-mode).
66
68
67
69
The following example adds the `$` anchor to the regular expression pattern used in the example in the [Start of String or Line](#start-of-string-or-line-) section. When used with the original input string, which includes five lines of text, the <xref:System.Text.RegularExpressions.Regex.Matches%28System.String%2CSystem.String%29?displayProperty=nameWithType> method is unable to find a match, because the end of the first line does not match the `$` pattern. When the original input string is split into a string array, the <xref:System.Text.RegularExpressions.Regex.Matches%28System.String%2CSystem.String%29?displayProperty=nameWithType> method succeeds in matching each of the five lines. When the <xref:System.Text.RegularExpressions.Regex.Matches%28System.String%2CSystem.String%2CSystem.Text.RegularExpressions.RegexOptions%29?displayProperty=nameWithType> method is called with the `options` parameter set to <xref:System.Text.RegularExpressions.RegexOptions.Multiline?displayProperty=nameWithType>, no matches are found because the regular expression pattern does not account for the carriage return character `\r`. However, when the regular expression pattern is modified by replacing `$` with `\r?$`, calling the <xref:System.Text.RegularExpressions.Regex.Matches%28System.String%2CSystem.String%2CSystem.Text.RegularExpressions.RegexOptions%29?displayProperty=nameWithType> method with the `options` parameter set to <xref:System.Text.RegularExpressions.RegexOptions.Multiline?displayProperty=nameWithType> again finds five matches.
68
70
@@ -83,6 +85,8 @@ Anchors, or atomic zero-width assertions, specify a position in the string where
83
85
The `\Z` anchor specifies that a match must occur at the end of the input string, or before `\n` at the end of the input string. It is identical to the `$` anchor, except that `\Z` ignores the <xref:System.Text.RegularExpressions.RegexOptions.Multiline?displayProperty=nameWithType> option. Therefore, in a multiline string, it can only be satisfied by the end of the last line, or the last line before `\n`.
84
86
85
87
Note that `\Z` is satisfied at `\n` but is not satisfied at `\r\n` (the CR/LF character combination). To treat CR/LF as if it were `\n`, include `\r?\Z` in the regular expression pattern. Note that this will make the `\r` part of the match.
88
+
89
+
Starting with .NET 11, you can use <xref:System.Text.RegularExpressions.RegexOptions.AnyNewLine?displayProperty=nameWithType> to make `\Z` recognize all common newline sequences instead of only `\n`. Unlike the `\r?\Z` workaround, `AnyNewLine` treats `\r\n` as an atomic sequence, so `\r` is not included in the match. For more information, see [AnyNewLine mode](regular-expression-options.md#anynewline-mode).
86
90
87
91
The following example uses the `\Z` anchor in a regular expression that is similar to the example in the [Start of String or Line](#start-of-string-or-line-) section, which extracts information about the years during which some professional baseball teams existed. The subexpression `\r?\Z` in the regular expression `^((\w+(\s?)){2,}),\s(\w+\s\w+),(\s\d{4}(-(\d{4}|present))?,?)+\r?\Z` is satisfied at the end of a string, and also at the end of a string that ends with `\n` or `\r\n`. As a result, each element in the array matches the regular expression pattern.
Copy file name to clipboardExpand all lines: docs/standard/base-types/character-classes-in-regular-expressions.md
+2Lines changed: 2 additions & 0 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -162,6 +162,8 @@ The period character (.) matches any character except `\n` (the newline characte
162
162
163
163
- If a regular expression pattern is modified by the <xref:System.Text.RegularExpressions.RegexOptions.Singleline?displayProperty=nameWithType> option, or if the portion of the pattern that contains the `.` character class is modified by the `s` option, `.` matches any character. For more information, see [Regular Expression Options](regular-expression-options.md).
164
164
165
+
- Starting with .NET 11, if the <xref:System.Text.RegularExpressions.RegexOptions.AnyNewLine?displayProperty=nameWithType> option is specified, `.` excludes all common newline characters instead of only `\n`. If both `Singleline` and `AnyNewLine` are specified, `Singleline` takes precedence and `.` matches every character. For more information, see [AnyNewLine mode](regular-expression-options.md#anynewline-mode).
166
+
165
167
The following example illustrates the different behavior of the `.` character class by default and with the <xref:System.Text.RegularExpressions.RegexOptions.Singleline?displayProperty=nameWithType> option. The regular expression `^.+` starts at the beginning of the string and matches every character. By default, the match ends at the end of the first line; the regular expression pattern matches the carriage return character, `\r`, but it doesn't match `\n`. Because the <xref:System.Text.RegularExpressions.RegexOptions.Singleline?displayProperty=nameWithType> option interprets the entire input string as a single line, it matches every character in the input string, including `\n`.
Copy file name to clipboardExpand all lines: docs/standard/base-types/regular-expression-options.md
+47-1Lines changed: 47 additions & 1 deletion
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -30,6 +30,7 @@ By default, the comparison of an input string with any literal characters in a r
30
30
|<xref:System.Text.RegularExpressions.RegexOptions.ECMAScript>| Not available | Enable ECMAScript-compliant behavior for the expression. |[ECMAScript matching behavior](#ecmascript-matching-behavior)|
31
31
|<xref:System.Text.RegularExpressions.RegexOptions.CultureInvariant>| Not available | Ignore cultural differences in language. |[Comparison using the invariant culture](#compare-using-the-invariant-culture)|
32
32
|<xref:System.Text.RegularExpressions.RegexOptions.NonBacktracking>| Not available | Match using an approach that avoids backtracking and guarantees linear-time processing in the length of the input. (Available in .NET 7 and later versions.)|[Nonbacktracking mode](#nonbacktracking-mode)|
33
+
|<xref:System.Text.RegularExpressions.RegexOptions.AnyNewLine>| Not available | Make `^`, `$`, `\Z`, and `.` recognize all common newline sequences instead of only `\n`. (Available in .NET 11 and later versions.) |[AnyNewLine mode](#anynewline-mode)|
33
34
34
35
## Specify options
35
36
@@ -75,7 +76,7 @@ The following five regular expression options can be set both with the options p
You can determine which options were provided to a <xref:System.Text.RegularExpressions.Regex> object when it was instantiated by retrieving the value of the read-only <xref:System.Text.RegularExpressions.Regex.Options%2A?displayProperty=nameWithType> property.
@@ -150,6 +155,9 @@ By default, `$` will be satisfied only at the end of the input string. If you sp
150
155
151
156
In neither case does `$` recognize the carriage return/line feed character combination (`\r\n`). `$` always ignores any carriage return (`\r`). To end your match with either `\r\n` or `\n`, use the subexpression `\r?$` instead of just `$`. Note that this will make the `\r` part of the match.
152
157
158
+
> [!TIP]
159
+
> Starting with .NET 11, you can use <xref:System.Text.RegularExpressions.RegexOptions.AnyNewLine?displayProperty=nameWithType> to make `^`, `$`, `\Z`, and `.` recognize all common newline sequences instead of only `\n`, removing the need for `\r?` workarounds. `AnyNewLine` also treats `\r\n` as an atomic newline sequence, so `\r` is never included in the match. For more information, see the [AnyNewLine mode](#anynewline-mode) section.
160
+
153
161
The following example extracts bowlers' names and scores and adds them to a <xref:System.Collections.Generic.SortedList%602> collection that sorts them in descending order. The <xref:System.Text.RegularExpressions.Regex.Matches%2A> method is called twice. In the first method call, the regular expression is `^(\w+)\s(\d+)$` and no options are set. As the output shows, because the regular expression engine cannot match the input pattern along with the beginning and end of the input string, no matches are found. In the second method call, the regular expression is changed to `^(\w+)\s(\d+)\r?$` and the options are set to <xref:System.Text.RegularExpressions.RegexOptions.Multiline?displayProperty=nameWithType>. As the output shows, the names and scores are successfully matched, and the scores are displayed in descending order.
@@ -405,6 +413,44 @@ The <xref:System.Text.RegularExpressions.RegexOptions.NonBacktracking?displayPro
405
413
406
414
For more information about backtracking, see [Backtracking in regular expressions](backtracking-in-regular-expressions.md).
407
415
416
+
## AnyNewLine mode
417
+
418
+
By default, .NET's regular expression engine treats only `\n` as a newline character. The anchors `^` and `$` (in <xref:System.Text.RegularExpressions.RegexOptions.Multiline?displayProperty=nameWithType> mode), `\Z`, and the wildcard `.` all use `\n` as the sole line boundary. This means that `$` doesn't match before `\r\n` (Windows-style line endings), and `.` matches `\r`, which leads to common bugs when processing text with mixed or non-Unix line endings.
419
+
420
+
The <xref:System.Text.RegularExpressions.RegexOptions.AnyNewLine?displayProperty=nameWithType> option, which was introduced in .NET 11, makes these constructs recognize all common newline sequences: `\r\n` (CR+LF), `\r` (CR), `\n` (LF), `\u0085` (NEL), `\u2028` (LS), and `\u2029` (PS). This is consistent with [Unicode TR18 RL1.6](https://unicode.org/reports/tr18/#RL1.6).
421
+
422
+
For example, without `AnyNewLine`, matching lines in a string with Windows line endings requires manual workarounds like `\r?$`:
423
+
424
+
```csharp
425
+
// BUG: .+$ captures trailing \r on Windows line endings
Console.WriteLine(match.Value); // "foo\r" -- not "foo"!
428
+
```
429
+
430
+
With `AnyNewLine`, the anchors handle all newline types automatically:
431
+
432
+
```csharp
433
+
varmatch=Regex.Match("foo\r\nbar", @".+$",
434
+
RegexOptions.Multiline|RegexOptions.AnyNewLine);
435
+
Console.WriteLine(match.Value); // "foo"
436
+
```
437
+
438
+
The following table summarizes how `AnyNewLine` affects each construct:
439
+
440
+
|Construct|Default behavior|With `AnyNewLine`|
441
+
|---------|----------------|-----------------|
442
+
|`.`|Matches any character except `\n`|Matches any character except `\r`, `\n`, `\u0085`, `\u2028`, `\u2029`|
443
+
|`$` (Multiline)|Matches before `\n`|Matches before `\r\n`, `\r`, `\n`, `\u0085`, `\u2028`, `\u2029`|
444
+
|`^` (Multiline)|Matches after `\n`|Matches after `\r\n`, `\r`, `\n`, `\u0085`, `\u2028`, `\u2029`|
445
+
|`$` (default) / `\Z`|Matches before `\n` at end of string|Matches before any newline sequence at end of string|
446
+
447
+
Key design points:
448
+
449
+
-**`\r\n` is treated atomically**: `$` matches before the full `\r\n` sequence, never between `\r` and `\n`.
450
+
-**`Singleline` takes precedence**: `.` with both `Singleline` and `AnyNewLine` matches every character (including newlines), consistent with `Singleline`'s existing behavior.
451
+
-**`\A` and `\z` are unaffected**: Absolute start-of-string and end-of-string anchors don't change.
452
+
-**Incompatible options**: `AnyNewLine` cannot be combined with <xref:System.Text.RegularExpressions.RegexOptions.NonBacktracking?displayProperty=nameWithType> or <xref:System.Text.RegularExpressions.RegexOptions.ECMAScript?displayProperty=nameWithType>. Attempting to do so throws an <xref:System.ArgumentOutOfRangeException>.
453
+
408
454
## See also
409
455
410
456
-[Regular Expression Language - Quick Reference](regular-expression-language-quick-reference.md)
0 commit comments