Fix uuid() accepting {, }, and prefixes inside the string#359
Conversation
Instead of stripping "urn:", "uuid:", "{" and "}" from anywhere in the value before matching.
So strings like "5{}50e8400-..." or an interior "urn:" passed the assertion and the un-normalised string was returned to the caller.
Moved the validated into the regex:
- The prefixes are only accepted at the start
- A leading "{" must be paired with a closing "}" using a PCRE conditional
| // Accepts the plain form (including the nil UUID with all 128 bits | ||
| // set to zero), optionally preceded by "urn:" and/or "uuid:" and | ||
| // optionally wrapped in a matching pair of curly braces. | ||
| if (!\preg_match('/^(?:urn:)?(?:uuid:)?(\{)?[0-9A-Fa-f]{8}-[0-9A-Fa-f]{4}-[0-9A-Fa-f]{4}-[0-9A-Fa-f]{4}-[0-9A-Fa-f]{12}(?(1)\})$/D', $value)) { |
There was a problem hiding this comment.
I think it would be better to break out these different forms, such that it looks like:
if (\str_starts_with($value, 'urn:uuid:') && \preg_match(...)) {
return $value;
}
if (\str_starts_with($value, 'uuid:') && \preg_match(...)) {
return $value;
}
if (\str_starts_with($value, '{') && \str_ends_with('}') && \preg_match(...)) {
return $value;
}
if (\preg_match(...)) {
return $value;
}
// resolve message, report invalidWhile more verbose, having separate forks for each form makes intent much more clear and provides more meaningful code coverage.
There was a problem hiding this comment.
Given uuid:{ff6f8cb0-c57d-21e1-9b21-0800200c9a66} should stay valid at least the brace matching should be kept then? Or would you rather see the prefixes and braces removed and then just the raw uuid validated?
So do you want something like this that reuses the main regex:
diff --git a/src/Assert.php b/src/Assert.php
index 089b020..31c861b 100644
--- a/src/Assert.php
+++ b/src/Assert.php
@@ -2442,18 +2442,33 @@ class Assert
{
static::string($value, $message);
- // Accepts the plain form (including the nil UUID with all 128 bits
- // set to zero), optionally preceded by "urn:" and/or "uuid:" and
- // optionally wrapped in a matching pair of curly braces.
- if (!\preg_match('/^(?:urn:)?(?:uuid:)?(\{)?[0-9A-Fa-f]{8}-[0-9A-Fa-f]{4}-[0-9A-Fa-f]{4}-[0-9A-Fa-f]{4}-[0-9A-Fa-f]{12}(?(1)\})$/D', $value)) {
- $message = self::resolveMessage($message);
- static::reportInvalidArgument(\sprintf(
- $message ?: 'Value %s is not a valid UUID.',
- static::valueToString($value)
- ));
+ $uuid = '[0-9A-Fa-f]{8}-[0-9A-Fa-f]{4}-[0-9A-Fa-f]{4}-[0-9A-Fa-f]{4}-[0-9A-Fa-f]{12}';
+
+ if (\str_starts_with($value, 'urn:uuid:') && \preg_match('/^urn:uuid:'.$uuid.'$/D', $value)) {
+ return $value;
}
- return $value;
+ if (\str_starts_with($value, 'uuid:') && \preg_match('/^uuid:(?:'.$uuid.'|\{'.$uuid.'\})$/D', $value)) {
+ return $value;
+ }
+
+ if (\str_starts_with($value, '{') && \str_ends_with($value, '}') && \preg_match('/^\{'.$uuid.'\}$/D', $value)) {
+ return $value;
+ }
+
+ if (\preg_match('/^'.$uuid.'$/D', $value)) {
+ return $value;
+ }
+
+ $message = self::resolveMessage($message);
+ static::reportInvalidArgument(\sprintf(
+ $message ?: 'Value %s is not a valid UUID.',
+ static::valueToString($value)
+ ));
}Alternatively stripping the str_starts_with matched parts from $value is possible.
I'm happy to do whatever you want here, no stake in the specific solution. Just trying to understand :)
There was a problem hiding this comment.
Your suggested code looks good to me.
| @@ -588,7 +588,9 @@ public static function getTests(): array | |||
| ['isNonEmptyMap', [[1, 2, 3]], false], | |||
| ['uuid', ['00000000-0000-0000-0000-000000000000'], true], | |||
| ['uuid', ['urn:ff6f8cb0-c57d-21e1-9b21-0800200c9a66'], true], | |||
There was a problem hiding this comment.
This should actually be removed, it is technically not a valid urn: per RFC 4122/9562, since RFC 2141 defines a URN as urn:<NID>:<NSS>.
Hi,
A hopefully small fix for UUIDs being allowed to contain the prefixes and {} inside the string.
If you'd prefer the regular expression to be multiple and commented inline, I'd be happy to do that as well.
I hope the test cases are self-explanatory regarding the behavioral changes.
Moved the validation into the regex: