Conversation
|
Thanks for the PR! First of all, terminology:
I would strongly suggest to use the wording that I've already suggested in our call last week: "reject unknown fields". Secondly, about the implementation approach: Currently it's a setting on a validataclass that gets read by the DataclassValidator. That means a validataclass will always either allow or reject unknown fields. Instead, I would suggest to leave the validataclasses untouched and add the setting to the DataclassValidator directly. This allows more flexibility, because you can use the same validataclass with a different validator to either allow or reject unknown fields. Which also makes it easier to allow/reject based on the context or environment, because you don't need to change the validataclass for that, just the validator. Which brings me to the third point: "One could just take For example, if you have a project where unknown fields should always (or in most cases) be rejected, you can subclass the DataclassValidator and just set Would you agree with all this or do you see problems / have better ideas? |
|
Will rename the field accordingly, I just came from the JSON Schema wirld where this makes sense. About the About os.env: I like the idea of having a CI testing in struct mode, and normal operations not. By setting the CI in struct mode, you can catch accidental fields in test data which were not handled by accident. Reason for this in general: I like systems which check my code, not manual work. Also, it makes a lot of sense for DATEX2 validation again: DATEX2 is multiple hundred validataclasses. You don't want to subclass all of them and rebuild the whole validataclass tree if you want to have a not so strict normal input validation, but a strict web validator which actually checks if your DATEX2 is valid. This apples to any other complex data model, too: subclassing the specific validataclass means you have to rebuild the whole tree of validators. Subclassing to add a debug attribute means that one has to replace all usages in a project with the sublassed |
So the validataclass can define the default behaviour and the DataclassValidator can override that. Yes, I think that makes sense. I thought about that too but thought it's a bit redundant, and my assumption was that in your project you would probably need that setting for every validataclass and not just for a specific subset of them. But if you have a good use case where it makes sense to have that as an inherent property of a validataclass, that makes sense.
No, I think you misunderstood me. I wasn't speaking of subclassing the validataclass to add a debug flag, that really doesn't make sense. But it does make sense to define a custom DataclassValidator for your specific application that you can adjust to your needs, rather than enforcing a very specific way to enable a feature via env variable. That's also one of the main design goals of this library: Keep things simple but extendable. It's not a workaround to subclass the DataclassValidator, that's intended usage. You can also keep the name "DataclassValidator", then you don't need to change every usage but only the import lines. What complicates things is also that you might not actually want "debug mode" to affect every DataclassValidator. Unknown fields are not always an error case, sometimes you explicitly want to validate only a subset of fields (think of validation for responses of outgoing API requests - you may be interested in only 2 fields of the entire response, why validate the entire request?) This use case can be easily solved with the subclassing approach, just use the regular DataclassValidator for those objects. If it's a built-in feature of the DataclassValidator that reads There may be ways to provide a simple and usage-agnostic way to do this, like some sort of global context/configuration of library behavior, but I think it's out of scope for this PR because there's a lot of things to consider to do that right. For now, please just leave it at 1. an option in the validataclass decorator to set the default behaviour, 2. an option in the DataclassValidator to override the default behaviour. |
14ca883 to
81e5b42
Compare
|
Changed the naming and added the |
| ] | ||
|
|
||
|
|
||
| class UnknownFieldsError(ValidationError): |
There was a problem hiding this comment.
I'm unsure about this validation error. It feels a bit inconsistent to me, because usually validation errors about a specific field in a dictionary/object are collected in a DictFieldsValidationError.
Also we already have the FieldNotAllowedError that is raised by the RejectValidator, and essentially it describes the same kind of error: A field was given in a dictionary that is not allowed and therefore rejected. And basically the reject_unknown_fields feature can be seen as having RejectValidator as the default value for unknown fields.
(Actually, you could literally implement this feature by setting default_validator=RejectValidator('Unknown field') or similar in the DictValidator that the DataclassValidator is based on.)
Do you see a specific reason why the UnknownFieldsError with a list of unknown fields makes more sense or is easier to handle than the regular DictFieldsValidationError with FieldNotAllowedError for every unknown field?
|
|
||
| def decorator(_cls: type[_T]) -> type[_T]: | ||
| # Pop validataclass-specific options before passing kwargs to @dataclass | ||
| reject_unknown_fields = kwargs.pop('reject_unknown_fields', False) |
There was a problem hiding this comment.
I would suggest to use None as the default value here, and only set the __reject_unknown_fields__ if the parameter is explicitly set to True or False. You already use getattr with False as the default value in the DataclassValidator, so the DataclassValidator doesn't need to be adjusted for this.
I think this makes it a bit more futureproof to allow for some "global debug setting" or similar in the future, which would only be applied to dataclasses that don't set this parameter.
In other words, reject_unknown_fields=True means unknown fields are always rejected (unless overridden by the DataclassValidator), =False means unknown fields are always allowed (unless overridden by the DataclassValidator), and not setting this parameter either on the dataclass or in the validator means "use the default behaviour". Right now the default behaviour is the same as False, but in the future the default behaviour could be overridden (e.g. using env).
For strict validation (eg validation services), it makes sense to be able to prevent additional properties. This MR provides this feature.
Open question: it might make sense to enable prevent additional attributes via env or something else, too, to make a strict validation in tests and therefore find inconsistencies. One could just take
os.envfor that. Do you think this is a good idea? And if yes, how should we call the env var? justVALIDATACLASS_PREVENT_ADDITIONAL_ATTRIBUTES?