[Serializer] Add ability to collect denormalization errors #38472

julienfalque · 2020-10-07T18:37:49Z

Q	A
Branch?	5.x
Bug fix?	no
New feature?	yes
Deprecations?	no
Tickets	#37419
License	MIT
Doc PR	-

This PR is a follow-up of #38165.

I tried a different approach: instead of throwing special exceptions, denormalizers/deserializers must always return an instance of the new DenormalizationResult class which wraps the denormalized value or the collected errors.

src/Symfony/Component/Serializer/Tests/SerializerTest.php

greedyivan · 2020-11-03T19:34:22Z

My objection to the proposed solution.

Collecting denormalization exceptions should be the usual way to use it because it is a very useful feature for the API.

Here we have to check result -- null !== $result->getDenormalizedValue() or empty($result->getInvariantViolations()) in case null is a valid result of denormalization. It is not obviously and not OOP-style at all.

It is very likely that this is a return to the json_last_erroruniverse.

julienfalque · 2020-11-04T08:04:39Z

Here we have to check result -- null !== $result->getDenormalizedValue() or empty($result->getInvariantViolations()) in case null is a valid result of denormalization.

My first approach was to throw specific exceptions instead (see #38165). The implementation was not very different and there was still overhead to check what happened in the nested denormalizers. Instead of checking what kind of result was returned, one would have to check whether a specific exception was thrown.

My main concern about that is that throwing an exception breaks the regular execution flow and you have no guarantee that it comes from the denormalizer that was just called. Maybe some deeper denormalizer actually thrown it and it wasn't catched by the denormalizer you called. This could lead to unexpected results.

Moreover, using exceptions seemed more like a hack than an actual solution to me.

It is not obviously and not OOP-style at all.

Can you elaborate on that?

One benefit of returning "result" objects is that this can more easily be checked: if a denormalizer doesn't return a DenormalizationResult instance, we could throw an exception immediatly, this could help third-party denormalizers implement the new workflow. Later we could even decide that the new workflow should become the default and drop the collect_invariant_violations.

Maybe the "result" object API can be improved though. I'm open to suggestions.

greedyivan · 2020-11-04T12:16:44Z

My main concern about that is that throwing an exception breaks the regular execution flow and you have no guarantee that it comes from the denormalizer that was just called. Maybe some deeper denormalizer actually thrown it and it wasn't catched by the denormalizer you called. This could lead to unexpected results.

Moreover, using exceptions seemed more like a hack than an actual solution to me.

That is return to that time when you have to use json_last_error after using json_decode.

That boilerplate code should be incapsulated in the library, and don't bother a client each time when it want to denormalize something.

I wrote another simple solution for that: #38968

It collects exceptions from both internal objects and types.
It collects only those exceptions that is marked for that explicitly.
It collects exceptions from initialization with constuctor.
It is more backward compatible, cause it violates only cases which explicitly want some kind of exceptions from serializer. It introduces a new one.

I'm not insist, but that solution is more appreciated to use with json-to-dto task, cause it will return a full list of fields, which cannot be denormalized.

src/Symfony/Component/Serializer/Normalizer/AbstractObjectNormalizer.php

camilledejoye · 2020-11-21T16:01:18Z

src/Symfony/Component/Serializer/DenormalizationResult.php

+    public static function failure(array $invariantViolations): self
+    {
+        $result = new self();
+        $result->invariantViolations = $invariantViolations;


It could be interesting to validate that the list really contains only instances of InvariantViolation

IMO this should be covered with static analysis tools such as PHPStan or Psalm, not with runtime assertions.

The fact that you added a string to the $invariantViolations variable when there is extra attributes seems to prove that static analysis was not enough.
I don't see any reason not to enforce types, if it's because you think it will add some unwanted overhead in production maybe we can go half way by adding an assert(self::areViolationsValid($invariantViolations)) ?

Woops, indeed that's an epic fail :)

I still think this should not be a runtime assertion though (which would have not detected this mistake anyway). There is no static analysis on the Symfony codebase as far as I know, this means this requires me to write tests that covers those scenarios instead.

which would have not detected this mistake anyway

What do you mean ?
Asserts should be enable on development environment and only disable in production.
Meaning that when running your tests with a php.ini that enables the assertions an error would have been thrown.
In case the static method implementation was not clear:

private static function areViolationsValid(array $violations): bool { foreach ($violations as $violation) { if (!$violation instanceof InvariantViolation) { return false; } } return true; }

which would have not detected this mistake anyway

What do you mean ?

I mean adding those assertions would have not prevented me from making this mistake because there are no tests to run that code as of now so I would have received no alerts about that anyway.

Asserts should be enable on development environment and only disable in production.

I would not allow disabling runtime assertions in production: the point is to prevent the application from silently ignoring errors and running in an invalid state. IMO detecting those error scenarios is more efficiently done using static analysis if possible, with tests otherwise. Here we have no static analysis so I'll go with tests.

camilledejoye

Good job!
Just a few more ideas :)

camilledejoye · 2020-11-22T09:54:21Z

src/Symfony/Component/Serializer/DenormalizationResult.php

+    public static function failure(array $invariantViolations): self
+    {
+        $result = new self();
+        $result->invariantViolations = $invariantViolations;


The fact that you added a string to the $invariantViolations variable when there is extra attributes seems to prove that static analysis was not enough.
I don't see any reason not to enforce types, if it's because you think it will add some unwanted overhead in production maybe we can go half way by adding an assert(self::areViolationsValid($invariantViolations)) ?

src/Symfony/Component/Serializer/Normalizer/AbstractObjectNormalizer.php

src/Symfony/Component/Serializer/Serializer.php

src/Symfony/Component/Serializer/Normalizer/DataUriNormalizer.php

src/Symfony/Component/Serializer/Normalizer/DateIntervalNormalizer.php

src/Symfony/Component/Serializer/Normalizer/AbstractObjectNormalizer.php

julienfalque · 2020-11-24T10:18:52Z

I think I'm quite happy with the current API but this PR is in competition with #38968, I'd like to have some feedback to know whether I should invest more time on it. @dunglas may I friendly request a quick review from you please?

camilledejoye · 2020-11-26T10:56:24Z

I personally like the operation result pattern, it's not common in PHP compare to languages with generics but it adds some value in this case.

I think there is a strong difference between expected and exceptional errors.
We often hears two things:

Exception should be kept for exceptional cases
We should never trust a user input

The errors we want to collect are errors related to an invalid data in the payload provided by the user.
It's an input and therefore we expect it to fails relatively often, exceptions don't really fit in this case.

And we might also have errors related to an invalid configuration of the serializer or invalid arguments (there is no guarantee the support method will always be called before the denormalize one), these are really exceptional to me.

The benefit to your approach is to clearly separate those two concerns, which avoid having to wonder how to handle each error based only on its type.
For instance when denormalizing, do we have an InvalidArgumentException because the $type is invalid or the $data ?
If it's the type then it should not be delayed, the exception should be propagated.
But if it's the data it might be better to collect it because it's a validation error.

On the other hand the exception approach allows to handle more cases out of the box and is less intrusive.
With your approach, when migrating we must update all our custom denormalizers.
IMO it's a good trade off in order to have a fine grained handling of the errors.

julienfalque · 2020-11-26T11:12:15Z

Thanks @camilledejoye, you put this in better words than I would do :)

Another reason I don't like using exceptions is that it relies on the fact that the parent denormalizers will handle them appropriately, but if they don't the whole process will easily fail or behave anormaly silently, without the developper being aware. Using the DenormalizationResult, any uncaught exception is a guarantee that some denormalizers didn't handle it appropriately and will immediately make this fact obvious.

dunglas · 2021-01-03T08:32:28Z

src/Symfony/Component/Serializer/Normalizer/DenormalizerInterface.php

    /**
     * Denormalizes data back into an object of the given class.
     *
+     * When context option `collect_invariant_violations` is enabled, the


Could we deprecate not enabling this option? So in 6.0 we will have only one code path again.

Also, shouldn't we add a similar API to NormalizerInterface for consistency?

@symfony/mergers I'm interested in your opinions about this PR because it's an important change in the API.

Could we deprecate not enabling this option? So in 6.0 we will have only one code path again.

I thought it would be better to have this as an experimental feature so as users don't get forced into a big API change. But thinking about it again, keeping it as an opt-in feature probably means third-party normalizers won't make a move to be compatible with the new API (because maintainers don't know about it or they can't/don't want to update). Deprecating the current API in favor of this one will also allow enforcing the API at type level by making DenormalizerInterface::denormalize() return a DenormalizationResult.

What I'm affraid of is that the new API might introduce some overhead to the denormalization process, is that something we want to be the new default?

Also: if the new API is the only one in 6.0, this new option shall be deprecated in 6.0 and removed in 7.0, right?

Also, shouldn't we add a similar API to NormalizerInterface for consistency?

The use case of the new API for denormalization is that input data can be untrusted and one might want to show detailed error messages rather than a technical exception to the user that provided that input. Is there a similar scenario for normalization? I'm totally ok to apply the same API changes to normalization, I just want to be sure it's actually worth it.

It's ok to deprecate and 5.x and remove in 6.

I'm not sure about the use cases for NormalizerInterface, but I'm sure we can find some. For instance in API Platform we often have to store metadata along with the normalization result. An example: we store the IRIs of visited documents to generate cache tags. We use currently use mighty tricks involving the serialization context, having a result object would allow to make cleaner things. We'll probably need an extension point for this result object by the way (it can start as a simple a context map).

That being said, the key point here is consistency IMHO. Normalization and de normalization must work in a similar way. Consistent APIs are easier to learn and remember.

dunglas · 2021-01-03T08:35:57Z

src/Symfony/Component/Serializer/DenormalizationResult.php

+    private $denormalizedValue;
+    private $invariantViolations = [];
+
+    private function __construct()


Maybe could we use a public constructor with named arguments. It's more idiomatic than named factory methods in PHP8.

Also, this would allow users to access both the normalized value and the errors. It could be useful in some or access the partially denormalized data even in case of error.

I disagree: named factories are more expressive about the intention of the caller and thus simplifies validating arguments. As an example, I could improve things here by validating that $invariantViolations is not empty in the failure() factory. It would not be possible with a single constructor that supports both cases because we lose the intention of the caller: since we don't know if we're in a success or failure case, we can only accept empty arrays in all cases and infer the use case depending on what the variable contains.

We could mitigate this by allowing e.g. null for $invariantViolations, but then validation would become more complicated and the API less clear.

I added a $partiallyDenormalizedValue argument to the failure() factory, which is a different name than in success(), that would not be possible with a single constructor.

src/Symfony/Component/Serializer/InvariantViolation.php

fabpot · 2021-09-10T07:20:07Z

see #42502

…ror during denormalization (lyrixx) This PR was merged into the 5.4 branch. Discussion ---------- [Serializer] Add support for collecting type error during denormalization | Q | A | ------------- | --- | Branch? | 5.4 | Bug fix? | no | New feature? | yes | Deprecations? | no | Tickets | Fix symfony#27824, Fix symfony#42236, Fix symfony#38472, Fix symfony#37419 Fix symfony#38968 | License | MIT | Doc PR | --- There is something that I don't like about the (de)Serializer. It's about the way it deals with typed properties. As soon as you add a type to a property, the API can return 500. Let's consider the following code: ```php class MyDto { public string $string; public int $int; public float $float; public bool $bool; public \DateTime $dateTime; public \DateTimeImmutable $dateTimeImmutable; public \DateTimeZone $dateTimeZone; public \SplFileInfo $splFileInfo; public Uuid $uuid; public array $array; /** `@var` MyDto[] */ public array $collection; } ``` and the following JSON: ```json { "string": null, "int": null, "float": null, "bool": null, "dateTime": null, "dateTimeImmutable": null, "dateTimeZone": null, "splFileInfo": null, "uuid": null, "array": null, "collection": [ { "string": "string" }, { "string": null } ] } ``` **By default**, I got a 500: ![image](https://user-images.githubusercontent.com/408368/129211588-0ce9064e-171d-42f2-89ac-b126fc3f9eab.png) It's the same with the prod environment. This is far from perfect when you try to make a public API :/ ATM, the only solution, is to remove all typehints and add assertions (validator component). With that, the public API is nice, but the internal PHP is not so good (PHP 7.4+ FTW!) In APIP, they have support for transforming to [something](https://github.com/api-platform/core/blob/53837eee3ebdea861ffc1c9c7f052eecca114757/src/Core/Serializer/AbstractItemNormalizer.php#L233-L237) they can handle gracefully. But the deserialization stop on the first error (so the end user must fix the error, try again, fix the second error, try again etc.). And the raw exception message is leaked to the end user. So the API can return something like `The type of the "string" attribute for class "App\Dto\MyDto" must be one of "string" ("null" given).`. Really not cool :/ So ATM, building a nice public API is not cool. That's why I propose this PR that address all issues reported * be able to collect all error * with their property path associated * don't leak anymore internal In order to not break the BC, I had to use some fancy code to make it work 🐒 With the following code, I'm able to collect all errors, transform them in `ConstraintViolationList` and render them properly, as expected. ![image](https://user-images.githubusercontent.com/408368/129215560-b0254a4e-fec7-4422-bee0-95cf9f9eda6c.png) ```php #[Route('/api', methods:['POST'])] public function apiPost(SerializerInterface $serializer, Request $request): Response { $context = ['not_normalizable_value_exceptions' => []]; $exceptions = &$context['not_normalizable_value_exceptions']; $dto = $serializer->deserialize($request->getContent(), MyDto::class, 'json', $context); if ($exceptions) { $violations = new ConstraintViolationList(); /** `@var` NotNormalizableValueException */ foreach ($exceptions as $exception) { $message = sprintf('The type must be one of "%s" ("%s" given).', implode(', ', $exception->getExpectedTypes()), $exception->getCurrentType()); $parameters = []; if ($exception->canUseMessageForUser()) { $parameters['hint'] = $exception->getMessage(); } $violations->add(new ConstraintViolation($message, '', $parameters, null, $exception->getPath(), null)); }; return $this->json($violations, 400); } return $this->json($dto); } ``` If this PR got accepted, the above code could be transferred to APIP to handle correctly the deserialization Commits ------- ebe6551 [Serializer] Add support for collecting type error during denormalization

carsonbot added Status: Needs Review Serializer Feature labels Oct 7, 2020

julienfalque mentioned this pull request Oct 7, 2020

[Serializer] Add ability to collect denormalization errors #38165

Closed

3 tasks

nicolas-grekas added this to the 5.x milestone Oct 12, 2020

greedyivan mentioned this pull request Nov 2, 2020

[Serializer] Add ability to collect denormalization errors #38968

Closed

julienfalque force-pushed the serializer-error-collection branch from 489f341 to fc20acd Compare November 3, 2020 12:30

julienfalque marked this pull request as ready for review November 3, 2020 12:40

julienfalque requested a review from dunglas as a code owner November 3, 2020 12:40

greedyivan reviewed Nov 3, 2020

View reviewed changes

src/Symfony/Component/Serializer/Tests/SerializerTest.php Outdated Show resolved Hide resolved

camilledejoye reviewed Nov 21, 2020

View reviewed changes

julienfalque force-pushed the serializer-error-collection branch from 112fdfc to f8e564c Compare November 22, 2020 09:24

camilledejoye reviewed Nov 22, 2020

View reviewed changes

julienfalque force-pushed the serializer-error-collection branch from f8e564c to d2b9674 Compare November 22, 2020 10:50

camilledejoye reviewed Nov 22, 2020

View reviewed changes

src/Symfony/Component/Serializer/Normalizer/AbstractObjectNormalizer.php Outdated Show resolved Hide resolved

julienfalque force-pushed the serializer-error-collection branch from d2b9674 to 2ef0003 Compare November 22, 2020 11:23

camilledejoye reviewed Nov 22, 2020

View reviewed changes

src/Symfony/Component/Serializer/Normalizer/AbstractObjectNormalizer.php Outdated Show resolved Hide resolved

src/Symfony/Component/Serializer/Normalizer/AbstractObjectNormalizer.php Outdated Show resolved Hide resolved

julienfalque force-pushed the serializer-error-collection branch 5 times, most recently from 436021c to b0a559f Compare November 23, 2020 21:09

julienfalque mentioned this pull request Dec 18, 2020

[Serializer] Do not throw exception in the DateTimeNormalizer if it's not a date #27824

Closed

falkenhawk mentioned this pull request Dec 20, 2020

[Serializer][RFC] New context option to avoid throw UnexpectedValueException #28358

Closed

This was referenced Dec 20, 2020

[RFC][Serializer] Ability to collect denormalization failures #37419

Closed

[Serializer] Allow to provide a "normalizer" callable to normalize a property during denormalization #27933

Open

julienfalque mentioned this pull request Jan 2, 2021

[Serializer] allow ObjectNormalizer to provide ConstraintViolationListInterface from ExtraAttributesException #25729

Closed

dunglas reviewed Jan 3, 2021

View reviewed changes

julienfalque force-pushed the serializer-error-collection branch from b0a559f to 934e707 Compare January 3, 2021 09:48

julienfalque added 5 commits January 26, 2021 19:19

[Serializer] Add ability to collect denormalization errors

d6db18d

Exception -> Throwable

5dc3490

Allow passing partially denormalized value in failure result

30b4ef4

Move DenormalizationResult to Result namespace

1abf68a

NormalizationResult

096c487

julienfalque force-pushed the serializer-error-collection branch from 934e707 to 096c487 Compare January 26, 2021 18:24

julienfalque requested a review from sroze as a code owner January 26, 2021 18:24

julienfalque marked this pull request as draft January 26, 2021 18:24

lyrixx mentioned this pull request Aug 12, 2021

[Serializer] Add support for collecting type error during denormalization #42502

Merged

fabpot closed this in b0fbe93 Sep 10, 2021

julienfalque deleted the serializer-error-collection branch September 10, 2021 07:27

Uh oh!

[Serializer] Add ability to collect denormalization errors #38472

[Serializer] Add ability to collect denormalization errors #38472

Uh oh!

Conversation

julienfalque commented Oct 7, 2020 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Uh oh!

greedyivan commented Nov 3, 2020

Uh oh!

julienfalque commented Nov 4, 2020 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

greedyivan commented Nov 4, 2020 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

camilledejoye left a comment

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

julienfalque commented Nov 24, 2020

Uh oh!

camilledejoye commented Nov 26, 2020 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

julienfalque commented Nov 26, 2020 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

dunglas Jan 3, 2021 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

fabpot commented Sep 10, 2021

Uh oh!

Uh oh!

julienfalque commented Oct 7, 2020 •

edited

Loading

julienfalque commented Nov 4, 2020 •

edited

Loading

greedyivan commented Nov 4, 2020 •

edited

Loading

camilledejoye commented Nov 26, 2020 •

edited

Loading

julienfalque commented Nov 26, 2020 •

edited

Loading

dunglas Jan 3, 2021 •

edited

Loading