- 
          
- 
                Notifications
    You must be signed in to change notification settings 
- Fork 9.7k
[Cache] Allow to configure serializator for cache instances. #27484
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
9a2b820    to
    2e76ae4      
    Compare
  
    |  | ||
| use Symfony\Component\Cache\SerializerInterface; | ||
|  | ||
| class NullSerializer implements SerializerInterface | 
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe this should be called IdentitySerializer, I expected a NullSerializer to always return null.
| } catch (\Exception $e) { | ||
| throw new InvalidArgumentException(sprintf('Cache key "%s" has non-serializable array value.', $key), 0, $e); | ||
| $exportedValue = $exportSerializer->serialize($value); | ||
| $valuePart = 'array("isSerialized" => false, "value" => '.$exportedValue.')'; | 
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
For me, an important criteria of this PR should be to preserve the original data format, so that we keep compatibility with existing dumped values. Here, this change it, and adds overhead to the storage, filling it with medadata that wasn't needed before.
I think the opcache-base storages should just not allow configuration of the serializer. WDYT?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Previous realization favors scalar and arrays of scalar data. Nulls and objects was stored in serialized form. It adds performance hit on every cache read or forces to use opcache for primitive types only.
It is possible to store values directly, but all checks in form: isset($this->values, $key) should be rewritten.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Previous realization favors scalar and arrays of scalar data. Nulls and objects was stored in serialized form
that's intended: only these benefit from opcache's shared memory
It adds performance hit on every cache read or forces to use opcache for primitive types only
do you have numbers about that to illustrate this statement?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Test: https://gist.github.com/palex-fpt/82fc3deed09c2de2a9023080a9613ce3
master results:
Duration (first/next) (largeArrayOfNulls): 14.200/2.300 s
Duration (first/next) (largeArrayOfStrings): 12.200/2.500 s
Duration (first/next) (largeArrayOfSmallObjects): 18.600/5.000 s
Duration (first/next) (largeObject): 130.400/112.300 s
branch results:
Duration (first/next) (largeArrayOfNulls): 16.500/3.500 s
Duration (first/next) (largeArrayOfStrings): 11.300/2.800 s
Duration (first/next) (largeArrayOfSmallObjects): 30.200/5.200 s
Duration (first/next) (largeObject): 145.700/3.400 s
Duration (first/next) (largeExportableObject): 7.500/3.300 s
|  | ||
| EOF; | ||
|  | ||
| $exportSerializer = new PhpExportSerializer(); | 
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
PhpExportSerializer is less capable than the current code, which accepts all serializable objects.
We should preserve this property.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
PhpExportSerializer is backed by instance serializer, when data cannot be var_export-ed - it is serialized.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I can merge PhpExportSerializer into PhpArrayTrait. It is not used in other Caches and it is not part of serialization contract of PhpArrayTrait.
| $unserializeCallbackHandler = ini_set('unserialize_callback_func', __CLASS__.'::handleUnserializeCallback'); | ||
| try { | ||
| $value = unserialize($serialized); | ||
| if (false === $value && serialize(false) !== $serialized) { | 
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this is a performance critical code, the previous logic should be preserved (checking the serialized "false" before unserializing)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It does unnecessary unserialize only on false value. Does storing false value as opcache entry is so frequent? But, ok. I would change it back.
| { | ||
| $unserializeCallbackHandler = ini_set('unserialize_callback_func', __CLASS__.'::handleUnserializeCallback'); | ||
| try { | ||
| return $this->checkResultCode($this->getClient()->getMulti($ids)); | 
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
serialization is also handled by the extension itself, so that the Memcached should be provided an IdentitySerializer and this could should be kept as is, isn't it?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I did not add SerilaizerTrait to Caches that handle serialization by itselves: Memcached, Apcu, Doctrine. It looks like adding IdentitySerilizer to it is good idea.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
it would allow using a compressing serializer if needed, so would be great yes
| Add test result at https://gist.github.com/palex-fpt/82fc3deed09c2de2a9023080a9613ce3 | 
| I just realized we cannot use the var_export() strategy with PhpArrayAdapter, because this strategy requires instantiating all objects in the pool, when typically only a few are needed per run. | 
| For PhpFileAdapter, var_export() poses another issue, which is deep cloning, which is broken. :( | 
26ba1cf    to
    a4b045f      
    Compare
  
    | Was PhpArrayAdapter designed with that access strategy in mind? | 
| Object should be tested against serialization method. There is no universal serializator. | 
| Oh yes it was. PhpArray is basically free on memory usage, because opcache puts the structure in shared memory, which is out if quota. | 
a4b045f    to
    6401ce7      
    Compare
  
    | My recommendation for this PR would be to remove the export-based serializers and leave the PHP ones as is. That would make a first important step mergeable. Then, if we can find a better serialization strategy, let's do it in another PR. | 
6401ce7    to
    defdc50      
    Compare
  
    | So, for PhpArrayCache we want to store scalar/serialized data and unserialize it on demand. | 
| For PhpArray, we want to create the objects on-demand yes. | 
| PhpFile uses same strategy as PhpArray. It stores non-scalar values in serialized form. That add un-serialization performance hit on access. | 
| Main point of this PR was to speed up access to opcache entries. This can be achieved by removing serialization part (allow to var_export objects with __set_state) or by changing serializator (igbinary ex.). It is possible to add extension points to select types that would be var_exported or to select serializator. But it is not possible to do that in PhpFilesCache preserving current data format. PhpFilesCache detects types by inspecting serialized content. Should it be done in fresh new implementation of CacheInterface? Can we change current data format? | 
| 
 I had a different goal, which you already achieved here (having a swappable serialization format, of special interest when using remote backends IMHO, to allow using e.g. igbinary.) For OPcache entries, I propose #27543 instead. | 
| 
 Ok. I would cleanup all changes from PhpFilesCache and PhpArrayCache. IMHO we don't  need IdentitySerializers in Apcu, Doctrine, Memcached, Redis. | 
f1b3fe3    to
    b279460      
    Compare
  
    b279460    to
    9a94e6c      
    Compare
  
    | How should we move this forward? Taking #27543 into account, I would suggest reverting all changes to  | 
…sible (nicolas-grekas) This PR was merged into the 4.2-dev branch. Discussion ---------- [Cache] serialize objects using native arrays when possible | Q | A | ------------- | --- | Branch? | master | Bug fix? | no | New feature? | no | BC breaks? | no | Deprecations? | no | Tests pass? | yes | Fixed tickets | - | License | MIT | Doc PR | - This PR allows leveraging OPCache shared memory when storing objects in `Php*` pool storages (as done by default for all system caches). This improves performance a bit further when loading e.g. annotations, etc. (bench coming); Instead of using native php serialization, this uses a marshaller that represents objects in plain static arrays. Unmarshalling these arrays is faster than unserializing the corresponding PHP strings (because it works with copy-on-write, while unserialize cannot.) php-serialization is still a possible format because we have to use it when serializing structures with internal references or with objects implementing `Serializable`. The best serialization format is selected automatically so this is completely seamless. ping @palex-fpt since you gave me the push to work on this, and are pursuing a similar goal in #27484. I'd be thrilled to get some benchmarks on your scenarios. Commits ------- 866420e [Cache] serialize objects using native arrays when possible
| #27543 is now merged, this can be rebased :) | 
| Php* changes was already reverted. After reverting Array* it left only two caches: FilesystemCache and PdoCache. My end goal is to use igbinary for filesystem cache. Can we just add some boolean flag to Filesystem* constructor to use igbinary? | 
| /** @var SerializerInterface */ | ||
| private $serializer; | ||
|  | ||
| public function setSerializer(SerializerInterface $serializer) | 
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
does this need to be public? it's currently not used in a public way.
| I closing in favor of #27645, which provides auto-adaptative igbinary support. It does not provide serializer injection, but I'm not sure we need any actually. I would like to thank you for providing this PR anyway, and for your comments on the other cache-related PRs, it's been helping A LOT to improve the component. Let's continue on #27645. | 
Extract object serialization logic into SerializerInterface.
Allow opcache related caches to store __set_state enabled objects without serialization.