-
-
Notifications
You must be signed in to change notification settings - Fork 9.6k
DomCrawler memory leak with filter method #10879
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Comments
This is weird, because |
In this case it would appear that CssSelector component is leaking somewhere inside toXPath method. |
I have taken a look at this issue. I have added a function removeAll()
{
unset($this->mainParser);
unset($this->shortcutParsers);
unset($this->extensions);
unset($this->nodeTranslators);
unset($this->combinationTranslators);
unset($this->functionTranslators);
unset($this->pseudoClassTranslators);
unset($this->attributeMatchingTranslators);
} I also updated the NB: I think that @jpauli could be of great help here |
@geoffrey-brier you are unsetting the properties from the class definitions here, not only unsetting their value. This means that later usages of the class would increase the memory usage (on PHP 5.4+ at least). I think one of the issue is that the CSSSelector object graph contains circular references in many places, thus making the reference counting unable to garbage collect the objects after the CSS is turned into XPath. The garbage collector handling circular references likely runs less often. |
@geoffrey-brier I implement the same behavior in the pull request #11221, but in my case, it fully fix this leak. (see https://github.com/symfony/symfony/pull/11221/files#diff-116347d1689bd54d19e1f9a6901ef8a7R401) What do you mean by "the memory leak quite well but it still leaks", do you have some test case ? |
…cular object graph (stof) This PR was merged into the 2.3 branch. Discussion ---------- [CssSelector] Refactored the CssSelector to remove the circular object graph | Q | A | ------------- | --- | Bug fix? | yes | New feature? | no | BC breaks? | no | Deprecations? | no | Tests pass? | yes | Fixed tickets | #10879, replaces #11221 | License | MIT | Doc PR | n/a This allows the translator and its extensions to be garbage collected based on the refcount rather than requiring the garbage collector run, making it much more likely to happen at the end of the ``CssSelector::toXPath`` call. Node translators now receive the Translator as second argument, instead of requiring to inject it in the extension to keep a reference to it. This way, the Translator is referenced nowhere inside it, only by the caller, and so will be destructed at the end of the usage (and extensions will then be destructed after it when not used anymore). Commits ------- 994f81f Refactored the CssSelector to remove the circular object graph
Hi,
recently I have noticed that DomCrawler might leak memory while using filter method (in contrary to filterXpath method).
Here's an example memory usage for both methods (after 100 loops):
Before $crawler->filterXPath("//tr/td"): 9175040
elements:800
After $crawler->filterXPath("//tr/td"): 9175040
Before $crawler->filter("tr > td"): 9175040
elements:800
After $crawler->filter("tr > td"): 12320768
Example code for both methods:
The text was updated successfully, but these errors were encountered: