Thanks to visit codestin.com
Credit goes to github.com

Skip to content

[DomCrawler] Add a way to filter direct children #28221

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged

Conversation

Einenlum
Copy link
Contributor

Q A
Branch? master
Bug fix? no
New feature? yes
BC breaks? no
Deprecations? no
Tests pass? yes
Fixed tickets #28171
License MIT
Doc PR -

The Dom-Crawler component only has a filter() method (to filter the node and all its children) and a children() method to return direct children.
There is currently no way to easily filter (thanks to a selector) the direct children of a node, like jQuery allows so (with a selector passed to the .children([selector]) method).

This PR adds a way to optionally filter direct children thanks to a CSS selector. Here is an example of the usage:

$html = <<<'HTML'
<html>
    <body>
        <div id="foo">
            <p class="lorem" id="p1"></p>
            <p class="lorem" id="p2"></p>
            <div id="nested">
                <p class="lorem" id="p3"></p>
            </div>
        </div>
    </body>
</html>
HTML;

$crawler = new Crawler($html);
$foo = $crawler->filter('#foo');

$foo->children() // will select `#p1`, `#p2` and `#nested`
$foo->children('p') // will select `#p1` and `p2`
$foo->children('.lorem') // will select `#p1` and `p2`

This PR adds only an optional parameter and adds no BC break.

Copy link
Member

@nicolas-grekas nicolas-grekas left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

thanks, here are some comments.

*/
public function children()
public function children($selector = null)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actually, this is a BC break: child classes overriding this method would break with this change.
What we do in these situations is fetching the argument using \func_get_arg(), see e.g. the Finder::sortByName() method.
public function children(/* string $selector = null */) should be the new signature

@@ -1148,4 +1154,20 @@ private function createSubCrawler($nodes)

return $crawler;
}

/**
* @param bool $html Whether HTML support should be enabled. Disable it for XML documents
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this argument should be removed: read $this->isHtml instead.

/**
* @param bool $html Whether HTML support should be enabled. Disable it for XML documents
*
* @return CssSelectorConverter A CssSelectorConverter instance
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

should be removed and replaced by a real return type on the method

*
* @return CssSelectorConverter A CssSelectorConverter instance
*
* @throws \RuntimeException if the CssSelector Component is not available
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

uppercase "If" (same in other places)

@Einenlum
Copy link
Contributor Author

Thank you very much for your feedbacks @nicolas-grekas :).
I fixed your comments.
A test is broken on 7.2 on Travis, but it seems it's not related to this PR (link).

Copy link
Member

@nicolas-grekas nicolas-grekas left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

(with minor comments)

@@ -501,16 +501,28 @@ public function parents()
/**
* Returns the children nodes of the current selection.
*
* @param $selector string|null An optional CSS selector to filter children
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

swap needed: @param string|null $selector

if (!$this->nodes) {
throw new \InvalidArgumentException('The current node list is empty.');
}

if ($selector) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

if (null !== $selector) {?

@@ -1148,4 +1156,16 @@ private function createSubCrawler($nodes)

return $crawler;
}

/**
* @throws \RuntimeException if the CssSelector Component is not available
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

uppercase If

@Einenlum
Copy link
Contributor Author

Thanks @nicolas-grekas :). Fixed.

$converter = $this->createCssSelectorConverter($this->isHtml);
$xpath = $converter->toXPath($selector, 'child::*/');

return $this->filterXPath($xpath);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this could be made much more efficient by using child:: as prefix when converting CSS and then using filterRelativeXPath instead of going through relativize to modify the XPath again

@Einenlum
Copy link
Contributor Author

@stof Thank you for your feedback. I fixed it.

*/
public function children()
public function children(/* string $selector = null */)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@nicolas-grekas shouldn't we also detect cases where getclass($this) !== __CLASS__ (plus the special handling of common mock libraries) to trigger a deprecation warning if they don't have the argument in the child class ? Otherwise, we don't have a continous migration path.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looking at the source, we forgot such deprecations in many places.
Maybe merge as is (the continuous upgrade path would be just not adding this argument in v5)
and fix all the code base at once next?

@nicolas-grekas nicolas-grekas force-pushed the feature/dom-crawler-filter-children branch from 6f303ec to f634afd Compare August 24, 2018 10:00
@nicolas-grekas
Copy link
Member

Thank you @Einenlum.

@nicolas-grekas nicolas-grekas merged commit f634afd into symfony:master Aug 24, 2018
nicolas-grekas added a commit that referenced this pull request Aug 24, 2018
…nlum)

This PR was squashed before being merged into the 4.2-dev branch (closes #28221).

Discussion
----------

[DomCrawler] Add a way to filter direct children

| Q             | A
| ------------- | ---
| Branch?       | master
| Bug fix?      | no
| New feature?  | yes
| BC breaks?    | no
| Deprecations? | no
| Tests pass?   | yes
| Fixed tickets | #28171
| License       | MIT
| Doc PR        | -

The Dom-Crawler component only has a `filter()` method (to filter the node and all its children) and a `children()` method to return direct children.
**There is currently no way to easily filter (thanks to a selector) the direct children of a node, like jQuery allows so (with a selector passed to the `.children([selector])` method).**

**This PR adds a way to optionally filter direct children thanks to a CSS selector**. Here is an example of the usage:

```php
$html = <<<'HTML'
<html>
    <body>
        <div id="foo">
            <p class="lorem" id="p1"></p>
            <p class="lorem" id="p2"></p>
            <div id="nested">
                <p class="lorem" id="p3"></p>
            </div>
        </div>
    </body>
</html>
HTML;

$crawler = new Crawler($html);
$foo = $crawler->filter('#foo');

$foo->children() // will select `#p1`, `#p2` and `#nested`
$foo->children('p') // will select `#p1` and `p2`
$foo->children('.lorem') // will select `#p1` and `p2`
```
This PR adds only an optional parameter and adds no BC break.

Commits
-------

f634afd [DomCrawler] Add a way to filter direct children
@Einenlum Einenlum deleted the feature/dom-crawler-filter-children branch August 24, 2018 10:05
@Einenlum
Copy link
Contributor Author

Thank you very much @nicolas-grekas! :)

@javiereguiluz
Copy link
Member

@Einenlum thanks for this feature! We've created symfony/symfony-docs#10288 to not forget about documenting this new feature. It'd be great if you could provide the docs for it. If you need any help doing that, ask us in the Symfony Docs repository. Thanks!

@Einenlum
Copy link
Contributor Author

Einenlum commented Sep 7, 2018

@javiereguiluz Thank you for your comment. I will create a PR on Symfony Docs in the following days, no problem!

@nicolas-grekas nicolas-grekas modified the milestones: next, 4.2 Nov 1, 2018
@fabpot fabpot mentioned this pull request Nov 3, 2018
@fabpot fabpot mentioned this pull request Nov 3, 2018
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants