hdq is a XGo package for processing HTML documents.
How to collect all links of a html page? If you use hdq, it is very easy.
import "github.com/goplus/hdq"
func links(url any) []string {
doc := hdq.Source(url)
return [link for a in doc.any.a if link := a.href?:""; link != ""]
}At first, we call hdq.Source(url) to create a node set named doc. doc is a node set which only contains one node, the root node.
Then, select all a elements by doc.any.a. Here doc.any means all nodes in the html document.
Then, we visit all these a elements, get href attribute value and assign it to the variable link. If link is not empty, collect it.
At last, we return all collected links. Goto tutorial/01-Links to get the full source code.