Thanks to visit codestin.com
Credit goes to github.com

Skip to content

Conversation

@phutelmyer
Copy link
Contributor

@phutelmyer phutelmyer commented Jan 26, 2023

Describe the change
With no limits set for the hyperlinks field in ScanHtml, events have been observed with well over 1000 hyperlinks. While this may be useful for some use cases, by default it may be advisable to not collect so many objects. The following modification were implemented:

  • Default limit (50) for amount of hyperlinks to be collected from ScanHtml session. Configurable in backend.yml
  • hyperlinks_count field is added to provide context on how many links were observed during that session, even if the limit is reached.

Describe testing procedures
Created test: test_scan_html_hyperlinks in test_scan_html.py that matches on a match on hyperlinks_count and ensuring that the limit functions properly.

Sample output

...
    },
    "html": {
      "elapsed": 0.002508,
      "hyperlinks": [
        "https://www.example4.com",
        "https://www.example3.com",
        "https://www.example2.com",
        "https://www.example.com/downloads/example.pdf",
        "https://www.example.com/images/example.jpg",
        "https://www.example.com",
        "https://www.example5.com"
      ],
      "hyperlinks_count": 7,
      "title": "Sample HTML File",
      "total": {
        "extracted": 0,
        "forms": 0,
        "frames": 0,
        "inputs": 0,
        "scripts": 0,
        "spans": 0
      }
...

Checklist

  • My code follows the style guidelines of this project
  • I have performed a self-review of and tested my code
  • I have commented my code, particularly in hard-to-understand areas
  • I have made corresponding changes to the documentation
  • My changes generate no new warnings

@phutelmyer phutelmyer merged commit 1d24d9d into master Jan 27, 2023
@phutelmyer phutelmyer deleted the scan-html-link-limits branch January 30, 2023 12:54
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants