Thanks to visit codestin.com
Credit goes to github.com

Skip to content

404saint/arkoi

Repository files navigation

Arkoi

A cross-engine SEO poisoning detector for software downloads.


Why this exists

SEO poisoning is a real and underappreciated attack vector. Threat actors register convincing-looking domains, stuff them with the right keywords, and buy or manipulate their way into the top results on Google, Bing, or Brave. An unsuspecting user searches for "Siemens TIA Portal V17 download", clicks the third result, and downloads a trojanised installer.

The key insight behind Arkoi: it's hard to poison every search engine simultaneously at scale.

A malicious domain might crack the top 3 on Google. But if it's absent on Bing, Brave, DuckDuckGo, and Yandex for the same query — that inconsistency is itself a signal. Arkoi cross-references results across six engines and asks a different question than any URL scanner:

Given that I searched for X, does this result actually belong here — or was it placed here to deceive me?

This is a personal experiment and an ongoing demo. It is not production security software.


How it works

Query → Parse intent (vendor, software, version)
      → Fetch all engines in parallel (async)
      → For each result, run signals concurrently:
            ① Vendor domain verification
            ② Cross-engine consensus scoring
            ③ Rank anomaly detection
            ④ Query-result relevance + path analysis
            ⑤ URLhaus threat intel lookup
            ⑥ Domain age (WHOIS)
      → Assemble verdict from signals
      → Render results + summary

No numeric risk scores. Verdicts are categorical with explicit, human-readable reasoning:

Verdict Meaning
✓ TRUSTED Official vendor domain or trusted partner, consistent across engines
? UNVERIFIED No red flags found, but no relationship to queried vendor confirmed
⚠ SUSPICIOUS One or more moderate signals — new domain, SEO anomaly, suspicious path
✗ DECEPTIVE Strong indicators of deceptive placement — impersonation, piracy signals, single-engine promotion

Requirements

  • Python 3.10+
  • A running SearXNG instance on http://127.0.0.1:8080

SearXNG is a self-hosted meta search engine. Arkoi queries it for Google, Bing, Brave, DuckDuckGo, Yahoo, and Yandex results simultaneously. You need to run your own instance — this keeps queries private and avoids API rate limits.

Quick SearXNG setup with Docker:

docker run -d -p 8080:8080 searxng/searxng

Installation

git clone https://github.com/404saint/arkoi.git
cd arkoi
python -m venv venv
source venv/bin/activate      # Windows: venv\Scripts\activate
pip install -r requirements.txt

Usage

# Interactive
python arkoi.py

# Direct query
python arkoi.py "Siemens TIA Portal V17 download"
python arkoi.py "AutoCAD 2025 download"
python arkoi.py "Wireshark install"
python arkoi.py "PyCharm professional"

Example output

══════════════════════════════════════════════════════════════════════════════
  ARKOI — SEO Poisoning Detector
══════════════════════════════════════════════════════════════════════════════
  Query  : Siemens TIA Portal V17 download
  Vendor : Siemens
  Version: V17

  Fetching results from all engines... done in 3.0s (21 unique domains, 3/6 engines responded)
  Running signal checks... done in 6.0s

[✓ TRUSTED    ] sieportal.siemens.com
  Engines: bing #8  ·  google #2  ·  brave #1  (3/3 engines)
  Vendor : VENDOR_MATCH
  ├─ Official vendor domain
  └─ Consistent across 3 search engine(s)
  ↳ This is a verified source.

[✗ DECEPTIVE  ] plc4me.com
  Engines: google #3  (1/3 engines)
  Vendor : UNRELATED
  └─ URL path contains piracy/bypass signals on an unverified domain
  ↳ Avoid this result. Do not download anything from this domain.

  ✓ Safest result : sieportal.siemens.com
  ✗ Avoid         : plc4me.com, plcshare.com
  Total runtime   : 9.1s

Project structure

arkoi/
├── arkoi.py          # Entry point and pipeline orchestrator
├── query_parser.py   # Extracts vendor, version, tokens from query
│                     # Contains the vendor registry and product aliases
├── fetcher.py        # Async multi-engine search via SearXNG
├── signals.py        # All signal checks (vendor, consensus, age, malware...)
├── verdict.py        # Assembles signals into verdict + reasons
├── renderer.py       # Terminal output formatting
└── requirements.txt

Vendor coverage

Arkoi currently recognises vendors and products across:

  • Industrial / PLC: Siemens, Rockwell, Schneider, Honeywell, Beckhoff, Omron, ABB
  • CAD / CAE / Simulation: Autodesk, ANSYS, PTC, Dassault, SolidWorks, Altair
  • Developer tools: Microsoft, JetBrains, HashiCorp, Docker, Atlassian, GitHub, GitLab
  • Creative: Adobe, Affinity, Blender, Blackmagic, Foundry, Maxon
  • Scientific / Data: MathWorks, NI, Wolfram, ESRI, Anaconda
  • Networking / Security: Cisco, Palo Alto, Fortinet, Wireshark, Nmap, PuTTY
  • Remote access: AnyDesk, TeamViewer, Zoom, Slack
  • Virtualisation / OS: VMware, VirtualBox, Ubuntu, Red Hat
  • Cloud: AWS, Google Cloud, Apple
  • Databases: Oracle, MySQL, PostgreSQL, MongoDB, Elastic

Product aliases are supported — searching for "autocad" automatically resolves to the Autodesk vendor profile. See query_parser.py for the full registry.


Known limitations

  • SearXNG engine availability: Not all six engines respond on every query. Consensus scoring adapts to however many responded, but results vary based on your SearXNG configuration.
  • WHOIS coverage: Many domains show UNKNOWN age due to privacy protection or WHOIS rate limiting. Age is a supporting signal, not a primary one.
  • No vendor in query: If the query doesn't match any known vendor or product alias, vendor verification is skipped and all results fall back to consensus + anomaly scoring only.
  • Not a replacement for VirusTotal: Arkoi is specifically designed to catch SEO poisoning through search result analysis. For deep malware analysis of a specific file or URL, use dedicated tools.

Open issues

See Issues for known bugs and planned improvements. Contributions are welcome — please read CONTRIBUTING.md first.


License

MIT

About

Cross-engine SEO poisoning detector. Audits search results across multiple engines to identify deceptive domain placement, vendor impersonation, and anomalous ranking patterns.

Topics

Resources

License

Contributing

Stars

Watchers

Forks

Packages

 
 
 

Contributors

Languages