Thanks to visit codestin.com
Credit goes to github.com

Skip to content

PhishIntention: Phishing detection through webpage intention

License

lindsey98/PhishIntention

Repository files navigation

PhishIntention

Dialogues

PaperWebsiteVideoCitation

PhishIntention

  • This is the official implementation of "Inferring Phishing Intention via Webpage Appearance and Dynamics: A Deep Vision-Based Approach"USENIX'22 link to paper, link to our website

  • Existing reference-based phishing detectors:

    • ❌ Subject to false positive because they only capture brand intention
  • The contributions of our paper:

    • ✅ We propose a referenced-based phishing detection system that captures both brand intention and credential-taking intention. To the best of our knowledge, this is the first work that analyzes both brand intention and credential-taking intentions in a systematic way for phishing detection.
    • ✅ We set up a phishing monitoring system. It reports phishing webpages per day with the highest precision in comparison to state-of-the-art phishing detection solutions.

Framework

Input: a screenshot, Output: Phish/Benign, Phishing target

  • Step 1: Enter Abstract Layout detector, get predicted elements

  • Step 2: Enter Siamese Logo Comparison

    • If Siamese report no target, Return Benign, None
    • Else Siamese report a target, Enter step 3 CRP classifier
  • Step 3: CRP classifier

    • If CRP classifier reports its a CRP page, go to step 5 Return
    • ElIf not a CRP page and havent execute CRP Locator before, go to step 4: CRP Locator
    • Else not a CRP page but have done CRP Locator before, Return Benign, None
  • Step 4: CRP Locator

    • Find login/signup links and click, if reach a CRP page at the end, go back to step 1 Abstract Layout detector with an updated URL and screenshot
    • Else cannot reach a CRP page, Return Benign, None
  • Step 5:

    • If reach a CRP + Siamese report target: Return Phish, Phishing target
    • Else Return Benign, None

Project structure

|_ configs: Configuration files for the object detection models and the gloal configurations
|_ modules: Inference code for layout detector, CRP classifier, CRP locator, and OCR-aided siamese model
|_ models: the model weights and reference list
|_ ocr_lib: external code for the OCR encoder
|_ utils
|_ configs.py: load configuration files
|_ phishintention.py: main script

Setup

Step 1: Install dependencies:

  • Prerequisite: Pixi installed

  • For Linux/Mac,

    export KMP_DUPLICATE_LIB_OK=TRUE
    git clone https://github.com/lindsey98/PhishIntention.git
    cd PhishIntention
    pixi install
    chmod +x setup.sh
    ./setup.sh
  • For Windows,

    git clone https://github.com/lindsey98/PhishIntention.git
    cd Phishpedia
    pixi install
    setup.bat

Step 2: Install chromedriver:

  • Check your chrome binary version, you can do so by typing chrome://version/ in your browser, or type google-chrome --version from the command line.
  • Download the corresponding chromedriver from this repository. For example, if you are using 135.0.7049.42 on Linux, then you should look for 135.0.7049.42 chromedriver-linux64.zip.
  • Unzip the downloaded zip, put the chromedriver.exe under ./chromedriver-linux64/.

Running PhishIntention from Command Line

When you run the scripts for the 1st time, the reference list needs to be loaded, this may take some time.

pixi run python phishintention.py --folder <folder you want to test e.g. datasets/test_sites> --output_txt <where you want to save the results e.g. test.txt>

The testing folder should be in the structure of:

test_site_1
|__ info.txt (Write the URL)
|__ shot.png (Save the screenshot)
|__ html.txt (HTML source code, optional)
test_site_2
|__ info.txt (Write the URL)
|__ shot.png (Save the screenshot)
|__ html.txt (HTML source code, optional)
......

Miscellaneous

  • In our paper, we also implement several phishing detection and identification baselines, see here

Citation

Please consider citing our work :)

@inproceedings{liu2022inferring,
  title={Inferring Phishing Intention via Webpage Appearance and Dynamics: A Deep Vision Based Approach},
  author={Liu, Ruofan and Lin, Yun and Yang, Xianglin and Ng, Siang Hwee and Divakaran, Dinil Mon and Dong, Jin Song},
  booktitle={30th $\{$USENIX$\}$ Security Symposium ($\{$USENIX$\}$ Security 21)},
  year={2022}
}

If you have any issues running our code, you can raise an issue or send an email to [email protected], [email protected], [email protected]

Packages

No packages published

Contributors 3

  •  
  •  
  •