Machine Generated Code Detection

This repository provides a solution for detecting machine-generated code using AI-based models. It employs pretrained language models and fine-tuning techniques to analyze whether a given piece of code is AI-generated or human-written.

Overview

The project leverages transformer models from Hugging Face to determine the origin of code (machine-generated vs. human-written). It provides a Flask-based backend for serving the analysis and a minimalistic HTML frontend for interacting with the API. This project builds upon the research presented in the paper Binoculars. While the original implementation of Binoculars lacked the capability to detect AI-generated code, we have extended its functionality to include robust AI code detection.

Features

Model Integration: Uses Hugging Face's pretrained models (e.g., SmolLM-360M) for analysis.
Frontend: A simple HTML page to upload code and display the results.
Backend API: Flask server that processes the requests and returns AI analysis results.
Custom Model Fine-Tuning: Scripts for fine-tuning the models using specific datasets.
Cross-Origin Resource Sharing (CORS): Enables integration with external services.

Frameworks and Libraries

transformers (by Hugging Face) for pretrained BERT model
torch for model training and inference
sklearn for evaluation metrics

Setup Instructions

Prerequisites

Python 3.8 or higher.
A valid Hugging Face authentication token.
GPU support must for running large models.

Installation

Clone the repository:

git clone https://github.com/your_username/Machine_Generated_Code_Detection.git
cd Machine_Generated_Code_Detection

Create .gitignore file:

# Ignore Python virtual environments
venv/
__pycache__/

# Ignore Hugging Face token
hugging_face_auth_token.txt

Create a virtual environment:
```
python3 -m venv env_name
```
Activate the Virtual Environment:
```
source env_name/bin/activate
```
Install required Python packages:
```
pip install -r requirements.txt
```
Add your Hugging Face authentication token:
- Save the token in the hugging_face_auth_token.txt file.

Usage

Fine tuning the model

Open the file model_fine_tuning.py and make below changes [optional]

Select Model and Dataset of your choice

# MODEL_TO_FINETUNE = "HuggingFaceTB/SmolLM-360M"
# MODEL_TO_FINETUNE = "HuggingFaceTB/SmolLM-360M-Instruct"
# SAVE_NAME = "SmolLM-360M-LORA"

# FINETUNE_DATASET = "ise-uiuc/Magicoder-Evol-Instruct-110K"
# FINETUNE_DATASET = "bigcode/starcoderdata"
# FINETUNE_DATASET = "iamtarun/code_instructions_120k_alpaca"

Set the number of epoch of your choice
Execute the file
```
 python model_fine_tuning.py
```
After the model finishes fine tuning it is saves the model under fine_tuned_model, creates results and log directories with content.

Running the Server

Start the Flask server:
```
python backend.py
```
Open the frontend in a browser:
- The server runs by default on http://localhost:5000.

Frontend

Paste the code you want to analyze into the text box and click "Analyze Code".
The result will display whether the code is AI-generated, along with the confidence score.

Folder and File Structure

API Endpoints

`/analyze`

Method: POST
Description: Analyze the submitted code to determine if it is machine-generated.

Request Format:

{
    "content": "<code to analyze>",
    "type": "code"
}

Response Format:

{
    "codeclassifier": {
        "is_ai_generated": "yes/no",
        "score": 0.95,
        "result": "AI Generated (Score: 0.9500)"
    }
}

Datasets

Dataset Description

The project uses datasets containing human-written and machine-generated code for model training and validation which was generated using the research paper.
Sources: Open-source repositories, GPT-generated code snippets, and research_paper.
Format: JSON or text files, where each entry contains:
- Code snippet.
- Label specifying if it's machine-generated (1) or human-written (0).

Test

Integration Testing: This test is performed by calling codeclassifier file which reads in the input from the test_prompt.txt.
System Testing: The tests for the code detection pipeline (code_detector_validation_pipeline.py) are provided in validate_dataset/TestDataset.csv.
Test Cases:
- Valid machine-generated code is labelled has 1.
- Valid human-written code is labelled has 0.

Evaluation

Accuracy, precision, recall, and confusion matrix plotted via matplotlib & seaborn.

Results

The model achieves 87% accuracy in distinguishing machine-generated code from human-written code.

Challenges

Optimizing Fine-Tuning with Limited Resources:
The model fine-tuning process was constrained by limited GPU, CPU, and computational resources. As a result, we were able to fine-tune the model over a limited number of epochs.
Long Training Times vs. Resource Availability:
Fine-tuning the model for 3 epochs required approximately 18 hours. However, the project was executed on a Hopper system, where the maximum session availability was restricted to 12 hours, presenting a significant challenge.
Hyperparameter Optimization and Threshold Tuning:
Since the algorithms were implemented from scratch with custom improvements, determining the optimal thresholds and hyperparameters to accurately detect AI-generated content was a challenging, highly experimental process.
Curating High-Quality Datasets:
Identifying and sourcing high-quality datasets with a balanced mix of human-generated and machine-generated code required significant effort.
Addressing Dataset Bias:
Special attention was given to mitigating potential biases present in machine-generated code datasets to ensure fairness and accuracy in the model’s predictions.

Hardware Resource

NVIDIA A100 (80GB VRAM).
only 1 GPU 40GB was available per session.
HPC

Conclusion

This project demonstrates the feasibility of detecting machine-generated code using state-of-the-art transformer models. Future work involves refining models, expanding datasets, and deploying the solution in production environments.

Contributors

Suhas
Manish
Kashish

References

Hugging Face Transformers: https://huggingface.co/transformers/
PyTorch: https://pytorch.org/

Citation

@article{hans2024spotting,
  title={Spotting LLMs With Binoculars: Zero-Shot Detection of Machine-Generated Text},
  author={Hans, Abhimanyu and Schwarzschild, Avi and Cherepanova, Valeriia and Kazemi, Hamid and Saha, Aniruddha and Goldblum, Micah and Geiping, Jonas and Goldstein, Tom},
  journal={arXiv preprint arXiv:2401.12070},
  year={2024}
}

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

Machine Generated Code Detection

Table of Contents

Overview

Features

Frameworks and Libraries

Setup Instructions

Prerequisites

Installation

Usage

Fine tuning the model

Running the Server

Frontend

Folder and File Structure

API Endpoints

`/analyze`

Datasets

Dataset Description

Test

Evaluation

Results

Challenges

Hardware Resource

Conclusion

Contributors

References

Citation

About

Uh oh!

Releases

Packages

Uh oh!

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 41 Commits
templates		templates
validate_dataset		validate_dataset
.gitignore		.gitignore
README.md		README.md
backend.py		backend.py
code_detector_validation_pipeline.py		code_detector_validation_pipeline.py
codeclassifier.py		codeclassifier.py
hugging_face_auth_token.txt		hugging_face_auth_token.txt
model_fine_tuning.py		model_fine_tuning.py
requirements.txt		requirements.txt
test_prompt.txt		test_prompt.txt
utlis.py		utlis.py

suhastr/Machine_Generated_Code_Detection

Folders and files

Latest commit

History

Repository files navigation

Machine Generated Code Detection

Table of Contents

Overview

Features

Frameworks and Libraries

Setup Instructions

Prerequisites

Installation

Usage

Fine tuning the model

Running the Server

Frontend

Folder and File Structure

API Endpoints

/analyze

Datasets

Dataset Description

Test

Evaluation

Results

Challenges

Hardware Resource

Conclusion

Contributors

References

Citation

About

Topics

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Languages

`/analyze`

Packages