Thanks to visit codestin.com
Credit goes to github.com

Skip to content
View MaheepChaudhary's full-sized avatar
🛡️
Busy making DENTS in the MULTIVERSE
🛡️
Busy making DENTS in the MULTIVERSE

Highlights

  • Pro

Block or report MaheepChaudhary

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Maximum 250 characters. Please don't include any personal information such as legal names or email addresses. Markdown supported. This note will be visible to only you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse
Showing results

Implementation of the paper--SafetyNet: Detecting Harmful Outputs in LLMs by Modeling and Monitoring Deceptive Behaviors

HTML 1 Updated Nov 10, 2025
Python 11 2 Updated Mar 24, 2025

Language model alignment-focused deep learning curriculum

1,490 119 Updated Aug 19, 2024
Jupyter Notebook 17 3 Updated Nov 15, 2024

Codebase for Obfuscated Activations Bypass LLM Latent-Space Defenses

Jupyter Notebook 25 4 Updated Feb 11, 2025

Safety at Scale: A Comprehensive Survey of Large Model Safety

204 5 Updated Feb 19, 2025

s1: Simple test-time scaling

Python 6,593 762 Updated Jun 25, 2025

An interactive HTML pretty-printer for machine learning research in IPython notebooks.

Python 451 24 Updated Aug 8, 2025

🤗 smolagents: a barebones library for agents that think in code.

Python 23,900 2,116 Updated Nov 12, 2025

🌐 Make websites accessible for AI agents. Automate tasks online with ease.

Python 72,440 8,610 Updated Nov 12, 2025

A python sdk for LLM finetuning and inference on runpod infrastructure

Python 16 2 Updated Nov 3, 2025

Fine-tuning & Reinforcement Learning for LLMs. 🦥 Train OpenAI gpt-oss, DeepSeek-R1, Qwen3, Gemma 3, TTS 2x faster with 70% less VRAM.

Python 48,196 3,953 Updated Nov 12, 2025

Kura is a simple reproduction of the CLIO paper which uses language models to label user behaviour before clustering them based on embeddings recursively. This helps us understand user behaviour on…

Python 359 36 Updated Sep 10, 2025

Situational Awareness Dataset

HTML 41 6 Updated Dec 14, 2024

Deep learning for dummies. All the practical details and useful utilities that go into working with real models.

Python 822 42 Updated Jul 29, 2025

Code for reproducing sections 4 and 6.2 of the paper "Obfuscated Activations Bypass LLM Latent-Space Defenses"

Jupyter Notebook 2 3 Updated Feb 8, 2025

Attribution-based Parameter Decomposition

Python 31 8 Updated Jun 11, 2025

LLM Transparency Tool (LLM-TT), an open-source interactive toolkit for analyzing internal workings of Transformer-based language models. *Check out demo at* https://huggingface.co/spaces/facebook/l…

Python 843 68 Updated Dec 3, 2024

Human preference data for "Training a Helpful and Harmless Assistant with Reinforcement Learning from Human Feedback"

1,795 149 Updated Jun 17, 2025

A library for efficient patching and automatic circuit discovery.

Python 79 19 Updated Jul 21, 2025

Investigating the generalization behavior of LM probes trained to predict truth labels: (1) from one annotator to another, and (2) from easy questions to hard

Python 28 5 Updated May 23, 2024

A library for mechanistic anomaly detection

Python 22 12 Updated Jan 9, 2025
Jupyter Notebook 22 2 Updated Dec 11, 2024
Python 134 33 Updated Oct 19, 2025

Tips for Writing a Research Paper using LaTeX

TeX 3,619 403 Updated May 4, 2023

Accompanies NeurIPS'24 poster "Dense Associative Memory Through the Lens of Random Features"

Python 6 Updated Dec 2, 2024

Detectron2 is a platform for object detection, segmentation and other visual recognition tasks.

Python 33,638 7,847 Updated Oct 27, 2025

A survey on harmful fine-tuning attack for large language model

219 6 Updated Nov 6, 2025

An Extensible Continual Learning Framework Focused on Language Models (LMs)

Python 290 22 Updated Jan 28, 2024
Next