- London, UK
- in/sruthikuriakose
Lists (20)
Sort Name ascending (A-Z)
AI Safety
Coding
complexity
DL
EEG
general
GUI
i💥
Interp
LLMs
medtech
ML
NeuroAI
NeuroAI decoding
Neuroscience
Neurotech
NIMHANS
Different projects I've worked on at NIMHANSRL
Useful
Webdev
Stars
Stop AI agents from half-building features. Ship complete code in one session.
This repository accompanies the research paper "Sandbagging Auditing Games" on detecting sandbagging in frontier AI systems. We provide access to the model organisms used in the paper and tools for…
James' cookbook of evaluations and finetuning experiments
✨ Monorepo containing most of BlueDot Impact's custom software.
Inference API for many LLMs and other useful tools for empirical research
Code for the paper: Linear Control of Test Awareness Reveals Differential Compliance in Reasoning Models
ControlArena is a collection of settings, model organisms and protocols - for running control experiments.
MedRAX: Medical Reasoning Agent for Chest X-ray - ICML 2025
Optimize prompts, code, and more with AI-powered Reflective Text Evolution
Repository for the "Chain-of-Thought Reasoning In The Wild Is Not Always Faithful" paper
[NeurIPS 2024] CoSy is an automatic evaluation framework for textual explanations of neurons.
A curated list of foundation models, datasets, and tools for biosignals
Real-time webcam demo with SmolVLM and llama.cpp server
⚓️ Repository for the "Thought Anchors: Which LLM Reasoning Steps Matter?" paper.
The Python Risk Identification Tool for generative AI (PyRIT) is an open source framework built to empower security professionals and engineers to proactively identify risks in generative AI systems.
Weekly seminar on transformers/LLMs at the University of Wyoming
pix2tex: Using a ViT to convert images of equations into LaTeX code.
Experimental LLM interface exploring new ways to use AI to improve human thinking
The 2024 edition of The Nature of Code with p5.js. Includes Notion workflow and build system.
A Python implementation of a project that classifies the valence and context of several thousand pig calls, extended to have a web interface.