romanlutz

Roman Lutz romanlutz

Responsible AI Engineer | AI Red Teaming | Open Source

216 followers · 225 following

Microsoft
Greater Seattle Area, WA, USA
https://romanlutz.github.io
in/romanlutz
@romanlutz.bsky.social

Achievements

x3 x3

Achievements

x3 x3

Organizations

Starred repositories

Azure / azure-sdk-for-python

This repository is for active development of the Azure SDK for Python. For consumers of the SDK we recommend visiting our public developer docs at https://learn.microsoft.com/python/azure/ or our v…

Python 5,431 3,206 Updated Jan 12, 2026

microsoft / playwright-mcp

Playwright MCP server

TypeScript 25,408 2,069 Updated Jan 9, 2026

OSU-NLP-Group / RedTeamCUA

RedTeamCUA: Realistic Adversarial Testing of Computer-Use Agents in Hybrid Web-OS Environments

Python 30 5 Updated Jun 25, 2025

microsoft / SecRL

Benchmarking LLM agents on Cyber Threat Investigation.

Jupyter Notebook 110 16 Updated Jan 8, 2026

ReversecLabs / spikee

HTML 127 31 Updated Jan 5, 2026

stefanv / npdoc2json

Recursively scan a Python module and export numpydoc docstrings to JSON

TypeScript 3 Updated May 14, 2025

cicscareers / OSAP

Open-Source Apprenticeship Program

3 Updated May 22, 2025

allenai / wildguard

Open One-Stop Moderation Tools for Safety Risks, Jailbreaks, and Refusals of LLMs

Python 103 12 Updated Dec 2, 2024

KutalVolkan / ai_recruiter

Python 10 3 Updated Jun 3, 2025

Tinche / aiofiles

File support for asyncio

Python 3,211 163 Updated Oct 9, 2025

microsoft / OmniParser

A simple screen parsing tool towards pure vision based GUI agent

Jupyter Notebook 24,185 2,076 Updated Sep 12, 2025

microsoft / AI-Red-Teaming-Playground-Labs

AI Red Teaming playground labs to run AI Red Teaming trainings including infrastructure.

TypeScript 1,780 263 Updated Jan 9, 2026

github / issue-metrics

Gather metrics on issues/prs/discussions such as time to first response, count of issues opened, closed, etc.

Python 512 91 Updated Jan 5, 2026

microsoft / debug-gym

A Text-Based Environment for Interactive Debugging

Python 288 37 Updated Jan 9, 2026

CarsonDon / Multilingual-Vuln-LLMs

Jupyter Notebook 4 Updated Feb 25, 2025

microsoft / agdebugger

TypeScript 65 7 Updated Nov 16, 2025

python-websockets / websockets

Library for building WebSocket servers and clients in Python

Python 5,615 579 Updated Jan 10, 2026

showlab / computer_use_ootb

Out-of-the-box (OOTB) GUI Agent for Windows and macOS

Python 1,860 201 Updated May 21, 2025

AI-secure / DecodingTrust

A Comprehensive Assessment of Trustworthiness in GPT Models

Python 311 61 Updated Sep 16, 2024

0din-ai / 0Din-Curated-Monthly-White-Papers

This repository curates a collection of monthly white papers focused on the latest LLM attack and defenses.

26 2 Updated Oct 10, 2024

jennifermarsman / FocusedNPC

Creating a non-player character in a game backed by generative AI that will stay focused on its goals

Python 4 Updated Sep 27, 2024

alanaqrawi / STCA

Results and Analysis of Single-Turn Crescendo Attacks (STCA) on Large Language Models: Evaluating vulnerabilities in content moderation through adversarial techniques.

2 1 Updated Sep 12, 2024

microsoft / eureka-ml-insights

A framework for standardizing evaluations of large foundation models, beyond single-score reporting and rankings.

Python 173 36 Updated Jan 1, 2026

promptfoo / promptfoo

Test your prompts, agents, and RAGs. AI Red teaming, pentesting, and vulnerability scanning for LLMs. Compare performance of GPT, Claude, Gemini, Llama, and more. Simple declarative configs with co…

TypeScript 9,854 860 Updated Jan 12, 2026

usnistgov / dioptra

Test Software for the Characterization of AI Technologies

Python 270 49 Updated Jan 9, 2026

danielmiessler / SecLists

SecLists is the security tester's companion. It's a collection of multiple types of lists used during security assessments, collected in one place. List types include usernames, passwords, URLs, se…

PHP 68,103 24,863 Updated Jan 12, 2026