Thanks to visit codestin.com
Credit goes to github.com

Skip to content
View huangtiansheng's full-sized avatar
🌴
On vacation
🌴
On vacation

Organizations

@git-disl @DatabasePractice

Block or report huangtiansheng

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Maximum 250 characters. Please don't include any personal information such as legal names or email addresses. Markdown supported. This note will be visible to only you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse
huangtiansheng/README.md

Hi there 👋 I am Tiansheng Huang

  • I’m currently a fourth-year PhD candidate from Georgia Tech.
  • I am working on safety alignment for large language models. Particularly, I am interested in red-teaming attacks and defenses for LLMs.

Selected Publications

  • [2025/3/01] Safety Tax: Safety Alignment Makes Your Large Reasoning Models Less Reasonable arXiv [paper] [code]
  • [2025/1/30] Virus: Harmful Fine-tuning Attack for Large Language Models bypassing Guardrail Moderation arXiv [paper] [code]
  • [2024/9/26] Harmful Fine-tuning Attacks and Defenses for Large Language Models: A Survey arXiv [paper] [repo]
  • [2024/9/3] Booster: Tackling harmful fine-tuning for large language models via attenuating harmful perturbation ICLR2025 [paper] [code] [Openreview]
  • [2024/8/18] Antidote: Post-fine-tuning safety alignment for large language models against harmful fine-tuning ICML2025 [paper] [code]
  • [2024/5/28] Lazy safety alignment for large language models against harmful fine-tuning NeurIPS2024 [paper] [code]
  • [2024/2/2] Vaccine: Perturbation-aware alignment for large language model aginst harmful fine-tuning NeurIPS2024 [paper] [code]
  • [2023/12/01] Lockdown: Backdoor Defense for Federated Learning with Isolated Subspace Training NeurIPS2023 [paper] [code]

Pinned Loading

  1. git-disl/awesome_LLM-harmful-fine-tuning-papers git-disl/awesome_LLM-harmful-fine-tuning-papers Public

    A survey on harmful fine-tuning attack for large language model

    215 6

  2. git-disl/Virus git-disl/Virus Public

    This is the official code for the paper "Virus: Harmful Fine-tuning Attack for Large Language Models Bypassing Guardrail Moderation"

    Python 51 4

  3. git-disl/Booster git-disl/Booster Public

    This is the official code for the paper "Booster: Tackling Harmful Fine-tuning for Large Language Models via Attenuating Harmful Perturbation" (ICLR2025 Oral).

    Shell 32 1

  4. git-disl/Antidote git-disl/Antidote Public

    This is the unofficial re-implementation of "Antidote: Post-fine-tuning Safety Alignment for Large Language Models against Harmful Fine-tuning Attack" (ICML2025)

    Shell 3

  5. git-disl/Lisa git-disl/Lisa Public

    This is the official code for the paper "Lazy Safety Alignment for Large Language Models against Harmful Fine-tuning" (NeurIPS2024)

    Python 23

  6. git-disl/Vaccine git-disl/Vaccine Public

    This is the official code for the paper "Vaccine: Perturbation-aware Alignment for Large Language Models" (NeurIPS2024)

    Shell 47 4