Thanks to visit codestin.com
Credit goes to github.com

Skip to content

NSArjun/RedactLess

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

10 Commits
 
 
 
 
 
 
 
 

Repository files navigation

RedactLess Logo
RedactLess

RedactLess is an AI-powered smart redaction and anonymization tool that detects, masks, and synthesizes sensitive data, ensuring privacy without compromising usability.
Built for enterprises that handle sensitive data in finance, healthcare, and law, powered by IBM Z.

License GitHub Release GitHub Stars GitHub Issues

Why RedactLess?Tech StackFeaturesGetting StartedPrivacy & EthicsLearnings


🎬 Demo

demo.mp4

🔗 Git Repository: github.com/NSArjun/RedactLess
📺 Video Explanation: https://youtu.be/vhkbdD8TPO8?si=pxynPT0YuRVF2jkc


Why RedactLess ( ⚙ Tech for Good! )

Every day, organizations in healthcare, finance, and law handle massive amounts of sensitive data.
Manual redaction remains slow, inconsistent, and error-prone; even one missed field can lead to a privacy disaster ⚠️

Existing tools often:

  • ❌ Miss complex or nested PII formats
  • 🐢 Break data usability or structure
  • 💢 Require tedious manual cleanup

That’s where RedactLess steps in:

Because Privacy Isn’t Optional.


⚙ Tech Stack

  • Frontend
    • Next.js (TypeScript)
    • Tailwind CSS
  • Backend
    • Flask API
    • Python (Hugging Face Transformers, Faker)
  • Environment
    • IBM Z Jupyter Notebook (AI model execution)
    • LinuxOne System (Development)

🚀 Features

  1. AI-Powered Redaction

    • Uses a fine-tuned transformer model to identify and mask Personally Identifiable Information (PII).
    • Labels entities such as FULLNAME, EMAIL, PHONE, ADDRESS, and more.
  2. Sensitivity Levels

    • Light: Masks key numerical identifiers (Aadhaar, Phone, Govt IDs).
    • Medium: Adds names and addresses.
    • Heavy: Redacts every identifiable element.
  3. Smart Data Synthesis

    • Extracts schema from CSV/JSON input.
    • Generates realistic synthetic data using Faker.
    • Keeps the structure intact for testing and analytics.
  4. Mainframe Integration

    • Model tested and executed on IBM Z Jupyter Notebook for reliability and scalability.
    • Demonstrates seamless AI–Mainframe interoperability.
  5. Web Interface

    • User-friendly web app to upload files and select sensitivity levels.
    • Built-in synthetic data generation page.

🧩 Getting Started

Prerequisites

  • Python 3.9+
  • Node.js + npm (for the frontend)
  • IBM Z Jupyter access (optional, for model execution)

Installation

Backend

git clone https://github.com/NSArjun/RedactLess
cd RedactLess/backend
pip install -r requirements.txt
python app.py

Frontend

cd ../frontend
npm install
npm run dev

🔐 Privacy & Ethics

  • All processing happens locally or on IBM Z’s secure environment.
  • Built with a strong focus on compliance (GDPR, HIPAA) and data governance.

🧠 Learnings

  • Successfully executed and tested AI model on IBM Z Jupyter Notebook, proving mainframe–AI integration.
  • Connected Flask backend to ML inference pipeline on IBM Z.
  • Understood secure data flow between frontend, API, and model.
  • Deployed Hugging Face Transformer for text redaction and synthesis.
  • Resolved dependency, I/O, and cross-platform issues.
  • Strengthened understanding of privacy-first AI and compliance in enterprise systems.

🌟 Team SAV015 – Maverick (IBM Z Datathon)

Name Role
Arjun N S ML Engineer
Keerthi K P ML Engineer
Shri Sai Aravind ML Engineer
Shivaa Palaniyappan V Full Stack Developer
Yathin Reddy SDE / Backend Developer
Sowmya Badoni Data Analyst & UI/UX Designer

📈 Future Plans

  • Add support for document and image redaction (PDF, DOCX).
  • Integrate Faker-based multi-language synthesis.
  • Expand deployment to IBM Cloud & Z Containers.
  • Build an enterprise dashboard for compliance tracking.

💬 Future Ideas

Replace stick figure with a 2D or 3D avatar

Add recording & playback features

Integrate sound-reactive effects

Stream live pose data into Unity or Blender

📜 License

RedactLess is licensed under the MIT License: you’re free to use, modify, and distribute it with attribution.


Made with ❤️ by Team MAVERICK for the IBM Z Datathon

ArjunKeerthiAravindShivaaYathinSowmya