This is my solution for the McCarren Applied AI Challenge. It's a Word Add-in that helps protect sensitive information in documents by automatically redacting emails, phone numbers, and social security numbers. The add-in also adds a confidentiality header with tracking enabled.
When you open a Word document and click the redaction button, the add-in will:
- Scan through your entire document looking for sensitive information
- Replace any emails, phone numbers, and SSNs with [REDACTED] markers
- Add a "CONFIDENTIAL DOCUMENT" header at the top
- Enable Track Changes so you can see what was modified
I built this to make document redaction simple and reliable, especially when you need to share documents but want to protect personal information.
The solution uses:
- TypeScript for type-safe code
- Office.js API for Word integration
- Custom CSS for a clean, professional interface
- Regex patterns to identify sensitive data
- Word's Track Changes API (when available)
You'll need:
- Node.js (version 14 or higher)
- npm
- Word Online or Word Desktop
- Clone this repository:
git clone https://github.com/namo507/office-doc-redactor.git
cd office-doc-redactor- Install dependencies:
npm install- Start the development server:
npm startThis will:
- Start a local server on port 3000
- Compile the TypeScript files
- Try to sideload the add-in into Word
If automatic sideloading doesn't work, you can manually sideload the manifest.xml file following Microsoft's guide.
- Open a Word document
- Look for the "Document Redactor" tab in the ribbon
- Click "Show Taskpane" to open the add-in
- Click the "Redact Document" button
- The add-in will process your document and show you what was redacted
The add-in identifies and redacts:
- Email addresses (like [email protected])
- Phone numbers (various formats including xxx-xxx-xxxx, (xxx) xxx-xxxx)
- Social Security Numbers (xxx-xx-xxxx format)
office-doc-redactor/
├── src/
│ ├── taskpane/
│ │ ├── taskpane.html # The UI
│ │ ├── taskpane.ts # Main logic
│ │ └── taskpane.css # Styling
├── manifest.xml # Add-in configuration
├── package.json # Dependencies
├── tsconfig.json # TypeScript config
└── README.md # This file
I focused on making the code clean and maintainable. The redaction logic is modular, so it's easy to add new patterns if needed. I also made sure to handle edge cases, like documents that don't support Track Changes.
The UI is straightforward but professional. I used custom CSS instead of a library to keep things lightweight and to show attention to design details.
I tested this with various documents containing different types of sensitive information. The regex patterns are robust enough to handle different formatting styles while avoiding false positives.
If I had more time, I would add:
- Support for more sensitive data types
- Undo functionality
- Batch processing for multiple documents
- Export redacted content report