Precision Data Extraction & Analysis
SDT is a fast and easy tool for cleaning, extracting, and organizing domain-based data. Whether you're handling emails, websites, or logs, SDT makes the process straightforward with minimal setup and maximum efficiency.
- Install Python Recommended version 3.8, see https://python.org
- Clone the repository:
git clone https://github.com/xyndez/SDT.git cd STD - Install dependencies:
pip install -r requirements.txt
- Run!
python source.py
-
Sort Data by Domains
Finds and groups emails, websites, and logins by company (Google, Facebook, etc.), while automatically removing duplicates. -
Extract Text with Patterns
Pulls specific information using simple search rules across text files, CSVs, and logs. -
Grab Random Samples
Quickly extracts random data from large files—choose from start, end, or random lines, so you can share it. -
Merge Files Easily
Combines multiple files into one, with an option to skip duplicate lines.
- No setup needed – Ready to go right away.
- Light & fast – Handles large files smoothly.
- Simple to use – No confusing options.
- Quick data cleaning
- Extracting specific info from logs
- Organizing messy text files (databases)
From Database worker to Database workers. I was tired of dealing with messy data. As a Database Worker, I needed something that could quickly sort, clean, and organize my files without all the setup and headaches. So I made it just for people like me (and you, of course)! Telegram
If you find SDT useful, consider supporting the project through donations, this project will expand further:
- Bitcoin (BTC):
bc1q0sjlvsv68yfn95qf3jqhrtnssk7e5rvhj98ewn - Ethereum (ETH):
0x51adDbc22358604B8B6412E744C49EEf69B329b5 - Litecoin (LTC):
LfjEjnbrj8Sp9NBg5Q4QoM7atVGS1gd73g - Monero (XMR):
447TRPHego4R2E9j2BUmw9HM4yPMAu1sy3Dhr7d72UiNJNoNqEMEArtQRDfzTfytmeKRgKMKytjQbGdj13SKFRAoLjoWNTh - Tether (ERC20):
00x51adDbc22358604B8B6412E744C49EEf69B329b5
SDT is an open-source project, licensed under the MIT License.