Welcome to TextTableScoop 🌟, a versatile tool designed for extracting text from files and CSV tables, particularly focusing on Office files like Excel, PowerPoint, etc. This project is part of a 'ProjectText' suite that includes ProjectTextAgent and ProjectDataBaseQnA.
- Specializes in extracting text from various file formats, including Office files.
- Designed to work in both Windows with COM and Linux with LibreOffice + PyUNO.
- Current implementation supports Linux + LibreOffice + PyUNO.
- Windows support with COM environment is planned for robust file handling.
To install TextTableScoop, use the following pip command:
pip3 install git+https://github.com/Flagro/TextTableScoop.gitRun texttablescoop from the bin folder with these arguments:
path: Path to the file or directory to process.-tor--temp: (Optional) Path to a custom temporary folder.-por--project: (Optional) Path to the project folder the file belongs to.--ignore: (Optional) Comma-separated list of patterns to ignore.
texttablescoop 'path/to/file' --temp 'path/to/temp' --project 'path/to/project' --ignore 'pattern1,pattern2'Open for collaboration; check the issues page for discussions.
Here's how you can contribute:
- Fork the Project.
- Create your Feature Branch (
git checkout -b feature/AmazingFeature). - Commit your Changes (
git commit -m 'Add some AmazingFeature'). - Push to the Branch (
git push origin feature/AmazingFeature). - Open a Pull Request.