A Streamlit application that calculates the similarity between texts in a given CSV or Excel file, visualizes the results as a network graph, and helps users identify relationships between similar texts.
This app allows users to upload a CSV or Excel file containing a column of text data. The application computes the similarity between each pair of texts using a chosen similarity metric and generates a network graph representing the relationships among similar texts. This can be particularly useful for tasks such as clustering, text analysis, and data exploration.
To run this project locally, follow these steps:
- Clone the repository:
git clone https://github.com/yourusername/text-similarity-network-app.git cd text-similarity-network-app - Install the required packages: pip install -r requirements.txt
- Run the Streamlit app: streamlit run app.py
- Open your web browser and go to http://localhost:8501.
- Upload your CSV or Excel file. Ensure that the file has a column with text data.
- Provide inputs such as the name of the text column and the similarity threshold.
- Optional - filter your data by a list of keywords.
- Click the "Generate Graph" button to process the data and generate the network graph.
- Explore the network graph to identify similar texts and their relationships.
- Supports both CSV and Excel file formats.
- Calculates text similarity using various language models.
- Currently supports English, Hebrew and Arabic.
- Interactive network graph visualization.
- User-friendly interface built with Streamlit.
© 2024 Ori Kopolovich. All rights reserved. License: Use for personal or internal business purposes only. No modifications or redistribution allowed.
Email: [email protected] GitHub: orikopel