CHAT π£οΈ SUMMARY π ANALYSIS
Chat Summarizer is a comprehensive web application designed to analyze and summarize chat data from messaging platforms. The tool provides insights into user communication patterns, keywords, and topics of conversation, along with graphical visualizations and a downloadable PDF report.
- Chat Parsing: Extracts structured data (date, time, sender, and messages) from raw chat text files.
- Preprocessing: Cleans and preprocesses messages to remove stopwords, punctuation, and irrelevant information.
- Filtering: Apply date, time, and keyword-based filters to refine the chat data for analysis.
- Keyword Extraction: Identifies and ranks the most significant keywords in the chat.
- Topic Modeling: Uses Latent Dirichlet Allocation (LDA) to identify key topics of conversation.
- Visualizations: Interactive charts for message trends, topic distributions, and keyword importance.
- PDF Report Generation: Exports the analysis results, including charts and message summaries, into a structured PDF report.
CHAT SUMMARIZER/
βββ app.py π Main application file
βββ summarizer.ipynb π Jupyter notebook for summarization
βββ requirements.txt π Required dependencies
βββ Data/ π Data Directory
βββ Chats.txt π¬ Raw dummy data chat file
βββ Report/ π Analysis Reports and Visualizations
βββ chat_analysis_report.pdf π PDF report from Jupyter notebook
βββ keywords_plot.png π Keywords analysis plot
βββ message_trends_plot.png π Message trends visualization
βββ report.pdf π PDF report from web application
βββ top_keywords.png π Top keywords bar plot
βββ topics_plot.png 𧩠Topics distribution pie chart
- Python 3.8 or later.
- pip (Python package manager).
- Windows 10/11.
- Web Browser.
-
Download the setup executable from release section of this repository.
-
Install the app via the setup.
-
Run the executable of the app, after the app is installed.
-
Open the application in your browser at http://localhost:8501.
-
If the PC running the web app is connected to a router, you can use your mobile device to launch the app by opening your mobile browser and launching the "Network URL :" given in the terminal. This will automatically use your PC as a server and your phone as a client. You can do this with as many other PCs and mobile devices as the PC running the app can support.
-
Clone this repository:
git clone https://github.com/N-Elmer/CHAT-SUMMARIZER.git cd CHAT-SUMMARIZER
-
Install required dependencies:
pip install -r requirements.txt
-
Run the application:
streamlit run app.py
-
Open the application in your browser at http://localhost:8501.
-
If the PC running the web app is connected to a router, you can use your mobile device to launch the app by opening your mobile browser and launching the "Network URL :" given in the terminal. This will automatically use your PC as a server and your phone as a client. You can do this with as many other PCs and mobile devices as the PC running the app can support.
- Open your WhatsApp.
- DON'T SELECT MEDIA for exporting.
- Export your chats to a text file.
- Open your Telegram.
- DON'T SELECT MEDIA for exporting.
- Export your chats to a json file.
- With the sidebar, upload your exported chat file.
- Use the filters on the sidebar to get more insights.
- Date and Time Filters: Set start and end dates/times.
- Keywords: Enter comma-separated keywords to filter messages containing specific terms.
- Select the number of top keywords, topics, and messages to include in the analysis.
- View visualizations, including:
- Message Trends: A time-based line chart of message counts.
- Keyword Importance: A bar chart of extracted keywords.
- Topic Distribution: A pie chart of conversation topics.
- Click "Generate Report" to create a PDF file summarizing the analysis, including:
- Top keywords chart.
- Message trends chart.
- Summary of top messages.
The app processes chat text files to extract structured data into a DataFrame. It identifies timestamps, senders, and message contents while handling multiline messages and system notifications.
Cleans chat messages by:
- Removing stopwords (e.g., "the", "and").
- Tokenizing text.
- Eliminating punctuation.
Utilizes Latent Dirichlet Allocation (LDA) to group conversations into topics. Each topic is represented by a set of key terms.
Interactive charts provide actionable insights, making it easier to understand chat trends and topics.
The following Python libraries are required:
- streamlit: Web app framework.
- pandas: Data manipulation.
- plotly: Interactive visualizations.
- nltk: Natural Language Toolkit for text preprocessing.
- gensim: Topic modeling.
- reportlab: PDF report generation.
- seaborn: Statistical data visualization.
Install all dependencies using:
pip install -r requirements.txt
- File Upload Issues: Ensure the chat file is in
.txt
format and properly structured. - Date/Time Errors: Verify that the date and time formats in the file match
DD/MM/YYYY
andHH:MM AM/PM
. - PDF Generation Problems: Ensure that the
Report/
directory exists and is writable. - Missing Dependencies: Reinstall required packages using
pip install -r requirements.txt
.
Contributions are welcome! If you find a bug or have a feature request, please open an issue in the GitHub repository.
This README file provides an overview of the CHAT-SUMMARIZER web application, its folder structure, usage instructions, code explanation, and troubleshooting tips. Use it as a guide to understand and utilize the CHAT-SUMMARIZER app.