ClassiFly is a powerful, next-generation tool that automates the creation of classifiers directly from your data. With just a few clicks, transform raw datasets into highly effective classification models, saving you time, effort, and resources. This project aims to revolutionize the way users create classification prompts, streamlining workflows, and offering a fresh approach to building data-driven solutions.
- Effortless Data Upload: Easily upload CSV or Excel datasets to get started.
- Simple Column Selection: Choose the category and input text columns to guide the classifier.
- Contextual Enrichment: Provide additional context for categories, enriching the classification.
- Automatic Category Detection: Let the system intelligently identify unique categories, reducing manual effort.
- LLM-Powered Category Descriptions 📝: Automatically generate rich category descriptions using leading-edge language models.
- Interactive Classification Interface 🖥️: Seamlessly test and refine your classification pipeline with real-time feedback.
- Built-in Testbench: Validate the classifier instantly with custom inputs to evaluate accuracy and robustness.
-
Data Upload: Users simply upload a dataset (CSV or Excel) and specify key columns, providing critical context for more meaningful categorization.
-
Automatic Category Detection: ClassiFly takes the guesswork out of identifying unique categories by automatically detecting them for you.
-
LLM-Driven Description Generation 📝: Using advanced language models, the tool generates detailed descriptions for each category, ensuring clarity and consistency.
-
Prompt Creation: The collated descriptions are converted into a refined, highly effective classification prompt, ready for deployment.
-
Interactive Classification Experience: The interactive interface allows users to input text 📝 data and observe the generated classifier at work, enabling a hands-on understanding of its predictive capabilities.
-
Testbench for Validation: Users can stress-test the classifier on unseen data to understand performance and reliability before deploying it in production.
ClassiFly is built to disrupt how we think about classification pipelines. By leveraging advanced language models and an intuitive interface, ClassiFly empowers users to create classification solutions that would traditionally require extensive coding, significant resources, and domain expertise.
- Accelerate Data Analysis: Cut down the time required to set up classification models from weeks to mere minutes.
- Reduce Reliance on Manual Efforts: No need for domain-specific labor-intensive tasks—simply provide your data and let the tool do the rest.
- Empower Non-Technical Users: With its easy-to-use interface, even non-experts can now create sophisticated classifiers, enabling organizations to unlock insights from their data faster.
- Scalable, Reproducible Classifications: Take your classification projects from proof of concept to production seamlessly with a pipeline that's scalable and repeatable across various domains.
- Reimagine Customizability: Every dataset is unique, and ClassiFly allows for deep customizability, ensuring that generated classifiers fit your needs with remarkable precision.
The downstream impact of ClassiFly extends beyond mere categorization. By bringing automation and ease to classification processes, this tool is set to change how individuals and businesses extract insights, automate workflows, and make data-driven decisions.
-
Clone the repository:
git clone https://github.com/yourusername/classifly.git cd classifly
-
Install the required dependencies:
pip install -r requirements.txt
-
Run the Streamlit app 🖥️:
streamlit run app.py
-
Open your web browser and navigate to the provided local URL (usually http://localhost:8501) or to our hosted URL!
- Upload your dataset using the file uploader in the app.
- Select the predictor columns (input text) and the target column (categories).
- Adjust the analysis settings ⚙️ in the sidebar, including samples per target and additional context.
- Click "Analyze Columns" to initiate the classification process.
- Review the results and auto-generated category descriptions 📝.
- Use the interactive testbench to input custom data and see the classifier in action.
The testbench feature allows you to:
- Input custom values for each predictor column and see immediate predictions.
- Understand the classifier's reasoning behind each prediction, promoting transparency.
- Experiment with new data to ensure your classifier's robustness in real-world scenarios.
- Python 3.7+
- Streamlit 🖥️
- Pandas
- OpenAI API key (for GPT-4 access)
- Anthropic API key (for Claude access)
We welcome contributions that expand the capabilities of ClassiFly! Whether it's new features, bug fixes, or performance improvements, please feel free to submit a Pull Request.
This project is licensed under the MIT License - see the LICENSE file for details.
ClassiFly is more than just a tool—it's an enabler of next-generation workflows that can transform how individuals and businesses approach data classification. Imagine being able to focus on what matters most—deriving insights and making impactful decisions—while leaving the tedious setup and maintenance of classification pipelines to automation. With ClassiFly, the future of classification is just a click away.