🚀 Revolutionizing Data Visualization AI: The First Comprehensive D3.js Training Dataset for Large Language Models
Welcome to the future of AI-powered data visualization. This groundbreaking project bridges the gap between natural language and interactive data visualization by creating the world's first comprehensive training dataset specifically designed for teaching LLMs to generate production-ready D3.js visualizations.
- 🎯 Unprecedented Scope: From basic charts to complex network visualizations, our dataset covers the entire spectrum of D3.js capabilities
- 🧠 AI-Ready Architecture: Meticulously structured data pairs natural language queries with production-grade TypeScript/React implementations
- ⚡ Production Quality: Every visualization is built with real-world applications in mind, featuring responsive design, accessibility, and performance optimization
- 🔄 Complete Learning Pipeline: Includes sophisticated tools for data generation, analysis, and refinement, enabling continuous dataset improvement
- 🎨 Beyond Basic Charts: Teaches LLMs to understand and generate complex interactive visualizations, animations, and algorithmic demonstrations
Imagine asking an AI to "create an interactive force-directed graph showing user relationships with hover effects and zoom capabilities" and receiving production-ready, TypeScript-compatible D3.js code that just works. This project makes that future possible by providing the comprehensive training data needed to teach LLMs the intricate patterns and best practices of D3.js visualization development.
- 📊 Access to a diverse collection of visualization implementations
- 🔧 Tools for generating and analyzing visualization training data
- 🧪 Test cases and validation scenarios for each visualization type
- 📘 Detailed documentation and natural language descriptions
- 🔄 Continuous integration with OpenAI's latest models
This project aims to create a robust dataset of D3.js visualizations that can be used to train LLMs, featuring:
- Production-ready, fully implemented visualizations of varying complexity
- Clean, type-safe TypeScript code with React integration
- Comprehensive documentation of visualization parameters and usage
- Wide range of visualization types and complexity levels
- Structured data format suitable for LLM training
- Natural language descriptions paired with functional code
The ultimate goal is to enable LLMs to understand and generate accurate, functional D3.js visualizations based on natural language descriptions and requirements.
- 📊 Interactive D3.js visualizations
- 💪 Full TypeScript support
- 🔄 React component architecture
- 🎨 Modern dark theme with gold accents
- 🛠️ Easy-to-use visualization selector
- 📱 Responsive design
- 🐛 Comprehensive error handling
Our dataset is organized into progressive complexity levels to facilitate structured learning for LLMs:
-
Bar Charts & Histograms
- Simple data distribution visualization
- Axis handling and scale transformations
- Data binning and aggregation techniques
-
Line & Area Charts
- Time series data representation
- Multiple series handling
- Interpolation methods
-
Scatter Plots
- 2D data point visualization
- Color and size encoding
- Zoom and pan interactions
-
Force-Directed Graphs
- Node-link relationship visualization
- Force simulation parameters
- Interactive node dragging
-
Arc Diagrams
- Linear network layout
- Edge bundling techniques
- Node clustering strategies
-
Tree Layouts
- Hierarchical data structures
- Collapsible tree interactions
- Parent-child relationships
-
Complex Network Visualizations
- Multi-level force layouts
- Dynamic graph updates
- Large-scale data handling
-
Sankey Diagrams
- Flow visualization
- Node ranking
- Interactive flow tracing
-
Pathfinding Algorithms
- Step-by-step execution
- State management
- Algorithm comparison views
-
Sorting Visualizations
- Array manipulation
- Transition animations
- Performance metrics
Each visualization category includes:
- 📝 Detailed implementation documentation
- 🎨 Customizable styling options
- 🔧 Configuration parameters
- 📊 Sample datasets
- 🧪 Test cases
- 💡 Natural language descriptions
- 🔄 Interactive features
This structured approach enables LLMs to learn:
- Progressive complexity in visualization implementation
- Common patterns and best practices
- Data structure relationships
- Interactive feature implementation
- Performance optimization techniques
src/
├── components/
│ ├── Visualizations/ # Core visualization components
│ │ ├── Basic/ # Simple, foundational visualizations
│ │ └── DeepDive/ # Complex, advanced visualizations
│ ├── DataStructures/ # Reusable data structure implementations
│ │ ├── Graph/ # Graph-based visualization components
│ │ └── types.ts # TypeScript type definitions
│ ├── shared/ # Shared visualization components
│ │ ├── AlgorithmExploration/ # Interactive algorithm demonstrations
│ │ ├── DatasetExploration/ # Dataset visualization tools
│ │ ├── GraphVisualization/ # Graph rendering components
│ │ └── VisualizationContainer/ # Container components
│ └── VisualizationGallery/ # Gallery interface components
├── constants/ # Configuration and theme settings
│ ├── dataStructureConfig.ts # Data structure configurations
│ ├── visualizationConfig.ts # Visualization parameters
│ └── visualizationTheme.ts # Theming and styling constants
├── hooks/ # Custom React hooks
├── services/ # Data processing and API services
├── types/ # TypeScript type definitions
└── utils/ # Utility functions and helpers
Each visualization in the dataset includes:
- Implementation Files: TypeScript/React components with D3.js integration
- Type Definitions: Comprehensive TypeScript interfaces and types
- Natural Language Descriptions: Detailed descriptions of visualization purpose and behavior
- Configuration Options: Customizable parameters and their effects
- Usage Examples: Sample implementations with varying complexity levels
- Test Cases: Validation scenarios and edge cases
This structure is designed to provide LLMs with:
- Clear relationships between natural language requirements and implementation code
- Progressive complexity levels for learning visualization patterns
- Consistent patterns in component organization and implementation
- Rich context for understanding visualization architecture and best practices
- Node.js (v14 or higher)
- npm or yarn
# Clone the repository
git clone https://github.com/yourusername/d3-visualization-gallery.git
# Navigate to project directory
cd d3-visualization-gallery
# Install dependencies
npm install
# Start development server
npm run dev
- Create a new directory under
src/components/
for your visualization category - Add your visualization component and related files
- Update
visualizationConfig.ts
with your new visualization details - Add the visualization to the gallery renderer
- Use the provided color variables for consistency
- Follow the dark theme pattern
- Ensure responsive design
- Add smooth transitions for interactions
To add a new visualization to the gallery, follow these steps:
Create a new directory under src/components/Visualizations/Basic/
with your visualization name:
src/components/Visualizations/Basic/
└── YourVisualization/
├── YourVisualization.tsx # Main component
├── YourVisualization.css # Styles
└── hooks/
└── useYourVisualization.ts # D3.js logic
Add your visualization to src/constants/visualizationConfig.ts
:
export const VISUALIZATIONS: Visualization[] = [
// ... existing visualizations
{
id: 'your-visualization-id',
name: 'Your Visualization Name',
description: 'Brief description of your visualization',
dataUrl: 'URL to your data source if applicable',
},
];
// Add configuration if needed
export const YOUR_VIZ_CONFIG: YourVizConfig = {
dimensions: {
margin: { top: 20, right: 20, bottom: 20, left: 20 },
width: 800,
height: 600
},
styles: {
// Your visualization-specific styles
}
};
Define your visualization's types in src/types/visualization.ts
:
export interface YourVizConfig {
dimensions: {
margin: { top: number; right: number; bottom: number; left: number };
width: number;
height: number;
};
styles: {
// Your visualization-specific style types
};
}
Add your visualization to the switch statement in src/components/VisualizationGallery/VisualizationGallery.tsx
:
import YourVisualization from '../Visualizations/Basic/YourVisualization/YourVisualization';
// In the renderVisualization function:
case 'your-visualization-id':
return <YourVisualization />;
- Use TypeScript for type safety
- Implement error handling for data loading
- Add loading states and error messages
- Make the visualization responsive
- Follow the existing component structure:
- Use a custom hook for D3.js logic
- Keep the React component clean
- Separate styles into a CSS file
- Add proper cleanup in useEffect hooks
- Add your visualization to the test suite
- Test with different screen sizes
- Verify error handling
- Check memory leaks with React DevTools
- Test data loading states
The /utils
directory contains a suite of Python and JavaScript tools designed to generate, analyze, and process D3.js visualization data for LLM training:
Analyzes D3.js visualization files and extracts training data:
- Extracts data URLs and inline data from JavaScript files
- Downloads and processes external data sources
- Generates detailed data reports for each visualization
- Supports both static and dynamic data analysis
Generated reports containing:
- Data structure analysis
- Visualization parameters
- Data relationships
- Usage patterns
Creates structured training datasets:
python utils/generate_training_data.py --input-dir /path/to/visualizations --output-dir /path/to/output
- Processes D3.js visualization code
- Extracts implementation patterns
- Creates structured training examples
Generates natural language queries for training:
python utils/generate_training_queries.py --viz-dir /path/to/viz --num-queries 10
- Uses OpenAI API to generate diverse queries
- Creates pairs of queries and implementations
- Supports multiple visualization types
- Requires OpenAI API key in environment:
OPENAI_API_KEY
Refines and processes generated training data:
- Cleans and normalizes data
- Validates training examples
- Ensures consistent formatting
- Optimizes for LLM training
Handles OpenAI API interactions:
- Infers visualization properties
- Generates natural language descriptions
- Validates implementation patterns
Translates between natural language and D3.js code:
- Converts natural language queries to D3.js implementations
- Generates code explanations
- Handles complex visualization requirements
- Set up OpenAI API access:
export OPENAI_API_KEY='your-api-key-here'
- Install required Python packages:
pip install openai requests argparse pathlib
- Directory structure for visualization analysis:
visualization_dir/
├── viz.js # D3.js visualization code
├── data_report.txt # Generated data analysis
├── explanation.txt # Visualization explanation
└── inferred_data_report.txt # AI-inferred data properties
- Analyze Visualizations
python utils/analyze_d3_data.py --viz-dir /path/to/visualizations
- Generate Training Queries
python utils/generate_training_queries.py --viz-dir /path/to/visualizations
- Create Training Data
python utils/generate_training_data.py --input-dir /path/to/processed
- Refine Dataset
python utils/refine_training_data.py --input-dir /path/to/training-data
This pipeline creates a comprehensive dataset of D3.js visualizations paired with natural language descriptions, suitable for training LLMs to generate and modify data visualizations.
--color-background: #1D1F21;
--color-surface: #16181A;
--color-primary: #C1A15A;
--color-text: #FFFFFF;
--color-text-secondary: #9CA3AF;
--color-border: #2D3748;
This project is licensed under the MIT License - see the LICENSE file for details.
Contributions are welcome! Please feel free to submit a Pull Request.