A comprehensive, production-ready machine learning package for classifying iris flowers using multiple algorithms with detailed analysis, visualization, and enterprise-grade deployment capabilities.
pip install -r requirements.txt
pip install -e .
git clone https://github.com/pyenthusiasts/Iris-Flower-Classification.git
cd Iris-Flower-Classification
pip install -r requirements.txt
pip install -e .
python main.py
This will run a complete analysis including:
# Start the API server
make api
# or
uvicorn iris_classifier.api:app --reload
# API will be available at http://localhost:8000
# Interactive docs at http://localhost:8000/docs
Make predictions via HTTP:
curl -X POST "http://localhost:8000/predict" \
-H "Content-Type: application/json" \
-d '{
"sample": {
"sepal_length": 5.1,
"sepal_width": 3.5,
"petal_length": 1.4,
"petal_width": 0.2
},
"model_name": "random_forest",
"include_probabilities": true
}'
# Train a specific model
iris-classifier train --model random_forest --save
# Compare all models
iris-classifier compare --plot
# Make a prediction
iris-classifier predict 5.0 3.6 1.4 0.2 --model random_forest
# Display dataset information
iris-classifier info --stats
from iris_classifier import IrisDataLoader, ModelFactory, ModelEvaluator
# Load data
loader = IrisDataLoader()
X_train, X_test, y_train, y_test = loader.get_train_test_split()
# Train a model
model = ModelFactory.create_model('random_forest')
model.fit(X_train, y_train)
# Evaluate
evaluator = ModelEvaluator()
results = evaluator.evaluate_model(model, X_test, y_test)
evaluator.print_evaluation_report(results)
The package includes a comprehensive CLI with the following commands:
iris-classifier train [OPTIONS]
Options:
--model TEXT Model to train (default: decision_tree)
--test-size FLOAT Test set size (default: 0.3)
--scale Scale features using StandardScaler
--save Save trained model
iris-classifier compare [OPTIONS]
Options:
--test-size FLOAT Test set size (default: 0.3)
--scale Scale features
--cv INTEGER Number of CV folds (default: 5)
--plot Show comparison plot
iris-classifier predict SEPAL_LENGTH SEPAL_WIDTH PETAL_LENGTH PETAL_WIDTH [OPTIONS]
Options:
--model TEXT Model to use (default: decision_tree)
--model-file TEXT Load model from file
--scale Scale features
iris-classifier visualize [OPTIONS]
Options:
--type TEXT Visualization type:
all, distribution, pairplot, correlation, pca, classes
iris-classifier info [OPTIONS]
Options:
--stats Show detailed feature statistics
from iris_classifier import IrisDataLoader
# Basic usage
loader = IrisDataLoader()
X, y = loader.get_full_dataset()
# With feature scaling
loader = IrisDataLoader(scale=True)
X_train, X_test, y_train, y_test = loader.get_train_test_split(test_size=0.3)
# Get dataset information
info = loader.get_dataset_info()
stats = loader.get_feature_statistics()
from iris_classifier.models import ModelFactory, ModelTrainer
# Create a model
model = ModelFactory.create_model('random_forest')
# With custom parameters
model = ModelFactory.create_model('decision_tree', {'max_depth': 5})
# Train with ModelTrainer
trainer = ModelTrainer(model, 'random_forest')
trainer.train(X_train, y_train)
# Make predictions
predictions = trainer.predict(X_test)
probabilities = trainer.predict_proba(X_test)
# Save model
trainer.save('my_model.pkl')
from iris_classifier import ModelEvaluator
evaluator = ModelEvaluator()
# Evaluate a single model
results = evaluator.evaluate_model(model, X_test, y_test, 'random_forest')
evaluator.print_evaluation_report(results)
# Cross-validation
cv_results = evaluator.cross_validate_model(model, X_train, y_train, cv=5)
# Compare multiple models
models = ModelFactory.get_all_models()
comparison_df = evaluator.compare_models(models, X_train, y_train, X_test, y_test)
# Get best model
best_model_name = evaluator.get_best_model(comparison_df)
from iris_classifier import IrisVisualizer
visualizer = IrisVisualizer()
# Data visualizations
visualizer.plot_feature_distributions(X, y)
visualizer.plot_pairplot(X, y)
visualizer.plot_correlation_matrix(X)
visualizer.plot_pca_visualization(X, y)
visualizer.plot_class_distribution(y)
# Model visualizations
visualizer.plot_confusion_matrix(y_test, predictions, model_name='Random Forest')
visualizer.plot_model_comparison(comparison_df)
visualizer.plot_feature_importance(model)
The notebooks/ directory contains interactive notebooks:
To run notebooks:
jupyter notebook notebooks/
Build and run with Docker:
# Build image
make docker-build
# Run container
make docker-run
Or manually:
docker build -t iris-classifier:latest .
docker run -d -p 8000:8000 --name iris-api iris-classifier:latest
Access the API at http://localhost:8000
Run the complete stack with monitoring:
make docker-compose-up
This starts:
Deploy to Kubernetes cluster:
# Apply all configurations
kubectl apply -f k8s/ -n iris-classifier
# Check status
kubectl get pods -n iris-classifier
kubectl get svc -n iris-classifier
Features:
Detailed deployment guide: See DEPLOYMENT.md
API documentation: See API.md
Iris-Flower-Classification/
├── src/
│ └── iris_classifier/
│ ├── __init__.py # Package initialization
│ ├── config.py # Configuration settings
│ ├── data_loader.py # Data loading and preprocessing
│ ├── models.py # ML model factory and trainer
│ ├── evaluator.py # Model evaluation and comparison
│ ├── visualizer.py # Visualization tools
│ ├── utils.py # Utility functions
│ └── cli.py # Command-line interface
├── tests/
│ ├── __init__.py
│ ├── conftest.py # Pytest configuration
│ ├── test_data_loader.py # Data loader tests
│ ├── test_models.py # Model tests
│ ├── test_evaluator.py # Evaluator tests
│ └── test_utils.py # Utility tests
├── notebooks/
│ ├── 01_exploratory_data_analysis.ipynb
│ └── 02_model_training_and_comparison.ipynb
├── data/ # Data directory
│ └── README.md
├── models/ # Saved models directory
│ └── README.md
├── docs/ # Documentation
├── .github/
│ └── workflows/
│ └── ci.yml # CI/CD pipeline
├── main.py # Main script
├── requirements.txt # Dependencies
├── setup.py # Package setup
├── pyproject.toml # Build configuration
├── README.md # This file
├── CONTRIBUTING.md # Contribution guidelines
├── CODE_OF_CONDUCT.md # Code of conduct
├── LICENSE # MIT license
└── .gitignore # Git ignore rules
The package supports 8 different classification algorithms:
| Model | Description | Key Parameters |
|---|---|---|
decision_tree |
Decision Tree Classifier | max_depth, min_samples_split |
random_forest |
Random Forest Classifier | n_estimators, max_depth |
svm |
Support Vector Machine | kernel, C, gamma |
knn |
K-Nearest Neighbors | n_neighbors, weights |
logistic_regression |
Logistic Regression | C, solver |
naive_bayes |
Gaussian Naive Bayes | - |
gradient_boosting |
Gradient Boosting | n_estimators, learning_rate |
mlp |
Multi-Layer Perceptron | hidden_layer_sizes, max_iter |
from iris_classifier import IrisDataLoader, ModelFactory, ModelEvaluator
# Load and split data
loader = IrisDataLoader()
X_train, X_test, y_train, y_test = loader.get_train_test_split()
# Train model
model = ModelFactory.create_model('random_forest')
model.fit(X_train, y_train)
# Evaluate
evaluator = ModelEvaluator()
results = evaluator.evaluate_model(model, X_test, y_test)
print(f"Accuracy: {results['accuracy']:.4f}")
from iris_classifier import IrisDataLoader, ModelFactory, ModelEvaluator, IrisVisualizer
# Setup
loader = IrisDataLoader()
X_train, X_test, y_train, y_test = loader.get_train_test_split()
# Compare all models
models = ModelFactory.get_all_models()
evaluator = ModelEvaluator()
comparison = evaluator.compare_models(models, X_train, y_train, X_test, y_test)
# Visualize results
visualizer = IrisVisualizer()
visualizer.plot_model_comparison(comparison)
# Get best model
best = evaluator.get_best_model(comparison)
print(f"Best model: {best}")
from iris_classifier import IrisDataLoader, ModelFactory
# Load data and train model
loader = IrisDataLoader()
X_train, X_test, y_train, y_test = loader.get_train_test_split()
model = ModelFactory.create_model('random_forest')
model.fit(X_train, y_train)
# Prepare new sample
sample = loader.predict_sample(5.0, 3.6, 1.4, 0.2)
# Predict
prediction = model.predict(sample)[0]
probabilities = model.predict_proba(sample)[0]
print(f"Predicted species: {loader.target_names[prediction]}")
for name, prob in zip(loader.target_names, probabilities):
print(f" {name}: {prob:.2%}")
The API exposes metrics at /metrics:
# View metrics
curl http://localhost:8000/metrics
Key metrics:
iris_predictions_total: Total number of predictionsiris_prediction_duration_seconds: Prediction latency histogramiris_errors_total: Error count by typeAccess Grafana at http://localhost:3000 (when using Docker Compose):
monitoring/grafana/dashboards/# Check API health
curl http://localhost:8000/health
# Response
{
"status": "healthy",
"version": "2.0.0",
"models_loaded": 3,
"uptime_seconds": 3600.5
}
Run performance benchmarks:
make benchmark
# or
python scripts/benchmark.py
Results include:
Run load tests with Locust:
make load-test
# or
locust -f tests/load_test.py --host=http://localhost:8000
Access Locust UI at http://localhost:8089
With default configuration (4 workers):
# Clone the repository
git clone https://github.com/pyenthusiasts/Iris-Flower-Classification.git
cd Iris-Flower-Classification
# Create virtual environment
python -m venv venv
source venv/bin/activate # On Windows: venv\Scripts\activate
# Install dependencies
pip install -r requirements.txt
pip install -e .[dev]
# Format code
black src/iris_classifier
# Lint code
flake8 src/iris_classifier --max-line-length=100
# Type checking
mypy src/iris_classifier
# Run all tests
pytest tests/
# Run with coverage
pytest tests/ --cov=iris_classifier --cov-report=html
# Run specific test file
pytest tests/test_models.py
# Run specific test
pytest tests/test_models.py::TestModelFactory::test_create_decision_tree
The project maintains >80% test coverage across all modules.
We welcome contributions! Please see CONTRIBUTING.md for details on:
This project is licensed under the MIT License - see the LICENSE file for details.
If you use this package in your research, please cite:
@software{iris_classifier,
author = {Your Name},
title = {Iris Flower Classification},
year = {2024},
url = {https://github.com/pyenthusiasts/Iris-Flower-Classification}
}
Happy Classifying! 🌸