Compare multiple ML algorithms, tune their hyperparameters with cross-validation, and pick the best model with statistical confidence.
Compare multiple ML algorithms, tune their hyperparameters with cross-validation, and pick the best model with statistical confidence.
A model comparison report showing accuracy, precision, recall, and training time for 5 algorithms, plus a tuned best model found via GridSearchCV — with a fully reproducible experiment script.
Never just try one model. The best algorithm depends on your data. Spend 10 minutes comparing 5 and you'll usually find one that's clearly better.
from sklearn.linear_model import LogisticRegression from sklearn.tree import DecisionTreeClassifier from sklearn.ensemble import RandomForestClassifier, GradientBoostingClassifier from sklearn.svm import SVC from sklearn.model_selection import cross_val_score import time models = { 'Logistic Regression': LogisticRegression(max_iter=1000), 'Decision Tree': DecisionTreeClassifier(random_state=42), 'Random Forest': RandomForestClassifier(n_estimators=100, random_state=42), 'Gradient Boosting': GradientBoostingClassifier(random_state=42), 'SVM': SVC(probability=True) } results = [] for name, model in models.items(): start = time.time() scores = cross_val_score(model, X_train, y_train, cv=5, scoring='f1') elapsed = time.time() - start results.append({ 'Model': name, 'CV F1 Mean': scores.mean(), 'CV F1 Std': scores.std(), 'Train Time (s)': round(elapsed, 2) }) results_df = pd.DataFrame(results).sort_values('CV F1 Mean', ascending=False) print(results_df.to_string())
Hyperparameters are settings you configure before training (like n_estimators in Random Forest). GridSearch tries every combination. RandomizedSearch samples from distributions — faster for large spaces.
from sklearn.model_selection import GridSearchCV, RandomizedSearchCV # Grid search (exhaustive — try every combination) param_grid = { 'n_estimators': [50, 100, 200], 'max_depth': [None, 5, 10], 'min_samples_split': [2, 5], } # 3 x 3 x 2 = 18 combinations x 5 folds = 90 fits gs = GridSearchCV( RandomForestClassifier(random_state=42), param_grid, cv=5, scoring='f1', n_jobs=-1, # use all CPU cores verbose=1 ) gs.fit(X_train, y_train) print(f"Best params: {gs.best_params_}") print(f"Best CV F1: {gs.best_score_:.3f}") # Best model is already fitted best_model = gs.best_estimator_ print(f"Test F1: {best_model.score(X_test, y_test):.3f}")
Cross-validation gives you a much more reliable accuracy estimate than a single train/test split. K-fold splits data into K parts, trains on K-1, tests on 1, and rotates.
from sklearn.model_selection import StratifiedKFold, cross_validate # Stratified: preserves class ratio in each fold cv = StratifiedKFold(n_splits=5, shuffle=True, random_state=42) # Multiple metrics at once results = cross_validate( best_model, X_train, y_train, cv=cv, scoring=['accuracy', 'precision', 'recall', 'f1'], return_train_score=True ) for metric in ['test_accuracy', 'test_precision', 'test_recall', 'test_f1']: scores = results[metric] print(f"{metric}: {scores.mean():.3f} +/- {scores.std():.3f}") # High train score + low test score = overfitting print(f"Train acc: {results['train_accuracy'].mean():.3f}")
Overfit check: If train accuracy is 0.99 and test accuracy is 0.80, your model memorized the training data instead of learning patterns. Reduce model complexity or add more data.
Choosing the right algorithm saves hours. Here's a quick reference.
Before moving on, make sure you can answer these without looking: