This is a simple example of AdaBoost using Python and scikit-learn.
AdaBoost (Adaptive Boosting) is an ensemble learning method that combines the predictions of multiple weak learners to create a strong learner. A weak learner is a model that performs slightly better than random chance. AdaBoost assigns weights to each instance in the dataset and focuses on the mistakes made by the weak learners. It then assigns higher weights to misclassified instances, enabling subsequent weak learners to focus on correcting these mistakes. The final prediction is a weighted sum of the weak learners' predictions.
Key concepts of AdaBoost:
AdaBoost is known for its simplicity and effectiveness, and it is often used with decision trees as weak learners.
Python Source Code:
# Import necessary libraries
import numpy as np
import matplotlib.pyplot as plt
from sklearn.datasets import make_classification
from sklearn.model_selection import train_test_split
from sklearn.ensemble import AdaBoostClassifier
from sklearn.tree import DecisionTreeClassifier
from sklearn.metrics import accuracy_score, confusion_matrix
# Generate synthetic classification data
np.random.seed(42)
X, y = make_classification(n_samples=1000, n_features=20, n_informative=10, n_clusters_per_class=2, random_state=42)
# Split the data into training and testing sets
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2, random_state=42)
# Build an AdaBoost model with decision tree as base estimator
base_estimator = DecisionTreeClassifier(max_depth=1)
adaboost = AdaBoostClassifier(base_estimator=base_estimator, n_estimators=50, random_state=42)
adaboost.fit(X_train, y_train)
# Make predictions on the test set
y_pred = adaboost.predict(X_test)
# Evaluate the model
accuracy = accuracy_score(y_test, y_pred)
conf_matrix = confusion_matrix(y_test, y_pred)
print(f'Accuracy: {accuracy}')
print(f'Confusion Matrix:\n{conf_matrix}')
# Plot the results
plt.scatter(X_test[:, 0], X_test[:, 1], c=y_test, cmap='viridis', marker='o', edgecolors='black', label='Actual Data')
plt.title('AdaBoost Example')
plt.xlabel('Feature 1')
plt.ylabel('Feature 2')
plt.legend()
plt.show()
Explanation:
make_classification
function from scikit-learn.train_test_split
function.AdaBoostClassifier
with a decision tree as the base estimator.