Random Forest Example

This is a simple example of Random Forest using Python and scikit-learn.

Random Forest Overview

Random Forest is an ensemble learning method that combines the predictions of multiple decision trees to improve the overall accuracy and robustness of the model. It is effective for both classification and regression tasks. Random Forest introduces randomness during training by using a technique called bootstrap aggregation (bagging) and randomly selecting a subset of features for each split.

Key characteristics of Random Forest:

Random Forest is known for its high performance, versatility, and ability to handle large and complex datasets. It also provides feature importance scores, helping in feature selection.

Python Source Code:

# Import necessary libraries
import numpy as np
import matplotlib.pyplot as plt
from sklearn.model_selection import train_test_split
from sklearn.ensemble import RandomForestRegressor
from sklearn.metrics import mean_squared_error

# Generate synthetic data
np.random.seed(42)
X = 2 * np.random.rand(100, 1)
y = 4 + 3 * X + np.random.randn(100, 1)

# Split the data into training and testing sets
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2, random_state=42)

# Train the Random Forest model
model = RandomForestRegressor(n_estimators=100, random_state=42)
model.fit(X_train, y_train.ravel())

# Make predictions on the test set
y_pred = model.predict(X_test)

# Evaluate the model
mse = mean_squared_error(y_test, y_pred)
print(f'Mean Squared Error: {mse}')

# Plot the results
plt.scatter(X_test, y_test, color='black')
plt.scatter(X_test, y_pred, color='red', marker='x')
plt.title('Random Forest Example')
plt.xlabel('X')
plt.ylabel('y')
plt.show()

Explanation: