k-Nearest Neighbors (KNN) Example

This is a simple example of k-Nearest Neighbors (KNN) using Python and scikit-learn.

k-Nearest Neighbors Overview

k-Nearest Neighbors (KNN) is a simple and intuitive machine learning algorithm used for both classification and regression tasks. In KNN, a data point is classified or predicted based on the majority class or average of its k-nearest neighbors in the feature space. The choice of the value of k determines the number of neighbors considered during the prediction.

Key concepts of k-Nearest Neighbors:

KNN is a non-parametric algorithm, meaning it does not make strong assumptions about the underlying data distribution. It is sensitive to the choice of the distance metric and the value of k. KNN is suitable for small to medium-sized datasets and can be computationally expensive for large datasets.

Python Source Code:

# Import necessary libraries
import numpy as np
import matplotlib.pyplot as plt
from sklearn.model_selection import train_test_split
from sklearn.neighbors import KNeighborsRegressor
from sklearn.metrics import mean_squared_error

# Generate synthetic data
np.random.seed(42)
X = 2 * np.random.rand(100, 1)
y = 4 + 3 * X + np.random.randn(100, 1)

# Split the data into training and testing sets
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2, random_state=42)

# Train the k-Nearest Neighbors (KNN) model
model = KNeighborsRegressor(n_neighbors=5)
model.fit(X_train, y_train)

# Make predictions on the test set
y_pred = model.predict(X_test)

# Evaluate the model
mse = mean_squared_error(y_test, y_pred)
print(f'Mean Squared Error: {mse}')

# Plot the results
plt.scatter(X_test, y_test, color='black')
plt.scatter(X_test, y_pred, color='red', marker='x')
plt.title('k-Nearest Neighbors (KNN) Example')
plt.xlabel('X')
plt.ylabel('y')
plt.show()

Explanation: