this is gonna be a very short blog about KNN aka K- Nearest Neighbors. Before we move on in depth let me give you a brief overview which sums up KNN, it literally means “you are whom you hang out with”. So instead of other ML algorithms which updates its weights and trains on data, KNN only memorises the data. Lets say we want to predict the gender of a person based on their features which are height and weight( a classic binary classification problem).
In logistic regression we can label females as 0 and males as 1 and then we compute a linear combination of features:
$$ z = w_1 \cdot \text{height} + w_2 \cdot \text{weight} + b $$
Then pass it through the sigmoid:
$$ \hat{y} = \sigma(z) = \frac{1}{1 + e^{-z}} $$
Where:
lets say we have a new data point we want to classify, we have the features (height and weight). In KNN, we simply:
For instance, if we set K=5 and 3 of the 5 nearest neighbors are female( 0 ) while 2 are male( 1 )
$k\ closest\ neighbours=[0, 0, 1, 0, 1]$
We'd classify our new data point as female.