Discover millions of ebooks, audiobooks, and so much more with a free trial

Only $11.99/month after trial. Cancel anytime.

K Nearest Neighbor Algorithm: Fundamentals and Applications
K Nearest Neighbor Algorithm: Fundamentals and Applications
K Nearest Neighbor Algorithm: Fundamentals and Applications
Ebook124 pages1 hour

K Nearest Neighbor Algorithm: Fundamentals and Applications

Rating: 0 out of 5 stars

()

Read preview

About this ebook

What Is K Nearest Neighbor Algorithm


The k-nearest neighbors technique, also known as k-NN, is a non-parametric supervised learning method that was initially created in 1951 by Evelyn Fix and Joseph Hodges in the field of statistics. Thomas Cover later expanded on the original concept. It has applications in both regression and classification. In both scenarios, the input is made up of the k training instances in a data collection that are the closest to one another. Whether or not k-NN was used for classification or regression, the results are as follows:The output of a k-nearest neighbor classification is a class membership. A plurality of an item's neighbors votes on how the object should be classified, and the object is then assigned to the class that is most popular among its k nearest neighbors (where k is a positive number that is often quite small). If k is equal to one, then the object is simply classified as belonging to the category of its single closest neighbor.The result of a k-NN regression is the value of a certain property associated with an object. This value is the average of the values of the k neighbors that are the closest to the current location. If k is equal to one, then the value of the output is simply taken from the value of the one nearest neighbor.


How You Will Benefit


(I) Insights, and validations about the following topics:


Chapter 1: K-nearest neighbors algorithm


Chapter 2: Supervised learning


Chapter 3: Pattern recognition


Chapter 4: Curse of dimensionality


Chapter 5: Nearest neighbor search


Chapter 6: Cluster analysis


Chapter 7: Kernel method


Chapter 8: Large margin nearest neighbor


Chapter 9: Structured kNN


Chapter 10: Weak supervision


(II) Answering the public top questions about k nearest neighbor algorithm.


(III) Real world examples for the usage of k nearest neighbor algorithm in many fields.


(IV) 17 appendices to explain, briefly, 266 emerging technologies in each industry to have 360-degree full understanding of k nearest neighbor algorithm' technologies.


Who This Book Is For


Professionals, undergraduate and graduate students, enthusiasts, hobbyists, and those who want to go beyond basic knowledge or information for any kind of k nearest neighbor algorithm.

LanguageEnglish
Release dateJun 23, 2023
K Nearest Neighbor Algorithm: Fundamentals and Applications

Related to K Nearest Neighbor Algorithm

Titles in the series (100)

View More

Related ebooks

Intelligence (AI) & Semantics For You

View More

Related articles

Reviews for K Nearest Neighbor Algorithm

Rating: 0 out of 5 stars
0 ratings

0 ratings0 reviews

What did you think?

Tap to rate

Review must be at least 10 words

    Book preview

    K Nearest Neighbor Algorithm - Fouad Sabry

    Chapter 1: k-nearest neighbors algorithm

    Evelyn Fix and Joseph Hodges created the k-nearest neighbors algorithm (k-NN) in statistics as a non-parametric supervised learning technique in 1951. Regression and classification are two uses for it. The input in both situations consists of a data set's k closest training samples. Whether k-NN is applied for classification or regression determines the results:

    The result of k-NN classification is a class membership. The class that an object is assigned to based on the majority vote of its k closest neighbors is determined by the item's neighbors (k is a positive integer, typically small). The object is simply assigned to the class of its one nearest neighbor if k = 1.

    The output of k-NN regression is the object's property value. The average of the values of the k closest neighbors makes up this number. The output is just assigned to the value of the one nearest neighbor if k = 1.

    With k-NN, all computation is postponed until after the function has been evaluated and the function is only locally approximated. Normalizing the training data can significantly increase the accuracy of this technique, which relies on distance for classification, if the features reflect various physical units or have wildly different sizes.

    When using k-NN classification or regression, the neighbors are chosen from a set of objects for which the class or object property value is known. Although there is no explicit training phase needed, this can be considered of as the algorithm's training set.

    The k-NN algorithm has the feature of being sensitive to the local structure of the data.

    Suppose we have pairs

    {\displaystyle (X_{1},Y_{1}),(X_{2},Y_{2}),\dots ,(X_{n},Y_{n})}

    taking values in {\mathbb {R}}^{d}\times \{1,2\} , where Y is X's class label, so that X|Y=r\sim P_{r} for r=1,2 (and probability distributions P_{r} ).

    Given some norm \|\cdot \| on \mathbb {R} ^{d} and a point x\in {\mathbb {R}}^{d} , let

    (X_{{(1)}},Y_{{(1)}}),\dots ,(X_{{(n)}},Y_{{(n)}})

    be a reordering of the training data such that

    \|X_{{(1)}}-x\|\leq \dots \leq \|X_{{(n)}}-x\|

    .

    The training examples are vectors with class labels in a multidimensional feature space. The algorithm's training phase consists solely of storing the training samples' feature vectors and class labels.

    The label that is most prevalent among the k training samples closest to the unlabeled vector (a query or test point) is assigned during the classification phase, where k is a user-defined constant.

    Euclidean distance is a typical distance metric for continuous variables. Another metric, such as the overlap metric, can be used for discrete variables, such as text categorization (or Hamming distance). For instance, k-NN has been used as a metric in the context of gene expression microarray data along with correlation coefficients like Pearson and Spearman. Often, learning the distance metric with specific algorithms like Large Margin Nearest Neighbor or Neighbourhood Components Analysis can greatly increase the classification accuracy of k-NN.

    When the distribution of the classes is skewed, the fundamental majority voting categorization has a disadvantage. In other words, because they tend to be common among the k nearest neighbors due to their huge number, examples of a more frequent class tend to dominate the forecast of the new example. To solve this issue, one solution is to weight the classification by accounting for the distance between the test location and each of its k closest neighbors. Each of the k closest points' class (or value, in regression issues) is multiplied by a weight that is proportional to the inverse of the distance separating it from the test point. Abstraction in data representation is a different strategy for dealing with skew. Depending on their density in the initial training data, each node in a self-organizing map (SOM) is a representative (a center) of a cluster of comparable points. The SOM can then be used with K-NN.

    The ideal number for k depends on the data; generally, higher values of k lessen the impact of noise on classification, but they also blur the borders between classes. A good k can be chosen using a variety of heuristic methods (see hyperparameter optimization). The nearest neighbor approach is used in the particular circumstance where the class is predicted to be the class of the nearest training sample (i.e. when k = 1).

    The existence of noisy or irrelevant features, or if the feature scales are inconsistent with their relevance, can significantly reduce the accuracy of the k-NN method. A lot of study has gone into choosing or scaling characteristics to enhance classification. Evolutionary algorithms are a particularly well-liked method for enhancing feature scalability.

    The nearest neighbor classifier that assigns a point x to the class of its closest neighbor in the feature space is the most logical nearest neighbor classifier, that is C_{n}^{{1nn}}(x)=Y_{{(1)}} .

    The one closest neighbour classifier ensures that the error rate won't be worse than twice the Bayes error rate as the size of the training data set approaches infinity (the minimum achievable error rate given the distribution of the data).

    The k-nearest neighbour classifier can be viewed as assigning the k nearest neighbours a weight 1/k and all others 0 weight.

    This is applicable to classifiers for weighted nearest neighbors.

    That is, where the ith nearest neighbour is assigned a weight w_{{ni}} , with {\textstyle \sum _{i=1}^{n}w_{ni}=1} .

    A similar finding on the high consistency of weighted closest neighbor classifiers is also true.

    Let C_{n}^{{wnn}} denote the weighted nearest classifier with weights \{w_{{ni}}\}_{{i=1}}^{n} .

    Depending on the regularity conditions, which, in asymptotic theory, are conditional variables that call for presumptions in order to distinguish between parameters using certain criteria.

    The excess risk exhibits the following asymptotic growth on the class distributions.

    {\displaystyle {\mathcal {R}}_{\mathcal {R}}(C_{n}^{wnn})-{\mathcal {R}}_{\mathcal {R}}(C^{\text{Bayes}})=\left(B_{1}s_{n}^{2}+B_{2}t_{n}^{2}\right)\{1+o(1)\},}

    for constants B_{1} and B_{2} where s_{n}^{2}=\sum _{{i=1}}^{n}w_{{ni}}^{2} and

    {\displaystyle t_{n}=n^{-2/d}\sum _{i=1}^{n}w_{ni}\left\{i^{1+2/d}-(i-1)^{1+2/d}\right\}}

    .

    The optimal weighting scheme \{w_{{ni}}^{*}\}_{{i=1}}^{n} , that keeps the two terms in the above graphic in balance, is given as follows: set k^{*}=\lfloor Bn^{{{\frac 4{d+4}}}}\rfloor ,

    {\displaystyle w_{ni}^{*}={\frac {1}{k^{*}}}\left[1+{\frac {d}{2}}-{\frac {d}{2{k^{*}}^{2/d}}}\{i^{1+2/d}-(i-1)^{1+2/d}\}\right]}

    for i=1,2,\dots ,k^{*} and

    {\displaystyle w_{ni}^{*}=0}

    for i=k^{*}+1,\dots ,n .

    With optimal weights the dominant term in the asymptotic expansion of the excess risk is {\mathcal {O}}(n^{{-{\frac 4{d+4}}}}) .

    When employing a bagged closest neighbour classifier, similar outcomes are obtained.

    k-NN is a specific instance of a uniform kernel, variable-bandwidth, kernel density balloon estimator.

    By calculating the distances between each stored example and the test example, the naive form of the technique is simple to use but computationally demanding for large training sets. Even for enormous data sets, the approximate closest neighbor search algorithm makes k-NN computationally manageable. Over the years, a number of nearest neighbor search algorithms have been proposed; these typically aim to minimize the amount of actual distance evaluations.

    Strong consistency results for k-NN are present. The two-class k-NN algorithm is guaranteed to produce an error rate no worse than twice the Bayes error rate as the amount of data grows toward infinity (the minimum achievable error rate given the distribution of the data).

    Cover and Hart (1967) provide an upper bound error rate of 0.1 for multi-class k-NN classification.

    {\displaystyle R^{*}\ \leq \ R_{k\mathrm {NN} }\ \leq \ R^{*}\left(2-{\frac {MR^{*}}{M-1}}\right)}

    where R^{*} is the Bayes error rate (which is the minimal error rate possible), {\displaystyle R_{kNN}} is the k-NN error rate, , where M denotes the number of classes in the issue.

    For M=2 and as the Bayesian error rate R^{*} approaches zero, Not more than twice the Bayesian error rate becomes the new upper limit.

    Numerous findings have been made on the k closest neighbour classifiers' error rate.

    Enjoying the preview?
    Page 1 of 1