Kernel Regression using Python

Brian H. Russell

In Lab 23, which is a continuation of Lab 22, I will discuss kernel regression and implement the procedure using Python.

Lab 22 discussed polynomial curve fitting using a simple 3-point problem and a more interesting 10-point noise corrupted sinusoid. I showed that for N points and a polynomial of order p if N = p+1, we get an exact fit if N is larger than p+1, we underfit the points, and if N is smaller than p+1, we overfit the points. I then introduced two regression solutions, primal and dual, where the primal solution involved inverting a p+1 x p+1 dimensional matrix and the dual regression solution involved inverting an N x N dimensional matrix with appropriate pre-whitening.

In Lab 23, I will first expand the linear solution to include nonlinear basis functions. This will allow us to fit our data using M basis functions, where M is smaller than N is the number of data points. Then, I will show how the dual solution leads us to nonlinear kernel regression methods, which extend the basis function approach. I will show that when M = N, the solution is identical. Finally, I will use the 'kernel trick' to create kernel regression methods based on polynomial, Gaussian, and sigmoidal functions. Using the Gaussian function results in what is referred to as the Radial Basis Function Neural Network, or RBRN.

I will finish with a discussion of kernel regression, the Generalized Regression Neural Network , or GRNN, a similar method that does not involve the inverse of the kernel function. Instead, the output is created by implementing the kernel function using the input and desired output values.

As in Lab 22, these methods will all be illustrated with Python examples applied to both a simple 3-point problem and a more complex 10-point sine wave with additive noise.