What is PCA first of all?

Principal Component Analysis or PCA is a statistical procedure that allows us to summarize/extract the only important data that explains the whole dataset.

Principal component analysis today is one of the most popular multivariate statistical techniques. PCA is the mother method for multivariate data analysis MVDA

It has been widely used in the areas of pattern recognition and signal processing and in statistical analysis to reduce the dimension, in simple words, to understand and extract only the important factors that explains the whole data. Thus helps in avoiding unnecessary data to be processed.

Now since we got a basic idea of what is pca.

Let’s understand what is KERNEL PCA.

Kernel PCA uses rbf radial based function to convert the non-linearly separable data to higher dimension to make it separable. So it performs better in non-linear data.

Lets load our data, define the X and Y, split the dataset into train & test and scale it to reduce the magnitude of the data points spread across. You can save this Data pre-processing template that we often need to use it before applying any model.