[Supervised Modeling] SVM linear vs. rbf kernels
In Q5 of the supervised modeling exercise in Hive we are using :
clf_svc_ker = SVC(kernel='rbf') #rbf kernel clf_svc_lin = SVC(kernel='linear') #linear kernel
Is anybody able to help me understand what are the main differences between those two parameters? How does one or the other affect the performance of the model ?
I will try and answer it based on what I think your problem is.
SVM tries to draw a hyper-plane on the data to separate the classes into two separate parts of the plane. A lot of the examples we see in lectures are simple examples of data over R^2 where you can just draw a straight line to divide the data into the two classes. However, not all data is this clean. Is there any function we can apply to the data in order to make it linearly separable? This is what the kernel is coming to do. A linear kernel is just applying a linear transformation on the data - it won't help too much in this respect. But a polynomial kernel and rbf kernel can do fancy stuff to your data to map it to higher dimensions and make it linearly separable. The maths of what these functions do is beyond the scope of my knowledge, but that's the idea. If you ever see a SVM boundary that is non-linear, you know that a non-linear kernel was used.
Check out this really short youtube video which consolidated the idea for me: https://www.youtube.com/watch?v=3liCbRZPrZA
In terms of the actual performance of the model:
- linear kernels are 'less dangerous' - they aren't doing funky transformations to the data. Polynomial/RBF kernels are useful because they make the data linearly separable, but they have the danger that you've transformed the data in a funky way.