[Supervised Modeling] Q3.d: Can't create decision boundary
Having a difficult time plotting the decision boundary on this question.
d) Now use all features. How does this affect the confusion matrix? Draw the decision boundary (Since we cannot really visualize a 4-dimensional plot, use only 'Sepal.Length' and 'Sepal.Width' as X for the boundary function but use the predictor trained on all features)
I think this is the problematic part. I consistently receive this error:
ValueError: operands could not be broadcast together with shapes (10000,2) (4,)
I believe this is because I am using a model that is trained on more than 2 dimensions (which is what the question specifies). This is what my code looks like:
iris_no_setosa = iris[iris.Species != 'setosa'] iris_train, iris_test = train_test_split(iris_no_setosa, test_size=0.3) all_feat_nb = GaussianNB() predictions = all_feat_nb.fit(iris_train.drop('Species',1) , iris_train.Species).predict(iris_test.drop('Species',1)) #Confusion Matrix cm= confusion_matrix(y_pred=predictions, y_true=iris_test.Species) helper_functions.plot_confusion_matrix(cm, iris_test.Species.unique()) #Drop extra features so decision boundary can work iris_train = iris_train[['Sepal.Width','Sepal.Length','Species']] iris_test = iris_test[['Sepal.Width','Sepal.Length','Species']] #Decision Boundary -- This is where the problem arises helper_functions.decision_boundary(X=iris_test.drop('Species',1) , Y=iris_test.Species, model=all_feat_nb)
The problem specifically occurs when I plot the decision boundary, the confusion matrix works just fine.
Anyone else understand this?
I got the same error and you're right it is because we are using a model that is trained on more than 2 dimensions. To solve it I think we should train the model with 2 dimensions but it is not what we were asked ...
# Training data X_train = train.drop('Species', 1) # Target value y_train = train.Species # Fit the training data gnb = GaussianNB() gnb.fit(X_train, y_train) # Make a prediction on the test set X_test = test.drop('Species', 1) y_test = test.Species p_test = gnb.predict(X_test) # Confusion matrix cm = confusion_matrix(y_test,p_test) # Plot the confusion matrix classes = test.Species.unique() plot_confusion_matrix(cm,classes) # Decision boundary y_test = y_test.reshape(y_test.size, 1) #decision_boundary(X_test[['Sepal.Length','Sepal.Width']],y_test,gnb)