### [Supervised Modeling] Q3.d: Can't create decision boundary

401
views
2
19 months ago by

Having a difficult time plotting the decision boundary on this question.

d) Now use all features. How does this affect the confusion matrix? Draw the decision boundary (Since we cannot really visualize a 4-dimensional plot, use only 'Sepal.Length' and 'Sepal.Width' as X for the boundary function but use the predictor trained on all features)

I think this is the problematic part. I consistently receive this error:

ValueError: operands could not be broadcast together with shapes (10000,2) (4,)


I believe this is because I am using a model that is trained on more than 2 dimensions (which is what the question specifies). This is what my code looks like:

iris_no_setosa = iris[iris.Species != 'setosa']
iris_train, iris_test = train_test_split(iris_no_setosa, test_size=0.3)

all_feat_nb = GaussianNB()
predictions = all_feat_nb.fit(iris_train.drop('Species',1) , iris_train.Species).predict(iris_test.drop('Species',1))

#Confusion Matrix
cm= confusion_matrix(y_pred=predictions, y_true=iris_test.Species)
helper_functions.plot_confusion_matrix(cm, iris_test.Species.unique())

#Drop extra features so decision boundary can work
iris_train = iris_train[['Sepal.Width','Sepal.Length','Species']]
iris_test = iris_test[['Sepal.Width','Sepal.Length','Species']]

#Decision Boundary -- This is where the problem arises
helper_functions.decision_boundary(X=iris_test.drop('Species',1) , Y=iris_test.Species, model=all_feat_nb)


The problem specifically occurs when I plot the decision boundary, the confusion matrix works just fine.

Anyone else understand this?

Community: ITC Fellows 16-17

5
18 months ago by

Hi,

I got the same error and you're right it is because we are using a model that is trained on more than 2 dimensions. To solve it I think we should train the model with 2 dimensions but it is not what we were asked ...

    # Training data
X_train = train.drop('Species', 1)

# Target value
y_train = train.Species

# Fit the training data
gnb = GaussianNB()
gnb.fit(X_train, y_train)

# Make a prediction on the test set
X_test = test.drop('Species', 1)
y_test = test.Species
p_test = gnb.predict(X_test)

# Confusion matrix
cm = confusion_matrix(y_test,p_test)

# Plot the confusion matrix
classes = test.Species.unique()
plot_confusion_matrix(cm,classes)

# Decision boundary
y_test = y_test.reshape(y_test.size, 1)
#decision_boundary(X_test[['Sepal.Length','Sepal.Width']],y_test,gnb)

I spoke with some others about this, and they seem to agree. There must be a mistake with the question.

written 18 months ago by Alex Reibman