[Supervised Modeling] Q3.d: Can't create decision boundary


144
views
2
6 months ago by

Having a difficult time plotting the decision boundary on this question.

d) Now use all features. How does this affect the confusion matrix? Draw the decision boundary (Since we cannot really visualize a 4-dimensional plot, use only 'Sepal.Length' and 'Sepal.Width' as X for the boundary function but use the predictor trained on all features)

I think this is the problematic part. I consistently receive this error:

ValueError: operands could not be broadcast together with shapes (10000,2) (4,) 

I believe this is because I am using a model that is trained on more than 2 dimensions (which is what the question specifies). This is what my code looks like:

iris_no_setosa = iris[iris.Species != 'setosa']
iris_train, iris_test = train_test_split(iris_no_setosa, test_size=0.3)

all_feat_nb = GaussianNB()
predictions = all_feat_nb.fit(iris_train.drop('Species',1) , iris_train.Species).predict(iris_test.drop('Species',1))

#Confusion Matrix
cm= confusion_matrix(y_pred=predictions, y_true=iris_test.Species)
helper_functions.plot_confusion_matrix(cm, iris_test.Species.unique())

#Drop extra features so decision boundary can work
iris_train = iris_train[['Sepal.Width','Sepal.Length','Species']]
iris_test = iris_test[['Sepal.Width','Sepal.Length','Species']]

#Decision Boundary -- This is where the problem arises
helper_functions.decision_boundary(X=iris_test.drop('Species',1) , Y=iris_test.Species, model=all_feat_nb)

The problem specifically occurs when I plot the decision boundary, the confusion matrix works just fine.

Anyone else understand this?

 

add commentfollow this post modified 6 months ago by Kim B   • written 6 months ago by Alex Reibman  

1 Answer


5
6 months ago by
Kim B  

Hi,

I got the same error and you're right it is because we are using a model that is trained on more than 2 dimensions. To solve it I think we should train the model with 2 dimensions but it is not what we were asked ...

    # Training data
    X_train = train.drop('Species', 1)

    # Target value
    y_train = train.Species

    # Fit the training data
    gnb = GaussianNB()
    gnb.fit(X_train, y_train) 

    # Make a prediction on the test set
    X_test = test.drop('Species', 1)
    y_test = test.Species
    p_test = gnb.predict(X_test)

    # Confusion matrix
    cm = confusion_matrix(y_test,p_test)

    # Plot the confusion matrix
    classes = test.Species.unique()
    plot_confusion_matrix(cm,classes)

    # Decision boundary
    y_test = y_test.reshape(y_test.size, 1)
    #decision_boundary(X_test[['Sepal.Length','Sepal.Width']],y_test,gnb)
add comment written 6 months ago by Kim B  

I spoke with some others about this, and they seem to agree. There must be a mistake with the question.

written 6 months ago by Alex Reibman  
Please log in to add an answer/comment or follow this question.

Share this question


Similar posts:
Search »