29.2: Calculating Accuracy in Python

Last updated
Save as PDF

Page ID: 88797

\( \newcommand{\vecs}[1]{\overset { \scriptstyle \rightharpoonup} {\mathbf{#1}} } \) \( \newcommand{\vecd}[1]{\overset{-\!-\!\rightharpoonup}{\vphantom{a}\smash {#1}}} \)\(\newcommand{\id}{\mathrm{id}}\) \( \newcommand{\Span}{\mathrm{span}}\) \( \newcommand{\kernel}{\mathrm{null}\,}\) \( \newcommand{\range}{\mathrm{range}\,}\) \( \newcommand{\RealPart}{\mathrm{Re}}\) \( \newcommand{\ImaginaryPart}{\mathrm{Im}}\) \( \newcommand{\Argument}{\mathrm{Arg}}\) \( \newcommand{\norm}[1]{\| #1 \|}\) \( \newcommand{\inner}[2]{\langle #1, #2 \rangle}\) \( \newcommand{\Span}{\mathrm{span}}\) \(\newcommand{\id}{\mathrm{id}}\) \( \newcommand{\Span}{\mathrm{span}}\) \( \newcommand{\kernel}{\mathrm{null}\,}\) \( \newcommand{\range}{\mathrm{range}\,}\) \( \newcommand{\RealPart}{\mathrm{Re}}\) \( \newcommand{\ImaginaryPart}{\mathrm{Im}}\) \( \newcommand{\Argument}{\mathrm{Arg}}\) \( \newcommand{\norm}[1]{\| #1 \|}\) \( \newcommand{\inner}[2]{\langle #1, #2 \rangle}\) \( \newcommand{\Span}{\mathrm{span}}\)\(\newcommand{\AA}{\unicode[.8,0]{x212B}}\)

Calculating your classifier’s accuracy is actually a snap. Once your classifier’s code is in a function, you just need a loop.

Return to the videogame example from last chapter, and the decision tree classifier we wrote on p. 287. We’ll use a counter variable, initialized to zero, that will keep track of our number of correct predictions. We’ll then loop through each row of the test set, feeding that row’s features to the classifier function. If the return value from the classifier matches the value of that row’s target, ka-ching! We increment our counter to increase our score. If it doesn’t, we don’t. At the end, we divide by the number of test points to get our percentage. Simple!

Code \(\PageIndex{1}\) (Python):

count = 0

for row in students_test.itertuples():

if predict(row.Major, row.Age, row.Gender) == row.VG:

count += 1

accuracy = count / len(students_test) * 100

print("Our accuracy on the test set was {}%.".format(accuracy, count, len(students_test)))

| Our accuracy on the test set was 87.5%.

If we want more detail, we could print a message for each prediction, and flag the incorrect ones for easy identification:

Code \(\PageIndex{2}\) (Python):

count = 0

for row in students_test.itertuples():

if predict(row.Major, row.Age, row.Gender) == row.VG:

print(" Predicted {}/{}/{} right!".format(row.Major, row.Age, row.Gender))

count += 1

else:

print("X Predicted {}/{}/{} wrong. :(".format(row.Major, row.Age, row.Gender))

accuracy = count / len(students_test) * 100

print("Our accuracy on the test set was {}% ({}/{}).".format( accuracy, count, len(students_test)))

Not too shabby. As you can see, the only test point we missed was the male middle-aged CPSC major, which our classifier figured would be a videogamer. Live and learn.

The data size here is laughably small so that I can fit everything on the page. But it’s worth considering these three quantities anyway:

Classifier’s performance on training set	94.1% (16/17)
Classifier’s performance on test set 8	82.5% (7/8)
Just using the prior on test set	62.5% (5/8)

These three quantities will nearly always be in this order from top to bottom. When we test our classifier on the very data it was trained on, we get an inflated view of its accuracy – for decision trees, recall, it will always be 100% less any contradictions. Testing it on the data it has not yet seen gives the truer (more realistic) picture. Finally, your classifier had better outperform just using the prior (here, choosing “No” because the majority of training points were “No”) or this whole thing is a pretty useless enterprise!