FUNDAMENTALS
By Surender Singh
7 min read
When we evaluate machine learning models, accuracy is not always enough. Two other metrics – precision and recall - often matter more, especially when the cost of mistakes is high. Let’s start with the basics.
Precision
Precision tells us: Of all the things the model flagged as positive, how many were actually correct? Precision = True Positives (TP) / True Positives (TP) + False Positives (FP)
Recall
Recall tells us: Of all the actual positives, how many did the model manage to catch? Recall = True Positives (TP) / True Positives (TP) + False Negatives (FN)
F1 Score
The F1 score balances both precision and recall by taking their harmonic mean. F1 = 2 * (Precision * Recall) / (Precision + Recall)
This is useful when you need a single number that accounts for both catching all positives and minimizing false alarms.
Imagine you’re a drone operator tasked with wildfire surveillance. You receive satellite pings about possible fires in remote forests. Your mission: fly the drone, scan the target, and decide whether it’s an actual wildfire.
Your role is critical:
If you miss a real wildfire (False Negative), the fire spreads, causing devastating damage.
If you raise too many false alarms (False Positive), firefighters rush to unnecessary sites, stretching resources thin.
This is exactly the tradeoff machine learning models face when balancing precision and recall.
Suppose your drone flagged 10 hotspots as wildfires.
7 turned out to be actual wildfires (True Positives).
3 were just campfires mistaken for wildfires (False Positives).
Your precision is: Precision = 7 / (7 + 3) = 0.7
In other words, when you raise an alarm, you’re right 70% of the time.
High precision means you’re rarely crying wolf. But it doesn’t tell us whether you’re catching all the fires out there.
Now imagine there were 12 real wildfires in total that day.
You correctly caught 7 (True Positives).
You missed 5 (False Negatives).
Your recall is: Recall = 7 / (7 + 5) = 0.58
So, you managed to detect only 58% of the actual wildfires. High recall would mean catching almost every real fire, even if that means sometimes mistaking campfires for wildfires.
If we combine the numbers:
Precision = 0.7
Recall = 0.58
Then your F1 score is: F1 = 2 * (0.7 * 0.58) / (0.7 + 0.58) ≈ 0.63
This gives you a more holistic view: you’re doing okay, but both missing fires and raising false alarms need improvement.
High Precision, Low Recall: You almost never cry wolf, but you miss a lot of real fires.
High Recall, Low Precision: You catch nearly every fire, but also mistake every campfire, barbecue, or sunset glow for wildfires.
High F1: You strike a balance, ensuring both efficiency and safety.
Precision and recall aren’t abstract math – they’re real tradeoffs with life-or-death consequences. As the wildfire drone operator, your decisions affect forests, communities, and emergency responders.
Similarly, in AI systems - whether detecting fraud, diagnosing disease, or scanning for wildfires – choosing between precision and recall depends on what costs you can afford:
Missing positives, or
Triggering false alarms.
The goal isn’t always perfect precision or perfect recall—but the right balance for the problem at hand.
Next time you see a model boasting “95% accuracy,” ask yourself: But what about precision and recall?
MODELS
Model Name
Lorem ipsum dolor sit amet, consectetur adipiscing elit, sed do eiusmod tempor incididunt ut labore et dolore magna aliqua. Ut enim ad minim veniam, quis nostrud exercitation.
By Surender Singh
7 min read
MODELS
Model Name
Lorem ipsum dolor sit amet, consectetur adipiscing elit, sed do eiusmod tempor incididunt ut labore et dolore magna aliqua. Ut enim ad minim veniam, quis nostrud exercitation.
By Surender Singh
7 min read