Trainloop AI

FUNDAMENTALS

Precision, Recall and F1

By Surender Singh

7 min read

When we evaluate machine learning models, accuracy is not always enough. Two other metrics – precision and recall - often matter more, especially when the cost of mistakes is high. Let’s start with the basics.

The Definitions

Precision
Precision tells us: Of all the things the model flagged as positive, how many were actually correct? Precision = True Positives (TP) / True Positives (TP) + False Positives (FP)

Recall
Recall tells us: Of all the actual positives, how many did the model manage to catch? Recall = True Positives (TP) / True Positives (TP) + False Negatives (FN)

F1 Score

The F1 score balances both precision and recall by taking their harmonic mean. F1 = 2 * (Precision * Recall) / (Precision + Recall)

This is useful when you need a single number that accounts for both catching all positives and minimizing false alarms.

The Wildfire Drone Operator

Imagine you’re a drone operator tasked with wildfire surveillance. You receive satellite pings about possible fires in remote forests. Your mission: fly the drone, scan the target, and decide whether it’s an actual wildfire.

Your role is critical:

If you miss a real wildfire (False Negative), the fire spreads, causing devastating damage.
If you raise too many false alarms (False Positive), firefighters rush to unnecessary sites, stretching resources thin.

This is exactly the tradeoff machine learning models face when balancing precision and recall.

Precision in the Drone Game

Suppose your drone flagged 10 hotspots as wildfires.

7 turned out to be actual wildfires (True Positives).
3 were just campfires mistaken for wildfires (False Positives).

Your precision is: Precision = 7 / (7 + 3) = 0.7

In other words, when you raise an alarm, you’re right 70% of the time.

High precision means you’re rarely crying wolf. But it doesn’t tell us whether you’re catching all the fires out there.

Recall in the Drone Game

Now imagine there were 12 real wildfires in total that day.

You correctly caught 7 (True Positives).
You missed 5 (False Negatives).

Your recall is: Recall = 7 / (7 + 5) = 0.58

So, you managed to detect only 58% of the actual wildfires. High recall would mean catching almost every real fire, even if that means sometimes mistaking campfires for wildfires.

Balancing the Two: F1 Score

If we combine the numbers:

Precision = 0.7
Recall = 0.58

Then your F1 score is: F1 = 2 * (0.7 * 0.58) / (0.7 + 0.58) ≈ 0.63

This gives you a more holistic view: you’re doing okay, but both missing fires and raising false alarms need improvement.

What does is all mean?

High Precision, Low Recall: You almost never cry wolf, but you miss a lot of real fires.

High Recall, Low Precision: You catch nearly every fire, but also mistake every campfire, barbecue, or sunset glow for wildfires.

High F1: You strike a balance, ensuring both efficiency and safety.

Key Takeaways

Precision and recall aren’t abstract math – they’re real tradeoffs with life-or-death consequences. As the wildfire drone operator, your decisions affect forests, communities, and emergency responders.

Similarly, in AI systems - whether detecting fraud, diagnosing disease, or scanning for wildfires – choosing between precision and recall depends on what costs you can afford:

Missing positives, or
Triggering false alarms.

The goal isn’t always perfect precision or perfect recall—but the right balance for the problem at hand.

Next time you see a model boasting “95% accuracy,” ask yourself: But what about precision and recall?

Related Blogs

MODELS

Model Name

Lorem ipsum dolor sit amet, consectetur adipiscing elit, sed do eiusmod tempor incididunt ut labore et dolore magna aliqua. Ut enim ad minim veniam, quis nostrud exercitation.

By Surender Singh

7 min read

MODELS

Model Name

Lorem ipsum dolor sit amet, consectetur adipiscing elit, sed do eiusmod tempor incididunt ut labore et dolore magna aliqua. Ut enim ad minim veniam, quis nostrud exercitation.

By Surender Singh

7 min read

schedule an intro

Training reasoning models aligned with your goals.

Email: founders@trainloop.ai

LINKS

1.1

Blogs

1.2

Contact

LEGAL

2.1

2.2

Terms

2.3

Trust Centre

SOCIALS

3.1

3.2

3.3

GitHub

3.4

YC (W25)

North Beach, San Francisco, CA

Training reasoning models aligned with your goals.

Email: founders@trainloop.ai

LINKS

1.1

Blogs

1.2

Contact

LEGAL

2.1

2.2

Terms

2.3

Trust Centre

SOCIALS

3.1

3.2

3.3

GitHub

3.4

YC (W25)

North Beach, San Francisco, CA