Product Updates
All
December 9, 2024

Understanding Machine Learning Model Accuracy: A Practical Guide

Groundlight Staff

Groundlight’s computer vision dashboard shows “Projected ML Accuracy”, which estimates how well your AI vision model will perform next.

Understanding Machine Learning Model Accuracy: A Practical Guide

When you set up your computer vision model in Groundlight, you’ll see a number next to “Projected ML Accuracy.” This number estimates how your model will perform on new, unseen data in the near future.

A view from Groundlight's computer vision platform, where in this case the model is detecting if the cat is on the couch. Here, it shows that there is a value of 95% for Projected Machine Learning Accuracy.

In this guide, you’ll learn:

  • What is ML accuracy?
  • Why is ML accuracy important?
  • How Groundlight calculates ML accuracy and how to interpret it
  • How to improve your model's ML accuracy
  • How Groundlight makes understanding ML accuracy easy

What is ML accuracy?

ML accuracy is like a test score for your model. If your model makes correct predictions 90 out of 100 times, it has a 90% accuracy rate. But accuracy can be misleading, especially when dealing with imbalanced data where some outcomes occur more frequently than others. For example, if 90% of the time there is no cat on the couch, a model that always predicts "no cat" would have 90% accuracy.

What is Balanced ML accuracy?

Balanced ML accuracy provides a more realistic view by considering all possible outcomes equally, giving you a more accurate measure of your model's real-world performance. For your Groundlight AI model (or “detector”) it is the average of accuracies calculated separately for the “YES” and “NO” classes.

If you click on "Accuracy Details" on the Groundlight dashboard, you get this view. Groundlight's "Projected ML Accuracy" is actually "Balanced ML Accuracy". The average of accuracies are calculated separately for "YES" and "NO" classes.

What is Projected ML accuracy?

In Groundlight’s dashboard you’ll see “Projected ML Accuracy.” This is actually balanced ML accuracy, forecast into the near future — an estimate of how well the ML model behind this detector will perform next.

Why is Balanced ML accuracy important?

It provides a balanced evaluation

Balanced ML accuracy accounts for all outcomes equally. This means it evaluates how well your model performs in each scenario and then combines them for an overall score.

It helps with real-world applications

By understanding balanced ML accuracy, you can trust that your model will perform reliably in real-world situations, leading to better decisions and more effective AI solutions.

How Groundlight calculates ML Accuracy and how to interpret it

Groundlight doesn't just tally the number of times the model has been right or wrong in the past. Instead, we use k-fold cross-validation for evaluation, where the evaluation sets consist only of examples labeled by customers or Groundlight scientists. We train 4 different models, each with 75% of the data, and the remaining 25% as unseen "held-out" data, and then see how accurate the model is on the data it wasn't trained on.

As mentioned earlier in this post, because many tasks are highly imbalanced (e.g., lots of "NO" answers and few "YES" answers), Groundlight reports balanced accuracy. Balanced accuracy is defined as the average of the proportion of correct predictions in each class, providing a fair assessment across all outcomes.

For example, let’s say, you're setting up a Groundlight computer vision model to see if your cat is on the couch. Here is how Groundlight calculates ML accuracy:

1. You send pictures to Groundlight for both outcomes:

  • Images with the cat on the couch ("YES" class)
  • Images without the cat on the couch ("NO" class)
Examples of images that show "Yes" (the cat is on the couch) and "No" (the cat is not on the couch).

2. You provide at least 8 Ground Truth labels (4 YES and 4 NO): Groundlight requires at least 4 customer-labeled examples in each class to begin reporting projected ML accuracy

When you first set up your "detector" (or model), you won't see a value for Projected ML Accuracy. You will need to click "Label Image Queries" to provide 8 total Ground Truth Labels for "YES" and "NO"
A view of what providing Ground Truth labels looks like, where the user determines what "YES" and "NO" means for whether the cat is on the couch.

3. Groundlight calculates projected ML accuracy for each class using k-fold cross-validation:

  1. "YES" class accuracy: If the model correctly identifies the cat on the couch in 9 out of 10 images, the accuracy is 90%.
  2. "NO" class accuracy: If the model correctly identifies that the cat is not on the couch in 8 out of 10 images, the accuracy is 80%.
  3. Groundlight computes balanced ML accuracy:
    • Balanced ML accuracy = (YES accuracy + NO accuracy) ÷ 2
    • Balanced ML accuracy = (90% + 80%) ÷ 2 = 85%

An 85% projected ML accuracy means your model is expected to perform correctly 85% of the time across all scenarios when exposed to new data.

NOTE: The ML model can make correct predictions even at low confidence levels. The projected ML accuracy metric doesn't consider the model's confidence scores or the detector's confidence threshold. A value of 100% means the ML model made all correct predictions, while 50% means random guessing. We require at least four customer-labeled examples in each class before we begin reporting projected ML accuracy. The term "projected" means this evaluation is performed using k-fold cross-validation on the latest model.

How to improve your model's ML accuracy

Here are some suggestions to improve your model’s ML accuracy:

1. Send more images to Groundlight for both "Yes" and "No" outcomes.

2. Use clear, high-resolution images.

3. Include various scenarios (different lighting and edge cases). 

For example, here are a few “edge cases” for “Is the cat on the couch?” query:

4. Correctly label each image with Ground Truth labels such as "cat on couch" or "no cat on couch."

NOTE: Ground Truth labels help the model learn and improve. They’re used as examples to help cloud labelers understand your instructions. They show exactly what you want your detector to do, especially in tricky cases. Ground Truth is a select set of high-quality images with correct answers. The more Ground Truth images you provide, the better we can measure detector performance. You need to provide answers because you know what the detector is trying to do and what your question means. Your inputs are the truth for accuracy and how the detector should behave. If you don't provide many answers, a Groundlight expert can act as a proxy for you.

5. Understand that confidence intervals on the accuracy measure are a statistical measure of uncertainty about our measurement and not the model’s confidence about a particular prediction. See our blog post for an in-depth discussion of confidence intervals.

How Groundlight makes understanding ML Accuracy easy

User-friendly dashboard

  • Visual metrics: See projected ML accuracy and confidence bounds at a glance.
  • Interactive tools: Easily provide Ground Truth labels through the computer vision platform.

Simplifies complex concepts

  • No technical expertise needed: Groundlight takes care of the ML science for you, so you can focus on your application.

Offers support for improvement

  • Guidance: Provides tips on how to enhance your model's performance.
  • Assistance: Access to human-in-the-loop support when needed.

Final thoughts

Understanding the performance metrics of your ML models is important and often tricky. Groundlight does the legwork and summarizes it for you, and gives you visibility on the performance of your computer vision model.

Ready to improve your AI models? Visit dashboard.groundlight.ai, sign up for a free account, and try Groundlight yourself using the "Explore" tab.

Still have questions?

Can ChatGPT analyze images?

What other AI software can analyze images?