Groundlight’s computer vision dashboard shows “Projected ML Accuracy”, which estimates how well your AI vision model will perform next.
When you set up your computer vision model in Groundlight, you’ll see a number next to “Projected ML Accuracy.” This number estimates how your model will perform on new, unseen data in the near future.
In this guide, you’ll learn:
ML accuracy is like a test score for your model. If your model makes correct predictions 90 out of 100 times, it has a 90% accuracy rate. But accuracy can be misleading, especially when dealing with imbalanced data where some outcomes occur more frequently than others. For example, if 90% of the time there is no cat on the couch, a model that always predicts "no cat" would have 90% accuracy.
Balanced ML accuracy provides a more realistic view by considering all possible outcomes equally, giving you a more accurate measure of your model's real-world performance. For your Groundlight AI model (or “detector”) it is the average of accuracies calculated separately for the “YES” and “NO” classes.
In Groundlight’s dashboard you’ll see “Projected ML Accuracy.” This is actually balanced ML accuracy, forecast into the near future — an estimate of how well the ML model behind this detector will perform next.
Balanced ML accuracy accounts for all outcomes equally. This means it evaluates how well your model performs in each scenario and then combines them for an overall score.
By understanding balanced ML accuracy, you can trust that your model will perform reliably in real-world situations, leading to better decisions and more effective AI solutions.
Groundlight doesn't just tally the number of times the model has been right or wrong in the past. Instead, we use k-fold cross-validation for evaluation, where the evaluation sets consist only of examples labeled by customers or Groundlight scientists. We train 4 different models, each with 75% of the data, and the remaining 25% as unseen "held-out" data, and then see how accurate the model is on the data it wasn't trained on.
As mentioned earlier in this post, because many tasks are highly imbalanced (e.g., lots of "NO" answers and few "YES" answers), Groundlight reports balanced accuracy. Balanced accuracy is defined as the average of the proportion of correct predictions in each class, providing a fair assessment across all outcomes.
For example, let’s say, you're setting up a Groundlight computer vision model to see if your cat is on the couch. Here is how Groundlight calculates ML accuracy:
1. You send pictures to Groundlight for both outcomes:
2. You provide at least 8 Ground Truth labels (4 YES and 4 NO): Groundlight requires at least 4 customer-labeled examples in each class to begin reporting projected ML accuracy
3. Groundlight calculates projected ML accuracy for each class using k-fold cross-validation:
An 85% projected ML accuracy means your model is expected to perform correctly 85% of the time across all scenarios when exposed to new data.
NOTE: The ML model can make correct predictions even at low confidence levels. The projected ML accuracy metric doesn't consider the model's confidence scores or the detector's confidence threshold. A value of 100% means the ML model made all correct predictions, while 50% means random guessing. We require at least four customer-labeled examples in each class before we begin reporting projected ML accuracy. The term "projected" means this evaluation is performed using k-fold cross-validation on the latest model.
Here are some suggestions to improve your model’s ML accuracy:
1. Send more images to Groundlight for both "Yes" and "No" outcomes.
2. Use clear, high-resolution images.
3. Include various scenarios (different lighting and edge cases).
For example, here are a few “edge cases” for “Is the cat on the couch?” query:
4. Correctly label each image with Ground Truth labels such as "cat on couch" or "no cat on couch."
NOTE: Ground Truth labels help the model learn and improve. They’re used as examples to help cloud labelers understand your instructions. They show exactly what you want your detector to do, especially in tricky cases. Ground Truth is a select set of high-quality images with correct answers. The more Ground Truth images you provide, the better we can measure detector performance. You need to provide answers because you know what the detector is trying to do and what your question means. Your inputs are the truth for accuracy and how the detector should behave. If you don't provide many answers, a Groundlight expert can act as a proxy for you.
5. Understand that confidence intervals on the accuracy measure are a statistical measure of uncertainty about our measurement and not the model’s confidence about a particular prediction. See our blog post for an in-depth discussion of confidence intervals.
Understanding the performance metrics of your ML models is important and often tricky. Groundlight does the legwork and summarizes it for you, and gives you visibility on the performance of your computer vision model.
Ready to improve your AI models? Visit dashboard.groundlight.ai, sign up for a free account, and try Groundlight yourself using the "Explore" tab.
Yes, ChatGPT can analyze images in the sense of writing reasonable text about what the image depicts. However, there are important caveats when it comes to getting repeated, trustworthy, actionable answers to visual questions, especially in specialized domains.
Groundlight AI trains your specialist models behind the scenes with a human-in-the-loop system that is easy to integrate into business applications.