Back to Blogs

Contents

What is Natural Language Search?
What Can You Use the Search Anything Model for?
How to Use Search Anything Model with Encord?
Conclusion

Encord Blog

Search Anything Model: Combining Vision and Natural Language in Search

June 20, 2023

5 mins

Back to Blogs

Contents

What is Natural Language Search?
What Can You Use the Search Anything Model for?
How to Use Search Anything Model with Encord?
Conclusion

Written by

Frederik Hvilshøj

View more posts

In the current AI boom, one thing is certain: data is king.

Data is at the heart of the production and development of new models; and yet, the processing and structuring required to get data to a form that is consumable by modern AI are often overlooked.

One of the most primordial elements of intelligence that can be leveraged to facilitate this is search. Search is crucial to understanding data: the more ways to search and group data, the more insights you can extract. The greater the insights, the more structured the data becomes.

Historically, search capabilities have been limited to uni-modal approaches: models used for images or videos in vision use cases have been distinct from those used for textual data in natural language processing. With GPT-4’s ability to process both images and text, we are only now starting to see the potential impacts of performant multi-modal models that span various forms of data.

Embracing the future of multi-modal data, we propose the Search Anything Model. The unified framework combines natural language, visual property, similarity, and metadata search together in a single package. Leveraging computer vision processing, multi-modal embeddings, LLMs, and traditional search characteristics, Search Anything allows for multiple forms of structured data querying using natural language.

If you want to find all bright images with multiple cats that look similar to a particular reference image, Search Anything will match over multiple index types to retrieve data of the requisite form and conditions.

What is Natural Language Search?

Natural Language Search (NLS) uses human-like language to query and retrieve information from databases, datasets, or documents. Unlike traditional keyword-based searches, NLS algorithms employ Natural Language Processing (NLP) techniques to understand the context, semantics, and intent behind user queries.

By interpreting the query’s meaning, NLS systems provide more accurate and relevant search results, mimicking how humans communicate. The computer vision domain requires a similar general understanding of data content without requiring metadata for visuals.

💡Encord is a data-centric computer vision company. With Encord Active, you can use the Search Anything Model to explore, curate, and debug your datasets.

What Can You Use the Search Anything Model for?

Let’s dive into some examples of computer vision uses for the Search Anything Model.

Data Exploration

Search Anything simplifies data exploration by allowing users to ask questions in plain language and receive valuable insights.

Instead of manually formulating complex queries and algorithms that may require pre-existing metadata, you can pose questions such as:

“Which images are blurry?”

“How is my model performing on images with multiple labels?”

Search Anything interprets these queries to provide visualizations or summaries of the data quickly and effectively to gain valuable insights.

Build better ML models with Encord

Clean & curate data smartly

Create quality labels quickly

Validate your label quality

Evaluate & monitor your models

Book a live demo

Data Curation

Search Anything streamlines data curation, making the process highly efficient and user-friendly. Filter, sort, or aggregate data using only natural language commands

For example, you can request the following:

“Remove all the very bright images from my dataset”

“Add an ‘unannotated’ tag to all the data that has not been annotated yet.”

Search Anything processes these commands, automatically performs the requested actions, and presents the curated data all without complex coding or SQL queries.

Encord Active

Using Encord Active to filter out bright images in the COCO dataset. Use the bulk tagging feature to tag all the data.

Data Debugging

Search Anything expedites the process of identifying and resolving data issues.

To investigate anomalies to inconsistencies, ask questions or issue commands such as:

“Are there any missing values for the image difficulty quality metric?”

“Find records that are labeled ‘cat’ but don’t look like a typical cat.”

Once again, Search Anything analyzes the data, detects discrepancies, and provides actionable insights to assist you in identifying and rectifying data problems efficiently.

💡Read to find out how to find and fix label errors with Encord Active.

Cataloging Data for E-commerce

Search Anything can also enhance the cataloging process for e-commerce platforms. By understanding product photos and descriptions, Search Anything enable users to search and categorize products efficiently, users can ask: .

“Locate the green and sparkly shoes.”

Search Anything interprets this query, matches the desired criteria with the product images and descriptions, and displays the relevant products, facilitating improved product discovery and customer experience.

How to Use Search Anything Model with Encord?

At Encord, we are building an end-to-end visual data engine for computer vision. Our latest release, Encord Active, empowers users to interact with visual data only using natural language.

Let’s dive into a few use cases:

Use Case 1: Data Exploration

User Query: “red dress,” “denim jeans,” and “black shirts”

Encord Active identifies the images in the dataset that most accurately corresponds to the query.

Use Case 2: Data Curation

User query: “Display the very bright images”

Encord Active displays filtered results from the dataset based on the specified criterion.

Read to find out how to choose the right data for your computer vision project.

Use Case 3: Data Debugging

User Query: “Find all the non-singular images?”

Encord Active detects any duplicated images in the dataset, and displays images that are not unique within the dataset.

Can I Use My Own Model?

Yes, Encord Active allows you to leverage your models. Through fine-tuning or integrating custom embedding models, you can tailor the search capabilities to your specific needs, ensuring optimal performance and relevance.

💡At Encord, we are actively researching how to fine-tune LLMs for the purpose of searching Encord Active projects efficiently. Get in touch if you would like to get involved.

Scale your annotation workflows and power your model performance with data-driven insights

Conclusion

Natural Language Search is revolutionizing the way we interact with data, enabling intuitive and efficient exploration, curation, and debugging.

By harnessing the power of NLP and computer vision models, our Search Anything Model allows you to pose queries, issue commands, and obtain actionable insights using human-like language. Whether you are an ML engineer, a data scientist, or an e-commerce professional, incorporating NLS into your workflow can significantly enhance productivity and unlock the full potential of your data.

Build better ML models with Encord

Get started today

Written by

Frederik Hvilshøj

View more posts

Previous blog

TAPIR: Tracking Any Point with per-frame Initialization and temporal Refinement | Explained

Next blog

How to build Semantic Visual Search with ChatGPT & CLIP

Jan 11 2023

5 M

sampleImage_find-and-fix-label-errors-tutorial

Tutorials

How to Find and Fix Label Errors with Encord Active

Introduction Are you trying to improve your model performance by finding label errors and correcting them? You’re probably spending countless hours manually debugging your data sets to find data and label errors with various scripts in Jupyter Notebooks. Encord Active, a new open-source active learning framework makes it easy to find and fix label errors in your computer vision datasets. With Encord Active, you can quickly and easily identify label errors in your datasets and fix them with just a few clicks. Plus, with a user-friendly UI and a range of different visualizations to slice your data, Encord Active makes it easier than ever to investigate and understand the failure modes in your computer vision models. In this guide, we will show you how to use Encord Active to find and fix label errors in the COCO validation dataset. Before we begin, let us quickly recap the three types of label errors in computer vision. Label errors in computer vision Incorrect labels in your training data can significantly impact the performance of your computer vision models. While it's possible to manually identify label errors in small datasets, it quickly becomes impractical when working with large datasets containing hundreds of thousands or millions of images. It’s basically like finding a needle in a haystack. The three types of labeling errors in computer vision are: Mislabeled objects: A sample that has a wrong object class attached to it. Missing labels: A sample that does not contain a label. Inaccurate labels: A label that is either too tight, too loose, or overlaps with other objects. Below you see examples of the three types of errors on a Bengalese tiger: Tip! If you’d like to read more about label errors, we recommend you check out Data errors in Computer Vision. How to find label errors with a pre-trained model As your computer vision activities mature, you can use a trained model to spot label errors in your data annotation pipelines. You will need to follow a simple 4-step approach: Run a pre-trained model on your newly annotated samples to obtain model predictions. Visualize your model predictions and ground truth labels on top of each other. Sort for high-confidence false positive predictions and compare them with the ground truth labels. Flag missing or wrong labels and send them for re-labeling. Tip! It is important that the computer vision model you use to get predictions has not been trained on the newly annotated samples we are investigating. How to fix label errors with Encord Active Getting started The sandbox dataset used in this example is the COCO validation dataset combined with model predictions from a pre-trained MASK R-CNN RESNET50 FPN V2 model. The sandbox dataset with labels and predictions can be downloaded directly from Encord Active. Tip! The quality of your model can greatly impact the effectiveness of using it to identify label errors. The better your model, the more accurate the predictions will be. So be sure to carefully select and use your model to get the best results. First, we install Encord Active using pip: $ pip install encord-active Hereafter, we download a sandbox dataset: $ encord-active download Loading prebuilt projects ... [?] Choose a project: [open-source][validation]-coco-2017-dataset (1145.2 mb) > [open-source][validation]-coco-2017-dataset (1145.2 mb) [open-source]-covid-19-segmentations (55.6 mb) [open-source][validation]-bdd-dataset (229.8 mb) quickstart (48.2 mb) Downloading sandbox project: 100%|################################################| 1.15G/1.15G [00:22<00:00, 50.0MB/s] Unpacking zip file. May take a bit. ╭───────────────────────────── 🌟 Success 🌟 ─────────────────────────────╮ │ │ │ Successfully downloaded sandbox dataset. To view the data, run: │ │ │ │ cd "C:/path/to/[open-source][validation]-coco-2017-dataset" │ │ encord-active visualise │ │ │ ╰─────────────────────────────────────────────────────────────────────────╯ Lastly, we visualize Encord Active: cd "[open-source][validation]-coco-2017-dataset" $ encord-active visualise In the UI, we navigate to the false positive page. A false positive prediction is when a model incorrectly identifies an object and gives it a wrong class or if the IOU is lower than the determined threshold . For example, if a model is trained to recognize tigers and mistakenly identifies a cat as a tiger, that would be a false positive prediction. Next, we select the metric “Model confidence” and filter for predictions with >75% confidence. Using the UI we can then sort for the highest confidence false positives to find images with possible label errors. In the example below, we can see that the model has predicted four missing labels on the selected image. The objects missing are a backpack, a handbag, and two people. The predictions are marked in purple with a box around them. As all four predictions are correct the label errors can automatically be sent back to the label editor to be corrected immediately. Similarly, we can use the false positive predictions to find mislabeled objects and send them for re-labeling in your label editor. The vehicle below is predicted with 99.4% confidence to be a bus but is currently mislabeled as a truck. Using Encord’s label editor, we can quickly correct the label. To find and fix any remaining incorrect labels in the dataset, we repeated this process until we were satisfied. If you're curious about identifying label errors in your own training data, you can try using Encord Active, the open-source active learning framework. Simply upload your data, labels, and model predictions to get started. Conclusion Finding and fixing label errors is a tedious manual process that can take countless hours. It is often done manually by shifting through one image at a time or writing one-off scripts in Jupyter notebooks. The three different label error types are 1) missing labels, 2) wrong labels, and 3) inaccurate labels. The easiest way to find and fix label errors and missing labels is to use a trained model to spot label errors in your training dataset by running a model. Want to test your own models? "I want to get started right away" - You can find Encord Active on Github here. "Can you show me an example first?" - Check out this Colab Notebook. "I am new, and want a step-by-step guide" - Try out the getting started tutorial. If you want to support the project you can help us out by giving a Star on GitHub :) Want to stay updated? Follow us on Twitter and Linkedin for more content on computer vision, training data, and active learning. Join our Discord channel to chat and connect.

Dec 19 2022

8 M

Software To Help You Turn Your Data Into AI

Forget fragmented workflows, annotation tools, and Notebooks for building AI applications. Encord Data Engine accelerates every step of taking your model into production.