Image Categorization - Ximilar: Visual AI for Business

How To Scan And Identify Your Trading Cards With Ximilar AI

Michal Lukáč — Mon, 05 Aug 2024 15:23:55 +0000

In the world of trading card scanning and seller tools, efficiency is crucial. Applications like CollX, VGPC, or Collectr handle millions of daily requests for card identification from images from hobby users as well as those who earn cash selling trading cards. Ximilar offers similar services, providing powerful API solutions for businesses looking to effortlessly integrate visual search and image recognition functionalities into their apps or websites, with the possibility of customization.

Today, I’d like to introduce a solution specifically designed for physical stores and warehouses to process their physical card collections quickly and efficiently using card scanners like those from Fujitsu. This tutorial is tailored for shop owners who need to handle large volumes of card images rapidly. We’ve developed a simple yet powerful script in Python 3 for card identification, condition assessment or grading. It also identifies comic books and reads slab labels from companies like PSA or Beckett. The script outputs a CSV file that can be easily imported into Google Sheets or Microsoft Excel. With a few modifications, it can also be adapted for use with your Shopify store or other seller tools, such as for eBay submissions. Let’s dive in and see how this tool can streamline your card-processing workflow!

Capabilities of our AI Solution for Sports Cards and TCGs

Trading Card Games

In the previous blog post, I wrote about our REST API for identifying TCGs, sports cards, and comic book covers. The TCG identification service supports more trading card games, including the most popular ones like Pokémon, Yu-Gi-Oh!, Magic: The Gathering, One Piece, and Lorcana. For some games, it can also identify the correct language version of the card or determine if it is a foil/holographic card. Additionally, for certain TCG games, the system provides links or identification numbers to the TCG Player. You can try how it works here.

Sports Cards

For sports cards, we can identify more than 5 million trading cards across six main sports categories: baseball, hockey, football, soccer, MMA, and basketball cards. Our system also supports the identification of parallel and reprint versions, with continuous improvements. Not only does it provide the best match, but it also offers alternative options to choose from.

If the trading cards are in slabs from major grading companies like PSA, Beckett, CGC, TAG, SGC, or ACE, the system can instantly identify graded cards and provide the slab company, grade, and certificate number.

All Under One API

As you can see, the functionality is complex, offering features such as bulk trading card scanning and language support, resulting in highly accurate identification. I believe that Ximilar Collectibles Recognition services are the most accurate solutions available on the market today. It is a true game-changer for card dealers, other collectors, or companies looking to be independent of third parties like CollX, Kronozio, or Card Dealer Pro, which automatically submit your cards to their marketplaces.

With Ximilar, you can handle your trading card scanning independently using our visual search technology and deep learning models. Our solutions are also designed to suit your specific needs through continuous improvements and customization. Whether you purchase, scan, analyze, search, or sell cards in bulk, our API empowers you to manage your collection without the constraints of third-party services.

How to Analyze TCG and Sports Card Scanners With AI

Step 1 – Run The Cards Through The Scanner

Enough talk! Let’s analyze the bulk of your cards. First, you’ll need a folder with images of your cards. For testing, I’ve selected a small MTG and Pokémon card subset. You can put them on your scanner via top loader (link), or individually. Most card collectors use the Fujitsu Ricoh Fi-8170 scanner, which is one of the best scanners available. It can capture both the front and back sides of the cards.

For our purposes, we will only need the front side of the cards. To avoid unnecessary costs, remove the back side images from the folder or configure your scanner to store only the front side of the cards. Some scanners, like Fujitsu, can produce scan files with names such as 19032024-0001.jpg or 19032024-FRONT-0001.jpg. You can specify the naming format for the scan files. See the following video tutorial on how to set up a Fujitsu scanner via PaperStream Capture by MaxWaxPax:

My recommendation is to use similar settings for your Fujitsu scanner as it is in the video by MaxWaxPax and create multiple profiles for sideways and top-bottom trading card scanning. Ideally set up the scanner to produce only images for the front of the cards or distinguish the images with “front” or “back” suffix in the filename. However, if you already have an unstructured collection of card images, you can fully automate the selection of images showing the front sides using our AI Recognition of Collectibles.

Step 2 – Sign Up To Ximilar Platform

Now, you’ll need an account in our App. Simply sign up with your personal or company email to get your unique API token for service authorization. Once you are in the App, copy your API key to the clipboard and save it into some file. To access the service via API, you’ll need to purchase at least a Business plan. Both tasks – getting the API key and purchasing a Business plan – can be completed in the platform’s settings in a matter of minutes.

Step 3 – Installing Python 3

Before running the script, ensure you have Python 3 installed. Some operating systems already include a version of Python, but we require at least Python 3.6. If you’re unsure, follow this tutorial on RealPython (link), which contains installation steps for Windows, macOS, and Linux:

Installation via windows and macOS takes only a few clicks.

You should be able to write in your command line, shell or terminal the similar command. Here’s mine at Mac:

michallukac@Michals-MacBook ~ % python --version && pip --version

If you don’t know how to run commands, read a short tutorial on using the terminal/shell/command line. I recommend this tutorial by DjangoGirls or watching some YouTube videos (here’s one for Windows and one for macOS). The output from the command should look similar to my example:

Python 3.9.18

pip 23.1 from /Users/michallukac/env/devel/lib/python3.9/site-packages/pip (python 3.9)

Next, you will need to install Python libraries argparse and requests via pip command:

pip install --upgrade argparse

pip install --upgrade requests

If everything passes, you’re now ready to use the script we’ve prepared to process your folder of card images!

Step 4 – Running The Script On Trading Card Games

Running the script is simple. You’ll need to use a terminal (macOS), shell (Linux), or command line (Windows), which is why we installed Python 3. Download the following file from one of these addresses:

Put this file/script next to the folder (tcgscans) with your trading card images or scans and in the terminal, write the following command:

python process_card_scans.py --folder tcgscans --api_key YOURAPIKEY --collectible tcg --output results.csv --select_images all

Hitting the enter will execute the script on the folder of tcgscans, and the progress bar will be shown. The folder will analyze all the images in the folder (select_images). You can interrupt the script (it automatically stores the results every 10 images to your specified output CSV file):

Executing the script on trading card scan recognition.

Each analysis of a scan (sports card) will consume 10 credits from your credit supply in your Ximilar account. Our App lets you watch your credit consumption closely under Reports. The Business 100k Plan allows you to analyze 10,000 raw cards. If you need to analyze millions of cards per month or your entire collection at once, reach out to us, and we can offer you a bulk discount.

Visualization of API credit consumption per image processing operation in Ximilar App.

Step 5 – Analyzing the CSV file

Now we have our CSV file named results.csv. The CSV file contains the following fields: filename (name of the photo in the folder), status (ok or error), side (front or back), subcategory, full_name, name, year, card_number, series, set, set_code, and other additional fields.

The output format of the CSV depends on whether you analyze sports cards, TCG cards, comics, or slabs. Here is a visualization of the CSV file in Visual Studio Code:

My CSV file in Visual Studio Code.

We can import the file into Google Sheets or Microsoft Excel spreadsheet, edit it as needed, or generate printable checklists. The columns and data from the CSV can also be easily added to your Shopify product files or used for eBay submissions.

Additional information for card condition (or grading) can be added to the script via the –condition (–grading) parameter. For example, if your sports card scanner provides images with filenames such as 0001.jpg, 0002.jpg, 0003.jpg, etc., the following command will process images with odd numbering (e.g., 0001.jpg, 0003.jpg, …), identify the cards (name, card number, etc.), and also compute their condition (very good, excellent, etc.):

python process_card_scans.py --folder sportsfolder --api_key YOUR_API_KEY --collectible sport --output sport.csv --select_images odd --alternative --condition

Conclusion

With Ximilar’s AI-powered solutions, identifying and documenting your trading cards has never been easier. From trading card scanning, analyzing and organizing, to finding the current average market price, every step is streamlined to save you time and effort. I hope this guide helps you optimize your trading card workflow, making it easier to manage and showcase your collection. Happy collecting, whether it’s baseball or Pokémon cards!

Try our public demos

The post How To Scan And Identify Your Trading Cards With Ximilar AI appeared first on Ximilar: Visual AI for Business.

Predict Values From Images With Image Regression

Zuzana Raidová — Wed, 22 Mar 2023 15:03:45 +0000

We are excited to introduce the latest addition to Ximilar’s Computer Vision Platform. Our platform is a great tool for building image classification systems, and now it also includes image regression models. They enable you to extract values from images with accuracy and efficiency and save your labor costs.

Let’s take a look at what image regression is and how it works, including examples of the most common applications. More importantly, I will tell you how you can train your own regression system on a no-code computer vision platform. As more and more customers seek to extract information from pictures, this new feature is sure to provide Ximilar’s customers with the tools they need to stay ahead of the curve in today’s highly competitive AI-driven market.

What is the Difference Between Image Categorization and Regression?

Image recognition models are ideal for the recognition of images or objects in them, their categorization and tagging (labelling). Let’s say you want to recognize different types of car tyres or their patterns. In this case, categorization and tagging models would be suitable for assigning discrete features to images. However, if you want to predict any continuous value from a certain range, such as the level of tyre wear, image regression is the preferred approach.

Image regression is an advanced machine-learning technique that can predict continuous values within a specific range. Whenever you need to rate or evaluate a collection of images, an image regression system can be incredibly useful.

For instance, you can define a range of values, such as 0 to 5, where 0 is the worst and 5 is the best, and train an image regression task to predict the appropriate rating for given products. Such predictive systems are ideal for assigning values to several specific features within images. In this case, the system would provide you with highly accurate insights into the wear and tear of a particular tyre.

Predicting the level of tires worn out from the image is a use case for an image regression task, while a categorization task can recognize the pattern of the tyre.

How to Train Image Regression With a Computer Vision Platform?

Simply log in to Ximilar App and go to Categorization & Tagging. Upload your training pictures and under Tasks, click on Create a new task and create a Regression task.

Creating an image regression task in Ximilar App.

You can train regression tasks and test them via the same front end or with API. You can develop an AI prediction task for your photos with just a few clicks, without any coding or any knowledge of machine learning.

This way, you can create an automatic grading system able to analyze an image and provide a numerical output in the defined range.

Use the Same Training Data For All Your Image Classification Tasks

Both image recognition and image regression methods fall under the image classification techniques. That is why the whole process of working with regression is very similar to categorization & tagging models.

Working with image regression model on Ximilar computer vision platform.

Both technologies can work with the same datasets (training images), and inputs of various image sizes and types. In both cases, you can simply upload your data set to the platform, and after creating a task, label the pictures with appropriate continuous values, and then click on the Train button.

Apart from a machine learning platform, we offer a number of AI solutions that are field-tested and ready to use. Check out our public demos to see them in action.

If you would like to build your first image classification system on a no-code machine learning platform, I recommend checking out the article How to Build Your Own Image Recognition API. We defined the basic terms in the article How to Train Custom Image Classifier in 5 Minutes. We also made a basic video tutorial:

Tutorial: train your own image recognition model with Ximilar platform.

Neural Network: The Technology Behind Predicting Range Values on Images

The most simple technique for predicting float values is linear regression. This can be further extended to polynomial regression. These two statistical techniques are working great on tabular input data. However, when it comes to predicting numbers from images, a more advanced approach is required. That’s where neural networks come in. Mathematically said, neural network “f” can be trained to predict value “y” on picture “x”, or “y = f(x)”.

Neural networks can be thought of as approximations of functions that we aim to identify through the optimization on training data. The most commonly used NNs for image-based predictions are Convolutional Neural Networks (CNNs), visual transformers (VisT), or a combination of both. These powerful tools analyze pictures pixel by pixel, and learn relevant features and patterns that are essential for solving the problem at hand.

CNNs are particularly effective in picture analysis tasks. They are able to detect features at different spatial scales and orientations. Meanwhile, VisTs have been gaining popularity due to their ability to learn visual features without being constrained by spatial invariance. When used together, these techniques can provide a comprehensive approach to image-based predictions. We can use them to extract the most relevant information from images.

What Are the Most Common Applications of Value Regression From Images?

Estimating Age From Photos

Probably the most widely known use case of image regression by the public is age prediction. You can come across them on social media platforms and mobile apps, such as Facebook, Instagram, Snapchat, or Face App. They apply deep learning algorithms to predict a user’s age based on their facial features and other details.

While image recognition provides information on the object or person in the image, the regression system tells us a specific value – in this case, the person’s age.

Needless to say, these plugins are not always correct and can sometimes produce biased results. Despite this limitation, various image regression models are gaining popularity on various social sites and in apps.

Ximilar already provides a face-detection solution. Models such as age prediction can be easily trained and deployed on our platform and integrated into your system.

Value Prediction and Rating of Real Estate Photos

Pictures play an essential part on real estate sites. When people are looking for a new home or investment, they are navigating through the feed mainly by visual features. With image regression, you are able to predict the state, quality, price, and overall rating of real estate from photos. This can help with both searching and evaluating real estate.

Predicting rating, and price (regression) for household images with image regression.

Custom recognition models are also great for the recognition & categorization of the features present in real estate photos. For example, you can determine whether a room is furnished, what type of room it is, and categorize the windows and floors based on their design.

Additionally, a regression can determine the quality or state of floors or walls, as well as rank the overall visual aesthetics of households. You can store all of this information in your database. Your users can then use such data to search for real estate that meets specific criteria.

Image classification systems such as image recognition and value regression are ideal for real estate ranking. Your visitors can search the database with the extracted data.

Determining the Degree of Wear and Tear With AI

Visual AI is increasingly being used to estimate the condition of products in photos. While recognition systems can detect individual tears and surface defects, regression systems can estimate the overall degree of wear and tear of things.

A good example of an industry that has seen significant adoption of such technology is the insurance industry. For example, startups-like Lemonade Inc, or Root use AI when paying the insurance.

With custom image recognition and regression methods, it is now possible to automate the process of insurance claims. For instance, a visual AI system can indicate the seriousness of damage to cars after accidents or assess the wear and tear of various parts such as suspension, tires, or gearboxes. The same goes with other types of insurance, including households, appliances, or even collectible & antique items.

Our platform is commonly utilized to develop recognition and detection systems for visual quality control & defect detection. Read more in the article Visual AI Takes Quality Control to a New Level.

Automatic Grading of Antique & Collectible Items Such as Sports Cards

Apart from car insurance and damage inspection, recognition and regression are great for all types of grading and sorting systems, for instance on price comparators and marketplaces of collectible and antique items. Deep learning is ideal for the automatic visual grading of collector items such as comic books and trading cards.

By leveraging visual AI technology, companies can streamline their processes, reduce manual labor significantly, cut costs, and enhance the accuracy and reliability of their assessments, leading to greater customer satisfaction.

Automatic Recognition of Collectibles

Ximilar built an AI system for the detection, recognition and grading of collectibles. Check it out!

Food Quality Estimation With AI

Biotech, Med Tech, and Industry 4.0 also have a lot of applications for regression models. For example, they can estimate the approximate level of fruit & vegetable ripeness or freshness from a simple camera image.

The grading of vegetables by an image regression model.

For instance, this Japanese farmer is using deep learning for cucumber quality checks. Looking for quality control or estimation of size and other parameters of olives, fruits, or meat? You can easily create a system tailored to these use cases without coding on the Ximilar platform.

Build Custom Evaluation & Grading Systems With Ximilar

Ximilar provides a no-code visual AI platform accessible via App & API. You can log in and train your own visual AI without the need to know how to code or have expertise in deep learning techniques. It will take you just a few minutes to build a powerful AI model. Don’t hesitate to test it for free and let us know what you think!

Our developers and annotators are also able to build custom recognition and regression systems from scratch. We can help you with the training of the custom task and then with the deployment in production. Both custom and ready-to-use solutions can be used via API or even deployed offline.

How do custom projects work?

The post Predict Values From Images With Image Regression appeared first on Ximilar: Visual AI for Business.

Image Annotation Tool for Teams

Michal Lukáč — Thu, 06 May 2021 11:55:57 +0000

Through the years, we worked with many annotation tools. The problem is most of the desktop annotating apps are offline and intended for single-person use, not for team cooperation. The web-based apps, on the other hand, mostly focus on data management with photo annotation, and not on the whole ecosystem with API and inference systems. In this article, I review, what should a good image annotation tool do, and explain the basic features of our own tool – Annotate.

Every big machine learning project requires the active cooperation of multiple team members – engineers, researchers, annotators, product managers, or owners. For example, supervised deep learning for object detection, as well as segmentation, outperforms unsupervised solutions. However, it requires a lot of data with correct annotations. Annotation of images is one of the most time-consuming parts of every deep learning project. Therefore, picking the right annotator tool is critical. When your team is growing and your projects require higher complexity over time, you may encounter new challenges, such as:

Adding labels to the taxonomy would require re-checking a lot of your work
Increasing the performance of your models would require more data
You will need to monitor the progress of your projects

Building solid annotation software for computer vision is not an easy task. And yes, it requires a lot of failures and taking many wrong turns before finding the best solution. So let’s look at what should be the basic features of an advanced data annotation tool.

What Should an Advanced Image Annotation Tool Do?

Many customers are using our cloud platform Ximilar App in very specific areas, such as Fashion, Healthcare, Security, or Industry 4.0. The environment of a proper AI helper or tool should be complex enough to cover requirements like:

Features for team collaboration – you need to assign tasks, and then check the quality and consistency of data
Great user experience for dataset curation – everything should be as simple as possible, but no simpler
Fast production of high-quality datasets for your machine-learning models
Work with complex taxonomies & many models chained with Flows
Fast development and prototyping of new features
Connection to Rest API with Python SDK & querying annotated data

With these needs in mind, we created our own image annotation tool. We use it in our internal projects and provide it to our customers as well. Our technologies for machine learning accelerate the entire pipeline of building good datasets. Whether you are a freelancer tagging pictures or a team managing product collections in e-commerce, Annotate can help.

Our Visual AI tools enable you to work with your own custom taxonomy of objects, such as fashion apparel or things captured by the camera. You can read the basics on the categories & tags and machine learning model training, watch the tutorials, or check our demo and see for yourself how it works.

The Annotate

Annotate is an advanced image annotation tool, which enables you to annotate images precisely and fast. It works as an end-to-end platform for visual data management. You can query the same images, change labels, create objects, draw bounding boxes and even polygons here.

It is a web-based online annotation tool, that works fully on the cloud. Since it is connected to the same back-end & database as Ximilar App, all changes you do in Annotate, manifest in your workspace in App, and vice versa. You can create labels, tasks & models, or upload images through the App, and use them in Annotate.

Ximilar Application and Annotate are connected to the same backend (api.ximilar.com) and the same database.

Annotate extends the functionalities of the Ximilar App. The App is great for training, creating entities, uploading data, and batch management of images (bulk actions for labelling and filtering). Annotate, on the other hand, was created for the detail-oriented management of images. The default single-zoomed image view brings advantages, such as:

Identifying separate objects, drawing polygons and adding metadata to a single image
Suggestions based on AI image recognition help you choose from very complex taxonomies
The annotators focus on one image at a time to minimize the risk of mistakes

Interested in getting to know Annotate better? Let’s have a look at its basic functions.

Deep Focus on a Single Image

If you enter the Images (left menu), you can open any image in the single image view. To the right of the image, you can see all the items located in it. This is where most of the labelling is done. There is also a toolbar for drawing objects and polygons, labelling images, and inspecting metadata.

In addition, you can zoom in/out and drag the image. This is especially helpful when working with smaller objects or big-resolution images. For example, teams annotating medical microscope samples or satellite pictures can benefit from this robust tool.

The main view of the image in our Fashion Tagging workspace

Create Multiple Workspaces

Some of you already know this from other SaaS platforms. The idea is to divide your data into several independent storages. Imagine your company is working on multiple projects at the same time and each of them requires you to label your data with an image annotation tool. Your company account can have many workspaces, each for one project.

Here is our active workspace for Fashion Tagging

Within the workspaces, you don’t mix your images, labels, and tasks. For example, one workspace contains only images for fruit recognition projects (apples, oranges, and bananas) and another contains data on animals (cats and dogs).

Your team members can get access to different workspaces. Also, everyone can switch between the workspaces in the App as well as in Annotate (top right, next to the user icon). Did you know, that the workspaces are also accessible via API? Check out our documentation and learn how to connect to API.

See API Documentation

Train Precise AI Models with Verification

Building good computer vision models requires a lot of data, high-quality annotations, and a team of people who understand the process of building such a dataset. In short, to create high-quality models, you need to understand your data and have a perfectly annotated dataset. In the words of the Director of AI at Tesla, Andrej Karpathy:

Annotate helps you build high-quality AI training datasets by verification. Every image can be verified by different users in the workspace. You can increase the precision by training your models only on verified images.

A list of users who verified the image with the exact dates

Verifying your data is a necessary requirement for the creation of good deep-learning models. To verify the image, simply click the button verify or verify and next (if you are working on a job). You will be able to see who verified any particular image and when.

Create and Track Image Annotating Jobs

When you need to process the newly uploaded images, you can assign them to a Job and a team of people can process them one by one in a job queue. You can also set up exactly how many times each image should be seen by the people processing this queue.

Moreover, you can specify, which photo recognition model or flow of models should be displayed when doing the job. For example, here is the view of the jobs that we are using in one of our tagging services.

Two jobs are waiting to be completed by annotators,
you can start working by hitting the play button on the right

When working on a job, every time an annotator hits the Verify & Next button, it will redirect them to a new image within a job. You can track the progress of each job in the Jobs. Once the image annotation job is complete, the progress bar turns green, and you can proceed to the next steps: retraining the models, uploading new images, or creating another job.

Draw Objects and Polygons

Sometimes, recognizing the most probable category or tags for an image is not enough. That is why Annotate provides a possibility to identify the location of specific things by drawing objects and polygons. The great thing is that you are not paying any credits for drawing objects or labelling. This makes Annotate one of the most cost-effective online apps for image annotation.

Simply click and drag the rectangle with the rectangle tool on canvas to create the detection object.

What exactly do you pay for, when annotating data? The only API credits are counted for data uploads, with volume-based discounts. This makes Annotate an affordable, yet powerful tool for data annotation. If you want to know more, read our newest Article on API Credit Packs, check our Pricing Plans or Documentation.

Annotate With Complex Taxonomies Elegantly

The greatest advantage of Annotate is working with very complex taxonomies and attribute hierarchies. That is why it is usually used by companies in E-commerce, Fashion, Real Estate, Healthcare, and other areas with rich databases. For example, our Fashion tagging service contains more than 600 labels that belong to more than 100 custom image recognition models. The taxonomy tree for some of the biotech projects can be even broader.

Navigating through the taxonomy of labels is very elegant in Annotate – via Flows. Once your Flow is defined (our team can help you with it), you simply add labels to the images. The branches expand automatically when you add labels. In other words, you always see only essential labels for your images.

Simply navigate through your taxonomy tree, expanding branches when clicking on specific labels.

For example, in this image is a fashion object “Clothing”, to which we need to assign more labels. Adding the Clothing/Dresses label will expand the tags that are in the Length Dresses and Style Dresses tasks. If you select the label Elegant from Style Dresses, only features & attributes you need will be suggested for annotation.

Automate Repetitive Tasks With AI

Annotate was initially designed to speed up the work when building computer vision solutions. When annotating data, manual drawing & clicking is a time-consuming process. That is why we created the AI helper tools to automate the entire annotating process in just a few clicks. Here are a few things that you can do to speed up the entire annotation pipeline:

Use the API to upload your previously annotated data to train or re-train your machine learning models and use them to annotate or label more data via API
Create bounding boxes and polygons for object detection & instance object segmentation with one click
Create jobs, share the data, and distribute the tasks to your team members

Predicting bounding boxes with one click automates the entire process of annotation.

Image Annotation Tool for Advanced Visual AI Training

As the main focus of Ximilar is AI for sorting, comparing, and searching multimedia, we integrate the annotation of images into the building of AI search models. This is something that we miss in all other data annotation applications. For the building of such models, you need to group multiple items (images or objects, typically product pictures) into the Similarity Groups. Annotate helps us create datasets for building strong image similarity search models.

Grouping the same or similar images with the Image Annotation Tool. You can tell which item is a smartphone photo or which photos should be located on an e-commerce platform.

Annotate is Always Growing

Annotate was originally developed as our internal image annotation software, and we have already delivered a lot of successful solutions to our clients with it. It is a unique product that any team can benefit from and improve the computer vision models unbelievably fast.

We plan to introduce more data formats like videos, satellite imagery (sentinel maps), 3D models, and more in the future to level up the Visual AI in fields such as visual quality control or AI-assisted healthcare. We are also constantly working on adding new features and improving the overall experience of Ximilar services.

Annotate is available for all users with Business & Professional pricing plans. Would you like to discuss your custom solution or ask anything? Let’s talk! Or read how the cooperation with us works first.

How do custom projects work?

The post Image Annotation Tool for Teams appeared first on Ximilar: Visual AI for Business.

Visual AI Takes Quality Control to a New Level

Michal Lukáč — Wed, 24 Feb 2021 16:08:27 +0000

Have you heard about The Big Hack? The Big Hack story was about a tiny probe (small chip) inserted on computer motherboards by Chinese manufacturing companies. Attackers then could infiltrate any server workstation containing these motherboards, many of which were installed in large US-based companies and government agencies. The thing is, the probes were so small, and the motherboards so complex, that they were almost impossible to spot by the human eye. You can take this post as a guide to help you navigate the latest trends of AI in the industry with a primary focus on AI-based visual inspection systems.

AI Adoption by Companies Worldwide

Let’s start with some interesting stats and news. The expansion of AI and Machine Learning is becoming common across numerous industries. According to this report by Stanford University, AI adoption is increasing globally. More than 50 % of respondents said their companies were using AI, and the adoption growth was greatest in the Asia-Pacific region. Some people refer to the automation of factory processes, including digitalization and the use of AI, as the Fourth Industrial Revolution (and so-called Industry 4.0).

AI adoption by industry and function [Source]

The data show that the Automotive industry is the largest adopter of AI in manufacturing, using heavily machine learning, computer vision, and robotics.
Other industries, such as Pharma or Infrastructure, are using computer vision in their production lines as well. Financial services, on the other hand, are using AI mostly in operations, marketing & sales (with a focus on Natural Language Processing – NLP).

AI technologies per industry [Source]

The MIT Technology Review cited the statement of a leading artificial intelligence expert Andrew Ng, who has been helping tech giants like Google implement AI solutions, that factories are AI’s next frontier. For example, while it would be difficult to inspect parts of electronic devices with our eyes, a cheap camera of the latest Android or iPhone can provide high-resolution images that can be connected to any industrial system.

Adopting AI brings major advantages, but also potential risks that need to be mitigated. It is no surprise that companies are mainly concerned about the cybersecurity of such systems. Imagine you could lose a billion dollars if your factory stopped working (like Honda in this case). Other obstacles are potential errors in machine learning models. There are techniques on how to discover such errors, such as the explainability of AI systems. As for now, the explainability of AI is a concern of only 19 % of companies so there is space to improve. Getting insight from the algorithms can improve the processes and quality of the products. Other than security, there are also political & ethical questions (e.g., job replacement or privacy) that companies are worried about.

This survey by McKinsey & Company brings interesting insights into Germany’s industrial sector. It demonstrates the potential of AI for German companies in eight use cases, one of which is automated quality testing. The expected benefit is a 50% productivity increase due to AI-based automation. Needless to say, Germany is a bit ahead with the AI implementation strategy – there are already several plans made by German institutions to create standardised AI systems that will have better interoperability, certain security standards, quality criteria, and test procedures.

Highly developed economies like Germany, with a high GDP per capita and challenges such as a quickly ageing population, will increasingly need to rely on automation based on AI to achieve GDP targets.
McKinsey & Company

Another study by PwC predicts that the total expected economic impact of AI in the period until 2030 will be about $15.7 trillion. The greatest economic gains from AI are expected in China (26% higher GDP in 2030) and North America.

What is Visual Quality Control?

The human visual system is naturally very selective in what it perceives, focusing on one thing at a time and not actually seeing the whole image (direct vs. peripheral view). The cameras, on the other hand, see all the details, and with the highest resolution possible. Therefore, stories like The Big Hack show us the importance of visual control not only to ensure quality but also safety. That is why several companies and universities decided to develop optical inspection systems engaging machine learning methods able to detect the tiniest difference from the reference board.

Motherboards by Super Micro [Source: Scott Gelber]

In general, visual quality control is a method or process to inspect equipment or structures to discover defects, damages, missing parts, or other irregularities in production or manufacturing. It is an important method of confirming the quality and safety of manufactured products. Optical inspection systems are mostly used for visual quality control in factories and assembly lines, where the control would be hard or ineffective with human workers.

What Are the Main Benefits of Automatic Visual Inspection?

Here are some of the essential aspects and reasons, why automatic visual inspection brings a major advantage to businesses:

The human eye is imprecise – Even though our visual system is a magnificent thing, it needs a lot of “optimization” to be effective, making it prone to optical illusions. The focused view can miss many details, our visible spectrum is limited (380–750 nm), and therefore unable to capture NIR wavelength (source). Cameras and computer systems, on the other hand, can be calibrated to different conditions. Cameras are more suitable for highly precise analyses.
Manual checking – Manual checking of the items one by one is a time-consuming process. Smart automation allows processing and checking more items and faster. It also reduces the number of defective items that are released to customers.
The complexity – Some assembly lines can produce thousands of various products of different shapes, colours, and materials. For humans, it can be very difficult to keep track of all possible variations.
Quality – Providing better and higher quality products by reducing defective items and getting insights into the critical parts of the assembly line.
Risk of damage – Machine vision can reduce the risk of item damage and contamination by a person.
Workplace safety – Making the work environment safer by inspecting it for potentially dangerous actions (e.g. detection of protection wearables as safety helmets in construction sites), inspection in radioactive or biohazard environments, detection of fire, covid face masks, and many more.
Saving costs – Labour work can be pretty expensive in the Western world.
For example, the average Quality control inspector salary in the US is about 40k USD. Companies consider numerous options when saving costs, such as moving the factories to other countries, streamlining the operations, or replacing the workers with robots. And as I said before, this goes hand in hand with some political & ethical questions. I think the most reasonable solution in the long term is the cooperation of workers with robotic systems. This will make the process more robust, reliable, and effective.
Costs of AI systems – Sooner or later, modern technology and automation will be common in all companies (Startups as well as enterprise companies). The adoption of automatic solutions based on AI will make the transition more affordable.

Where is Visual Quality Control Used?

Let’s take a look at some of the fields where the AI visual control helps:

Cosmetics – Inspection of beauty products for defects and contaminations, colour & shape checks, controlling glass or plastic tubes for cleanliness and rejecting scratched pieces.
Pharma & Medical – Visual inspection for pharmaceuticals: rejecting defective and unfilled capsules or tablets or the filling level of bottles, checking the integrity of items; or surface imperfections of medical devices. High-resolution recognition of materials.
Food Industry and Agriculture – Food and beverage inspection for freshness. Label print/barcode/QR code control of presence or position.

A great example of industrial IoT is this story about a Japanese cucumber farmer who developed a monitoring system for quality check with deep learning and TensorFlow.

Automotive – Examination of forged metallic parts, plastic parts, cracks, stains or scratches in the paint coating, and other surface and material imperfections. Monitoring quality of automotive parts (tires, car seats, panels, gears) over time. Engine monitoring and predictive autonomous maintenance.
Aerospace – Checking for the presence and quality of critical components and material, spotting the defective parts, discarding them, and therefore making the products more reliable.
Transportation – Rail surface defects control (example), aircraft maintenance check, or baggage screening in airports – all of them require some kind of visual inspection.
Retail/Consumer Goods & Fashion – Checking assembly line items made of plastics, polymers, wood, and textile, and packaging. Visual quality control can be deployed for the manufacturing process of the goods. Sorting imprecise products.
Energy, Mining & Heavy Industries – Detecting cracks and damage in wind blades or solar panels, visual control in nuclear power plants, and many more.

It’s interesting to see that more and more companies choose collaborative platforms such as Kaggle to solve specific problems. In 2019, the contest by Russian company Severstal on Kaggle led to tens of solutions for the steel defect detection problem.

Steel defects [Source: Kaggle]

Other, e.g. safety checks – if people are present in specific zones of the factory if they have helmets, or stopping the robotic arm if a worker is located nearby.

The Technology Behind AI Quality Control

There are several different approaches and technologies that can be used for visual inspection on production lines. The most common nowadays are using some kind of neural network model.

Neural Networks – Deep Learning

Neural Networks (NN) are computational models that accept the input data and output relevant information. To make the neural network useful (finding the weights for the connection between the neurons and layers), we need to feed the network with some initial training data.

The advantage of using neural networks is their power to internally represent training data which leads to the best performance compared to other machine learning models in computer vision. However, it brings challenges, such as computational demands, overfitting, and others.

[Un|Semi|Self] Supervised Learning

If a machine-learning algorithm (NN) requires ground truth labels, i.e. annotations, then we are talking about supervised learning. If not, then it is an unsupervised method or something in between – semi or self-supervised method. However, building an annotated dataset is much more expensive than simply obtaining data with no labels. The good news is that the latest research in Neural Networks tackles problems with unsupervised learning.

On the left is the original item without any defects, on the right, a bit damaged one. If we know the labels (OK/DEFECT), we can train a supervised machine-learning algorithm. [Source: Kaggle]

Here is the list of common services and techniques for visual inspection:

Image Recognition – Simple neural network that can be trained for categorization or error detection on products from images. The most common architectures are based on convolution (CNN).
Object Detection – Model able to predict the exact position (bounding box) of specific parts. Suitable for defect localization and counting.
Segmentation – More complex than object detection, image segmentation can tell you a pixel-based prediction.
Image Regression – Regress/get a single decimal value from the image. For example, getting the level of wear out of the item.
Anomaly Detection – Shows which image contains an anomaly and why. Mostly done by GAN or GRAD-CAM.
OCR – Optical Character Recognition is used for getting and reading text from images.
Image matching – Matching the picture of the product to the reference image and displaying the difference.
Other – There are also other solutions that do not require data at all, most of the time using some simple, yet powerful computer vision technique.

If you would like to dive a bit deeper into the process of building a model, you can check my posts on Medium, such as How to detect defects on images.

Typical Types and Sources of Data for Visual Inspection

Common Data Sources

Thermal imaging example [Source: Quality Magazine]

RGB images – The most common data type and the easiest to get. A simple 1080p camera that you can connect to Raspberry Pi costs about 25$.

Thermography – Thermal quality control via infrared cameras, mostly used to detect flaws not visible by simple RGB cameras under the surface, gas imaging, fire prevention, and electronics behaviour under different conditions. If you want to know more, I recommend reading the articles in Quality Magazine.

3D scanning, Lasers, X-ray, and CT scans – Creating 3D models from special depth scanners gives you a better insight into material composition, surface, shape, and depth.

Microscopy – Due to the rapid development and miniaturization of technologies, sometimes we need a more detailed and precise view. Microscopes can be used in an industrial setting to ensure the best quality and safety of products. Microscopy is used for visual inspection in many fields, including material sciences and industry (stress fractures), nanotechnology (nanomaterial structure), or biology & medicine. There are many microscopy methods to choose from, such as stereomicroscopy, electron microscopy, opto-digital or purely digital microscopes, and others.

Common Inspection Errors

scratches
patches
knots, shakes, checks, and splits in the wood
crazing
pitted surface
missing parts
label/print damage
corrosion
coating nonuniformity

Surface crazing and cracking on brake discs [source], crazing in polymer-grafted nanoparticle film [source], and wood shakes [source].

Examples of Datasets for Visual Inspection

Severstal Kaggle Dataset – A competition for the detection of defects on flat sheet steel.
MVTec AD – 5000 high-resolution annotated images of 15 items (divided into defective and defect-free categories).
Casting Dataset – Casting is a manufacturing process in which a liquid material is usually poured into a form/mould. About 7 thousand images of submersible pump defects.
Kolektor Surface-Defect Dataset – Dataset of microscopic fractions or cracks in electrical accumulators.
PCB Dataset – Annotated images of printed circuit boards.

AI Quality Control Use Cases

We talked about a wide range of applications for visual control with AI and machine learning. Here are three of our use cases for industrial image recognition we worked on in 2020. All these cases required an automatic optical inspection (AOI) and partial customization when building the model, working with different types of data and deployment (cloud/on-premise instance/smartphone). We are glad to hear that during the COVID-19 pandemic, our technologies help customers keep their factories open.

Our typical workflow for a customized solution is the following:

Setup, Research & Plan: If we don’t know how to solve the problem from the initial call, our Machine Learning team does the research and finds the optimal solution for you.
Gathering Data: We sit with your team and discuss what kind of data samples we need. If you can’t acquire and annotate data yourself, our team of annotators will work on obtaining a training dataset.
First prototype: Within 2–4 weeks we prepare the first prototype or proof of concept. The proof of concept is a lightweight solution for your problem. You can test it and evaluate it by yourself.
Development: Once you are satisfied with the prototype results, our team can focus on the development of the full solution. We work mostly in an iterative way improving the model and obtaining more data if needed.
Evaluation & Deployment: If the system performs well and meets the criteria set up in the first calls (mostly some evaluation on the test dataset and speed performance), we work on the deployment. It can be used in our cloud, on-premise, or embedded hardware in the factory. It’s up to you. We can even provide a source code so your team can edit it in the future.

Use case: Image recognition & OCR for wood products

One of our customers contacted us with a request to build a system for categorization and quality control of wooden products. With Ximilar Platform we were able to easily develop and deploy a camera system over the assembly line that sorted the products into the bins. The system can identify the defective print on the products with optical character recognition technology (OCR), and the surface control of wood texture is enabled by a separate model.

Printed text on wood [Source: Ximilar]

The technology is connected to a simple smartphone/tablet camera in the factory and can handle tens of products per second. This way, our customer was able to reduce rework and manual inspections which led to saving thousands of USD per year. This system was built with the Ximilar Flows service.

Use case: Spectrogram analysis from car engines

Another project we successfully deployed was the detection of malfunctioning engines. We did it by transforming the sound input from the car into an image spectrogram. After that, we train a deep neural network that recognises problematic car engines and can tell you the specific problem of the engine.

The good news is that this system can also detect anomalies in an unsupervised way (no need for data labelling) with the GAN technology.

Spectrogram from Engine [Source: Ximilar]

Use case: Wind Turbin Blade damages from drone footage

[Source: Pexels]

According to Bloomberg, there is no simple way to recycle a wind turbine, and it is therefore crucial to prolong the lifespan of wind power plants. They can be hit by lightning, influenced by extreme weather, and other natural forces.

That’s why we developed for our customers a system checking the rotor blade integrity and damages working with drone video footage. The videos are uploaded to the system, and inspection is done with an object detection model identifying potential problems. There are thousands of videos analyzed in one batch, so we built a workstation (with NVidia RTX GPU cards) able to handle such a load.

Ximilar Advantages in Visual AI Quality Control

An end-to-end and easy-to-use platform for Computer Vision and Machine Learning, with enterprise-ready features.
Processing hundreds of images per second on an average computer.
Train your model in the cloud and use it offline in your factory without an internet connection. Thanks to TensorFlow, you can use the model on any computer, edge device, GPU card, or embedded hardware (Raspberry Pi or NVIDIA Jetson connected to a camera). We also provide optimized CPU models on Intel devices through OpenVINO technology.
Easily gather more data and teach models on new defects within a day.
Evaluation of the independent dataset, and model versioning.
A customized yet affordable solution providing the best outcome with pixel-accurate recognition.
Advanced image management and annotation platform suitable for creating intelligent vision systems.
Image augmentation settings that can be tuned for your problem.
Fast machine learning models that can be connected to your industrial camera or smartphone for industrial image processing robust to lighting conditions, object motion, or vibrations.
Great team of experts, available to communicate and help.

To sum up, it is clear that artificial intelligence and machine learning are becoming common in the majority of industries working with automation, digital data, and quality or safety control. Machine learning definitely has a lot to offer to the factories with both manual and robotic assembly lines, or even fully automated production, but also to various specialized fields, such as material sciences, pharmaceutical, and medical industry.

Are you interested in creating your own visual control system?

How do custom projects work?

The post Visual AI Takes Quality Control to a New Level appeared first on Ximilar: Visual AI for Business.

Image Recognition as an Answer to New Energy Labelling

Zuzana Raidová — Wed, 27 Jan 2021 08:45:30 +0000

The year 2021 will bring a fundamental change in the energy labelling of household appliances. Updated labelling should be more efficient, and intuitive, and enable consumers to make better and more informed purchasing decisions. A first large group of goods should be re-labelled by the beginning of March, not only in retail but also in e-shops. Even though such modification brings benefits to the buyers, it poses a great challenge to the online sellers, to which we in Ximilar have a clever solution.

Upcoming Changes in the EU Energy Labelling

The energy labels indicate the energy efficiency category the appliance falls into. In 2019, the European Union approved a new regulation setting a framework for updated energy labelling, which will come into force in 2021 and gradually replace the old system of labels. According to European lawmakers, the new system could save up to 200 billion kWh of energy, which is approximately the same amount of energy all Baltic countries spend together in a year. The first new labels are already in circulation.

Effective March 2021, sellers and manufacturers will be required to update the energy labels on fridges, washing machines, dishwashers, TVs, electronic displays, and refrigerating appliances for display purposes, followed by tyres in May, and lamps in September.

So far, the products have fallen into categories A+++ to G, which will be simplified back to A to G and the energy class of a product will be determined by higher standards. This means the appliance that was A+ in 2020 could be B or C from now on.

Re-scaling is not the only new feature, as the new labels are provided with a QR code leading consumers to the EPREL (European Product Registry for Energy Labelling) database, providing them with detailed energy and environmental information on the goods.

A Challenge for E-commerce Industry

The new regulation applies not only to retail but also to e-commerce, meaning all e-shops will be required to re-label the household appliances as well. They will be required to do so between March 1^st and 18^th.

E-shops need to identify thousands of energy labels in the product galleries and replace them with the new ones.

E-shops generally upload the energy labels as pictures into the galleries on the item pages. Due to the large amounts of images they upload every day, it is not uncommon not to have them tagged.

To ensure a smooth transition from the old label system to the new one, the physical stores will focus on the re-labelling of the displayed goods. The e-shops, on the other hand, will need to identify and replace considerable amounts of pictures in their databases at once. For instance, the largest e-shop selling household appliances in the Czech Republic Alza.cz currently offers approximately 1 200 products in the category of fridges, 500 in washing machines, 350 in dishwashers, 600 TVs, and 1 200 monitors, meaning they will need to update at least 3 850 energy labels in the first wave.

Many large e-shops also cooperate with price comparison websites, such as Heureka, that have their item galleries. For such services, the problem is a bit more complex: as a price analysis tool, the comparison website acquires its data from various sellers meaning its picture tagging or sorting is not standardised, and they have to deal with a wide range of file types and names.

Example of an old EU energy label in a product gallery at Heureka.cz

Such task poses a question: what is the most efficient way to identify the old energy labels amongst other images in the product galleries in order to delete and replace them? The solution lies in the image recognition software.

Smart Solution: Image Recognition

E-shops with electronics typically upload the energy labels as images into the product galleries on their item pages and provide them to the price comparison websites. Therefore, they need software able to sort the product images, reliably recognize the old energy labels and set them aside.

Image Recognition is one of the core services of Ximilar. In principle, once you upload your images to this service, it equips them with tags and sorts them into categories. This service uses computer vision and deep learning to detect a wide range of features in the pictures. It is designed to process extensive databases of pictures in a fraction of a second.

With Ximilar App, you can develop an AI service directly for energy label recognition.

How to Use the Image Recognition on Energy Labels

If you need to identify and replace the old energy labels in your e-shop, there are two ways to use the Ximilar Energy Label Recognition service:

You can train your own recognition model for energy-label images. Then you can use the model as an API endpoint. Meaning, you will send images from the product gallery and get immediate feedback on whether they are or aren’t energy labels.
You can provide us with an export from your product image database (as image URLs or the actual files) and we will take care of the rest for you. You will get the output back in a standard CSV format.

Since image recognition is a CPU/GPU-intensive process, one of the greatest advantages of this service lies in the image database processing on our servers, whether you use the API or leave it to us. Of course, you will have a chance to test the service in the Ximilar App before you run it on your image database.

The energy label recognition with the Ximilar service is an efficient, quick, and above all, reliable way to identify the images that need to be replaced.

Try it in Ximilar App

With Ximilar you can develop more models for energy labels recognition:

Reliable recognition of the old energy labels from the new ones. This might be handy in the transition period when some labels will be already replaced, but others will not.
Reading the actual energy class, especially from the new energy labels. The energy label change is a great opportunity to enrich your product data by this piece of information.

If you are interested, please just fill out our contact form. We are here to help!

The Image Recognition Service Makes E-commerce Easier

Whether you need to sort your catalogue into fine-grained categories, recognize pictures in product galleries, or offer similar products to your customers, Ximilar has a solution for you.

Read more in this detailed article on Image Recognition uses in e-commerce, or contact us, and we can discuss other solutions tailored to your needs.

Try our public demos

The post Image Recognition as an Answer to New Energy Labelling appeared first on Ximilar: Visual AI for Business.

Introducing Tags, Categories & Image Management

Víťa Válka — Tue, 26 Mar 2019 13:02:14 +0000

Ximilar not only grows by its customer base, but we constantly learn and add new features. We aim to give you as much comfort as possible — by delivering great user experience and even features that might not have been invented yet. We learn from the AI universe, and we contribute to it in return. Let’s see the feature set added in the early spring of 2019.

New Label Types: Categories & Tags

This one is a major, long-awaited upgrade, to our custom recognition system.

Until this point, we offered only image categorization, formally: multi-class classification, where every image belongs to exactly one category. That was great for many use cases, but some elaborate ones needed more. So now we introduce Tagging tasks, formally: multi-label classification, where images are tagged with multiple labels per image. Labels correspond to various features or objects contained in a single picture. Therefore, from this point on, we use strictly categorization or tagging, and not classification anymore.

With this change, the Ximilar App starts to differentiate two kinds of labels — Categories and Tags, where each image could be assigned either to one Category or/and multiple Tags.

For every Tagging Task that you create, the Ximilar App automatically creates a special tag “ – no tags” where you can put images that contain none of the tags connected to the task. You need to carefully choose the type of task when creating, as the type cannot be changed later. Other than that, you can work in the same way with both types of tasks.

When you want to categorize your images in production, you simply take the category with the highest probability – this is clear. In the case of tagging, you must set a threshold and take tags with probability over this threshold. A general rule of thumb is to take all tags with a probability over 50 %, but you can tune this number to fit your use case and data.

With these new features, there are also a few minor API improvements. To keep everything backwards compatible, when you create a Task or Label and do not specify the type, then you create a Categorization task with Categories. If you want to learn more about our REST API, which allows you to manage almost everything even training of the models, please check out docs.ximilar.com.

Benefit: Linking Tags with Categories

So hey, we have two types of labels in place. Let’s see what that brings in real use. The typical use-case of our customers is, that they have two or more tasks, defined in the same field/area. For instance, they want to enhance real-estate properties so they need:

Automatically categorize photos by room type — living room, bedroom, kitchen, outdoor house. At the same time, also:
Recognize different features/objects in the images — bed, cabinet, wooden floor, lamp, etc.

So far, customers had to upload — often the same — training images separately into each label.

This upgrade makes this way easier. The new Ximilar App section Images allows you to upload images once and assign them to several Categories and Tags. You can easily modify the categories and tags of each image there. Either one by one or in bulk. There can be thousands of images in your workspace. So you can also filter images by their tags/categories and do batch processing on selected images. We believe that this will speed up the workflow of building reliable data for your tasks.

Improved Search

Some of our customers have hundreds of Labels. With a growing number of projects, it started to be hard to orient all Labels, Tags, and Tasks. That is why there is now a search bar at the top of the screen, which helps you find desired items faster.

Updated Insights

As we mentioned in our last update notes, we offer a set of insights that help you increase the quality of results over time by looking into what works and what does not in your case. In order to improve the accuracy of your models, you may inspect the details of your model. Please see the article on Confusion Matrix and Failed Images insights and also another one, talking about the Precision/Recall table. We have recently updated the list of Failed images so that you can modify the categories/tags of these failed images — or delete them — directly.

Upcoming Features

Workspaces — to clearly split work in different areas
Rich statistics — number of API calls, amount of credits, per task, long-term/per-month/within-week/hourly and more.

We at Ximilar are constantly working on new features, refactoring the older ones and listening to your requests and ideas as we aim to deliver a great service not just out of the box, and not only with pre-defined packages but actually meeting your needs in real-world applications. You can always write to us at and request some new API features which will benefit everyone who uses this platform. We will be glad if you share with us how do you use the Ximilar Recognition in your use cases. Not only this will help us grow as a company, but it will also inspire others.

We create the Ximilar App as a solid entry point to learn a bunch about AI, but our skills are mostly benefiting custom use cases, where we deliver solutions for Narrow Fields AI Challenges, that are required more than a little over-hyped generic tools that just tell you this is a banana and that is an apple.

Try our public demos

The post Introducing Tags, Categories & Image Management appeared first on Ximilar: Visual AI for Business.