AI Safety - Ximilar: Visual AI for Business

Explainable AI: What is My Image Recognition Model Looking At?

Zuzana Raidová — Tue, 07 Dec 2021 14:16:20 +0000

There are many challenges in machine learning, and developing a good model is one of them. Even though neural networks are very powerful, they have a great weakness. Their complexity makes it hard to understand how they reach their decisions. This might be a problem when you want to move from development to production, and it might eventually cause your whole project to fail. But how can you measure the success of a machine learning model? The answer is not easy. In our opinion, the model must excel in a production environment and should work reliably in both common and uncommon situations.

However, even when the results in production are good, there are areas, where we can’t simply accept black box decisions without being sure, how the AI made them. These areas are typically medicine and biotech or any other field where there is no place for errors. We need to make sure that both output and the way our model reached its decision make sense – we need explainable AI. For these reasons, we introduced a new feature to our Image Recognition service called Explain.

Training Image Recognition

Image Recognition is a Visual AI service enabling you to train custom models to recognize images or objects in them. In Ximilar App, you can use Categorization & Tagging and Object Detection, which can be combined with Flows. For example, the first task will detect all the human models in the image and the categorization & tagging tasks will categorize and tag their clothes and accessories.

Image recognition is a very powerful technology, bringing automation to many industries. It requires well-trained models, and, in the case of object detection, precise data annotation. If you are not familiar with using image recognition on our platform, please try to set up your own classifier first.

These resources should be helpful in the beginning:

Check the Custom Image Recognition page
Read The Basic Rules for Image Recognition Models Training
Read Best Practices in Image Recognition Training
Watch our YouTube tutorial on how to set up the image recognition task
Read how to combine and chain your models with Flows

From model-centric to data-centric with explainable AI

Explaining which areas are important for the leaf disease recognition model when predicting a label called “canker”.

When you want a model which performs great in a production setting and has high accuracy, you need to focus on your training data first. Consistency of labelling, cleaning datasets from unnecessary samples/labels, and adding feature-rich samples that are missing is much more important than the newest architecture of the neural network. Andrew Ng, an entrepreneur and professor at Stanford, is also promoting this approach to building machine learning models.

The Explain feature in our App tells you:

which parts of images (features and pixels) are important for predicting specific labels
for which images the model will probably predict the wrong results
which samples should be added to your training dataset to improve performance

Simple Example: T-shirt or Not?

Let’s look at this simple example of how explainable AI can be useful. Let’s say we have a task containing two categories – t-shirts and shoes. For a start, we have 20 images in each category. It is definitely not enough for production, but it is enough if you want to experiment and learn.

This neural network has two labels: shoes and t-shirt.

After playing with the advanced options and short training, the result seems really promising:

Using Explain on a Training Image

But did the model actually learn what we wanted? To check, what the neural network find important when categorizing our images, we will apply two different methods with the tool Explain:

Grad-CAM (first published in 2016) – this method is very fast, but the results are not very precise
Blur Integrated Gradients (published in 2020) smoothed with SmoothGrad – this method provides much more details, but at the cost of computational time

Grad-Cam result of Explain feature. As you can see, the model is looking mostly at the head/face.

Blur-Integrated Gradients results, the most important features are head/face, similar to what grad-cam is telling us.

In this case, both methods clearly demonstrate the problem of our model. The focus is not on the t-shirt itself, but on the head of the person wearing it. In the end, it was easier for the learning algorithm and the neural network to distinguish between the two categories using this feature instead of focusing on the t-shirt. If we look at the training data for label t-shirt, we can see that all pictures include a person with a visible face.

Data for T-shirt label for the image recognition task. This small dataset contains only photos with visible faces, which can be a problem.

Explainability After Adding New Data

The solution might be adding more varied training data and introducing images without a person. Generally, it’s a good approach to start with a small dataset and over time increase it to a bigger one. Adding visually broad images helps model with overfitting on wrong features. So we added more photos to the label and trained the model again. Let’s see what the results look like with our new version of the model:

After retraining the model on new data, we can see the improvement for what features the neural network looking for.

The Grad-CAM result on the left is not very convincing in this case. The image on the right shows the result of Blur Integrated Gradients. Here you can see, how the focus moved from the head to the t-shirt. It seems like the head still plays some part, but there is much less focus on it.

Both methods for explainable AI have their drawbacks, and sometimes we have to try more pictures to get a better understanding of model behaviour. We also need to mention one important point. Due to the way the algorithm works, it tends to prefer edges, which is clearly visible in the examples.

Summary

The Explainability and Interpretability of Neural Networks is a big research topic, and we are looking forward to adopting and integrating more techniques into our SaaS AI solution. AI Explainability that we showed you is only one tool amongst many towards data-centric AI.

If you have any troubles, do not hesitate to contact us. The machine learning specialists of Ximilar have vast experience with different kinds of problems, and are always happy to help you with yours.

The post Explainable AI: What is My Image Recognition Model Looking At? appeared first on Ximilar: Visual AI for Business.

Visual AI Takes Quality Control to a New Level

Michal Lukáč — Wed, 24 Feb 2021 16:08:27 +0000

Have you heard about The Big Hack? The Big Hack story was about a tiny probe (small chip) inserted on computer motherboards by Chinese manufacturing companies. Attackers then could infiltrate any server workstation containing these motherboards, many of which were installed in large US-based companies and government agencies. The thing is, the probes were so small, and the motherboards so complex, that they were almost impossible to spot by the human eye. You can take this post as a guide to help you navigate the latest trends of AI in the industry with a primary focus on AI-based visual inspection systems.

AI Adoption by Companies Worldwide

Let’s start with some interesting stats and news. The expansion of AI and Machine Learning is becoming common across numerous industries. According to this report by Stanford University, AI adoption is increasing globally. More than 50 % of respondents said their companies were using AI, and the adoption growth was greatest in the Asia-Pacific region. Some people refer to the automation of factory processes, including digitalization and the use of AI, as the Fourth Industrial Revolution (and so-called Industry 4.0).

AI adoption by industry and function [Source]

The data show that the Automotive industry is the largest adopter of AI in manufacturing, using heavily machine learning, computer vision, and robotics.
Other industries, such as Pharma or Infrastructure, are using computer vision in their production lines as well. Financial services, on the other hand, are using AI mostly in operations, marketing & sales (with a focus on Natural Language Processing – NLP).

AI technologies per industry [Source]

The MIT Technology Review cited the statement of a leading artificial intelligence expert Andrew Ng, who has been helping tech giants like Google implement AI solutions, that factories are AI’s next frontier. For example, while it would be difficult to inspect parts of electronic devices with our eyes, a cheap camera of the latest Android or iPhone can provide high-resolution images that can be connected to any industrial system.

Adopting AI brings major advantages, but also potential risks that need to be mitigated. It is no surprise that companies are mainly concerned about the cybersecurity of such systems. Imagine you could lose a billion dollars if your factory stopped working (like Honda in this case). Other obstacles are potential errors in machine learning models. There are techniques on how to discover such errors, such as the explainability of AI systems. As for now, the explainability of AI is a concern of only 19 % of companies so there is space to improve. Getting insight from the algorithms can improve the processes and quality of the products. Other than security, there are also political & ethical questions (e.g., job replacement or privacy) that companies are worried about.

This survey by McKinsey & Company brings interesting insights into Germany’s industrial sector. It demonstrates the potential of AI for German companies in eight use cases, one of which is automated quality testing. The expected benefit is a 50% productivity increase due to AI-based automation. Needless to say, Germany is a bit ahead with the AI implementation strategy – there are already several plans made by German institutions to create standardised AI systems that will have better interoperability, certain security standards, quality criteria, and test procedures.

Highly developed economies like Germany, with a high GDP per capita and challenges such as a quickly ageing population, will increasingly need to rely on automation based on AI to achieve GDP targets.
McKinsey & Company

Another study by PwC predicts that the total expected economic impact of AI in the period until 2030 will be about $15.7 trillion. The greatest economic gains from AI are expected in China (26% higher GDP in 2030) and North America.

What is Visual Quality Control?

The human visual system is naturally very selective in what it perceives, focusing on one thing at a time and not actually seeing the whole image (direct vs. peripheral view). The cameras, on the other hand, see all the details, and with the highest resolution possible. Therefore, stories like The Big Hack show us the importance of visual control not only to ensure quality but also safety. That is why several companies and universities decided to develop optical inspection systems engaging machine learning methods able to detect the tiniest difference from the reference board.

Motherboards by Super Micro [Source: Scott Gelber]

In general, visual quality control is a method or process to inspect equipment or structures to discover defects, damages, missing parts, or other irregularities in production or manufacturing. It is an important method of confirming the quality and safety of manufactured products. Optical inspection systems are mostly used for visual quality control in factories and assembly lines, where the control would be hard or ineffective with human workers.

What Are the Main Benefits of Automatic Visual Inspection?

Here are some of the essential aspects and reasons, why automatic visual inspection brings a major advantage to businesses:

The human eye is imprecise – Even though our visual system is a magnificent thing, it needs a lot of “optimization” to be effective, making it prone to optical illusions. The focused view can miss many details, our visible spectrum is limited (380–750 nm), and therefore unable to capture NIR wavelength (source). Cameras and computer systems, on the other hand, can be calibrated to different conditions. Cameras are more suitable for highly precise analyses.
Manual checking – Manual checking of the items one by one is a time-consuming process. Smart automation allows processing and checking more items and faster. It also reduces the number of defective items that are released to customers.
The complexity – Some assembly lines can produce thousands of various products of different shapes, colours, and materials. For humans, it can be very difficult to keep track of all possible variations.
Quality – Providing better and higher quality products by reducing defective items and getting insights into the critical parts of the assembly line.
Risk of damage – Machine vision can reduce the risk of item damage and contamination by a person.
Workplace safety – Making the work environment safer by inspecting it for potentially dangerous actions (e.g. detection of protection wearables as safety helmets in construction sites), inspection in radioactive or biohazard environments, detection of fire, covid face masks, and many more.
Saving costs – Labour work can be pretty expensive in the Western world.
For example, the average Quality control inspector salary in the US is about 40k USD. Companies consider numerous options when saving costs, such as moving the factories to other countries, streamlining the operations, or replacing the workers with robots. And as I said before, this goes hand in hand with some political & ethical questions. I think the most reasonable solution in the long term is the cooperation of workers with robotic systems. This will make the process more robust, reliable, and effective.
Costs of AI systems – Sooner or later, modern technology and automation will be common in all companies (Startups as well as enterprise companies). The adoption of automatic solutions based on AI will make the transition more affordable.

Where is Visual Quality Control Used?

Let’s take a look at some of the fields where the AI visual control helps:

Cosmetics – Inspection of beauty products for defects and contaminations, colour & shape checks, controlling glass or plastic tubes for cleanliness and rejecting scratched pieces.
Pharma & Medical – Visual inspection for pharmaceuticals: rejecting defective and unfilled capsules or tablets or the filling level of bottles, checking the integrity of items; or surface imperfections of medical devices. High-resolution recognition of materials.
Food Industry and Agriculture – Food and beverage inspection for freshness. Label print/barcode/QR code control of presence or position.

A great example of industrial IoT is this story about a Japanese cucumber farmer who developed a monitoring system for quality check with deep learning and TensorFlow.

Automotive – Examination of forged metallic parts, plastic parts, cracks, stains or scratches in the paint coating, and other surface and material imperfections. Monitoring quality of automotive parts (tires, car seats, panels, gears) over time. Engine monitoring and predictive autonomous maintenance.
Aerospace – Checking for the presence and quality of critical components and material, spotting the defective parts, discarding them, and therefore making the products more reliable.
Transportation – Rail surface defects control (example), aircraft maintenance check, or baggage screening in airports – all of them require some kind of visual inspection.
Retail/Consumer Goods & Fashion – Checking assembly line items made of plastics, polymers, wood, and textile, and packaging. Visual quality control can be deployed for the manufacturing process of the goods. Sorting imprecise products.
Energy, Mining & Heavy Industries – Detecting cracks and damage in wind blades or solar panels, visual control in nuclear power plants, and many more.

It’s interesting to see that more and more companies choose collaborative platforms such as Kaggle to solve specific problems. In 2019, the contest by Russian company Severstal on Kaggle led to tens of solutions for the steel defect detection problem.

Steel defects [Source: Kaggle]

Other, e.g. safety checks – if people are present in specific zones of the factory if they have helmets, or stopping the robotic arm if a worker is located nearby.

The Technology Behind AI Quality Control

There are several different approaches and technologies that can be used for visual inspection on production lines. The most common nowadays are using some kind of neural network model.

Neural Networks – Deep Learning

Neural Networks (NN) are computational models that accept the input data and output relevant information. To make the neural network useful (finding the weights for the connection between the neurons and layers), we need to feed the network with some initial training data.

The advantage of using neural networks is their power to internally represent training data which leads to the best performance compared to other machine learning models in computer vision. However, it brings challenges, such as computational demands, overfitting, and others.

[Un|Semi|Self] Supervised Learning

If a machine-learning algorithm (NN) requires ground truth labels, i.e. annotations, then we are talking about supervised learning. If not, then it is an unsupervised method or something in between – semi or self-supervised method. However, building an annotated dataset is much more expensive than simply obtaining data with no labels. The good news is that the latest research in Neural Networks tackles problems with unsupervised learning.

On the left is the original item without any defects, on the right, a bit damaged one. If we know the labels (OK/DEFECT), we can train a supervised machine-learning algorithm. [Source: Kaggle]

Here is the list of common services and techniques for visual inspection:

Image Recognition – Simple neural network that can be trained for categorization or error detection on products from images. The most common architectures are based on convolution (CNN).
Object Detection – Model able to predict the exact position (bounding box) of specific parts. Suitable for defect localization and counting.
Segmentation – More complex than object detection, image segmentation can tell you a pixel-based prediction.
Image Regression – Regress/get a single decimal value from the image. For example, getting the level of wear out of the item.
Anomaly Detection – Shows which image contains an anomaly and why. Mostly done by GAN or GRAD-CAM.
OCR – Optical Character Recognition is used for getting and reading text from images.
Image matching – Matching the picture of the product to the reference image and displaying the difference.
Other – There are also other solutions that do not require data at all, most of the time using some simple, yet powerful computer vision technique.

If you would like to dive a bit deeper into the process of building a model, you can check my posts on Medium, such as How to detect defects on images.

Typical Types and Sources of Data for Visual Inspection

Common Data Sources

Thermal imaging example [Source: Quality Magazine]

RGB images – The most common data type and the easiest to get. A simple 1080p camera that you can connect to Raspberry Pi costs about 25$.

Thermography – Thermal quality control via infrared cameras, mostly used to detect flaws not visible by simple RGB cameras under the surface, gas imaging, fire prevention, and electronics behaviour under different conditions. If you want to know more, I recommend reading the articles in Quality Magazine.

3D scanning, Lasers, X-ray, and CT scans – Creating 3D models from special depth scanners gives you a better insight into material composition, surface, shape, and depth.

Microscopy – Due to the rapid development and miniaturization of technologies, sometimes we need a more detailed and precise view. Microscopes can be used in an industrial setting to ensure the best quality and safety of products. Microscopy is used for visual inspection in many fields, including material sciences and industry (stress fractures), nanotechnology (nanomaterial structure), or biology & medicine. There are many microscopy methods to choose from, such as stereomicroscopy, electron microscopy, opto-digital or purely digital microscopes, and others.

Common Inspection Errors

scratches
patches
knots, shakes, checks, and splits in the wood
crazing
pitted surface
missing parts
label/print damage
corrosion
coating nonuniformity

Surface crazing and cracking on brake discs [source], crazing in polymer-grafted nanoparticle film [source], and wood shakes [source].

Examples of Datasets for Visual Inspection

Severstal Kaggle Dataset – A competition for the detection of defects on flat sheet steel.
MVTec AD – 5000 high-resolution annotated images of 15 items (divided into defective and defect-free categories).
Casting Dataset – Casting is a manufacturing process in which a liquid material is usually poured into a form/mould. About 7 thousand images of submersible pump defects.
Kolektor Surface-Defect Dataset – Dataset of microscopic fractions or cracks in electrical accumulators.
PCB Dataset – Annotated images of printed circuit boards.

AI Quality Control Use Cases

We talked about a wide range of applications for visual control with AI and machine learning. Here are three of our use cases for industrial image recognition we worked on in 2020. All these cases required an automatic optical inspection (AOI) and partial customization when building the model, working with different types of data and deployment (cloud/on-premise instance/smartphone). We are glad to hear that during the COVID-19 pandemic, our technologies help customers keep their factories open.

Our typical workflow for a customized solution is the following:

Setup, Research & Plan: If we don’t know how to solve the problem from the initial call, our Machine Learning team does the research and finds the optimal solution for you.
Gathering Data: We sit with your team and discuss what kind of data samples we need. If you can’t acquire and annotate data yourself, our team of annotators will work on obtaining a training dataset.
First prototype: Within 2–4 weeks we prepare the first prototype or proof of concept. The proof of concept is a lightweight solution for your problem. You can test it and evaluate it by yourself.
Development: Once you are satisfied with the prototype results, our team can focus on the development of the full solution. We work mostly in an iterative way improving the model and obtaining more data if needed.
Evaluation & Deployment: If the system performs well and meets the criteria set up in the first calls (mostly some evaluation on the test dataset and speed performance), we work on the deployment. It can be used in our cloud, on-premise, or embedded hardware in the factory. It’s up to you. We can even provide a source code so your team can edit it in the future.

Use case: Image recognition & OCR for wood products

One of our customers contacted us with a request to build a system for categorization and quality control of wooden products. With Ximilar Platform we were able to easily develop and deploy a camera system over the assembly line that sorted the products into the bins. The system can identify the defective print on the products with optical character recognition technology (OCR), and the surface control of wood texture is enabled by a separate model.

Printed text on wood [Source: Ximilar]

The technology is connected to a simple smartphone/tablet camera in the factory and can handle tens of products per second. This way, our customer was able to reduce rework and manual inspections which led to saving thousands of USD per year. This system was built with the Ximilar Flows service.

Use case: Spectrogram analysis from car engines

Another project we successfully deployed was the detection of malfunctioning engines. We did it by transforming the sound input from the car into an image spectrogram. After that, we train a deep neural network that recognises problematic car engines and can tell you the specific problem of the engine.

The good news is that this system can also detect anomalies in an unsupervised way (no need for data labelling) with the GAN technology.

Spectrogram from Engine [Source: Ximilar]

Use case: Wind Turbin Blade damages from drone footage

[Source: Pexels]

According to Bloomberg, there is no simple way to recycle a wind turbine, and it is therefore crucial to prolong the lifespan of wind power plants. They can be hit by lightning, influenced by extreme weather, and other natural forces.

That’s why we developed for our customers a system checking the rotor blade integrity and damages working with drone video footage. The videos are uploaded to the system, and inspection is done with an object detection model identifying potential problems. There are thousands of videos analyzed in one batch, so we built a workstation (with NVidia RTX GPU cards) able to handle such a load.

Ximilar Advantages in Visual AI Quality Control

An end-to-end and easy-to-use platform for Computer Vision and Machine Learning, with enterprise-ready features.
Processing hundreds of images per second on an average computer.
Train your model in the cloud and use it offline in your factory without an internet connection. Thanks to TensorFlow, you can use the model on any computer, edge device, GPU card, or embedded hardware (Raspberry Pi or NVIDIA Jetson connected to a camera). We also provide optimized CPU models on Intel devices through OpenVINO technology.
Easily gather more data and teach models on new defects within a day.
Evaluation of the independent dataset, and model versioning.
A customized yet affordable solution providing the best outcome with pixel-accurate recognition.
Advanced image management and annotation platform suitable for creating intelligent vision systems.
Image augmentation settings that can be tuned for your problem.
Fast machine learning models that can be connected to your industrial camera or smartphone for industrial image processing robust to lighting conditions, object motion, or vibrations.
Great team of experts, available to communicate and help.

To sum up, it is clear that artificial intelligence and machine learning are becoming common in the majority of industries working with automation, digital data, and quality or safety control. Machine learning definitely has a lot to offer to the factories with both manual and robotic assembly lines, or even fully automated production, but also to various specialized fields, such as material sciences, pharmaceutical, and medical industry.

Are you interested in creating your own visual control system?

How do custom projects work?

The post Visual AI Takes Quality Control to a New Level appeared first on Ximilar: Visual AI for Business.

Insights to Help you Understand your Visual Content

Víťa Válka — Wed, 22 Aug 2018 08:00:51 +0000

This update is huge! Even though this summer in Europe is really hot, we are working hard to improve Vize to become a tool that helps you understand and improve the results much more than any other similar platform.

As we promised you in the previous update of Ximilar App, here are new features that would save you time and lower your stress levels significantly.

More Tools for Developers

Vize users are asking all the time for tools to help them inspect and debug their classifiers. As machine learning experts, we know how hard it can sometimes be to build a reliable model. It is a tough challenge even for professionals. Our goal is to make Vize as simple as possible. Yet we are still focused on you, using our API and developing tools on your side.

So, now you can look and examine your Tasks and Models even deeper. We have added Model Screen, where you can find four tools helping you to improve your classifiers. We have used these tools in our custom machine-learning solutions for our customers to help ourselves. And we believe you will love it as much as we do. To see the features, click on the Detail button in the list of models on the Task Screen.

Insights can be accessed via the Detail button

Confusion Matrix

The Confusion Matrix is a well-known visualisation technique to help you understand which labels are commonly mixed up (confused.) Imagine that we want to build an Animal Classifier with four labels cat, dog, parrot and bird (various other kinds.) Probably, cats will likely be confused with dogs and parrots will more likely be mixed up with birds. This is a very simple example, but in a more complex scenario, this chart will help you pinpoint exactly the most interfering labels.

Confusion Matrix with 4 Images

The value of each square represents a percentage of how many pictures belong to ground truth label — rows, or was classified as predicted label — columns. The higher the percentage, the darker the colour.

Ideal Confusion Matrix has all diagonal squares from top left to bottom right dark — high percentage and all other squares light — low percentage. Vize computes the Confusion Matrix on a testing set, which is approx. 20 % of all images in the Task.

Failed Images

Another feature to help you understand what is happening inside your Task is knowledge of failed images. Vize will show you some of the images that your classifier misclassified — behaved incorrectly after training. You can clearly see, with such an overview, that some images can be quite hard, and maybe your Task needs more similar images to be added to some labels.

Vize computes misclassified images after final training (training & test set) on all of your data. That is also why Confusion Matrix and Failed Images could differ. Thanks to Failed Images your classifier is going to be more transparent and you will know better how to tweak the Task to get better results.

Dialogue with information for each image

In this example picture, our Animal classifier has failed on some images. The first picture should be classified as a bird, but our model predicts that it is a parrot. We can see that indeed the first picture is a bit more colourful, probably that is why our model made an error and classified the image as a parrot. Another possible explanation could be, that some birds have many features in common with parrots. We have overfitted our small task, and we need to put a bit more similar images to our dataset.

More Charts

Quite commonly, user tasks have several labels, and it could be difficult to see that they have imbalanced data. That means the number of images for some Labels can be higher than for all other Labels, or one of the Labels has only a small percentage of images and all other labels are quite well-defined.

Anything has a little more images than a Trailer

Our optimisation algorithm can handle this quite well, but we still recommend you balance your data so you will get the best results.

Uploading more images just to some Labels can lead to decreasing overall accuracy of the classifier, however, with more images overall, your classifier will have more stable results throughout all Labels. We recommend always having some data prepared which will never be uploaded to Vize and on which you can test your models. Check out the Images per Label section, with a pie chart, where you see how well-balanced your dataset is.

Try our public demos

The post Insights to Help you Understand your Visual Content appeared first on Ximilar: Visual AI for Business.