Custom Object Detection - Ximilar: Visual AI for Business

How to deploy object detection on Nvidia Jetson Nano

Michal Lukáč — Mon, 18 Oct 2021 12:13:16 +0000

At the beginning of summer, we received a request for a custom project for a camera system in a factory located in Africa. The project was about detecting, counting, and visual quality control of the items on the conveyor belts in a factory with the help of visual AI. So we developed a complex system with neural networks on a small computer called Jetson Nano. If you are curious about how we did it, this article is for you. And if you need help with building similar solutions for your factory, our team and tools are here for you.

What is NVIDIA Jetson Nano?

There were two reasons why using our API was not an option. First, the factory has unstable internet connectivity. Also, the entire solution needs to run in real time. So we chose to experiment with embedded hardware that can be deployed in such an environment, and we are very glad that we found Nvidia Jetson Nano.

[Source]

Jetson Nano is an amazing small computer (embedded or edge device) built for AI. It allows you to do machine learning in a very efficient way with low-power consumption (about 5 watts). It can be a part of IoT (Internet of Things) systems, running on Ubuntu & Linux, and is suitable for simple robotics or computer vision projects in factories. However, if you know that you will need to detect, recognize and track tens of different labels, choose the higher version of Jetson embedded hardware, such as Xavier. It is a much faster device than Nano and can solve more complex problems.

What is Jetson Nano good for?

Jetson is great if:

You need a real-time analysis
Your problem can be solved with one or two simple models
You need a budget solution & be cost-effective when running the system
You want to connect it to a static camera – for example, monitoring an assembly line
The system cannot be connected to the internet – for example, because your factory is in a remote place or for security reasons

The biggest challenges in Africa & South Africa remain connectivity and accessibility. AI systems that can run in house and offline can have great potential in such environments.
Deloitte: Industry 4.0 – Is Africa ready for digital transformation?

Object Detection with Jetson Nano

If you need real-time object detection processing, use the Yolo-V4-Tiny model proposed in this repository AlexeyAB/darknet. And other more powerful architectures are available as well. Here is a table of what FPS you can expect when using Yolo-V4-Tiny on Jetson:

Architecture	mAP @ 0.5	FPS
yolov4-tiny-288	0.344	36.6
yolov4-tiny-416	0.387	25.5
yolov4-288	0.591	7.93

Source: Github

After the model’s training is completed, the next step is the conversion of the weights to the TensorRT runtime. TensorRT runtimes make a substantial difference in speed performance on Jetson Nano. So train the model with AlexeyAB/darknet and then convert it with tensorrt_demos repository. The conversion has multiple steps because you first convert darknet Yolo weights to ONNX and then convert to TensorRT.

There is always a trade-off between accuracy and speed. If you do not require a fast model, we also have a good experience with Centernet. Centernet can achieve a really nice mAP with precise boxes. If you run models with TensorFlow or PyTorch backends, then the speed is slower than Yolo models in our experience. Luckily, we can train both architectures and export them in a suitable format for Nvidia Jetson Nano.

Image Recognition on Jetson Nano

For any image categorization problem, I would recommend using simple architecture as MobileNetV2. You can select for example the depth multiplier for mobilenet of 0.35 and image resolution 128×128 pixels. In this way, you can achieve great performance both in speed and precision.

We recommend using TFLITE backend when deploying the recognition model on Jetson Nano. So train the model with the TensorFlow framework and then convert it to TFLITE. You can train recognition models with our platform without any coding for free. Just visit Ximilar App, where you can develop powerful image recognition models and download them for offline usage on Jetson Nano.

A simple Object Detection camera system with the counting of products can be deployed offline in your factory with Jetson Nano.

Recommended camera and utilities

Jetson Nano is simple but powerful hardware. However, it is not as powerful as your laptop or desktop computer. That’s why analyzing 4k images on Jetson will be very slow. I would recommend using max 1080p camera resolution. We used a camera by Raspberry PI, which works very well on Jetson and installation is easy!

I should mention that with Jetson Nano, you can come across some temperature issues. Jetson is normally shipped with a passive cooling system. However, if this small piece of hardware should be in the factory, and run stable for 24 hours, we recommend using an active cooling system like this one. Don’t forget to run the next command so your fan on Jetson starts working:

sudo jetson_clocks --fan

Installation steps & tips for development

When working with Jetson Nano, I recommend following guidelines by Nvidia, for example here is how to install the latest TensorFlow version. There is a great tool called jtop, which visualizes hardware stats as GPU frequency, temperature, memory size, and much more:

jtop tool can help you monitor statistics on Nvidia Jetson Nano.

Remember, the Jetson has shared memory with GPU. You can easily run out of 4 GB when running the model and some programs alongside. If you want to save more than 0.5 GB of memory on Jetson, then run the Ubuntu on LXDE desktop environment/interface. The LXDE is more lightweight than the default Ubuntu environment. To increase memory, you can also create a swap file. But be aware that if your project requires a lot of memory, it can eventually destroy your microSD card. More great tips and hacks can be found on JetsonHacks page.

For improvement of the speed of Jetson, you can also try these two commands, which will set the maximum power input and frequency:

sudo nvpmodel -m0
sudo jetson_clocks

When using the latest image for Jetson, be sure that you are working with the right OpenCV versions of the library. For example, some older tracking algorithms like MOSSE or KCF from OpenCV require a specific version. For some tracking solutions, I recommend looking on PyImageSearch website.

Developing on Jetson Nano

The experience of programming challenging projects, exploring new gadgets, and helping our customers is something that deeply satisfies us. We are looking forward to trying other hardware for machine learning such as Coral from Google, Raspberry Pi, or Intel Movidius for Industry 4.0 projects.

Most of the time, we are developing a machine learning API for large e-commerce sites. We are really glad that our platform can also help us build machine learning models on devices running in distant parts of the world with no internet connectivity. I think that there are many more opportunities for similar projects in the future.

The post How to deploy object detection on Nvidia Jetson Nano appeared first on Ximilar: Visual AI for Business.

Image Recognition as an Answer to New Energy Labelling

Zuzana Raidová — Wed, 27 Jan 2021 08:45:30 +0000

The year 2021 will bring a fundamental change in the energy labelling of household appliances. Updated labelling should be more efficient, and intuitive, and enable consumers to make better and more informed purchasing decisions. A first large group of goods should be re-labelled by the beginning of March, not only in retail but also in e-shops. Even though such modification brings benefits to the buyers, it poses a great challenge to the online sellers, to which we in Ximilar have a clever solution.

Upcoming Changes in the EU Energy Labelling

The energy labels indicate the energy efficiency category the appliance falls into. In 2019, the European Union approved a new regulation setting a framework for updated energy labelling, which will come into force in 2021 and gradually replace the old system of labels. According to European lawmakers, the new system could save up to 200 billion kWh of energy, which is approximately the same amount of energy all Baltic countries spend together in a year. The first new labels are already in circulation.

Effective March 2021, sellers and manufacturers will be required to update the energy labels on fridges, washing machines, dishwashers, TVs, electronic displays, and refrigerating appliances for display purposes, followed by tyres in May, and lamps in September.

So far, the products have fallen into categories A+++ to G, which will be simplified back to A to G and the energy class of a product will be determined by higher standards. This means the appliance that was A+ in 2020 could be B or C from now on.

Re-scaling is not the only new feature, as the new labels are provided with a QR code leading consumers to the EPREL (European Product Registry for Energy Labelling) database, providing them with detailed energy and environmental information on the goods.

A Challenge for E-commerce Industry

The new regulation applies not only to retail but also to e-commerce, meaning all e-shops will be required to re-label the household appliances as well. They will be required to do so between March 1^st and 18^th.

E-shops need to identify thousands of energy labels in the product galleries and replace them with the new ones.

E-shops generally upload the energy labels as pictures into the galleries on the item pages. Due to the large amounts of images they upload every day, it is not uncommon not to have them tagged.

To ensure a smooth transition from the old label system to the new one, the physical stores will focus on the re-labelling of the displayed goods. The e-shops, on the other hand, will need to identify and replace considerable amounts of pictures in their databases at once. For instance, the largest e-shop selling household appliances in the Czech Republic Alza.cz currently offers approximately 1 200 products in the category of fridges, 500 in washing machines, 350 in dishwashers, 600 TVs, and 1 200 monitors, meaning they will need to update at least 3 850 energy labels in the first wave.

Many large e-shops also cooperate with price comparison websites, such as Heureka, that have their item galleries. For such services, the problem is a bit more complex: as a price analysis tool, the comparison website acquires its data from various sellers meaning its picture tagging or sorting is not standardised, and they have to deal with a wide range of file types and names.

Example of an old EU energy label in a product gallery at Heureka.cz

Such task poses a question: what is the most efficient way to identify the old energy labels amongst other images in the product galleries in order to delete and replace them? The solution lies in the image recognition software.

Smart Solution: Image Recognition

E-shops with electronics typically upload the energy labels as images into the product galleries on their item pages and provide them to the price comparison websites. Therefore, they need software able to sort the product images, reliably recognize the old energy labels and set them aside.

Image Recognition is one of the core services of Ximilar. In principle, once you upload your images to this service, it equips them with tags and sorts them into categories. This service uses computer vision and deep learning to detect a wide range of features in the pictures. It is designed to process extensive databases of pictures in a fraction of a second.

With Ximilar App, you can develop an AI service directly for energy label recognition.

How to Use the Image Recognition on Energy Labels

If you need to identify and replace the old energy labels in your e-shop, there are two ways to use the Ximilar Energy Label Recognition service:

You can train your own recognition model for energy-label images. Then you can use the model as an API endpoint. Meaning, you will send images from the product gallery and get immediate feedback on whether they are or aren’t energy labels.
You can provide us with an export from your product image database (as image URLs or the actual files) and we will take care of the rest for you. You will get the output back in a standard CSV format.

Since image recognition is a CPU/GPU-intensive process, one of the greatest advantages of this service lies in the image database processing on our servers, whether you use the API or leave it to us. Of course, you will have a chance to test the service in the Ximilar App before you run it on your image database.

The energy label recognition with the Ximilar service is an efficient, quick, and above all, reliable way to identify the images that need to be replaced.

Try it in Ximilar App

With Ximilar you can develop more models for energy labels recognition:

Reliable recognition of the old energy labels from the new ones. This might be handy in the transition period when some labels will be already replaced, but others will not.
Reading the actual energy class, especially from the new energy labels. The energy label change is a great opportunity to enrich your product data by this piece of information.

If you are interested, please just fill out our contact form. We are here to help!

The Image Recognition Service Makes E-commerce Easier

Whether you need to sort your catalogue into fine-grained categories, recognize pictures in product galleries, or offer similar products to your customers, Ximilar has a solution for you.

Read more in this detailed article on Image Recognition uses in e-commerce, or contact us, and we can discuss other solutions tailored to your needs.

Try our public demos

The post Image Recognition as an Answer to New Energy Labelling appeared first on Ximilar: Visual AI for Business.

Advanced Options for Machine Learning Model Training

Libor Vanek — Tue, 03 Nov 2020 14:16:07 +0000

Results of machine learning models are strongly dependent both on the quality and quantity of your training data. Unfortunately, getting more reliable data is often quite expensive. There are however a couple of techniques designed to deal specifically with this problem. Using Ximilar App, anyone can train their own recognition and detection models, and Ximilar has recently introduced new options for this process.

What We Do for You

From our end to reduce work for you, we use models pre-trained on huge amounts of data, which can already recognize the basic elements of your images. Sometimes those might be already quite high-level concepts, like an individual. Other times, they could be lines, edges and so forth. The training just refines the model for your specific data.

How You Can Help Us

Then, there are some options which are much more dependent on the given task you are trying to solve. You can artificially change your training images and generate additional data “for free”. Some operations are typically “safe” and we turn them on by default – image quality, mild colour changes, left-right flip or small crop of an image.

Others are more disruptive and can potentially destroy the important pieces of information in your image. Therefore, it is up to you to turn them on. This is something we strongly encourage you to do for as many of them as possible within your task.

Image augmentation settings in app.ximilar.com

Choosing Correct Options

Do not be afraid of a growing number of options. The main rule for deciding which one to enable is very simple. Could the particular operation change the image in such a way that you will be no longer able to recognize it? If not, you can allow it. Sometimes, you might ask a second question – could an image modified by this operation be sent to my service? Do I want to recognize it?

In one of our earlier articles, we described how to train a custom image classifier. We will take an example from the same domain (cats vs. dogs) to walk you through the different options.

Flip And Rotate

The most basic operations are rotation and flip. Below, you will find an original image on the left side and then pictures with the following operations: flip vertically, flip horizontally, rotate 90, and rotate max (20).

Image augmentation of flip and rotation on images for training machine learning models

As you see, the dog is recognizable in all pictures. Horizontal flip should be turned on without any hesitation. Vertical flip and rotate 90 depends on the data you are expecting to recognize. Will those all be professional photos? Then those options will probably not help you. Will users all around the world be using your service to upload pictures from their phones? Well, sometimes the image might be rotated the wrong way.

In addition, a small arbitrary rotation (rotate max) might be useful to introduce some deformations which will make the model even more robust.

Let’s Do Some More

Now, we can continue to more advanced image augmentations. Again, we have an original image on the left. And then we apply the following: colour (light, medium and aggressive), quality, crop and erase augmentations.

Image augmentation of color adjustment, cropping and erase on images for training machine learning models.

Changes in the colours are a very natural operation. We provide you with for options:

Keep original colours
Light changes – slightly modify brightness and contrast
Medium changes – modify brightness and contrast a bit more, modify hue and saturation
Aggressive changes – drop one of the colour channels, swap channels, convert the image to black and white

Other operations are more straightforward. They can be either on or off. Quality simulates various JPEG compressions, noise etc. Crop will cut of a small part of an image on each side. And finally, erase will remove small rectangular patches from the image.

In our task, all operations made the dog recognizable, with the possible exception of the last one – erase. If the object or its defining part on the image is too small, it might be removed by this operation. Therefore, use erase carefully.

Final Tips

Do not be afraid to experiment. You can train multiple models with different settings and compare the results. However, always be careful, how you do the evaluation. The best way is to set your independent test dataset. Read more in this blog post.

For every trained model, your settings are saved, and you can inspect them at your will.

View on trained image recognition model in Ximilar App Platform.

If you have any questions about this or any other functionality, please do not hesitate to contact us. We would be glad to discuss your problem and help you.

The post Advanced Options for Machine Learning Model Training appeared first on Ximilar: Visual AI for Business.

How to Train an Object Detection Model With One Click

Michal Lukáč — Fri, 04 Sep 2020 12:47:05 +0000

Introducing Custom Object Detection on Click!

With our newly released object detection, you are able to train models for finding objects on your images. Ximilar solution allows you to combine Recognition and Detection models in one workflow through the Flows service. On click, without a single line of code!

We are glad that you love our Custom Image Recognition service, which helps you effectively build classification and tagging models. Over time, we have received a lot of messages that you are missing a service for training object detection models. We have spent a lot of time on it, and we know why – making your life easier when building such models. Training detection models of good quality can be quite challenging, and we wanted to be sure to deliver the best solution possible.

What Is Object Detection

The difference between recognition and detection is the following: in recognition, we are interested if a feature/item is present on our image. In reality, there could be many of these items in the image and one would like to know their count and positions. This is exactly the task for object detection. Object detection models can predict the exact locations of items in the form of bounding boxes – rectangles around the objects.

If you want to know more about the technology behind it, read the blog post from our ML specialist Libor Vaněk.

Creating Your First Model Step-by-Step

Define Your Task (Model)

Just log in to app.ximilar.com and click on the Object Detection tile on the dashboard. Click on Create New Task and set the name and description (optional). After that, you need to create detection labels and connect them to the Task. Click on Create New Label tile for your first detection label. After doing this, your task definition is complete. Your task now contains one label, but you can create and connect more.

Upload Your Data

Now we need to upload our dataset and create bounding boxes on your images. Go to the Images page and start uploading. Then go through each of the images and create objects/bounding boxes on them.

As with the Image Recognition service, we recommend starting with a small dataset of about 50 images per label and then increasing the counts. If you already have your dataset with bounding boxes on your local computer, you can use Ximilar Client to upload them.

Train the Model and See the Results

Once your training collection is ready, click the TRAIN button on the TASK page. Training will take some time (up to several hours), so make a coffee and relax.

After the model is successfully optimized, you can use the detect endpoint and test it in production or even connect to the API with Ximilar Client.

Upload More Data

There is a good chance that after the first round, your model will require more images and objects. However, you already have some semi-perfect models trained, and you can use them to help you with creating Bounding Boxes on your new training images – just use the Predict button below the training image. If you want, you can create your independent TEST dataset, you can do it by using the test flag. See the video below.

Flows With Object Detection

This is our most powerful feature right now. You can build a really complex computer vision system by connecting detection and recognition models into a single API endpoint. Imagine first detecting individual items on the image and then recognizing their attributes. This is possible with the new Flows action “Object Selector”. What are the example use cases?

detect all the items on a production line and identify if they have a defect or not
detect fashion products on the person and recognize all their attributes
find the exact position and recognize tooth decays
count and classify all the cars from the parking camera
object recognition for insurance damage and cost prediction
and many more

We will go through one of these examples in an upcoming blog post. Follow us on social media [LN | FB | TW | IN] so you will not miss anything important.

Tell Us About Your Ideas

This is one of the best solutions for detecting bounding boxes, which is available in the market. Why choose our solution?

The UX is great, and we made it really straightforward to use it.
Great performance with SOTA architectures behind it.
The price is affordable.
Download models for offline usage on our higher pricing plans.
Detect items on your images and then recognize features with image recognition through the Flows service.
Configure your image augmentation settings for training and get better performance.
You can A/B test model versions and evaluate the accuracy on an independent dataset.
We are using it in our own custom services, and we keep it updated with new techniques and architectures

If you love this new feature, you would like to discuss anything with us, or you have some custom project from computer vision, then contact us, and we can schedule a call with you.

Try our public demos

The post How to Train an Object Detection Model With One Click appeared first on Ximilar: Visual AI for Business.