Object Detection API - Ximilar: Visual AI for Business

New AI Solutions for Card & Comic Book Collectors

Zuzana Raidová — Wed, 18 Sep 2024 12:35:34 +0000

Recognize and Identify Comic Books in Detail With AI

The newest addition to our portfolio of solutions is the Comics Identification (/v2/comics_id). This service is designed to identify comics from images. While it’s still in the early stages, we are actively refining and enhancing its capabilities.

The API detects the largest comic book in an image, and provides key information such as the title, issue number, release date, publisher, origin date, and creator’s name, making it ideal for identifying comic books, magazines, as well as manga.

Comics Identification by Ximilar provides the title, issue number, release date, publisher, origin date, and creator’s name.

This tool is perfect for organizing and cataloging large comic collections, offering accurate identification and automation of metadata extraction. Whether you’re managing a digital archive or cataloging physical collections, the Comics Identification API streamlines the process by quickly delivering essential details. We’re committed to continuously improving this service to meet the evolving needs of comic identification.

Try how it works

Learn more

Star Wars Unlimited, Digimon, Dragon Ball, and More Can Now Be Recognized by Our System

Our trading card identification system has already been widely used to accurately recognize and provide detailed information on cards from games like Pokémon, Yu-Gi-Oh!, Magic: The Gathering, One Piece, Flesh and Blood, MetaZoo, and Lorcana.

Recently, we’ve expanded the system to include cards from Garbage Pail Kids, Star Wars Unlimited, Digimon, Dragon Ball Super, Weiss Schwarz, and Union Arena. And we’re continually adding new games based on demand. For the full and up-to-date list of recognized games, check out our API documentation.

Ximilar keeps adding new games to the trading card game recognition system. It can easily be deployed via API and controlled in our App.

Try how it works

See the full taxonomy

Detect and Identify Both Trading Cards and Their Slab Labels

The new endpoint slab_grade processes your list of image records to detect and identify cards and slab labels. It utilizes advanced image recognition to return detailed results, including the location of detected items and analyzed features.

Graded slab reading by Ximilar AI.

The Slab Label object provides essential information, such as the company or category (e.g., BECKETT, CGC, PSA, SGC, MANA, ACE, TAG, Other), the card’s grade, and the side of the slab. This endpoint enhances our capability to categorize and assess trading cards with greater precision. In our App, you will find it under Collectibles Recognition: Slab Reading & Identification.

Try how it works

Documentation

Automatic Recognition of Collectibles

Ximilar built an AI system for the detection, recognition and grading of collectibles. Check it out!

New Endpoint for Card Centering Analysis With Interactive Demo

Given a single image record, the centering endpoint returns the position of a card and performs centering analysis. You can also get a visualization of grading through the _clean_url_card and _exact_url_card fields.

The _tags field indicates if the card is autographed, its side, and type. Centering information is included in the card field of the record.

The card centering API by Ximilar returns the position of a card and performs centering analysis.

Try how it works

Documentation

Learn How to Scan and Identify Trading Card Games in Bulk With Ximilar

Our new guide How To Scan And Identify Your Trading Cards With Ximilar AI explains how to use AI to streamline card processing with card scanners. It covers everything from setting up your scanner and running a Python script to analyzing results and integrating them into your website.

Read the guide

Let Us Know What You Think!

And that’s a wrap on our latest updates to the platform! We hope these new features might help your shop, website, or app grow traffic and gain an edge over the competition.

If you have any questions, feedback, or ideas on how you’d like to see the services evolve, we’d love to hear from you. We’re always open to suggestions because your input shapes the future of our platform. Your voice matters!

The post New AI Solutions for Card & Comic Book Collectors appeared first on Ximilar: Visual AI for Business.

Image Annotation Tool for Teams

Michal Lukáč — Thu, 06 May 2021 11:55:57 +0000

Through the years, we worked with many annotation tools. The problem is most of the desktop annotating apps are offline and intended for single-person use, not for team cooperation. The web-based apps, on the other hand, mostly focus on data management with photo annotation, and not on the whole ecosystem with API and inference systems. In this article, I review, what should a good image annotation tool do, and explain the basic features of our own tool – Annotate.

Every big machine learning project requires the active cooperation of multiple team members – engineers, researchers, annotators, product managers, or owners. For example, supervised deep learning for object detection, as well as segmentation, outperforms unsupervised solutions. However, it requires a lot of data with correct annotations. Annotation of images is one of the most time-consuming parts of every deep learning project. Therefore, picking the right annotator tool is critical. When your team is growing and your projects require higher complexity over time, you may encounter new challenges, such as:

Adding labels to the taxonomy would require re-checking a lot of your work
Increasing the performance of your models would require more data
You will need to monitor the progress of your projects

Building solid annotation software for computer vision is not an easy task. And yes, it requires a lot of failures and taking many wrong turns before finding the best solution. So let’s look at what should be the basic features of an advanced data annotation tool.

What Should an Advanced Image Annotation Tool Do?

Many customers are using our cloud platform Ximilar App in very specific areas, such as Fashion, Healthcare, Security, or Industry 4.0. The environment of a proper AI helper or tool should be complex enough to cover requirements like:

Features for team collaboration – you need to assign tasks, and then check the quality and consistency of data
Great user experience for dataset curation – everything should be as simple as possible, but no simpler
Fast production of high-quality datasets for your machine-learning models
Work with complex taxonomies & many models chained with Flows
Fast development and prototyping of new features
Connection to Rest API with Python SDK & querying annotated data

With these needs in mind, we created our own image annotation tool. We use it in our internal projects and provide it to our customers as well. Our technologies for machine learning accelerate the entire pipeline of building good datasets. Whether you are a freelancer tagging pictures or a team managing product collections in e-commerce, Annotate can help.

Our Visual AI tools enable you to work with your own custom taxonomy of objects, such as fashion apparel or things captured by the camera. You can read the basics on the categories & tags and machine learning model training, watch the tutorials, or check our demo and see for yourself how it works.

The Annotate

Annotate is an advanced image annotation tool, which enables you to annotate images precisely and fast. It works as an end-to-end platform for visual data management. You can query the same images, change labels, create objects, draw bounding boxes and even polygons here.

It is a web-based online annotation tool, that works fully on the cloud. Since it is connected to the same back-end & database as Ximilar App, all changes you do in Annotate, manifest in your workspace in App, and vice versa. You can create labels, tasks & models, or upload images through the App, and use them in Annotate.

Ximilar Application and Annotate are connected to the same backend (api.ximilar.com) and the same database.

Annotate extends the functionalities of the Ximilar App. The App is great for training, creating entities, uploading data, and batch management of images (bulk actions for labelling and filtering). Annotate, on the other hand, was created for the detail-oriented management of images. The default single-zoomed image view brings advantages, such as:

Identifying separate objects, drawing polygons and adding metadata to a single image
Suggestions based on AI image recognition help you choose from very complex taxonomies
The annotators focus on one image at a time to minimize the risk of mistakes

Interested in getting to know Annotate better? Let’s have a look at its basic functions.

Deep Focus on a Single Image

If you enter the Images (left menu), you can open any image in the single image view. To the right of the image, you can see all the items located in it. This is where most of the labelling is done. There is also a toolbar for drawing objects and polygons, labelling images, and inspecting metadata.

In addition, you can zoom in/out and drag the image. This is especially helpful when working with smaller objects or big-resolution images. For example, teams annotating medical microscope samples or satellite pictures can benefit from this robust tool.

The main view of the image in our Fashion Tagging workspace

Create Multiple Workspaces

Some of you already know this from other SaaS platforms. The idea is to divide your data into several independent storages. Imagine your company is working on multiple projects at the same time and each of them requires you to label your data with an image annotation tool. Your company account can have many workspaces, each for one project.

Here is our active workspace for Fashion Tagging

Within the workspaces, you don’t mix your images, labels, and tasks. For example, one workspace contains only images for fruit recognition projects (apples, oranges, and bananas) and another contains data on animals (cats and dogs).

Your team members can get access to different workspaces. Also, everyone can switch between the workspaces in the App as well as in Annotate (top right, next to the user icon). Did you know, that the workspaces are also accessible via API? Check out our documentation and learn how to connect to API.

See API Documentation

Train Precise AI Models with Verification

Building good computer vision models requires a lot of data, high-quality annotations, and a team of people who understand the process of building such a dataset. In short, to create high-quality models, you need to understand your data and have a perfectly annotated dataset. In the words of the Director of AI at Tesla, Andrej Karpathy:

Annotate helps you build high-quality AI training datasets by verification. Every image can be verified by different users in the workspace. You can increase the precision by training your models only on verified images.

A list of users who verified the image with the exact dates

Verifying your data is a necessary requirement for the creation of good deep-learning models. To verify the image, simply click the button verify or verify and next (if you are working on a job). You will be able to see who verified any particular image and when.

Create and Track Image Annotating Jobs

When you need to process the newly uploaded images, you can assign them to a Job and a team of people can process them one by one in a job queue. You can also set up exactly how many times each image should be seen by the people processing this queue.

Moreover, you can specify, which photo recognition model or flow of models should be displayed when doing the job. For example, here is the view of the jobs that we are using in one of our tagging services.

Two jobs are waiting to be completed by annotators,
you can start working by hitting the play button on the right

When working on a job, every time an annotator hits the Verify & Next button, it will redirect them to a new image within a job. You can track the progress of each job in the Jobs. Once the image annotation job is complete, the progress bar turns green, and you can proceed to the next steps: retraining the models, uploading new images, or creating another job.

Draw Objects and Polygons

Sometimes, recognizing the most probable category or tags for an image is not enough. That is why Annotate provides a possibility to identify the location of specific things by drawing objects and polygons. The great thing is that you are not paying any credits for drawing objects or labelling. This makes Annotate one of the most cost-effective online apps for image annotation.

Simply click and drag the rectangle with the rectangle tool on canvas to create the detection object.

What exactly do you pay for, when annotating data? The only API credits are counted for data uploads, with volume-based discounts. This makes Annotate an affordable, yet powerful tool for data annotation. If you want to know more, read our newest Article on API Credit Packs, check our Pricing Plans or Documentation.

Annotate With Complex Taxonomies Elegantly

The greatest advantage of Annotate is working with very complex taxonomies and attribute hierarchies. That is why it is usually used by companies in E-commerce, Fashion, Real Estate, Healthcare, and other areas with rich databases. For example, our Fashion tagging service contains more than 600 labels that belong to more than 100 custom image recognition models. The taxonomy tree for some of the biotech projects can be even broader.

Navigating through the taxonomy of labels is very elegant in Annotate – via Flows. Once your Flow is defined (our team can help you with it), you simply add labels to the images. The branches expand automatically when you add labels. In other words, you always see only essential labels for your images.

Simply navigate through your taxonomy tree, expanding branches when clicking on specific labels.

For example, in this image is a fashion object “Clothing”, to which we need to assign more labels. Adding the Clothing/Dresses label will expand the tags that are in the Length Dresses and Style Dresses tasks. If you select the label Elegant from Style Dresses, only features & attributes you need will be suggested for annotation.

Automate Repetitive Tasks With AI

Annotate was initially designed to speed up the work when building computer vision solutions. When annotating data, manual drawing & clicking is a time-consuming process. That is why we created the AI helper tools to automate the entire annotating process in just a few clicks. Here are a few things that you can do to speed up the entire annotation pipeline:

Use the API to upload your previously annotated data to train or re-train your machine learning models and use them to annotate or label more data via API
Create bounding boxes and polygons for object detection & instance object segmentation with one click
Create jobs, share the data, and distribute the tasks to your team members

Predicting bounding boxes with one click automates the entire process of annotation.

Image Annotation Tool for Advanced Visual AI Training

As the main focus of Ximilar is AI for sorting, comparing, and searching multimedia, we integrate the annotation of images into the building of AI search models. This is something that we miss in all other data annotation applications. For the building of such models, you need to group multiple items (images or objects, typically product pictures) into the Similarity Groups. Annotate helps us create datasets for building strong image similarity search models.

Grouping the same or similar images with the Image Annotation Tool. You can tell which item is a smartphone photo or which photos should be located on an e-commerce platform.

Annotate is Always Growing

Annotate was originally developed as our internal image annotation software, and we have already delivered a lot of successful solutions to our clients with it. It is a unique product that any team can benefit from and improve the computer vision models unbelievably fast.

We plan to introduce more data formats like videos, satellite imagery (sentinel maps), 3D models, and more in the future to level up the Visual AI in fields such as visual quality control or AI-assisted healthcare. We are also constantly working on adding new features and improving the overall experience of Ximilar services.

Annotate is available for all users with Business & Professional pricing plans. Would you like to discuss your custom solution or ask anything? Let’s talk! Or read how the cooperation with us works first.

How do custom projects work?

The post Image Annotation Tool for Teams appeared first on Ximilar: Visual AI for Business.

How to Train an Object Detection Model With One Click

Michal Lukáč — Fri, 04 Sep 2020 12:47:05 +0000

Introducing Custom Object Detection on Click!

With our newly released object detection, you are able to train models for finding objects on your images. Ximilar solution allows you to combine Recognition and Detection models in one workflow through the Flows service. On click, without a single line of code!

We are glad that you love our Custom Image Recognition service, which helps you effectively build classification and tagging models. Over time, we have received a lot of messages that you are missing a service for training object detection models. We have spent a lot of time on it, and we know why – making your life easier when building such models. Training detection models of good quality can be quite challenging, and we wanted to be sure to deliver the best solution possible.

What Is Object Detection

The difference between recognition and detection is the following: in recognition, we are interested if a feature/item is present on our image. In reality, there could be many of these items in the image and one would like to know their count and positions. This is exactly the task for object detection. Object detection models can predict the exact locations of items in the form of bounding boxes – rectangles around the objects.

If you want to know more about the technology behind it, read the blog post from our ML specialist Libor Vaněk.

Creating Your First Model Step-by-Step

Define Your Task (Model)

Just log in to app.ximilar.com and click on the Object Detection tile on the dashboard. Click on Create New Task and set the name and description (optional). After that, you need to create detection labels and connect them to the Task. Click on Create New Label tile for your first detection label. After doing this, your task definition is complete. Your task now contains one label, but you can create and connect more.

Upload Your Data

Now we need to upload our dataset and create bounding boxes on your images. Go to the Images page and start uploading. Then go through each of the images and create objects/bounding boxes on them.

As with the Image Recognition service, we recommend starting with a small dataset of about 50 images per label and then increasing the counts. If you already have your dataset with bounding boxes on your local computer, you can use Ximilar Client to upload them.

Train the Model and See the Results

Once your training collection is ready, click the TRAIN button on the TASK page. Training will take some time (up to several hours), so make a coffee and relax.

After the model is successfully optimized, you can use the detect endpoint and test it in production or even connect to the API with Ximilar Client.

Upload More Data

There is a good chance that after the first round, your model will require more images and objects. However, you already have some semi-perfect models trained, and you can use them to help you with creating Bounding Boxes on your new training images – just use the Predict button below the training image. If you want, you can create your independent TEST dataset, you can do it by using the test flag. See the video below.

Flows With Object Detection

This is our most powerful feature right now. You can build a really complex computer vision system by connecting detection and recognition models into a single API endpoint. Imagine first detecting individual items on the image and then recognizing their attributes. This is possible with the new Flows action “Object Selector”. What are the example use cases?

detect all the items on a production line and identify if they have a defect or not
detect fashion products on the person and recognize all their attributes
find the exact position and recognize tooth decays
count and classify all the cars from the parking camera
object recognition for insurance damage and cost prediction
and many more

We will go through one of these examples in an upcoming blog post. Follow us on social media [LN | FB | TW | IN] so you will not miss anything important.

Tell Us About Your Ideas

This is one of the best solutions for detecting bounding boxes, which is available in the market. Why choose our solution?

The UX is great, and we made it really straightforward to use it.
Great performance with SOTA architectures behind it.
The price is affordable.
Download models for offline usage on our higher pricing plans.
Detect items on your images and then recognize features with image recognition through the Flows service.
Configure your image augmentation settings for training and get better performance.
You can A/B test model versions and evaluate the accuracy on an independent dataset.
We are using it in our own custom services, and we keep it updated with new techniques and architectures

If you love this new feature, you would like to discuss anything with us, or you have some custom project from computer vision, then contact us, and we can schedule a call with you.

Try our public demos

The post How to Train an Object Detection Model With One Click appeared first on Ximilar: Visual AI for Business.

Is Ximilar Better Than AI Giants?

Víťa Válka — Tue, 18 Jun 2019 07:43:37 +0000

We get this question occasionally from users of other Visual AI analysis tools, and the simple answer could be yes, it’s better. Nothing is as simple as black and white, so let us compare services from Goliaths like Google, IBM, Amazon and Microsoft with our David-like solution from Ximilar.

To say it simply, artificial intelligence vision got to a point, where it is easy not only to recognize objects in a photo, but also detect features of each thing. That creates a new universe of opportunities for real-world application in e-commerce & traditional industries alike. And Ximilar is a computer vision platform that digs deep into some pretty narrow use cases. So while the big solutions might be great in many ways, Ximilar might very well be the agile alternative.

Ximilar offers you a great cloud AI platform for training your custom image recognition models and advanced visual search services.

Ximilar is Not a Big Corporation

And that is a good thing. Because we keep things simple, streamlined, and we have time to listen to each customer’s needs. We also have the ability to implement new custom features in a timely manner. And we do it as fast as we can, widely benefiting both customers and us, freeing our manpower from manual work.

We at Ximilar create, and continuously improve, advanced visual search, image recognition services & image tools for businesses around the World. That happens in few areas:

Ximilar recognition app – AI cloud platform for training custom image recognition models
Ximilar annotate – image annotation tool for creating great datasets
Visual Search – fashion, stock photos, real estate, home decor, cards, … search engines
Ready-to-use services – fashion tagging, home-decor and collectibles recognition API
Image tools – upscale your images and remove the background with API
and in many cases the most challenging custom AI solutions

We are also not an enterprise that requires millions of users of its services to just stay afloat. See for example how many services were killed by Google. No. Rather than growth in quantity, our center of the universe is how precise we get, and how reliable & sustainable results we deliver. And how we can grow strong together with our customers, or we should rather say our partners.

Here is why Ximilar could be a solid alternative for you if you need to iterate quickly and reach reliable results in narrow fields. Or if you simply need someone who takes your idea further and finds an AI solution to deliver value to your business.

1 – We are focused AI Team

We craft our features to perfection, and we test & use them ourselves. We continuously improve our application for everybody to benefit from new findings in AI vision industry. And we also do things that customers ask for, we don’t just sell access to a platform.

2 – We are an independent company

These days, many companies are created to be acquired. They are created to grow no matter the sustainability of such growth. We are different. Our customers like that we would not disappear tomorrow — getting acquired by a giant and then dissolved into some unreachable feature of some huge app suite is not our target.

3 – We innovate faster

We don’t have a large team and therefore decisions are quick. We are a team of remote professionals working in a field that we truly love and would like to explore to the edge of possibilities. It’s a lot of fun to work on our customers’ challenging tasks. And we are happy to customize any feature. The customer’s budget is the only limit.

4 – Save expenses on AI

Our AI solutions are significantly cheaper than the solutions of big AI players. We are able to save you a lot of money on training and deploying your custom models. For example, training and deploying a model on Google Vertex AI can cost you thousands of dollars, without even calling the API. For Vertex AI AutoML models you are paying for training, deploying and calling a model. Similar pricing for features can be applied to Amazon Rekognition and Azure Custom Vision services. With Amazon Rekognition you are also paying for each hour your model is deployed! On the other hand, AI models built via our platform are trained and deployed for free! You are paying just for calling the API. No more hidden costs.

Head-to-Head Comparison

	Focus	Models	3,000 requests, free model training and deployment	Request Price per 1,000 images	Free plan per month	Visual Search	Expert assistance
Ximilar	Custom Image Recognition, Visual & Similarity Search, Tagging	Fashion, Home-Decor, Collectibles, Custom (classification, tagging, detection)	Optional	$1.0	3,000 requests, free model trainings and deployment	Yes	Yes
Microsoft	Image Recognition	Generic, Custom (classification, tagging, detection)	No	$2	10,000 requests, 1 hour of training	No	No
Amazon	Image & Video Recognition	Generic, Face, Sensitive Content, Text, Celebrity, …	No	$1	5,000 requests	Face only	No
Google	Image Recognition	Generic, Faces, Text, Logos, Landmarks	No	$1.5	1,000 requests	No	No
IBM Watson	Image Recognition	Generic, Faces, Food, Explicit, Custom (classification, tagging)	No	$2	1,000 predictions, 2 trainings of models	No	No
Clarifai	Image & Video Recognition, Similarity Search	Generic, Faces, Nudity, (Fashion) Custom (classification, tagging)…	Optional	$1.2 – 3.2	1,000 operations	Yes	Yes

Narrow Field vs. Generic AI

This one is personal. You would see a lot of simple AI applications, like detecting a cat and a dog in a given — well lit & well shot — picture. But in reality, the bread and butter of applied visual AI is narrow field recognition and analysis of large volumes of images, where the customer needs pretty high accuracy on a specific subject. For example, detect a type of screw on a blurry cellphone photo, shot in bad lighting conditions.

Unlike the giants who mostly sell you ready-made solutions that you can hardly bend to meet your needs, Ximilar is in the other end of the spectrum, brainstorming with customers about how to solve the use case that they have. Being their partner in the path to success.

Examples of such narrow use cases are

Detecting coffee grounds in a cup – for a customer who receives millions of images to their mobile app used to foretell the future for its users. You wouldn’t believe how many users in coffee-drinking countries use such an app.

Recognition of trading cards from photo – A cool use-case that was a dream of every geek. Not anymore. Simply snap a photo of a sports card or a game like Pokémon, and the app will identify a card and return a price listed on eBay. You can build your own portfolio tracker and much more with Ximilar.
Give me a quality rating of a photo – this one was brought up by a hotel reservation site and real estate company. They need to detect the best photos of a property, while the photos are often delivered by a re-seller, or a hotel owner and might not be well shot. And we all know that good photos sell better. Ximilar can help even there with upscaling images and improving their quality.

Lower Price for Higher Accuracy

While the examples above might be fun to read, let’s get to real facts, hardcore numbers and actual user feedback. Because that is a requirement for any business to base its thoughts on. Here are some real-life examples of our customer experiences.

Ximilar Recognition is cheaper and has comparable accuracy as Microsoft Custom Vision, Amazon Rekognition, Google Vertex AI and IBM Watson. At least several of our customers, and users of the Ximilar App, achieve even better accuracy than with the big cloud solutions. Ximilar allows users to control various parameters of training from a simple GUI.

Model versioning in Ximilar App.

UX of Ximilar App is extremely easy to use, also reported by our customers, saying: “Ximilar has a shallow learning curve in comparison to others”. Connection to the API and integration to your systems and apps is easy.
Ximilar has advanced features for tuning of your recognition tasks which no other services provide — flips, rotations, etc.

Advanced settings of image augmentations in Ximilar App.

Ximilar Product Similarity and Custom Similarity are unique services for finding visually similar alternatives in fashion, home decor and other image collections
Ximilar is much more flexible as we are willing to improve our service for your needs – e.g. add more tags to our models — according to your requirements and keep it attached to your data exclusively
We are cheaper — Google AutoML Vision/Vertex AI is significantly much more expensive than our solution
Ximilar Fashion Tagging is at the top of abilities in fashion object recognition
Elaborate management of tags & categories for more projects of higher complexity — we are the only system we know of, that enables users to share training data between categorisation and tagging tasks, chaining recognition models into one API…
Ximilar, unlike the big competition, is able to install the system on-premise, giving you better control over the system, do a lot of flexible customizations

This is just a brief summary of what we see as benefits for you if you use Ximilar as your partner for pioneering the AI world. We see it now as really just the beginning of all the possibilities that might come in the future of automation and machine learning abilities. We have been around for many years now and Ximilar would surely be around for the years to come. Backing you on the way. Enjoying the exploration.

How do custom projects work?

The post Is Ximilar Better Than AI Giants? appeared first on Ximilar: Visual AI for Business.

Major Updates for a Headstart in 2019

Víťa Válka — Tue, 08 Jan 2019 08:00:08 +0000

We have completely rebuilt the Vize system for image recognition and rebranded it as Ximilar App, expanded the team to cover more disciplines and most importantly we have grown our customer base. Besides that, we have worked hard on the product side of things to deliver even better service to all our existing customers, as there is fast development in the Computer Vision field every day.

It is visible that we are becoming experts in Fashion E-commerce and overall online commerce once the subject is image automation and data flow optimization. We either save significant expenses for our clients or improve their services for better conversion rates. Nevertheless, we are the backend guys, and we are not tangible that much to the end customer.

Redesigned Company Website

The redesign of the Ximilar website took a serious pile of man-days to create. The whole team got involved. And it is already paying off well. There is way more information about all the tools & features Ximilar offers. Spiced up with real-life use cases from many fields, including Fashion AI and E-commerce AI applications. There are examples and sample bits of code. And there is this blog to inform you about what is happening inside Ximilar.

We have also become an IBM Business Partner and expanded sales reach to Atlanta (USA), the United Kingdom and Asia. All that to be closer to you when you need a partner to help your business with the initial workings when embedding a robust Ximilar system inside your workflow.

Complex Documentation

This might seem like just a bunch of text at first look. But in reality, the documentation uncovers all the magic, all the possibilities that you get by using our universe of tools. Developers constantly update the docs, so you always have the most recent information at your fingertips. Making your life easier and supporting you in your busy day is our target.

Nevertheless, we are on email & live chat to help you anytime you require a helping hand.

New Feature: Precision/Recall table for Model

This is another critical feature to inspect the quality of your models, which should not be missed when developing a machine learning solution. We introduced a page with insight into your model in the Summer of 2018, and we are adding another advisory feature — Precision/Recall for each label. With that feature, you can verify exactly the level of reliability prediction for individual labels. The higher the value, the better your model succeeds in the prediction of these particular labels. The values have been detected from your training data, which is a random 20 % from all uploaded images in given labels.

Example: the precision of label Cats means that 85% of images that were predicted by the model to be Cats are actually cats. Low precision numbers mean that the label is too broad – many images falsely get this label, and you should probably add more training images that are NOT Cats. On the other hand, the recall of 50 % of the label Parrots means that only 5 out of 10 images that actually are parrots were recognized by the model to be Parrots. Low recall numbers mean that the training data define this label as too narrow — this label is not recognized as often as it should be, and you should add more training images that ARE parrots.

New Feature: Advanced Settings for each Task

Many of our customers tell us that models from Ximilar Recognition provide better and faster results than models of our competitors (including the big players). Knowing your data, you can now further improve the reliability of your model by selecting the right checkboxes (horizontal flip, vertical flip, rotate 90). These settings are applied randomly to your images during the training (together with other modifications that are standard in machine learning). As a result, the trained model should then be invariant to the corresponding transformation (e.g., the recognition should be independent of the vertical flip of the image).

For example, many classifiers for microscope/medical data will benefit all three to be checked, as the important knowledge on the images can be rotated in all possible ways. The common practice for basic tasks, let’s say classifying houses, is to have just horizontal flip checked (default behaviour) as you probably do not want to classify a house upside down. You can experiment with the settings as you want and see what works best for your task.

Improved & Updated Python Library

All the Ximilar Services are now behind the https://api.ximilar.com endpoint. That is why we made huge improvements to our Python library, which allows you to work with Ximilar Recognition (formerly Vize.ai), Dominant Colours, Generic Tagging & Fashion Tagging. The documentation, mentioned above, was changed to cover more knowledge, so the entire workflow of using the library is very straightforward. We still have further plans to expand this client by including more features and working with all possible endpoints.

More at https://gitlab.com/ximilar-public/ximilar-vize-api

Upcoming Features

And that is just the beginning of the year 2019. We already prepared many further features that are either requested by our customers or improve existing features to allow you to reach new horizons. These are just a few to give you a glimpse of what is coming:

Image Tagging — or technically multi-label classification, where both the training images and the real data get more than one label/tag. A technique is often seen in stock photo agencies as photography keywords.
Workspaces for Images and Tasks — To allow you to sort out your projects, should you have more than one.
Improved User Interface — We are constantly iterating on the most common features.

Feel free to contact us and let us know what you are missing, or what would improve your system performance, speed or reliability. We are always on your side when it comes to reaching business targets or optimizing your expenses.

The post Major Updates for a Headstart in 2019 appeared first on Ximilar: Visual AI for Business.

Custom vs. General Vision AI Services

David Novák — Thu, 15 Jun 2017 06:00:49 +0000

Understand the Difference in Image Recognition Platforms

AI is on fire, and so are services delivering different forms of artificial intelligence. In this post, I would like to focus on the Visual segment and compare two different approaches: general vision and custom vision.

What is the difference, and what works better for different visual tasks?

General Vision Platforms

Here are some examples of general vision platforms:

Google Cloud Vision API
Amazon Rekognition
Microsoft Computer Vision
Clarifai
IBM Visual Recognition Watson

These platforms are built for understanding everyday objects (dogs, aeroplanes, faces, tables…). They mostly provide photo tagging. This means understanding as many objects and abstracts in the image as possible.

The main goal of general platforms is a human-level understanding of images. They are reaching for more than object understanding. The idea is to understand the abstract interaction between objects, moods, and contexts. In the video, they are trying to understand the action, its impact and time continuity. These are a very complex task and need a lot of labelled data to learn.

General vision providers gain millions of images from different sources. Users upload images into Google Photos, OneDrive and Google index every image on the web for Google images.

How do users then benefit from these never-ending data sources?

Machine learning requires a lot of data for training. In this case, users don’t have to provide any data. General models learned thousands of everyday objects, face emotions, landmarks, car types… We can start using this treasure right away with no pain of gathering relevant training data.

This is an amazing benefit, and general models learn more and more categories. We can set up many cool apps by implementing general models because our apps often look at everyday objects.

Another benefit of general solutions is other functionalities they provide out of the box. Generate a thumbnail, read a text, or find a celebrity name. All with no training data and for a reasonable price.

How to choose?

The most important factor we are trying to maximise is accuracy on our specific task. Each provider has a different number of training images, and different deep learning architecture and provides different tasks. These are company secret data we will never reach.

Generally, in AI, we want to find all the providers who offer the functionality we need and test them out to find the best-performing solution. For complex tasks, it is common to mix a few providers with the best results.

Who is this for?

General vision suits the best to the application that needs to recognise everyday objects. Robots reading human faces, e-shop image captioning for better SEO performance and helping blind people to understand new environments. These are the great examples of general vision. When reaching for vision solutions, the first question should be: Is this something I could find online? If yes, then it is worth trying general models.

Services like Google Vision provide the power of millions of images to everyone.

But what happens when we come out of every day’s space? What if we have scientific data available only to a few universities? Here comes the Custom vision.

Custom Vision Solutions

One example is Ximilar solutions. These custom vision solutions are continually evolving to meet the dynamic needs of the rapidly growing sectors such as e-commerce, manufacturing, healthcare, and more. Most of these solutions are ready for deployment via API with just a click, requiring no knowledge of coding or machine learning techniques. They are highly customizable and can be easily combined in a modular fashion to suit various applications. I call this a win.

And some more:

Microsoft custom vision AI
Clarifai custom image recognition
Imaga computer vision

In custom computer vision, users create their own rules to sort images.

Rather than asking the type of the flower, you may want to know if there is a sun shining on the flower. Sometimes you want to be alert when your security camera spots humans, but your neighbour on a mower tractor is all right.

You could make sure all the product thumbnails you display show the unboxed products on a white background. Someone wants to make sure that the product on the end of the line is not damaged. This is something that off-shelf solutions are not built for.

About a year ago, there was only one solution. Hire an AI team to deliver an expensive on-premiss solution. Custom vision solution opens up whole new possibilities in visual AI. Compared to general platforms, there is an infinite number of tasks we can solve by defining custom objects. We can also detect different states of one object or environment. Custom vision is a little machine learning lab where everyone can test his idea. As a result, we can automate boring human tasks and save some time.

Custom vision goal is not general image understanding but a 100% accurate understanding of the specific task. This is very close to the market, but it comes with one disadvantage. User have to gather their own training data. This can be a pain and time-consuming, but it is a competitive advantage too.

At this moment, all the custom services offer image classification tasks. This means sorting the images into classes while looking at the whole image. The task can be as simple as deciding “ok” or “broken” or it can consist of many classes (e.g. several terrain appearances).

Custom vision can also come in handy when we need high accuracy on a smaller set of categories. We don’t always need to recognise thousands of categories. We would like to find 10 that are interesting for us. Custom vision can often supply better accuracy for general tasks.

The key for the user is the number of images they need for training. This is very hard to estimate in general but can be as little as 20 images per class. Read more about custom datasets in this post. The smaller the visual difference is, the higher the number of images we need.

How to choose?

We are looking for a solution that is easy to use, provides a simple user interface and the best accuracy for our task. We should test all the available solutions before we start using one. Before we start testing our idea I would also recommend saving some time discussing the project with the support team.

Who is this for?

Custom vision suits the application that needs to recognise very specific images or object states. It also fits images that are not available online or are not mass-produced by web users. It can solve many scientific, industrial, medical, and laboratory tasks. General models are often made for the needs of the online business. Custom vision can help in a variety of industries. Agriculture, production lines, security, and many others.

Custom vision opens new possibilities in visual automation. All made simple for users with no technical background.

Summary

Machine learning is a technology that saves people a lot of time. Vision is one of the human abilities that is now possible to automate. There is no universal approach for vision tasks, so we have to decide what type of task we are facing.

General vision is here to organize and structure images that are available on the web. It is very simple to use and needs no training data.

Custom vision makes sense in images that are very specific to the task or not available online. It needs some effort to gather training data, but it provides very accurate results for vision tasks.

The post Custom vs. General Vision AI Services appeared first on Ximilar: Visual AI for Business.