Let's talk!

Maciej Szymczuk
Regional Director
USA

(+1) 929 262 9149

maciej.szymczuk@softwarehut.com


contact form

Let's talk

In today’s world, computers are our new eyes. We rely on these machines to react based on the appearance of a precise space.

From animals to persons, to inanimate objects, image analytics in computers can detect far beyond simple items, but the variations of said object – in other words, a small step closer to that of a human.

In this blog, I will take you through the terminology, methodology, and applications of Image recognition in business applications. Here, we will also gain some valuable insight from Marcin Skoczylas PhD., regarding the state of the technology.

Be sure to see his articles as well!

What is Image Recognition All About?

I can’t in good conscience let you walk away into the world without teaching you the differences between image detection and image recognition.

That’s why I’d like to start with some definitions, to get that ‘image of the ball’ rolling.

Image Detection

The first step of the process. Only sees the existence and position of predefined objects on images.

  • Example 1: Face detection tells us if there are faces visible on an image.
  • Example 2: Checking inside factories, such as if electronic tracers on motherboards are uncut.

In practice, image detection is enough for many such operations, according to Marcin Skoczylas.

Image Recognition

The second step. This function recognises objects on images. Objects are identified and assigned to a predefined class.

  • Example: Facial recognition tells us about age, gender, match based on name or ID.

How is it possible? Well, with algorithms, naturally. Machine learning plays a starring role as well. In all, it is a label for a process of training computers to ‘see like humans’, and process images.

Approaches

Image recognition Approaches

The following is an overview of method variations, which I will list by levels of sophistication, in ascending order.

  • Simple Objects Detection – “That cat is back again”. This variant recognises an image based on general characteristics.
  • Image Matching – This variant analyses 2-dimensional imagery based on simple parameters. It operates, for example, in product label matching and packaging and works effectively with small databases
  • Image Recognition using Machine Learning – This refers to the practice of machine learning techniques, such as: Texture Classification, Convolutional Neural Network (CNN), and Fast Recurrent Neural Networks (Fast-RNN).
  • Advanced Image Recognition – This step is a more intricate-scale operation to the former option listed. Here, we apply Deep Learning, which requires an extensive database, with millions of objects. Google Images suits this example.

Current State of the Technology

We see it all over the news. Image recognition features in commercial, and even military contexts.

Google Search, however, is a primary example of its civilian application. The search engine allows you to input images as a search term. Law enforcement in China applies the technology to detect perpetrators of minor infractions, such as jaywalking.

However, where are we in terms of the progress of image recognition?

A major study carried out by Perficient Digital analysed the technological capability of significant market players: Microsoft Azure, IBM Waterson, and Amazon Web Services’ Rekognition. A total of 2000 images have been analysed based on four categories: charts, landscapes, people, products

The study itself consisted of measurement of the accuracy of images vs categories allocated. Here, 7% of the 2000 total images were labelled precisely. A human control group, on the other hand, tagged the photos with 87.7% accuracy.

In terms of confidence level, or how satisfied the program is of it performing a correct image assignment, Microsoft Azure performed with an efficiency of 90.9%, with Google leading with 92.4%. Among the 3 test subjects, 55% of tags submitted cited a Confidence Level of 70% or higher. The humans, on the other hand, returned tags with total confidence of 90%.

According to Perficient Digital, precision is improving, as is the confidence levels within the machines. Compared to humans, though, technology remains inferior.

Applications & Implications

Here’s a snapshot of where these technologies are applied now.

Applications & Implications

AI Cameras

Machine Learning improves the effectiveness of access controls, particularly for consumer applications, such as that of home security.

Object Detection

Retail assistance and automation are an area on the frontlines of potential improvement to the industry. Points of sale, as well as loss prevention, are areas slated for advancement.

Medical Imaging

Ideally, such tech applies to Pathology (Cancer Detection), and MRIs (Lesion Detections). In-fact, Google boasts an algorithm that can find signs of cancer on medical images. However, in the case of the United Kingdom, they are not permitted to present these algorithms due to the legal requirement of manual inspection.

Because the technology works on Neural Networks, which are a complex mathematical structure, it is not 100% possible to prove their effectiveness, according to Marcin Skoczylas. Therefore, at this point, even our most advanced technology cannot fulfil NHS requirements.

Industrial Robots

Image recognition assists throughout the manufacturing and assembly processes, as well as in the likes of quality control and assurance, in the detection of faulty products. The method uses deep learning to recognise defective products, as well as process optimisation, by reducing waste and increasing production by 20 and 50 per cent, respectively.

Image Analytics Examples

Facial recognition in-particular is very complex and is progressing beyond the 2-Dimensional scope of recognition. Using IR cameras, devices can now recognise the shape of one’s face and distinguish real from fiction.

Cloud APIs now offer prospective developers a multitude of options for developing image recognition functions to aid in retail operations. According to Marcin Skoczylas, Google Vision API is powerful for rapid analysis of objects visible within images and presents them in simple-to-use JSON files.

Smartphone Face Unlock

The smartphones now available on the market boast a plethora of technologies in image recognition. The tech was first used to suggest image categories in in-built photo album programmes before becoming an intuitive tool used for unlocking and navigating the device itself.

Facial Recognition in Retail POS

From December, the Wedome Bakery Chain in China piloted the ‘Smile to Pay’ function, enabling users at points-of-sale to identify themselves via facial recognition, and pay using an associated digital wallet, which is protected by SMS Authentication, and ‘risky environment’ detection.

Search-by-Image E-Commerce

Using a similar method to Google Image search, retailers now open the doors for ‘backwards’ searches of fashion items, aiding in retrieving fashion designs. Such is particularly useful in cases of unpronounceable or linguistically-challenging furniture and clothing items.

Challenges

There exist reasons for concern, citing a lack of progress in image recognition. Opinions within scientific, legislative, and the public are mixed, with some studies appearing to suggest room for improvement.

Excavating AI: Politics of Images in Machine Learning Training Sets, a study by Prof. Kate Crawford, AI researcher, and Trevor Paglen, artist and researcher ran images through ImageNet, a 14,000,000+ image neural network, in September 2019.

The images processed were assigned tags that yielded peculiar, and sometimes startling assumptions and generalisations, as demonstrated in the cited descriptions.

As a case-in-point, I gave the program a try, using the following images:

Firefighter

Airline Pilot

In the case of the more peculiar tags, I have included the on-site provided definitions of said terms.

Microsoft Founder, Bill Gates (Front
Profile)

Professional Tennis Player, Serena Williams

Verdict

Conclusively, the results demonstrate an aptitude in recognising some essential characteristics of people. However, as this example shows, the program refers to features based on its preconceived notion of human traits, the last two photographs are attributed more verbose, and potentially offensive tags.

Regulatory Environment

It’s no secret that technology is moving at an exceptional pace. In many instances, it progresses at rates which people cannot understand or guarantee control over. As such tech nominally processes thousands upon millions of images bearing the likeness of real people, the concerns range from safeguarding data resources to a popularly-held notion of a potential segue into subjecting the tools toward authoritarian means.

In the United States, for instance, the local San Francisco government has banned the use of facial recognition in law enforcement and transport, citing privacy concerns regarding the security of facial databases.

According to Marcin Skoczylas, there is a potential remedy to privacy concerns. In the case of trained models vs training databases, the former would consist of a programmed system that would be immune to attempts to reverse-read the database of images.

Deepfake Subterfuge

As facial recognition becomes commonplace, there exist pathways for attempts to defraud such systems. So-called ‘Deepfake’ uses AI software to analyse a face and map it onto a video of another subject. As this tech develops beyond its current famous rendition, Zao, by Chinese company Momo, there are fears ‘digital masking’ skill could outpace the understanding of security systems relying on facial recognition.

Conclusion

As I have shown above, the implications of such tech are simply-put – substantial. As commerce, and government attempt to grapple the scale of the potential of this technology, ambitious concepts and programming appear to take shape.

Businesses with resources to pilot these schemes are fighting to become first-movers in this sensitive data-driven industry. In this case, enterprises and investors alike should consider waiting for the outcome and impacts of legal and legislative bodies, as they define rules of engagement and necessary safeguards of image recognition.

IR emergence into the mainstream faces challenges due to privacy and security concerns, as well as public moods regarding the protection of civil liberties, and the ‘right to be forgotten’. According to Marcin Skoczylas, the technology falters in performance, because of the time-consuming nature of database gathering, training, and algorithm setting. It is the primary reason that this cannot yet correctly apply in embedded, real-time situations.

In addition to this, with several instances of merit, the application of IR could well-apply to pilot projects. As computer speeds improve, this technology will become more available for everyday use and applications.

The applications’ potential does imply a significant convenience and can unfold to optimise business processes across the entire value chain. The future is digital and has its eye wide open.

Looking for insight and know-how for your next IT project?

Fresh software development tips delivered straight to your inbox

Subscribe to our monthly newsletter with useful information about building valuable software products.
Don't worry, we value your privacy and won't spam you with any bussines enquiries!

Regional Director - USA

Entrepreneur, front-end developer, technology enthusiast. A leader who is always pushing himself beyond limits. Firmly believes in the strength of teamwork, but has the ability to work independently. Ambitious with a creative and analytical mind that always guides him to find the most optimal solution for finishing tasks on time.

As part of our website we use cookies to provide you with services at the highest level, including in a manner tailored to individual needs. Using the site whithout changing the cookies settings means that they will be stored in your device. You can changes settings at any time. Accept