20 August 2020

Text Recognition Using Firebase Machine Learning


Machine Learning has been in use since around 1950, but from the looks of things, the brightest times for it are yet to come.

Due to the development of technology and increasing computing power, more and more exciting applications of Machine Learning are beginning to appear. Imagine, for example, using your phone to recognize text on your medicinal prescription.

Well, the purpose of this article is to discuss a similarly-functioning app. This app uses Flutter UI toolkit and Firebase Machine Learning to get the job done.

In this article, I’d like to brief you through some terminologies and show you how to add text recognition using Firebase Machine Learning.

What’s Flutter?

As I mentioned, the app uses Flutter UI toolkit. But what is it exactly?

This is a software development kit created by Google and is based on the Dart programming language. It’s a tool for creating multiplatform apps from one codebase. Such applications could run on Android, iOS, Linux, Mac, Windows, Google Fuchsia or as a web browser programme. What’s more, is that Flutter is compiled to native code, so its execution is efficient.

What’s Firebase Machine Learning?

Well, Firebase Machine Learning, also called ‘ML Kit for Firebase’, is an SDK that allows for easy usage of Google’s ML (Machine Learning) technologies on mobile devices running Android or iOS. This kit enables on-device and cloud-based operations. What’s more,there are quite a few ML models available. So, if you would like to use your own, you can do it because it supports TensorFlow Lite models.

But that might just be enough with the terminologies. Let’s take a dive into the coding part.

ML Kit for Firebase configuration

To be able to use all good and cool things that Firebase Machine Learning has to offer, you need to start by creating a Firebase project in a Firebase console. Google has done brilliant work regarding this by creating a codelab for it. Therefore, be sure to follow their instructions. You’ll find that those have been placed in steps 5 and 6 of this codelab.

Once the Firebase project is created and connected with the Flutter app, you need to add a dependency for Flutter ML Kit Vision for Firebase. To do so, follow these instructions:

1. In file pubspec.yaml insert

dependencies: firebase_ml_vision: ^0.9.4

2. Then run flutter pub get

3. [Android only] in the file {path_to_your_project}/android/app/src/main/AndroidManifest.xml inside <application> element add the following metadata

1 2 3 <meta-data android:name="" android:value="ocr" />

4. [iOS only] to PodFile add the pod which you wish to use, and run pod update. Available pods:

1 2 3 pod 'Firebase/MLVisionBarcodeModel' pod 'Firebase/MLVisionFaceModel' pod 'Firebase/MLVisionLabelModel' pod 'Firebase/MLVisionTextModel'

Using ML Kit for Firebase to recognize text

With configuration done, we can look into recognizing some text. To do this, an object of class TextRecognizer is needed. For this, we can ask FirebaseVision to provide us with one. Then simply request to process image is done.

The code for doing so is presented on code snippet #1:

1 2 3 _textRecognizer = FirebaseVision.instance.textRecognizer(); _textRecognizer.processImage(visionImage);

processImage returns object of type VisionText, which represents recognized texts and points describing text placement on an image.

You may wonder what is visionImage – that is passed to processImage. Well, it’s an object of type FirebaseVisionImage, a wrapper class for an image on which we’d like to recognize texts. We can create such visionImage passing: path to image file, image file itself or a bytes list representing an image.

The app created for this article refers to the latter variant ie. passing a bytes list representing an image that is retrieved from a device camera. When using this constructor for FirebaseVisionImage, additional metadata needs to be provided. Such metadata contains information about image size, raw format, and rotation.

See code snippet #2:

1 2 3 4 5 6 7 8 9 10 11 final bytes = _concatenatePlanes(image.planes); final metadata = _prepareMetadata(image, sensorOrientation, deviceOrientation); final visionImage = FirebaseVisionImage.fromBytes(bytes, metadata); _textRecognizer.processImage(visionImage); FirebaseVisionImageMetadata _prepareMetadata(CameraImage image, int sensorOrientation, NativeDeviceOrientation deviceOrientation) { return FirebaseVisionImageMetadata( size: Size(image.width.toDouble(),image.height.toDouble()), rawFormat: image.format.raw, planeData: => FirebaseVisionImagePlaneMetadata( bytesPerRow: currentPlane.bytesPerRow, height: currentPlane.height, width: currentPlane.width )).toList(), rotation: _pickImageRotation(sensorOrientation, deviceOrientation), ); }

It’s worth mentioning that Android implementation of an ML Kit requires an image in NV21 format. However, a camera only provides a YUV_420_888 format.

But worry not! To convert YUV_420_888 into NV21, you simply need to concatenate the bytes list of each plane into a unfied byte list.

Code snippet #3 presents such a code:

1 2 3 4 Uint8List _concatenatePlanes(List<Plane> planes) { final WriteBuffer allBytes = WriteBuffer(); planes.forEach((Plane plane) => allBytes.putUint8List(plane.bytes)); return allBytes.done().buffer.asUint8List(); }

Add text recognition using Firebase Machine Learning

To sum up, Firebase Machine Learning allows for convenient and easy usage of ML for mobile developers without the need to be an expert in the field.

As shown above, adding it to project is easy and straightforward. You may also know how to integrate an ML Kit for Firebase text recognition into your Flutter app to further explore the AI & ML world.

For the curious ones, see the sample project used created for this article.

Explore our Flutter development services

Daniel Łojewski
Android / Flutter Technical Lead