Seeing AI: Artificial Intelligence for Blind and Visually Impaired Users
By Steven Kelley, CVRT
Seeing AI: An Exciting New Release
Microsoft has released a new free app for Apple’s iPhone, called Seeing AI, and it is generating a lot of interest in a short period of time. In less than a week, “techies” were giving glowing reviews to the app and podcasters were creating tutorials, which is all great news for consumers looking for an introduction to using the app.
I too have downloaded the app and tried it out with considerable interest, because it is a free app, and because it includes what many technology users with a vision impairment are seeking most in productivity tools: reading printed text; object or obstacle recognition; and labeling utilities, all packaged together in one app!
Writing a comprehensive review of Seeing AI is beyond the scope of this article and at the rate techies are reviewing it, there will be plenty of tutorials and guidance available elsewhere. That said, the structure and components of the app itself provide an excellent outline for a review of some of the apps it is competing with, and the concepts that may be helpful to users as they decide which of these software tools is right for them.
The Channels of Seeing AI
The Seeing AI app has several navigational “channels” that correspond nicely with other technology that is useful to users with visual impairments: Short Text; Documents; Product; Currency; Person; and Scene. (Both Person and Scene are beta works in progress). These channels correspond to a specific type of app or software utility in the following broader categories:
- Short Text and Documents are types of Optical Character Recognition or OCR.
- Product is a barcode scanner. It scans a barcode that appears on most products that identifies what that product is, and other details about it, such as directions and ingredients.
- Person, Currency, and Scene are all forms of object recognition. They have the ability to take a picture of something and have either the software or a remote human being identify the object and communicate that information to the user.
One of the reasons that Seeing AI is getting a great deal of press in a short period of time – besides that it’s free – is because it combines all of the above features into a single app. Traditionally, users would get one or more apps for each feature listed above, and pay for several apps.
However, consumers may discover that a critical comparison of Seeing AI to some of the stand-alone competitors can reveal some significant features on the stand-alone apps that make them a good value, even when compared to the free Seeing AI app.
Optical Character Recognition
One of the most powerful categories of productivity apps available for visually impaired users on both smartphones and tablets is Optical Character Recognition, or OCR: software that enables users to take a picture of text, and have the app read it back with the device’s screen reader. For example, you can take a picture of a magazine article or printed recipe and have it read aloud by the smartphone!
OCR is certainly nothing new and has been around for decades. Years ago, when Ray Kurzweil developed one of the first commercially available OCR devices, it originally cost tens of thousands of dollars and was the size of a filing cabinet—neither cheap nor portable.
Over time these devices have morphed into software applications that can run on a smartphone, and cost infinitely less—or free, as in the case of Seeing AI.
Up until now, the go-to OCR app for both Android and iOS has been the KNFB Reader (pictured at left). The “K” in this product name stands for Ray “Kurzweil,” still developing OCR products, and “NFB” for the National Federation of the Blind.
Although the app differs very little in the way it works, on either the iOS device or Android, the KNFB Reader is significantly less expensive on the Android—about $20 versus the $100 price tag on the iOS side.
Another OCR from Microsoft is called Office Lens (pictured at left), and it too is free. Unlike Seeing AI, it was designed as a simple OCR app for both the computer and iOS – not specifically for visually impaired users. However, it works well with a screen reader and has some support for focusing the camera.
The software can identify the edges of the paper to be photographed and prompt users when all edges of a document are within the viewfinder. The user is then prompted to take the picture. The photo is processed, and the text saved into a digital text document that can be read by a screen reader.
Another alternative OCR app for iOS is Prizmo. Like the others, Prizmo provides support for VoiceOver users to position a document below the camera viewfinder and prompts when to take the picture. After the photo is processed, text can be exported to a document and read with the screen reader. Prizmo is just under $10 and appears to be available only on iOS devices.
This is by no means a comprehensive list of OCR apps—there are many others with a variety of features, and many will save text so that a screen reader can read it back out loud. Several recent reviewers have predicted that the free apps will be the demise of the more expensive apps, like KNFB Reader.
As a low vision user myself, there are several features on the KNFB Reader I use on an Android phone routinely that may not be available on the free apps. The font size is adjustable, with highlighting on the words being read, and the text is wrapped to fit the screen (pictured at left). Depending on a user’s needs, additional features such as these may be well worth the extra money. Don’t write off one of the alternative apps just because it costs a little extra. You may really love one of the features, or it may ultimately increase your efficiency.
“What’s in this unlabeled can?” If you can’t read the printed label on a can, and it’s of a standard size and shape, it’s virtually impossible to tell what’s in it without some sort of labeling system in place.
The Product channel on Seeing AI provides a barcode scanner that uses the camera to help locate the barcode printed on most product labels, and then read out the available product information. This may be as simple as the name of the product, size, and weight, or it may also contain more comprehensive information, like ingredients.
There are many barcode scanner apps available for both iOS and Android devices, but the standard for iOS and VoiceOver, is Digit-Eyes, for $9.99 on the App Store. On most barcode scanner apps, the product is held up to the camera and moved about until the software detects a barcode, and then uses that information to locate the item in a database. The database is usually not stored on the tablet or phone, so to use these apps successfully, you must have an Internet connection.
One of the significant features of Seeing AI over Digit-Eyes is the addition of location guidance using a sequence of tones that grows more rapid as the user gets closer to the barcode. Locating a barcode on a product with limited or no vision can be a real challenge, and this is a great feature.
On the other hand, Digit-Eyes incorporates the ability to print barcode labels to use as labels on household items like CDs or even clothing, then record that information for later identification. Consumers wanting a more complete labeling system will find Digit-Eyes a more comprehensive solution overall.
i.d. mate Galaxy
It is worth noting again that these apps require an Internet connection to identify items. It has been my experience that you can’t always count on a good Internet connection in a store, which means that it may take some time to get a response, or it may not be usable at all. If barcode scanning is something you want to do in areas with unknown Internet availability, a dedicated product like the i.d. mate Galaxy from En-Vision America may be a better solution.
While significantly more expensive at $1,299, it maintains a huge database of products on its own hand-held computer, so regardless of the internet connection, most products can be identified in seconds. Another significant feature, is that the i.d. mate utilizes a laser to scan the barcodes, which seems much quicker and more sensitive in my trials than the camera used on smart devices. Like Digit-Eyes, this product also allows users to create their own labels, and add products to the database.
Seeing AI offers a currency identifier that is every bit as accurate as the United States Bureau of Engraving and Printing App called EyeNote or NantMobile Money Reader. Like Seeing AI, both apps are free, and each uses the camera on a phone or tablet to identify a bill’s denomination.
The Person and Scene channels both seem to be in beta mode, and users can expect varying results. In both channels, the goal is for the Artificial Intelligence (the “AI” in Seeing AI) to identify the people and objects in the scene. In the Person channel, users may take a picture of someone they know and label this person, so that any time that person is in a scene on the camera or viewfinder they will be identified to the user.
In my brief trial with the software, Seeing AI identified the number of people in a scene, and, if their faces were present, attempted to guess the person’s age and expression on their face. When a student took a picture of me, for example, Seeing AI described me as two years older than I actually am, identified my gender, and described my expression as smiling—not bad!
TapTapSee and Be My Eyes
For several years, a number of object recognition apps have been available, notably TapTapSee and Be My Eyes. Both apps rely heavily on volunteers to assist with the identification process. A user takes a picture of an object or scene, and the image is forwarded to a volunteer who sends a brief description back to the user within a short period of time. This is quite different from Artificial Intelligence that relies on the software to describe objects and scenes.
Both of these services are currently free; however, the user may not always get immediate results. My few trials with TapTapSee returned useful results within approximately 15 seconds, so it was quite responsive.
Object recognition has taken an exciting leap forward with a relatively new app and service called AIRA. This service utilizes a small video camera and speaker mounted to a pair of glasses and connected to a smartphone, that enables to contact a trained, sighted assistant to provide additional information or sighted support.
In real time, a user connects with a person who is then able to look remotely through the video camera to describe a scene, look up directions, read a sign, and many other functions. AIRA relies on an Internet connection for this service, so in areas where cellular Internet reception is poor it will not be readily available. AIRA is based on a monthly subscription fee with a base rate of $89/month for 100 minutes each month up to an unlimited usage rate of $329/month. AIRA certainly gives you an idea of where object recognition apps and services are going, and how useful they can be.
Wrapping It Up
Seeing AI is a great app because it combines several utilities – OCR, product identification, and object recognition – into one free app. For users new to these utilities who use an iOS device, this is a great introduction to these features on a portable smart device. Think of the app as that utilitarian Swiss Army Knife for the iPhone!
If these apps and utilities are all new to you, shop around. The hype about Microsoft’s Seeing AI is warranted, but don’t let that prevent you from exploring other great apps that may have additional features that can work well for your needs.