VisionAware is pleased to speak with Dr. Yonatan Wexler, the head of Research and Development at OrCam, an Israeli start-up company founded in 2010 by Amnon Shashua, Sachs Professor of Computer Science at The Hebrew University of Jerusalem.
The mission of OrCam is to develop a “portable, wearable visual system with ‘human-like’ capabilities” for blind and visually impaired persons, via the use of artificial computer intelligence and augmented reality. The OrCam is based on a computer algorithm that Professor Shashua has pioneered with Dr. Wexler and Shai Shalev-Shwartz, a Hebrew University colleague. You can learn more at The OrCam: A Portable, Wearable Visual System for Blind and Visually Impaired Persons on the VisionAware blog.
Please note: This interview does not imply endorsement of the OrCam by VisionAware. It is intended only to provide information for our readers.
Maureen Duffy: Hello, Dr. Wexler. I appreciate your taking the time to talk with our readers. First, the basics: How did you become interested in and/or involved with the OrCam project?
Yonatan Wexler: Visual arts always fascinated me. When computers came along, I found it surprisingly hard to explain precisely how we perceive the world in visual terms. We do it so innately that we can’t explain how we do it. This question led me to a career of researching in the field called “Computer Vision.” Three years ago, Amnon Shashua, my Ph.D. advisor, contacted me about creating a device for blind people. For this we had to develop the most sophisticated artificial visual system.
MD: Had you had experience with blind people prior to this?
YW: Not very much. This was interesting, since it took a while to learn what blind and visually impaired people actually need. It is different from what we expected at the beginning. For example, recognizing obstacles is not a key difficulty. This is because a simple [white] cane gives all the needed information. The ability to read, on the other hand, has a far greater added value to a person who is blind.
MD: I’m interested in the computer algorithm your team developed. Can you explain it to a non-technical audience?
YW: The biggest challenge any visual system faces is the sheer amount of data. Every image has millions of measurements (called pixels). Any attempt to make sense of these has to be very lean, so it isn’t bogged down with the sheer volume of the input. For this we have published a new algorithm [i.e., a step-by-step problem-solving procedure] called ShareBoost. It finds the most informative parts of the image that can be combined into a realization of what is in the image.
MD: How long did it take to develop the algorithm?
YW: We spent about a year devising the core algorithm and then two more years to apply it to the various aspects of the device. Creating something of value, from a scientific perspective and by solving a fundamental human need, is very rewarding. When we started working on this, it was an open problem from a scientific perspective. Such cases, when it is not clear that there is a solution, require real determination. We have a fantastic team that is dedicated and focused; otherwise we would have not succeeded – and definitely not in such a short time.
What types of information are pre-stored in the OrCam system at the time of purchase?
YW: The system comes programmed to recognize bank notes. We know that this is something of value for all of our users. It is very easy to teach the system more products in a very simple manner that takes only a few seconds. Each user, therefore, can teach the system the products they care about.
MD: Can you explain the gestures that are required to add an item and add a face or place?
YW: Adding an item is very simple. It starts by pressing a button. The system will prompt the user to wave the product so the camera can see it from various angles. The user then records the name of the product and concludes by pressing the button again. All this takes up to twenty seconds.
MD: You say that the OrCam is intuitive, but is there a learning curve?
YW: My experience is that a five-minute demonstration suffices to teach most people how to use the device. The interaction is gesture-based, and therefore intuitive. You point at what you are interested in, and the device talks back to you in plain English. It is part of our natural body language that the device responds to. That is why it is so intuitive.
MD: I’m also following the OpenGlass Project with great interest. Can you explain the difference between Google Glass and the OrCam? It seems that readers find it difficult to differentiate the two systems.
YW: Google Glass is essentially a wearable screen. OrCam (pictured at left) is a smart camera that speaks. It has no screen, since this has no value for most visually impaired persons. As such, they are very different. OrCam is also designed to be a robust device that is dependable and can sustain daily use.
MD: How much does the OrCam cost and is there insurance coverage on the horizon?
YW: The device is priced at $2,500 for our first users, which corresponds to the price of a hearing aid. The demand has been overwhelming. We started working towards getting insurance coverage in order to help our customers and will update the information on our website once progress has been made.
MD: Is there anything else you’d like to add?
YW: It is a great privilege being able to apply this technology to something that will improve the lives of so many people.
We thank Dr. Wexler for his support of VisionAware and for his research on behalf of blind and visually persons worldwide. You can read about additional OrCam-related projects in development at the German Research Center for Artificial Intelligence website.
Where you can find Dr. Yonatan Wexler and OrCam online:
- At the OrCam website
- On Facebook
- On Twitter @OrCam
- On Twitter @yoniwexler