The Teachable Object Recognizer mobile application uses AI and machine learning to help access the visual world.
Spending more time at home due to the pandemic, many of us are occupied with more indoor activities than before. From cooking and crafts to household projects, we interact with everyday objects, new and old, from packages of food to items of clothing. Individuals who are blind have quickly adopted technologies and techniques that allow them to identify many everyday objects to accomplish the same tasks. For example, they may use bar code readers to distinguish products such as pantry items. However, not all items in our surroundings have bar codes. Researchers at the University of Maryland College of Information Studies (UMD iSchool) are developing a mobile app that can help those with visual impairments more confidently interact with the visual world.
The Teachable Object Recognizer mobile application, developed by researchers at the UMD iSchool in collaboration with individuals who are blind from the community, uses AI and machine learning to help access the visual world. The app uses the camera on a smartphone to capture images of surrounding objects or faces, prompts the user to create a description or label for the images, and then remembers these labels. Once images and descriptions are archived, the user can point their camera at a child playing with a dog and it will respond with, say, “George is playing with a dog.” This same concept is also being applied to an additional project using wearable cameras to interact with the world and people around us in a more natural way rather than pointing with phone cameras.
“You can easily imagine needing this very detailed level of description of things you have in your house or office that you don’t always want to have to memorize by touch or smell,” said Dr. Hernisa Kacorri, assistant professor and researcher at the University of Maryland leading the Teachable Object Recognizer project. “[Within] this idea of personalization in the context of smart applications, there’s a term that we call ‘teachable interfaces’ or ‘teachable machines.’ So you don’t just have an expert collecting a bunch of data and then deploying it, but the user changes the behavior of the model or the smart technology.”
With the progressions in AI and machine learning technology, there is also a broader concern about the dependency and availability of data. Researchers share data resources within the community and use methods like crowdsourcing to collect large data sets from a broader population. However, when it comes to users with disabilities, this data is still very sparse. The UMD iSchool team has shared the data collected during the development of the Teachable Object Recognizer with researchers and industry. The team is further deploying a central repository on the web with information about any data generated by disabled users. This will allow for widespread use of such data.by researchers and anyone around the world who cares to make AI inclusive.
“It’s hard to find data from people with disabilities that can be used to train machine learning models. We’re talking about a smaller population, so we expect less data. But even within a specific disability group, there will be highly varying characteristics between members of that group,” said Kacorri. “This is where personalization through teachable interfaces can shine. Rather than assuming one size fits all, we can create applications that enable individuals to fine tune them to their behaviors. And we are working to make this happen for anyone even if they don’t know exactly how AI works.”
The Teachable Object Recognizer has already drawn attention from prominent tech companies, including Microsoft, so we may start to see other related products developed in the near future.
Check out the links below to learn about more digital access/inclusion research initiatives from UMD faculty and students.
- Reviewing Speech Input with Audio: Differences between Blind and Sighted Users
- Pedestrian Detection with Wearable Cameras for the Blind: A Two-way Perspective
- Revisiting Blind Photography in the Context of Teachable Object Recognizers
- Sign Language Recognition, Generation, and Translation: An Interdisciplinary Perspective
- #HandsOffMyADA: A Twitter Response to the ADA Education and Reform Act