Events

Dissertation Defense: Jonggi Hong

Event Start Date:
Wednesday, September 15, 2021 - 09:00 AM
Event End Date:
Wednesday, September 15, 2021 - 11:00 AM
Location
5105 Iribe Center
Add to Calendar 2021-09-15 09:00:00 2021-09-15 11:00:00 Dissertation Defense: Jonggi Hong Title: Exploring Blind and Sighted Users' Interactions with Error-Prone Speech and Image Recognition Author: Jonggi Hong Abstract: Speech and image recognition, already employed in many mainstream and assistive applications, hold great promise for increasing independence and improving the quality of life for people with visual impairments. However, their error-prone nature combined with challenges in visually inspecting errors can hold back their use for more independent living.  This thesis explores blind users’ challenges and strategies in handling speech and image recognition errors through non-visual interactions looking at both perspectives: that of an end-user interacting with already trained and deployed models such as automatic speech recognizer and image recognizers but also that of an end-user who is empowered to attune the model to their idiosyncratic characteristics such as teachable image recognizers. To better contextualize the findings and account for human factors beyond visual impairments, user studies also involve sighted participants on a parallel thread. More specifically, Part I of this thesis explores blind and sighted participants' experience with speech recognition errors through audio-only interactions.  Here, the recognition result from a pre-trained model is not being displayed; instead, it is played back through text-to-speech. Through carefully engineered speech dictation tasks in both crowdsourcing and controlled-lab settings, this part investigates the percentage and type of errors that users miss, their strategies in identifying errors, as well as potential manipulations of the synthesized speech that may help users better identify the errors. Part II investigates blind and sighted participants' experience with image recognition errors. Here, we consider both pre-trained image recognition models and those fine-tuned by the users. Through carefully engineered questions and tasks in both crowdsourcing and simulated remote lab settings, this part investigates the percentage and type of errors that users miss, their strategies in identifying errors, as well as the potential for avoiding such errors through iterative training for personalization. Examining Committee:  Chair: Dr. Hernisa Kacorri                 Dean's rep: Dr. Marine Carpuat Members: Dr. Huaishu Peng Dr. Leo Zhicheng Liu Dr. Leah Findlater                                                                                              Bio: Jonggi Hong is a Ph.D. student at the Department of Computer Science, working under the supervision of Prof. Hernisa Kacorri. His research interests include human-computer interaction, machine learning, and accessibility. 5105 Iribe Center America/New_York public

Computer Science Logo

Title: Exploring Blind and Sighted Users' Interactions with Error-Prone Speech and Image RecognitionDissertation Defense of Jonggi Hong

Author: Jonggi Hong

Abstract:
Speech and image recognition, already employed in many mainstream and assistive applications, hold great promise for increasing independence and improving the quality of life for people with visual impairments. However, their error-prone nature combined with challenges in visually inspecting errors can hold back their use for more independent living.  This thesis explores blind users’ challenges and strategies in handling speech and image recognition errors through non-visual interactions looking at both perspectives: that of an end-user interacting with already trained and deployed models such as automatic speech recognizer and image recognizers but also that of an end-user who is empowered to attune the model to their idiosyncratic characteristics such as teachable image recognizers. To better contextualize the findings and account for human factors beyond visual impairments, user studies also involve sighted participants on a parallel thread.

More specifically, Part I of this thesis explores blind and sighted participants' experience with speech recognition errors through audio-only interactions.  Here, the recognition result from a pre-trained model is not being displayed; instead, it is played back through text-to-speech. Through carefully engineered speech dictation tasks in both crowdsourcing and controlled-lab settings, this part investigates the percentage and type of errors that users miss, their strategies in identifying errors, as well as potential manipulations of the synthesized speech that may help users better identify the errors.

Part II investigates blind and sighted participants' experience with image recognition errors. Here, we consider both pre-trained image recognition models and those fine-tuned by the users. Through carefully engineered questions and tasks in both crowdsourcing and simulated remote lab settings, this part investigates the percentage and type of errors that users miss, their strategies in identifying errors, as well as the potential for avoiding such errors through iterative training for personalization.

Examining Committee: 

  • Chair: Dr. Hernisa Kacorri                
  • Dean's rep: Dr. Marine Carpuat
  • Members: Dr. Huaishu Peng
    Dr. Leo Zhicheng Liu
    Dr. Leah Findlater                                                                                             

Bio:
Jonggi Hong is a Ph.D. student at the Department of Computer Science, working under the supervision of Prof. Hernisa Kacorri. His research interests include human-computer interaction, machine learning, and accessibility.