VR Enhance – aiding human speech and sensorimotor skills using virtual reality

Stroke remains one of the major causes for most linguistic and functional disabilities, but this condition is not the only cause for such deficits. Current rehabilitation programmes struggle to keep up with the increasing demands for therapy on a daily basis. The psychological impacts of therapy are also not be underestimated.

This study investigates a rehabilitation game built on a virtual reality (VR) system, which uses multimodality to identify both speech and dynamic gestures within a single application. The solution aims to provide an alternative means of therapy that would allow patients to independently improve their speech and physical abilities, more specifically those related to the upper extremities, with minimal to no guidance from therapists. For user engagement, the system applies the themes of magic and spells to instantiate intra-diegetic features after speech or gesture classification, which are amplified according to the user’s score. A sensor-based deep neural network is applied, which recognises both one-handed and two-handed gestures, essential for targeting bimanual activities. For speech, the IBM Watson cloud-based speech-to-text service was used with streaming, to allow for continuous speech recognition until a pause would be detected.

The performance of both models was evaluated through a user evaluation to validate the efficacy of the proposed system. When applied to 18 participants, a global accuracy and Cohen’s kappa coefficient of 93.3% and 89.9% respectively were achieved for the gesture model. These results indicate the model’s ability to extend to different users, whilst maintaining considerable accuracies. An overall word error rate of 28.8% was achieved for the speech model, which suggests that further improvements would be required to recognise speech with low intelligibility. Nonetheless, a gradual improvement in user scores was observed during the 10 repetitions performed for each gesture-and-speech sequence. The system was also very well accepted by users, thus indicating that VR could be effectively applied to rehabilitation programmes in the future.

Figure 1. High-level diagram of the system
Figure 2. Intra-diegetic interface showing user analytics
Student: Ryan Camilleri
Course: B.Sc. IT (Hons.) Artificial Intelligence
Supervisor: Dr Vanessa Camilleri
Co-supervisor: Dr Andrea DeMarco