MorphoTest

Learning a language is always a challenging task that requires a substantial amount of time and dedication to be able to see any progress. This holds true with respect to the Maltese language, Maltese grammar in particular. Maltese has a ‘mixed’ grammar, which is influenced by its origins. For instance, words of Semitic origin follow a root-and-pattern conjugation system, whilst words of a Romance origin follow a stem-and-affixation pattern. Both children and adults learning the language often find that they would need to memorise which system is to be applied, to which set of words.

When compared to other languages, Maltese is considered a low-resource language, meaning that there is a lack of resources available to process Maltese computationally. This is also true in terms of educational resources that could assist Maltese-language learners in making progress. The main aim of this project is to investigate how to utilise existing natural language processing (NLP) tools to facilitate the creation a language-learning tool for Maltese. Due to the richness of Maltese morphology (i,e., the structure of its words and the way in which they interact) the research seeks to create an application that could assist language learners to practice this grammatical aspect of the language.

The language-learning sector is very vast and nowadays there are many smartphone applications that seek to aid language learning. However, many of these applications do not necessarily make use of NLP tools to the best advantage.

One of these applications is WordBrick [1] and it seeks to tackle the difficulty of independent language learning by displaying a jumble of words, presented in different shapes and colours, requiring the user to rearrange them to form a proper sentence. Echoing jigsaw puzzles, this is achieved by having connectors attached to the word that have a specific shape, where only another word with that shape could be joined to it.

This project was inspired by WordBrick, which allows learners to build words from their different components (morphemes), and studying the meaning of each component. In order to achieve this, we take advantage of Ġabra [2], an open-source lexicon for Maltese. The first step was to automatically segment words into their components and associate a label to the individual component. This task is referred to as morphological analysis. The components would be presented to the user, jumbled up, and they would have to join the pieces together again in the right order to produce the word. The focus on the language-learning component would then determine which words should be presented to which learners, according to their level. The type of exercises offered could also be varied by reversing the process, and asking the learner to segment a word and to attach a meaning to each of the parts.

The developed application demonstrates how NLP techniques could assist Maltese-language learners. The main aim of the application is to provide a basis for the development of further exercises that use NLP as their backbone, allowing teachers to create content for exercises more easily and with more diversity.

Figure 1. A screenshot of WordBrick displaying how words are presented initially and then rearranged

References/Bibliography

[1] M. Purgina, M. Mozgovoy, and J. Blake, “WordBricks: Mobile Technology and Visual Grammar Formalism for Gamification of Natural Language Grammar Acquisition”, inJournal of Educational Computing Research, vol. 58, pp. 126–159, Mar. 2020. Publisher: SAGE Publications Inc.

[2] John J. Camilleri. “A Computational Grammar and Lexicon for Maltese”, M.Sc. Thesis. Chalmers University of Technology. Gothenburg, Sweden, September 2013.

Student: David Vassallo
Course: B.Sc. IT (Hons.) Artificial Intelligence
Supervisor: Dr Claudia Borg