Investigating the use of Machine Learning for Automated Element. Location in Test Automation

Modern web application development is constantly evolving, the increasing demand for online businesses and increased complexity in the development of web applications have increased the importance and necessity for automation testing. Automation testing tools like Selenium, allow testers to automate testing processes across multiple browsers. In order to test the functionality of web elements, Selenium requires the use of web element locators, such as id, name attributes
and XPaths, which are used to select and retrieve elements from the Document Object Model (DOM) using a given query. Despite the numerous benefits that automated testing brings to organizations, certain challenges that may arise in the implementation and maintenance of the automated test suites must be considered [1], as changes to the application under test (AUT) require the automated tests to be updated accordingly ensuring that the tests provide accurate results, which can be perceived as time-consuming and costly.

This study focused on investigating the use of machine learning (ML) techniques trained on a pre-processed dataset of HTML elements to identify particular web elements on e-commerce websites using multiple attributes of the element to locate the element, thus being able to identify the element if changes to the element’s attributes occur. Previous studies have successfully managed to develop a framework that aids in automating web application testing using Machine Learning [2]. The adopted methodology was carried out as follows: Data collection and dataset construction, data preprocessing and feature extraction, classification and integrating the developed model with Selenium.

Throughout the first part of the implementation, a list of e-commerce websites was retrieved from analytical tools using web-scraping tools to extract the list of websites and generate a dataset of the following HTML elements, search field, add to cart buttons, checkout buttons. The generated dataset was then pre-processed which is a way of cleaning the data by tokenizing, lemmatizing and removing stop words from the data. A support vector machine (SVM) model using bag-of-words (BoW) vectors as features was trained. Tests were performed to determine whether the trained model was able to predict a web element and the performance of the classifier was evaluated using the performance metrics, which resulted in satisfactory results.

An API was developed to demonstrate how the trained model can be used with Selenium to generate test cases to test the functionality of a web element. Hence, enhancing web automation testing, as it enables test engineers to use higher-level domain-specific abstractions
instead of identifying web elements using low-level locators, thus simplifying test creation and the automated tests are not affected by any changes to the locators in the AUT.

Figure 1. Architecture of Web Element Classifier

References/Bibliography:

[1] Berner, S., Weber, R. & Keller, R. K. Observations and lessons learned from automated testing.
[2] Duyen Phuc Nguyen, Stephane Maag. Codeless web testing using Selenium and machine learning. ICSOFT 2020: 15th International Conference on Software Technologies, Jul 2020, Online, France .pp.51-60, ff10.5220/0009885400510060ff. Available at: https://hal.archives-ouvertes.fr/hal-02909787/document

Student: Kelsey Debono
Course: B.Sc. (Hons.) Computing Science
Supervisor: Dr Mark Micallef