Software project baseline prediction

Software-project estimation is a complex process that involves predicting the time and resources required to complete a project. Currently, various methods are used to estimate software projects, including predictions by experts and statistical methods such as autoregressive integrated moving average (ARIMA). However, accurately predicting the time and resources needed for a project is challenging due to numerous variables, such as the size and complexity of the project, all of which could affect the estimation. Therefore, it would be essential to anticipate all possible factors that might influence the outcome of a project in order to ensure an accurate prediction.

One promising approach to improving software-project estimation is through the use of artificial intelligence (AI) models. Such models can analyse vast amounts of data to identify patterns and make predictions based on historical performance. Unlike expert predictions or statistical methods, AI models are not influenced by human biases or limitations, which means that they could potentially provide more objective and accurate estimates. An AI model trained on historical data could take into account variables, such as project complexity, team size, and code quality, to provide a more accurate estimate of project duration or cost.

The primary aim of this study was to investigate the feasibility of an approach by replicating a previously conducted study [1]. To achieve this, the study followed two proposed methods ‒ Deep-SE and TF/IDF-SVM ‒ and tested them on both the datasets mentioned in the original paper and a custom dataset. The primary objective was to verify whether the metrics would hold true for the proposed methods and if they could provide a reliable solution to software-project estimation. By testing on various datasets, the experiment was expected to obtain generalisable results, through which to determine the reliability of the proposed approach. The overall goal was to present an accurate and unbiased solution to the problem of software-project estimation.

The future work outlined in the paper by Tawosi et al [1] involved measuring the metrics of the models with filtered dataset descriptions, as opposed to unfiltered descriptions. The filtered descriptions contained chunks of code and https links. To verify the feasibility of the proposed approach and provide further insight into the effectiveness of the models under different conditions, in our research we  compared the performance of the proposed methods, namely Deep-SE and TF/IDF-SVM, with the filtered and unfiltered descriptions present in both the datasets of the original paper and a custom dataset.

The above test confirmed that AI models have the potential to improve the accuracy of software-project estimation by providing more objective and accurate estimates, which may lead to better planning, resource allocation, and overall project success. Despite the challenges, accurate software-project estimation would be crucial for managing resources effectively and delivering projects on time. 

As software projects become increasingly complex and diverse, the use of AI in project estimation is likely to become more prevalent, making it imperative to continue exploring (and validating) its potential benefits. The findings of this study offer useful insights into the feasibility and effectiveness of AI-based approaches to software-project estimation, thus contributing to the ongoing effort to improve project management practices.

Figure 1. Use case diagram outlining software-project estimation

Student: Kieron Sultana
Supervisor: Dr Charlie Abela