Spam emails are a significant threat to internet users, and it is crucial to have effective systems in place to filter them out. Traditional spam filters are time-consuming and easy to bypass, so machine learning (ML) has emerged as a promising solution for spam detection. This project sought to develop a system that could automatically identify and detect spam emails from genuine emails using ML techniques.
The project explored different ML techniques such as: supervised learning, unsupervised learning, natural language processing (NLP), and network analysis. A diverse dataset of emails was collected, comparing both spam and legitimate emails for training and testing. The model was trained to identify patterns and learn the characteristics that could separate spam emails from genuine emails.
To evaluate the effectiveness of these techniques, a comparative analysis was conducted to determine the best approach for detecting spam emails. Explainable artificial intelligence (AI) techniques were integrated into the model and used to present results that could be understood by non-technical users. This would increase accountability, reduce bias, and align the system’s decision-making with ethical and legal standards.
In order to address the challenges in building an ML-based spam detection system, diverse and high-quality training datasets were used, and the system was regularly updated with new data. Moreover, advanced techniques, such as deep learning, were explored. Human expertise was integrated into the system for handling edge cases and verifying false positives and negatives. Integration with existing spam filters and email systems was also explored, towards enhancing spam detection.
Ultimately, by comparing the different techniques, it was possible to draw a conclusion as to the effectiveness of using an AI-based approach and finding the best solution. On the basis of the findings, this project promises to provide a valuable and effective tool for individuals and organisations seeking to improve their email security by reducing the impact of spam and protecting users from this threat.
Figure 1. Overview of the email filtering system
Figure 2. The process for developing an effective spam-detection system
Student: Jessica Silvio
Supervisor: Dr Clyde Meli