E-commerce Based Information Retrieval and Personalised Search

The evolution of E-commerce websites and internet users over the years has meant that retailers have to deal with the challenge of information overload while also keeping the customers satisfied [4]. Such specialised systems hence require the retrieval of unstructured information and descriptions about their product inventory to be able to carry out personalisation on the search results. The main challenge in Information Retrieval involves retrieving products according to a user’s needs using a matching algorithm. This is mainly due to polysemy and synonymy [5] as well as the searcher not knowing how to properly formulate their information needs in a short search query [3]. These problems are more prevalent in smaller, specialised systems such as in the domain of E-commerce. This is due to the fact that when searching for something on the World Wide Web, many relevant documents are likely to match a query. On the other hand, in a specialised collection it is likely that the terms used to represent documents may not be so common [1].

Despite these problems, techniques including Latent Semantic Analysis and Automatic Relevance Feedback have been known to improve Information Retrieval results [4]. Moreover, the ultimate goal is creating a user-adaptive system that will automatically provide users with information according to their needs [2]. In fact, when dealing with personalised search in an E-commerce setting, the biggest challenge is obtaining information about a user in a non-intrusive way. This information can then be used in order to create the user’s profile which will contain the user’s preferences and needs over time [2]. Another challenge deals with choosing a recommendation algorithm which will be used to gather information about users in order to personalise their search results according to their needs [2].

This dissertation describes an information retrieval and personalised search system in which different techniques were researched and developed to achieve the best outcome for the problem outlined. The proposed system is split into two components, Information Retrieval and Personalised Search. The information retrieval component retrieves textual information about products from an E-commerce collection and processes them in such a way as to retrieve the best features. In the personalised search component, a personalisation algorithm is used to convert user information into user models. The system is able to re-rank search results returned by user queries using the retrieved product features and user models.

Figure 1. System Overview

When evaluating our system, we found that using information retrieval techniques such as Latent Semantic Analysis greatly improves the search result relevance scores. Also, when personalising   the   search   results   according   to   the   user’s preferences, the incorporation of popularity of the product as well as similarity between the query terms and product description terms helped to obtain the best result over all the queries. The user-product relevance had the greatest impact on the re-ranking of search results, indicating that personalisation according to the user’s preferences is desired.


[1]         C. Layfield, J. Azzopardi and C. Staff, “Experiments with Document Retrieval from Small Text Collections Using Latent Semantic Analysis or Term Similarity with Query Coordination and Automatic Relevance Feedback,” in Semantic Keyword-Based Search on Structured Data Sources, 2017.

[2]         B. Mobasher, “Data Mining for Web Personalization,” in The Adaptive Web: Methods and Strategies of Web Personalization, Springer Berlin Heidelberg, 2007, pp. 90–135.

[3]         I. Ruthven and M. Lalmas, “A survey on the use of relevance feedback for information access systems,” The Knowledge Engineering Review, vol. 18, no. 2, pp. 95-–145, 2003.

[4]         F. Isinkaye, Y. Folajimi and B. Ojokoh, “Recommendation systems: Principles, methods and evaluation,” Egyptian Informatics Journal, vol. 16, 2015. [5]         S. Deerwester, S. Dumais, T. Landauer, G. Furnas and R. Harshman, “Indexing by Latent Semantic Analysis,” Journal of the American Society for Information Science, vol. 41, pp. 391–407, 1990.

Student: Anne-Marie Camilleri
Supervisor: Dr Joel Azzopardi
Course: B.Sc. IT (Hons.) Artificial Intelligence