ABSTRACT Conversational recommender systems are E-Commerce applications which interactively assist online users to acquire their interaction goals during their sessions. In our previous work, we have proposed and validated a methodology for conversational systems which autonomously learns the particular web page to display to the user, at each step of the session. We employed reinforcement learning to learn an optimal strategy, i.e., one that is personalized for a real user population. In this paper, we extend our methodology by allowing it to autonomously learn and update the optimal strategy dynamically (at run-time), and individually for each user. This learning occurs perpetually after every session, as long as the user continues her interaction with the system. We evaluate our approach in an off-line simulation with four simulated users, as well as in an online evaluation with thirteen real users. The results show that an optimal strategy is learnt and updated for each real and simulated user. For each simulated user, the optimal behavior is reasonably adapted to this user’s characteristics, but converges after several hundred sessions. For each real user, the optimal behavior converges only in several sessions. It provides assistance only in certain situations, allowing many users to buy several products together in shorter time and with more page-views and lesser number of query executions. We prove that our approach is novel and show how its current limitations can catered.
Adriano Venturini hasn't uploaded this paper.
Let Adriano know you want this paper to be uploaded.