Papers by Magdalini Eirinaki
2023 11th IEEE International Conference on Mobile Cloud Computing, Services, and Engineering (MobileCloud)

2020 IEEE International Conference On Artificial Intelligence Testing (AITest)
Autonomous vehicles have the potential to completely upend the way we transport today, however de... more Autonomous vehicles have the potential to completely upend the way we transport today, however deploying them safely at scale is not an easy task. Any autonomous driving system relies on multiple layers of software to function safely. Among these layers, the Perception layer is the most data intensive and also the most complex layer to get right. Companies need to collect and annotate lots of data to properly train deep learning perception models. Simulation systems have come up as an alternative to the expensive task of data collection and annotation. However, whether simulated data can be used as a proxy for real-world data is an ongoing debate. In this work, we attempt to address the question of whether models trained on simulated data can generalize well to the real-world. We collect datasets based on two different simulators with varying levels of graphics fidelity and use the KITTI dataset as an example of real- world data. We train three separate deep learning based object detection models on each of these datasets, and compare their performance on test sets collected from the same sources. We also add the recently released Waymo Open Dataset as a challenging test set. Performance is evaluated based on the mean average precision (mAP) metric for object detection. We find that training on simulation in general does not translate to generalizability on real-world data and that diversity in the training set is much more important than visual graphics’ fidelity.

Proceedings of the 23rd International Database Applications & Engineering Symposium on - IDEAS '19, 2019
Crime has been prevalent in our society for a very long time and it continues to be so even today... more Crime has been prevalent in our society for a very long time and it continues to be so even today. Currently, many cities have released crime-related data as part of an open data initiative. Using this as input, we can apply analytics to be able to predict and hopefully prevent crime in the future. In this work, we applied big data analytics to the San Francisco crime dataset, as collected by the San Francisco Police Department and available through the Open Data initiative. The main focus is to perform an in-depth analysis of the major types of crimes that occurred in the city, observe the trend over the years, and determine how various attributes contribute to specific crimes. Furthermore, we leverage the results of the exploratory data analysis to inform the data preprocessing process, prior to training various machine learning models for crime type prediction. More specifically, the model predicts the type of crime that will occur in each district of the city. We observe that the provided dataset is highly imbalanced, thus metrics used in previous research focus mainly on the majority class, disregarding the performance of the classifiers in minority classes, and propose a methodology to improve this issue. The proposed model finds applications in resource allocation of law enforcement in a Smart City.

2021 IEEE International Conference on Big Data (Big Data), 2021
Through lyrics, pitch, and rhythm, music is a natural way of expressing one’s thoughts. As one of... more Through lyrics, pitch, and rhythm, music is a natural way of expressing one’s thoughts. As one of the essential music composition elements, lyrics’ composition is complicated as it requires creativity and follows a particular rhythm pattern. In this work we design and train two neural network models for composing lyrics in three genres and propose scoring functions to select and evaluate the generated songs. We treat this problem as a text generation task, and optimize for features particular to song lyrics, including lyrics’ quality, rhyme density, and sentiment ratio, for lyrics in different genres. The neural networks we have experimented with are generative adversarial network (GAN)-based transfer learning for a deep learning model and long short-term memory (LSTM)-based deep learning model. In addition to quantitative evaluation, we also conducted user studies, inviting 25 people to rate the generative songs selected by the scoring functions. Our findings show that the GAN-based models perform better than LSTM-based models and the scoring functions are useful in selecting good songs.

Fourteenth ACM Conference on Recommender Systems, 2020
The energy consumption of households has steadily increased over the last couple of decades. Rese... more The energy consumption of households has steadily increased over the last couple of decades. Research suggests that user behavior is the most influential factor in the energy waste of a household. Thus, there’s a need for helping consumers change their behavior to make it more energy efficient and environment friendly. In this work we propose a real-time recommender system that assists consumers in improving their household’s energy usage. By monitoring the power demand of each appliance in the household, the system detects the device status (on/off) at any moment, and using pattern mining creates a household profile comprising energy consumption patterns for different periods of the day. An intuitive UI allows users to set energy consumption goals and preferences on the appliances they’d like to save energy from. Based on the household profile, the user’s preferences and the actual power demand the system generates personalized real-time recommendations on which appliances should be turned off at a moment. We employ the UK-DALE (UK Domestic Appliance-Level Electricity) dataset to model and evaluate the entire process, from data preprocessing and transformation of the appliance power demand input to various pattern mining algorithms used to generate appliance usage profiles and recommendations, showing that even small changes in appliance usage behavior can lead to energy savings between 2-17%.

2017 IEEE Third International Conference on Big Data Computing Service and Applications (BigDataService), 2017
Conventional street cleaning methods include street sweepers going to various spots in the city a... more Conventional street cleaning methods include street sweepers going to various spots in the city and manually verifying if the street needs cleaning and taking action if required. However, this method is not optimized and demands a huge investment in terms of time and money. This paper introduces an automated framework which addresses street cleaning problem in a better way by making use of modern equipment with cameras and computational techniques to analyze, find and efficiently schedule clean-up crews for the areas requiring more attention. Deep learning-based neural network techniques can be used to achieve better accuracy and performance for object detection and classification than conventional machine learning algorithms for large volume of images. The proposed framework for street cleaning leverages the deep learning algorithm pipeline to analyze the street photographs and determines if the streets are dirty by detecting litter objects. The pipeline further determines the the degree to which the streets are littered by classifying the litter objects detected in earlier stages. The framework also provides information on the cleanliness status of the streets on a dashboard updated in real-time. Such framework can prove effective in reducing resource consumption and overall operational cost involved in street cleaning.
The World Wide Web Conference on - WWW '19, 2019
Social recommendations have been a very intriguing domain for researchers in the past decade. The... more Social recommendations have been a very intriguing domain for researchers in the past decade. The main premise is that the social network of a user can be leveraged to enhance the rating-based recommendation process. This has been achieved in various ways, and under different assumptions about the network characteristics, structure, and availability of other information (such as trust, content, etc.) In this work, we create neighborhoods of influence leveraging only the social graph structure. These are in turn introduced in the recommendation process both as a pre-processing step and as a social regularization factor of the matrix factorization algorithm. Our experimental evaluation using real-life datasets demonstrates the effectiveness of the proposed technique.

2016 IEEE International Conferences on Big Data and Cloud Computing (BDCloud), Social Computing and Networking (SocialCom), Sustainable Computing and Communications (SustainCom) (BDCloud-SocialCom-SustainCom), 2016
The process of decision making in humans involves a combination of the genuine information held b... more The process of decision making in humans involves a combination of the genuine information held by the individual, and the external influence from their social network connections. This helps individuals to make decisions or adopt behaviors, opinions or products. In this work, we seek to investigate under which conditions and with what cost we can form neighborhoods of influence within a social network, in order to assist individuals with little or no prior genuine information through a two-phase recommendation process. Most of the existing approaches regard the problem of identifying influentials as a long-term, network diffusion process, where information cascading occurs in several rounds and has fixed number of influentials. In our approach we consider only one round of influence, which finds applications in settings where timely influence is vital. We tackle the problem by proposing a two-phase framework that aims at identifying influentials in the first phase and form influential neighborhoods to generate recommendations to users with no prior knowledge in the second phase. The difference of the proposed framework with most social recommender systems is that we need to generate recommendations including more than one item and in the absence of explicit ratings, solely relying on the social network's graph.

Proceedings of the 2015 IEEE/ACM International Conference on Advances in Social Networks Analysis and Mining 2015, 2015
Online advertisements are a major source of profit and customer attraction for web-based business... more Online advertisements are a major source of profit and customer attraction for web-based businesses. In a successful advertisement campaign, both users and businesses can benefit, as users are expected to respond positively to special offers and recommendations of their liking and businesses are able to reach the most promising potential customers. The extraction of user preferences from content provided in social media and especially in review sites can be a valuable tool both for users and businesses. In this paper, we propose a model for the analysis of content from product review sites, which considers in tandem the aspects discussed by users and the opinions associated with each aspect. The model provides two different visualizations: one for businesses that uncovers their weak and strong points against their competitors and one for end-users who receive suggestions about products of potential interest. The former is an aggregation of aspect-based opinions provided by all users and the latter is a collaborative filtering approach, which calculates user similarity over a projection of the original bipartite graph (user-item rating graph) over a content-based clustering of users and items. The model takes advantage of the feedback users give to businesses in review sites, and employ opinion mining techniques to identify the opinions of users for specific aspects of a business. Such aspects and their polarity can be used to create user and business profiles, which can subsequently be fed in a clustering and recommendation process. We envision this model as a powerful tool for planning and executing a successful marketing campaign via online media. Finally, we demonstrate how our prototype can be used in different scenarios to assist users or business owners, using the Yelp challenge dataset.

The participatory Web has enabled the ubiquitous and pervasive access of information, accompanied... more The participatory Web has enabled the ubiquitous and pervasive access of information, accompanied by an increase of speed and reach in information sharing. Data dissemination services such as news aggregators are expected to provide up-to-date, real-time information to the end users. News aggregators are in essence recommendation systems that filter and rank news stories in order to select the few that will appear on the user's front screen at any time. One of the main challenges in such systems is to address the recency and latency problems, that is, to identify as soon as possible how important a news story is. In this work we propose an integrated framework that aims at predicting the importance of news items upon their publication with a focus on recent and highly popular news, employing resampling strategies, and at translating the result into concrete news rankings. We perform an extensive experimental evaluation using real-life datasets of the proposed framework as both a stand-alone system and when applied to news recommendations from Google News. Additionally, we propose and evaluate a combinatorial solution to the augmentation of official media recommendations with social information. Results show that the proposed approach complements and enhances the news rankings generated by state-of-the-art systems.
Proc. of the 21st International Conference …, 2009

Computer Science Review, 2022
Recommender systems have been widely used in different application domains including energypreser... more Recommender systems have been widely used in different application domains including energypreservation, e-commerce, healthcare, social media, etc. Such applications require the analysis and mining of massive amounts of various types of user data, including demographics, preferences, social interactions, etc. in order to develop accurate and precise recommender systems. Such datasets often include sensitive information, yet most recommender systems are focusing on the models' accuracy and ignore issues related to security and the users' privacy. Despite the efforts to overcome these problems using different risk reduction techniques, none of them has been completely successful in ensuring cryptographic security and protection of the users' private information. To bridge this gap, the blockchain technology is presented as a promising strategy to promote security and privacy preservation in recommender systems, not only because of its security and privacy salient features, but also due to its resilience, adaptability, fault tolerance and trust characteristics. This paper presents a holistic review of blockchain-based recommender systems covering challenges, open issues and solutions. Accordingly, a well-designed taxonomy is introduced to describe the security and privacy challenges, overview existing frameworks and discuss their applications and benefits when using blockchain before indicating opportunities for future research. Blockchain-based Recommender Systems Table 1 Summary of the types of information used by Recommendation Systems. Information Description Item attributes Descriptive information about the items (i.e. their features). Examples include brand, color, model, category, place of origin, etc. User attributes Descriptive information about the users (i.e. their features). Examples include age, marital status, education, demographics etc. User ratings for the items Explicit user feedback, in the form of ratings. can be scalar or binary. Implicit user preferences Information that is implicitly derived and relates to the user's choices. Examples are clicks, tags, and comments. Recommendation feedback The user response to the recommendations. It is expressed as accept/reject values, positive or negative labels, etc. Can be used to define (implicitly and explicitly) the user preferences. User behavioural information Implicit data recorded during the interaction of the user with the broader system. Contextual information Information on the context of recommendations. Examples are time, date, location, user status etc. Social information Data related to the user's social graph, including connections and interactions with other users, friendship relations (or similar) to other users, community membership, or both. Domain knowledge Background or prior information, empirical knowledge and rules that define the relation between content items and the user stereotype. This type of knowledge is usually static, but can also vary over time. User purchase or consumption history List of content items that have previously been purchased or consumed by the user.

Recommendation algorithms aim at proposing “next” pages to a user based on her current visit and ... more Recommendation algorithms aim at proposing “next” pages to a user based on her current visit and the past users ’ navigational patterns. In the vast majority of related algorithms, only the usage data are used to produce recommendations, whereas the structural properties of the Web graph are ignored. We claim that taking also into account the web structure and using link analysis algorithms ameliorates the quality of recommendations. In this paper we present UPR, a novel personalization algorithm which combines usage data and link analysis techniques for ranking and recommending web pages to the end user. Using the web site’s structure and its usage data we produce personalized navigational graph synopses (prNG) to be used for applying UPR and produce personalized recommendations. Experimental results show that the accuracy of the recommendations is superior to pure usage-based approaches. 1.

The continuous growth in the size and use of the World Wide Web imposes new methods of design and... more The continuous growth in the size and use of the World Wide Web imposes new methods of design and development of on-line information services. The need for predicting the users ’ needs in order to improve the usability and user retention of a web site is more than evident and can be addressed by personalizing it. Recommendation algorithms aim at proposing “next ” pages to users based on their current visit and the past users’ navigational patterns. In the vast majority of related algorithms, however, only the usage data are used to produce recommendations, disregarding the structural properties of the web graph. Thus important – in terms of PageRank authority score – pages may be underrated. In this work we present UPR, a PageRank-style algorithm which combines usage data and link analysis techniques for assigning probabilities to the web pages based on their importance in the web site’s navigational graph. We propose the application of a localized version of UPR (l-UPR) to personal...

Markov models have been widely used for modelling users' navigational behaviour in the Web g... more Markov models have been widely used for modelling users' navigational behaviour in the Web graph, using the transitional probabilities between web pages, as recorded in the web logs. The recorded users ' navigation is used to extract popular web paths and predict current users ’ next steps. Such purely usage-based probabilistic models, however, present certain shortcomings. Since the prediction of users ' navigational behaviour is based solely on the usage data, structural properties of the Web graph are ignored. Thus important- in terms of pagerank authority score- paths may be underrated. In this paper we present a hybrid probabilistic predictive model extending the properties of Markov models by incorporating link analysis methods. More specifically, we propose the use of a PageRank-style algorithm for assigning prior probabilities to the web pages based on their importance in the web site's graph. We prove, through experimentation, that this approach results ...
IEEE Data Eng. Bull., 2011
Interactive database exploration is a key task in information mining. However, users who lack SQL... more Interactive database exploration is a key task in information mining. However, users who lack SQL expertise or familiarity with the database schema face great difficulties in performing this task. To aid these users, we developed the QueRIE system for personalized query recommendations. QueRIE continuously monitors the user’s querying behavior and finds matching patterns in the system’s query log, in an attempt to identify previous users with similar information needs. Subsequently, QueRIE uses these “similar” users and their queries to recommend queries that the current user may find interesting. We discuss the key components of QueRIE and describe empirical results based on actual user traces with the Sky Server database.
Uploads
Papers by Magdalini Eirinaki