Papers by Francesco Lelli
Scalable Computing: Practice and Experience, Sep 1, 2012
The dataset reports a collection of earnings call transcripts, the related stock prices, and the ... more The dataset reports a collection of earnings call transcripts, the related stock prices, and the sector index In terms of volume, there is a total of 188 transcripts, 11970 stock prices, and 1196 sector index values. Furthermore, all of these data originated in the period 2016-2020 and are related to the NASDAQ stock market. Furthermore, the data collection was made possible by Yahoo Finance and Thomson Reuters Eikon. Specifically, Yahoo Finance enabled the search for stock values and Thomson Reuters Eikon provided the earnings call transcripts. Lastly, the dataset can be used as a benchmark for the evaluation of several NLP techniques to understand their potential for financial applications. Moreover, it is also possible to expand the dataset by extending the period in which the data originated following a similar procedure. Contact at Tilburg University: Francesco Lelli

In this paper we intend to present a dataset that contain a collection of tweets generated as rea... more In this paper we intend to present a dataset that contain a collection of tweets generated as reactions of the release of 50 different movies. The dataset can be used for gaining useful insights regarding the conversation that is generated around a particular movie. It is particularly suitable for conducting sentiment analysis and other NLP techniques. The dataset contains approximately 2.5 million tweets with their related meta data and cover 50 movies. For each movie, its IMDb rating is included. The movies are the 25 releases with the highest number of votes during 2020 and 2021. The collected tweets represent the reactions of the twitter community during the first week of the release date in US of that particular movie. The tweets per movie ranged from 1.000 to approximately 200.000 tweets with an average of 50.000 per release. We used The Internet Archive Wayback Machine in order to retrieve the IMDb movie rating after one week of the US release date. The tweets and related met...

This paper aims to raise awareness on certain interoperability issues as we intend to shape indus... more This paper aims to raise awareness on certain interoperability issues as we intend to shape industry 5.0 in order to enable a human-centric resilient society. We advocate that the need of sharing small and specific data will become more intensive as AI-based solutions will become more pervasive. Consequently, dataspaces should be carefully designed to address this need. We advance the conversation by presenting a case study from HR demonstrating how to predict the possibility of an employee experiencing attrition. Our experimental results show that we need more than 500 samples for developing a machine learning model to be sufficiently capable to generalize the problem. Consequently, our experimental results show the feasibility of the idea. However, in small and medium sized companies this approach cannot be implemented due to the limited number of samples. At the same time, we advocate that this obstacle may be overcome if multiple companies will join a shared dataspace, thus rais...
We describe a dataset that contains job description published on a popular online website in the ... more We describe a dataset that contains job description published on a popular online website in the information and technology sector. As the website focus mainly on United Kingdom based jobs, the data have a specific focus on this country. It contains 11.501 job vacancies and 13 related meta data information. The dataset is suitable for HR analysis using machine learning techniques such as natural language processing and neural networks.
High-Energy Physics Experiments require more and more fast and reliable transport technologies fo... more High-Energy Physics Experiments require more and more fast and reliable transport technologies for Data Acquisition. To this end, possible answers arrive from the industry world, in fact a large amount of technologies based on high-level languages have been developed. We have investigated the Java Message Service (JMS) [1] approach, and in particular a specific implementation of such interface, called Reliable Multicast Messaging (RMM) [2], has been taken in consideration.

In this paper, we investigate the relationship people have with their smart devices. We use the c... more In this paper, we investigate the relationship people have with their smart devices. We use the concept of agency to capture aspects of users’ sense of mastery as they relate to their device. This study gives preliminary evidence of the existence of two independent dimensions of agency for modeling the interaction between humans and smart devices: (i) user agency and (ii) device agency. These constructs emerged from an exploratory factorial analysis conducted on a survey data collected from 587 participants. In addition, we investigate the correlation between user agency and device agency with background variables of the respondents. Finally, we argue that mapping the users’ dynamics with their device into user agency and device agency fosters a better understanding of the needs of the users and helps in designing interfaces tailored for the specific capabilities and expectations of the users.

Purpose: The use of smart devices has increased greatly in the last ten years with users reaching... more Purpose: The use of smart devices has increased greatly in the last ten years with users reaching out to the possibility to do more with them especially in the networking front. In this context there is a need to understand the connection between users’ social demographic factors and their way to related to their smart devices. Objective: This study was designed to evaluate the senso of belonging of a community in order to evaluate intangible benefits that employees may gain from a more immerse relationship with their devices. Method: We used a dataset of 586 anonymous respondent of an existing survey designed for capturing the relationships that humans develop with their smart devices. In particular, we investigate the relationships with smart device and particular background variables of the respondents using a chi-square test. Results: The study showed that there is a significant relationship between users’ sex and smart device type and their dependency on smart device. Male tend...

This dataset reports responses to a survey designed for investigating the relationship that human... more This dataset reports responses to a survey designed for investigating the relationship that humans have with their smart devices. The dataset has been collected in May-July 2020 and is a sample of over 500 respondents of various different ethnicities and backgrounds. These data have been used for modelling the ways people relate to their devices using the notion of agency. However, the data can be used for complementing any study that intends to investigate a tool-mediated communication from the perspective of the users and via a variety of attitudes and expectations the users invest in their devices and in themselves as users. This article presents the survey items as well as some raw data insights. The data have been collected in English and answers have been anonymized in order to ensure GDPR compliance. They are stored in a .csv file containing the respondents' answers to the questions. The reference contact for this data at Tilburg University is Francesco Lelli The paper &q...
In this preprint, we introduce a dataset containing students enrolment applications combined with... more In this preprint, we introduce a dataset containing students enrolment applications combined with the related result of their filing procedure. The dataset contains 73 variable. Student candidates, at the time of applying for study, fill a web form for filing the procedure. A committee at Tilburg University review each single application and decide if the student is admissible or not. This dataset is suitable for algorithmic studies and has been used in a comparison between the Naïve Bayes and the C5.0 Decision Tree Algorithms. They have been used for predicting the decision of the committee in admitting candidates at various bachelor programs. Our analysis shows that, in this particular case, a combination of the approaches outperform a both of them in term of precision and recall.
The dataset reports a collection of earnings call transcripts, the related stock prices, and the ... more The dataset reports a collection of earnings call transcripts, the related stock prices, and the related sector index. It contains a total of 188 transcripts, 11970 stock prices, and 1196 sector index values. Furthermore, all of these data originated in the period 2016-2020 and are related to the NASDAQ stock market. The data have been collected using Yahoo Finance and Thomson Reuters Eikon. Specifically, Yahoo Finance offered daily stock prices and traded volume. At the same time, Thomson Reuters Eikon has been used as source for the earnings call transcripts. The dataset can be used as a benchmark for the evaluation of several NLP techniques as well as machine learning algorithms for understanding their potential for financial applications. Moreover, it is also possible to expand the dataset by extending the period in which the data originated following a similar procedure.
Journal of pediatric rehabilitation medicine, 2012

ac.uk Francesco.Lelli(@,lnl. infrn. it Abstract - Thispaperconsiders theissues involved in case,i... more ac.uk Francesco.Lelli(@,lnl. infrn. it Abstract - Thispaperconsiders theissues involved in case,itshouldbe notedthatdifferent kindsof developing ageneric problem solver tobeusedwithin agrid instrumentation willhavecompletely different waysof environment for the monitoring and controlof collecting information andthat, incontrast toaparticular instrumentation. Thespecific feature ofsuchanenvironmentimplementation ofaclassical grid, this information will be isthatthetypeofdatatobeprocessed, aswellasthe markedly heterogeneous. Thismakesitnecessary to problem, isnotalwaysknowninadvance. Therefore, itis necessary todevelop aproblem solver architecture thatwill develop aproblemsolver withageneric structures- n address this issue. Wepropose toanalyze theperformance Of other words, aproblem solver that isabletoprocess data theproblem solving algorithms available within theWEKA efficiently irrespective ofthesize oftheprocessed dataset toolkit anddetermine adecision treeofthebestperformingandi...
We present a new XML parser, the Cache Parser, which exploits a cache to reduce the parsing time ... more We present a new XML parser, the Cache Parser, which exploits a cache to reduce the parsing time on the receiver side, by reusing information related to previously parsed documents/messages similar to the one under examination. We will show how our fast parser can improve the global throughput of any application based on Web or Grid Services, or also on JAXP-RPC. Experimental results demonstrate that our algorithm is 25 times faster than the fastest algorithm in the market and, if used in a WS scenario, can dramatically increase the number of requests per second handled by a server (up to 150% of improvement) bringing it close to a system that does not use XML at all.

The CMS detector is under construction for imminent operation at the LHC machine at CERN. About 8... more The CMS detector is under construction for imminent operation at the LHC machine at CERN. About 80 of a total of 250 muon drift tubes chambers were built in a large hall of the Legnaro National Laboratories and shipped to CERN. Chambers, front-end electronics, and data acquisition software require a flexible and reliable control and monitoring system for operating and testing the functionalities of the hardware and software components. Functionality tests are continuously performed both in Legnaro and at CERN. During summer and autumn 2006 the CMS collaboration took the crucial opportunity to operate pieces of all the sub-detectors together with cosmic rays and magnetic field. The Magnet Test and Cosmic Challenge (MTCC) included a 60° sector of the Muon System comprising drift tube chambers. This report briefly describes the Run Control and Monitoring System of the Drift Tube used both during the MTCC for global CMS operation and in local DAQ systems in Legnaro.
From a generic point of view a remote method invocation can be split into 7 crucial parts, as exp... more From a generic point of view a remote method invocation can be split into 7 crucial parts, as explained in figure 1 [1]. During the interval t1-t0 the client serializes the invocation input in SOAP format and sends it during the interval t2-t1. The remote peer receives the serialized message at time t2 and starts the deserialization process that finishes at time t3. During the interval t4-t3 the remote method is executed and the output is produced. As its last operation, the remote services serialize the output in SOAP format during the interval t5-t4 and starts sending it. The invoker receives the serialized message at time t6 and during the interval t7-t6 starts deserializing the output message in the proper data structure.

In this article, we report how we constructed a dataset that contains the responses to a survey d... more In this article, we report how we constructed a dataset that contains the responses to a survey designed for investigating the relationship that humans have with their smart devices. The dataset has been collected in May-July 2020 and is a sample of over 500 respondents of various different ethnicities and backgrounds. These data have been used for modelling the ways people relate to their devices using the notion of agency. However, the data can be used for complementing any study that intends to investigate a tool-mediated communication from the perspective of the users and via a variety of attitudes and expectations the users invest in their devices and in themselves as users. This article presents the survey items as well as some preliminary data insights. The data have been collected in English and answers have been anonymized in order to ensure GDPR compliance. They are stored in a .csv file containing the respondents’ answers to the questions.
Future Internet, 2019
Industry 4.0 demands a dynamic optimization of production lines. They are formed by sets of heter... more Industry 4.0 demands a dynamic optimization of production lines. They are formed by sets of heterogeneous devices that cooperate towards a shared goal. The Internet of Things can serve as a technology enabler for implementing such a vision. Nevertheless, the domain is struggling in finding a shared understanding of the concepts for describing a device. This aspect plays a fundamental role in enabling an “intelligent interoperability” among sensor and actuators that will constitute a dynamic Industry 4.0 production line. In this paper, we summarize the efforts of academics and practitioners toward describing devices in order to enable dynamic reconfiguration by machines or humans. We also propose a set of concepts for describing devices, and we analyze how present initiatives are covering these aspects.
Advances in Grid and Pervasive Computing, 2009
In the past few years, the idea of extending the Grid to cover also the remote access, control, m... more In the past few years, the idea of extending the Grid to cover also the remote access, control, management of instrument devices has been explored in a few initiatives. Existing tools lack in generality and require advanced specialized computer science knowledge, thus making them difficult to be broadly adopted in the scientific community. In this paper we present a new open source initiative that is designed to overcome these problems. The Tiny Instrument Element project defines a high level architecture for plugging instruments into the Grid and provides the corresponding skeleton implementation. This lightweight approach, as opposed to existing middleware-based solutions, reduces the effort required to Gridify existing instruments. The paper evaluates the proposed abstraction with a case study from a pervasive computing scenario.
Uploads
Papers by Francesco Lelli