A business intelligence system for fisheries surveillance

Ricardo Moura

A business intelligence system for fisheries surveillance

Ricardo Moura

https://doi.org/10.23919/CISTI.2019.8760814

visibility

…

description

6 pages

link

1 file

The volume of data related to operational activity, produced and exchanged among the several Portuguese Navy naval units is high and diversified. The way each naval unit currently collects and processes data makes difficult to consolidate information in the process of analysis and decision making. This work describes the design of a Business Intelligence (BI) solution aimed at obtaining a unified view of operational information from the Navy. For this purpose, it was necessary to conceive, design and implement a system that would allow the collection, treatment, integration, consolidation, analysis and visualization of the data contained in the operational data shared by naval units, in order to generate information relevant to a more effective and efficient decision making. The Business Intelligence system presented is based on an analytical processing system whose data is transferred to a data warehouse and then to a data mart. The solution of this system was validated through a survey applied to target users. The survey responses allowed to gather important information with the purpose of improving the practicability and usefulness in future iterations with the BI solution developed.

A Business Intelligence system for fisheries surveillance Anacleto Correia Ricardo Moura CINAV – Escola Naval Base Naval de Lisboa – Alfeite 2810-001 Almada, Portugal [email protected] CINAV – Escola Naval Base Naval de Lisboa – Alfeite 2810-001 Almada, Portugal [email protected] Abstract — The volume of data related to operational activity, produced and exchanged among the several Portuguese Navy naval units is high and diversified. The way each naval unit currently collects and processes data makes difficult to consolidate information in the process of analysis and decision making. This work describes the design of a Business Intelligence (BI) solution aimed at obtaining a unified view of operational information from the Navy. For this purpose, it was necessary to conceive, design and implement a system that would allow the collection, treatment, integration, consolidation, analysis and visualization of the data contained in the operational data shared by naval units, in order to generate information relevant to a more effective and efficient decision making. The Business Intelligence system presented is based on an analytical processing system whose data is transferred to a data warehouse and then to a data mart. The solution of this system was validated through a survey applied to target users. The survey responses allowed to gather important information with the purpose of improving the practicability and usefulness in future iterations with the BI solution developed. Keywords - Business Intelligence, Data Warehouse, Data Mart, Decision Support Systems, Fisheries Surveillance. I. INTRODUCTION The naval units of Portuguese Navy accomplish a set of missions assigned to the institution. These missions include: military operations (e.g. naval exercises, anti-piracy actions, monitoring of shipping lanes, search and rescue), and security (e.g. combat drug traffic, illegal immigration, maritime accidents), cooperation with civilian authorities (e.g. fisheries surveillance and protection of marine resources, patrolling of protected areas), as well as scientific support for research, development and innovation projects in partnership with academia and industry. The large set of operational activities and multiple sources data of available generates large amounts of data that need to be collected and dealt with, in order to support the decision-making of the organization, namely, planning, logistical and financial support, as well as the operational command of the naval resources. The collection of data is done in relevant management areas of the organization, designated by domains of interest. Conclusions withdrawn from preliminary surveys of the information requirements showed the importance of collecting data on domains related to fuel consumption; ammunition consumption; operational limitations; search and rescue operations; navigation; and fisheries surveillance. In the case of the domain of interest of fisheries surveillance, which will be analyzed in more detail in this work, the collected data resulted from activities surveillance and monitoring of fishing activities and marine cultures held in spaces under national sovereignty and jurisdiction. The areas where the inspection activity of vessels can be made is regulated by the United Nations Convention on the Law of the Sea (UNCLOS). According to UNCLOS, the areas of sovereignty and national jurisdiction are the inland waterways, the territorial sea (up to 12 miles), the contiguous zone (up to 24 miles), and the Exclusive Economic Zone (EEZ) (up to 200 miles). The practice of sea fishing and the cultivation of marine species should be carried out in a balanced manner, to ensure that the management and use of resources, in waters under national sovereignty and jurisdiction, are sustainable. Thus, the inspection of fishing is aimed at ascertaining whether the legal norms established for fishing and marine culture activities are complied with. After the inspection actions and the detection of alleged violations of the law, procedural decisions are taken by the judicial system, namely by the imposition of coercive measures or acquittal of alleged offenders. The control action of a fishing vessel comprises the simultaneous collection of a large data set, with the objective of maximizing the verification and survey activities. These include, inter alia, those specifically related to fishing (e.g. quantity of fish on board, licenses, type of gear, operating areas) of another nature, such as related to the requirements and means of security (e.g. life-saving appliances, lifejackets, fire extinguishers, pyrotechnics), seaworthiness and crew conditions (e.g. safety stocking, seafarers' qualifications, required documents for seafarers), on licenses relating to the radioelectric equipment of vessels. The information gathering process is generally carried out by ensuring that the duration of the enforcement action does not last longer than 4 hours unless an infraction is detected, or the agents manifestly need additional information [1]. Whenever unlawful contraventions are detected by a naval unit, it is the responsibility of the ship to make the report and safeguard evidence through appropriate measures. For this purpose, during the inspection, certification of compliance with technical standards and procedures is carried out by completing the inspection report [1]. Any non-compliance with the 2019 14th Iberian Conference on Information Systems and Technologies (CISTI) 19 – 22 June 2019, Coimbra, Portugal ISBN: 978-989-98434-9-3 established parameters is considered a presumed infraction, so the supporting data are recorded in the inspection report, to serve as evidence in a procedural decision. Although the scenario previously described relates to the domain of interest of fisheries surveillance, other domains also need an adequate collection of data and its conversion into objective and summarized information that enables decision making. With this in mind, it was necessary the use of data tools to facilitate the adequate merging, aggregation and visualization of the information. In this paper, we describe the Business Intelligence (BI) system life cycle stages built to provide decision-makers at the different operational areas and hierarchical levels of the organization, a unified view of the information relevant for the decision-making process. The domain of interest, used in this article, to illustrate the referred BI life cycle was the fisheries surveillance. This paper has five sections. In this section, the overall domains of interest relevant to the organization were identified, as well as the domain (fisheries surveillance) that will be tackled in this work. In the second section, we analyze the state of the art of Business Intelligence, referring to its most relevant concepts and the process of designing and developing BI solutions. The third section, describes the design of the data repository and data visualization application of the BI solution, as well as the steps taken for extracting, transforming and loading the data from its sources. In the fourth section, the results obtained from the artifact developed were analyzed, and validated by users, through a survey object of statistical treatment. Finally, we summarize the results obtained and the recommendations for future work. II. LITERATURE REVIEW Decision making requires the analysis of the degree of achievement of the objectives defined by the organization, as well as of the efficiency of the established strategy to achieve them. The way to achieve the goals may have to be changed in the face of changes in the organization's surroundings. The decision on which changes to initiate faces the difficult task of gathering useful information [2]. Business Intelligence aims to contribute where relevant information was scarce for decision making in organizations. From the multiple definitions of BI, we have selected the one that refers to how to frame the processes, technologies and applications necessary to transform data into information, information into knowledge, and knowledge into action plans [3]. A BI usually includes one or more data repositories, as well as analytical and knowledge management tools [3]. The current operations of an organization result in the recording of transactions (e.g., registering orders, processing invoices, inventory updating). The storage, control and processing of transactions are performed in Operational Data Stores (ODS) [4] or Online Transaction Processing (OLTP) (Larson et al., 2012). OLTPs are one of the operational data repositories of an organization. These systems enable the insertion, modification and query of daily data processed by the organization. The data entered in an OLTP are called primitive data [5]. Because this type of system is not the most suitable for manipulation and summarization of large volumes of unit transactions, required by the BI, the Executive Information System (EIS) emerged in the 1970s. The EIS, having evolved from the Decision Support Systems (DSS), treats internal and external information relevant to the organization, in the form of multidimensional dynamic reports, with forecasts and trend analysis, among other functionalities [2]. One of the key BI infrastructures is the Data Warehouse (DW), an integrated data repository that considers the time factor, and records relevant organizational topics for support of decision making [6]. A DW incorporates data from multiple sources, after being processed, formatted, and consolidated into a single data structure. Data transferred to a DW is usually from OLTP systems. The DW data sources can also be from Enterprise Resource Planning (ERP), Customer Relationship Management (CRM), document management systems, business processes management systems, spreadsheets, etc. [4]. In order to be able to transfer data from the source systems to the DW, it was necessary to use the Extract-Transform-Load (ETL) process [7]. This process aims at detecting and eliminating errors in the data, as well as the data transformation in order to ensure the quality of the data stored in the DW [8]. In detail, the activities performed within an ETL process aim at: (1) data extraction from data source systems, which includes the phases of (i) Initial Extraction - occurs only once with the data to be entered in the DW for the first time -, and the (ii) changed data extraction/changed data capture (CDC), occurring based on the periodicity of the updates defined by the organization, with the data modified and added to the source systems, since the last extraction, being sent again to the DW; (2) the data transformation with the data to be disseminated to a temporary area, the Staging Area (SA), where data cleansing occurs, i.e. error detection, data correction and consolidation, with inconsistent data deleted. The data is changed only on the target system (DW) and not on the source system (OLTP). This phase is critical because it involves processing large volumes of data, extracted from various sources, with different encodings, structures and storages. At the end of this phase, the DW should contain unambiguous, correct and consistent data; (3) the loading of the data stores in a destination system, an Online Analytical Processing (OLAP) whose repository, when supported by a dimensional structure with greater efficiency in data query operations, is called Dimensional Data Store (DDS) [9]. OLAP systems, also called Multidimensional Database Management System (MDBMS) [5], allow the interactive processing and analysis of data stored in the DW [9]. In addition to the original transactional data OLAP contains analytical or derived data, which are the result of data transformation for analysis and decision support purposes. The database servers that support this type of data processing are referred as: (1) Relational Online Analytical Processing (ROLAP), when uses a relational database as support, which allows flexibility in querying and changing data; (2) Multidimensional Online Analytical Processing (MOLAP) with data processing performed in a multidimensional database represented by dimensions and facts; (3) Hybrid Online Analytical Processing (HOLAP), a combination of ROLAP and MOLAP technologies taking advantage of the best of each, with the storage of data in This work was funded by the Portuguese Ministry of Defense and the Portuguese Navy/CINAV/Escola Naval. 2019 14th Iberian Conference on Information Systems and Technologies (CISTI) 19 – 22 June 2019, Coimbra, Portugal ISBN: 978-989-98434-9-3 a HOLAP server being performed in a relational database and the analytical operations executed by a MOLAP server. OLAP servers can also be classified, depending on the location of the server, as: (1) Desktop OLAP (DOLAP) when the OLAP server is transferred to the desktop from where users access to all or part of the multidimensional database; (2) OLAP Web (WOLAP) when access to OLAP server is done with internet technology. If the analytical data repository is small and include only information related to a department of the organization, it is called Data Mart (DM) [7]. This approach allows for greater speed in query and data analysis operations. The DM may be dependent or independent on a DW, based on one of the following configurations chosen for the DW architecture (Ballard et al., 2006): (1) Enterprise Data Warehouse (EDW) with a logically centralized data repository referred as Hub and Spoke. In this architecture, the DMs are not inserted in the DW [8]; (2) Independent Data Mart when analytical data is organized to meet the specific needs of a department. In this type of architecture, there is no relationship between DMs, which can lead to inconsistencies [10-11]; (3) Dependent Data Mart, when the DMs are implemented in several departments, but there is integration, which allows a unified view of the data [10]. The DW can be implemented using three alternative strategies: (1) Top-down, with data from the DMs extracted from an integrated DW initially built [6]. Although this is the standard development for the DW, it leads to a delayed implementation [12]; (2) Botton-up, with data originated from the DMs sent to the DW to be consolidated [11]. Allows a faster implementation than the previous strategy but has the drawback of not providing an overall DW model, making the integration difficult since the DMs are independent [12]; (3) Mixed strategy is based on the first one, but the implementation of each DM is made only when it is needed by a specific domain of interest [12]. The construction of the Data Warehouse is based on dimensional modeling [11] with the data organized in order to facilitate the processing of the queries in different perspectives. The implementation of the model can be achieved through the following type of modeling: multidimensional cubes, star schemas, snowflake schemes or constellation schemes. Underlying the type of modeling are the concepts of: (1) fact table containing numeric columns (facts) and foreign key fields that relate to dimension tables [11]; (a) a fact refers to a topic on which there are events that are intended to be recorded and analyzed. Events, recorded as numerical values, are the object of analysis for a set of measures [13]; (b) a measurement correspond to a row in the fact table and should have the level of granularity appropriate to the desired analysis [11]; (2) each dimension table provides a specific perspective of analysis (who, when, where, why, how), consisting preferably of descriptive attributes than codes [11]; (a) each dimension is defined by a primary key [11], unique and non-null field values for each row of the table [11]. (b) the attributes correspond to the lines of the dimension table, organized hierarchically to allow greater detail in the analysis and used as constraints of queries and reports. The potentiality of the provided analysis of the Data Warehouse depends on the expressiveness of the attributes created for the dimensions [11]. The facts are generally classified as: (1) additive facts that can be aggregated (i.e., object of sum, counting, etc. operations) by all dimensions related to the fact table (e.g. quantity, value); (2) semi-additive facts which may be aggregated only by one or more of the existing dimensions, but not all (e.g. date); and (3) non-additive facts that cannot be aggregated by any of the dimensions present in the structure (e.g. prices, percentages, unit values). From the moment the data enters in the DW, the only possible operation on the data is read (except in the DW loading and updating phases through the ETL process). The attributes that make up the dimension tables are often used in the analysis. Although not common, they can be altered over time using the Slowly Changing Dimensions (SCD) technique to monitor the occurrence of changes. The common types of SCD are [11]: (1) Type 1 - Overwrite the Value: consists of an update that replaces one or more attributes. This technique is used to correct data with no interest in storing historical values [8]. It has a simple implementation, since the value of the previous attribute is not saved [11]; (2) Type 2 - Add a Dimension Row / Creating another dimension record: the change is made by inserting a new line, that is, a new attribute in the dimension, so we can check the previous value and its update. To distinguish both attributes, different values of the primary key are used. The implementation is more complex but allow us to save the complete history; (3) Type 3 - Add a Dimension Column: This is the addition of a defined number of columns that are used when changing, that is, there is the column with the value inserted and another column of change, with the value relative to a previous moment. This technique makes it possible to observe the new attribute and the previous one, but, however, it does not save more history if other changes occur. It does not allow an analysis of the impact of the changes, if they exist [11]. More complex and costly techniques than the previous ones are called hybrid techniques [11]: (1) Predictable Changes with Multiple Version Overlays, consisting of a succession of changes to be made in an attribute. Type 2 would be the alternative but does not apply when the requirements of the changes are very elaborate and require a combination between Types 2 and 3 [11]; (2) Unpredictable Changes with SingleVersion Overlay is used when there is a need to preserve historical data to understand the evolution of the current values and allow comparisons. In this case, two columns are needed, one for the current values and the other column for the historical values. Thus, the required operations are: Type 2 to create the new line, which contains the previous value and the current value with different keys; Type 3 to add the column containing the new update; and Type 1 that writes over the previous one, since the historical value already exists in another column [11]. The multidimensional modeling of the DW can be classified as: (1) star schema, the most common, especially when it is expected to perform large data queries. It is the scheme from which others derive. It is based on a single fact table (center of the star) that establishes relation with the non-normalized dimension tables [10]. It can be viewed as a cube, where each of the schematic dimension tables corresponds to one face of the cube; (2) the snowflake scheme is equivalent, in the data content, to the previous scheme, except that in this scheme the dimensions are normalized. It has the advantage of 2019 14th Iberian Conference on Information Systems and Technologies (CISTI) 19 – 22 June 2019, Coimbra, Portugal ISBN: 978-989-98434-9-3 hierarchically relating the dimensions, but the drawback of being more complex and difficult to understand, by showing the structure of each dimension [14]; (3) the constellation scheme is a set of star schemas, with multiple fact tables linked through common dimension tables [10]. with tools to support the BI processes, namely, to build the DW (OLTP system, ETL tools and data integration), OLAP (for cube manipulation and data analysis), as well as for data visualization, i.e. a suitable user interface for data presenting for decision making. DDS data are stored in multidimensional cubes of Multidimensional Databases (MDB) [9]. These structures allow flexible access to DW data for information visualization and reporting using tools such as spreadsheets, reporting applications, and data mining tools. The cubes also enable a set of operations on data: (1) Drill-up/Roll-up (Drill-down/Rolldown): group (detail) data according to hierarchical levels [10]. In drill-up/roll-up the visualization starts from the most detailed level of data to the highest level of aggregation. By detailing (drill-down/roll-down) the visualization of the data begins with the aggregated values of the data and evolves to the detailed values; (2) the Slice and Dice allows to restrict the information to be visualized, using data cutting and fragmentation [13]; a cut (slice) consists of the selection of a restricted number of attributes in a dimension in conjunction with other unrestricted dimensions [10]; a fragment (dice) relates different dimensions, placing the dimensions attributes on the axes of a partial cube [10]. (3) with the Pivot and rotate it is possible to rotate the axes and to transpose rows and columns allowing, for the same data, different visualizations [10]. The volume of data being processed is now increasing, with the integration of Internet of Things (IoT) and Big Data. Other technological trends, that certainly will shape the BI architecture, are Cloud Computing, Machine Learning, Business Analytics, Data Mining, Competitive Intelligence and Knowledge Management [2]. A way to store, process, analyze, and process large volumes of high-performance BI data based on the In-Memory Database (IMDB) is also gaining expression. For the effectiveness of the information visualization in a dashboard there are principles that must be followed in its elaboration, namely: focusing on the needs of the target audience, display of important information in the upper corners; keep charts simple, with titles and subtitles, and the use of a color palette easy to interpret. One should avoid charts that distort proportions, in 3D, complex to interpret or have unnecessary details. The most common mechanisms for visualizing information in Business Intelligence are dashboards (static or dynamic) and scoreboards [10]. A dashboard consolidates, on a single screen, relevant information about achieving strategic objectives and measuring the overall efficiency of the organization. Scorecards, on the other hand, provide an expedited and concise way of measuring Key Performance Indicators (KPIs) and depicting in charts the progress of the organization towards defined objectives [10]. Dashboards can be classified as: (1) operational, when monitoring various business processes; (2) tactical, if monitors of departmental processes; and (3) strategic (or scorecards), designed to monitor the implementation of strategic objectives [3]. In addition to dashboards, BI applications can be grouped into five categories: reporting applications, analytic applications, data mining applications, alerts, and portals. Usually, one can also perform ad-hoc queries, which are based on access to information using parameters as filters, according to the information needs [5]. The architecture of a BI solution has two main components (see Figure 9): (1) back-end, where the server resides with the largest data repository; and (2) the front-end, that forms the user interface, where data are presented and analyzed [2]. Traditional BI systems are based on a three-tier architecture [2]: Database Layer, Application Layer and Presentation Layer. This architecture includes the previously mentioned components, One of the current challenges of BI systems is the inclusion of unstructured data sources, such as e-mail messages, personal profiles obtained via the Web and social networks, data from mobile devices and other sensors. Although data growth contains the potential for access to more relevant information, the fact that it comes from unstructured data sources introduces additional complexity of processing and obtaining knowledge [2]. III. PROPOSED SOLUTION In order to create the BI solution, for the domain of interest of fisheries surveillance, data was gathered in an OLTP database. After the data collection, the next step was to design the DW, using a star schema, by defining tables of fact and dimensions, as well as their attributes, using the dimensional modeling. The scheme was designed with quantitative data of fisheries surveillance at the center, as the fact table, linked to several dimensions with qualitative data. The fact in the model is semi-additive because only some of its measures are performed operations with the attributes of the dimensions. After the creation of the star model, the data was populated. In order to load the data in the various dimensions, the ETL process was executed. During this process, the inconsistencies detected in the data (e.g. decimal and/or negative values for nonnegative integer fields, date and time data represented in fractional number format, months codified in different ways, strings of characters without trimmed spaces, etc.) were resolved. For instantiation of the ETL process, MSSQL Server Integration Services (SSIS), was used. This tool enables the import of data from various sources, as well as manipulation, and loading into the DW. The ETL process was executed for the dimensions and fact tables, which resulted in the star schema filled up with the data ready for the multidimensional analysis. The relevant information to be made available through OLAP analyzes was the following: number of fisheries surveillances per year; entities that carried out the surveys; type and nationality of the vessels surveyed; fishing apparatus; presumed offenders by fishing area; and type of fish found on boats suspected of being offenders. 2019 14th Iberian Conference on Information Systems and Technologies (CISTI) 19 – 22 June 2019, Coimbra, Portugal ISBN: 978-989-98434-9-3 of the participant, regarding the kind of knowledge related with fisheries surveillance; (3) evaluation of the participant's ability for adequate information analysis, in decision-making, through the built solution; (4) global assessment of the built BI solution. The analysis of the data collected from the sample allowed us to conclude that more than half of the participants were officers, with experience and knowledge (e.g. fishing legislation, types of fishing apparatus, types of fishing vessels) attained in actions of fisheries surveillance. To evaluate the suitability of generalization, of the results obtained from the sample, to the universe of future end-users, a statistical validation was carried out. Participants’ responses intended to evaluate the degree of compliance of the solution with the requirements, and the degree of satisfaction of the respondent to the solution presented - were represented by a random variable X with ordinal values on a Likert scale between 1 (low) and 4 (high). Also, three research questions (RQ) were formulated, for which null hypotheses (H0) were defined, stating what was intended to be rejected. Each of the research questions (RQn) was mapped with a set of questions posed in the survey (SQn), serving the answers to these questions as a basis for concluding about the validity of each hypothesis tests. RQ1 - Does the designed and implemented BI system provide information needed for decision making? Figure 1. Star schema of the fisheries surveillance solution. The reports made available were built using the MSSQL Report Builder tool. Partial reports were initially produced until the required information was available to create a dashboard of a consolidated set of reports. For the generation of the reports, connections were established to a cube, source of data sets resulting from joining fact table’s measures with attributes of dimensions. Summarized information considered relevant was eventually presented in a dashboard. Complementary and more detailed information was made available through hyperlinks, namely: the subtype of vessels; naval units that participated in surveillance operations; the regional naval department involved; the results of boats’ inspection; and the type and quantity of fish found in inspections. IV. RESULTS ANALYSIS The evaluation of the BI system [15] was made by validating whether the solution built fulfilled the objective of supporting the decision-making process, according users’ requirements. For this purpose, a survey was launched to collect the opinions of end users about the quality and quantity of information made available in charts and tables, as well as the limitations detected. To reduce ambiguity of the responses the survey consisted of closed questions (multiple choice). The selected sample was composed of a total of 45 respondents having some knowledge regarding fisheries surveillance. Thus, the responses to the questionnaire were requested to a non-random sample that included seasoned students of the naval academy, naval officers, and officers responsible for fisheries surveillance data analysis. The survey was composed of four sections, with the aim of collecting the following data: (1) demographic data of the survey’s participant (e.g. gender, military status, and category); (2) characterization x SQ6 - Does the system provide information on the relevant dimensions of decision making? (Scale of answers from 4-Yes to 1-No) x SQ7 - How do you rate the solution presented in terms of support for decision-making on fisheries surveillance? (Scale of answers from 4- Excellent to 1Insufficient) RQ2 - Is the presentation of the information in the dashboard adequate for the decision-making process? x SQ2 - How do you classify the solution in terms of easiness for information visualization and interpretation? (Scale of answers from 4- Excellent to 1Bad) x SQ4 - Does the graphical interface of the system allow adequate access to information during the decisionmaking process? (Scale of answers from 4- Yes to 1No) RQ3 - The BI system has limitations that affect the potential for support of decision-making in fisheries surveillance? x SQ8 - Do you consider that the BI system has limitations? (Scale of answers from 4- No to 1- Very significant) The hypothesis tests performed were unidirectional with an ߙ-significance level of 0.05 (5% error probability). For each of the possible answers to the questions, the amount and percentage of the responses of the participants were computed, as well as the critical value t and p-value that allow rejection or not of H0. The referred values were obtained through the t-test, with an unknown variance (since the population variance was unknown). The null hypothesis must be rejected when the value 2019 14th Iberian Conference on Information Systems and Technologies (CISTI) 19 – 22 June 2019, Coimbra, Portugal ISBN: 978-989-98434-9-3 t has values greater than or equal to critical t, or alternatively pvalue is lower than the level of ߙ-significance (of 0.05 in the performed tests). The hypotheses tests carried out for each of the research questions (RQ1, RQ2 and RQ3) are described next. RQ1: H0: The management information system designed and implemented does not provide information needed for decision making. The answer to RQ1was given by the tests for SQ6 and SQ7 considering H0: Xİ 3. With respect to the values of SQ6, since the value t is higher than the critical t (4.509> 1.680) and p-value is lower than ߙ-significance (0.000023 <0.05), one can say that the null hypothesis must be rejected. However, the values for the hypothesis test of SQ7 claims the opposite. The contradiction of the hypothesis test values obtained for SQ6 and SQ7 does not confirm that the designed and implemented BI system provides the information needed for decision making. RQ2: H0: The presentation of the information in the dashboard is not adequate for the decision-making process. The answer to RQ2 was given by the tests for SQ2 and SQ4 considering H0: Xİ 3. With respect to the values of SQ2, since the value t is less than the critical t (1.478 <1.680), one cannot reject H0. On the other hand, for SQ4, the value t is higher than the critical t (4,780>1,680) and p-value is lower than ߙ significance, so H0 is rejected. However, in view of the contradiction between the two issues, it is not possible to conclude that the presentation of the information in the dashboard is adequate for the decision-making process. RQ3: H0: The BI system has limitations that affect the potential for support of decision-making in fisheries surveillance. The answer to RQ3 was given by the test for SQ8 considering H0: Xİ 3. With respect to the values of SQ8, the value obtained for this test of hypothesis reveal that it is not possible to reject the null hypothesis, since the value t is lower than the critical t (-3,100 <1,680). So, one cannot reject the hypothesis that the BI system has limitations that affect the potential of the system in support of decision-making in fisheries surveillance. To conclude we can say that the analysis of the hypotheses tests to the research questions does not allow to conclude about the qualities of the built BI system. Therefore, the BI solution, despite presenting useful and relevant information, still lacks new improvements, in order to fulfill the necessary requirements for more complete support of the decision-making in fisheries surveillance. V. CONCLUSIONS In this article, the state of the art of the process of implementing a BI solution was summarized. To demonstrate the instantiation of the process a BI solution was built for a relevant domain of interest: fisheries surveillance. For this purpose, a survey was necessary to be carried out for the existing legal framework on fisheries surveillance, the type of possible infractions and the data currently available to be collected. The messages are one of the most widely used means for information dissemination regarding fisheries surveillance. The fields that compose the different types of messages were elicited. This survey was used later in the ETL process to clean the data to be migrated to the BI solution. The design and implementation of the BI system were carried out with the finality of contributing to a more unified vision of the decision process related to fisheries surveillance. The evaluation of the final solution allowed to conclude that the system still needs improvements so that it may incorporate all users’ requirements. For future work, in addition to more iterations needed to improve the solution obtained, the objective is to extend the study initiated to other domains of interest relevant to the organization. Furthermore, an important effort that has to be made is the quality improvement of data sources, i.e. the data contained in the surveillance messages originally sent by naval units. REFERENCES [1] [2] [3] [4] [5] [6] [7] [8] [9] [10] [11] [12] [13] [14] [15] EU: 'EU Regulation 404/2011', Official Journal of the European Union, Series L-112/1, April 8, 2011. Dalfovo, O., and Tamborlin, N.: ‘Business Intelligence – Estudos e casos: Gestão de Tecnologia da Informação como Inteligência nos Negócios’, 1.a edição, Blumenau, Clube de Autores, 2017. Eckerson, Wayne W.: The Rise of Analytical Applications: Build or Buy?, https://bit.ly/2NZnxyk, accessed december, 3 2017. Cody, W.F., Kreulen, J.T., Krishna, V., and Spangler, W.S.: ‘The integration of business intelligence and knowledge management’, IBM systems journal, 41(4), 697-713, 2002. Inmon, W.H.: ‘Building the Data Warehouse’, New York, USA, Jhon Wiley & Sons, 2002. Inmon, W.H., and Krishnan, K.: ‘Building the Unstructured Data Warehouse: Architecture, Analysis, and Design’, 1ª edição, New Jersey, Technics Publications, 2011. Kimball, R., and Ross, M.: ‘The Data Warehouse Toolkit: The Complete Guide to Dimensional Modeling’, 2a ed., New York, John Wiley & Sons, 2002. Kimball, R., and Caserta, J.: ‘The Data Warehouse ETL Toolkit: Practical Techniques for Extracting, Cleaning, Conforming, and Delivering Data’, Indianapolis, Wiley Publishing, 2004. Rainardi, V.: ‘Building a Data Warehouse: With Examples in SQL Server’, 1st ed.., USA, Apress, 2008. Ballard, C., Farrell, D.M., Gupta, A., Azuela, C., and Vohnik, S.: ‘Dimensional Modeling: In a Business Dimensional Intelligence Environment’, in ‘Book Dimensional Modeling: In a Business Dimensional Intelligence Environment’, 2006. Kimball, R., and Caserta, J.: ‘The Data Warehouse Toolkit: The Definitive Guide to Dimensional Modeling’, 3rd ed., Indianapolis, John Wiley & Sons, 2013. Vercellis, C.: ‘Business Intelligence: Data Mining and Optimization for Decision Making’, 1st ed., UK, John Wiley & Sons, 2009. Valacich, J., and Schneider, C.: ‘Information Systems Today: Managing in the Digital World’, 8th ed., UK, Pearson Education Limited, 2017. Moody, D.L., and Kortink, M.A.R.: ‘From Enterprise Models to Dimensional Models: A Methodology for Data Warehouse and Data Mart Design’, Proceedings of the International Workshop on Design and Management of Data Warehouses (DMDW’2000), vol.28, pp. 5-1-12, https://bit.ly/2KmPhLf, accessed in dec., 25 of 2017, 2000. Machado, J.: ‘Business Intelligence da Atividade Operacional da Marinha Portuguesa’, Msc Thesis, Escola Naval, 2018 2019 14th Iberian Conference on Information Systems and Technologies (CISTI) 19 – 22 June 2019, Coimbra, Portugal ISBN: 978-989-98434-9-3 View publication stats

Log In

A business intelligence system for fisheries surveillance

Related papers

Related papers