A Business Intelligence system for fisheries
surveillance
Anacleto Correia
Ricardo Moura
CINAV – Escola Naval
Base Naval de Lisboa – Alfeite 2810-001
Almada, Portugal
[email protected]
CINAV – Escola Naval
Base Naval de Lisboa – Alfeite 2810-001
Almada, Portugal
[email protected]
Abstract — The volume of data related to operational activity,
produced and exchanged among the several Portuguese Navy
naval units is high and diversified. The way each naval unit
currently collects and processes data makes difficult to consolidate
information in the process of analysis and decision making. This
work describes the design of a Business Intelligence (BI) solution
aimed at obtaining a unified view of operational information from
the Navy. For this purpose, it was necessary to conceive, design
and implement a system that would allow the collection, treatment,
integration, consolidation, analysis and visualization of the data
contained in the operational data shared by naval units, in order
to generate information relevant to a more effective and efficient
decision making. The Business Intelligence system presented is
based on an analytical processing system whose data is transferred
to a data warehouse and then to a data mart. The solution of this
system was validated through a survey applied to target users. The
survey responses allowed to gather important information with the
purpose of improving the practicability and usefulness in future
iterations with the BI solution developed.
Keywords - Business Intelligence, Data Warehouse, Data Mart,
Decision Support Systems, Fisheries Surveillance.
I.
INTRODUCTION
The naval units of Portuguese Navy accomplish a set of
missions assigned to the institution. These missions include:
military operations (e.g. naval exercises, anti-piracy actions,
monitoring of shipping lanes, search and rescue), and security
(e.g. combat drug traffic, illegal immigration, maritime
accidents), cooperation with civilian authorities (e.g. fisheries
surveillance and protection of marine resources, patrolling of
protected areas), as well as scientific support for research,
development and innovation projects in partnership with
academia and industry.
The large set of operational activities and multiple sources
data of available generates large amounts of data that need to be
collected and dealt with, in order to support the decision-making
of the organization, namely, planning, logistical and financial
support, as well as the operational command of the naval
resources.
The collection of data is done in relevant management areas
of the organization, designated by domains of interest.
Conclusions withdrawn from preliminary surveys of the
information requirements showed the importance of collecting
data on domains related to fuel consumption; ammunition
consumption; operational limitations; search and rescue
operations; navigation; and fisheries surveillance. In the case of
the domain of interest of fisheries surveillance, which will be
analyzed in more detail in this work, the collected data resulted
from activities surveillance and monitoring of fishing activities
and marine cultures held in spaces under national sovereignty
and jurisdiction. The areas where the inspection activity of
vessels can be made is regulated by the United Nations
Convention on the Law of the Sea (UNCLOS). According to
UNCLOS, the areas of sovereignty and national jurisdiction are
the inland waterways, the territorial sea (up to 12 miles), the
contiguous zone (up to 24 miles), and the Exclusive Economic
Zone (EEZ) (up to 200 miles).
The practice of sea fishing and the cultivation of marine
species should be carried out in a balanced manner, to ensure
that the management and use of resources, in waters under
national sovereignty and jurisdiction, are sustainable. Thus, the
inspection of fishing is aimed at ascertaining whether the legal
norms established for fishing and marine culture activities are
complied with. After the inspection actions and the detection of
alleged violations of the law, procedural decisions are taken by
the judicial system, namely by the imposition of coercive
measures or acquittal of alleged offenders.
The control action of a fishing vessel comprises the
simultaneous collection of a large data set, with the objective of
maximizing the verification and survey activities. These include,
inter alia, those specifically related to fishing (e.g. quantity of
fish on board, licenses, type of gear, operating areas) of another
nature, such as related to the requirements and means of security
(e.g. life-saving appliances, lifejackets, fire extinguishers,
pyrotechnics), seaworthiness and crew conditions (e.g. safety
stocking, seafarers' qualifications, required documents for
seafarers), on licenses relating to the radioelectric equipment of
vessels. The information gathering process is generally carried
out by ensuring that the duration of the enforcement action does
not last longer than 4 hours unless an infraction is detected, or
the agents manifestly need additional information [1].
Whenever unlawful contraventions are detected by a naval
unit, it is the responsibility of the ship to make the report and
safeguard evidence through appropriate measures. For this
purpose, during the inspection, certification of compliance with
technical standards and procedures is carried out by completing
the inspection report [1]. Any non-compliance with the
2019 14th Iberian Conference on Information Systems and Technologies (CISTI)
19 – 22 June 2019, Coimbra, Portugal
ISBN: 978-989-98434-9-3
established parameters is considered a presumed infraction, so
the supporting data are recorded in the inspection report, to serve
as evidence in a procedural decision.
Although the scenario previously described relates to the
domain of interest of fisheries surveillance, other domains also
need an adequate collection of data and its conversion into
objective and summarized information that enables decision
making. With this in mind, it was necessary the use of data tools
to facilitate the adequate merging, aggregation and visualization
of the information.
In this paper, we describe the Business Intelligence (BI)
system life cycle stages built to provide decision-makers at the
different operational areas and hierarchical levels of the
organization, a unified view of the information relevant for the
decision-making process. The domain of interest, used in this
article, to illustrate the referred BI life cycle was the fisheries
surveillance.
This paper has five sections. In this section, the overall
domains of interest relevant to the organization were identified,
as well as the domain (fisheries surveillance) that will be tackled
in this work. In the second section, we analyze the state of the
art of Business Intelligence, referring to its most relevant
concepts and the process of designing and developing BI
solutions. The third section, describes the design of the data
repository and data visualization application of the BI solution,
as well as the steps taken for extracting, transforming and
loading the data from its sources. In the fourth section, the results
obtained from the artifact developed were analyzed, and
validated by users, through a survey object of statistical
treatment. Finally, we summarize the results obtained and the
recommendations for future work.
II. LITERATURE REVIEW
Decision making requires the analysis of the degree of
achievement of the objectives defined by the organization, as
well as of the efficiency of the established strategy to achieve
them. The way to achieve the goals may have to be changed in
the face of changes in the organization's surroundings. The
decision on which changes to initiate faces the difficult task of
gathering useful information [2]. Business Intelligence aims to
contribute where relevant information was scarce for decision
making in organizations.
From the multiple definitions of BI, we have selected the one
that refers to how to frame the processes, technologies and
applications necessary to transform data into information,
information into knowledge, and knowledge into action plans
[3]. A BI usually includes one or more data repositories, as well
as analytical and knowledge management tools [3].
The current operations of an organization result in the
recording of transactions (e.g., registering orders, processing
invoices, inventory updating). The storage, control and
processing of transactions are performed in Operational Data
Stores (ODS) [4] or Online Transaction Processing (OLTP)
(Larson et al., 2012). OLTPs are one of the operational data
repositories of an organization. These systems enable the
insertion, modification and query of daily data processed by the
organization. The data entered in an OLTP are called primitive
data [5]. Because this type of system is not the most suitable for
manipulation and summarization of large volumes of unit
transactions, required by the BI, the Executive Information
System (EIS) emerged in the 1970s. The EIS, having evolved
from the Decision Support Systems (DSS), treats internal and
external information relevant to the organization, in the form of
multidimensional dynamic reports, with forecasts and trend
analysis, among other functionalities [2].
One of the key BI infrastructures is the Data Warehouse
(DW), an integrated data repository that considers the time
factor, and records relevant organizational topics for support of
decision making [6]. A DW incorporates data from multiple
sources, after being processed, formatted, and consolidated into
a single data structure. Data transferred to a DW is usually from
OLTP systems. The DW data sources can also be from
Enterprise Resource Planning (ERP), Customer Relationship
Management (CRM), document management systems, business
processes management systems, spreadsheets, etc. [4].
In order to be able to transfer data from the source systems
to the DW, it was necessary to use the Extract-Transform-Load
(ETL) process [7]. This process aims at detecting and
eliminating errors in the data, as well as the data transformation
in order to ensure the quality of the data stored in the DW [8]. In
detail, the activities performed within an ETL process aim at: (1)
data extraction from data source systems, which includes the
phases of (i) Initial Extraction - occurs only once with the data
to be entered in the DW for the first time -, and the (ii) changed
data extraction/changed data capture (CDC), occurring based
on the periodicity of the updates defined by the organization,
with the data modified and added to the source systems, since
the last extraction, being sent again to the DW; (2) the data
transformation with the data to be disseminated to a temporary
area, the Staging Area (SA), where data cleansing occurs, i.e.
error detection, data correction and consolidation, with
inconsistent data deleted. The data is changed only on the target
system (DW) and not on the source system (OLTP). This phase
is critical because it involves processing large volumes of data,
extracted from various sources, with different encodings,
structures and storages. At the end of this phase, the DW should
contain unambiguous, correct and consistent data; (3) the
loading of the data stores in a destination system, an Online
Analytical Processing (OLAP) whose repository, when
supported by a dimensional structure with greater efficiency in
data query operations, is called Dimensional Data Store (DDS)
[9].
OLAP systems, also called Multidimensional Database
Management System (MDBMS) [5], allow the interactive
processing and analysis of data stored in the DW [9]. In addition
to the original transactional data OLAP contains analytical or
derived data, which are the result of data transformation for
analysis and decision support purposes. The database servers
that support this type of data processing are referred as: (1)
Relational Online Analytical Processing (ROLAP), when uses a
relational database as support, which allows flexibility in
querying and changing data; (2) Multidimensional Online
Analytical Processing (MOLAP) with data processing
performed in a multidimensional database represented by
dimensions and facts; (3) Hybrid Online Analytical Processing
(HOLAP), a combination of ROLAP and MOLAP technologies
taking advantage of the best of each, with the storage of data in
This work was funded by the Portuguese Ministry of Defense and the
Portuguese Navy/CINAV/Escola Naval.
2019 14th Iberian Conference on Information Systems and Technologies (CISTI)
19 – 22 June 2019, Coimbra, Portugal
ISBN: 978-989-98434-9-3
a HOLAP server being performed in a relational database and
the analytical operations executed by a MOLAP server.
OLAP servers can also be classified, depending on the
location of the server, as: (1) Desktop OLAP (DOLAP) when the
OLAP server is transferred to the desktop from where users
access to all or part of the multidimensional database; (2) OLAP
Web (WOLAP) when access to OLAP server is done with
internet technology.
If the analytical data repository is small and include only
information related to a department of the organization, it is
called Data Mart (DM) [7]. This approach allows for greater
speed in query and data analysis operations. The DM may be
dependent or independent on a DW, based on one of the
following configurations chosen for the DW architecture
(Ballard et al., 2006): (1) Enterprise Data Warehouse (EDW)
with a logically centralized data repository referred as Hub and
Spoke. In this architecture, the DMs are not inserted in the DW
[8]; (2) Independent Data Mart when analytical data is
organized to meet the specific needs of a department. In this type
of architecture, there is no relationship between DMs, which can
lead to inconsistencies [10-11]; (3) Dependent Data Mart, when
the DMs are implemented in several departments, but there is
integration, which allows a unified view of the data [10].
The DW can be implemented using three alternative
strategies: (1) Top-down, with data from the DMs extracted from
an integrated DW initially built [6]. Although this is the standard
development for the DW, it leads to a delayed implementation
[12]; (2) Botton-up, with data originated from the DMs sent to
the DW to be consolidated [11]. Allows a faster implementation
than the previous strategy but has the drawback of not providing
an overall DW model, making the integration difficult since the
DMs are independent [12]; (3) Mixed strategy is based on the
first one, but the implementation of each DM is made only when
it is needed by a specific domain of interest [12].
The construction of the Data Warehouse is based on
dimensional modeling [11] with the data organized in order to
facilitate the processing of the queries in different perspectives.
The implementation of the model can be achieved through the
following type of modeling: multidimensional cubes, star
schemas, snowflake schemes or constellation schemes.
Underlying the type of modeling are the concepts of: (1) fact
table containing numeric columns (facts) and foreign key fields
that relate to dimension tables [11]; (a) a fact refers to a topic on
which there are events that are intended to be recorded and
analyzed. Events, recorded as numerical values, are the object of
analysis for a set of measures [13]; (b) a measurement
correspond to a row in the fact table and should have the level
of granularity appropriate to the desired analysis [11]; (2) each
dimension table provides a specific perspective of analysis
(who, when, where, why, how), consisting preferably of
descriptive attributes than codes [11]; (a) each dimension is
defined by a primary key [11], unique and non-null field values
for each row of the table [11]. (b) the attributes correspond to
the lines of the dimension table, organized hierarchically to
allow greater detail in the analysis and used as constraints of
queries and reports. The potentiality of the provided analysis of
the Data Warehouse depends on the expressiveness of the
attributes created for the dimensions [11].
The facts are generally classified as: (1) additive facts that
can be aggregated (i.e., object of sum, counting, etc. operations)
by all dimensions related to the fact table (e.g. quantity, value);
(2) semi-additive facts which may be aggregated only by one or
more of the existing dimensions, but not all (e.g. date); and (3)
non-additive facts that cannot be aggregated by any of the
dimensions present in the structure (e.g. prices, percentages, unit
values).
From the moment the data enters in the DW, the only
possible operation on the data is read (except in the DW loading
and updating phases through the ETL process). The attributes
that make up the dimension tables are often used in the analysis.
Although not common, they can be altered over time using the
Slowly Changing Dimensions (SCD) technique to monitor the
occurrence of changes. The common types of SCD are [11]: (1)
Type 1 - Overwrite the Value: consists of an update that replaces
one or more attributes. This technique is used to correct data
with no interest in storing historical values [8]. It has a simple
implementation, since the value of the previous attribute is not
saved [11]; (2) Type 2 - Add a Dimension Row / Creating
another dimension record: the change is made by inserting a
new line, that is, a new attribute in the dimension, so we can
check the previous value and its update. To distinguish both
attributes, different values of the primary key are used. The
implementation is more complex but allow us to save the
complete history; (3) Type 3 - Add a Dimension Column: This
is the addition of a defined number of columns that are used
when changing, that is, there is the column with the value
inserted and another column of change, with the value relative
to a previous moment. This technique makes it possible to
observe the new attribute and the previous one, but, however, it
does not save more history if other changes occur. It does not
allow an analysis of the impact of the changes, if they exist [11].
More complex and costly techniques than the previous ones
are called hybrid techniques [11]: (1) Predictable Changes with
Multiple Version Overlays, consisting of a succession of
changes to be made in an attribute. Type 2 would be the
alternative but does not apply when the requirements of the
changes are very elaborate and require a combination between
Types 2 and 3 [11]; (2) Unpredictable Changes with SingleVersion Overlay is used when there is a need to preserve
historical data to understand the evolution of the current values
and allow comparisons. In this case, two columns are needed,
one for the current values and the other column for the historical
values. Thus, the required operations are: Type 2 to create the
new line, which contains the previous value and the current
value with different keys; Type 3 to add the column containing
the new update; and Type 1 that writes over the previous one,
since the historical value already exists in another column [11].
The multidimensional modeling of the DW can be classified
as: (1) star schema, the most common, especially when it is
expected to perform large data queries. It is the scheme from
which others derive. It is based on a single fact table (center of
the star) that establishes relation with the non-normalized
dimension tables [10]. It can be viewed as a cube, where each of
the schematic dimension tables corresponds to one face of the
cube; (2) the snowflake scheme is equivalent, in the data content,
to the previous scheme, except that in this scheme the
dimensions are normalized. It has the advantage of
2019 14th Iberian Conference on Information Systems and Technologies (CISTI)
19 – 22 June 2019, Coimbra, Portugal
ISBN: 978-989-98434-9-3
hierarchically relating the dimensions, but the drawback of
being more complex and difficult to understand, by showing the
structure of each dimension [14]; (3) the constellation scheme is
a set of star schemas, with multiple fact tables linked through
common dimension tables [10].
with tools to support the BI processes, namely, to build the DW
(OLTP system, ETL tools and data integration), OLAP (for cube
manipulation and data analysis), as well as for data visualization,
i.e. a suitable user interface for data presenting for decision
making.
DDS data are stored in multidimensional cubes of
Multidimensional Databases (MDB) [9]. These structures allow
flexible access to DW data for information visualization and
reporting using tools such as spreadsheets, reporting
applications, and data mining tools. The cubes also enable a set
of operations on data: (1) Drill-up/Roll-up (Drill-down/Rolldown): group (detail) data according to hierarchical levels [10].
In drill-up/roll-up the visualization starts from the most detailed
level of data to the highest level of aggregation. By detailing
(drill-down/roll-down) the visualization of the data begins with
the aggregated values of the data and evolves to the detailed
values; (2) the Slice and Dice allows to restrict the information
to be visualized, using data cutting and fragmentation [13]; a cut
(slice) consists of the selection of a restricted number of
attributes in a dimension in conjunction with other unrestricted
dimensions [10]; a fragment (dice) relates different dimensions,
placing the dimensions attributes on the axes of a partial cube
[10]. (3) with the Pivot and rotate it is possible to rotate the axes
and to transpose rows and columns allowing, for the same data,
different visualizations [10].
The volume of data being processed is now increasing, with
the integration of Internet of Things (IoT) and Big Data. Other
technological trends, that certainly will shape the BI
architecture, are Cloud Computing, Machine Learning, Business
Analytics, Data Mining, Competitive Intelligence and
Knowledge Management [2]. A way to store, process, analyze,
and process large volumes of high-performance BI data based
on the In-Memory Database (IMDB) is also gaining expression.
For the effectiveness of the information visualization in a
dashboard there are principles that must be followed in its
elaboration, namely: focusing on the needs of the target
audience, display of important information in the upper corners;
keep charts simple, with titles and subtitles, and the use of a color
palette easy to interpret. One should avoid charts that distort
proportions, in 3D, complex to interpret or have unnecessary
details. The most common mechanisms for visualizing
information in Business Intelligence are dashboards (static or
dynamic) and scoreboards [10]. A dashboard consolidates, on a
single screen, relevant information about achieving strategic
objectives and measuring the overall efficiency of the
organization. Scorecards, on the other hand, provide an
expedited and concise way of measuring Key Performance
Indicators (KPIs) and depicting in charts the progress of the
organization towards defined objectives [10]. Dashboards can be
classified as: (1) operational, when monitoring various business
processes; (2) tactical, if monitors of departmental processes;
and (3) strategic (or scorecards), designed to monitor the
implementation of strategic objectives [3].
In addition to dashboards, BI applications can be grouped
into five categories: reporting applications, analytic
applications, data mining applications, alerts, and portals.
Usually, one can also perform ad-hoc queries, which are based
on access to information using parameters as filters, according
to the information needs [5].
The architecture of a BI solution has two main components
(see Figure 9): (1) back-end, where the server resides with the
largest data repository; and (2) the front-end, that forms the user
interface, where data are presented and analyzed [2]. Traditional
BI systems are based on a three-tier architecture [2]: Database
Layer, Application Layer and Presentation Layer. This
architecture includes the previously mentioned components,
One of the current challenges of BI systems is the inclusion
of unstructured data sources, such as e-mail messages, personal
profiles obtained via the Web and social networks, data from
mobile devices and other sensors. Although data growth
contains the potential for access to more relevant information,
the fact that it comes from unstructured data sources introduces
additional complexity of processing and obtaining knowledge
[2].
III. PROPOSED SOLUTION
In order to create the BI solution, for the domain of interest
of fisheries surveillance, data was gathered in an OLTP
database. After the data collection, the next step was to design
the DW, using a star schema, by defining tables of fact and
dimensions, as well as their attributes, using the dimensional
modeling. The scheme was designed with quantitative data of
fisheries surveillance at the center, as the fact table, linked to
several dimensions with qualitative data. The fact in the model
is semi-additive because only some of its measures are
performed operations with the attributes of the dimensions.
After the creation of the star model, the data was populated.
In order to load the data in the various dimensions, the ETL
process was executed. During this process, the inconsistencies
detected in the data (e.g. decimal and/or negative values for nonnegative integer fields, date and time data represented in
fractional number format, months codified in different ways,
strings of characters without trimmed spaces, etc.) were
resolved.
For instantiation of the ETL process, MSSQL Server
Integration Services (SSIS), was used. This tool enables the
import of data from various sources, as well as manipulation,
and loading into the DW. The ETL process was executed for the
dimensions and fact tables, which resulted in the star schema
filled up with the data ready for the multidimensional analysis.
The relevant information to be made available through OLAP
analyzes was the following: number of fisheries surveillances
per year; entities that carried out the surveys; type and
nationality of the vessels surveyed; fishing apparatus; presumed
offenders by fishing area; and type of fish found on boats
suspected of being offenders.
2019 14th Iberian Conference on Information Systems and Technologies (CISTI)
19 – 22 June 2019, Coimbra, Portugal
ISBN: 978-989-98434-9-3
of the participant, regarding the kind of knowledge related with
fisheries surveillance; (3) evaluation of the participant's ability
for adequate information analysis, in decision-making, through
the built solution; (4) global assessment of the built BI solution.
The analysis of the data collected from the sample allowed
us to conclude that more than half of the participants were
officers, with experience and knowledge (e.g. fishing
legislation, types of fishing apparatus, types of fishing vessels)
attained in actions of fisheries surveillance.
To evaluate the suitability of generalization, of the results
obtained from the sample, to the universe of future end-users, a
statistical validation was carried out. Participants’ responses intended to evaluate the degree of compliance of the solution
with the requirements, and the degree of satisfaction of the
respondent to the solution presented - were represented by a
random variable X with ordinal values on a Likert scale between
1 (low) and 4 (high). Also, three research questions (RQ) were
formulated, for which null hypotheses (H0) were defined, stating
what was intended to be rejected. Each of the research questions
(RQn) was mapped with a set of questions posed in the survey
(SQn), serving the answers to these questions as a basis for
concluding about the validity of each hypothesis tests.
RQ1 - Does the designed and implemented BI system
provide information needed for decision making?
Figure 1. Star schema of the fisheries surveillance solution.
The reports made available were built using the MSSQL
Report Builder tool. Partial reports were initially produced until
the required information was available to create a dashboard of
a consolidated set of reports. For the generation of the reports,
connections were established to a cube, source of data sets
resulting from joining fact table’s measures with attributes of
dimensions. Summarized information considered relevant was
eventually presented in a dashboard. Complementary and more
detailed information was made available through hyperlinks,
namely: the subtype of vessels; naval units that participated in
surveillance operations; the regional naval department involved;
the results of boats’ inspection; and the type and quantity of fish
found in inspections.
IV. RESULTS ANALYSIS
The evaluation of the BI system [15] was made by validating
whether the solution built fulfilled the objective of supporting
the decision-making process, according users’ requirements. For
this purpose, a survey was launched to collect the opinions of
end users about the quality and quantity of information made
available in charts and tables, as well as the limitations detected.
To reduce ambiguity of the responses the survey consisted of
closed questions (multiple choice).
The selected sample was composed of a total of 45
respondents having some knowledge regarding fisheries
surveillance. Thus, the responses to the questionnaire were
requested to a non-random sample that included seasoned
students of the naval academy, naval officers, and officers
responsible for fisheries surveillance data analysis. The survey
was composed of four sections, with the aim of collecting the
following data: (1) demographic data of the survey’s participant
(e.g. gender, military status, and category); (2) characterization
x
SQ6 - Does the system provide information on the
relevant dimensions of decision making? (Scale of
answers from 4-Yes to 1-No)
x
SQ7 - How do you rate the solution presented in terms
of support for decision-making on fisheries
surveillance? (Scale of answers from 4- Excellent to 1Insufficient)
RQ2 - Is the presentation of the information in the dashboard
adequate for the decision-making process?
x
SQ2 - How do you classify the solution in terms of
easiness
for
information
visualization
and
interpretation? (Scale of answers from 4- Excellent to 1Bad)
x
SQ4 - Does the graphical interface of the system allow
adequate access to information during the decisionmaking process? (Scale of answers from 4- Yes to 1No)
RQ3 - The BI system has limitations that affect the potential
for support of decision-making in fisheries surveillance?
x
SQ8 - Do you consider that the BI system has
limitations? (Scale of answers from 4- No to 1- Very
significant)
The hypothesis tests performed were unidirectional with an
ߙ-significance level of 0.05 (5% error probability). For each of
the possible answers to the questions, the amount and percentage
of the responses of the participants were computed, as well as
the critical value t and p-value that allow rejection or not of H0.
The referred values were obtained through the t-test, with an
unknown variance (since the population variance was
unknown). The null hypothesis must be rejected when the value
2019 14th Iberian Conference on Information Systems and Technologies (CISTI)
19 – 22 June 2019, Coimbra, Portugal
ISBN: 978-989-98434-9-3
t has values greater than or equal to critical t, or alternatively pvalue is lower than the level of ߙ-significance (of 0.05 in the
performed tests). The hypotheses tests carried out for each of the
research questions (RQ1, RQ2 and RQ3) are described next.
RQ1: H0: The management information system designed and
implemented does not provide information needed for decision
making.
The answer to RQ1was given by the tests for SQ6 and SQ7
considering H0: Xİ 3. With respect to the values of SQ6, since
the value t is higher than the critical t (4.509> 1.680) and p-value
is lower than ߙ-significance (0.000023 <0.05), one can say that
the null hypothesis must be rejected. However, the values for the
hypothesis test of SQ7 claims the opposite. The contradiction of
the hypothesis test values obtained for SQ6 and SQ7 does not
confirm that the designed and implemented BI system provides
the information needed for decision making.
RQ2: H0: The presentation of the information in the
dashboard is not adequate for the decision-making process.
The answer to RQ2 was given by the tests for SQ2 and SQ4
considering H0: Xİ 3. With respect to the values of SQ2, since
the value t is less than the critical t (1.478 <1.680), one cannot
reject H0. On the other hand, for SQ4, the value t is higher than
the critical t (4,780>1,680) and p-value is lower than ߙ significance, so H0 is rejected. However, in view of the
contradiction between the two issues, it is not possible to
conclude that the presentation of the information in the
dashboard is adequate for the decision-making process.
RQ3: H0: The BI system has limitations that affect the
potential for support of decision-making in fisheries
surveillance.
The answer to RQ3 was given by the test for SQ8 considering
H0: Xİ 3. With respect to the values of SQ8, the value obtained
for this test of hypothesis reveal that it is not possible to reject
the null hypothesis, since the value t is lower than the critical t
(-3,100 <1,680). So, one cannot reject the hypothesis that the BI
system has limitations that affect the potential of the system in
support of decision-making in fisheries surveillance.
To conclude we can say that the analysis of the hypotheses
tests to the research questions does not allow to conclude about
the qualities of the built BI system. Therefore, the BI solution,
despite presenting useful and relevant information, still lacks
new improvements, in order to fulfill the necessary requirements
for more complete support of the decision-making in fisheries
surveillance.
V. CONCLUSIONS
In this article, the state of the art of the process of
implementing a BI solution was summarized. To demonstrate
the instantiation of the process a BI solution was built for a
relevant domain of interest: fisheries surveillance. For this
purpose, a survey was necessary to be carried out for the existing
legal framework on fisheries surveillance, the type of possible
infractions and the data currently available to be collected.
The messages are one of the most widely used means for
information dissemination regarding fisheries surveillance. The
fields that compose the different types of messages were elicited.
This survey was used later in the ETL process to clean the data
to be migrated to the BI solution.
The design and implementation of the BI system were
carried out with the finality of contributing to a more unified
vision of the decision process related to fisheries surveillance.
The evaluation of the final solution allowed to conclude that the
system still needs improvements so that it may incorporate all
users’ requirements.
For future work, in addition to more iterations needed to
improve the solution obtained, the objective is to extend the
study initiated to other domains of interest relevant to the
organization. Furthermore, an important effort that has to be
made is the quality improvement of data sources, i.e. the data
contained in the surveillance messages originally sent by naval
units.
REFERENCES
[1]
[2]
[3]
[4]
[5]
[6]
[7]
[8]
[9]
[10]
[11]
[12]
[13]
[14]
[15]
EU: 'EU Regulation 404/2011', Official Journal of the European Union,
Series L-112/1, April 8, 2011.
Dalfovo, O., and Tamborlin, N.: ‘Business Intelligence – Estudos e casos:
Gestão de Tecnologia da Informação como Inteligência nos Negócios’,
1.a edição, Blumenau, Clube de Autores, 2017.
Eckerson, Wayne W.: The Rise of Analytical Applications: Build or
Buy?, https://bit.ly/2NZnxyk, accessed december, 3 2017.
Cody, W.F., Kreulen, J.T., Krishna, V., and Spangler, W.S.: ‘The
integration of business intelligence and knowledge management’, IBM
systems journal, 41(4), 697-713, 2002.
Inmon, W.H.: ‘Building the Data Warehouse’, New York, USA, Jhon
Wiley & Sons, 2002.
Inmon, W.H., and Krishnan, K.: ‘Building the Unstructured Data
Warehouse: Architecture, Analysis, and Design’, 1ª edição, New Jersey,
Technics Publications, 2011.
Kimball, R., and Ross, M.: ‘The Data Warehouse Toolkit: The Complete
Guide to Dimensional Modeling’, 2a ed., New York, John Wiley & Sons,
2002.
Kimball, R., and Caserta, J.: ‘The Data Warehouse ETL Toolkit: Practical
Techniques for Extracting, Cleaning, Conforming, and Delivering Data’,
Indianapolis, Wiley Publishing, 2004.
Rainardi, V.: ‘Building a Data Warehouse: With Examples in SQL
Server’, 1st ed.., USA, Apress, 2008.
Ballard, C., Farrell, D.M., Gupta, A., Azuela, C., and Vohnik, S.:
‘Dimensional Modeling: In a Business Dimensional Intelligence
Environment’, in ‘Book Dimensional Modeling: In a Business
Dimensional Intelligence Environment’, 2006.
Kimball, R., and Caserta, J.: ‘The Data Warehouse Toolkit: The
Definitive Guide to Dimensional Modeling’, 3rd ed., Indianapolis, John
Wiley & Sons, 2013.
Vercellis, C.: ‘Business Intelligence: Data Mining and Optimization for
Decision Making’, 1st ed., UK, John Wiley & Sons, 2009.
Valacich, J., and Schneider, C.: ‘Information Systems Today: Managing
in the Digital World’, 8th ed., UK, Pearson Education Limited, 2017.
Moody, D.L., and Kortink, M.A.R.: ‘From Enterprise Models to
Dimensional Models: A Methodology for Data Warehouse and Data Mart
Design’, Proceedings of the International Workshop on Design and
Management of Data Warehouses (DMDW’2000), vol.28, pp. 5-1-12,
https://bit.ly/2KmPhLf, accessed in dec., 25 of 2017, 2000.
Machado, J.: ‘Business Intelligence da Atividade Operacional da Marinha
Portuguesa’, Msc Thesis, Escola Naval, 2018
2019 14th Iberian Conference on Information Systems and Technologies (CISTI)
19 – 22 June 2019, Coimbra, Portugal
ISBN: 978-989-98434-9-3
View publication stats