Big Data
Big Data
Big Data
Review article
Keywords: The construction industry is currently going through an intelligent revolution. The profound transformation
Construction of the Industry 4.0 era is made possible by contemporary technologies such as Internet of Things (IoT),
Big data cloud computing, and robotics. Essentially, the vast amount of diverse big data from many sources should
Data analytics
be properly utilized to enhance the entire life-cycle construction process. Construction efficiency can be
Data engineering
enhanced while material waste and construction expenses are reduced, planning and decision-making processes
can be improved while errors are lowered, and applications of big data in construction analytics will make
construction sites safer. This article not only offers a comprehensive review of the advantages of associated big
data approaches, but it also assesses the current state of the art in the construction industry. Several unresolved
difficulties are also discussed. In the end, we express our thoughts on the potential future of big data in the
construction industry.
∗ Corresponding author at: Faculty of Information Technology, Beijing University of Technology, Beijing, 100124, China.
E-mail address: [email protected] (F. Li).
https://doi.org/10.1016/j.jii.2023.100483
Received 8 December 2022; Received in revised form 23 April 2023; Accepted 14 June 2023
Available online 28 June 2023
2452-414X/© 2023 Elsevier Inc. All rights reserved.
F. Li et al. Journal of Industrial Information Integration 35 (2023) 100483
Fig. 1. Big data are produced throughout the duration of an entire construction project and can be advantageous for the industry in the other way round.
Besides common-sense construction processes including the pre- to communications and cost prediction as well as management. Be-
fabrication manufacturing and construction sites, construction plan- cause of the often unstructured data which could be difficult to access
ning and management software, such as BIM (building information without the appropriate tools, harnessing big data in construction
modeling), EIS (executive information system), DSS (decision support is important. Furthermore, AI paradigms are also driving the digital
system), also produce big data. The sources can be summarized as sim- revolution of the traditional construction towards intelligent construc-
ulation dataset, experimental dataset, organization dataset, company tion [18]. Therefore, it is essential to research and create big data
dataset, government dataset, project field survey, project monitoring approaches for the construction industry, including engineering and an-
dataset, and unstructured sources such as social networks, statistic alytics methods and strategies [19,20]. For example, big data algorithm
yearbooks, and crawler data from websites. The main benefits brought can produce a promotion in construction project quality and reduce
from big data include but not limit to building efficiency improvement, the incidence of quality problems [21]; multi-party construction project
environmental impact reduction, collaboration promotion, building and can be improved by a big data service platform [22]; the capacity
infrastructure sustainability improvement and so on. of the construction companies can be assessed using a predictive and
Many industrial applications, such as large-scale distributed device, prescriptive big data platform [23].
long-lasting production, company operation, production supply chain, Data analytics in the construction industry contributes to many
and external cooperation sources, can provide industrial big data different aspects, including building design, construction cost man-
[8–11]. For example, energy Internet [12], distributed modeling agement, energy consumption prediction and pattern identification,
[13,14], sensor network monitoring [15], high dimensional process material performance prediction, safety management, cloud framework
monitoring [16], and so on, generate big data. For modern industries establishment, decision making and control system, and other areas [2].
in the Industry 4.0 era, data are being generated by all kinds of In the other way, the vast data gathered on the progress of construction
isolated and networked machines and devices, cloud-based solutions, projects during their entire life cycles, machinery conditions, worker
business planning and management, etc. Big data is a distinguishing activities, material positions, vehicle trajectories, energy consumption,
feature of the current generation of intelligent construction because weather conditions, and so on can improve the AI models [19], provid-
IoT (Internet of Things) uses widespread sensors and microprocessors ing wiser and broader insights into the construction industry. Utilizing
throughout the whole construction site, generating a massive volume of the data from construction projects to optimize design, build, and
data that is well above that of traditional size. And the data volume is run the next wave of industrial innovation. It becomes possible to
expected to be continuously increasing in the next decades. Due to the develop better scheduling and planning strategies as well as efficient
streaming nature of data sources, construction data is also dynamic. construction rules and techniques could be derived and developed to
However, the term ‘‘big data’’ does not simply mean ‘‘a big amount’’ solve the potential issues and improve the efficiency. Besides, thanks
of data containing abundant information, which would be called ‘‘very to the more streamlined data and information flows, it is also feasible
large data’’ or ‘‘massive data’’. Essentially, ‘‘big data’’ stands for the to benefit from the decision-making and strategy development for the
data which are generally unstructured, heterogeneous, and therefore management of construction projects. To sum up, data analytics and AI
extremely complex to deal with. Fortunately, information sharing, contribute to parts of the construction process.
processing, and application are becoming more organized as a result Construction is a fairly fragmented procedure with an ad hoc or-
of digitization and standardization of many sorts of data formats and ganizational structure and non-linear workflow, in contrast to the
types. IoT solutions including mobile, pervasive computing, and smart assembly line approach used in manufacturing. Tasks do not typically
devices have also seen huge increases in popularity along with the big link in a straight line. Contrarily, shared resources build connections
data industry’s explosive growth. for the activity in between or within tasks and other activity. Because
Real-time or nearly real-time analysis is frequently needed for in- subcontractors usually receive a variety of projects and frequently have
dustrial big data. Therefore, it is challenging even for the commercial variable degrees of information literacy, getting precise information
database software and data analytics software tools to efficiently and from them is challenging for general contractors and project owners.
effectively capture, store, manage, and analyze the data sets [17]. Conflicting information flows cause participants to perceive the project
The construction industry – particularly intelligent construction – gen- differently, and their coordination consumes a significant amount of
erates enormous volumes of data every day, which keep growing labor and resources, which is detrimental to project management and
with information on everything from building models and designs team organization [24]. As a result, unstructured data, which are data
2
F. Li et al. Journal of Industrial Information Integration 35 (2023) 100483
Fig. 2. Big data generation scenarios in the construction process, covering terminal devices, communication techniques, edge/cloud servers, and associated services.
that include information but lack a clear structure, must be handled by the construction process and reduce loss and waste if information is
data analytics throughout the construction process, including images, shared throughout the design, manufacture, transportation, assembly,
emails, plans, websites and reports [25]. Fig. 2 shows the typical big construction, and maintenance phases.
data generation scenarios, including the pervasive sensors installed
on the machinery used in the construction sites or terminal devices 2.1. Construction modeling and planning
used in the whole construction process, communication techniques
such as satellite, WiFi, RoLa, etc., edge or cloud servers, and all kinds BIM (Building Information Modeling) was developed and derived
of construction services, including management, design, monitoring, from computer-aided design (CAD) [31]. However, BIM is a more
planning, scheduling and so on. comprehensive concept right now, which assists a construction project
As far as we are aware, this is the first comprehensive, cutting- throughout its entire life cycle. Generally speaking, BIM offers a vir-
edge study of how big data paradigm has affected every stage of the tual model and the necessary data about buildings. Planning, design,
construction industry’s life cycle in the new era of digitization. Al- building, and operation are the four steps that BIM typically supports.
though there have been works describing big data analytics [10,17,19], Phase planning, site study, cost budgeting, and status quo model-
existing surveys have mainly focused on certain isolated aspects of the ing are all involved in the planning step; Design process includes
construction process. In terms of different construction perspectives, re- assessment of design modeling, cost accounting, structural analysis,
searchers have provided overviews on BIM storage [26], IoT intelligent performance analysis, resource analysis, energy analysis, pipeline and
management [27], construction site management [5], prefabricated data integration. In the construction stage, model building, site simula-
building [28], CPS (cyber physical system) based construction organi- tion, construction simulation, dynamic collision detection, construction
zation [24], data mining in construction [2], bibliometric analysis [29], optimization, design coordination, monitoring and adjustment are im-
and so on. This article aims to help readers have a better understanding plemented by BIM; In the operation stage, BIM includes the study
of the role of big data engineering and analytics in modern construction of building systems, maintenance schedule, equipment management,
by observing the sources, analytics tools, and benefits. record model, space management, disaster warning, and document pro-
This survey’s remaining sections are organized as follows. In Sec- duction. As a result, BIM is defined as ‘‘sharing of knowledge resources
tion 2, we first introduce the data generation possibilities for the for information about a facility, providing a trustworthy foundation
different stages of the construction project life cycle. We discuss the for decisions throughout its life-cycle; BIM exists from the earliest
associated big data engineering and analytics approaches and proce- conception to the demolition of a construction project’’ by US National
dures in Section 3. The cutting-edge advantages of big data in the Institute of Building Science [32].
construction sector are illustrated in Section 4. Section 5 discusses BIM has become more than a modeling and simulation tool. It offers
the current problems and difficulties with big data in terms of the a digital replica representation of the building under construction,
construction applications. Section 6 concludes the survey. which seems similar to ‘‘digital twin’’ [33]. Through the interactions
between physical and cyber worlds, engineering project monitoring,
2. Big data generated from construction process modeling, and decision-making are now possible with the help of BIM.
At the component level, BIM may provide a highly accurate represen-
Big data is a broad term for a complex and substantial collection tation of a project by including geometric, topological, and metadata
of data, which needs advanced engineering strategies and analytics features [34]. Specifically, BIM models make it possible for building
systems to process, store and manage. Big data also includes a data blueprints to be quickly accessed via digital devices and offer a way
transformation flow and a data security architecture in addition to the to track the current status of construction projects in real time [35].
5Vs (volume, velocity, variety, value, and veracity) [30]. In the con- As geographical data acquisition and retrieval advance, Integration of
struction industry, not just in construction sites big data are produced, geographical and semantic data is related to BIM on the many stages of
but also other related procedures within the entire construction pro- the construction process, including land planning, cadastral survey, and
cess, consisting of the modeling, designing, planning, scheduling, and other applications related to geographic information systems (GIS). In
management. Therefore, various sources in the construction industry the construction site, BIM acts as a real-time standard safety planning
provide data from a variety of structured and unstructured formats, and automated hazard checker to automatically identify and prevent
including cameras, sensors, wearables, mobile devices, designs, log and construction worker fall hazards on the construction site [36]. The
management files, and so forth. We can only significantly improve amount of available BIM data is readily growing. BIM models are
3
F. Li et al. Journal of Industrial Information Integration 35 (2023) 100483
designed to contain a significant amount of data details from multiple process multi-sourced and diverse data and extract valuable informa-
sources and disciplines such as architecture, engineering, and con- tion to analyze and improve the existing workflow. Big data makes
struction. BIM models of large buildings, complex infrastructures, and it possible for engineering and construction companies to gather and
multi-disciplinary projects which contain high-level of details occupy analyze data on costs, site-based transactions, photos, conversations,
large storage spaces. In addition, BIM models are often used for collabo- changes to plans, and more. There are literally thousands of bits of data
ration and coordination among project stakeholders, such as architects, created for each project in the construction business. Many processes
engineers, contractors, and building owners, which requires sharing are either not tracked at all, or only when reporting is done using paper-
and updating large amounts of data between multiple parties. BIM files based document sheets. Without digital technology, it is practically
of a building model can easily exceed 50 GB [37] or 100 GB [38] in impossible to locate crucial data items that would allow for an im-
size. Big data management and analytics are therefore crucial. mediate response to anticipated issues or the application of successful
A full collection of BIM data is produced that includes physical outcomes to future projects, such as the construction industry’s use of
model and attribute details for the whole life cycle, from design, survey, data mining [2]. Nowadays, information, knowledge and models of the
and construction through to operation and maintenance. How to store, real world are usually represented in a virtual system. Other elements of
integrate, and use BIM models has become a key issue because of the the virtual world can be software, data-driven models, data mining, AI
increased volume of data and the longer period of the visualization models, and various simulation and processing techniques [44]. In this
preparation of the BIM model. The ground, underground, inside, out- context, AI approaches are more and more preferred by practitioners for
door, building, street, real-time, history, and prediction data are all data processing and analytics to support decision-making and provide
included in the BIM big data. Input, acquisition, modification, and feedback to the physical construction system and projects [45,46].
integration of information are done by project participants throughout The virtual twin can imitate and manage the real system, improve
different phases of the construction project life cycle, leading to infor- a procedure, and foresee problems that are not yet emerged in the
mation exchange as well as integration. The arrangement, storage, and physical system [47,48].
maintenance of BIM data should all be done with high efficiency [26]. It Big data engineering and analytics are the two primary functions of
is necessary and urgent to handle large BIM files more efficiently, big data, as shown in Fig. 3. Big data engineering includes critical steps
allowing for greater collaboration and faster decision-making in the to acquire, regularize and manage the semi-structured and unstructured
design and construction process. data with or without interference. Acquisition, processing, storage,
database and pipeline construct the main components of big data
2.2. Construction site sensing engineering. Big data analytics are used to categorize, characterize,
consolidate, predict, infer, and classify data in order to produce useful
Instinctively, construction site stands for the construction. Multidi- information. Prediction, classification, clustering, inference, and opti-
mensional information about a construction site can be collected by mization can be summarized as the core operations, whereas statistics,
IoT sensors. With the use of Industry 4.0 technology, it is possible data mining, expert knowledge, machine learning, and deep learning
to create a ‘‘connected’’ construction site that integrates off-site, on- are frequently utilized techniques [49,50].
site, and post-construction activities to increase productivity, uphold
sustainability, and enhance worker safety [39]. Onsite IoT network 3.1. Big data engineering
brings all kinds of possibilities in terms of construction site monitoring
[5] and the possibility to build an IoT empowered intelligent construc- Big data are advantageous to the construction industry, highlighting
tion monitoring system for prefabricated buildings [28]. Distributed the importance of data engineering. The streaming and heterogeneity of
and smart sensors have been deployed in the construction sites. For ex- big data are characteristics of the construction process. The continuous
ample, the RFID (radio-frequency identification) tags can be implanted large data streaming flows are frequently used and processed over
into prefabricated construction materials, then the construction quality time in the construction industry. Because big data is produced in
control can be achieved and the data can be integrated with BIM for a streaming manner, it needs to be properly structured and stored
more convenient search and positioning [40]. Project managers employ to facilitate data analytics. Data engineering, including acquisition,
a variety of advanced instruments, including video equipment, facial processing, storage, database and pipeline, serves as the foundation
recognition technology, RFID technology, wireless sensor technology, for effective data analysis by arranging the raw data into a form that
and terminal location devices, to achieve dynamic site supervision. As is both analysis- and store-friendly. The tools for data engineering
an illustration, dynamically grasp the surroundings of the construction have recently improved, making them more broadly accessible. The
site, the state of personnel, equipment, and vehicles having access to engineering of big data should meet the following criteria from the per-
it, and the attendance status of the labor force. Project employees can spective of data analysis: (1) accessibility needs to be implemented for
quickly spot issues, correct deviations, and make sure the project is different types of data; (2) massive data needs to be stored efficiently;
running smoothly by examining critical data information in the BIM (3) the engineering system needs to be capable of scalability; (4) the
model [41]. data needs to be controlled globally. From data acquisition to the whole
In addition, risk perception is critical for workers’ safety. Data col- pipeline, big data engineering is a systematic task, as shown in Fig. 4.
lected from wearable sensors is used to improve the worker safety. For The data that is used in the construction process is collected from
instance, wearable sensors have made it possible to evaluate the health multi-modal sources and transmitted using a variety of methods and
and happiness of employees in a personalized, objective manner [42]. protocols, according to the data acquisition perspective. Besides the
And, when a construction worker is actively working, physiological different types of sensor data, global positioning system (GPS) [40],
data can be acquired from them [43]. In essence, there are an enormous RFID [51], location-based service (LBS) [52], etc. are often used. After
number of significant use cases that make the collecting, processing, being acquired from many sources, the heterogeneous data is brought
and analysis of big data on construction sites necessary. into a unified process. Both cable and wireless technologies, including
WiFi, ultra-wideband (UWB), etc., would be used to transfer the local
3. Big data approaches and methods data. The Open Platform Communications United Architecture (OPC-
UA), the data exchange standards for industrial communication (M2M
The construction industry is being transformed by big data, which or PC-to-Machine connection), is used to manage the acquired data
has been known for being slow to accept new technology. This change by programmable logic controllers (PLC), remote terminal units (RTU),
will boost productivity, collaboration, worker safety, and material and other devices [53]. All forms of data will eventually be managed
waste reduction while lowering risks. Big data paradigm efficiently by the supervisory control and data acquisition (SCADA) server. The
4
F. Li et al. Journal of Industrial Information Integration 35 (2023) 100483
Fig. 3. Big data paradigm components in the upcoming digital construction industry.
Fig. 4. Big data engineering diagram covering from data acquisition to cloud computing.
5
F. Li et al. Journal of Industrial Information Integration 35 (2023) 100483
SCADA system is essential to data engineering’s data collecting process of data transformation techniques include normalization, aggregating,
(as shown in the bottom right corner of Fig. 4). This system, which smoothing, and attribute building [69].
manages many kinds of big data in the construction industry, includes Because of cloud computing, users leverage Internet to instantly
sensor readings like velocity, temperature, frequency data, and so on. access a shared pool of flexible resources. The appropriate cloud-
It is a centralized system that keeps a check on and regulates the based construction management platform can handle and process tasks
entire data collecting process. Data are continuously recorded and then related to building design, costs, safety compliance, and accounting
transformed into live and historical data by connecting and monitoring more swiftly. Development of an information management platform
the dispersed devices throughout the entire site and throughout the that manages all facets of a project’s operation for big data is thus
entire construction process. Separate databases are used to hold these required for project management in the construction process. The cloud
two types of data. While live real-time data are utilized to fine-tune platform is a great technical and data resource for the clever adminis-
the pre-trained system model to better synchronize the instantaneous tration of engineering projects due to its potential to be enhanced and
system responses or detect abnormal events, historical data are used combined. For instance, using big data cloud calculations to increase
to construct empirical models for system analysis. The resilience and computing capacity might be a great approach to do so without having
sensitivity of the system model in terms of maintenance and responses to worry about setting up or maintaining a system. Edge computing
are improved by the merging of historical and real-time data. has also had an impact in addition to the cloud, particularly when
Message-oriented middleware (MOM), a crucial element of Indus- processing requires only a portion of the data to be transmitted to the
trial Internet of Things (Industrial IoT), is specifically created to flexibly cloud [70]. In today’s intelligent integration of engineering projects,
integrate distributed applications and services that are running on such as construction projects, the cloud and networks are commonly
heterogeneous computing and communication devices [54,55], which used. The development of cutting-edge information technology and
also improves the interoperability among peer-to-peer communication the ubiquitous availability of high-speed networks theoretically make
parties. (Message Queuing Telemetry Transport (MQTT), Data Distri- intelligent building project management via cloud platforms feasible.
bution Service (DDS), Advanced Message Queuing Protocol (AMQP), Therefore, as we progress to Industry 4.0’s cloud-based distributed and
Java Message Service (JMS), and other open message protocols are fre- real-time big data processing, it is no longer appropriate to employ tra-
quently used by MOM for M2M communications.) The MOM connects ditional data analytics methodologies for intelligent construction. The
with and regulates the systems or sub-systems that set up the entire con- bulk of AI algorithms and open source tools struggle with scalability,
struction process, providing services like identifying, authentication, usability, extensibility, and generalization capabilities when facing big
permission, and security. data problems [71].
Data representation, distinct level fusion, and data alignments are Therefore, the pipeline is an essential skill for building a complete
used to realize the data fusion with diverse modalities. Data presenta- and fluent workflow. For distributed computing, a popular and highly
tions extract and create the standard presentations for the integrated fault tolerant programming model is map-reduce [72]. Big data algo-
data analysis using the multi-modal data features. In general, data rithms and libraries can be created using platforms like Hadoop [73]
presentation includes coordinate representation based on spatial and or Spark [74] in order to extract useful information from massive
distribution information [56,57], sparse representations based on dic- data collections. It becomes essential to develop methods for predicting
tionary learning [58,59], joint representation based on graphs and big data parameters, specify strategies for their structure, aggregation,
cross-media data [60,61], and so on. Data fusion focuses on a dynamic testing, and storage, as well as understand how formats relate to their
multi-level and multi-faceted data analysis of big data originating streaming. The strategic advantage is the capacity to use the results of
from heterogeneous sources into a unified architecture, in order to big data analysis to identify hidden trends, comprehend them, and take
transform various types of information into a format that is simpler appropriate action. While contrasting the various stream processing
to handle [62]. Combining data from diverse data sources, data fusion platforms, such as Apache Storm, Flink, Kafka Streams, or Samza [75],
is a technique for detecting, correlating, estimating, and combining we may also choose platforms that allow processes to run continuously,
data and information from several sources [63]. Careful data fusion can allowing new data to be processed as soon as it arrives. Public clouds
reduce the inconsistent and redundant data that can come from several and private servers can both use the technology. Amazon Lambda (also
database descriptions of the same attribute. Observation-level fusion known as AWS Lambda), one of the most well-known cloud services,
merges data straight from the source; For feature-level fusion, repre- enables communication with other AWS services and the execution
sentative features need to be first extracted from the raw sensor data; of code written in a variety of different languages, including Node.js,
Only when a first round determination by each terminal device inde- Python, C#, or Java. It is built using Amazon CloudWatch [76], which
pendently has been made is decision-level fusion used. Data alignment, enables system monitoring and alterations in response to changes. With
which includes fundamental format alignment [64] as well as advanced Azure Data Factory, Microsoft offers a service that is comparable to this.
space alignment [65] and temporal alignment [66], is another method Access to their own big data and cloud storage services is available
for bridging the varied semantic gaps between various data sources. It is through both the Google Cloud Platform [77] and Oracle Cloud [78],
possible to efficiently extract and evaluate the information and features and now other significant corporations are starting to follow the trend.
after the multi-modal data are uniformly aligned.
Data processing is required upon the formatted data, due to the 3.2. Big data analytics
noise, missing data, and inconsistent nature of real-world data. Pro-
cessing data to increase quality is crucial because low-quality data Construction industry under the big data paradigm can profit a lot
yield low-quality information. The primary methods for preparing data from the data analysis tools. In the construction industry, as shown
include cleaning, transformation, and discretization [67]. Data cleaning in Fig. 3, prediction, classification, clustering and inference are often
purifies the original data by removing noise and redundant informa- utilized data analytics techniques. Descriptive, diagnostic, predictive,
tion, closing gaps, identifying and treating outliers, and smoothing and prescriptive analytics are the most used types of data analysis,
noise. A smaller-volume, identically analytic representation of the orig- but outlier identification is rarely utilized [2]. It is feasible to have
inal data collection is produced by data reduction. Examples of data an overall and occasionally real-time prediction and monitoring during
reduction methods include data compression and dimensionality re- the life cycle of the building sector by using several data analytics
duction [68]. Discretization and data transformation incorporate or methodologies. Fig. 5 shows that various analytics methods could be
modify the original data to enhance the mining process’ efficiency and used in the construction business to uncover latent information in big
the ability to understand the patterns that are extracted. Examples data.
6
F. Li et al. Journal of Industrial Information Integration 35 (2023) 100483
Fig. 5. Big data analytics techniques in the model, design, plan, schedule and management stages of a construction project.
The whole construction process can take advantages of big data building costs [85] and delays [86]. Buchheit et al. [87] offered an
analytics. For instance, whether a project will succeed or fail depends illustration of the knowledge discovery procedure using data mining
greatly on how well the plan is executed and how well the project methods for an ‘‘intelligent’’ construction. Data mining was used by
information is understood. Most construction projects eventually cost Dagbui et al. [88] to extract the information included in the con-
more than anticipated attributed to schedule issues, poor planning, and struction data and predict the cost of building. Kim et al. [86] chose
a lack of knowledge in the early stages of the project. Each employee’s the most important causes of the construction delay and used data
output is inevitably decreased as a result, which also decreases project mining to analyze it. Data mining tools increasingly place a greater
productivity as a whole. Because the main structural systems, major emphasis on construction productivity during scheduling. By using data
building techniques, and the majority of construction materials are mining, Pradhan et al. [89] made it possible to track the productivity
chosen during this stage, cost planning during the design phase is of the construction industry and identify areas for improvement. Bai
essential for the successful completion of a construction project [79]. et al. [90] applied intelligent data mining methods to assess the effec-
The mastery of project information regarding the operational status of tive productivity of construction equipment. Additionally, data mining
the construction process is also crucial for risk management, schedule was employed to increase construction productivity. The integration
modification, staff restructuring, etc. Furthermore, big data analytics of construction data for increasing productivity was accomplished by
methods must be used to create a fully complete picture of the op- Rujirayanyong and Shi [91]. Meanwhile, there are other aspects such
erating scenario. In terms of methodology, the construction industry as injury detection [92] that data mining could make a difference and
has made extensive use of statistics, data mining, machine learning, explore.
and deep learning. Therefore, we shall discuss such methods in the Machine learning is a type of technology that can learn from expe-
subsequent paragraphs. rience. Machine learning is receiving more attention than ever before
Statistics is the most conventional way for solving problems since because of its capacity to learn as the evaluation of industry data grows
it employs inherent information in a rigorous and efficient manner. easier and easier. Machine learning is extensively employed throughout
Trends and patterns at the data level can be analyzed and used to guide the whole construction life cycle because of its great generalization
decisions regarding the building process, depending on the statistical capabilities. BIM modeling is essential for the success of the entire
nature of the data. A manner of interpretable model is also offered construction process. The digital construction process is made easier by
by statistics. In the vast majority of instances, the management of the machine learning thanks to BIM capability. For instance, each object in
construction industry makes use of statistical methodologies, while the the Industry Foundation Classes (IFC) model might be categorized into
use of statistical methods for construction design exists [80,81]. For the appropriate categories using data-driven classifications [93].
instance, the statistical approach was utilized to help legal decision- A method to automatically segment and categorize BIM objects from
making in construction lawsuits [82]. Li and Jin [83] proposed a fault point cloud data was put forth by Romero-Jarén [94]. Machine learning
detection system based on pattern matching and principal component contributes to a more thorough BIM model ling. Machine learning
analysis (PCA) methods. Additionally, statistical techniques were used algorithms could identify phenomena that we can only evaluate from a
to schedule the construction delay as well as to track the movements statistical perspective rather than an intuitionistic perspective through-
of heavy equipment and workers [84]. out the design and construction. For instance, one of the elements
Data mining combines vast amounts of data from various sources that could benefit from data-driven analytics is structural design. The
and finding hidden patterns or laws. For the sake of analytics, data researcher can design the construction of the building while taking into
mining deconstructs a huge batch of data’s inherent linkages. For account its benefits and drawbacks by analyzing several data sets from
planning and scheduling in the construction industry, mining the infor- various fields [95,96]. To find a more compact structure for tensegrity
mation in big data frequently yields more accuracy. Data mining helps structures, Yamamoto et al. [95] recommended using a genetic algo-
the construction industry prepare by allowing companies to uncover rithm. Be aware that machine learning is a useful tool for planning
knowledge that has been hidden in databases, improving estimates of building projects as well. A model-based planning system in BIM is
7
F. Li et al. Journal of Industrial Information Integration 35 (2023) 100483
presented by Chen et al. [97]. using genetic algorithms to determine the Bayram et al. [125] developed models utilizing artificial neural net-
best crew assignment configuration. In a manner similar to this, Shin works and radial basis functions for the initial estimation of construc-
et al. [98] likewise got the best layout for the hoist. To forecast the cost tion expenses.
of the building, Hwang et al. [99] created a dynamic regression model. For better decision-making, the construction industry also needs
Machine learning is used in construction industry planning to help find to estimate risk in addition to costs. Asadi et al. [105], for instance,
the best solution given the resource constraints. Natural frequencies, investigated the best approach to creating a prediction model that
damping ratios, and mode shapes are important model properties for might pinpoint the reasons for building project delays. To forecast
monitoring the development of complex structures and the structural project success, Cheng et al. [126] introduced the evolutionary support
health of those systems. These data were used to characterize and vector machine inference model (ESIM). These researchers frequently
identify dynamic structural features using very efficient Bayesian algo- use machine learning and statistical techniques to anticipate the course
rithms [100]. Additionally, machine learning can be used in daily tasks of construction using integrated data. These data are obtained either by
such as construction scheduling to improve efficiency. Using a random using nonintrusive approaches or by installing sensors on the under-
forest model to reduce risks, Kang and Ryu [101] forecast workplace construction buildings in an invasive manner. AI has been used in
accidents. Kale and Baradam [102] used logistic regression to study the the construction sector to evaluate historical data for the estimation
modeling of construction injuries. For identifying bridge deterioration, and forecasting of the compressive strength of concrete in order to
Liu and Jiao [103] devised a genetic algorithm-support vector machine accomplish the aforementioned objectives. It has been useful to com-
technique. And the crack of bridge can be detected more precisely using prehend how the five fundamental components of concrete – water,
machine learning algorithms [104]. On-site construction projects can cement, metakaolin, fine aggregate, and coarse aggregate – affect the
be supported by data-driven solutions based on automated earthwork construction’s quality [127].
planners and cost prediction tools, which ensures the effective use of Information modeling can make use of predictability. Unexperi-
vehicles and route planning during on-site building and results in con- enced builders could be guided and helped by an AI-powered system
siderable time and cost savings [105]. Similarly to the above, Ardeshir in the absence of highly skilled experts. Alternatively, because of the
et al. [106] developed a fuzzy inference system for the risk assessment data-driven paradigm, big data analytics technologies can bring expert
of health, safety, and the environment (HSE). In the management phase knowledge to the construction practice.
of the construction industry, machine learning algorithms also play a
unique role on the tasks like damage assessment [80,107] and fault 4. Big data benefits
detection [108,109]. Machine learning is capable of handling such
For industries, a survey pointed out that those companies using big
problems that need to excavate information from data as above.
data had an 8% increase in profit and a 10% reduction in cost [128].
Deep learning is a cutting-edge framework that incorporates many
Specifically, McKinsey & Company claimed that using big data analytics
aspects of the human brain. Data’s inherent patterns and hierarchical
in construction can reduce project costs by 5%–10% and shorten project
representations are obtained using deep learning. Deep learning is now
schedules by 10–20% [129]. And using BIM-like digital tools, the time
frequently used as a powerful monitoring and prediction tool. Such
to generate reports has been reduced by 75%, and document transmittal
a task, requiring significantly more non-linear and non-anthropogenic
is sped up by 90%. In addition, big data analytics can also improve
than ever before, can be handled by deep learning. Deep learning’s dis-
construction safety [130].
tinctive capabilities, including object recognition and detection [110,
With the decline in the cost of data acquisition, real-time massive
111], cost calculation, activity recognition [112], productivity estima-
data collecting and processing is becoming more and more prevalent. A
tion [113,114], safety performance, etc., come into play during the
dynamic relationship exists between information and decision-making.
scheduling step.
Data technology is commonly used for optimization. Data collection,
A unique framework that recognizes worker behaviors and deter-
update, pattern recognition, and correlation will all be automatic. Big
mines if they fall within the parameters of their certification for safety
data systems are used by AI approaches to provide crucial information
control was introduced by Fang et al. [115]. A deep neural network ap- and insight even before a project has even started. This allows for the
proach by Juszczyk et al. [116] enables for cost forecasts of a particular early management of potential concerns including coordination issues
kind of construction object. Slaton et al. [117] used a model made up on construction sites, conflicts between different disciplines and trades,
of convolutional neural networks (CNN) and long short-term memory and even the impact of the weather. The ability to pivot in reaction to
networks (LSTM) layers to identify the activities of heavy machinery. data insights may have a major influence on cost reduction and time
This kind of AI technology might be used to manage swarms of robots overruns. For instance, big data and BIM alter the construction industry
and self-driving vehicles to increase productivity on construction sites. during the design phase. Data can be gathered and used to help in
All of the aforementioned tasks can only be completed when sensory the design process. When a suitable data analytics platform is in place,
data streams are networked to data warehouses, analytics platforms, large amounts of data may be quickly evaluated and used to uncover
and platforms for developing insights and subsequently using by hu- probabilities and trends that can help anticipate potential issues that
mans, robots, and automation devices. Many studies also used big data may influence building projects throughout the construction process.
analytics technologies to pinpoint the contributing elements or causes Data analytics technology reduces construction delays and material-
of accidents. For instance, Zhang et al. [118] analyzed the reports of the related costs by providing comprehensible data and early detection of
construction accident using natural language processing (NLP). Tixier potential structural issues. Because of this, there are less human errors
et al. [119] creatively developed a conceptual framework employing and project managers are able to decide more swiftly and wisely. Big
NLP and graph mining to uncover factors related to injuries. In the data advantages in construction projects promote collaboration, boost
interim, data analytics techniques can also be used to implement or output, speed up construction, lower risk, decrease waste, improve
enhance risk monitoring [120], event prediction [121], and hazard worker safety, and so on.
identification [122].
Furthermore, by integrating data from earlier initiatives, integrated 4.1. Efficiency improvement
analytics will provide more precise cost estimation. For instance, Lowe
et al. [123] found that cost estimation using a neural network was Nowadays, simplified activities are frequently used in the construc-
superior to cost estimation using more conventional techniques. Rafiei tion process. As soon as the operating plan is established, initiatives
et al. [124] estimated construction costs using deep models while become simpler to evaluate. Team members are equipped with the
taking into account the effects of external economics (DBM-SoftMax). knowledge and resources necessary to make decisions, reduce risk,
8
F. Li et al. Journal of Industrial Information Integration 35 (2023) 100483
and increase efficiency. One of the major problems that construction 4.4. Working condition improvement
industry workers typically encounter is a lack of communication. For-
tunately, big data solutions make it simple and accessible for crew Compared to experts in other industries, construction workers are
members to exchange information. IoT’s visual remote monitoring fea- more likely to sustain an occupational injury [135]. Wearable sensors
ture boosts management effectiveness while cutting costs. Moreover, are having a significant impact on enhancing working conditions for
by doing this, the likelihood of misunderstanding-related mistakes is site staff in addition to offering vital insights on the productivity and ef-
reduced, working relationships are improved, and everyone is kept ficiency of machinery and equipment. The wearable’s biometric sensors
informed in the event of an unexpected change or interruption. allow for the monitoring of worker health in addition to environmen-
tal variables that may affect workplace safety. A pleased workforce
Big data can also help increase the productivity of construction
produces more. Fortunately, the sector is quickly adopting technology
sites and machinery. In order to increase efficiency, sensors are utilized
such as smart construction devices and safety management software.
on modern construction sites to collect data about the site and the
Through the collection of health and activity data, the detection of
equipment. The performance and use of instrumented machines can safety risks, and the notification of construction teams of safety protocol
be studied in great detail thanks to the information produced by these violations, these devices take advantage of the potential of big data.
devices when they are connected to on-site operational equipment. By providing this knowledge, project managers may be ready for any
Sensor data can reveal when construction equipment is idle and in use, safety concerns on upcoming projects in addition to protecting the
allowing contractors to increase fuel efficiency and productivity and people who use safety technologies.
determine whether it is more cost-effective to buy, lease, or rent such
equipment. 5. Open issues and challenges
9
F. Li et al. Journal of Industrial Information Integration 35 (2023) 100483
latency are main demands for the smart construction. A new trend analytics are discussed. Additionally, the advantages of big data are
is enabling computation at the network edge. For example, the IoT also illustrated. Finally, in order to shed light on the contemporary
sensors in the construction sites can not only implement the data intelligent construction, key issues are emphasized along with future
acquisition, but also have the abilities to accomplish computation tasks research and application areas. We hope our paper can help researchers
for data engineering and analytics. Distributed sensors continue to and practitioners to better understand the state of the art in the
produce valuable data, and AI makes inferences from it. In essence, modern construction industry, identify big data related areas for further
the edges with strong computing capabilities bring the AI closer to investigation, and ultimately drive innovations and progresses in the
the end consumers and application scenarios. Local edge servers can construction industry.
improve privacy and security while reducing bandwidth and latency
at the edge of IoT networks by utilizing edge AI. The distributed BIM Declaration of competing interest
software can use the local edge AI engines to improve the modeling per-
formances without losing the sensitive information. When coordinating The authors declare the following financial interests/personal rela-
heterogeneous resources across several domains, the edge intelligence tionships which may be considered as potential competing interests:
paradigm that is produced by organically integrating global and local Honggui Han reports financial support was provided by National Key
intelligence can always apply AI techniques. Federated learning and Research and Development Project.
blockchain techniques provide the security communication protocols to
leverage both local and global sources. The extracted local information Data availability
or knowledge and even edge intelligence overcomes the traditional
limits of model training. In other words, although global intelligence No data was used for the research described in the article.
negatively impacts the accuracy and speed of local intelligence, lo-
cal intelligence really enhances global intelligence. Edge intelligence Acknowledgments
consequently becomes a crucial concern, necessitating further edge
structure design that is tailored for the whole construction process. The research is supported by National Key Research and Devel-
opment Project under Grant 2022YFB3305800-5, National Science
5.3. Heterogeneous devices coexisting Foundation of China under Grants 92267107, 62125301, 61890930-
5, 61903010, and 62021003, Beijing Outstanding Young Scientist
In the course of building, a variety of CPS and IoT systems coexist Program under Grant BJJWZYJH01201910005020, and Beijing Natural
and function, frequently utilizing heterogeneous hardware and soft- Science Foundation under Grant KZ202110005009, and Beijing Youth
ware. Critical issues arise as a result of this heterogeneity. Non-linear Scholar under Grant No. 037.
interactions between the many subsystems. The interactions, particu-
larly the data and information flows, between the diverse subsystems References
are crucial. Often, the subsystems must work together to maintain a
working flow. Workers and managers need to have a comprehensive [1] C.P. Chen, C.-Y. Zhang, Data-intensive applications, challenges, techniques and
understanding of the entire building project, which necessitates under- technologies: A survey on Big Data, Inform. Sci. 275 (2014) 314–347.
standing the subsystems’ workflow. Negative interference brought on [2] H. Yan, N. Yang, Y. Peng, Y. Ren, Data mining in the construction industry:
Present status, opportunities, and future trends, Autom. Constr. 119 (2020)
by diverse subsystems. The restricted resources available to subsystems 103331.
like the machinery at a building site make it possible for such subsys- [3] J. Snyder, A. Menard, N. Spare, Big data: Big questions for the engineering and
tems to severely interact with one another. Therefore, additional work construction industry, 2018.
is needed. Utilizing cross technology communication (CTC), which [4] Y. Lu, Industry 4.0: A survey on technologies, applications and open research
issues, J. Ind. Inf. Integr. 6 (2017) 1–10.
permits communication between many communication systems, is one
[5] A. Bucchiarone, M. De Sanctis, P. Hevesi, M. Hirsch, F.J.R. Abancens, P.F. Vi-
potential remedy [146]. System’s overall scalability and flexibility are vanco, O. Amiraslanov, P. Lukowicz, Smart construction: remote and adaptable
unavailable. Network nodes should be able to join and depart the management of construction sites through IoT, IEEE Internet Things Mag. 2 (3)
system at any moment due to its dynamic nature, for instance, a truck (2019) 38–45.
leaving a construction site or a new worker joining the job at any [6] P.K.R. Maddikunta, Q.-V. Pham, B. Prabadevi, N. Deepa, K. Dev, T.R. Gadekallu,
R. Ruby, M. Liyanage, Industry 5.0: A survey on enabling technologies and
time. Thus, ensuring the flexibility and scalability is crucial. Potential
potential applications, J. Ind. Inf. Integr. 26 (2022) 100257.
solutions include developing adaptive network growing/pruning strate- [7] E. Hämäläinen, T. Inkinen, Industrial applications of big data in disruptive
gies and incremental service orchestration, which reuses the existing innovations supporting environmental reporting, J. Ind. Inf. Integr. 16 (2019)
network resources. 100105.
[8] Y. Cheng, K. Chen, H. Sun, Y. Zhang, F. Tao, Data and knowledge mining with
big data towards smart production, J. Ind. Inf. Integr. 9 (2018) 1–13.
6. Conclusion [9] X. Zhou, Y. Hu, W. Liang, J. Ma, Q. Jin, Variational LSTM enhanced anomaly
detection for industrial big data, IEEE Trans. Ind. Inform. 17 (5) (2020)
The modern construction industry and the next wave of digitization 3469–3477.
require big data processing and analytics in real-world industrial appli- [10] Z. Lv, H. Song, P. Basanta-Val, A. Steed, M. Jo, Next-generation big data
analytics: State of the art, challenges, and future research topics, IEEE Trans.
cations during a construction project’s lifespan. Big data techniques aid
Ind. Inform. 13 (4) (2017) 1891–1899.
the construction industry by enhancing efficiency and reducing waste [11] A. Al-Abassi, H. Karimipour, H. HaddadPajouh, A. Dehghantanha, R.M. Parizi,
by utilizing cutting-edge information technology and data management Industrial big data analytics: challenges and opportunities, in: Handbook of Big
systems. Modern technologies, like as AI, sophisticated statistical and Data Privacy, Springer, 2020, pp. 37–61.
optimization models, and big data analytics, offer further opportunities [12] K. Wang, H. Li, Y. Feng, G. Tian, Big data analytics for system stability
evaluation strategy in the energy Internet, IEEE Trans. Ind. Inform. 13 (4)
for process improvement. The advancement of big data analytics en-
(2017) 1969–1978.
hances the capacity to track, record, and analyze data to forecast and [13] D. Wang, F. Li, K. Liu, Modeling and monitoring of a multivariate
recommend the best course of action for the management of building spatio-temporal network system, IISE Trans. (2021) 1–17.
projects. Big data applications in construction yet face a number of [14] F. Li, R. Xie, Z. Wang, L. Guo, J. Ye, P. Ma, W. Song, Online distributed IoT
unresolved problems and difficulties at various levels. We provide a security monitoring with multidimensional streaming big data, IEEE Internet
Things J. 7 (5) (2020) 4387–4394.
broad overview of big data in construction, focusing on the entire life [15] S. Rani, S.H. Ahmed, R. Talwar, J. Malhotra, Can sensors collect big data?
cycle. We started by outlining the history of large data production in An energy-efficient big data gathering algorithm for a WSN, IEEE Trans. Ind.
the construction process. Then, topics of data engineering and data Inform. 13 (4) (2017) 1961–1968.
10
F. Li et al. Journal of Industrial Information Integration 35 (2023) 100483
[16] K. Huang, Y. Wu, C. Wang, Y. Xie, C. Yang, W. Gui, A projective and [44] E. Winsberg, Simulated experiments: Methodology for a virtual world, Philos.
discriminative dictionary learning for high-dimensional process monitoring with Sci. 70 (1) (2003) 105–125.
industrial applications, IEEE Trans. Ind. Inform. 17 (1) (2021) 558–568. [45] S. Dick, Artificial intelligence, 2019.
[17] S. Yin, O. Kaynak, Big data for modern industry: challenges and trends [point [46] A.I. AI, The fourth industrial revolution, in: Presented at the American
of view], Proc. IEEE 103 (2) (2015) 143–146. Psychological Association in Minneapolis, Vol. 4, 2022, p. 6.
[18] W. Yu, T. Dillon, F. Mostafa, W. Rahayu, Y. Liu, A global manufacturing big [47] F. Tao, H. Zhang, A. Liu, A.Y. Nee, Digital twin in industry: State-of-the-art,
data ecosystem for fault detection in predictive maintenance, IEEE Trans. Ind. IEEE Trans. Ind. Inform. 15 (4) (2018) 2405–2415.
Inform. 16 (1) (2020) 183–192. [48] S. Haag, R. Anderl, Digital twin–Proof of concept, Manuf. Lett. 15 (2018) 64–66.
[19] M. Bilal, L.O. Oyedele, J. Qadir, K. Munir, S.O. Ajayi, O.O. Akinade, H.A. [49] J. Evans, C. Lindner, Business Analytics: The Next Frontier for Decision Sciences,
Owolabi, H.A. Alaka, M. Pasha, Big Data in the construction industry: A review Vol. 21, College of Business, University of Cincinnati, Decision Science In-
of present status, opportunities, and future trends, Adv. Eng. Inform. 30 (3) stitute, 2012, p. 2018, http://www.cbpp.uaa.alaska.edu/afef/business_analytics.
(2016) 500–521. htm. Zugegriffen Am. (12).
[20] S.A. Ismail, S. Bandi, Z.N. Maaz, An appraisal into the potential application [50] G. Wang, A. Gunasekaran, E.W. Ngai, T. Papadopoulos, Big data analytics in
of big data in the construction industry, Int. J. Built Environ. Sustain. 5 (2) logistics and supply chain management: Certain investigations for research and
(2018). applications, Int. J. Prod. Econ. 176 (2016) 98–110.
[21] D. Wang, J. Fan, H. Fu, B. Zhang, Research on optimization of big data [51] L.-C. Wang, Enhancing construction quality inspection and management using
construction engineering quality management based on RNN-LSTM, Complexity RFID technology, Autom. Constr. 17 (4) (2008) 467–479.
2018 (2018). [52] G. Malacarne, G. Toller, C. Marcher, M. Riedl, D. Matt, Investigating benefits
[22] X. Zhang, X. Ming, D. Yin, Reference architecture of common service platform and criticisms of bim for construction scheduling in SMEs: An Italian case study,
for Industrial Big Data (I-BD) based on multi-party co-construction, Int. J. Adv. Int. J. Sustain. Dev. Plan. 13 (1) (2018) 139–150.
Manuf. Technol. 105 (5) (2019) 1949–1965. [53] S.-H. Leitner, W. Mahnke, OPC UA–Service-Oriented Architecture for Industrial
[23] J. Ngo, B.-G. Hwang, C. Zhang, Factor-based big data and predictive analytics Applications, Vol. 48, ABB Corporate Research Center, 2006, p. 22, (61–66).
capability assessment tool for the construction industry, Autom. Constr. 110 [54] E. Curry, Message-oriented middleware, in: Middleware for Communications,
(2020) 103042. Wiley Online Library, 2004, pp. 1–28.
[24] Z. You, L. Feng, Integration of industry 4.0 related technologies in construc- [55] J. Yongguo, L. Qiang, Q. Changshuai, S. Jian, L. Qianqian, Message-oriented
tion industry: A framework of cyber-physical system, IEEE Access 8 (2020) middleware: A review, in: 2019 5th International Conference on Big Data
122908–122922. Computing and Communications, BIGCOM, IEEE, 2019, pp. 88–97.
[25] H. Baars, H.-G. Kemper, Management support with structured and unstructured [56] R.A. Andersen, L.H. Snyder, C.-S. Li, B. Stricanne, Coordinate transformations
data—an integrated business intelligence framework, Inf. Syst. Manage. 25 (2) in the representation of spatial information, Curr. Opin. Neurobiol. 3 (2) (1993)
(2008) 132–148. 171–176.
[26] Z. Lv, X. Li, H. Lv, W. Xiu, BIM big data storage in WebVRGIS, IEEE Trans. [57] F. Zhang, X. Zhu, H. Dai, M. Ye, C. Zhu, Distribution-aware coordinate
Ind. Inform. 16 (4) (2020) 2566–2573. representation for human pose estimation, in: Proceedings of the IEEE/CVF
[27] Y. Zhao, Q. Wang, X. Wang, Refined and intelligent management mode Conference on Computer Vision and Pattern Recognition, 2020, pp. 7093–7102.
of construction project based on BIM and IOT technology, in: The Sixth [58] R. Gribonval, M. Nielsen, Sparse representations in unions of bases, IEEE Trans.
International Conference on Information Management and Technology, 2021, Inform. Theory 49 (12) (2003) 3320–3325.
pp. 1–5. [59] X.X. Zhu, R. Bamler, A sparse image fusion algorithm with application to
[28] X. Wang, S. Wang, X. Song, Y. Han, IoT-based intelligent construction system pan-sharpening, IEEE Trans. Geosci. Remote Sens. 51 (5) (2012) 2827–2836.
for prefabricated buildings: Study of operating mechanism and implementation [60] H. Liu, T. Li, R. Hu, Y. Fu, J. Gu, H. Xiong, Joint representation learning
in China, Appl. Sci. 10 (18) (2020) 6311. for multi-modal transportation recommendation, in: Proceedings of the AAAI
[29] Y. Lu, J. Zhang, Bibliometric analysis and critical review of the research on big Conference on Artificial Intelligence, Vol. 33, 2019, pp. 1036–1043.
data in the construction industry, Eng. Constr. Archit. Manag. (2021). [61] X. Zhai, Y. Peng, J. Xiao, Learning cross-media joint representation with sparse
[30] E. Sisinni, A. Saifullah, S. Han, U. Jennehag, M. Gidlund, Industrial internet of and semisupervised regularization, IEEE Trans. Circuits Syst. Video Technol. 24
things: Challenges, opportunities, and directions, IEEE Trans. Ind. Inform. 14 (6) (2013) 965–978.
(11) (2018) 4724–4734. [62] M. Schmitt, X.X. Zhu, Data fusion and remote sensing: An ever-growing
[31] S. Azhar, Building information modeling (BIM): Trends, benefits, risks, and relationship, IEEE Geosci. Remote Sens. Mag. 4 (4) (2016) 6–23.
challenges for the AEC industry, Leadersh. Manag. Eng. 11 (3) (2011) 241–252. [63] U.S. Department of Defense, Data fusion lexicon: data fusion subpanel of the
[32] H.-M. Chen, K.-C. Chang, T.-H. Lin, A cloud-based system framework for joint directors of laboratories technical panel for C3, 1991.
performing online viewing, storage, and analysis on big data of massive BIMs, [64] G.J. Barton, et al., ALSCRIPT: a tool to format multiple sequence alignments,
Autom. Constr. 71 (2016) 34–48. Protein Eng. Des. Sel. 6 (1) (1993) 37–40.
[33] F. Tao, Q. Qi, Make more digital twins, Nature 573 (7775) (2019) 490–491. [65] H. Pang, S. Wei, G. Zhang, S. Zhang, S. Qiu, Y. Zhao, Heterogeneous feature
[34] T.D. Oesterreich, F. Teuteberg, Understanding the implications of digitisation alignment and fusion in cross-modal augmented space for composed image
and automation in the context of Industry 4.0: A triangulation approach and retrieval, IEEE Trans. Multimed. (2022).
elements of a research agenda for the construction industry, Comput. Ind. 83 [66] T.D. Wang, C. Plaisant, A.J. Quinn, R. Stanchak, S. Murphy, B. Shneiderman,
(2016) 121–139. Aligning temporal data by sentinel events: discovering patterns in electronic
[35] J.K. Whyte, T. Hartmann, How digitizing building information transforms the health records, in: Proceedings of the SIGCHI Conference on Human Factors in
built environment, Build. Res. Inf. 45 (6) (2017) 591–595. Computing Systems, 2008, pp. 457–466.
[36] S. Zhang, K. Sulankivi, M. Kiviniemi, I. Romo, C.M. Eastman, J. Teizer, BIM- [67] S. García, S. Ramírez-Gallego, J. Luengo, J.M. Benítez, F. Herrera, Big data
based fall hazard identification and prevention in construction safety planning, preprocessing: methods and prospects, Big Data Anal. 1 (1) (2016) 1–22.
Saf. Sci. 72 (2015) 31–45. [68] K. Sayood, Introduction to Data Compression, Morgan Kaufmann, 2017.
[37] M. Bilal, L.O. Oyedele, J. Qadir, K. Munir, O.O. Akinade, S.O. Ajayi, H.A. Alaka, [69] A.A. Bakar, Z.A. Othman, N.L.M. Shuib, Building a new taxonomy for data
H.A. Owolabi, Analysis of critical features and evaluation of BIM software: discretization techniques, in: 2009 2nd Conference on Data Mining and
towards a plug-in for construction waste minimization using big data, Int. J. Optimization, IEEE, 2009, pp. 132–140.
Sustain. Build. Technol. Urban Dev. 6 (4) (2015) 211–228. [70] T. Taleb, K. Samdanis, B. Mada, H. Flinck, S. Dutta, D. Sabella, On multi-access
[38] N. Garyaev, V. Garyaeva, Big data technology in construction, in: E3S Web of edge computing: A survey of the emerging 5G network edge cloud architecture
Conferences, Vol. 97, EDP Sciences, 2019, p. 01032. and orchestration, IEEE Commun. Surv. Tutor. 19 (3) (2017) 1657–1681.
[39] C.J. Turner, J. Oyekan, L. Stergioulas, D. Griffin, Utilizing industry 4.0 on the [71] S. Sharma, Expanded cloud plumes hiding Big Data ecosystem, Future Gener.
construction site: Challenges and opportunities, IEEE Trans. Ind. Inform. 17 (2) Comput. Syst. 59 (2016) 63–92.
(2020) 746–756. [72] J. Dean, S. Ghemawat, MapReduce: Simplified data processing on large clusters,
[40] N. Pradhananga, J. Teizer, Automatic spatio-temporal analysis of construction 2004.
site equipment operations using GPS data, Autom. Constr. 29 (2013) 107–122. [73] T. White, Hadoop: The Definitive Guide, " O’Reilly Media, Inc.", 2012.
[41] S. Gong, X. Gao, Z. Li, L. Chen, Developing a dynamic supervision mechanism [74] H. Karau, A. Konwinski, P. Wendell, M. Zaharia, Learning Spark: Lightning-Fast
to improve construction safety investment supervision efficiency in China: Big Data Analysis, " O’Reilly Media, Inc.", 2015.
Theoretical simulation of evolutionary game process, Int. J. Environ. Res. Public [75] C. Prakash, Spark streaming vs flink vs storm vs kafka streams vs samza: Choose
Health 18 (7) (2021) 3594. your stream processing framework, 2018, Why Not Learn Something Blogspot.
[42] W. Lee, K.-Y. Lin, E. Seto, G.C. Migliaccio, Wearable sensors for monitoring [76] J. Varia, S. Mathew, et al., Overview of Amazon Web Services, Vol. 105,
on-duty and off-duty worker physiological status and activities in construction, Amazon Web Services, 2014.
Autom. Constr. 83 (2017) 341–353. [77] E. Bisong, An overview of google cloud platform services, in: Building Machine
[43] B. Choi, H. Jebelli, S. Lee, Feasibility analysis of electrodermal activity (EDA) Learning and Deep Learning Models on Google Cloud Platform, Springer, 2019,
acquired from wearable sensors to assess construction workers’ perceived risk, pp. 7–10.
Saf. Sci. 115 (2019) 110–120. [78] M.T. Jakóbczyk, Practical Oracle Cloud Infrastructure, Springer, 2020.
11
F. Li et al. Journal of Industrial Information Integration 35 (2023) 100483
[79] J.R. Meredith, S.M. Shafer, S.J. Mantel Jr., Project Management: A Strategic [108] Y.-J. Cha, K. You, W. Choi, Vision-based detection of loosened bolts using
Managerial Approach, John Wiley & Sons, 2017. the Hough transform and support vector machines, Autom. Constr. 71 (2016)
[80] X. Jiang, S. Mahadevan, Bayesian probabilistic inference for nonparametric 181–188.
damage detection of structures, J. Eng. Mech. 134 (10) (2008) 820–831. [109] D. Dehestani, F. Eftekhari, Y. Guo, S.S. Ling, S. Su, H.T. Nguyen, Online support
[81] R. Fernando, R. Drogemuller, F. Salim, J. Burry, Patterns, heuristics for vector machine application for model based fault detection and isolation of
architectural design support: Making use of evolutionary modelling in design, HVAC system, 2011.
in: A.I. Li, N. Gu, B. Dave, H.J. Park (Eds.), New Frontiers: Proceedings of [110] D. Kim, S. Lee, V.R. Kamat, Proximity prediction of mobile objects to prevent
the 15th International Conference on Computer-Aided Architectural Design contact-driven accidents in co-robotic construction, J. Comput. Civ. Eng. 34
Research in Asia, The Association for Computer-Aided Architectural Design (4) (2020).
Research in Asia, Hong Kong, 2010, pp. 283–292. [111] M. Kamari, Y. Ham, Vision-based volumetric measurements via deep learning-
[82] T.S. Mahfouz, Construction Legal Support for Differing Site Conditions (DSC) based point cloud segmentation for material management in jobsites, Autom.
through Statistical Modeling and Machine Learning (ML), Iowa State University, Constr. 121 (2021) 103430.
2009. [112] K.M. Rashid, J. Louis, Times-series data augmentation and deep learning
[83] S. Li, J. Wen, Application of pattern matching method for detecting faults in for construction equipment activity recognition, Adv. Eng. Inform. 42 (2019)
air handling unit system, Autom. Constr. 43 (2014) 49–58. 100944.
[84] J. Gong, C.H. Caldas, C. Gordon, Learning and classifying actions of construc- [113] K.M. El-Gohary, R.F. Aziz, H.A. Abdel-Khalek, Engineering approach using
tion workers and equipment using Bag-of-Video-Feature-Words and Bayesian ANN to improve and predict construction labor productivity under different
network models, Adv. Eng. Inform. 25 (4) (2011) 771–782. influences, J. Constr. Eng. Manag. 143 (8) (2017) 04017045.
[114] Measuring and benchmarking the productivity of excavators in infrastructure
[85] L. Soibelman, H. Kim, Data preparation process for construction knowledge
projects: A deep neural network approach, Autom. Constr. 124 (2021) 103532.
generation through knowledge discovery in databases, J. Comput. Civ. Eng.
[115] Q. Fang, H. Li, X. Luo, L. Ding, T.M. Rose, W. An, Y. Yu, A deep learning-based
16 (1) (2002) 39–48.
method for detecting non-certified work on construction sites, Adv. Eng. Inform.
[86] H. Kim, L. Soibelman, F. Grobler, Factor selection for delay analysis using
35 (2018) 56–68.
Knowledge Discovery in Databases, Autom. Constr. 17 (5) (2008) 550–560.
[116] M. Juszczyk, K. Zima, W. Lelek, Forecasting of sports fields construction costs
[87] R.B. Buchheit, J. Garrett, S.R. Lee, R. Brahme, A knowledge discovery case
aided by ensembles of neural networks, J. Civ. Eng. Manag. 25 (7) (2019)
study for the intelligent workplace, 2012, pp. 914–921.
715–729.
[88] D.D. Ahiaga-Dagbui, S.D. Smith, Dealing with construction cost overruns using [117] T. Slaton, C. Hernandez, R. Akhavian, Construction activity recognition with
data mining, Constr. Manag. Econ. 32 (7–8) (2014) 682–694. convolutional recurrent networks, Autom. Constr. 113 (2020) 103138.
[89] A. Pradhan, B. Akinci, C.T. Haas, Formalisms for query capture and data source [118] F. Zhang, H. Fleyeh, X. Wang, M. Lu, Construction site accident analysis using
identification to support data fusion for construction productivity monitoring, text mining and natural language processing techniques, Autom. Constr. 99
Autom. Constr. 20 (4) (2011) 389–398. (2019) 238–248.
[90] S. Bai, M. Li, R. Kong, S. Han, H. Li, L. Qin, Data mining approach to [119] A.J.-P. Tixier, M.R. Hallowell, B. Rajagopalan, D. Bowman, Construction
construction productivity prediction for cutter suction dredgers, Autom. Constr. safety clash detection: identifying safety incompatibilities among fundamental
105 (2019) 102833. attributes using data mining, Autom. Constr. 74 (2017) 39–54.
[91] T. Rujirayanyong, J.J. Shi, A project-oriented data warehouse for construction, [120] J. Wu, N. Cai, W. Chen, H. Wang, G. Wang, Automatic detection of hardhats
Autom. Constr. 15 (6) (2006) 800–807. worn by construction personnel: A deep learning approach and benchmark
[92] C.-W. Liao, Y.-H. Perng, Data mining for occupational injuries in the Taiwan dataset, Autom. Constr. 106 (2019) 102894.
construction industry, Saf. Sci. 46 (7) (2008) 1091–1102. [121] B.U. Ayhan, O.B. Tokdemir, Predicting the outcome of construction incidents,
[93] J. Wu, J. Zhang, New automated BIM object classification method to support Saf. Sci. 113 (2019) 91–104.
BIM interoperability, J. Comput. Civ. Eng. 33 (5) (2019). [122] Y. Zhao, Q. Chen, W. Cao, J. Yang, J. Xiong, G. Gui, Deep learning for risk
[94] R. Romero-Jarén, J.J. Arranz, Automatic segmentation and classification of BIM detection and trajectory tracking at construction sites, IEEE Access 7 (2019)
elements from point clouds, Autom. Constr. 124 (2021) 103576. 30905–30912.
[95] M. Yamamoto, B.S. Gan, K. Fujita, J. Kurokawa, A genetic algorithm based [123] D.J. Lowe, M.W. Emsley, A. Harding, Predicting construction cost using multiple
form-finding for tensegrity structure, Procedia Eng. 14 (2011) 2949–2956. regression techniques, J. Constr. Eng. Manag. 132 (7) (2006-07) 750–758.
[96] R. Mehanna, Resilient structures through machine learning and evolution, in: [124] M.H. Rafiei, H. Adeli, Novel machine-learning model for estimating construction
ACADIA 13: Adaptive Architecture [Proceedings of the 33rd Annual Conference costs considering economic variables and indexes, J. Constr. Eng. Manag. 144
of the Association for Computer Aided Design in Architecture (ACADIA) ISBN (12) (2018) 04018106.
978-1-926724-22-5] Cambridge 24-26 October, 2013), CUMINCAD, 2013, pp. [125] S. Bayram, M.E. Ocal, E. Laptali Oral, C.D. Atis, Comparison of multi layer per-
319–326. ceptron (MLP) and radial basis function (RBF) for construction cost estimation:
[97] Y.-J. Chen, C.-W. Feng, Y.-R. Wang, H.-M. Wu, Using BIM model and genetic the case of Turkey, J. Civ. Eng. Manag. 22 (4) (2016) 480–490.
algorithms to optimize the crew assignment for construction project planning, [126] M.-Y. Cheng, N.-D. Hoang, A.F. Roy, Y.-W. Wu, A novel time-depended
Int. J. Technol. (3) (2011) 179–187. evolutionary fuzzy SVM inference model for estimating construction project at
[98] Y. Shin, H. Cho, K.-I. Kang, Simulation model incorporating genetic algorithms completion, Eng. Appl. Artif. Intell. 25 (4) (2012) 744–752.
for optimal temporary hoist planning in high-rise building construction, Autom. [127] J.-B. Yu, Y. Yu, L.-N. Wang, Z. Yuan, X. Ji, The knowledge modeling system
Constr. 20 (5) (2011) 550–558. of ready-mixed concrete enterprise and artificial intelligence with ANN-GA for
manufacturing production, J. Intell. Manuf. 27 (4) (2016) 905–914.
[99] S. Hwang, Dynamic regression models for prediction of construction costs, J.
[128] Business Application Research Center (BARC), Big Data Use Cases 2015 –
Constr. Eng. Manag. 135 (5) (2009) 360–367.
Getting Real On Data Monetization, 2016.
[100] F. Zhang, H. Xiong, W. Shi, X. Ou, Structural health monitoring of Shanghai
[129] McKinsey & Company, Imagining Construction’s Digital Future, 2016.
Tower during different stages using a Bayesian approach, Struct. Control Health
[130] Dodge Data & Analytics, Improving Performance with Project Data - How
Monit. 23 (11) (2016) 1366–1384.
Improved Collection and Analysis is Leading the Digital Transformation of the
[101] K. Kang, H. Ryu, Predicting types of occupational accidents at construction sites
Construction Industry, 2019.
in Korea using random forest model, Saf. Sci. 120 (2019) 226–236.
[131] J. Barata, P.R.d. Cunha, Safety is the new black: the increasing role of wearables
[102] Ö.A. Kale, S. Baradan, Identifying factors that contribute to severity of con-
in occupational health and safety in construction, in: International Conference
struction injuries using logistic regression model, Tek. Dergi 31 (2) (2020)
on Business Information Systems, Springer, 2019, pp. 526–537.
9919–9940. [132] M.S. Aslam, B. Huang, L. Cui, Review of construction and demolition waste
[103] H.-B. Liu, Y.-B. Jiao, Application of genetic algorithm-support vector machine management in China and USA, J. Environ. Manag. 264 (2020) 110445.
(GA-SVM) for damage identification of bridge, Int. J. Comput. Intell. Appl. 10 [133] X. Deng, G. Liu, J. Hao, A study of construction and demolition waste
(4) (2011) 383–397, Cited by: 42. management in Hong Kong, in: 2008 4th International Conference on Wireless
[104] M.R. Jahanshahi, S.F. Masri, Adaptive vision-based crack detection using 3D Communications, Networking and Mobile Computing, IEEE, 2008, pp. 1–4.
scene reconstruction for condition assessment of structures, Autom. Constr. 22 [134] O. Moselhi, A. Alshibani, Optimization of earthmoving operations in heavy civil
(2012) 567–576. engineering projects, J. Constr. Eng. Manag. 135 (10) (2009) 948–954.
[105] A. Asadi, M. Alsubaey, C. Makatsoris, A machine learning approach for [135] N. Van Tam, N.L. Huong, N.B. Ngoc, Factors affecting labour productivity of
predicting delays in construction logistics, Int. J. Adv. Logist. 4 (2) (2015) construction worker on construction site: A case of Hanoi, J. Sci. Technol. Civ.
115–130. Eng. (STCE)-HUCE 12 (5) (2018) 127–138.
[106] A. Ardeshir, P. Farnood Ahmadi, H. Bayat, A prioritization model for hse risk [136] F. Li, R. Xie, B. Yang, L. Guo, P. Ma, J. Shi, J. Ye, W. Song, Detection and
assessment using combined failure mode, effect analysis, and fuzzy inference identification of cyber and physical attacks on distribution power grids with
system: A case study in iranian construction industry, Int. J. Eng. 31 (9) (2018) pvs: An online high-dimensional data-driven approach, IEEE J. Emerg. Sel. Top.
1487–1497. Power Electron. 10 (1) (2022) 1282–1291.
[107] K. Pietrzyk, A systemic approach to moisture problems in buildings for mould [137] D. Lazer, R. Kennedy, G. King, A. Vespignani, The parable of Google Flu: traps
safety modelling, Build. Environ. 86 (2015) 50–60. in big data analysis, Science 343 (6176) (2014) 1203–1205.
12
F. Li et al. Journal of Industrial Information Integration 35 (2023) 100483
[138] J. Fan, F. Han, H. Liu, Challenges of big data analysis, Natl. Sci. Rev. 1 (2) [142] L. Zhao, J. Li, Q. Li, F. Li, A federated learning framework for detecting false
(2014) 293–314. data injection attacks in solar farms, IEEE Trans. Power Electron. 37 (3) (2021)
[139] W. Wang, H. Xu, M. Alazab, T.R. Gadekallu, Z. Han, C. Su, Blockchain-based 2496–2501.
reliable and efficient certificateless signature for IIoT devices, IEEE Trans. Ind. [143] Y. Lu, X. Huang, Y. Dai, S. Maharjan, Y. Zhang, Blockchain and federated
Inform. (2021). learning for privacy-preserved data sharing in industrial IoT, IEEE Trans. Ind.
[140] C.-C. Sun, A. Hahn, C.-C. Liu, Cyber security of a power grid: State-of-the-art, Inform. 16 (6) (2019) 4177–4186.
Int. J. Electr. Power Energy Syst. 99 (2018) 45–56. [144] S. Li, L. Da Xu, S. Zhao, 5G Internet of Things: A survey, J. Ind. Inf. Integr.
[141] F. Li, Y. Shi, A. Shinde, J. Ye, W.-Z. Song, Enhanced cyber-physical security in 10 (2018) 1–9.
internet of things through energy auditing, IEEE Internet Things J. 6 (3) (2019) [145] Y. Lu, X. Zheng, 6G: A survey on technologies, scenarios, challenges, and the
5224–5231. related issues, J. Ind. Inf. Integr. 19 (2020) 100158.
[146] Y. Chen, M. Li, P. Chen, S. Xia, Survey of cross-technology communication for
IoT heterogeneous devices, IET Commun. 13 (12) (2019) 1709–1720.
13