Implementing Pointnet For Point Cloud Segmentation in The Heritage Context
Implementing Pointnet For Point Cloud Segmentation in The Heritage Context
Implementing Pointnet For Point Cloud Segmentation in The Heritage Context
Abstract
Automated Heritage Building Information Modelling (HBIM) from the point cloud data has been researched in the
last decade as HBIM can be the integrated data model to bring together diverse sources of complex cultural content
relating to heritage buildings. However, HBIM modelling from the scan data of heritage buildings is mainly manual
and image processing techniques are insufficient for the segmentation of point cloud data to speed up and enhance
the current workflow for HBIM modelling. Artificial Intelligence (AI) based deep learning methods such as PointNet
are introduced in the literature for point cloud segmentation. Yet, their use is mainly for manufactured and clear
geometric shapes and components. To what extent PointNet based segmentation is applicable for heritage build-
ings and how PointNet can be used for point cloud segmentation with the best possible accuracy (ACC) are tested
and analysed in this paper. In this study, classification and segmentation processes are performed on the 3D point
cloud data of heritage buildings in Gaziantep, Turkey. Accordingly, it proposes a novel approach of activity workflow
for point cloud segmentation with deep learning using PointNet for the heritage buildings. Twenty-eight case study
heritage buildings are used, and AI training is performed using five feature labelling for segmentation namely, walls,
roofs, floors, doors, and windows for each of these 28 heritage buildings. The dataset is divided into clusters with 80%
training dataset and 20% prediction test dataset. PointNet algorithm was unable to provide sufficient accuracy in seg-
menting the point clouds due to deformation and deterioration on the existing conditions of the heritage case study
buildings. However, if PointNet algorithm is trained with the restitution-based heritage data, which is called synthetic
data in the research, PointNet algorithm provides high accuracy. Thus, the proposed approach can build the baseline
for the accurate classification and segmentation of the heritage buildings.
Keywords: Deep learning, Artificial intelligence, Cultural heritage, Segmentation, 3D point cloud
© The Author(s) 2023. Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which
permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the
original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or
other third party material in this article are included in the article’s Creative Commons licence, unless indicated otherwise in a credit line
to the material. If material is not included in the article’s Creative Commons licence and your intended use is not permitted by statutory
regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this
licence, visit http://creativecommons.org/licenses/by/4.0/. The Creative Commons Public Domain Dedication waiver (http://creativeco
mmons.org/publicdomain/zero/1.0/) applies to the data made available in this article, unless otherwise stated in a credit line to the data.
Haznedar et al. Heritage Science (2023) 11:2 Page 2 of 18
5], multi-view-based [6, 7], graph-based [8–10] and set- to future-proof digital records of historical buildings
based [11, 12]. OctNet [13] and Kd-Net [14], created by to ensure that their components can be reliably located
using the advantages of the voxelization-based method, through tagging, such as semantically recognizable
are two different methods that reduce the computational doors, windows, and walls. However, within the field of
cost. In these methods, the voxel, which is expressed as document analysis and pattern recognition in cultural
empty in the data allocated to the voxels, is not included heritage, it is widely recognized that current analysis of
in the calculation, thus saving both time and memory. pattern recognition and deep learning methods are inad-
The multi-view-based method [6, 7] defines the 3D point equate for the analysis/recognition of degraded, infor-
cloud as a series of images taken from different angles. mation-rich historical buildings since most work in the
The number of images taken from different angles, the literature has concentrated on relatively narrow scope
image distribution, and radial distances between images objects, such as textual documents or small 3D artefacts
are not at regular intervals. Therefore, different param- rather than buildings.
eters are required for each study. It is often described Hence, this paper examines and proposes a segmenta-
as an indefinite method. Graph-based method [8–10] is tion approach using PointNet for heritage buildings point
a Convolutional Neural Network (CNN)-based method cloud data. In this study, classification and segmentation
that processes the neighbourhoods of each point in the processes are performed on 3D point cloud data of the
point cloud in planar space and then creates the final pla- heritage buildings in Gaziantep in Turkey. In this pro-
nar space graph. cess, the segmentation of the historical structure, which
Methods that require obtaining 2D images or scan- is the most comprehensive step to create a BIM model,
ning the entire point cloud in order to segment from 3D is achieved using artificial intelligence and deep learning
data are not cost and time effective. Therefore, there is methods, and the results are examined.
a need for solutions that can be worked directly on the
point cloud without pre-processing. In the part segmen- Related works
tation study by Yi et al. [15], a method for object segmen- In this section, the studies that focused on similar meth-
tation was proposed over point cloud data belonging to ods as in this study related to the segmentation for point
16 different categories containing different numbers of cloud data have been critically reviewed. In a study by
data. According to this method, different regions of the Shen et al. [18], 3D point clusters are defined as 3D data
object were determined in each object category and the stacks whose correlation can be calculated, which can
system was trained in this direction. Deep learning meth- respond jointly to neighbouring points and can learn.
ods using a total of 95,000 data were supported by differ- The two methods named Edge-Conditioned Convolution
ent framework methods and a structure called Scalable (ECC) [19] and Superpoint Graph (SPG) [20] are based
Active Framework was created. With this part segmenta- on the graph-based method that proposes to create con-
tion method, an F1 score varying between 85% and 95% volution filters using graph weights. Since these methods
was obtained in 16 different categories. can only operate on predefined weights, they have been
PointNet [16], which is an end-to-end deep neural effective only on certain data structures. Therefore, it is
network architecture that allows working directly on not a recommended method in the literature.
the point cloud and can be used for classification, part According to Wang et al. [21], the set-based method
segmentation and semantic segmentation, is one of the can be applied directly to point-level data. However, it is
pioneering studies in this field. Using the PointNet archi- a method that is not preferred in semantic segmentation
tecture, the semantic segmentation performance was studies since it ignores the neighbourhood relations that
obtained 83.7%. The authors, who stated that PointNet contain structural information between the points.
could not capture local geometries over time, presented In a CNN-based study by Su et al. [22] for object iden-
the PointNet++ [17] architecture as a new study. In this tification, a network model trained with 2D images was
study, a hierarchical grouping was made to identify local created to describe 3D images. The dataset known as
features. More details on the point cloud can be captured modelnet40 was used to train the created model and 90.1
using point-to-point metric calculations. ACC was obtained. Different from the Modelnet40 data-
This paper aims to propose an approach for segmen- set, which is widely used in part segmentation and clas-
tation of point cloud data for heritage structures using sification studies, the results obtained using the Stanford
the PointNet deep learning algorithm. There is currently Large-Scale 3D Indoor Spaces (S3DIS) [23] dataset used
a significant gap in research and practice on the auto- in the studies [24–26] for structure segmentation are
mated segmentation of point cloud data for heritage detailed in "Comparison with literature findings" section.
building towards automated HBIM modelling. Previous In a study by Hackel et al. [27], a trained network was
research and literature review show that it is necessary created using different datasets for the classification and
Haznedar et al. Heritage Science (2023) 11:2 Page 3 of 18
segmentation of 3D point cloud data. In this study, unlike correctly. Teruggi et al. [33] presented a study recom-
other studies cited as a reference, a confusion matrix mending the use of machine learning methods with the
was also included in the evaluation. Ma et al. [28] con- multi-level and multi-resolution (MLMR) approach. In
ducted a study in which PointNet and Dynamic Graph their study, two large-scale and complex datasets were
Convolutional Neural Network (DGCNN) architectures used. According to the three-level classification results
were used together for the semantic segmentation of made with these datasets, an f1 score of over 90% was
BIM models and point cloud data in 2020. In their study, obtained at each level.
S3DIS dataset, which consists of undeformed data, was Croce et al. [34] used heritage-building information
taken as reference. For the creation of the synthetic data models based on semi-automatic methods for 3D recon-
from restitution information, one field out of six fields struction. In these methods, the correct conversion of
was selected in this dataset, and the synthetic data was semantic information, the correct application of feature
produced using 44 rooms from the chosen area. The selection methods, data marking and conversion to the
DGCNN algorithm outstripped the PointNet algorithm HBIM model was considered. This is one of the examples
in both synthetic and real point cloud data for 12 classes of a hybrid method that combines ML and DL methods
as ceiling, floor, wall, beam, column, wındow, door, chair, to generate geometry in Revit BIM software successfully
table, bookcase, sofa, and board. and ultimately outputs HBIM in IFC format.
Stasinakis et al. [29] applied a method called Gen- In a study by Rodgigues et al. [35], besides the segmen-
erative Adversarial Networks (GAN)-based Cascaded tation methods used in the literature, anomaly detec-
refinement network on fragmented archaeological tion studies were conducted from point cloud data using
objects. This method was performed for self-supervised known architectures such as Resnet. After various aug-
data augmentation using high-level geometry techniques mentations applied on the data collected as an image,
and achieved successful results. conversions from image data to point cloud data were
Perez-Perez et al. [30] presented an approach called made and integrated into the BIM model. This study can
Scan2BIM-NET, which is a deep learning network be considered a reference, but it lags behind with the 0.60
model used in mechanical, structural, architectural, f1 score in the literature.
and component segmentation. In this approach, which In cases where CNN networks are not effective in terms
can be processed with Point Cloud data, two CNNs and of both time and cost in large data sets, structures called
one Recurrent Neural Network (RNN) network were transformers can be included in the network. Liu et al.
used. Operations were performed on 5 different classes, [36] proposed an architecture called TR-Net in which
namely, beam, ceiling, column, floor, pipe, and wall. In classification and segmentation units are defined in a
the dataset used, the average accuracy value was obtained transformer consisting of encoder and decoder blocks.
as 86.13%. Global features obtained from the encoder are given
Pierdicca et al. [31] used a deep learning network that as input to both classification and segmentation units.
was trained using the Architectural Cultural Heritage According to the studies on the benchmark data, TR-
(ArCH) dataset to achieve semantic. In this dataset, in Net outperformed PointNet (83.77%) and PointNet++
addition to XYZ values, Hue Saturation Value (HSV) and (85.1%) in part segmentation with a mIoU value of 85.3%.
Red-Green-Blue (RGB) values were used for training of By taking into consideration latest in the literature
the proposed model called DGCNN. In this respect, it about point cloud segmentation with AI, this paper pro-
differs from the point cloud features used in the litera- poses a novel approach for increasing the accuracy of
ture. This method surpassed the PointNet architecture, segmentation with PointNet for point cloud data of her-
which has become a reference for point cloud segmen- itage buildings. Next section provides the methodology
tation, with 74.8% precision 74.2% recall and 72.2% f1 and research design for the formulation of the proposed
score. novel approach for point cloud segmentation with Point-
Matrone et al. [32] proposed a hybrid method com- Net at higher accuracy.
bining DGCNN, DGCNN-Mod, DGCNN-3Dfeat used
in the literature. When the results of these three meth- Materials and methods
ods are examined; DGCNN has alone 0.37 IoU and 0.79 Research methodology: case study
f1-score, while DGCNN-Mod and 3Dfeat has 0.59 IoU Heritage buildings in Turkey at risk in Gaziantep are
and 0.91 f1 score. The results were obtained using the selected as case studies, provided by the Heritage Con-
publicly available ArCH dataset. servation Department of the Gaziantep Metropolitan
Model definition, analysis and conservation steps, Municipality, called KUDEB that is an active partner in
which are important factors affecting the success of the project as the end user. Thus, experts from KUDEB
the model in deep learning studies, must be completed also validate the research outcomes and the related test
Haznedar et al. Heritage Science (2023) 11:2 Page 4 of 18
Fig. 3 HBIM laser scanner data a ‘Building_1’RGB data, b ‘Building_1_room_1’RGB data, c ‘Building_1_room_2’ RGB data
with the aim of increasing data, is separated from each heritage case studies were carried out with the feedback
other in terms of width, height and amount of deforma- method in a reverse engineering strategy. This reverse
tion. For this reason, working on separate rooms didn’t engineering process included the 3D BIM modelling
affect the model performance in terms of overfitting or from the restitution information, then conversion of the
underfitting. In addition, the other reason why the build- 3D BIM model to the 3D point cloud data for the training
ings are divided into rooms is that the existing cultural of the PointNet algorithm. BIM models were imported
and historical building data [23, 32, and 33] do not match into the CloudCompare platform in FBX file format.
the deformed data discussed in this study and sufficient The amount of data for the deep learning network was
data cannot be obtained. increased with the 11 restitution point cloud data struc-
Using large number data and data diversity are impor- tures (converted from 3D HBIM model to point cloud)
tant to achieve accurate results in training of deep learn- were created and included in the system. The point cloud
ing models. However, HBIM dataset used in this study representations of the labels of the restitution data are
contains too many deformed building elements and the given in Fig. 4.
number of point cloud data is limited. For this reason, A frequently mentioned concept to describe informa-
data generation from the restitution information of the tion richness of BIM objects is ‘Level of Detail’ or also
Fig. 4 HBIM Restitution data a’Wall’ label, b ‘Ceiling’ label, c ‘Door’ label, d ‘Window’ label, e ‘Floor’ label.
Haznedar et al. Heritage Science (2023) 11:2 Page 6 of 18
referred to as ‘Level of Development’ (LoD). LoDs allow to measure and verify the performance of the segmenta-
to specify the amount of detail and generalization pre- tion process. The IoU value is a frequently used verifica-
sent in the 3D model. In this use case, the LoD of the syn- tion and measurement method of object detection [41],
thetic data is an important factor as it contributes to the object segmentation [42], and definition of workspaces.
accuracy of the deep learning network. There are differ- This value measures the similarity between ground truth
ent levels of development in literature whose definitions and model prediction.
differ in geometric accuracy, quality or completeness of The IoU calculation method is the intersection of the
semantic information. One of them, LoD200 is a design ground truth and the predicted area divided by the com-
development of a product which contains geometry bination of these two areas, as shown in Eq. (1). Ground
information [37–39]. Point cloud data contains precise truth is the volume calculated using point cloud data of
geometric information such as width, length, height, and historical buildings.
detail sizes in itself, but not semantic information and
Area_of _overlap
therefore the synthetic data was generated at LoD200 Score(IoU ) = (1)
level like scan data. Some building examples obtained Area_of _union
from the synthetic data generation process at LoD200
level used in DNN training are given in Table 1.
PointNet algorithm architecture
The synthetic data we call restitution data are produced
In the PointNet architecture given in Fig. 6, the input
by the feedback method, also known as reverse engineer-
layer consists of a set of Multi-Layer Perceptrons (MLPs)
ing. While performing the reverse engineering applica-
that use the properties of point clouds. In the layer
tion, the point cloud was produced in 3 steps. These steps
known as the Max Pooling layer, the symmetric prop-
create 2D restitution information, 3D HBIM models
erties of the input data are used, the input permutation
from 2D restitution information, and convert these mod-
calculations are made, and the global values of the data
els to point clouds. This process uses survey and restitu-
are calculated. Fully Connected Layers, known as the last
tion data to train the deep learning network. Five labels
layer, perform label prediction and classification.
for each room of the building, were determined as door,
In the PointNet network, 3D data consisting of n points
window, wall, floor, and ceiling, defined as unique build-
is taken as input. To transform the input data, the input
ing elements. The labelling process of 140 rooms and 11
transform and feature transform operations are per-
restitution data used is shown in Fig. 5.
formed, which enable the independent transformation of
During the labelling process, the unique architectural
each point. The schema showing this transformation is
features of these historical buildings are considered. Point
given in Fig. 7. In the most general terms, PointNet takes
cloud datasets are labelled with a point cloud processing
a series of (x, y, and z) coordinate values, and each point
software. First, a model was produced by giving coordi-
in this coordinate array is in the form of labelled data. It
nates to the corners of the labelled building elements, as
is an integrated system that can classify and segment by
in Fig. 5. However, it was determined that the model can-
calculations on coordinate values and determination of
not be created in some building elements by only giving
the surface normal values. Three basic modules make up
coordinates to the corner points. As a result, the second
this integrated system. These modules are explained by
method was developed to perform the segmentation of
Qi et al. [16] as follows.
building elements.
The Symmetry Function for Unordered Input mod-
The second method is the process of location-based
ule is described as ordering a set of irregular data in an
separation of the structural element to be segmented
understandable order, training the ordered data using the
from the entire structure that has been laser-scanned.
RNN network, and generating a new set of vectors using
This process is performed by leaving the individual build-
a symmetric function.
ing elements in isolation from the whole building data
PointNet processes the n input data in an artificial neu-
and saving the isolated element as a separate file with-
ral network known as MLP to obtain regular data. After
out changing or distorting its location and coordinates.
the input is transformed (64,64), it is passed through
The building elements were recorded by naming them
the MLP network again for the feature transformation
according to the room and type. This way, a more accu-
(64,128,1024) and the input data is converted into regular
rate model was obtained by giving coordinates to each
information in nx1024 dimension. It is proven in the lit-
point of the defined elements.
erature that high performance is achieved with the use of
The PointNet model trained with the classified data
RNN networks on 3D point cloud data. To create a suit-
was implemented in the segmentation of the other point
able RNN network in the PointNet network, our input
cloud data. The intersection over union (IoU) value com-
data must be based on a universal function. This function
pute method, known as the Jaccard index [40], was used
is shown in Eq. (2).
Haznedar et al. Heritage Science (2023) 11:2 Page 7 of 18
Table 1 Some building examples obtained from the synthetic data generation process
Building
No HBIM Model (.rvt) Solid Model (.fbx) Point Cloud (.txt)
10
11
12
Haznedar et al. Heritage Science (2023) 11:2 Page 8 of 18
Fig. 8 Research process plan a classification and segmentation, b transform invariance, c permutation (order) invariance.
dimensions to 64 dimensions and then from 64 dimen- a labour-intensive and manual. The dataset was divided
sions to 1024 dimensions. These processes, detailed as into 3 groups for training, verification, and testing. This
Input transforms and Feature transforms in the previ- separation was done at 70%, 10% and 20%, respectively.
ous sections, constitute the first stage of the classification Weights were created by training the training and vali-
process. dation dataset with the PointNet model. The test dataset
Deep learning architectures are used to directly con- with 20% rate was used to measure the test success of the
sume point clouds and well respect the permutation trained model.
invariance of points in the input capable of reasoning
about 3D geometric data such as point clouds or meshes. Point cloud segmentation approach on heritage buildings
In the step called Permutation invariance, presented in with PointNet
Fig. 8, an MLP network was used to obtain global fea- The point cloud dataset of Gaziantep historical buildings
tures and Local Point Features. For an array containing N shown in Fig. 1 and the BIM object catalogue produced
points, N! situation arises. N! cases must mean the same from the restitution information of historical buildings is
thing for a single point. Therefore, all probabilities must used as Input Data for the training of the learning net-
be based and fixed on a single function. work. Process diagrams for processing a point cloud
Global and local features are obtained as the output of and performing its semantic segmentation are shown in
the MLP network after fixation using a symmetric func- Fig. 9. Also, Fig. 10 contains detailed information on how
tion. While global feature vectors are used in the classi- this process works in the HBIM integrated system.
fication, segmentation can be performed when used with Input data—data preparation; Heritage buildings
local point features. The vector defined as R1088 for each scanned by 3D laser scanners were converted into point
point in the MLP network used in the segmentation pro- clouds data and a dataset was created. The collected
cess is converted into an array of nxmdimensions. Here, 3D point cloud data was tagged and made ready for AI
n is the number of points and m is the number of classes. training. In addition, BIM models produced using the
A point cloud dataset collected with 3D laser scan- Revit program were converted into point cloud data. It
ners was created. The objects of the dataset were was automatically tagged during the conversion process,
labelled as doors, windows, walls, etc. This process was making it ready for AI training.
Haznedar et al. Heritage Science (2023) 11:2 Page 10 of 18
Fig. 11 HBIM segmented data a RGB ‘Data_1’, b Segmented RGB ‘Data_1’, c RGB ‘Data_2’, d Segmented RGB ‘Data _2’
used as test data and the same level of performance was structure_30, and 77.23% test ACC from structure_25.
obtained. The accuracy and loss graphs obtained using When only the given 3 structures were evaluated, the
laser scanning and restitution data in the deep learning average test accuracy was 82.98%.
network are shown in Fig. 13. Segmentation results using Gaziantep original Cul-
A few results of the segmentation made with resti- tural Heritage data are shown in Fig. 15. Accord-
tution data created to support the original Gaziantep ing to the segmentation result, 91.13% test ACC, 0.20
Cultural Heritage data are shown in Fig. 14. According loss value was obtained using building_1_room_1 and
to these results, 84.22% ACC was obtained from the 90.70% test ACC, 41.06 loss value were obtained for
data we named structure_29, 85.89% test ACC from building_2_room_7.
Haznedar et al. Heritage Science (2023) 11:2 Page 13 of 18
HBIM Gaziantep Cultural Heritage 18 HBIM building /100 room Area_1 Area_6 0.7147 0.5783 1.1179 1.8003
Area_2 (Laser Scanner data)
Area_3
Area_4 Area_5
PointNet-Stanford 6 PointNet building / 271 room Area_1 Area_6 0.8102 0.8200 0.7072 0.7526
Area_2
Area_3
Area_4 Area_5
HBIM Gaziantep Cultural Heritage 18 HBIM building / 100 room Area_2 Area_1 0.9230 0.8793 0.1818 0.4410
+ Area_3 (Laser Scanner data)
1 PointNet building / Area_4
40 room Area_5
Area_6
HBIM Gaziantep Cultural Heritage 18 HBIM building/ 100 room Building_6 Building_30 0.9514 0.8652 0.3320 0.4150
+ Building_7 (Restitution data)
1 PointNet building … Building_29 0.9514 0.8159 0.3320 0.7734
/40 room Building_21 (Restitution data)
+ Building_22
11 restitution Building_25 0.9514 0.9120 0.3320 0.3250
Building (Restitution data)
Fig. 13 HBIM dataset a ‘Restitution Data’ test results, b ‘Laser Scanner Data’ test results
Fig. 14 Segmented restitution data a restitution ‘Building_29’, b restitution ‘Building_30’, c restitution ‘Building_25’
The HBIM dataset is a dataset created with the In this study, we proposed and implemented new
deformed data given in Figs. 11 and 15. With this aspect, methods to improve the accuracy of the segmentation
it should be evaluated differently from the segmentation results with PointNet deep learning. Most appropri-
studies, examples of which we have seen in the literature. ate segmented data for the case study buildings were
When working with these data, the expected result is used to increase the training and test performance and
that it has lower performance than the examples in the to obtain the closest results to the truth. With the use
literature. of restitution data produced via reverse engineering
Haznedar et al. Heritage Science (2023) 11:2 Page 15 of 18
Fig. 15 Segmented data a Gaziantep ‘Cultural Heritage_2_room_7’, b Gaziantep ‘Cultural Heritage _1_room_1’
approach from the restitution data, the learning net- training, segmentation accuracy for windows and doors
work was transformed into an integrated system con- are relatively high and satisfactory.
sisting of both laser scan data of existing conditions and
the restitution data obtained from produced by using Comparison with literature findings
the characteristics of historical buildings in Gaziantep. The common feature of these studies, which are refer-
The results of the test obtained using the new data set enced in the segmentation area and compared in Table 5,
consisting of laser scanning and restitution data as is that the data used are clear and clean. The results that
training data are detailed in Table 4. can be obtained using the 3D point cloud datasets used
The IoU value for each label of laser scanning and res- in the references cited are predictable.
titution data is given in Table 4. When these values are The dataset used in this study is 3D laser scanning data
examined, it is seen that the IoU value of the window obtained from damaged historical buildings that were
obtained from the laser scanning data alone is very low. not used in the literature before. In addition, restitution
However, significant increases were recorded in the IoU models of damaged buildings were used, and data aug-
values obtained using laser scanning and restitution mentation was performed. The HBIM model will have a
data together. The effect of restitution data on the IoU unique place in the literature.
value of each label is shown in Fig. 16. The success rate of the studies in the literature that
As mentioned in the previous sections, because the make segmentation using the 3D Point cloud data set is
windows and doors are very similar both visually and in listed in Table 5. It is aimed that the created list includes
size in the deep learning network created, the desired the comparison of accuracy values and studies using dif-
results in these two labels could not be obtained. As ferent networks with point cloud dataset. The studies
seen in Fig. 16, when restitution data is used for AI in the list generally used the point cloud dataset, which
Table 4 Comparison of the restitution data and Laser scanner data IoU results
Number of building Types of data Window (IoU) Wall (IoU) Ceiling (IoU) Door (IoU) Floor (IoU) Average (IoU)
2016 Generative and discriminative voxel modeling with convolutional neural networks MVCNN [6] ModelNet40 90.1
2016 Fast semantic segmentation of 3D point clouds with strongly varying density TMLC-MSR [27] TerraMobilita 90.28
2017 A scalable active framework for region annotation in 3D shape collections Yi [15] ModelNet40 81.4
2017 OctNet: learning deep 3D representations at high resolutions Oct-Net [13] ModelNet10 81.5
2017 Escape from cells: Deep Kd-networks for the recognition of 3D point cloud models Kd-Net [14] ModelNet40 82.3
2017 PointNet: deep learning on point sets for 3d classification and segmentation PointNet [16] ModelNet40 83.7
2017 PointNet++: deep hierarchical feature learning on point sets in a metric space PointNet++ [17] ModelNet40 90.7
2017 SEGCloud: semantic segmentation of 3D point clouds SegCloud [24] S3DIS 88.1
2017 Dynamic edge-conditioned filters in convolutional neural networks on graphs ECC [19] ModelNet40 82.4
2017 Unstructured point cloud semantic labeling using deep segmentation networks SnapNet [25] S3DIS 88.6
2017 Deep projective 3D semantic segmentation DeePr3SS[26] S3DIS 88.9
2018 Mining point cloud local structures by kernel correlation and graph pooling KCNet [18] ModelNet40 91.0
2018 Large-scale point cloud semantic segmentation with superpoint graphs SPG [20] S3DIS 85.5
2019 Graph attention convolution for point cloud semantic segmentation GACNet [21] S3DIS 87.79
is the output of the laser scanner device in the machine and which gap it will fill. Thus, the most similar literature
learning process. In our study, laser scanner data and information was compared with our study.
synthetic point cloud data from the restitution HBIM As seen in Table 5, the literature used the point cloud
models were used simultaneously in the machine learn- data type and accuracy values ranged between 81.4% and
ing process. In addition, the segmentation of the 3D 91.7%. Our study presents 95.14% training accuracy and
point cloud dataset of historical heritage buildings that 83.3% test accuracy. While the success achieved with the
are not in good structural condition is the challenging dataset consisting of 3D point cloud data type of struc-
part of the study. Table 5 has been created for compari- turally damaged buildings, an example of which is shown
son to determine the place of our study in the literature in Fig. 3, is 57.83%, this success has been increased to
Haznedar et al. Heritage Science (2023) 11:2 Page 17 of 18
83.3% with the restitution dataset. With this increase, the Author contribution
BH: supervision, methodology, writing, original draft preparation. RB: concep-
success of automating the pre-restoration processes by tualization, methodology, validation, software. AEO: supervision, conceptual-
scanning historical heritage buildings with 3D laser scan- ization, methodology, validation, software. YA: writing, reviewing and editing.
ners has been increased. For this reason, our study could All authors read and approved the final manuscript.
be compared with successful studies that contributed to Funding
the literature. The authors declare that no funds, grants, or other support were received dur-
ing the preparation of this manuscript.
Conclusion Availability of data and materials
In the research reported in this paper, the scanned data The data will be available upon reasonable request.
from existing historical buildings, which are deterio-
rated and deformed, were used in the AI-based segmen- Declarations
tation using PointNet. The results showed that 83.30%
Competing Interests
prediction and 95.14% training accuracy was achieved The Authors declare no conflict of interests.
even though the scanned data did not contain sufficient
information about the structure due to the deforma- Author details
1
Department of Computer Engineering, Gaziantep University, Gaziantep,
tions in the buildings. Segmentation of point cloud data Turkey. 2 AI Enablement, Huawei Turkey R&D Center, Umranıye, İstanbul, Turkey.
for historic buildings can be challenging and AI-based 3
Department of Electric‑Electronic Engineering, Hasan Kalyoncu University,
algorithms can be insufficient due to these historic build- Gaziantep, Turkey. 4 Department of Architecture and Built Environment, North-
umbria University, Newcastle, UK.
ings` unique and deformed conditions. However, prepar-
ing training data set from the restitution information of Received: 22 August 2022 Accepted: 11 December 2022
the historic building that is called restitution data in this
research helps significantly for high accurate segmenta-
tion. This restitution data and laser scanning data were
used together for segmentation of five components (win- References
dows, doors, wall, ceiling and floor). The reason for five 1. Gonzalez RC, Woods RE. Digital image processing. Addison-Wesley Pub-
lishing Company; 1993.
components is because the case study heritage build- 2. Dubb D, Zell A. Real-time plane extraction from depth images with the
ings are deteriorated and deformed from which the seg- randomised hough transform. In: Proceedings of the IEEE international
mented results were still satisfactory. conference on computer vision workshops (ICCV Workshops). 2011.
p. 1084–1091.
The results show that the combined use of restitution 3. Zhu H, Meng F, Cai J, Lu S. Beyond pixels: a comprehensive survey from
data and existing conditions data together would be the bottom-up to semantic image segmentation and cosegmentation. J Vis
way forward for the point cloud segmentation with AI for Commun Image Represent. 2016;34(5):12–27.
4. Truc L, Duan Y. PointGrid: a deep network for 3D shape understandings.
heritage structures belonging to the same period. There- In 2018 IEEE/CVF conference on computer vision and pattern recogni-
fore, the research will expand further by identifying other tion. 2018. p. 9204–9214.
minor components in the case study buildings by prepar- 5. Wang P, Liu Y, Guo Y, Sun C, Tong X. O-CNN: octree-based convolu-
tional neural networks for 3D shape analysis. ACM Trans Graphics.
ing a training dataset for the algorithm towards enhanced 2017;36(4):1–11.
and detailed segmentation with higher accuracy. In addi- 6. Qi CR, Su H, Nießner M, Dai A, Yan M, Guibas LJ. Volumetric and multi-
tion, PointNet++ [17], an improved system of PointNet view CNNs for object classification on 3D data. In: Proceedings of the IEEE
conference on computer vision and pattern recognition (CVPR). 2016.
[16], can provide better segmentation performance with p. 5648–5656.
proposed approach. Therefore, as an expansion from the 7. Le T, Giang B, Duan Y. A multi-view recurrent neural network for 3D mesh
current research, PointNet + + will also be considered to segmentation. Comput Graph. 2017;66:103–12.
8. Bronstein MM, Bruna J, LeCun Y, Szlam A, Vandergheynst P. Geometric
improve the segmentation as part of the research plan deep learning: going beyond euclidean data. IEEE Signal Process Mag.
on R-CNN and Fast R-CNN networks to incorporate the 2017;34(4):18–42.
unlabelled data into the HBIM network. 9. Yi L, Su H, Guo X, Guibas L. SyncSpecCNN: synchronised spectral CNN
for 3D shape segmentation. In: Proceedings of the IEEE conference on
computer vision and pattern recognition. 2017. p. 6584–6592.
10. Niepert M, Ahmed M, Kutzkov K. Learning convolutional neural networks
Abbreviations for graphs. In: Proceedings of the 33rd international conference on
HBIM: Heritage Building Information Modelling; S3DIS: Stanford Large-Scale machine learning. 2016. p. 2014–2023.
3D Indoor Spaces Dataset; DL: Deep learning; IoU: The intersection over union; 11. Xie Y, Tian J, Zhu XX. Linking points with labels in 3D: a review of
MLP: Multi-layer perceptron; SVM: Support Vector Machine; JAN: Joint Align- point cloud semantic segmentation. IEEE Geosci Remote Sens Mag.
ment Network. 2020;8(4):38–59.
12. Wang Z, Liu H, Yueliang Q, Xu T. Real-time plane segmentation and obsta-
Acknowledgements cle detection of 3D point clouds for indoor scenes. In: Fusiello A, Murino
This paper is written within the TUBITAK 1001 project (Grant number: V, Cucchiara R, editors. European conference on computer vision (ECCV).
119Y038) and the authors would like to acknowledge The Scientific and Tech- 2012. p. 22–31.
nological Research Council of Turkey (TUBITAK) for their support.
Haznedar et al. Heritage Science (2023) 11:2 Page 18 of 18
13. Riegler G, Ulusoy AO, Geiger A. OctNet: learning deep 3D representations 35. Rodrigues F, Cotella V, Rodrigues H, Rocha E, Freitas F, Matos R. Applica-
at high resolutions. In: Proceedings of the IEEE conference on computer tion of deep learning approach for the classification of buildings’ degra-
vision and pattern recognition (CVPR). 2017. p. 6620–6629. dation state in a BIM methodology. Appl Sci. 2022;12(15):7403.
14. Klokov R, Lempitsky V. Escape from cells: deep Kd-networks for the recog- 36. Liu L, Chen E, Ding Y. TR-Net: a transformer-based neural network for
nition of 3D point cloud models. In: Proceedings of the IEEE international point cloud processing. Machines. 2022;10(7):517.
conference on computer vision (ICCV). 2017. p. 863–872. 37. Morbidoni C, Pierdicca R, Paolanti M, Quattrini R, Mammoli R. Learning
15. Yi L, et al. A scalable active framework for region annotation in 3D shape from synthetic point cloud data for historical buildings semantic segmen-
collections. ACM Trans Graph. 2016;35(6):1–12. tation. J Comput Cult Herit. 2020;13(4):1–16.
16. Charles RQ, Su H, Kaichun M, Guibas LJ. PointNet: deep learning on point 38. Mengqi Z, Yan T. Exploring spatiotemporal changes in cities and vil-
sets for 3D classification and segmentation. In: Proceedings of the IEEE lages through remote sensing using multibranch networks. Herit Sci.
conference on computer vision and pattern recognition. 2017. p. 77–85. 2021;9(1):1–15.
17. Charles RQ, Yi L, Su H, Guibas LJ. PointNet++: deep hierarchical feature 39. Dong Y, Li Y, Hou M. The point cloud semantic segmentation method for
learning on point sets in a metric space. In: Proceedings of the 31st the Ming and Qing Dynasties’ official-style architecture roof considering
international conference on neural information processing systems. 2017. the construction regulations. Int J Geo-Inf. 2022;11(4):214.
p. 5105–5114. 40. Jaccard P. The distribution of the flora in the alpine zone. New Phytol.
18. Shen Y, Feng C, Yang Y, Tian D. Mining point cloud local structures by ker- 1912;11(2):37–50.
nel correlation and graph pooling. In: Proceedings of the IEEE conference 41. Xu J, Ma Y, He S, Zhu J. 3D-GIoU: 3D generalised intersection over union
on computer vision and pattern recognition. 2018. p. 4548–4557. for object detection in point cloud. Sensors. 2019;19(19):4093.
19. Simonovsky M, Komodakis N. Dynamic edge conditioned filters in con- 42. Hou F, Lei W, Li S, Xi J, Xu M, Luo J. Improved mask R-CNN with distance
volutional neural networks on graphs. In: Proceedings of the IEEE confer- guided intersection over union for GPR signature detection and segmen-
ence on computer vision and pattern recognition (CVPR). 2017. p. 29–38. tation. Autom Constr. 2021;121(1):103414.
20. Landrieu L, Simonovsky M. Large-scale point cloud semantic segmenta-
tion with super point graphs. In: Proceedings of the IEEE conference on
computer vision and pattern recognition. 2018. p. 4558–4567. Publisher’s Note
21. Wang L, Huang Y, Hou Y, Zhang S, Shan J. Graph attention convolution Springer Nature remains neutral with regard to jurisdictional claims in pub-
for point cloud semantic segmentation. In: Proceedings of the IEEE lished maps and institutional affiliations.
conference on computer vision and pattern recognition (CVPR). 2019.
p. 10288–10297.
22. Su H, Maji S, Kalogerakis E, Learned-Miller E. Multi-view convolutional
neural networks for 3D shape recognition. In: Proceedings of the
IEEE conference on computer vision and pattern recognition. 2015.
p. 945–953.
23. Retrieved from papers with code. https://paperswithcode.com/dataset/
s3dis. Accessed 2022.
24. Tchapmi L, Choy C, Armeni I, Gwak J, Savarese S. SEGCloud: semantic
segmentation of 3D point clouds. In: 2017 international conference on
3D vision (3DV). 2017. p. 537–547.
25. Boulch A, Saux BL, Audebert N. Unstructured point cloud semantic
labeling using deep segmentation networks. In: Eurographics workshop
on 3D object retrieval. 2017. p. 17–24.
26. Lawin FJ, Danelljan M, Tosteberg P, Bhat G, Khan FS, Felsberg M. Deep
projective 3D semantic segmentation. In: Felsberg M, Heyden A, Krüger
N, editors. Computer analysis of images and patterns. 2017. p. 95–107.
27. Hackel T, Wegner JD, Schindler K. Fast semantic segmentation of 3D point
clouds with strongly varying density. ISPRS Ann Photogramm Remote
Sens Spat Inf Sci. 2016;III–3:177–84.
28. Ma JW, Czerniawski T, Leite F. Semantic segmentation of point clouds of
building interiors with deep learning: augmenting training datasets with
synthetic BIM-based point clouds. Autom Constr. 2020;113:103144.
29. Stasinakis A, Chatzilari E, Nikolopoulos S, Kompatsiaris I, Karolidis D, Tou-
loumtzidou A, Tzetzis D. A hybrid 3D object auto-completion approach
with self-supervised data augmentation for fragments of archaeological
objects. J Cult Herit. 2022;56:138–48.
30. Perez-Perez Y, Golparvar-Fard M, El-Rayes K. Scan2BIM-NET: deep learning
method for segmentation of point clouds for scan-to-BIM. J Constr Eng
Manag. 2021;147(9):04021107.
31. Pierdicca R, Paolanti M, Matrone F, Martini M, Morbidoni C, Malinverni ES,
Lingua AM. Point Cloud semantic segmentation using a deep learning
framework for cultural heritage. Remote Sens. 2020;12(6):1005.
32. Matrone F, Grilli E, Martini M, Paolanti M, Pierdicca R, Remondino F.
Comparing machine and deep learning methods for large 3D heritage
semantic segmentation. ISPRS Int J Geo-Inf. 2020;9(9):535.
33. Teruggi S, Grilli E, Russo M, Fassi F, Remondino F. A hierarchical machine
learning approach for multi-level and multi-resolution 3D point cloud
classification. Remote Sens. 2020;12(16):2598.
34. Croce V, Caroti G, De Luca L, Jacquot K, Piemonte A, Véron P. From the
semantic point cloud to heritage-building information modeling: a
semiautomatic approach exploiting machine learning. Remote Sens.
2021;13(3):461.