Peerj Cs 2527
Peerj Cs 2527
Peerj Cs 2527
ABSTRACT
Traffic prediction is of vital importance in intelligent transportation systems. It
enables efficient route planning, congestion avoidance, and reduction of travel time,
etc. However, accurate road traffic prediction is challenging due to the complex
spatio-temporal dependencies within the traffic network. Establishing and learning
spatial dependencies are pivotal for accurate traffic prediction. Unfortunately, many
existing methods for capturing spatial dependencies consider only single relationships,
disregarding potential temporal and spatial correlations within the traffic network.
Moreover, the end-to-end training methods often lack control over the training
direction during graph learning. Additionally, existing traffic forecasting methods
often fail to integrate multiple traffic data sources effectively, which affects prediction
accuracy adversely. In order to capture the spatiotemporal dependencies of the traffic
network accurately, a novel traffic prediction framework, Adaptive Spatio-Temporal
Graph Neural Network based on Multi-graph Fusion (DTS-AdapSTNet), is proposed.
Firstly, in order to better extract the hidden spatial dependencies, a method for
fusing multiple factors is designed, which includes the distance relationship, transfer
relationship and same-road segment relationship of traffic data. Secondly, an adaptive
learning method is proposed, which can control the learning direction of parameters
better by the adaptive matrix generation module and traffic prediction module. Thirdly,
an improved loss function is designed for training processes and a multi-matrix fusion
Submitted 28 June 2024
module is designed to perform weighted fusion of the learned matrices, updating the
Accepted 28 October 2024 spatial adjacency matrix continuously, which fuses as much traffic information as
Published 29 November 2024 possible for more accurate traffic prediction. Finally, experimental results using two
Corresponding author large real-world datasets demonstrate that the DTS-AdapSTNet model outperforms
Jing Zhang, [email protected] other baseline models in terms of mean absolute error (MAE), root mean square
Academic editor error (RMSE), and mean absolute percentage error (MAPE) when forecasting traffic
Bilal Alatas speed one hour ahead. On average, it achieves reductions of 12.4%, 9.8% and 16.1%,
Additional Information and respectively. Moreover, the ablation study validates the effectiveness of the individual
Declarations can be found on modules of DTS-AdapSTNet.
page 33
DOI 10.7717/peerj-cs.2527
Subjects Artificial Intelligence, Data Mining and Machine Learning, Neural Networks
Copyright
Keywords Traffic prediction, Spatial-temporal dependencies, Graph convolutional network,
2024 Shi et al.
Adaptive graph learning, Multi-graph fusion mechanism
Distributed under
Creative Commons CC-BY 4.0
OPEN ACCESS
How to cite this article Shi W, Zhang J, Zhong X, Chen X, Ye X. 2024. DTS-AdapSTNet: an adaptive spatiotemporal neural networks for
traffic prediction with multi-graph fusion. PeerJ Comput. Sci. 10:e2527 http://doi.org/10.7717/peerj-cs.2527
INTRODUCTION
Traffic prediction plays a vital role in intelligent transportation systems. Accurate road
traffic forecasting facilitates dynamic route planning, congestion avoidance, travel time
reduction, and efficient allocation of traffic resources (Rabbouch, Saâdaoui & Mraihi,
2018; Lana et al., 2018; Wang et al., 2022b). Traffic prediction aims to estimate the future
traffic conditions (e.g., traffic flow and speed) for each road segment using historical
traffic data. Prediction methods can be categorized into two groups roughly: temporal
dependencies-based methods and spatiotemporal dependencies-based methods (Ren, Li &
Liu, 2023; Ermagun & Levinson, 2018).
For the prediction methods that only considering temporal dependencies, such as
the Autoregressive Integral Moving Average (ARIMA) model (Ahmed & Cook, 1979),
and Bayesian model (Castillo, Menéndez & Sánchez-Cambronero, 2008). They focus on
modeling temporal dependencies of time series mainly without considering potential spatial
dependencies among predicted road segments or nodes. However, with the development
of deep learning methods, attention has shifted towards considering potential spatial
dependencies within traffic networks. In the early stages, the research area is divided
into regular grids. The convolutional neural networks (CNNs) and recurrent neural
networks (RNNs) are employed to learn spatial relationships and valuable spatiotemporal
information among these grids is extracted (Geng et al., 2019; Yu et al., 2017). Subsequently,
with the successful application of graph neural networks (GNNs) in processing graph
topology (Jiang et al., 2023), Spatio-Temporal Graph Neural Networks (ST-GNNs) have
been developed and have demonstrated superior performance compared to grid-based
methods (Yu, Yin & Zhu, 2019; Diao et al., 2019; Wu et al., 2020). Compared with previous
methods, ST-GNN utilizes predefined graphs, which facilitate more effective learning of
latent spatial features. In recent years, the training frameworks of ST-GNN are divided into
two parts: the graph learning module and the prediction network module (Li & Zhu, 2021;
Lee & Rhee, 2022). With the advancement of technologies such as public transportation
systems and sensors, large amounts of spatio-temporal data can be obtained easily,
providing a robust data foundation for traffic prediction (Liu et al., 2020; Jiang & Luo,
2022). For example, Wang et al. (2019) use the historical trajectory data of taxis to extract
valuable spatial information through deep neural networks for road traffic prediction.
Ta et al. (2022) learn the potential spatial relationships among sensors first. They use the
historical traffic data provided by the sensors to predict the traffic conditions of each sensor
through a spatial–temporal convolutional network.
Unfortunately, road traffic prediction still faces the following three challenges: (1)
Graphs based solely on a single spatial relationship may overlook crucial factors such as road
characteristics and vehicle flow. This oversight can result in an inaccurate representation
of spatial relationships, hindering the extraction of comprehensive spatial dependencies
from traffic data. (2) The end-to-end training method leads to interdependence, making it
challenging to determine the training direction of the learnable parameters in each module.
(3) There are some limitations in the utilization of existing spatio-temporal data, which
leads to the inability to extract and fuse the data well. Additionally, as traffic networks
expand, the scalability of prediction models becomes essential. The ability to implement
these models efficiently across larger, dynamic environments is vital for their practical
application.
An example of the problems is shown in Fig. 1. On one hand, a long trajectory is
depicted in Fig. 1A, which represents the traffic conditions of multiple road segments
roughly. However, this representation fails to take into account other important traffic
details, making it less favorable for road traffic prediction. On the other hand, the distance
and the environment are only considered as shown in Fig. 1B, neglecting the flow direction
of the actual road and disregarding the relationships among sensors on the same road.
Consequently, this simplistic approach also hinders accurate road traffic prediction.
Therefore, both representations require improvement to enhance prediction accuracy.
In order to solve the above problems, DTS-AdapSTNet is proposed in this article.
Firstly, a novel DTS relationship matrix generation module is designed to address the issue
of inaccurate graph predefined by a single spatial relationship. Instead of relying on the
Euclidean distance matrix solely, multiple spatial relationship matrices are provided, which
are fused to obtain the initial predefined graph. Secondly, a two-stage alternating training
structure is proposed to overcome the limitations of traditional end-to-end training. This
structure includes alternate training between the adaptive matrix generation module and
the prediction module, thereby enhancing control over the training direction of learnable
parameters. Finally, sensor data is utilized in experiments to predict the traffic of road
RELATED WORK
In this section, the relevant methods of existing research are reviewed from three aspects.
PRELIMINARIES
In this section, the motivation, some definitions and the formalization of the problem will
be introduced.
Motivation
The traffic network exhibits not only complex spatial patterns but also constantly changing
spatial states. These variations in traffic conditions occur continuously, with each road
segment having its own unique driving direction and being equipped with multiple sensors
that record traffic data at various time intervals. To achieve accurate traffic prediction, it
is essential to analyze large amounts of historical traffic data collected by these sensors.
Additionally, precisely grasping the dynamics of traffic network changes is crucial for
learning the spatial dependencies more accurately. This grasp is the key to predicting traffic
conditions for each road segment at specific future moments.
The proposed DTS-AdapSTNet framework leverages traffic status data collected from
sensors to adaptively learn these spatial dependencies. By integrating GCNs, the model
enables accurate predictions for individual road segments. The primary goal is to improve
road traffic prediction accuracy by dynamically learning the spatial dependencies among
sensors in combination with historical traffic data.
Related definition
Definition 1 (Crossing sensors). Crossing sensors can be defined as sensors located at both
ends of a directed road segment. As shown in Fig. 2A, the sensor at the starting position of
a road segment is denoted as Sini , while the sensor at the end position of a road segment is
denoted as Sout
i , where i represents the ith road.
Definition 2 (Sensors on the road segments). Sensors on a road can be defined as all other
sensors on a road except crossing sensors, represented as Son i,j , where i represents the ith
road, and j denotes the jth sensor on the ith road.
Definition 3 (Road segments composition). A road segment Si in the road network can be
defined as composed of an inflow crossing sensor Sin out
i , an outflow crossing sensor Si , and
on
several sensors Si,j on the segment.
n o
As shown in Fig. 2B, Si = Sin i ,S on
i,1 ,...,S ,S
on out
i,j i . Si belongs to the segment set S, Sin
i
n o
out
and Si belong to the crossing sensor set S in−out , Si,1 ,...,Si,j belong to the non-crossing
on on
on
sensor set S .
Definition 4 (Traffic network graph). The traffic network graph can be represented as
a weighted directed graph g = {S,V ,W } of the road network, where S represents the
road segment set in the road network and |S| = Ns .V represents the set of all sensors and
|V | = N .W ∈ RN ×N is a weighted adjacency matrix representing the spatial correlations
of sensors. In general, when W (i,j) = 0, it indicates no correlation between sensors i and
j. However, in this article, some new weighted adjacency matrices among sensors will be
defined where this property does not necessarily hold.
Problem formalization
If the graph signal X (t ) represents
h the0 historical traffic data observed by each sensor at
0
i 0
the t −th moment. Then X = X (t −T +1) ,...,X , X ∈ RT ×N ×M is used to represent T
(t )
Y = g X ,A∗
(1)
Notation Description
V Sensor set
S Road segment set
in−out
S Crossing sensor set
Son On-road sensor set
Sin
i ∈S
in−out
Starting sensor of road i
Sout
i ∈S
in−out
End sensor of road i
on on
Si,j ∈ S The jth sensor located on road i
Si ∈ S The i-th road
N Number of sensors
NS Number of roads
N ×N
W ∈R Adjacency matrix
g = {S,V ,W } Traffic network graph
M Dimension of node attributes
0
T ,T Window size of measurements
0
×N ×M
X ∈ RT Sensor attributes of historical conditions
T ×N ×M
Y ∈R Sensor attributes of predicted conditions
τ ∈ RT ×NS ×M Road attributes of predicted conditions
A∗ Optimal adjacency matrix
g (·) Sensor traffic condition prediction function
h(·) Road traffic condition prediction function
τ = h(Y). (2)
In general, the size of X can be RN ×M , where M is the number of features observed
by each sensor. Similarly, the size of P can be RNS ×M , where NS is the number of road
segments on the road network, NS ≤ N .M is the number of features observed for each
road segment. The datasets used in the experiments only include speed feature, i.e., M = 1.
However, all results are directly applicable to problems with M > 1. A summary of key
notations used in our model is shown in Table 1.
Step 3: Prediction module based on improved loss function. The relationship matrix A∗ is
utilized as the optimal input for the prediction module, while incorporating the historical
traffic data to train the prediction network simultaneously. A well-designed loss function
is employed to facilitate optimization. The parameter θ ∗ that maximizes the likelihood
estimation of the prediction model’s excellence is determined under the condition of the
current relationship matrix A∗ .
Step 4: Adaptive Matrix Generation module (AMG-module). The parameter value of the
prediction network in the AMG-module is designated as θ ∗ , which is obtained in Step 3 and
remains fixed. The same prediction network is utilized for training, thereby allowing the
AMG-module to generate the relationship matrix repeatedly. As a result, a new relationship
matrix Mnew that improves the prediction outcome under the current parameter θ ∗ can
be obtained. The spatial dependencies among sensors are thus reweighted by this matrix,
leading to more accurate predictions ultimately.
Step 5: Multi-matrix Fusion module (MF-module). The Mnew generated in Step 4 is
incorporated into the matrix set A of the MF-module, which utilizes the matrices generated
in Step 1 as its elements. Through the MF-module, weight calculation and distribution are
performed on all matrices in the matrix set. Subsequently, the optimal adjacency matrix is
obtained by fusing with the new weights, and it replaces the original A∗ .
By repeating steps 3-5, a relatively better spatial adjacency matrix and a relatively better
road segment traffic prediction model can be obtained as the preset maximum number
of training iterations is reached. The pseudocode for DTS-AdapSTNet is described in
Algorithm 1.
Distance relationship
A certain correlation is observed among sensors, with the strength of correlation being
stronger among sensors that are relatively close to each other. To describe this relationships,
the measure of distance among sensors is utilized. The distance between two sensors i and
j in the road network, denoted as dist (i,j), is considered as the shortest distance when one
or more paths exist between them. This is illustrated in Fig. 4A.
where θ1 is a fixed parameter and k is a threshold. Although the Gaussian kernel function
has become the standard for most distance modeling, in theory, a deep neural network can
model any function according to the general approximation theorem. Additionally, since
the provided experimental data includes the distance among sensors, the Laplacian kernel
function is used to calculate WD . By comparing the result calculated by Gaussian kernel
function, it can be observed that the former ensures greater accuracy in the experimental
outcome.
Transfer relationship
The transfer relationship can be used to describe the flow relationship among various
road segments. As shown in Fig. 4B, multiple possibilities exist for the transfer situation
of each crossing in the network. Accurately simulating the spatial transfer relationship
among crossing sensors is of great importance for predicting purposes. To address this, the
transfer relationship matrix WT is defined, which is utilized to simulate the spatial transfer
relationship of the road network.
Definition 7 (Transfer relationship matrix WT ). The spatial dependencies between the
crossing sensors Sini and Si
out
are captured through the transfer relationship. Consequently,
a relationship matrix is obtained which reflects the similarity among crossing sensors
effectively.
Firstly, the similarity matrix among crossing sensors is calculated using the Node2Vec
algorithm (Grover & Leskovec, 2016). Given that the traffic network is a directed and
unweighted graph, the crossing sensors are sampled through a biased random walk, which
is shown in Eq. (4).
( πvx
, if (v,x) ∈ E
P (ci = x|ci−1 = v) = Z (4)
0, otherwise
1
, if dtx = 0
p
αpq (t ,x) = 1 if dtx = 1 (6)
1
if dtx = 2
q
where t represents the previous node, Wvx represents the weight of the edges in the weighted
graph. Since the traffic network is a directed and unweighted graph, the value of Wvx can
be regarded as 1. Additionally, αpq (t ,x) is defined as the meta-transition probability. Its
calculation formula is shown in Eq. (6), where p, q are the parameters that control the
model walking strategy. dtx = 0 indicates that the flow back from the current crossing node,
i.e., t = x, and dtx = 1 indicates that t and x are connected directly, which is the so-called
breadth-first walk. dtx = 2 means that t and x are not connected, which is the so-called
depth-first walk.
The retrograde situation is not feasible in the real road segments. In addition, the
breadth-first walking strategy can better capture the dependencies between each crossing
node and its directly adjacent nodes. Therefore, when setting the model parameters, the
value of p is set to infinity, while the value of q is set as a number greater than 1, making
the model more inclined towards the breadth-first walk strategy.
Finally, the vector representation of all nodes Enode ∈ RN ×Cnode is obtained through the
aforementioned random walk strategy, where Cnode represents the information dimension
recorded by each node. The similarity matrix, denoted as WT , is then calculated based on
the customized threshold Tsim for transfer relationships. This matrix represents the transfer
relationships and is computed by Eqs. (7)–(8).
Enode (i) · Enode (j)
WT (i,j) = (7)
kEnode (i)k Enode (j)
(
0, WT (i,j) < Tsim
WT (i,j) = (8)
1, WT (i,j) ≥ Tsim
where WT (i,j) represents the similarity matrix of any two crossing sensor vectors. In order
to make the transfer relationship matrix WT sparse further, it is reassigned according to
the threshold to obtain the final WT .
function, with a total of M such functions available. This implies that WS is divided
into M fine-grained divisions, and the resulting partition matrices satisfy the condition
PM
m=1 fm (WS ) = 1. This condition ensures that WS is distributed across M fine-grained
partition matrices, while preserving the distribution characteristics of WS .
Finally, the m-th fine-grained partition matrix represents the probability that the
relationship between any two sensors in the network belongs to the m-th relationship.
To facilitate this, a weight set W = {w1 ,...,wm } is designed. By combining the probability
matrices and weight set W , a relationship matrix can be constructed to quantify the level of
mutual influence between sensors within each segment. The value of M can be determined
based on cluster analysis. To ensure smooth treatment of boundary values, a Gaussian
kernel filter is selected. The degree of mutual influence among sensors can be defined by
the matrix WS , which is shown in Eqs. (9)–(11).
(Ws (i,j )−rm )2
−
Gm (Ws ) = e 2θ22
,m ∈ M (9)
Gm (Ws )
fm (Ws ) = PM ,m ∈ M (10)
m=1 Gm (Ws )
M
X
Ws = fm (Ws ) · wm ,m ∈ M ,wm ∈ W (11)
m=1
where Gm (Ws ) represents the result obtained through the Gaussian kernel filter, rm is the
cluster center point obtained through cluster analysis, and θ2 is the hyperparameter.
It is ensured that wnear > wmid > wfar . The specific process of generating the DTS
relationship matrix is shown in Algorithm 1.
For all the models used in this article, each layer of the GCN employs spectral graph
convolution based on the Chebyshev polynomial approximation (ChebConv). As shown
in Fig. 5B, it is able to capture the spatial dependencies, which can be expressed by Eq. (15).
K
X
H (l) = ChebConv(A∗ ,H (l−1) ,θ (l) ) = σ ( Tk (A∗ )H (l−1) θ (l) ) (15)
k=0
where σ is the activation function, and Tk (x) represents the recursively defined Chebyshev
polynomial given by Tk (x) = 2xTk−1 (x) − Tk−2 (x), where T0 (x) = 1, T1 (x) = x.
1 1
A∗ = D̃− 2 ÃD̃− 2 is the normalized adjacency matrix, where à = A + I . D̃ is a diagonal
matrix, D̃ii = j Ãij . H (l) represents the hidden features of layer l obtained after the
P
convolution operation, and θ (l) represents the learnable parameter corresponding to the
l-th layer structure.
For the training of the previous prediction modules, the focus is usually on designing a
loss function to enhance the accuracy of predicting the value of individual sensors, rather
than prioritizing the accurate prediction of traffic conditions on the road segment where
the sensor is located. This approach may result in less accurate predictions. This article
aims to predict the future traffic conditions of each road segment. Instead of calculating
the loss function between the real value Y and the predicted value Pred of a single sensor
directly, some improvements should be made to ensure that the designed loss function
optimizes the model’s performance in predicting road traffic conditions.
where Vi ∈ N represents the ith sensor, and Sj ∈ S represents the jth road. If the ith sensor
locates on the jth road, the corresponding matrix value is set to 1, otherwise it is 0. This
grouping allows sensors to be categorized according to the road they belong to.
Secondly, the ground truth Y and predicted value Pred are processed. As shown in Eqs.
(17)–(18), the actual value Yroad of each road segment and the predicted value Predroad are
Yroad = Y · Wr (17)
An array C of length S is calculated to record the number of sensors for each road
segment. Additional processing is conducted on Yroad and Predroad . The average real value
Yroad and the average predicted value Predroad are obtained by Eqs. (19)–(20).
Finally, the minimization of the L1 loss between the predicted value of the road segments
and the real value is selected as the training goal of the prediction module, which is shown
in Eq. (21).
Through such a loss function, after several iterations, correlations that are favorable
for road segment traffic prediction are highlighted, while weaker correlations are erased
gradually. The prediction modules of the experiments are all implemented based on several
GCNs models with good prediction effects. The specific process is shown in Algorithm 3.
(
1, if sim(i,j) > 0
Aij = (23)
0, otherwise
where vi , vj are the eigenvectors of sensors i and j, respectively. k·k represents the modulus
of the vector. Aij represents the degree of correlation between sensors i and j. Then a
learnable matrix A1 is introduced and combined with the initial relationship matrix A by
Eq. (24) to obtain the initial learnable matrix Minit .
A1 ∈ RN ×N are learnable parameters. In order to enhance the sparsity of Minit , the activation
function ReLU can set the diagonal position of the matrix and the other half positions to
0.
Secondly, an attention mechanism is used to fuse the old matrices with the newly
generated spatial dependency matrix. Then, an attention weight αij between each pair of
nodes can be obtained by Eq. (25).
where W is a learnable weight matrix, vi and vj are eigenvectors of nodes i and j respectively,
|| represents vector concatenation operation, LeakyReLU is a linear rectification function
with leakage, and the softmax function is used to normalize attention weights. The fusion
matrix Mfs is obtained according to the attention weights αij by Eq. (26).
Datasets Nodes Edges Time windows Statistical characteristics Time period covered
METR-LA 207 1515 17568 traffic speed 2012.3.1-2012.4.30
PEMS-BAY 325 2369 52116 traffic speed 2017.1.1-2017.6.30
EXPERIMENTS
In this section, the effectiveness of the DTS-AdapSTNet is evaluated and compared with
other baseline models using two real-world datasets. In addition, ablation experiments are
also conducted on the relevant modules of the proposed model.
Datasets
In order to verify the performance of the model, experiments are conducted on two public
datasets, METR-LA and PEMS-BAY. The detailed statistics of the two datasets are shown
in Table 2.
METR-LA: This traffic dataset contains traffic information collected from loop detectors
in the highway of Los Angeles County (Gehrke et al., 2014). Since there are a large number
of missing values in this dataset and to facilitate the experiment, traffic speed statistics
data from March 1, 2012, to April 30, 2012, encompassing 207 sensors along Los Angeles
County highways is utilized to mitigate the impact of missing values on experimental
results. Moreover, missing values are handled.
PEMS-BAY: This traffic dataset is collected by California Transportation Agencies
(CalTrans) Performance Measurement System (PeMS). We selected six months of traffic
speed statistics from January 1, 2017 to June 30, 2017, covering 325 sensors in the Bay Area.
The same data pre-processing procedures are adopted as Li et al. (2018). Observations
from the sensors are aggregated into 5-minute windows. The original spatial dependencies
such as road distance and direction are used as the input of the DTSRMG-module
to generate corresponding relationship matrices. The input data are normalized using
Z-score. Both datasets are split by time sequence with 70%, 15% and 15% for training,
validation and testing, respectively.
Baseline
Three predictive neural networks based on GCN with good performance are employed as
prediction models for experiments. The structures of these three GCN-based frameworks
are shown in Fig. 6.
(1) ASTGCN (Guo et al., 2019): Attention Based Spatio-Temporal Graph Convolutional
Network, which combines graph convolutions with spatial attention to capture spatial
patterns, and leverages standard convolutions and temporal attention to extract temporal
features.
(2) TGCN (Zhao et al., 2020): Temporal Graph Convolutional Network model, which
utilizes recurrent models to extract temporal features and graph convolutions to capture
spatial dependencies, respectively.
(3) DCRNN (Li et al., 2018): Diffused Convolutional Recurrent Neural Network, which
replaces matrix multiplication in recurrent models (i.e., GRU and LSTM) with graph
convolutions, and extracts spatial and temporal features in an encoder–decoder manner
simultaneously.
The above three models also serve as baseline models, along with the following ones,
including both classic methods and state-of-the-art approaches:
(1) HA: Historical average, which uses the average of historical traffic flow data to
complete the task.
(2) STGCN (Yu, Yin & Zhu, 2017): Spatio-Temporal Graph Convolutional Network,
which combines 1D convolution and graph convolution.
(3) Graph WaveNet (Wu et al., 2019): A convolutional network architecture, which
introduces adaptive graphs to capture hidden spatial dependencies and uses dilated
convolutions to capture temporal dependencies.
(4) DCRNN+AdaGL (Zhang et al., 2022): An Adaptive Graph Learning Algorithm for
Traffic Prediction Based on Spatiotemporal Neural Networks, which combines the
proposed adaptive graph learning module with a DCRNN to find the adjacency matrix
relations that makes traffic prediction work well.
(3) MAPE: Mean absolute percentage error provides a normalized error measurement,
making it useful when comparing prediction performance across datasets with different
scales (e.g., high-traffic vs. low-traffic areas).
1 X yi −b yi
MAPE(Y,b Y) = × 100%. (36)
|| y i
i∈
Experiment settings
In all experiments, the traffic speed is predicted over the next hour using the traffic speed
from the previous hour, hence T = 12. The parameters θ , k, and Tsim used in generating
the three relationship matrices WD , WT , WS are adjusted according to the scale of the data.
For the prediction module, the number of training epochs Nepoch is set to approximately
10 based on the convergence rate of the prediction module. In the AMG-module, the
dimensions of A1 and A2 are configured as 64, and σ is selected from the range [0,1].
Regarding the MF-module, the size of the matrix set A is set to 3.
All experiments are executed under a platform with NVIDIA GeForce RTX3080-10GB
graphics card. For all deep learning based models, the training process is implemented in
Python with Pytorch 1.8.0. Adam (Wang, Xiao & Cao, 2022) is utilized as the optimization
method with a learning rate of 0.001. Additionally, an early stopping strategy is employed
to determine whether the stopping criterion is met. Furthermore, the optimal parameters
are determined based on the performance on the validation dataset.
Performance analysis
Table 3 presents the performance comparison of different baselines and DTS-AdapSTNet
on the two datasets for road speed prediction in the next 15 min, 30 min and 60 min,
respectively. The prediction networks, which serve as the foundations for DTS-AdapSTNet,
are denoted within brackets (e.g., DTS-AdapSTNet (ASTGCN) refers to using ASTGCN
as a prediction network). The results demonstrate that DTS-AdapSTNet achieves excellent
results in all metrics across the prediction range. Superior performance is observed for
DTS-AdapSTNet based on three different GCN prediction networks compared to the other
baselines, for both the METR-LA and PEMS-BAY datasets. Particularly, DTS-AdapSTNet
based on DCRNN exhibits the best performance. When compared with the traditional
model HA for a 15-minute prediction on both datasets, it is found that MAE, RMSE, and
MAPE are reduced by 43%, 39%, and 54%, respectively. This demonstrates the significance
of considering spatial dependencies among sensors. Other spatio-temporal GCN models
are also compared, revealing an average reduction in prediction error for each metric by
10%, 6% and 13%, respectively. This reduction can be attributed to the accuracy of the
learned adjacency matrix, which has a relatively large impact on prediction results. It is
evident from the results presented in Table 3 that as the prediction window increases, the
accuracy of prediction decreases for each model. However, DTS-AdapSTNet outperforms
other models consistently, with a more noticeable improvement.
For the MAE, when the prediction window is 15 min, the DTS-AdapSTNet (DCRNN)
model exhibits a 24% lower error compared to the ST-GCN model. With a prediction
window of 30 min, the error is reduced by 26%. Furthermore, with a prediction window of
60 min, the error is reduced by 32%. This showcases the increasing complexity of the traffic
network and the growing difficulty in prediction as the prediction range expands. However,
it is noteworthy that DTS-AdapSTNet is able to outperform in long-term forecasting as
well, highlighting the stability of the method employed in this study.
When compared with the AdapGL+DCRNN model, the method in this article has
different prediction focuses and specific implementation methods. In different GCN
prediction networks, the performance of the DTS-AdapSTNet model has been improved.
The most obvious improvement is observed in DTS-AdapSTNet (DCRNN). Particularly
noteworthy is the significant decline observed in all three metrics for the METR-LA dataset.
Similarly, for the PEMS-BAY dataset, DTS-AdapSTNet continues to remain competitive
or even show further enhancements compared to the AdapGL+DCRNN model.
In order to provide a more intuitive comparison of the performance of DTS-AdapSTNet
and other models with superior performance under different prediction windows, a
line chart illustrating the changes in each evaluation metric corresponding to different
models under various prediction windows is depicted in Fig. 7. It can be observed that
DTS-AdapSTNet based on different GCNs demonstrates relatively good performance in
each prediction range on the two datasets. Specifically, on the METR-LA dataset, as the
prediction window increases, the MAE, RMSE and MAPE of the three models proposed
in this article increase among 0.32−2.15, 1.00−5.26 and 1.07−7.60, respectively. The
corresponding increases of the other three baseline models are 0.32−2.21, 0.98−5.45 and
1.07−7.63, respectively. It can be observed that the error rise of the DTS-AdapSTNet
model changes less as the prediction window increases, illustrating the effectiveness of the
proposed model in long-term prediction further.
Ablation experiment
To verify the effectiveness of the main modules proposed in this article, ablation studies
are conducted on METR-LA and PEMS-BAY datasets.
The position distributions of the sensors 160, 164, 161 are shown in Fig. 10E. Firstly,
the distance relationship is shown as the small squares 1 and 2 in Fig. 10A, where square
1 represents the relationship between sensors 160 and 164, and square 2 represents the
relationship between sensors 164 and 161. It is evident that the distance between them is
related closely. Secondly, after the initialization matrix is generated by the AMP-module,
the spatial relationships among them are shown in Fig. 10B. The color of squares 1 and
2 becomes lighter, indicating that the relationships among sensors at this stage not only
consider the distance factor but also take into account other factors. Finally, after the
AMG-module, the spatial relationships among the three sensors are shown in Fig. 10C. It
can be seen that the color of square 1 is darker, indicating that after the model training, it
is believed that the relationship between sensors 160 and 164 is closer.
As depicted in Fig. 10E, several observations can be made. Firstly, sensor 164 is situated
downstream of sensor 160, indicating a strong correlation between the two sensors after
model training. Secondly, there is a junction in the vicinity of sensor 164 where traffic
can merge, and another junction near sensor 161 where traffic can flow out. Therefore,
sensors 164 and 161 cannot be classified simply as part of the same segment. After model
training, the color of square 2 becomes lighter, indicating a weakening of the spatial
relationship between sensors 164 and 161. Finally, although all three sensors are located on
the same road and are relatively close in distance, effective capturing of potential spatial
relationships among them after training can enhance beneficial dependence relationships
while weakening unfavorable ones. Consequently, new spatial relationships conducive to
road speed prediction are obtained. It is evident that the proposed model demonstrates
effectiveness in learning the spatial relationships among sensors on the same road, and
endeavors to represent real road conditions accurately.
Regarding the spatial relationships among different roads, the proposed model also
describes them through the spatial relationships of sensors located on different roads.
Firstly, the two sensors located on the road 62 and the road 85 show a weak correlation
in both the distance relationship matrix and the initialization relationship matrix, which
are shown by square 3 in Figs. 10A and 10B. Secondly, after learning by the proposed
model, it can be seen in the generated graph that the spatial relationship between the two
sensors enhancs, which is shown by square 3 in Fig. 10C. This indicates that the two roads
where the two sensors are located have a high similarity in traffic changes. Finally, to verify
this, the real speed variation of the two roads in a day is plotted. As shown in Fig. 10D,
it can be seen that road 62 and road 85 have almost identical speed curves. This further
demonstrates the importance of spatial relationships among the sensors represented by the
generated graph in road network speed prediction.
In summary, the AMG-module is capable of effectively learning spatial relationships
among sensors located on the same road as well as among sensors located on different
roads. These spatial relationships contribute to making predictions more accurate.
Figure 11 Comparison of prediction curves on a road of the test data of METR-LA trained with differ-
ent loss function.
Full-size DOI: 10.7717/peerjcs.2527/fig-11
utilizing the proposed loss function, the error is increased, illustrating the effectiveness and
improvement introduced by the proposed loss function for road speed prediction.
CONCLUSION
Capturing the potential spatial dependencies of roads in a traffic network to achieve
accurate prediction poses a challenging problem. To address this challenge, an Adaptive
Spatio-Temporal Graph Neural Network based on Multi-graph Fusion (DTS-AdapSTNet)
is proposed in this article. In order to make more effective use of historical traffic data,
firstly, DTS-AdapSTNet divides the roads in the road network and the sensors they contain
carefully. The DTSRMG-module is used to capture the initial spatial dependencies among
sensors, which are fused to generate an initial predefined matrix. Secondly, a novel AMG-
module is proposed to learn the potential spatial dependencies adaptively. Specifically, the
AMG-module and the prediction module are trained alternately in cycles, enabling the
model to self-adjust. In addition, a loss function with good performance is designed in the
process of model training. Furthermore, a fusion mechanism is used to fuse the learned
matrices and produce the optimal adjacency matrix, thereby enhancing the accuracy of
road traffic prediction. Finally, it is demonstrated through extensive experiments based
on two real-world datasets that the proposed DTS-AdapSTNet outperforms other existing
methods. Ablation experiments further confirm the effectiveness and contribution of each
module in this model. Accurate prediction of roads in the traffic network is crucial for
Funding
This work was supported by the National Natural Science Foundation of China (grant
number 61902069), the Natural Science Foundation of Fujian Province of China (grant
number 2021J011068), the Research Initiation Fund Program of Fujian University of
Technology (GY-S24002), and the Fujian Provincial Department of Science and Technology
Industrial Guidance Project (grant number 2022H0025). The funders had no role in study
design, data collection and analysis, decision to publish, or preparation of the manuscript.
Grant Disclosures
The following grant information was disclosed by the authors:
The National Natural Science Foundation of China: 61902069.
Natural Science Foundation of Fujian Province of China: 2021J011068.
Research Initiation Fund Program of Fujian University of Technology: GY-S24002.
Fujian Provincial Department of Science and Technology Industrial Guidance Project:
2022H0025.
Competing Interests
The authors declare there are no competing interests.
Author Contributions
• Wenlong Shi conceived and designed the experiments, performed the experiments,
performed the computation work, prepared figures and/or tables, authored or reviewed
drafts of the article, and approved the final draft.
• Jing Zhang conceived and designed the experiments, performed the experiments,
performed the computation work, authored or reviewed drafts of the article, and
approved the final draft.
Data Availability
The following information was supplied regarding data availability:
The two processed datasets, Metry_LA and PEMS_BAY, are available at Zenodo: Shi,
W. (2024). Data for DTS-adapSTNet [Data set]. Zenodo. https://doi.org/10.5281/zenodo.
12520380. They both include training, validation, and testing sets.
The raw measurements are available in the Supplementary File.
Supplemental Information
Supplemental information for this article can be found online at http://dx.doi.org/10.7717/
peerj-cs.2527#supplemental-information.
REFERENCES
Ahmed MS, Cook AR. 1979. Analysis of freeway traffic time-series data by using Box-
Jenkins techniques. Transportation Research Record Vol. 722. Transportation
Research Board. Washington, D.C., USA 1–9.
Alam I, Farid DM, Rossetti RJ. 2019. The prediction of traffic flow with regression anal-
ysis. In: Emerging technologies in data mining and information security: proceedings of
IEMIS 2018, vol. 2. Cham: Springer, 661–671.
Alghamdi T, Elgazzar K, Bayoumi M, Sharaf T, Shah S. 2019. Forecasting traffic conges-
tion using ARIMA modeling. In: 2019 15th international wireless communications &
mobile computing conference (IWCMC). Piscataway: IEEE, 1227–1232.
AlKheder S, Alkhamees W, Almutairi R, Alkhedher M. 2021. Bayesian combined neural
network for traffic volume short-term forecasting at adjacent intersections. Neural
Computing and Applications 33:1785–1836 DOI 10.1007/s00521-020-05115-y.
Almeida A, Brás S, Oliveira I, Sargento S. 2022. Vehicular traffic flow prediction using
deployed traffic counters in a city. Future Generation Computer Systems 128:429–442
DOI 10.1016/j.future.2021.10.022.
Altché F, De La Fortelle A. 2017. An LSTM network for highway trajectory prediction.
In: 2017 IEEE 20th international conference on intelligent transportation systems
(ITSC). Piscataway: IEEE, 353–359.
Castillo E, Menéndez JM, Sánchez-Cambronero S. 2008. Predicting traffic flow using
Bayesian networks. Transportation Research Part B: Methodological 42(5):482–509
DOI 10.1016/j.trb.2007.10.003.
Chen C, Hu J, Meng Q, Zhang Y. 2011. Short-time traffic flow prediction with ARIMA-
GARCH model. In: 2011 IEEE intelligent vehicles symposium (IV). Piscataway: IEEE,
607–612.