On The Combination of Graph Data For Assessing Thin-File Borrowers' Creditworthiness
On The Combination of Graph Data For Assessing Thin-File Borrowers' Creditworthiness
On The Combination of Graph Data For Assessing Thin-File Borrowers' Creditworthiness
borrowers’ creditworthiness
Ricardo Muñoz-Cancino1 , Cristián Bravo2 , Sebastián A. Rı́os3 , and Manuel Graña4
1,3 Business
Intelligence Research Center (CEINE), Industrial Engineering Department, University of
Chile, Beauchef 851, Santiago 8370456, Chile
2 Department of Statistical and Actuarial Sciences, The University of Western Ontario, 1151 Richmond
arXiv:2111.13666v2 [cs.SI] 16 Sep 2022
Abstract
Thin-file borrowers are customers for whom a creditworthiness assessment is uncertain due
to their lack of credit history. To address missing credit information, many researchers have
used borrowers’ social interactions as an alternative data source. Exploiting social networking
data has traditionally been achieved by hand-crafted feature engineering, but lately, graph
neural networks have emerged as a promising alternative. Here we introduce an information-
processing framework to improve credit scoring models by blending several methods of graph
representation learning: feature engineering, graph embeddings, and graph neural networks.
In this approach, we aggregate the methods’ outputs to be fed to a gradient boosting classifier
to produce a final creditworthiness score. We have validated this framework over a unique
multi-source dataset that characterizes the relationships, interactions, and credit history for
the entire population of a Latin American country, applying it to credit risk models, appli-
cation, and behavior. It also allows us to study both individuals and companies. Our results
show that the methods of graph representation learning should be used as complements;
they should not be seen as self-sufficient methods, as it is currently done. We improve the
creditworthiness assessment performance in terms of the measures of Area Under the ROC
Curve (AUC) and Kolmogorov-Smirnov (KS), outperforming traditional methods of exploit-
ing social interaction data. In the area of corporate lending, where the potential gain is much
higher, our results confirm that the evaluation of a thin-file company cannot solely consider
the company’s own characteristics. The business ecosystem in which these companies inter-
act with their owners, suppliers, customers, and other companies provides novel knowledge
that enables financial institutions to enhance their creditworthiness assessment. Our results
let us know when and on which population to use graph data and the expected effects on
performance. They also show the enormous value of graph data on the credit scoring problem
for thin-file borrowers, mainly to help companies with thin or no credit history to enter the
financial system.
Keywords: credit scoring; machine learning; social network analysis; network data; graph
neural networks
∗
NOTICE: This is a preprint of a published work. Changes resulting from the publishing process, such as editing, corrections,
structural formatting, and other quality control mechanisms may not be reflected in this version of the document. Please cite this work
as follows: Muñoz-Cancino, R., Bravo, C., Rı́os, S. A., & Graña, M. (2022). On the combination of graph data for assessing thin-file
borrowers’ creditworthiness. Expert Systems with Applications, 118809. DOI: https://doi.org/10.1016/j.eswa.2022.11880
1
1 Introduction
A large part of the population requires access to credit to achieve their life goals: social mobility,
owning a home, and financial success. Moreover, access to financial services and a proper credit
evaluation can facilitate and are often necessary to obtain a job, rent a home, buy a car, start a
new business, or pursue a college education (Aziz & Dowling, 2019; Hurley & Adebayo, 2017). At
the macroeconomic level, access to credit is a major driver for local economic growth, especially
in developing economies (Diallo & Al-Titi, 2017). Financial institutions play a significant social
role in facilitating access to credit and facing the entailed risks of lending money. To manage this
credit risk, financial institutions have applied credit scoring models to assess the creditworthiness
of their borrowers, that is, to distinguish between good and bad payers and delivering loans to
those who are most likely to repay. To build a credit scoring model, financial institutions often
use personal information, banking data, and payment history to estimate creditworthiness and the
probability of default. Despite being the standard mechanism in the industry for credit-granting
decisions and the management of the loan’s life cycle (Thomas, Crook, & Edelman, 2017), this
ubiquitous tool still does not ensure adequate access to credit and to the financial system.
The World Bank estimates that more than 1.4 billion adults remain unbanked, without access
to the financial system (The Global Financial Index, 2022). This number only considers those who
do not have a bank account through either a financial institution or mobile banking. If we included
underbanked people, that is, those who have an account but cannot apply for a loan, this number
would be much larger. Being unbanked or underbanked raises the issue of those who lack a credit
history, also known as thin-file borrowers: people who have no access to a loan not because they
are bad payers but because they lack the attributes evaluated by traditional credit scoring models
(Baidoo, 2020; Cusmano, 2018; Djeundje, Crook, Calabrese, & Hamid, 2021; Hurley & Adebayo,
2017).
In this scenario, lenders have tried different ways to reach this population; we highlight two
business models here. In the traditional business model, the higher risk assumed due to the
lack of information is compensated by applying higher interest rates. Alternatively, granting
microcredits has been used as a strategy to assess the client’s payment behavior under limited
exposure. However, neither of these solutions has proven to be cost-effective in addressing the
credit needs of this population (Baidoo, 2020; Hurley & Adebayo, 2017).
For this reason, financial institutions, fintech, and researchers have looked in recent years for
business-model innovations and better decision-making with the available information. This search
is done via developing better scoring algorithms and using alternative data sources to improve
credit scoring models. Regarding the use of alternative information, graph data has gained high
visibility because it allows an improvement of credit scoring models’ performance (Óskarsdóttir,
Bravo, Sarraute, Vanthienen, & Baesens, 2019; Roa, Rodrı́guez-Rey, Correa-Bahnsen, & Valencia,
2021).
We have identified two main gaps, which are addressed in this work. The first gap is the
data sources employed. Most of the studies are carried out with partial social networks that fail
to capture the overall picture of the client’s social interactions. These networks are limited by
the data provider. Our study uses social networking data to characterize the interactions of the
country’s entire population, encompassing the complete financial system. Secondly, the network
knowledge extraction is mainly done both through hand-made feature engineering (Freedman &
Jin, 2017; Niu, Ren, & Li, 2019; Óskarsdóttir et al., 2019; Ruiz, Gomes, Rodrigues, & Gama, 2017)
and, in recent years, through graph neural networks (Roa, Rodrı́guez-Rey, et al., 2021) that do are
no improvement over the traditional feature-engineering approach.
2
Our work will investigate the combination of different representation learning techniques with
complex graph structures instead of observing them in isolation. Hence, we formulate the following
research questions:
1. When combining different graph representation learning (GRL) techniques over complex
graph structures, is there a performance improvement compared to merely applying hand-
crafted feature engineering or graph neural networks?
2. What insights are obtained into the combined network features, and what value do these
insights add to credit risk assessment?
3. Where does social information help the most? Is the most significant performance enhance-
ment obtained in personal credit scoring or business credit scoring? What can we gather
from this information? Does it influence which network and which features are the most
relevant?
This study challenges traditional hand-crafted feature engineering and the novel approach of
graph neural networks (GNNs) by combining multiple GRL methods. In particular, our work
contributes to the following aspects.
• Our results are the first to validate and test graph data regarding both corporate and con-
sumer lending, showing that the information from graphs has a different effect depending on
the analyzed borrower, people, or companies. These effects are reflected both in the predic-
tive power enhancement and in the features relevant in each problem, letting us know not
only when and on which population to use social-interaction data but also which effects on
creditworthiness prediction performance to expect.
• To the best of our knowledge, this is the first study that considers the credit behavior of
an entire country, together with social networks that allow the characterizing of its entire
population and consolidate multiple types of social and economic relationships, for example,
parental, spouses, business owners, employers and employees, or transactional services.
• This paper also contributes to the growing literature in credit scoring and network data,
proposing a mechanism to achieve better results than the popular hand-crafted feature en-
gineering and the novel GNN approach.
This paper is structured as follows. Section 2 presents a review of credit risk management,
credit scoring and social networks. The GRL methods are presented in Section 3. Section 4
describes the data sources and features extracted for classification. Section 5 shows the proposed
information-processing methodology and the adopted experimental design. Section 6 presents the
results obtained. The conclusions and future work that originated from this research are presented
in Section 7.
3
2 Background and Related Work
2.1 Credit Risk Management
Banks’ core business is granting loans to individuals and companies. Granting a loan is not
risk-free; in fact, banks are heavily exposed to credit risk (Anderson, 2022), originating from the
potential loss due to the debtors’ default or their inability to comply with the agreed conditions
(The Basel Committee on Banking Supervision, 2000). Banking risk management focuses on
detecting, measuring, reporting, and managing all sources of risk. Banks define strategies, policies,
and procedures to limit the assumed risk. These strategies encourage and integrate the use of
mathematical models for the early detection of potential risks. Credit scoring is widely used for
managing credit risk, handling large volumes of data, and capturing complex patterns that are
difficult to express as simple business rules. This instrument became popular and ubiquitous in the
1980s, mainly due to advances in computing power and to the growth of financial markets, which
made it almost impossible to manage large credit portfolios without this kind of tool (Thomas et
al., 2017).
The regulatory framework also endorses the use of credit scoring models; in fact, the Basel Ac-
cords allow banks to manage credit risk with internal ratings. Specifically, banks develop internal
models for assessing the expected loss. This assessment can be divided into three components: the
probability of default (PD), the loss given default (LGD), and the exposure at default (EAD). The
PD is a key component, because it is used to define the credit granting policies and for portfolio
management. The general approach to estimating the PD and assessing the borrower’s credit-
worthiness is through classification techniques using demographic features and payment history as
explanatory variables.
Over the years, lenders have explored multiple ways to improve creditworthiness assessment,
novel machine learning techniques (Moscato, Picariello, & Sperlı́, 2021), and non-traditional data
sources (Aziz & Dowling, 2019). Multiple lines of research have been established; some of them at-
tempt to understand the characteristics of defaulters (Bravo, Thomas, & Weber, 2015), the feature
selection process (Kozodoi, Lessmann, Papakonstantinou, Gatsoulis, & Baesens, 2019; Maldonado,
Pérez, & Bravo, 2017), or the transformation of the feature space (Carta, Ferreira, Reforgiato Re-
cupero, & Saia, 2021). However, the most significant improvements have been obtained by the
exploitation of alternative data sources such as telephone call data (Óskarsdóttir, Bravo, Vanathien,
& Baesens, 2018a; Óskarsdóttir et al., 2019, 2017), written risk assessments (Stevenson, Mues, &
Bravo, 2021), data generated by an app-based marketplace (Roa, Correa-Bahnsen, et al., 2021;
Roa, Rodrı́guez-Rey, et al., 2021), social media data (Cnudde et al., 2019; Putra, Joshi, Redi, &
Bozzon, 2020; Tan & Phan, 2018), network information (Ruiz et al., 2017), behavioral and psycho-
logical surveys (Goel & Rastogi, 2021), fund transfers datasets (Shumovskaia, Fedyanin, Sukharev,
Berestnev, & Panov, 2020; Sukharev, Shumovskaia, Fedyanin, Panov, & Berestnev, 2020), and psy-
chometric data (Djeundje et al., 2021; Rabecca, Atmaja, & Safitri, 2018; Rathi, Verma, Jain, Nay-
yar, & Thakur, 2022). All these studies have in common the use of social-interaction information,
the graph formed of the interactions among individuals recorded in alternative data sources.
There are multiple taxonomies of credit scoring problems. One that has been widely adopted
by academics and practitioners distinguishes between application scoring and behavior scoring.
On the one hand, application scoring corresponds to a credit scoring system for new customers,
where the available information is often scarce and limited. On the other hand, behavioral scoring
is a credit scoring system for borrowers with available credit and repayment history. In the current
study, both of these credit scoring types and their differences between personal and business clients
4
are explored.
5
To work with both datasets, they propose a methodology based on GCN and recurrent neural
networks to handle network data and transactional data, respectively. As baseline models, they
train a model with 7000 features; however, they achieve an increase of 0.4% AUC when comparing
the proposed model with the best baseline model. Finally, Roa, Correa-Bahnsen, et al. (2021)
present a methodology for using alternative information in a credit scoring model. Models are
estimated using data generated by an app-based marketplace. This information is precious for
low-income segments and young individuals, who are often not assessed well by traditional credit
scoring models. The authors compare a model with hand-crafted features versus models from
GCNs. However, GCNs do not achieve better results than do hand-crafted features in terms of
predictive power.
6
preserving the neighborhood of the nodes in the embedding subspace. The algorithm optimizes
using stochastic gradient descent, a network-based objective function, and produces samples for
neighborhoods of nodes through second-order random walks. The key feature of Node2vec is the
use of biased-random walks, providing a trade-off among two network search methods: breadth-
first search (BFS) and depth-first search (DFS). This trade-off creates more informative network
embeddings than other competing methods.
7
The next step is to define the graph Fourier transform and its inverse. The graph Fourier
transform F of the feature vector Xi ∈ X is defined as follows:
F(Xi ) = U T Xi , (3)
where X is the matrix of node attributes, and the inverse Fourier transform of a graph is defined
as follows:
F −1 (X̂i ) = U X̂i , (4)
where X̂i are the coordinates
P of the nodes in the new space. Therefore, the feature vector can
be written as Xi = jinV X̂i uj . Finally, the graph convolution of feature vector Xi with filter
g ∈ RN , using the element-wise product , is defined as follows:
8
4 Data Description
The data used in this paper encompasses several datasets provided by a large Latin American
bank. Some datasets contain information from their customers, while others concern the entire
population of the country.
• [WeddNet] Network of marriages: This network is built from the information of mar-
riages recorded by the bureau of vital statistics from 1938 to December 2015. It includes the
anonymized identifiers of the husband and the wife and the wedding date.
• [TrxSNet] Transactional services network: The primary source of this network comes
from transactional services data, primarily payroll services and the transfers of funds between
two entities. We have access to monthly data from January 2017 to December 2019.
• [EnOwNet] Enterprise’s ownership network: This network is built from the informa-
tion on companies’ ownership structure. For each firm, we have information concerning their
owners, be they individuals or other firms. We have quarterly information from January
2017 to November 2019.
• [PChNet] Parents and children network: This network corresponds to parental relation-
ships. For every person born between January 1930 and June 2018, we have the anonymized
identifiers of their parents.
• [EmpNet] Employment network: This network is built from multiple sources and con-
nects people with their employers. We have monthly data from January 2017 to December
2019.
9
This probability was assessed and provided by the financial institution, and it is our benchmark
to contrast the performance of our models.
5.2 Target
The target event was ”becoming a defaulter during the period of observation”. Therefore, we only
took into account individuals or businesses that were non-defaulters at the start of the period of
observation; we dismissed entities that were defaulters at the very beginning of the observation.
10
In the current study, a person or company was considered a defaulter when they had payments
past the due date for 90 or more days within 12 months starting from the observation point.
Otherwise, they were considered non-defaulters. The target vector, denoted by ydef , contained the
actual information about the target event.
• [NodeStats] Node Statistics: This dataset collects node centrality statistics, namely, its
degree, degree centrality, number of triads, PageRank score, authority and hub score given by
the Hits algorithm (Kleinberg, 1999), and an indicator of whether the node is an articulation
point.
• [N2V] Node2Vec Features: Node2Vec is an unsupervised method that only uses the
network structure to generate the graph embedding. For the static network FamilyNet,
Node2Vec is applied only once. A node is characterized by this embedding regardless of the
moment it was sampled in the dataset. For temporal networks (EOWNet), Node2Vec has
to be recomputed every period because of the network changes. Each node is characterized
by the embedding corresponding to the month in which it was sampled in the dataset.
11
t0 tn time
dataset
Figure 2: Node2Vec to Features
t0 tn time
dataset
Figure 3: Graph Convolutional Networks and Graph Autoencoders to Features
Figure 2 shows the process through which to obtain the embedding features by applying
Node2Vec. Each period, a new model is trained, and the resulting embedding is consolidated
to characterize the sample dataset that will allow us to train the final credit scoring models.
• [GNN] Graph Neural Network Features: We used either GCN or GAE for the ex-
traction of GNN features. These methods carry out the graph convolution of the network
structure through a node feature vector in order to generate as output the feature vector
for ensuing classification by machine learning approaches. In this study, GNN input feature
vectors are in the Node feature dataset. Regardless of whether the network is static or
temporal, the feature vector is dynamic and changes during each observation period, that is,
each month. To avoid target leakage (Kaufman et al., 2012), we train the GCNs with the first
available data period; these models are applied in the subsequent months while the network
or the feature vectors change. This approach does not handle new entrants to the social
interaction networks; however, due to the sources of social interaction information employed
in this study, most thin-file borrowers are taken into account in the training dataset, despite
having temporal networks. Therefore, new entrants do not affect our results. Under other
circumstances, for instance when working with a partial network, our recommendation is to
train a new model for every period, taking precautions not to incur in target leakage on the
population of the credit scoring model. Depending on the dynamism of the network, one
12
solution is to calculate local connection updates as suggested in Vlasselaer et al. (2015) in the
interim between model training phases. The GCNs are trained on the entire network (either
FamilyNet or EOWNet) regardless of whether the nodes belong to our training dataset or
not. The output of the GCN is the label of the nodes in the network, which can be ei-
ther defaulter, non-defaulter, or unbanked; therefore, each GCN solves a multi-classification
problem by providing the a posteriori probabilities of each label for each node. Finally, each
node is characterized by the GCN Features resulting from the application of the GCN on
the network and on the feature vector of the month in which the node was sampled, that is,
in which it entered the banking system. Figure 3 illustrates the feature-engineering process
to extract the GCN features. Regarding GAE, the feature engineering process is similar. In
this case, the embedding corresponds to the bottleneck hidden layer representing the encoder
section’s output. For consistency, we apply the same data selection applied for the GCN to
the training of the GAE models, although their training is unsupervised.
• Subset B : XBenchScore is the benchmark score; this attribute is also used as a benchmark to
quantify the performance of our proposed approach.
• Subset C : XN odeStats is composed of the statistics obtained from the position of the node
within the network.
• Subset D : XEgoN et includes the egoNet aggregation and the egoNet weighted aggregation
features that are calculated in three scenarios, considering the entire network, considering
only those edges that are bridges, and considering those edges that are not bridges∗ .
• Subset E : XGN N,N 2V corresponds to the features created by applying GNNs and Node2Vec.
People are characterized by features from both networks (EOWNet and FamilyNet), while
companies are characterized only by EOWNet.
For this study, we aggregated these feature subsets into eight increasingly larger datasets for
training and validation, each defining a different experimental setting. The details of the feature
sets are presented below in Table 2.
13
Table 2: Experiments Setup
Experiment Id Feature Group
A X = {XN ode }
A+B X = {XN ode + XBenchScore }
A+B+C X = {XN ode + XBenchScore + XN odeStats }
A+B+D X = {XN ode + XBenchScore + XEgoN et }
A+B+E X = {XN ode + XBenchScore + XGN N,N 2V }
A+B+C+D X = {XN ode + XBenchScore + XN odeStats + XEgoN et }
A+B+C+E X = {XN ode + XBenchScore + XN odeStats + XGN N,N 2V }
A+B+C+D+E X = {XN ode + XBenchScore + XN odeStats + XEgoN et + XGN N,N 2V }
and KS ranges between 0 and 1, and a higher KS indicates a higher prediction performance.
5.6 Methodology
First, we carry out a feature engineering process that seeks to create attributes to characterize
the nodes from the network. Then, in the train-test split step, the available dataset is divided
into a training dataset of which 30% consists of the samples used to estimate the model’s hyper-
parameters, and the remaining 70% consists of the samples used to train and validate models
according to an N-Fold Cross-Validation scheme.
Before estimating the best hyper-parameters, we apply a feature selection process. In this step,
the intention is to choose a low-correlated subset of features with high predictive power. Three
selection levels are formulated. The first is a bivariate selection that only considers one feature
at a time and the target vector to build a prediction model, evaluating its performance. The
selection process applies a threshold to this feature’s predictive power. Only those variables are
selected such that KS > KSmin and AU C > AU Cmin , where KSmin and AU Cmin are threshold
parameters.
Next, a multivariate selection is applied following a simple but effective method to drop cor-
related features that we have developed. This algorithm starts with an empty list S. We iterate
over a set of features P in decreasing order of predictive power and append to S those features
whose absolute value of the correlation with each and all the features in S is less than a threshold
ρ set to avoid high correlated features (Akoglu, 2018). The algorithm stops when all the features
have been visited. The first feature in P is added to S without correlation comparison.
For this study, the multivariate selection process is applied twice. In the first application,
the process selects low-correlated features for each group of attributes P ∈ {XN ode ∪ XBenchScore ,
XN odeStats , XEgoN et , XGN N,N 2V }. Secondly, it is applied globally to all the remaining features P =
{XN ode ∪ XBenchScore ∪ XN odeStats ∪ XEgoN et ∪ XGN N,N 2V }. In both cases, a threshold ρ is used, and
the features are ordered by the features’ AUC, from higher to lower.
Finally, at the N-Fold Cross-Validation stage, the dataset is partitioned into N subsets of equal
size. Each subset is used alternatively as the test dataset, while the remaining folds are used to
train the classification model. The hyper-parameters used in each iteration are those estimated
14
in the previous stage. Additionally, in each of these iterations, multiple models are trained with
different feature sets and stored to be used later to compare the models.
• [NodeStats] Node Statistics: This process corresponds to the computation of the metrics
defined in Section 5.3. This process is carried out only once for the static network FamilyNet,
and it is calculated for all the available periods (24) of the EOWNet. The computation of
all the metrics for a network takes, on average, 25 minutes. The total execution time of this
stage was 625 minutes.
• [EgoNet] EgoNetwork Aggregation: This process is calculated once per network type
(FamilyNet and EOWNet). The total execution time of this stage was 300 minutes.
15
• [N2V] Node2Vec Features: This process is carried out only once for the static network
FamilyNet, and it is calculated for all the available periods (24) of the EOWNet. The
Node2Vec training for a network takes, on average, 300 minutes. The total execution time
of this stage was 7,500 minutes.
• [GNN] Graph Neural Network Features: In this stage, eight models are trained using
each network and eight different feature vectors. The training of each model is carried out
over the first available period data; afterward, the trained model is applied to the data of
the remaining periods. Table 3 shows the execution time by GNN type and by network type.
The total execution time of this stage was 6,920 minutes.
• Gradient boosting training: Four models were trained using the methodology described
in Section 5.6 for the scenarios defined in Section 4, predicting application and behavioral
scoring for individuals and companies. The complete training for each scenario takes, on
average, 40 minutes. The total execution time of this step was 160 minutes.
The total execution time of the application of the proposed methodology to the datasets is
15,500 minutes. Although a large part of these executions was parallelized using the server de-
scribed in Section 6.1, the high computational cost is mainly due to two factors, the volume of
data and the complexity of the algorithms. Regarding the large volume of data, the FamilyNet
has 20 million nodes and 30 million edges, and the EOWNet has 8.6 million nodes and 26 million
edges. These massive dataset sizes directly impact the algorithms used, because the complexity
depends on the nodes |V |, edges |E| and the embedding dimension d; the algorithmic complexity
for this particular case is the following: Node2Vec: O(|V |d) (Grover & Leskovec, 2016), Graph
Convolutional Networks: O(|E|d) (Kipf & Welling, 2016a) and Graph Autoencoder: O(|V |2 d)
(Kipf & Welling, 2016b).
16
Table 4: Improvement in AUC relative to the benchmark model (mean and std). We only report results
when the equal performance hypothesis is rejected, with a confidence level of 95%; otherwise, we display
*. The best performance in each column is shown in bold; more than one bold value indicates that the
hypothesis of equal performance between those models cannot be rejected.
Feature Business Credit Score Personal Credit Score
Set Application Behavioral Application Behavioral
A -3.52% ± 2.87% -0.90% ± 0.21% -0.74% ± 0.63% -0.63% ± 0.09%
A+B * 0.58% ± 0.06% 1.45% ± 0.39% 0.95% ± 0.06%
A+B+C * 1.13% ± 0.12% 2.02% ± 0.49% 1.07% ± 0.06%
A+B+D 8.96% ± 3.37% 2.33% ± 0.15% 2.31% ± 0.64% 1.25% ± 0.08%
A+B+E 3.92% ± 2.03% 1.77% ± 0.13% 3.17% ± 0.55% 1.96% ± 0.04%
A+B+C+D 9.00% ± 3.47% 2.37% ± 0.16% 2.39% ± 0.60% 1.32% ± 0.08%
A+B+C+E 4.25% ± 1.84% 1.94% ± 0.16% 3.26% ± 0.48% 2.03% ± 0.05%
A+B+C+D+E 8.43% ± 2.83% 2.80% ± 0.16% 3.58% ± 0.61% 2.18% ± 0.04%
Table 5: Improvement in KS relative to the benchmark model (mean and std). We only report results
when the equal performance hypothesis is rejected, with a confidence level of 95%; otherwise, we display
*. The best performance in each column is shown in bold; more than one bold value indicates that the
hypothesis of equal performance between those models cannot be rejected.
Feature Business Credit Score Personal Credit Score
Set Application Behavioral Application Behavioral
A * -4.15% ± 0.94% -5.25% ± 2.40% -2.39% ± 0.46%
A+B * 1.56% ± 0.40% 4.38% ± 1.19% 1.95% ± 0.35%
A+B+C * 3.21% ± 0.71% 6.27% ± 1.02% 2.23% ± 0.39%
A+B+D 20.69% ± 16.73% 7.69% ± 0.92% 6.79% ± 1.36% 2.69% ± 0.47%
A+B+E 12.22% ± 10.89% 5.83% ± 0.74% 8.64% ± 2.13% 4.68% ± 0.28%
A+B+C+D 21.28% ± 17.10% 8.09% ± 0.95% 7.12% ± 1.52% 2.83% ± 0.52%
A+B+C+E 12.88% ± 10.11% 6.33% ± 0.70% 8.93% ± 1.98% 4.93% ± 0.26%
A+B+C+D+E 19.32% ± 14.77% 9.45% ± 0.85% 10.83% ± 1.98% 5.15% ± 0.42%
17
The performance comparison between all feature sets is shown in Table 6; we marked as * those
comparisons with no statistically significant differences.
When performance is measured in terms of KS, the maximum is obtained with five feature sets,
with GRL features, but no method for feature extraction predominates. Although differences are
observed in the values presented, these are not statistically significant. From these results, it is
necessary to highlight at least one GRL method in the best feature sets. The complete comparison
is presented in Table 7
The best performance is observed in other scenarios when combining traditional features and
all the GRL features; this corresponds to the Feature Set A+B+C+D+E. The best performance
is achieved in AUC (see Table 4) and KS (see Table 5).
These results are of great importance because they indicate that the methods combined by our
methodology are complementary, and none is significantly better than the others. Both methods,
namely hand-crafted feature engineering and GNNs, have until now been treated in the literature
as independent in addressing the credit scoring problem.
When comparing the results of Application and Behavioral Credit Scoring, it is observed that
the most significant increase in performance, regardless of the metric, is achieved in Application
Credit Scoring. Network-related features complement the least availability of information, such
that the relationships that a person or company has are relevant when predicting their creditworthi-
ness. These results are of high interest for lenders and in terms of their strategies for the unbanked.
The improvement in predictive performance implies that more borrowers can be serviced without
increasing the portfolio default rate.
Regarding Behavioral Credit Scoring, traditional attributes are already good predictors of
creditworthiness; the borrower’s credit behavior is a good indicator of default. For this reason, the
increase in predictive performance is more limited, although still significant.
18
6.3.3 The Advantages of Blending Graph Representation Learning
The previous sections have shown that our approach allows us to enhance the discrimination
power of our benchmark in terms of AUC and KS. Through the incorporation of the graph data
by means of the GRL methods, this increase is even more significant. Now, we are interested in
discovering the contribution of each of these methods. The performance comparison between the
A+B+C+D+E feature set and each method by itself is shown in Tables 8 and 9, for AUC and
KS respectively; we marked as * those comparisons with no statistically significant differences. In
each table, the results are presented for each credit scoring scenario and the comparison using the
XEgoN et (A + B + D) and XGN N,N 2V (A + B + E) features; In both cases, the models trained
with the XN odeStats features are also included.
Table 8: Blended Graph Representation Learning performance. The performance enhancement of training
a model using all GRL methods (A+B+C+D+E) is measured as the relative increase in AUC given by
AU C −columnAU C
( [A+B+C+D+E]
columnAU C ).
Feature Set
Scoring Model
A+B+D A+B+C+D A+B+E A+B+C+E
Application Business Credit Score * * 4.33% 4.00%
Scoring Personal Credit Score 1.23% 1.16% 0.39% 0.31%
Behavioral Business Credit Score 0.47% 0.43% 1.02% 0.85%
Scoring Personal Credit Score 0.92% 0.84% 0.22% 0.15%
The results show that combining the GRL methods always generates better or similar results
than using each method independently. An equal performance is only obtained for the Business
Application Credit Score, where the only statistically significant increase, in AUC terms, occurs
when using the XGN N,N 2V features. However, this feature subset does not produce an increment
compared to using only the XEgoN et features. On the other hand, in all other scenarios, the GRL
combination generates statistically significant increments, independent of the method used and
whether or not the XN odeStats features are incorporated. In this way, our approach allows us to
increase discriminatory power in assessing creditworthiness, generate more accurate models, and
use graph data better through a framework that combines multiple methods of GRL.
19
of Application or Behavioral Scoring for individuals or for businesses. All analyses are conducted
with the feature set A+B+C+D+E, which incorporates all the features and is the one that
reports the best results.
(a) Application: Average impact on model output (b) Application: Impact on model output
(c) Behavioral: Average impact on model output (d) Behavioral: Impact on model output
As expected, the most significant contribution to the model is the BenchScore, which already
summarizes valuable information about each company that allows the estimation of its credit-
worthiness. This influence occurs in both scenarios, Application, and Behavioral. However, its
importance is more significant in Behavioral Scoring.
Among these 15 relevant attributes in both scenarios, only the BenchScore, commercial debt
amount (NODEATT 08), and unused revolving credit amount (NODEATT 05) correspond to
the business-related characteristics. The remaining top features correspond to Network-related
features.
An additional relevant feature is the average BenchScore of the company’s ego network, in-
cluding only the non-bridge edges. This result indicates the creditworthiness of the company’s
neighborhood is also highly predictive of the company’s creditworthiness. In Application Scoring,
20
this feature is practically as relevant as the BenchScore. See Table A1 for more detail on the
description of the most relevant variables. This table presents the taxonomy of the features used
in the current study, giving the necessary specifications for the correct interpretation of the feature
attributes and the nomenclature used for the management of the datasets.
Further, we find attributes whose influence corresponds to people related to the company,
including its owners, for instance, the attribute generated from a Graph Autoencoder trained with
the consumer debt of the EOWNet Network. The consumer-debt effect of the ego network is
also observed in the attribute corresponding to the consumer debt weighted by the PageRank of
the node’s neighborhood. The presence of consumer debt as a relevant network-related feature
in Business Scoring is highly significant, especially in SMEs. The EgoNet’s short-term personal
debt, mainly on the part of the owners, accounts for the often blurred separation between personal
finances and company finances. The owner’s default can affect the company and vice versa. This
hypothesis requires a more detailed investigation and will be addressed for future work.
To quantify the usefulness, impact and importance of the different feature sets on the output
model, Figure 5 presents a Treemap based on the average of the absolute values of the SHAP
values; the complete list of the model features is displayed, and the different colors indicate that
they belong to different feature sets.
Figure 5: Business Credit Scoring: Treemap of Feature Importance, the Average Impact on Model Output
In Figure 5(a), it is shown that the feature set XEgoN et (D) contributes, in Application Scoring,
60% of the model’s overall impact. The feature set XGN N,N 2V (E) contributes 21%, of which 19%
correspond to GNN features, while 2% correspond to Node2Vec. The low importance of Node2Vec
features is likely the reason for the limited research on Node2Vec to enhance the prediction of
creditworthiness.
However, in Business Behavioral Scoring, the traditional characteristics now represent 48 % of
the total impact of the model. In contrast, in Business Application Scoring, they represent only
21
16% (See Figure 5(b)). Indeed, the BenchScore attribute alone represents 29% of the total impact.
The feature set XEgoN et (D) represents 30% of the total impact, the average BenchScore of the
ego network being the most relevant attribute.
(a) Application: Average impact on model output (b) Application: Impact on model output
(c) Behavioral: Average impact on model output (d) Behavioral: Impact on model output
The combined network features also play an essential role in the final score; the feature sets
XEgoN et (D) and XGN N,N 2V (E) represent 25% and 33.4% in Application Scoring (See Figure 7(a)),
while the impact in Behavioral Scoring (See Figure 7(b)) are 18% and 28.3% respectively. In both
cases, the contribution of Node2Vec features is negligible. The network feature with the highest
impact is, as the average neighborhood’s amount, the network’s amount of unused revolving credit.
Personal Credit Scoring includes attributes generated with both FamilyNet and EOWNet net-
works. When analyzing the network-related features, almost all of them, in both scenarios, are
FamilyNet features. These results show us both the suitability of the network used to characterize
borrowers and the importance of the type of relationship used to build the network. In this study,
family ties are the most appropriate to characterize borrowers as regards the problem of individual
credit scoring.
22
Figure 7: Personal Credit Scoring: Treemap of Feature Importance, the average impact on model output
7 Conclusions
This study presents an information processing methodology that allows us to assess the additional
value of social-interaction data to approach the credit scoring problem for thin-file clients. This
framework is applied in four scenarios arising from the consideration of all combinations of Appli-
cation and Behavioral Scoring of individual and business lending. Additionally, this methodology
allows the evaluation of different GRL approaches to feature extraction from social networks: hand-
crafted feature engineering, Node2Vec, and Graph Neural Networks. The results show an improve-
ment in creditworthiness assessment performance when different GRL approaches are combined.
Specifically, two of the three GRL methods significantly enhance credit scoring models, namely the
Hand-crafted Feature Engineering and Graph Neural Networks, which have the greatest impact
when used together. We believe this to be very relevant to the community because, until now,
these two methods have been used independently. On the other hand, we have found that the
contribution of Node2Vec is negligible. This result seems to justify the limited research conducted
on Node2Vec as a feature-engineering method for credit scoring.
As a baseline, we use a credit scoring model developed by a financial institution. This model,
the BenchScore, already outperforms the credit bureau model they obtain from the credit bureau
offices, and our methodology allows us to obtain better results in each of the four scenarios.
The highest value of the proposed approach is found in Unbanked Application Scoring. The
unbanked applicants, individuals, and companies, lack behavioral information, which, as it turns
out, is one of the best predictors of creditworthiness. Our approach overcomes the lack of behavioral
information and delivers a proper credit risk assessment using graph data. In this way, applicants
have greater access to the financial system. In the case of the Behavior Scoring models, our
methodology also improves performance. In both cases, the maximum improvement in predictive
performance is achieved when these GRL methods are used together.
23
Explanatory measures, such as SHAP values, allow us to understand each attribute’s contribu-
tion. If the impact on the output model is measured in this way, the baseline model (BenchScore),
although it continues to be an essential attribute, has a diminished effect because it is in the
presence of other good predictors. This feature importance analysis allows us to understand that
we cannot solely examine a company’s characteristics to evaluate the company, especially if it is
unbanked. We also have to understand that they are part of an ecosystem in which the owners,
suppliers, clients, and related companies are essential. The business ecosystem information allows
us to improve the creditworthiness assessment performance. A similar situation occurs in Personal
Credit Scoring, although with less intensity. The network data allows us to address the scarcity of
information and achieve a better credit risk assessment.
Our research shows that there is still room for improvement in incorporating network informa-
tion into the credit scoring problem. This methodology goes in the right direction, improving the
performance of creditworthiness assessment, and it has great value for unbanked and under-banked
people and even in the management of portfolio’s credit risk.
Acknowledgments
This work would not have been accomplished without the financial support of CONICYT-PFCHA
/ DOCTORADO BECAS CHILE / 2019-21190345. The second author acknowledges the support
of the Natural Sciences and Engineering Research Council of Canada (NSERC) [Discovery Grant
RGPIN-2020-07114]. This research was undertaken, in part, thanks to funding from the Canada
Research Chairs program. The last author thanks the partial support of FEDER funds through
the MINECO project TIN2017-85827-P and the European Union’s Horizon 2020 research and
innovation program under the Marie Sklodowska-Curie grant agreement No 777720.
References
Akoglu, H. (2018, 08). User’s guide to correlation coefficients. Turkish Journal of Emergency
Medicine, 18 (3), 91–93. doi: 10.1016/j.tjem.2018.08.001
Anderson, R. A. (2022). Credit intelligence and modelling: Many paths through the forest of credit
rating and scoring. Oxford University Press.
Arsov, N., & Mirceva, G. (2019). Network embedding: An overview. arXiv preprint
arXiv:1911.11726.
Aziz, S., & Dowling, M. (2019). Machine learning and ai for risk management. In Disrupting
finance: Fintech and strategy in the 21st century (pp. 33–50). Cham: Springer International
Publishing. doi: 10.1007/978-3-030-02330-0\ 3
Baidoo, E. (2020). A credit analysis of the unbanked and underbanked: an argument for alternative
data (PhD dissertation). Analytics and Data Science Institute, Kennesaw State University.
Bravo, C., Thomas, L. C., & Weber, R. (2015). Improving credit scoring by differentiating defaulter
behaviour. The Journal of the Operational Research Society, 66 (5), 771–781.
Bravo, C., & Óskarsdóttir, M. (2020). Evolution of credit risk using a personalized pagerank
algorithm for multilayer networks. arXiv preprint arXiv:2005.12418.
Breiman, L. (2001). Random forests. Machine learning, 45 (1), 5–32.
Carta, S., Ferreira, A., Reforgiato Recupero, D., & Saia, R. (2021). Credit scoring by leverag-
ing an ensemble stochastic criterion in a transformed feature space. Progress in Artificial
Intelligence, 10 (4), 417–432.
24
Cnudde, S. D., Moeyersoms, J., Stankova, M., Tobback, E., Javaly, V., & Martens, D. (2019).
What does your facebook profile reveal about your creditworthiness? using alternative data
for microfinance. Journal of the Operational Research Society, 70 (3), 353-363. doi: 10.1080/
01605682.2018.1434402
Cusmano, L. (2018). SME and entrepreneurship financing: The role of credit guarantee schemes
and mutual guarantee societies in supporting finance for small and medium-sized enterprises.
OECD SME and Entrepreneurship Papers, No. 1 . doi: 10.1787/35b8fece-en
Defferrard, M., Bresson, X., & Vandergheynst, P. (2016). Convolutional neural networks on graphs
with fast localized spectral filtering. In Advances in neural information processing systems
(pp. 3844–3852).
Diallo, B., & Al-Titi, O. (2017). Local growth and access to credit: Theory and evidence. Journal
of Macroeconomics, 54 , 410-423. (Banking in Macroeconomic Theory and Policy) doi:
https://doi.org/10.1016/j.jmacro.2017.07.005
Djeundje, V. B., Crook, J., Calabrese, R., & Hamid, M. (2021). Enhancing credit scoring with
alternative data. Expert Systems with Applications, 163 , 113766. doi: https://doi.org/
10.1016/j.eswa.2020.113766
Fang, F., & Chen, Y. (2019). A new approach for credit scoring by directly maximizing the
kolmogorov–smirnov statistic. Computational Statistics & Data Analysis, 133 , 180–194.
Fey, M., & Lenssen, J. E. (2019). Fast graph representation learning with pytorch geometric.
arXiv preprint arXiv:1903.02428.
Flach, P. A. (2012). Machine learning - the art and science of algorithms that make sense of data.
Cambridge University Press.
Freedman, S., & Jin, G. Z. (2017). The information value of online social networks: Lessons from
peer-to-peer lending. International Journal of Industrial Organization, 51 , 185 - 222. doi:
https://doi.org/10.1016/j.ijindorg.2016.09.002
Friedman, J. H. (2001). Greedy function approximation: a gradient boosting machine. Annals of
statistics, 1189–1232.
Goel, A., & Rastogi, S. (2021). Credit scoring of small and medium enterprises: a behavioural
approach. Journal of Entrepreneurship in Emerging Economies.
Grover, A., & Leskovec, J. (2016). Node2vec: Scalable feature learning for networks. In Proceedings
of the 22nd acm sigkdd international conference on knowledge discovery and data mining
(p. 855–864). New York, NY, USA: Association for Computing Machinery. doi: 10.1145/
2939672.2939754
Hagberg, A., Swart, P., & SChult, D. (2008). Exploring network structure, dynamics, and function
using networkx. In In proceedings of the 7th python in science conference (scipy (pp. 11–15).
Hamilton, W. L., Ying, R., & Leskovec, J. (2017). Representation learning on graphs: Methods
and applications. arXiv preprint arXiv:1709.05584.
Hurley, M., & Adebayo, J. (2017). Credit scoring in the era of big data. Yale Journal of Law and
Technology, 18 (1), 5.
Kaufman, S., Rosset, S., Perlich, C., & Stitelman, O. (2012, December). Leakage in data mining:
Formulation, detection, and avoidance. ACM Trans. Knowl. Discov. Data, 6 (4). doi: 10
.1145/2382577.2382579
Kipf, T. N., & Welling, M. (2016a). Semi-supervised classification with graph convolutional
networks. arXiv preprint arXiv:1609.02907.
Kipf, T. N., & Welling, M. (2016b). Variational graph auto-encoders. arXiv preprint
arXiv:1611.07308.
Kleinberg, J. M. (1999, 9). Authoritative sources in a hyperlinked environment. J. ACM , 46 (5),
25
604–632. doi: 10.1145/324133.324140
Kozodoi, N., Lessmann, S., Papakonstantinou, K., Gatsoulis, Y., & Baesens, B. (2019). A multi-
objective approach for profit-driven feature selection in credit scoring. Decision Support
Systems, 120 , 106-117. doi: https://doi.org/10.1016/j.dss.2019.03.011
Leskovec, J., & Sosič, R. (2016). Snap: A general-purpose network analysis and graph-mining
library. ACM Transactions on Intelligent Systems and Technology (TIST), 8 (1), 1.
Lundberg, S. M., & Lee, S.-I. (2017). A unified approach to interpreting model predictions. In
Proceedings of the 31st international conference on neural information processing systems
(pp. 4765–4774). Red Hook, NY, USA: Curran Associates Inc.
Maldonado, S., Pérez, J., & Bravo, C. (2017). Cost-based feature selection for support vector
machines: An application in credit scoring. European Journal of Operational Research,
261 (2), 656 - 665. doi: https://doi.org/10.1016/j.ejor.2017.02.037
Moscato, V., Picariello, A., & Sperlı́, G. (2021). A benchmark of machine learning approaches
for credit score prediction. Expert Systems with Applications, 165 , 113986. doi: https://
doi.org/10.1016/j.eswa.2020.113986
Nargesian, F., Samulowitz, H., Khurana, U., Khalil, E. B., & Turaga, D. (2017). Learning feature
engineering for classification. In Proceedings of the twenty-sixth international joint conference
on artificial intelligence, IJCAI-17 (pp. 2529–2535). doi: 10.24963/ijcai.2017/352
Niu, B., Ren, J., & Li, X. (2019, 12). Credit scoring using machine learning by combing social
network information: Evidence from peer-to-peer lending. Information, 10 (12), 397. doi:
10.3390/info10120397
Óskarsdóttir, M., Bravo, C., Vanathien, J., & Baesens, B. (2018a, 11). Credit scoring for good:
enhancing financial inclusion with smartphone-based microlending. In Proceedings of the
thirty ninth international conference on information systems. San Francisco, California,
USA.
Óskarsdóttir, M., Bravo, C., Vanathien, J., & Baesens, B. (2018b, 7). Social network analytics in
micro-lending. In 29th european conference on operational research (08/07/18 - 11/07/18).
Valencia, Spain.
Óskarsdóttir, M., & Bravo, C. (2021). Multilayer network analysis for improved credit risk pre-
diction. Omega, 105 , 102520. doi: https://doi.org/10.1016/j.omega.2021.102520
Óskarsdóttir, M., Bravo, C., Sarraute, C., Vanthienen, J., & Baesens, B. (2019). The value
of big data for credit scoring: Enhancing financial inclusion using mobile phone data and
social network analytics. Applied Soft Computing, 74 , 26 - 39. doi: https://doi.org/10.1016/
j.asoc.2018.10.004
Óskarsdóttir, M., Bravo, C., Verbeke, W., Baesens, B., & Vanthienen, J. (2018). Effects of network
architecture on model performance when predicting churn in telco.
Óskarsdóttir, M., Bravo, C., Verbeke, W., Sarraute, C., Baesens, B., & Vanthienen, J. (2017).
Social network analytics for churn prediction in telco: Model building, evaluation and network
architecture. Expert Systems with Applications, 85 , 204 - 220. doi: https://doi.org/10.1016/
j.eswa.2017.05.028
Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., . . . others (2019). Pytorch:
An imperative style, high-performance deep learning library. Advances in neural information
processing systems, 32 , 8026–8037.
Putra, S. G. P., Joshi, B., Redi, J., & Bozzon, A. (2020). A credit scoring model for smes based
on social media data. In M. Bielikova, T. Mikkonen, & C. Pautasso (Eds.), Web engineering
(pp. 113–129). Cham: Springer International Publishing.
Rabecca, H., Atmaja, N. D., & Safitri, S. (2018). Psychometric credit scoring in indonesia microfi-
26
nance industry: A case study in pt amartha mikro fintek. In The 3rd international conference
on management in emerging markets (icmem 2018) (pp. 620–631). Bali, Indonesia.
Rathi, S., Verma, J. P., Jain, R., Nayyar, A., & Thakur, N. (2022). Psychometric profiling of indi-
viduals using twitter profiles: A psychological natural language processing based approach.
Concurrency and Computation: Practice and Experience, e7029. doi: 10.1002/cpe.7029
Roa, L., Correa-Bahnsen, A., Suarez, G., Cortés-Tejada, F., Luque, M. A., & Bravo, C. (2021).
Super-app behavioral patterns in credit risk models: Financial, statistical and regulatory
implications. Expert Systems with Applications, 169 , 114486. doi: https://doi.org/10.1016/
j.eswa.2020.114486
Roa, L., Rodrı́guez-Rey, A., Correa-Bahnsen, A., & Valencia, C. (2021). Supporting financial
inclusion with graph machine learning and super-app alternative data.
Romero, D. M., Uzzi, B., & Kleinberg, J. (2019). Social networks under stress: Specialized team
roles and their communication structure. ACM Transactions on the Web (TWEB), 13 (1),
1–24.
Ruiz, S., Gomes, P., Rodrigues, L., & Gama, J. (2017). Credit scoring in microfinance using
non-traditional data. In E. Oliveira, J. Gama, Z. Vale, & H. Lopes Cardoso (Eds.), Progress
in artificial intelligence (pp. 447–458). Cham: Springer International Publishing.
Shumovskaia, V., Fedyanin, K., Sukharev, I., Berestnev, D., & Panov, M. (2020). Linking bank
clients using graph neural networks powered by rich transactional data.
Stevenson, M., Mues, C., & Bravo, C. (2021). The value of text for small business default
prediction: A deep learning approach. European Journal of Operational Research, 295 (2),
758-771. doi: https://doi.org/10.1016/j.ejor.2021.03.008
Sukharev, I., Shumovskaia, V., Fedyanin, K., Panov, M., & Berestnev, D. (2020). Ews-gcn: Edge
weight-shared graph convolutional network for transactional banking data.
Tan, T., & Phan, T. Q. (2018). Social media-driven credit scoring: The predictive value of social
structures. Available at SSRN 3217885 .
The Basel Committee on Banking Supervision. (2000, 09). Principles for the management of credit
risk. Basel Committee Publications, 75 .
The Global Financial Index. (2022). The global findex database 2021: Financial inclusion, digital
payments, and resilience in the age of covid-19. (Retrieved from https://openknowledge
.worldbank.org/bitstream/handle/10986/37578/9781464818974.pdf. Accessed July 3,
2022)
Thomas, L., Crook, J., & Edelman, D. (2017). Credit scoring and its applications. SIAM.
Vlasselaer, V. V., Bravo, C., Caelen, O., Eliassi-Rad, T., Akoglu, L., Snoeck, M., & Baesens,
B. (2015). Apate: A novel approach for automated credit card transaction fraud detection
using network-based extensions. Decision Support Systems, 75 , 38 - 48. doi: https://doi.org/
10.1016/j.dss.2015.04.013
Wu, Z., Pan, S., Chen, F., Long, G., Zhang, C., & Yu, P. S. (2020). A comprehensive survey on
graph neural networks. IEEE Transactions on Neural Networks and Learning Systems, 1-21.
doi: 10.1109/TNNLS.2020.2978386
Zeng, G., & Zeng, E. (2019). On the three-way equivalence of auc in credit scoring with tied
scores. Communications in Statistics-Theory and Methods, 48 (7), 1635–1650.
Zhang, M., & Chen, Y. (2018). Link prediction based on graph neural networks. In Advances in
neural information processing systems (pp. 5165–5175).
27
A Feature Description
Table A1: Taxonomy of features used in the experiments and nomenclature that we used for the man-
agement of the data in the experiments.
Feature Set Nomenclature Prefix/suffix Description
Identifier N ODEAT T Feature subset identifier
Node AT T 01, · · · , AT T 04 The debt situation characterized by the
Features Borrower feature delinquency level
(XN ode ) AT T 05, · · · , AT T 08 The debt type: revolving, consumer,
identifier commercial, or mortgage
Other aspects of the customer’s debt,
AT T 09, · · · , AT T 13
payments in arrears, and the time in
the financial system
Bench Score The benchmark score
Identifier NodeStats Feature subset identifier
DegreeCentr Degree centrality
T riads Number of triads
Node P ageRank PageRank Algorithm
Statistics Statistic identifier
ArtP oint Articulation point
(XN odeStats ) Hits Auth Hits algorithm Authority score
Hits Hub Hits algorithm Hub score
EOWNET EOWNet Network
Network identifier
FamilyNet Family Network
Borrower feature AT T 01, · · · , AT T 13 Borrower Feature
identifier
EOWNET EOWNet Network
EgoNetwork Network identifier
FamilyNet Family Network
Agreggation
Aggregation MEAN Mean
Features
Function STD Standard Deviation
(XEgoN et )
Full All edges
Edges NotBridge Edges that are not bridges
IsBridge Edges that are bridges
Weighted W by + F eature Suffix for weighted aggregations
Aggregations
Identifier N2V Feature subset identifier
Node2Vec Embedding
Features EM B 01, · · · , EM B 08 The embedding number
Identifier
(XEgoN et ) EOWNET EOWNet Network
Network identifier
FamilyNet Family Network
CHEB Graph Convolutional Network (GCN)
Graph Neural GNN Identifier
GAE Graph Autoencoder
Network Borrower feature 01, · · · , 13 Borrower Feature
Features identifier
(XGN N ) Embedding The embedding number, where n = 3
EM B 01, · · · , EM B n
Identifier for CHEB and n = 8 for GAE
EOWNET EOWNet Network
Network identifier
FAMNET Family Network
28