BDA4CID-Mark
BDA4CID-Mark
BDA4CID-Mark
net/publication/330625565
CITATIONS READS
31 575
6 authors, including:
All content following this page was uploaded by Mark Williams on 14 April 2020.
over time in the National Vulnerability Database. A brief of vulnerabilities over time from one product to another and
analysis of this figure reveals, among other trends, the drastic between different versions of the same product.
decline of the susceptibility of Internet Explorer in recent 3) We determine the overall susceptibility of major software
years, but a sudden rise in susceptibility for Windows 10 and products to vulnerabilities within a specified time frame, and
Edge about three years ago. The team of experts will be able explore the distribution of susceptibility within distinct product
to recommend an area of focus based on a figure similar to domains.
Figure 1.
Now, consider the scenario where the team is performing a II. R ELATED W ORK
detailed analysis regarding a Windows 10 vulnerability — e.g.,
the operating system mishandles library handling allowing There are existing efforts to implement broader data min-
users to gain privileges to execute arbitrary codes. Under- ing algorithms on cyber-security corpora. In fact, data min-
standing how such a vulnerability evolved despite monitoring ing algorithms have been increasingly utilized to facilitate
multiple earlier versions of the product would allow the team knowledge gathering in the cyber-security domain in recent
to resolve the issue in the current version as well as in any years. Vulnerability prediction has been addressed in many
other related versions or related products. Figure 2 shows how efforts. For instance, text mining techniques are applied by
the vulnerability described evolved over time from another Scandariato et al. to predict software component vulnerabilties
issue involving kernel mode drivers that affected an earlier in [15]. Han et al. [5] develop a method to predict the severity
version of Windows. We explain Figure 2 later in Experimental of a vulnerability by using a one-layer convolutional neural
Results as part of Section V-B as a case study. network. Zhang et. al [18] use data mining techniques to try
In this paper, we use two analytical approaches to aid to predict day zero software vulnerabilities using the NVD.
decision making using vulnerability corpora. The first ap- Researchers have also targeted analysis of vulnerabilities
proach utilizes a temporal topical model that builds on top through detection of relationships between vulnerabilities [14],
of our previous effort called Supervised Topical Evolution [20]. Lin et al. [9] present an association rule discovery
Model (STEM) [10]. Outcomes such as Figure 1 can be algorithm to find correlations between certain aspects of a
generated using STEM. STEM provides a holistic idea about cyber-security corpus, the Common Weakness and Enumer-
the vulnerability topics in the dataset. The second approach ation database2 . Shin et al. [16] discover correlations between
leverages another of our efforts called Diffusion-based story- vulnerabilities by analyzing code complexity, code churn, and
telling [1], by providing the ability to describe how a specific developer activity metrics. Such correlations are extended
threat evolved historically. Overall, STEM is the first stage beyond code content to a vulnerability life cycle in [8]. Topic
of the analytic process to better understand the trends in models are used extensively in [11] to discover frequent
vulnerabilities. Diffusion-based storytelling is the second stage vulnerability types and new patterns in cyber-security corpora.
which provides finer details of the evolution of a specific One significant limitation of these works is that they do not
threat. take into account the temporal aspect of vulnerabilities. This
In summary, the contributions of this paper are: makes it difficult to determine if the relationship discovered by
1) We provide a high-level holistic view of cyber-security the work is relevant in the present. Our proposed framework
vulnerabilities using a probabilistic graphical model, STEM, is distinct in this manner because we discover trends in
that integrates timestamps, annotations, latent themes, and vulnerabilities based on the latent structure of the corpus
textual data. and by incorporating the temporal feature of vulnerability
2) Through a variety of experiments on a cyber-security reports. We then go one step further and study the evolution
corpus, we demonstrate that diffusion-based storytelling can
establish evolutionary narratives describing the propagation 2 Common Weakness and Enumeration (CWE), https://cwe.mitre.org
of a vulnerability as a series of connected vulnerability reports. labels and the 50 most affected products in our experiments
with STEM. All vulnerability labels and affected products are
utilized in the diffusion-based storytelling framework.
III. P ROBLEM AND S OLUTION S TATEMENT
We have a set of vulnerability reports D =
{d1 , d2 , . . . , d|D| }, which contain entities or words from
E = {e1 , e2 , . . . , eE }. Each document d has a publication date B. STEM
td , and a binary distribution of vulnerability labels specific to
the document d from ⇤d = [l1 , l2 , · · · , lK ], where K is the Supervised Topical Evolution Model (STEM) is a proba-
number of vulnerability labels. Additionally, each document bilistic graphical approach for modeling dynamic document
d has a list of affected products distinct to the document d collections with explicit labels. Similar to traditional topic
that is a subset of ⇤d = [a1 , a2 , · · · , aP ] where P is the models like LDA [2], STEM views each document as a
number of products referenced in the NVD dataset. mixture of underlying topics, and each topic as a distribution
The proposed framework consists of three stages: over the words. Unlike traditional topic models, STEM learns
1) Given the vulnerability reports D, compute the appear- the topics and their evolution over time by guiding its inference
ance probability (i.e., topic probability) of each of the K procedure through the incorporation of timestamps and label
vulnerability labels in each timestamp t. information of the documents. The generative process for the
2) Given the vulnerability reports D, compute the product model is as follows.
recurrence (i.e., product susceptibility) of each of the P 1) For k = 1 to K :
product labels in each timestamp t. a) k ⇠ Dirichlet( )
3) Given a specific vulnerability report d, create a chain of 2) For each document d 2 D :
relevant vulnerability reports from the past such that the a) For k = 1 to K :
chain reflects the evolution of the vulnerability reported i) ⇤dk ⇠ Bernoulli( k )
in d. b) Compute Ld from ⇤d
c) ↵d = Ld ⇥ ↵
IV. M ETHODOLOGY
d) ✓d ⇠ Dirichlet(↵d )
The National Vulnerability Database (NVD) that we ex- e) For each entity edi 2 d :
tensively use in this paper contains 111,060 unique reports. i) zdi ⇠ M ultinomial(✓d )
The reports describe vulnerabilities and exposures in software ii) edi ⇠ M ultinomial( zdi )
products from as early as 1996. The database contains links iii) tdi ⇠ Beta( zdi )
to the report-documents published by software development
A list of all the symbols used in the generative process is
companies such as Apple and Microsoft.
provided in Table I.
Each document in the NVD consists of a Common Vulner-
abilities and Exposures3 (CVE) entry, which is a high-level STEM approximates the posterior probabilities of the hid-
vulnerability description. Each CVE entry is associated with
products that are affected by the vulnerability described in the
entry. CVE entries in the NVD vary in length but they are Table I: List of symbols
usually short paragraphs. We leverage the links provided with Symbol Description
each entry to enrich the data by augmenting NVD reports with D Set of all the documents
descriptions from the original source. V Number of unique entities
K Number of unique labels
A. Dataset Preparation Nd Number of entities in document d
To prepare the dataset for our experiments, we used distinct ↵, Dirichlet prior for the document-topics and topic-
words distributions, respectively
attributes present in each NVD document (publication date,
Prior probabilities for the Bernoulli distribution
CVE entry description, etc.) and labeled documents as a of labels
collection of these attributes. We then extracted entities from ✓d Multinomial distribution of topics specific to the
each document description in the dataset by using standard document d
entity detection approaches [1], [7]. We ignored documents ⇤d Binary distribution of labels specific to the docu-
with less than four entities and documents that were published ment d
Ld Projection matrix for labels in document d
before 2000 because they do not contain enough information Multinomial distribution of words specific to topic
z
to be discriminative. z
Each NVD document is also associated with a label that z Beta distribution of time specific to topic z
serves as a classifier for the vulnerability described in the zdi The topic associated with the i-th token in the
document and a set of products that are affected by that vul- document d
nerability. We have selected the 50 most frequent vulnerability edi The i-th entity in the document d
tdi The timestamp associated with the i-th token in
3 Common Vulnerabilities and Exposures (CVE), https://cve.mitre.org the document d
den variables using Gibbs Sampling method [4] as follows. where wi 2 W is the weight for document di and
p(zdi |z di , e, t, ↵, , ) = (ti , tsL , tsH ) ⇤ (tj , tsL , tsH ) (5)
nzdi edi + 1
/ (ndzdi + ↵ 1) PV = (ti , tsL , tsH ) ⇤ (1 (tj , tsL , tsH )) (6)
v=1 (nzdi v + ) 1 (1)
1
tdi di
z 1
(1 tdi ) zdi 2 1 where tsL,H are the lower/upper turning points for a particular
⇥ segment and is defined as:
B( zdi 1 , zdi 2 )
For the sake of speed and simplicity, is updated after each 8 ✓ ◆
ts ˆ 2 )2
Gibbs sample using the method of moments as follows: >
>
(t L
✓ ◆ >p 1 2ˆ 2
if t tsL
< 2⇡ ˆ 2 e
>
ˆz1 = t̄z t̄z (1 t̄z ) 1 (t, tsL , tsH ) = p 1 if tsL < t < tsH
s2z >
> 2⇡ ˆ 2 ✓ ◆
✓ ◆ (2) >
> (t ts 2 2
H +ˆ )
ˆz2 = (1 t̄z ) t̄z (1 t̄z ) 1 :p 1 e 2ˆ 2
if tsH t
2⇡ ˆ 2
s2z (7)
where t̄z and sz indicate the mean and standard deviation of where ˆ is the standard deviation of a Gaussian distribution
all the timestamps belonging to topic z, respectively. Refer to that indicates the degree of membership of a timestamp in a
[10] for a more detailed derivation. segment.
C. Diffusion-based story generation The third term in the objective function is the overlap
penalty, which prevents having two or more turning points
In [1], we presented a diffusion-based framework that mines very close to each other.
a large corpus of documents to generate smooth evolving 0 ✓ ◆1
stories given a seed document published recently. The frame- |S| 1⇥|S| 1
X (ti tj )2
|S|
incoherence(s) = X
P|D|⇥|D| F( T , W) = (incoherence(s) ⇤ similarity(s)) ⇤
i,j wi ⇤ wj ⇤ ⇤ soergel(di , dj ) ⇤ |ti tj | s=1
P|D|⇥|D| (3)
wi ⇤ wj ⇤ overlap ⇤ uniformity (10)
i,j
In this section we seek to answer the following questions: One of the goals of this paper is to study how we can
discover connections and trends in vulnerabilities that will
1) How does Supervised Topical Evolution Model (STEM)
help create more effective mitigation strategies and cyber-
perform compared to other topic modeling approaches in
defense implementations. Usually, this discovery is achieved
explaining trends in vulnerabilities? (Section V-A)
by analyzing individual aspects of a vulnerability with limited
2) How can an expert analyze the evolution of a specific
consideration of temporal factors. In the following experi-
vulnerability of interest? (Section V-B)
ments, we utilize STEM to discover novel correlations in
3) How can shifts in product susceptibility to vulnerabilities
vulnerabilities, with a particular emphasis on both temporal
be detected and explained? (Section V-C)
factors and the interplay between vulnerability dependencies.
4) How well does STEM perform on the NVD dataset in
We also compare STEM results with results obtained by
terms of quality and coherence of topics? (Section V-D)
LLDA and TOT. Capability-wise, STEM enables both aspects
5) How well does STEM scale in its parallel execution?
of LLDA and TOT. LLDA is capable of handling labels in
(Section V-E)
topic modeling but does not include time. TOT includes topics
We implemented two versions of STEM — a sequential over time but does not take labels into consideration. STEM
version and a parallel one. The parallel version of the STEM provides a combination of the advantages of both algorithms
model uses 12 processors and is denoted as STEM-P for the — labeled topics over time for a massive text corpus.
rest of this section. The L-BFGS-B optimization algorithm
In Figure 3, we display the top ten highly probable terms
[19] minimizes a GPU-oriented parallel version of the objec-
discovered by STEM, LLDA, and TOT for vulnerability labels
tive function for the diffusion-based storytelling framework.
in the NVD dataset. Although we performed this experiment
We conducted a series of quantitative and qualitative ex- on the ten most prominent vulnerability labels in the dataset,
periments using these algorithms to answer the preceding we have displayed only three in the Figure due to space con-
questions. We compared our model with two contemporary siderations. The complete results for this experiment and the
graphical models —- Topics Over Time (TOT) [17], which source code for this paper can be accessed on a website4 we
models topical evolutions over continuous time, and Labeled created. The three listed vulnerability labels are: Permissions
LDA (LLDA) [13], which uses labels to guide the model and Privileges, Path Traversal, and Resources Errors. Overall,
inference process — to determine the accuracy of our methods. the terms inferred by STEM and LLDA are moderately similar,
We utilized a real-world dataset –– a cyber security corpus referring to the fact that STEM subsumes the capability of
known as the National Vulnerability Database (NVD) — for LLDA.
our experiments. However, the terms discovered by TOT differ significantly
It is important to note that in our experiments with STEM from the other models. The reason of such difference is that
using products as topics, we refer to topic probability as prod- TOT does not consider vulnerability labels as its topics while
uct susceptibility. Topic probability and product susceptibility the other models do. The three topics — Topic 0, Topic 14,
are interchangeable in the context of this paper because they and Topic 13 — listed in Figure 3 (right) are based on mere
reflect the same concept. For example, if the topic Microsoft overlaps of top terms with vulnerability labels Permissions and
Edge had high topic probability in 2011, then the topic was
highly susceptible in that year because it appeared frequently 4 https://analyzing-evolving-trends-of-vulnerabilities-in-nvd-database.
in the data at that time. weebly.com/
Figure 3: Terms discovered by STEM, LLDA, and TOT for three vulnerability labels in NVD.
STEM LLDA TOT
Linux Linux
Kernel iPhone OS Android Kernel iPhone OS Android Topic 13 Topic 9 Topic 10
denial component kernel kernel apple android service os android
service corruption bug linux ios bug denial mac versions
memory denial android users issue application users users code
crash service application service service kernel linux allows issue
kernel code id denial denial id kernel x kernel
function apple product function code product function service id
system issue qualcomm system corruption versions allows denial application
allows application driver allows component qualcomm crash code product
users applets linux crash application driver code linux data
linux vectors versions memory products code memory apple access
Figure 4: Terms discovered by STEM, LLDA, and TOT for three products in NVD.
The kernel-mode drivers Buffer overflow in the Buffer overflow in the The SMB server Microsoft Windows does
in Microsoft Windows kernel-mode drivers in Network Driver Interface component in Microsoft not properly enforce
allow local users to Microsoft Windows Standard (NDIS) Windows allows local permissions, which
bypass the ASLR allows local users to gain implementation in users to gain privileges allows local users to
protection mechanism privileges via a crafted Microsoft Windows via a crafted application obtain administrator
via a crafted function call, application. allows users to gain that forwards a request access via a crafted DLL
causing memory privileges via a crafted to an unintended service. (dynamic link library).
disclosure. application.
Figure 7: Evolutionary narrative of CVE-2016-3346, which describes a Windows Permissions Enforcement Elevation of
Below are the results from CVE-2012-0175, which has a vulnerability score of 9.3 on a
Privilege Vulnerability.
scale of 1-10, making it one of the most critical security vulnerabilities. CVE-2012-0175
describes a remote code execution vulnerability (“Command Injection Vulnerability”) caused in
the shell of Microsoft Windows when a user opens a specially crafted file or directory. The first
CVE in this chain describes how an integer overflow vulnerability caused by a crafted AVI
(Audio Video Interleaved) file leads to arbitrary code execution. The subsequent CVE bulletins
dispersion coefficient of 1.0 to ensure that there is a smooth vation of privilege and possibly remote code execution because
transition of topics. of improper input validation before loading libraries. The first
In Figure 6, we display the evolutionary narrative of CVE- document found by the storytelling algorithm describes a race
2017-0050, a document in the NVD dataset that describes condition vulnerability in the kernel mode drivers of an earlier
an elevation of privilege vulnerability caused by faulty input version of Microsoft Windows, as Microsoft Windows 10 was
validation in Microsoft Windows. The first document in the released in 2015. The next documents in the narrative describe
narrative describes the likely origin of this vulnerability as similar vulnerabilities in the kernel mode drivers of an earlier
being a scaling error in Windows kernel-mode drivers that version of Microsoft Windows. Note that all vulnerabilities in
occurred in 2015. The following documents in the narrative the chain can be exploited via a crafted application. In the final
describe similar flaws in the Microsoft Windows kernel or its two documents of the narrative, a shift from driver exploitation
components. The relations between previous CVE’s to the seed to library exploitation is seen, but the vulnerability still has the
CVE, CVE-2017-0050, can be clearly seen. For instance, the same impact on product security. In summary, this narrative
document labeled CVE-2016-7216 (4th from left in Figure 6) explores the impact of a single vulnerability on the security of
describes how the kernel API mishandles permissions, which multiple products, with a focus on how vulnerabilities in one
in turn allows for escalation of privilege errors. In the seed product contribute to vulnerabilities in others.
document, the kernel does not properly enforce permissions, Overall, our case study into the susceptibility of Microsoft
so although the mishandling of permissions was patched in the Windows 10 suggests that vulnerabilities do evolve over time.
kernel, the kernel itself did not enforce these new permissions It also suggests that vulnerability development is based on
leading to a denial of service. Note that the documents in the several factors, such as the dependence of vulnerabilities on
narrative are all related to kernel vulnerabilities, suggesting previous issues and the cross-product nature of vulnerability
that vulnerabilities affecting Microsoft Windows components adaptation.
are heavily dependent on previous vulnerabilities that affect
the same components. C. Shifts in Product Susceptibility
Figure 7 illustrates the evolutionary narrative of CVE-2016- In order to explain shifts in products susceptibilities, we
3346. This particular CVE describes a bug in the loading of perform an empirical comparison test. Figure 8 (left) shows
a dynamic link library (DLL). The bug leads to elevation the topic probability of ten vulnerability labels since 2000
of privilege, which causes information disclosure or remote and Figure 8 (middle) shows the product susceptibility of ten
code execution. A possible origin of the vulnerability can products since 2000. Comparing these two figures explains
be pinpointed to security feature bypass and buffer over- some of the shifts in susceptibility. However, other factors
flow vulnerabilities in the kernel mode drivers of Microsoft such as product use and product deprecation that have an
Windows. This implies that the vulnerability documented in undeniable impact on susceptibility are not considered here.
CVE-2016-3346 is kernel-based, rather than an inherent DLL For instance, Figure 8 (middle) shows that since 2000, certain
issue. Additional documents in the narrative provide further products, such as Edge and Windows Server, have appeared
background on the adaptation of the vulnerability, offering that completely changed the susceptibility dynamic. However,
a fascinating example that displays the progression of one Safari remains an exception. It becomes susceptible around
vulnerability to another. As a whole, the narrative reinforces 2003, when it was released to the public, and it has a high
the concept that vulnerabilities are multifaceted, and they often susceptibility until about 2014 when its susceptibility abruptly
have dependencies that are not easily visible to experts. falls. Similarly, the topic Numeric Errors in Figure 8 (left)
In Figure 2, we present the evolutionary narrative for the grows especially probable in the time range 2003-2014, before
document labeled CVE-2015-6218, which describes a high abruptly declining as well. Of course, there are multiple
impact vulnerability in Microsoft Windows 10 that causes ele- reasons for the shift in susceptibility for Safari, but Numeric
Safari Windows
iTunes
Server (2012)
MySQL SQL Injection SQL Injection
Figure 8: (left) Topic probability of ten vulnerability labels since 2000 (middle) Product susceptibility of ten products since
2000 (right) Susceptibility of five Apple products over the last ten years.
Errors vulnerabilities undoubtedly play a part as they are the
only vulnerabilities with a similar probability tendency out of
all the given topics.
Another shift that can be partially explained using this
pragmatic method is the sudden susceptibility of both Mi-
crosoft Edge and Windows Server 2012. Although the shifts
in susceptibility for these products can be partly credited to
the time of their introduction into the product ecosystem,
the shift in susceptibility is especially drastic and unlike the
natural growth of susceptibility observed thus far. However,
an analysis of overall vulnerability distribution offers some
explanation for this phenomena. Both the topics NULL Pointer
and Data Handling in Figure 8 (left) show similar growth
patterns suggesting that a proliferation of these vulnerabilities
contributes to an equal if not greater increase in susceptibility
for the products Edge and Windows Server 2012. Once again,
this assertion stands for these particular vulnerabilities because Figure 9: Coherence of topics generated by STEM, LDA,
they are the only ones that have a comparable probability LLDA, and TOT on the NVD dataset. STEM outperforms all.
variation during this time period.
Our final experiment using STEM in the realm of product
susceptibility is an analysis of shifts that occur when a new
software product is introduced to the public. In Figure 8
(right), we assess product susceptibility over ten years since
the introduction of iPhone OS. For this experiment, we chose
to only study products that were created by Apple, the com-
pany that developed iPhone OS, to display the impact of
product introduction on a company’s overall product suscep-
tibility. An analysis of this experiment reveals that since the
introduction of iPhone OS there has been a steady decline
in susceptibility for iTunes, Apple TV, and Safari, but there
has been an aggressive growth in susceptibility for iPhone
OS. Now, iPhone OS is the most susceptible product created
by Apple, but the susceptibility of Mac OS, which remained
steady for most of the period, is on the rise as well. This
relationship suggests that substantial shifts occur in the product
susceptibility domain once a new product enters the market,
but it also demonstrates that product susceptibility is volatile Figure 10: Perplexity of STEM, LLDA, STEM-P and TOT
even if a product introduction has not occurred, such as in the models on the NVD dataset.
case of Mac OS.
D. Timestamp Prediction, Coherence, and Perplexity error than LDA and LLDA. The resulting figure is omitted
STEM models the temporal evolution of topics along with here because of space considerations, but it can be found at
word co-occurence probabilities. To illustrate that the topics the website mentioned in Section V-A.
generated by STEM can capture time more accurately than We use Topic Coherence [12] as a second method to
other models, we approximated the timestamp of each NVD evaluate the quality of the topics generated by STEM. Each
document from its distribution of topics according to four generated topic consists of words, and topic coherence is
topic models (STEM, LDA, LLDA, and TOT) as well as a applied to the top N words from each topic. Higher topic
baseline model that predicted timestamps randomly. Support coherence is better. Figure 9 shows the coherence of topics
Vector Machine regression was used to predict timestamps for generated by STEM, LLDA, TOT, and LDA for different
the four topic models, where the feature vectors for regression numbers of topics in the NVD dataset. The figure shows that
were the topic distributions in NVD documents. For more STEM outperforms all models in terms of topic coherence.
information on Support Vector regression refer to [3]. Root Perplexity is another widely used metric of convergence
Mean Squared Error (RMSE) was used for error measurement in topic modeling. It is measured as the likelihood of the
and the models were evaluated at various folds (training/test inverse of the geometric mean per word. A lower perplexity
splits). We found that STEM performed as good as TOT, is an indicator of a better fit to the data. Figure 10 shows
resulting in lower error (RMSE). STEM also had a lower the perplexity for STEM, STEM-P, LLDA and TOT on the
NVD dataset. STEM and the parallel version of STEM (called VII. ACKNOWLEDGMENTS
STEM-P) converge in less iterations than TOT and compete This material is based upon work supported by the National
well with LLDA by being very close to it. Science Foundation under Grant No. HRD-1242122.
R EFERENCES
E. Parallel Inference
[1] R. C. Barranco, A. P. Boedihardjo, and M. S. Hossain, “Analyzing
In this paper, we draw conclusions on the nature of vul- evolving stories in news articles,” International Journal of Data Science
and Analytics, 2017.
nerabilities that affect software products based on only one [2] D. M. Blei, A. Y. Ng, and M. I. Jordan, “Latent dirichlet allocation,”
cyber-security corpus. In order to discover more intricate Machine Learning Research, p. 993–1022, 2003.
relationships in this domain, other corpora may be necessary. [3] C. Cortes and V. N. Vapnik, “Support-vector networks,” Machine Learn-
ing, p. 273–297, 1995.
Considering this, the scalability of STEM becomes of crucial [4] S. German and D. German, “Stochastic relaxation, gibbs distributions,
importance as the model needs to be trained for each new and the bayesian restoration of images,” IEEE Transactions on Pattern
dataset, a process that can be computationally intensive and Analysis and Machine Intelligence, p. 721–741, 1984.
[5] Z. Han, X. Li, Z. Xing, H. Liu, and Z. Feng, “Learning to predict severity
time-consuming. In this subsection, we perform an experiment of software vulnerability using only vulnerability description,” 2017
on the scalability of STEM, focusing on how the model IEEE International Conference on Software Maintenance and Evolution
converges during the inference process. In Figure 11, we (ICSME), 2017.
[6] M. Hossain, P. Butler, S. Boedihardjo, and N. Ramakrishnan, “Story-
compare the convergence times of the STEM with its parallel telling in entity networks to support intelligence analysts,” KDD ’12,
version, STEM-P, over ten iterations for datasets with 25, 50, pp. 1375–1383, 2012.
75, and 100 topics. Our analysis reveals that STEM-P is able [7] M. A. Kader, A. P. Boedihardjo, S. M. Naim, and M. S. Hossain,
“Contextual embedding for distributed representations of entities in a
to converge 30 to 40 percent faster than the regular STEM text corpus,” in Proceedings of the 5th International Workshop on Big
model, suggesting that the STEM model has high scalability Data, Streams and Heterogeneous Source Mining: Algorithms, Systems,
when implemented using a parallel approach. Programming Models and Applications at KDD 2016, ser. Proceedings
of Machine Learning Research, W. Fan, A. Bifet, J. Read, Q. Yang, and
P. S. Yu, Eds., vol. 53, 14 Aug 2016, pp. 35–50.
70 [8] S. Kamara, S. Fahmy, E. Schultz, F. Kerschbaum, and M. Frantzen,
“Anaylsis of vulnerabilities in internet firewalls,” Computers and Secu-
Convergence Time(in minutes)