The purpose of this study is to propose a fuzzy clustering model of information appliances (IA). ... more The purpose of this study is to propose a fuzzy clustering model of information appliances (IA). There are two sub-models, saying information appliance cluster engine (IACE) and user interactive model (UIM), in this model. The function of IACE is to process the users’ recognitions of IA devices. The UIM is the interface of IACE with the IA intelligent agents (IAIA). Via the proposed model, the IAIA can be more humanistic, and convenient for user. Also, via the implementation of this model, we can have the analysis of the optimal effectiveness
As world wild internet has non-stop developments, making profit by lending registered domain name... more As world wild internet has non-stop developments, making profit by lending registered domain names emerges as a new business in recent years. Unfortunately, the larger the market scale of domain lending service becomes, the riskier that there exist malicious behaviors or malwares hiding behind parked domains will be. Also, previous work for differentiating parked domain suffers two main defects: 1) too much data-collecting effort and CPU latency needed for features engineering and 2) ineffectiveness when detecting parked domains containing external links that are usually abused by hackers, e.g., drive-by download attack. Aiming for alleviating above defects without sacrificing practical usability, this paper proposes ParkedGuard as an efficient and accurate parked domain detector. Several scripting behavioral features were analyzed, while those with special statistical significance are adopted in ParkedGuard to make feature engineering much more cost-efficient. On the other hand, fi...
2017 10th International Conference on Ubi-media Computing and Workshops (Ubi-Media), 2017
Hacking competitions serve an important role in cybersecurity education, especially in increasing... more Hacking competitions serve an important role in cybersecurity education, especially in increasing students' learning motivations and practical experiences. A proper feedback can promote students reflective thinking and improve learning performances. It's essential to collect logs for effective portfolio assessment. However, a conventional hacking competition setup is a heterogeneous environment, including different applications, operating systems and virtual machines. It's important to unify these heterogeneous logs to form high quality portfolios. In this paper, we first propose hacking competition ontology to represent concepts within competitions and their relationships. The describing ability of ontology can link heterogeneous logs to participants' intentions. Based on the proposed ontology, we develop an automatic hacking competition assessment framework including log collection, aggregation and analysis. Experimental results show that our proposed assessment framework can reveal participants' misconceptions within competition.
2018 IEEE Conference on Dependable and Secure Computing (DSC), 2018
Malware and backdoor usually hide their malicious activities to communicate with C&C server throu... more Malware and backdoor usually hide their malicious activities to communicate with C&C server through HTTP protocol. Various techniques for stealth are developed which directly leads security system's failure to detect hacker's activities. This paper focuses on profiling fingerprints of browsers running on different client-side host to detect anomaly among outbound HTTP traffics at behavioral and semantic level. Patterns describing fake header are also elaborately designed using graph structure and become significant features in the proposed method for the subsequent detection. Performance of proposed approach are evaluated with data from realistic environment and compare to state-of-the-art. Results show that the proposed method delivers accuracy up to 99%, also for the counterfeit fingerprint's detection it even achieve 100% recall, while the alternative approach totally failed under this scenario.
2017 IEEE Conference on Dependable and Secure Computing, 2017
Cyber-criminals use various malware technologies to bypass antivirus software. For example, drive... more Cyber-criminals use various malware technologies to bypass antivirus software. For example, drive-by downloads happen without a person's knowledge when visiting a website, viewing an email message, or clicking on a deceptive pop-up window. One way to understand drive-by download attacks is to study the connections between different drive-by download behaviors during the installation phase. However, current solutions need a large number of browsing records from ISPs to build up a model. Insufficient historical browsing data may prevent this approach from working. In this study, we propose Ziffersystem, a system that identifies the suspicious connections in a targeted enterprise. We develop a graph-based model of malicious orchestrated behaviors. Ziffersystem does not need large-scale network data (e.g., IPS traffic) to model malicious activity, and therefore the system is useful for an enterprise with few in-house blacklists and highly sensitive data. We apply the proposed system to the analysis of blacklists from public and private sources, and we show its effectiveness for visualizing malicious download behavior that cannot be identified through piecewise event logs.
Several common file synchronization services (such as GoogleDrive, Dropbox and so on) are employe... more Several common file synchronization services (such as GoogleDrive, Dropbox and so on) are employed as infrastructure for being used by command and control(C&C) and data exfiltration, saying Man-in-the-Cloud (MITC) attacks. MITC is not easily detected by common security measures result in without using any exploits, and re-configuration of these services can easily turn them into an attack tool. In this study, we propose Interactive Visualization Threats Explorer that can be with intuition to aware the potential cloud threats hiding in data and eventually improve the analyzing effectiveness significantly. Drill-down and quick response visualization analytics provides cloud administrators full and deep views between cloud resources and users behavior. In addition, Collaborative Risk Estimator which considers users social and business workflow behavior enhance analysis performance. By learning from past behavior of an individual user and social network relations, rolling up behavior models to continue adapt enterprise environment changes. Analyst can quickly aware high risk access behavior locality from abnormal cloud resource access and drill-down the unusual patterns and access behavior. To illustrate the effectiveness of this approach, we present example explorations on two real-world data sets for the detection and understanding of potential Advanced Persistent Threats in progress.
2015 International Carnahan Conference on Security Technology (ICCST), 2015
What you see is not definitely believable is not a rare case in the cyber security monitoring. Ho... more What you see is not definitely believable is not a rare case in the cyber security monitoring. However, due to various tricks of camouflages, such as packing or virutal private network (VPN), detecting "advanced persistent threat"(APT) by only signature based malware detection system becomes more and more intractable. On the other hand, by carefully modeling users' subsequent behaviors of daily routines, probability for one account to generate certain operations can be estimated and used in anomaly detection. To the best of our knowledge so far, a novel behavioral analytic framework, which is dedicated to analyze Active Directory domain service logs and to monitor potential inside threat, is now first proposed in this project. Experiments on real dataset not only show that the proposed idea indeed explores a new feasible direction for cyber security monitoring, but also gives a guideline on how to deploy this framework to various environments.
2009 2nd International Conference on Computer Science and its Applications, 2009
An adaptation mechanism is quite important for false alarm reduction in Intrusion Detection Syste... more An adaptation mechanism is quite important for false alarm reduction in Intrusion Detection System (IDS) for solving the problem of environment change and wrongly trigger from irrelevant signatures. In this study, we proposed a Weighted Score-based Rule Adaptation (WSRA) mechanism from expert’s feedback in order to reduce the massive false alarm produced by IDS. The rule set is generated by rule learner (e.g.: RIPPER) to identify the false alert in addition to a score which represents its availability. The weighted score-based rule adaptation adjusts the score according to the incoming labeled information from expert. Besides, we also propose the concept level features to the false alarm reduction issues for easily retrieving the feedback from experts. We propose WSRA, which makes following contributions: (a) it automatically adapts with the network environment changes to identify false alarms, (b) it proposes a new weighted score-based rule adaptation mechanism, (c) it is easier to demonstrate the rules for retrieving experts feedback benefits from concept level features. The evaluations from one benchmark dataset (DARPA 99) support our approach. The proposed mechanism performs well in false alarm reduction than that other down by mechanism without adaptation consideration.
Electronic Proceedings in Theoretical Computer Science, 2010
Web applications suffer from cross-site scripting (XSS) attacks that resulting from incomplete or... more Web applications suffer from cross-site scripting (XSS) attacks that resulting from incomplete or incorrect input sanitization. Learning the structure of attack vectors could enrich the variety of manifestations in generated XSS attacks. In this study, we focus on generating more threatening XSS attacks for the state-of-the-art detection approaches that can find potential XSS vulnerabilities in Web applications, and propose a mechanism for structural learning of attack vectors with the aim of generating mutated XSS attacks in a fully automatic way. Mutated XSS attack generation depends on the analysis of attack vectors and the structural learning mechanism. For the kernel of the learning mechanism, we use a Hidden Markov model (HMM) as the structure of the attack vector model to capture the implicit manner of the attack vector, and this manner is benefited from the syntax meanings that are labeled by the proposed tokenizing mechanism. Bayes theorem is used to determine the number of hidden states in the model for generalizing the structure model. The paper has the contributions are as following: (1) automatically learn the structure of attack vectors from practical data analysis to modeling a structure model of attack vectors, (2) mimic the manners and the elements of attack vectors to extend the ability of testing tool for identifying XSS vulnerabilities, (3) be helpful to verify the flaws of blacklist sanitization procedures of Web applications. We evaluated the proposed mechanism by Burp Intruder with a dataset collected from public XSS archives. The results shows that mutated XSS attack generation can identify potential vulnerabilities.
2006 IEEE International Conference on e-Business Engineering (ICEBE'06), 2006
The Chinese financial news titles has only few words so that it is hard for measuring the similar... more The Chinese financial news titles has only few words so that it is hard for measuring the similarity between titles if compare all their keywords only. In this study, we proposed a method of semantic similarity measurement for Chinese financial news titles based on constructing the Event Frame structure as the template of a Chinese financial news title. It concerns the relation between the basic meanings of two news titles for similarity measurement. In addition, a semantic similarity function is used to integrate both the relation of Event Frames of the financial news titles and the relation between the keywords of these titles. In this matter, the proposed method can differentiate the Chinese financial news that mention the same event from all other Chinese financial news by the Event Frame, since it concerns the relation between the basic meanings of two news titles and reduces the comparing time. The result of this approach shows that the Event Frame extracting has high precision and the provided semantic similarity measurement can emphasize the relation between the connotations of two news titles.
Proceedings of the 2007 workshop on Large scale attack defense, 2007
Through the rapid evaluation of spam, no fully successful solution for filtering spam has been fo... more Through the rapid evaluation of spam, no fully successful solution for filtering spam has been found. However, the spammers still spread spam by using the same intentions such as advertising and phishing. In this investigation, we propose a mechanism of Email Words Social Network (EWSN) for profiling users' intentions related to interesting and uninteresting e-mails. An EWSN is constructed from the information in an individual user's mailbox, and expands e-mail information from the World Wide Web (WWW) via the search engine. Based on the web information and association rules among the words, words and relations are expanded as a words' social network. Via the EWSN, both interested and uninterested EWSNs can be constructed to analyze user intentions. Additionally, an efficiency detection mechanism based on the EWSN is proposed to classify e-mails. Finally, the adaptation algorithm of artificial immune system is applied to EWSN, which is thus adapted to follow the user's confirmed classification results. The experimental results indicate that the proposed system is very helpful for classifying spam e-mails by analyzing senders' intentions. Some ideas for analyzing interested nature of people, and profiling their backgrounds, are also presented.
Given a stream of time-stamped events, like alerts in a network monitoring setting, how can we is... more Given a stream of time-stamped events, like alerts in a network monitoring setting, how can we isolate a sequence of alerts that form a network attack? We propose a Sequence Based Attack Detection (SBAD) method, which makes the following contributions: (a) it automatically identifies groups of alerts that are frequent; (b) it summarizes them into a suspicious sequence of activity, representing them with graph structures; and (c) it suggests a novel graph-based dissimilarity measure. As a whole, SBAD is able to group suspicious alerts, visualize them, and spot anomalies at the sequence level. The evaluations from three datasets-two benchmark datasets (DARPA 1999, PKDD 2007) and a private dataset Acer 2007 gathered from a Security Operation Center in Taiwan-support our approach. The method performs well even without the help of the IP and payload information. No need for privacy information as the input makes the method easy to plug into existing system such as an intrusion detector. To talk about efficiency, the proposed method can deal with large-scale problems, such as processing 300K alerts within 20 mins on a regular PC.
The purpose of this study is to propose an intelligent extracting web content agent on the intern... more The purpose of this study is to propose an intelligent extracting web content agent on the internet. This agent can automatically collect the web pages, generate several kinds of web pages templates, extract and simplify the appropriate web pages, provide the headlines for remote users via the remote devices. Via the proposed agent, the remote users can easily get the
Thousands of web pages rapidly expand every day, and the diversifications of web templates make u... more Thousands of web pages rapidly expand every day, and the diversifications of web templates make us difficult to extract the contents of web pages. In this study, we proposed a classifying web page templates model based on fuzzy k-means clustering method. This model can automatically collect the web pages, generate several kinds of web pages templates, provide the different kinds
2012 Seventh Asia Joint Conference on Information Security, 2012
Recently, the threat of Android malware is spreading rapidly, especially those repackaged Android... more Recently, the threat of Android malware is spreading rapidly, especially those repackaged Android malware. Although understanding Android malware using dynamic analysis can provide a comprehensive view, it is still subjected to high cost in environment deployment and manual efforts in investigation. In this study, we propose a static feature-based mechanism to provide a static analyst paradigm for detecting the Android malware. The mechanism considers the static information including permissions, deployment of components, Intent messages passing and API calls for characterizing the Android applications behavior. In order to recognize different intentions of Android malware, different kinds of clustering algorithms can be applied to enhance the malware modeling capability. Besides, we leverage the proposed mechanism and develop a system, called Droid Mat. First, the Droid Mat extracts the information (e.g., requested permissions, Intent messages passing, etc) from each application's manifest file, and regards components (Activity, Service, Receiver) as entry points drilling down for tracing API Calls related to permissions. Next, it applies K-means algorithm that enhances the malware modeling capability. The number of clusters are decided by Singular Value Decomposition (SVD) method on the low rank approximation. Finally, it uses kNN algorithm to classify the application as benign or malicious. The experiment result shows that the recall rate of our approach is better than one of well-known tool, Androguard, published in Black hat 2011, which focuses on Android malware analysis. In addition, Droid Mat is efficient since it takes only half of time than Androguard to predict 1738 apps as benign apps or Android malware.
2010 International Conference on Technologies and Applications of Artificial Intelligence, 2010
We propose a graphical signature for intrusion detection given alert sequences. By correlating al... more We propose a graphical signature for intrusion detection given alert sequences. By correlating alerts with their temporal proximity, we build a probabilistic graph-based model to describe a group of alerts that form an attack or normal behavior. Using the the models, we design a pairwise measure based on manifold learning to measure the dissimilarities between different groups of alerts. A large dissimilarity implies different behaviors between the two groups of alerts. Such measure can therefore be combined with regular classification methods for intrusion detection. The proposed method makes the following contributions: (a) it automatically identifies groups of alerts that are frequent; (b) it summarizes them into a suspicious sequence of activity, representing them with graph structures; and (c) it suggests a novel graph-based dissimilarity measure. We evaluate our framework mainly on Acer 2007, a private dataset gathered from a well-known Security Operation Center in Taiwan. The performance on the real data suggests that the proposed method can achieve high detection performance in attack coverage and tolerant the attack variations. No need for privacy information as the input makes the method easy to plug into existing system such as an intrusion detector. Moreover, the graphical structures and the representation from manifold learning naturally provide the visualized result suitable for further analysis from domain experts.
2006 IEEE International Conference on Systems, Man and Cybernetics, 2006
Page 1. 2006 IEEE International Conference on Systems, Man, and Cybernetics October 8-11, 2006, T... more Page 1. 2006 IEEE International Conference on Systems, Man, and Cybernetics October 8-11, 2006, Taipei, Taiwan Network Motif Model: An Efficient Approach for Extracting Features from Relational Data Chiung-Wei Huang ...
2012 IEEE 11th International Conference on Trust, Security and Privacy in Computing and Communications, 2012
With the emergence of cloud computing, an increasingly greater number of innovative applications ... more With the emergence of cloud computing, an increasingly greater number of innovative applications are being built on cloud computing platforms (e.g., Elastic Compute Cloud by Amazon, Windows Azure by Microsoft, and Cloud Foundry by VMware). However, these cloud applications are also prone to risks and potential vulnerabilities in their system life cycle. In this study, a cloud security self-governance deployment framework is proposed from the system development life cycle perspective (Cloud SSDLC), and especially from government and industry perspectives. The cloud SSDLC incorporates the secure system development life cycle (SSDLC), cloud security critical domain guidelines, and risk considerations. According to the SSDLC, there are five main phases in Cloud SSDLC: initiation, development, implementation, operation, and destruction. Furthermore, critical cloud security domains and corresponding risks are integrated into each phase. From the industry and government perspective, different cases are used to demonstrate practical usage and legal issues in the proposed Cloud SSDLC. The main contribution is the provision of a framework to connect the SSDLC and cloud computing paradigm for enhancing cloud applications.
The purpose of this study is to propose a fuzzy clustering model of information appliances (IA). ... more The purpose of this study is to propose a fuzzy clustering model of information appliances (IA). There are two sub-models, saying information appliance cluster engine (IACE) and user interactive model (UIM), in this model. The function of IACE is to process the users’ recognitions of IA devices. The UIM is the interface of IACE with the IA intelligent agents (IAIA). Via the proposed model, the IAIA can be more humanistic, and convenient for user. Also, via the implementation of this model, we can have the analysis of the optimal effectiveness
As world wild internet has non-stop developments, making profit by lending registered domain name... more As world wild internet has non-stop developments, making profit by lending registered domain names emerges as a new business in recent years. Unfortunately, the larger the market scale of domain lending service becomes, the riskier that there exist malicious behaviors or malwares hiding behind parked domains will be. Also, previous work for differentiating parked domain suffers two main defects: 1) too much data-collecting effort and CPU latency needed for features engineering and 2) ineffectiveness when detecting parked domains containing external links that are usually abused by hackers, e.g., drive-by download attack. Aiming for alleviating above defects without sacrificing practical usability, this paper proposes ParkedGuard as an efficient and accurate parked domain detector. Several scripting behavioral features were analyzed, while those with special statistical significance are adopted in ParkedGuard to make feature engineering much more cost-efficient. On the other hand, fi...
2017 10th International Conference on Ubi-media Computing and Workshops (Ubi-Media), 2017
Hacking competitions serve an important role in cybersecurity education, especially in increasing... more Hacking competitions serve an important role in cybersecurity education, especially in increasing students' learning motivations and practical experiences. A proper feedback can promote students reflective thinking and improve learning performances. It's essential to collect logs for effective portfolio assessment. However, a conventional hacking competition setup is a heterogeneous environment, including different applications, operating systems and virtual machines. It's important to unify these heterogeneous logs to form high quality portfolios. In this paper, we first propose hacking competition ontology to represent concepts within competitions and their relationships. The describing ability of ontology can link heterogeneous logs to participants' intentions. Based on the proposed ontology, we develop an automatic hacking competition assessment framework including log collection, aggregation and analysis. Experimental results show that our proposed assessment framework can reveal participants' misconceptions within competition.
2018 IEEE Conference on Dependable and Secure Computing (DSC), 2018
Malware and backdoor usually hide their malicious activities to communicate with C&C server throu... more Malware and backdoor usually hide their malicious activities to communicate with C&C server through HTTP protocol. Various techniques for stealth are developed which directly leads security system's failure to detect hacker's activities. This paper focuses on profiling fingerprints of browsers running on different client-side host to detect anomaly among outbound HTTP traffics at behavioral and semantic level. Patterns describing fake header are also elaborately designed using graph structure and become significant features in the proposed method for the subsequent detection. Performance of proposed approach are evaluated with data from realistic environment and compare to state-of-the-art. Results show that the proposed method delivers accuracy up to 99%, also for the counterfeit fingerprint's detection it even achieve 100% recall, while the alternative approach totally failed under this scenario.
2017 IEEE Conference on Dependable and Secure Computing, 2017
Cyber-criminals use various malware technologies to bypass antivirus software. For example, drive... more Cyber-criminals use various malware technologies to bypass antivirus software. For example, drive-by downloads happen without a person's knowledge when visiting a website, viewing an email message, or clicking on a deceptive pop-up window. One way to understand drive-by download attacks is to study the connections between different drive-by download behaviors during the installation phase. However, current solutions need a large number of browsing records from ISPs to build up a model. Insufficient historical browsing data may prevent this approach from working. In this study, we propose Ziffersystem, a system that identifies the suspicious connections in a targeted enterprise. We develop a graph-based model of malicious orchestrated behaviors. Ziffersystem does not need large-scale network data (e.g., IPS traffic) to model malicious activity, and therefore the system is useful for an enterprise with few in-house blacklists and highly sensitive data. We apply the proposed system to the analysis of blacklists from public and private sources, and we show its effectiveness for visualizing malicious download behavior that cannot be identified through piecewise event logs.
Several common file synchronization services (such as GoogleDrive, Dropbox and so on) are employe... more Several common file synchronization services (such as GoogleDrive, Dropbox and so on) are employed as infrastructure for being used by command and control(C&C) and data exfiltration, saying Man-in-the-Cloud (MITC) attacks. MITC is not easily detected by common security measures result in without using any exploits, and re-configuration of these services can easily turn them into an attack tool. In this study, we propose Interactive Visualization Threats Explorer that can be with intuition to aware the potential cloud threats hiding in data and eventually improve the analyzing effectiveness significantly. Drill-down and quick response visualization analytics provides cloud administrators full and deep views between cloud resources and users behavior. In addition, Collaborative Risk Estimator which considers users social and business workflow behavior enhance analysis performance. By learning from past behavior of an individual user and social network relations, rolling up behavior models to continue adapt enterprise environment changes. Analyst can quickly aware high risk access behavior locality from abnormal cloud resource access and drill-down the unusual patterns and access behavior. To illustrate the effectiveness of this approach, we present example explorations on two real-world data sets for the detection and understanding of potential Advanced Persistent Threats in progress.
2015 International Carnahan Conference on Security Technology (ICCST), 2015
What you see is not definitely believable is not a rare case in the cyber security monitoring. Ho... more What you see is not definitely believable is not a rare case in the cyber security monitoring. However, due to various tricks of camouflages, such as packing or virutal private network (VPN), detecting "advanced persistent threat"(APT) by only signature based malware detection system becomes more and more intractable. On the other hand, by carefully modeling users' subsequent behaviors of daily routines, probability for one account to generate certain operations can be estimated and used in anomaly detection. To the best of our knowledge so far, a novel behavioral analytic framework, which is dedicated to analyze Active Directory domain service logs and to monitor potential inside threat, is now first proposed in this project. Experiments on real dataset not only show that the proposed idea indeed explores a new feasible direction for cyber security monitoring, but also gives a guideline on how to deploy this framework to various environments.
2009 2nd International Conference on Computer Science and its Applications, 2009
An adaptation mechanism is quite important for false alarm reduction in Intrusion Detection Syste... more An adaptation mechanism is quite important for false alarm reduction in Intrusion Detection System (IDS) for solving the problem of environment change and wrongly trigger from irrelevant signatures. In this study, we proposed a Weighted Score-based Rule Adaptation (WSRA) mechanism from expert’s feedback in order to reduce the massive false alarm produced by IDS. The rule set is generated by rule learner (e.g.: RIPPER) to identify the false alert in addition to a score which represents its availability. The weighted score-based rule adaptation adjusts the score according to the incoming labeled information from expert. Besides, we also propose the concept level features to the false alarm reduction issues for easily retrieving the feedback from experts. We propose WSRA, which makes following contributions: (a) it automatically adapts with the network environment changes to identify false alarms, (b) it proposes a new weighted score-based rule adaptation mechanism, (c) it is easier to demonstrate the rules for retrieving experts feedback benefits from concept level features. The evaluations from one benchmark dataset (DARPA 99) support our approach. The proposed mechanism performs well in false alarm reduction than that other down by mechanism without adaptation consideration.
Electronic Proceedings in Theoretical Computer Science, 2010
Web applications suffer from cross-site scripting (XSS) attacks that resulting from incomplete or... more Web applications suffer from cross-site scripting (XSS) attacks that resulting from incomplete or incorrect input sanitization. Learning the structure of attack vectors could enrich the variety of manifestations in generated XSS attacks. In this study, we focus on generating more threatening XSS attacks for the state-of-the-art detection approaches that can find potential XSS vulnerabilities in Web applications, and propose a mechanism for structural learning of attack vectors with the aim of generating mutated XSS attacks in a fully automatic way. Mutated XSS attack generation depends on the analysis of attack vectors and the structural learning mechanism. For the kernel of the learning mechanism, we use a Hidden Markov model (HMM) as the structure of the attack vector model to capture the implicit manner of the attack vector, and this manner is benefited from the syntax meanings that are labeled by the proposed tokenizing mechanism. Bayes theorem is used to determine the number of hidden states in the model for generalizing the structure model. The paper has the contributions are as following: (1) automatically learn the structure of attack vectors from practical data analysis to modeling a structure model of attack vectors, (2) mimic the manners and the elements of attack vectors to extend the ability of testing tool for identifying XSS vulnerabilities, (3) be helpful to verify the flaws of blacklist sanitization procedures of Web applications. We evaluated the proposed mechanism by Burp Intruder with a dataset collected from public XSS archives. The results shows that mutated XSS attack generation can identify potential vulnerabilities.
2006 IEEE International Conference on e-Business Engineering (ICEBE'06), 2006
The Chinese financial news titles has only few words so that it is hard for measuring the similar... more The Chinese financial news titles has only few words so that it is hard for measuring the similarity between titles if compare all their keywords only. In this study, we proposed a method of semantic similarity measurement for Chinese financial news titles based on constructing the Event Frame structure as the template of a Chinese financial news title. It concerns the relation between the basic meanings of two news titles for similarity measurement. In addition, a semantic similarity function is used to integrate both the relation of Event Frames of the financial news titles and the relation between the keywords of these titles. In this matter, the proposed method can differentiate the Chinese financial news that mention the same event from all other Chinese financial news by the Event Frame, since it concerns the relation between the basic meanings of two news titles and reduces the comparing time. The result of this approach shows that the Event Frame extracting has high precision and the provided semantic similarity measurement can emphasize the relation between the connotations of two news titles.
Proceedings of the 2007 workshop on Large scale attack defense, 2007
Through the rapid evaluation of spam, no fully successful solution for filtering spam has been fo... more Through the rapid evaluation of spam, no fully successful solution for filtering spam has been found. However, the spammers still spread spam by using the same intentions such as advertising and phishing. In this investigation, we propose a mechanism of Email Words Social Network (EWSN) for profiling users' intentions related to interesting and uninteresting e-mails. An EWSN is constructed from the information in an individual user's mailbox, and expands e-mail information from the World Wide Web (WWW) via the search engine. Based on the web information and association rules among the words, words and relations are expanded as a words' social network. Via the EWSN, both interested and uninterested EWSNs can be constructed to analyze user intentions. Additionally, an efficiency detection mechanism based on the EWSN is proposed to classify e-mails. Finally, the adaptation algorithm of artificial immune system is applied to EWSN, which is thus adapted to follow the user's confirmed classification results. The experimental results indicate that the proposed system is very helpful for classifying spam e-mails by analyzing senders' intentions. Some ideas for analyzing interested nature of people, and profiling their backgrounds, are also presented.
Given a stream of time-stamped events, like alerts in a network monitoring setting, how can we is... more Given a stream of time-stamped events, like alerts in a network monitoring setting, how can we isolate a sequence of alerts that form a network attack? We propose a Sequence Based Attack Detection (SBAD) method, which makes the following contributions: (a) it automatically identifies groups of alerts that are frequent; (b) it summarizes them into a suspicious sequence of activity, representing them with graph structures; and (c) it suggests a novel graph-based dissimilarity measure. As a whole, SBAD is able to group suspicious alerts, visualize them, and spot anomalies at the sequence level. The evaluations from three datasets-two benchmark datasets (DARPA 1999, PKDD 2007) and a private dataset Acer 2007 gathered from a Security Operation Center in Taiwan-support our approach. The method performs well even without the help of the IP and payload information. No need for privacy information as the input makes the method easy to plug into existing system such as an intrusion detector. To talk about efficiency, the proposed method can deal with large-scale problems, such as processing 300K alerts within 20 mins on a regular PC.
The purpose of this study is to propose an intelligent extracting web content agent on the intern... more The purpose of this study is to propose an intelligent extracting web content agent on the internet. This agent can automatically collect the web pages, generate several kinds of web pages templates, extract and simplify the appropriate web pages, provide the headlines for remote users via the remote devices. Via the proposed agent, the remote users can easily get the
Thousands of web pages rapidly expand every day, and the diversifications of web templates make u... more Thousands of web pages rapidly expand every day, and the diversifications of web templates make us difficult to extract the contents of web pages. In this study, we proposed a classifying web page templates model based on fuzzy k-means clustering method. This model can automatically collect the web pages, generate several kinds of web pages templates, provide the different kinds
2012 Seventh Asia Joint Conference on Information Security, 2012
Recently, the threat of Android malware is spreading rapidly, especially those repackaged Android... more Recently, the threat of Android malware is spreading rapidly, especially those repackaged Android malware. Although understanding Android malware using dynamic analysis can provide a comprehensive view, it is still subjected to high cost in environment deployment and manual efforts in investigation. In this study, we propose a static feature-based mechanism to provide a static analyst paradigm for detecting the Android malware. The mechanism considers the static information including permissions, deployment of components, Intent messages passing and API calls for characterizing the Android applications behavior. In order to recognize different intentions of Android malware, different kinds of clustering algorithms can be applied to enhance the malware modeling capability. Besides, we leverage the proposed mechanism and develop a system, called Droid Mat. First, the Droid Mat extracts the information (e.g., requested permissions, Intent messages passing, etc) from each application's manifest file, and regards components (Activity, Service, Receiver) as entry points drilling down for tracing API Calls related to permissions. Next, it applies K-means algorithm that enhances the malware modeling capability. The number of clusters are decided by Singular Value Decomposition (SVD) method on the low rank approximation. Finally, it uses kNN algorithm to classify the application as benign or malicious. The experiment result shows that the recall rate of our approach is better than one of well-known tool, Androguard, published in Black hat 2011, which focuses on Android malware analysis. In addition, Droid Mat is efficient since it takes only half of time than Androguard to predict 1738 apps as benign apps or Android malware.
2010 International Conference on Technologies and Applications of Artificial Intelligence, 2010
We propose a graphical signature for intrusion detection given alert sequences. By correlating al... more We propose a graphical signature for intrusion detection given alert sequences. By correlating alerts with their temporal proximity, we build a probabilistic graph-based model to describe a group of alerts that form an attack or normal behavior. Using the the models, we design a pairwise measure based on manifold learning to measure the dissimilarities between different groups of alerts. A large dissimilarity implies different behaviors between the two groups of alerts. Such measure can therefore be combined with regular classification methods for intrusion detection. The proposed method makes the following contributions: (a) it automatically identifies groups of alerts that are frequent; (b) it summarizes them into a suspicious sequence of activity, representing them with graph structures; and (c) it suggests a novel graph-based dissimilarity measure. We evaluate our framework mainly on Acer 2007, a private dataset gathered from a well-known Security Operation Center in Taiwan. The performance on the real data suggests that the proposed method can achieve high detection performance in attack coverage and tolerant the attack variations. No need for privacy information as the input makes the method easy to plug into existing system such as an intrusion detector. Moreover, the graphical structures and the representation from manifold learning naturally provide the visualized result suitable for further analysis from domain experts.
2006 IEEE International Conference on Systems, Man and Cybernetics, 2006
Page 1. 2006 IEEE International Conference on Systems, Man, and Cybernetics October 8-11, 2006, T... more Page 1. 2006 IEEE International Conference on Systems, Man, and Cybernetics October 8-11, 2006, Taipei, Taiwan Network Motif Model: An Efficient Approach for Extracting Features from Relational Data Chiung-Wei Huang ...
2012 IEEE 11th International Conference on Trust, Security and Privacy in Computing and Communications, 2012
With the emergence of cloud computing, an increasingly greater number of innovative applications ... more With the emergence of cloud computing, an increasingly greater number of innovative applications are being built on cloud computing platforms (e.g., Elastic Compute Cloud by Amazon, Windows Azure by Microsoft, and Cloud Foundry by VMware). However, these cloud applications are also prone to risks and potential vulnerabilities in their system life cycle. In this study, a cloud security self-governance deployment framework is proposed from the system development life cycle perspective (Cloud SSDLC), and especially from government and industry perspectives. The cloud SSDLC incorporates the secure system development life cycle (SSDLC), cloud security critical domain guidelines, and risk considerations. According to the SSDLC, there are five main phases in Cloud SSDLC: initiation, development, implementation, operation, and destruction. Furthermore, critical cloud security domains and corresponding risks are integrated into each phase. From the industry and government perspective, different cases are used to demonstrate practical usage and legal issues in the proposed Cloud SSDLC. The main contribution is the provision of a framework to connect the SSDLC and cloud computing paradigm for enhancing cloud applications.
Uploads
Papers by Ching-Hao Mao