Skip to main content

Marcos Vieira

Followers

2

Following

0

Public Views

Uploads

Papers by Marcos Vieira

DBM-Tree: A Dynamic Metric Access Method Sensitive to Local Density Data

Metric Access Methods (MAM) are employed to accelerate the processing of similarity queries, such... more Metric Access Methods (MAM) are employed to accelerate the processing of similarity queries, such as the range and the k-nearest neighbor queries. Current methods improve the query performance minimizing the number of disk accesses, keeping a constant height of the structures stored on disks (height-balanced trees). The Slim-tree and the M-tree are the most efficient dynamic MAM so far. However, the overlapping between their nodes has a very high influence on their performance. This paper presents a new dynamic MAM called the DBM-tree (Density-Based Metric tree), which can minimize the overlap between high-density nodes by relaxing the height-balancing of the structure. Thus, the height of the tree is larger in denser regions, in order to keep a tradeoff between breadth-searching and depth-searching. Moreover, an optimization algorithm called Shrink is also presented, which improves the performance of an already built DBM-tree by reorganizing the elements among their nodes. Experiments performed over both synthetic and real datasets showed that the DBM-tree is, in average, 50% faster than traditional MAM and reduces the number of distance calculations by up to 72% and disk accesses by up to 54%. After performing the Shrink algorithm, the performance improves up to 30% regarding the number of disk accesses for range and k-nearest neighbor queries. In addition, the DBM-tree scales up well, exhibiting sub-linear performance with growing number of elements in the database.

ESFFI-a novel technique for the emulation of software faults in COTS components

The paper presents and evaluates a methodology for the emulation of software faults in COTS compo... more The paper presents and evaluates a methodology for the emulation of software faults in COTS components using software implemented fault injection (SWIFI) technology. ESFFI (Emulation of Software Faults by Fault Injection) leverages matured fault injection techniques, which have been used so far for the emulation of hardware faults, and adds new features that make possible the insertion of errors mimicking those caused by real software faults. The major advantage of ESFFI over other techniques that also emulate software faults (mutations, for instance) is making fault locations ubiquitous; every software module can be targeted, no matter if it is a device driver running in operating kernel mode or a third party component whose source code is not available. Experimental results have shown that for specific fault classes, e.g. assignment and checking, the accuracy obtained by this technique is quite good.

Assessing Robustness of Web-Services Infrastructures

Web-services are supported by a complex software infrastructure that must provide a robust servic... more Web-services are supported by a complex software infrastructure that must provide a robust service to the client applications. This practical experience report presents a practical approach for the evaluation of the robustness of Web-services infrastructures. A set of robustness tests (i.e., invalid web-services call parameters) is applied during Web-services execution in order to reveal possible robustness problems in the Web-services code and in the application server infrastructure. The approach is illustrated using two different implementations of the Web-services specified by the TPC-App performance benchmark running on top of the JBoss application server. The proposed approach is generic and can be used to evaluate the robustness of Web-services implementations (relevant for programmers) and application server infrastructures (relevant for administrators and system integrators).

Dependability Benchmarking of Web-Servers

The assessment of the dependability properties of a system (dependability benchmarking) is a crit... more The assessment of the dependability properties of a system (dependability benchmarking) is a critical step when choosing among similar components/products. This paper presents a proposal for the benchmarking of the dependability properties of web-servers. Our benchmark is composed of the three key components: measures, workload, and faultload. We use the SPECWeb99 benchmark as starting point, adopting the workload and performance measures from this performance benchmark, and we added the faultload and new measures related to dependability. We illustrate the use of the proposed benchmark through a case-study involving two widely used web servers (Apache and Abyss) running on top of three different operating systems. The faultloads used encompass software faults, hardware faults and network faults. We show that by using the proposed dependability benchmark it is possible to observe clear differences regarding dependability properties of the web-servers.

Definition of Faultloads Based on Operator Faults for DMBS Recovery Benchmarking

Abstract The characterization of database management system (DBMS) recovery mechanisms and the co... more

Mapping software faults with web security vulnerabilities

Web applications are typically developed with hard time constraints and are often deployed with c... more Web applications are typically developed with hard time constraints and are often deployed with critical software bugs, making them vulnerable to attacks. The classification and knowledge of the typical software bugs that lead to security vulnerabilities is of utmost importance. This paper presents a field study analyzing 655 security patches ofsix widely used web applications. Results are compared against other field studies on general software faults (i.e., faults not specifically related to security), showing that only a small subset of software fault types is related to security. Furthermore, the detailed analysis of the code of the patches has shown that web application vulnerabilities result from software bugs affecting a restricted collection of statements.

Testing and Comparing Web Vulnerability Scanning Tools for SQL Injection and XSS Attacks

Web applications are typically developed with hard time constraints and are often deployed with s... more Web applications are typically developed with hard time constraints and are often deployed with security vulnerabilities. Automatic web vulnerability scanners can help to locate these vulnerabilities and are popular tools among developers of web applications. Their purpose is to stress the application from the attacker's point of view by issuing a huge amount of interaction within it. Two of the most widely spread and dangerous vulnerabilities in web applications are SQL injection and cross site scripting (XSS), because of the damage they may cause to the victim business. Trusting the results of web vulnerability scanning tools is of utmost importance. Without a clear idea on the coverage and false positive rate of these tools, it is difficult to judge the relevance of the results they provide. Furthermore, it is difficult, if not impossible, to compare key figures of merit of web vulnerability scanners. In this paper we propose a method to evaluate and benchmark automatic web vulnerability scanners using software fault injection techniques. The most common types of software faults are injected in the web application code which is then checked by the scanners. The results are compared by analyzing coverage of vulnerability detection and false positives. Three leading commercial scanning tools are evaluated and the results show that in general the coverage is low and the percentage of false positives is very high.

On the Emulation of Software Faults by Software Fault Injection

This paper presents an experimental study on the emulation of software faults by fault injection.... more This paper presents an experimental study on the emulation of software faults by fault injection. In a first experiment, a set of real software faults has been compared with faults injected by a SWIFI tool (Xception) to evaluate the accuracy of the injected faults. Results revealed the limitations of Xception (and other SWIFI tools) in the emulation of different classes of software faults (about 44% of the software faults cannot be emulated). The use of field data about real faults was discussed and software metrics were suggested as an alternative to guide the injection process when field data is nor available. In a second experiment, a set of rules for the injection of errors meant to emulate classes of software faults was evaluated. The fault triggers used seem to be the cause for the observed strong impact of the faults in the target system and in the program results. The results also show the influence in the fault emulation of aspects such as code size, complexity of data structures, and recursive versus sequential execution

A Dependability Benchmark for OLTP Application Environments

The ascendance of networked information in our economy and daily lives has increased the awarenes... more The ascendance of networked information in our economy and daily lives has increased the awareness of the importance of dependability features. OLTP (On-Line Transaction Processing) systems constitute the kernel of the information systems used today to support the daily operations of most of the business. Although these systems comprise the best examples of complex business-critical systems, no practical way has been proposed so far to characterize the impact of faults in such systems or to compare alternative solutions concerning dependability features. This paper proposes a dependability benchmark for OLTP systems. This dependability benchmark uses the workload of the TPC-C performance benchmark and specifies the measures and all the steps required to evaluate both the performance and key dependability features of OLTP systems, with emphasis on availability. This dependability benchmark is presented through a concrete example of benchmarking the performance and dependability of several different transactional systems configurations. The effort required to run the dependability benchmark is also discussed in detail.

Using web security scanners to detect vulnerabilities in web services

Although web services are becoming businesscritical components, they are often deployed with crit... more Although web services are becoming businesscritical components, they are often deployed with critical software bugs that can be maliciously explored. Web vulnerability scanners allow detecting security vulnerabilities in web services by stressing the service from the point of view of an attacker. However, research and practice show that different scanners have different performance on vulnerabilities detection. In this paper we present an experimental evaluation of security vulnerabilities in 300 publicly available web services. Four well know vulnerability scanners have been used to identify security flaws in web services implementations. A large number of vulnerabilities has been observed (177), which confirms that many services are deployed without proper security testing. Additionally, the differences in the vulnerabilities detected and the high number of false-positives (35% and 40% in two cases) and low coverage (less than 20% for two of the scanners) observed highlight the limitations of web vulnerability scanners on detecting security vulnerabilities in web services.

Recovery and Performance Balance of a COTS DBMS in the Presence of Operator Faults

A major cause of failures in large database management systems (DBMS) is operator faults. Althoug... more A major cause of failures in large database management systems (DBMS) is operator faults. Although most of the complex DBMS have comprehensive recovery mechanisms, the effectiveness of these mechanisms is difficult to characterize. On the other hand, the tuning of a large database is very complex and database administrators tend to concentrate on performance tuning and disregard the recovery mechanisms. Above all, database administrators seldom have feedback on how good a given configuration is concerning recovery. This paper proposes an experimental approach to characterize both the performance and the recoverability in DBMS. Our approach is presented through a concrete example of benchmarking the performance and recovery of an Oracle DBMS running the standard TPC-C benchmark, extended to include two new elements: a faultload based on operator faults and measures related to recoverability. A classification of operator faults in DBMS is proposed. The paper ends with the discussion of the results and the proposal of guidelines to help database administrators in finding the balance between performance and recovery tuning.

The OLAP and Data Warehousing Approaches for Analysis and Sharing of Results from Dependability Evaluation Experiments

Two important questions on experimental dependability evaluation remain largely unanswered: 1) ho... more Two important questions on experimental dependability evaluation remain largely unanswered: 1) how to analyze the usually large amount of raw data produced in dependability evaluation experiments and 2) how to compare results from different experiments or results from similar experiments across different systems. These problems are also common to other dependability evaluation techniques such as the ones based on simulation, or even to the analysis of field data on computer faults. We propose the use of data warehousing technologies to store raw results from different experiments/setups in a common multidimensional structure where raw data can be analyzed and shared world wide by means of web-enabled OLAP (On-Line Analytical Processing) tools. This paper describes how to use the proposed approach in a concrete example of dependability evaluation experiment.

DBM-Tree: A Dynamic Metric Access Method Sensitive to Local Density Data

Metric Access Methods (MAM) are employed to accelerate the processing of similarity queries, such... more Metric Access Methods (MAM) are employed to accelerate the processing of similarity queries, such as the range and the k-nearest neighbor queries. Current methods improve the query performance minimizing the number of disk accesses, keeping a constant height of the structures stored on disks (height-balanced trees). The Slim-tree and the M-tree are the most efficient dynamic MAM so far. However, the overlapping between their nodes has a very high influence on their performance. This paper presents a new dynamic MAM called the DBM-tree (Density-Based Metric tree), which can minimize the overlap between high-density nodes by relaxing the height-balancing of the structure. Thus, the height of the tree is larger in denser regions, in order to keep a tradeoff between breadth-searching and depth-searching. Moreover, an optimization algorithm called Shrink is also presented, which improves the performance of an already built DBM-tree by reorganizing the elements among their nodes. Experiments performed over both synthetic and real datasets showed that the DBM-tree is, in average, 50% faster than traditional MAM and reduces the number of distance calculations by up to 72% and disk accesses by up to 54%. After performing the Shrink algorithm, the performance improves up to 30% regarding the number of disk accesses for range and k-nearest neighbor queries. In addition, the DBM-tree scales up well, exhibiting sub-linear performance with growing number of elements in the database.

ESFFI-a novel technique for the emulation of software faults in COTS components

The paper presents and evaluates a methodology for the emulation of software faults in COTS compo... more The paper presents and evaluates a methodology for the emulation of software faults in COTS components using software implemented fault injection (SWIFI) technology. ESFFI (Emulation of Software Faults by Fault Injection) leverages matured fault injection techniques, which have been used so far for the emulation of hardware faults, and adds new features that make possible the insertion of errors mimicking those caused by real software faults. The major advantage of ESFFI over other techniques that also emulate software faults (mutations, for instance) is making fault locations ubiquitous; every software module can be targeted, no matter if it is a device driver running in operating kernel mode or a third party component whose source code is not available. Experimental results have shown that for specific fault classes, e.g. assignment and checking, the accuracy obtained by this technique is quite good.

Assessing Robustness of Web-Services Infrastructures

Web-services are supported by a complex software infrastructure that must provide a robust servic... more Web-services are supported by a complex software infrastructure that must provide a robust service to the client applications. This practical experience report presents a practical approach for the evaluation of the robustness of Web-services infrastructures. A set of robustness tests (i.e., invalid web-services call parameters) is applied during Web-services execution in order to reveal possible robustness problems in the Web-services code and in the application server infrastructure. The approach is illustrated using two different implementations of the Web-services specified by the TPC-App performance benchmark running on top of the JBoss application server. The proposed approach is generic and can be used to evaluate the robustness of Web-services implementations (relevant for programmers) and application server infrastructures (relevant for administrators and system integrators).

Dependability Benchmarking of Web-Servers

The assessment of the dependability properties of a system (dependability benchmarking) is a crit... more The assessment of the dependability properties of a system (dependability benchmarking) is a critical step when choosing among similar components/products. This paper presents a proposal for the benchmarking of the dependability properties of web-servers. Our benchmark is composed of the three key components: measures, workload, and faultload. We use the SPECWeb99 benchmark as starting point, adopting the workload and performance measures from this performance benchmark, and we added the faultload and new measures related to dependability. We illustrate the use of the proposed benchmark through a case-study involving two widely used web servers (Apache and Abyss) running on top of three different operating systems. The faultloads used encompass software faults, hardware faults and network faults. We show that by using the proposed dependability benchmark it is possible to observe clear differences regarding dependability properties of the web-servers.

Definition of Faultloads Based on Operator Faults for DMBS Recovery Benchmarking

Abstract The characterization of database management system (DBMS) recovery mechanisms and the co... more

Mapping software faults with web security vulnerabilities

Web applications are typically developed with hard time constraints and are often deployed with c... more Web applications are typically developed with hard time constraints and are often deployed with critical software bugs, making them vulnerable to attacks. The classification and knowledge of the typical software bugs that lead to security vulnerabilities is of utmost importance. This paper presents a field study analyzing 655 security patches ofsix widely used web applications. Results are compared against other field studies on general software faults (i.e., faults not specifically related to security), showing that only a small subset of software fault types is related to security. Furthermore, the detailed analysis of the code of the patches has shown that web application vulnerabilities result from software bugs affecting a restricted collection of statements.

Testing and Comparing Web Vulnerability Scanning Tools for SQL Injection and XSS Attacks

Web applications are typically developed with hard time constraints and are often deployed with s... more Web applications are typically developed with hard time constraints and are often deployed with security vulnerabilities. Automatic web vulnerability scanners can help to locate these vulnerabilities and are popular tools among developers of web applications. Their purpose is to stress the application from the attacker's point of view by issuing a huge amount of interaction within it. Two of the most widely spread and dangerous vulnerabilities in web applications are SQL injection and cross site scripting (XSS), because of the damage they may cause to the victim business. Trusting the results of web vulnerability scanning tools is of utmost importance. Without a clear idea on the coverage and false positive rate of these tools, it is difficult to judge the relevance of the results they provide. Furthermore, it is difficult, if not impossible, to compare key figures of merit of web vulnerability scanners. In this paper we propose a method to evaluate and benchmark automatic web vulnerability scanners using software fault injection techniques. The most common types of software faults are injected in the web application code which is then checked by the scanners. The results are compared by analyzing coverage of vulnerability detection and false positives. Three leading commercial scanning tools are evaluated and the results show that in general the coverage is low and the percentage of false positives is very high.

On the Emulation of Software Faults by Software Fault Injection

This paper presents an experimental study on the emulation of software faults by fault injection.... more This paper presents an experimental study on the emulation of software faults by fault injection. In a first experiment, a set of real software faults has been compared with faults injected by a SWIFI tool (Xception) to evaluate the accuracy of the injected faults. Results revealed the limitations of Xception (and other SWIFI tools) in the emulation of different classes of software faults (about 44% of the software faults cannot be emulated). The use of field data about real faults was discussed and software metrics were suggested as an alternative to guide the injection process when field data is nor available. In a second experiment, a set of rules for the injection of errors meant to emulate classes of software faults was evaluated. The fault triggers used seem to be the cause for the observed strong impact of the faults in the target system and in the program results. The results also show the influence in the fault emulation of aspects such as code size, complexity of data structures, and recursive versus sequential execution

A Dependability Benchmark for OLTP Application Environments

The ascendance of networked information in our economy and daily lives has increased the awarenes... more The ascendance of networked information in our economy and daily lives has increased the awareness of the importance of dependability features. OLTP (On-Line Transaction Processing) systems constitute the kernel of the information systems used today to support the daily operations of most of the business. Although these systems comprise the best examples of complex business-critical systems, no practical way has been proposed so far to characterize the impact of faults in such systems or to compare alternative solutions concerning dependability features. This paper proposes a dependability benchmark for OLTP systems. This dependability benchmark uses the workload of the TPC-C performance benchmark and specifies the measures and all the steps required to evaluate both the performance and key dependability features of OLTP systems, with emphasis on availability. This dependability benchmark is presented through a concrete example of benchmarking the performance and dependability of several different transactional systems configurations. The effort required to run the dependability benchmark is also discussed in detail.

Using web security scanners to detect vulnerabilities in web services

Although web services are becoming businesscritical components, they are often deployed with crit... more Although web services are becoming businesscritical components, they are often deployed with critical software bugs that can be maliciously explored. Web vulnerability scanners allow detecting security vulnerabilities in web services by stressing the service from the point of view of an attacker. However, research and practice show that different scanners have different performance on vulnerabilities detection. In this paper we present an experimental evaluation of security vulnerabilities in 300 publicly available web services. Four well know vulnerability scanners have been used to identify security flaws in web services implementations. A large number of vulnerabilities has been observed (177), which confirms that many services are deployed without proper security testing. Additionally, the differences in the vulnerabilities detected and the high number of false-positives (35% and 40% in two cases) and low coverage (less than 20% for two of the scanners) observed highlight the limitations of web vulnerability scanners on detecting security vulnerabilities in web services.

Recovery and Performance Balance of a COTS DBMS in the Presence of Operator Faults

A major cause of failures in large database management systems (DBMS) is operator faults. Althoug... more A major cause of failures in large database management systems (DBMS) is operator faults. Although most of the complex DBMS have comprehensive recovery mechanisms, the effectiveness of these mechanisms is difficult to characterize. On the other hand, the tuning of a large database is very complex and database administrators tend to concentrate on performance tuning and disregard the recovery mechanisms. Above all, database administrators seldom have feedback on how good a given configuration is concerning recovery. This paper proposes an experimental approach to characterize both the performance and the recoverability in DBMS. Our approach is presented through a concrete example of benchmarking the performance and recovery of an Oracle DBMS running the standard TPC-C benchmark, extended to include two new elements: a faultload based on operator faults and measures related to recoverability. A classification of operator faults in DBMS is proposed. The paper ends with the discussion of the results and the proposal of guidelines to help database administrators in finding the balance between performance and recovery tuning.

The OLAP and Data Warehousing Approaches for Analysis and Sharing of Results from Dependability Evaluation Experiments

Two important questions on experimental dependability evaluation remain largely unanswered: 1) ho... more Two important questions on experimental dependability evaluation remain largely unanswered: 1) how to analyze the usually large amount of raw data produced in dependability evaluation experiments and 2) how to compare results from different experiments or results from similar experiments across different systems. These problems are also common to other dependability evaluation techniques such as the ones based on simulation, or even to the analysis of field data on computer faults. We propose the use of data warehousing technologies to store raw results from different experiments/setups in a common multidimensional structure where raw data can be analyzed and shared world wide by means of web-enabled OLAP (On-Line Analytical Processing) tools. This paper describes how to use the proposed approach in a concrete example of dependability evaluation experiment.