Automatic identification of anomalies on network data is a problem of fundamental interest to ISP... more Automatic identification of anomalies on network data is a problem of fundamental interest to ISPs to diagnose incipient problems in their networks. ISPs gather diverse data sources from the network for monitoring, diagnostics or provisioning tasks. Finding anomalies in this data is a huge challenge due to the volume of the data collected, the number and diversity of data sources and the diversity of anomalies to be detected. In this paper we introduce a framework for anomaly detection that allows the construction of a black box anomaly detector. This anomaly detector can be used for automatically finding anomalies with minimal human intervention. Our framework also allows us to deal with the different types of data sources collected from the network. We have developed a prototype of this framework, TrafficComber, and we are in the process of evaluating it using the data in the warehouse of a tier-1 ISP.
Proceedings 2019 Network and Distributed System Security Symposium, 2019
Enterprises own a significant fraction of the hosts connected to the Internet and possess valuabl... more Enterprises own a significant fraction of the hosts connected to the Internet and possess valuable assets, such as financial data and intellectual property, which may be targeted by attackers. They suffer attacks that exploit unpatched hosts and install malware, resulting in breaches that may cost millions in damages. Despite the scale of this phenomenon, the threat and vulnerability landscape of enterprises remains under-studied. The security posture of enterprises remains unclear, and it's unknown whether enterprises are indeed more secure than consumer hosts. To address these questions, we perform the largest and longest enterprise security study so far. Our data covers nearly 3 years and is collected from 28K enterprises, belonging to 67 industries, which own 82M client hosts and 73M public-facing servers. Our measurements comprise of two parts: an analysis of the threat landscape and an analysis of the enterprise vulnerability patching behavior. The threat landscape analysis studies the encounter rate of malware and PUP in enterprise client hosts. It measures, among others, that 91%-97% of the enterprises, 13%-41% of the hosts, encountered at least one malware or PUP file over the length of our study; that enterprises encounter malware much more often than PUP; and that some industries like banks and consumer finances achieve significantly lower malware and PUP encounter rates than the most-affected industries. The vulnerability analysis examines the patching of 12 client-side and 112 server-side applications in enterprise client hosts and servers. It measures, among others, that it takes over 6 months on average to patch 90% of the population across all vulnerabilities in the 12 client-side applications; that enterprise hosts are faster to patch vulnerabilities compared to consumer hosts; and that the patching of server applications is much worse than the patching of clientside applications.
Binary code reuse is the process of automatically identifying the interface and extracting the in... more Binary code reuse is the process of automatically identifying the interface and extracting the instructions and data dependencies of a code fragment from an executable program, so that it is self-contained and can be reused by external code. Binary code reuse is useful for a number of security applications, including reusing the proprietary cryptographic or unpacking functions from a malware sample and for rewriting a network dialog. In this paper we conduct the first systematic study of automated binary code reuse and its security applications. The main challenge in binary code reuse is understanding the code fragment's interface. We propose a novel technique to identify the prototype of an undocumented code fragment directly from the program's binary, without access to source code or symbol information. Further, we must also extract the code itself from the binary so that it is self-contained and can be easily reused in another program. We design and implement a tool that uses a combination of dynamic and static analysis to automatically identify the prototype and extract the instructions of an assembly function into a form that can be reused by other C code. The extracted function can be run independently of the rest of the program's functionality and shared with other users. We apply our approach to scenarios that include extracting the encryption and decryption routines from malware samples, and show that these routines can be reused by a network proxy to decrypt encrypted traffic on the network. This allows the network proxy to rewrite the malware's encrypted traffic by combining the extracted encryption and decryption functions with the session keys and the protocol grammar. We also show that we can reuse a code fragment from an unpacking function for the unpacking routine for a different sample of the same family, even if the code fragment is not a complete function.
The effects of an extended photoperiod (18 h light or 12 h light), using a combination of natural... more The effects of an extended photoperiod (18 h light or 12 h light), using a combination of natural and artificial light sources, on the differential gene expression and growth rate of Nile tilapia Oreochromis niloticus L. were evaluated. Four groups of all males tilapia (n=10) with initial mean body weight of 102.25 g were reared in an aquaria with two replications for each treatment. The experiment was conducted during a period of 35 days, growth rate and water quality (dissolved oxygen, pH, nitrite, nitrate) were measured weekly, temperature and solar radiation were recorded daily. At the end of the experiment we used a zebrafish genome array to study differential expression over 14,900 transcripts and to test the hypotheses that an extended photoperiod stimulates growth rate and causes differential gene expression in tilapia. Fishes in the extended photoperiod group (Ep) were susceptible to show higher growth rate (GR) compared with the standard photoperiod group (Sp; P = 0.001). ...
La Fundacion Ciudad de la Energia (CIUDEN) esta desarrollando un proyecto integral de demostracio... more La Fundacion Ciudad de la Energia (CIUDEN) esta desarrollando un proyecto integral de demostracion de captura y almacenamiento geologico de CO2. Para dicho almacenamiento se pretende inyectar CO2 en la localidad de Hontomin (Burgos, Espana) mediante la instalacion de una planta de desarrollo tecnologico. Se inyectaran aproximadamente 100.000 toneladas de CO2 en un acuifero salino situado a 1.500 metros de profundidad. Entre los objetivos del proyecto esta demostrar que el almacenamiento de CO2 es seguro, y que se tiene un completo conocimiento del comportamiento del gas inyectado en el acuifero y de los potenciales riesgos ambientales asociados. Por este motivo, la monitorizacion de flujos de CO2 en la interfase suelo-atmosfera en el emplazamiento es fundamental, teniendose que realizar previa a la a la inyeccion, durante la inyeccion y posterior a la misma. En concreto, en este proyecto se estudia la emanacion difusa de CO2 desde el suelo a la atmosfera mediante la tecnica de la “camara de acumulacion”. Desde 2009, los trabajos estan centrados en la obtencion de la linea base de flujo de CO2 y su variacion estacional previa a la inyeccion, puesto que esta linea de base es esencial para el futuro control y deteccion de potenciales fugas durante los periodos de inyeccion y post-inyeccion. Palabras clave: Almacenamiento geologico, Camara de acumulacion, CO2, Hontomin, Monitorizacion. Abstract The Fundacion Ciudad de la Energia (CIUDEN) is carrying out a project of captura and geological storage of CO2. The area selected for the CO2 injection and storage is located at Hontomin (Burgos, Spain). It is planned to inject approximately 100,000 tons of CO2 into a saline aquifer at a depth of 1,500 metres. One of the aims of the project is to demonstrate that CO2 storage is safe, and there is a control on the evalution and fate of the CO2 injected, and on the potential environmental effects. Such a control requires a detailed monitoring study of the CO2 fluxes at the soil-atmosphere interface before, during and after the injection operations. In particular, in this project we study the diffuse flux of CO2 from the soil to the atmosphere by using an "accumulation chamber ". Since 2009, the work is focused on the determination of a baseline flux of CO2 and its seasonal variation prior to injection, as the baseline flux of CO2 is essential in order to detect potential leakage during injection and post injection.
Fig. 1. NE-SW geological cross-section through the Monte Amiata complex, with the two reservoirs ... more Fig. 1. NE-SW geological cross-section through the Monte Amiata complex, with the two reservoirs of the geothermal system in evidence CO 2 emission from two old mine drillings (Mt. Amiata, Central Italy) as a possible example of storage and leakage of deep-seated CO 2. Barbara Nisi (1), Orlando Vaselli (2),(3), Javier de Elío(4),(5), Marcelo Ortega(5), Juan Caballero(5), Franco Tassi(2),(3), Daniele RAPPUOLI (6), Luis FELIPE MAZADIEGO(4)
In this paper, we give an overview of the BitBlaze project, a new approach to computer security v... more In this paper, we give an overview of the BitBlaze project, a new approach to computer security via binary analysis. In particular, BitBlaze focuses on building a unified binary analysis platform and using it to provide novel solutions to a broad spectrum of different security problems. The binary analysis platform is designed to enable accurate analysis, provide an extensible architecture, and combines static and dynamic analysis as well as program verification techniques to satisfy the common needs of security applications. By extracting security-related properties from binary programs directly, BitBlaze enables a principled, root-cause based approach to computer security, offering novel and effective solutions, as demonstrated with over a dozen different security applications.
Proceedings of the 2013 ACM SIGSAC conference on Computer & communications security - CCS '13, 2013
In this ongoing work we perform the first systematic investigation of cross-platform (X-platform)... more In this ongoing work we perform the first systematic investigation of cross-platform (X-platform) malware. As a first step, this paper presents an exploration into existing X-platform malware families and X-platform vulnerabilities used to distribute them. Our exploration shows that Xplatform malware uses a wealth of methods to achieve portability. It also shows that exploits for X-platform vulnerabilities are X-platform indeed and readily available in commercial exploit kits, making them an inexpensive distribution vector for X-platform malware.
Research in Attacks, Intrusions, and Defenses, 2013
The ever-increasing number of malware families and polymorphic variants creates a pressing need f... more The ever-increasing number of malware families and polymorphic variants creates a pressing need for automatic tools to cluster the collected malware into families and generate behavioral signatures for their detection. Among these, network traffic is a powerful behavioral signature and network signatures are widely used by network administrators. In this paper we present FIRMA, a tool that given a large pool of network traffic obtained by executing unlabeled malware binaries, generates a clustering of the malware binaries into families and a set of network signatures for each family. Compared with prior tools, FIRMA produces network signatures for each of the network behaviors of a family, regardless of the type of traffic the malware uses (e.g., HTTP, IRC, SMTP, TCP, UDP). We have implemented FIRMA and evaluated it on two recent datasets comprising nearly 16,000 unique malware binaries. Our results show that FIRMA's clustering has very high precision (100% on a labeled dataset) and recall (97.7%). We compare FIRMA's signatures with manually generated ones, showing that they are as good (often better), while generated in a fraction of the time.
This chapter contains sections titled: Online Advertising: With Secret Security Web Security Reme... more This chapter contains sections titled: Online Advertising: With Secret Security Web Security Remediation Efforts Content-Sniffing XSS Attacks: XSS with Non-HTML Content Our Internet Infrastructure at Risk Social Spam Understanding CAPTCHAs and Their Weaknesses Security Questions Folk Models of Home Computer Security Detecting and Defeating Interception Attacks Against SSL
International Journal of Information Security, 2014
Drive-by downloads are the preferred distribution vector for many malware families. In the drive-... more Drive-by downloads are the preferred distribution vector for many malware families. In the drive-by ecosystem, many exploit servers run the same exploit kit and it is a challenge understanding whether the exploit server is part of a larger operation. In this paper, we propose a technique to identify exploit servers managed by the same organization. We collect over time how exploit servers are configured, which exploits they use, and what malware they distribute, grouping servers with similar configurations into operations. Our operational analysis reveals that although individual exploit servers have a median lifetime of 16 h, long-lived operations exist that operate for several months. To sustain long-lived operations, miscreants are turning to the cloud, with 60 % of the exploit servers hosted by specialized cloud hosting services. We also observe operations that distribute multiple malware families and that pay-per-install affiliate programs are managing exploit servers for their affiliates to convert traffic into installations. Furthermore, we analyze the exploit polymorphism problem, measuring the repacking rate for different exploit types. To understand how difficult is to takedown exploit servers, we analyze the abuse reporting process and issue abuse reports for 19 long-lived servers. We describe the interaction with ISPs and hosting providers and monitor the result of the report. We find that
There is tremendous potential for genome sequencing to improve clinical diagnosis and care once i... more There is tremendous potential for genome sequencing to improve clinical diagnosis and care once it becomes routinely accessible, but this will require formalizing research methods into clinical best practices in the areas of sequence data generation, analysis, interpretation and reporting. The CLARITY Challenge was designed to spur convergence in methods for diagnosing genetic disease starting from clinical case history and genome sequencing data. DNA samples were obtained from three families with heritable genetic disorders and genomic sequence data were donated by sequencing platform vendors. The challenge was to analyze and interpret these data with the goals of identifying disease-causing variants and reporting the findings in a clinically useful format. Participating contestant groups were solicited broadly, and an independent panel of judges evaluated their performance. A total of 30 international groups were engaged. The entries reveal a general convergence of practices on mo...
Proceedings of the 14th ACM conference on Computer and communications security, 2007
Protocol reverse engineering, the process of extracting the application-level protocol used by an... more Protocol reverse engineering, the process of extracting the application-level protocol used by an implementation, without access to the protocol specification, is important for many network security applications. Recent work [17] has proposed protocol reverse engineering by using clustering on network traces. That kind of approach is limited by the lack of semantic information on network traces. In this paper we propose a new approach using program binaries. Our approach, shadowing, uses dynamic binary analysis and is based on a unique intuition-the way that an implementation of the protocol processes the received application data reveals a wealth of information about the protocol message format. We have implemented our approach in a system called Polyglot and evaluated it extensively using real-world implementations of five different protocols: DNS, HTTP, IRC, Samba and ICQ. We compare our results with the manually crafted message format, included in Wireshark, one of the state-of-the-art protocol analyzers. The differences we find are small and usually due to different implementations handling fields in different ways. Finding such differences between implementations is an added benefit, as they are important for problems such as fingerprint generation, fuzzing, and error detection.
2009 IEEE International Workshop on Genomic Signal Processing and Statistics, 2009
ABSTRACT Coding and non-coding gene prediction is still a challenge. Diverse computer-based tools... more ABSTRACT Coding and non-coding gene prediction is still a challenge. Diverse computer-based tools have been created to screen sequences using elaborate strategies for gene prediction. Many of these implement various statistical tests to measure the plausibility of the prediction but until now, a comprehensive negative control did not exist. We developed an algorithm that generates sequences with characteristics of the intergenic regions of a genome, including nucleotide composition and typical inserted elements like interspersed repeats, low complexity sequences and pseudogenes. We also challenged some gene prediction programs to compare the artificial sequences with real intergenic regions.
9. European Monitoring Centre for Drugs and Drug Addiction. Report on the risk assessment of PMMA... more 9. European Monitoring Centre for Drugs and Drug Addiction. Report on the risk assessment of PMMA in the framework of the joint action on new synthetic drugs [Documento on-line]. 2003 [consultado 29 May 2012]. Disponible en: http://www.emcdda.europa.eu/html.cfm/index33349EN.html 10. Belgian Early Warning System on Drugs [Documento on-line]. PMMA alert by EMCCDA, PMMA/PMA. 2002 [consultado 29 May 2012]. Disponible en: http:// ewsd.wiv-isp.be/Main/PMMA%20alert%20by%20EMCDDA.aspx
To provide context for the diversification of archosaurs--the group that includes crocodilians, d... more To provide context for the diversification of archosaurs--the group that includes crocodilians, dinosaurs, and birds--we generated draft genomes of three crocodilians: Alligator mississippiensis (the American alligator), Crocodylus porosus (the saltwater crocodile), and Gavialis gangeticus (the Indian gharial). We observed an exceptionally slow rate of genome evolution within crocodilians at all levels, including nucleotide substitutions, indels, transposable element content and movement, gene family evolution, and chromosomal synteny. When placed within the context of related taxa including birds and turtles, this suggests that the common ancestor of all of these taxa also exhibited slow genome evolution and that the comparatively rapid evolution is derived in birds. The data also provided the opportunity to analyze heterozygosity in crocodilians, which indicates a likely reduction in population size for all three taxa through the Pleistocene. Finally, these data combined with newl...
The transport of Glyphosate ([N-phosphonomethyl] glycine), AMPA (aminomethylphosphonic acid, CH(6... more The transport of Glyphosate ([N-phosphonomethyl] glycine), AMPA (aminomethylphosphonic acid, CH(6)NO(3)P), and Bromide (Br(-)) has been studied, in the Mediterranean Maresme area of Spain, north of Barcelona, where groundwater is located at a depth of 5.5m. The unsaturated zone of weathered - granite soils was characterized in adjacent irrigated and non-irrigated experimental plots where 11 and 10 boreholes were drilled, respectively. At the non irrigated plot, the first half of the period was affected by a persistent and intense rainfall. After 69 days of application residues of Glyphosate up to 73.6 microgg(-1) were detected till a depth of 0.5m under irrigated conditions, AMPA, analyzed only in the irrigated plot was detected till a depth of 0.5m. According to the retardation coefficient of Glyphosate as compared to that of Br(-) for the topsoil and subsoil (80 and 83, respectively) and the maximum observed migration depth of Br(-) (2.9 m) Glyphosate and AMPA should have been detected till a depth of 0.05 m only. Such migration could be related to the low content of organic matter and clays in the soils; recharge generated by irrigation and heavy rain, and possible preferential solute transport and/or colloidal mediated transport.
Automatic identification of anomalies on network data is a problem of fundamental interest to ISP... more Automatic identification of anomalies on network data is a problem of fundamental interest to ISPs to diagnose incipient problems in their networks. ISPs gather diverse data sources from the network for monitoring, diagnostics or provisioning tasks. Finding anomalies in this data is a huge challenge due to the volume of the data collected, the number and diversity of data sources and the diversity of anomalies to be detected. In this paper we introduce a framework for anomaly detection that allows the construction of a black box anomaly detector. This anomaly detector can be used for automatically finding anomalies with minimal human intervention. Our framework also allows us to deal with the different types of data sources collected from the network. We have developed a prototype of this framework, TrafficComber, and we are in the process of evaluating it using the data in the warehouse of a tier-1 ISP.
Proceedings 2019 Network and Distributed System Security Symposium, 2019
Enterprises own a significant fraction of the hosts connected to the Internet and possess valuabl... more Enterprises own a significant fraction of the hosts connected to the Internet and possess valuable assets, such as financial data and intellectual property, which may be targeted by attackers. They suffer attacks that exploit unpatched hosts and install malware, resulting in breaches that may cost millions in damages. Despite the scale of this phenomenon, the threat and vulnerability landscape of enterprises remains under-studied. The security posture of enterprises remains unclear, and it's unknown whether enterprises are indeed more secure than consumer hosts. To address these questions, we perform the largest and longest enterprise security study so far. Our data covers nearly 3 years and is collected from 28K enterprises, belonging to 67 industries, which own 82M client hosts and 73M public-facing servers. Our measurements comprise of two parts: an analysis of the threat landscape and an analysis of the enterprise vulnerability patching behavior. The threat landscape analysis studies the encounter rate of malware and PUP in enterprise client hosts. It measures, among others, that 91%-97% of the enterprises, 13%-41% of the hosts, encountered at least one malware or PUP file over the length of our study; that enterprises encounter malware much more often than PUP; and that some industries like banks and consumer finances achieve significantly lower malware and PUP encounter rates than the most-affected industries. The vulnerability analysis examines the patching of 12 client-side and 112 server-side applications in enterprise client hosts and servers. It measures, among others, that it takes over 6 months on average to patch 90% of the population across all vulnerabilities in the 12 client-side applications; that enterprise hosts are faster to patch vulnerabilities compared to consumer hosts; and that the patching of server applications is much worse than the patching of clientside applications.
Binary code reuse is the process of automatically identifying the interface and extracting the in... more Binary code reuse is the process of automatically identifying the interface and extracting the instructions and data dependencies of a code fragment from an executable program, so that it is self-contained and can be reused by external code. Binary code reuse is useful for a number of security applications, including reusing the proprietary cryptographic or unpacking functions from a malware sample and for rewriting a network dialog. In this paper we conduct the first systematic study of automated binary code reuse and its security applications. The main challenge in binary code reuse is understanding the code fragment's interface. We propose a novel technique to identify the prototype of an undocumented code fragment directly from the program's binary, without access to source code or symbol information. Further, we must also extract the code itself from the binary so that it is self-contained and can be easily reused in another program. We design and implement a tool that uses a combination of dynamic and static analysis to automatically identify the prototype and extract the instructions of an assembly function into a form that can be reused by other C code. The extracted function can be run independently of the rest of the program's functionality and shared with other users. We apply our approach to scenarios that include extracting the encryption and decryption routines from malware samples, and show that these routines can be reused by a network proxy to decrypt encrypted traffic on the network. This allows the network proxy to rewrite the malware's encrypted traffic by combining the extracted encryption and decryption functions with the session keys and the protocol grammar. We also show that we can reuse a code fragment from an unpacking function for the unpacking routine for a different sample of the same family, even if the code fragment is not a complete function.
The effects of an extended photoperiod (18 h light or 12 h light), using a combination of natural... more The effects of an extended photoperiod (18 h light or 12 h light), using a combination of natural and artificial light sources, on the differential gene expression and growth rate of Nile tilapia Oreochromis niloticus L. were evaluated. Four groups of all males tilapia (n=10) with initial mean body weight of 102.25 g were reared in an aquaria with two replications for each treatment. The experiment was conducted during a period of 35 days, growth rate and water quality (dissolved oxygen, pH, nitrite, nitrate) were measured weekly, temperature and solar radiation were recorded daily. At the end of the experiment we used a zebrafish genome array to study differential expression over 14,900 transcripts and to test the hypotheses that an extended photoperiod stimulates growth rate and causes differential gene expression in tilapia. Fishes in the extended photoperiod group (Ep) were susceptible to show higher growth rate (GR) compared with the standard photoperiod group (Sp; P = 0.001). ...
La Fundacion Ciudad de la Energia (CIUDEN) esta desarrollando un proyecto integral de demostracio... more La Fundacion Ciudad de la Energia (CIUDEN) esta desarrollando un proyecto integral de demostracion de captura y almacenamiento geologico de CO2. Para dicho almacenamiento se pretende inyectar CO2 en la localidad de Hontomin (Burgos, Espana) mediante la instalacion de una planta de desarrollo tecnologico. Se inyectaran aproximadamente 100.000 toneladas de CO2 en un acuifero salino situado a 1.500 metros de profundidad. Entre los objetivos del proyecto esta demostrar que el almacenamiento de CO2 es seguro, y que se tiene un completo conocimiento del comportamiento del gas inyectado en el acuifero y de los potenciales riesgos ambientales asociados. Por este motivo, la monitorizacion de flujos de CO2 en la interfase suelo-atmosfera en el emplazamiento es fundamental, teniendose que realizar previa a la a la inyeccion, durante la inyeccion y posterior a la misma. En concreto, en este proyecto se estudia la emanacion difusa de CO2 desde el suelo a la atmosfera mediante la tecnica de la “camara de acumulacion”. Desde 2009, los trabajos estan centrados en la obtencion de la linea base de flujo de CO2 y su variacion estacional previa a la inyeccion, puesto que esta linea de base es esencial para el futuro control y deteccion de potenciales fugas durante los periodos de inyeccion y post-inyeccion. Palabras clave: Almacenamiento geologico, Camara de acumulacion, CO2, Hontomin, Monitorizacion. Abstract The Fundacion Ciudad de la Energia (CIUDEN) is carrying out a project of captura and geological storage of CO2. The area selected for the CO2 injection and storage is located at Hontomin (Burgos, Spain). It is planned to inject approximately 100,000 tons of CO2 into a saline aquifer at a depth of 1,500 metres. One of the aims of the project is to demonstrate that CO2 storage is safe, and there is a control on the evalution and fate of the CO2 injected, and on the potential environmental effects. Such a control requires a detailed monitoring study of the CO2 fluxes at the soil-atmosphere interface before, during and after the injection operations. In particular, in this project we study the diffuse flux of CO2 from the soil to the atmosphere by using an "accumulation chamber ". Since 2009, the work is focused on the determination of a baseline flux of CO2 and its seasonal variation prior to injection, as the baseline flux of CO2 is essential in order to detect potential leakage during injection and post injection.
Fig. 1. NE-SW geological cross-section through the Monte Amiata complex, with the two reservoirs ... more Fig. 1. NE-SW geological cross-section through the Monte Amiata complex, with the two reservoirs of the geothermal system in evidence CO 2 emission from two old mine drillings (Mt. Amiata, Central Italy) as a possible example of storage and leakage of deep-seated CO 2. Barbara Nisi (1), Orlando Vaselli (2),(3), Javier de Elío(4),(5), Marcelo Ortega(5), Juan Caballero(5), Franco Tassi(2),(3), Daniele RAPPUOLI (6), Luis FELIPE MAZADIEGO(4)
In this paper, we give an overview of the BitBlaze project, a new approach to computer security v... more In this paper, we give an overview of the BitBlaze project, a new approach to computer security via binary analysis. In particular, BitBlaze focuses on building a unified binary analysis platform and using it to provide novel solutions to a broad spectrum of different security problems. The binary analysis platform is designed to enable accurate analysis, provide an extensible architecture, and combines static and dynamic analysis as well as program verification techniques to satisfy the common needs of security applications. By extracting security-related properties from binary programs directly, BitBlaze enables a principled, root-cause based approach to computer security, offering novel and effective solutions, as demonstrated with over a dozen different security applications.
Proceedings of the 2013 ACM SIGSAC conference on Computer & communications security - CCS '13, 2013
In this ongoing work we perform the first systematic investigation of cross-platform (X-platform)... more In this ongoing work we perform the first systematic investigation of cross-platform (X-platform) malware. As a first step, this paper presents an exploration into existing X-platform malware families and X-platform vulnerabilities used to distribute them. Our exploration shows that Xplatform malware uses a wealth of methods to achieve portability. It also shows that exploits for X-platform vulnerabilities are X-platform indeed and readily available in commercial exploit kits, making them an inexpensive distribution vector for X-platform malware.
Research in Attacks, Intrusions, and Defenses, 2013
The ever-increasing number of malware families and polymorphic variants creates a pressing need f... more The ever-increasing number of malware families and polymorphic variants creates a pressing need for automatic tools to cluster the collected malware into families and generate behavioral signatures for their detection. Among these, network traffic is a powerful behavioral signature and network signatures are widely used by network administrators. In this paper we present FIRMA, a tool that given a large pool of network traffic obtained by executing unlabeled malware binaries, generates a clustering of the malware binaries into families and a set of network signatures for each family. Compared with prior tools, FIRMA produces network signatures for each of the network behaviors of a family, regardless of the type of traffic the malware uses (e.g., HTTP, IRC, SMTP, TCP, UDP). We have implemented FIRMA and evaluated it on two recent datasets comprising nearly 16,000 unique malware binaries. Our results show that FIRMA's clustering has very high precision (100% on a labeled dataset) and recall (97.7%). We compare FIRMA's signatures with manually generated ones, showing that they are as good (often better), while generated in a fraction of the time.
This chapter contains sections titled: Online Advertising: With Secret Security Web Security Reme... more This chapter contains sections titled: Online Advertising: With Secret Security Web Security Remediation Efforts Content-Sniffing XSS Attacks: XSS with Non-HTML Content Our Internet Infrastructure at Risk Social Spam Understanding CAPTCHAs and Their Weaknesses Security Questions Folk Models of Home Computer Security Detecting and Defeating Interception Attacks Against SSL
International Journal of Information Security, 2014
Drive-by downloads are the preferred distribution vector for many malware families. In the drive-... more Drive-by downloads are the preferred distribution vector for many malware families. In the drive-by ecosystem, many exploit servers run the same exploit kit and it is a challenge understanding whether the exploit server is part of a larger operation. In this paper, we propose a technique to identify exploit servers managed by the same organization. We collect over time how exploit servers are configured, which exploits they use, and what malware they distribute, grouping servers with similar configurations into operations. Our operational analysis reveals that although individual exploit servers have a median lifetime of 16 h, long-lived operations exist that operate for several months. To sustain long-lived operations, miscreants are turning to the cloud, with 60 % of the exploit servers hosted by specialized cloud hosting services. We also observe operations that distribute multiple malware families and that pay-per-install affiliate programs are managing exploit servers for their affiliates to convert traffic into installations. Furthermore, we analyze the exploit polymorphism problem, measuring the repacking rate for different exploit types. To understand how difficult is to takedown exploit servers, we analyze the abuse reporting process and issue abuse reports for 19 long-lived servers. We describe the interaction with ISPs and hosting providers and monitor the result of the report. We find that
There is tremendous potential for genome sequencing to improve clinical diagnosis and care once i... more There is tremendous potential for genome sequencing to improve clinical diagnosis and care once it becomes routinely accessible, but this will require formalizing research methods into clinical best practices in the areas of sequence data generation, analysis, interpretation and reporting. The CLARITY Challenge was designed to spur convergence in methods for diagnosing genetic disease starting from clinical case history and genome sequencing data. DNA samples were obtained from three families with heritable genetic disorders and genomic sequence data were donated by sequencing platform vendors. The challenge was to analyze and interpret these data with the goals of identifying disease-causing variants and reporting the findings in a clinically useful format. Participating contestant groups were solicited broadly, and an independent panel of judges evaluated their performance. A total of 30 international groups were engaged. The entries reveal a general convergence of practices on mo...
Proceedings of the 14th ACM conference on Computer and communications security, 2007
Protocol reverse engineering, the process of extracting the application-level protocol used by an... more Protocol reverse engineering, the process of extracting the application-level protocol used by an implementation, without access to the protocol specification, is important for many network security applications. Recent work [17] has proposed protocol reverse engineering by using clustering on network traces. That kind of approach is limited by the lack of semantic information on network traces. In this paper we propose a new approach using program binaries. Our approach, shadowing, uses dynamic binary analysis and is based on a unique intuition-the way that an implementation of the protocol processes the received application data reveals a wealth of information about the protocol message format. We have implemented our approach in a system called Polyglot and evaluated it extensively using real-world implementations of five different protocols: DNS, HTTP, IRC, Samba and ICQ. We compare our results with the manually crafted message format, included in Wireshark, one of the state-of-the-art protocol analyzers. The differences we find are small and usually due to different implementations handling fields in different ways. Finding such differences between implementations is an added benefit, as they are important for problems such as fingerprint generation, fuzzing, and error detection.
2009 IEEE International Workshop on Genomic Signal Processing and Statistics, 2009
ABSTRACT Coding and non-coding gene prediction is still a challenge. Diverse computer-based tools... more ABSTRACT Coding and non-coding gene prediction is still a challenge. Diverse computer-based tools have been created to screen sequences using elaborate strategies for gene prediction. Many of these implement various statistical tests to measure the plausibility of the prediction but until now, a comprehensive negative control did not exist. We developed an algorithm that generates sequences with characteristics of the intergenic regions of a genome, including nucleotide composition and typical inserted elements like interspersed repeats, low complexity sequences and pseudogenes. We also challenged some gene prediction programs to compare the artificial sequences with real intergenic regions.
9. European Monitoring Centre for Drugs and Drug Addiction. Report on the risk assessment of PMMA... more 9. European Monitoring Centre for Drugs and Drug Addiction. Report on the risk assessment of PMMA in the framework of the joint action on new synthetic drugs [Documento on-line]. 2003 [consultado 29 May 2012]. Disponible en: http://www.emcdda.europa.eu/html.cfm/index33349EN.html 10. Belgian Early Warning System on Drugs [Documento on-line]. PMMA alert by EMCCDA, PMMA/PMA. 2002 [consultado 29 May 2012]. Disponible en: http:// ewsd.wiv-isp.be/Main/PMMA%20alert%20by%20EMCDDA.aspx
To provide context for the diversification of archosaurs--the group that includes crocodilians, d... more To provide context for the diversification of archosaurs--the group that includes crocodilians, dinosaurs, and birds--we generated draft genomes of three crocodilians: Alligator mississippiensis (the American alligator), Crocodylus porosus (the saltwater crocodile), and Gavialis gangeticus (the Indian gharial). We observed an exceptionally slow rate of genome evolution within crocodilians at all levels, including nucleotide substitutions, indels, transposable element content and movement, gene family evolution, and chromosomal synteny. When placed within the context of related taxa including birds and turtles, this suggests that the common ancestor of all of these taxa also exhibited slow genome evolution and that the comparatively rapid evolution is derived in birds. The data also provided the opportunity to analyze heterozygosity in crocodilians, which indicates a likely reduction in population size for all three taxa through the Pleistocene. Finally, these data combined with newl...
The transport of Glyphosate ([N-phosphonomethyl] glycine), AMPA (aminomethylphosphonic acid, CH(6... more The transport of Glyphosate ([N-phosphonomethyl] glycine), AMPA (aminomethylphosphonic acid, CH(6)NO(3)P), and Bromide (Br(-)) has been studied, in the Mediterranean Maresme area of Spain, north of Barcelona, where groundwater is located at a depth of 5.5m. The unsaturated zone of weathered - granite soils was characterized in adjacent irrigated and non-irrigated experimental plots where 11 and 10 boreholes were drilled, respectively. At the non irrigated plot, the first half of the period was affected by a persistent and intense rainfall. After 69 days of application residues of Glyphosate up to 73.6 microgg(-1) were detected till a depth of 0.5m under irrigated conditions, AMPA, analyzed only in the irrigated plot was detected till a depth of 0.5m. According to the retardation coefficient of Glyphosate as compared to that of Br(-) for the topsoil and subsoil (80 and 83, respectively) and the maximum observed migration depth of Br(-) (2.9 m) Glyphosate and AMPA should have been detected till a depth of 0.05 m only. Such migration could be related to the low content of organic matter and clays in the soils; recharge generated by irrigation and heavy rain, and possible preferential solute transport and/or colloidal mediated transport.
Uploads
Papers by Juan Caballero