Unit 3

Download as pptx, pdf, or txt
Download as pptx, pdf, or txt
You are on page 1of 19

MALWARE ANALYSIS

Malware is a code that performs malicious actions; it can take the form of an executable, script, code, or
any other software. Attackers use malware to steal sensitive information, spy on the infected system, or
take control of the system. It typically gets into your system without your consent and can be delivered
via various communication channels such as email, web, or USB drives.

The following are some of the malicious actions performed by malware:

*Disrupting computer operations


* Stealing sensitive information, including personal, business, and financial data
* Unauthorized access to the victim's system
* Spying on the victims
* Sending spam emails
* Engaging in distributed-denial-of-service attacks (DDOS)
* Locking up the files on the computer and holding them for ransom.
Malware can be of different types based on their functionality. Few of these are:
Types of Malware Analysis
FEATURES EXTRACTION
Feature extraction is a crucial step in malware analysis, where relevant characteristics or features of the malware are identified and extracted
for further analysis. These features can be used to build machine learning models, train classifiers, or develop signatures for detection. Some of
the features that can be extracted from malware analysis are

(i) Static Features:


(a) File size: the size of a malware file can give idea about its complexity.
(b) File type: the nature of the malware can be identified by looking at the file type (e.g., executable, DLL,
etc)
(c) File Metadata: Extracting information such as creation time, modification time, and file permissions.
(d) Entropy: The entropy of specific sections within the binary, indicating randomness and potential obfuscation.
(e) Header Information: Features extracted from the portable executable header, such as, entry point, image base and
section information.

(ii) Strings and keywords:


(a) API function calls: Extracting the list of API functions called by the malware.
(b)Unique strings: Identifying specific strings or keywords within the binary, which may included hardcoded
URLs, registry keys, or other indicators.
(c) Cryptographic constants: Identifying constants used in encryption or hashing algorithms.

(iii) Code analysis:


(a) Control Flow Graph (CFG): Analysing the control flow of the code to understand its structure.
(b) Function Calls: Extracting information about function calls, arguments and returns.
(c) Opcode sequences: Extracting sequences of opcodes from the disassembled code to capture the malware’s behaviour.
(iv) Resource Information:
(a) Embedded resources: Identifying any files or resources within the malware.
(b) Resource Names: Extracting names and types of resources, which can provide clues about the malware’s
functionality.

(v) Network Features:


(a) Domain and IP Addresses: Extracting information about domains and IP addresses communicated with by
the malware.
(b) Communication protocols: Identifying the protocols used for communication, such as HTTP, HTTPS or custom
protocols.
(c) Packet sizes and timing: Analysing network traffic patterns, including packet sizes and timing, for anomaly
detection.

(vi) Behavioural Features:


(a) System Calls: Extracting information about system calls made during execution.
(b) Registry Operations: Capturing changes made to the windows registry by the malware.
(c) File system operations: Monitoring file creations, deletions and modifications caused by the malware.

(vii) Temporal and Contextual Features:


(a) Timestamps: Analysing timestamps of file creation, modification and access.
(b) Execution Order: Extracting the order of executed operations to understand the malware’s behaviour over time.
After extracting these features from a dataset of malware samples, they can be used as input for machine learning algorithms such as decision
trees, random forests, support vector machines, or neural networks. The training process involves providing labelled data (malware and benign
samples) to the machine learning model so that it learns to distinguish between the classes. The model can then be evaluated and applied to new,
unseen samples for classification purposes. Regular validation and testing are essential to ensure the effectiveness of the classification model. The
transition from features to classification in malware analysis involves using the extracted features to train machine learning models that can
classify whether a given sample is malicious or benign. It's important to note that the success of the classification model relies on the quality of
the features, the representativeness of the dataset, and the choice of an appropriate machine learning algorithm. Regular updates and
improvements to the model are essential to keep up with the evolving nature of malware. Additionally, ensemble methods, combining multiple
models, may enhance overall detection performance. Here is a step-by-step process:

1. Feature Extraction:
Extract relevant features from the malware samples using static or dynamic analysis techniques, as discussed earlier. These features capture
different aspects of the malware's behaviour, structure, and characteristics.
2. Dataset Preparation:
Create a labelled dataset where each sample is annotated as either malicious or benign. This dataset is used for training and evaluating the
machine learning model.
3. Feature Scaling and Normalization:
Scale and normalize the features to ensure that they are on a consistent scale. Common techniques include min-max scaling or standardization.
4. Feature Selection:
If the feature set is large, consider performing feature selection to identify the most relevant features. This can help reduce dimensionality and
improve model efficiency.
5. Splitting the Dataset:
Split the dataset into training and testing sets. The training set is used to train the machine learning model, while the testing set is used to
evaluate its performance.
6. Model Selection:
Choose an appropriate machine learning algorithm for classification. Common algorithms used in malware analysis include decision trees,
random forests, support vector machines, and deep learning models.
7. Training the Model:
Train the selected machine learning model using the training dataset. The model learns the patterns and relationships between features and
their corresponding labels (malicious or benign).
8. Model Evaluation:
Evaluate the trained model on the testing dataset to assess its performance. Common evaluation metrics include accuracy, precision, recall,
F1- score, and area under the ROC curve (AUC-ROC).
9. Hyperparameter Tuning:
Fine-tune the model's hyperparameters to optimize its performance. This may involve adjusting parameters such as learning rate,
regularization, or tree depth, depending on the chosen algorithm.
10.Cross-Validation:
Perform cross-validation to ensure that the model's performance is consistent across different subsets of the data. This helps assess its
generalization ability.
11.Deployment:
Once the model achieves satisfactory performance, it can be deployed in a real-world environment for malware detection. This may involve
integrating the model into security systems, antivirus software, or other cybersecurity tools.
12. Monitoring and Updating:
Continuously monitor the model's performance in the production environment. Periodically update the model with new data to adapt to
emerging threats and maintain effectiveness over time.
LIVE MALWARE ANALYSIS

Live malware analysis involves studying and understanding the behaviour of malware in a real-time or live environment.
Unlike static analysis, which focuses on examining the static properties of a malware sample (such as file structure, strings,
and code), live malware analysis involves observing the dynamic behaviour of the malware as it runs in a controlled
environment. Here are the key aspects and techniques involved in live malware analysis:

1. Sandboxing

 Isolation Environment: Running the malware in an isolated environment or sandbox that emulates a real operating
system without affecting the actual production environment.
 Dynamic Analysis: Observing the behaviour of the malware as it executes, interacts with the system, and attempts to
carry out malicious activities.

2. Traffic Analysis

 Packet Capture: Capturing and analysing network traffic generated by the malware to understand communication
patterns and detect malicious activities.
 Protocol Analysis: Analysing the protocols used by the malware for communication, which can help identify the nature of
the malicious network traffic.
3. Dynamic Behavior Analysis

 API Calls Monitoring: Tracking the Windows API calls made by the malware during execution. This helps in
understanding the functionality and potential malicious actions.
 Network Activity: Monitoring network communications initiated by the malware, including connections to command
and control servers, data exfiltration, or communication with other malicious entities.
 File System and Registry Changes: Observing modifications made to the file system and Windows Registry, such as file
creations, deletions, and registry key modifications.
 Memory Analysis: Examining changes in the system's memory, such as injected code, modified processes, or altered
system states.

4. Capture and Analysis tools

 Capture Screenshots: Taking screenshots during the execution of the malware to capture visual changes on the system.
 Dynamic Analysis Tools: Leveraging tools designed for dynamic malware analysis, such as Cuckoo Sandbox, Joe
Sandbox, or other custom sandboxing solutions.
5. Payload Analysis

 Decryption and Decoding: Decrypting or decoding any encrypted or encoded payloads to understand the actual
content and purpose of the payload.
 Payload Delivery: Observing how the malware delivers and executes its payload on the infected system .

6. Interaction with External Entities

 Command and Control (C2) Server Interaction: Observing communication between the malware and its command
and control server to understand the instructions received by the malware.
 Data Exfiltration: Identifying attempts by the malware to steal or send sensitive data from the infected system.

7. Runtime Analysis tools

 Debuggers and Tracers: Using tools like debuggers to dynamically analyse the malware's code execution, inspect
variables, and step through the code.
 Memory Forensics: Analysing the memory space of the infected process to identify artifacts left by the malware.
DEAD MALWARE ANALYSIS

Dead malware analysis, also known as static malware analysis, involves examining the non-executing aspects of malware without
running it in a live or dynamic environment. This type of analysis focuses on understanding the structure, content, and
characteristics of the malware sample without executing its code. Dead malware analysis provides valuable insights into the
static properties and characteristics of a malware sample without the need to execute it. This information is crucial for
developing signatures, understanding the malware's capabilities, and enhancing overall cybersecurity defensesHere are the key
aspects and techniques involved in dead malware analysis:

1. File Analysis

 File Hashing: Calculating hash values (MD5, SHA-1, SHA-256) to uniquely identify the malware sample.
 File Metadata: Examining file properties, such as file size, creation date, modification date, and file type.
 File Type Identification: Determining the type of file (e.g., executable, document) and analyzing its structure.

2. Strings Analysis

 Static Strings: Identifying hardcoded strings within the binary, which may include URLs, registry keys, or other
indicators.
 Unicode and ASCII Strings: Analysing both Unicode and ASCII strings present in the binary.
3. Code Analysis

 Disassembly and Decompilation: Reverse engineering the malware's code to understand its assembly code or
higher-level language representation.
 Control Flow Analysis: Studying the control flow of the code to comprehend its logic without executing it.

4. Header and Section Analysis

 PE Header Analysis: Examining the Portable Executable (PE) header fields to understand the binary's structure.
 Section Headers: Analysing the sections within the binary to identify different parts of the executable.

5. Resource Analysis

 Embedded Resources: Identifying and extracting any embedded files, resources, or payloads within the malware.
 Resource Names and Types: Extracting information about the names and types of resources embedded in the binary.

6. Cryptographic Analysis
 Encryption Identification: Determining whether the malware uses encryption or obfuscation techniques.
 Cryptographic Constants: Identifying constants related to cryptographic algorithms within the code.
ANDROID MALWARE ANALYSIS

Analysing Android malware involves studying malicious software designed to target the Android operating system. The goal is
to understand its behaviour, functionality, and potential impact. By combining static and dynamic analysis techniques, security
analysts can gain a comprehensive understanding of Android malware, enabling them to develop effective detection
mechanisms and mitigation strategies. Here are the key steps and techniques involved in Android malware analysis:

1. Static Analysis
 APK File Inspection: Examine the APK (Android Package) file, which is the package format used by Android
applications.
 File Hashing: Calculate hash values (MD5, SHA-1, SHA-256) to uniquely identify the APK.
 Manifest File Analysis: Inspect the AndroidManifest.xml file for information about the app's components, permissions,
and activities.
2. Dynamic Analysis
 Emulator or Device Execution: Run the APK in a controlled environment using an emulator or a dedicated
Android device.
 Behavioural Analysis: Observe the app's behaviour during execution, such as network communication, file system
changes, and interactions with the device.

3. Timeline Analysis
 Execution Timeline: Create a timeline of the app's execution to understand the sequence of events and potential
malicious activities over time.
4. Network Analysis

 Packet Capture: Use tools like Wireshark to capture and analyse network traffic generated by the app. This helps identify
communication with command and control servers, data exfiltration, or other malicious activities.
 HTTP/HTTPS Analysis: Examine HTTP/HTTPS traffic for potential malicious content or communication patterns.

5. File System Analysis

 File Operations: Monitor changes made to the file system, including file creations, modifications, and deletions.
 Sensitive File Access: Identify whether the app accesses sensitive files or directories on the device.

6. Malicious Payload Detection

 URL Analysis: Investigate URLs hardcoded in the app for signs of malicious content or phishing.
 Dynamic Payload Extraction: Analyse the app for the presence of dynamic payloads or malicious scripts .

7. Anti-Analysis Techniques

 Rooting Detection: Check for mechanisms employed by the app to detect rooted or jailbroken devices.
 Emulator Detection: Identify anti-emulator techniques used to thwart dynamic analysis.

You might also like