Unit 3
Unit 3
Unit 3
Malware is a code that performs malicious actions; it can take the form of an executable, script, code, or
any other software. Attackers use malware to steal sensitive information, spy on the infected system, or
take control of the system. It typically gets into your system without your consent and can be delivered
via various communication channels such as email, web, or USB drives.
1. Feature Extraction:
Extract relevant features from the malware samples using static or dynamic analysis techniques, as discussed earlier. These features capture
different aspects of the malware's behaviour, structure, and characteristics.
2. Dataset Preparation:
Create a labelled dataset where each sample is annotated as either malicious or benign. This dataset is used for training and evaluating the
machine learning model.
3. Feature Scaling and Normalization:
Scale and normalize the features to ensure that they are on a consistent scale. Common techniques include min-max scaling or standardization.
4. Feature Selection:
If the feature set is large, consider performing feature selection to identify the most relevant features. This can help reduce dimensionality and
improve model efficiency.
5. Splitting the Dataset:
Split the dataset into training and testing sets. The training set is used to train the machine learning model, while the testing set is used to
evaluate its performance.
6. Model Selection:
Choose an appropriate machine learning algorithm for classification. Common algorithms used in malware analysis include decision trees,
random forests, support vector machines, and deep learning models.
7. Training the Model:
Train the selected machine learning model using the training dataset. The model learns the patterns and relationships between features and
their corresponding labels (malicious or benign).
8. Model Evaluation:
Evaluate the trained model on the testing dataset to assess its performance. Common evaluation metrics include accuracy, precision, recall,
F1- score, and area under the ROC curve (AUC-ROC).
9. Hyperparameter Tuning:
Fine-tune the model's hyperparameters to optimize its performance. This may involve adjusting parameters such as learning rate,
regularization, or tree depth, depending on the chosen algorithm.
10.Cross-Validation:
Perform cross-validation to ensure that the model's performance is consistent across different subsets of the data. This helps assess its
generalization ability.
11.Deployment:
Once the model achieves satisfactory performance, it can be deployed in a real-world environment for malware detection. This may involve
integrating the model into security systems, antivirus software, or other cybersecurity tools.
12. Monitoring and Updating:
Continuously monitor the model's performance in the production environment. Periodically update the model with new data to adapt to
emerging threats and maintain effectiveness over time.
LIVE MALWARE ANALYSIS
Live malware analysis involves studying and understanding the behaviour of malware in a real-time or live environment.
Unlike static analysis, which focuses on examining the static properties of a malware sample (such as file structure, strings,
and code), live malware analysis involves observing the dynamic behaviour of the malware as it runs in a controlled
environment. Here are the key aspects and techniques involved in live malware analysis:
1. Sandboxing
Isolation Environment: Running the malware in an isolated environment or sandbox that emulates a real operating
system without affecting the actual production environment.
Dynamic Analysis: Observing the behaviour of the malware as it executes, interacts with the system, and attempts to
carry out malicious activities.
2. Traffic Analysis
Packet Capture: Capturing and analysing network traffic generated by the malware to understand communication
patterns and detect malicious activities.
Protocol Analysis: Analysing the protocols used by the malware for communication, which can help identify the nature of
the malicious network traffic.
3. Dynamic Behavior Analysis
API Calls Monitoring: Tracking the Windows API calls made by the malware during execution. This helps in
understanding the functionality and potential malicious actions.
Network Activity: Monitoring network communications initiated by the malware, including connections to command
and control servers, data exfiltration, or communication with other malicious entities.
File System and Registry Changes: Observing modifications made to the file system and Windows Registry, such as file
creations, deletions, and registry key modifications.
Memory Analysis: Examining changes in the system's memory, such as injected code, modified processes, or altered
system states.
Capture Screenshots: Taking screenshots during the execution of the malware to capture visual changes on the system.
Dynamic Analysis Tools: Leveraging tools designed for dynamic malware analysis, such as Cuckoo Sandbox, Joe
Sandbox, or other custom sandboxing solutions.
5. Payload Analysis
Decryption and Decoding: Decrypting or decoding any encrypted or encoded payloads to understand the actual
content and purpose of the payload.
Payload Delivery: Observing how the malware delivers and executes its payload on the infected system .
Command and Control (C2) Server Interaction: Observing communication between the malware and its command
and control server to understand the instructions received by the malware.
Data Exfiltration: Identifying attempts by the malware to steal or send sensitive data from the infected system.
Debuggers and Tracers: Using tools like debuggers to dynamically analyse the malware's code execution, inspect
variables, and step through the code.
Memory Forensics: Analysing the memory space of the infected process to identify artifacts left by the malware.
DEAD MALWARE ANALYSIS
Dead malware analysis, also known as static malware analysis, involves examining the non-executing aspects of malware without
running it in a live or dynamic environment. This type of analysis focuses on understanding the structure, content, and
characteristics of the malware sample without executing its code. Dead malware analysis provides valuable insights into the
static properties and characteristics of a malware sample without the need to execute it. This information is crucial for
developing signatures, understanding the malware's capabilities, and enhancing overall cybersecurity defensesHere are the key
aspects and techniques involved in dead malware analysis:
1. File Analysis
File Hashing: Calculating hash values (MD5, SHA-1, SHA-256) to uniquely identify the malware sample.
File Metadata: Examining file properties, such as file size, creation date, modification date, and file type.
File Type Identification: Determining the type of file (e.g., executable, document) and analyzing its structure.
2. Strings Analysis
Static Strings: Identifying hardcoded strings within the binary, which may include URLs, registry keys, or other
indicators.
Unicode and ASCII Strings: Analysing both Unicode and ASCII strings present in the binary.
3. Code Analysis
Disassembly and Decompilation: Reverse engineering the malware's code to understand its assembly code or
higher-level language representation.
Control Flow Analysis: Studying the control flow of the code to comprehend its logic without executing it.
PE Header Analysis: Examining the Portable Executable (PE) header fields to understand the binary's structure.
Section Headers: Analysing the sections within the binary to identify different parts of the executable.
5. Resource Analysis
Embedded Resources: Identifying and extracting any embedded files, resources, or payloads within the malware.
Resource Names and Types: Extracting information about the names and types of resources embedded in the binary.
6. Cryptographic Analysis
Encryption Identification: Determining whether the malware uses encryption or obfuscation techniques.
Cryptographic Constants: Identifying constants related to cryptographic algorithms within the code.
ANDROID MALWARE ANALYSIS
Analysing Android malware involves studying malicious software designed to target the Android operating system. The goal is
to understand its behaviour, functionality, and potential impact. By combining static and dynamic analysis techniques, security
analysts can gain a comprehensive understanding of Android malware, enabling them to develop effective detection
mechanisms and mitigation strategies. Here are the key steps and techniques involved in Android malware analysis:
1. Static Analysis
APK File Inspection: Examine the APK (Android Package) file, which is the package format used by Android
applications.
File Hashing: Calculate hash values (MD5, SHA-1, SHA-256) to uniquely identify the APK.
Manifest File Analysis: Inspect the AndroidManifest.xml file for information about the app's components, permissions,
and activities.
2. Dynamic Analysis
Emulator or Device Execution: Run the APK in a controlled environment using an emulator or a dedicated
Android device.
Behavioural Analysis: Observe the app's behaviour during execution, such as network communication, file system
changes, and interactions with the device.
3. Timeline Analysis
Execution Timeline: Create a timeline of the app's execution to understand the sequence of events and potential
malicious activities over time.
4. Network Analysis
Packet Capture: Use tools like Wireshark to capture and analyse network traffic generated by the app. This helps identify
communication with command and control servers, data exfiltration, or other malicious activities.
HTTP/HTTPS Analysis: Examine HTTP/HTTPS traffic for potential malicious content or communication patterns.
File Operations: Monitor changes made to the file system, including file creations, modifications, and deletions.
Sensitive File Access: Identify whether the app accesses sensitive files or directories on the device.
URL Analysis: Investigate URLs hardcoded in the app for signs of malicious content or phishing.
Dynamic Payload Extraction: Analyse the app for the presence of dynamic payloads or malicious scripts .
7. Anti-Analysis Techniques
Rooting Detection: Check for mechanisms employed by the app to detect rooted or jailbroken devices.
Emulator Detection: Identify anti-emulator techniques used to thwart dynamic analysis.