Writing a simple deobfuscator for a simple C# malware variant

18 Nov 2019

Introduction

When a threat actor wishes to circumvent analysis from a reverse engineering standpoint, a common technique utilised by the attackers is to obfuscate their malicious code. This can be done in several ways and may include control flow obfuscation (making the flow of the program confusing, random jumps..), string obfuscation (not having text in plain sight, may be encrypted, encoded..), junk code (pointless code which does nothing, merely a way to confuse analysts), object renaming (renaming objects within the code from their original, e.g. MainService may become OQuXiqmXq throughout the code). To get around this, you can create your own tooling to deobfuscate binaries and strip them of these circumventions automatically. This is easier to achieve when managed code is involved as they’re typically much more comfortable to manipulate, for .NET we’ve got dnlib, and for Java, we have ASM.

The abovementioned libraries make it much easier to manipulate and change executables, in this post we’ll be writing a simple .NET deobfuscator using dnlib against a prolific RAT (“Remote Administration Tool”) - this can easily be extended and changed to suit your needs based on the challenges you face with a specific managed executable. I have mimicked the protections which we’d see in this RAT for the purpose of this writeup.

By the end of this, we’ll have a working deobfuscator which will strip some of the protections that the attacker has applied to the .NET assembly.

The steps we’ll take

Everyone loves diagrams, right? This is a basic breakdown of the process we’ll take to achieve deobfuscation.

Initial static code analysis

Taking a look at the executable, it employs an extremely basic class, and method obfuscation technique of using a randomly generated 10 character string prepended to all of the declarations. The strings are obfuscated strangely, containing a ‘|’ character they’re split by using the main string used throughout - so we’ll have to do some instruction patching too to get rid of this and split it ourselves. You can download the executable from here that we’re going to use as an example - please note, it’s just full of useless code.

Using dnSpy, we can easily open the binary and recover the source code from the .NET assembly. If you’ve not already got dnSpy, you can get it from here. Opening it in dnSpy, we see the following structure:

As previously mentioned, I’ve created the scenario where the module name is prepended as a way for obfuscation. Looking at one of these classes deeper, we can see the other obfuscation techniques that were mentioned:

We can see all of the methods within the above class have the string appended to them, along with the strings within the binary being split by the | character.

Writing the deobfuscator

You’ll need to install dnlib, the easiest way to achieve this is using Nuget which is within Visual Studio (Projects -> Add Nuget Package -> Browse). We’ll create a Console Application, using C# as our language choice and create a basic skeleton in which a user can pass in an input assembly and output one.

Our basic process is going to be:

Let’s start by creating a new console C# project in Visual Studio, import the namespaces we need, and make a basic boilerplate for passing in two arguments. The first argument being the executable we want to deobfuscate, and the second the path we want to output to.

using System;
using dnlib.DotNet;
using dnlib.DotNet.Emit;

Next, let’s get to using dnlib. We’ll want to pass in our path to the library as a module, an executable, and attempt to load it. As seen in the initial static analysis stage, the module name is AfmAcgnNGYtN9H. So, we’ll gather the assembly name at the same time. Where fullPath is the variable name of the path to the executable.

var module = ModuleDefMd.Load(fullPath);
if (module == null)
{
    return;
}

string moduleName = module.Assembly.Name;

We can then call the GetTypes to get a collection of all of the types that exist in the module, this will return classes, functions, body of functions and more. As we saw in the static analysis of the executable, all of the classes and methods had the unique string prepended to them, along with the string obfuscation being present within some of these methods. We’ll start by simply renaming all of the ‘types’ (classes, methods, etc.) back to their original, human-readable name.

What’s beautiful about dnlib is that it’ll automatically rename the method across the entire assembly, so we don’t need to cross-reference calls to it and manually change everything.

foreach (var type in module.GetTypes())
{
    if (type.Name.Contains(moduleName)) // does our method name contain the bad string?
    {
        method.Name = Method.Name.Replace(moduleName, string.Empty); // replace with nothing
    }
    ...

After this, all of the methods will now be renamed like so:

AfmAcgnNGYtN9HGetCount -> Functionality
AfmAcgnNGYtN9HBadServer -> BadServer
..and so on

That part was relatively easy, right? Let’s move onto removing the string obfuscation which is contained within the binary. As abovementioned, all of the obfuscation when it comes to strings looks like this:

string realString = "Hello world!|AfmAcgnNGYtN9H".Split('|')[0];

We’ll need to edit the assembly manually, finding the Common Intermediate Language (“CIL”) instructions that are responsible and remove them. When a string is concerned, the ldstr OpCode is used with the first and only operand being the string we want to load. We can further look at this in dnSpy, if we select Edit IL assembly within the method body. For example, this function here:

private static string AfmAcgnNGYtN9HMalicious()
{
	return "This is a bad string|AfmAcgnNGYtN9H".Split(new char[]
	{
		'|'
	})[0];
}

Will look like this in CIL operations when we disassemble it:

We’ll want to keep the original string which is This is a bad string - we’ll achieve this by splitting the string ourselves and NOP’ing out the rest of the string splitting. When we NOP something, we’re effectively telling the Common Language Runtime (“CLR”) to do nothing. As a resource for a complete list of instructions that’re implemented, you can find them here.

We’ll want to NOP the next 9 sets of opcodes after the ldstr instruction when we find it in the body of a method to produce something like this:

Which, in actual code terms, produces this:

private static string AfmAcgnNGYtN9HBadServer()
{
	return "This is a bad string";
}

So, going back into writing the obfuscator - we’ll want to keep in the loop where we’re going through each type however add a new one within it. After this, we’ll go each method and check if it has a body (code), then continue to iterate over each of the instructions within there until we find an ldstr instruction. To verify the string has been obfuscated, we’ll then check if it’s being split by the predefined string (AfmAcgnNGYtN9H) and remove the splitting logic.

...
foreach (var method in type.Methods)
{
    if (!method.HasBody) // does the method contain code?
    {
        continue;
    }

    for (int i = 0; i < method.Body.Instructions.Count; i++)
    {
        if (method.Body.Instructions[i].OpCode != OpCodes.Ldstr)
        {
            continue;
        }

        string originalStr = method.Body.Instructions[i].Operand.ToString();
        if (string.IsNullOrEmpty(originalStr))
        {
            continue; // for some reason, we can't recover the string. Let's keep going.
        }

        // split by the garbage that's added
        string[] parts = originalStr.Split("|AfmAcgnNGYtN9H");

        // check the garbage is there
        if (parts.Len != 2)
        {
            continue;
        }

        // ok, we've found a bad string, lets change our old one over to our recovered string
        method.Body.Instructions.Insert(i, new Instruction(OpCodes.Ldstr, parts[0]));

        // now, lets remove the next 9 set of opcodes and replace with them a NOP
        for (int j = 0; j < 9; j++)
        {
            method.Body.Instructions[i + j] = OpCodes.Nop;
        }

        // increment our opcode counter
        i += 9;
    }
}

Finally, we’ll want to write the modified assembly back to an executable (where outPath is our path we want to write to). This will save all of our changes we’ve made.

module.Write(outPath);

We now have a deobfuscated binary. You can easily do much, much more with dnSpy - I thought I’d create this for those wondering where to start.

Conclusion

The library we’re using, dnlib, is extremely powerful and easy to use. Although de4dot is extensive in its deobfuscation efforts you may wish to do something further with deobfuscation. Anyway, I hope by reading this you learnt something. For now, ciao!

If you’ve got any questions, feel free to email me: [email protected]

obfuscation malware re dnlib analysis