KringleCon 2: Recovering the clear text document
KringleCon 2: Recovering the clear text document
Overview
The elfscrow program used a predictable seed(time) along with a predictable PRNG. Given that time was the seed and that we were supplied with a time window, it was determined that there were 7200 possible keys. After bruteforcing the key the flag was found to be “Machine Learning Sleigh Route Finder”
Challenge text
“The Elfscrow Crypto tool is a vital asset used at Elf University for encrypting SUPER SECRET documents. We can’t send you the source, but we do have debug symbols that you can use.
Recover the plaintext content for this encrypted document. We know that it was encrypted on December 6, 2019, between 7pm and 9pm UTC.
What is the middle line on the cover page? (Hint: it’s five words)”
Initial thoughts
The challenge gives us a binary, a pdb file, an encrypted document, and a ruff time estimate of when the document was encrypted.
My first impression was a glimmer of hope that the uuid would be time based, but it was uuid 4 so it was completely random. With that path exhausted fairly quickly I moved on to gathering more information.
I ran the program with no arguments and saw that there was a flag called --insecure
. The description for that flag was that it would send the traffic over http instead of https. I opened up wireshark and captured some packets while
running the program.
Networking in the binary
While reverse engineering the application I first looked for xrefs to HttpSendRequestA
because while running the program I saw it print out url endpoints.
There were two functions that called HttpSendRequestA
.
This helped me narrow down what functions were dealing with the actual encryption keys (Note I could have just as easily looked for xrefs to the encryption functions,
but I really wanted to understand how the binary works).
The two functions each had different api endpoints that they called out to. I decided to look at the one calling out to /api/register
(I will refer to the function as storeKey from here on out). Inside of storeKey a few notable
things can be seen
- The user agent ““ElfScrow V1.01 (SantaBrowse Compatible)”
- The server sends back the uuid key after making a POST request with the key. It then prints a message telling the user to hold onto it for decryption later.
- This is the most important part, the key is passed as a parameter to
storeKey
, which means that whatever function that calls it will have access to the key :)
For the rest of this section I am going to cover the other function, note it does not relate directly to the solution, but I think its fun to get a better understanding of how the binary works.
The second function that deals with networking I am going to call checkKeyServer
. checkKeyServer
calls out to the api endpoint /api/retrieve
. This function is basically the inverse of
storeKey
.checkKeyServer
takes the uuid that storeKey
gets back from the server and queries the server for the associated encryption key. It then reads that key into a buffer which is later used for decryption.
Cryptography
Identifying the algorithm and the MS crept functions
But how does that networking information lead us to the crypto part? The secret lies into the xrefs from storeKey
. Store key is called by a high level function I am going to call encrypt
. When first looking at encrypt
I noticed the following interesting function calls along with the Microsoft documentation links for them
CryptAcquireContext
- https://docs.microsoft.com/en-us/windows/win32/api/wincrypt/nf-wincrypt-cryptacquirecontextaCryptImportKey
- https://docs.microsoft.com/en-us/windows/win32/api/wincrypt/nf-wincrypt-cryptimportkeyCryptEncrypt
- https://docs.microsoft.com/en-us/windows/win32/api/wincrypt/nf-wincrypt-cryptencrypt
These are all functions from advapi.dll
and more specifically they are functions from the old windows cryptography interface.
For those who ignored the msdn docs here is some pseudo code of how to use the functions
context Context
HCRYPTKEY impKey
CryptAcquireContext(&Context)
key = 'HopeFully-A-Properly-Generated-Key'
CryptImportKey(Context, key, impKey)
CryptEncrypt(impKey, stuffToEncrypt)
This pseudo code oversimplifies how the functions actually work to make the process of encryption easier to understand. First a context is created, then a key is ‘securely generated’, next the aforementioned key is imported into the context and loads a HCRYPTKEY struct and finally we encrypt the data. I RECOMMEND READING THE MICROSOFT DOCS FOR A BETTER UNDERSTANDING OF THESE FUNCTIONS!!!!, however that is the simplified version of what is occurring.
With this knowledge we need to find two thing still
- The encryption algorithm and the mode of operation
- The keygen
Luckily for us Santa’s elves in the software development branch on the north pole left a lot of handy debug messages in the code for example the helpful error message
CryptImportKey failed for DES-CBC key
Now we now the encryption algorithm and the mode in which it operates. It can also be seen that no initialization vector (IV) is set, which is bad practice.
Now on to reversing the key gen algorithm.
Keygen
The next step is figuring out how the keygen algorithm is implemented. From here on out I am going to call the top level keygen function keyGen
.
Looking at the ghidra decompile three (non standard) functions, which I renamed genSeed
, genBytes
and setSeed
, are called.
The two functions of particular interest are genSeed
and genBytes
.
Looking at the looking at the rough decompile of genSeed
it can be seen that the function _time64
is called.
According to the Microsoft documentation for _time64
“(time64) Returns the time as seconds elapsed since midnight, January 1, 1970”
This means the seed is the number of second since epoch. Keep this in mind because this is the first crucial mistake that the elves made while developing this application.
Next comes looking into genBytes
. The most effective way to reverse this function is to look at the disassembly instead of the decompile.
Here is a translation of the decompile to pseudo code
seed = seed * 0x343fd + 0x269ec3
seed = SHIFT_RIGHT_BY_0x10(seed)
seed = AND_BY_0x7FFF(seed)
return seed
This is a predictable “random” number generator (Also called an PRNG which is short for pseudo random number generator). Before we implement it, there is one more detail that is needed from the disassembly of keyGen
.
Looking at the disassembly of keyGen
the line that matters the most is AND ECX, 0xff
put simply this assembly takes the return of genBytes
and converts it into a single byte.
That byte is then append to the key. This is the second issue that was made that allows for this challenge to be solved, a predictable RNG algorithm was used.
With these two bits of information we can move onto decryption the document.
Exploiting the flaw
Okay after reversing the crypto we now know that the program uses a predictable PRNG and a non random seed, but you may be wondering how this could lead to use being able to decrypt the document. The answer to that is that if you recall we are given a time frame in which the document was encrypted, specifically the time frame was December 6, 2019, between 7pm and 9pm UTC, these values can be converted to seconds since the epoch and since we know that it is a two hour time frame that means that there are 7200 possible keys to allow for the document to be decrypted. Cryptographically speaking that is a very low number of possibilities for the key. Okay so we know the flaw now lets look at the solution code.
This is a simple python script I wrote to go through and test every possible key (bruteforce), it then stores the decryption results into
an output directory outdir/
. Run this script and then go brew yourself some tea depending on how fast your computer is.
You will be left with a directory of 7202 files which seems useless at first, but with the help of a simple bash oneliner we can find the intended output.
file output/* | grep 'PDF'
That oneliner will return a single pdf. After opening the pdf file in a viewer I confirmed that it was the correct result.
The flag was “Machine Learning Sleigh Route Finder”
How to prevent something like this
The easiest way to prevent an attack like this would be to use a “secure” PRNG with a non predictable seed.
For win32 use a function like BCryptGenRandom
, if you are on a different platform make sure to research a safe PRNG.
The overkill answer is to use a hardware based RNG, but these can become expensive and most consumers don’t have one.
Final notes
Summary
Elfscrow used contained a poorly implemented keygen algorithm that allowed for a possible bruteforce. To avoid this remember that it is not safe to role you own keygen and that time should
My experience
This was a fun challenge. I love reverse engineering things and this challenge gave me the change to play around with the win32 crypto stack. SANS did an excellent job with this challenge!