Stripping symbols from a Mach-O

A common mistake many developers do is to leave names of local symbols inside applications built on OS X. Using the strip utility combined with the compiler visibility flags is, unfortunately, not enough.

So I wrote a small script for Profiler to be run from the command line and I integrated it in the build process.

The syntax to execute the script is:

Here’s the whole code:

The code zeroes names of symbols which are not associated to any section in the binary.

Easy. 🙂

Profiler 2.1 (Mac OS X support)

The new version of Profiler is out and it includes support for Mac OS X!

Here’s the change-list for the current version:

– ported Cerbero Profiler to MacOSX
– added command-line scripting support
– updated Header Manager to Clang 3.5
– improved parsing of PE Delay Import Descriptors
– fixed some small issues

Enjoy!

Command-line scripting

The upcoming 2.1 version of Profiler adds support for command-line scripting. This is extremely useful as it enables users to create small (or big) utilities using the SDK and also to integrate those utilities in their existing tool-chain.

The syntax to run a script from command-line is the following:

If we want to run a specific function inside a script, the syntax is:

This calls the function ‘bar’ inside the script ‘foo.py’.

Everything following the script/function is passed on as argument to the script/function itself.

When no function is specified, the arguments can be retrieved from sys.argv.

For the command-line above the output would be:

When a function is specified, the number of arguments are passed on to the function directly:

If you actually try one of these examples, you’ll notice that Profiler will open its main window, focus the output view and you’ll see the output of the Python print functions in there. The reason for this behaviour is that the command-line support also allows us to instrument the UI from command-line. If we want to have a real console behaviour, we must also specify the ‘-c’ parameter. Remember we must specify it before the ‘-r’ one as otherwise it will be consumed as an argument for the script.

However, on Windows you won’t get any output in the console. The reason for that is that on Windows applications can be either console or graphical ones. The type is specified statically in the PE format and the system acts accordingly. There are some dirty tricks which allow to attach to the parent’s console or to allocate a new one, but neither of those are free of glitches. We might come up with a solution for this in the future, but as for now if you need output in console mode on Windows, you’ll have to use another way (e.g. write to file). Of course, you can also use a message box.

But a message box in a console application usually defies the purpose of a console application. Again, this issue affects only Windows.

So, let’s write a small sample utility. Let’s say we want to print out all the import descriptor module names in a PE. Here’s the code:

Running the code above with the following command line:

Produces the following output:

Of course, this is not a very useful utility. But it’s just an example. It’s also possible to create utilities which modify files. There are countless utilities which can be easily written.

Another important part of the command-line support is the capability to register logic providers on the fly. Which means we can force a custom scan logic from the command-line.

This script scans a single file given to it as argument. All callbacks, aside from init, are optional (they default to None).

That’s all! Expect the new release soon along with an additional supported platform. 🙂

Profiler 2.0

The new version 2.0 is out! The most important news is that we have a new online store, which allows orders from individuals and not only from organizations. If you’re not yet one of our customers, make sure to test out our trial. 🙂

We now offer 2 type of licenses Home/Academic and Commercial. Also, the price of the commercial license has been reduced. The reason for this is that we stripped active support from the license cost (we now offer support only in the advanced version). After 2 years on the road we had very few support requests and so it made sense to make licenses cheaper by removing the costs of support.

We’re currently finishing to port Profiler to Linux and OSX, so these platforms will be available soon. The current change-list reflects the changes in licensing and our cross-platform effort:

– switched to Visual Studio 2013
– updated Qt to 5.2.1
– updated Python to 3.4
– updated SQLite3
– updated OpenSSL
– switched from the XED2 to the Capstone disasm engine
– added disasm filters for ARM, Thumb, ARM64, MIPS, PowerPC and 8086
– implemented some custom view notifications in Python
– added UI controls to custom views
– made layouts available in the context of the main window
– improved Python SDK
– fixed many small issues

We switched to Qt 5, so PySide is unfortunately no longer supported. On the other hand the SDK now allows to build complex UIs. In fact, we fixed lots of minor issues in the SDK. The reason for that is that in the last months we had to offer a lot of SDK support and so we had to postpone many new features. The upside is that our SDK has become much more robust.

We also changed our licensing schema, which is no longer year-based, but version based. To compensate current customers for the lack of updates in the last months, we have renewed for free their license for the 2.x series. If you’re an active customer and you haven’t received your new license, please contact us!

Analysis of CVE-2013-3906 (TIFF)

This is just a demonstration of malware analysis with Profiler, I haven’t looked into previous literature on the topic. So, perhaps there’s nothing new here, but I hope it will be of help for our users.

We open the main DOCX file. The first embedded file we analyze is the TIFF image, which stands out because it’s the only image.

TIFF directories

Among the directories we have two which specify an embedded JPEG. So we just select the data area according to the offset and length value and load it as an embedded JPEG (it’s a good idea to do this automatically in the future: bear with us, TIFF support has been just introduced).

JPEG data

JPEG embedded

Now we can inspect the embedded JPEG.

JPEG meta

The JPEG looks strange. Even by looking at the format fields, it looks malformed. Given certain anomalies we can suppose that the TIFF might be used as some sort of vector for something. Let’s keep that in mind and let’s take a look at other files. Two of them stand out: they have a .bin extension and are CFBFs (same format as DOC files).

CFBF Foreign

We immediately notice various problems. Foreign data is abundant and the file looks malformed. More alarmingly lots of shellcode warnings are reported. Let’s look at a random one (they are all the same basically).

CFBF Shellcode

Let’s start with the analysis of this shellcode.

The start of the shellcode is just a decryption loop which xors every byte following the last call with 0xEE. So, that’s exactly what we’re going to do. We select 0x27E bytes after the last call, then we press Ctrl+T to open the filters view.

Decrypt shellcode

We use two filters: ‘misc/basic‘ to xor the bytes and ‘disasm/x86‘ to disassemble them.

At this point it’s useful to load the shellcode into a debugger. So let’s select the decrypted shellcode and press Ctrl+R and activate the ‘Shellcode to executable‘ action:

Shellcode to executable

The options dialog will pop up.

Shellcode to executable options

After pressing OK, the debugger will be executed. Depending on your debugger, put a break-point on the beginning of the shellcode and run the code.

The start of the shellcode resolves API addresses:

This code basically retrieves the base of ‘kernel32.dll‘ and then resolves API address by calling 0x120d for 0x12 times (which is the amount of APIs to be resolved).

This is the function which resolves API names hashes to addresses:

The API hashes are stored at the end of the shellcode. Here are the hashes and the resolved APIs:

We can actually close the debugger now. Once the API names are known, it’s easy to continue the analysis statically.

The next step in the shellcode is to suspend all other threads in the current process:

It then looks for the handle in the current process for the main DOCX file:

It reads the entire file (minus the first 0x24 bytes) into memory and looks for a certain signature:

It creates a new file in the temp directory:

Now comes the juicy part. It reads few parameters after the matched signature, including a size parameter, and uses these parameters to decrypt a region of data:

Let’s select in the hex view the same region (we find the start address by looking for the signature just as the shellcode does).

Encrypted payload

Then we press Ctrl+E to add an embedded file and we click on filters. Since the decryption routine is too complex to be expressed through a simple filter, we have finally a good reason to use a Lua filter, in particular the ‘lua/custom‘ one.

Lua custom filter

As you can see, a sample stub is already provided when clicking on the script value in the options. We have to modify it only slightly for our purposes.

By clicking on preview, we can see that the file start with a typical MZ header. Thus, we can specify a PE file when loading the payload.

PE payload

By inspecing the PE file we can see that it’s a DLL among other things. Now, before analyzing the DLL, let’s finish the shellcode analysis.

The payload is now decrypted. So the shellcode just writes it to the temporary file, loads the DLL, unloads it and then terminates the current thread.

That’s it. Now we can go back to the payload dll.

One of the resources embedded in the PE file is a DOCX.

Payload DOCX

Probably a sane document to re-open in a reader to avoid making the user suspicious of a document which didn’t open.

There’s another suspicious embedded resource, but before making hypotheisis about it, let’s do a complete analysis of the DLL, (don’t worry: it’s very small). For this purpose I used IDA Pro and the decompiler.

It starts with the DllMain:

The called function:

It basically retrieves both embedded resources and then performs some stuff with both of them. First we take a look at what it does with the DOCX resource:

As hypothized previously, it just launches a new instance of the current process with the sane DOCX this time.

Now let’s take a look at what it does with the resource we haven’t yet identified:

It dumps it to file (with a vbe extension) and then executes it with ‘cscript.exe‘. I actually didn’t know about VBE files: they’re just encoded VBS files. To decode it I used an online tool by GreyMagic.

VBE resource

The VBS code is quite easy to read and quite boring. The only interesting part in my opinion is the update mechanism.

It basically looks for certain comments in two YouTube pages with a regex. The url captured by the regex is then used to perform the update.

It sends back to the URL various information about the current machine:

As a final anecdote: the main part of the script just contains an enormous byte array which is dumped to file. The file is a PNG with a RAR file appended to it which in turn contains a VBE encoding executable. The script itself doesn’t seem to make use of this, so no idea why it’s there.

Appended RAR

That’s all. While it may seem a lot of work, writing the article took much longer than performing the actual analysis (~30 minutes, including screenshots).

We hope it may be of help or interest to someone.

News for version 1.1

And here it is.

added libmagic to the SDK
added preliminary ELF support
added TIFF support
– added GZ, BZ2 and LZMA file support
– exposed internal API for files and paths to Python
– hooks are now triggered even when loading embedded objects in the workspace
added magic info extension script
– exposed more DEX methods to Python
– remember manually enabled extensions
– capability to add individual files in the scan page
– some bug fixes

Enjoy.

TIFF Support

The upcoming 1.1 version of Profiler includes support for TIFF image files. This addition completes the list of most popular supported image formats (JPEG, PNG/APNG, GIF, BMP/DIB, TIFF). The support was actually already partially there, because the Exif format (which is just an embedded TIFF) inside of JPEG files was already supported since the first public version of Profiler, but somehow the actual TIFF support hadn’t yet been added.

Multi-page TIFF

In a few days the new version will be released.

libmagic Support

While Profiler offers an API to identify file formats, it does so only for those which are supported. The list of supported files is vast, but there will be always unrecognized formats.

It’s certainly a good idea to introduce a file signature identification API. This API might be useful for several purposes, not all foreseeable right away. That’s why the upcoming version introduces support for libmagic (it comes with the latest 5.11 version). The library is exposed to Python in the ‘Pro.magic’ module. Here are the functions:

Just as a note: magic_file just calls magic_buffer internally.

Let’s create a small hook to demonstrate the use of the library, although it’s quite intuitive. Here’s the cfg entry:

The Python code:

The addMetaDataString in ScanProvider adds a string in the individual file report, which is visible from the file stats page in the workspace.

So if we open a file in the workspace, we’ll get the following extra information:

libmagic

The script above will be included in the update.