News for version 0.9.6

The new 0.9.6 version of the Profiler is out. The main new feature is the support for Mach-O files. Since this feature stands on its own, it did make sense to postpone other features to the next version and in the meanwhile let our users benefit from this addition.

Here’s the changelist:

added support for Mach-O files
added support for fat/universal binaries
added support for Apple code signatures
– exposed DemangleSymbolName to Python

The DemangleSymbolName function demangles both VC++ and GCC symbols. Its use is straightforward:

from Pro.Core import DemangleSymbolName
demangled = DemangleSymbolName("__ZNK8OSObject14getRetainCountEv")
print(demangled)
# outputs: OSObject::getRetainCount() const

Mach-O support (including Universal Binaries and Apple Code Signatures)

The reason behind this addition is that before undertaking the next big step in the road map of the Profiler there was some spare time to dedicate to some extra features for the upcoming 0.9.6 version. There have also been some customer requests for Mach-O support, so we hope that this will satisfy their request. While there are still some things left which would be useful and nice to add to the Mach-O support, they are not many.

Layout

The first screenshot as you can see features the Mach-O layout.

The logic of Mach-Os starts with their load commands which describe everything else:

Load commands

Segments and sections:

Segments

Entry points (LC_MAIN, LC_UNIXTHREAD):

Entry points

Symbols:

Symbols

Then the LC_DYLD_INFO can describle some VM operations for rebasing and binding:

Rebase

Binding:

Bind

Also the DyldInfo export section is represented as in the file as a tree:

Export

Function starts:

Function starts

Of course, Mach-O support makes little sense without Fat/Universal Binary support:

Fat/Universal Binary

While the upcoming version won’t yet support validation of Apple Code Signatures embedded in Mach-Os, it’s already possible to inspect their format and the embedded certificates.

Apple Code Signature

As usual all the formats added have been exposed to Python as well. I paste some of the SDK class documentation here excluding constants, which are just too many.

class MachObject
    : CFFObject

    AddressToOffset(MaxUInt address) -> MaxUInt
    AddressToSection(MaxUInt address) -> CFFStruct
    AddressToSegment(MaxUInt address) -> CFFStruct
    BuildSymbolsValueHash(CFFStruct symtablc) -> NTHash< MaxUInt,UInt32 >
    CertificateLCs() -> NTUIntVector
    DyLibModules(CFFStruct dysymtablc) -> CFFStruct
    DySymTableLC() -> CFFStruct
    DyTableOfContents(CFFStruct dysymtablc) -> CFFStruct
    DyldDisassembleBind(NTTextStream out, MaxUInt offset, UInt32 size)
    DyldDisassembleBind(NTTextStream out, CFFStruct dyldinfo)
    DyldDisassembleLazyBind(NTTextStream out, CFFStruct dyldinfo)
    DyldDisassembleRebase(NTTextStream out, MaxUInt offset, UInt32 size)
    DyldDisassembleRebase(NTTextStream out, CFFStruct dyldinfo)
    DyldDisassembleWeakBind(NTTextStream out, CFFStruct dyldinfo)
    DyldFindExportedSymbol(CFFStruct dyldinfo, char const * symbol) -> MaxUInt
    DyldInfoLC() -> CFFStruct
    EntryPointAddress(CFFStruct lc) -> MaxUInt
    EntryPointLCs() -> NTUIntVector
    ExternalSymbolReferences(CFFStruct dysymtablc) -> CFFStruct
    FunctionStartsLC() -> CFFStruct
    FunctionStartsOffsetsAndValues(CFFStruct funcstartslc, NTVector< MaxUInt > & values) -> NTUIntVector
    GetLC(LoadCmdInfo info) -> CFFStruct
    GetLC(UInt32 index) -> CFFStruct
    GetLCCount() -> UInt32
    GetLCDescription(CFFStruct s) -> NTString
    GetLCDescription(UInt32 index) -> NTString
    GetLCInfo(UInt32 index) -> LoadCmdInfo
    GetLCInfoFromOffset(MaxUInt offset) -> LoadCmdInfo
    static GetLCName(UInt32 cmd) -> NTString
    IndirectSymbolTable(CFFStruct dysymtablc) -> CFFStruct
    IsMachO64() -> bool
    MachHeader() -> CFFStruct
    OffsetToAddress(MaxUInt offset) -> MaxUInt
    OffsetToSection(MaxUInt offset) -> CFFStruct
    OffsetToSegment(MaxUInt offset) -> CFFStruct
    ProcessLoadCommands() -> bool
    ReadSLEB128(NTBuffer b) -> Int64
    ReadSLEB128(MaxUInt offset, UInt32 & size) -> Int64
    ReadULEB128(NTBuffer b) -> UInt64
    ReadULEB128(MaxUInt offset, UInt32 & size) -> UInt64
    SectionFromOffset(UInt32 cmd, MaxUInt offset) -> CFFStruct
    SegmentSections(CFFStruct seg) -> CFFStruct
    SymTableLC() -> CFFStruct
    SymbolNList(CFFStruct symtablc) -> CFFStruct

class FatObject
    : CFFObject

    Architectures() -> CFFStruct

class AppleCodeSignatureObject
    : CFFObject

    BlobFromOffset(UInt32 offset) -> CFFStruct
    BlobIndexes(CFFStruct supblob) -> CFFStruct
    BlobName(UInt32 magic) -> NTString
    BlobName(CFFStruct blob) -> NTString
    IsSuperBlob(UInt32 magic) -> bool
    IsSuperBlob(CFFStruct blob) -> bool
    TopBlob() -> CFFStruct

Given the SDK capabilities, it’s easy to perform custom scans on Mach-Os or to create plugins.

That’s all. Hope you enjoyed and don’t be shy if you have feature requests or suggestions. 😉

News for version 0.9.5

We’re happy to present to you the new version of the Profiler with the following news:

introduced Lua filters: lua/custom and lua/loop
added optional condition to misc/basic
added JavaScript execute action
added JavaScript debugger
– simplified save report/project logic
– included actions among the extensions views
– improved detection of shellcodes
introduced max file size option for shellcode detection
improved OLE Streams parsing and extraction from RTFs
exposed getHash method in ScanProvider to Python
– added text replace functionality to text controls

While most of the items in the list have been discussed in previous posts, some of them need a brief introduction.

Max file size for shellcode detection

While shellcode detection applies by default to files of any size, you might want to specify a threshold.

Shellcodes scan options

This is useful if you want to speed up the analysis of large files. It might come handy in some cases.

The ‘getHash’ method

This method should be used by hooks to retrieve a hash for the currently scanned file. The syntax is very simple:

sp.getHash("md5")

Of course one could use a filter to hash the file, but the advantage of this method is that once a particular hash type has been computed it won’t be computed again if requested by another hook.

Improved OLE Streams parsing and extraction from RTFs

In one of the previous use cases we’ve analyzed a huge set of malicious RTF documents. Some of them were not recognized correctly and some of them showed problems in the automatic extraction of OLE streams. This release fixes these issues.

RTF set

As you can see all RTFs are now correctly parsed and their OLE stream has been extracted. Some of the OLE objects though are not extracted correctly. After looking into it, it seems to be a problem with the malicious files themselves. OLE streams are encoded as hex strings into the RTF and in some of these files there’s an extra byte which invalidates the sequence.

01 05 00 00 02 00 00 00 1B 00 00 00 A 4D

That ‘A’ character between 00 and 4D makes the sequence to be 00 A4 D which is incorrect. Our guess is that the malware generator which produced these RTFs outputted some invalid ones by inserting an ‘A’ character instead of a 0x0A newline.

While RTF readers are not able to parse these objects either it’s still interesting for our analysis to be able to inspect them. So we just load the RTF files patching the ‘A’ character with a filter as in the screenshot below.

Fixing a broken OLE stream

That fixes it and we are now able to inspect the embedded OLE object and its threats. As you can see we get directly the shellcode disassembly from the automatic analysis.

Fixed OLE stream

Enjoy!

JavaScript Analysis

The upcoming 0.9.5 version of the Profiler introduces tools to interactively analyze JavaScript code. In a few words it adds the capability to execute snippets of code or to debug them. The JavaScript engine used is the one in WebKit.

Let’s take a look at the newly introduced actions:

JavaScript actions

The ‘Execute JavaScript‘ action executes a script and lets the user decided whether to process ‘eval‘ calls or not.

Execute JavaScript

Even when ‘eval‘ calls are not being processed, the argument is still printed out for the user to inspect. And in case ‘eval‘s are performed, then the result (if any) is printed out as well.

js_eval: print('hello world'); 1 + 1
js_print: hello world
js_eval_result: 2

Let’s take a look at the same code under the JavaScript debugger. Given the JavaScript debug capabilities already in Qt, it was easy to integrate a full fledged debugger:

JavaScript Debugger

The debugger can be executed as a stand-alone utility (jsdbg.exe) as well.

It shouldn’t take long before the new version is ready and then we’ll see these features in action against some real world samples. Stay tuned!