Dissecting an ELF with C++ Types

While there are more interesting targets which could be manually analyzed with the new features provided in the Profiler, I decided to write a small post about ELF, also because official support for ELF will be added sooner or later.

Let’s start by importing the types contained in ‘elf.h’. You’ll probably find this header in ‘/usr/include’. Everything we’re interested in is in this file, so we can avoid importing other stuff. I added some predefines in order to avoid includes:

#define int8_t char
#define uint8_t unsigned char
#define int16_t short
#define uint16_t unsigned short
#define int32_t int
#define uint32_t unsigned int
#define int64_t long long
#define uint64_t unsigned long long

Then I pasted ‘elf.h’ into the Header Manager after the HEADER_START directive and clicked on ‘Import’.

ELF types import

We now have a header (elf) with all the types we need to start the manual analysis.

Since this is just a demonstration I didn’t do a full analysis of the ELF format. I limited the scope to finding the imported symbols and their strings.

ELF analysis

Every ELF starts with a _Elf64_Ehdr header (Elf32_Ehdr for 32-bit files, in this case it’s a 64-bit ELF). The header specifies the offset, number and size of the sections (we’ll just assume the standard 0x40 size here). The ‘name’ field of sections is just an index into a ‘SHT_STRTAB’ section whose index is specified by the header. The contents of a section are specified by its type, so finding the symbol table is pretty straight-forward. In this ELF we have a SHT_DYNSYM section. This section is just an array of _Elf64_Sym structures. Again, their ‘st_name’ field is just an index into another SHT_STRTAB section (the interval in the screenshot named ‘.dynstr’).

As already mentioned in the previous post, we can create a layout programmatically as well:

from Pro.Core import *
from Pro.UI import *

def buildElfLayout(obj, l):
    hname = "elf"
    hdr = CFFHeader()
    if hdr.LoadFromFile(hname) == False:
        return
    sopts = CFFSO_GCC | CFFSO_Pack1
    d = LayoutData()
    d.setTypeOptions(sopts)
    
    # add header
    ehdr = obj.MakeStruct(hdr, "_Elf64_Ehdr", 0, sopts)
    d.setColor(ntRgba(255, 0, 0, 70))
    d.setStruct(hname, "_Elf64_Ehdr")
    l.add(0, ehdr.Size(), d)

    # add sections (we assume that e_shentsize is 0x40)
    e_shoff = ehdr.Num("e_shoff")
    e_shnum = ehdr.Num("e_shnum")
    esects = obj.MakeStructArray(hdr, "_Elf64_Shdr", e_shoff, e_shnum, sopts)
    d.setStruct(hname, "_Elf64_Shdr")
    d.setArraySize(e_shnum)
    l.add(e_shoff, esects.TotalSize(), d)

hv = proContext().getCurrentView()
if hv.isValid() and hv.type() == ProView.Type_Hex:
    c = hv.getData()
    obj = CFFObject()
    obj.Load(c)
    lname = "ELF_ANALYSIS" # we could make the name unique
    l = proContext().getLayout(lname) 
    buildElfLayout(obj, l)
    # apply the layout to the current hex view
    hv.setLayoutName(lname)

Moreover, the imported types can be used to do other operations not related to layouts. For instance let’s write few lines of code to print out the symbol names for this ELF:

from Pro.Core import *

obj = proCoreContext().currentScanProvider().getObject()

hdr = CFFHeader()
if hdr.LoadFromFile("elf"):
    syms = obj.MakeStructArray(hdr, "_Elf64_Sym", 0x39A0, 2179, CFFSO_GCC | CFFSO_Pack1)
    it = syms.iterator()
    while it.hasNext():
        s = it.next()
        name_offs = s.Num(0) + 0x105E8 # .dynstr offset
        name = obj.ReadUInt8String(name_offs, 0x1000)[0].decode("utf-8")
        print(name)

The output will be:

endgrent
__ctype_toupper_loc
iswlower
sigprocmask
__snprintf_chk
getservent
wcscmp
putchar
strcasecmp
localtime
mblen
__vfprintf_chk
; etc.

Rememebr that the advantages of using CFFStructs rely not only in their dynamism or easiness in displaying them graphically, but also security. Contrary to a structure pointer in C, there’s no risk of crash when accessing members in a CFFStruct.

Today some final tests will be performed on the new version and if everything goes well, it will be released tomorrow or the day after. So stay tuned!