Thursday, January 27, 2011

Quick checks on VxWorks images - part 2

Following the 1st post on VxWorks, let's check some more internals on binary system images. I will focus on VxWorks version 5, which is based on a proprietary binary format, whereas VxWorks version 6 makes use of the ELF executable and linkable format. VxWorks 5 images are monolithic: everything -system and applications- is often built and linked into a single executable file. And luckily (or by mistake?) most of the VxWorks 5 images I have seen include debugging symbols! Manufacturers seem forgetting to strip their firmwares...

My goal for now is to retrieve the symbol table containing all functions' code start address, and corresponding symbols' name and type. Firstly, I will try to find suspicious areas were functions or variables names are collapsed, separated with padding bytes.
For example:

[...]
00137E80   74 65 6C 6E  65 74 64 54  61 73 6B 44  65 6C 65 74  telnetdTaskDelet
00137E90   65 00 74 65  6C 6E 65 74  64 53 65 73  73 69 6F 6E  e.telnetdSession
00137EA0   44 69 73 63  6F 6E 6E 65  63 74 46 72  6F 6D 53 68  DisconnectFromSh
00137EB0   65 6C 6C 4A  6F 62 00 74  65 6C 6E 65  74 64 53 65  ellJob.telnetdSe
00137EC0   73 73 69 6F  6E 44 69 73  63 6F 6E 6E  65 63 74 46  ssionDisconnectF
00137ED0   72 6F 6D 53  68 65 6C 6C  00 74 65 6C  6E 65 74 64  romShell.telnetd
00137EE0   53 65 73 73  69 6F 6E 44  69 73 63 6F  6E 6E 65 63  SessionDisconnec
00137EF0   74 46 72 6F  6D 52 65 6D  6F 74 65 00  74 65 6C 6E  tFromRemote.teln
00137F00   65 74 64 50  61 72 73 65  72 43 6F 6E  74 72 6F 6C  etdParserControl
[...]

After some more checks, I have a good view on the way those names are appended all together: it is only printable characters separated by 1 or few null bytes. This will be easy to automate a search. Furthermore, it can help in retrieving the OS loading address used by the bootloader: addresses of those symbols in the static file are not those at the execution time; however, they are all moved by a fixed offset: ...the loading address.
An idea is to make use of the difference between all the symbols strings address instead of using directly their static addresses in the file. May these relative offsets help in retrieving the symbol table?
Let's open python again, find sequence of symbols and return the list of offsets between them:

def scan_for_symbols(img):
    # printable chars:  0x21 to 0x7E
    # scan the file for printable chars,
    # spaced with \x00 repeated 1 to maxp times
    
    # constants:
    # char for padding between symbols
    pad_char = '\x00'
    # maximum padding bytes admitted between symbols strings
    maxp = 8
    # limit to determine possible sequence of symbols names
    min_pattern = 0x100 
    
    # initialized:
    num_pattern = 0 # count possible symbols during scan
    start_addr = 0 # store the possible sequence start address
    addr_cur_sym, addr_prev_sym = 0, 0 # start address of possible symbols
    p, pad, acc = 0, 0, 0 # address pointer, and pad and char counters
    addr_diff = [] # list with address offsets between symbols
    ret = [] # list with addr_diff lists found
    
    while p < len(img):
        if 0x21 < ord(img[p]) < 0x7E:
            if start_addr == 0:
                start_addr = p
            if pad > 0:
                addr_cur_sym = p
                if addr_prev_sym > 0:
                    addr_diff.append(addr_cur_sym - \
                     addr_prev_sym)
                addr_prev_sym = addr_cur_sym
            pad = 0
            acc += 1
        elif img[p] == pad_char and pad <= maxp:
            if acc > 0:
                num_pattern += 1
            acc = 0
            pad += 1
        else:
            if num_pattern > min_pattern:
                print '[+] possible symbols starting' \
                      ' at address: 0x%x' \
                      % start_addr
                ret.append(addr_diff)
            addr_diff = []
            addr_cur_sym, addr_prev_sym = 0, 0
            pad, acc = 0, 0
            num_pattern = 0
            start_addr = 0
        p += 1
    return ret

And as a result in the python interpreter (with the firmware image of the IP-enable fridge):

>>> s = scan_for_symbols(img)
[+] possible symbols starting at address: 0x10f90e0
[+] possible symbols starting at address: 0x138fdbc
[+] possible symbols starting at address: 0x14de5f8
>>> [(len(l), l[:15]) for l in s]
[(481, [24, 24, 36, 32, 24, 24, 28, 28, 24, 28, 24, 20, 24, 24, 24]), 
(519, [8, 8, 8, 8, 8, 4, 8, 4, 4, 4, 4, 4, 8, 32, 20]), 
(45700, [12, 12, 12, 8, 8, 8, 8, 8, 8, 4, 8, 8, 8, 4, 8])]

It seems we have a winner at 0x14de5f8 with 47500 debugging symbols! I can confirm thanks to my hexeditor.
Next step would be to use those offsets to retrieve the symbol table containing functions pointers: scanning the file, applying a guessed size of a pattern from the table, and comparing the relative offsets between successive supposed symbol's address with what I got from scan_from_symbols().
Let's do a simple research with python:

def search_symbol_table(img, offset_list):
    endian='>' # handle endianness for DWORD
    pat_size=16 # guessed size of a pattern of the symbol table
    
    # truncate the list to avoid possible dummy symbols
    # at the beginning and end 
    offset_list = offset_list[50:-50]
    
    i, start_addr = 0, 0
    # scan with word alignment at 0, 1, 2, ... pat_size bytes
    for o in range(0, pat_size):
        for p in range(0, len(img)-(2*pat_size), pat_size):
            sym_addr_1 = struct.unpack(endian+'I', \
                          img[p+o:p+o+4])[0]
            sym_addr_2 = struct.unpack(endian+'I', \
                          img[p+o+pat_size:p+o+pat_size+4])[0]
            # check against the symbols offset list
            if (sym_addr_2 - sym_addr_1).__abs__() == \
             offset_list[i]:
                i += 1
                if i == 1:
                    start_addr = p+o
                elif i == len(offset_list):
                    print '[+] found symbol table starting'
                          ' just before address: 0x%x' \
                          % start_addr
                    i = 0
            else:
                i = 0

And let's try it on our fridge's firmware image with the 45700 debugging symbols:

>>> s = scan_for_symbols(img)
[+] possible symbols starting at address: 0x10f90e0
[+] possible symbols starting at address: 0x138fdbc
[+] possible symbols starting at address: 0x14de5f8
>>> search_symbol_table(img, s[2])
[+] found symbol table starting just before address: 0x158d958

After checking with the hexeditor, the match is confirmed: Bingo! Actually, the table is exactly starting at address 0x158d6b8. This is because we truncated the beginning of the offset list a bit too much.
The format that seem to be used by VxWorks 5 is:
struct symtable_pattern{
dword symbol_addr;
dword code_addr;
dword symbol_type; // 0x500: function, 0x700: data, 0x900: ?
dword null;
};
I was lucky to find a match directly. It could have happened that the offset list deduced from the symbols strings would need to be reversed. It could also happen (who knows how obscure debuggers work...) that the symbols in the string list are not sorted in the same way than the patterns in the symbol table. For those reasons, we can have another approach to try to find the symbol table: a bit more statistical...
So, one can scan the firmware image, extracting 2 consecutive supposed symbol addresses (guessing again the size of a pattern of the table), and check if the difference between those 2 addresses is less than the maximum memory space taken by the full symbols string. If there is a match a number of time equal to the number of symbols found in the string area: then we have certainly found the symbol table... Or a large area of padding :(
However, let's test it:

def search_symbol_table_stat(img, sym_number, max_offset):
    endian='>'
    pat_size = 16
    i, start_addr = 0, 0
    # scan with word alignment at 0, 1, ..., pat_size bytes
    for o in range(0, pat_size):
        for p in range(0, len(img)-(2*pat_size), pat_size):
            sym_addr_1 = struct.unpack(endian+'I', \
                          img[p+o:p+o+4])[0]
            sym_addr_2 = struct.unpack(endian+'I', \
                          img[p+o+pat_size:p+o+pat_size+4])[0]
            if (sym_addr_2 - sym_addr_1).__abs__() < \
             max_offset:
                i += 1
                if i == 1:
                    start_addr = p+o
                elif i == sym_number:
                    print '[+] possible symbol table'\
                          ' starting at address: 0x%x' \
                          % start_addr
            else:
                i = 0

And let's try it again on fridge's firmware:

>>> search_symbol_table_stat(img, len(s[2]), sum(s[2]))
[+] possible symbol table starting at address: 0x158d580
[+] possible symbol table starting at address: 0x1652d80
[+] possible symbol table starting at address: 0x158d591
[...]
[+] possible symbol table starting at address: 0x1652d8f

Suprisingly, the result is not so bad... I find the area starting at 0x158d580 thanks to the statistic search, that is close to the exact start address 0x158d6b8. The area starting at 0x1652d80 is actually padding bytes.

Knowing the symbol table and the list of symbols strings, I have now to retrieve the loading address of the firmware. The idea is to get one of the extrem address (lowest or highest) from the symbols strings, and the equivalent in the symbol table. Based on the last example with 45700 symbols, I get the loading address: 0x10000. Look's good!
Let it try now on the firmware of the helium balloon:

>>> len(img)
9128872
>>> s = scan_for_symbols(img)
[+] possible symbols starting at address: 0x67e358
>>> len(s[0])
15874
>>> search_symbol_table(img, s[0])
>>> search_symbol_table_stat(img, len(s[0]), sum(s[0]))
[+] possible symbol table starting at address: 0x71cd40
[+] possible symbol table starting at address: 0x75ad70
[+] possible symbol table starting at address: 0x71cd41
[+] possible symbol table starting at address: 0x75ad71
[...]
[+] possible symbol table starting at address: 0x75ad7f
>>> get_lowest_addr_from_symtable(0x71cd40, 0x75ab8c)
1617425240
>>> hex(_ - 0x67e358)
'0x60001000'

So, this image has almost 16000 symbols. Checking with the hexeditor, I can confirm the starting address of the symbol table: 0x71cd40. I note at the same time the end of the table: 0x75ab8c. And comparing lowest addresses between symbols strings and symbol table, I deduce the loading address: 0x60001000. So nice...
From this point, it is possible to extract the list of symbols with corresponding code start address and type. This is left for the reader, and it ends up this 2nd session on VxWorks image analysis.
Next session, blind_key will use the loading address and debugging symbols retrieved here to resolve cross-reference of the image executable in IDA. This will help us in having a logical view of the binary, instead of the austere hexa view we had up to now.

Monday, January 24, 2011

The Local Priviledge Storage Impossibility

When you fire up your mailer, it (usually) does not prompt you your password to retrieve your mail. Instead, it stores it locally. Wait, whaaat ? My password is stored on my hard-drive, so anybody can boot on a live medium and 0wn me ? Yes, my dear. Fortunately, they are usually ciphered before being stored. Not always. Pidgin users can try the following ultr4-l33t h4xx :
$ grep -rn "<password" ~/.purple/*
OK, this was funny. Some other softwares do a slightly better job at hiding your credentials. Here is today's" source-code-reverse-engineering" case-study ! The subject will be Claws-Mail. Get the source, grep magic words, and you'll quickly find the interesting source file.
$ wget http://downloads.sourceforge.net/\
sourceforge/sylpheed-claws/claws-mail-3.7.8.tar.bz2
$ tar xjvf claws-mail-3.7.8.tar.bz2
$ cd claws-mail-3.7.8/src
$ grep -rn password * # lots of stuff, let's get a simple file list
$ grep -rn password * | cut -d':' -f1 | sort | uniq
[...]
common/passcrypt.c
common/passcrypt.h
[...]
$ # here it is !
$ vim common/passcrypt.c common/passcrypt.h
The header file shows us a nice PASSCRYPT_KEY set to "passkey0". And in the C file, we find the ciphering and deciphering procedures. A lot of obscure crypto (at least obscure to me), but you don't need to be a genius to understand the problem: the secret is not secret (it's in the header file), so no crypto-system will be able to hide the password we want to protect. Moreover, there is no need to reverse the algorithm, since the source file provides both ciphering and deciphering functions.

One note about this problem: it is impossible to solve. You cannot hide something to someone who knows the secret you use. Unlike the hashing algorithms used to store your UNIX credentials in /etc/shadow for example, Claws-Mail HAS to store the password in a reversible way. Here is why.
When you type your password in the UNIX login prompt, the system checks that the password you gave produces the same hash that is stored in /etc/shadow; it does not check that you entered YOUR password (because it does not know it, it simply knows the hash). Thanks to the cryptographic hash functions properties, this ensures us that there is a very high probability that the password you entered was indeed yours. This propery is called collision resistance.
On the contrary, Claws-Mail HAS to know your password because the mail protocol (e.g IMAP, SMTP) states that you must provide a password. So either you type the password each time you launch Claws-Mail, either it stores it in a reversible way. Anyone who has the same knowledge as you (i.e. who knows the value of PASSCRYPT_KEY) can retrieve the password.

Claws-Mail stores the ciphered password in ~/.claws-mail/accountrc. All the code you need is already there: the deciphering procedure in common/passcrypt.c, plus some base64 stuff in common/base64.c.
$ sudo grep password /home/dummy/.claws-mail/accountrc
password=!U5unwpFJ3+dsOVd+IwfdyQ==
$ grep -rn passcrypt_decrypt *
[...]
prefs_gtk.c:226:    passcrypt_decrypt(tmp, len);
[...]
$ vim prefs_gtk.c
The deciphering is done in this file.
case P_PASSWORD:
   g_free(*((gchar **)param[i].data));
   if (value[0] == '!') {
    gchar tmp[1024];
    gint len;

    len = base64_decode(tmp, &value[1], strlen(value) - 1);
    passcrypt_decrypt(tmp, len);
    tmp[len] = '\0';
    *((gchar **)param[i].data) =
     *tmp ? g_strdup(tmp) : NULL;
   } else {
    *((gchar **)param[i].data) =
     *value ? g_strdup(value) : NULL;
   }
   break;
First character in has to be a "!", then it is decoded in base64, then we call the passcrypt_decrypt method on it. Add a "\0" at the end. All you need is to put the pieces together.

Here is a package with everything. C code put together (just ripped, no need to add anything), plus a python script that extracts all the informations and decipher the password.
$ sudo python2 passrec.py dummy
[+] found claws-mail config files for user dummy
[ ] account address: dummy@gmail.com
[ ] receive server: imap.gmail.com
[ ] login: dummy
[ ] ciphered password: U5unwpFJ3+dsOVd+IwfdyQ==
[+] deciphered password: zessuperPAssw0rd
Warning: this script is so ugly that it will give eye-cancer to any person able to write a "Hello World" in python. Add -fPIC to the lib target in the Makefile if it does not compile.

Fortunately you can modify the PASSCRYPT_KEY before Claws-Mail compilation:
$ ./configure --with-passcrypt-key=KEY
Unfortunately, the key will still be stored in the binary. So a little bit of reverse-engineering will eventually bring you the secret. Fire up gdb, break on crypt_cfb_buf, and find the key.

Friday, January 7, 2011

Keyboard fun



I bought a really mean keyboard. Yes, the keys are blank. I wanted to be able to switch keyboard layouts, and I find it weird to type on an AZERTY keybord with a QWERTY layout (and vice-versa). So, blank keys, no assumptions on the layout, no schizophrenia problem for me.

Now, about the layout switching. You can use a GUI application (e.g. fbxkb), but if you have a really bad ass keyboard, you don't want this. So here is a more geeky solution using Xorg configuration. I'm using version 1.9.2.

Open /etc/X11/xorg.conf.d/10-evdev.conf and edit the following lines in the keyboard section
        Option "XkbLayout" "fr,us"
        Option "XkbOptions" "grp:caps_toggle,grp_led:caps"

What it basically does is :
- load two layouts, fr and us
- use the Caps Lock key (seriously, who uses it ?) to change the layout
- toggle the Caps Lock LED when the keyboard layout switches (same behavious than with the normal Caps Lock key).

You can find more on these options here if you want to use different keys and LEDs. And by the way, if you find yourself "caps-locked", and unable to switch back (remember, the Caps Lock key changes the layout), try Shift-Caps_Lock...

Another way to make the keyboard even more bad ass is to add media keys, since it does not have any. I use the three keys you might have never used, Pause, Print Screen and Scroll Lock. Here is a quote of my Openbox config (in ~/.config/openbox/rc.xml):
     <keybind key="S-Print">
      <action name="Execute">
        <execute>scrot -s -e 'mv $f ~/sshots'</execute>
      </action>
    </keybind>
    <keybind key="S-Pause">
      <action name="execute">
        <execute>ncmpcpp toggle</execute>
      </action>
    </keybind>

So Shift-Print allows me to take a screenshot, and Shift-Pause acts like a Play-Pause key.