1. Introduction

A shellcode is a sequence of machine language instructions which an already-running program can be forced to execute by altering its execution flow through software vulnerabilities (e.g. stack overflow, heap overflow or format strings). In other words, it is the notorious arbitrary code which can be run on systems affected by specific vulnerabilities. Tipically, a shellcode looks like:

char shellcode[] = "\xeb\x18\x5e\x31\xc0\x88\x46\x07\x89\x76\x08\x89\x46"
                   "\x0c\xb0\x0b\x8d\x1e\x8d\x4e\x08\x8d\x56\x0c\xcd\x80"
                   "\xe8\xe3\xff\xff\xff\x2f\x62\x69\x6e\x2f\x73\x68";

that is a sequence of binary bytes (machine language).

The purpose of this document is to introduce some of the most widespread techniques for writing shellcode for Linux and *BSD systems running on the IA-32 (x86) architecture.

You may wonder why you should learn anything about writing shellcode, since you can find a lot of ready-to-use shellcodes on the internet (after all, that's what "copy and paste" is for). Anyway, I think there are at least two good reasons:

  1. first of all, it's always a good idea to analyse someone else's shellcode before executing it, just to know what's going to happen and to avoid bad surprises (we will discuss this later in detail);
  2. besides this, keep in mind that the shellcode may have to run in the most diverse environments (input filtering, string manipulation, IDS...) and, therefore, you should be able to modify it accordingly.

A good knowledge of IA-32 assembly programming is assumed, since we won't dwell much on strictly programming topics, such as the use of registers, memory addressing or calling conventions.

Anyway, the appendix provides a short bibliography useful to anyone who wants to learn the basics of assembly programming or just to refresh one's memory. Last, a little knowledge of Linux, *BSD and C can be helpful...