What is Reverse Engineering?

Simply put, Reverse Engineering is the process of taking something apart to find out how it works. In regards to software, there are three general categories of data that are reverse engineered:

File formats
Protocol formats
Executable programs

File Formats

With File format reverse engineering, you typically have one or more files of a (unknown) structured format. The purpose is to determine that structured format. As a simple example, consider a file containing the four byte sequence:

0x41 0x42 0x43 0x44

This could be interpreted in at least the following ways:

four ascii characters A B C D
four 8 bit signed integers 65, 66, 67 and 68
two 16 bit little endian signed integers 16961 and 17475
two 16 bit big endian signed integers 16706 and 17220
one 32 bit little endian signed integer 1145258561
one 32 bit big endian signed integer 1094861636
one 32 bit little endian floating point number 781.0352173

As you can see, the possibilities for interpreting unknown data are limitless. Now imagine that there are many of these four byte data files. Examining each file, we see that the value of the bytes is always between 0x41 and 0x5a. Interpreting this range as ascii characters, the range is 'A' .. 'Z'. From this we infer that the likely structure of the file format is an ascii string of length four.

Real life examples are much more complex. File formats typically contain strings, signed and unsigned integers of varying sizes, floating point numbers, padding bytes and arrays, as well as structures built up of these primitives. Although it may sound complex, file format reverse engineering is the simplest of three categories.

Protocol Formats

Reverse engineering communication protocols is a step up from reverse engineering file formats, with the added complexity being that protocols are multi directional while file formats are only unidirectional. A protocol is a conversation that follows strict rules. When reverse engineering a protocol, the data being transmitted must be determined (like with file formats). But perhaps more importantly, the rules as to when that data can be transmitted must be determined as well.

Reasons for reverse engineering protocols can be more than just to document an existing unknown protocol. These techniques can often be used to test the robustness of a particular implementation of a known protocol.

Executable Programs

Reverse engineering executable programs is analysing a program for which you do not have access to the source code. There are multiple reasons for reverse engineering an executable program. These range from gaining a general overview of the capabilities of a program, through to analysing a specific algorithm inside a program, and finally obtaining a complete program decompilation to source code.

Often when performing file format or protocol reverse engineering, you have a particular program that uses that file format or protocol. Reverse engineering the executable may help assist in the reverse engineering of the unknown file format or protocol.

This site mainly deals with reverse engineering executable programs.

Why Reverse Engineer?

"I mean, I heard from my (lecturer/colleague/vendor/political representative) that reverse engineering is evil."

Like all software practices, reverse engineering can be used for multiple purposes and reasons.

Imagine that you are a software company. Another software company has come up with a revolutionary new product that happens to compete with yours. Your response could be to:

Improve your product.
Buy the rights to the product from the other company.
Obtain a copy of the product and reverse engineer it to uncover its secrets.

You suspect that a software product you recently purchased and installed is really spyware and is transmitting your private details back to the vendor. You can either:

Do nothing because the company assures you that the data being transmitted is for debugging/qa/security purposes and in no way does it compromise your privacy.
Uninstall the product and discard the product, wasting your initial investment, just in case your privacy might be compromised.
Reverse engineer (and void your click through EULA) the data being transmitted to see if the company's claims are true.

You are a system administrator / security researcher. The machines you administer are being attacked and penetrated by a worm. You have obtained a copy of the worm program. You can:

Reverse engineer the program to find out what it does and work out a way to stop it.
Start reading all the security mailing lists to read about the worm from the guy that did reverse engineer it.
Do nothing until the next Friday the 13th when the worm deletes your entire collection of painstakingly collected jpegs.

This website discusses the tools and processes used in reverse engineering. When compared to technical discussions, morality arguments are boring. The morality of the practice of reverse engineering is left as an exercise for the interested reader.