Re: Making C compiler generate obfuscated code

Hans-Peter Diettrich <DrDiettrich1@aol.com>
Wed, 22 Dec 2010 17:12:06 +0100

          From comp.compilers

Related articles
[11 earlier articles]
Re: Making C compiler generate obfuscated code torbenm@diku.dk (2010-12-20)
Re: Making C compiler generate obfuscated code gneuner2@comcast.net (George Neuner) (2010-12-21)
Re: Making C compiler generate obfuscated code gneuner2@comcast.net (George Neuner) (2010-12-21)
Re: Making C compiler generate obfuscated code walter@bytecraft.com (Walter Banks) (2010-12-21)
Re: Making C compiler generate obfuscated code gah@ugcs.caltech.edu (glen herrmannsfeldt) (2010-12-21)
Re: Making C compiler generate obfuscated code martin@gkc.org.uk (Martin Ward) (2010-12-22)
Re: Making C compiler generate obfuscated code DrDiettrich1@aol.com (Hans-Peter Diettrich) (2010-12-22)
Re: Making C compiler generate obfuscated code gneuner2@comcast.net (George Neuner) (2010-12-23)
Re: Making C compiler generate obfuscated code torbenm@diku.dk (2011-01-04)
Re: Making C compiler generate obfuscated code gneuner2@comcast.net (George Neuner) (2011-01-06)
| List of all articles for this month |

From: Hans-Peter Diettrich <DrDiettrich1@aol.com>
Newsgroups: comp.compilers
Date: Wed, 22 Dec 2010 17:12:06 +0100
Organization: Compilers Central
References: 10-12-017 10-12-019 10-12-023 10-12-030 10-12-033
Keywords: code, design
Posted-Date: 23 Dec 2010 11:13:15 EST

Torben Fgidius Mogensen schrieb:


>> In practice such interruptions of the control flow make automatic
>> disassembling almost impossible. Instead a good *interactive*
>> disassembler is required (as I was writing when I came across above
>> tricks), and time consuming manual intervention and analysis is
>> required with almost every break in the control flow. The mix of data
>> and instructions not only makes it impossible to generate an assembler
>> listing, but also hides the use of memory locations (variables or
>> constants), with pointers embedded in the inlined parameter
>> blocks. Now tell me how a decompiler or other analysis tool should
>> deal with such constructs, when already the automatic separation of
>> code and data is impossible.
>
> Using jump tables and the like is, indeed, going to make unobfuscation
> hard. Especially if the tables change dynamically.


In the observed cases the presence of jump tables was unknown, and also
the structure and size of the data block, that follows the call
instruction :-(


> You might be able to get around this by symbolic execution: You start
> with a state description which allows arbitrary values of variables.


Then you'll end up with a tree of states, in the best case, and a graph
(with loops and knots) in the worst case.




> But what if you know the obfuscation method? Assuming that the
> obfuscation method is polynomic, deobfuscation is at worst NP-hard, so
> it is decidable. But it can be so intractable that it doesn't matter.


It may be possible to crack algorithmic obfuscation, but odds are bad
with the encountered "handmade" obfuscation. The intended (and achieved)
effect was optimization (almost for smaller size), and the resulting
obfuscation only was a side effect.


Even if one can produce equivalent assembler code, with some tricks
(macros...) for data structures with multiple meanings (instructions in
instruction arguments...), that code will remain hard to understand -
and that's the primary goal of every obfuscation. Who will be able to
tell the *purpose* of a state machine or other automaton, given only its
implementation?


More unobfuscation problems come into mind, like the use of modified
external code, maybe only different versions of (shared) standard
libraries. When some code relies on the values returned from such
external subroutines, and the precise implementation in a specific
library version, the entire environment (version of the OS and all
installed libraries) has to be taken into account.


DoDi


Post a followup to this message

Return to the comp.compilers page.
Search the comp.compilers archives again.