Obfuscation - Making Reverse Engineering Harder
In a previous article I have demonstrated how easy it is to decompile and reverse engineer .NET assemblies using Reflector and Reflexil. I've also shown that applying a strong name to your assembly does not protect your code from reverse engineering. So, what else is left?
A technique called obfuscating goes a long way in keeping your source code safe. An obfuscator will mangle your code, without changing the actual result, to make it increasingly harder for someone to decompile your code and actually understand it.
Keep in mind however that this is 'security through obscurity', and as such it does not provide solid security, it's just another layer. The more layers, the more walls there are around your assembly to demotivate others into breaking it, but never enough to demotivate 100% of the people. A similar tactic can be found in CD protection schemes, which stops a percentage of users from copying a disk, but not everyone.
There are several possibilities to obfuscate an assembly:
Renaming Classes, methods, parameters, namespaces, fields, ... are renamed to short names, making it much harder to deduct the working of a program from descriptive naming. Unicode characters can be used, making it even harder to understand.
When there is a lot of code, an obfuscator can also rename different methods to the same name, due to taking advantage of overloading.
For example, if you have a method GetCustomers without any parameters and GetCustomer(int customerNr), an obfuscator can rename them both to 'a', letting the .NET framework pick the correct one thanks to using overloads.
Control Flow Alternation
An obfuscator can turn nicely structured code using if, while, for statements into a big spaghetti mess, by utilizing lots of goto statements. The more code you have, the more difficult it will be to understand the generated spaghetti end result.
String Encryption
When code has been turned in to a complex obfuscated mess, it is still possible to find interesting parts due to the readability of strings used, limiting the attack vector to a smaller piece of obfuscated to be understood.
Obfuscators provide the ability to encrypt strings and decrypt them at run time, making them unreadable in the source code.
Assembly Linking
Calls made to .NET Framework methods will still be readable, because of this some obfuscators allow you to link in the .NET Framework assemblies being used, and obfuscating these as well, making it impossible to find anything 'normal' in the IL.
Besides protection features some obfuscator packages out there also provide nice diagram generators to analyze your assembly.
After applying an obfuscator on my previous CrackMe program, this is how it looks when opened with Reflector:
As you can see, Reflector is already unable to generate valid C# code for it. With some obfuscators it is even possible to make Reflector totally unable to output anything, besides IL. Trying to view in C# for example, made Reflector throw an exception.
Testing this on my SuperSecretApplication application, the obfuscator applied several methods on the IL of the assembly:
- Method renaming, using Unicode characters which couldn't get displayed.
- Parameter renaming, using a simply naming scheme.
- Altered control flow, using several jump conditions, as displayed by the arrows in Reflexil.
- String encryption, making it harder to find interesting sections.
.method private hidebysig static void á(string[] A_0) cil managed { .entrypoint .maxstack 3 .locals init ( [0] string str, [1] int32 num, [2] int32 num2) L_0000: ldc.i4 14 L_0005: stloc num2 L_0009: br.s L_0024 L_000b: ldloc num L_000f: switch (L_00b7, L_0050, L_0071, L_0094) L_0024: ldstr "\ud382\ue484\uf486\ufa88\ufc8a\ue28c\ufd8e\uf590\ua992\ub594" L_0029: ldloc num2 L_002d: call string <module>::a(string, int32) L_0032: call void [mscorlib]System.Console::Write(string) L_0037: call string [mscorlib]System.Console::ReadLine() L_003c: stloc.0 L_003d: ldc.i4.1 L_003e: br.s L_0043 L_0040: ldc.i4.0 L_0041: br.s L_0043 L_0043: brfalse.s L_0045 L_0045: ldc.i4 1 L_004a: stloc num L_004e: br.s L_000b L_0050: ldloc.0 L_0051: ldstr "\ue182\ue984\ue886\uee88\ua58a\uee8c\ufa8e\ufc90\ue392\ue694\ub996\ufb98\ufe9a" L_0056: ldloc num2 L_005a: call string <module>::a(string, int32) L_005f: call bool [mscorlib]System.String::op_Equality(string, string) L_0064: brfalse.s L_0073 L_0066: ldc.i4 2 L_006b: stloc num L_006f: br.s L_000b L_0071: br.s L_0096 L_0073: ldstr "\ucd82\uea84\ua686\ua988\uc58a\ue28c\uae8e" L_0078: ldloc num2 L_007c: call string <module>::a(string, int32) L_0081: call void [mscorlib]System.Console::WriteLine(string) L_0086: ldc.i4 3 L_008b: stloc num L_008f: br L_000b L_0094: br.s L_00b9 L_0096: ldstr "\uda82\uea84\uf286\ua988\ue68a\uec8c\ue18e\uf090\uf492\uf094\uf396\ub998\uef9a\uf29c\ubf9e\uc4a0\ucda2\ud1a4\uc2a6\udba8\u8baa\ud9ac\uc7ae\ud4b0\u93b2\uc6b4\ud2b6\udab8\uc9ba\ud8bc\ucbbe\ue1c0\ua2c2\ub5c4\ub7c6\ua5c8\ua2ca\uaecc\uaece\ua5d0\ubad2\ubad4\ub9d6\uf8d8" L_009b: ldloc num2 L_009f: call string <module>::a(string, int32) L_00a4: call void [mscorlib]System.Console::WriteLine(string) L_00a9: ldc.i4 0 L_00ae: stloc num L_00b2: br L_000b L_00b7: br.s L_00b9 L_00b9: ldstr "\ud382\uf784\ue286\ufa88\uf88a\uad8c\uee8e\uff90\uea92\ub594\ufc96\ufc98\ue29a\ubd9c\ueb9e\ucea0\u83a2\uc6a4\uc8a6\uc7a8\udfaa\uc4ac\uc1ae\uc4b0\ud6b2\u95b4\u99b6\u99b8\u95ba\u9dbc\u91be" L_00be: ldloc num2 L_00c2: call string <module>::a(string, int32) L_00c7: call void [mscorlib]System.Console::Write(string) L_00cc: ldc.i4.1 L_00cd: call valuetype [mscorlib]System.ConsoleKeyInfo [mscorlib]System.Console::ReadKey(bool) L_00d2: pop L_00d3: ret }
As you can see, given enough time, it is possible to figure out the above code and still break it. The more code you have however, the harder it becomes due to the increasing complexity of the obfuscated code, added with the fact that obfuscators will be able to apply even more advanced methods.
Obfuscators also have some disadvantages though. When writing advanced code involving reflection, obfuscating an assembly might break your code, possibly due to class and namespaces being renamed. Debugging also becomes harder because the stacktrace will contain renamed method signatures.
A good obfuscator will offer features to work around these advantages though, by allowing you to mark sections which should be skipped for obfuscation and also providing tools to translate an obfuscated stack trace to the original methods, using a mapping file.
You can find a comparison between several obfuscators on How To Select Guides, including links to individual websites and details about which features they support.
This post is the third in a series on protecting intellectual property.