Context-sensitive analysis without calling-context

39
Higher-Order Symb Comput (2010) 23:275–313 DOI 10.1007/s10990-011-9080-1 Context-sensitive analysis without calling-context Arun Lakhotia · Davidson R. Boccardo · Anshuman Singh · Aleardo Manacero Jr. Published online: 25 November 2011 © Springer Science+Business Media, LLC 2011 Abstract Since Sharir and Pnueli, algorithms for context-sensitivity have been defined in terms of ‘valid’ paths in an interprocedural flow graph. The definition of valid paths requires atomic call and ret statements, and encapsulated procedures. Thus, the resulting algorithms are not directly applicable when behavior similar to call and ret instructions may be realized using non-atomic statements, or when procedures do not have rigid boundaries, such as with programs in low level languages like assembly or RTL. We present a framework for context-sensitive analysis that requires neither atomic call and ret instructions, nor encapsulated procedures. The framework presented decouples the transfer of control semantics and the context manipulation semantics of statements. A new definition of context-sensitivity, called stack contexts, is developed. A stack context, which is defined using trace semantics, is more general than Sharir and Pnueli’s interprocedural path based calling-context. An abstract interpretation based framework is developed to rea- son about stack-contexts and to derive analogues of calling-context based algorithms using stack-context. The framework presented is suitable for deriving algorithms for analyzing binary pro- grams, such as malware, that employ obfuscations with the deliberate intent of defeating automated analysis. The framework is used to create a context-sensitive version of Venable et al.’s algorithm for analyzing x86 binaries without requiring that a binary conforms to a standard compilation model for maintaining procedures, calls, and returns. Experimental A. Lakhotia ( ) · A. Singh University of Louisiana at Lafayette, P.O. Box 44330 Lafayette, LA 70504, USA e-mail: [email protected] A. Singh e-mail: [email protected] D.R. Boccardo Inmetro—National Institute of Metrology, Quality and Technology, Rio de Janeiro, Brazil e-mail: [email protected] A. Manacero Jr. Paulista State University (UNESP), São Paulo, Brazil e-mail: [email protected]

Transcript of Context-sensitive analysis without calling-context

Page 1: Context-sensitive analysis without calling-context

Higher-Order Symb Comput (2010) 23:275–313DOI 10.1007/s10990-011-9080-1

Context-sensitive analysis without calling-context

Arun Lakhotia · Davidson R. Boccardo ·Anshuman Singh · Aleardo Manacero Jr.

Published online: 25 November 2011© Springer Science+Business Media, LLC 2011

Abstract Since Sharir and Pnueli, algorithms for context-sensitivity have been defined interms of ‘valid’ paths in an interprocedural flow graph. The definition of valid paths requiresatomic call and ret statements, and encapsulated procedures. Thus, the resulting algorithmsare not directly applicable when behavior similar to call and ret instructions may be realizedusing non-atomic statements, or when procedures do not have rigid boundaries, such as withprograms in low level languages like assembly or RTL.

We present a framework for context-sensitive analysis that requires neither atomic calland ret instructions, nor encapsulated procedures. The framework presented decouples thetransfer of control semantics and the context manipulation semantics of statements. A newdefinition of context-sensitivity, called stack contexts, is developed. A stack context, whichis defined using trace semantics, is more general than Sharir and Pnueli’s interproceduralpath based calling-context. An abstract interpretation based framework is developed to rea-son about stack-contexts and to derive analogues of calling-context based algorithms usingstack-context.

The framework presented is suitable for deriving algorithms for analyzing binary pro-grams, such as malware, that employ obfuscations with the deliberate intent of defeatingautomated analysis. The framework is used to create a context-sensitive version of Venableet al.’s algorithm for analyzing x86 binaries without requiring that a binary conforms to astandard compilation model for maintaining procedures, calls, and returns. Experimental

A. Lakhotia (�) · A. SinghUniversity of Louisiana at Lafayette, P.O. Box 44330 Lafayette, LA 70504, USAe-mail: [email protected]

A. Singhe-mail: [email protected]

D.R. BoccardoInmetro—National Institute of Metrology, Quality and Technology, Rio de Janeiro, Brazile-mail: [email protected]

A. Manacero Jr.Paulista State University (UNESP), São Paulo, Brazile-mail: [email protected]

Page 2: Context-sensitive analysis without calling-context

276 Higher-Order Symb Comput (2010) 23:275–313

results show that a context-sensitive analysis using stack-context performs just as well forprograms where the use of Sharir and Pnueli’s calling-context produces correct approxi-mations. However, if those programs are transformed to use call obfuscations, a context-sensitive analysis using stack-context still provides the same, correct results and withoutany additional overhead.

Keywords Analysis of binaries · Context-sensitive analysis · Obfuscation · Deobfuscation

1 Introduction

Recent years have seen an increase in research activity in the area of binary analysis. Whilea majority of the work are in analysis of binaries that are considered benign [1–3, 7–9, 21,25, 26, 29, 34, 39–41, 45, 49], there is an increasing body of work in analyzing maliciousprograms as well [4–6, 12, 19, 31, 51].

Current methods of analyzing binaries are modeled on methods for analysis of sourcecode. A binary is analyzed by decomposing it into a collection of procedures, and perform-ing intraprocedural and interprocedural analysis. The interprocedural analysis is classicallymodeled using Sharir and Pnueli’s interprocedural analysis framework [46]. Since a binary,albeit disassembled, is not syntactically rich, the identification of procedure boundaries,parameters, procedure calls, and returns is done by making assumptions, such as the se-quence of instructions used at a procedure entry (prologue), at a procedure exit (epilogue),the parameter passing convention, and the conventions to make a procedure call. These as-sumptions are often referred to by researchers as a ‘standard compilation model.’

Tools that are designed to aid the analysis of benign programs operate under the assump-tion that the goals of the programmer who wrote the program—referred to as the author—and the goals of the programmer who is analyzing the program—referred to as the analyst—are aligned. The two programmers, when different, are assumed to be on the same team. Itis assumed that the author will not go out of her way to make it difficult for the analystto reach her objective. However, there are many situations when the author and the analysthave conflicting interests. For instance, the interests of a malware author and that of a mal-ware analyst are clearly in conflict. Similar conflict exists between a software developer anda software pirate. It is not in the best interest of the (malware) author to make it easy for the(security) analyst to reach her objective.

The work presented is situated in this context: programs that are written with the ex-plicit intent to make them difficult to analyze. Such programs are often called obfuscatedprograms, since they employ techniques that obfuscate their intent [10]. Though most mal-ware today employ some form of obfuscation [6, 32], the use of obfuscation is not limitedto malware. It is also becoming increasingly common to obfuscate code to protect intel-lectual property [10, 36]. When analyzing third party binaries that may be written with anadversarial intent, even when the source code is available, analyzing the binary is the onlytrue way to detect hidden capabilities, as demonstrated by Thompson in his Turing AwardLecture [50]. Lest Thompson’s paper be considered theoretical, his ideas have been put intopractice by the malware W32.Induc.A [37].

In general obfuscations are created by violating the assumptions of the analyst. The mostcommon method of obfuscating a binary is to violate standards, such as the structure of theexecutable file, or conventions, such as the standard compilation model. The art of deobfus-cation may then be considered to be the creation of analysis with weaker assumptions [42].More complex obfuscations are created by violating the assumptions of the analysis algo-rithms, such as forcing a disassembler to create incorrect disassembly.

Page 3: Context-sensitive analysis without calling-context

Higher-Order Symb Comput (2010) 23:275–313 277

This work focuses on a specific type of obfuscation, namely, call obfuscation [31]. Weconsider a binary to use call obfuscation if it does not follow any particular convention forthe layout of the procedure code in memory or if it does not use call or ret instructions tomake procedure calls. This class of obfuscation is targeted at making it difficult to decom-pose a binary into procedures, and thus making it harder to perform interprocedural analysis.In the absence of procedures, the entire program would be treated as a single, monolithicentity. Such obfuscations directly target methods that use similarity between programs’ callgraphs to identify malware variants. It also targets methods that use the pattern of systemcalls to detect malware [5, 48]. When a malware obfuscates its system calls, such malwaredetection methods fail.

Our work improves upon Sharir and Pnueli’s framework of context-sensitive interproce-dural analysis. In Sharir and Pnueli’s framework, a program is represented as an interpro-cedural control flow graph (ICFG). An ICFG is a collection of CFGs, one per procedure inthe program. A CFG represents the intraprocedural computation of a procedure. Each CFGhas two special nodes, an entry node and an exit node. The entry node represents the pointwhere control is transferred to the procedure, and the exit node represents the point wherethe procedure terminates and returns control to its caller. Every procedure call in an ICFGis represented by two nodes, a call node and a return node. The ICFG has interprocedu-ral edges connecting the CFGs. There is a call edge from a call node to the entry node ofthe procedure called. There is a return edge from an exit node of a procedure to the returnnode paired with the call-site from which the procedure is called. An interprocedural analy-sis is context-sensitive if it propagates information only along interprocedurally valid paths(IVPs). Sharir and Pnueli presented two methods of intraprocedural analysis: the functionalapproach and the call-string approach. In the functional approach, procedure summaries—maps from input to the output values—are computed for every procedure. The calculation ofthe procedure summary requires the analysis of each procedure only a few times in the caseof recursive procedures. Sharir and Pnueli’s framework for computing procedure summarieswas limited to value domains of finite lattices. Sagiv, Reps, and Horwitz [44] generalized itto infinite domains using context-free graph reachability, a concept derived from IVPs. Inthe call string approach, a procedure is analyzed separately for each different invocation (itscalling contexts). This improves the analysis’ precision for pieces of code that are executedmore than once in different contexts. The analysis of different call sequences is made bysimulating the call stack of an abstract machine which contains unclosed calling sequences.

Call obfuscations challenge Sharir and Pnueli’s assumption of statically identifiableprocedures, rigid and well-defined procedure boundaries, and statically identifiable (andpairable) call and return statements. Since a binary does not contain syntactic constructsthat encapsulate procedures, most tools use the knowledge of the compilation models ofvarious compilers to identify and delineate procedures. An important element of the compi-lation model is the use of CALL instructions to make procedure calls. It is common for toolsto find the entry points of procedures by locating the target of all the CALL instructions of aprogram. Call obfuscations are performed by violating these assumptions in two ways. First,a different set of instructions is used to achieve the semantics of CALL and RET instructions.Second, the CALL and RET instructions are used in non-traditional ways.

The issues are enumerated by the x86 programs in Table 1. The two programs are equiv-alent, differing only on how control is transferred to a procedure. The original, unobfuscatedprogram is on the left. The program has labels A1, A2, etc., which are non-informative, asone would expect from labels of disassembled programs. By reviewing the code, it appearsthat A9 is the entry point of a procedure. This is evident from the two ‘CALL A9’ instruc-tions. Thus, the program has two procedures: Main and A9. Procedure Main makes two callsto the procedure A9.

Page 4: Context-sensitive analysis without calling-context

278 Higher-Order Symb Comput (2010) 23:275–313

Table 1 (a) Originalprogram—left; (b) Obfuscatedprogram—right

Label Instruction Label Instruction

A1: PUSH 0xA B1: PUSH 0xA

A2: PUSH 0xB B2: PUSH 0xB

A3: CALL A9 B3: PUSH B6

A4: PUSH 0xC B4: PUSH B13

A5: PUSH 0xD B5: RET

A6: CALL A9 B6: PUSH 0xC

A7: NOP B7: PUSH 0xD

A8: RET B8: PUSH B11

A9: MOV eax, [esp+4] B9: PUSH B13

A10: MOV ebx, [esp+8] B10: RET

A11: ADD eax,ebx B11: NOP

A12: RET 8 B12: RET

B13: MOV eax, [esp+4]

B14: MOV ebx, [esp+8]

B15: ADD eax,ebx

B16: RET 8

The program on the right in Table 1 was created by replacing each CALL A9 instructionin the program on the left by instructions performing equivalent behavior (and renaming thelabels). Consider the following instructions that replace the first occurrence of CALL A9:

B3: PUSH B6B4: PUSH B13B5: RET

The first PUSH instruction pushes the address B6 onto the stack, where B6 is the addressof the instruction after the RET instruction. The second PUSH instruction pushes onto thestack the address B13, the entry point of procedure B13. The RET instruction then, fol-lowing semantics of such instructions, pops B13 from the stack and transfers control to it.Finally, when the execution reaches the RET 8 instruction at address B18, the address B6 ispopped from the stack and control is transferred to it. Thus, the three instructions that re-place CALL A9 preserve its semantics. The instructions replacing the second call instructionproduce analogous behavior.

Figures 1 and 2 present the ICFGs of the original and obfuscated programs, respectively,of Table 1. Comparison of the two ICFGs shows how call obfuscations impact context-sensitive analysis based on Sharir and Pnueli’s framework. The ICFG of the original pro-gram, Fig. 1, contains two CFGs, one for each procedure. Since a binary does not haveencapsulated procedures, the procedure boundaries are identified using some known com-pilation model. The two call statements are represented by four nodes, two call nodes (C1and C2) and two return nodes (R1 and R2). The graph has two call edges, one from each callnode to the procedure entry node A9. It also has two return edges from A12, the exit node ofprocedure A9, to the two return nodes. Other edges in the graph are intraprocedural edges.

The path (A1, A2, C1, A9, A10, A11, A12, R1) represents the execution where procedureA9 returns after the call from the first site. Similarly, if this path is appended with the path(A4, A5, C2, A9, A10, A11, A12, R2) it would represent the execution where A9 returnsafter the call from the second site. Using Sharir and Pnueli’s terms, these two are IVPs.

Page 5: Context-sensitive analysis without calling-context

Higher-Order Symb Comput (2010) 23:275–313 279

Fig. 1 ICFG of program in Table 1(a)

However, the graph also contains the following path: (C1, A9, A10, A11, A12, R2), showinga flow of control from the call site C1 to the return site R2 without going through R1 or C2.Such a path is not an IVP.

The ICFG in Fig. 2 is created using classic rules of constructing such a graph. Whilethe graph is presented to show its correspondence with that in Fig. 1, it should be observedthat the graph does not have any interprocedural edges. In other words, the graph is reallya CFG of a single procedure. The reason is that the obfuscated program does not haveany CALL instructions, and thus does not provide any cues to identify entry points of theprocedures. Therefore, there is no way to determine that the program has two procedures.Because procedure boundaries cannot be determined, the example raises the issue of howthe RET instructions should be treated. In our previous example, a RET instruction wasassociated to its enclosing procedure, whose boundaries were determined using procedure

Page 6: Context-sensitive analysis without calling-context

280 Higher-Order Symb Comput (2010) 23:275–313

Fig. 2 ICFG of program in Table 1(b)

entry point and other rules from the compilation model used. Using the same rule the RETinstructions at addresses B5 and B10 are treated as return instructions of Main. Hence, thegraph shows them as transferring control to the exit node of Main.

The ICFG of Fig. 2 illustrates the limitations of extending Sharir and Pnueli’s frameworkto binary analysis. Paths in the graph do not capture the correct semantics of the program,and its use will produce incorrect results for any context-sensitive, interprocedural analysis.

The key to call obfuscation is that the semantics of call and ret operations can be re-alized using a collection of instructions, thus making the operations non-atomic. It is thenon-atomic nature of the call and ret operations that contributes to making the programobfuscated. Once an operation is decomposed there is no requirement that the instructionsimplementing it appear in consecutive order. Thus, more complex obfuscations that cannotbe detected by simple pattern matches may be realized by shuffling the instructions among

Page 7: Context-sensitive analysis without calling-context

Higher-Order Symb Comput (2010) 23:275–313 281

other code. More complex obfuscations may be achieved by not using PUSH and RET in-structions; instead one may use move, increment, and decrement operations directly on thestack pointer to perform equivalent behavior [31].

This paper presents a framework for developing context-sensitive analyzers of programsin which call and ret operations may not be performed by atomic instructions. Unlike Sharirand Pnueli’s framework, our context-sensitive analysis neither requires that procedures beencapsulated nor that the call and ret operations be atomic. In contrast with Sharir andPnueli’s definition of a context in terms of procedure calls, our framework defines a contextin terms of the state of the stack. A stack context depends only on the knowledge of howthe stack pointer and the instruction pointer are represented, the direction of stack growth,and the static identification of operators manipulating the stack pointer. Thus, to develop acontext-sensitive analysis using our framework, one only needs to know how the stack isimplemented.

The main contributions of this paper are summarized below.

– It introduces the problem of context-sensitive analysis of programs with non-atomic calland ret instructions, and without procedure encapsulation.

– It introduces a new concept of context, called stack-context, and defines context-sensitivity in terms of stack-context. Departing from Sharir and Pnueli’s use of IVP, weuse trace semantics to define these terms. The use of trace semantics, in lieu of IVP, ismotivated by the fact that we wish to model non-atomic call and return operations.

– It presents a systematic development of generic context-sensitive analysis using Ga-lois connection based abstractions of traditional trace-based semantics. The context-abstractions it derives are generic in that they depend only on the LIFO nature of creationand deletion of contexts. These abstractions enable derivation of stack-based context-sensitive analysis, which, unlike calling-context based analysis of prior work, do not de-pend upon transfer of control semantics.

– It systematically derives generic versions of Sharir and Pnueli’s k-suffix call-strings ab-stractions [46] and Emami et al.’s strategy of abstracting calling-contexts [22], with thelatter referred to in this paper as �-contexts. It should be emphasized that prior workson these abstractions have been dependent upon the control flow semantics of CALL andRET instructions. Our generalization of these previous definitions is novel since it doesnot depend on atomicity of call and return operations.

– It proposes a general method for deriving a sound context-sensitive analysis from acontext-insensitive analysis. As an application, a context-sensitive version of Venable etal.’s call obfuscation detection algorithm [51] is derived. The resulting analysis is shownto be sound.

– It presents empirical results comparing the context-insensitive, Sharir and Pnueli’scalling-context-sensitive, and stack-context-sensitive versions of the algorithm. The re-sults show that for unobfuscated programs context-sensitive analysis using stack-contextperforms just as well as the analysis using calling-context. However, if the programsemploy call or return obfuscation, the use of stack-context still yields correct context-sensitive analysis and at the same cost, whereas the use of calling-context produces incor-rect results. The context-sensitive analysis also outperforms its context-insensitive coun-terpart in terms of the computational cost as well as the precision of the answer computed.

Section 2 discusses the related works. Section 3 provides background in domain theoryand abstract interpretation. Section 4 introduces context-trace semantics, a trace semanticsin which context is made explicit. Section 5 presents a generalization of Sharir and Pnueli’sand Emami et al.’s context abstractions. Section 6 derives context-sensitive version of Ven-able et al.’s analysis. Section 7 demonstrates the resulting algorithm by applying it on the

Page 8: Context-sensitive analysis without calling-context

282 Higher-Order Symb Comput (2010) 23:275–313

example programs of Table 1. Section 8 presents empirical evaluation of the method pre-sented. Section 9 outlines some of the limitations of our analysis and is followed by ourconcluding remarks.

2 Related works

Prior research related to this paper may be broadly grouped into the following categories:interprocedural analysis (in general), interprocedural analysis of binary programs, and anal-ysis of malicious/obfuscated programs.

Precise and efficient context-sensitive interprocedural data-flow analysis of high-levellanguages has been an active area of research. The general strategies fall within the twoapproaches proposed by Sharir and Pnueli [46], namely the call-string approach or the pro-cedure summaries approach. A good summary of these works may be found in [28].

The classic interprocedural control flow graph (ICFG) based algorithms for comput-ing function summaries require a priori identification of procedure entries and exits. Thesemethods cannot directly be adapted for our needs because call obfuscations prevent deter-mination of the procedures and their boundaries, thus violating a pre-requisite. Reps et al.’sweighted pushdown system based interprocedural analysis, which also computes functionsummaries [43], does not use ICFGs. Indeed, our representation of context using the state ofthe stack is analogous to Reps et al.’s use of pushdown automata [33, 43]. However, a push-down system implicitly depends upon the transfer of control semantics of call and returninstructions, and thus may not be generalizable to programs with arbitrary stack operations.

Might and Shivers’s [38] framestrings for λ-based languages is similar to our stack-string abstraction in that the context is defined in terms of push and pop stack operations.Their work also includes modeling environments, with the intent of enabling certain inliningoptimizations. Use of their environment theory in our context would be a valuable directionfor future work.

The call-string approach follows the execution of a program. Algorithms based on thisapproach have classically been modeled to determine a change of context based on thesemantics of procedure call/return and are described using ICFGs. We generalize contextabstraction such that it does not depend on the semantics of procedure invocation. As isdone for context-sensitive points-to analysis, the call-graph used for the call-string approachmay be computed on the fly [22, 54–56]. Determining transfer of control based on contentsof memory or registers is analogous to computing the points-to relation for higher-levellanguages. However, since memory addresses are linearly ordered, the resulting “points-to”sets in our problem context can be abstracted using a linear function. Thus, our method isanalogous in spirit, though not in letter, to context-sensitive points-to analysis.

Interprocedural analysis of binaries has also received attention for post-compile timeoptimization [47], for analyzing binaries to detect vulnerabilities not visible in the sourcecode, such as those due to memory mapping of variables [3], and for checking the binary forsome safety property [49]. Goodwin uses the procedure summary approach to interprocedu-ral analysis to aid link-time optimization [25]. Balakrishnan and Reps [3] use the call-stringapproach. Thakur et al. use weighted pushdown systems for interprocedural analysis [43].As mentioned earlier, these methods assume a certain compilation model to identify codesegments related to performing procedure calls, such as that supported by IDA Pro [27].In contrast, we split the semantics of call and ret instructions. We model their effect on the“context” separately from their effect on the “transfer of control.” The context is representedby the state of the stack and is modeled by an instruction’s effect on the stack pointer. The

Page 9: Context-sensitive analysis without calling-context

Higher-Order Symb Comput (2010) 23:275–313 283

transfer of control is analyzed using Balakrishnan and Reps’ value-sets [3] to abstract theset of values maintained in a register or a memory location.

Like us, Kinder et al. have developed an abstract interpretation framework for construct-ing control flow of binaries containing indirect transfer of control [29]. Their analysis, un-like ours, is context-insensitive since it does not model contexts of any kind. It uses constantpropagation to determine the values in memory locations and registers, and thus is likely togenerate greater over-approximations than those resulting from our use of value-sets [3].

There has been significant work in obfuscation of programs with the intent to thwart staticanalysis [11, 36]. The obfuscation techniques are targeted at defeating specific phases in theanalysis of a binary [32]. A metamorphic malware, that is, a malware that transforms itsown code as it propagates, may use procedure call obfuscations as part of its transformationalgorithm, as is the case with the Win32.Evol virus [53]. It has the side-effect of defeatingany interprocedural analysis that depends on a traditional compilation model [32].

Giacobazzi and Dalla Preda have developed abstract interpretation based models of ob-fuscation and deobfuscation of programs [16, 18, 24]. In their models, both program obfus-cators and deobfuscators are abstract interpreters. Since most interesting program analysisproblems are undecidable, program analyzers approximate the answers. When the answerresulting from abstract computation is the same as that resulting from abstracting the answerof concrete computation the analysis is said to be complete. To obfuscate a program is tocreate an equivalent version on which a given analysis is incomplete. On the other hand, todeobfuscate a program is to either transform the program or transform the analysis such thatthe analysis is complete. In other words, an obfuscator, such as a call obfuscator, violatesthe assumptions under which an analysis, such as interprocedural analysis, is complete. Thecontext-sensitive analysis we present reduces the assumptions under which such an analysisis complete.

There has also been efforts in the use of other semantics based methods for detecting mal-ware [6, 17, 19]. Term-rewriting has been proposed to normalize variants of a metamorphicmalware [52]. Dalla Preda et al. have used abstract interpretation to derive signatures for de-tecting self-modifying programs [19] and for detecting opaque predicates [20]. While theseworks have investigated analysis of various types of obfuscated programs, none of themhave addressed the analysis of obfuscations that violate the standard compilation model.

The foundations of the approach presented in this paper come from previous work ofour research group in analyzing programs with obfuscated calls [30, 31, 51]. Lakhotia etal. introduced the problem of call obfuscation [30, 31]. They introduced the Abstract StackGraph (ASG) as an abstraction of the set of all possible stacks of a program. They usedASGs to detect call and return obfuscations. The obfuscations were detected by checkingthe consistency of the stack reaching a statement. For instance, if the top of stack reaching aRET instruction is not created by a CALL instruction, it indicates a non-traditional use of theRET instruction. Though such rules detected obfuscated usage of such instructions, the ASGdid not contain enough information to continue the analysis after an obfuscation was found.For instance, if the program added one to the return address on the stack, the analysis wouldidentify that the address was tampered with, but could not provide the new return address.Venable et al. overcame this limitation by also maintaining an abstraction of the values in thestack locations and in the registers [51]. The values were abstracted using Balakrishnan andReps’ value-sets. Though Venable et al.’s analysis improved the state-of-the-art in analyzingobfuscated programs, being context-insensitive, it did not fare as well for analyzing non-obfuscated programs. The method presented is a further improvement, in that, the analysispresented can uniformly be applied to both obfuscated and non-obfuscated programs. Itfollows that Lakhotia et al.’s rules for detecting call and return obfuscations by performing

Page 10: Context-sensitive analysis without calling-context

284 Higher-Order Symb Comput (2010) 23:275–313

consistency checks using ASGs [31] can also be applied on the ASGs constructed using ourmethod. The application is straightforward and is not elaborated further.

3 Preliminaries

3.1 Domain theory

A binary relation �C : C × C is a partial order upon a set C iff �C is reflexive, transitive,and antisymmetric. For a set C partially ordered by �C and a subset X of C,

⊔X denotes

the element of C (if it exists) that satisfies the following conditions: (i) ∀x ∈ X,x �C

⊔X;

and (ii) ∀y ∈ C,∀x ∈ X,x �C y ⇒ ⊔X �C y. The element

⊔X is called the least upper

bound (lub) of X. Its dual, the greatest lower bound (glb) of X, is denoted by�

X. Whenoperating on a set of two elements, the operations are represented by the binary operators� and � respectively. A partial order due to the binary relation = (equality) is called a flatorder. A partially ordered set (poset) 〈C,�C〉 is a lattice iff ∀x, y ∈ C, both x � y and x � y

exist. It is a complete lattice iff for all subsets X of C, both⊔

X and�

X exist. The lub andglb of a complete lattice 〈C,�C〉 are represented by C and ⊥C , respectively. For any set X,〈℘(X), ⊆〉 is a lattice under the usual subset ordering ⊆. A complete lattice 〈C,⊆C〉 is flatif ∀c, c1, c2 ∈ C : ⊥C �C c �C C , and if c1 = c2 then c1 � c2 = C and c1 � c2 = ⊥C . A flatposet 〈C,=〉 may be extended to a flat lattice by including a lub and a glb in the domain.

Let X∗ denote the Kleene closure of the set X, i.e., the set of finite sequences over X.Let ε ∈ X∗ denote the sequence of length 0. Let (x i) denote the ith element of the sequencex ∈ X∗. Let ‘.’: X × X∗ → X∗ be the cons operator, that inserts an element at the head of asequence, defined formally as: a.x = y ⇔ (y 0) = a ∧ ∀i ≥ 0 : (y i + 1) = (x i). If 〈X, �X〉is a lattice, then 〈X∗, �X∗ 〉 is a lattice, where �X∗ is defined as follows:

∀x1, x2 ∈ X; s, s1, s2 ∈ X∗

ε �X∗ s

s �X∗ $

x1.s1 �X∗ x2.s2 ⇔ x1 �X x2 ∧ s1 �X∗ s2

The order resulting from �X∗ is called a strong ordering, for it defines a sequence tobe smaller than another sequence iff all of its elements are smaller than the correspondingelements of the other sequence. We introduce some operators on sequences for syntacticconvenience. We assume two polymorphic extensions of the cons operator “.”, one to in-sert an element at the end of a sequence: X∗ × X → X∗, and the other to concatenate twosequences: X∗ × X∗ → X∗. We also define the function rest, which operates on X∗ as fol-lows: (rest a.x) = x. When convenient, we also use the notation “Y ↓ X” to denote the Xthelement of the pair Y .

Two complete lattices, C and A, form a Galois connection iff there exists an adjunctionbetween C and A, i.e., ∃α : C → A and γ : A → C such that ∀a ∈ A,c ∈ C : α c �A a ⇔c �C γ a. A Galois connection is denoted by (C,α, γ,A) where α is the left adjoint of γ

(or γ is the right adjoint of α). It is enough to specify either α or γ because in any Galoisconnection the left adjoint map α uniquely determines the right adjoint map γ and viceversa. Given the left adjoint α, the right adjoint is determined as γ a = ⊔

C{c ∈ C | α c �A

a}; or given the right adjoint γ , the left adjoint is determined as α c = �A{a ∈ A | c �C γ a}.

Two complete lattices, C and A, form a Galois connection (C,α, γ,A) iff α is additive (or

Page 11: Context-sensitive analysis without calling-context

Higher-Order Symb Comput (2010) 23:275–313 285

equivalently, iff γ is coadditive). A Galois connection is called Galois insertion when α issurjective (or, equivalently, γ is injective).

Given a Galois connection (C,α, γ,A), a function f # : A → A is a sound approximationof a function f : C → C when α ◦f �A f # ◦α, or equivalently, f ◦ γ �C γ ◦f #. When theabstraction and concretization maps are obvious from context, we denote C � A to meanthat ∃α,γ such that (C,α, γ,A) is a Galois connection. We call A1 � A2 � · · · � An a chainof Galois connections.

3.2 Abstract interpretation

Abstract interpretation is a unified framework for defining approximate semantics of pro-grams [13, 14]. It allows the systematic derivation of data flow analyses and provides meth-ods to prove their correctness and termination.

An analysis may be derived in stages, starting from the concrete semantics and terminat-ing on the desired abstracted semantics. Soundness of the resulting analysis is demonstratedby creating Galois connections between the domains of the successive stages. Galois con-nections may also be used to order two or more analyses by their precision.

Following [15], a program may be formalized as a graph or a transition system τ =〈Σ,Σi, t〉, where Σ is a set of states, Σi ⊆ Σ denotes the set of initial states and t ⊆ Σ ×Σ

defines the transition relation between states. A finite partial trace σ ∈ Σ∗ is a sequence ofprogram states s0 . . . sn satisfying the following: s0 ∈ Σi ∧∀i ∈ [0, n) : (si, si+1) ∈ t . The setof all such finite partial traces is called the trace semantics of the program and is given bythe least fixpoint of the semantic transformer F :

F T = Σi ∪ {σ.s.s ′ | σ.s ∈ T ∧ 〈s, s ′〉 ∈ t}

where T is a set of finite partial traces. The domain of this trace semantics is ℘(Σ∗). Hence,the least fixpoint (lfp) of F is as follows:

lfp�⊥ F =

n≥0

F n ⊥

Let (℘ (Σ∗), α, γ,Abs) be a Galois connection, then F # : Abs → Abs is sound w.r.t F whenF # satisfies the above mentioned requirements (in Sect. 3.1). It can be shown that F # reachesa fixpoint by the well known fixpoint transfer theorem [14]. The precision and cost of theapproximated fixpoint is related with the choice of the widening operator [13].

The derivation of a static analyzer using abstract interpretation may be summarized asfollows. The program state is classically represented by the domain Σ = I × Store, where I

is the domain of instructions and Store is the domain of stores. The analysis is derived froma chain of Galois connections linking the semantic domain ℘((I × Store)∗) to the analysisdomain I → Abstore, where Abstore is an abstraction of stores. The derivation may havethe following stages:

1. The set ℘((I × Store)∗), called a set of traces, is approximated to a trace of sets, repre-sented by (℘ (I × Store))∗.

2. The trace of sets is equivalent to (I → ℘(Store))∗. This sequence of a mapping of in-structions to a set of stores can be approximated by I → ℘(Store).

3. Finally, a Galois connection between ℘(Store) and Abstore completes the analysis.

Page 12: Context-sensitive analysis without calling-context

286 Higher-Order Symb Comput (2010) 23:275–313

4 Context-trace semantics

Context-sensitivity is classically presented in the literature in terms of paths of an ICFG.A path, starting from the entry node in an individual CFG, represents a valid sequence offlow of control. A flow-sensitive analysis propagates data over paths of a CFG. However, apath that starts from the entry of the program and traverses nodes in multiple CFGs may notalways represent a valid flow of control. For such a path to be valid the call and the ret edgesin the path should be paired, and they should meet certain constraints, the details of whichmay be found elsewhere in the literature [28].

Since the prior definition of context-sensitivity is tied to semantics of procedure calland return statements of high-level languages, and therefore, call and ret instructions ofassembly language, it is not directly applicable for context-sensitive analysis of binaries thatare obfuscated.

In this section we use the machinery of abstract interpretation to develop a generalizednotion of context-sensitive analysis, where contexts are maintained in LIFO order. The con-cept is general in that it only requires the knowledge of the set of instructions that createcontexts and those that delete contexts. This generalized concept of context-sensitivity doesnot depend on whether an instruction transfers control. The primary constraint required isthat the most recently created context be destroyed first.

Let � ⊆ I denote the set of instructions (of a language) that open contexts, and � ⊆ I

denote the set of instructions that close contexts. A context string is a sequence of contextopening instructions belonging to �∗ ⊆ I ∗. The binary relation ��∗ : �∗×�∗ is a strong or-dering, as defined in Sect. 3.1, after extending � to a flat lattice. The function π representsthe effect of an individual state, an element of Σ , on the accumulated context string.

π : Σ → �∗→ �∗

πsν �

⎧⎪⎨

⎪⎩

i.ν if i ∈ �

(rest ν) if i ∈ �

ν otherwise

where i = s ↓ 1. If the instruction in the state given by s ↓ 1 belongs to the set �, it is pushedon the current context string. If the instruction belongs to � it pops the topmost context fromthe context string. Otherwise, the context string is left unchanged.

Now given a trace σ we can map it to its current context ν = Πσ , where Π is defined asfollows:

Π : Σ∗ → �∗

Πσ � (Π ′σε)

Π ′: Σ∗ → �∗→ �∗

Π ′εν � ν

Π ′s.σν � (Π ′σ(πsν))

The function Π maps a trace to its context string—the list of contexts that are open—by applying π repeatedly on successive elements of σ . Let νi represent the context stringfrom the ith application of π . The function Π (using Π ′) establishes the following relationνi = (π(σ i)νi−1), where ν0 = ε, for 1 ≤ i ≤ |σ |.

Page 13: Context-sensitive analysis without calling-context

Higher-Order Symb Comput (2010) 23:275–313 287

Consider, for example, the sequence of instructions a x b c y a b a resulting from pro-jecting out only the instructions from a trace. Let a, b, c ∈ � and x, y ∈ �. The context stringassociated with each prefix of the projected trace sequence is given as follows:

{(a) �→ (a), (a x) �→ ε, (a x b) �→ (b), (a x b c) �→ (c b), (a x b c y) �→ (b),

(a x b c y a) �→ (a b), (a x b c y a b) �→ (b a b), (a x b c y a b a) �→ (a b a b)}.A context trace is a pair consisting of a context string and a trace (ν, σ ) ∈ (�∗×Σ∗).

Not all elements of the set (�∗×Σ∗) are meaningful. We define a context-trace in which thecontext string represents the context associated with the trace as a Π -valid context trace.

Definition 1 A context-trace (ν, σ ) ∈ (�∗×Σ∗) is Π -valid iff ν = Πσ .

A Π -valid context trace is equivalent to a valid interprocedural path in the ICFG of aprogram when the sets � and � represent the set of call and return instructions, respectively,of that program.

We denote the set of all finite partial Π -valid context traces as (�∗×Σ∗)Π ≡ �∗ Π−→℘(Σ∗). This forms the semantic domain for the context-trace semantics. The followinglemma shows that this semantic domain is equivalent to ℘(Σ∗), the semantic domain forthe trace semantics.

Lemma 1 �∗ Π−→ ℘(Σ∗) ≡ ℘(Σ∗).

Proof Let f : (�∗ Π−→ ℘(Σ∗)) −→ ℘(Σ∗) and g : ℘(Σ∗) −→ (�∗ Π−→ ℘(Σ∗)) be functionsdefined as follows:

f : (�∗ Π−→ ℘(Σ∗)) −→ ℘(Σ∗)

f Y =⋃

ν∈domY

g : ℘(Σ∗) −→ (�∗ Π−→ ℘(Σ∗))

gX = λν � {σ | σ ∈ X ∧ ν = Πσ }

where Y ∈ �∗ Π−→ ℘(Σ∗) and X ∈ ℘(Σ∗).We can say that �∗ Π−→ ℘(Σ∗) ≡ ℘(Σ∗) if ∀X ∈ ℘(Σ∗) : f (gX) = X and ∀Y ∈ �∗ Π−→

℘(Σ∗) : g(f Y ) = Y .(i) ∀X ∈ ℘(Σ∗) : f (gX) = X.We have to show that f (gX) ⊂ X and X ⊂ f (gX). The proof is as follows.

f (gX) =⋃

ν∈dom(gX)

(gX)ν =⋃

ν∈dom(g X)

{σ | σ ∈ X ∧ ν = Πσ } ⊂ X

That X ⊂ f (g X) follows from contradiction. Assume ∃σ ∈ X : σ /∈ f (gX). That im-plies ∀ν ∈ dom(gX) : ν = Πσ , which is a contradiction if Π is a total function.

(ii) ∀Y ∈ �∗ Π−→ ℘(Σ∗) : g(f Y ) = Y .We have to show that g(f Y ) ⊂ Y and that Y ⊂ g(f Y ). The proof is as follows.

g(f Y ) = {σ | σ ∈ (f Y ), ν = Πσ } ={

σ | σ ∈⋃

ν∈domY

(Y ν), ν = Πσ

}

⊂ Y

Page 14: Context-sensitive analysis without calling-context

288 Higher-Order Symb Comput (2010) 23:275–313

Now, for Y ⊂ g(f Y ), we assume ∃ν ∈ domY such that ν /∈ {σ | σ ∈ ⋃ν∈domY (Y ν), ν =

Πσ }, i.e., ν = Πσ which contradicts that Y is Π -valid according to Definition 1. �

This then gives us the framework needed to develop context-sensitive analyses, wherecontext is made explicit. In particular, it gives the framework to derive a context-sensitivecounterpart to a context-insensitive analysis.

Assume that an analysis I → Abstore derived from the trace semantics ℘((I × Store))∗

is context-insensitive. Its context-sensitive counterpart may be derived using the followingchain of Galois connections:

�∗ Π−→ ℘((I × Store)∗)

� �∗ Π−→ (℘ (I × Store))∗

≡ �∗ Π−→ (I → ℘(Store))∗

� �Abs Π−→ I → Abstore

where �Abs is an abstraction of the concrete context �∗. In the following section we describetwo context abstractions, generalized from analogous abstractions used for calling-contexts.

5 Context abstractions

Due to recursion, the set of all finite length calling-contexts in a program may be infinite.Even when a program does not have recursion, the number of calling-contexts it has can beexponentially large [35]. So while a full call-string analysis may yield the most precise re-sults, it may not be practical to compute it. To make an analysis scalable for large programs,it is common to reduce the space of calling-contexts by using certain abstractions.

The literature contains two significant classes of abstractions for calling-contexts. Thefirst one, introduced by Sharir and Pnueli [46], abstracts a call string by mapping it to its k-length suffix. The second abstraction, introduced by Emami et al. [22], effectively abstractsa call string by reducing recursive paths in it to a single node. We say ‘effectively’ becausethe method is not stated as an abstraction over call-strings but can be mapped to such anabstraction. There are a few later works whose calling-context abstractions may also bemapped to this second abstraction [54, 56].

What is true of calling-contexts will also be true for any other instantiation of our gener-alized notion of context. Hence, it is fruitful to develop generalized context abstractions foruse in any context-sensitive analysis. Since the abstractions for calling contexts have beendefined in terms of paths over an ICFG, the original definitions cannot be directly mappedto generalized contexts that are defined independently of the control flow.

In the following subsections we derive the two abstractions using the machinery of ab-stract interpretation. We call the generalization of Sharir and Pnueli’s k-suffix approach ak-context abstraction and Emami et al.’s reduction of recursive loops an �-context abstrac-tion. While Sharir and Pnueli used k length suffixes, our abstraction uses k length prefixesbecause in our stack the most recent element is inserted at the head of the sequence. Mappingfrom our method to Emami et al.’s is not that straightforward. Emami et al. define a contextas a node in an “invocation graph.” Our �-context strings correspond to paths in Emami etal.’s invocation graph.

Page 15: Context-sensitive analysis without calling-context

Higher-Order Symb Comput (2010) 23:275–313 289

It is apparent that there is no significant algorithmic challenge in generalizing the abstrac-tions from calling-contexts to generalized contexts. However, the real issue in developing theabstraction is in how one would prove that an analysis using that abstraction will be sound.When used for abstracting calling-context, such arguments are made by reasoning over pathsof an ICFG. Since the generalized context does not have the benefit of an ICFG, albeit bydesign, the arguments about soundness must be developed.

Thus, the most significant component of the generalization we perform is the derivationof Galois connections, for these are necessary to prove the soundness of any analysis derivedfrom these context abstractions.

5.1 k-context

Let �k represent the set of sequences of opening contexts of length ≤ k and k + 1 lengthsequences created by appending = ⊔

� to k-length sequences of opening contexts. Thebinary relation ��k : �k×�k is a strong ordering, as defined in Sect. 3.1, after extending � toa flat lattice. An element of �k is called a k-context. We can establish a map αk : �∗→ �k as:

αkν �{

ν if |ν| ≤ k

νk. otherwise, where ∃ν ′ : ν = νk.ν′ ∧ |νk| = k.

In other words, when ν is longer than k, αk maps it to νk., where νk is the k-lengthprefix of ν. A sequence of length ≤ k is mapped to itself. It follows from the lemma belowthat �k is an abstraction of �∗.

Lemma 2 αk is surjective and additive.

Proof The surjectivity property follows from the application of αk , since ∀νk ∈ �k is formedfrom a k-length prefix of a ν ∈ �∗. The additivity property is true if: f (a �b) = f (a)�f (b),where f (x) is the application of αk over a context string ν, and

(i) a � b results from the consecutive application of the strong ordering operator overevery element in the context strings a and b.

(ii) f (a �b) produces a k-context string with the prefix k elements from a �b, preservingtheir order.

(iii) f (a) and f (b) produces two k-context strings with the prefix k elements from a andb, respectively, preserving their order.

(iv) f (a) � f (b) produces a context string resulting from the consecutive application ofthe strong ordering operator over every element in the k-contexts f (a) and f (b).

That αk is additive follows from (ii), (iii), that the strong ordering operator imposes a flatorder, and that αk preserves the order of the context. �

Thus, �∗ and �k form a Galois insertion with the abstraction map αk . Context-sensitiveanalyses may be derived by defining appropriate context abstraction �k� �Abs.

In Table 2, the “Context” column provides some examples of contexts. Their correspond-ing k-context abstractions, with k = 3, are shown in the “3-Context” column.

5.2 �-context

Let B represent the set {1,+}, where 1 � +. The set ��⊆ (�×B)∗ is defined as follows.

Page 16: Context-sensitive analysis without calling-context

290 Higher-Order Symb Comput (2010) 23:275–313

Table 2 Examples of contextsand abstract contexts Context 3-context �-context

abc abc abc

aaaaa aaa a+abca abc a+abcaaaaaabc abc abc+abccba abc a+aaabbbaaabbbcbbb aaa a+b+

Definition 2 �� is the smallest set contained in (�×B)∗ satisfying:

1. ε ∈ ��

2. ∀ν� ∈ ��; c ∈ �; ∀x ∈ B:(c, x) /∈ ν� ⇒ (c,1).ν� ∈ �� ∧ (c,+).ν� ∈ ��

Assume � = {a, b, c}, the notation x denotes (x,1), and x+ denotes (x,+). The followingstrings are some examples of sequences in ��: ε, a, ab, a+, ab+, a+b+c+. Some examplesof sequences in (�×B)∗, but not in ��, are: aa, abba, a+a+, aba+b+. The following lemmagives the bound on the size of strings in ��.

Lemma 3 ∀ν� ∈ �� : |ν�| ≤ |�|.

Proof For any element c ∈ �, either c or c+ may be in ν�, and each element can occur atmost once. �

The element a+ represents the set of all contexts that start and end with the openingcontext a, and may have any context in between. Table 2 provides examples of contextsand their corresponding �-contexts. Consider the context “abcaaaaaabc,” which when readright-to-left gives the order in which the contexts were pushed. It may be abstracted to abc+,where the term c+ represents the set of all non-zero length sequences starting with c andending with c. Thus, the term abc+ thus represents the set of contexts consisting of theopening context a, pushed on the opening context b, pushed on a sequence of openingscontexts starting with c and ending with c. Though, the context “abcaaaaaabc” may alsobe abstracted as a+bc, the construction below chooses abc+ because it abstracts the contextsdeeper in the stack.

To develop the abstraction function from �∗ to �� we first develop the abstract syntax tree(AST) domain �T that is isomorphic to �∗. The abstraction map is then defined on �T . Thefollowing rule defines the syntactic structure of �T in terms of �∗

T .

�T = ⊥ ∪ �T ×�∗T ×�

An element of �T may either be ⊥ or a 3-tuple consisting of (νT , σT , c) where νT ∈ �T ,σT ∈ �∗

T , and c ∈ �. In addition, we also require that the elements of �T further satisfy thesemantic constraint that (t, σT , c) is in �T iff c does not occur again in the subtrees t and σT ,which is formally defined as follows:

∀t ∈ �T ;σT ∈ �∗T ; c ∈ �: (t, σT , c) ∈ �T ⇔ c /∈T t ∧ c /∈T ∗ σT

Page 17: Context-sensitive analysis without calling-context

Higher-Order Symb Comput (2010) 23:275–313 291

Table 3 Examples of mappingcontexts to T-contexts Context T-context

a (⊥, ε, a)

abc (((⊥, ε, a), ε, b), ε, c)

aaaaa (⊥, [⊥,⊥,⊥,⊥], a)

abca (⊥, [((⊥, ε, b), ε, c)], a)

abcabc (((⊥, ε, a), ε, b), [((⊥, ε, a), ε, b)], c)cabcaaabab (((⊥, ε, c), ε, a), [((⊥, ε, c), [⊥,⊥], a), (⊥, ε, a)], b)

where the two relations ∈T ⊆ �×�T and ∈T ∗ ⊆ �×�∗T are defined as follows:

∀c, d ∈ �;σT ∈ �∗T ; t ∈ �T

c ∈T (t, σT , d) ⇔ c ∈T t ∨ c ∈T ∗ σT ∨ d = c

c ∈T ∗ t.σT ⇔ c ∈T t ∨ c ∈T ∗ σT

The function φ maps elements from �∗ to �T . This map amounts to parsing.

φ: �∗→ �T

φ ε � ⊥φ σ.c � ((φ s1), (map φ [s2, s3, . . . , sn]), c)

(1)

where σ = s1.c.s2.c. . . . c.sn for some s1, s2, . . . , sn ∈ �∗ such that ∀1 ≤ i ≤ n : c /∈ si . Thefunction splits a context string, using its first context c, into a sequence of maximal sub-strings s1, . . . , sn such that each of the si does not contain c. The triple (s1, [s2 . . . sn], c) isused to create the recursive structure, with the function map lifting φ to apply it point-wiseon all elements of a sequence. This construction ensures that the semantic constraint for �T

is preserved. The map from a sequence σ.c ∈ �∗ to the triple (s1, [s2 . . . sn], c) is bijective.Thus, the domains �∗ and �T are isomorphic. Table 3 provides examples of contexts andtheir corresponding “T-contexts”, i.e., the corresponding terms in �T .

We now define an abstraction map α� : �T → �� as follows:

∀t ∈ �T ; s ∈ �∗T ; c ∈ �

(α� ⊥) � ε (2)

(α� (t, s, c)) � (α� t).c(αB |s|)

where αB is defined as:

(αB n) �{

1 n = 0

+ n > 0

It follows from the definition that αB is surjective and additive. Hence, αB is an abstrac-tion from N to B, and (N,≤) and (B, {(+,+), (1,1), (1,+)} form a Galois insertion. Todemonstrate that α� is additive we introduce the relation �T on �T as follows:

∀t, t1, t2 ∈ �T ; ∀s1, s2 ∈ �∗T ; ∀c1, c2 ∈ �

Page 18: Context-sensitive analysis without calling-context

292 Higher-Order Symb Comput (2010) 23:275–313

⊥ �T t,

(t1, s1, c1) �T (t2, s2, c2) ⇔ t1 �T t2 ∧ s1 �∗T s2 ∧ c1 �� c2

where � is extended to flat-lattice.It can be shown that �T is reflexive, anti-symmetric, and transitive, and thus defines a

partial order on �T .

Lemma 4 α� is surjective and additive.

Proof The surjectivity property follows from the application of the definition of α�. Theadditivity property follows from structural induction using the recursive functions definedfor mappings φ (1) and α� (2). �

Once again, a context-sensitive analysis may be derived by defining � and �, and theappropriate context abstraction ��� �Abs.

6 Analysis of obfuscated assembly programs

We now turn our attention to context-sensitive analysis of assembly programs in which thecall and ret operations may be performed without using CALL and RET instructions. Thesemantics of the classic call and ret operations consists of two parts: manipulation of thereturn address on the stack and transfer of program control. To obfuscate a procedure call(or return from a call) the two parts of the semantics of the instructions may be separatedand performed using other instructions. Further, all instructions participating in simulatinga call or a return may not be contiguous in the code; they may be intermixed with otherinstructions. On the other hand, CALL (RET) instructions may be employed for purposesother than making (returning from) a procedure call. For instance, a CALL instruction maybe used to transfer control, but the return address may be discarded [30].

Procedure call and return obfuscations thwart analysis of assembly programs by attack-ing an important step needed for interprocedural analysis: identification of procedures [32].Most assembly languages do not provide any mechanism to encapsulate procedures. Thus,disassemblers use call and ret instructions to determine procedure boundaries and to createthe call graph [27]. When these instructions are obfuscated, the procedures identified and thecall graph created may be questionable and any subsequent interprocedural analyses mustbe circumspect.

6.1 Programming language

To present our analysis of assembly programs where call and ret instructions are obfuscated,we first introduce a simple assembly language that does not contain these instructions. In-stead, the language provides primitives to manipulate the stack pointer and the instructionpointer, both of which are registers in the IA32 architecture. Thus, our language capturesthe essential properties needed to present our algorithm for performing context-sensitiveanalysis of obfuscated assembly programs.

Figure 3 presents the syntax of the language we use to model our analysis. A pro-gram p in this language consists of a sequence of instructions (seq(i)). Instructions canbe either conditional or unconditional. A conditional instruction at a label l has the form“l : if (b) eip = e; eip = l′”, where b is a boolean expression, e is an integer expression

Page 19: Context-sensitive analysis without calling-context

Higher-Order Symb Comput (2010) 23:275–313 293

Fig. 3 A model assemblylanguage

Syntactic Categories:b ∈ B (boolean expressions)e, e′ ∈ E (integer expressions)i ∈ I (instructions)l, l′ ∈ L ⊆ Z (labels)z ∈ Z (integers)p ∈ P (programs)r ∈ R (references)

Syntax:e ::= l | z | r | ∗r | e1 op e2 (op ∈ {+, −, ∗, /, . . .})b ::= true | false | e1 < e2 |¬b | b1 && b2i ::= l : esp = esp + e � eip = e′ |

l : esp = e � eip = e′ |l : ∗esp = e � eip = e′ |l : r = e � eip = e′ |l : ∗r = e � eip = e′ |l : if (b) eip = e; eip = l′

p ::= seq(i)

which evaluates to the label of the instruction to execute if b evaluates to true, and l′ is thelabel of the instruction to execute if b evaluates to false. An unconditional instruction at alabel l has the form “l : lhs = rhs � eip = e”. The instruction binds the result of evaluatingthe expression rhs to a reference resulting from evaluating lhs. If the lhs is of the form r ,i.e., is the register r , the value is bound to the register r . On the other hand, if the lhs is ofthe form ∗r , the value is bound to the memory address indexed by the value of register r .The component “eip = e” of an unconditional instruction assigns to the instruction pointereip the label of the instruction to be executed next.

Our language assumes a unique symbol esp representing the stack pointer, which maybe a register or a memory location. As noted by the rules in Fig. 3 for an instruction i,the operations on (or through) the stack pointer are distinguishable from other operations.The interpretation of the unconditional instructions involving esp is analogous to the uncon-ditional instruction discussed above, with esp treated as a register. The analysis presentedassumes that the stack grows towards lower memory addresses, but it can be changed triv-ially to accommodate the opposite convention.

Though our language does not explicitly model call, ret, push, or pop instructions, equiv-alent behavior may be performed using primitives of our language. For example, a “call l”instruction may be mapped to the following sequence of instructions in our language:

l0 : esp = esp − 1 � eip = l1

l1 : ∗esp = l2 � eip = l

where l2 is the address of the instruction after the call instruction. It is not necessary thatthese two instructions appear contiguously in code.

Figures 4 and 5 present the semantics of our language. Figure 4 shows the semantic do-mains and transition relation while Fig. 5 shows the semantic functions. A program state isrepresented by a pair (i, δ) ∈ I × Δ, where i is the next instruction to be executed in thestore environment δ. Thus, Σ = I × Δ denotes the set of all possible program states. Thetransition relation between program states is defined as I : Σ → ℘(Σ), i.e., the transitionrelation represents the behavior of an instruction i when executed in a certain store environ-ment δ. Given a program state s ∈ Σ , the semantic function (I s) gives the set of possiblesuccessor states of s.

Page 20: Context-sensitive analysis without calling-context

294 Higher-Order Symb Comput (2010) 23:275–313

Fig. 4 Semantic domains andtransition relation for thelanguage of Fig. 3

Semantic domains:δ ∈ Δ = R + L → Z (store environment)s ∈ Σ = I × Δ (program states)z ∈ Z (integers)B = {true, false} (truth values)

Transition relation:I : Σ → ℘(Σ)

I(�l : esp = esp + e � eip = e′�, δ)

= {((Fexpr e′ δ),Fesp (Fexpr e δ) δ)}I(�l : esp = e � eip = e′�, δ)

= {((Fexpr e′ δ),Freset (Fexpr e δ) δ)}I(�l : ∗esp = e � eip = e′�, δ)

= {((Fexpr e′ δ),F∗esp (Fexpr e δ) δ)}I(�l : r = e � eip = e′�, δ)

= {((Fexpr e′ δ),Fassign r (Fexpr e δ) δ)}I(�l : ∗r = e � eip = e′�, δ)

= {((Fexpr e′ δ), (F∗assign r (Fexpr e δ) δ)}I(�l : if (b) eip = e; eip = l′�, δ)

={

{((Fexpr e δ), δ)} if true = (Fbool b δ)

{(l′, δ)} if false = (Fbool b δ)

Fig. 5 Semantic functions forthe language of Fig. 3

Semantic functions:Fesp : Z → Δ → Δ

Fesp z δ = [esp �→((δ esp) + z)]δFreset : Z → Δ → Δ

Freset z δ = [esp �→ z]δF∗esp : Z → Δ → Δ

F∗esp z δ = [l′ �→z]δ, where l′ = δ espFassign : R → Z → Δ → Δ

Fassign r z δ = [r �→z]δF∗assign : R → Z → Δ → Δ

F∗assign r z δ = [l′ �→ z]δ, where l′ = δ r

Fexpr : E → Δ → Z

Fexpr �l�δ = l

Fexpr �z�δ = z

Fexpr �r�δ = δ r

Fexpr �∗r�δ = δ l, where l = δ r

Fexpr �e1 op e2 �δ = Fexpr �e1 �δ op Fexpr �e2 �δ

Fbool : B → Δ → BFbool �true�δ = trueFbool �false�δ = falseFbool �e1 < e2 �δ = Fexpr �e1 �δ < Fexpr �e2 �δ

Fbool �¬b�δ = ¬Fbool �b�δ

Fbool �b1 && b2 �δ = Fbool �b1 �δ ∧ Fbool �b2 �δ

The transition relation, written for the set of all possible states, may be specialized for thestates of a specific program as follows. Let Σp = p × Δp be the set of states of a programp, then the transition relation I p : Σp → ℘(Σp) on program p is: (I p i δ) = {(i ′, δ′) |(i ′, δ′) ∈ (I i δ), i ′ ∈ p, and δ, δ′ ∈ Δp}.

The concrete trace semantics for a program p is given by the least fixpoint of the follow-ing function:

F p T = Σp

i ∪ {σ.s.s ′ | σ.s ∈ T ∧ s ′ ∈ I p s}where T is a set of finite partial traces, σ is a sequence of program states s0 . . . sn of length|σ | > 0 such that ∀i ∈ [1, n) : si ∈ (I p si−1), and s0 ∈ Σ

p

i is the set of initial states. Following

Page 21: Context-sensitive analysis without calling-context

Higher-Order Symb Comput (2010) 23:275–313 295

Sect. 4, the concrete context-trace semantics can be obtained by the least fixpoint of Fc :�∗ Π−→ ℘(Σ∗) −→ �∗ Π−→ ℘(Σ∗) which is mapped from F : ℘(Σ∗) −→ ℘(Σ∗).

6.2 Stack context

We now define the sets �′asm and �′

asm, which are the sets of instructions that open and closecontexts, respectively, based on operations on the stack pointer. An instruction opens a con-text, i.e., belongs in �′

asm, if it decrements the stack pointer. Analogously, an instructioncloses a context, i.e., belongs in �′

asm, if it increments the stack pointer.

�′asm � {i | ∃n ∈ N,∃δ, δ′ : δ′ ∈ (I i δ) ∧ (δ′ esp) = (δ esp) − n}

�′asm � {i | ∃n ∈ N,∃δ, δ′ : δ′ ∈ (I i δ) ∧ (δ′ esp) = (δ esp) + n}

Consider the class of programs in which (a) an instruction that modifies the stack pointeralways increments or decrements it by a statically known constant and (b) that constant isthe same for all instructions. This class includes programs that use only call, ret, push, andpop instructions to modify the stack pointer. Programs generated by conventional compilerstypically fall in this class. For this class of programs, the analysis domains �′�

asm→ Abstateor �′k

asm→ Abstate may be used to derive a context-sensitive analysis.Now consider the programs that meet constraint (a), but not (b). That is, programs in

which the increment/decrement applied to a stack pointer can be statically determined, butnot all instructions use the same constant. Since the size of space allocated/deallocated on thestack is not the same, a closing context statement may not remove the entire context fromthe top of the stack. The analysis can be trivially extended to accommodate this class ofprograms by statically introducing pseudo-instructions such that all stack pointer operationsuse the same constant.

Obfuscated programs, however, may not meet either of the constraints. They may containinstructions that modify the stack pointer by direct assignment of values, such as using theinstruction l : esp = e � eip = e, or they may contain instructions that increment or decrementthe stack pointer by an expression whose value cannot be determined statically. In the ab-sence of any further information about the possible values of the expression, say from usingthe Abstate, to derive a safe analysis a worst case assumption must be made.

To analyze the most general class of assembly programs, we need to develop a concretecontext-trace semantics that allows for non-fixed size contexts. The set of opening contextsfor such semantics may be represented by the domain: �asm ⊆ I × N, meaning that a contextis a pair consisting of a statement and the stack units it affects. The set of closing contextsis represented by the domain �asm ⊆ I × N. The domains are described as follows:

�asm � {(i, n) | ∃δ, δ′ : δ′ ∈ (I i δ) ∧ (δ′ esp) = (δ esp) − n ∧ n > 0}�asm � {(i, n) | ∃δ, δ′ : δ′ ∈ (I i δ) ∧ (δ′ esp) = (δ esp) + n ∧ n > 0}

A context string is a sequence belonging to �∗asm. A function Πasm : Σ∗ → �∗

asm may nowbe defined that maps a trace to its context string. This function accounts for the creation and

Page 22: Context-sensitive analysis without calling-context

296 Higher-Order Symb Comput (2010) 23:275–313

destruction of varying size contexts.

Πasm: Σ∗asm → �∗

asm

Πasmσ � (Π ′asm σ ε)

Π ′asm: Σ∗

asm → �∗asm→ �∗

asm

Π ′asmε ν � ν

Π ′asms1.ε ν � ν

Π ′asms1.s2.σ ν � (Π ′

asm s2.σ (πasm (i1, n) ν))

where n = (δ2 esp) − (δ1 esp), s1 = (i1, δ1), δ2 = s2 ↓ 2πasm: (I × N) → �∗

asm→ �∗asm

πasm (i,0) ν � ν

πasm (i, n) ν � (i,−n).ν, if n < 0

πasm (i, n) (j,m).ν �{

(j,m − n).ν if m > n

πasm (i, n − m) ν otherwiseif n > 0

The function πasm above is a counter-part of the function π described in Sect. 4. It performsthe push and pop operations on a context-string depending on the value of n. A negativevalue of n implies a push operation when the stack grows towards lower memory addresses.Correspondingly, a‘positive value of n implies a pop operation.

The concrete domains �asm and �asm form infinite lattices because N is infinite. Further-more, the value of n, representing the size of a context, may not be statically computable.Hence, we need an abstraction of these domains. We use the lattice of constant-propagationto abstract N. Let N represent the flat lattice consisting of the set of numbers in N and thespecial values and ⊥, the lub and glb of the lattice. We can now define the abstract context

domains �asm = I ×N and �asm = I ×N . The �-context abstraction of �∗asm, denoted by �

asm,will be used in the context-sensitive analysis of assembly programs.

Let us compare the stack-context and the calling-context of a non-obfuscated program.It is apparent that the set �asm specialized for instructions in a program P may be largerwhen using stack-context than for calling-context. This is because when using stack-context,a context is created not just for call instructions, but also for push instructions. What is theimplication of these extra nodes on the computational complexity of the analysis? The com-plexity depends on two factors. The total number of context strings created for a programand the number of context strings reaching a statement. The total number of context strings(which includes all partial strings) will increase, as will the length of the strings. Empiricalresults presented in the following section show that this does not impact the computationalperformance of the algorithms since the number of context-strings reaching an instructionremains unchanged. The number of context-strings can grow exponentially. Further im-provement in computational cost may be gained by encoding them using binary-decisiondiagrams (BDD), as previously reported for context-sensitive pointer analysis [54, 56].

6.3 Context-sensitive construction of context

The functions π �asm and π k

asm—approximations of πasm over the domains ��

asm and �k

asm,respectively—are too imprecise to yield useful context-sensitive analysis. The key issueis how these abstract functions map the push and pop operations. While the abstraction of apush operation produces the desired k- or �-context strings, that is not the case for a pop op-eration. The problem is that a k- or an �-context string does not contain enough information

Page 23: Context-sensitive analysis without calling-context

Higher-Order Symb Comput (2010) 23:275–313 297

to create the abstract contexts resulting from popping the top context. For example, considera 3-context string abc. What should be the set of contexts resulting from popping a fromthis string?

When performing a classical interprocedural analysis, abstractions of call-strings arecomputed using the call-graph of the program. Every calling context is a path in a call-graph,and vice versa. Thus, in the case of abc, the set of k-contexts resulting from popping thetop context may be determined from the call graph. The same holds for �-contexts.

As with interprocedural analysis, any context-sensitive analysis must maintain a contextgraph that concisely represents all the contexts. Lakhotia and Kumar [30] introduced Ab-stract Stack Graph, ASG = R + �asm → ℘(�asm), as a finite representation of all possible

stacks and showed that ℘(Σ∗) = ℘(I × (R + L) → Z) � I → R + �asm → ℘(�asm). Theresulting AI computes the ASG. Given an ASG s, (s esp) gives the topmost contexts on thestack. Successive contexts may be enumerated by reapplying s.

While Lakhotia and Kumar’s analysis abstracted the shape of the stack, it did not abstractthe values in the stack. Thus, the resulting AI could not analyze a program that modified thecontents of the stack and used that to transfer control. Venable et al. addressed this limitationby using Balakrishnan and Reps’ [3] value-set (VS) domain, N × Z × Z, to abstract ℘(Z),the set of values in stack locations and registers. A value s[lb, ub] ∈ VS, where s ∈ N andlb, ub ∈ Z are mapped to ℘(Z) by the following concretization map:

γ (s[lb, ub]) = {z | lb ≤ z ≤ ub, z ≡ lb (mod s)}.Thus, for example, γ (2[1,9]) = {1,3,5,7,9}. The VS domain is partially ordered using thesubset (⊆) order of the set of values represented by its elements.

Since memory addresses are numerical values, the domain VS provides a safe approxi-mation of the set of numerical values as well as addresses held by a register or a memorylocation. Whether the values represented by an element s[lb, ub] are memory addresses ornumerical values follows from how the information is used in an instruction. When the valueis assigned to eip it is treated as a memory address, in particular, a label of an instruction.Similarly, the value represents a memory address when used in an indirect memory operand,such as when computing the expression ∗r .

Venable et al. combined the ASG and VS domains to derive the context-insensitive anal-ysis I → R + �asm → ℘(�asm) × VS. Venable et al.’s algorithm tracks stack manipulationswhere the stack pointer may be saved and restored in memory or registers. The memoryoperations are limited to those on the stack. Given s ∈ R + �asm → ℘(�asm) × VS, the firstelement of (s esp) gives the set of topmost contexts and the second element of (s eip) yieldsthe set of addresses of the next instructions.

Venable et al.’s analysis is different from the VSA algorithm of Balakrishnan and Reps,which computes I → Region → VS, where the domain Region is an abstraction of the mem-ory layout. There are three types of regions, differing in the scope and lifetime of the alloca-tion. There is a region each for memory allocated in the global area, on the heap, and on thestack. In contrast, the domain �asm represents only memory allocated on the stack, and thusis a weaker version of Balakrishnan and Reps. On the other hand, Balakrishnan and Repsused a pre-computed call-graph to model the semantics of call and ret instructions, whereasVenable et al. used the ASG that was simultaneously constructed with the analysis.

Even though Venable et al.’s analysis modeled the ASG, its computation was context-insensitive. Their AI did not use the ASG to maintain contexts. We derive a context-sensitive equivalent of Venable et al.’s analysis using the domain �

asm → I → R + �asm →℘(�asm) × VS. The analyzer may be derived using a chain of Galois connections. To ensure

Page 24: Context-sensitive analysis without calling-context

298 Higher-Order Symb Comput (2010) 23:275–313

Fig. 6 Toplevel flow ofalgorithm for context-sensitiveanalysis of binaries

analyze(P , E)begin

for all i in I [P ] domake node i in CFG;i.visited[.] = false;i.state[.] = δinit ;

odδ0 = {esp : (⊥, {⊥})} · δinit } ;worklist = {〈E,⊥, δ0〉};

while worklist is not empty doremove 〈i, c↑, δ↑〉 from worklist;if not i.visited[c↑] then

〈Succi ,Ci , δi 〉 = interpret(i, c↑, δ↑);else if δ↑ � i.state[c↑] then

continue;else if i is a loop junction node then

δ↑ = widen(i.state[c↑], δ↑);〈Succi ,Ci , δi 〉 = interpret(i, c↑, δ↑);

else if i is a non-loop junction node thenδ↑ = i.state[c↑] � δ↑;〈Succi ,Ci , δi 〉 = interpret(i, c↑, δ↑);

else〈Succi ,Ci , δi 〉 = interpret(i, c↑, δ↑);

fii.visited[c↑] = true;i.state[c↑] = δ↑;worklist = worklist ∪ (Succi × Ci × {δi });Add edges {i} × Succi to CFG;

odend

termination of the analyzer, we use the widening operator for the VS domain, as given by [3],to accelerate fixpoint computation.

6.4 Soundness

The concrete context-trace semantics is given by the least fixpoint of the function Fc :�∗

asmΠasm−−→ ℘(Σ∗) −→ �∗

asmΠasm−−→ ℘(Σ∗), where Σ = I × (R + L → Z) = I → ℘(R + L →

Z). The context-trace semantics of the context-sensitive analyzer is given by the least fix-

point of the function F # : (��

asm → I → R + �asm → ℘(�asm) × VS) −→ (��

asm → I →R + �asm → ℘(�asm) × VS). Soundness of F # follows from the following lemma.

Lemma 5 �∗asm

Πasm−−→ ℘(Σ∗) � ��

asm → I → R + �asm → ℘(�asm) × VS.

Proof It follows from �∗asm� �

asm (Lemma 4) and ℘(Σ∗) � I → R + �asm → ℘(�asm) ×VS [51]. �

It follows from Lemma 5 and the fixpoint transfer theorem that F # is a sound, albeitincomplete, approximation of Fc .

Page 25: Context-sensitive analysis without calling-context

Higher-Order Symb Comput (2010) 23:275–313 299

7 Example

We now demonstrate our context-sensitive counterpart of Venable et al.’s algorithm by ap-plying it on the examples provided in Sect. 1. The demonstration is done in two steps. Wefirst enumerate the intermediate results from applying Venable et al.’s algorithm on the non-obfuscated program of Table 1(a). We then conduct a similar enumeration of intermediateresults for the obfuscated program of Table 1(b), but this time using our context-sensitivealgorithm. Henceforth, the two programs are referred to as Programs A and B, respectively.

7.1 Toplevel algorithm

Figure 6 presents the toplevel control flow of our program analysis algorithm. The controlflow is essentially the same as classic data flow analysis algorithms. It maintains a worklistof nodes to process. The elements in the worklist consist of 3-tuples 〈i, c↑, δ↑〉, where

i ∈ I [P ] is an instruction of the program P being analyzed, c↑ ∈ ��

asm[P ] is the context for

the analysis (specialized for program P), and δ↑ ∈ I [P ] + �asm[P ] → VS × ℘(�asm[P ]) isthe state to be propagated to instruction i. The algorithm picks a node from the worklistand interprets the instruction in the given context and state. The interpretation yields a triple〈Succi ,Ci, δi〉, where Succi is the set of next instructions to analyze, Ci is the set of contextsin which to analyze the instructions, and δi is the state to be propagated. That instruction i

was interpreted in context c↑ is remembered by marking ‘i.visited[c↑]’ as true. Similarly,the state δ↑ at the time of the visit is also remembered in ‘i.state[c↑]’. If the same instructionis visited again in the same context, but with a state that is smaller than the previous state, itis not interpreted again. A revisit to an instruction is analyzed to determine whether it is dueto a loop, and if so, it is reinterpreted after widening the previous state using the new state.Otherwise, it is interpreted after merging the two states.

The algorithm does deviate from classic flow analysis algorithms at one point, though.While classic algorithms assume that the CFG (or ICFG) of the program has been con-structed before the analysis, our algorithm constructs the CFG as it proceeds. The nodes ofthe CFG consist of the instructions of the program, and are initialized at the beginning. Theedges of the CFG are added as they are discovered as a result of the interpretation.

7.2 Context-insensitive analysis

We now perform Venable et al.’s context-insensitive analysis of Program A, i.e., the original,unobfuscated program. The program has the following sets pertinent for the analysis:

• Instructions: I [A] = {A1 . . .A12}• Opening contexts: �asm[A] = {A1/4,A2/4,A3/4,A4/4,A5/4,A6/4,⊥}• Closing contexts: �asm[A] = {A8/4,A12/12}

The instructions are represented by their labels or addresses. An opening context is a pairof an instruction and the number of bytes of contexts it adds. We assume a 32-bit machine,and that the PUSH and CALL instructions push 4-byte values onto the stack. Henceforth,for brevity, a context is represented only by its instruction and not its size. Stack contextsof other sizes may be created by direct manipulation of the stack pointer by using a SUBinstruction. To keep the discussion focused on core issues, such instructions are not used inour examples. The program has six instructions that open contexts, all of the same size. Inaddition, a context ⊥ is added to represent the context at the start of the program, typically

Page 26: Context-sensitive analysis without calling-context

300 Higher-Order Symb Comput (2010) 23:275–313

referred to as the bottom of the stack. The program has two instructions that close contexts,both of which are RET instructions. They vary in the size of contexts they remove. The RETinstruction at address A8 removes 4 bytes of context corresponding to the return address,whereas the RET instruction at address A12 removes 12 bytes (hexadecimal 0xC), fourcorresponding to the return address and four each for the two parameters.

Since we are performing context-insensitive analysis, we will restrict the set of openingcontext to �asm[A] = φ, which then limits the context to a single value ⊥. The analysis startsat instruction A1, the entry point of A, with the initial state δinit as defined below:

δinit = {rl : (⊥, φ) | rl ∈ {eax, ebx, esp} ∪ �asm[A]}We use the following notations. In a function construction, the notation x : y implies thatthe resulting function x maps to y. We represent the function update operation using f ′ ={x : y} · f . The updated function f ′ is point-wise similar to f , except for x where f ′x = y.

Table 4 presents the ‘trace’ of analyzing the program. It presents the ‘output’ of the‘interpret’ step from the algorithm of Fig. 6. Column # gives the order in which an ele-ment is picked from the worklist. Column �#� gives the instruction chosen for interpreta-tion. Columns Succ# and δ# give the tuples resulting from “interpret(�#�,⊥, δx )”, where theδx is the state used for interpretation. Context-insensitive analysis is performed by using

��

asm[P ] = {⊥}, thus C# is always {⊥}, and is not represented.Row 1 of Table 4 shows the result of interpreting A1: PUSH 0xA in state δ0. Per the

semantics of the instruction, there is only one successor instruction, A2, and the next stateδ1 = {esp : (, {A1}), A1 : (0xA, {⊥})} · δ0. The interpretation modifies bindings for theregister esp and the abstract location represented by A1. Both are bound to a pair of val-ues from the domain VS × ℘(�asm[A]). The ℘(�asm[A]) component represents the edges ofLakhotia and Kumar’s ASG [30]. The VS component of esp is bound to representing anypossible value, but the ASG component is bound to {A1}, implying that upon terminationof the instruction the top of stack will be the abstract location A1. Further, the values boundto the abstract location A1 complete the semantics of the PUSH instruction. Its VS com-ponent is 0xA, the parameter of the PUSH instruction and its ASG component is {⊥}. TheASG component represents the set of top of stacks in that state. The initial stack state isrepresented as ⊥.

Interpretation of the subsequent PUSH instructions has the same logic as that in row 1.The interpretation of A3: CALL A9 instruction in row 3 is analogous to the interpretation ofPUSH instructions, but with two differences. The value A4, the address of the instructionfollowing the CALL instruction is pushed on the stack as the VS component, and the set ofsuccessor instructions consists of A9, the target of the call.

Before continuing to the next instruction, it is instructive to study the layout of the stackat this point, as given by the ASG component of the following chain: esp:(, {A3}), A3:(A4,{A2}), A2:(0xB, {A1}), A1:(0xA, {⊥}). The abstract location A3 is the top of the stack, whoseASG component points to A2, which points to A1, which points to the bottom of the stack ⊥.

The concrete semantics of the instruction A9: MOV eax, [esp+4] is as follows. It com-putes the expression esp+4, uses the computed value as the address of a memory location,and copies the value at that memory location to the register eax. However, in the context ofthe convention that the stack grows toward lower memory addresses, the location esp+4 ison the top of the stack. In the abstract stack, with each unit modeling 4 bytes, the expression[esp+4] in state δ4 evaluates to the value of the abstract location A2. Thus, as shown in row4 of Table 4, the interpretation of the instruction at address A9 binds the values (0xB, {A1})to the register eax.

Page 27: Context-sensitive analysis without calling-context

Higher-Order Symb Comput (2010) 23:275–313 301

Table 4 Context-insensitive analysis of program in Table 1(a)

# �#� Succ# δ#

1. A1: PUSH 0xA {A2} {esp:(, {A1}), A1:(0xA, {⊥})}·δ0

2. A2: PUSH 0xB {A3} {esp:(, {A2}), A2:(0xB, {A1})} ·δ1

3. A3: CALL A9 {A9} {esp:(, {A3}), A3:(A4, {A2})} ·δ2

4. A9: MOV eax, [esp+4] {A10} {eax:(0xB, {A1})}·δ3

5. A10: MOV ebx, [esp+8] {A11} {ebx:(0xA, {⊥})}·δ4

6. A11: ADD eax, ebx {A12} {eax:(0x15, )}·δ5

7. A12: RET 8 {A4} {esp:(, {⊥})}·δ6

8. A4: PUSH 0xC {A5} {esp:(, {A4}), A4:(0xC, {⊥})}·δ7

9. A5: PUSH 0xD {A6} {esp:(, {A5}), A5:(0xD, {A4})} ·δ8

10. A6: CALL A9 {A9} {esp:(, {A6}), A6:(A7, {A5})} ·δ9

11. widen(δ3, δ10) {esp:(, {A3, A6} )}·δ10

12. A9: MOV eax, [esp+4] {A10} {eax:(2[0xB, 0xD], {A1,A4})} ·δ11

13. A10: MOV ebx, [esp+8] {A11} {ebx:(2[0xA, 0xC], {⊥})} ·δ12

14. A11: ADD eax, ebx {A12} {eax:(2[0x15, 0x19], )} ·δ13

15. A12: RET 8 {A4, A7} {esp:(, {⊥})}·δ14

16. A4: PUSH 0xC {A5} {esp:(, {A4})}·δ15

17. A5: PUSH 0xD {A6} {esp:(, {A5})}·δ16

18. A6: CALL A9 {A9} {esp:(, {A6})}·δ17

19. widen(δ11, δ18) {eax:(2[0x15, ], ),

ebx:(2[0xA, ], {⊥}),esp:(, {A3, A6})}·δ18

20. A9: MOV eax, [esp+4] {A10} {eax:(2[0xB, 0xD], )} ·δ19

21. A10: MOV ebx, [esp+8] {A11} {ebx:(2[0xA, 0xC], {⊥})} ·δ20

22. A11: ADD eax, ebx {A12} {eax:(2[0x15, 0x19], )} ·δ21

23. A12: RET 8 – –

24. A7: NOP {A8} δ15

25. A8: RET φ {esp:(⊥, {})}·δ23

The next important step is the interpretation of A12: RET 8 in row 7. The concrete seman-tics of the instruction is to remove eight bytes from the top of the stack, and then pop fourmore bytes, before transfering control to the corresponding address. In the abstract domain,this is translated to traversing the stack graph to change the esp and to find the addresses ofthe successor instructions. When interpreted in state δ6, it results in the successor being setas {A4} and esp bound to (, {}).

Interpretation in rows 8, 9, and 10 follow a similar theme. However, in row 12, instruc-tion A9 is interpreted. It was interpreted earlier in row 4 with state δ3. Since state δ3 doesnot over-approximate state δ11, the new state is interpreted again. However, in the CFG con-structed there is a cycle, albeit over an interprocedurally invalid path. Hence, the new stateδ11 is first widened, which results in widening the ASG component of esp to include A3 inaddition to A6.

Page 28: Context-sensitive analysis without calling-context

302 Higher-Order Symb Comput (2010) 23:275–313

The widening has an impact on the subsequent interpretation of the instructions A9 andA10, in rows 12 and 13. Unlike during their interpretations in rows 4 and 5, esp points totwo possible tops of stacks: A3 and A6. Thus, the evaluation of [esp+4] yields two valuesA2:(0xB, {A1}) and A5:(0xD, {A4}), whose join yields (2[0xB, 0xD], {A1, A4}). Simi-larly, the evaluation of [esp+4] produces (2[0xA, 0xC], {⊥}).

When the instruction A12 is re-interpreted on row 15, it yields two return addresses A4and A7. This property is, in essence, what makes the analysis context-insensitive. Interpre-tation of instruction A4 in row 16 follows the familiar course through the interpretation ofinstruction A6, which then brings us back to instruction A9.

Once again, the new state (δ18) used for interpreting A9 is larger than the previous state(δ11) in which it was interpreted. The significant difference is in the bindings of eax and ebx,as shown below.

(δ11 eax) = 0x15 � (δ18 eax) = 2[0x15, 0x19](δ11 ebx) = 0xA � (δ18 ebx) = 2[0xA, 0xC]

Row 19 presents the result of widening δ11. Interpretation at rows 20, 21, and 22 is similar asbefore. At row 23, the instruction A12 is selected for interpretation in state δ22. Since state δ22

is not larger than state δ14, the state in which the instruction A12 was previously interpreted(row 15), its interpretation is terminated. The rest of the interpretation is straightforward.

7.3 Context-sensitive analysis

We now present context-sensitive analysis of Program B; the example using call obfuscation.The example thus demonstrates how stack-based context may be used to perform context-sensitive analysis.

The crux of the analysis is the use of the ASG component for computing expressionsinvolving esp, as was discussed in the previous example. Thus, what remains to be discussedis the maintenance and update of contexts.

The following sets are pertinent for the analysis of Program B.

Instructions: I [B] = {B1. . . B16}Opening contexts: �asm[B] = {B1/4,B2/4,B3/4,B4/4,B6/4,B7/4,B8/4,B9/4,⊥}Closing contexts: �asm[B] = {B5/4,B10/4,B12/4,B16/12}

Table 5 presents the trace of analyzing Program B. Its column has the same meaning asthose of Table 4. Column C# gives the set of next contexts resulting from interpreting theinstruction in row #. Since in this example interpretation always yields a single context, asingleton value is presented.

The analysis starts at instruction B1 in context ⊥ and state δ0. As shown in row 1 of thetable, the interpretation of B1: PUSH 0xA creates a new context B1.⊥, which is mapped tothe state δ1 created by updating δ0 with the effects of the PUSH instruction.

The subsequent two PUSH instructions increase context to B3.B2.B1.⊥, until we arriveat the RET instruction in row 5. Using the ASG component of the top of the stack, theinterpretation of this instruction computes its successor to be B13. Interpretation proceedsthere, but only after removing the top context B3.

The interpretation of the MOV and ADD instructions in rows 6–8 does not change thecontext. Then in row 9 the interpretation of instruction B16: RET 8 removes three contextsand propagates the new state δ9 in the context ⊥ to the instruction at address B6.

Page 29: Context-sensitive analysis without calling-context

Higher-Order Symb Comput (2010) 23:275–313 303

Table 5 Trace of stack based context-sensitive analysis of program in Table 1(b)

# �#� Succ# C# δ#

1. B1: PUSH 0xA {B2} B1.⊥ {esp:(, {B1}),

B1:(0xA, {⊥})}·δ0

2. B2: PUSH 0xB {B3} B2.B1.⊥ {esp:(, {B2}),

B2:(0xB, {B1})}·δ1

3. B3: PUSH B6 {B4} B3.B2.B1.⊥ {esp:(, {B3}),

B3:(B6, {B2})}·δ2

4. B4: PUSH B13 {B5} B4.B3.B2.B1.⊥ {esp:(, {B4}),

B4:(B13, {B3})}·δ3

5. B5: RET {B13} B3.B2.B1.⊥ {esp:(, {B3})}·δ4

6. B13: MOV eax, [esp+4] {B14} B3.B2.B1.⊥ {eax:(0xB, {B1})}·δ5

7. B14: MOV ebx, [esp+8] {B15} B3.B2.B1.⊥ {ebx:(0xA, {⊥})}·δ6

8. B15: ADD eax, ebx {B16} B3.B2.B1.⊥ {eax:(0x15, )}·δ7

9. B16: RET 8 {B6} ⊥ {esp:(, {⊥})}·δ8

10. B6: PUSH 0xC {B7} B6.⊥ {esp:(, {B6}),

B6:(0xC, {⊥})}·δ9

11. B7: PUSH 0xD {B8} B7.B6.⊥ {esp:(, {B7}),

B7:(0xD, {B6})}·δ10

12. B8: PUSH B11 {B9} B8.B7.B6.⊥ {esp:(, {B8}),

B8:(B11, {B7})}·δ11

13. B9: PUSH B13 {B10} B9.B8.B7.B6.⊥ {esp:(, {B9}),

B9:(B13, {B8})}·δ12

14. B10: RET {B13} B8.B7.B6.⊥ {esp:(, {B8})}·δ13

15. B13: MOV eax, [esp+4] {B14} B8.B7.B6.⊥ {eax:(0xD, {B6})}·δ14

16. B14: MOV ebx, [esp+8] {B15} B8.B7.B6.⊥ {ebx:(0xC, {⊥})}·δ15

17. B15 ADD eax, ebx {B16} B8.B7.B6.⊥ {eax:(0x19, )}·δ16

18. B16: RET 8 {B11} ⊥ {esp:(, {⊥})}·δ17

19. B11: NOP {B12} ⊥ δ18

20. B12: RET φ ⊥ δ19

In row 15, control arrives back to the MOV instruction at address B13, but this time withcontext B8, B7.B6.⊥. Since the context is not the same as that from the previous visit, thecorresponding states are not compared. Hence, there is no need to widen the state. Finally,in row 18, the interpretation of the RET instruction yields only one successor, B11.

Context-sensitive analysis of Program A produces results similar to that of context-sensitive analysis of Program B. The analysis yields similar values at the correspondingstatements, with appropriate mappings for context strings.

8 Empirical evaluation

We now present the results of an empirical evaluation of context-sensitive analysis of obfus-cated programs. We study the improvements in analysis of obfuscated code resulting from

Page 30: Context-sensitive analysis without calling-context

304 Higher-Order Symb Comput (2010) 23:275–313

the use of (1) our context-sensitive version of Venable et al.’s analysis using stack context(Asen-stack) against (2) the original context-insensitive (Ainsen) version of Venable et al.’sanalysis [51], and (3) its context-sensitive version using Sharir and Pnueli’s calling con-text (Asen-call). The three versions were implemented on the Java Eclipse workbench as anEclipse plugin, and the evaluations performed on an Intel Core 2 Duo 2 Ghz/3 GB DellWorkstation.

While our system implements both, the �- and the k-context, to perform a fair compar-ative study �-context was chosen. Use of k-context requires choosing a parameter k. Sincecontext-strings based on stack-context are expected to be longer, using the same k for bothmay not yield a fair comparison. And if different k’s are to be selected, we would need toestablish a mechanism to choose the right k. Since �-context does not require selection ofany parameter, it was our preferred choice.

In the absence of any accepted gold standard or benchmark for evaluating obfuscatedprograms, we crafted our own procedure. We performed the analysis using two sets of pro-grams: (1) a set of hand-crafted programs and (2) W32.Evol.a, a metamorphic virus. Byusing hand-crafted programs we were able to perform a controlled study of the performanceof each analysis using programs with a specific call-structure and specific obfuscations. Onthe other hand, performance of the analyses on the W32.Evol.a virus provided insights intotheir application on real-world programs.

To enable quantitative comparison of various analyses we use several metrics. We quan-tify performance using two metrics: Time, measured as the CPU time (in milliseconds), andnumber of interpreted instructions (#II), measured by counting the number of instructionsinterpreted in order to complete the whole analysis. To gain an insight into the quality ofcontext-sensitive analyses, we use three context metrics: the total number of context stringscreated (#CS), the maximum number of context strings reaching any node (#MC) and themaximum length of all observed context strings (MaxL). Due to limitation of our experimen-tal apparatus, the Time metric measures the elapsed clock time and may include overheaddue to other processes on a machine. Hence, the Time measurement should be viewed alongwith #II to gain an insight into the computational costs.

8.1 Comparing call and stack context analysis

Our first experiment performs a comparative study of the three types of analyses: Ainsen,Asen-stack , and Asen-call . The study is performed using one set of unobfuscated programsand three sets of obfuscated programs. Each set contains ten programs, where each programconsists of a procedure main and a procedure sum. The procedure sum takes two integerparameters and returns their sum. The programs differ in the number of calls to sum frommain. The ith program in each set has i calls to the procedure sum. The number of calls(#C) made in a procedure serves as the control variable in this study. The two programs inTable 1 are instances of such programs, with i = 2.

The programs in the unobfuscated set use CALL and RET instructions as intended, thatis, to make a procedure call and to return from a called procedure, respectively. Of the threesets of obfuscated programs, two use call obfuscations and one uses a ret obfuscation. Thetwo types of call obfuscations consist of replacing the CALL instruction of the unobfuscatedprogram (a) by a combination of two PUSH instructions and a RET instruction and (b) bya combination of a PUSH instruction and a JMP instruction. The ret obfuscation consistsof replacing the RET instruction of the unobfuscated program by a combination of a POPinstruction and a JMP instruction. The program in Table 1(b) uses the first type of callobfuscation.

Page 31: Context-sensitive analysis without calling-context

Higher-Order Symb Comput (2010) 23:275–313 305

Fig. 7 Performance measures for analyzing non-obfuscated programs

Table 6 Quality measures for analyzing non-obfuscated programs

Asen-call Asen-stack

#C #CS #MC MaxL #CS #MC MaxL

10 12 10 1 33 10 3

20 22 20 1 63 20 3

30 32 30 1 93 30 3

40 42 40 1 123 40 3

50 52 50 1 153 50 3

60 62 60 1 183 60 3

70 72 70 1 213 70 3

80 82 80 1 243 80 3

90 92 90 1 273 90 3

100 102 100 1 303 100 3

The comparative data for the three analyses for each of the four data sets is presentedin Figs. 7 through 10 and Tables 6 through 9. The results for each data set are presentedin two figures and one table. For the three analyses, the first figure plots the time requiredfor analyzing each of the ten programs and the second plots the number of interpreted in-structions. The table presents the measurements for the three context measures for the twocontext-sensitive analyses.

Results from analyzing the non-obfuscated programs provide an insight into howAsen-stack would fair if it was used in place of Asen-call for programs where the latter per-forms correctly. Figure 7(a) plots the time taken for analyzing the ten non-obfuscated pro-grams and Fig. 7(b) plots the number of interpreted instructions. The results show that thecomputational costs of both of the context-sensitive analyses grow linearly with the numberof contexts. In contrast, the cost of context-insensitive analysis grows quadratically match-ing the growth in the number of contex-insensitive paths. Table 6 shows that, as expected,the maximum number of context strings reaching any node (#MC) for the programs are thesame in both context-sensitive analyses, since, in the absence of recursion the �-context ab-straction is equivalent to the identity function. The maximum context string length (MaxL)and the total number of context strings (#CS) of Asen-stack is larger than that of Asen-call .

Page 32: Context-sensitive analysis without calling-context

306 Higher-Order Symb Comput (2010) 23:275–313

Fig. 8 Performance measures for analyzing programs using push/ret call-obfuscation

Table 7 Quality measures for analyzing programs using push/ret call-obfuscation

Asen-call Asen-stack

#C #CS #MC MaxL #CS #MC MaxL

10 2 1 1 43 10 4

20 2 1 1 83 20 4

30 2 1 1 123 30 4

40 2 1 1 163 40 4

50 2 1 1 203 50 4

60 2 1 1 243 60 4

70 2 1 1 283 70 4

80 2 1 1 323 80 4

90 2 1 1 363 90 4

100 2 1 1 403 100 4

That is because each call to sum is preceded by two PUSH instructions that push its pa-rameters onto the stack, and, for Asen-stack , each PUSH instruction creates a new contextstring. The data shows that even though Asen-stack creates more and longer context strings,its performance is comparable with Asen-call because the number of context strings reachinga call site remains unchanged.

The data from analyzing the three sets of obfuscated programs are primarily useful forassessing the impact of each obfuscation on the performance and quality of Asen-stack . Fig-ures 8(a), 9(a), and 10 show that while there is some variation in the time required foranalysis of the three obfuscations, the variation is unremarkable and may be attributed tomeasurement errors. This is further supported by the data in Figs. 8(b), 9(b), and 10(b)which show that the number of instructions interpreted for stack-sensitive-analysis does notchange significantly with the type of obfuscations applied. The most important measure is#MC, i.e., the maximum number of context strings reaching a node, as presented in Tables 7,8, and 9. This measure becomes significant in obfuscated programs since there are no pro-cedure boundaries, call sites, or return statements. It is instructive to compare the #MCs ofnon-obfuscated programs in Table 6 with the #MC of the corresponding obfuscated pro-grams in Tables 7, 8, and 9. The data shows that the #MC for Asen-stack for corresponding

Page 33: Context-sensitive analysis without calling-context

Higher-Order Symb Comput (2010) 23:275–313 307

Fig. 9 Performance measures for analyzing programs using push/jmp call-obfuscation

Table 8 Quality measures for analyzing programs using push/jmp call-obfuscation

Asen-call Asen-stack

#C #CS #MC MaxL #CS #MC MaxL

10 2 1 1 33 10 3

20 2 1 1 63 20 3

30 2 1 1 93 30 3

40 2 1 1 123 40 3

50 2 1 1 153 50 3

60 2 1 1 183 60 3

70 2 1 1 213 70 3

80 2 1 1 243 80 3

90 2 1 1 273 90 3

100 2 1 1 303 100 3

programs remains the same, indicating that the maximum number of contexts reaching astatement remains the same even when a program is obfuscated. The other measures pre-sented in Tables 7, 8, and 9 are also indicators of the quality of an analysis. The total numberof context strings, #CS, varies across the programs; a change that can be attributed to thedifferent amount of instructions that affect the stack. The difference in the maximum lengthof a context-string, MaxL, can also similarly be attributed to the PUSH, POP, and RETinstructions in an obfuscation.

For the Asen-call analysis the data, and in particular the context measures, highlight the er-ror in the analysis. For the two sets of programs with CALL instruction obfuscations, Tables 7and 8 show that every instruction of each program is considered to be in only one context(since #MC is 1 for all programs). This is because these programs do not have any CALLinstructions, the instruction necessary to create a calling context. For the set of programsgenerated by the RET obfuscation, the #MC values match those for the stack-sensitive anal-ysis. However, the values of MaxL indicate that the maximum length of contexts increaseswith the number of calls. Since the programs do not have recursion this can only happen ifthe calls are treated as nested, which is a violation of the semantics. This error follows fromthe fact that a calling context is never closed because these programs do not have any RET

Page 34: Context-sensitive analysis without calling-context

308 Higher-Order Symb Comput (2010) 23:275–313

Fig. 10 Performance measures for analyzing programs using pop/jmp ret-obfuscation

Table 9 Quality measures for analyzing programs using pop/jmp ret-obfuscation

Asen-call Asen-stack

#C #CS #MC MaxL #CS #MC MaxL

10 12 10 11 33 10 3

20 22 20 21 63 20 3

30 32 30 31 93 30 3

40 42 40 41 123 40 3

50 52 50 51 153 50 3

60 62 60 61 183 60 3

70 72 70 71 213 70 3

80 82 80 81 243 80 3

90 92 90 91 273 90 3

100 102 100 101 303 100 3

instruction. Hence, when using Asen-call once a context is put on the stack, it is never againremoved.

8.2 Evaluation with metamorphic virus

The W32.Evol.a virus has been thoroughly analyzed in our lab, and hence we are in a po-sition to evaluate the results of our analysis [30]. To quantify the improvement resultingfrom its analysis using Asen-stack over Ainsen, we compute the difference in the size of thevalue sets resulting from the two analyses for each instruction. The size of the value setat an instruction i for Ainsen is denoted by Sinsen(i), and that for Asen-stack is denoted bySsen-stack(i). Since the sizes resulting from context-insensitive analysis are always higher,we compute the difference as Sinsen(i) − Ssen-stack(i), for each instruction i.

Table 10 presents time and size measures for analyzing W32.Evol.a. Besides the mea-sures for Ainsen and Asen-stack , it also presents the measure for Asen-call for reference.Like our handcrafted programs, W32.Evol.a virus is also not recursive, hence the context-sensitive analysis using �-context abstraction separates each calling context, without any ab-straction. The table shows that Asen-stack yielded 21 contexts, with maximum length of 19.In contrast, Asen-call yielded only 7 contexts, with maximum length of 3.

Page 35: Context-sensitive analysis without calling-context

Higher-Order Symb Comput (2010) 23:275–313 309

Table 10 Measures foranalyzing Win32.Evol.a virus Ainsen Asen-call Asen-stack

Time 157 120 110

# II 392 380 348

#CS 0 7 21

#MC 0 2 2

MaxL 0 3 19

Fig. 11 Histogram ofapproximations for Win32.Evol.a

Figure 11 presents a histogram that shows the number of instructions where the context-insensitive analysis gives larger sets (for various intervals of differences). The data showsan improvement in precision in 25 of 98 interpreted instructions of the W32.Evol.a virus.Thus, Asen-stack is more efficient and more precise than Ainsen.

8.3 Discussion

Our empirical evaluation shows that, as expected, for unobfuscated programs both thecontext-sensitive analyses produce more precise results than their context-insensitive coun-terpart. Quite unexpectedly, our evaluation also shows that the cost of the context-insensitiveanalysis grows quadratically as the number of calls to the same procedure increases, whereasthe costs of context-sensitive analyses grow linearly. The data shows that, while the numberof contexts created by use of �-stack-context in a context-sensitive analysis is larger thanthose created by the use of �-calling-context, the number of contexts reaching an instruc-tions remains unchanged. This implies that the use of �-stack-context in a context-sensitiveanalysis does not introduce any more overhead than does the use of �-calling-context.

Our empirical evaluation also shows that, in the presence of call/ret obfuscations, the useof stack-context aids in correctly separating contexts without incurring any extra cost. Asexpected, the use of calling-context produces incorrect analysis when applied on obfuscatedprograms.

Our case study on a real-world program that uses a call obfuscation shows that an �-stack-context based analysis produces more precise results than its context-insensitive counterpartand is also more efficient.

9 Limitations and future works

As stated earlier, obfuscations are crafted by violating the assumptions of an analyzer. Forinstance, call obfuscations defeat classical interprocedural analysis by violating the assump-

Page 36: Context-sensitive analysis without calling-context

310 Higher-Order Symb Comput (2010) 23:275–313

tion that the underlying program uses a standard compilation model. Thus, the assumptionsof an analysis also define its limitations.

The stack-context analysis, as presented, is based on the following assumptions.

– All the instructions of the program can be statically identified.– A program has at most one run-time stack.– The stack pointer is represented by a unique, statically identifiable symbol (whether a

register or a memory location).– The direction in which the stack grows is known a priori.– The instructions that may modify the stack pointer can be statically identified.

While these set of assumptions are weaker than those implicit in the use of a standard com-pilation model, it goes without saying that programs can be crafted for which our analysiswill not produce useful results.

In the von Neumann architecture differentiating data and instruction is an undecidableproblem, hence it is easy to craft programs in which instructions cannot be statically identi-fied. Self-modifying programs clearly fall in this category, as do other programs that violatethe assumptions on any underlying disassembler. Dalla Preda et al.’s phase-semantics [19]may be utilized to develop an analysis for self-modifying programs.

While it is common practice to limit a program to have one stack, the number of stacksis not limited by any theoretical or practical bounds. One can craft a program where eachprocedure maintains its own stack and the corresponding stack pointer. This may not be anefficient structure, however, it will defeat our analysis. Even for the same stack the stackpointer may be aliased, especially if it is in memory. The stack pointer may even be selecteddynamically. The direction of growth of the stack may also be determined at run-time, andif using multiple stacks they need not grow in the same direction. Further, when a stackpointer is maintained in memory, it may not always be possible to identify the instructionsthat modify the stack pointer.

Methods to dynamically discover stack-like data structures may aid in analyzing pro-grams without knowing the stack or its implementation details a priori. Balakrishnan andReps’s value-set analysis is aimed at discovering the variables and their types in a binaryprogram [3]. Their work may provide a direction for discovering stack, or other richer struc-tures.

It is our belief that, notwithstanding the undecidable nature of the problem, the chal-lenges in analyzing obfuscated programs stems from the pipeline architecture of most ana-lyzers [32]. Most analyzers are constructed as a sequence of stages, where the abstractionscreated by one stage flow into the next. For instance, one common architecture is a data-flowpipe consisting of stages to disassemble, construct CFG, construct call graph, and performanalysis. In the pipeline architecture the abstractions created in an early stage impact allsubsequent stages. The soundness or completeness of the later stages is typically provenassuming completeness of the earlier stages [17]. Such analysis are easily defeated by ob-fuscation because an early stage, such as the disassembly, is usually incomplete, and quiteoften unsound for the general class of programs. This is so because the abstractions at theearlier stages are created using a very minimal semantics.

A different architecture for composing various stages may make some analyses moreresistent to obfuscations. For instance, a blackboard architecture [23], commonly used inrobotics, may enable information to flow from any stage to any stage. Thus, the disassemblermay utilize the results of the CFG construction to improve the disassembly. Developing suchan architecture may also require creating new theoretical frameworks, since the existingframeworks are designed to reason about the pipeline structures.

Page 37: Context-sensitive analysis without calling-context

Higher-Order Symb Comput (2010) 23:275–313 311

10 Conclusions

We have presented a method for performing context-sensitive analysis of binaries in whichcalling-contexts cannot be discerned. Such binaries are often crafted to break existing meth-ods of analysis. For instance, the IDA Pro disassembler identifies the procedures in a binaryby analyzing its call instructions. Any analysis based on such a disassembler would failif the binary does not use the call instruction to make a procedure call. Obfuscations thatdefeat analyzers are commonly used by authors of malware. They are also used to protectintellectual property.

Our method of context-sensitive analysis does not rely on finding procedure boundariesor determining procedure calls. Instead, it defines a context based on the state of the stack.Thus, any operation that pushes data on the stack creates a context. Conversely, any op-eration that removes data from the stack removes a context. The problem of determiningtransfer of control, which is also an important problem for obfuscated binaries, is solvedseparately by using Balakrishnan and Reps’ value-sets to model values in memory and reg-isters [3].

We adapt prior work on context-sensitive analysis based on call-strings to work withthe stack-context. The notion of call-strings has been described in terms of valid paths ofan ICFG [46]. We generalize the concept using abstract interpretation and define contextsusing trace semantics. We implement a context-sensitive variant of Venable et al.’s analysisthat combines the value-sets and the abstract stack graph domains [51]. Empirical evaluationshows that context-sensitive analysis using �-context leads to several orders of magnitudeimprovement in the running time and improvement in the precision.

While the method improves upon the state-of-the-art by enabling context-sensitive anal-ysis of obfuscated programs, its performance on non-obfuscated programs is equally impor-tant. Treating any push onto the stack as the creation of a new context increases the numberof contexts and the length of the contexts. However, this does not have an impact on theperformance of the algorithm since it does not increase the number of contexts for eachstatement.

Acknowledgements This work has benefited significantly from the comments of the anonymous referees,and from the comments of Uday Khedkar. This research was supported in part by funds from the LouisianaGovernor’s Information Technology Initiative, Air Force Office of Scientific Research grant FA9550-09-1-0715, Air Force Research Lab and DARPA (FA8750-10-C-0171), and Coordination for the Improvement ofHigher Education grant 1137/07-7.

References

1. Amme, W., Braun, P., Zehendner, E., Thomasset, F.: Data dependence analysis of assembly code. In: Int.J. Parallel Proc. (2000)

2. Backes, W.: Programmanalyse des XRTL Zwischencodes (in German.). Ph.D. thesis, Universität desSaarlandes (2004)

3. Balakrishnan, G., Reps, T.: WYSINWYX: What you see is not what you eXecute. ACM Trans. Program.Lang. Syst. 32, 23:1–23:84 (2010)

4. Bergeron, J., Debbabi, M., Desharnais, J., Erhioui, M., Lavoie, Y., Tawbi, N.: Static detection of mali-cious code in executable programs. In: Proceedings of the International Symposium on RequirementsEngineering for Information Security SREIS’01, pp. 1–8 (2001)

5. Bergeron, J., Debbabi, M., Erhioui, M., Ktari, B.: Static analysis of binary code to isolate maliciousbehaviors. In: Proceedings of the IEEE 8th International Workshops on Enabling Technologies: Infras-tructure for Collaborative Enterprises, pp. 184–189 (1999)

6. Christodorescu, M., Jha S.: Static analysis of executables to detect malicious patterns. In: Proc. of the12th USENIX Security Symposium (2003)

Page 38: Context-sensitive analysis without calling-context

312 Higher-Order Symb Comput (2010) 23:275–313

7. Cifuentes, C., Fraboulet, A.: Interprocedural data flow recovery of high-level language code from assem-bly. Technical Report 421, Univ. Queensland (1997)

8. Cifuentes, C., Fraboulet, A.: Intraprocedural static slicing of binary executables. In: Proc. Int. Conf. onSoftware Maintenance (ICSM), pp. 188–195 (1997)

9. Cifuentes, C., Simon, D., Fraboulet, A.: Assembly to high-level language translation. In: Proc. Int. Conf.on Software Maintenance (ICSM), pp. 228–237 (1998)

10. Collberg, C., Nagra, J.: Surreptitious Software: Obfuscation, Watermarking, and Tamperproofing forSoftware Protection. Addison Wesley, Reading (2010)

11. Collberg, C.S., Thomborson, C.: Watermarking, tamper-proofing, and obfuscation tools for softwareprotection. IEEE Trans. Softw. Eng. 28(8), 735–746 (2002)

12. Coogan, K., Debray, S., Kaochar, T., Townsend, G.: Automatic static unpacking of malware binaries. In:2009, WCRE’09, 16th Working Conference on Reverse Engineering, pp. 167–176 (2009)

13. Cousot, P., Cousot, R.: Abstract interpretation: A unified lattice model for static analysis of programs byconstruction or approximation of fixed points. In: Proc. Principles of Programming Languages (POPL),Los Angeles, CA, USA, pp. 238–252 (1977)

14. Cousot, P., Cousot, R.: Systematic design of program analysis frameworks by abstract interpretation. In:Proc. Principles of Programming Languages (POPL), San Antonio, TX, USA, pp. 269–282 (1979)

15. Cousot, P., Cousot, R.: Basic concepts of abstract interpretation. In: IFIP World Computer Congress,Toulouse France, pp. 359–366 (2004)

16. Dalla Preda, M.: Code obfuscation and malware detection by abstract interpretation. Ph.D. thesis, Uni-versità di Verona (2007)

17. DallaPreda, M., Christodorescu, M., Jha S., Debray S.: A semantics-based approach to malware detec-tion. In: Proc. Principles of Programming Languages (POPL), New York, NY, USA, pp. 377–388 (2007)

18. DallaPreda, M., Giacobazzi, R.: Semantics-based code obfuscation by abstract interpretation. J. Comput.Secur. 17(6), 855–908 (2009)

19. DallaPreda, M., Giacobazzi, R., Debray, S., Coogan, K., Townsend, G.M.: Modelling metamorphism byabstract interpretation. In: Proceedings of the 17th International Conference on Static Analysis, Berlin,Heidelberg, pp. 218–235 (2010)

20. DallaPreda, M., Madou, M., De Bosschere, K., Giacobazzi, R.: Opaque predicates detection by abstractinterpretation. Algebraic Methodology and Software Technology, pp. 81–95 (2006)

21. Debray, S.K., Muth, R., Weippert, M.: Alias analysis of executable code. In: Proc. Principles of Pro-gramming Languages (POPL), pp. 12–24 (1998)

22. Emami, M., Ghiya, R., Hendren, L.J.: Context-sensitive interprocedural points-to analysis in the presenceof function pointers. SIGPLAN Not. 29(6), 242–256 (1994)

23. Engelmore, R., Morgan, T.: Blackboard Systems, vol. 141. Addison-Wesley, Reading (1988)24. Giacobazzi, R.: Hiding information in completeness holes: new perspectives in code obfuscation and wa-

termarking. In: 2008 Sixth IEEE International Conference on Software Engineering and Formal Meth-ods, pp. 7–18 (2008)

25. Goodwin, D.W.: Interprocedural dataflow analysis in an executable optimizer. In: Conf. on Prog. Lang.Design and Implementation (PLDI), New York, NY, USA, pp. 122–133 (1997)

26. Guo, B., Bridges, M.J., Triantafyllis, S., Ottoni, G., Raman, E., August, D.: Practical and accurate low-level pointer analysis. In: 3nd Int. Symp. on Code Gen. and Opt. (2005)

27. IdaPro: Ida Pro—Disassembler. http://www.hex-rays.com/idapro/ (Last accessed October 2010)28. Khedkar, U.P., Sanyal, A., Karkare, B.: Data Flow Analysis: Theory and Practice. CRC Press, Boca

Raton (2009)29. Kinder, J., Veith, H., Zuleger, F.: An abstract interpretation-based framework for control flow recon-

struction from binaries. In: 10th International Conference on Verification, Model Checking, and AbstractInterpretation (VMCAI 2009), pp. 214–228 (2009)

30. Lakhotia, A., Kumar, E.U.: Abstracting stack to detect obfuscated calls in binaries. In: 4th IEEE Int.Workshop on Source Code Analysis and Manipulation (SCAM), Chicago, Illinois (2004)

31. Lakhotia, A., Kumar, E.U., Venable, M.: A method for detecting obfuscated calls in malicious binaries.IEEE Trans. Softw. Eng. 31(11), 955–968 (2005)

32. Lakhotia, A., Singh, P.K.: Challenges in getting ‘formal’ with viruses. Virus Bull., pp. 15–19 (2003)33. Lal, A., Reps, T.: Improving pushdown system model checking. In: Proc. Computer-Aided Verification

(2006)34. Larus, J.R., Schnarr, E.: EEL: Machine-independent executable editing. In: Conf. on Prog. Lang. Design

and Implementation (PLDI), pp. 291–300 (1995)35. Lhoták, O., Hendren, L.: Context-sensitive points-to analysis: Is it worth it. In: Compiler Construction,

15th International Conference. LNCS, vol. 3923, pp. 47–64 (2006)36. Linn, C., Debray, S.: Obfuscation of executable code to improve resistance to static disassembly. In: 10th

ACM Conf. on Computer and Communications Security (CCS) (2003)

Page 39: Context-sensitive analysis without calling-context

Higher-Order Symb Comput (2010) 23:275–313 313

37. McDonald, J.: Delphi falls prey. http://www.symantec.com/connect/blogs/delphi-falls-prey (Last ac-cessed October 2010)

38. Might, M., Shivers, O.: Environment analysis via ΔCFA. SIGPLAN Not. 41(1), 127–140 (2006)39. Mycroft, A.: Type-based decompilation. In: European Symp. on Programming (ESOP) (1999)40. Reps, T., Balakrishnan, G.: Improved memory-access analysis for x86 executables. In: Proc. Int. Conf.

on Compiler Construction (2008)41. Reps, T., Balakrishnan, G., Lim, J.: Intermediate-representation recovery from low-level code. In: Proc.

Workshop on Partial Evaluation and Program Manipulation (PEPM), Charleston, SC (2006)42. Reps, T., Lim, J., Thakur, A., Balakrishnan, G., Lal, A.: There’s plenty of room at the bottom: analyzing

and verifying machine code. In: Computer Aided Verification, pp. 41–56 (2010)43. Reps, T., Schwoon, S., Jha, S., Melski, D.: Weighted pushdown systems and their application to inter-

procedural dataflow analysis. Sci. Comput. Program. 58(1–2), 206–263 (2005)44. Sagiv, M., Reps, T., Horwitz, S.: Precise interprocedural dataflow analysis with applications to constant

propagation. Theor. Comput. Sci. 167(1–2), 131–170 (1996)45. Schwarz, B., Debray, S.K., Andrews, G.R.: PLTO: A link-time optimizer for the Intel IA-32 architecture.

In: Proc. Workshop on Binary Translation (WBT) (2001)46. Sharir, M., Pnueli, A.: Two approaches to interprocedural data flow analysis. In: Muchnick, S.S., Jones,

N.D. (eds.) Program Flow Analysis: Theory and Applications, Chap. 7, pp. 189–234. Prentice-Hall,Englewood Cliffs (1981)

47. Srivastava, A., Wall, D.: A practical system for intermodule code optimization at linktime. J. Program.Lang. 1(I), 1–18 (1993)

48. Symantec, Inc., Understanding heuristics. http://www.symantec.com/avcenter/reference/heuristc.pdf(1997)

49. Thakur, A., Lim, J., Lal, A., Burton, A., Driscoll, E., Elder, M., Andersen, T., Reps, T.: Directed proofgeneration for machine code. In: Computer Aided Verification, pp. 288–305 (2010)

50. Thompson, K.: Reflections on trusting trust. Commun. ACM 27(8), 761–763 (1984)51. Venable, M., Chouchane, M.R., Karim, M.E., Lakhotia, A.: Analyzing memory accesses in obfuscated

x86 executables. In: DIMVA’05, pp. 1–18 (2005)52. Walenstein, A., Mathur, R., Chouchane, M.R., Lakhotia, A.: Normalizing metamorphic malware using

term-rewriting. In: 6th IEEE Int. Workshop on Source Code Analysis and Manipulation (SCAM) (2006)53. Walenstein, A., Mathur, R., Chouchane, M.R., Lakhotia, A.: The design space of metamorphic malware.

In: 2nd International Conference on i-Warfare and Security (ICIW) (2007)54. Whaley, J., Lam, M.S.: Cloning-based context-sensitive pointer alias analysis using binary decision di-

agrams. In: PLDI’04: Proceedings of the ACM SIGPLAN 2004 conference on Programming languagedesign and implementation, New York, NY, USA, pp. 131–144 (2004)

55. Zhu, J.: Towards scalable flow and context sensitive pointer analysis. In: DAC’05: Proceedings of the42nd Annual Design Automation Conference, New York, NY, USA, pp. 831–836 (2005)

56. Zhu, J., Calman, S.: Symbolic pointer analysis revisited. In: PLDI’04: Proceedings of the ACM SIG-PLAN 2004 Conference on Programming Language Design and Implementation, New York, NY, USA,pp. 145–157 (2004)