forked from uva-cs/pdr
-
Notifications
You must be signed in to change notification settings - Fork 0
/
x86-32bit-ccc.tex
281 lines (233 loc) · 14 KB
/
x86-32bit-ccc.tex
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
\begin{quotation}
\ldots
\end{quotation}
\noindent {\em This chapter was derived from a document written by Adam Ferrari and later updated by Alan Batson, Mike Lack and Anita
Jones}
\section{What is a Calling Convention?}
At the end of the previous chapter, we saw a simple
example of a subroutine defined in x86 assembly language. In fact,
this subroutine was quite simple~-- it did not modify any registers
except EAX (which was needed to return the result), and it did not
call any other subroutines. In practice, such simple function
definitions are rarely useful. When more complex subroutines are
combined in a single program, a number of complicating issues arise.
For example, how are parameters passed to a subroutine? Can
subroutines overwrite the values in a register, or does the caller
expect the register contents to be preserved? Where should local
variables in a subroutine be stored? How should results be returned
from functions?
To allow separate programmers to share code and develop libraries for
use by many programs, and to simplify the use of subroutines in
general, programmers typically adopt a common {\em calling
convention}. The calling convention is simply a set of rules that
answers the above questions without ambiguity to simplify the
definition and use of subroutines. For example, given a set of calling
convention rules, a programmer need not examine the definition of a
subroutine to determine how parameters should be passed to that
subroutine. Furthermore, given a set of calling convention rules,
high-level language compilers can be made to follow the rules, thus
allowing hand-coded assembly language routines and high-level language
routines to call one another.
In practice, even for a single processor instruction set, many calling
conventions are possible. In this class we will examine and use one
of the most important conventions: the C language calling convention.
Understanding this convention will allow you to write assembly
language subroutines that are safely callable from C and C++ code, and
will also enable you to call C library functions from your assembly
language code.
\section{The C Calling Convention}
The C calling convention is based heavily on the use of the
hardware-supported stack. To understand the C calling convention, you
should first make sure that you fully understand the push, pop, call,
and ret instructions~-- these will be the basis for most of the rules.
In this calling convention, subroutine parameters are passed on the
stack. Registers are saved on the stack, and local variables used by
subroutines are placed in memory on the stack. In fact, this
stack-centric implementation of subroutines is not unique to the C
language or the x86 architecture. The vast majority of high-level
procedural languages implemented on most processors have used similar
calling convention.
The calling convention is broken into two sets of rules. The first set
of rules is employed by the caller of the subroutine, and the second
set of rules is observed by the writer of the subroutine (the
``callee''). It should be emphasized that mistakes in the observance
of these rules quickly result in fatal program errors; thus meticulous
care should be used when implementing the call convention in your own
subroutines.
\section{The Caller's Rules}
The caller should adhere to the following rules when invoking a
subroutine:
\begin{numlist}
\item Before calling a subroutine, the caller should save the contents
of certain registers that are designated caller-saved. The
caller-saved registers are EAX, ECX, EDX. If you want the contents
of these registers to be preserved across the subroutine call, push
them onto the stack.
\item To pass parameters to the subroutine, push them onto the stack
before the call. The parameters should be pushed in inverted order
(i.e. last parameter first)~-- since the stack grows down, the first
parameter will be stored at the lowest address (this inversion of
parameters was historically used to allow functions to be passed a
variable number of parameters).
\item To call the subroutine, use the call instruction. This
instruction places the return address on top of the parameters on
the stack, and branches to the subroutine code.
\item After the subroutine returns, (i.e. immediately following the
call instruction) the caller must remove the parameters from stack.
This restores the stack to its state before the call was performed.
\item The caller can expect to find the return value of the subroutine
in the register EAX.
\item The caller restores the contents of caller-saved registers (EAX,
ECX, EDX) by popping them off of the stack. The caller can assume
that no other registers were modified by the subroutine.
\end{numlist}
\section{The Callee's Rules}
The definition of the subroutine should adhere to the following rules:
\begin{numlist}
\item At the beginning of the subroutine, the function should push the
value of EBP onto the stack, and then copy the value of ESP into EBP
using the following instructions:
\begin{lstlisting}[backgroundcolor=\color{white},frame=trBL,linewidth=3.75in,xleftmargin=2.25in,label={x86-callee-code-1.lst},language={[x86masm]Assembler},caption={x86 callee code, part 1}]
push ebp
mov ebp, esp
\end{lstlisting}
The reason for this initial action is the maintenance of the base
pointer, EBP. The base pointer is used by convention as a point of
reference for finding parameters and local variables on the stack.
Essentially, when any subroutine is executing, the base pointer is a
``snapshot'' of the stack pointer value from when the subroutine
started executing. Parameters and local variables will always be
located at known, constant offsets away from the base pointer value.
We push the old base pointer value at the beginning of the subroutine
so that we can later restore the appropriate base pointer value for
the caller when the subroutine returns. Remember, the caller isn't
expecting the subroutine to change the value of the base pointer. We
then move the stack pointer into EBP to obtain our point of reference
for accessing parameters and local variables.
\item Next, allocate local variables by making space on the stack.
Recall, the stack grows down, so to make space on the top of the
stack, the stack pointer should be decremented. The amount by which
the stack pointer is decremented depends on the number of local
variables needed. For example, if 3 local integers (4 bytes each)
were required, the stack pointer would need to be decremented by 12
to make space for these local variables. I.e:
\begin{lstlisting}[backgroundcolor=\color{white},frame=trBL,linewidth=3.75in,xleftmargin=2.25in,label={x86-callee-code-2.lst},language={[x86masm]Assembler},caption={x86 callee code, part 2}]
sub esp, 12
\end{lstlisting}
As with parameters, local variables will be located at known offsets
from the base pointer.
\item Next, the values of any registers that are designated
callee-saved that will be used by the function must be saved. To
save registers, push them onto the stack. The callee-saved registers
are EBX, EDI and ESI (ESP and EBP will also be preserved by the call
convention, but need not be pushed on the stack during this step).
After these three actions are performed, the actual operation of the
subroutine may proceed. When the subroutine is ready to return, the
call convention rules continue:
\item When the function is done, the return value for the function
should be placed in EAX if it is not already there.
\item The function must restore the old values of any callee-saved
registers (EBX, EDI and ESI) that were modified. The register contents
are restored by popping them from the stack. Note, the registers
should be popped in the inverse order that they were pushed.
\item Next, we deallocate local variables. The obvious way to do this
might be to add the appropriate value to the stack pointer (since
the space was allocated by subtracting the needed amount from the
stack pointer). In practice, a less error-prone way to deallocate
the variables is to move the value in the base pointer into the
stack pointer, i.e.:
\begin{lstlisting}[backgroundcolor=\color{white},frame=trBL,linewidth=3.75in,xleftmargin=2.25in,label={x86-callee-code-3.lst},language={[x86masm]Assembler},caption={x86 callee code, part 3}]
mov esp, ebp
\end{lstlisting}
This trick works because the base pointer always contains the value
that the stack pointer contained immediately prior to the allocation
of the local variables.
\item Immediately before returning, we must restore the caller's base
pointer value by popping EBP off the stack. Remember, the first
thing we did on entry to the subroutine was to push the base pointer
to save its old value.
\item Finally, we return to the caller by executing a ret instruction. This instruction will find and
remove the appropriate return address from the stack.
\end{numlist}
It might be noted that the callee's rules fall cleanly into two halves
that are basically mirror images of one another. The first half of the
rules apply to the beginning of the function, and are therefor
commonly said to define the {\em prologue} to the function. The latter
half of the rules apply to the end of the function, and are thus
commonly said to define the {\em epilogue} of the function.
\section{Calling Convention Example}
The above rules may seem somewhat abstract on first examination. In practice, the rules
become simple to use when they are well understood and familiar. To start the process of better
understanding the call convention, we now examine a simple example of a subroutine call and a
subroutine definition.
\begin{figure}
\lstinputlisting[caption={Example function call, caller's rules obeyed},label={x86-caller-example-1.s.lst},backgroundcolor=\color{white},frame=trBL,linewidth=6in,xleftmargin=0.75in,language={[x86masm]Assembler}]{x86-32bit/code/caller-example-1.s}
\end{figure}
In Listing~\ref{x86-caller-example-1.s.lst} a sample function call is
depicted. Note how the caller pushes the parameters onto the stack in
inverted order before the call. The call instruction is used to jump
to the beginning of the subroutine in anticipation of the fact that
the subroutine will use the ret instruction to return when the
subroutine completes. When the subroutine returns, the parameters must
be removed from the stack. A simple way to do this is to add the
appropriate amount to the stack pointer (since the stack grows down).
Finally, the result is available in EAX.
Relative to the caller's rules, the callee's rules are somewhat more
complex. An example subroutine implementation that obeys the callee's
rules is depicted in Listing~\ref{x86-callee-example-1.s.lst}. The
subroutine prologue performs the standard actions of saving a snapshot
of the stack pointer in EBP (the base pointer), allocating local
variables by decrementing the stack pointer, and saving register
values on the stack.
\begin{figure}[h!]
\lstinputlisting[caption={Example function definition, callee's rules obeyed},label={x86-callee-example-1.s.lst},backgroundcolor=\color{white},frame=trBL,linewidth=6.9in,xleftmargin=0.15in,language={[x86masm]Assembler}]{x86-32bit/code/callee-example-1.s}
\end{figure}
In the body of the subroutine we can now more clearly see the use of
the base pointer illustrated. Both parameters and local variables are
located at constant offsets from the base pointer for the duration of
the subroutines execution. In particular, we notice that since
parameters were placed onto the stack before the subroutine was
called, they are always located below the base pointer (i.e. at higher
addresses) on the stack. The first parameter to the subroutine can
always be found at memory location [EBP+8], the second at [EBP+12],
the third at [EBP+16], and so on. Similarly, since local variables
are allocated after the base pointer is set, they always reside above
the base pointer (i.e. at lower addresses) on the stack. In
particular, the first local variable is always located at [EBP-4], the
second at [EBP-8], and so on. Understanding this conventional use of
the base pointer allows us to quickly identify the use of local
variables and parameters within a function body.
The function epilogue, as expected, is basically a mirror image of the
function prologue. The caller's register values are recovered from the
stack, the local variables are deallocated by resetting the stack
pointer, the caller's base pointer value is recovered, and the {\tt
ret} instruction is used to return to the appropriate code location
in the caller.
A good way to visualize the operation of the calling convention is to
draw the contents of the nearby region of the stack during subroutine
execution. Figure~\ref{x86-activation-record.fig} depicts the contents
of the stack during the execution of the body of myFunc (myFunc is
depicted in Listing~\ref{x86-callee-example-1.s.lst}). Notice, lower
addresses are depicted lower in the figure, and thus the ``top'' of
the stack is the bottom-most cell. This corresponds visually to the
intuitive statement that the x86 hardware stack ``grows down.'' The
cells depicted in the stack are 32-bit wide memory locations, thus the
memory addresses of the cells are 4 bytes apart. From this picture we
see clearly why the first parameter resides at an offset of 8 bytes
from the base pointer. Above the parameters on the stack (and below
the base pointer), the call instruction placed the return address,
thus leading to an extra 4 bytes of offset from the base pointer to
the first parameter.
\begin{figure}[h]
\centering
\includegraphics[width=4in]{x86-32bit/x86-activation-record.pdf}
\caption{A picture of the stack in memory during the execution of the body of myFunc}
\label{x86-activation-record.fig}
\end{figure}
The assembly code for {\tt myFunc()} was shown above in
Listing~\ref{x86-callee-example-1.s.lst}. The C++ code to call that
subroutine is shown in
Listing~\ref{x86-callee-example-main.cpp.lst}.
\begin{figure}[h!]
\lstinputlisting[caption={Example C++ code to invoke a 3-parameter x86 subroutine},label={x86-callee-example-main.cpp.lst},backgroundcolor=\color{white},frame=trBL,linewidth=6.9in,xleftmargin=0.15in,language=C++]{x86-32bit/code/callee-example-main.cpp}
\end{figure}