forked from itanium-cxx-abi/cxx-abi
-
Notifications
You must be signed in to change notification settings - Fork 0
/
abi.html
6815 lines (5825 loc) · 268 KB
/
abi.html
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
352
353
354
355
356
357
358
359
360
361
362
363
364
365
366
367
368
369
370
371
372
373
374
375
376
377
378
379
380
381
382
383
384
385
386
387
388
389
390
391
392
393
394
395
396
397
398
399
400
401
402
403
404
405
406
407
408
409
410
411
412
413
414
415
416
417
418
419
420
421
422
423
424
425
426
427
428
429
430
431
432
433
434
435
436
437
438
439
440
441
442
443
444
445
446
447
448
449
450
451
452
453
454
455
456
457
458
459
460
461
462
463
464
465
466
467
468
469
470
471
472
473
474
475
476
477
478
479
480
481
482
483
484
485
486
487
488
489
490
491
492
493
494
495
496
497
498
499
500
501
502
503
504
505
506
507
508
509
510
511
512
513
514
515
516
517
518
519
520
521
522
523
524
525
526
527
528
529
530
531
532
533
534
535
536
537
538
539
540
541
542
543
544
545
546
547
548
549
550
551
552
553
554
555
556
557
558
559
560
561
562
563
564
565
566
567
568
569
570
571
572
573
574
575
576
577
578
579
580
581
582
583
584
585
586
587
588
589
590
591
592
593
594
595
596
597
598
599
600
601
602
603
604
605
606
607
608
609
610
611
612
613
614
615
616
617
618
619
620
621
622
623
624
625
626
627
628
629
630
631
632
633
634
635
636
637
638
639
640
641
642
643
644
645
646
647
648
649
650
651
652
653
654
655
656
657
658
659
660
661
662
663
664
665
666
667
668
669
670
671
672
673
674
675
676
677
678
679
680
681
682
683
684
685
686
687
688
689
690
691
692
693
694
695
696
697
698
699
700
701
702
703
704
705
706
707
708
709
710
711
712
713
714
715
716
717
718
719
720
721
722
723
724
725
726
727
728
729
730
731
732
733
734
735
736
737
738
739
740
741
742
743
744
745
746
747
748
749
750
751
752
753
754
755
756
757
758
759
760
761
762
763
764
765
766
767
768
769
770
771
772
773
774
775
776
777
778
779
780
781
782
783
784
785
786
787
788
789
790
791
792
793
794
795
796
797
798
799
800
801
802
803
804
805
806
807
808
809
810
811
812
813
814
815
816
817
818
819
820
821
822
823
824
825
826
827
828
829
830
831
832
833
834
835
836
837
838
839
840
841
842
843
844
845
846
847
848
849
850
851
852
853
854
855
856
857
858
859
860
861
862
863
864
865
866
867
868
869
870
871
872
873
874
875
876
877
878
879
880
881
882
883
884
885
886
887
888
889
890
891
892
893
894
895
896
897
898
899
900
901
902
903
904
905
906
907
908
909
910
911
912
913
914
915
916
917
918
919
920
921
922
923
924
925
926
927
928
929
930
931
932
933
934
935
936
937
938
939
940
941
942
943
944
945
946
947
948
949
950
951
952
953
954
955
956
957
958
959
960
961
962
963
964
965
966
967
968
969
970
971
972
973
974
975
976
977
978
979
980
981
982
983
984
985
986
987
988
989
990
991
992
993
994
995
996
997
998
999
1000
<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN"
"http://www.w3.org/TR/html4/strict.dtd">
<HTML>
<HEAD>
<title>Itanium C++ ABI</title>
<link rel=stylesheet href=code.css type="text/css">
</HEAD>
<BODY>
<hr />
<h1>Itanium C++ ABI</h1>
<p> <hr> <p>
<h2> Contents </h2>
<ul>
<li><a href="#acknowledgements">Acknowledgements</a></li>
<li> <a href=#intro> Chapter 1: Introduction </a>
<ul>
<li> <a href=#definitions> 1.1 Definitions </a>
<li> <a href=#limits> 1.2 Limits </a>
<li> <a href=#namespace> 1.3 Namespace and Header </a>
<li> <a href=#scope> 1.4 Scope of This ABI </a>
<li> <a href=#docs> 1.5 Base Documents </a>
</ul>
<li> <a href=#layout> Chapter 2: Data Layout </a>
<ul>
<li> <a href=#general> 2.1 General </a>
<li> <a href=#pod> 2.2 POD Data Types </a>
<li> <a href=#member-pointers> 2.3 Member Pointers </a>
<li> <a href=#class-types> 2.4 Non-POD Class Types </a>
<li> <a href=#vtable> 2.5 Virtual Table Layout </a>
<li> <a href=#vtable-ctor> 2.6 Virtual Tables During Object Construction </a>
<li> <a href=#array-cookies> 2.7 Array Operator <code>new</code> Cookies </a>
<li> <a href=#guards> 2.8 Initialization Guard Variables </a>
<li> <a href=#rtti> 2.9 Run-Time Type Information (RTTI) </a>
</ul>
<li> <a href=#calls> Chapter 3: Code Emission and APIs </a>
<ul>
<li> <a href=#functions> 3.1 Functions</a>
<li> <a href=#vcall> 3.2 Virtual Calls</a>
<li> <a href=#obj-ctor> 3.3 Construction and Destruction APIs</a>
<li> <a href=#demangler> 3.4 Demangler API</a>
</ul>
<li> <a href=abi-eh.html> Chapter 4: Exception Handling </a>
<li> <a href=#linkage> Chapter 5: Linkage and Object Files </a>
<ul>
<li> <a href=#mangling> 5.1 External Names (a.k.a. Mangling)</a>
<li> <a href=#vague> 5.2 Vague Linkage </a>
<li> <a href=#unwind> 5.3 Unwind Table Location </a>
</ul>
<li> <a href=#revisions> Appendix R: Revision History</a>
</ul>
<p> <hr> <p>
<a name="acknowledgements">
<h2><a href="#acknowledgements"> Acknowledgements</a></h2>
<p> <hr> <p>
<p>This document was originally developed jointly by an informal
industry coalition consisting of (in alphabetical order) CodeSourcery,
Compaq, EDG, HP, IBM, Intel, Red Hat, and SGI. Additional contributions
were provided by a variety of individuals. It is now developed as an
open-source project with contributions from a variety of individuals
and companies.</p>
<p> <hr> <p>
<a name="intro">
<h2><a href="#intro"> Chapter 1: Introduction </a></h2>
<p> <hr> <p>
In this document, we specify the Application Binary Interface (ABI)
for C++ programs: that is, the object code interfaces between different
user-provided C++ program fragments and between those fragments and
the implementation-provided runtime and libraries. This includes the
memory layout for C++ data objects, including both predefined and
user-defined data types, as well as internal compiler generated
objects such as virtual tables. It also includes function calling
interfaces, exception handling interfaces, global naming, and various
object code conventions.
<p>
In general, this document is meant to serve as a generic specification
which can be used by C++ implementations on a variety of platforms.
It does this by layering on top of a platform's base C ABI. However,
it was originally written for the Itanium architecture, and some parts
still directly make Itanium-specific or 64-bit-specific assumptions.
There is an ongoing project to restate the entire C++ ABI specification
in terms of portable C concepts that are defined in the C ABI. In
the meantime, it is usually straightforward to recognize these
unportable assumptions and translate them appropriately, e.g. by
replacing a 64-bit pointer with a 32-bit pointer.
<p>
This document is not an authoritative definition of the C++ ABI for
any particular platform. Platform vendors retain the ultimate power
to define the C++ ABI for their platform. Platforms using this ABI
for C++ should declare that they do so, either unmodified or with a
certain set of changes.
<p>
While this ABI has generally stood up well, there are some parts of it
that are now seen as mistakes. This document includes several
recommendations for platforms adopting this ABI with no need to
interoperate with existing C++ object code. These recommendations
appear as follows:
<blockquote>
<span class="future-abi">Recommendation for new platforms: consider
forbidding the use of function templates on your platform so that
the ABI can remove these expression-mangling rules.</span>
</blockquote>
<p>
Platforms adopting any of these recommendations should describe the
exact changes they've made in their platform ABI documentation,
as the set of recommendations in this document may change over time.
<p> <hr> <p>
<a name="definitions">
<h3><a href="#definitions"> 1.1 Definitions </a></h3>
<p>
The descriptions below make use of the following definitions:
<dl>
<p>
<dt> <i>alignment</i> of a type T (or object X)</dt>
<dd>
A value A such that any object X of type T has an address satisfying
the constraint that &X modulo A == 0.
<p>
<dt> <i>base class</i> of a class T</dt>
<dd>
When this document refers to base classes of a class T,
unless otherwise specified,
it means T itself as well as all of the classes from which it is derived,
directly or indirectly, virtually or non-virtually.
We use the term <i>proper base class</i>
to exclude T itself from the list.
<p>
<dt> <i>base object destructor</i> of a class T</dt>
<dd>
A function that runs the destructors for non-static data members of T and
non-virtual direct base classes of T.
<p>
<dt> <i>basic ABI properties</i> of a type T</dt>
<dd>
The basic representational properties of a type decided by the base C ABI,
including its size, its alignment, its treatment by calling conventions,
and the representation of pointers to it.
<p>
<dt> <i>complete object destructor</i> of a class T</dt>
<dd>
A function that, in addition to the actions required of a base
object destructor, runs the destructors for the virtual base classes of T.
<p>
<dt> <i>deleting destructor</i> of a class T</dt>
<dd>
A function that, in addition to the actions required of a complete
object destructor, calls the appropriate deallocation function
(i.e,. <code>operator delete</code>) for T.
<p>
<dt> <i>direct base class order</i> </dt>
<dd>
When the direct base classes of a class are viewed as an ordered set,
the order assumed is the order declared, left-to-right.
<p>
<dt> <i>diamond-shaped inheritance</i> </dt>
<dd>
A class has diamond-shaped inheritance iff it has a virtual base class
that can be reached by distinct inheritance graph paths through
more than one direct base.
<p>
<dt> <i>dynamic class</i> </dt>
<dd>
A class requiring a virtual table pointer
(because it or its bases have one or more virtual member functions or
virtual base classes).
<p>
<dt> <i>empty class</i> </dt>
<dd>
A class with no non-static data members other than empty data members,
no unnamed bit-fields other than zero-width bit-fields,
no virtual functions, no virtual base classes,
and no non-empty non-virtual proper base classes.
<p>
<dt> <i>empty data member</i> </dt>
<dd>
A potentially-overlapping non-static data member of empty class type.
<p>
<dt> <i>inheritance graph</i> </dt>
<dd>
A graph with nodes representing a class and all of its subobjects,
and arcs connecting each node with its direct bases.
<p>
<dt> <i>inheritance graph order</i> </dt>
<dd>
The ordering on a class object and all its subobjects obtained
by a depth-first traversal of its inheritance graph,
from the most-derived class object to base objects,
where:
<ul>
<p>
<li> No node is visited more than once.
(So, a virtual base subobject, and all of its base subobjects,
will be visited only once.)
<p>
<li>
The subobjects of a node are visited in the order in which they
were declared.
(So, given <code>class A : public B, public C</code>,
A is walked first,
then B and its subobjects,
and then C and its subobjects.)
</ul>
<p>
Note that the traversal may be preorder or postorder.
Unless otherwise specified,
preorder (derived classes before their bases) is intended.
<p>
<a name="instantiation-dependent"></a>
<dt> <i>instantiation-dependent</i> </dt>
<dd>
An expression is <i>instantiation-dependent</i> if it is type-dependent or value-dependent,
or it has a subexpression that is type-dependent or value-dependent. For example, if
<code>p</code> is a type-dependent identifier, the expression <code>sizeof(sizeof(p))</code>
is neither type-dependent, nor value-dependent, but it is instantiation-dependent (and could
turn out to be invalid if after substitution of template arguments <code>p</code> turns out to
have an incomplete type).
Similarly, a type expressed in source code is <i>instantiation-dependent</i> if the source
form includes an <i>instantiation-dependent</i> expression. For example, the type form
<code>double[sizeof(sizeof(p))]</code> (with <code>p</code> a type dependent identifier)
is instantiation-dependent.
<p>
<dt> <i>morally virtual</i> </dt>
<dd>
A subobject X is a <i>morally virtual</i> base of Y if X is either a
virtual base of Y, or the direct or indirect base of a virtual base of
Y.
<p>
<dt> <i>nearly empty class</i> </dt>
<dd>
A class that contains a virtual pointer, but no other data except
(possibly) virtual bases. In particular, it:
<ul>
<li> has no non-static data members and no non-zero-width unnamed bit-fields,
<li> has no direct base classes that are not either empty, nearly empty,
or virtual,
<li> has at most one non-virtual, nearly empty direct base class, and
<li> has no proper base class that is empty, not morally virtual, and
at an offset other than zero.
</ul>
Such classes may be primary base classes even if virtual,
sharing a virtual pointer with the derived class.
<p>
<a name="non-trivial">
<dt> <i>non-trivial for the purposes of calls</i><dt>
<dd>
<p>A type is considered non-trivial for the purposes of calls if:
<ul>
<li>it has a non-trivial copy constructor, move constructor, or destructor, or
<li>all of its copy and move constructors are deleted.
</ul>
</p>
<p>This definition, as applied to class types, is intended to be the
complement of the definition in [class.temporary]p3 of types for which
an extra temporary is allowed when passing or returning a type. A type
which is trivial for the purposes of the ABI will be passed and returned
according to the rules of the base C ABI, e.g. in registers; often
this has the effect of performing a trivial copy of the type.
</p>
</dd>
<p>
<a name="POD" />
<dt> <i>POD for the purpose of layout</i><dt>
<dd>
<p>
In general, a type is considered a POD for the purposes of layout if
it is a POD type (in the sense of ISO C++ [basic.types]). However, a
type is not considered to be a POD for the purpose of layout if it is:
<ul>
<li>a POD-struct or POD-union (in the sense of ISO C++ [class]) with a
bit-field whose declared width is wider than the declared type
of the bit-field, or
<li>an array type whose element type is not a POD for the purpose of layout, or
<li>a POD-struct with one or more potentially-overlapping non-static
data members.
</ul>
Where references to the ISO C++ are made in this paragraph, the Technical
Corrigendum 1 version of the standard is intended.
</p>
<p>
<img src="warning.gif" alt="<b>NOTE</b>:">
There have been multiple published revisions to the ISO C++ standard,
and each one has included a different definition of POD. To ensure
interoperation of code compiled according to different revisions of
the standard, it is necessary to settle on a single definition for a
platform. A platform vendor may choose to follow a different revision
of the standard, but by default, the definition of POD under this ABI
is the definition from the 2003 revision (TC1).
</p>
<p>
Being tied to the TC1 definition of POD does not prevent compilers
from being fully compliant with later revisions. This ABI uses the
definition of POD only to decide whether to allocate objects in the
tail-padding of a base-class subobject. While the standards have
broadened the definition of POD over time, they have also forbidden
the programmer from directly reading or writing the underlying bytes
of a base-class subobject with, say, <tt>memcpy</tt>. Therefore,
even in the most conservative interpretation, implementations may
freely allocate objects in the tail padding of any class which would
not have been POD in C++98. This ABI is in compliance with that.
</p>
</dd>
<p>
<dt> <i>potentially-overlapping subobject</i> </dt>
<dd>
A base class subobject or a non-static data member declared with
the <tt>[[no_unique_address]]</tt> attribute.
<p>
<dt> <i>primary base class</i> </dt> <dd> For a dynamic class, the
unique base class (if any) with which it shares the virtual pointer at
offset 0.
<p>
<dt> <i>secondary virtual table</i> </dt>
<dd>
The instance of a virtual table for a base class
that is embedded in the virtual table of a class derived from it.
<p>
<dt> <i>thunk</i> </dt>
<dd>
A segment of code associated (in this ABI) with a target function,
which is called instead of the target function for the purpose of
modifying parameters (e.g. <code>this</code>)
or other parts of the environment
before transferring control to the target function,
and possibly making further modifications after its return.
A thunk may contain as little as an instruction to be executed prior to
falling through to an immediately following target function,
or it may be a full function with its own stack frame that does
a full call to the target function.
<p>
<dt> <i>vague linkage</i> </dt>
<dd>
The treatment of entities --
e.g. inline functions, templates, virtual tables --
with external linkage that can be
defined in multiple translation units,
while the ODR requires that the program
behave as if there were only a single definition.
<p>
<dt> <i>virtual table</i> (or <i>vtable</i>) </dt>
<dd>
A dynamic class has an associated table
(often several instances, but not one per object)
which contains information about its dynamic attributes,
e.g. virtual function pointers, virtual base class offsets, etc.
<p>
<dt> <i>virtual table group</i> </dt>
<dd>
The primary virtual table for a class along with all of the associated
secondary virtual tables for its proper base classes.
</dl>
<p> <hr> <p>
<a name="limits">
<h3><a href="#limits"> 1.2 Limits </a></h3>
<p>
Various representations specified by this ABI impose limitations on
conforming user programs.
These include, for the 64-bit Itanium ABI:
<ul>
<p>
<li>
The offset of a non-virtual base subobject in the full object containing
it must be representable by a 56-bit signed integer
(due to RTTI implementation).
This implies a practical limit of 2**55 bytes on the size of a class.
</ul>
<p> <hr> <p>
<a name="namespace">
<h3><a href="#namespace"> 1.3 Namespace and Header </a></h3>
<p>
This ABI specifies a number of type and function APIs supplemental
to those required by the ISO C++ Standard.
A header file named <code>cxxabi.h</code> will be provided by
implementations that declares these APIs.
The reference header file included with this ABI definition
shall be the authoritative definition of the APIs.
<p>
These APIs will be placed in a namespace <code>__cxxabiv1</code>.
The header file will also declare a namespace alias <code>abi</code>
for <code>__cxxabiv1</code>.
It is expected that users will use the alias,
and the remainder of the ABI specification will use it as well.
<p>
In general,
API objects defined as part of this ABI are assumed to be extern "C++".
However, some (many?) are specified to be extern "C" if they:
<ul>
<li> are expected to be called by users from C code,
e.g. <code>longjmp_unwind</code>; or
<li> are expected to be called only implicitly by compiled code,
and are likely to be implemented in C.
</ul>
<p> <hr> <p>
<a name="scope">
<h3><a href="#scope"> 1.4 Scope of This ABI </a></h3>
<a name="scope-library">
<h3><a href="#scope-library"> 1.4.1 Runtime Libraries </a></h3>
<p>
The objective of a full ABI is to allow arbitrary mixing of object
files produced by conforming implementations,
by fully specifying the <b>binary interface</b> of application programs.
We do not fully achieve this objective.
<p>
There are two principal reasons for this:
<ol type=I>
<p>
<li>
We start from the Itanium processor-specific ABI as the standard for the
underlying C interfaces.
At this time, however,
the psABI does not attempt to specify the supported C library interfaces.
<p>
<li>
More fundamental is the definition of the Standard C++ Library.
As the standard interface makes heavy use of templates,
most user object files will end up with embedded template
instantiations.
Vendors are allowed to use helper functions and data in their
implementations of these templates,
and quite reasonably do so,
with the result that a typical user object file will contain references
to such helper objects specific to the implementation where compiled.
We have not attempted to constrain the interface at this level,
because we do not consider doing so feasible at this time.
</ol>
<p>
Notwithstanding these problems,
because this ABI does completely specify the data model
and certain library interfaces that inherently interact between objects
(e.g. construction, destruction, and exceptions),
it is our intent that interoperation of object files produced by
different compilers be possible in the following cases:
<ul>
<p>
<li>
A program which uses only the standalone standard library interfaces
(Chapter 18) does not depend on the problematic template features.
<p>
<li>
Since the standard library headers for an implementation presumably
match the interfaces of the standard library on that implementation,
a program compiled with the target system's headers,
even if a mixture of compilers is used,
should function properly on that system.
</ul>
<p>
Even these cases can fail if the compiler makes use of
implementation-defined library interfaces to implement runtime
functionality without explicit user reference,
e.g. a software divide function.
We can distinguish between:
<ul>
<li> the standard support library,
which provides interfaces required by the C++ Standard Library
specification and the vendor header files required for it,
as well as interfaces required by this ABI; and
<p>
<li> the implicit compiler support library,
which provides other interfaces implicitly assumed by the compiler
and used to implement either standard features or extensions.
</ul>
<p>
An implementation shall place its standard support library in a DSO
named <code>libcxa.so</code> on Itanium systems,
or in auxiliary DSOs automatically loaded by it.
It shall place implicit compiler support
in a library separate from the standard support library,
with any external names chosen to avoid conflicts between vendors
(e.g. by including a vendor identifier as part of the names).
This allows a program to function properly if linked with the
target's standard support library and the implicit compiler support
libraries from any implementations used to build components.
<a name="scope-templates">
<h3><a href="#scope-templates"> 1.4.2 Export Templates </a></h3>
<p>
This ABI does not specify the treatment of export templates,
as there are no working implementations to serve as models at this time.
We hope to address this weakness in the future when implementation
experience is available.
<p> <hr> <p>
<a name="docs">
<h3><a href="#docs"> 1.5 Base Documents </a></h3>
<p>
A number of other documents provide a basis on which this ABI is built,
and are occasionally referenced herein:
<ul>
<p>
<li> [gABI]
The <b>System V Application Binary Interface</b>,
otherwise known as the <i>Generic ABI</i>.
This document describes processor-independent object file formats
and binary software interfaces for C under Unix.
A somewhat out-of-date version is available from the SCO website,
<a href=http://www.caldera.com/developers/devspecs/>
http://www.caldera.com/developers/devspecs/</a>.
A newer version, produced in conjunction with the next document,
should be released in the future.
Included by reference in this ABI.
<p>
<li> [psABI]
The Intel <b>Unix System V Application Binary Interface,
Itanium Processor Supplement</b>.
This document describes Itanium processor-specific object file formats
and binary software interfaces, primarily for C, under Unix.
Available from the Intel Itanium software developer website,
<a href="http://www.intel.com/design/itanium/downloads/245370.htm">
http://www.intel.com/design/itanium/downloads/245370.htm</a>.
Included by reference in this ABI.
<p>
<li> [SWCONV]
The Intel <b>Itanium Software Conventions and Runtime Architecture Guide</b>.
This document describes Itanium processor-specific binary software interfaces,
notably including register usage, subprogram calling conventions, and
stack unwind facilities, under all systems.
Available from the Intel Itanium software developer website,
<a href="https://www.intel.com/content/dam/www/public/us/en/documents/guides/itanium-software-runtime-architecture-guide.pdf">
https://www.intel.com/content/dam/www/public/us/en/documents/guides/itanium-software-runtime-architecture-guide.pdf</a>.
Included by reference in this ABI.
<p>
<li> [ABI-EH]
The <a href=abi-eh.html><b>C++ ABI for Itanium: Exception Handling</b></a>.
Its Level II is considered an integral part of this document (Chapter 4).
It also contains the base specification of unwind support for [psABI].
<p>
<li> [C++FDIS]
The <b>Final Draft International Standard, Programming Language C++</b>,
ISO/IEC FDIS 14882:1998(E).
References herein to the "C++ Standard," or to just the "Standard,"
are to this document.
</ul>
<p> <hr> <p>
<a name="layout">
<h2><a href="#layout"> Chapter 2: Data Layout </a></h2>
<p> <hr> <p>
<a name="general">
<h3><a href="#general"> 2.1 General </a></h3>
<p>
In what follows, we define the memory layout for C++ data objects.
Specifically, for each type, we specify the following information about
an object O of that type:
<ul>
<li> the <i>size</i> of an object, <i>sizeof</i>(O);
<li> the <i>alignment</i> of an object, <i>align</i>(O); and
<li> the <i>offset</i> within O, <i>offset</i>(C),
of each data component C, i.e. base or member.
</ul>
<p> For purposes internal to the specification,
we also specify:
<ul>
<li> <i>dsize</i>(O):
the <i>data size</i> of an object, which is the size of O without tail
padding.
<p>
<li> <i>nvsize</i>(O):
the <i>non-virtual size</i> of an object, which is the size of O
without virtual bases.
<p>
<li> <i>nvalign</i>(O):
the <i>non-virtual alignment</i> of an object, which is the alignment of O
without virtual bases.
</ul>
<p> <hr> <p>
<a name="pod">
<h3><a href="#pod"> 2.2 POD Data Types </a></h3>
<p>
The size and alignment of a type which is a <a href="#POD">POD for the
purpose of layout<a> is as specified by the base C ABI, with the
following provisos:
</p>
<ul>
<li>If the base ABI specifies rules for the C99 type <code>_Bool</code>,
then <code>bool</code> follows those rules. Otherwise, it has size
and alignment 1.</li>
<li>If the base ABI does not specify rules for empty classes, then an
empty class has size and alignment 1.</li>
<li>The types <code>T &</code> and <code>T &&</code> are treated
exactly like the pointer type <code>T *</code>.</li>
<li>A member pointer type is treated exactly as if it were the C type
<a href="#member-pointers">described below</a>.</li>
</ul>
<p>
The <i>dsize</i>, <i>nvsize</i>, and <i>nvalign</i> of these types are
defined to be their ordinary size and alignment. These properties
only matter for non-empty class types that are used as base classes.
We ignore tail padding for PODs because an early version of the
standard did not allow us to use it for anything else and because it
sometimes permits faster copying of the type.
</p>
<p> <hr> <p>
<a name="member-pointers"></a>
<h3><a href="#member-pointers"> 2.3 Member Pointers </a></h3>
<a name="data-member-pointers"></a>
<h4><a href="#data-member-pointers"> 2.3.1 Data Member Pointers </a></h4>
<p>
The basic ABI properties of data member pointer types are those
of <code>ptrdiff_t</code>.
<p>
A data member pointer is represented as the data member's offset in bytes
from the address point of an object of the base type, as a
<code>ptrdiff_t</code>.
<p>
A null data member pointer is represented as an offset of <code>-1</code>.
Unfortunately, it is possible to generate a data member pointer with an
offset of <code>-1</code> using explicit derived-to-base conversions.
If this is done, implementations following this ABI may misbehave.
<span class="future-abi">Recommendation for new platforms: consider using
a different representation for data member pointers, such as left-shifting
the offset by one and using a non-zero low bit to indicate a non-null
value.</span>
<p>
Note that by <code>[dcl.init]</code>, "zero initialization" of a data
member pointer object stores a null pointer value into it. Under this
representation, that value has a non-zero bit representation. On most
modern platforms, data member pointers are the only type with this
property.
<p>
Base-to-derived and derived-to-base conversions of a non-null data member
pointer can be performed by adding or subtracting (respectively) the static
offset of the base within the derived class. The C++ standard does not
permit base-to-derived and derived-to-base conversions of member pointers
to cross a <code>virtual</code> base relationship, and so a static offset
is always known.
<a name="member-function-pointers"></a>
<h4><a href="#member-function-pointers"> 2.3.2 Member Function Pointers </a></h4>
<p>
Several different representions of member function pointers are in use.
The standard representation relies on several assumptions about the
platform, such as that the low bit of a function pointer to a non-virtual
member function is always zero. For platforms where this is not reasonable
to guarantee, an alternate representation must be used. One such
representation, used on the 32-bit ARM architecture, is also described here.
<p>
In all representations, the basic ABI properties of member function
pointer types are those of the following class, where <code>fnptr_t</code>
is the appropriate function-pointer type for a member function of this type:
<pre>
struct {
fnptr_t ptr;
ptrdiff_t adj;
};
</pre>
<p>
A member function pointer for a non-virtual member function is represented
with <code>ptr</code> set to a pointer to the function, using the base
ABI's representation of function pointers.
<p>
In the standard representation, a member function pointer for a virtual
function is represented with <code>ptr</code> set to 1 plus the function's
v-table entry offset (in bytes), converted to a function pointer as if by
<code>reinterpret_cast<fnptr_t>(uintfnptr_t(1 + offset))</code>,
where <code>uintfnptr_t</code> is an unsigned integer of the same
size as <code>fnptr_t</code>.
<p>
In both of these cases, <code>adj</code> stores the offset (in bytes)
which must be added to the <code>this</code> pointer before the call.
<p>
In the standard representation, a null member function pointer is
represented with <code>ptr</code> set to a null pointer. The value
of <code>adj</code> is unspecified for null member function pointers.
<p>
The standard representation relies on some assumptions which are
true for most platforms:
<ul compact>
<li>The low bit of a function pointer to a non-static member function
is never set. On most platforms, this is either always true or
can be made true at little cost. For example, on platforms where
a function pointer is just the address of the first instruction in the
function, the implementation can ensure that this addresss is always
sufficiently aligned to make the low bit zero for non-static member
functions; often this is required by the underlying architecture.</li>
<li>A null function pointer can be distinguished from a virtual
offset value. On most platforms, this is always true because the
null function pointer is the zero value.</li>
<li>The offset to a v-table entry is never odd. On most platforms,
the size of a v-table entry is even because the architecture is
byte-addressed and pointers are even-sized.</li>
<li>A virtual call can be performed knowing only the addresss of a
v-table entry and the type of the virtual function. On most
platforms, a v-table entry is equivalent to a function pointer,
and the type of that function pointer can be determined from the
member pointer type.</li>
</ul>
<p>
However, there are exceptions. For example, on the 32-bit ARM
architecture, the low bit of a function pointer determines whether
the function begins in THUMB mode. Such platforms must use an
alternate representation.
<p>
In the 32-bit ARM representation, the <code>this</code>-adjustment
stored in <code>adj</code> is left-shifted by one, and the low bit
of <code>adj</code> indicates whether <code>ptr</code> is a function
pointer (including null) or the offset of a v-table entry. A virtual
member function pointer sets <code>ptr</code> to the v-table entry
offset as if by
<code>reinterpret_cast<fnptr_t>(uintfnptr_t(offset))</code>.
A null member function pointer sets <code>ptr</code> to a null
function pointer and must ensure that the low bit of <code>adj</code>
is clear; the upper bits of <code>adj</code> remain unspecified.
<p>A member function pointer is null if <code>ptr</code> is equal
to a null function pointer and (only when using the 32-bit ARM
representation) the low bit of <code>adj</code> is clear.
<p>Two member function pointers are equal if they are both null or
if their corresponding values of <code>ptr</code> and <code>adj</code>
are equal. Note that the C++ standard does not require member pointers
to the same virtual member function to compare equal; implementations
using this ABI will do so, but only if the member pointers are built
using the same v-table offset, which they may not be in the presence
of multiple inheritance or overrides with covariant return types.
<p>
Base-to-derived and derived-to-base conversions of a member function
pointer can be performed by adding or subtracting (respectively) the
static offset of the base within the derived class to the stored
<code>this</code>-adjustment value. In the standard representation,
this simply means adding it to <code>adj</code>; in the 32-bit ARM
representation, the addend must be left-shifted by one. Because the
adjustment does not factor into whether a member function pointer is
null, this addition can be done unconditionally when performing a
conversion.
<p>
A call is performed as follows:
<ol>
<li>Add the stored adjustment to the <code>this</code> address.</li>
<li>If the member pointer stores a v-table entry offset, load the
v-table from the adjusted <code>this</code> address and call
the v-table entry at the stored offset.</li>
<li>Otherwise, call the stored function pointer.</li>
</ol>
<p> <hr> <p>
<a name="class-types">
<h3><a href="#class-types"> 2.4 Non-POD Class Types </a></h3>
For a class type C which is not a <a href="#POD">POD for the purpose
of layout</a>, assume that all component types (i.e. proper base
classes and non-static data member types) have been laid out, defining
size, data size, non-virtual size, alignment, and non-virtual
alignment.
(See the description of these terms in
<a href=#general><b>General</b></a> above.)
Layout (of type C) is done using the following procedure.
<ol type=I>
<p>
<li> <h5> Initialization </h5>
<ol type=1>
<p>
<li> Initialize sizeof(C) to zero, align(C) to one, dsize(C) to zero.
<p>
<li> If C is a dynamic class type:
<ol type=a>
<p>
<li> Identify all virtual base classes, direct or indirect,
that are primary base classes for some other direct or indirect
base class.
Call these <i>indirect primary base classes</i>.
<p>
<li> If C has a dynamic base class,
attempt to choose a primary base class B.
It is the first (in direct base class order)
non-virtual dynamic base class, if one exists.
Otherwise, it is a nearly empty virtual base class,
the first one in (preorder) inheritance graph order which
is not an indirect primary base class if any exist,
or just the first one if they are all indirect primaries.
<p>
<li> If C has no primary base class, allocate the virtual table
pointer for C at offset zero, and set sizeof(C), align(C),
and dsize(C) to the appropriate values for a pointer (all 8
bytes for Itanium 64-bit ABI).
</ol>
</ol>
<p>
<img src=warning.gif alt="<b>NOTE</b>:">
<i>
Case (2b) above is now considered to be an error in the design. The
use of the first indirect primary base class as the derived class'
primary base does not save any space in the object, and will cause
some duplication of virtual function pointers in the additional copy
of the base classes virtual table.
<p>
The benefit is that using the derived class virtual pointer as the base
class virtual pointer will often save a load,
and no adjustment to the <code>this</code> pointer will be required for
calls to its virtual functions.
<p>
It was thought that 2b would allow the compiler to avoid
adjusting <code>this</code> in some cases, but this was incorrect, as
the <a href=#vcall>virtual function call algorithm</a> requires that
the function be looked up through a pointer to a class that defines
the function, not one that just inherits it. Removing that
requirement would not be a good idea, as there would then no longer be
a way to emit all thunks with the functions they jump to. For
instance, consider this example:
<blockquote><code><pre>
struct A { virtual void f(); };
struct B : virtual public A { int i; };
struct C : virtual public A { int j; };
struct D : public B, public C {};
</pre></code></blockquote>
<p>
When B and C are declared, A is a primary base in each case, so although
vcall offsets are allocated in the A-in-B and A-in-C vtables, no
<code>this</code> adjustment is required and no thunk is generated.
However, inside D objects, A is no longer a primary base of C, so if we
allowed calls to <code>C::f()</code> to use the copy of A's vtable in the C
subobject, we would need to adjust <code>this</code> from <code>C*</code>
to <code>B::A*</code>, which would require a third-party thunk. Since we
require that a call to <code>C::f()</code> first convert to
<code>A*</code>, C-in-D's copy of A's vtable is never referenced, so this
is not necessary.
</i>
<p>
<li> <h5> Allocation of Members Other Than Virtual Bases </h5>
<p>
For each data component D (first the primary base of C, if any, then
the non-primary, non-virtual direct base classes in declaration order,
then the non-static data members and unnamed bit-fields in declaration
order), allocate as follows:
<ol type=1>
<p>
<li> If D is a (possibly unnamed) bit-field whose declared type is
<code>T</code> and whose declared width is <code>n</code> bits:
<p>
There are two cases depending on <code>sizeof(T)</code>
and <code>n</code>:
<ol type=a>
<p>
<li>
If <code>sizeof(T)*8 >= n</code>,
the bit-field is allocated as required by the base C ABI,
subject to the constraint that a bit-field is never placed in the
tail padding of a base class of C.
<p>
If dsize(C) > 0, and the byte at offset dsize(C) - 1 is
partially filled by a bit-field, and that bit-field is also a
data member declared in C (but not in one of C's proper base
classes), the next available bits are the unfilled bits at
offset dsize(C) - 1. Otherwise, the next available bits are at
offset dsize(C).
<p>
Update align(C) to max (align(C), align(T)).
<p>
<li>
If <code>sizeof(T)*8 < n</code>,
let T' be the largest integral POD type with
<code>sizeof(T')*8 <= n</code>.
The bit-field is allocated starting at the next offset aligned
appropriately for T', with length n bits.
The first <code>sizeof(T)*8</code> bits are used to hold the
value of the bit-field,
followed by <code>n - sizeof(T)*8</code> bits of padding.
<p>
Update align(C) to max (align(C), align(T')).
</ol>