Benjamin P. Wood, 2017
This repository contains the implementation of FIB as presented in the OOPSLA 2017 paper:
"Instrumentation Bias for Dynamic Data Race Detection"
Benjamin P. Wood, Man Cao, Michael D. Bond, Dan Grossman
Proceedings of the ACM on Programming Languages. Vol. 1, OOPSLA,
Article 69 (October 2017), 31 pages. https://doi.org/10.1145/3133893
Presented at OOPSLA 2017: ACM SIGPLAN Conference on Object-Oriented
Programming Systems, Languages, and Applications, October 2017
The remainder of this README gives an overview of the FIB code.
Direct questions or feedback to Ben Wood.
- Code Overview
- Configurations
- Build
- Run
- Epochs, Epoch Maps/Read Maps, and Vector Clocks
- Access History Metadata
- Synchronization Metadata
- GC Support
- Fib Communication
This code ("DR") implements FIB and other versions of the FastTrack algorithm in Jikes RVM version 3.1.3, all based on the Octet code base released with [Bond, et al., OOPSLA 2013]. Selected parts of Octet's infrastructure for configuration and for inserting instrumentation are reused or extended in this code, but the Octet mechanism itself is not (despite its similarity).
Main code for the modified Jikes RVM is in ~/fibRvmRoot/rvm/src
.
The best way to navigate the code is through Eclipse, which we have
included in the VM image. FIXME
Code for FIB and the other FastTrack implementations is centered in
the org.jikesrvm.dr
package. Highlights:
- class
Dr
holds top-level data race detector constants - class
DrRuntime
holds runtime hooks that are called by instrumentation inserted by the compiler - package
fasttrack
contains the read and write analysis barriers for all FastTrack variants except FIB- All FastTrack implementations extend
FastTrack
; their read/write barriers are called from top-level analysis barriers inDrRuntime
.
- All FastTrack implementations extend
- package
fib
contains all support for FIB- class
FibFastTrack
contains read/write analysis barriers - class
FibComm
contains cross-thread communication infrastructure
- class
- package
instrument
contains instrumentation rules This package interacts with support for instrumentation infrastructure mainly inorg.jikesrvm.octet.{InstrDecisions,Octet}
andorg.jikesrvm.compilers.opt.PlainFastTrackOpt*
. - package
metadata
defines layouts and manipulations on analysis metadata for per-field access histories and synchronization tracking
Additional support is spread throughout existing Jikes RVM
infrastructure and is most easily explored by searching for references
to code in the dr
package. (In Eclipse, highlight a name and use
Control-Shift-G to search for references.)
Key parts of the code outside the dr
package include:
org.jikesrvm.scheduler.RVMThread
- Code for GC scanning and statics in
org.jikesrvm.mm.mmtk
- Object representation extensions, instrumentation support, and
runtime hooks also touch
org.jikesrvm.objectmodel
,org.jikesrvm.compilers
,org.jikesrvm.runtime.*Entrypoints
. - Minor extensions elsewhere.
Modifications to existing Jikes RVM code are generally labeled with comments using the identifiers "DR" or "FIB".
We use the Octet org.jikesrvm.config
system. All data race detector
configurations extend org.jikesrvm.config.dr.Base
. Here, each
relevant configuration is listed with the FastTrack implementation it
uses (from org.jikesrvm.dr.{fasttrack,fib}
), as well as the matching
label used for this configuration in the paper.
UnsyncArray
(usesUnsyncFastTrack
, matches FTUnsync in the paper): FastTrack with no synchronization of analysis barriers.NaiveCasArray
(usesNaiveSpinLockFastTrack
, matches FTLock): FastTrack with naive spin-locking on every analysis barrier.CasSimpleArray
(usesCASFastTrack
, matches FTCasSimple): FastTrack with coarse-grained CAS-based synchronization of analysis barriers.CasArray
(usesCASFastTrack
, matches FTCas): FastTrack with fine-grained CAS-based synchronization of analysis barriers.FibArray
(usesFibFastTrack
, matches FTFib): FastTrack with FIB for synchronization of analysis barriers.FibPreemptiveReadShareArray
(usesFibFastTrack
, matches FTFib+Share): FastTrack with FIB for synchronization of analysis barriers, with predictive read sharing.FibCasArray
(usesFibFastTrack
, matches FTFib+Adapt): FastTrack with FIB for synchronization of analysis barriers, falling back on the CAS algorithm for highly contended locations.FibPreemptiveReadShareObjectCasStaticArray
(usesFibFastTrack
, matches FTFib+Adapt(SO)+Share): FastTrack with FIB for synchronization of analysis barriers, with predictive read sharing, falling back on the CAS algorithm for highly contended locations.
"Array" in these configuration names refers to the style of EpochMap ("read map") implementation used in these configurations. In this paper, we use only "Array".
The FibArrayStats
, FibPreemptiveReadShareArrayStats
, and
FibCasArrayStats
configurations enable profiling of several events.
The configuration BaseConfig
will configure Jikes RVM without any
data race detection support.
This section assumes a 64-bit x86 Linux host. See the full user guide
http://www.jikesrvm.org/UserGuide/ or in ./userguide
for build
dependencies and to adjust additional Jikes RVM parameters.
To build a configuration org.jikesrvm.config.C
:
ant -Dhost.name=x86_64-linux -Dconfig.name=FastAdaptiveGenImmix -Dconfig.config-class=org.jikesrvm.config.C
For example:
-
to build Jikes RVM with FIB but no predictive read sharing:
ant -Dhost.name=x86_64-linux -Dconfig.name=FastAdaptiveGenImmix -Dconfig.config-class=org.jikesrvm.config.dr.FibArray
-
to build Jikes RVM without any data race detection support:
ant -Dhost.name=x86_64-linux -Dconfig.name=FastAdaptiveGenImmix -Dconfig.config-class=org.jikesrvm.config.BaseConfig
To run a configuration org.jikesrvm.config.dr.FibArray
:
./dist/FastAdaptiveGenImmix_dr.FibArray_x86_64-linux/rvm [JVM args] [app args]
For example, with avrora
from DaCapo 9.12:
./dist/FastAdaptiveGenImmix_dr.FibArray_x86_64-linux/rvm -X:vm:errorsFatal=true -X:vm:measureCompilation=true -X:vm:measureCompilationPhases=true -Xmx1200M -cp dacapo-9.12-bach.jar Harness -s small -c MMTkCallback -n 1 avrora
An epoch is a pair of thread ID t and logical clock c, notated variously as "c@t" or "Tt:c". It is a Word where the LSB is 1, the next 5 bits (configurable in configs descending from dr.Base) are the thread ID, and the remaining (high) bits are the clock. Epochs are manipulated with methods in the Epoch class.
EpochMaps ("read maps" in the paper), represent sets of last reads to locations. They are distinct from Vector Clocks (org.jikesrvm.dr.metadata.VC) which are attached to threads and locks to track synchronization. The basic EpochMap implementation used in all configurations for this paper is just a VC, both represented as simple arrays.
Every non-final non-volatile field in application/library code and every Java array element has its own access history, a pair of contiguous Words. Given an object (or null, as the zero address) and an offset, methods in the AccessHistory class manipulate the parts of a history.
Instance field access histories are allocated inline in the same object as the field. The offset of a field's access history within the object is accessible with RVMField.getDrOffset().
Static field access histories are allocated in the Statics numeric section. The offset of a field's access history is accessible with RVMField.getDrOffset().
Array element access histories are stored in a lazily allocated shadow WordArray, referenced by a header word in the data array. Metadata for element i starts at the 2i word in the shadow array.
Access histories of fields (or array elements) are arranged as two contiguous words in memory: a read word and a write word. The read word may hold an epoch (LSB = 1), a reference to an epoch map (LSB = 0, some non-LSB not zero), or the zero word (meaning never read or written). The write word always holds an epoch (LSB always 1) or the zero word (meaning never written).
Synchronization is tracked by vector clocks attached to each thread, lock, volatile field, and class:
- A reference to the vector clock for a thread is stored in RVMThread.drThreadVC.
- A reference to the vector clock for a scalar object's monitor is stored in a word in the object's header.
- A reference to the vector clock for an array object's monitor is stored in a word in the header of the array object's shadow array.
- A reference to the vector clock for a volatile field's monitor is stored in a word in the same object (for instance fields) or in the Statics reference section (for static fields).
- A reference to the vector clock for a class's initialization time is stored in RVMClass.drClassInitVC.
Note that our volatile instrumentation scheme requires post-barriers.
To support compact metadata for synchronization and access history, we make the following extensions to GC.
Access Histories:
- Access histories for static fields are stored in the Statics non-reference section, but they contain one maybe-reference word (the read word) and one non-reference word (the write word). Scanning uses a list of all static access histories to reach these.
- Scanning calls org.jikesrvm.dr.metadata.ObjectShadow.scan(...) on objects during scans. This method checks the LSB of read words stored in objects (including array shadows) to determine whether they are references or epochs, and processes the edges accordingly. It also scans the header word of an object, which may point to an array shadow or vector clock.
Synchronization:
- Thread vector clock references are stored in normal fields of org.jikesrvm.runtime.RVMThread.
- Scanning calls org.jikesrvm.dr.metadata.ObjectShadow.scan(...) on on all objects, which scans the header word holding the vector clock reference for objects that have been used as locks. (It also scans access histories, as described above.)
- Static volatile vector clock references appear as normal references in the Statics reference section and are scanned without special support.
- Instance volatile vector clock references are added to the reference maps for object types, so they are scanned automatically.
Communication in the Fib protocol is implemented in the FibRequestManager -- an FRM object is attached to each RVMThread.