MapD Core is an in-memory, column store, SQL relational database that was designed from the ground up to run on GPUs.
This project is licensed under the Apache License, Version 2.0.
The repository includes a number of third party packages provided under separate licenses. Details about these packages and their respective licenses is at ThirdParty/licenses/index.md.
The standard build process for this project downloads the Community Edition of the MapD Immerse visual analytics client. This version of MapD Immerse is governed by a separate license agreement, included in the file EULA-CE.txt
, and may only be used for non-commercial purposes.
In order to clarify the intellectual property license granted with Contributions from any person or entity, MapD must have a Contributor License Agreement ("CLA") on file that has been signed by each Contributor, indicating agreement to the Contributor License Agreement. After making a pull request, a bot will notify you if a signed CLA is required and provide instructions for how to sign it. Please read the agreement carefully before signing and keep a copy for your records.
If this is your first time building MapD Core, install the dependencies mentioned in the Dependencies section below.
MapD uses CMake for its build system.
mkdir build
cd build
cmake -DCMAKE_BUILD_TYPE=debug ..
make -j 4
The following cmake
/ccmake
options can enable/disable different features:
-DCMAKE_BUILD_TYPE=release
build type and compiler options to use. Options:Debug
,Release
,RelWithDebInfo
,MinSizeRel
, and unset.-DENABLE_CUDA=off
disable CUDA. Defaulton
.-DMAPD_IMMERSE_DOWNLOAD=on
download the latest master build of Immerse /mapd2-frontend
. Defaulton
.-DMAPD_DOCS_DOWNLOAD=on
download the latest master build of the documentation /docs.mapd.com
. Defaultoff
. Note: this is a >50MB download.-DPREFER_STATIC_LIBS=on
static link dependencies, if available. Defaultoff
.-DENABLE_ARROW_CONVERTER=on
enable alpha support for the GPU Data Frame, based on a subset of the Apache Arrow specification. Defaultoff
.
MapD Core uses Google Test as its main testing framework. Tests reside under the Tests directory.
The sanity_tests
target runs the most common tests. If using Makefiles to build, the tests may be run using:
make sanity_tests
AddressSanitizer can be activated by setting the ENABLE_ASAN
CMake flag in a fresh build directory. At this time CUDA must also be disabled, and Calcite must be run in standalone/server mode. In an empty build directory run CMake and compile:
mkdir build && cd build
cmake -DENABLE_ASAN=on -DENABLE_CUDA=off ..
make -j 4
In a separate terminal start Calcite in standalone mode from the build directory:
java -jar bin/mapd-1.0-SNAPSHOT-jar-with-dependencies.jar --data=Tests/tmp
Finally run the tests:
export ASAN_OPTIONS=alloc_dealloc_mismatch=0:handle_segv=0
make sanity_tests
ThreadSanitizer can be activated by setting the ENABLE_TSAN
CMake flag in a fresh build directory. At this time CUDA must also be disabled, and Calcite must be run in standalone/server mode. In an empty build directory run CMake and compile:
mkdir build && cd build
cmake -DENABLE_TSAN=on -DENABLE_CUDA=off ..
make -j 4
In a separate terminal start Calcite in standalone mode from the build directory:
java -jar bin/mapd-1.0-SNAPSHOT-jar-with-dependencies.jar --data=Tests/tmp
Finally run the tests:
make sanity_tests
The startmapd
wrapper script may be used to start MapD Core in a testing environment. This script performs the following tasks:
- initializes the
data
storage directory viainitdb
, if required - starts the main MapD Core server,
mapd_server
- starts the MapD Core web server,
mapd_web_server
, for serving MapD Immerse - offers to download and import a sample dataset, using the
insert_sample_data
script - attempts to open MapD Immerse in your web browser
Assuming you are in the build
directory, and it is a subdirectory of the mapd-core
repository, startmapd
may be run by:
../startmapd
It is assumed that the following commands are run from inside the build
directory.
Initialize the data
storage directory. This command only needs to be run once.
mkdir data && ./bin/initdb data
Start the MapD Core server:
./bin/mapd_server
In a new terminal, start the MapD Core web server:
./bin/mapd_web_server
If desired, insert a sample dataset by running the insert_sample_data
script in a new terminal:
../insert_sample_data
You can now start using the database. The mapdql
utility may be used to interact with the database from the command line:
./bin/mapdql -p HyperInteractive
where HyperInteractive
is the default password. The default user mapd
is assumed if not provided.
You can also interact with the database using the web-based MapD Immerse frontend by visiting the web server's default port of 9092
:
http://localhost:9092
Note: usage of MapD Immerse is governed by a separate license agreement, provided under EULA-CE.txt
. The version bundled with this project may only be used for non-commercial purposes.
A .clang-format
style configuration, based on the Chromium style guide, is provided at the top level of the repository. Please format your code using a recent version (3.8+) of ClangFormat before submitting.
To use:
clang-format -i File.cpp
Contributed code should compile without generating warnings by recent compilers (gcc 4.9, gcc 5.3, clang 3.8) on most Linux distributions. Changes to the code should follow the C++ Core Guidelines.
MapD has the following dependencies:
Package | Min Version | Required |
---|---|---|
CMake | 3.3 | yes |
LLVM | 3.8 | yes |
GCC | 4.9 | no, if building with clang |
Boost | 1.5.7 | yes |
OpenJDK | 1.7 | yes |
CUDA | 7.5 | yes, if compiling with GPU support |
gperftools | yes | |
gdal | yes | |
Arrow | 0.4.1 | no |
Dependencies for mapd_web_server
and other Go utils are in ThirdParty/go
. See ThirdParty/go/src/mapd/vendor/README.md
for instructions on how to add new deps.
MapD Core requires a number of dependencies which are not provided in the common CentOS/RHEL package repositories. The script scripts/mapd-deps-linux.sh is provided to automatically build and install these dependencies. A prebuilt package containing these dependencies is also provided for CentOS 7 (x86_64).
First install the basic build tools:
sudo yum groupinstall -y "Development Tools"
sudo yum install -y \
zlib-devel \
epel-release \
libssh \
openssl-devel \
ncurses-devel \
git \
maven \
java-1.8.0-openjdk-devel \
java-1.8.0-openjdk-headless \
gperftools \
gperftools-devel \
gperftools-libs \
environment-modules
Next download and install the prebuilt dependencies:
curl -OJ https://internal-dependencies.mapd.com/mapd-deps/deploy.sh
sudo bash deploy.sh
These dependencies will be installed to a directory under /usr/local/mapd-deps
. The deploy.sh
script also installs Environment Modules in order to simplify managing the required environment variables. Log out and log back in after running the deploy.sh
script in order to active Environment Modules command, module
.
The mapd-deps
environment module is disabled by default. To activate for your current session, run:
module load mapd-deps
To disable the mapd-deps
module:
module unload mapd-deps
WARNING: The mapd-deps
package contains newer versions of packages such as GCC and ncurses which might not be compatible with the rest of your environment. Make sure to disable the mapd-deps
module before compiling other packages.
Instructions for installing CUDA are below.
It is preferred, but not necessary, to install CUDA and the NVIDIA drivers using the .rpm using the instructions provided by NVIDIA. The rpm (network)
method (preferred) will ensure you always have the latest stable drivers, while the rpm (local)
method allows you to install does not require Internet access.
The .rpm method requires DKMS to be installed, which is available from the Extra Packages for Enterprise Linux repository:
sudo yum install epel-release
Be sure to reboot after installing in order to activate the NVIDIA drivers.
scripts/mapd-deps-linux.sh generates two files with the appropriate environment variables: mapd-deps-<date>.sh
(for sourcing from your shell config) and mapd-deps-<date>.modulefile
(for use with Environment Modules, yum package environment-modules
). These files are placed in mapd-deps install directory, usually /usr/local/mapd-deps/<date>
. Either of these may be used to configure your environment: the .sh
may be sourced in your shell config; the .modulefile
needs to be moved to the modulespath.
The Java server lib directory containing libjvm.so
must also be added to your LD_LIBRARY_PATH
. Add one of the following to your shell config:
LD_LIBRARY_PATH=$LD_LIBRARY_PATH:/usr/lib/jvm/jre/lib/amd64/server
LD_LIBRARY_PATH=$LD_LIBRARY_PATH:/usr/lib/jvm/java-1.8.0/jre/lib/amd64/server
scripts/mapd-deps-osx.sh is provided that will automatically install and/or update Homebrew and use that to install all dependencies. Please make sure macOS is completely update to date and Xcode is installed before running. Xcode can be installed from the App Store.
mapd-deps-osx.sh
will automatically install CUDA via Homebrew and add the correct environment variables to ~/.bash_profile
.
mapd-deps-osx.sh
will automatically install Java and Maven via Homebrew and add the correct environment variables to ~/.bash_profile
.
Most build dependencies required by MapD Core are available via APT. Thrift, Blosc, and Folly must be built manually. The following will install all required dependencies and build the ones not available in the APT repositories.
sudo apt update
sudo apt install -y \
build-essential \
cmake \
cmake-curses-gui \
git \
clang \
clang-format \
llvm \
llvm-dev \
libboost-all-dev \
libgoogle-glog-dev \
golang \
libssl-dev \
libevent-dev \
default-jre \
default-jre-headless \
default-jdk \
default-jdk-headless \
maven \
libncurses5-dev \
binutils-dev \
google-perftools \
libdouble-conversion-dev \
libevent-dev \
libgdal-dev \
libgflags-dev \
libgoogle-perftools-dev \
libiberty-dev \
libjemalloc-dev \
liblz4-dev \
liblzma-dev \
libsnappy-dev \
zlib1g-dev \
autoconf \
autoconf-archive
sudo apt build-dep -y thrift-compiler
VERS=0.10.0
wget http://apache.claz.org/thrift/$VERS/thrift-$VERS.tar.gz
tar xvf thrift-$VERS.tar.gz
pushd thrift-$VERS
./configure \
--with-lua=no \
--with-python=no \
--with-php=no \
--with-ruby=no \
--prefix=/usr/local/mapd-deps
make -j $(nproc)
sudo make install
popd
VERS=1.11.3
wget --continue https://github.com/Blosc/c-blosc/archive/v$VERS.tar.gz
tar xvf v$VERS.tar.gz
BDIR="c-blosc-$VERS/build"
rm -rf "$BDIR"
mkdir -p "$BDIR"
pushd "$BDIR"
cmake \
-DCMAKE_BUILD_TYPE=Release \
-DCMAKE_INSTALL_PREFIX=/usr/local/mapd-deps \
-DBUILD_BENCHMARKS=off \
-DBUILD_TESTS=off \
-DPREFER_EXTERNAL_SNAPPY=off \
-DPREFER_EXTERNAL_ZLIB=off \
-DPREFER_EXTERNAL_ZSTD=off \
..
make -j $(nproc)
sudo make install
popd
VERS=2017.04.10.00
wget --continue https://github.com/facebook/folly/archive/v$VERS.tar.gz
tar xvf v$VERS.tar.gz
pushd folly-$VERS/folly
/usr/bin/autoreconf -ivf
./configure --prefix=/usr/local/mapd-deps
make -j $(nproc)
sudo make install
popd
VERS=1.21-45
wget --continue https://github.com/jarro2783/bisonpp/archive/$VERS.tar.gz
tar xvf $VERS.tar.gz
pushd bisonpp-$VERS
./configure --prefix=/usr/local/mapd-deps
make -j $(nproc)
sudo make install
popd
VERS=0.3.0
wget --continue https://github.com/apache/arrow/archive/apache-arrow-$VERS.tar.gz
tar -xf apache-arrow-$VERS.tar.gz
mkdir -p arrow-apache-arrow-$VERS/cpp/build
pushd arrow-apache-arrow-$VERS/cpp/build
cmake \
-DCMAKE_BUILD_TYPE=Release \
-DARROW_BUILD_SHARED=off \
-DARROW_BUILD_STATIC=on \
-DCMAKE_INSTALL_PREFIX=/usr/local/mapd-deps \
-DARROW_BOOST_USE_SHARED=off \
-DARROW_JEMALLOC_USE_SHARED=off \
..
makej make install popd
##UBUNTU 17.04
Sames as 16.10 with the following additons (we recommend that the proprietary nvidia drivers supplied via canonical be used)
sudo apt install -y
nvidia-cuda-toolkit
ccache
libglu1-mesa-dev
libglewmx-dev
gcc-5
g++-5
libldap2-dev
flex-old
It is preferred, but not necessary, to install CUDA and the NVIDIA drivers using the .deb using the instructions provided by NVIDIA. The deb (network)
method (preferred) will ensure you always have the latest stable drivers, while the deb (local)
method allows you to install does not require Internet access.
Be sure to reboot after installing in order to activate the NVIDIA drivers.
The CUDA, Java, and mapd-deps lib
directories need to be added to LD_LIBRARY_PATH
; the CUDA and mapd-deps bin
directories need to be added to PATH
. The easiest way to do so is by creating a new file named /etc/profile.d/mapd-deps.sh
containing the following:
LD_LIBRARY_PATH=/usr/local/cuda/lib64:$LD_LIBRARY_PATH
LD_LIBRARY_PATH=/usr/lib/jvm/default-java/jre/lib/amd64/server:$LD_LIBRARY_PATH
LD_LIBRARY_PATH=/usr/local/mapd-deps/lib:$LD_LIBRARY_PATH
LD_LIBRARY_PATH=/usr/local/mapd-deps/lib64:$LD_LIBRARY_PATH
PATH=/usr/local/cuda/bin:$PATH
PATH=/usr/local/mapd-deps/bin:$PATH
export LD_LIBRARY_PATH PATH
The following uses yaourt to install packages from the Arch User Repository.
yaourt -S \
git \
cmake \
boost \
google-glog \
extra/jdk8-openjdk \
clang \
llvm \
thrift \
go \
VERS=1.21-45
wget --continue https://github.com/jarro2783/bisonpp/archive/$VERS.tar.gz
tar xvf $VERS.tar.gz
pushd bisonpp-$VERS
./configure
make -j $(nproc)
sudo make install
popd
CUDA and the NVIDIA drivers may be installed using the following.
yaourt -S \
linux-headers \
cuda \
nvidia
Be sure to reboot after installing in order to activate the NVIDIA drivers.