Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[VL] Add a docker build job and reuse pre-built arrow libs #6826

Merged
merged 13 commits into from
Aug 15, 2024
Merged
Show file tree
Hide file tree
Changes from 9 commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
48 changes: 48 additions & 0 deletions .github/workflows/docker_image.yml
Original file line number Diff line number Diff line change
@@ -0,0 +1,48 @@
# Licensed to the Apache Software Foundation (ASF) under one or more
# contributor license agreements. See the NOTICE file distributed with
# this work for additional information regarding copyright ownership.
# The ASF licenses this file to You under the Apache License, Version 2.0
# (the "License"); you may not use this file except in compliance with
# the License. You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.

name: Build and Push Docker Image
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can you please help to add one doc on how to use this?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@zhouyuan, I will do that. Thanks!


on:
pull_request:
paths:
- '.github/workflows/docker_image.yml'
schedule:
- cron: '0 20 * * 0'

jobs:
build:
runs-on: ubuntu-latest

steps:
- name: Checkout repository
uses: actions/checkout@v2

- name: Set up Docker Buildx
uses: docker/setup-buildx-action@v1

- name: Login to Docker Hub
uses: docker/login-action@v2
with:
username: ${{ secrets.DOCKERHUB_USER }}
password: ${{ secrets.DOCKERHUB_TOKEN }}

- name: Build and push Docker image
uses: docker/build-push-action@v2
with:
context: .
file: dev/vcpkg/docker/gha-centos-7.dockerfile
push: true
tags: apache/gluten:vcpkg-centos-7
3 changes: 1 addition & 2 deletions .github/workflows/velox_docker.yml
Original file line number Diff line number Diff line change
Expand Up @@ -51,7 +51,7 @@ concurrency:
jobs:
build-native-lib-centos-7:
runs-on: ubuntu-20.04
container: apache/gluten:gluten-vcpkg-builder_2024_08_05 # centos7 with dependencies installed
container: apache/gluten:vcpkg-centos-7
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Are we planning to do the same on centos 8 build?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@zhztheplayer, yes, it would be better to also do this for centos-8 build. At least, those libs generated by arrow build (cpp/java) can be cached in docker to accelerate CI job when the current folder based cache miss happens.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can we reuse the binary on centos 7?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@zhouyuan, I will try this.

steps:
- uses: actions/checkout@v2
- name: Generate cache key
Expand All @@ -63,7 +63,6 @@ jobs:
with:
path: |
./cpp/build/releases/
/root/.m2/repository/org/apache/arrow/
key: cache-velox-build-centos-7-${{ hashFiles('./cache-key') }}
- name: Build Gluten native libraries
if: ${{ steps.cache.outputs.cache-hit != 'true' }}
Expand Down
4 changes: 1 addition & 3 deletions .github/workflows/velox_docker_cache.yml
Original file line number Diff line number Diff line change
Expand Up @@ -30,7 +30,7 @@ concurrency:
jobs:
cache-native-lib-centos-7:
runs-on: ubuntu-20.04
container: apache/gluten:gluten-vcpkg-builder_2024_08_05 # centos7 with dependencies installed
container: apache/gluten:vcpkg-centos-7
steps:
- uses: actions/checkout@v2
- name: Generate cache key
Expand All @@ -43,7 +43,6 @@ jobs:
lookup-only: true
path: |
./cpp/build/releases/
/root/.m2/repository/org/apache/arrow/
key: cache-velox-build-centos-7-${{ hashFiles('./cache-key') }}
- name: Build Gluten native libraries
if: steps.check-cache.outputs.cache-hit != 'true'
Expand All @@ -57,7 +56,6 @@ jobs:
with:
path: |
./cpp/build/releases/
/root/.m2/repository/org/apache/arrow/
key: cache-velox-build-centos-7-${{ hashFiles('./cache-key') }}

cache-native-lib-centos-8:
Expand Down
7 changes: 2 additions & 5 deletions dev/ci-velox-buildstatic-centos-7.sh
Original file line number Diff line number Diff line change
Expand Up @@ -2,12 +2,9 @@

set -e

yum install sudo patch java-1.8.0-openjdk-devel -y
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Initially this is kept in case someone is using this script from a clean centos 7 image

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@zhouyuan, maybe we can just tell user to refer weekly build job for building from clean image. centos-7/8/9, ubuntu-20.04/22.04 are covered in that job. I will document this guide. Thanks!

cd $GITHUB_WORKSPACE/ep/build-velox/src
./get_velox.sh
source /opt/rh/devtoolset-9/enable
cd $GITHUB_WORKSPACE/
source ./dev/vcpkg/env.sh
sed -i '/^headers/d' ep/build-velox/build/velox_ep/CMakeLists.txt
export NUM_THREADS=4
./dev/builddeps-veloxbe.sh --build_tests=OFF --build_benchmarks=OFF --enable_s3=ON --enable_gcs=ON --enable_hdfs=ON --enable_abfs=ON
./dev/builddeps-veloxbe.sh --build_tests=OFF --build_benchmarks=OFF --build_arrow=OFF --enable_s3=ON \
--enable_gcs=ON --enable_hdfs=ON --enable_abfs=ON
2 changes: 1 addition & 1 deletion dev/vcpkg/Makefile
Original file line number Diff line number Diff line change
Expand Up @@ -31,7 +31,7 @@ docker-image:

docker-image-gha:
docker build \
--file docker/Dockerfile.gha \
--file docker/gha-centos-7.dockerfile \
--tag "$(DOCKER_IMAGE)" \
--build-arg HTTPS_PROXY="" \
--build-arg HTTP_PROXY="" \
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -11,7 +11,7 @@ RUN sed -i \
-e 's/mirror\.centos\.org/vault.centos.org/' \
/etc/yum.repos.d/CentOS-SCLo-scl-rh.repo

RUN yum install -y git patch wget sudo
RUN yum install -y git patch wget sudo java-1.8.0-openjdk-devel

# build
RUN git clone --depth=1 https://github.com/apache/incubator-gluten /opt/gluten
Expand All @@ -22,4 +22,7 @@ RUN cd /opt/gluten && bash ./dev/vcpkg/setup-build-depends.sh

# vcpkg env
ENV VCPKG_BINARY_SOURCES=clear;files,/var/cache/vcpkg,readwrite
RUN source /opt/rh/devtoolset-9/enable && cd /opt/gluten && source dev/vcpkg/env.sh

# Build arrow, then install the native libs to system paths and jar package to .m2/ directory.
RUN cd /opt/gluten && source ./dev/vcpkg/env.sh && bash ./dev/builddeps-veloxbe.sh build_arrow && \
rm -rf ep/_ep/ && rm -rf /tmp/velox-deps/
3 changes: 3 additions & 0 deletions dev/vcpkg/ports/gflags/portfile.cmake
Original file line number Diff line number Diff line change
Expand Up @@ -22,6 +22,9 @@ vcpkg_configure_cmake(
-DBUILD_gflags_nothreads_LIB:BOOL=OFF
-DGFLAGS_USE_TARGET_NAMESPACE:BOOL=ON
-DCMAKE_DEBUG_POSTFIX=d
-DGFLAGS_BUILD_STATIC_LIBS:BOOL=ON
-DGFLAGS_BUILD_SHARED_LIBS:BOOL=ON
-DGFLAGS_BUILD_gflags_LIB:BOOL=ON
)

vcpkg_install_cmake()
Expand Down
Loading