Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

merge upstream #46

Merged
merged 103 commits into from
Nov 27, 2024
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
103 commits
Select commit Hold shift + click to select a range
f245cc2
scripts : fix missing key in compare-llama-bench.py (#10332)
ggerganov Nov 16, 2024
bcdb7a2
server: (web UI) Add samplers sequence customization (#10255)
MaggotHATE Nov 16, 2024
8ee0d09
make : auto-determine dependencies (#0)
ggerganov Nov 16, 2024
db4cfd5
llamafile : fix include path (#0)
ggerganov Nov 16, 2024
4e54be0
llama/ex: remove --logdir argument (#10339)
JohannesGaessler Nov 16, 2024
0fff7fd
docs : vulkan build instructions to use git bash mingw64 (#10303)
FirstTimeEZ Nov 16, 2024
5c9a8b2
scripts : update sync
ggerganov Nov 16, 2024
8a43e94
ggml: new optimization interface (ggml/988)
JohannesGaessler Nov 16, 2024
68fcb47
ggml : fix compile warnings (#0)
ggerganov Nov 16, 2024
84274a1
tests : remove test-grad0
ggerganov Nov 16, 2024
a4200ca
make : add ggml-opt (#0)
ggerganov Nov 16, 2024
5d9e599
ggml : adapt AMX to tensor->grad removal (#0)
ggerganov Nov 16, 2024
24203e9
ggml : inttypes.h -> cinttypes (#0)
ggerganov Nov 16, 2024
eda7e1d
ggml : fix possible buffer use after free in sched reserve (#9930)
slaren Nov 17, 2024
467576b
CMake: default to -arch=native for CUDA build (#10320)
JohannesGaessler Nov 17, 2024
c3ea58a
CUDA: remove DMMV, consolidate F16 mult mat vec (#10318)
JohannesGaessler Nov 17, 2024
a431782
ggml : fix undefined reference to 'getcpu' (#10354)
FirstTimeEZ Nov 17, 2024
cf32a9b
metal : refactor kernel args into structs (#10238)
ggerganov Nov 17, 2024
20a780c
gitignore : ignore local run scripts [no ci]
ggerganov Nov 17, 2024
be5cacc
llama : only use default buffer types for the KV cache (#10358)
slaren Nov 17, 2024
ce2e59b
CMake: fix typo in comment [no ci] (#10360)
JohannesGaessler Nov 17, 2024
76e9e58
CUDA: fix MMV kernel being used for FP16 src1 (#10357)
JohannesGaessler Nov 17, 2024
75207b3
docker: use GGML_NATIVE=OFF (#10368)
JohannesGaessler Nov 17, 2024
9b75f03
Vulkan: Fix device info output format specifiers (#10366)
0cc4m Nov 18, 2024
2eb76b2
flake.lock: Update (#10346)
ggerganov Nov 18, 2024
f139d2e
vulkan: remove use of null initializer (#10372)
jeffbolznv Nov 18, 2024
531cb1c
Skip searching root path for cross-compile builds (#10383)
bandoti Nov 18, 2024
d3481e6
cuda : only use native when supported by cmake (#10389)
slaren Nov 18, 2024
557924f
sycl: Revert MUL_MAT_OP support changes (#10385)
Alcpz Nov 19, 2024
b3e5859
vulkan: Optimize soft_max (#10301)
jeffbolznv Nov 19, 2024
2a1507c
sycl : Add option to set the SYCL architecture for all targets (#10266)
Rbiessy Nov 19, 2024
a88ad00
llama : add OLMo November 2024 support (#10394)
2015aroras Nov 19, 2024
8e752a7
llama : add check for KV cache shifts (#10401)
ggerganov Nov 19, 2024
3ee6382
cuda : fix CUDA_FLAGS not being applied (#10403)
slaren Nov 19, 2024
2a11b6b
Add required ggml-base and backend libs to cmake pkg (#10407)
bandoti Nov 19, 2024
342397d
cmake: force MSVC compiler charset to utf-8 (#9989)
shou692199 Nov 19, 2024
12b0ad9
metal : add `GGML_UNARY_OP_ELU` kernel (ggml/1018)
PABannier Nov 18, 2024
611fabd
metal : fox offset integer overflows in im2col (ggml/1015)
pminev Nov 18, 2024
9fe0fb0
sync : ggml
ggerganov Nov 19, 2024
42ae10b
add cmake rvv support (#10411)
lhpqaq Nov 19, 2024
3952a22
Fix missing file renames in Makefile due to changes in commit ae8de6d…
avdg Nov 19, 2024
ad21c9e
update rel to 4040 (#10395)
NeoZhangJianyu Nov 20, 2024
1bacb9f
vulkan: further optimize mul_mat_vec using larger loads (#10387)
jeffbolznv Nov 20, 2024
8fd4b7f
vulkan: copy iq4_nl LUT into shared memory (#10409)
jeffbolznv Nov 20, 2024
fab5d30
llama : add .clang-format file (#10415)
slaren Nov 20, 2024
f95caa7
cmake: add link dependencies to cmake find pkg (#10433)
bandoti Nov 20, 2024
9abe9ee
vulkan: predicate max operation in soft_max shaders/soft_max (#10437)
jeffbolznv Nov 20, 2024
02e4eaf
ggml-opt: fix data corruption (ggml/1022)
JohannesGaessler Nov 20, 2024
59b9172
ggml/sched : do not skip views in pre-assignments
slaren Nov 20, 2024
87a533b
sync : ggml
ggerganov Nov 21, 2024
1bb30bf
llama : handle KV shift for recurrent models (#10402)
ggerganov Nov 21, 2024
a5e4759
cuda : optimize argmax (#10441)
slaren Nov 21, 2024
c18610b
CANN: Support Ascend310P to accelerate F32 and F16 Model (#10216)
leo-pony Nov 22, 2024
599b3e0
GitHub: ask for more info in issue templates (#10426)
JohannesGaessler Nov 22, 2024
6dfcfef
ci: Update oneAPI runtime dll packaging (#10428)
shou692199 Nov 22, 2024
55ed008
ggml : do not use ARM features not included in the build (#10457)
slaren Nov 23, 2024
96fa2c5
fix gguf-py: Conversion error when multiple licenses are configured …
mmngays Nov 24, 2024
9336db4
convert : XLMRoberta Type Vocab Size (#10458)
gabe-l-hart Nov 24, 2024
dc39012
llama : fix op mul check with command-r-plus (#10476)
slaren Nov 24, 2024
cce5a90
flake.lock: Update (#10470)
ggerganov Nov 24, 2024
d9d54e4
speculative : refactor and add a simpler example (#10362)
ggerganov Nov 25, 2024
5a89877
[SYCL] Fix building Win package for oneAPI 2025.0 update (#10483)
NeoZhangJianyu Nov 25, 2024
b756441
metal : minor code formatting
ggerganov Nov 25, 2024
f6d12e7
tests : fix compile warning
ggerganov Nov 25, 2024
5931c1f
ggml : add support for dynamic loading of backends (#10469)
slaren Nov 25, 2024
9ca2e67
server : add speculative decoding support (#10455)
ggerganov Nov 25, 2024
a9a678a
Add download chat feature to server chat (#10481)
brucepro Nov 25, 2024
1f92225
Github: update issue templates [no ci] (#10489)
JohannesGaessler Nov 25, 2024
10bce04
llama : accept a list of devices to use to offload a model (#10497)
slaren Nov 25, 2024
80acb7b
Rename Olmo1124 to Olmo2 (#10500)
2015aroras Nov 25, 2024
106964e
metal : enable mat-vec kernels for bs <= 4 (#10491)
ggerganov Nov 25, 2024
47f931c
server : enable cache_prompt by default (#10501)
ggerganov Nov 25, 2024
9fd8c26
server : add more information about error (#10455)
ggerganov Nov 25, 2024
50d5cec
ci : build docker images only once daily (#10503)
slaren Nov 25, 2024
0cc6375
Introduce llama-run (#10291)
ericcurtin Nov 25, 2024
0eb4e12
vulkan: Fix a vulkan-shaders-gen arugment parsing error (#10484)
sparkleholic Nov 26, 2024
7066b4c
CANN: RoPE and CANCAT operator optimization (#10488)
noemotiovon Nov 26, 2024
9a4b79b
CANN: Improve the Inferencing Performance for Ascend NPU Device (#10454)
shen-shanshan Nov 26, 2024
811872a
speculative : simplify the implementation (#10504)
ggerganov Nov 26, 2024
84e1c33
server : fix parallel speculative decoding (#10513)
ggerganov Nov 26, 2024
25669aa
ggml-cpu: cmake add arm64 cpu feature check for macos (#10487)
chaxu01 Nov 26, 2024
c6807b3
ci : add ubuntu cuda build, build with one arch on windows (#10456)
slaren Nov 26, 2024
7db3846
ci : publish the docker images created during scheduled runs (#10515)
slaren Nov 26, 2024
ab96610
cmake : enable warnings in llama (#10474)
ggerganov Nov 26, 2024
0bbd226
restore the condistion to build & update pacakge when merge (#10507)
NeoZhangJianyu Nov 26, 2024
45abe0f
server : replace behave with pytest (#10416)
ngxson Nov 26, 2024
904109e
vulkan: fix group_norm (#10496)
jeffbolznv Nov 26, 2024
249cd93
mtgpu: Add MUSA_DOCKER_ARCH in Dockerfiles && update cmake and make (…
yeahdongcn Nov 26, 2024
be0e350
Fix HIP flag inconsistency & build docs (#10524)
tristandruyen Nov 26, 2024
30ec398
llama : disable warnings for 3rd party sha1 dependency (#10527)
slaren Nov 26, 2024
5a349f2
ci : remove nix workflows (#10526)
slaren Nov 26, 2024
de50973
Add OLMo 2 model in docs (#10530)
2015aroras Nov 26, 2024
c9b00a7
ci : fix cuda releases (#10532)
slaren Nov 26, 2024
4a57d36
vulkan: optimize Q2_K and Q3_K mul_mat_vec (#10459)
jeffbolznv Nov 27, 2024
71a6498
vulkan: skip integer div/mod in get_offsets for batch_idx==0 (#10506)
jeffbolznv Nov 27, 2024
249a790
vulkan: further optimize q5_k mul_mat_vec (#10479)
jeffbolznv Nov 27, 2024
5b3466b
vulkan: Handle GPUs with less shared memory (#10468)
jeffbolznv Nov 27, 2024
c31ed2a
vulkan: define all quant data structures in types.comp (#10440)
jeffbolznv Nov 27, 2024
9150f8f
Do not include arm_neon.h when compiling CUDA code (ggml/1028)
frankier Nov 26, 2024
fee824a
sync : ggml
ggerganov Nov 27, 2024
9e2301f
metal : fix group_norm support condition (#0)
ggerganov Nov 27, 2024
46c69e0
ci : faster CUDA toolkit installation method and use ccache (#10537)
slaren Nov 27, 2024
289e208
Merge branch 'layla-build' into merge
l3utterfly Nov 27, 2024
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
161 changes: 161 additions & 0 deletions .clang-format
Original file line number Diff line number Diff line change
@@ -0,0 +1,161 @@
---
Language: Cpp
AlignAfterOpenBracket: Align
AlignArrayOfStructures: Left
AlignConsecutiveAssignments: AcrossComments
AlignConsecutiveBitFields: AcrossComments
AlignConsecutiveDeclarations: AcrossComments
AlignConsecutiveMacros: AcrossComments
# AlignConsecutiveShortCaseStatements: AcrossComments
AlignEscapedNewlines: Left # LeftWithLastLine
AlignOperands: Align
AlignTrailingComments:
Kind: Always
OverEmptyLines: 1
AllowAllArgumentsOnNextLine: true
AllowAllParametersOfDeclarationOnNextLine: false
# AllowBreakBeforeNoexceptSpecifier: OnlyWithParen
AllowShortBlocksOnASingleLine: Never
AllowShortCaseLabelsOnASingleLine: false
AllowShortFunctionsOnASingleLine: Inline
AllowShortIfStatementsOnASingleLine: Never
AllowShortLambdasOnASingleLine: Inline
AllowShortLoopsOnASingleLine: false
AlwaysBreakBeforeMultilineStrings: true
BinPackArguments: true
BinPackParameters: true # OnePerLine
BitFieldColonSpacing: Both
BreakBeforeBraces: Custom # Attach
BraceWrapping:
AfterCaseLabel: true
AfterClass: false
AfterControlStatement: false
AfterEnum: false
AfterFunction: false
AfterNamespace: false
AfterObjCDeclaration: false
AfterStruct: false
AfterUnion: false
AfterExternBlock: false
BeforeCatch: false
BeforeElse: false
BeforeLambdaBody: false
BeforeWhile: false
IndentBraces: false
SplitEmptyFunction: false
SplitEmptyRecord: false
SplitEmptyNamespace: false
# BreakAdjacentStringLiterals: true
BreakAfterAttributes: Never
BreakBeforeBinaryOperators: None
BreakBeforeInlineASMColon: OnlyMultiline
BreakBeforeTernaryOperators: false
# BreakBinaryOperations: Never
BreakConstructorInitializers: AfterColon
# BreakFunctionDefinitionParameters: false
BreakInheritanceList: AfterComma
BreakStringLiterals: true
# BreakTemplateDeclarations: Yes
ColumnLimit: 120
CommentPragmas: '^ IWYU pragma:'
CompactNamespaces: false
ConstructorInitializerIndentWidth: 4
ContinuationIndentWidth: 4
Cpp11BracedListStyle: false
DerivePointerAlignment: false
DisableFormat: false
EmptyLineBeforeAccessModifier: Leave
EmptyLineAfterAccessModifier: Never
ExperimentalAutoDetectBinPacking: false
FixNamespaceComments: true
IncludeBlocks: Regroup
IncludeCategories:
- Regex: '^<.*\.h>'
Priority: 1
SortPriority: 0
- Regex: '^<.*'
Priority: 2
SortPriority: 0
- Regex: '.*'
Priority: 3
SortPriority: 0
IncludeIsMainRegex: '([-_](test|unittest))?$'
IncludeIsMainSourceRegex: ''
IndentAccessModifiers: false
IndentCaseBlocks: true
IndentCaseLabels: true
IndentExternBlock: NoIndent
IndentGotoLabels: false
IndentPPDirectives: AfterHash
IndentWidth: 4
IndentWrappedFunctionNames: false
InsertBraces: true # NOTE: may lead to incorrect formatting
InsertNewlineAtEOF: true
JavaScriptQuotes: Leave
JavaScriptWrapImports: true
KeepEmptyLinesAtTheStartOfBlocks: false
LambdaBodyIndentation: Signature
LineEnding: LF
MacroBlockBegin: ''
MacroBlockEnd: ''
MaxEmptyLinesToKeep: 1
NamespaceIndentation: None
ObjCBinPackProtocolList: Auto
ObjCBlockIndentWidth: 4
ObjCSpaceAfterProperty: true
ObjCSpaceBeforeProtocolList: true
PPIndentWidth: -1
PackConstructorInitializers: CurrentLine
PenaltyBreakAssignment: 2
PenaltyBreakBeforeFirstCallParameter: 1
PenaltyBreakComment: 300
PenaltyBreakFirstLessLess: 120
PenaltyBreakString: 1000
PenaltyBreakTemplateDeclaration: 10
PenaltyExcessCharacter: 1000000
PenaltyReturnTypeOnItsOwnLine: 200
PointerAlignment: Middle
QualifierAlignment: Left
#QualifierOrder: ['static', 'inline', 'friend', 'constexpr', 'const', 'volatile', 'type', 'restrict']
RawStringFormats:
- Language: Cpp
Delimiters:
- cc
- CC
- cpp
- Cpp
- CPP
- 'c++'
- 'C++'
CanonicalDelimiter: ''
ReferenceAlignment: Middle
ReflowComments: false # IndentOnly
SeparateDefinitionBlocks: Always
SortIncludes: CaseInsensitive
SortUsingDeclarations: LexicographicNumeric
SpaceAfterCStyleCast: true
SpaceAfterLogicalNot: false
SpaceAfterTemplateKeyword: true
SpaceBeforeAssignmentOperators: true
SpaceBeforeCpp11BracedList: false
SpaceBeforeCtorInitializerColon: true
SpaceBeforeInheritanceColon: true
SpaceBeforeParens: ControlStatements
SpaceBeforeRangeBasedForLoopColon: true
SpaceInEmptyBlock: false
SpaceInEmptyParentheses: false
SpacesBeforeTrailingComments: 2
SpacesInAngles: Never
SpacesInContainerLiterals: true
SpacesInLineCommentPrefix:
Minimum: 1
Maximum: -1
SpacesInParentheses: false
SpacesInSquareBrackets: false
SpaceBeforeSquareBrackets: false
Standard: c++17
TabWidth: 4
UseTab: Never
WhitespaceSensitiveMacros: ['STRINGIZE']
...

2 changes: 1 addition & 1 deletion .devops/full-cuda.Dockerfile
Original file line number Diff line number Diff line change
Expand Up @@ -26,7 +26,7 @@ COPY . .
RUN if [ "${CUDA_DOCKER_ARCH}" != "default" ]; then \
export CMAKE_ARGS="-DCMAKE_CUDA_ARCHITECTURES=${CUDA_DOCKER_ARCH}"; \
fi && \
cmake -B build -DGGML_CUDA=ON -DLLAMA_CURL=ON ${CMAKE_ARGS} -DCMAKE_EXE_LINKER_FLAGS=-Wl,--allow-shlib-undefined . && \
cmake -B build -DGGML_NATIVE=OFF -DGGML_CUDA=ON -DLLAMA_CURL=ON ${CMAKE_ARGS} -DCMAKE_EXE_LINKER_FLAGS=-Wl,--allow-shlib-undefined . && \
cmake --build build --config Release -j$(nproc) && \
cp build/bin/* .

Expand Down
9 changes: 8 additions & 1 deletion .devops/full-musa.Dockerfile
Original file line number Diff line number Diff line change
Expand Up @@ -6,6 +6,9 @@ ARG BASE_MUSA_DEV_CONTAINER=mthreads/musa:${MUSA_VERSION}-devel-ubuntu${UBUNTU_V

FROM ${BASE_MUSA_DEV_CONTAINER} AS build

# MUSA architecture to build for (defaults to all supported archs)
ARG MUSA_DOCKER_ARCH=default

RUN apt-get update && \
apt-get install -y build-essential cmake python3 python3-pip git libcurl4-openssl-dev libgomp1

Expand All @@ -19,7 +22,11 @@ WORKDIR /app

COPY . .

RUN cmake -B build -DGGML_MUSA=ON -DLLAMA_CURL=ON ${CMAKE_ARGS} -DCMAKE_EXE_LINKER_FLAGS=-Wl,--allow-shlib-undefined . && \
# Use the default MUSA archs if not specified
RUN if [ "${MUSA_DOCKER_ARCH}" != "default" ]; then \
export CMAKE_ARGS="-DMUSA_ARCHITECTURES=${MUSA_DOCKER_ARCH}"; \
fi && \
cmake -B build -DGGML_NATIVE=OFF -DGGML_MUSA=ON -DLLAMA_CURL=ON ${CMAKE_ARGS} -DCMAKE_EXE_LINKER_FLAGS=-Wl,--allow-shlib-undefined . && \
cmake --build build --config Release -j$(nproc) && \
cp build/bin/* .

Expand Down
2 changes: 1 addition & 1 deletion .devops/llama-cli-cann.Dockerfile
Original file line number Diff line number Diff line change
Expand Up @@ -22,7 +22,7 @@ ENV LD_LIBRARY_PATH=${ASCEND_TOOLKIT_HOME}/runtime/lib64/stub:$LD_LIBRARY_PATH

RUN echo "Building with static libs" && \
source /usr/local/Ascend/ascend-toolkit/set_env.sh --force && \
cmake -B build -DGGML_CANN=ON -DBUILD_SHARED_LIBS=OFF && \
cmake -B build -DGGML_NATIVE=OFF -DGGML_CANN=ON -DBUILD_SHARED_LIBS=OFF && \
cmake --build build --config Release --target llama-cli

# TODO: use image with NNRT
Expand Down
2 changes: 1 addition & 1 deletion .devops/llama-cli-cuda.Dockerfile
Original file line number Diff line number Diff line change
Expand Up @@ -22,7 +22,7 @@ COPY . .
RUN if [ "${CUDA_DOCKER_ARCH}" != "default" ]; then \
export CMAKE_ARGS="-DCMAKE_CUDA_ARCHITECTURES=${CUDA_DOCKER_ARCH}"; \
fi && \
cmake -B build -DGGML_CUDA=ON ${CMAKE_ARGS} -DCMAKE_EXE_LINKER_FLAGS=-Wl,--allow-shlib-undefined . && \
cmake -B build -DGGML_NATIVE=OFF -DGGML_CUDA=ON ${CMAKE_ARGS} -DCMAKE_EXE_LINKER_FLAGS=-Wl,--allow-shlib-undefined . && \
cmake --build build --config Release --target llama-cli -j$(nproc) && \
mkdir -p /app/lib && \
find build -name "*.so" -exec cp {} /app/lib \;
Expand Down
2 changes: 1 addition & 1 deletion .devops/llama-cli-intel.Dockerfile
Original file line number Diff line number Diff line change
Expand Up @@ -15,7 +15,7 @@ RUN if [ "${GGML_SYCL_F16}" = "ON" ]; then \
export OPT_SYCL_F16="-DGGML_SYCL_F16=ON"; \
fi && \
echo "Building with static libs" && \
cmake -B build -DGGML_SYCL=ON -DCMAKE_C_COMPILER=icx -DCMAKE_CXX_COMPILER=icpx \
cmake -B build -DGGML_NATIVE=OFF -DGGML_SYCL=ON -DCMAKE_C_COMPILER=icx -DCMAKE_CXX_COMPILER=icpx \
${OPT_SYCL_F16} -DBUILD_SHARED_LIBS=OFF && \
cmake --build build --config Release --target llama-cli

Expand Down
9 changes: 8 additions & 1 deletion .devops/llama-cli-musa.Dockerfile
Original file line number Diff line number Diff line change
Expand Up @@ -8,14 +8,21 @@ ARG BASE_MUSA_RUN_CONTAINER=mthreads/musa:${MUSA_VERSION}-runtime-ubuntu${UBUNTU

FROM ${BASE_MUSA_DEV_CONTAINER} AS build

# MUSA architecture to build for (defaults to all supported archs)
ARG MUSA_DOCKER_ARCH=default

RUN apt-get update && \
apt-get install -y build-essential git cmake

WORKDIR /app

COPY . .

RUN cmake -B build -DGGML_MUSA=ON ${CMAKE_ARGS} -DCMAKE_EXE_LINKER_FLAGS=-Wl,--allow-shlib-undefined . && \
# Use the default MUSA archs if not specified
RUN if [ "${MUSA_DOCKER_ARCH}" != "default" ]; then \
export CMAKE_ARGS="-DMUSA_ARCHITECTURES=${MUSA_DOCKER_ARCH}"; \
fi && \
cmake -B build -DGGML_NATIVE=OFF -DGGML_MUSA=ON ${CMAKE_ARGS} -DCMAKE_EXE_LINKER_FLAGS=-Wl,--allow-shlib-undefined . && \
cmake --build build --config Release --target llama-cli -j$(nproc) && \
mkdir -p /app/lib && \
find build -name "*.so" -exec cp {} /app/lib \;
Expand Down
2 changes: 1 addition & 1 deletion .devops/llama-cli-vulkan.Dockerfile
Original file line number Diff line number Diff line change
Expand Up @@ -14,7 +14,7 @@ RUN wget -qO - https://packages.lunarg.com/lunarg-signing-key-pub.asc | apt-key
# Build it
WORKDIR /app
COPY . .
RUN cmake -B build -DGGML_VULKAN=1 && \
RUN cmake -B build -DGGML_NATIVE=OFF -DGGML_VULKAN=1 && \
cmake --build build --config Release --target llama-cli

# Clean up
Expand Down
2 changes: 1 addition & 1 deletion .devops/llama-server-cuda.Dockerfile
Original file line number Diff line number Diff line change
Expand Up @@ -22,7 +22,7 @@ COPY . .
RUN if [ "${CUDA_DOCKER_ARCH}" != "default" ]; then \
export CMAKE_ARGS="-DCMAKE_CUDA_ARCHITECTURES=${CUDA_DOCKER_ARCH}"; \
fi && \
cmake -B build -DGGML_CUDA=ON -DLLAMA_CURL=ON ${CMAKE_ARGS} -DCMAKE_EXE_LINKER_FLAGS=-Wl,--allow-shlib-undefined . && \
cmake -B build -DGGML_NATIVE=OFF -DGGML_CUDA=ON -DLLAMA_CURL=ON ${CMAKE_ARGS} -DCMAKE_EXE_LINKER_FLAGS=-Wl,--allow-shlib-undefined . && \
cmake --build build --config Release --target llama-server -j$(nproc) && \
mkdir -p /app/lib && \
find build -name "*.so" -exec cp {} /app/lib \;
Expand Down
2 changes: 1 addition & 1 deletion .devops/llama-server-intel.Dockerfile
Original file line number Diff line number Diff line change
Expand Up @@ -15,7 +15,7 @@ RUN if [ "${GGML_SYCL_F16}" = "ON" ]; then \
export OPT_SYCL_F16="-DGGML_SYCL_F16=ON"; \
fi && \
echo "Building with dynamic libs" && \
cmake -B build -DGGML_SYCL=ON -DCMAKE_C_COMPILER=icx -DCMAKE_CXX_COMPILER=icpx -DLLAMA_CURL=ON ${OPT_SYCL_F16} && \
cmake -B build -DGGML_NATIVE=OFF -DGGML_SYCL=ON -DCMAKE_C_COMPILER=icx -DCMAKE_CXX_COMPILER=icpx -DLLAMA_CURL=ON ${OPT_SYCL_F16} && \
cmake --build build --config Release --target llama-server

FROM intel/oneapi-basekit:$ONEAPI_VERSION AS runtime
Expand Down
9 changes: 8 additions & 1 deletion .devops/llama-server-musa.Dockerfile
Original file line number Diff line number Diff line change
Expand Up @@ -8,14 +8,21 @@ ARG BASE_MUSA_RUN_CONTAINER=mthreads/musa:${MUSA_VERSION}-runtime-ubuntu${UBUNTU

FROM ${BASE_MUSA_DEV_CONTAINER} AS build

# MUSA architecture to build for (defaults to all supported archs)
ARG MUSA_DOCKER_ARCH=default

RUN apt-get update && \
apt-get install -y build-essential git cmake libcurl4-openssl-dev

WORKDIR /app

COPY . .

RUN cmake -B build -DGGML_MUSA=ON -DLLAMA_CURL=ON ${CMAKE_ARGS} -DCMAKE_EXE_LINKER_FLAGS=-Wl,--allow-shlib-undefined . && \
# Use the default MUSA archs if not specified
RUN if [ "${MUSA_DOCKER_ARCH}" != "default" ]; then \
export CMAKE_ARGS="-DMUSA_ARCHITECTURES=${MUSA_DOCKER_ARCH}"; \
fi && \
cmake -B build -DGGML_NATIVE=OFF -DGGML_MUSA=ON -DLLAMA_CURL=ON ${CMAKE_ARGS} -DCMAKE_EXE_LINKER_FLAGS=-Wl,--allow-shlib-undefined . && \
cmake --build build --config Release --target llama-server -j$(nproc) && \
mkdir -p /app/lib && \
find build -name "*.so" -exec cp {} /app/lib \;
Expand Down
2 changes: 1 addition & 1 deletion .devops/llama-server-vulkan.Dockerfile
Original file line number Diff line number Diff line change
Expand Up @@ -14,7 +14,7 @@ RUN wget -qO - https://packages.lunarg.com/lunarg-signing-key-pub.asc | apt-key
# Build it
WORKDIR /app
COPY . .
RUN cmake -B build -DGGML_VULKAN=1 -DLLAMA_CURL=1 && \
RUN cmake -B build -DGGML_NATIVE=OFF -DGGML_VULKAN=1 -DLLAMA_CURL=1 && \
cmake --build build --config Release --target llama-server

# Clean up
Expand Down
2 changes: 1 addition & 1 deletion .devops/nix/python-scripts.nix
Original file line number Diff line number Diff line change
Expand Up @@ -34,7 +34,7 @@ let

# server tests
openai
behave
pytest
prometheus-client
];
in
Expand Down
50 changes: 0 additions & 50 deletions .github/ISSUE_TEMPLATE/01-bug-low.yml

This file was deleted.

Loading
Loading