Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Generate pipeline #334

Merged
merged 158 commits into from
Jun 7, 2024
Merged
Show file tree
Hide file tree
Changes from 120 commits
Commits
Show all changes
158 commits
Select commit Hold shift + click to select a range
ba91fde
initial generate
pavel-esir Mar 26, 2024
9d85a0e
LLM pipeline
pavel-esir Mar 28, 2024
b21c6c1
Added calculating for several batches
pavel-esir Apr 2, 2024
e52e90d
Greedy search works
pavel-esir Apr 3, 2024
745a804
rename to GenerationConfig
pavel-esir Apr 4, 2024
8895ed0
Add fluent interface
pavel-esir Apr 5, 2024
b24977d
Update text_generation/causal_lm/cpp/generate_pipeline/generate_pipel…
pavel-esir Apr 5, 2024
c933ca0
cosmetic changes in main
pavel-esir Apr 5, 2024
c43e901
greedy search with batches and left padding works
pavel-esir Apr 10, 2024
5a914f6
combine LLModel with LLMPipeline
pavel-esir Apr 10, 2024
c1e0c9d
wip: enable calling tokenize/detokenize for LLMPipeline
pavel-esir Apr 10, 2024
8d66353
add callback to generate
pavel-esir Apr 11, 2024
fa12da7
cleanup generate_sample.cpp
pavel-esir Apr 11, 2024
5ceb9d5
add speculative decoding
pavel-esir Apr 16, 2024
a5083c7
separate Tokenizer
pavel-esir Apr 17, 2024
7692160
wip
pavel-esir Apr 23, 2024
d3f6339
add start/stop conversation
pavel-esir Apr 24, 2024
3776433
use text in streamer instead of raw tokens
pavel-esir Apr 23, 2024
964a5e8
add apply_chat_template
pavel-esir Apr 23, 2024
e57aa4c
fix difference between accumulating conversation as text and keeping …
pavel-esir Apr 26, 2024
d0c1341
cleanup
pavel-esir Apr 26, 2024
8dcea1f
add Jinja2cpp submodule
pavel-esir Apr 26, 2024
754a462
add ov namespace
pavel-esir May 2, 2024
9b19c6f
return scores for batched outputs
pavel-esir May 2, 2024
9bf6caa
add AnyMap
pavel-esir May 3, 2024
39fd73c
Merge remote-tracking branch 'upstream/master' into generate_pipeline
pavel-esir May 3, 2024
63d8f6d
cleanup
pavel-esir May 3, 2024
a833760
before moving to pimpl
pavel-esir May 6, 2024
1681654
move to separate include & src
pavel-esir May 6, 2024
9fe73c6
pimpl implementation
pavel-esir May 6, 2024
053708f
temporary disable jinja2cpp
pavel-esir May 6, 2024
bd6849a
add python api draft, hide implementations from user & refactor imple…
pavel-esir May 7, 2024
62c471e
extract decoding methods to separate files
pavel-esir May 7, 2024
f1d54f4
extended python api, added python api test
pavel-esir May 7, 2024
3c82e11
remove call method
pavel-esir May 8, 2024
5543cee
init
Wovchena May 6, 2024
abb8835
add_subdirectory
Wovchena May 7, 2024
0998abc
add files
Wovchena May 8, 2024
15492c4
add __init__.py
Wovchena May 8, 2024
005d3fb
removed set_streamer
pavel-esir May 8, 2024
cc44bc8
use std::optional
pavel-esir May 8, 2024
d8cab05
started to add Readme docs
pavel-esir May 8, 2024
2535394
reoder Readme
pavel-esir May 8, 2024
95c1bfb
rm generate_pipeline/python
Wovchena May 9, 2024
4510f71
update Readme; cleanup LLMPipeline and add docstring
pavel-esir May 9, 2024
507bc49
refactor folder structure
pavel-esir May 9, 2024
af747d4
cleanup generation_config and ov::Tokenizer
pavel-esir May 9, 2024
c6620d9
move includes to a separate openvino/genai folder
pavel-esir May 10, 2024
59c3e0b
Merge branch 'generate_pipeline' into package
Wovchena May 10, 2024
be84345
align names
Wovchena May 10, 2024
bced64a
Dont modify text_generation/causal_lm/cpp/CMakeLists.txt
Wovchena May 10, 2024
f4e82b6
rm -r text_generation/causal_lm/cpp/generate_pipeline/python-bindings/
Wovchena May 10, 2024
5b2b0ca
fix build
Wovchena May 10, 2024
0dd8f59
add tokenizers only once
Wovchena May 10, 2024
23638ff
change cmake.source-dir
Wovchena May 10, 2024
d8c5349
restore openvino/genai inits
Wovchena May 10, 2024
24faefe
Integrate JinjaCpp
ilya-lavrenov May 10, 2024
598dda3
install genai lib
Wovchena May 10, 2024
f274b93
Merge pull request #2 from ilya-lavrenov/jinja-integration-pavel
pavel-esir May 10, 2024
02d0eae
import openvino for win and lin
Wovchena May 10, 2024
e6695f3
Merge branch 'generate_pipeline' into package
Wovchena May 10, 2024
a27c5a7
put the line back
Wovchena May 10, 2024
0849c41
Added cmake build type before project clause
ilya-lavrenov May 10, 2024
34cddff
one line properties
Wovchena May 10, 2024
023cf1e
Merge pull request #3 from ilya-lavrenov/cmake-build-type
pavel-esir May 10, 2024
6a5d750
Export API symbols
ilya-lavrenov May 10, 2024
27f385e
Merge pull request #4 from ilya-lavrenov/generate_pipeline
pavel-esir May 10, 2024
a9332f0
Merge branch 'generate_pipeline' into package
Wovchena May 10, 2024
9ef488c
rename
Wovchena May 10, 2024
4fad7d5
add .github/workflows/genai_lib.yml
Wovchena May 10, 2024
51e03a2
on: pull_request
Wovchena May 10, 2024
e23a7bb
spelling
Wovchena May 10, 2024
fc5b753
install openvino
Wovchena May 10, 2024
09f8806
add syntacis sugar for geenrate, optimize value passing by reference
pavel-esir May 10, 2024
af22a8a
remove speculative decoding
pavel-esir May 11, 2024
e7db7e8
update
Wovchena May 13, 2024
f279363
add rpath
Wovchena May 13, 2024
83d77c8
add rpath to libopenvino.so
Wovchena May 13, 2024
167f924
py_generate_pipeline
Wovchena May 13, 2024
a111a3f
reorder tokenizer.cpp, add comments to BaseStreamer
pavel-esir May 11, 2024
813d80a
install centos7
Wovchena May 13, 2024
6227b65
install nightly
Wovchena May 13, 2024
74fc107
Merge branch 'generate_pipeline' into package
Wovchena May 13, 2024
9b83a7e
propagate _GLIBCXX_USE_CXX11_ABI
Wovchena May 13, 2024
2d15752
Populate python with the libraries to allow skipping wheel installation
Wovchena May 13, 2024
8025554
run setupvars
Wovchena May 13, 2024
2b14286
update .gitignore, install numpy
Wovchena May 13, 2024
1c11bc7
quotes
Wovchena May 13, 2024
e7fce82
fix PYTHONPATH
Wovchena May 13, 2024
64608d1
fix PYTHONPATH
Wovchena May 13, 2024
43b87c7
quotes
Wovchena May 13, 2024
fef9674
reorder vars
Wovchena May 14, 2024
b21286c
openvino.genai-
Wovchena May 14, 2024
d393f89
Merge pull request #1 from Wovchena/package
pavel-esir May 14, 2024
2b8954d
Merge branch 'master' into generate_pipeline
pavel-esir May 14, 2024
11e872b
Update CMakeLists.txt
pavel-esir May 14, 2024
442dcbf
move group beam searcher to src
pavel-esir May 13, 2024
53d534e
Update .gitignore (#5)
Wovchena May 15, 2024
dcb4b86
Merge remote-tracking branch 'origin/generate_pipeline' into generate…
pavel-esir May 15, 2024
72c045e
fixed difference between old greddy sample and generate
pavel-esir May 15, 2024
11fbaa2
tokenizer minor fixes
pavel-esir May 15, 2024
264e99f
apply comments
pavel-esir May 15, 2024
11032b4
remove accidentally added test_cpp_samples.py
pavel-esir May 15, 2024
7d0c80b
fix build
pavel-esir May 15, 2024
2e3cd73
fix causal_lm comparison error
pavel-esir May 15, 2024
e7fa974
fix different outputs
pavel-esir May 15, 2024
78d0b88
Archive (#7)
Wovchena May 20, 2024
5eb59ea
add tests
pavel-esir May 16, 2024
ce4eb00
Apply suggestions from code review
pavel-esir May 22, 2024
aa90e9d
names correction
pavel-esir May 22, 2024
54cbb52
update URL_HASH
Wovchena May 22, 2024
82a9449
remove submodules from .gitmodules
Wovchena May 22, 2024
5a0079b
install openvino_tokenizers for genai_python_lib
pavel-esir May 22, 2024
73e4312
Update Jinja2Cpp fork commit
Wovchena May 22, 2024
75b7c37
remove group_beam_searcher.hpp; copy fast_tokenizer
pavel-esir May 22, 2024
70f1177
Fix archive (#8)
Wovchena May 23, 2024
da729ba
Apply suggestions from code review
pavel-esir May 24, 2024
28c313b
add groups to GenerationConfig docstring
pavel-esir May 24, 2024
c395a8d
refactor namespace ov::* -> ov::genai::*
pavel-esir May 24, 2024
bbc8c25
removed ov_tokenizers_path when ov::gena::Tokenizer is passed to LLMP…
pavel-esir May 24, 2024
9e37273
Add sampling decoding (#6)
as-suvorov May 27, 2024
81ec069
Fix library loading by updating dependencies (#10)
Wovchena May 28, 2024
88c44fe
Add extension near to genai library, tokenizers from fork (#11)
Wovchena May 29, 2024
220035d
set openvino_tokenizers path via environment; cleared LLMPipeline con…
pavel-esir May 29, 2024
5c6c14f
update environment util for win
pavel-esir May 29, 2024
174f67a
Add callback binding (#12)
Wovchena May 29, 2024
6709a67
Add streamer binding (#13)
Wovchena May 29, 2024
1a4bd68
remove reset_state, multibatch tests added
pavel-esir May 29, 2024
9389930
fix win build
pavel-esir May 29, 2024
7d1d616
map stop_criteria in pybind;
pavel-esir May 30, 2024
9208110
fix chat_sample build on Win
pavel-esir May 30, 2024
7021c87
fix tests failing
pavel-esir May 30, 2024
680e362
add return bool to streamer to stop generation
pavel-esir May 31, 2024
9ba0a71
Add tests for macOS (#9)
yatarkan May 31, 2024
ac26bf8
add genai into llm_bench (#15)
eaidova Jun 3, 2024
dd619e6
allow model without position_ids
pavel-esir Jun 3, 2024
04003d4
Merge remote-tracking branch 'origin/generate_pipeline' into generate…
pavel-esir Jun 3, 2024
1718bfb
Return eos_token from decoding algos (#16)
as-suvorov Jun 3, 2024
1b35935
Merge remote-tracking branch 'origin/generate_pipeline' into generate…
pavel-esir Jun 3, 2024
2c2a34a
Cache a model, rename genai target, fix Windows (#14)
Wovchena Jun 4, 2024
b180faf
Merge remote-tracking branch 'origin/generate_pipeline' into generate…
pavel-esir Jun 4, 2024
59c1096
Merge branch 'master' into generate_pipeline
eaidova Jun 4, 2024
28ebc87
read special tokens only from tokenizer_config.json and config.json
pavel-esir Jun 4, 2024
da96019
Leftovers (#18)
Wovchena Jun 5, 2024
a7f73a6
minor typos fix
pavel-esir Jun 5, 2024
a74baa2
Split text samples to separate folders (#19)
Wovchena Jun 5, 2024
13ebf9f
update llm_bench (#17)
eaidova Jun 5, 2024
67b1cfa
Assume GenAI is installed (#20)
Wovchena Jun 5, 2024
0bd9cb3
fix segfault in tests
pavel-esir Jun 5, 2024
b618673
fix converting unfinished utf strings
pavel-esir Jun 6, 2024
80a17be
load special tokens leftovers
pavel-esir Jun 6, 2024
743f348
add config loading tests
pavel-esir Jun 6, 2024
51a9a73
commit forgotten py_generate_pipeline.cpp
pavel-esir Jun 7, 2024
2494df1
fix ScopedVar in Tokenizer for ov_tokenizers_path
pavel-esir Jun 7, 2024
8f1399f
skip config modification in tmp dir on Win
pavel-esir Jun 7, 2024
57830ba
return back win tests after disabling cleanup
pavel-esir Jun 7, 2024
2175796
Disable unfinished utf string test in Win
pavel-esir Jun 7, 2024
7c07136
disable failing win workflows
pavel-esir Jun 7, 2024
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
4 changes: 4 additions & 0 deletions .github/dependabot.yml
Original file line number Diff line number Diff line change
@@ -1,5 +1,9 @@
version: 2
updates:
- package-ecosystem: "pip"
directory: "./"
schedule:
interval: "weekly"
- package-ecosystem: "pip"
directory: "image_generation/stable_diffusion_1_5/cpp/scripts/"
schedule:
Expand Down
81 changes: 41 additions & 40 deletions .github/workflows/causal_lm_cpp.yml

Large diffs are not rendered by default.

63 changes: 63 additions & 0 deletions .github/workflows/genai_package.yml
Original file line number Diff line number Diff line change
@@ -0,0 +1,63 @@
name: genai_package
on: pull_request
jobs:
ubuntu_genai_package:
strategy:
matrix:
build-type: [Release, Debug]
runs-on: ubuntu-20.04
steps:
- uses: actions/checkout@v4
with:
submodules: recursive
- uses: actions/setup-python@v4
with:
python-version: 3.8
- run: mkdir ./ov/
- run: curl https://storage.openvinotoolkit.org/repositories/openvino/packages/2024.1/linux/l_openvino_toolkit_ubuntu20_2024.1.0.15008.f4afc983258_x86_64.tgz | tar --directory ./ov/ --strip-components 1 -xz
- run: sudo ./ov/install_dependencies/install_openvino_dependencies.sh
- run: sudo apt-get install libtbb-dev

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

it's not required anymore since OpenVINO package contains TBB runtime and dev files

- run: source ./ov/setupvars.sh && cmake -DCMAKE_BUILD_TYPE=${{ matrix.build-type }} -S ./ -B ./build/
- run: source ./ov/setupvars.sh && cmake --build ./build/ --config ${{ matrix.build-type }} --target package -j
- run: source ./ov/setupvars.sh && cmake --install ./build/ --config ${{ matrix.build-type }} --prefix ov
- run: ov/samples/cpp/build_samples.sh -i ${{ github.workspace }}/s\ pace
if: ${{ 'Release' == matrix.build-type }} # build_samples enforces Release build
- run: source ./ov/setupvars.sh && python -m pip install --upgrade-strategy eager -r text_generation/causal_lm/cpp/requirements.txt
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@Wovchena
I suppose it should be installed after tokenizers, because text_generation/causal_lm/cpp/requirements.txt forces installation of released version of tokenizers.

if: ${{ 'Release' == matrix.build-type }}
- run: source ./ov/setupvars.sh && python -m pip install ./thirdparty/openvino_tokenizers/[transformers]
if: ${{ 'Release' == matrix.build-type }}
- run: source ./ov/setupvars.sh && optimum-cli export openvino --trust-remote-code --weight-format fp16 --model TinyLlama/TinyLlama-1.1B-Chat-v1.0 TinyLlama-1.1B-Chat-v1.0
if: ${{ 'Release' == matrix.build-type }}
- run: source ./ov/setupvars.sh && timeout 50s ${{ github.workspace }}/s\ pace/samples_bin/greedy_causal_lm ./TinyLlama-1.1B-Chat-v1.0/ ""
if: ${{ 'Release' == matrix.build-type }}

windows_genai_package:
strategy:
matrix:
build-type: [Release, Debug]
runs-on: windows-latest
defaults:
run:
shell: cmd
steps:
- uses: actions/checkout@v4
with:
submodules: recursive
- uses: actions/setup-python@v4
with:
python-version: 3.8
- run: curl --output ov.zip https://storage.openvinotoolkit.org/repositories/openvino/packages/nightly/2024.2.0-15349-765302e0de1/w_openvino_toolkit_windows_2024.2.0.dev20240515_x86_64.zip
- run: unzip ov.zip
- run: call w_openvino_toolkit_windows_2024.2.0.dev20240515_x86_64\setupvars.bat && cmake -DCMAKE_BUILD_TYPE=${{ matrix.build-type }} -S ./ -B ./build/
- run: call w_openvino_toolkit_windows_2024.2.0.dev20240515_x86_64\setupvars.bat && cmake --build ./build/ --config ${{ matrix.build-type }} --target package -j
- run: call w_openvino_toolkit_windows_2024.2.0.dev20240515_x86_64\setupvars.bat && cmake --install ./build/ --config ${{ matrix.build-type }} --prefix w_openvino_toolkit_windows_2024.2.0.dev20240515_x86_64
- run: call w_openvino_toolkit_windows_2024.2.0.dev20240515_x86_64\samples\cpp\build_samples_msvc.bat -i "${{ github.workspace }}/samples_install"
if: ${{ 'Release' == matrix.build-type }} # build_samples enforces Release build
- run: call w_openvino_toolkit_windows_2024.2.0.dev20240515_x86_64\setupvars.bat && python -m pip install --upgrade-strategy eager -r text_generation/causal_lm/cpp/requirements.txt
if: ${{ 'Release' == matrix.build-type }}
- run: call w_openvino_toolkit_windows_2024.2.0.dev20240515_x86_64\setupvars.bat && python -m pip install ./thirdparty/openvino_tokenizers/[transformers]
if: ${{ 'Release' == matrix.build-type }}
- run: call w_openvino_toolkit_windows_2024.2.0.dev20240515_x86_64\setupvars.bat && optimum-cli export openvino --trust-remote-code --weight-format fp16 --model TinyLlama/TinyLlama-1.1B-Chat-v1.0 TinyLlama-1.1B-Chat-v1.0
if: ${{ 'Release' == matrix.build-type }}
- run: call w_openvino_toolkit_windows_2024.2.0.dev20240515_x86_64\setupvars.bat && "${{ github.workspace }}/samples_install/samples_bin/greedy_causal_lm" .\TinyLlama-1.1B-Chat-v1.0\ ""
if: ${{ 'Release' == matrix.build-type }}
58 changes: 58 additions & 0 deletions .github/workflows/genai_python_lib.yml
Original file line number Diff line number Diff line change
@@ -0,0 +1,58 @@
name: genai_python_lib
on: pull_request
jobs:
ubuntu_genai_python_lib:
runs-on: ubuntu-20.04
steps:
- uses: actions/checkout@v4
with:
submodules: recursive
- uses: actions/setup-python@v4
with:
python-version: 3.8
- run: mkdir ./ov/
- run: curl https://storage.openvinotoolkit.org/repositories/openvino/packages/nightly/2024.1.0-14758-22bd6ff0494/l_openvino_toolkit_centos7_2024.1.0.dev20240315_x86_64.tgz | tar --directory ./ov/ --strip-components 1 -xz # Install CentOS7 instead of Ubuntu to match PyPI distribution ABI
- run: sudo ./ov/install_dependencies/install_openvino_dependencies.sh
- run: source ./ov/setupvars.sh && cmake -DCMAKE_BUILD_TYPE=Release -S ./ -B ./build/
- run: source ./ov/setupvars.sh && cmake --build ./build/ --config Release -j
- run: python -m pip install --pre openvino --extra-index-url https://storage.openvinotoolkit.org/simple/wheels/nightly # Can't load CentOS libraries from the archive
# GitHub Actions already provides what is listed in ./requirements-build.txt but the internal
# build system doesn't. Install ./requirements-build.txt to detect possible conflicts.
- run: source ./ov/setupvars.sh && python -m pip install ./thirdparty/openvino_tokenizers/[transformers] -r ./requirements-build.txt
- run: PYTHONPATH=./src/python/ python -c "from openvino_genai import LLMPipeline"
- run: source ./ov/setupvars.sh && CMAKE_BUILD_PARALLEL_LEVEL="" python -m pip install --pre . --extra-index-url https://storage.openvinotoolkit.org/simple/wheels/nightly
- run: python -c "from openvino_genai import LLMPipeline"
- name: GenAI Python API tests
run: |
source ./ov/setupvars.sh
cd ./tests/python_tests/
python -m pip install -r requirements.txt
models=$(python list_test_models.py)
echo "$models" | while read -r model_name model_path; do
optimum-cli export openvino --trust-remote-code --weight-format fp16 --model "$model_name" "$model_path"
done
python -m pytest test_generate_api.py

windows_genai_python_lib:
runs-on: windows-latest
defaults:
run:
shell: cmd
steps:
- uses: actions/checkout@v4
with:
submodules: recursive
- uses: actions/setup-python@v4
with:
python-version: 3.8
- run: curl --output ov.zip https://storage.openvinotoolkit.org/repositories/openvino/packages/nightly/2024.2.0-15349-765302e0de1/w_openvino_toolkit_windows_2024.2.0.dev20240515_x86_64.zip
- run: unzip ov.zip
- run: call w_openvino_toolkit_windows_2024.2.0.dev20240515_x86_64\setupvars.bat && cmake -DCMAKE_BUILD_TYPE=Release -S ./ -B ./build/
- run: call w_openvino_toolkit_windows_2024.2.0.dev20240515_x86_64\setupvars.bat && cmake --build ./build/ --config Release -j
- run: python -m pip install "numpy<1.27"
# GitHub Actions already provides what is listed in ./requirements-build.txt but the internal
# build system doesn't. Install ./requirements-build.txt to detect possible conflicts.
- run: call w_openvino_toolkit_windows_2024.2.0.dev20240515_x86_64\setupvars.bat && python -m pip install ./thirdparty/openvino_tokenizers/[transformers] -r ./requirements-build.txt
- run: set "PYTHONPATH=./src/python;" && call w_openvino_toolkit_windows_2024.2.0.dev20240515_x86_64\setupvars.bat && python -c "from openvino_genai import LLMPipeline" # cmd evaluates variables in a different way. Setting PYTHONPATH before setupvars.bat instead of doing that after solves that.
- run: set CMAKE_BUILD_PARALLEL_LEVEL=&& call w_openvino_toolkit_windows_2024.2.0.dev20240515_x86_64\setupvars.bat && python -m pip install .
- run: python -c "from openvino_genai import LLMPipeline"

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@Wovchena
let's create actual pipeline to ensure tokenizers are found

4 changes: 4 additions & 0 deletions .gitignore
Original file line number Diff line number Diff line change
@@ -1,3 +1,7 @@
# They are copied to python folder during the build to allow skipping wheel installation
src/python/openvino_genai/*genai*
src/python/openvino_genai/py_generate_pipeline*

# build/artifact dirs
_*
[Bb]uild*/
Expand Down
26 changes: 26 additions & 0 deletions CMakeLists.txt
Original file line number Diff line number Diff line change
@@ -0,0 +1,26 @@
# Copyright (C) 2018-2024 Intel Corporation
# SPDX-License-Identifier: Apache-2.0
#

cmake_minimum_required(VERSION 3.15)

# Multi config generators such as Visual Studio ignore CMAKE_BUILD_TYPE. Multi config generators are configured with
# CMAKE_CONFIGURATION_TYPES, but limiting options in it completely removes such build options
get_property(GENERATOR_IS_MULTI_CONFIG_VAR GLOBAL PROPERTY GENERATOR_IS_MULTI_CONFIG)
if(NOT GENERATOR_IS_MULTI_CONFIG_VAR AND NOT DEFINED CMAKE_BUILD_TYPE)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@Wovchena we can also set default build type for ninja openvinotoolkit/openvino_tokenizers#162

message(STATUS "CMAKE_BUILD_TYPE is not defined, 'Release' will be used")
# Setting CMAKE_BUILD_TYPE as CACHE must go before project(). Otherwise project() sets its value and set() doesn't take an effect
set(CMAKE_BUILD_TYPE Release CACHE STRING "Choose the type of build, options are: None Debug Release RelWithDebInfo MinSizeRel ...")
endif()

project(OpenVINOGenAI VERSION 2024.2.0.0)
ilya-lavrenov marked this conversation as resolved.
Show resolved Hide resolved

add_subdirectory(./thirdparty/openvino_tokenizers/ "${CMAKE_CURRENT_BINARY_DIR}/openvino_tokenizers/")
add_subdirectory(src)
add_subdirectory(text_generation/causal_lm/cpp)

install(DIRECTORY text_generation/causal_lm/cpp/ DESTINATION samples/cpp/causal_lm COMPONENT cpp_samples_genai)
pavel-esir marked this conversation as resolved.
Show resolved Hide resolved
install(FILES LICENSE DESTINATION licensing COMPONENT licensing_genai RENAME LICENSE-GENAI)
install(FILES third-party-programs.txt DESTINATION licensing COMPONENT licensing_genai RENAME third-party-programs-genai.txt)
set(CPACK_GENERATOR "ZIP")
ilya-lavrenov marked this conversation as resolved.
Show resolved Hide resolved
include(CPack)
41 changes: 41 additions & 0 deletions pyproject.toml
Original file line number Diff line number Diff line change
@@ -0,0 +1,41 @@
[project]
name = "openvino_genai"
version = "2024.2.0.0"
description = "Python bindings for https://github.com/openvinotoolkit/openvino.genai"
requires-python = ">=3.8"
readme = {file = "text_generation/causal_lm/cpp/README.md", content-type="text/markdown"}
ilya-lavrenov marked this conversation as resolved.
Show resolved Hide resolved
license = {text = "OSI Approved :: Apache Software License"}
authors = [
{ name = "OpenVINO Developers", email = "[email protected]" },
]
classifiers = [
"Programming Language :: Python :: 3.8",
"Programming Language :: Python :: 3.9",
"Programming Language :: Python :: 3.10",
"Programming Language :: Python :: 3.11",
"Programming Language :: Python :: 3.12",
]
dependencies = [
"openvino_tokenizers~=2024.1.0.0"
]

[tool.scikit-build]
cmake.source-dir = "./"
cmake.build-type = "Release"
cmake.targets = ["py_generate_pipeline", "genai"]
pavel-esir marked this conversation as resolved.
Show resolved Hide resolved
install.components = ["wheel_genai"]
sdist.cmake = true
wheel.packages = ["src/python/openvino_genai"]
wheel.install-dir = "openvino_genai"
wheel.build-tag = "000"
wheel.license-files = ["LICENSE", "SECURITY.md", "third-party-programs.txt"]

[[tool.scikit-build.generate]]
path = "openvino_genai/__version__.py"
template = '''
__version__ = "${version}"
'''

[build-system]
requires = ["scikit-build-core~=0.8.0"] # See https://github.com/openvinotoolkit/openvino_tokenizers/pull/123
build-backend = "scikit_build_core.build"
2 changes: 2 additions & 0 deletions requirements-build.txt
Original file line number Diff line number Diff line change
@@ -0,0 +1,2 @@
cmake~=3.23
build~=1.2.1
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@Wovchena
do you remember why we need it separately?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's the @akladiev's request. That enables python -m build --wheel --outdir {GENAI_BUILD_PY_DIR}'.

Copy link
Contributor

@ilya-lavrenov ilya-lavrenov Jun 3, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

is it possible to use:

export OpenVINO_DIR=xxx # or call setupvars.sh
python -m pip wheel -v --extra-index-url https://storage.openvinotoolkit.org/simple/wheels/pre-release <GenAI source dir>

Pip will build openvino_genai-2024.2.0.0-000-cp310-cp310-manylinux_2_35_x86_64.whl to current folder

See https://pip.pypa.io/en/stable/cli/pip_wheel/

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It also puts all other .whl to the same dir which isn't desirable when the intention is to build one .whl.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

(test_env) devuser@ov-spr-19:~/ilavreno/openvino.genai$ python -m pip wheel -v  --no-deps --extra-index-url https://storage.openvinotoolkit.org/simple/wheels/pre-release .
...
(test_env) devuser@ov-spr-19:~/ilavreno/openvino.genai$ ls *.whl
openvino_genai-2024.2.0.0-000-cp310-cp310-manylinux_2_35_x86_64.whl

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, I guess --no-deps may help. @akladiev, Can you move to pip wheel so requirements-build.txt could be removed?

13 changes: 13 additions & 0 deletions src/CMakeLists.txt
Original file line number Diff line number Diff line change
@@ -0,0 +1,13 @@
# Copyright (C) 2018-2024 Intel Corporation
# SPDX-License-Identifier: Apache-2.0
#

# Find OpenVINODeveloperPackage first to compile with SDL flags
find_package(OpenVINODeveloperPackage QUIET
PATHS "${OpenVINO_DIR}")
if(NOT OpenVINODeveloperPackage_FOUND)
find_package(OpenVINO REQUIRED COMPONENTS Runtime)
endif()

add_subdirectory(cpp)
add_subdirectory(python)
163 changes: 163 additions & 0 deletions src/README.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,163 @@
# OpenVINO Generate API

## Usage

First of all you need to convert your model with optimum-cli
``` sh
optimum-cli export openvino --model "TinyLlama/TinyLlama-1.1B-Chat-v1.0" --weight-format fp16 --trust-remote-code "TinyLlama-1.1B-Chat-v1.0"
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I suppose we should not explicitly use weights format and let Optimum to decide on that matter

pip install openvino-genai
```

`LLMPipeline` is the main object used for decoding. You can construct it straight away from the folder with the converted model. It will automatically load the main model, tokenizer, detokenizer and default generation configuration.

### Python

A minimalist example:
```python
import openvino_genai as ov_genai
pipe = ov_genai.LLMPipeline(model_path, "CPU")
print(pipe.generate("The Sun is yellow bacause"))
pavel-esir marked this conversation as resolved.
Show resolved Hide resolved
```

Calling generate with custom generation config parameters, e.g. config for grouped beam search
```python
import openvino_genai as ov_genai
pipe = ov_genai.LLMPipeline(model_path, "CPU")

result = pipe.generate("The Sun is yellow bacause", max_new_tokens=30, num_groups=3, group_size=5, diversity_penalty=1.5)
pavel-esir marked this conversation as resolved.
Show resolved Hide resolved
print(result)
```

output:
```
'it is made up of carbon atoms. The carbon atoms are arranged in a linear pattern, which gives the yellow color. The arrangement of carbon atoms in'
```

A simples chat in python:
```python
import openvino_genai as ov_genai
pipe = ov_ov_genai.LLMPipeline(model_path)
pavel-esir marked this conversation as resolved.
Show resolved Hide resolved

config = {'num_groups': 3, 'group_size': 5, 'diversity_penalty': 1.5}
pavel-esir marked this conversation as resolved.
Show resolved Hide resolved
pipe.set_generation_cofnig(config)
pavel-esir marked this conversation as resolved.
Show resolved Hide resolved
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I tried and it does not work like this for me, should it work ? It woks for me if config is GenerationConfig object and I got with get_generation_config before

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is it possible to add description about options , which we can configure in config ?


pipe.start_chat()
while True:
    print('question:')
    prompt = input()
if prompt == 'Stop!':
        break
    print(pipe(prompt))
pipe.finish_chat()
```

Test to compare with Huggingface outputs

### C++

Minimalistc example
```cpp
#include "openvino/genai/llm_pipeline.hpp"
#include <iostream>

int main(int argc, char* argv[]) {
std::string model_path = argv[1];
ov::genai::LLMPipeline pipe(model_path, "CPU");
std::cout << pipe.generate("The Sun is yellow bacause");
}
```

Using Group Beam Search Decoding
```cpp
#include "openvino/genai/llm_pipeline.hpp"
#include <iostream>

int main(int argc, char* argv[]) {
std::string model_path = argv[1];
ov::genai::LLMPipeline pipe(model_path, "CPU");

ov::genai::GenerationConfig config = pipe.get_generation_config();
config.max_new_tokens = 256;
config.num_groups = 3;
config.group_size = 5;
config.diversity_penalty = 1.0f;

std::cout << pipe.generate("The Sun is yellow bacause", config);
}
```

A simple chat in C++ using grouped beam search decoding
``` cpp
#include "openvino/genai/llm_pipeline.hpp"
#include <iostream>

int main(int argc, char* argv[]) {
std::string prompt;

std::string model_path = argv[1];
ov::genai::LLMPipeline pipe(model_path, "CPU");

ov::genai::GenerationConfig config = pipe.get_generation_config();
config.max_new_tokens = 256;
config.num_groups = 3;
config.group_size = 5;
config.diversity_penalty = 1.0f;

pipe.start_chat();
for (;;;) {
std::cout << "question:\n";
std::getline(std::cin, prompt);
if (prompt == "Stop!")
break;

std::cout << "answer:\n";
auto answer = pipe(prompt, config);
std::cout << answer << std::endl;
}
pipe.finish_chat();
}
```

Streaming example with lambda function
``` cpp
#include "openvino/genai/llm_pipeline.hpp"
#include <iostream>

int main(int argc, char* argv[]) {
std::string model_path = argv[1];
ov::genai::LLMPipeline pipe(model_path, "CPU");

auto streamer = [](std::string word) { std::cout << word << std::flush; };
std::cout << pipe.generate("The Sun is yellow bacause", streamer);
}
```

Streaming with a custom class
``` cpp
#include "openvino/genai/streamer_base.hpp"
#include "openvino/genai/llm_pipeline.hpp"
#include <iostream>

class CustomStreamer: public ov::genai::StreamerBase {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
class CustomStreamer: public ov::genai::StreamerBase {
class CustomStreamer: public ov::genai::IStreamer {

?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

what do you think?
@Wovchena @pavel-esir

public:
void put(int64_t token) {
/* custom decoding/tokens processing code
tokens_cache.push_back(token);
std::string text = m_tokenizer.decode(tokens_cache);
...
*/
};

void end() {
/* custom finalization */
};
};

int main(int argc, char* argv[]) {
CustomStreamer custom_streamer;

std::string model_path = argv[1];
ov::genai::LLMPipeline pipe(model_path, "CPU");
std::cout << pipe.generate("The Sun is yellow bacause", custom_streamer);
}
```
Loading
Loading