Skip to content

Commit

Permalink
Merge pull request #353 from TylunasLi/bug_fix_att_opt
Browse files Browse the repository at this point in the history
Win32Demo工程增加GLM模型
  • Loading branch information
ztxz16 authored Oct 30, 2023
2 parents 0370ee9 + fb39b4c commit da50cbd
Show file tree
Hide file tree
Showing 7 changed files with 151 additions and 75 deletions.
26 changes: 16 additions & 10 deletions example/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -2,7 +2,9 @@

## Benchmark

测速示例程序,方便大家测试不同软硬件下的推理性能。作者测速度可以可参考[这里](doc/benchmark.md)
测速示例程序,方便大家测试不同软硬件下的推理性能。作者测试的速度可以参考[这里](doc/benchmark.md)

由于实际使用时很难满足batch的条件,也并非贪婪解码,该速度与真实使用时的速度有一定差异。

### 使用方法:

Expand Down Expand Up @@ -38,13 +40,17 @@ fastllm工程目前分为CPU版本和GPU版本,为简单上手,在没有cmak

签出代码后,**修改 include/fastllm.h**,Visual Studio中点击”文件“ -> "高级保存选项",在编码中选择”Unicode (UTF-8 **带签名**) -代码页 65001“,或在其他文本编辑器中转为”UTF-8 BOM“编码。(由于linux下gcc不识别BOM头,该修改只能手动处理。)

* **CPU版本**
* 如果本机没有安装CUDA,在Win32Demo项目“属性”中找到"链接器" -> "输入" -> "附加依赖项",点击'从父级或项目设置继承'。

* **GPU版本**
- 需要正确安装CUDA
- 需要正确安装CUDA及其中的Visual Studio Integration
- 正确配置CUDA_PATH环境变量,指向要编译的CUDA版本;
- 在解决方案中删除fastllm.vcproj,引入fastllm-gpu.vcproj,
- 在解决方案资源管理器中移除fastllm.vcproj,引入fastllm-gpu.vcproj,
- 对fastllm-gpu项目,在”生成依赖项“ -> "生成自定义" 中手动添加已安装的CUDA的自定义项文件;
- 对fastllm-gpu项目,在”属性“中找到"CUDA C/C++" -> "Device" -> "Code Generation" 中配置编译后支持的[GPU计算能力](https://developer.nvidia.com/cuda-gpus#compute)
- 在Win32Demo项目上选择”添加“ -> "引用“,勾选fastllm-gpu项目;
- 配置预处理器定义”USE_CUDA“。
- 在Win32Demo项目上配置预处理器定义”USE_CUDA“。

### 使用方法:

Expand All @@ -61,15 +67,15 @@ Android,使用Android studio工具建立的一個Android平台上运行LLM程

### 使用方法:

1.直接AS打开运行
1.在Android Studio直接打开工程运行

2.直接下载release目录里里面的apk体验。

3.可以通过CMake工具链编译main文件(具体步骤见主页的readme),通过adb shell运行,

1. adb push main /data/local/tmp 将main文件放到手机的tmp文件夹,
2. adb shell ,
3. cd /data/local/tmp
4. ./main 运行。
1. `adb push main /data/local/tmp` 将main文件放到手机的tmp文件夹,
2. `adb shell` ,
3. `cd /data/local/tmp`
4. `./main` 运行。

注意:demo apk 会将模型文件复制到应用 data 目录以方便 native 读取,因此设备需准备至少两倍模型大小的空余空间
注意:demo apk 会将模型文件复制到应用 data 目录以方便 native 读取,因此设备需准备至少两倍模型大小的空余空间
1 change: 1 addition & 0 deletions example/Win32Demo/Win32Demo.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -135,6 +135,7 @@ int chatllm(const char* prompt, int type) {

}, *generationConfig);
history = model->MakeHistory(history, sRound, input, ret);
sRound++;
return ret.length();
}

Expand Down
14 changes: 6 additions & 8 deletions example/Win32Demo/fastllm-gpu.vcxproj
Original file line number Diff line number Diff line change
Expand Up @@ -71,29 +71,25 @@
<LinkIncremental>true</LinkIncremental>
<LibraryPath>$(CUDA_PATH)\lib\Win32;$(LibraryPath)</LibraryPath>
<IncludePath>$(CUDA_PATH)\include;$(IncludePath)</IncludePath>
<TargetExt>.lib</TargetExt>
<OutDir>$(SolutionDir)$(Platform)\$(Configuration)\</OutDir>
<IntDir>$(Platform)\$(Configuration)\</IntDir>
</PropertyGroup>
<PropertyGroup Condition="'$(Configuration)|$(Platform)'=='Debug|x64'">
<LinkIncremental>true</LinkIncremental>
<IncludePath>$(CUDA_PATH)\include;$(IncludePath)</IncludePath>
<LibraryPath>$(CUDA_PATH)\lib\x64;$(LibraryPath)</LibraryPath>
<TargetExt>.lib</TargetExt>
</PropertyGroup>
<PropertyGroup Condition="'$(Configuration)|$(Platform)'=='Release|Win32'">
<LinkIncremental>false</LinkIncremental>
<LibraryPath>$(CUDA_PATH)\lib\Win32;$(LibraryPath)</LibraryPath>
<IncludePath>$(CUDA_PATH)\include;$(IncludePath)</IncludePath>
<TargetExt>.lib</TargetExt>
<OutDir>$(SolutionDir)$(Platform)\$(Configuration)\</OutDir>
<IntDir>$(Platform)\$(Configuration)\</IntDir>
</PropertyGroup>
<PropertyGroup Condition="'$(Configuration)|$(Platform)'=='Release|x64'">
<LinkIncremental>false</LinkIncremental>
<IncludePath>$(CUDA_PATH)\include;$(IncludePath)</IncludePath>
<LibraryPath>$(CUDA_PATH)\lib\x64;$(LibraryPath)</LibraryPath>
<TargetExt>.lib</TargetExt>
</PropertyGroup>
<ItemDefinitionGroup Condition="'$(Configuration)|$(Platform)'=='Debug|Win32'">
<ClCompile>
Expand All @@ -111,7 +107,7 @@
<SubSystem>Windows</SubSystem>
</Link>
<CudaCompile>
<CodeGeneration>compute_61,sm_61;%(CodeGeneration)</CodeGeneration>
<CodeGeneration>compute_61,sm_61;compute_75,sm_75;compute_86,sm_86;%(CodeGeneration)</CodeGeneration>
</CudaCompile>
</ItemDefinitionGroup>
<ItemDefinitionGroup Condition="'$(Configuration)|$(Platform)'=='Debug|x64'">
Expand All @@ -130,7 +126,7 @@
<SubSystem>Windows</SubSystem>
</Link>
<CudaCompile>
<CodeGeneration>compute_61,sm_61;%(CodeGeneration)</CodeGeneration>
<CodeGeneration>compute_61,sm_61;compute_75,sm_75;compute_86,sm_86;%(CodeGeneration)</CodeGeneration>
</CudaCompile>
</ItemDefinitionGroup>
<ItemDefinitionGroup Condition="'$(Configuration)|$(Platform)'=='Release|Win32'">
Expand All @@ -153,7 +149,7 @@
<OptimizeReferences>true</OptimizeReferences>
</Link>
<CudaCompile>
<CodeGeneration>compute_61,sm_61;%(CodeGeneration)</CodeGeneration>
<CodeGeneration>compute_61,sm_61;compute_75,sm_75;compute_86,sm_86;%(CodeGeneration)</CodeGeneration>
</CudaCompile>
</ItemDefinitionGroup>
<ItemDefinitionGroup Condition="'$(Configuration)|$(Platform)'=='Release|x64'">
Expand All @@ -176,7 +172,7 @@
<OptimizeReferences>true</OptimizeReferences>
</Link>
<CudaCompile>
<CodeGeneration>compute_61,sm_61;%(CodeGeneration)</CodeGeneration>
<CodeGeneration>compute_61,sm_61;compute_75,sm_75;compute_86,sm_86;%(CodeGeneration)</CodeGeneration>
<FastMath>true</FastMath>
</CudaCompile>
</ItemDefinitionGroup>
Expand All @@ -191,6 +187,7 @@
<ClInclude Include="..\..\include\models\basellm.h" />
<ClInclude Include="..\..\include\models\chatglm.h" />
<ClInclude Include="..\..\include\models\factoryllm.h" />
<ClInclude Include="..\..\include\models\glm.h" />
<ClInclude Include="..\..\include\models\llama.h" />
<ClInclude Include="..\..\include\models\moss.h" />
<ClInclude Include="..\..\include\models\qwen.h" />
Expand All @@ -208,6 +205,7 @@
<ClCompile Include="..\..\src\model.cpp" />
<ClCompile Include="..\..\src\models\basellm.cpp" />
<ClCompile Include="..\..\src\models\chatglm.cpp" />
<ClCompile Include="..\..\src\models\glm.cpp" />
<ClCompile Include="..\..\src\models\llama.cpp" />
<ClCompile Include="..\..\src\models\moss.cpp" />
<ClCompile Include="..\..\src\models\qwen.cpp" />
Expand Down
6 changes: 6 additions & 0 deletions example/Win32Demo/fastllm-gpu.vcxproj.filters
Original file line number Diff line number Diff line change
Expand Up @@ -63,6 +63,9 @@
<ClInclude Include="..\..\include\models\factoryllm.h">
<Filter>头文件\models</Filter>
</ClInclude>
<ClInclude Include="..\..\include\models\glm.h">
<Filter>头文件\models</Filter>
</ClInclude>
<ClInclude Include="..\..\include\models\llama.h">
<Filter>头文件\models</Filter>
</ClInclude>
Expand Down Expand Up @@ -113,6 +116,9 @@
<ClCompile Include="..\..\src\models\chatglm.cpp">
<Filter>源文件\models</Filter>
</ClCompile>
<ClCompile Include="..\..\src\models\glm.cpp">
<Filter>源文件\models</Filter>
</ClCompile>
<ClCompile Include="..\..\src\models\llama.cpp">
<Filter>源文件\models</Filter>
</ClCompile>
Expand Down
10 changes: 2 additions & 8 deletions example/Win32Demo/fastllm.vcxproj
Original file line number Diff line number Diff line change
Expand Up @@ -72,25 +72,21 @@
<LinkIncremental>true</LinkIncremental>
<LibraryPath>$(CUDA_PATH)\lib\Win32;$(LibraryPath)</LibraryPath>
<IncludePath>$(CUDA_PATH)\include;$(IncludePath)</IncludePath>
<TargetExt>.lib</TargetExt>
<OutDir>$(SolutionDir)$(Platform)\$(Configuration)\</OutDir>
<IntDir>$(Platform)\$(Configuration)\</IntDir>
</PropertyGroup>
<PropertyGroup Condition="'$(Configuration)|$(Platform)'=='Debug|x64'">
<LinkIncremental>true</LinkIncremental>
<TargetExt>.lib</TargetExt>
</PropertyGroup>
<PropertyGroup Condition="'$(Configuration)|$(Platform)'=='Release|Win32'">
<LinkIncremental>false</LinkIncremental>
<LibraryPath>$(CUDA_PATH)\lib\Win32;$(LibraryPath)</LibraryPath>
<IncludePath>$(CUDA_PATH)\include;$(IncludePath)</IncludePath>
<TargetExt>.lib</TargetExt>
<OutDir>$(SolutionDir)$(Platform)\$(Configuration)\</OutDir>
<IntDir>$(Platform)\$(Configuration)\</IntDir>
</PropertyGroup>
<PropertyGroup Condition="'$(Configuration)|$(Platform)'=='Release|x64'">
<LinkIncremental>false</LinkIncremental>
<TargetExt>.lib</TargetExt>
</PropertyGroup>
<ItemDefinitionGroup Condition="'$(Configuration)|$(Platform)'=='Debug|Win32'">
<ClCompile>
Expand Down Expand Up @@ -172,10 +168,6 @@
<EnableCOMDATFolding>true</EnableCOMDATFolding>
<OptimizeReferences>true</OptimizeReferences>
</Link>
<CudaCompile>
<CodeGeneration>compute_61,sm_61;%(CodeGeneration)</CodeGeneration>
<FastMath>true</FastMath>
</CudaCompile>
</ItemDefinitionGroup>
<ItemGroup>
<ClInclude Include="..\..\include\device.h" />
Expand All @@ -187,6 +179,7 @@
<ClInclude Include="..\..\include\models\basellm.h" />
<ClInclude Include="..\..\include\models\chatglm.h" />
<ClInclude Include="..\..\include\models\factoryllm.h" />
<ClInclude Include="..\..\include\models\glm.h" />
<ClInclude Include="..\..\include\models\llama.h" />
<ClInclude Include="..\..\include\models\moss.h" />
<ClInclude Include="..\..\include\models\qwen.h" />
Expand All @@ -202,6 +195,7 @@
<ClCompile Include="..\..\src\model.cpp" />
<ClCompile Include="..\..\src\models\basellm.cpp" />
<ClCompile Include="..\..\src\models\chatglm.cpp" />
<ClCompile Include="..\..\src\models\glm.cpp" />
<ClCompile Include="..\..\src\models\llama.cpp" />
<ClCompile Include="..\..\src\models\moss.cpp" />
<ClCompile Include="..\..\src\models\qwen.cpp" />
Expand Down
6 changes: 6 additions & 0 deletions example/Win32Demo/fastllm.vcxproj.filters
Original file line number Diff line number Diff line change
Expand Up @@ -57,6 +57,9 @@
<ClInclude Include="..\..\include\models\factoryllm.h">
<Filter>头文件\models</Filter>
</ClInclude>
<ClInclude Include="..\..\include\models\glm.h">
<Filter>头文件\models</Filter>
</ClInclude>
<ClInclude Include="..\..\include\models\llama.h">
<Filter>头文件\models</Filter>
</ClInclude>
Expand Down Expand Up @@ -101,6 +104,9 @@
<ClCompile Include="..\..\src\models\chatglm.cpp">
<Filter>源文件\models</Filter>
</ClCompile>
<ClCompile Include="..\..\src\models\glm.cpp">
<Filter>源文件\models</Filter>
</ClCompile>
<ClCompile Include="..\..\src\models\llama.cpp">
<Filter>源文件\models</Filter>
</ClCompile>
Expand Down
Loading

0 comments on commit da50cbd

Please sign in to comment.