From 02663ae5e5a24faaec8adb6b5b39f8db43169684 Mon Sep 17 00:00:00 2001
From: Hyesoo Kim <100982596+duper203@users.noreply.github.com>
Date: Mon, 7 Oct 2024 14:46:56 -0700
Subject: [PATCH] 1) env setting 2) solar pro model 3) new results
---
.../11_summary_writing_translation.ipynb | 584 ++++++++++--------
1 file changed, 321 insertions(+), 263 deletions(-)
diff --git a/Solar-Fullstack-LLM-101/11_summary_writing_translation.ipynb b/Solar-Fullstack-LLM-101/11_summary_writing_translation.ipynb
index 26fb562..9eba6dc 100644
--- a/Solar-Fullstack-LLM-101/11_summary_writing_translation.ipynb
+++ b/Solar-Fullstack-LLM-101/11_summary_writing_translation.ipynb
@@ -1,274 +1,332 @@
{
- "cells": [
- {
- "cell_type": "markdown",
- "metadata": {},
- "source": [
- "\n",
- "\n",
- ""
- ]
- },
- {
- "cell_type": "markdown",
- "metadata": {},
- "source": [
- "# 11. Summary, Writing, Translation\n",
- "\n",
- "## Overview \n",
- "In this exercise, we will use the Solar API for translation and summarization. By applying the LLM Chain techniques we have previously learned, we can generate summaries in the desired format and translate English to Korean using the Solar Translation API. This tutorial will guide you through the process of combining summarization and translation effectively.\n",
- " \n",
- "## Purpose of the Exercise\n",
- "The purpose of this exercise is to attain proficiency in utilizing the LLM Chain and modify output formats by applying prompt engineering techniques. By the end of this tutorial, users will be able to create summaries and translate text seamlessly, improving their ability to manage multilingual content using the Solar API.\n"
- ]
- },
- {
- "cell_type": "code",
- "execution_count": 2,
- "metadata": {},
- "outputs": [],
- "source": [
- "! pip3 install -qU langchain-upstage requests python-dotenv"
- ]
- },
- {
- "cell_type": "code",
- "execution_count": 4,
- "metadata": {},
- "outputs": [],
- "source": [
- "# @title set API key\n",
- "import os\n",
- "import getpass\n",
- "from pprint import pprint\n",
- "import warnings\n",
- "\n",
- "warnings.filterwarnings(\"ignore\")\n",
- "\n",
- "from IPython import get_ipython\n",
- "\n",
- "if \"google.colab\" in str(get_ipython()):\n",
- " # Running in Google Colab. Please set the UPSTAGE_API_KEY in the Colab Secrets\n",
- " from google.colab import userdata\n",
- " os.environ[\"UPSTAGE_API_KEY\"] = userdata.get(\"UPSTAGE_API_KEY\")\n",
- "else:\n",
- " # Running locally. Please set the UPSTAGE_API_KEY in the .env file\n",
- " from dotenv import load_dotenv\n",
- "\n",
- " load_dotenv()\n",
- "\n",
- "if \"UPSTAGE_API_KEY\" not in os.environ:\n",
- " os.environ[\"UPSTAGE_API_KEY\"] = getpass.getpass(\"Enter your Upstage API key: \")\n"
-]
-
- },
- {
- "cell_type": "code",
- "execution_count": 5,
- "metadata": {},
- "outputs": [],
- "source": [
- "solar_text = \"\"\"\n",
- "SOLAR 10.7B: Scaling Large Language Models with Simple yet Effective Depth Up-Scaling\n",
- "\n",
- "We introduce SOLAR 10.7B, a large language model (LLM) with 10.7 billion parameters, \n",
- "demonstrating superior performance in various natural language processing (NLP) tasks. \n",
- "Inspired by recent efforts to efficiently up-scale LLMs, \n",
- "we present a method for scaling LLMs called depth up-scaling (DUS), \n",
- "which encompasses depthwise scaling and continued pretraining.\n",
- "In contrast to other LLM up-scaling methods that use mixture-of-experts, \n",
- "DUS does not require complex changes to train and inference efficiently. \n",
- "We show experimentally that DUS is simple yet effective \n",
- "in scaling up high-performance LLMs from small ones. \n",
- "Building on the DUS model, we additionally present SOLAR 10.7B-Instruct, \n",
- "a variant fine-tuned for instruction-following capabilities, \n",
- "surpassing Mixtral-8x7B-Instruct. \n",
- "SOLAR 10.7B is publicly available under the Apache 2.0 license, \n",
- "promoting broad access and application in the LLM field.\n",
- "\"\"\""
- ]
- },
- {
- "cell_type": "code",
- "execution_count": 6,
- "metadata": {},
- "outputs": [],
- "source": [
- "from langchain_core.prompts import PromptTemplate\n",
- "from langchain_core.output_parsers import StrOutputParser\n",
- "from langchain_upstage import ChatUpstage\n",
- "\n",
- "\n",
- "llm = ChatUpstage()\n",
- "\n",
- "prompt_template = PromptTemplate.from_template(\n",
- " \"\"\"\n",
- " Please do {action} on the following text: \n",
- " ---\n",
- " TEXT: {text}\n",
- " \"\"\"\n",
- ")\n",
- "chain = prompt_template | llm | StrOutputParser()"
- ]
- },
- {
- "cell_type": "code",
- "execution_count": 10,
- "metadata": {},
- "outputs": [
+ "cells": [
{
- "data": {
- "text/plain": [
- "'\\nSOLAR 10.7B, a 10.7 billion parameter large language model, demonstrates superior performance in various NLP tasks. The model uses depth up-scaling (DUS) method, which includes depthwise scaling and continued pretraining. DUS is a simple yet effective way to scale up high-performance LLMs without requiring complex changes to training and inference. Additionally, SOLAR 10.7B-Instruct, a variant fine-tuned for instruction-following capabilities, is available. SOLAR 10.7B is publicly available under the Apache 2.0 license.'"
+ "cell_type": "markdown",
+ "metadata": {
+ "id": "cAxq9H408P7P"
+ },
+ "source": [
+ "\n",
+ "\n",
+ ""
]
- },
- "execution_count": 10,
- "metadata": {},
- "output_type": "execute_result"
- }
- ],
- "source": [
- "chain.invoke({\"action\": \"three line summarize\", \"text\": solar_text})"
- ]
- },
- {
- "cell_type": "code",
- "execution_count": 12,
- "metadata": {},
- "outputs": [
+ },
{
- "data": {
- "text/plain": [
- "'We created a new computer program called SOLAR 10.7B that is very good at understanding and speaking different languages. This program can do many things, like answering questions, writing stories, and even helping with homework.\\n\\nTo make this program better, we used a special way called \"depth up-scaling\" or DUS. This means we made the program bigger and stronger, but we did it in a simple way that is easy to understand and use. We also made sure that the program can learn new things on its own, just like a child learning from their experiences.\\n\\nWe showed that our simple way of making the program bigger and stronger works really well. It is much better than other ways that use more complicated ideas.\\n\\nWe also made a special version of the program called SOLAR 10.7B-Instruct. This version is even better at following instructions and doing what it\\'s told. It\\'s so good that it\\'s better than other programs that use even more complicated ideas.\\n\\nNow, everyone can use our program for free and make new things with it. We hope that our program will help people learn and do cool things with computers.'"
+ "cell_type": "markdown",
+ "metadata": {
+ "id": "PS6sWBAe8P7Q"
+ },
+ "source": [
+ "# 11. Summary, Writing, Translation\n",
+ "\n",
+ "## Overview \n",
+ "In this exercise, we will use the Solar API for translation and summarization. By applying the LLM Chain techniques we have previously learned, we can generate summaries in the desired format and translate English to Korean using the Solar Translation API. This tutorial will guide you through the process of combining summarization and translation effectively.\n",
+ "\n",
+ "## Purpose of the Exercise\n",
+ "The purpose of this exercise is to attain proficiency in utilizing the LLM Chain and modify output formats by applying prompt engineering techniques. By the end of this tutorial, users will be able to create summaries and translate text seamlessly, improving their ability to manage multilingual content using the Solar API.\n"
]
- },
- "execution_count": 12,
- "metadata": {},
- "output_type": "execute_result"
- }
- ],
- "source": [
- "chain.invoke(\n",
- " {\n",
- " \"action\": \"Please rewrite this so that children and grandma can understand it easily.\",\n",
- " \"text\": solar_text,\n",
- " }\n",
- ")"
- ]
- },
- {
- "cell_type": "code",
- "execution_count": 14,
- "metadata": {},
- "outputs": [],
- "source": [
- "from langchain_core.prompts import ChatPromptTemplate\n",
- "\n",
- "enko_translation = ChatUpstage(model=\"solar-1-mini-translate-enko\")\n",
- "\n",
- "chat_prompt = ChatPromptTemplate.from_messages(\n",
- " [\n",
- " (\"human\", \"{text}\"),\n",
- " ]\n",
- ")\n",
- "\n",
- "chain = chat_prompt | enko_translation | StrOutputParser()"
- ]
- },
- {
- "cell_type": "code",
- "execution_count": 16,
- "metadata": {},
- "outputs": [
+ },
{
- "data": {
- "text/plain": [
- "'우리는 107억 개의 매개변수를 가진 대규모 언어 모델 (LLM)인 SOLAR 10.7B를 소개합니다.'"
+ "cell_type": "code",
+ "execution_count": 1,
+ "metadata": {
+ "id": "mVqSkn4s8P7Q"
+ },
+ "outputs": [],
+ "source": [
+ "! pip3 install -qU langchain-upstage requests python-dotenv"
]
- },
- "execution_count": 16,
- "metadata": {},
- "output_type": "execute_result"
- }
- ],
- "source": [
- "chain.invoke(\n",
- " {\n",
- " \"text\": \"\"\"We introduce SOLAR 10.7B, a large language model (LLM) with 10.7 billion parameters, \n",
- "demonstrating superior performance in various natural language processing (NLP) tasks. \"\"\"\n",
- " }\n",
- ")"
- ]
- },
- {
- "cell_type": "code",
- "execution_count": 28,
- "metadata": {},
- "outputs": [],
- "source": [
- "style_prompt = ChatPromptTemplate.from_messages(\n",
- " [\n",
- " (\"human\", \"we present a method for scaling LLMs called depth up-scaling (DUS)\"),\n",
- " (\"ai\", \"DUS라는 방법에 대해 알려줄께. DUS는 LLMs를 확장하는 방법이야.\"),\n",
- " (\n",
- " \"human\",\n",
- " \"In contrast to other LLM up-scaling methods that use mixture-of-experts, DUS does not require complex changes to train and inference efficiently.\",\n",
- " ),\n",
- " (\n",
- " \"ai\",\n",
- " \"다른 LLM 확장 방법과는 달리, DUS는 복잡한 변경 사항 없이 효율적으로 훈련하고 추론할 수 있어.\",\n",
- " ),\n",
- " (\"human\", \"{text}\"),\n",
- " ]\n",
- ")\n",
- "\n",
- "chain = style_prompt | enko_translation | StrOutputParser()"
- ]
- },
- {
- "cell_type": "code",
- "execution_count": 29,
- "metadata": {},
- "outputs": [
+ },
+ {
+ "cell_type": "code",
+ "execution_count": 2,
+ "metadata": {
+ "id": "70AeDpel8P7Q"
+ },
+ "outputs": [],
+ "source": [
+ "# @title set API key\n",
+ "from pprint import pprint\n",
+ "import os\n",
+ "\n",
+ "import warnings\n",
+ "\n",
+ "warnings.filterwarnings(\"ignore\")\n",
+ "\n",
+ "if \"google.colab\" in str(get_ipython()):\n",
+ " # Running in Google Colab. Please set the UPSTAGE_API_KEY in the Colab Secrets\n",
+ " from google.colab import userdata\n",
+ "\n",
+ " os.environ[\"UPSTAGE_API_KEY\"] = userdata.get(\"UPSTAGE_API_KEY\")\n",
+ "else:\n",
+ " # Running locally. Please set the UPSTAGE_API_KEY in the .env file\n",
+ " from dotenv import load_dotenv\n",
+ "\n",
+ " load_dotenv()\n",
+ "\n",
+ "assert (\n",
+ " \"UPSTAGE_API_KEY\" in os.environ\n",
+ "), \"Please set the UPSTAGE_API_KEY environment variable\""
+ ]
+ },
{
- "data": {
- "text/plain": [
- "'우리는 107억 개의 매개 변수를 가진 대규모 언어 모델(LLM)인 SOLAR 10.7B를 소개하며, 다양한 자연어 처리(NLP) 작업에서 우수한 성능을 보여줘.'"
+ "cell_type": "code",
+ "execution_count": 3,
+ "metadata": {
+ "id": "G4RC8p_M8P7Q"
+ },
+ "outputs": [],
+ "source": [
+ "solar_text = \"\"\"\n",
+ "SOLAR 10.7B: Scaling Large Language Models with Simple yet Effective Depth Up-Scaling\n",
+ "\n",
+ "We introduce SOLAR 10.7B, a large language model (LLM) with 10.7 billion parameters,\n",
+ "demonstrating superior performance in various natural language processing (NLP) tasks.\n",
+ "Inspired by recent efforts to efficiently up-scale LLMs,\n",
+ "we present a method for scaling LLMs called depth up-scaling (DUS),\n",
+ "which encompasses depthwise scaling and continued pretraining.\n",
+ "In contrast to other LLM up-scaling methods that use mixture-of-experts,\n",
+ "DUS does not require complex changes to train and inference efficiently.\n",
+ "We show experimentally that DUS is simple yet effective\n",
+ "in scaling up high-performance LLMs from small ones.\n",
+ "Building on the DUS model, we additionally present SOLAR 10.7B-Instruct,\n",
+ "a variant fine-tuned for instruction-following capabilities,\n",
+ "surpassing Mixtral-8x7B-Instruct.\n",
+ "SOLAR 10.7B is publicly available under the Apache 2.0 license,\n",
+ "promoting broad access and application in the LLM field.\n",
+ "\"\"\""
]
- },
- "execution_count": 29,
- "metadata": {},
- "output_type": "execute_result"
+ },
+ {
+ "cell_type": "code",
+ "execution_count": 4,
+ "metadata": {
+ "id": "IHYzAiCT8P7R"
+ },
+ "outputs": [],
+ "source": [
+ "from langchain_core.prompts import PromptTemplate\n",
+ "from langchain_core.output_parsers import StrOutputParser\n",
+ "from langchain_upstage import ChatUpstage\n",
+ "\n",
+ "\n",
+ "llm = ChatUpstage(model=\"solar-pro\")\n",
+ "\n",
+ "prompt_template = PromptTemplate.from_template(\n",
+ " \"\"\"\n",
+ " Please do {action} on the following text:\n",
+ " ---\n",
+ " TEXT: {text}\n",
+ " \"\"\"\n",
+ ")\n",
+ "chain = prompt_template | llm | StrOutputParser()"
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": 5,
+ "metadata": {
+ "id": "KwTDNcFq8P7R",
+ "outputId": "67a163fa-feb0-432e-bc9e-fab96960fa7e",
+ "colab": {
+ "base_uri": "https://localhost:8080/",
+ "height": 70
+ }
+ },
+ "outputs": [
+ {
+ "output_type": "execute_result",
+ "data": {
+ "text/plain": [
+ "'SOLAR 10.7B is a 10.7B parameter LLM with improved NLP performance, scaled using a simple yet effective depth up-scaling (DUS) method. DUS involves depthwise scaling and continued pretraining, without the need for complex changes like mixture-of-experts. SOLAR 10.7B-Instruct, a fine-tuned variant for instruction-following, outperforms Mixtral-8x7B-Instruct. The model is publicly accessible under the Apache 2.0 license.'"
+ ],
+ "application/vnd.google.colaboratory.intrinsic+json": {
+ "type": "string"
+ }
+ },
+ "metadata": {},
+ "execution_count": 5
+ }
+ ],
+ "source": [
+ "chain.invoke({\"action\": \"three line summarize\", \"text\": solar_text})"
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": 7,
+ "metadata": {
+ "id": "PdjOdZjV8P7R",
+ "outputId": "f63096a0-977c-47f2-80ae-0a772d72c02e",
+ "colab": {
+ "base_uri": "https://localhost:8080/",
+ "height": 139
+ }
+ },
+ "outputs": [
+ {
+ "output_type": "execute_result",
+ "data": {
+ "text/plain": [
+ "'We\\'re excited to share about our new \"SOLAR 10.7B\" toy! It\\'s a big, smart toy with 10.7 billion tiny parts, and it can do many cool things with words and sentences. \\n\\nWe made SOLAR 10.7B by using a special trick called \"depth up-scaling.\" This trick helps us make our smart toy even better without making it too complicated. \\n\\nOur smart toy is different from other big toys because it doesn\\'t need a special helper called \"mixture-of-experts\" to work well. \\n\\nWe also made a special version of SOLAR 10.7B called \"SOLAR 10.7B-Instruct.\" This version is great at following instructions and is even better than another popular toy called \"Mixtral-8x7B-Instruct.\"\\n\\nWe want everyone to have fun with SOLAR 10.7B, so we\\'re sharing it for free! You can use it for many word and sentence projects, and it\\'s available under the \"Apache 2.0 license.\"'"
+ ],
+ "application/vnd.google.colaboratory.intrinsic+json": {
+ "type": "string"
+ }
+ },
+ "metadata": {},
+ "execution_count": 7
+ }
+ ],
+ "source": [
+ "chain.invoke(\n",
+ " {\n",
+ " \"action\": \"Please rewrite this so that children and grandma can understand it easily.\",\n",
+ " \"text\": solar_text,\n",
+ " }\n",
+ ")"
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": 8,
+ "metadata": {
+ "id": "fCQN_KNg8P7R"
+ },
+ "outputs": [],
+ "source": [
+ "from langchain_core.prompts import ChatPromptTemplate\n",
+ "\n",
+ "enko_translation = ChatUpstage(model=\"solar-1-mini-translate-enko\")\n",
+ "\n",
+ "chat_prompt = ChatPromptTemplate.from_messages(\n",
+ " [\n",
+ " (\"human\", \"{text}\"),\n",
+ " ]\n",
+ ")\n",
+ "\n",
+ "chain = chat_prompt | enko_translation | StrOutputParser()"
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": 9,
+ "metadata": {
+ "id": "3-hmu_8N8P7R",
+ "outputId": "b200c14a-833e-4a5c-cee5-6f970638a9cf",
+ "colab": {
+ "base_uri": "https://localhost:8080/",
+ "height": 35
+ }
+ },
+ "outputs": [
+ {
+ "output_type": "execute_result",
+ "data": {
+ "text/plain": [
+ "'107억 개의 매개변수를 가진 대규모 언어 모델(LLM)인 SOLAR 10.7B를 소개합니다. \\n다양한 자연어 처리(NLP) 작업에서 뛰어난 성능을 보여줍니다.'"
+ ],
+ "application/vnd.google.colaboratory.intrinsic+json": {
+ "type": "string"
+ }
+ },
+ "metadata": {},
+ "execution_count": 9
+ }
+ ],
+ "source": [
+ "chain.invoke(\n",
+ " {\n",
+ " \"text\": \"\"\"We introduce SOLAR 10.7B, a large language model (LLM) with 10.7 billion parameters,\n",
+ "demonstrating superior performance in various natural language processing (NLP) tasks. \"\"\"\n",
+ " }\n",
+ ")"
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": 10,
+ "metadata": {
+ "id": "9-Bwaj9f8P7S"
+ },
+ "outputs": [],
+ "source": [
+ "style_prompt = ChatPromptTemplate.from_messages(\n",
+ " [\n",
+ " (\"human\", \"we present a method for scaling LLMs called depth up-scaling (DUS)\"),\n",
+ " (\"ai\", \"DUS라는 방법에 대해 알려줄께. DUS는 LLMs를 확장하는 방법이야.\"),\n",
+ " (\n",
+ " \"human\",\n",
+ " \"In contrast to other LLM up-scaling methods that use mixture-of-experts, DUS does not require complex changes to train and inference efficiently.\",\n",
+ " ),\n",
+ " (\n",
+ " \"ai\",\n",
+ " \"다른 LLM 확장 방법과는 달리, DUS는 복잡한 변경 사항 없이 효율적으로 훈련하고 추론할 수 있어.\",\n",
+ " ),\n",
+ " (\"human\", \"{text}\"),\n",
+ " ]\n",
+ ")\n",
+ "\n",
+ "chain = style_prompt | enko_translation | StrOutputParser()"
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": 11,
+ "metadata": {
+ "id": "SiqtsUMT8P7S",
+ "outputId": "3b3c263d-c2b9-4f50-e8f0-2aaaf86418f8",
+ "colab": {
+ "base_uri": "https://localhost:8080/",
+ "height": 35
+ }
+ },
+ "outputs": [
+ {
+ "output_type": "execute_result",
+ "data": {
+ "text/plain": [
+ "'우리는 SOLAR 10.7B라는 대형 언어 모델(LLM)을 소개하고, 다양한 자연어 처리(NLP) 작업에서 우수한 성능을 보여줘.'"
+ ],
+ "application/vnd.google.colaboratory.intrinsic+json": {
+ "type": "string"
+ }
+ },
+ "metadata": {},
+ "execution_count": 11
+ }
+ ],
+ "source": [
+ "chain.invoke(\n",
+ " {\n",
+ " \"text\": \"\"\"We introduce SOLAR 10.7B, a large language model (LLM) with 10.7 billion parameters, demonstrating superior performance in various natural language processing (NLP) tasks. \"\"\"\n",
+ " }\n",
+ ")"
+ ]
+ }
+ ],
+ "metadata": {
+ "kernelspec": {
+ "display_name": "Python 3 (ipykernel)",
+ "language": "python",
+ "name": "python3"
+ },
+ "language_info": {
+ "codemirror_mode": {
+ "name": "ipython",
+ "version": 3
+ },
+ "file_extension": ".py",
+ "mimetype": "text/x-python",
+ "name": "python",
+ "nbconvert_exporter": "python",
+ "pygments_lexer": "ipython3",
+ "version": "3.10.10"
+ },
+ "colab": {
+ "provenance": []
}
- ],
- "source": [
- "chain.invoke(\n",
- " {\n",
- " \"text\": \"\"\"We introduce SOLAR 10.7B, a large language model (LLM) with 10.7 billion parameters, demonstrating superior performance in various natural language processing (NLP) tasks. \"\"\"\n",
- " }\n",
- ")"
- ]
- }
- ],
- "metadata": {
- "kernelspec": {
- "display_name": "Python 3 (ipykernel)",
- "language": "python",
- "name": "python3"
},
- "language_info": {
- "codemirror_mode": {
- "name": "ipython",
- "version": 3
- },
- "file_extension": ".py",
- "mimetype": "text/x-python",
- "name": "python",
- "nbconvert_exporter": "python",
- "pygments_lexer": "ipython3",
- "version": "3.10.10"
- }
- },
- "nbformat": 4,
- "nbformat_minor": 4
-}
+ "nbformat": 4,
+ "nbformat_minor": 0
+}
\ No newline at end of file