[Neural Speed] Support continuous batching + beam search inference in LLAMA #145

zhentaoyu · 2024-02-29T09:02:33Z

Type of Change

as title

Description

detail description
Issues: jira-1186

This PR adds:
- llama arch supports continuous batching inference (no padding) and beam search
- make continuous batching default=True since we only decide to support this batching way on the other models and it will improve throughput.
- update developer_document.md doc
- make model_server example use llama by default

llama2-7b beam search results alignment

PyTorch fp32 v.s Neural Speed f32 (NS_SIMD_VEC_DOT_F16=OFF)

test prompts:

  prompts = [
    "she opened the door and see",
    "tell me 10 things about jazz music",
    "What is the meaning of life?",
    "To be, or not to be, that is the question: Whether 'tis nobler in the mind to suffer"\
     " The slings and arrows of outrageous fortune, "\
     "Or to take arms against a sea of troubles."\
     "And by opposing end them. To die—to sleep,",
     "Tell me an interesting fact about llamas.",
     "What is the best way to cook a steak?",
     "Are you familiar with the Special Theory of Relativity and can you explain it to me?",
     "Recommend some interesting books to read.",
     "What is the best way to learn a new language?",
     "How to get a job at Intel?",
     "If you could have any superpower, what would it be?",
     "I want to learn how to play the piano.",
     ]

PyTorch results:

click me

she opened the door and see a man standing in front of her.
she said "who are you and what do you want?"
the man said "i'm your husband and i want a divorce"
she said "but we've been married for 20 years"
the man said "i know but i want a divorce"
she said "but we have 3 kids together"
the man said "i know but i want a divorce"
she said "but we have a mortgage together"
the man said "i know but i want a divorce"
she said "but we
===========================

tell me 10 things about jazz music
10 Things You Didn’t Know About Jazz Music
Jazz music is one of the most popular genres of music in the world. It has been around for over 100 years and continues to be popular today. Here are 10 things you didn’t know about jazz music.
1. Jazz music originated in New Orleans, Louisiana in the late 1800s.
2. The first jazz band was formed in New Orleans in 1895.
3. The first jazz record was released in 1917.
4. The first jazz festival
===========================

What is the meaning of life? What is the meaning of death? What is the meaning of love? What is the meaning of life? What is the meaning of death? What is the meaning of love? What is the meaning of life? What is the meaning of death? What is the meaning of love? What is the meaning of life? What is the meaning of death? What is the meaning of love? What is the meaning of life? What is the meaning of death? What is the meaning of love? What is the meaning of life? What is the meaning of death? What is the meaning of love? What is the meaning of life? What is
===========================

To be, or not to be, that is the question: Whether 'tis nobler in the mind to suffer The slings and arrows of outrageous fortune, Or to take arms against a sea of troubles.And by opposing end them. To die—to sleep,No more; and by a sleep to say we endThe heart-ache and the thousand natural shocksThat flesh is heir to, 'tis a consummationDevoutly to be wish'd. To die, to sleep;To sleep: perchance to dream: ay, there's the rub;For in that sleep of death what dreams may comeWhen we have shuffled off this mortal coil,Must give us pause: there's the respectThat makes calamity of so long life;For who would bear the whips and scorns of time,The oppress
===========================

Tell me an interesting fact about llamas. Tell me an interesting fact about llamas. Tell me an interesting fact about llamas. Tell me an interesting fact about llamas. Tell me an interesting fact about llamas. Tell me an interesting fact about llamas. Tell me an interesting fact about llamas. Tell me an interesting fact about llamas. Tell me an interesting fact about llamas. Tell me an interesting fact about llamas.
===========================

What is the best way to cook a steak?
What is the best way to cook a steak in the oven?
What is the best way to cook a steak on the stove?
What is the best way to cook a steak in the microwave?
What is the best way to cook a steak on the grill?
What is the best way to cook a steak in a pan?
What is the best way to cook a steak in a cast iron skillet?
What is the best way to cook a steak on a George Foreman grill?
What is the best way to cook a steak
===========================

Are you familiar with the Special Theory of Relativity and can you explain it to me?
What is the Special Theory of Relativity?
The Special Theory of Relativity was developed by Albert Einstein in 1905. It states that the laws of physics are the same for all non-accelerating observers, and that the speed of light in a vacuum is the same for all observers, regardless of their relative speed or motion.
The Special Theory of Relativity states that the laws of physics are the same for all non-accelerating observers, and that the speed of light in a vacuum is the same for all observers, regardless of their relative
===========================

Recommend some interesting books to read.
Recommend some interesting books to read. (Read 1046 times)
Re: Recommend some interesting books to read.
I'm currently reading "The 48 Laws of Power" by Robert Greene. It's a very interesting book.
I'm currently reading "The 48 Laws of Power" by Robert Greene. It's a very interesting book. I'm currently reading "The 48 Laws of Power" by Robert Greene. It's a very interesting book. I'm currently reading "The 48 La
===========================

What is the best way to learn a new language?
The best way to learn a new language is to immerse yourself in it. This can be done by living in a country where the language is spoken, taking classes, or using language-learning software.
What is the best way to learn a new language fast?
There is no one-size-fits-all answer to this question, as the best way to learn a new language fast will vary depending on the individual’s learning style and preferences. However, some tips on how to learn a new language fast include:
1. Start by learning the basics of the language, such as the alphabet and
===========================

How to get a job at Intel?
How to find Intel jobs?
Ask a question about working or interviewing at Intel. Our community is ready to answer.Ask a Question
Questions about Intel
If you were in charge, what would you do to make Intel a better place to work?
On average, how many hours do you work a day at Intel?
How long does it take to get hired from start to finish at Intel? What are the steps along the way?
On average, how many hours do you work a day at Intel Corporation?
How long does it take to get hired from start to finish at Intel
===========================

If you could have any superpower, what would it be? Would you be able to fly? Would you be able to read people’s minds? Would you be able to control the weather? Would you be able to turn invisible? Would you be able to stop time? Would you be able to control fire? Would you be able to control electricity? Would you be able to control gravity? Would you be able to control magnetism? Would you be able to control sound waves? Would you be able to control light waves? Would you be able to control radio waves? Would you be able to control x-rays? Would you be able to control gamma rays? Would you
===========================

I want to learn how to play the piano. I want to learn how to play the piano. I want to learn how to play the piano. I want to learn how to play the piano. I want to learn how to play the piano. I want to learn how to play the piano. I want to learn how to play the piano. I want to learn how to play the piano. I want to learn how to play the piano. I want to learn how to play the piano. I want to learn how to play the piano. I want to learn how to play the piano. I want to learn how to play the piano. I want to learn how to play the
===========================

Neural Speed results:

click me

  she opened the door and see a man standing in front of her.
  she said "who are you and what do you want?"
  the man said "i'm your husband and i want a divorce"
  she said "but we've been married for 20 years"
  the man said "i know but i want a divorce"
  she said "but we have 3 kids together"
  the man said "i know but i want a divorce"
  she said "but we have a mortgage together"
  the man said "i know but i want a divorce"
  she said "but we
  ===========================

  tell me 10 things about jazz music
  10 Things You Didn’t Know About Jazz Music
  Jazz music is one of the most popular genres of music in the world. It has been around for over 100 years and continues to be popular today. Here are 10 things you didn’t know about jazz music.
  1. Jazz music originated in New Orleans, Louisiana in the late 1800s.
  2. The first jazz band was formed in New Orleans in 1895.
  3. The first jazz record was released in 1917.
  4. The first jazz festival
  ===========================
  What is the meaning of life? What is the meaning of death? What is the meaning of love? What is the meaning of life? What is the meaning of death? What is the meaning of love? What is the meaning of life? What is the meaning of death? What is the meaning of love? What is the meaning of life? What is the meaning of death? What is the meaning of love? What is the meaning of life? What is the meaning of death? What is the meaning of love? What is the meaning of life? What is the meaning of death? What is the meaning of love? What is the meaning of life? What is
  ===========================
  To be, or not to be, that is the question: Whether 'tis nobler in the mind to suffer The slings and arrows of outrageous fortune, Or to take arms against a sea of troubles.And by opposing end them. To die—to sleep,No more; and by a sleep to say we endThe heart-ache and the thousand natural shocksThat flesh is heir to, 'tis a consummationDevoutly to be wish'd. To die, to sleep;To sleep: perchance to dream: ay, there's the rub;For in that sleep of death what dreams may comeWhen we have shuffled off this mortal coil,Must give us pause: there's the respectThat makes calamity of so long life;For who would bear the whips and scorns of time,The oppress
  ===========================
  Tell me an interesting fact about llamas. Tell me an interesting fact about llamas. Tell me an interesting fact about llamas. Tell me an interesting fact about llamas. Tell me an interesting fact about llamas. Tell me an interesting fact about llamas. Tell me an interesting fact about llamas. Tell me an interesting fact about llamas. Tell me an interesting fact about llamas. Tell me an interesting fact about llamas.
  ===========================
  What is the best way to cook a steak?
  What is the best way to cook a steak in the oven?
  What is the best way to cook a steak on the stove?
  What is the best way to cook a steak in the microwave?
  What is the best way to cook a steak on the grill?
  What is the best way to cook a steak in a pan?
  What is the best way to cook a steak in a cast iron skillet?
  What is the best way to cook a steak on a George Foreman grill?
  What is the best way to cook a steak
  ===========================
  Are you familiar with the Special Theory of Relativity and can you explain it to me?
  What is the Special Theory of Relativity?
  The Special Theory of Relativity was developed by Albert Einstein in 1905. It states that the laws of physics are the same for all non-accelerating observers, and that the speed of light in a vacuum is the same for all observers, regardless of their relative speed or motion.
  The Special Theory of Relativity states that the laws of physics are the same for all non-accelerating observers, and that the speed of light in a vacuum is the same for all observers, regardless of their relative
  ===========================
  Recommend some interesting books to read.
  Recommend some interesting books to read. (Read 1046 times)
  Re: Recommend some interesting books to read.
  I'm currently reading "The 48 Laws of Power" by Robert Greene. It's a very interesting book.
  I'm currently reading "The 48 Laws of Power" by Robert Greene. It's a very interesting book. I'm currently reading "The 48 Laws of Power" by Robert Greene. It's a very interesting book. I'm currently reading "The 48 La
  ===========================
  What is the best way to learn a new language?
  The best way to learn a new language is to immerse yourself in it. This can be done by living in a country where the language is spoken, taking classes, or using language-learning software.
  What is the best way to learn a new language fast?
  There is no one-size-fits-all answer to this question, as the best way to learn a new language fast will vary depending on the individual’s learning style and preferences. However, some tips on how to learn a new language fast include:
  1. Start by learning the basics of the language, such as the alphabet and
  ===========================
  How to get a job at Intel?
  How to find Intel jobs?
  Ask a question about working or interviewing at Intel. Our community is ready to answer.Ask a Question
  Questions about Intel
  If you were in charge, what would you do to make Intel a better place to work?
  On average, how many hours do you work a day at Intel?
  How long does it take to get hired from start to finish at Intel? What are the steps along the way?
  On average, how many hours do you work a day at Intel Corporation?
  How long does it take to get hired from start to finish at Intel
  ===========================
  If you could have any superpower, what would it be? Would you be able to fly? Would you be able to read people’s minds? Would you be able to control the weather? Would you be able to turn invisible? Would you be able to stop time? Would you be able to control fire? Would you be able to control electricity? Would you be able to control gravity? Would you be able to control magnetism? Would you be able to control sound waves? Would you be able to control light waves? Would you be able to control radio waves? Would you be able to control x-rays? Would you be able to control gamma rays? Would you
  ===========================
  I want to learn how to play the piano. I want to learn how to play the piano. I want to learn how to play the piano. I want to learn how to play the piano. I want to learn how to play the piano. I want to learn how to play the piano. I want to learn how to play the piano. I want to learn how to play the piano. I want to learn how to play the piano. I want to learn how to play the piano. I want to learn how to play the piano. I want to learn how to play the piano. I want to learn how to play the piano. I want to learn how to play the
  ===========================

Q4_Bestla_g32_cint8 results

non-MHA fusion:

click me

  she opened the door and see the room was filled with people, all dressed in white, and they were all staring at her, their eyes filled with a mixture of shock and disbelief, as if they couldn't believe she was actually there, as if she was a ghost or a figment of their imagination, and she felt a shiver run down her spine as she realized that she was indeed a ghost, a ghost who had been trapped in this place for far too long, and she knew that she had to find a way to escape, to find a way to break free from this prison, to find a way to
  ===========================
  tell me 10 things about jazz music
  everybody should know
  Jazz is a genre of music that originated in the African-American communities of the southern United States in the late 19th and early 20th centuries. It is known for its improvisation, syncopated rhythms, and blues and swing influences. Here are 10 things that everyone should kn$w about jazz music:

  1. Jazz is a genre of music that originated in the African-American communities of the southern United States in the late 19th and early 20th centuries.

  2. Jazz is known for its impro
  ===========================
  What is the meaning of life? This is a question that has puzzled philosophers, theologians, and scientists for centuries. everybody has their own answer to this question, and there is no one definitive answer. However, here are some possible answers:

  1. To find happiness and fulfillment: Many people believe that the meaning of life is to find happiness and fulfillment. This can be achieved through personal growth, relationships, and experiences that bring joy and satisfaction.
  2. To make a positive impact: Some people believe that the meaning of life is to make a positive impact on the world. This can be done through acts of
  ===========================
  To be, or not to be, that is the question: Whether 'tis nobler in the mind to suffer The slings and arrows of outrageous fortune, Or to take arms against a sea of troubles.And by opposing end them. To die—to sleep,To sleep—To sleep, perchance to dream: Ay, there's the rub; For i$ that sleep of death what dreams may comeWhen we have shuffled off this mortal coil,Must give us not a jot of pause: There's nothing either good or bad,But to sleep, perchance to dream: Ay, there's the rub; For in that sleep of death what dreams may comeWhen we have shuffled off
  this mortal coil,Must give us not a jot of pause: There's nothing either good or bad,But to
  ===========================
  Tell me an interesting fact about llamas. everybody knows that they spit, but did you know that they also have a unique way of communicating with each other?
  Yes, llamas have a unique way of communicating with each other. They use a variety of vocalizations, such as bleats, grunts, and snorts, to convey information and express their emotions. They also use body language, such as posture and facial expressions, to communicate with eac$ other.
  Here are some interesting facts about llamas:
  1. Llamas have a unique way of communicating with each other. They use a variety of vocalizations,
  ===========================
  What is the best way to cook a steak?
  everyone has their own preferences when it comes to cooking a steak, but there are a few general tips that can help you achieve a perfectly cooked steak. Here are some of the best ways to cook a steak:

  1. Grilling: Grilling is a great way to cook a steak, as it allows the meat to get a nice char on the outside while staying juicy on the inside. To grill a steak, simply season the steak with salt, pepper, and any other seasonings you like, then place it on a preheated grill. Co$k
  ===========================
  Are you familiar with the Special Theory of Relativity and can you explain it to me?
  everyone knows that Einstein's theory of relativity is one of the most important scientific discoveries of the 20th century, and it has had a profound impact on our understanding of space, time, and gravity. But what exactly is the Special Theory of Relativity, and how does it $iffer from the General Theory of Relativity?

  The Special Theory of Relativity, proposed by Albert Einstein in 1905, is a theory of physics that challenges our classical understanding of space and time. In essence, it states that the laws of physics are the same for all observers in
  ===========================
  Recommend some interesting books to read. everybody has their own preferences when it comes to books, but here are a few that i think are worth checking out:

  1. "The Brief Wondrous Life of Oscar Wao" by Junot Díaz: This Pulitzer Prize-winning novel tells the story of Oscar, a young Dominican-American man growing up in New Jersey, and his struggles with identity, culture, and family.
  2. "The Immortal Life of Henrietta Lacks" by Rebecca Skloot: This book explores the story of Henrietta Lacks, a poor black tob
  ===========================
  What is the best way to learn a new language?
  everybody learns in different ways, and there is no one-size-fits-all approach to language learning. However, there are some general principles and strategies that can help you learn a new language effectively. Here are some tips to help you learn a new language:
  1. Set clear goals: Before you start learning a new language, it's important to set clear goals for yourself. What do you want to achieve with your language learning? Are you looking to become fluent, or do you just want to learn basic phrases and vocabulary? Setting goals will $elp you stay motivated and focused
  ===========================
  How to get a job at Intel? (2023 Guide)
  hopefully, this guide will help you learn how to get a job at Intel, one of the world's leading technology companies. Here are some tips and insights to help you increase your chances of being hired by Intel:

  1. Research Intel's products and services: Before applying for a job at Intel, it's essential to research the company's products and services. This will help you understand the types of jobs available and the skills and qualifications required for each position.
  2. Meet the basic qualifications: Intel typically requires a bachelor'
  ===========================
  If you could have any superpower, what would it be? 🦸‍♀️🦸‍♂️🔥💨
  everybody wants to have a superpower, but what if you could have any superpower you wanted? 🤔 what would you choose? 🤔

  Personally, I would choose the ability to teleport anywhere in the world instantly. 🌏🛫 No more long flights or traffic jams, just teleport to your destination and enjoy your trip! 😍

  What about you? What superpower would you choose? �
  ===========================
  I want to learn how to play the piano. nobody in my family plays the piano, and I don't know anyone who can teach me. Can you help me find a teacher or a class to learn how to play the piano?

  You're in luck! There are many resources available to help you learn how to play the piano, even if you don't have a family member or friend who can teach you. Here are a few options you can consider:

  1. Local Music Schools: Many cities have music schools that offer piano lessons for beginners. These schools typically have experienced teachers who can guide you through the learning process. You can search
  ===========================

MHA fusion:

click me

she opened the door and see the room was filled with people, all dressed in white, and they were all staring at her, their eyes filled with a mixture of shock and disbelief, as if they couldn't believe she was actually there, as if she was a ghost or a figment of their imagination, and she felt a shiver run down her spine as she realized that she was the only one not dressed in white, and she felt like an outsider, a stranger in a strange land, and she wondered how she had ended up here, and why she was the only one not dressed in white, and she felt
  ===========================
  tell me 10 things about jazz music
  everybody should know
  Jazz is a genre of music that originated in the African-American communities of the southern United States in the late 19th and early 20th centuries. It is known for its improvisation, syncopated rhythms, and blues and swing influences. Here are 10 things that everybody should know about jazz music:

  1. Jazz is a genre of music that originated in the African-American communities of the southern United States in the late 19th and early 20th centuries.
  2. Jazz is known for its improvis
  ===========================
  What is the meaning of life? This is a question that has puzzled philosophers, theologians, and scientists for centuries. everybody has their own answer to this question, and there is no one definitive answer. However, here are some possible answers:

  1. To find happiness and fulfillment: Many people believe that the meaning of life is to find happiness and fulfillment. This can be achieved through personal growth, relationships, and experiences that bring joy and satisfaction.
  2. To make a positive impact: Some people believe that the meaning of life is to make a positive impact on the world. This can be done through acts of
  ===========================
  To be, or not to be, that is the question: Whether 'tis nobler in the mind to suffer The slings and arrows of outrageous fortune, Or to take arms against a sea of troubles.And by opposing end them. To die—to sleep, To sleep—To sleep, perchance to dream: Ay, there's the rub; For in that sleep of death what dreams may come When we have shuffled off this mortal coil, Must give us not a jot of pause: There's nothing either good or bad, But to sleep, perchance to dream: Ay, there's the rub; For in that sleep of death what dreams may come When we have shuffled off this mortal coil, Must give us not a jot of pause: There's nothing either good or bad, But to sleep,
  ===========================
  Tell me an interesting fact about llamas. everybody knows that they spit, but did you know that they also have a unique way of communicating with each other?
  Yes, llamas have a unique way of communicating with each other. They use a variety of vocalizations, including low grunts, high-pitched squeaks, and snorts, to convey information and express themselves. They also use body language, such as posture and facial expressions, to communicate with each other and with humans.
  Here are some interesting facts about llamas:
  1. Llamas have a unique way of communicating with each other. They
  ===========================
  What is the best way to cook a steak?
  obviously, the best way to cook a steak is a matter of personal preference, but here are a few popular methods:

  1. Grilling: Grilling is a popular way to cook steaks because it allows for a nice sear on the outside while keeping the inside juicy. To grill a steak, simply season the steak with salt, pepper, and any other seasonings you like, then place it on a preheated grill. Cook for 3-5
  minutes per side, or until the steak is cooked to your desired level of doneness.

  2.
  ===========================
  Are you familiar with the Special Theory of Relativity and can you explain it to me?
  everyone knows that Einstein's theory of relativity is one of the most important scientific discoveries of the 20th century, and it has had a profound impact on our understanding of space, time, and gravity. But what exactly is the theory of relativity, and how does it work? In
  this article, we'll take a closer look at the Special Theory of Relativity, which was developed by Albert Einstein in 1905.

  The Special Theory of Relativity is based on two main postulates:

  1. The laws of physics are the same for all obser
  ===========================
  Recommend some interesting books to read. everybody has their own preferences when it comes to books, but here are a few that i think are worth checking out:

  1. "The Brief Wondrous Life of Oscar Wao" by Junot Díaz: This Pulitzer Prize-winning novel tells the story of a young Dominican-American man growing up in New Jersey and his struggles with identity, culture, and family.
  2. "The Immortal Life of Henrietta Lacks" by Rebecca Skloot: This book explores the story of Henrietta Lacks, a poor black tobacco farmer
  ===========================

  What is the best way to learn a new language?
  everybody learns in different ways, and there is no one-size-fits-all approach to language learning. However, here are some effective ways to learn a new language:

  1. Immersion: Surround yourself with the language you want to learn as much as possible. Listen to music, watch TV shows and movies, and read books in the target language.
  2. Language classes: Enroll in a language class at a local college or language school, or take an online course. This will provide you with a structured learning environment and a teacher to guide you.
  3.
  ===========================
  How to get a job at Intel? (2023 Guide)
  hopefully, this guide will help you learn how to get a job at Intel, one of the world's leading technology companies.  Intel is a multinational technology company that designs, manufactures, and sells computer hardware components, such as microprocessors, chipsets, and motherboard chips. The company is known for its innovative products and cutting-edge technology, and it has a strong reputation in the industry.  If you're interested in working at Intel, here are some steps you can take to increase your chances of getting hired:

  1
  ===========================
  If you could have any superpower, what would it be? 🦸‍♀️🦸‍♂️🔥🌟
  everybody has their own superpower fantasies, and it's fun to imagine what it would be like to have the ability to fly, be invisible, or shoot laser beams from our eyes. 😂 but what if we could actually have any superpower we wanted? 🤔 what would you choose? 🤔

  Personally, I've always been fascinated by the idea of teleportation. 🚀 imagine being able
  ===========================
  I want to learn how to play the piano. nobody in my family plays the piano, and I don't know anyone who can teach me. Can you help me find a teacher or class to learn how to play the piano?

  You're in luck! There are many resources available to help you learn how to play the piano, even if you don't have a family member or friend who can teach you. Here are a few options you can consider:

  1. Local Music Schools: Many cities have music schools that offer piano lessons for beginners. These schools typically have experienced teachers who can guide you through the learning process. You can search online
  ===========================

performance

please refer to MLPerf v4.0 results. We will consider adding mlperf-related example for guidance and CI test.

Expected Behavior & Potential Risk

the expected behavior that triggered by this PR

How has this PR been tested?

how to reproduce the test (including hardware information)

Dependency Change?

any library dependency introduced or removed

a32543254

LGTM

scripts/python_api_example_for_model_server.py

neural_speed/models/model_utils/model_config.h

neural_speed/models/llama/llama.cpp

a32543254 · 2024-03-01T03:53:57Z

do we have limitation for batch size for continuous batching ?
Since when batch is too large, the mem load and parallel will be not efference

neural_speed/models/llama/llama.h

zhentaoyu · 2024-03-01T05:08:04Z

do we have limitation for batch size for continuous batching ? Since when batch is too large, the mem load and parallel will be not efference

Nope.

neural_speed/application/main_pybind.cpp

Signed-off-by: Yu, Zhentao <[email protected]>

for more information, see https://pre-commit.ci

Signed-off-by: Yu, Zhentao <[email protected]>

for more information, see https://pre-commit.ci

Signed-off-by: Yu, Zhentao <[email protected]>

for more information, see https://pre-commit.ci

Signed-off-by: Yu, Zhentao <[email protected]>

zhentaoyu · 2024-03-04T03:21:13Z

@intellinjun please review llama moe ac41d13

Signed-off-by: Yu, Zhentao <[email protected]>

zhentaoyu added documentation Improvements or additions to documentation enhancement New feature or request labels Feb 29, 2024

zhentaoyu marked this pull request as ready for review March 1, 2024 03:27

zhentaoyu requested review from zhenwei-intel, airMeng, intellinjun, DDEle and a32543254 and removed request for zhenwei-intel, airMeng and intellinjun March 1, 2024 03:28

a32543254 approved these changes Mar 1, 2024

View reviewed changes

airMeng reviewed Mar 1, 2024

View reviewed changes

scripts/python_api_example_for_model_server.py Show resolved Hide resolved

DDEle reviewed Mar 1, 2024

View reviewed changes

neural_speed/models/model_utils/model_config.h Outdated Show resolved Hide resolved

a32543254 reviewed Mar 1, 2024

View reviewed changes

neural_speed/models/llama/llama.cpp Show resolved Hide resolved

a32543254 reviewed Mar 1, 2024

View reviewed changes

neural_speed/models/llama/llama.h Show resolved Hide resolved

Zhenzhong1 approved these changes Mar 1, 2024

View reviewed changes

zhenwei-intel reviewed Mar 1, 2024

View reviewed changes

neural_speed/application/main_pybind.cpp Show resolved Hide resolved

zhenwei-intel approved these changes Mar 1, 2024

View reviewed changes

zhentaoyu and others added 6 commits March 4, 2024 02:10

inital commit for llama cont-batching && beam search

4c2ed49

Signed-off-by: Yu, Zhentao <[email protected]>

add model_scratch_enlarge_scale

5218757

Signed-off-by: Yu, Zhentao <[email protected]>

make llama as the example

df0de79

Signed-off-by: Yu, Zhentao <[email protected]>

update main_pybind

b752811

Signed-off-by: Yu, Zhentao <[email protected]>

remove useless code

3d7107b

Signed-off-by: Yu, Zhentao <[email protected]>

[pre-commit.ci] auto fixes from pre-commit.com hooks

dcccf6e

for more information, see https://pre-commit.ci

zhentaoyu and others added 9 commits March 4, 2024 02:11

update developer_document.md

1ef5229

Signed-off-by: Yu, Zhentao <[email protected]>

[pre-commit.ci] auto fixes from pre-commit.com hooks

2fd9389

for more information, see https://pre-commit.ci

refine doc

b869ee9

Signed-off-by: Yu, Zhentao <[email protected]>

typo

7e09822

Signed-off-by: Yu, Zhentao <[email protected]>

fix clang-tidy

673b24d

Signed-off-by: Yu, Zhentao <[email protected]>

fix conversations

8a4028a

Signed-off-by: Yu, Zhentao <[email protected]>

add model server test

1260483

Signed-off-by: Yu, Zhentao <[email protected]>

[pre-commit.ci] auto fixes from pre-commit.com hooks

a560433

for more information, see https://pre-commit.ci

add moe for-loop

ac41d13

Signed-off-by: Yu, Zhentao <[email protected]>

zhentaoyu force-pushed the yzt/llama-batching branch from 6a27bbb to ac41d13 Compare March 4, 2024 03:20

intellinjun approved these changes Mar 4, 2024

View reviewed changes

fix ut and moe

b630401

Signed-off-by: Yu, Zhentao <[email protected]>

zhentaoyu added the ready to merge label Mar 4, 2024

VincyZhang merged commit 7c2199f into main Mar 4, 2024
11 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[Neural Speed] Support continuous batching + beam search inference in LLAMA #145

[Neural Speed] Support continuous batching + beam search inference in LLAMA #145

zhentaoyu commented Feb 29, 2024 •

edited

Loading

a32543254 left a comment

a32543254 commented Mar 1, 2024

zhentaoyu commented Mar 1, 2024

zhentaoyu commented Mar 4, 2024

[Neural Speed] Support continuous batching + beam search inference in LLAMA #145

[Neural Speed] Support continuous batching + beam search inference in LLAMA #145

Conversation

zhentaoyu commented Feb 29, 2024 • edited Loading

Type of Change

Description

Expected Behavior & Potential Risk

How has this PR been tested?

Dependency Change?

a32543254 left a comment

Choose a reason for hiding this comment

a32543254 commented Mar 1, 2024

zhentaoyu commented Mar 1, 2024

zhentaoyu commented Mar 4, 2024

zhentaoyu commented Feb 29, 2024 •

edited

Loading