Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Update Generate prompt for leonardo #56

Open
wants to merge 4 commits into
base: main
Choose a base branch
from
Open
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
159 changes: 158 additions & 1 deletion src/engine.ts
Original file line number Diff line number Diff line change
Expand Up @@ -73,6 +73,7 @@ export function init({
return {
getSuggestions,
isContentSafe,
generateAPromptForLeonardo,
};
}

Expand Down Expand Up @@ -122,7 +123,6 @@ async function getWordSuggestions({
.map((word) => word.trim())
.slice(0, maxWords)
: [];
console.log(wordsSuggestionsList);
if (!wordsSuggestionsList.length)
throw new Error("ERROR: Suggestion list is empty or maxToken reached");
return wordsSuggestionsList;
Expand Down Expand Up @@ -185,6 +185,163 @@ async function getSuggestions({
return suggestions;
}

// A function to generate a prompt for generating images from Leonardo AI using GPT3.5-turbo-instruct and provided template and words
export async function generatePromptForImageGeneration({
words,
}: {
words: string[];
}): Promise<Array<{ word: string; prompt: string }>> {
const completionRequestParams = {
model: "gpt-3.5-turbo-instruct",
prompt: `Create a detailed prompt to generate a pictogram for each word of the words array: '${words}'.
First, determine if this is primarily an ACTION or OBJECT, then create a prompt following the appropriate template below.

For ACTIONS (verbs, activities):
- Show a figure actively performing the action
- Include clear motion indicators where appropriate
- Focus on the most recognizable moment of the action
- Use side view if it better shows the action
- Include minimal but necessary context elements

Style requirements:
- Bold black outlines
- Flat colors
- High contrast
- Centered composition
- White background
- Simple geometric shapes

Return only the prompt for each word, no explanations. Keep it under 100 words for each word.
The returned template should be like this:
word1: 'prompt',
word2: 'prompt',
...
wordN: 'prompt'`,
temperature: 0,
max_tokens: 1500,
};

const response = await globalConfiguration.openAIInstance.createCompletion(
completionRequestParams
);
const promptText = response.data?.choices[0]?.text;
if (!promptText)
throw new Error("Error generating prompt for image generation");
try {
// Split the text by newlines and parse each line
const lines = promptText.split("\n").filter((line) => line.trim());
return lines.map((line) => {
const [word, ...promptParts] = line.split(":");
return {
word: word.trim(),
prompt: promptParts
.join(":")
.trim()
.replace(/^['"]|['"]$/g, ""), // Remove quotes if present
};
});
} catch (error) {
throw new Error("Error parsing image generation prompts: " + error);
}
}

export async function getPromptsForLenonardo({
prompt,
maxWords,
language,
}: {
prompt: string;
maxWords: number;
language: string;
}) {
const promptedWords = await getWordSuggestions({
prompt,
maxWords,
language,
});
const leonardoPrompts = await generatePromptForImageGeneration({
words: promptedWords,
});

return leonardoPrompts;
}

export async function generateAPromptForLeonardo({
word,
}: {
word: string;
}): Promise<string> {
const max_tokens = Math.round(2 * 100 + 460);
const response = await globalConfiguration.openAIInstance.createChatCompletion({
model: "gpt-4o-mini",
messages: [
{
role: "system",
content: `You are an expert in creating pictogram prompts. Analyze the word and create a detailed prompt following these guidelines:

CLASSIFICATION CRITERIA:
For ACTIONS:
-Can it be performed/demonstrated?
-Does it involve movement or change?
-Can you complete the phrase "to [word]"?

For OBJECTS:
-Can it be touched or physically exist?
-Is it a person, place, or thing?
-Can you put "the" or "a" before it?

For ADJECTIVES:
-Does it describe a quality or state?
-Can you put "very" before it?
-Can you add "-er" or "-est" to compare it?

TEMPLATE REQUIREMENTS:
For ACTIONS:
-Show simplified human figure mid-action
-Capture distinctive moment
-Include motion indicators
-Use appropriate view angle
-Include essential props only

For OBJECTS:
-Show complete item in recognizable form
-Use optimal viewing angle
-Follow specific guidelines for category
-Avoid interaction/movement

For ADJECTIVES:
-Show clear comparison/extreme example
-Use split scenes if needed
-Include reference objects
-Use universal symbols
-Emphasize through composition

STYLE:
-Bold black outlines (3px)
-Flat colors
-High contrast
-Centered composition
-White background
-No gradients/shadows
-1:1 ratio

Return only the prompt, under 100 words, no explanations.`
},
{
role: "user",
content: `Create a pictogram prompt for the word: '${word}'`
}
],
temperature: 0,
max_tokens: max_tokens
});

const promptText = response.data?.choices[0]?.message?.content;
if (!promptText)
throw new Error("Error generating prompt for image generation");
return promptText;
}

async function isContentSafe(textPrompt: string): Promise<boolean> {
try {
const contentSafetyConfig = globalConfiguration.contentSafety;
Expand Down