Smart AI World

Build an AI-Powered WhatsApp Sticker Generator with Python

Imagine sending your own custom memes or cartoons instead of the ones from the internet. So, transform your selfies or photos into fun, stylized stickers using OpenAI’s new GPT-Image-1 model. In this tutorial, we’ll build a WhatsApp sticker generator in Python that applies various art styles, including caricature and Pixar-style filters, to your images. 

You’ll learn how to set up the OpenAI image editing API, capture or upload images in Colab, define funny and humorous text categories, or use your own text and process three stickers in parallel using multiple API keys for speed. By the end, you’ll have a working sticker maker powered by GPT-Image-1 and custom text prompts.

Why GPT-Image-1?

We evaluated several cutting-edge image-generation models, including Gemini 2.0 Flash, Flux, and Phoenix, on the Leonardo.ai platform. In particular, all these models struggled with rendering text and expressions correctly. For instance:

  • Google’s Gemini 2.0 image API often produces misspelled or jumbled words even when given exact instructions. For Example, with Gemini, the exact text looks like ‘Big Sale Today!’ and we get outputs like “Big Sale Todai” or random gibberish. 
  • Flux delivers high image quality in general, but users report that it “quickly introduced little errors” into any text it renders. Flux also makes tiny spelling mistakes or garbled letters, especially as the text length increases. Flux also defaults to very similar face generations, i.e, “all faces are looking the same” unless heavily constrained.
  • Phoenix is optimized for fidelity and prompt adherence, but like most diffusion models, it still views text visually and can introduce errors. We found that Phoenix could generate a sticker with the correct wording only sporadically, and it tended to repeat the same default face for a given prompt.

Together, these limitations led us to develop GPT-Image-1. Unlike the above models, GPT-Image-1 incorporates a specialized prompt pipeline that explicitly enforces correct text and expression changes.

Read more: How to run the Flux model?

How GPT-Image-1 Powers Image Editing

GPT-Image-1 is OpenAI’s flagship multimodal model. It creates and edits images from text and image prompts to generate high-quality image outputs. Essentially, we can instruct GPT-Image-1 to apply an edit to a source image based on a text prompt. In our case, we use the images. Edit the API endpoint with GPT-Image-1 to apply fun and humorous filtering, and overlay text to a photo input to create stickers. 

The prompt is carefully constructed to enforce a sticker-friendly output (1024×1024 PNG). Then GPT-Image-1 essentially becomes the AI-powered sticker creator, where it will change the appearance of the subject in the photo and add hilarious text.

# Set up OpenAI clients for each API key (to run parallel requests)

clients = [OpenAI(api_key=key) for key in API_KEYS]

So, for that, we create one OpenAI client per API key. With three keys, we can make three simultaneous API calls. This multi-key, multi-thread approach uses ThreadPoolExecutor. It lets us generate 3 stickers in parallel for each run. As the code prints, it uses “3 API keys for SIMULTANEOUS generation”, dramatically speeding up the sticker creation..

Step-by-Step Guide

The idea of creating your own AI sticker generator may sound complex, but this guide will help you simplify the entire process. You will begin with the environment preparation in Google Colab, then we will review the API, understand categories of phrases, validate text, generate different artistic styles, and finally generate stickers in parallel. Each part is accompanied by code snippets and explanations so you can follow along easily. Now, let’s proceed to code:

Installing and Running on Colab

To generate stickers, we’ve got to have the right setup! This project will use Python libraries PIL and rembg for basic image processing, and google-genai will be used for use in the Colab instance. The first step is the install the dependencies directly in your Colab notebook.

!pip install --upgrade google-genai pillow rembg

!pip install --upgrade onnxruntime

!pip install python-dotenv

OpenAI Integration and API Keys

After installation, import the modules and set up API keys. The script creates one OpenAI client per API key. This lets the code distribute image-edit requests across multiple keys in parallel. The client list is then used by the sticker-generation functions.

API_KEYS = [ # 3 API keys

            "API KEY 1",

             "API KEY 2",

             "API KEY 3"

]

"""# Stickerverse

"""

import os

import random

import base64

import threading

from concurrent.futures import ThreadPoolExecutor, as_completed

from openai import OpenAI

from PIL import Image

from io import BytesIO

from rembg import remove

from google.colab import files

from IPython.display import display, Javascript

from google.colab.output import eval_js

import time

clients = [OpenAI(api_key=key) for key in API_KEYS]

Image upload & camera capture (logic)

Now the next step is to access the camera to capture a photo or upload an image file. The capture_photo() uses JavaScript injected into Colab to open the webcam and return a captured image.upload_image() uses Colab’s file upload widget and verifies the uploaded file with PIL.

# Camera capture via JS

def capture_photo(filename="photo.jpg", quality=0.9):

    js_code = """

    async function takePhoto(quality) {

        const div = document.createElement('div');

        const video = document.createElement('video');

        const btn = document.createElement('button');

        btn.textContent="📸 Capture";

        div.appendChild(video);

        div.appendChild(btn);

        document.body.appendChild(div);

        const stream = await navigator.mediaDevices.getUserMedia({video: true});

        video.srcObject = stream;

        await video.play();

        await new Promise(resolve => btn.onclick = resolve);

        const canvas = document.createElement('canvas');

        canvas.width = video.videoWidth;

        canvas.height = video.videoHeight;

        canvas.getContext('2d').drawImage(video, 0, 0);

        stream.getTracks().forEach(track => track.stop());

        div.remove();

        return canvas.toDataURL('image/jpeg', quality);

    }

    """

    display(Javascript(js_code))

    data = eval_js("takePhoto(%f)" % quality)

    binary = base64.b64decode(data.split(',')[1])

    with open(filename, 'wb') as f:

        f.write(binary)

    print(f"Saved: {filename}")

    return filename

# Image upload function

def upload_image():

    print("Please upload your image file...")

    uploaded = files.upload()

    if not uploaded:

        print("No file uploaded!")

        return None

    filename = list(uploaded.keys())[0]

    print(f"Uploaded: {filename}")

    # Validate if it's an image

    try:

        img = Image.open(filename)

        img.verify()

        print(f"📸 Image verified: {img.format} {img.size}")

        return filename

    except Exception as e:

        print(f"Invalid image file: {str(e)}")

        return None

# Interactive image source selection

def select_image_source():

    print("Choose image source:")

    print("1. Capture from camera")

    print("2. Upload image file")

    while True:

        try:

            choice = input("Select option (1-2): ").strip()

            if choice == "1":

                return "camera"

            elif choice == "2":

                return "upload"

            else:

                print("Invalid choice! Please enter 1 or 2.")

        except KeyboardInterrupt:

            print("\nGoodbye!")

            return None

Output:

Output

Examples of Categories and Phrases

Now we’ll create our different phrase categories to put on our stickers. Therefore, we’ll use a PHRASE_CATEGORIES dictionary that contains many categories, such as corporate, Bollywood, Hollywood, Tollywood, sports, memes, and others. When a category is chosen, the code randomly selects three unique phrases for the three sticker styles.

PHRASE_CATEGORIES = {

    "corporate": [

        "Another meeting? May the force be with you!",

        "Monday blues activated!",

        "This could have been an email, boss!"

    ],

    "bollywood": [

        "Mogambo khush hua!",

        "Kitne aadmi the?",

        "Picture abhi baaki hai mere dost!"

    ],

    "memes": [

        "Bhagwan bharose!",

        "Main thak gaya hoon!",

        "Beta tumse na ho payega!"

    ]

}

Phrase Categories and Custom Text

The generator uses a dictionary of phrase categories. The user can either select a category for random phrase selection or enter their own custom phrase. There are also helper functions for interactive selection, as well as a simple function to validate the length of a custom phrase.

def select_category_or_custom():

    print("\nChoose your sticker text option:")

    print("1. Pick from phrase category (random selection)")

    print("2. Enter my own custom phrase")

    while True:

        try:

            choice = input("Choose option (1 or 2): ").strip()

            if choice == "1":

                return "category"

            elif choice == "2":

                return "custom"

            else:

                print("Invalid choice! Please enter 1 or 2.")

        except KeyboardInterrupt:

            print("\nGoodbye!")

            return None

# NEW: Function to get custom phrase from user

def get_custom_phrase():

    while True:

        phrase = input("\nEnter your custom sticker text (2-50 characters): ").strip()

        if len(phrase) < 2:

            print("Too short! Please enter at least 2 characters.")

            continue

        elif len(phrase) > 50:

            print("Too long! Please keep it under 50 characters.")

            continue

        else:

            print(f"Custom phrase accepted: '{phrase}'")

            return phrase

For custom phrases, input length is checked (2–50 characters) before acceptance.

Phrase Validation and Spelling Guardrails

def validate_and_correct_spelling(text):

    spelling_prompt = f"""

    Please check the spelling and grammar of the following text and return ONLY the corrected version.

    Do not add explanations, comments, or change the meaning.

    Text to check: "{text}"

    """

    response = clients[0].chat.completions.create(

        model="gpt-4o-mini",

        messages=[{"role": "user", "content": spelling_prompt}],

        max_tokens=100,

        temperature=0.1

    )

    corrected_text = response.choices[0].message.content.strip()

    return corrected_text

Now we’ll create a sample build_prompt function to set up some basic-level instructions for the agent. Also note build_prompt() calls the spelling validator, and then embeds the corrected text into the strict sticker prompt:

# Concise Prompt Builder with Spelling Validation

def build_prompt(text, style_variant):

    corrected_text = validate_and_correct_spelling(text)

    base_prompt = f"""

    Create a HIGH-QUALITY WhatsApp sticker in {style_variant} style.

    OUTPUT:

    - 1024x1024 transparent PNG with 8px white border

    - Subject centered, balanced composition, sharp details

    - Preserve original facial identity and proportions

    - Match expression to sentiment of text: '{corrected_text}'

    TEXT:

    - Use EXACT text: '{corrected_text}' (no changes, no emojis)

    - Bold comic font with black outline, high-contrast colors

    - Place text in empty space (top/bottom), never covering the face

    RULES:

    - No hallucinated elements or decorative glyphs

    - No cropping of head/face or text

    - Maintain realistic but expressive look

    - Ensure consistency across stickers

    """

    return base_prompt.strip()

Style Variants: Caricature vs Pixar

The three style templates live in STYLE_VARIANTS.  The first two are caricature transformations and the last is a Pixar-esque 3D look. These strings will get sent directly into the prompt builder and dictate the visual style.

STYLE_VARIANTS = [

    "Transform into detailed caricature with slightly exaggerated facial features...",

    "Transform into expressive caricature with enhanced personality features...",

    "Transform into high-quality Pixar-style 3D animated character..."

]

Generating Stickers in Parallel

The real strength of the project is the parallel sticker generation. The sticker generation is done in parallel with threading all three at the same time, using separate API keys, so wait times are dramatically reduced.

# Generate single sticker using OpenAI GPT-image-1 with specific client (WITH TIMING)
def generate_single_sticker(input_path, output_path, text, style_variant, client_idx):

    try:

        start_time = time.time()

        thread_id = threading.current_thread().name

        print(f"[START] Thread-{thread_id}: API-{client_idx+1} generating {style_variant[:30]}... at {time.strftime('%H:%M:%S', time.localtime(start_time))}")

        prompt = build_prompt(text, style_variant)

        result = clients[client_idx].images.edit(

            model="gpt-image-1",

            image=[open(input_path, "rb")],

            prompt=prompt,

            # input_fidelity="high"

            quality = 'medium'

        )

        image_base64 = result.data[0].b64_json

        image_bytes = base64.b64decode(image_base64)

        with open(output_path, "wb") as f:

            f.write(image_bytes)

        end_time = time.time()

        duration = end_time - start_time

        style_type = "Caricature" if "caricature" in style_variant.lower() else "Pixar"

        print(f"[DONE] Thread-{thread_id}: {style_type} saved as {output_path} | Duration: {duration:.2f}s | Text: '{text[:30]}...'")

        return True

    except Exception as e:

        print(f"[ERROR] API-{client_idx+1} failed: {str(e)}")

        return False

# NEW: Create stickers with custom phrase (all 3 styles use the same custom text)

def create_custom_stickers_parallel(photo_file, custom_text):

    print(f"\nCreating 3 stickers with your custom phrase: '{custom_text}'")

    print("   • Style 1: Caricature #1")

    print("   • Style 2: Caricature #2")

    print("   • Style 3: Pixar Animation")

    # Map futures to their info

    tasks_info = {}

    with ThreadPoolExecutor(max_workers=3, thread_name_prefix="CustomSticker") as executor:

        start_time = time.time()

        print(f"\n[PARALLEL START] Submitting 3 API calls SIMULTANEOUSLY at {time.strftime('%H:%M:%S', time.localtime(start_time))}")

        # Submit ALL tasks at once (non-blocking) - all using the same custom text

        for idx, style_variant in enumerate(STYLE_VARIANTS):

            output_name = f"custom_sticker_{idx+1}.png"

            future = executor.submit(generate_single_sticker, photo_file, output_name, custom_text, style_variant, idx)

            tasks_info[future] = {

                'output_name': output_name,

                'text': custom_text,

                'style_variant': style_variant,

                'client_idx': idx,

                'submit_time': time.time()

            }

        print("All 3 API requests submitted! Processing as they complete...")

        completed = 0

        completion_times = []

        # Process results as they complete

        for future in as_completed(tasks_info.keys(), timeout=180):

            try:

                success = future.result()

                task_info = tasks_info[future]

                if success:

                    completed += 1

                    completion_time = time.time()

                    completion_times.append(completion_time)

                    duration = completion_time - task_info['submit_time']

                    style_type = "Caricature" if "caricature" in task_info['style_variant'].lower() else "Pixar"

                    print(f"[{completed}/3] {style_type} completed: {task_info['output_name']} "

                          f"(API-{task_info['client_idx']+1}, {duration:.1f}s)")

                else:

                    print(f"Failed: {task_info['output_name']}")

            except Exception as e:

                task_info = tasks_info[future]

                print(f"Error with {task_info['output_name']} (API-{task_info['client_idx']+1}): {str(e)}")

        total_time = time.time() - start_time

        print(f"\n [FINAL RESULT] {completed}/3 custom stickers completed in {total_time:.1f} seconds!")

# UPDATED: Create 3 stickers in  PARALLEL (using as_completed)

def create_category_stickers_parallel(photo_file, category):

    if category not in PHRASE_CATEGORIES:

        print(f" Category '{category}' not found! Available: {list(PHRASE_CATEGORIES.keys())}")

        return

    # Choose 3 unique phrases for 3 stickers

    chosen_phrases = random.sample(PHRASE_CATEGORIESBeginner, 3)

    print(f" Selected phrases for {category.title()} category:")

    for i, phrase in enumerate(chosen_phrases, 1):

        style_type = "Caricature" if i <= 2 else "Pixar Animation"

        print(f"   {i}. [{style_type}] '{phrase}' → API Key {i}")

    # Map futures to their info

    tasks_info = {}

    with ThreadPoolExecutor(max_workers=3, thread_name_prefix="StickerGen") as executor:

        start_time = time.time()

        print(f"\n [PARALLEL START] Submitting 3 API calls SIMULTANEOUSLY at {time.strftime('%H:%M:%S', time.localtime(start_time))}")

        # Submit ALL tasks at once (non-blocking)

        for idx, (style_variant, text) in enumerate(zip(STYLE_VARIANTS, chosen_phrases)):

            output_name = f"{category}_sticker_{idx+1}.png"

            future = executor.submit(generate_single_sticker, photo_file, output_name, text, style_variant, idx)

            tasks_info[future] = {

                'output_name': output_name,

                'text': text,

                'style_variant': style_variant,

                'client_idx': idx,

                'submit_time': time.time()

            }

        print("All 3 API requests submitted! Processing as they complete...")

        print("   • API Key 1 → Caricature #1")

        print("   • API Key 2 → Caricature #2")

        print("   • API Key 3 → Pixar Animation")

        completed = 0

        completion_times = []

        # Process results as they complete (NOT in submission order)

        for future in as_completed(tasks_info.keys(), timeout=180):  # 3 minute total timeout

            try:

                success = future.result()  # This only waits until ANY future completes

                task_info = tasks_info[future]

                if success:

                    completed += 1

                    completion_time = time.time()

                    completion_times.append(completion_time)

                    duration = completion_time - task_info['submit_time']

                    style_type = "Caricature" if "caricature" in task_info['style_variant'].lower() else "Pixar"

                    print(f"[{completed}/3] {style_type} completed: {task_info['output_name']} "

                          f"(API-{task_info['client_idx']+1}, {duration:.1f}s) - '{task_info['text'][:30]}...'")

                else:

                    print(f"Failed: {task_info['output_name']}")

            except Exception as e:

                task_info = tasks_info[future]

                print(f"Error with {task_info['output_name']} (API-{task_info['client_idx']+1}): {str(e)}")

        total_time = time.time() - start_time

        print(f"\n[FINAL RESULT] {completed}/3 stickers completed in {total_time:.1f} seconds!")

        if len(completion_times) > 1:

            fastest_completion = min(completion_times) - start_time

            print(f"Parallel efficiency: Fastest completion in {fastest_completion:.1f}s")

Here, generate_single_sticker() builds the prompt and calls the images. edit endpoint using the specified client_idx. The parallel functions create a ThreadPoolExecutor with max_workers=3, submit the three tasks, and process results with as_completed. This lets the script log each finished sticker quickly. Moreover, we can also view the logs to see what is happening for each thread (time, what was it caricature or Pixar style).

Main execution block 

At the bottom of the script, the __main__ guard defaults to running sticker_from_camera(). However, you can agree/uncomment as desired to run interactive_menu(), create_all_category_stickers() or other functions.

# Main execution

if __name__ == "__main__":

    sticker_from_camera()

Output:

Output Image:

For the complete version of this WhatsApp sticker generator code, visit this GitHub repository.

Conclusion

In this tutorial, we have walked through setting up GPT-Image-1 calls, constructing an extended prompt for stickers, capturing or uploading images, selecting amusing phrases or custom text, and running 3 style variants simultaneously. In just a few hundred lines of code, this project converts your pictures into some comic-styled stickers.

By simply combining OpenAI’s vision model with some creative prompt engineering and multi-threading, you can generate fun, personalized stickers in seconds. And the result will be an AI-based WhatsApp sticker generator that can produce instantly shareable stickers with a single click to any of your friends and groups. Now try it for your own photo and your favorite joke!

Frequently Asked Questions

Q1. What does the AI-Powered WhatsApp Sticker Generator do?

A. It transforms your uploaded or captured photos into fun, stylized WhatsApp stickers with text using OpenAI’s GPT-Image-1 model.

Q2. Why is GPT-Image-1 better than other image models?

A. GPT-Image-1 handles text accuracy and facial expressions better than models like Gemini, Flux, or Phoenix, ensuring stickers have correct wording and expressive visuals.

Q3. How does the script speed up sticker generation?

A. It uses three OpenAI API keys and a ThreadPoolExecutor to generate three stickers in parallel, cutting down processing time.

Hello! I’m Vipin, a passionate data science and machine learning enthusiast with a strong foundation in data analysis, machine learning algorithms, and programming. I have hands-on experience in building models, managing messy data, and solving real-world problems. My goal is to apply data-driven insights to create practical solutions that drive results. I’m eager to contribute my skills in a collaborative environment while continuing to learn and grow in the fields of Data Science, Machine Learning, and NLP.

Login to continue reading and enjoy expert-curated content.


Source link

Smart AI World

Add comment