Imagine sending your own custom memes or cartoons instead of the ones from the internet. So, transform your selfies or photos into fun, stylized stickers using OpenAI’s new GPT-Image-1 model. In this tutorial, we’ll build a WhatsApp sticker generator in Python that applies various art styles, including caricature and Pixar-style filters, to your images.
You’ll learn how to set up the OpenAI image editing API, capture or upload images in Colab, define funny and humorous text categories, or use your own text and process three stickers in parallel using multiple API keys for speed. By the end, you’ll have a working sticker maker powered by GPT-Image-1 and custom text prompts.
Why GPT-Image-1?
We evaluated several cutting-edge image-generation models, including Gemini 2.0 Flash, Flux, and Phoenix, on the Leonardo.ai platform. In particular, all these models struggled with rendering text and expressions correctly. For instance:
- Google’s Gemini 2.0 image API often produces misspelled or jumbled words even when given exact instructions. For Example, with Gemini, the exact text looks like ‘Big Sale Today!’ and we get outputs like “Big Sale Todai” or random gibberish.
- Flux delivers high image quality in general, but users report that it “quickly introduced little errors” into any text it renders. Flux also makes tiny spelling mistakes or garbled letters, especially as the text length increases. Flux also defaults to very similar face generations, i.e, “all faces are looking the same” unless heavily constrained.
- Phoenix is optimized for fidelity and prompt adherence, but like most diffusion models, it still views text visually and can introduce errors. We found that Phoenix could generate a sticker with the correct wording only sporadically, and it tended to repeat the same default face for a given prompt.
Together, these limitations led us to develop GPT-Image-1. Unlike the above models, GPT-Image-1 incorporates a specialized prompt pipeline that explicitly enforces correct text and expression changes.
Read more: How to run the Flux model?
How GPT-Image-1 Powers Image Editing
GPT-Image-1 is OpenAI’s flagship multimodal model. It creates and edits images from text and image prompts to generate high-quality image outputs. Essentially, we can instruct GPT-Image-1 to apply an edit to a source image based on a text prompt. In our case, we use the images. Edit the API endpoint with GPT-Image-1 to apply fun and humorous filtering, and overlay text to a photo input to create stickers.
The prompt is carefully constructed to enforce a sticker-friendly output (1024×1024 PNG). Then GPT-Image-1 essentially becomes the AI-powered sticker creator, where it will change the appearance of the subject in the photo and add hilarious text.
# Set up OpenAI clients for each API key (to run parallel requests)
clients = [OpenAI(api_key=key) for key in API_KEYS]
So, for that, we create one OpenAI client per API key. With three keys, we can make three simultaneous API calls. This multi-key, multi-thread approach uses ThreadPoolExecutor. It lets us generate 3 stickers in parallel for each run. As the code prints, it uses “3 API keys for SIMULTANEOUS generation”, dramatically speeding up the sticker creation..
Step-by-Step Guide
The idea of creating your own AI sticker generator may sound complex, but this guide will help you simplify the entire process. You will begin with the environment preparation in Google Colab, then we will review the API, understand categories of phrases, validate text, generate different artistic styles, and finally generate stickers in parallel. Each part is accompanied by code snippets and explanations so you can follow along easily. Now, let’s proceed to code:
Installing and Running on Colab
To generate stickers, we’ve got to have the right setup! This project will use Python libraries PIL and rembg for basic image processing, and google-genai will be used for use in the Colab instance. The first step is the install the dependencies directly in your Colab notebook.
!pip install --upgrade google-genai pillow rembg
!pip install --upgrade onnxruntime
!pip install python-dotenv
OpenAI Integration and API Keys
After installation, import the modules and set up API keys. The script creates one OpenAI client per API key. This lets the code distribute image-edit requests across multiple keys in parallel. The client list is then used by the sticker-generation functions.
API_KEYS = [ # 3 API keys
"API KEY 1",
"API KEY 2",
"API KEY 3"
]
"""# Stickerverse
"""
import os
import random
import base64
import threading
from concurrent.futures import ThreadPoolExecutor, as_completed
from openai import OpenAI
from PIL import Image
from io import BytesIO
from rembg import remove
from google.colab import files
from IPython.display import display, Javascript
from google.colab.output import eval_js
import time
clients = [OpenAI(api_key=key) for key in API_KEYS]
Image upload & camera capture (logic)
Now the next step is to access the camera to capture a photo or upload an image file. The capture_photo() uses JavaScript injected into Colab to open the webcam and return a captured image.upload_image() uses Colab’s file upload widget and verifies the uploaded file with PIL.
# Camera capture via JS
def capture_photo(filename="photo.jpg", quality=0.9):
js_code = """
async function takePhoto(quality) {
const div = document.createElement('div');
const video = document.createElement('video');
const btn = document.createElement('button');
btn.textContent="📸 Capture";
div.appendChild(video);
div.appendChild(btn);
document.body.appendChild(div);
const stream = await navigator.mediaDevices.getUserMedia({video: true});
video.srcObject = stream;
await video.play();
await new Promise(resolve => btn.onclick = resolve);
const canvas = document.createElement('canvas');
canvas.width = video.videoWidth;
canvas.height = video.videoHeight;
canvas.getContext('2d').drawImage(video, 0, 0);
stream.getTracks().forEach(track => track.stop());
div.remove();
return canvas.toDataURL('image/jpeg', quality);
}
"""
display(Javascript(js_code))
data = eval_js("takePhoto(%f)" % quality)
binary = base64.b64decode(data.split(',')[1])
with open(filename, 'wb') as f:
f.write(binary)
print(f"Saved: {filename}")
return filename
# Image upload function
def upload_image():
print("Please upload your image file...")
uploaded = files.upload()
if not uploaded:
print("No file uploaded!")
return None
filename = list(uploaded.keys())[0]
print(f"Uploaded: {filename}")
# Validate if it's an image
try:
img = Image.open(filename)
img.verify()
print(f"📸 Image verified: {img.format} {img.size}")
return filename
except Exception as e:
print(f"Invalid image file: {str(e)}")
return None
# Interactive image source selection
def select_image_source():
print("Choose image source:")
print("1. Capture from camera")
print("2. Upload image file")
while True:
try:
choice = input("Select option (1-2): ").strip()
if choice == "1":
return "camera"
elif choice == "2":
return "upload"
else:
print("Invalid choice! Please enter 1 or 2.")
except KeyboardInterrupt:
print("\nGoodbye!")
return None
Output:

Examples of Categories and Phrases
Now we’ll create our different phrase categories to put on our stickers. Therefore, we’ll use a PHRASE_CATEGORIES dictionary that contains many categories, such as corporate, Bollywood, Hollywood, Tollywood, sports, memes, and others. When a category is chosen, the code randomly selects three unique phrases for the three sticker styles.
PHRASE_CATEGORIES = {
"corporate": [
"Another meeting? May the force be with you!",
"Monday blues activated!",
"This could have been an email, boss!"
],
"bollywood": [
"Mogambo khush hua!",
"Kitne aadmi the?",
"Picture abhi baaki hai mere dost!"
],
"memes": [
"Bhagwan bharose!",
"Main thak gaya hoon!",
"Beta tumse na ho payega!"
]
}
Phrase Categories and Custom Text
The generator uses a dictionary of phrase categories. The user can either select a category for random phrase selection or enter their own custom phrase. There are also helper functions for interactive selection, as well as a simple function to validate the length of a custom phrase.
def select_category_or_custom():
print("\nChoose your sticker text option:")
print("1. Pick from phrase category (random selection)")
print("2. Enter my own custom phrase")
while True:
try:
choice = input("Choose option (1 or 2): ").strip()
if choice == "1":
return "category"
elif choice == "2":
return "custom"
else:
print("Invalid choice! Please enter 1 or 2.")
except KeyboardInterrupt:
print("\nGoodbye!")
return None
# NEW: Function to get custom phrase from user
def get_custom_phrase():
while True:
phrase = input("\nEnter your custom sticker text (2-50 characters): ").strip()
if len(phrase) < 2:
print("Too short! Please enter at least 2 characters.")
continue
elif len(phrase) > 50:
print("Too long! Please keep it under 50 characters.")
continue
else:
print(f"Custom phrase accepted: '{phrase}'")
return phrase
For custom phrases, input length is checked (2–50 characters) before acceptance.
Phrase Validation and Spelling Guardrails
def validate_and_correct_spelling(text):
spelling_prompt = f"""
Please check the spelling and grammar of the following text and return ONLY the corrected version.
Do not add explanations, comments, or change the meaning.
Text to check: "{text}"
"""
response = clients[0].chat.completions.create(
model="gpt-4o-mini",
messages=[{"role": "user", "content": spelling_prompt}],
max_tokens=100,
temperature=0.1
)
corrected_text = response.choices[0].message.content.strip()
return corrected_text
Now we’ll create a sample build_prompt function to set up some basic-level instructions for the agent. Also note build_prompt() calls the spelling validator, and then embeds the corrected text into the strict sticker prompt:
# Concise Prompt Builder with Spelling Validation
def build_prompt(text, style_variant):
corrected_text = validate_and_correct_spelling(text)
base_prompt = f"""
Create a HIGH-QUALITY WhatsApp sticker in {style_variant} style.
OUTPUT:
- 1024x1024 transparent PNG with 8px white border
- Subject centered, balanced composition, sharp details
- Preserve original facial identity and proportions
- Match expression to sentiment of text: '{corrected_text}'
TEXT:
- Use EXACT text: '{corrected_text}' (no changes, no emojis)
- Bold comic font with black outline, high-contrast colors
- Place text in empty space (top/bottom), never covering the face
RULES:
- No hallucinated elements or decorative glyphs
- No cropping of head/face or text
- Maintain realistic but expressive look
- Ensure consistency across stickers
"""
return base_prompt.strip()
Style Variants: Caricature vs Pixar
The three style templates live in STYLE_VARIANTS. The first two are caricature transformations and the last is a Pixar-esque 3D look. These strings will get sent directly into the prompt builder and dictate the visual style.
STYLE_VARIANTS = ["Transform into detailed caricature with slightly exaggerated facial features...",
"Transform into expressive caricature with enhanced personality features...",
"Transform into high-quality Pixar-style 3D animated character..."
]
Generating Stickers in Parallel
The real strength of the project is the parallel sticker generation. The sticker generation is done in parallel with threading all three at the same time, using separate API keys, so wait times are dramatically reduced.
# Generate single sticker using OpenAI GPT-image-1 with specific client (WITH TIMING)
def generate_single_sticker(input_path, output_path, text, style_variant, client_idx):
try:
start_time = time.time()
thread_id = threading.current_thread().name
print(f"[START] Thread-{thread_id}: API-{client_idx+1} generating {style_variant[:30]}... at {time.strftime('%H:%M:%S', time.localtime(start_time))}")
prompt = build_prompt(text, style_variant)
result = clients[client_idx].images.edit(
model="gpt-image-1",
image=[open(input_path, "rb")],
prompt=prompt,
# input_fidelity="high"
quality = 'medium'
)
image_base64 = result.data[0].b64_json
image_bytes = base64.b64decode(image_base64)
with open(output_path, "wb") as f:
f.write(image_bytes)
end_time = time.time()
duration = end_time - start_time
style_type = "Caricature" if "caricature" in style_variant.lower() else "Pixar"
print(f"[DONE] Thread-{thread_id}: {style_type} saved as {output_path} | Duration: {duration:.2f}s | Text: '{text[:30]}...'")
return True
except Exception as e:
print(f"[ERROR] API-{client_idx+1} failed: {str(e)}")
return False
# NEW: Create stickers with custom phrase (all 3 styles use the same custom text)
def create_custom_stickers_parallel(photo_file, custom_text):
print(f"\nCreating 3 stickers with your custom phrase: '{custom_text}'")
print(" • Style 1: Caricature #1")
print(" • Style 2: Caricature #2")
print(" • Style 3: Pixar Animation")
# Map futures to their info
tasks_info = {}
with ThreadPoolExecutor(max_workers=3, thread_name_prefix="CustomSticker") as executor:
start_time = time.time()
print(f"\n[PARALLEL START] Submitting 3 API calls SIMULTANEOUSLY at {time.strftime('%H:%M:%S', time.localtime(start_time))}")
# Submit ALL tasks at once (non-blocking) - all using the same custom text
for idx, style_variant in enumerate(STYLE_VARIANTS):
output_name = f"custom_sticker_{idx+1}.png"
future = executor.submit(generate_single_sticker, photo_file, output_name, custom_text, style_variant, idx)
tasks_info[future] = {
'output_name': output_name,
'text': custom_text,
'style_variant': style_variant,
'client_idx': idx,
'submit_time': time.time()
}
print("All 3 API requests submitted! Processing as they complete...")
completed = 0
completion_times = []
# Process results as they complete
for future in as_completed(tasks_info.keys(), timeout=180):
try:
success = future.result()
task_info = tasks_info[future]
if success:
completed += 1
completion_time = time.time()
completion_times.append(completion_time)
duration = completion_time - task_info['submit_time']
style_type = "Caricature" if "caricature" in task_info['style_variant'].lower() else "Pixar"
print(f"[{completed}/3] {style_type} completed: {task_info['output_name']} "
f"(API-{task_info['client_idx']+1}, {duration:.1f}s)")
else:
print(f"Failed: {task_info['output_name']}")
except Exception as e:
task_info = tasks_info[future]
print(f"Error with {task_info['output_name']} (API-{task_info['client_idx']+1}): {str(e)}")
total_time = time.time() - start_time
print(f"\n [FINAL RESULT] {completed}/3 custom stickers completed in {total_time:.1f} seconds!")
# UPDATED: Create 3 stickers in PARALLEL (using as_completed)
def create_category_stickers_parallel(photo_file, category):
if category not in PHRASE_CATEGORIES:
print(f" Category '{category}' not found! Available: {list(PHRASE_CATEGORIES.keys())}")
return
# Choose 3 unique phrases for 3 stickers
chosen_phrases = random.sample(PHRASE_CATEGORIESBeginner, 3)
print(f" Selected phrases for {category.title()} category:")
for i, phrase in enumerate(chosen_phrases, 1):
style_type = "Caricature" if i <= 2 else "Pixar Animation"
print(f" {i}. [{style_type}] '{phrase}' → API Key {i}")
# Map futures to their info
tasks_info = {}
with ThreadPoolExecutor(max_workers=3, thread_name_prefix="StickerGen") as executor:
start_time = time.time()
print(f"\n [PARALLEL START] Submitting 3 API calls SIMULTANEOUSLY at {time.strftime('%H:%M:%S', time.localtime(start_time))}")
# Submit ALL tasks at once (non-blocking)
for idx, (style_variant, text) in enumerate(zip(STYLE_VARIANTS, chosen_phrases)):
output_name = f"{category}_sticker_{idx+1}.png"
future = executor.submit(generate_single_sticker, photo_file, output_name, text, style_variant, idx)
tasks_info[future] = {
'output_name': output_name,
'text': text,
'style_variant': style_variant,
'client_idx': idx,
'submit_time': time.time()
}
print("All 3 API requests submitted! Processing as they complete...")
print(" • API Key 1 → Caricature #1")
print(" • API Key 2 → Caricature #2")
print(" • API Key 3 → Pixar Animation")
completed = 0
completion_times = []
# Process results as they complete (NOT in submission order)
for future in as_completed(tasks_info.keys(), timeout=180): # 3 minute total timeout
try:
success = future.result() # This only waits until ANY future completes
task_info = tasks_info[future]
if success:
completed += 1
completion_time = time.time()
completion_times.append(completion_time)
duration = completion_time - task_info['submit_time']
style_type = "Caricature" if "caricature" in task_info['style_variant'].lower() else "Pixar"
print(f"[{completed}/3] {style_type} completed: {task_info['output_name']} "
f"(API-{task_info['client_idx']+1}, {duration:.1f}s) - '{task_info['text'][:30]}...'")
else:
print(f"Failed: {task_info['output_name']}")
except Exception as e:
task_info = tasks_info[future]
print(f"Error with {task_info['output_name']} (API-{task_info['client_idx']+1}): {str(e)}")
total_time = time.time() - start_time
print(f"\n[FINAL RESULT] {completed}/3 stickers completed in {total_time:.1f} seconds!")
if len(completion_times) > 1:
fastest_completion = min(completion_times) - start_time
print(f"Parallel efficiency: Fastest completion in {fastest_completion:.1f}s")
Here, generate_single_sticker() builds the prompt and calls the images. edit endpoint using the specified client_idx. The parallel functions create a ThreadPoolExecutor with max_workers=3, submit the three tasks, and process results with as_completed. This lets the script log each finished sticker quickly. Moreover, we can also view the logs to see what is happening for each thread (time, what was it caricature or Pixar style).
Main execution block
At the bottom of the script, the __main__ guard defaults to running sticker_from_camera(). However, you can agree/uncomment as desired to run interactive_menu(), create_all_category_stickers() or other functions.
# Main execution
if __name__ == "__main__":
sticker_from_camera()
Output:
Output Image:

For the complete version of this WhatsApp sticker generator code, visit this GitHub repository.
Conclusion
In this tutorial, we have walked through setting up GPT-Image-1 calls, constructing an extended prompt for stickers, capturing or uploading images, selecting amusing phrases or custom text, and running 3 style variants simultaneously. In just a few hundred lines of code, this project converts your pictures into some comic-styled stickers.
By simply combining OpenAI’s vision model with some creative prompt engineering and multi-threading, you can generate fun, personalized stickers in seconds. And the result will be an AI-based WhatsApp sticker generator that can produce instantly shareable stickers with a single click to any of your friends and groups. Now try it for your own photo and your favorite joke!
Frequently Asked Questions
A. It transforms your uploaded or captured photos into fun, stylized WhatsApp stickers with text using OpenAI’s GPT-Image-1 model.
A. GPT-Image-1 handles text accuracy and facial expressions better than models like Gemini, Flux, or Phoenix, ensuring stickers have correct wording and expressive visuals.
A. It uses three OpenAI API keys and a ThreadPoolExecutor to generate three stickers in parallel, cutting down processing time.
Login to continue reading and enjoy expert-curated content.
Source link




Add comment