Thumbnail Ranking
POST/youtube/v1/thumbnail-ranking
What it does
You upload 2–10 thumbnail variants for the same video. The API analyzes each one across six visual quality signals, scores them 0–100, and tells you which one will perform best — with clear, human-readable reasons why.
The six quality signals
Each thumbnail is scored on these six dimensions. The final score is a weighted average:
| Signal | Weight | What it measures |
|---|---|---|
| Face Presence | 20% | Detects faces and their size relative to the image. Sweet spot: 8–35% of image area. No face = neutral score (40). |
| Semantic Alignment | 20% | How well the thumbnail visually matches the video title and script. Requires the title field. Adding a scriptExcerpt gives the AI richer context and improves accuracy. Thumbnails with very low alignment (≤25) receive a 15-point penalty and a misalignment warning. |
| Mobile Legibility | 20% | Sharpness and readability at mobile size (200px wide). Penalizes visual clutter and low-contrast elements. |
| Contrast Robustness | 15% | How well the thumbnail holds up across different UI contexts (light/dark mode, compression). Tests JPEG compression stability. |
| Saliency | 15% | How attention-grabbing the thumbnail is. AI vision analysis identifies focal points and visual hierarchy. |
| Text Density | 10% | OCR-based analysis of text on the thumbnail. Ideal range: 5–40% coverage, 3–50 characters. Too much text hurts; too little misses an opportunity. |
💡 Tip: Always include the
title field in your request. Without it, the semantic alignment signal (20% of the score) can't work properly and defaults to a neutral 50. You're missing out on a fifth of the analysis.💡 Pro tip: Add a
scriptExcerpt (first 10–1500 characters of your script) for even better alignment accuracy. The AI uses both the title and script to understand what your video is about, catching thumbnails that look good visually but don't match your content.Why use it?
- Eliminate guesswork. Instead of asking friends or posting polls, get an objective, data-backed ranking in seconds.
- Catch problems you can't see. A thumbnail that looks great on your 27″ monitor might be unreadable on a phone. The API simulates mobile viewing and scores legibility.
- Understand why one thumbnail wins. It's not just "A is better than B." You get subscores for contrast, face presence, saliency, text density, and more — so you know exactly what to improve.
- Test before publishing. Run your thumbnails through the API before uploading to YouTube. Fix issues while they're still free to fix.
- Batch A/B variants. Create 5–8 variants with different approaches (bold text vs. clean, face vs. no face) and let the API find the winner.
Examples
Request example
curl -X POST https://api.creatornode.io/youtube/v1/thumbnail-ranking \
-H "Content-Type: application/json" \
-H "X-API-Key: YOUR_KEY" \
-d '{
"variants": [
{ "id": "bold-text", "imageBase64": "dGh1bWItYS1iYXNlNjQ=" },
{ "id": "clean-face", "imageBase64": "dGh1bWItYi1iYXNlNjQ=" },
{ "id": "dramatic", "imageBase64": "dGh1bWItYy1iYXNlNjQ=" }
],
"title": "I Quit My Job to Build an App — Here'''s What Happened",
"scriptExcerpt": "What'''s up everyone, so last month I decided to quit my 9-to-5 and go all in on building a SaaS app..."
}'Response example
{
"success": true,
"data": {
"winner": {
"id": "clean-face",
"score": 87,
"reasons": [
"🏆 Strong face presence with genuine emotion (95/100)",
"✓ Excellent mobile legibility — sharp at small sizes",
"✓ High semantic alignment with the video title"
]
},
"variants": [
{
"id": "clean-face", "rank": 1, "score": 87,
"subscores": {
"mobileLegibility": 92, "contrastRobustness": 88,
"facePresence": 95, "saliencyProxy": 78,
"textDensity": 72, "semanticAlignment": 82
}
},
{ "id": "bold-text", "rank": 2, "score": 74, ... },
{ "id": "dramatic", "rank": 3, "score": 68, ... }
],
"confidence": { "level": "high", "value": 0.91 }
}
}Image input options
You can provide thumbnails in two ways:
| Method | How | Best for |
|---|---|---|
| Base64 | "imageBase64": "..." | Local files — encode and embed directly in JSON |
| Multipart | multipart/form-data | Multiple files from disk — great for frontend upload UIs |
⚠️ Image limits: Max 2 MB per image. JPEG, PNG, and WebP supported. Minimum 50×50 pixels.
Understanding the response
The response includes a clear winner object with reasons, plus a variants array sorted by rank, each with subscores and diagnostics:
{
"success": true,
"data": {
"winner": {
"id": "clean-face",
"score": 87,
"reasons": [
"🏆 Strong face presence with genuine emotion (95/100)",
"✓ Excellent mobile legibility — sharp at small sizes",
"✓ High semantic alignment with the video title"
]
},
"variants": [
{
"id": "clean-face", "rank": 1, "score": 87,
"subscores": {
"mobileLegibility": 92, "contrastRobustness": 88,
"facePresence": 95, "saliencyProxy": 78,
"textDensity": 72, "semanticAlignment": 82
}
},
{ "id": "bold-text", "rank": 2, "score": 74, ... },
{ "id": "dramatic", "rank": 3, "score": 68, ... }
],
"confidence": { "level": "high", "value": 0.91 }
}
}Confidence scoring
The confidence object tells you how reliable the ranking is, based on three factors:
- Signal coverage (40%) — how many of the 6 signals returned valid data. More signals = higher confidence.
- Score stability (30%) — low variance across subscores means the thumbnail is consistently good (or bad), not just lucky in one dimension.
- Margin strength (30%) — the score gap between #1 and #2. A 15+ point gap = high confidence. Under 2 points = essentially a tie, lower confidence.
Tips & tricks
💡 Tip: If confidence is
"low" and the margin is tiny, your variants are very similar. Try making bolder changes — different compositions, not just color tweaks.- Test diverse approaches. Don't upload 5 variations of the same idea. Include at least one radically different concept — face vs. no face, text-heavy vs. clean, bright vs. dark. The biggest insights come from comparing different strategies, not minor tweaks.
- Use 3–5 variants for best results. Two thumbnails give you a winner. Five give you a clear hierarchy of what works and what doesn't. More than 8 usually means you haven't narrowed your concept enough.
- Read the subscores, not just the final score. A thumbnail might win overall but have a poor
textDensityscore. You can fix that specific issue and make it even better. - Mobile legibility is king. Over 70% of YouTube views happen on mobile. If your thumbnail scores below 70 on
mobileLegibility, simplify it — larger text, fewer elements, stronger contrast. - Face presence matters — but size matters more. A tiny face in the corner scores lower than no face at all. If you use a face, make it 8–35% of the image area. Fill the frame.
- Keep text to 3–50 characters. The sweet spot for overlay text. More than 50 characters on a thumbnail is always penalized — on mobile, nobody can read it.
- Check
X-Credits-Usedheaders. The response headers show exactly how many credits were consumed and how many remain on your key.
Cost & Limits
| Feature | Detail |
|---|---|
| Credit cost | 13 credits per request |
| Extra variant cost | +1 credit per variant above 4 |
| Max image size | 2 MB per image |
| Supported formats | JPEG, PNG, WebP |
| Includes | AI analysis (visual scoring) |
Tier Limits
| Limit | Free | Premium |
|---|---|---|
| Max variants | 3 | 10 |
| AI models | Previous generation | Latest models |
💡 Example calculations:
- 3 thumbnails: 13 = 13 credits
- 4 thumbnails: 13 = 13 credits
- 6 thumbnails: 13 + 2×1 = 15 credits
- 8 thumbnails: 13 + 4×1 = 17 credits
Formula: total = 13 + max(0, variants − 4) × 1