Thumbnail Ranking

POST/youtube/v1/thumbnail-ranking

What it does

You upload 2–10 thumbnail variants for the same video. The API analyzes each one across six visual quality signals, scores them 0–100, and tells you which one will perform best — with clear, human-readable reasons why.

The six quality signals

Each thumbnail is scored on these six dimensions. The final score is a weighted average:

Signal	Weight	What it measures
Face Presence	20%	Detects faces and their size relative to the image. Sweet spot: 8–35% of image area. No face = neutral score (40).
Semantic Alignment	20%	How well the thumbnail visually matches the video title and script. Requires the `title` field. Adding a `scriptExcerpt` gives the AI richer context and improves accuracy. Thumbnails with very low alignment (≤25) receive a 15-point penalty and a misalignment warning.
Mobile Legibility	20%	Sharpness and readability at mobile size (200px wide). Penalizes visual clutter and low-contrast elements.
Contrast Robustness	15%	How well the thumbnail holds up across different UI contexts (light/dark mode, compression). Tests JPEG compression stability.
Saliency	15%	How attention-grabbing the thumbnail is. AI vision analysis identifies focal points and visual hierarchy.
Text Density	10%	OCR-based analysis of text on the thumbnail. Ideal range: 5–40% coverage, 3–50 characters. Too much text hurts; too little misses an opportunity.

💡 Tip: Always include the title field in your request. Without it, the semantic alignment signal (20% of the score) can't work properly and defaults to a neutral 50. You're missing out on a fifth of the analysis.

💡 Pro tip: Add a scriptExcerpt (first 10–1500 characters of your script) for even better alignment accuracy. The AI uses both the title and script to understand what your video is about, catching thumbnails that look good visually but don't match your content.

Why use it?

Eliminate guesswork. Instead of asking friends or posting polls, get an objective, data-backed ranking in seconds.
Catch problems you can't see. A thumbnail that looks great on your 27″ monitor might be unreadable on a phone. The API simulates mobile viewing and scores legibility.
Understand why one thumbnail wins. It's not just "A is better than B." You get subscores for contrast, face presence, saliency, text density, and more — so you know exactly what to improve.
Test before publishing. Run your thumbnails through the API before uploading to YouTube. Fix issues while they're still free to fix.
Batch A/B variants. Create 5–8 variants with different approaches (bold text vs. clean, face vs. no face) and let the API find the winner.

Examples

Request example

curl -X POST https://api.creatornode.io/youtube/v1/thumbnail-ranking \
  -H "Content-Type: application/json" \
  -H "X-API-Key: YOUR_KEY" \
  -d '{
    "variants": [
      { "id": "bold-text", "imageBase64": "dGh1bWItYS1iYXNlNjQ=" },
      { "id": "clean-face", "imageBase64": "dGh1bWItYi1iYXNlNjQ=" },
      { "id": "dramatic", "imageBase64": "dGh1bWItYy1iYXNlNjQ=" }
    ],
    "title": "I Quit My Job to Build an App — Here'''s What Happened",
    "scriptExcerpt": "What'''s up everyone, so last month I decided to quit my 9-to-5 and go all in on building a SaaS app..."
  }'

Response example

{
  "success": true,
  "data": {
    "winner": {
      "id": "clean-face",
      "score": 87,
      "reasons": [
        "🏆 Strong face presence with genuine emotion (95/100)",
        "✓ Excellent mobile legibility — sharp at small sizes",
        "✓ High semantic alignment with the video title"
      ]
    },
    "variants": [
      {
        "id": "clean-face", "rank": 1, "score": 87,
        "subscores": {
          "mobileLegibility": 92, "contrastRobustness": 88,
          "facePresence": 95, "saliencyProxy": 78,
          "textDensity": 72, "semanticAlignment": 82
        }
      },
      { "id": "bold-text", "rank": 2, "score": 74, ... },
      { "id": "dramatic", "rank": 3, "score": 68, ... }
    ],
    "confidence": { "level": "high", "value": 0.91 }
  }
}

Image input options

You can provide thumbnails in two ways:

Method	How	Best for
Base64	`"imageBase64": "..."`	Local files — encode and embed directly in JSON
Multipart	`multipart/form-data`	Multiple files from disk — great for frontend upload UIs

⚠️ Image limits: Max 2 MB per image. JPEG, PNG, and WebP supported. Minimum 50×50 pixels.

Understanding the response

The response includes a clear winner object with reasons, plus a variants array sorted by rank, each with subscores and diagnostics:

{
  "success": true,
  "data": {
    "winner": {
      "id": "clean-face",
      "score": 87,
      "reasons": [
        "🏆 Strong face presence with genuine emotion (95/100)",
        "✓ Excellent mobile legibility — sharp at small sizes",
        "✓ High semantic alignment with the video title"
      ]
    },
    "variants": [
      {
        "id": "clean-face", "rank": 1, "score": 87,
        "subscores": {
          "mobileLegibility": 92, "contrastRobustness": 88,
          "facePresence": 95, "saliencyProxy": 78,
          "textDensity": 72, "semanticAlignment": 82
        }
      },
      { "id": "bold-text", "rank": 2, "score": 74, ... },
      { "id": "dramatic", "rank": 3, "score": 68, ... }
    ],
    "confidence": { "level": "high", "value": 0.91 }
  }
}

Confidence scoring

The confidence object tells you how reliable the ranking is, based on three factors:

Signal coverage (40%) — how many of the 6 signals returned valid data. More signals = higher confidence.
Score stability (30%) — low variance across subscores means the thumbnail is consistently good (or bad), not just lucky in one dimension.
Margin strength (30%) — the score gap between #1 and #2. A 15+ point gap = high confidence. Under 2 points = essentially a tie, lower confidence.

Tips & tricks

💡 Tip: If confidence is "low" and the margin is tiny, your variants are very similar. Try making bolder changes — different compositions, not just color tweaks.

Test diverse approaches. Don't upload 5 variations of the same idea. Include at least one radically different concept — face vs. no face, text-heavy vs. clean, bright vs. dark. The biggest insights come from comparing different strategies, not minor tweaks.
Use 3–5 variants for best results. Two thumbnails give you a winner. Five give you a clear hierarchy of what works and what doesn't. More than 8 usually means you haven't narrowed your concept enough.
Read the subscores, not just the final score. A thumbnail might win overall but have a poor textDensity score. You can fix that specific issue and make it even better.
Mobile legibility is king. Over 70% of YouTube views happen on mobile. If your thumbnail scores below 70 on mobileLegibility, simplify it — larger text, fewer elements, stronger contrast.
Face presence matters — but size matters more. A tiny face in the corner scores lower than no face at all. If you use a face, make it 8–35% of the image area. Fill the frame.
Keep text to 3–50 characters. The sweet spot for overlay text. More than 50 characters on a thumbnail is always penalized — on mobile, nobody can read it.
Check X-Credits-Used headers. The response headers show exactly how many credits were consumed and how many remain on your key.

Cost & Limits

Feature	Detail
Credit cost	13 credits per request
Extra variant cost	+1 credit per variant above 4
Max image size	2 MB per image
Supported formats	JPEG, PNG, WebP
Includes	AI analysis (visual scoring)

Tier Limits

Limit	Free	Premium
Max variants	3	10
AI models	Previous generation	Latest models

💡 Example calculations:

3 thumbnails: 13 = 13 credits
4 thumbnails: 13 = 13 credits
6 thumbnails: 13 + 2×1 = 15 credits
8 thumbnails: 13 + 4×1 = 17 credits

Formula: total = 13 + max(0, variants − 4) × 1

Other Endpoints

POST/youtube/v1/hook-validator

Hook Validator

Validate YouTube video hooks/intros — cross-validate thumbnail, title, and script for viewer retention.