Appearance
Chapter 7 — Toolkit 2026
🧰
"AI tools change every quarter. Don't memorize tools — memorize the ROLE of each layer."
How to use this chapter
- Read overview tables → know the landscape
- Quick picker — fast tool choice for specific use case
- Re-check every 3 months — pricing/features change fast
00 Interactive stack picker
🎯 Gen Stack Picker — Bạn cần tool gì?
Trả lời 3 câu để pick tool phù hợp.
01 Image Generation
Full comparison
| Tool | Price (May 2026) | Strength | Weakness | Best for |
|---|---|---|---|---|
| Midjourney V7 | $10-120/month | Aesthetic, art style, character consistency (Omni Reference), voice input | Discord-heavy, weak text-in-image | Branding, illustration, mood boards |
| Flux.2 (Pro/Max/Klein) | $0.05+/image API | 4MP photoreal, light physics, multi-ref control | New ecosystem, fewer community LoRAs | Commercial product photo, ads |
| Nano Banana Pro (Gemini 3) | $0.134/image API | Top prompt adherence, 4K, in-image text accurate | Premium cost, strict safety filters | Brand edits, complex composition |
| Recraft V3 | $0-96/month | #1 ELO 1172, SVG vector export, brand canvas | Smaller community | Logos, vectors, icons, brand |
| Ideogram 3 | $0-48/month | Best in-image text (~75% accuracy) | Less photoreal than Flux | Ads with text, posters, packaging |
| GPT Image 1.5/2 (DALL-E) | $0.009-0.04/image | Native ChatGPT, prompt accuracy | Slow (60s-4min), over-stylize | Conversational edit, ChatGPT flow |
| Adobe Firefly 4 | CC subscription | IP-indemnified (commercial safe) | Lower aesthetic ceiling | Enterprise / IP-sensitive |
| Imagen 4 | $0.02-0.04/image API | Photoreal, near real-time, sharp text | Less artistic | High-volume production |
| Krea 1 | $0-30/month | Real-time canvas, sketch-to-image | Heavy on credits | Iterative ideation |
| Stable Diffusion 3.5 | Free (local) | Open weights, full control | Needs GPU, learning curve | Pro pipeline, ComfyUI |
02 Video Generation
| Tool | Price (May 2026) | Strength | Best for |
|---|---|---|---|
| Sora 2 | ChatGPT Pro $20+ | Physics first-class, sync audio 10-25s, Cameo | Social-native, cameo content |
| Veo 3.1 | Gemini / Vertex AI | 4K, native 48kHz dialogue, vertical 9:16 | Cinematic, audio-synced |
| Kling 2.5 Turbo | $6.99-64.99/month | Outperforms Seedance/Veo3-fast, ~30% cheaper | Cost-efficient, anime style |
| Runway Gen-4 | $15-95/month | Single-image character consistency | Filmmaker, character scenes |
| Pika 2.2 | $10-95/month | Pikaframes, Scene Ingredients, lip-sync | Memes, playful effects |
| MiniMax Hailuo 2.3 | $14.99/month | Budget photoreal, style packs | Stylized motion, budget |
| Luma Ray 3 | $9.99-99/month | Native 16-bit HDR, physics | Multi-model lab, HDR |
| Hunyuan Video 1.5 | Free open-source | 13B params, strong motion | Open-source pipeline |
| Wan 2.2 | Apache 2.0 (Alibaba) | MoE arch, beat Sora on VBench (86.22%) | Self-host, ComfyUI |
| Higgsfield Cinema | $9-99/month | Aggregator + camera presets | Cinematic camera moves |
03 Audio Generation
Music
| Tool | Price | Strength | Best for |
|---|---|---|---|
| Suno v5.5 | $0-30/month | Voice clone, Personas, 8-min songs, charted #1 Billboard | Full songs with vocals |
| Udio | UMG/Udio 2026 | Licensed (UMG settle Oct 2025) | Licensed music workflow |
| Riffusion | Free + paid | Image-diffusion based | Backing tracks, samples |
| Stable Audio 3.0 | Open weights | On-device, dev-friendly | Indie devs, sound design |
Voice / TTS
| Tool | Price | Strength |
|---|---|---|
| ElevenLabs v3 | $5-1,320/month | Instant + Pro voice clone, emotional tag, 32+ languages |
| Cartesia (Sonic) | API per-minute | Lowest latency TTS market, on-device |
| Play.ht | $19-99/month | Conversational AI voice |
| Murf AI | $19-79/month | 120+ voices, video sync |
04 3D Generation
| Tool | Price | Strength | Best for |
|---|---|---|---|
| Tripo v3.1 | $0.133/gen | 50% faster, PBR default | Game prop, indie |
| Meshy-6 | ~$0.80+/gen | Best geometry, 3D-print ready | High-fidelity, 3D print |
| Luma Genie | Free + paid | Text/image → 4 previews ~10s | Simple character, prop |
| CSM | API + sub | Multi-view consistency, AR-ready | AR/VR multi-view |
| Spline AI | $0-22/month | Browser-native, web-embed | Web 3D, marketing |
05 Multimodal / Workflow
| Tool | Price | What it does | Best for |
|---|---|---|---|
| ComfyUI | Free open-source | Node-based graph for SD/Flux/Wan/Hunyuan | Pro full-control, local |
| Krea Nodes | $0-60/month | Visual canvas integrating Flux/Sora/Veo/3D | Designers, node workflow |
| Magnific (Freepik) | $5.75/month+ | 250M asset library + Upscaler/Relight | Stock-heavy workflow |
| Higgsfield | $9-99/month | Multi-model camera control wrapper | Cinematic cross-model |
| ElevenLabs Flows | Bundled | Node canvas AI filmmaking pipeline | Short film end-to-end |
| Adobe Firefly App/Foundry | CC sub | Hosts Firefly + Veo 3 + Runway Gen-4 in Adobe | Enterprise creative |
| Pomelli (Google Labs) | Free beta | Scans website → "Business DNA" → branded ads | Small business, solo founders |
06 Ideal stacks by persona
Persona 1: Indie creator (budget $50-100/month)
Image: Midjourney V7 ($30)
Video: Kling 2.5 ($14.99) or Sora 2 (ChatGPT Plus $20)
Audio: Suno Pro ($10) + ElevenLabs Starter ($5)
3D: Luma Genie (free tier)
Edit: CapCut Pro ($7.99)
─────────────
Total: ~$80-90/monthPersona 2: Solo SaaS founder
Image API: Flux Pro via Replicate (~$0.05/gen × volume)
Backend: Stripe + Supabase
Frontend: Next.js + Vercel
Auth: Clerk
─────────────
Variable cost (70-85% margin if pricing right)Persona 3: Faceless YouTuber
Image: Midjourney V7 ($30)
Video: Veo 3.1 (Gemini Advanced $19.99)
Voice: ElevenLabs Creator ($22)
Edit: CapCut Pro ($7.99)
SEO: TubeBuddy + VidIQ ($15-30)
─────────────
Total: ~$95-110/monthPersona 4: Virtual influencer agency
Image: Flux Pro API + own LoRA
Render: ComfyUI local (RTX 4090 ~$1,600 one-time)
3D / animation: Runway Gen-4 ($95)
Voice: ElevenLabs Pro ($99)
Posting: Buffer ($15)
─────────────
Total: ~$200/month + GPUPersona 5: AI music producer
Music: Suno Premier ($30)
Voice: ElevenLabs Creator ($22)
Master: LANDR ($4/track) or BandLab Mastering (free)
Distribute: DistroKid ($23/year)
Cover art: Midjourney ($30)
─────────────
Total: ~$85/month07 Selection framework
Decision tree
Q1: Cost model fits business model?
- Pay-per-use (API) → OK for SaaS
- Subscription → OK for creators
- One-time → rare for AI
Q2: Does region work?
- International card OK?
- Sub has regional tier?
- Local regulations OK?
Q3: Does output have commercial license?
- Personal tiers usually don't allow commercial
- Read ToS carefully before launch
Q4: Backup if primary fails?
- Tools rotate, don't over-rely on one
- Test 2-3 alternatives every 6 months
08 Continue reading
- Chapter 1 — Solo Studio
- Chapter 4 — Solo SaaS
- Chapter 5 — Sora 2 & TikTok
- Chapter 6 — Faceless Empire
- Chapter 8 — Ethics 2026
- Chapter 9 — 30-Day Roadmap
Update cadence
Tools and pricing accurate May 2026.Re-check every 3 months — pricing changes fast.