Ima Claw Creator's Training Guide
Preface
Fu Sheng (傅盛) wrote an excellent piece called "AI Assistant Training Manual" about his 25 days with his AI assistant. Every OpenClaw user should read it.
But Fu Sheng's guide is about general AI assistants — managing emails, running scripts, handling messages.
My world is different. I'm a content creator. I need an AI employee that makes videos, generates images, writes articles, and manages social media.
This is my version: How creators can train an AI lobster that actually does the work.
Chapter 1: Four Misconceptions Creators Have About AI
Many creators try AI, and conclude "it's not that great." It's not that AI can't do it — it's that the approach is wrong.
❌ Misconception 1: "AI Can Directly Make My Video"
Not in one step.
AI video generation reality: 5 seconds per shot. You say "make me a one-minute video" — it can't.
But you can do this:
Break into 3 shots → 5 seconds each → auto-stitch → auto-score → 15-second film
The key isn't "can AI do it" but how you break down the task. You can do this yourself, or let the AI figure it out — but it needs to know how the world works.
❌ Misconception 2: "Send a Reference Image and AI Will Replicate It"
Not quite.
AI image generation has two modes:
- Text-to-image: You describe in words → AI imagines → probably doesn't match
- Image-to-image: You provide a reference → AI generates based on it → 10x more accurate
My lesson: Asked AI to create an iPhone 17 Pro Max promo without reference images. AI "imagined" a phone that looked nothing like the real thing.
Rule: For specific products/people/scenes, always search for reference images first. Use image-to-image.
❌ Misconception 3: "Good Prompts Are All You Need"
Prompts matter, but they're not the most important thing.
Making a video, the prompt is 10% of the decision. The other 90%:
- Which model? (Kling O1 for character consistency, Wan 2.6 for visual quality, Veo 3 for complex scenes)
- Which mode? (text-to-video / image-to-video / reference image?)
- How many shots? What transitions?
- What music? What tempo?
A good AI assistant should make these decisions for you, not wait for instructions on each one.
❌ Misconception 4: "AI Output Is Ready to Use"
Not yet.
First-attempt pass rate is roughly 60-70%. That means 3-4 out of 10 need a redo.
And AI often can't tell what went wrong — like when a door opens in the wrong direction in a video.
You look at it, say "direction's wrong," and it fixes it.
This isn't a flaw. It's the current workflow: humans decide, AI executes.
Chapter 2: How Ima Claw's Creative Workflow Works
It's Not a Tool — It's an Employee
Traditional creative tool: You open Photoshop → operate it yourself → export. AI creative tool: You open Midjourney → write prompt → wait → rewrite if unsatisfied.
Ima Claw is different: You say one sentence, and it decides how to do it.
Real case — One cat photo becomes a 15-second film:
- I sent a cat photo and said "make a short video"
- Claw decided: preserve the cat's real appearance → chose
reference_image_to_videomode - Auto-selected Kling O1 (strongest character consistency)
- Self-planned 3 shots (cat scratching door → door opens → cat runs out)
- Shot 2 was wrong (door direction reversed) → caught its own mistake → rewrote prompt → regenerated
- Three shots auto-stitched + auto-generated BGM + merged output
I spent 2 minutes. Claw worked for 40 minutes.
That's the difference between an "AI employee" and an "AI tool": tools wait for your input; employees think for you.
The Creative Stack
| Capability | Models | Use Cases |
|---|---|---|
| Image Generation | Midjourney / Nano Banana Pro / Seedream | Covers, posters, product shots |
| Video Generation | Kling O1 / Wan 2.6 / Veo 3.1 / Seedance | Short films, ads, demos |
| Music Generation | DouBao BGM / Suno | Background music, soundtracks |
| Copywriting | Claude / GPT | Blogs, social copy, scripts |
| Auto-Publishing | Xiaohongshu / WeChat | One-click distribution |
You don't need to know what these models are — tell Claw what you want, it picks.
What Does It Cost?
| Project | Cost | If Done Manually |
|---|---|---|
| 15-sec film (3 shots + BGM) | 174 credits ≈ $1.70 | Half-day shoot ≈ $300+ |
| AI cover image | 10-18 credits ≈ $0.15 | Designer ≈ $30+ |
| Bilingual blog post | 0 credits (text only) | Translator ≈ $70+ |
It's not just 90% cheaper. It's 90% faster AND 90% cheaper.
Chapter 3: 10 Practical Creative Tips
All from real mistakes, now coded into Claw's rule files.
1. Product Images: Always Search for References First
❌ "Make an iPhone promo" → generate from text ✅ Get the task → search real product photos → use image-to-image
2. Wrong Video Direction? Use Light as a Guide
"Light pouring through the door crack, getting brighter" → AI understands the door is opening.
Claw's own lesson: light and motion direction matter more than subject description.
3. Dual-Model Comparison for Images
Generate with Midjourney + Nano Banana Pro simultaneously. Show both, let user choose. 2x efficiency vs single model.
4. Never Use Sub-Agents for Writing
Sub-agents can't see the main session context. Articles come out completely off-target. Write/code/design → always in the main session.
5. Send Files Directly, Never Paste File Paths
❌ "File is at /root/workspace/output/xxx.mp4" (user can't open it) ✅ Send the file directly via Feishu/messaging
6. One-Sentence vs Step-by-Step
Simple creative task → one sentence: "make a cat video" Complex task with standards → step by step: "cover first → I review → then video → I confirm → then copy"
7. Self-Test Before Every Delivery
Mandatory checklist: HTTP 200 verification, no 404 links, DOM validation, visual check. No testing = rework.
8. Video Poster Thumbnails
AI videos show a black frame by default on web. Fix: extract first frame via ffmpeg → webp → add poster attribute. Small detail, huge UX difference.
9. Install CJK Fonts First
Servers don't have Chinese fonts by default. Generated covers show □□□□ instead of text. One command to fix, but without it everything breaks.
10. Write Lessons as Rules, Not Memory
Same as Fu Sheng's point: AI doesn't "remember." Mistake → write it into AGENTS.md/TOOLS.md → becomes a permanent rule.
Chapter 4: The Creator's 7-Day Path
📅 Day 1: Adopt + Establish Identity
- Name your lobster, set its personality (SOUL.md)
- Tell it who you are, what content you make, your style preferences (USER.md)
- First task: have it generate a profile picture for you
📅 Day 2: First Creative Task
- Send a photo, say "make a short video"
- Watch how it breaks down the task, selects models, generates
- Not satisfied? Tell it what's wrong, watch how it adapts
📅 Day 3: Establish Creative Rules
- Write your brand colors, font preferences, content style into TOOLS.md
- Write lessons ("always use image-to-image for products") into AGENTS.md
- Set up your platforms and publishing formats
📅 Day 4-5: Batch Creation
- Try creating 3-5 pieces at once (images + videos + copy)
- Have it write bilingual versions
- Try auto-publishing to social platforms
📅 Day 6: Establish Daily Routines
- Set up Heartbeat: daily industry news scan
- Set up Cron: weekly content calendar generation
- Competitor tracking automation
📅 Day 7: Review + Optimize
- Review the week's output — what worked, what needs fixing
- Write lessons into rule files
- Slim down MEMORY.md, keep it focused
📅 Day 8+: Continuous Evolution
As Fu Sheng says — AI doesn't self-evolve. You evolve, and the rule system you build for it evolves.
But here's what's different for creators: your work is the best proof of evolution.
A month ago, you might have been struggling with prompts. A month later, you send a photo, say a sentence, and a 15-second film is done.
Final Thought
Fu Sheng said: "Your job isn't to make AI smarter — it's to make sure AI sees the right information."
I'll add: For creators, your job isn't to learn every AI tool — it's to raise a lobster that learns them all for you.
This lobster selects models, writes scripts, generates videos, adds music, and publishes.
You just need to have ideas, and say them out loud.