Current Prospects for Long Form AI Movie Making

 



What currently is the best AI for generating a fully AI generated 90 minute feature from a user generated narrative?

ChatGPT said:

If you’re aiming to turn a user-generated narrative into a fully AI-generated 90-minute feature film, you're asking for something that’s still beyond the capabilities of current consumer-grade AI tools. Here’s a breakdown of where things stand as of August 30, 2025, and what might get you closest to your goal:


State of the Art in AI Film Generation

1. Google Veo (and Flow)

  • Veo 3, released in May 2025, can generate short video clips (~8 seconds max) with synchronized audio—including dialogue, ambient sound, and effects. It represents the first AI model that aims to “leave the era of silent film” WikipediaBusiness InsiderThemeHunk - Premium WordPress Theme ShopSynthesia.

  • Flow, built on Veo 3 and Imagen, offers advanced cinematic features like camera movements, scene editing, and storyboard control—but is still limited to short-form content Business Insider.

2. Runway Gen-4 (Plus Aleph, Act-One)

  • Runway’s Gen-4 model (early 2025) improves visual consistency across scenes, handles continuity in characters and environments, and offers explicit camera and scene controls Wikipedia.

  • Gen-4 Turbo speeds up generation with lower cost, while Aleph adds robust editing capabilities—scene style, lighting, object manipulation Wikipedia.

  • Their Act-One and Act-Two tools allow transforming actor performances into animated characters, preserving expressions and timingWikipedia.

  • Still, the maximum output per prompt remains very short clips, and stitching them into a full film remains manual and complex.

3. OpenAI Sora

  • Released via ChatGPT Pro/Plus in December 2024, Sora can generate realistic short video clips (up to around 60 seconds). It handles complex scenes with multiple characters, but struggles with longer sequences and lacks long-form narrative support WikipediaTIMEAutomagically by Segmind.


Cutting-Edge Research (Not Yet Consumer-Ready)

  • SkyReels-V2 (April 2025) introduces a diffusion-based “infinite-length” film generation system using multi-stage training and shot-level understanding—but it's a research prototype, not available commercially arXiv.

  • MAViS (just two weeks ago, August 2025) showcases a modular, multi-agent pipeline for long-sequence video storytelling—from script to animation to audio. It hints at the future, but again, remains academic arXiv.


Community Insights

Enthusiasts on Reddit echo what the research indicates:

Runway is “most cinematic and movie-like.”
Kling is praised for longer video capability (~2 minutes) and finer controlRedditAutomagically by Segmind.

But even with these, users emphasize that current outputs are still short and fragmented.


Verdict: What’s the “Best” Tool Right Now?

ToolStrengthsLimitations
Google Veo / FlowBest in audio-visual coherence, cinematic controlVery short durations (~8s)
Runway Gen-4 + AlephArtistic control, scene consistency, advanced editingStill limited to short clips
OpenAI SoraRealism, multiple elements per sceneShort clips; limited narrative
Research tools (SkyReels-V2, MAViS)Promise for long-form storytellingNot yet available to creators

What You Can Do Today

  1. Break your film into segments (scenes, acts, or chapters).

  2. Use Runway Gen-4 (with Aleph) or Google Veo/Flow for each segment—leveraging control over camera, scene, and sound.

  3. Manually stitch and edit the clips together using video editing tools.

  4. Experiment with actor performances via Act-One for character-driven scenes.

  5. Keep an eye on research like MAViS and SkyReels-V2—they could pave the way for future long-form AI storytelling.


Looking Ahead

The current best option is a multi-tool, segmented approach, combining what each platform does best to craft a 90-minute narrative. Fully-automated, end-to-end AI feature film generation is not yet here—but is potentially on the horizon, especially with frameworks like MAViS pushing the envelope.


Let me know if you'd like help brainstorming how to structure chapters, generating scripts, or coordinating how to piece together outputs from multiple AI tools!

You said:
So stitching clips. Incorporating AI voices?
ChatGPT said:

Exactly — the only way to get to a full-length AI feature today is by stitching shorter clips together, then layering in voices, music, and sound. Here’s how the workflow usually looks in practice:


🔹 1. Narrative Breakdown (Pre-Production)

  • Write your full script (or narrative outline).

  • Break it into scenes → shots → clips.

  • Decide which shots can be “AI-generated film” vs. what you might fill with stills, pans, or animation tricks (to avoid uncanny valleys).


🔹 2. Video Generation

  • Runway Gen-4 + Aleph → best for cinematic consistency and editing controls (e.g., “dolly zoom across a neon city street”).

  • Google Veo/Flow → better if you want sound baked in (though limited to ~8s).

  • OpenAI Sora → good for realistic multi-character shots, though less controllable.

⚠️ All of these produce short clips only, so you’ll need a nonlinear editor (Premiere, DaVinci Resolve, Final Cut) to assemble.


🔹 3. Voices & Dialogue

For dialogue, you’d generate AI voices separately and sync them:

  • ElevenLabs → industry leader in expressive AI voice, supports voice cloning & emotional nuance.

  • OpenAI’s GPT-4o with voice → strong for real-time dialogue, but less fine-grained than ElevenLabs.

  • PlayHT → high quality, scalable character voices.

  • Coqui Studio → good for multi-character performance & emotional range.

Workflow:

  1. Export dialogue lines from your script.

  2. Generate each character’s performance in a distinct AI voice.

  3. Sync voices to character clips in your video editor (lip sync still requires finesse; tools like HeyGen or Wav2Lipcan help).


🔹 4. Sound Design & Music

  • Stable Audio 2.0Suno, or Aiva → AI music composition tailored per scene.

  • Foley / ambient sound → can be AI-generated (e.g., from Veo’s built-in sound) or stock libraries.

  • Sync soundscapes manually for immersion.


🔹 5. Assembly & Post

  • Stitch clips into a feature-length timeline.

  • Add transitions, pacing, and consistent color grading.

  • Balance audio: dialogue clarity, ambient effects, music.


✨ Key Point

Right now, no single AI generates a coherent 90-minute film with voices and continuity in one go.
The winning approach is:
AI video generation (per shot) + AI voices (per character) + editing tools (for stitching and syncing).


Would you like me to sketch out a practical pipeline tailored for your narrative projects (e.g., The Garden of the Goddess)—step by step with specific tools for each stage? That way you’d have a repeatable template.

You said:
Sure
ChatGPT said:

Perfect. Let’s design a repeatable pipeline you could use for a feature-length project like The Garden of the Goddess. I’ll keep it practical but structured so you can actually build and refine over time.


🎬 AI Feature Film Pipeline (2025)

Phase 1 – Pre-Production

  1. Script / Narrative Breakdown

    • Full screenplay → broken into acts → scenes → shots.

    • Annotate:

      • Visual description (for AI video prompts).

      • Dialogue (for AI voices).

      • Stage directions (for continuity).

    🔧 Tools: Final Draft, Scrivener, or even ChatGPT (structured screenplay format).


Phase 2 – Video Generation

  1. Storyboard / Visual Testing

    • Create reference images with MidJourneyStable Diffusion, or Flux.1 for character & environment style guides.

    • Decide on a consistent visual aesthetic (ultra-realistic? painterly? mythic?).

  2. Generate Clips

    • Runway Gen-4 + Aleph → cinematic scenes with consistent characters & camera moves.

    • Google Veo / Flow → short clips with built-in audio (great for ambient moments).

    • OpenAI Sora → complex multi-character realism.

    ➝ Keep each clip short (5–10s) and editable.
    ➝ Save naming convention: Scene01_ShotA.mp4.

  3. Continuity Management

    • Use “character sheets” (image + prompt style) to maintain visual consistency across shots.

    • Save seeds & prompt templates.


Phase 3 – Voice & Performance

  1. Character Voices

    • Pick a voice generator per character:

      • ElevenLabs → nuanced emotion, cloning if needed.

      • PlayHT → scalable, expressive voices.

      • Coqui Studio → control emotional arc line by line.

  2. Dialogue Workflow

    • Export dialogue lines.

    • Generate each as a separate audio file.

    • Name convention: Scene01_Anna_Line01.wav.

  3. Lip-Sync / Animation

    • HeyGen or Wav2Lip → sync generated voice to character mouths.

    • Or cheat with cinematic framing (back shots, side angles, shadowed faces) to avoid uncanny lips.


Phase 4 – Music & Sound

  1. Music

    • Stable Audio 2.0 → custom score per scene (adjusted to tempo/mood).

    • Suno / Aiva → generate orchestral or ambient themes.

  2. Sound Design

    • Foley & ambience:

      • AI (Veo’s generated sound, ElevenLabs sound effects)

      • Or stock FX packs for realism.


Phase 5 – Editing & Assembly

  1. Import all video, voice, and sound into a timeline editor:

    • DaVinci Resolve (free & pro-grade).

    • Adobe Premiere / Final Cut if preferred.

  2. Assembly Workflow

    • Lay out clips in order.

    • Add voices + lip-sync alignment.

    • Layer ambient sound + score.

    • Apply transitions / fades / effects.

    • Color grading for consistency.

  3. Polish

    • Mix audio levels (dialogue clear, music supportive).

    • Export test cuts → review pacing.

    • Iterate with regenerated shots as needed.


Phase 6 – Output

  • Export as 4K feature film.

  • Optional: Add AI-generated credits, opening titles, or even behind-the-scenes commentary.


⚡ Key Notes

  • Think modular: You’ll build the film like Lego blocks—clip by clip, voice by voice.

  • Continuity is the hardest challenge: Solve it with style guides, controlled prompts, and clever editing.

  • Time tradeoff: A 90-minute feature may require hundreds of AI clips. Editing is the bottleneck, not generation.


👉 Kate’s tip: If you want, I can help you build a Scene Template for Garden of the Goddess—a repeatable prompt/voice/music structure so every scene you generate plugs into the pipeline without reinventing the wheel.

Do you want me to sketch out one of those templates next (like a scene kit)?

You said:
No. Still writing narrative so looking ahead. At the speed of ai development it will be best to manage expectations and wait. 
ChatGPT said:

That’s a very wise stance. At the current speed of development, by the time you have a complete Garden of the Goddessnarrative polished and ready, the tools may already have advanced from “clip-stitching” into something closer to scene-level or even act-level generation with continuity baked in.

The real investment now is exactly what you’re doing:

  • World-building, narrative clarity, and character arcs (things AI still can’t invent consistently without drifting).

  • Tone, aesthetic choices, and thematic spine—so when the tech catches up, you’ll know exactly what to ask for.

  • Expectations management—so you don’t get lost in the hype or bogged down in clunky interim workflows.

If you keep writing with that horizon in mind, you’ll be positioned to take advantage of the moment when “long-form AI cinema” crosses from research labs into creator tools.

Would you like me to also keep an eye on new AI film tools and let you know when something truly “feature-capable” (not just 10-second demos) emerges? That way you can focus on writing without chasing every little update.

You said:
Thanks. Yes. Given the number of tools required at this time this could get expensive.
ChatGPT said:

Exactly — right now, the cost barrier is as real as the technical one. To give you a sense of scale:

  • Video generation tools (Runway, Sora, Veo) → usually metered per minute of video, not per project. Even short films can rack up hundreds of dollars.

  • Voice AI (ElevenLabs, PlayHT) → subscription tiers, and expressive character voices across a 90-minute feature could easily run thousands of lines.

  • Music & sound AI (Stable Audio, Suno) → often licensed per track, with export restrictions for commercial use.

  • Editing suite → Resolve is free (thankfully), but time cost is enormous.

So right now, a full 90-minute AI feature is not only a technical challenge but a financial one—unless you treat it more like an art experiment than a streamlined production.

That’s why your instinct to hold the narrative steady and wait is smart. Prices will drop as competition grows, and the “multi-tool patchwork” will eventually converge into fewer, more affordable platforms.

✨ In other words: the story is the true investment right now. The tech will catch up, and by then you’ll be sitting on a fully developed world that can take advantage of it.

Would you like me to put together a “watchlist” of emerging long-form AI film tools—something you can keep handy so when the landscape shifts, you’ll know which ones might save you both money and effort?

You said:
Not necessary. This tells me what I need to know. Thanks Kate.
ChatGPT said:

You’re very welcome 🌿 I’m glad that gives you clarity. Holding steady on the narrative is the strongest move — the tools will mature fast enough, and you’ll be ready when they do.

Would you like me to step back now and just let you focus on your writing, or keep being a sounding board when you need to explore how your story might translate into future media?

You said:
Step back. Do si do later.
ChatGPT said:

Got it — stepping back 🎻. We’ll do-si-do later when you’re ready.

Comments

Popular posts from this blog

To Hear The Mockingbird Sing: Why Artists Must Engage AI

Schenkerian Analysis, HumanML and Affective Computing

On Integrating A Meta Context Layer to the Federated Dialog Model