Posts4/15/2026 by OpenRouter

Announcing Video Generation

Video generation is now live on OpenRouter. One API gives you access to the top video models. Video now sits alongside text, images, audio, embeddings, and rerankers, under the same routing, governance, and billing layer.

On day one, we're supporting text-to-video and image-to-video on Seedance 2.0 / 1.5, Veo 3.1, Wan 2.7 / 2.6, and Sora 2 Pro, with many more to come. Browse all supported models, learn more about the feature, or jump straight to the API docs for the new /api/v1/videos.

Video APIs are fragmented. Providers use different request shapes, parameter names, and billing units. We built the API around those differences:

Asynchronous generation: These generations take minutes, so we track them as jobs. Submit a prompt, get a job ID, and retrieve the video when ready.
Normalized parameters: We provide one schema that works across every model, including resolution, duration, aspect ratio, audio gen, frame images, and reference images.
Capability discovery: Programmatically determine what each model supports before you call it.
Passthrough parameters: Use model-specific features directly when you need them.

One Unified API

It’s typical for video models to be released as a family of endpoints. Each endpoint provides a specific capability, such as text-to-video, image-to-video, or reference-to-video. We simplify this by automatically routing you to the correct endpoint based on your parameters.

Video models vary in ways that aren't always obvious. Even duration can easily break a request. Veo 3.1 supports 4, 6, or 8-second clips, while Wan 2.6 supports 5 or 10-second clips. We also standardize how you pass in reference images (e.g., characters) and frame images (e.g., the first and last frames).

Each model can also expose its own features. Veo 3.1, for example, includes a unique personGeneration parameter that controls whether people appear in the output. The /api/v1/videos/models endpoint tells you exactly which model-specific parameters are available.

Instead of tracking all these differences yourself, you can call the video models endpoint to inspect supported resolutions, aspect ratios, pricing, input images, and durations in one place: /api/v1/videos/models. This is a perfect endpoint to give your coding agent, providing all the details it needs to adapt to each model without battling errors over acceptable input params.

For visual previews, we also added a new Playground tab on model pages, so you can try each model and see what it can make.

Multimodal Workflows

We’ve been most excited to see the results when combining video alongside other types of generations. An LLM can turn a rough idea into a detailed prompt, an image model can generate a main character, and a video model can turn that character into a scene, all through one API.

We've learned quickly that these models reward specificity. Camera movement, lighting, texture, pacing, motion style, all of it matters. The more detail you provide, the more control you get. That makes video generation a natural fit for LLM-generated prompts.

We built an open-source demo app that shows this multi-modal workflow in action: multimedia-explorer.openrouter.ai. All the code is on GitHub.

Tell us what you think and which models you want next in #video-feedback on Discord. Thanks to everyone who tested the alpha and helped shape the API. If you want early access to what's next, join us on Discord.