Stable Diffusion’s Generative AI Video Maker Has Enormous Potential

November 29, 2023

1 View 0

SaveSavedRemoved 0

Stable Diffusion has already attained plenty of fame and popularity with its AI image-generating tool, now it also tentatively offers generative AI video.

In case you haven’t already heard about it, the new product is imaginatively called Stable Video Diffusion (SVD) and it comes in two different models.

If this seems odd, bear in mind that Stable Diffusion has also created a whole range of AI image generator models for different uses.

With SVD, users can create video frames at a rate of 3 to 30 per second and the model is limited to just 14 frames for an image. The more advanced SVD-XT model on the other hand can deliver up to 25 frames of video from a single still image at a resolution of 576×1024.

Overall, these performance metrics might seem just a bit underwhelming, but remember that the technology is just getting off the ground and will almost certainly advance rapidly given time.

After all, AI image rendering was an absolute mishmash of warped creations just a couple of years ago, and we all know where it sits now, with results capable of almost perfectly emulating real photos and human-drawn works of art.

Also, the current Stable Diffusion AI video models are available only for research purposes, with potential users having to contact Stability AI for waitlist access.

This too will however change as the technology is refined for commercial use. Stability hasn’t yet said as much in such a way, but it’s the obvious outcome that they would be aiming for.

With that said, the technology is impressive even if it has lots of room for improvement. Even Stability AI itself admits its weaknesses, “The generated videos are rather short (four seconds), and the model does not achieve perfect photorealism,”

Stability also adds other caveats,

“The model may generate videos without motion, or very slow camera pans. The model cannot be controlled through text. The model cannot render legible text. Faces and people in general may not be generated properly.”

However, Stability does optimistically mention, “This state-of-the-art generative AI video model represents a significant step in our journey toward creating models for everyone of every type,”

Personally, I wouldn’t scoff at it. It would be easy to see this technology used for full-blown short movies and cartoons in just a couple of years, or sooner.

Stable Diffusion Video AI generator

As for the source of the data used to train the SVD models, it’s an issue as sensitive as that of the training data for the platform’s AI image rendering technology.

For at least part of the image creation tool, Stability used something called the LAION dataset, and this ended up leading to a lawsuit with Getty Images.

Stability has been tight-lipped about the video AI training data but did claim that it’s based on “publicly available” sources. This could of course mean all kinds of things that the courts will later resolve in all kinds of lawyer-approved ways.

For now, the market for AI video is tiny in practice simply because the technology is so limited. Stability’s new release really does present a major new development.

On the other hand, once it becomes robust, generative AI video technology could completely outclass AI image rendering in its potential for all kinds of positive, and negative uses.