AI for better Blender's Camera Tracking ? ( NOT Image Generation )

Heh… Please before anyone start trashing me on this post

I am NOT talking about Image Generation like Midjourney, Sora, etc… NO

I don’t like AI on that copyright problematic side like most artist…

.__________________

What I do consider interesting is,

Seems like AI is constantly getting better and better at reading pictures,

Could that feature be used to improve annoying tasks like Camera Tracking ?

No more loong hours trying to get the perfect Track,

Just one click and your video tracking is done…

.

Blender now has a “Pseudo” Auto Tracker,
but always comes out full of errors that you have to manually and slooowly refine.

.

Would be nice if AI could automate tasks nobody likes, like

Camera Tracking, UV Unwrapping, Rotoscopy, Retopology, etc

.

Anyway… just some ideas

Better go tell that to AI developers. No point in daydreaming about that here.

You mean in general or specifically for blender ?

NVidia is using AI for this (or everthing) to “boost” the use for their cards for… ages ?

Just one example:

And (of course)

with or for blender:

But there are also some specialised addon for camera tracking to make it more easy: BlenderMarket search: Camera tracking

Reliable AI camera tracking is coming. Wonder Dynamics has been making some meaningful strides there. I think when it arrives, it will likely be a separate app that can ingest footage and then spit out fbx or alembic files for easy import into most platforms.

Considering the Blender devs have given very little time to the Camera tracker in years, I doubt they would invest in researching this.

Wonder Dynamics has been acquired by Autodesk.

The article is in German.

Wonder Dynamics approach
could be an interesting case study for Blender Devs ,
in order to improve the Camera Tracking automation :thinking:

I’ve been wondering about this as well. I don’t know how the tracking algorithms work but there seems to be a way to calculate ‘cost’ ( pixel accuracy) so a neural network could be trained to minimize it…

It’s actually pretty simple:
The result from the camera tracking are a bunch of points in 3D space (each corresponding to a marker in the footage).
Intuitively, said cost (formally reprojection error) is the distance in image-space between a given marker and the result of ‘viewing’ the corresponding 3D point through the reconstructed camera.

greetings, Kologe

1 Like