Camera Tracking : depth estimation

Zao29 · December 17, 2020, 9:53am

Hello, I used camera tracking for solving the camera motion from a video footage and add 3D object in it, that works well. Nevertheless, for a smoother augmented reality, i would like to integrate my 3D scene using depth estimation of each pixel of my video (to know if my 3D object is behind/in front of the element of my video), is there a way to do that in blender ? I know the floor trick but it works only with floor… For instance if i would like to add a car in my video passing through a forest present in my video (behind some trese in front of others) ?

Zao29 · December 18, 2020, 2:01pm

Please, any idea ?

sundialsvc4 · December 19, 2020, 5:38pm

In “making of” videos, I usually see people walking around with 100’ tape measures.

If you can’t go back to the actual location where the video was shot, try to re-create “something that looks like it,” then actually measure it. Adapt this to whatever is the scale of your model.

"What is the bounding-box of that real-world car, and how many real-world feet away from the camera is it, as it drives through that scene? The things in front of it, exactly how big are they and exactly how far away? The more accurately you capture this information and then incorporate it into your modeling, lighting and cinematography, the more satisfying and believable the final results will be. Keep very accurate records with pencil-and-paper.

You then “do the scaling math” and enter these (X,Y,Z) coordinates and object dimensions, as numbers.

You can’t reliably “estimate depth” from a two-dimensional video sequence.

Walker · December 19, 2020, 10:21pm

Possibly some useful links and videos:

And they can do it with a single image!!

https://learn.foundry.com/nuke/content/reference_guide/3d_nodes/depthgenerator.html

The closest thing I have actually done to doing what you are wanting, is when I used this addon here to export a motion tracked camera out of Blender into a photogrammetry application along with a few still frames from the footage. I then generated a textured mesh of the scene which when imported back into Blender was perfectly aligned, and was technically good enough to have been used as a holdout/mask.

Zao29 · December 21, 2020, 9:31am

Thank you for your reply ! So much informations ^^ given to me. I have already tried some “python cnn monocular depth-estimator” but they didn’t seems very accurate in my case. I will tried blender addon !

Zao29 · December 21, 2020, 9:39am

If you look at the link below, you can see that is possible to “estimate” the depth even from a single image. It’s a class of algorithm called “convolutional neural network”, basicly, you trained the algorithm on an amount of image where you know the depthmap and you give an image that “looks like” the images he used for his learning, he give you the depthmap of this photo. But thanks for you reply