This is how we go from images to somewhat depth understanding using neural networks.

Monocular Depth Estimation

The paper that started it all ( 11 years ago from today 2025 )
It was released in 2014
Pasted image 20250419140727.png
Pasted image 20250419140800.png

No one does this anymore, but it was really good as one of the starter DL papers on inverse graphics.

The limitations were there because the ground truth data wasn't that great, edge estimation wasn't great, black holes etc.

To train NN , we need a lot of data, and some data is not scaled.

To tackle this we make scale invariant loss functions for our NN to train on.

From "Towards Robust Monocular Depth Estimation: Mixing Datasets
for Zero-Shot Cross-Dataset Transfer, Ranftl et al. 2022 "
Pasted image 20250419152623.png
Pasted image 20250419152749.png