Depth and Camera Pose

Projection (π) and Backprojection (π⁻¹)

Step 1: Backprojection (π⁻¹)

Given a pixel (u, v) in image Iᵢ and its depth d, we compute its 3D coordinates in the camera frame of Iᵢ:

Pi=π1(u,v,d)

Step 2: Transform from Frame i to Frame j (Tᵢⱼ)

The 3D point Piis transformed to the camera frame of Ij using:

Tij=(Tjw)1Tiw

where:

The transformed point is:

Pj=TijPi

Step 3: Projection (π)

Project Pj onto the image plane of Ij:

(u,v)=π(Pj)

For a pinhole camera:

u=fx(Pj,x/Pj,z)+cx,v=fy(Pj,y/Pj,z)+cy

Step 4: Compute Optical Flow

The flow vector for pixel (u, v) is:

f^(u,v)=(uu,vv)

Intuition

Example

Given:

Step 1: Backprojection

Pi=π1(320,320,1)=[001]

Step 2: Transform to Frame j

Tij=(Tjw)1Tiw=[1000.1010000100001]Pj=TijPi=[0.101]

Step 3: Projection

u=500(0.1/1)+320=270v=500(0/1)+320=320

Step 4: Flow Vector

f^(320,320)=(270320,320320)=(50,0)

Interpretation:

Dense Flow Field

Instead of computing flow for one pixel, we compute it for all pixels (u, v) in a grid (h × w), given a dense depth map d. This gives a flow field:

hij(Tiw,Tjw,di)=π(Tijπ1(u,v,di))[uv]

Why is this useful?