## Background

There are different camera models in RealityCapture, but you can specify, which you want to use after all cameras are registered (aligned). This should be called as a final model. By default it is set to Brown3, which stands for 3-parametric polynomial Brown model without tangential distorsion. You can change this settings in Alignment settings ->Advanced -> Distortion model.

Each distortion model defines the way how points from the 3D space are projected to 2D image space. RealityCapture can convert between this models. Based on the selected Distortion model (see figure above), the camera model and a proper projection equation is defined.

## Camera Coordinate system

There is a common part which employs transferring 3D point into a camera coordinate system.

Let *X *be a homogenous 3D point, *X = [Xx, Xy, Xz, Xw]*, *R* be a 3x3 rotation/unitary matrix and *t* be a camera translation vector, then *x* is a 3x1 vector* [u; v; w]*.

x = [ R t ] * X.

If *Xw = 1*, then point x represents the point *X* in the camera coordinate frame.

In the RealityCapture XMP file, the rotation matrix and the position component are stored as:

<xcr:Rotation>-0.600806990019897 0.799386597570746 -0.00346819369376912 -0.210825371930633 -0.162635135620312 -0.963899618846316 -0.771092486861039 -0.578386445454965 0.266243303052733</xcr:Rotation> <xcr:Position>2111.44219951044 1607.86624656544 2302.25896526736</xcr:Position>.

Where the rotation is stored row wise, i.e.,

R = [-0.600806990019897 0.799386597570746 -0.00346819369376912

-0.210825371930633 -0.162635135620312 -0.963899618846316

-0.771092486861039 -0.578386445454965 0.266243303052733 ]

and the position is

Position= [2111.44219951044

1607.86624656544

2302.25896526736].

As XMP file contains the position component (not "t"), it is needed to transform this value according these equations:

[ R t ] * [Position;1] = [0;0;0;1] (simplified to 0),

R * Position + t = 0

And from it you will compute *t* as:

t = -R*Position.

## General Projection Equation

Projection of *X* to image is defined as:

m = DistortionModel( x ),

where *m* is a 3x1 vector with *m=[x; y; depth]* and DistortionModel is an arbitrary function. Coordinates *x* and *y* are always in range <-0.5, 0.5>.

Following cases are considered:

1) calibration is applied after the lens distortion, i.e., DistortionModel( x ) = K * Distort( x ),

2) calibration is applied prior the lens distortion, i.e., DistortionModel( x ) = Distort( K * x ),

where *K* is an upper triangular 3x3 calibration matrix

[ focal skew ppU

0 aspect*focal ppV

0 0 1 ].

Entries of this matrix are stored in XMP as follows:

1. Principal point coordinates

ppU = xcr:PrincipalPointU="0.00621063808526977",

ppV = xcr:PrincipalPointV="-0.0214264554930412",

2. XMP stores the focal length w.r.t. 35mm file format

xcr:FocalLength35mm="82.2539160239028",

focal = xcr:FocalLength35mm * w/36,

where *w *is sensor width.

3. The camera skew, skew = xcr:Skew="0",

4. Aspect ratio, aspect = xcr:AspectRatio="1".

## Lens Distortion Models

### Division Model

The basic equation for this model is as follows:

m = x + [0;0;k*r],

r =sqrt( ||x-[0;0;1]||^2),

where *k* is the distortion parameter. Please note that it is assumed that *x* is de-homogenized, i.e., *w = 1*.

In the XMP, the distortion vector is the first vector element:

<xcr:DistortionCoeficients>-0.0831553227672967 0 0 0 0 0</xcr:DistortionCoeficients>.

In RealityCapture we apply division model in the image coordinates instead of camera coordinates. It means that:

DivisionMode( K^-1*m ) = x,

or alternatively:

m = K*DivisionModel^-1( x ).

The inverse form yields to a more complicated projection equation which we omit here.

Brown Models

All other supported models are based on the polynomial distortion with basic distortion equation:

Xd = Brown( x ) = x +(x-o)* ( k1*r^2 + k2*r^4 + k3*r^6 + k4*r^8 ... ) +

[t1*(r^2 + 2*xx^2) + 2*t2*xx*xy; t2*(r^2 + 2*xy^2) + 2*t1*xx*xy ],

where *Xd* is the distorted coordinate of an undistorted point *x = [xx; xy; 1], o *is the origin/distortion center and *r* is the distance of the point *x* from the center of distortion (approximated to 0,0):

r =sqrt( ||x-[0;0;1]||^2).

Please note that point* x* is normalized w.r.t. its *w* component. The complete 3D-to-2D equation has then the following form:

x = [ R t ] * X,

x’ = x / x.w,

Xd = Brown( x’ ),

m = K*Xd.

The distortion coefficients for brown model are in XMP file stored in xcr:DistortionCoeficients as:

<xcr:DistortionCoeficients>k1 k2 k3 k4 t1 t2</xcr:DistortionCoeficients>.

For Brown 3 or models without tangential distortion compensation *k4, t1* and *t2* are replaced with 0. Order of the entries in the vector is unchanged.

Pixel Coordinates

Recall that *x* and* y* coordinates of *m* are in range <-0.5, 0.5>. This is to make the projection equations independent on the scale of images and thus an end-user can freely downscale images without breaking the equation.

To scale the point *m* to the image pixels:

scale = max( image width, image height ),

px = scale*m[0] + image width / 2,

py = scale*m[1] + image height / 2,

where *px* and *py* are coordinates in the image.

## Comments

0 comments