Camera Calibration Theory

This page details the mathematics of the camera calibration.

Camera Model

Calcam is based on fitting or otherwise creating a model which describes the relationship between 3D real-world coordinates and image coordinates. It supports two different models: one for “conventional” rectilinear lenses and one for fisheye lenses. In both cases, we wish to relate the coordinates of a point \((X,Y,Z)\) in the lab frame to its pixel coordinates \((x{_p},y{_p})\) in the camera image. First, we must consider the position and viewing direction of the camera in the lab frame, which is described by a 3D translation and rotation. The translation and rotation parameters are known as the extrinsic parameters in the model. Knowing these, we can apply a suitable translation and rotation to obtain the point of interest’s coordinates in the camera frame: a 3D real space coordinate system where the camera pupil is at the origin and the camera looks along the positive \(Z\) axis. We denote the coordinates of our point of interest in the camera frame as \((X^\prime,Y^\prime,Z^\prime)\).

In order to find the pixel coordinates of this point in the camera image, we start with a simple perspective projection, where the height of an object in the image is inversely proportional to its distance from the camera pupil:

\[\begin{split}\begin{pmatrix}x_n\\y_n\end{pmatrix} = \begin{pmatrix}X^\prime/Z^\prime\\Y^\prime/Z^\prime\end{pmatrix}. \label{eqn:cmmodel_pinhole}\end{split}\]

The normalised coordinates \((x_n,y_n)\) are then transformed by a model which describes the image distortion due to the optical system. This model depends on the lens type, and models for the rectilinear and fisheye lenses are described in the following sections. Here we simply denote the resulting distorted normalised coordinates as \((x_d, y_d)\). Finally, the normalised, distorted coordinates are related to the actual pixel coordinates \(x_p, y_p\) in the image plane by multiplication with the camera matrix:

\[\begin{split}\begin{pmatrix}x_p\\y_p\\1\end{pmatrix} = \begin{pmatrix}f_x & 0 & c_x \\ 0 & f_y & c_y\\0 & 0 & 1\end{pmatrix}\begin{pmatrix}x_d\\y_d\\1\end{pmatrix}. \label{eqn:cammatrix}\end{split}\]

Here \(f_x\) and \(f_y\) are the effective focal length of the imaging system measured in units of detector pixels in the horizontal and vertical directions, and are expected to be equal for square pixels and non-anamorphic optics. \(c_x\) and \(c_y\) are the pixel coordinates of the centre of the perspective projection on the sensor, expected to be close to the detector centre. The parameters in the camera matrix, along with those describing the distortion model, constitute the intrinsic camera parameters, i.e. they are characteristic of the camera and optical system and are independent of how that system is placed in the lab.

Rectilinear Lens Distortion Model

For rectilinear lenses (which in the ideal distortion-free case, re-produce straight lines in the scene as straight lines in the image), Calcam uses the Brown–Conrady model 1 to characterise geometrical lens distortions. The equation relating the undistorted and distorted normalised image coordinates in this model is:

\[\begin{split}\begin{pmatrix}x_d\\y_d\end{pmatrix} = \left[ 1 + k_1r^2 + k_2r^4 + k_3r^6\right]\begin{pmatrix}x_n\\y_n\end{pmatrix} + \begin{pmatrix}2p_1x_ny_n + p_2(r^2 + 2x_n^2)\\2p_2x_ny_n + p_1(r^2 + 2y_n^2)\end{pmatrix}, \label{eq:perspective_distortion}\end{split}\]

where \(r = \sqrt{x_n^2 + y_n^2}\). The polynomial in \(r^2\) describes radial distortion: the \(k_1\) term corresponds to barrel or pincushion distortion, and the combination of \(k_1\) and \(k_2\) terms can describe moustache distortion. Including a \(k_3\) term allows for an even higher order radial distortion model. The second term with coefficients \(p_1,p_2\) describes tangential, or decentring distortion, which results from de-centring or misalignment of individual optical elements. When fitting a camera calibration in calcam, each term of the radial distortion polynomial, and the tangential distortion term, can either be enabled or set to zero to simplify the distortion model.

Fisheye Lens Distirtion Model

Fisheye lenses use deliberate geometrical distortion to produce an image with much wider field of view than rectilinear lenses. In this case, the polynomial describing the radial distortion is a function of an anglular distance from the centre of perspective, rather than a linear distance in the image like in the rectilinear lens model:

\[\begin{split}\begin{pmatrix}x_d\\y_d\end{pmatrix} = \frac{\theta}{r}\left[ 1 + k_1\theta^2 + k_2\theta^4 + k_3\theta^6 + k_4\theta^8\right]\begin{pmatrix}x_n\\y_n\end{pmatrix}, \label{eqn:fisheye_distortion}\end{split}\]

where \(r = \sqrt{x_n^2 + y_n^2}\) and \(\theta = \tan^{-1}(r)\).

Similarly to the rectilinear lens model, when fitting a fisheye calibration in Calcam the individual polynomial terms can be switched on or off to fit a different order of distortion model.

Underlying OpenCV Documentation

Calcam does not implement the above camera models within its own code; under the hood it uses the OpenCV camera calibration functions. It may therefore be helpful to also refer to the OpenCV camera calibration documentation, which can be found on the OpenCV webpages.

References

1

Duane C. Brown, “Decentering distortion of lenses” Photogrammetric Engineering. 32 (3): 444–462 (1966) PDF available on archive.org.