ArUco: a minimal library for Augmented Reality applications based on OpenCv

 

NEWS: Aruco version 1.2.4 released!

The main features of ArUco are:

  • Detect markers with a single line of C++ code
  • Detection of AR boards (markers composed by several markers)
  • Requires only OpenCv (>=2.1)
  • Up to to 1024 different markers
  • Trivial integration with OpenGL and OGRE
  • Fast, reliable and cross-platform because relies on OpenCv
  • Examples that will help you to get running your AR application in less than 5 minutes
  • BSD licence

Videos in YouTube:

Video 1: Aruco and Ogre

Video 2: Single marker detection

Video 3: Board of markers detection

Examples of Projects using Aruco :

Soldamatic Project (Augmented Reality Welder) 

OpenSpace3D: http://www.openspace3d.com/

JavaScript: http://www.youtube.com/watch?v=_wzPupbww4I&feature=related

Contact info: rmsalinas at uco dot es

Download:

Download ArUco

Download ArUco Test Data

Download ArUco+OGRE example

Fast Code sample

Take a look:

Camera Calibration

Concepts

Camera calibration is the process of obtaning the fundamental parameters of a camera. These parameter allows us to determine where a 3D point in the space projects in the camera sensor. The camera parameters can be divided into intrinsics and extrinsics.

Intrinsic parameters are

  • fx,fy: Focal lenght of the camera lens in both axes. These are normally expressed in pixels
  • cx,cy:  Optical center of the sensor(expressed in pixels)
  • k1,k2,p1,p2: distorsion coefficients.

In a ideal camera, a 3D point (X,Y,Z) in the space would project in the pixel:

x= (X fx /Z) +cx ; y= (Y fy/Z)+cy.

However camera lenses normally distorts the scene making the points far from the center to be even farther. Thus, the vertical stripes near the image borders appears slightly bended. As a consequence, if we want to know a pixel's projection, then we must consider the distorsion components. Merely to say that are two type of distorsions (radial and tangential) and these are represented by the paremeters p1,p2,k1,k2.

 

The  above assumes that you know the 3D location of the point with relation to the camera reference system. If you want to know the projection of a point referred to an arbitrary reference system, then you must use th extrinsic parameters. The extrinsic parameters are basically the 3D rotations (Rvec={Rx,Ry,Rz})and 3D translations (Tvec={Tx,Ty,Tz}) required to translate the camera reference system to the arbitrary one.

Calibrating in OpenCv

Calibration is the process of obtaining the intrinsic camera parameters and OpenCv allows you to easily do it. You only need to show images of a chessboard panel of known dimensions.

 

You should take at least five different pictures. A frontal one, and four where the border of the pattern is near the image border. By placing the pattern near the image border it will be possible to estimate accurately the camera distorsion. You'll find a printable pdf of the chessboard pattern in OpenCv2.2 distribution "doc/pattern.pdf".

Once you had taken the pictures with the camera you want to use, use the "samples/cpp/calibration.cpp" application provided in OpenCv >= 2.2 to calibrate your camera. You'll have to indicate the number of corners of the pattern in both axes, and the real size of the square (use your measurer).

In my case, I ran the program with the parameters

>calibration  -w 8 -h 6 -s 0.025 -o camera.yml -op

As output, the program generates a .yml file that can be used in ArUco. In the process of obtaining the intrinsics, OpenCv  also obtains the extrinsics of the camera to each one of the locations of the pattern. However, we will not use it here.

How to Augment Reality?

Augmenting the reality is the process of adding virtual information to images. To do so, we need to know where to paint the virtual information (normally using OpenGL or another 3D engine). Here is where the markers will help you.

An AR Marker, like the one used in Aruco, is a very distintive element that can be easily detected. The marker will be used to calculate the extrinsics of the camera in relation to the marker so that you'll be able to render in 3D knowing where is the (0,0,0) of the world reference system. Since the AR marker is very easy to detect, you can process images and do rendering in real time.

 

Detection Process in Aruco

The marker detection process of ArUco is as follows:

  • Apply an Adaptive Thresholding so as to obtain borders (Figure 1)
  • Find contours. After that, not only the real markers are detected but also a lot of undesired borders. The rest of the process aims to filter out unwanted borders.
    • Remove borders with an small number of points (Figure 2)
    • Polygonal approximation of contour and keep the concave contours with exactily 4 corners (i.e., rectangles) (Figure 3)
  • Sort corners in anti-clockwise direction.
  • Remove too close rectangles. This is required because the adaptive threshold normally detects the internal an external part of the marker's border. At this stage, we keep the most external border. (Figure 4)
  • Marker Identification
    • Remove the proyection perspective so as to obtain a frontal view of the rectangle area using an homography (Figure 5)
    • Threshold the area using Otsu. Otsu's algorithms assumes a bimodal distribution and finds the threshold that maximizes the extra-class variance while keeping a low intra-class variance.
    • Identification of the internal code. If it is a marker, then it has an internal code. The marker is divided in a 6x6 grid, of which the internal 5x5 cells contains id information. The rest correspond to the external black border. Here, we first check that the external black border is present. Afterwards, we read the internal 5x5 cells and check if they provide a valid code (it might be required to rotate the code to get the valid one).
  • For the valid markers, refine corners using subpixel interpolation
  • Finally, if camera parameters are provided, it is computed the extrinsics of the markers to the camera.

Marker coding

Each marker has an internal code given by 5 words of 5 bits each. The codification employed is a slight modification of the Hamming Code. In total, each word has only 2 bits of information out of the 5 bits employed. The other 3 are employed for error detection. As a consequence, we can have up to 1024 diferent ids.

The main difference with the Hamming Code is that the first bit (parity of bits 3 and 5) is inverted. So, the id 0 (which in hamming code is 00000) becomes 10000 in our codification. The idea is to prevent a completely black rectangle from being a valid marker id with the goal of reducing the likelihood of false positives with objects of the environment.

 

Benefits of using AR Boards


Detecting  a single marker might fail for different reasons such as poor lightning conditions,  fast camera  movement, occlusions, etc. To overcome that problem, ArUco allows the use of boards. An Augmented Reality board is a marker composed by several markers arranged in a grid. Boards present two main advantages. First, since there have more than one markers, it is less likely to lose them all at the same time. Second, the more markers are detected, the more points are available for computing the camera extrinsics. As a consecuence, the accuracy obtained increases.