ArUco: a minimal library for Augmented Reality applications based on OpenCV



News: Aruco 2.0.x released: A Library to rule them all. The new version can detect markers from other libraries such as AprilTags, ArTootKit+ and ARTAG. 

IMPORTANT: see the related project markermapper

The main features of ArUco are:

  • Detect markers with a single line of C++ code.
  • Detect varios dictionaries: ARUCO, ARUCO-HRM, AprilTag,ArToolKit+,ARTAG.
  • Few dependencies  OpenCV (>=2.4.9) and eigen3 (included in the library).
  • Detection of  Marker Maps (several markers).
  • Trivial integration with OpenGL and OGRE.
  • Fast, reliable and cross-platform because relies on OpenCV.
  • Examples that will help you to get running your AR application in less than 5 minutes.
  • Calibrate you camera using Aruco ChessBoard
  • BSD licence.
  • Faster than AprilTags


Highlights features of version 2.x

  • Detect varios dictionaries which are hardcoded (no need to use external files).
  • Improved speed using pyramid computation.
  • Added tracking methods to alleviate the ambiguity problem in planar pose estimation.
  • Detection of chessboards of markers
  • Calibration tool using Aruco ChessBoards added (see below)


Download ArUco

Download ArUco Test Data

Download ArUco+OGRE example  (UNDER ADAPTATION TO VERSION 2.x)



Please cite the following papers if you use aruco:

1. Main ArUco paper:
"Automatic generation and detection of highly reliable fiducial markers under occlusion"
2. Generation of marker dictionaries:
"Generation of fiducial marker dictionaries using mixed integer linear programming"


Videos in YouTube:

Video 1: Aruco and Ogre

Video 2: Single marker detection

Video 3: Board of markers detection

Examples of Projects using ArUco :

Augmented Reality Chess

Vision Based Navigation  and Fuzzy Vision Based Navigation

Soldamatic Project (Augmented Reality Welder) 


JavaScript port : watch demonstration video

ROS (Robot Operative System) port:#1 #2

Panoramic camera calibratio

Contact info:

Rafael Muñoz-Salinas:  rmsalinas at uco dot es

Sergio Garrido-Jurado:   sgarrido2011 at gmail dot com

Fast Code sample

Take a look:



You can learn how to use the library first reading this document, then reading the README and finally, analyzing the many examples in the utils directories of the library. Also, you can have a more in-depth knowledge by reading the above mentioned papers.

Camera Calibration


Camera calibration is the process of obtaning the fundamental parameters of a camera. These parameter allows us to determine where a 3D point in the space projects in the camera sensor. The camera parameters can be divided into intrinsics and extrinsics.

Intrinsic parameters are

  • fx,fy: Focal lenght of the camera lens in both axes. These are normally expressed in pixels
  • cx,cy:  Optical center of the sensor(expressed in pixels)
  • k1,k2,p1,p2: distorsion coefficients.

In a ideal camera, a 3D point (X,Y,Z) in the space would project in the pixel:

x= (X fx /Z) +cx ; y= (Y fy/Z)+cy.

However camera lenses normally distorts the scene making the points far from the center to be even farther. Thus, the vertical stripes near the image borders appears slightly bended. As a consequence, if we want to know a pixel's projection, then we must consider the distorsion components. Merely to say that are two type of distorsions (radial and tangential) and these are represented by the paremeters p1,p2,k1,k2.


The  above assumes that you know the 3D location of the point with relation to the camera reference system. If you want to know the projection of a point referred to an arbitrary reference system, then you must use th extrinsic parameters. The extrinsic parameters are basically the 3D rotations (Rvec={Rx,Ry,Rz})and 3D translations (Tvec={Tx,Ty,Tz}) required to translate the camera reference system to the arbitrary one. The rotation elements are expressed in Rodrigues formula, so you can obtain the equivalen 3x3 rotation matrix using the opencv function cv::Rodrigues()

A very popular alternative to manage both rotation and translation is using Homogenoeus coordinates. You can easily create a 4x4 matrix than comprises both rotation and translation (use the following piece of code to create the 4x4 matrix from rvec and tvec).

This matrix allows you to transform a point from one reference system to the other. You can also invert the 4x4 matrix to obtain the opposite transform.

Calibrating in OpenCV

Calibration is the process of obtaining the intrinsic camera parameters and OpenCV allows you to easily do it. You only need to show images of a chessboard panel of known dimensions.


You should take at least five different pictures. A frontal one, and four where the border of the pattern is near the image border. By placing the pattern near the image border it will be possible to estimate accurately the camera distorsion. You'll find a printable pdf of the chessboard pattern in OpenCV 2.2 distribution "doc/pattern.pdf".

Once you had taken the pictures with the camera you want to use, use the "samples/cpp/calibration.cpp" application provided in OpenCV >= 2.2 to calibrate your camera. You'll have to indicate the number of corners of the pattern in both axes, and the real size of the square (use your measurer).

In my case, I ran the program with the parameters

>calibration  -w 8 -h 6 -s 0.025 -o camera.yml -op

As output, the program generates a .yml file that can be used in ArUco. In the process of obtaining the intrinsics, OpenCV  also obtains the extrinsics of the camera to each one of the locations of the pattern. However, we will not use it here.

Calibration with Aruco

Aruco has its own calibration board that has an advantage over the classical opencv chessboard: since it is composed by several markers, you do not need to see it completely in order to obtain calibration points. Instead, you can see if partially and still works. This is a very convenient feature in many applications. You have programs in utils to use calibrate using our chessboard.


How to Augment Reality?

Augmenting the reality is the process of adding virtual information to images. To do so, we need to know where to paint the virtual information (normally using OpenGL or another 3D engine). Here is where the markers will help you.

An AR Marker, like the one used in ArUco, is a very distintive element that can be easily detected. The marker will be used to calculate the extrinsics of the camera in relation to the marker so that you'll be able to render in 3D knowing where is the (0,0,0) of the world reference system. Since the AR marker is very easy to detect, you can process images and do rendering in real time.

Detection Process in ArUco

The marker detection process of ArUco is as follows:

  • Apply an Adaptive Thresholding so as to obtain borders (Figure 1)
  • Find contours. After that, not only the real markers are detected but also a lot of undesired borders. The rest of the process aims to filter out unwanted borders.
    • Remove borders with an small number of points (Figure 2)
    • Polygonal approximation of contour and keep the concave contours with exactily 4 corners (i.e., rectangles) (Figure 3)
  • Sort corners in anti-clockwise direction.
  • Remove too close rectangles. This is required because the adaptive threshold normally detects the internal an external part of the marker's border. At this stage, we keep the most external border. (Figure 4)
  • Marker Identification
    • Remove the proyection perspective so as to obtain a frontal view of the rectangle area using an homography (Figure 5)
    • Threshold the area using Otsu. Otsu's algorithms assumes a bimodal distribution and finds the threshold that maximizes the extra-class variance while keeping a low intra-class variance.
    • Identification of the internal code. If it is a marker, then it has an internal code. The marker is divided in a 6x6 grid, of which the internal 5x5 cells contains id information. The rest correspond to the external black border. Here, we first check that the external black border is present. Afterwards, we read the internal 5x5 cells and check if they provide a valid code (it might be required to rotate the code to get the valid one).
  • For the valid markers, refine corners using subpixel interpolation
  • Finally, if camera parameters are provided, it is computed the extrinsics of the markers to the camera.

Marker coding

Each marker has an internal code given by 5 words of 5 bits each. The codification employed is a slight modification of the Hamming Code. In total, each word has only 2 bits of information out of the 5 bits employed. The other 3 are employed for error detection. As a consequence, we can have up to 1024 diferent ids.

The main difference with the Hamming Code is that the first bit (parity of bits 3 and 5) is inverted. So, the id 0 (which in hamming code is 00000) becomes 10000 in our codification. The idea is to prevent a completely black rectangle from being a valid marker id with the goal of reducing the likelihood of false positives with objects of the environment.


Benefits of using Marker Maps

Detecting  a single marker might fail for different reasons such as poor lightning conditions,  fast camera  movement, occlusions, etc. To overcome that problem, ArUco allows the use of boards. An Augmented Reality board is a marker composed by several markers arranged in a grid. Markermaps present two main advantages. First, since there have more than one markers, it is less likely to lose them all at the same time. Second, the more markers are detected, the more points are available for computing the camera extrinsics. As a consecuence, the accuracy obtained increases.


Managing the output of ArUco

When you know the camera intrinsics (camera matrix and distortion coefficients obtained by calibration), and you specify the size of the marker, the library can tell you the relative position of the markers and the camera.

ArUco uses the a class named Marker which represents a marker observed in the image. Each marker is a vector of 4 2d points (representing the corners in the image), an unique id, its size (in meters), and the translation and rotation  that relates the center of the marker and the camera location.

The 3D information is represented by the class variables Rvec and Tvec. The latter, reprents a 3D translation. The first, is a rotation transform expressed as given by the Rodrigues formula. To obtatin the full 3x3 rotation matrix, you can use cv::Rodrigues(Rvec,R33).

Then, you can compose a full 4x4 transform matrix by adding the 3x3 matrix to the translation vector. You get a matrix like

T=[ R00 R01 R01 T0;R10 R11 R11 T1;R20 R21 R21 T2; 0 0 0 1]

So, if you want to project a 3D point in the space (measured wrt the marker ref system), you first must transform it to the camera ref system (multiplying the point in homogeneous corrdinates by T). The resulting 3d point is projected using the camera matrix, and finally you need to apply the distortion model.


All these steps are done by cv::projectPoints, which uses as input the CameraMatrix the Distortion Coefficients, and Rvec,Tvec.


Tracking the position of a marker 

Inevitably, the pose of a single marker is subject to the ambiguity problem, i.e., two valid solutions can explain the observed projection of the marker. It generally happens when the marker is small in relation to the image, or in very inclined positions. When tracking a marker, you see this effect as a strange change flip of the z-axis  of the marker.  To alleviate that problem, the library has the possibility of tracking the marker position. If you start from a good position, for the next frame, you can fo an optimization looking for the best pose near from the last one. Then, the abiguity problem can be solved. To do so, use the MarkerPoseTracker and MarkerMapPoseTracker classes.

Using ArUco in your project

Here we show how to create your own project using ArUco and cmake. This example can be downloaded at SourceForge.

First, create a dir for your dir with the project (e.g. aruco_testproject). Go in and create the CMakeLists.txt as

cmake_minimum_required(VERSION 2.8)
find_package(aruco REQUIRED )
add_executable(aruco_simple aruco_simple.cpp)
target_link_libraries(aruco_simple  ${aruco_LIBS})
cmake_minimum_required(VERSION 2.8)
find_package(aruco REQUIRED )
add_executable(aruco_simple aruco_simple.cpp)
target_link_libraries(aruco_simple  ${aruco_LIBS})

Then, create the program file aruco_simple.cpp :

#include <iostream>
#include <aruco/aruco.h>
#include <aruco/cvdrawingutils.h>
#include <opencv2/highgui/highgui.hpp>
using namespace cv;
using namespace aruco;
int main(int argc,char **argv)
        if (argc!=2) {
            cerr<<"Usage: in.jpg "<<endl;
            return -1;
        MarkerDetector MDetector;
        vector<Marker> Markers;
        //read the input image
        cv::Mat InImage;
    //Ok, let's detect
        //for each marker, draw info and its boundaries in the image
        for (unsigned int i=0;i<Markers.size();i++) {
        cv::waitKey(0);//wait for key to be pressed
    } catch (std::exception &ex)
        cout<<"Exception :"<<ex.what()<<endl;
Finally, create a directory for building (e.g. build), go in, compile and execute
mkdir build
cd build
cmake ..
./aruco_simple image_withmarkers.jpg