Adaptive Interfaces for Marker-based Augmented Reality

CS290I Mixed and Augmented Reality [2012 Winter]

Project Report. 

March 21, 2012
[Individual Project]  Byungkyu (Jay) Kang

Overview of the Project

Augmented reality(AR) techniques have been developed for more than a decade in a variety of fields including Computer Science and its possible application practices are being exponentially increased.  As augmented reality technology is being developed rapidly, we are not only facing its increased demand in the real world, but being asked for more advanced interfaces as well as smooth adaptation to merge the technology with real world.  In this project, we propose a few interfaces that can be utilized in both legacy platform and mobile environment.  We suggest four different user interfaces (recognition, reflection, sound and motion interactions) for experiencing more realistic augmented reality application.

Main Feature

The contribution of this work is harnessing four different simple but effective means for user interface in augmented reality. In this section, brief description of each interface is introduced. Detailed specification and implementation methodology of each technique will be discussed in the following section.

  1. Nested Marker for extensible distance
    For the first user interface, we exploit nested architecture of a single marker.  Nested marker can provide longer distance by having larger marker print.  Our implementation also provides seamless transition between two different layers in a nested marker.
  2. Light source responsive object rendering
    We have been experiencing unrealistic composition of both rendered image and real image frame. For example, crispy 3D or 2D object on a blurry camera image and bright augmented object in a dark surrounding are the most common unrealistic compositions. We provide a lighting mechanism interacting with real background light sources in OpenGL rendering loop.
  3. Sound manipulator
    There are a number of possible interfaces between human and computer.  Even a conveniently enough mobile device can not be reached sometimes.  We can interact with AR application/objects by making a specific sounds such as clapping or snapping of fingers.
  4. Mask based object interaction
    Direct interaction in real space with augmented objects through our body(i.e. finger tip) is perhaps the most effective and intuitive interface in virtual or augmented reality.  In this project, we firstly detect markers and find their vertices of each corner.  By making use of them, we are able to generate four different regions of interest(ROIs) and assign specific effect such as rotation or re-scaling.

Implementation and Evaluation

1. System Design

The implemented application for this project is originally intended for mobile environment because we can exploit more sensing modalities(i.e. gyro sensor, compass, etc.).  Due to time constraint, however, implementation on mobile device(iOS) is incomplete yet, and now in progress. (Image input/output, basic user interface(storyboard environment for iOS 4 and preliminary image processing are implemented.)  We will update upcoming progress as well as its final result on this post immediately.

1.1. System Overview

– Platform : Open Platform (Developed on MacOSX 10.7.2 – XCode)
– Used Libraries : OpenCV, OpenGL, ARToolkit 2.72.1, Portaudio

1.2. System Architecture

1.2.1. Feature Matching

Not only ARToolkit but most of augmented reality frameworks and libraries follow the basic feature matching algorithm. This algorithm is as follows.

Figure 1. Feature Matching in ARToolkit

1.2.2. System Architecture

Our system has mainly three tracks between both each raw frame and final rendering.  Nested Marker can be individually trained by ARToolkit.  We generated marker pattern files by the flash online marker generator [2] and added smooth scaler which is calculated from detected marker size.

Figure 2. System Architecture

2. Proposed Method

2.1. Nested Marker

Figure 3. Prototype of nested marker

Since a nested marker has more than four markers of the same image, we can extend recognition distance as much as necessary as shown in Figure 3.  However, we must have bigger marker print for longer distance to recognize, and this is one of the drawbacks of our interface.  The problem statement of our approach is physical limitation of recognition distance, particularly outdoor augmented reality.  Similar technique was introduced by [1] Tateno et al., but our goal purely focuses on providing smooth and seamless extension of recognition in longer distance.  [1] Tateno et al. intended to provide various user experiences with multi-layered nested marker by having different sub-images.  This previous objective now can be achieved by incorporating multiple renderings with different offsets in an application.  Smooth-scaling is done by detecting the area size of recognized marker in real-time.  We can rescale rendered object by multiplying coefficient derived from detected maker size.  When nested sub-images are not aligned in center, we can also assign offset value as well.

ARToolkit documentation also explicitly mentions about the limitation of the distance of recognition(range issues). Figure 3.1 is the table about range issue from the ARToolkit documentation and Figure 3.2 describes our nested-marker model.

Figure 3.1. Tracking range for different sized patterns (from ARToolkit Documentation)

Figure 3.2. Nested marker model

2.2. Light source responsive object rendering

Figure 4. Flowchart of the algorithm of light aware rendering technique

As can be seen in Figure 2., the light source aware rendering algorithm utilizes real world camera input with appropriate image processing.  A similar approach has been made in [3] Lie et al. which utilizes the spatial and temporal coherence of illumination from real camera input image.  Although their algorithm is comparatively robust, it shows nearly real-time performance with optimized configuration according to the paper.  Our algorithm shows us real-time rendering using OpenGL without having optimized condition(approximation of obtained contours from light source detection).  The real-time performance is especially crucial for mobile augmented reality applications.  By employing light source aware rendering in any augmented reality application, a user of AR system can experience realism from rendered object responding to surrounding light condition.  The algorithm of the approach is described in Figure 4 as below.

Figure 5. Light Source Extraction - Upper two images are from OpenGL and OpenCV output, bottom left is for Lab image with highlight of detected light sources. Bottom right image is found contours from cvFindContour function.

Figure 6. Fast Fourier Transform visualization of the sound of snapping fingers fed as input interaction

2.3. Sound manipulator

One of two additional approaches we intended to add in this project was sound based interaction with augmented realistic objects.  This approach can be very simple but also powerful unless an user is interacting with a system in a very noisy public space.  Since each sound has its unique feature in its frequency distribution, the system can recognize certain frequency against average amplitude of the whole input sound.  Preliminary implementation has been done with Processing.  For example, the unique distribution in fast fourier transform(FFT) visualization was found as can be seen in Figure 6.  Since we need to merge this functionality with our system, C based sound processing library must be used instead of Processing which is Java based.  Portaudio library is chosen for audio processing layer in order to analyze microphone input.  Implementation of this module is still in progress.


2.4. Mask based object interaction

This interaction approach has been made from a number of literatures previously.  Therefore, we can briefly mention about this approach with the following Figure 7.  We can firstly find the orientation of detected markers from ‘marker_info‘ pointer variable in ARToolkit library.  After up/down and left/right positions of a marker is found, four regions of interest with half size of the marker area is going to be assigned to each edge of a marker.  For each region, we can link to specific event which will be activated once any motion flow is detected in the corresponding ROI.

Figure 7. A systemic description of marker interaction with motion flow detection

Figure 8. Finding four distinctive corners of detected marker






[1] Hirokazu Kato , Mark Billinghurst, Marker Tracking and HMD Calibration for a Video-Based Augmented Reality Conferencing System, Proceedings of the 2nd IEEE and ACM International Workshop on Augmented Reality, p.85, October 20-21, 1999

[2] ARToolKit Marker Generator Online

[3] Yanli Liu, Xavier Granier: Online Tracking of Outdoor Lighting Variations for Augmented Reality with Moving Cameras. IEEE Trans. Vis. Comput. Graph. 18(4): 573-580 (2012)

[4] The skeleton code is brought to this project by SimpleTest.c in ARToolkit 2.72.1 open sourcecode.

[5] OpenCV 2.0 Documentation (OpenCV 2.0 C Reference)