Mirror Image

Mostly AR and Stuff

Symbian Multimarker Tracking Library

#augmentedreality
Demo-version of binary Symbian multimarker tracking library SMMT available for download.
SMMT library is a SLAM multimarker tracker for Symbian. Library can work on Symbian S60 9.1 devices like Nokia N73 and Symbian 9.2 like Nokia N95, N82. It may also work on some other later versions. This version support only landscape 320×240 resolution for algorithmical reason – size used in the optimization.
This is slightly more advanced version of the tracker used in AR Tower Defense game.
PS corrupted file fixed

5, September, 2009 Posted by | Coding AR | , , , , , , , , , | Comments Off on Symbian Multimarker Tracking Library

Another prospective AR device

Nokia RX-51. It reported having OMAP3 600Mhz CPU with hardware 3D, camera, phone connectivity – it’s not a pure tablet like N800, GPS, accelerometers, and most testy – Maemo 5 Linux
No reports of electronic compass though.

9, August, 2009 Posted by | Mobile | , , | Comments Off on Another prospective AR device

Augmented reality on S60 – basics

Blair MacIntyre asked on ARForum how to get video out of the Symbian Image data structre and upload it into OpenGL ES texture. So here how I did for my games:
I get viewfinder RGB bitmap, access it’s rgb data and use glTextureImage2D to upload it into background texture, which I stretch on the background rectangle. On top of the background rectangle I draw 3d models.
This code snipped for 320×240 screen and OpenGL ES 1+ (wordpress completly screwed tabs)

PS Here is binary static library for multimarker tracking for S60 which use that method.

#define VFWIDTH 320
#define VFHEIGHT 240

Two textures used for background, because texture size should be 2^n: 256×256 and 256×64

#define BKG_TXT_SIZEY0 256
#define BKG_TXT_SIZEY1 64

Nokia camera example could be used the as the base.

1. Overwrite ViewFinderFrameReady function

void CCameraCaptureEngine::ViewFinderFrameReady(CFbsBitmap& aFrame)
{
iController->ProcessFrame(&aFrame);
}

2. iController->ProcessFrame call CCameraAppBaseContaine->ProcessFrame

void CCameraAppBaseContainer::ProcessFrame(CFbsBitmap* pFrame)
{
// here RGB buffer for background is filled
iGLEngine->FillRGBBuffer(pFrame);
//and greyscale buffer for tracking is filled
iTracker->FillGreyBuffer(pFrame);

//traking
TBool aCaptureSuccess = iTracker->Capture();
//physics
if(aCaptureSuccess)
{
iPhEngine->Tick();
}
//rendering
glClear( GL_DEPTH_BUFFER_BIT);
iGLEngine->SetViewMatrix(iTracker->iViewMatrix);
iGLEngine->Render();

iGLEngine->Swap();
};
void CGLengine::Swap()
{
eglSwapBuffers( m_display, m_surface);
};

3. now how buffers filled: RGB buffers filled ind binded to textures

inline unsigned int byte_swap(unsigned int v)
{


		return (v<<16) | (v&0xff00) | ((v >> 16)&0xff);
}

void CGLengine::FillRGBBuffer(CFbsBitmap* pFrame)
{
pFrame->LockHeap(ETrue);
unsigned int* ptr_vf = (unsigned int*)pFrame->DataAddress();

FillBkgTxt(ptr_vf);

pFrame->UnlockHeap(ETrue); // unlock global heap

BindRGBBuffer(m_bkgTxtID0, m_rgbxBuffer0, BKG_TXT_SIZEY0);
BindRGBBuffer(m_bkgTxtID1, m_rgbxBuffer1, BKG_TXT_SIZEY1);
}

void CGLengine::FillBkgTxt(unsigned int* ptr_vf)
{
unsigned int* ptr_dst0 = m_rgbxBuffer0 +
(BKG_TXT_SIZEY0-VFHEIGHT)*BKG_TXT_SIZEY0;
unsigned int* ptr_dst1 = m_rgbxBuffer1 +
(BKG_TXT_SIZEY0-VFHEIGHT)*BKG_TXT_SIZEY1;

for(int j =0; j < VFHEIGHT; j++)
for(int i =0; i < BKG_TXT_SIZEY0; i++)
{
ptr_dst0[i + j*BKG_TXT_SIZEY0] = byte_swap(ptr_vf[i + j*VFWIDTH]);
}

ptr_vf += BKG_TXT_SIZEY0;

for(int j =0; j < VFHEIGHT; j++)
for(int i =0; i < BKG_TXT_SIZEY1; i++)
{
ptr_dst1[i + j*BKG_TXT_SIZEY1] = byte_swap(ptr_vf[i + j*VFWIDTH]);
}

}

void CGLengine::BindRGBBuffer(TInt aTxtID, GLvoid* aPtr, TInt aYSize)
{
glBindTexture( GL_TEXTURE_2D, aTxtID);
glTexImage2D(GL_TEXTURE_2D, 0, GL_RGBA, aYSize, BKG_TXT_SIZEY0, 0,
GL_RGBA, GL_UNSIGNED_BYTE, aPtr);
}

4. Greysacle buffer filled, smoothed by integral image :

void CTracker::FillGreyBuffer(CFbsBitmap* pFrame)
{

pFrame->LockHeap(ETrue);
unsigned int* ptr = (unsigned int*)pFrame->DataAddress();

if(m_bIntegralImg)
{
// calculate integral image values

unsigned int rs = 0;
for(int j=0; j < VFWIDTH; j++)
{
// cumulative row sum
rs = rs+ Raw2Grey(ptr[j]);
m_integral[j] = rs;
}

for(int i=1; i< VFHEIGHT; i++)
{
unsigned int rs = 0;
for(int j=0; j = VFWIDTH)
{
m_integral[i*VFWIDTH+j] = m_integral[(i-1)*VFWIDTH+j]+rs;
}
}
}

iRectData.iData[0] = m_integral[1*VFWIDTH+1]>>2;

int aX, aY;

for(aY = 1; aY >2;
iRectData.iData[MAX_SIZE_X-1 + aY*MAX_SIZE_X] = Area(2*MAX_SIZE_X-2, 2*aY, 2, 2)>>2;
}

for(aX = 1; aX >2;
iRectData.iData[aX + (MAX_SIZE_Y-1)*MAX_SIZE_X] = Area(2*aX, 2*MAX_SIZE_Y-2, 2, 2)>>2;
}

for(aY = 1; aY < MAX_SIZE_Y-1; aY++)
for(aX = 1; aX >4;
}

}
else
{

if(V2RX == 2 && V2RY ==2)
for(int j =0; j < MAX_SIZE_Y; j++)
for(int i =0; i >2;
}
else
for(int j =0; j < MAX_SIZE_Y; j++)
for(int i =0; i UnlockHeap(ETrue); // unlock global heap

}

Background could be rendered like this

#define GLUNITY (1<<16)
static const TInt quadTextureCoords[4 * 2] =
{
0, GLUNITY,
0, 0,
GLUNITY, 0,
GLUNITY, GLUNITY
};

static const GLubyte quadTriangles[2 * 3] =
{
0,1,2,
0,2,3
};

static const GLfloat quadVertices0[4 * 3] =
{
0, 0, 0,
0, BKG_TXT_SIZEY0, 0,
BKG_TXT_SIZEY0, BKG_TXT_SIZEY0, 0,
BKG_TXT_SIZEY0, 0, 0
};

static const GLfloat quadVertices1[4 * 3] =
{
BKG_TXT_SIZEY0, 0, 0,
BKG_TXT_SIZEY0, BKG_TXT_SIZEY0, 0,
BKG_TXT_SIZEY0+BKG_TXT_SIZEY1, BKG_TXT_SIZEY0, 0,
BKG_TXT_SIZEY0+BKG_TXT_SIZEY1, 0, 0
};

void CGLengine::RenderBkgQuad()
{
glMatrixMode(GL_PROJECTION);
glLoadIdentity();
glOrthof(0, VFWIDTH, 0, VFHEIGHT, -1.0, 1.0);
glMatrixMode(GL_MODELVIEW);
glLoadIdentity();
glViewport(0, 0, VFWIDTH, VFHEIGHT);

glClear( GL_DEPTH_BUFFER_BIT);
glDisable(GL_BLEND);
glDisable(GL_ALPHA_TEST);
glDisable(GL_DEPTH_TEST);
glDisable(GL_CULL_FACE);

glColor4x(GLUNITY, GLUNITY, GLUNITY, GLUNITY);

glBindTexture( GL_TEXTURE_2D, m_bkgTxtID0);
glVertexPointer( 3, GL_FLOAT, 0, quadVertices0 );
glTexCoordPointer( 2, GL_FIXED, 0, quadTextureCoords );
glDrawElements( GL_TRIANGLES, 2 * 3, GL_UNSIGNED_BYTE, quadTriangles );

glBindTexture( GL_TEXTURE_2D, m_bkgTxtID1);
glVertexPointer( 3, GL_FLOAT, 0, quadVertices1 );
glTexCoordPointer( 2, GL_FIXED, 0, quadTextureCoords );
glDrawElements( GL_TRIANGLES, 2 * 3, GL_UNSIGNED_BYTE, quadTriangles );

glEnable(GL_CULL_FACE);
glEnable(GL_BLEND);
glEnable(GL_DEPTH_TEST);
glEnable(GL_ALPHA_TEST);

}

27, July, 2009 Posted by | Coding AR | , , , , , , , , | Comments Off on Augmented reality on S60 – basics

Marker vs markerless (bundle adjustment)

#augmentedreality
Here is a sample of image registration with fiduciary marker (actually the marker I used in my games) vs registration with bundle adjustment. Blue lines are points heights (relatively to marker plane) calculated using marker registration and triangulation. White lines are the same using bundle adjustment(modified). Points extracted with multiscale FAST and fitted with log-polar Fourier descriptors for correspondence (actually SURF descriptor produce the same correspondence).
marker vs markerless
As you can see markerless is in no way worse then markers, at least on this example ))).

23, July, 2009 Posted by | Coding AR | , , , , , , | 2 Comments

Computer vision accelerator in FPGA for smartphone

#augmentedreality
Tony Chun form Intel integrated platform research group talk about “methodology” of putting computer vision algorithm(or speech recognition) into hardware. He specifically mention smartphone and mobile augmented reality. Tony suggest that this accelerator should be programmable, with some software language to make it flexible. It’s not clear if he is talking about FPGA prototype, or putting FPGA into smartphone. Idea to use FPGA chip for mobile CV task is not new, for example in this LinkedIn discussion Stanislav Sinyagin suggested some specific hardware to play with.

Thanks artimes.rouli.net for pointing this one.

7, July, 2009 Posted by | Augmented Reality | , , , | 2 Comments

Augmented Reality on Android – now with NDK

With release of native code kit Android now looks more like a functional AR platform. NDK allow for native C/C++ libraries, and complete application seems need java wrapper still. It’s not clear to me still how accessible are video and OpenGL API from NDK – have to look into it.
On related note – there are rumors about pretty powerful 1Ghz phone for Android 2.0

5, July, 2009 Posted by | Augmented Reality, Coding AR | , , , | Comments Off on Augmented Reality on Android – now with NDK

Air-fueled batteries

As I’d already written I think the battery life is the key to adoption of high-performance mobile devices, strong enough for advanced image processing and real-time augmented reality.
Here are some news – Technologyreview report it seems there are some advances in lithium-air batteries. Air-fueled batteries is something similar to fuel-air explosives Like FAE air-fueled batteries are not storing oxidizer in themselves, but use oxidizer from the air. AFB should allow ten times energy density of the common batteries.
Lithium AFB are developed by IBM, Hitachi and could use not only lithium but zinc and aluminium

28, June, 2009 Posted by | Mobile | , , | Comments Off on Air-fueled batteries

Nokia consider Maemo Linux as alternative to Symbian ?

As cnet point out Symbian is not mentioned in the joint Intel-Nokia press release about 3G and Open Source Software collaboration. Only Maemo and Moblin are mentioned. Symbian, though also open sourced is left out. It could be that Nokia is less enthusiastic about Symbain OS now. Existing Symbain OS UIs are inferior to iPhone UI, Symbian OS third party applications are not getting enough traction and most of Symbian users are not even aware they exist. Symbian Signed restrictions are not helping either. BTW most of Symbian users are not even aware they are Symbian users.
So Nokia seems hedging its bets with Maemo linux. Cnet think Nokia could switch to Maemo for high-end devices and leave Symbian for mid-range.

25, June, 2009 Posted by | Mobile, Symbian | , , , , , | 1 Comment

Augmented reality, enforced locality, geometric hashing

I had discussion with Lester Madden at linkedin MAR group. The thing we discussed was the concept of the locality in the AR. That is, each AR object should be attached to specific location and accessible only from that location.
I’ll try explain it more in depth here.
Augmented graffiti, augmented reality mail/drop boxes and billboards, user-built reality overlays – all of those should be attached to specific location. This locality could be enforced – only local data would be available (filtered into) in the specific location. This locality of data prevent user from sinking in the augmented noise, generated all other the world, and reduce possibility of spam.
For example you can have neighborhood billboard, leave note for the friends in the park and so on. All those AR objects data could be accessed only locally for both read and write – to read billboard and to post a message on it you would have to go to it.
The user should get the data/content only if he is physically present at the specific location. The same way poster/producer of the data or AR object should physically visit each location where it placed.
If locality is enforced, to place note for your friend in the park you have to visit park, and there is no way around it.
Locality could be enforced with location-based encryption. I think this encryption could be made with use of geometric hashing. User scan environment and make 3d registration with his mobile or wearable device. Encryption key is generated by mobile device from the scanned 3d model of the environment.
If user want to get data attached to the location, he access the server, retrieve local data and decrypt them with that key.
In the opposite direction, if user want to attach some object or data to location, mobile device encrypt data with part of the hash key and send other part of the key to server. Before storing data the server do uniqueness check. Nearby data already stored on the server are checked, and the new data allowed in only if there is some distance from new key to keys of all the other stored data. After that new data encrypted with the second part of the key by server and stored.
Each object encrypted by two keys, one of which is server side. Server have no access to content of the data, but have access to the part of the location hash key. That way no two objects or data attached to exactly the same location. Clattering of AR objects could be reduced. More importantly if poster have to physically visit location where he want to place AR object, he should have at least some relation to that location, and he is not some spammer from the other end of the world.
If spammer forge location key without actually visiting the place, that will most probably be non-existing location, and no one will be hit by his data.
That all is of cause is a rough outline of how could enforced locality works. Building robust algorithm for extracting geometric hash could be non-trivial.

1, May, 2009 Posted by | Augmented Reality | , , , , | 8 Comments

Why 3d markerless tracking is difficult for mobile augmented reality

I often hear sentiments from users that they don’t like markers, and they are wondering, why there are so relatively few markerless AR around. First I want to say that there is no excuse for using markers in the static scene with immobile camera, or if desktop computer is used. Brute force methods for tracking like bundle adjustment and fundamental matrix are well developed and used for years and years in the computer vision and photogrammetry. However those methods in their original form could hardly produce acceptable frame rate on the mobile devices. From the other hand marker trackers on mobile devices could be made fast, stable and robust.
So why markers are easy and markerless are not ?
The problem is the structure , or “shape” of the points cloud generated by feature detector of the markerless tracker. The problem with structure is that depth coordinate of the points is not easily calculated. That is even more difficult because camera frame taken from mobile device have narrow baseline – frames taken form position close one to another, so “stereo” depth perception is quite rough. It is called structure from motion problem.
In the case of the marker tracker all feature points of the markers are on the same plane, and that allow to calculate position of the camera (up to constant scale factor) from the single frame. Essentially, if all the points produced by detector are on the same plane, like for example from the pictures lying on the table, the problem of structure from motion goes away. Planar cloud of point is essentially the same as the set of markers – for example any four points could be considered as marker and the same algorithm could apply. Structure from motion problem is why there is no easy step from “planar only” tracker to real 3d markerless tracker.
However not everything is so bad for mobile markerless tracker. If tracking environment is indoor, or cityscape there is a lot of rectangles, parallel lines and other planar structures around. Those could be used as initial approximation for one the of structure from motion algorithm, or/and as substitutes for markers.
Another approach of cause is to find some variation of structure from motion method which is fast and works for mobile. Some variation of bundle adjustment algorithm looks most promising to me.
PS PTAM tracker, which is ported to iPhone, use yet another approach – instead of using bundle adjustment for each frame, bundle adjustment is running in the separate thread asynchronously, and more simple method used for frame to frame tracking.
PPS And the last thing, from 2011:

30, March, 2009 Posted by | Coding AR | , , , , , , , , | 4 Comments