Monday, March 05, 2007

Clicking real world objects

If you’ve ever seen anyone approach a desktop computer for the first time, you appreciate what a terrific concept the mouse is. But more often than not, newbies will look at the monitor then look down to see how they can interact with what is on screen. They’ll focus on that cool looking oval thingy – the mouse - and move it. The cursor will move with the mouse and the light bulb goes off immediately: “This oval thingy here allows me to move to the different objects on the screen”. Eventually they’ll click on those inviting buttons and figure out how to start programs. If you care about usability, it’s an exhilarating lesson in natural design.

Taking the Idea to Business Plan

So what if you could take this concept and allow people to click on objects like billboards, cars, tables, clothes that triggers an interaction? Let’s say I’m walking down a street and happen to see a billboard for War of the Worlds. I see a picture of Tom Cruise and go “I absolutely need that jacket”. I “click” on the picture and figure out where to buy it and how much it’ll cost. In fact, I might even be able to order it right there and have it shipped out.

Even if you aren’t impulsive about getting all kinds of information delivered to your phone, the underlying theme here is to be able to connect the physical world with the cyber world – importing objects in your surrounding areas into the digital world so it can be processed there. The potential use cases are mind-boggling.

How would something like this work? There are three things that need to happen before you can click on a real world object: (1) you need a device that can perform the click (2) the device needs to be able to recognize what it’s clicking on and (3) it needs show relevant information about the object to its owner.

Picking a device as a platform

Let’s break down (1) first. What’s the most ubiquitous device that you could find on a person today? If you answered “cell phone”, you did great. Wristwatches would also be a good answer but we can’t use those for what we are about to do. A lot of people have mobiles on their person at all times so it’s a perfect device for pulling out and clicking on something.

How do you actually perform a click? You use the camera that your mobile is likely to have and take a picture of the object (we’ll get to the details of that in a little bit). A mobile phone is a highly connected device. If it is in coverage, it can then take the picture, extract some information and send it via a Short Message Service (SMS) message to a server somewhere. The server figures out the relevant and related information. It then sends the information and sends it back to the mobile via an incoming SMS. SMSs move very quickly through the phone network, so seconds later you may find your SMS ringtone alerting you that the object you clicked on has been identified and everything you need to know about it is now at your fingertips.

In other words, you can’t tell your guests you bought that Target furniture at Walter E. Smithe. Not if they have a cell phone with a camera on it that they can then use when you are not looking.

Making the "click" work

How does the “click” itself work? There are several technologies to simulate a click. In each case, your mobile must take a picture with its in-built camera.

The first technology we’ll look at is called qr-codes. A qr-code is like a bar code, but two-dimensional and designed for very quick processing. Each qr-code contains a tag which is typically an Internet address (or a URL for now). A mobile scans the qr-code by taking its picture and decoding the embedded URL. This decoding requires some software to be installed on the phone. As the technology becomes widespread (it is currently in use in Japan, which leads the markets in a lot of mobile innovation), your phone might come installed with this software when you purchase it. It is important to understand that the mobile doesn’t have to know what the object is that it just clicked on. If it has that URL, it just sends an SMS with the URL embedded in it for processing to some server. The server can then use a variety of protocols including web services to grab the information about the object and send it back in an SMS to the mobile.

Alternate Technologies

There is also a set of technologies that can be put on an object and used to send information to your mobile but they all require special hardware on your phone. Some examples are RFID and Near Field Communication (NFC). These technologies require a small device, a hardware tag, which could be as small as and not much thicker than say a postage stamp, attached to the object. On the other end, a hardware reader needs to be present to talk to the tag. That reader needs to be spliced on your phone. The implication here is: your mobile today won’t work without hardware modifications.

Newer mobiles might well have RFID or NFC readers. RFID in particular has been generating a lot of heat as a technology with endless applications. And when implemented, they can give us more information than a printed tag like a qr-code. But until phones with RFID or NFC readers become widespread, we’ll have to wait to use this technology.

Image Recognition Makes its Case

So let’s say qr-codes are our best option for the near future. You still need to go put the codes themselves on objects everywhere before they can be processed. This is a fairly daunting task that can take years to accomplish – after everyone agrees it is in their best interests. The workaround is to take a picture of the object and send it back to a server, where special image recognition software will identify the object and figure out the related information to be sent back to the user. No qr-code tags are necessary on the object.

This technology has some issues to be worked out. First, a picture cannot be sent as an SMS. You need a Multi Media Service (MMS) message to send an image from a phone. Current latencies in networks means that the window for receiving an MMS message after it is sent is anywhere from a few seconds to a few hours. Most MMS messages arrive pretty quickly but a shorter window of delivery is not guaranteed. So if it takes long, the information may have well lost its value along with the urgency of the user’s attention.

In addition, image recognition, itself, is a tricky problem to solve. While it’s improved vastly over the years, recognizing a wide variety of objects photographed in real world conditions that involve different lighting conditions and angles is unproven at best.

Here and Now

If there is money to be made, challenges can be addressed. And proximity marketing (the ability to market a product that is in a user’s immediate proximity) is such a holy grail that the money should be there in the short term. But the problems aren’t technical as we’ve seen. They are social (acceptance) and political (adoption). Several companies are betting these barriers can be surmounted. Shotcodes, Smartpox, Scanbuy and Semacode make qr-codes and the mobile readers to go with them. Daem Interactive recently showed its image recognition technology at 3GSM in Barcelona. Ontella and ScanR operates in the same space. MotionDSP has technology to enhance images taken on a mobile.

Image from Daem Interactive

1 comment:

streetstylz said...

qode® is a ground-breaking technology, developed and patented by Neomedia Technologies, which turns brand-names and barcodes into hyperlinks to the mobile Internet. Web-enabled handsets with the patented qode® software are able to take consumers - or enterprise users - direct to desired pages on the mobile Web, simply by clicking on a code with the handset's camera, or by entering keywords or product codes in a search-style window. The handset is now your mouse and the brand name or barcode is now your hyperlink.