SURFACE COMPUTING



[pic][pic][pic][pic][pic][pic][pic][pic][pic][pic][pic][pic][pic][pic][pic][pic][pic][pic][pic][pic][pic][pic][pic][pic][pic][pic][pic][pic][pic][pic][pic][pic][pic][pic][pic][pic][pic][pic][pic][pic][pic]ABSTRACT

The name Surface comes from "surface computing," and Microsoft envisions the coffee-table machine as the first of many such devices. Surface computing uses a blend of wireless protocols, special machine-readable tags and shape recognition to seamlessly merge the real and the virtual world — an idea the Milan team refers to as "blended reality." The table can be built with a variety of wireless transceivers, including Bluetooth, Wi-Fi and (eventually) radio frequency identification (RFID) and is designed to sync instantly with any device that touches its surface.

It supports multiple touch points – Microsoft says "dozens and dozens" -- as well as multiple users simultaneously, so more than one person could be using it at once, or one person could be doing multiple tasks.

The term "surface" describes how it's used. There is no keyboard or mouse. All interactions with the computer are done via touching the surface of the computer's screen with hands or brushes, or via wireless interaction with devices such as smartphones, digital cameras or Microsoft's Zune music player. Because of the cameras, the device can also recognize physical objects; for instance credit cards or hotel "loyalty" cards.

For instance, a user could set a digital camera down on the tabletop and wirelessly transfer pictures into folders on Surface's hard drive. Or setting a music player down would let a user drag songs from his or her home music collection directly into the player, or between two players, using a finger – or transfer mapping information for the location of a restaurant where you just made reservations through a Surface tabletop over to a smartphone just before you walk out the door.

Introduction

For years engineers and computer technicians have looked for a better way for people to communicate with their computers. Keyboards while feeling natural to many of us has advanced very little beyond the typewriters which have been around for well over a hundred years and though the mouse is a step above that it still takes practice for someone who has never used one to become used to the idea of moving the mouse with it and after years of using a computer many older people still have trouble with the concepts of double clicking, right clicking, dragging, dropping and other techniques that can seem simple to more advanced computer users.

Computing is usually defined as the activity of using and developing computer technology, computer hardware and software. It is the computer specific part of information technology.

Surface computing or Microsoft surface (codename : Milan) is a multi-touch product form Microsoft which h is developed as a software and hardware combination technology that allows a user are multiple user to manipulate digital content by the use of natural motions , hand gestures, or physical objects.

Microsoft Surface Computer is the first in a new category of surface computing products from Microsoft that will break down traditional barriers between people and technology. It is the next generation of computer interfaces those offer multi touch technology. Unlike most touch screens, surface computer can respond to more than one touch at a time without keyboard or a mouse. The next generation of computer interfaces will be hands on.

Over the past couple of years, a new class of interactive device has begun to emerge, what can best be described as “surface computing”. Two examples are illustrated in this report. They are-

Surface Table top

Perceptive Pixel

The Surface table top typically incorporates a rear-projection display coupled with an optical system to capture touch points by detecting shadows from below. Different approaches to doing the detection have been used, but most employ some form of IR illumination coupled with IR cameras. With today’s camera and signal-processing capability, reliable responsive and accurate multi-touch capabilities can be achieved.

The multitouch pioneer and his company, Perceptive Pixel, have devoted the better part of two years to building an entirely new multitouch framework from the ground up. Instead of simply mapping multitouch technology to familiar interfaces and devices, Han's goal is far more sweeping: To use the technology as a foundation for an entirely new operating system.

Because they are new to most, the tendency in seeing these systems is to assume that they are all more-or-less alike. Well, in a way that is true. But on the other hand, that is perhaps no more so than to say that all ICs are more-or-less alike, since they are black plastic things with feet like centipedes which contain a bunch of transistors and other stuff. In short, the more that you know, the more you can differentiate. But even looking at the two systems in the photo, there is evidence of really significant difference.

The really significant difference is that one is vertical and the other is horizontal. Why is this significant? Well, this is one of those questions perhaps best answered by a child in kindergarten. They will tell you that if you put a glass of water on the vertical one, it will fall to the floor, leading to a bout of sitting in the corner. On the other hand, it is perfectly safe to put things on a table. They will stay there.

Computing curricula 2005 defined computing:

In general way, we can define computing to mean any goal-oriented activity requiring, benefiting from or creating computers. Thus, computing includes designing and building hardware and software system for a wide range of purposes; processing, structuring, and managing various kinds of information; doing scientific studies using computers; making computer systems behave intelligently; creating and using communications and entertainment media; finding and gathering information relevant to any particular purpose, and so on. The list is virtually endless, and the possibilities are vast.

surface computing

Surface computing is a new way of working with computers that moves beyond the traditional mouse-and-keyboard experience. It is a natural user interface that allows people to interact with digital content the same way they have interacted with everyday items such as photos, paintbrushes and music their entire life: with their hands, with gestures and by putting real-world objects on the surface. Surface computing opens up a whole new category of products for users to interact with.

Surface computing is the term for the use of a specialized computer GUI in which traditional GUI elements are replaced by intuitive, everyday objects. Instead of a keyboard and mouse, the user interacts directly with a touch-sensitive screen. It has been said that this more closely replicates the familiar hands-on experience of everyday object manipulation.

Early work in this area was done at the University of Toronto, Alias Research, and MIT.Surface work has included customized solutions from vendors such as GestureTek, Applied Minds for Northrop Grumman and SmartSurface. Major computer vendor platforms are in various stages of release: the iTable by PQLabs, Linux MPX, and Microsoft Surface.

Surface computing is slowly starting to catch on and is starting to be used in real world applications. Here is just a sample of what surface computing technologies have been used.

The Microsoft Surface is starting to pick up popularity and has been used in various places and venues. AT&T became the first retailer to use Surface to help their customers purchase phones. Customers could place the phones on the Surface and receive full phone specs, as well as pricing.It has also been used in a wide variety of locations which include hotel lobbies, such as Sheraton Hotels, as well as venues which included Super Bowl XLIII to help police organize and monitor the event in great detail.It is also starting to gain use in the broadcasting industry and has been used by MSNBC during the 2008 US Presidential Elections. However, USD $15,500 (device only) is still considered expensive for most business.

There are other new surface computing applications that are still being developed, one of which is from the MIT Media Lab where students are developing wearable computing systems that can be used on almost any surface. The name of this device is SixthSense.

Surface computing is a completely intuitive and liberating way to interact with digital content. It blurs the lines between the physical and virtual worlds. By using your hands or placing other unique everyday objects on the surface – such as an item you’re going to purchase at a retail store or a paint brush – you can interact with, share and collaborate like you’ve never done before. Imagine you’re out at a restaurant with friends and you each place your beverage on the table – and all kinds of information appears by your glass, such as wine pairings with a restaurant’s menu. Then, with the flick of your finger, you order dessert and split the bill. We really see this as broadening content opportunities and delivery systems.

Surface computing is a powerful movement. In fact, it’s as significant as the move from DOS [Disk Operating System] to GUI [Graphic User Interface]. Our research shows that many people are intimidated and isolated by today’s technology. Many features available in mobile phones, PCs and other electronic devices like digital cameras aren’t even used because the technology is intimidating. Surface computing breaks down those traditional barriers to technology so that people can interact with all kinds of digital content in a more intuitive, engaging and efficient manner. It’s about technology adapting to the user, rather than the user adapting to the technology. Bringing this kind of natural user interface innovation to the computing space is what Surface Computing is all about.

History of Surface Computing

Surface computing is a major advancement that moves beyond the traditional user interface to a more natural way of interacting with digital content. Microsoft Surface™, Microsoft Corp.’s first commercially available surface computer, breaks down the traditional barriers between people and technology to provide effortless interaction with all forms of digital content through natural gestures, touch and physical objects instead of a mouse and keyboard. The people will be able to interact with Surface in select restaurants, hotels, retail establishments and public entertainment.

In 2001, Stevie Bathiche of Microsoft Hardware and Andy Wilson of Microsoft Research began working together on various projects that took advantage of their complementary expertise in the areas of hardware and software. In one of their regular brainstorm sessions, they started talking about an idea for an interactive table that could understand the manipulation of physical pieces. Although there were related efforts happening in academia, Bathiche and Wilson saw the need for a product where the interaction was richer and more intuitive, and at the same time practical for everyone to use.

This conversation was the beginning of an idea that would later result in the development of Surface, and over the course of the following year, various people at Microsoft involved in developing new product concepts, including the gaming-specific PlayTable, continued to think through the possibilities and feasibility of the project. Then in October 2001 a virtual team was formed to fully pursue bringing the idea to the next stage of development; Bathiche and Wilson were key members of the team.

In early 2003, the team presented the idea to Bill Gates, Microsoft chairman, in a group review. Gates instantly liked the idea and encouraged the team to continue to develop their thinking. The virtual team expanded, and within a month, through constant discussion and brainstorming, the first humble prototype was born and nicknamed T1. The model was based on an IKEA table with a hole cut in the top and a sheet of architect vellum used as a diffuser. The evolution of Surface had begun. A variety of early applications were also built, including pinball, a photo browser and a video puzzle. As more applications were developed, the team saw the value of the surface computer beyond simply gaming and began to favor those applications that took advantage of the unique ability of Surface to recognize physical objects placed on the table. The team was also beginning to realize that surface computing could be applied to a number of different embodiments and form factors. Over the next year, the team grew significantly, including the addition of Nigel Keam, initially software development lead and later architect for

Surcface, who was part of the development team eventually tasked with taking the product from prototype to a shipping product. Surface prototypes, functionality and applications were continually refined. More than 85 early prototypes were built for use by software developers, hardware developers and user researchers.

One of the key attributes of Surface is object recognition and the ability of objects placed on the surface to trigger different types of digital responses, including the transfer of digital content. This feature went through numerous rounds of testing and refining. The team explored various tag formats of all shapes and sizes before landing on the domino tag (used today) which is an 8-bit, three-quarter-inch-square tag that is optimal thanks to its small size. At the same time, the original plan of using a single camera in the vision system was proving to be unreliable. After exploring a variety of options, including camera placement and different camera lens sizes, it was decided that Surface would use five cameras that would more accurately detect natural movements and gestures from the surface.

TIME LINE

The technology behind Surface is called multi touch and has at least a 25- year history, beginning in 1982, with pioneering work being done at the University of Toronto (multi-touch tables) and Bell Labs (multi touch screen). The product idea for Surface was initially conceptualized in 2001 by Steven Bathiche of Microsoft Hardware and Andy Wilson of Microsoft Research.

In October 2001, a virtual team was formed with Bathiche and Wilson as key members, to bring the idea to the next stage of development.

In 2003, the team presented the idea to the Microsoft chairman Bill Gates in a group in a group review. Later, the virtual team was expanded and a prototype nicknamed T1 was produced within a month.

The prototype was based on an IKEA Table with a hole cut in the top and a sheet of architect vellum used as a diffuser. The team also developed some applications, including pinball, a photo browser and a video puzzle.

Over the next year, Microsoft built more than 85 early prototypes for Surface. The final hardware design was completed in 2005.

A similar concept was used in the 2002 science fiction movie Minority Report. As noted in the DVD commentary, the director Steven Spielberg stated the concept of the device came from consultation with Microsoft during the making of movie. One of the film’s technology consultant’s associates form MIT later joined Microsoft to work on the Surface project.

Surface was unveiled by Microsoft CEO Steve Ballmer on May 30, 2007 at The Wall Street Journal’s ‘D: All Things Digital’ conference in Carlsbad, California. Surface Computing is part of Microsoft’s Productivity and Extended Consumer Experiences Group, which is within the entertainment and Devices division.

The first few companies to deploy Surface will include Harrah’s Entertainment, Starwood Hotels and Resorts Worldwide, T-Mobile and a distributor, International Game Technology.

On the April 17, 2008 AT&T became the first retail lo lunch Surface in June 2008 Harrah’s Entertainment lunched Microsoft Surface at Rio iBar and Disneyland launched it in Tomorrow-land, Innovations Dream Home.

On August 13, 2008 Sheraton Hotels introduced Surface in hotel lobbies at 5 locations.

Hardware Design

By late 2004, the software development platform of Surface was well-established and attention turned to the form factor. A number of different experimental prototypes were built including “the tub” model, which was encased in a rounded plastic shell,

[pic]

the “tub” model

a desk-height model with a square top and cloth-covered sides, and even a bar-height model that could be used while standing. After extensive testing and user research, the final hardware design (seen today) was finalized in 2005. Also in 2005, Wilson and Bathiche introduced the concept of surface computing in a paper for Gates’ twice-yearly “Think Week,” a time Gates takes to evaluate new ideas and technologies for the company.

FROM PROTOTYPE to Product

The next phase of the development of Surface focused on continuing the journey from concept to product. Although much of what would later ship as Surface was determined, there was significant work to be done to develop a market-ready product that could be scaled to mass production.

[pic]

“T1 Prototype”

In early 2006, Pete Thompson joined the group as general manager, tasked with driving end-to-end business and growing development and marketing. Under his leadership, the group has grown to more than 100 employees. Today Surface has become the market-ready product once only envisioned by the group, a 30-inch display in a table like form factor that’s easy for individuals or small groups to use collaboratively.

The sleek, translucent surface lets people engage with Surface using touch, natural hand gestures and physical objects placed on the surface. Years in the making, Microsoft Surface is now poised to transform the way people shop, dine, entertain and live. This is a radically different user-interface experience than anything and it’s really a testament to the innovation that comes from marrying brilliance and creativity.

Key attributes of Surface Computing

Surface computing features four key attributes:

[pic]Direct interaction: Users can actually “grab” digital information with their hand and interact with content through touch and gesture, without the use of a mouse or keyboard. Customers will benefit from Microsoft Surface instantly. Interacting with content is natural, simple, intuitive, and fun.

[pic]Multi‐touch contact: Surface computing recognizes many points of contact simultaneously, not just from one finger as with a typical touch screen, but up to dozens and dozens of items at once.

[pic]Multi‐user experience: The 30-inch diagonal display and the horizontal form factor makes it easy for several people to gather around surface computers together, providing a collaborative, face‐to‐face computing experience.

[pic]Object recognition: Users can place physical objects on the surface to trigger different types of digital responses, including the transfer of digital content.

MULTI-TOUCH

Multi-touch is an enhancement to touchscreen technology, which provides the user with the ability to apply multiple finger gestures simultaneously onto the electronic visual display to send complex commands to the device.

[pic]

Multi-touch screen

Multi-touch has been implemented in several different ways, depending on the size and type of interface. Both touchtables and touch walls project an image through acrylic or glass, and then backlight the image with LED's. When a finger or an object touches the surface, causing the light to scatter, the reflection is caught with sensors or cameras that send the data to software which dictates response to the touch, depending on the type of reflection measured. Touch surfaces can also be made pressure-sensitive by the addition of a pressure-sensitive coating that flexes differently depending on how firmly it is pressed, altering the reflection.Handheld technologies use a panel that carries an electrical charge. When a finger touches the screen, the touch disrupts the panel's electrical field. The disruption is registered and sent to the software, which then initiates a response to the gesture.

In the past few years, several companies have released products that use multitouch. In an attempt to make the expensive technology more accessible, hobbyists have also published methods of constructing DIY touchscreens.

History

The use of touch technology to control electronic devices predates the personal computer. Early synthesizer and electronic instrument builders like Hugh Le Caine and Bob Moog experimented with using touch-sensitive capacitance sensors to control the sounds made by their instruments.IBM began building the first touch screens in the late '60's, and, in 1972, Control Data released the PLATO IV computer, a terminal used for educational purposes that employed single-touch points in a 16x16 array as its user interface.

Multi-touch technology began in 1982, when the University of Toronto's Input Research Group developed the first human-input multi-touch system. The system used a frosted-glass panel with a camera placed behind the glass. When a finger or several fingers pressed on the glass, the camera would detect the action as one or more black spots on an otherwise white background, allowing it to be registered as an input. Since the size of a dot was dependent on pressure (how hard the person was pressing on the glass), the system was somewhat pressure-sensitive as well.

In 1983, Bell Labs at Murray Hill published a comprehensive discussion of touch-screen based interfaces.In 1984, Bell Labs engineered a touch screen that could change images with more than one hand. In 1985, the University of Toronto group including Bill Buxton developed a multi-touch tablet that used capacitance rather than bulky camera-based optical sensing systems.

A breakthrough occurred in 1991, when Pierre Wellner published a paper on his multi-touch “Digital Desk”, which supported multi-finger and pinching motions.

Various companies expanded upon these inventions in the beginning of the twenty-first century. Mainstream exposure to multi-touch technology occurred in the year 2007, the iPhone gained popularity, with Apple stating they 'invented multi touch' as part of the iPhone announcement.

Microsoft followed after with the unveiling of their Microsoft Surface table-top touch platform. Small-scale touch devices are rapidly becoming commonplace, with the amount of touch screen telephones expected to increase from 200,000 shipped in 2006 to 21 million in 2012.More robust and customizable multi-touch and gesture-based solutions are beginning to become available, with interfaces that register multiple touchpoints and gestures. Recently, Displax unveiled a new approach to multitouch that also detects airflow movement. According to Daniel Wigdor, a user experience architect for Microsoft who focuses on multitouch and gestural computing, “If Displax can do this for larger displays, it will really be one of the first companies to do what we call massive multitouch (...) If you look at existing commercial technology for large touch displays, they use infrared camera that can sense only two to four points of contact. Displax takes us to the next step.

Major brands and manufacturers

Many companies in recent years have expanded into multitouch, with systems designed for everything from the casual user to multinational organizations.

Laptop manufacturers have begun to include multitouch trackpads on their laptops, as well as constructing tablet PC's that respond to touch input rather than traditional stylus input.

In the wake of the iPhone, several mobile phone manufacturers have begun to replace traditional push-button interfaces with multitouch interfaces on their handheld devices as well. So far, such innovations are mostly restricted to the higher-end smartphones used for web browsing and computing in addition to phone-based functions.

A few companies are focusing on large-scale surface computing rather than personal electronics, either large multitouch tables or wall surfaces. These systems carry a hefty price tag and are generally used by government organizations, museums, and companies as a means of information or exhibit display.

Apple Inc. lists "Multi-Touch" on their page of trademarks, however, this was only added some time after October 2007,and Apple was awarded a patent covering multitouch on 20 January 2009.

Companies that manufacture multitouch devices

HCI — Multi-Touch Table, Multi-Touch Wall, Multi-Touch Screen, Multi-Touch Frame, Multi-Touch Company

3M — M2256PW with ten-finger support.

Acer — Acer Aspire 1820PT & 5738PG.

Apple — iPhone, iPad, iPod Touch, MacBook, MacBook Air, MacBook Pro, Magic Mouse.

Asus — EEE PC T91MT & T101MT.

Circle Twelve — DiamondTouch.

Dell — Latitude XT & XT2, Mini 5, Studio 17.

Google — Nexus One.

Hewlett-Packard — HP Touchsmart, HP Slate PC.

HTC — HTC Hero, HTC HD2, HTC Legend, HTC Desire.

Ideum — MT-50 Multitouch Table.

Lenovo — X200 & T400, Ideapad S10 3T.

LG Electronics — Arena, BL40 New Chocolate.

Microsoft — Surface, Zune HD.

Mindstorm — iBar, Aurora, Vortex, Eclipse.

Motorola — Droid.

MULTIVISION — Multi-Touch LCD — up to 32 fingers.

Nortd — TouchKit.

Palm — Pre, Pixi.

Perceptive Pixel — Multi-Touch Collaboration Wall.

Shuttle Inc. — Multi-touch LCD X50v2.

Sony — VAIO L Series All-in-one desktops.

TouchTable — TOUCHTABLE TT45 & TOUCHTABLE TT84.

Wacom — Bamboo tablets.

Displax — 16 fingers, also airflow detection.

LamasaTech — Multitouch Multi-user platform for restaurants and bespoke API for practical application.

Software

Many recent operating systems support multitouch, including Mac OS X, Windows 7, Windows Vista, Windows XP Tablet PC Edition and Ubuntu (since version 7.10), Apple's iPhone OS, Google's Android, Palm's webOS and Xandros.

Popular culture references

Pop culture has also portrayed potential uses of multi-touch technology in the future, including several installments of the Star Trek franchise.

The television series CSI: Miami introduced both surface and wall multitouch displays in its sixth season. Another television series, NCIS: Los Angeles make use of multitouch surfaces and wall panels as an initiative to go digital. Another form of a multi-touch computer was seen in the motion picture,The Island, where the professor, played by Sean Bean, has a multi-touch desktop to organize files, based on an early version of Microsoft Surface. Multitouch technology can also be seen in the James Bond film, Quantum of Solace, where MI6 uses a touch interface to browse information about the criminal Dominic Greene.In a parodic episode of the popular TV series The Simpsons, when Lisa Simpson travels to the underwater headquarters of Mapple to visit Steve Jobs, the erstwhile pretender to the throne of Mapple is shown to be performing multiple multi-touch hand gestures on a large touch wall.

A device similar to the Surface was seen in the 1982 movie Tron. It took up an executive's entire desk and was used to communicate with the Master Control computer.The interface used to control the alien ship in the movie District 9 features such similar technology.

Microsoft's Surface was also used in the movie "The Day the Earth Stood Still (2008 film)

OBJECT RECOGNIZATION

The object recognition feature on Microsoft Surface is the first of its kind. Since most touch screens are dependent on electrical resistance or heat, it would not work. But since Surface is simply ased on touch (which cameras recognize (Natural User Interface (NUI) )) it can recognize not only human touch, but objects as well.

In fact, object recognition is almost exactly the same as touch recognition. A game with bouncing balls would bounce off of a camera just as it would a finger or hand. The nice thing about this is that it helps merge technology with the real world. That means that in the paint app, you can use the paintbrush rather than your finger, and have the same effect. And in the air hockey app, the puck and goalie mallets from any other table work perfectly.

But when it comes to Microsoft Surface, there is object recognition and there is object recognition. What I mean is that Surface can do more than just say “hey, there is an object on me”.

Microsoft Surface can also recognize specific objects, what they are, and interact with them!In order for Surface to recognize what an object is you have to put a tag on it. These are called byte tags, and they look a lot like a domino:

[pic]

When an object with a tag is placed on the Surface, the relation between the tag and the object is recognized, and from there on out, it is recognized as that object. So if you put a camera down with a tag in relation to that camera, the Surface recognizes the object as a camera.

This has no use in simple things such as a paintbrush or airhockey puck/goalie mallets as mentioned before. This is used when sharing data between digital electronics and Surface, such as cameras, mp3 players, and cell phones.

But you need more than just a tag to do that. For the data transfers, Microsoft Surface uses Bluetooth 2.0. So once you place your camera (which has a tag on it), and it is recognized as a camera, Bluetooth 2.0 downloads all of the images onto surface, and creates a spill out effect. It’s as simple as that, but at the same time, very advanced.

Technology behind Surface Computing

Microsoft Surface uses cameras to sense objects, hand gestures and touch. This user input is then processed and displayed using rear projection. Specifically:

Microsoft Surface uses a rear projection system which displays an image onto the underside of a thin diffuser.

Objects such as fingers are visible through the diffuser by series of infrared–sensitive cameras, positioned underneath the display.

An image processing system processes the camera images to detect fingers, custom tags and other objects such as paint brushes when touching the display.

The objects recognized with this system are reported to applications running in the computer so that they can react to object shapes, 2D tags, movement and touch.

One of the key components of surface computing is a "multitouch" screen. It is an idea that has been floating around the research community since the 1980s and is swiftly becoming a hip new product interface — Apple's new iPhone has multitouch scrolling and picture manipulation. Multitouch devices accept input from multiple fingers and multiple users simultaneously, allowing for complex gestures, including grabbing, stretching, swiveling and sliding virtual objects across the table. And the Surface has the added advantage of a horizontal screen, so several people can gather around and use it together. Its interface is the exact opposite of the personal computer: cooperative, hands-on, and designed for public spaces.

What Options are Available?

Many companies have begun developing some type of surface computing. Some, like Microsoft, turn customized furniture (i.e., tabletops or bars) into interactive surfaces; while others, such as GesturTek, design their systems to work with pre-existing structures like walls and loors. While the following does not detail all of the surface computing solutions in market today, it does provide an overview of the major players.

Microsoft Surface – Microsoft Surface is arguably the best known surface computing solution in market today. Surface is a table-top only, multi-touch display that uses cameras (within the tables) and rear-projection to provide interactivity through natural gestures, touch, and physical objects.

Laser Touch (Microsoft) – Laser Touch is a low-cost solution that can transform any display (monitor, projector, etc.) into a touch screen. The biggest difference from Surface, aside from price, is Laser Touch’s ability to be used on multiple displays, not just tables. Unfortunately, there are no plans to commercialize it.

GestureTek – GestureTek’s solutions include interactive displays for any surface (tables, loors, and walls), as well as virtual gaming and interactive signage. GestureTek also uses its solutions for industry-based specialties (i.e., health and mobile), enough that it has created different divisions within the company for these two. It has enough solutions to offer a custom toolkit to potential clients, including components of its different solutions or whole solutions themselves.

Perceptive Pixel – Perceptive Pixel was founded by Jeffry Han, considered by many to be the revolutionary mind of multi-touch displays. Han has developed large-scale, multi-touch displays for corporations and the government, and he is also rumored to be the mind behind iPhone’s multi-touch display. Perceptive Pixel specializes in giant, wall-sized touch screens that support multiple inputs. These displays were used on CNN during the 2008 election season.

Diamond Touch (Mitsubishi) – Diamond Touch is a table-top only, multi-touch display that supports small group collaborations. Diamond Touch was speciically intended for in-ofice business use. Its unique technology uses antennas instead of cameras.

Smart Table – Smart Table is a table-top only, multi-touch display intended for child education.

Catchyoo – Catchyoo provides interactive solutions for loors, walls, and tables. Its solutions are designed for large system deployment and include worldwide network capabilities. These networks are similar to comprehensive digital signage networks with features like content management, real-time administration, and scheduling.

Reactrix – Reactrix’s solutions are more sophisticated than Catchyoo’s, but almost identical. According to MediaWeek, as of October 2008, Reactrix is up for sale and is in discussions with potential buyers.

Sensacell – Sensacell is an interactive loor system comprised of different “modules” that can form any shape of any size (up to thousands of square feet). Once the user is within six feet of the modules or steps on them, sensors identify the proximity/pressure and react by illuminating.

Microsoft Surface Overview

Microsoft Surface turns an ordinary tabletop into a vibrant, interactive computing experience. The product provides effortless interaction with digital content through natural gestures, touch and physical objects. In Essence, it’s a surface that comes to life for exploring, learning, sharing, creating, buying and much more. Currently available in select in restaurants, hotels, retail establishments and public entertainment venues, this experience will transform the way people shop, dine, entertain and live.

Microsoft Surface is a touch-based graphical user interface. Using specialized hardware designed to replace the keyboard and mouse used in typical computing applications, Surface enables a level of interaction previously unattainable with conventional hardware. The system is composed of a horizontal touchscreen under a coffee table-like surface, with cameras mounted below to detect user interaction activities. All interface components such as dialogs, mouse pointer, and windows, are replaced with circles and rectangles outlining "objects" that are manipulated via drag and drop.

The "objects" in question can be either virtual objects displayed on the screen, or physical objects such as cellphones, digital cameras, and PDAs placed on the screen. Physical objects are automatically identified and connected to the Surface computer upon their placement on the screen. With no interface text, the Surface computer can be used by speakers of any language and any competency level.

Surface's main feature is the apparent simplicity with which common computing tasks can be performed. Most operations are performed without dialogs or wizards. For instance, pictures in a digital camera placed on the surface are automatically downloaded to the device and displayed on the screen. Transferring those pictures to another device, such as a compatible cellphone, simply requires the user to place the cellphone on the surface and to drag the pictures in it's direction. While the potential security implications of this type of interaction are obvious, and Microsoft's solutions to the issue are vague at best. Devices are identified by a one-byte "domino" tag on their sides, which is easily forged with a pencil.

Although the underlying bluetooth and wifi technologies are considered safe for the transfer of the data itself, the ease in which documents can be accidentally or maliciously copied is alarming. This is typical of Microsoft products, which generally sacrifice security for convenience and simplicity of use.

The technology behind Microsoft Surface has been under heavy development for over five years. Microsoft installed a team of researchers at an unofficial building outside it's Redmond headquarters, guarded in secrecy with no direct support of other Microsoft entities. Although the pre-production Surface uses the latest Microsoft operating system, Vista, the hardware involved is somewhat close to the minimum required by that OS. An Intel dual core processor backed by 2 GB of RAM form the base system, and a modest 256MB video card provides the graphic-processing power. Five video cameras operating in the infrared spectrum detect objects and hand gestures at the screen's surface. The 30-inch screen runs at a nominal 1024 by 768 resolution, easily graphed by the camera array. Obviously, the Surface's interface innovations were designed with standard hardware in mind, a fact that may help lower it's price and promote it's adoption.

Surface is a 30‐inch display in a table‐like form factor that’s easy for individuals or small groups to interact with in a way that feels familiar, just like in the real world. Surface can simultaneously recognize dozens and dozens of movements such as touch, gestures and actual unique objects that have identification tags similar to bar codes.

Surface computing breaks down traditional barriers between people and technology, changing the way people interact with all kinds of everyday content, from photos to maps to menus. The intuitive user interface works without a traditional mouse or keyboard, allowing people to interact with content and information by using their hands and natural movements. Users are able to access information either on their own or collaboratively with their friends and families, unlike any experience available today.

specification

TECHNICAL SPECIFICATIONS

Display

Type: 30-inch XGA DLP® projector

ATI X1650 graphics card with 256 MB of memory

Maximum resolution: 1024 x 768

Lamp mean-life expectancy: 6,000+ hours

Maximum pressure on the display: 50 pounds per

square inch/3.5 kg per cm

Maximum load: 200 pounds

Input Devices

Camera-based vision system with LED infrared

direct illumination

Computing System

2.13-GHz Intel® CoreTM 2 Duo processor

Memory: 2 GB dual-channel DDR2

Storage: Minimum 250 GB SATA hard-disk drive

Audio

Output type: Stereo fl at panel built-in speakers

Output compliant standards: Stereo

Input: None

Network Protocols and Standards

Network adapter: Intel Gb LAN

Wireless LAN connectivity supported: Yes

Networking and Data Protocols: IEEE802.11b,

IEEE802.11g, Bluetooth 2.0, Gigabit Ethernet

I/O Connections

2 headphone jacks

6 USB 2.0 ports

RGB component video

S-VGA video (DB15 external VGA connector)

Component audio

Ethernet port (Gigabit Ethernet card [10/100/1000]

External monitor port

Bays for routing cables

On/Standby power button

AC Input Ratings

AC input: 100-240 VAC, 50/60Hz, 10A, 650W

The Hardware

Essentially, Microsoft Surface is a computer embedded in a medium-sized table, with a large, flat display on top that is touch-sensitive. The software reacts to the touch of any object, including human fingers, and can track the presence and movement of many different objects at the same time. In addition to sensing touch, the Microsoft Surface unit can detect objects that are labeled with small "domino" stickers, and in the future, it will identify devices via radio-frequency identification (RFID) tags.

The demonstration unit I used was housed in an attractive glass table about three feet high, with a solid base that hides a fairly standard computer equipped with an Intel Core 2 Duo processor, an AMI BIOS, 2 GB of RAM, and Windows Vista. The team lead would not divulge which graphics card was inside, but they said that it was a moderately-powerful graphics card from either AMD/ATI or NVIDIA.

[pic]

Screen: A diffuser turns the Surface's acrylic tabletop into a large horizontal

"multitouch" screen, capable of processing multiple inputs from multiple users. The

Surface can also recognize objects by their shapes or by reading coded "domino" tags.

Infrared: Surface's "machine vision" operates in the near-infrared spectrum, using an

850-nanometer-wavelength LED light source aimed at the screen. When objects touch

the tabletop, the light reflects back and is picked up by multiple infrared cameras with

a net resolution of 1280 x 960.

CPU: Surface uses many of the same components found in everyday desktop

computers — a Core 2 Duo processor, 2GB of RAM and a 256MB graphics card.

Wireless communication with devices on the surface is handled using WiFi and

Bluetooth antennas (future versions may incorporate RFID or Near Field

Communications). The underlying operating system is a modified version of Microsoft

Vista.

Projector: Microsoft's Surface uses the same DLP light engine found in many rear-

projection HDTVs. The footprint of the visible light screen, at 1024 x 768 pixels, is

actually smaller than the invisible overlapping infrared projection to allow for better

recognition at the edges of the screen.

The display screen is a 4:3 rear-projected DLP display measuring 30 inches diagonally. The screen resolution is a relatively modest 1024x768, but the touch detection system had an effective resolution of 1280x960. Unlike the screen resolution, which for the time being is constant, the touch resolution varies according to the size of the screen used—it is designed to work at a resolution of 48 dots per inch. The top layer also works as a diffuser, making the display clearly visible at any angle.

Unlike most touch screens, Surface does not use heat or pressure sensors to indicate when someone has touched the screen. Instead, five tiny cameras take snapshots of the surface many times a second, similar to how an optical mouse works, but on a larger scale. This allows Surface to capture many simultaneous touches and makes it easier to track movement, although the disadvantage is that the system cannot (at the moment) sense pressure.

Five cameras mounted beneath the table read objects and touches on the acrylic surface above, which is flooded with near-infrared light to make such touches easier to pick out. The cameras can read a nearly infinite number of simultaneous touches and are limited only by processing power. Right now, Surface is optimized for 52 touches, or enough for four people to use all 10 fingers at once and still have 12 objects sitting on the table.

The unit is rugged and designed to take all kinds of abuse. Senior director of marketing Mark Bolger demonstrated this quite dramatically by slamming his hand onto the top of the screen as hard as he could—it made a loud thump, but the unit itself didn't move. The screen is also water resistant. At an earlier demonstration, a skeptical reporter tested this by pouring his drink all over the device. Microsoft has designed the unit to put up with this kind of punishment because it envisions Surface being used in environments such as restaurants where hard impacts and spills are always on the menu.

The choice of 4:3 screen was, according to Nigel Keam, mostly a function of the availability of light engines (projectors) when the project began. Testing and user feedback have shown that the 4:3 ratio works well, and the addition of a slight amount of extra acrylic on each side leaves the table looking like it has normal dimensions.

Built-in wireless and Bluetooth round out the hardware capabilities of Surface. A Bluetooth keyboard with a built-in trackpad is available to diagnose problems with the unit, although for regular use it is not required.

System software

Microsoft Surface works much like another Microsoft product, Media Center, in that the main application runs on top of Windows and takes over the whole screen. Like Media Center, it is designed to be difficult to exit the application without using a mouse or keyboard. I asked if the Surface team considered allowing the user to drop into Windows mode while retaining the touch functionality, but they felt that the product worked better if it stayed in this mode.

The various demonstration programs are accessed from a main menu, which scrolls left and right in an endless loop. The user moves the selection by swiping back and forth and selects an application with a single tap. This works reasonably well and feels quite natural. When an application is selected, a swirly purple ring appears in the center of the screen to indicate that the program is loading.

There were eight different programs available: Water, Video Puzzle, Paint, Music, Photos, Casino, a T-Mobile demonstration app, and Dining. Much of the software was ritten using Microsoft's WPF (Windows Presentation Foundation), though the XNA development toolkit, a framework originally created for writing PC and Xbox 360 games, is also supported. XNA allows programmers to use managed code written in C# to manipulate various DirectX features; managed code frees the programmer from worrying about handling memory, allocating and discarding memory automatically. This approach has allowed Microsoft and its partners to write impressive-looking demonstration programs for Surface more quickly than would otherwise be possible.

Features

Multi-touch display. The Microsoft Surface display is capable of multi-touch interaction, recognizing dozens and dozens of touches simultaneously, including fingers, hands, gestures and objects.

[pic]

Perceptive Pixel’s touch screens work via frustrated total internal reflection Technology. The acrylic surface has infrared LEDs on the edges. When undisturbed, the light passes along predictable paths, a process known as total internal reflection. When one or more fingers touch the surface, the light diffuses at the contact points, changing the internal-reflection pathways. A camera below the surface captures the diffusion and sends the information to image-processing software, which translates it into a command.

Multitouch technology has been around since early research at the University of Toronto in 1982. With multitouch devices, one or more users activate advanced functions by touching a screen in more than one place at the same time. For example, a person could expand or shrink images by pinching the edges of the display window with the thumb and forefinger of one hand, explained Microsoft principal researcher Bill Buxton. Users could also, while in contact with a point on a map, touch other controls to make the system display information, such as nearby restaurants, about the area surrounding the indicated location. This is accomplished much as it has been in PCs for years.

For example, desktop users can press the Alt and Tab keys at the same time to toggle between open windows. The OS translates the simultaneous keystrokes into a single command. Industry observers say tabletop computers are likely to become a popular multitouch-screen implementation. Because multiple users at different positions will work with tabletop systems, the computers must be able to display material in different parts of the screen and move controls around to keep them from blocking reoriented content. The systems can determine users’ locations based on the positions from which they input commands or data. The computers then orient their displays toward the tabletop edge nearest to the user. Vendors are beginning to release commercial multitouch systems. For example, Mitsubishi Electric Research Laboratories’ Diamond Touch table, which includes a developer’s kit, can be used for small-group collaboration.

Horizontal orientation. The 30-inch display in a table-sized form factor allows users to share, explore and create experiences together, enabling a truly collaborative computing experience.

Dimensions. Microsoft Surface is 22 inches high, 21 inches deep and 42 inches wide.

Materials. The Microsoft Surface tabletop is acrylic, and its interior frame is powder- coated steel.

Surface computers

A surface computer is a computer that interacts with the user through the surface of an ordinary object, rather than through a monitor and keyboard.

The category was created by Microsoft with Surface (codenamed Milan), the surface computer from Microsoft which was based entirely on a Multi-Touch interface and using a coffee-table like design, and was unveiled on 30 May 2007. Users can interact with the machine by touching or dragging their fingertips and objects such as paintbrushes across the screen, or by setting real-world items tagged with special bar-code labels on top of it.

The Surface is a horizontal display on a table-like form. Somewhat similar to the iPhone, the Surface has a screen that can incorporate multiple touches and thus uses them to navigate multimedia content. Unlike the iPhone, which uses fingers' electrical properties to detect touch, the Surface utilizes a system of infrared cameras to detect input. Uploading digital files only requires each object (e.g. a Bluetooth-enabled digital camera) to be placed on the Surface. People can physically move around the picture across the screen with their hands, or even shrink or enlarge them. The first units of the Surface will be information kiosks in the Harrah's family of casinos.

Besides the microsoft-created devices, other computer firms have also entered the surface computing market. These include Mitsubishi Electric with its DiamondTouch, and Smart Surface Sdn Bhd with its SmartSurface.

Also receiving units will be T-Mobile, for comparing several cell phones side-by-side, and Sheraton Hotels and Resorts, which will use Surface to service lobby customers in numerous ways.

The Surface has a 2.0GHz Core 2 Duo processor, 2GB of memory, an off the shelf graphics card, a scratch-proof spill-proof surface, a DLP projector, and 5 infrared cameras as mentioned above. However, the expensive components required for the interface also give the Surface a price tag of between $12,500 to $15,000.

Perceptive Pixel

Computer scientists see technologies such as surface computing and multitouch as the key to a new era of ubiquitous computing, where processing power is embedded in almost every object and everything is interactive. Last year, New York University professor Jeff Han launched a company called Perceptive Pixel, which builds six-figure-plus custom multitouch drafting tables and enormous interactive wall displays for large corporations and military situation rooms. "I firmly believe that in the near future, we will have wallpaper displays in every hallway, in every desk. Every surface will be a point of interaction with a computer," Han says, "and for that to happen, we really need interfaces like this."

[pic]

Technologies such as surface computing and multitouch as the key to a new era of ubiquitous computing, where processing power is embedded in almost every object and everything is interactive. Last year, New York University professor Jeff Han launched a company called Perceptive Pixel."I firmly believe that in the near future, we will have wallpaper displays in every hallway, in every desk. Every surface will be a point of interaction with a computer," Han says, "and for that to happen, we really need interfaces like this."

The display’s surface is a six-millimeter-thick piece of clear acrylic, with infrared LEDs on the edges. Left undisturbed, the light passes along predictable paths within the acrylic, a process known as total internal reflection. When objects such as fingers touch the surface, the light diffuses at the contact point, causing the acrylic’s internal-reflection pathways to change. A camera below the surface captures the diffusion and sends the information to image-processing software, which can read multiple touches simultaneously and translate them into a command. The system sends information about screen touches to applications via the lightweight Open Sound Control protocol, utilized for network-based communication between computers and multimedia devices, and User Datagram Protocol data transport technology. The applications then take the appropriate actions. Perceptive Pixel, which has built a prototype that measures 36 _ 27 inches, is still working on applications for its displays, Han noted. They could be used for collaborative work on design-related and other projects, perhaps in place of interactive whiteboards, he said.

Short-term success for a technology can be measured by how much attention a product gathers when it is new. Long-term success is measured by how effectively that product disappears into the everyday routine of life. Surface computing has enormous potential to do both — it is a splashy new computer interface, surrounded by hype, but it is also, quite literally, furniture. It is a technology in its infancy, where even the engineers behind it can't predict its full impact; but the possibilities are everywhere, underhand and underfoot — on every surface imaginable.

Advantages

Large surface area to view different windows and applications.

Data Manipulation - Selecting, moving, rotating and resizing (manipulating objects on the screen is similar to manipulating them in the manual world).

Quick and easy to use.

More Than One User –Several people can orient themselves on different sides of the surface to interact with an application simultaneously (Max 52 points of touch).

Objects Recognition - Increased functionality aiding user in speed and ease of use.

Disadvantages

Incredibly expensive and not Portable.

Currently designed only in some areas.

Loss of Privacy - Open for many to view.

Tailored to high end clients.

Applications of Surface Computing

Water

Water is used as an "attract mode" for the Surface desktop, and it is certainly attractive. The default background picture is an image of smooth pebbles that appear to sit beneath a thin layer of rippling water. By itself, the water moves as if it were being disturbed by a light breeze, but it is when you touch the screen that it becomes more interesting than just another screensaver.

[pic]

Tapping anywhere on the surface causes larger ripples to spread out from the point of contact. Many people can tap at the same time, making an effect similar to a rainstorm. But by far the most fun is when you sweep your whole hand across and cause waves to bounce back and forth. The physics of the water simulation is not perfect: the ripples never get above a certain intensity, and there is no way to simulate diffraction. However, the overall effect is strangely compelling and is certainly a good way to introduce people to Surface.

One interesting feature of Water is that if you take any object (the team used a regular stove dial) and stick an identification sticker on the bottom, the program will switch background pictures whenever you turn the dial.

Video Puzzle

Video Puzzle showcases the power of the little identification tags mentioned above. The tags consist of a pattern of variously-sized dots; Keam mentioned that the dots currently represent an 8-bit code (256 permutations) but that 128-bit tags were in the works. The neat thing about the tags is that they can be very nearly transparent and the system will still pick them up. Not only can the tags transmit numerical information, but the geometrical arrangement of the dots means that Surface can also tell, to a high degree of accuracy, how much the tag (and therefore the object) has rotated.

In Video Puzzle, these virtually invisible tags are placed upon small squares of glass. When the pieces of glass are put on the table, the screen starts playing video clips underneath each one. Because the video moves whenever you move the squares, it creates the illusion that the glass itself is displaying the video, which looks very futuristic. As you move the squares around, you quickly realize that the video clips are all pieces of a larger video. Flipping the glass squares over inverts the video playing underneath, making completing the puzzle even more of a challenge.

When you complete the puzzle correctly, the system senses the achievement, congratulates you, and shows you the time taken to finish. According to Mark Bolger, the current record for finishing when the pieces are fully randomized is 1 minute and 53 seconds. On my first attempt, I finished in just over 2 minutes, but the squares were all right side up to begin with

Paint

Paint programs have been a natural demonstration application for new platforms ever since MacPaint graced the first Macintosh back in 1984. Surface’s paint program is even lighter on features than MacPaint was, but the natural user interface makes up for this deficiency.

[pic]

There are three draw modes that can be toggled by touching an icon on the bottom of the toolbar: brush, paint, and reveal, the last of which is kind of a negative brush that shows a background bitmap underneath. The brush mode is a bit spotty and tends to skip, but the paint mode is smooth and fun. You can draw using one finger, all your fingers at once (good for drawing hair), the palm of your hand, or using any natural object such as a regular paintbrush. Using the program is like having a flashback to finger painting back in kindergarten (minus the mess), and certainly children will have tons of fun with this kind of application.

That said, having this great touch interface absolutely cries out for a more full-featured program, something that can mix colors (like Microsoft's own paint program that comes with the Tablet PC version of Windows) and play around with textures and natural materials. I immediately thought of Fractal Design Painter and how much fun it would be with this interface. Of course, real digital artists have been using advanced pressure-sensitive graphics tablets for years, and Surface is not aimed at replacing this kind of workflow. Still, a more full-featured Paint program would be nice to have, and Keam mentioned that the team is still deciding whether or not to add features to Paint or instead take an existing paint program and rework it for Surface.

MUSIC

The Music application works like a virtual jukebox, displaying music arranged by album and allowing the user to flip over albums, select songs, and drag them to the "Now Playing" section. The album browser works a bit like Apple’s Cover Flow, although many albums are visible at once without scrolling.

[pic]

In addition to playing music that is already stored on the unit's hard drive, Music can also transfer songs from portable music players. Mark Bolger demonstrated this by placing two Zunes on top of the Surface and using the wireless connection to drag and drop songs between the units, the song list, and the Now Playing section. I mentioned to the team that this was the first time I had ever seen even one Zune "in the wild," and they joked that Microsoft headquarters didn’t really count as being in the wild. Bolger noted that sharing songs in this manner would be "subject to DRM restrictions, of course."

Photos

Sharing photos is a much more unrestricted activity, thanks to the fact that the consumer is also the creator of the content, and the photo album application reflected this freedom.

[pic]

By simply placing a Bluetooth-equipped digital camera on the tabletop, Surface was able to import the photos and place them in a pile on the screen, which Bolger verified by taking a picture of Cindy, my Microsoft PR contact who was sitting in the next chair. Most of the other photos were pictures of Microsoft employees' children; Bolger joked that only the cutest kids were allowed to be put in the demonstration.

Photos are arranged into albums that look like piles. Tapping the pile once spreads it around the screen and from there you can drag, rotate, and resize the images to your heart’s content. Since Surface can detect many touches at the same time, multiple people can sort and resize pictures, which could potentially turn a tedious job into a fun family affair. The program can also apparently sort photos into stacks by using metadata tags, although I did not see this feature demonstrated.

Not only pictures but full-motion videos can be viewed in this way; tapping thevideo once starts the playback, and it can be smoothly resized and rotated while it plays.

Casino

The Casino application was developed in cooperation with Harrah's of Las Vegas and is a good example of how Surface can be used in a hospitality environment. The background image is a giant map of the hotel and casino, with all the attractions marked for further inspection. Hotel customers can place their card anywhere on the screen and reserve tickets to any of these shows. The background map can be easily scrolled with a brush of the hand, and zoomed in and out by performing the two-finger pinch.

Dining

[pic]

The application allows diners to preview the entire menu by choosing a category (drinks, appetizers, main courses, and so forth) and then scrolling left and right through the available options. Items can be dragged into a central "ordering area" and when everyone is satisfied with their choices, a single tap on the Order button sends the list out to the waiter. This could potentially save service people huge chunks of time and would be very useful for busy restaurants. The software can display the daily specials, and forregular customers with their own identification cards, it could display a list of "favorites" to make ordering even easier. Combine this with entertainment activities for the kids (perhaps Paint?) and you can see how many restaurants could view this as a compelling application.

Understanding Multi-touch Manipulation

Two-handed, multi-touch surface computing provides a scope for interactions that are closer analogues to physical interactions than classical windowed interfaces. The design of natural and intuitive gestures is a difficult problem as we do not know how users will approach a new multi-touch interface and which gestures they will attempt to use. In this paper we study whether familiarity with other environments influences how users approach interacton with a multi-touch surface computer as well as how efficiently those users complete a simple task.

Inspired by the need for object manipulation in information visualization applications, we asked users to carry out an object sorting task on a physical table, on a tabletop display, and on a desktop computer with a mouse. To compare users‘ gestures we produced a vocabulary of manipulation techniques that users apply in the physical world and we compare this vocabulary to the set of gestures that users attempted on the surface without training. We find that users who start with the physical model finish the task faster when they move over to using the surface than users who start with the mouse.

The rapidly-developing world of multi-touch tabletop and surface computing is opening up new possibilities for interaction paradigms. Designers are inventing new ways of interacting with technology and users are influenced by their previous experience with technology.

Tabletop gestures are an important focal point in understanding these new designs. Windowing environments have taught users to experience computers with one hand, focusing on a single point. What happens when those constraints are relaxed, as in multi-touch systems? Does it make sense to allow or expect users to interact with multiple objects at once? Should we design for users having two hands available for their interactions? Both the mouse-oriented desktop and the physical world have constraints that limit the ways in which users can interact with multiple objects and users come to the tabletop very accustomed to both of these.

There is no shortage of applications where users might need to manipulate many objects at once. From creating diagrams to managing files within a desktop metaphor, users need to select multiple items in order to move them about. A number of projects in the visual analytics and design spaces have attempted to take advantage of spatial memory by simulating sticky notes a mixed blessing when rearranging the notes is expensive and difficult. As it becomes simpler to move objects and the mapping between gesture and motion becomes more direct, spatial memory can become a powerful tool.

We would like to understand what tools for managing and manipulating objects the tabletop medium affords and how users respond to it. Particularly, we would like to understand the techniques that users adopt to manipulate multiple small objects. What techniques do they use in the real world and how do those carry over to the tabletop context? Do they focus on a single object as they do in the real world or look at groups? Do they use one hand or two? How dexterous are users in manipulating multiple objects at once with individual fingers?

The problems of manipulating multiple objects deftly are particularly acute within the area of visual analytics , where analysts need to sort, filter, cluster, organize and synthesize many information objects in a visualization. Example systems include In-Spire , Jigsaw, Occulus nSpace, or Analyst‘s Notebook, i.e. systems where analysts use virtual space to organize iconic representations of documents into larger spatial representations for sense making or presenting results to others. In these tasks, it is important to be able to efficiently manipulate the objects and it is often helpful to manipulate groups of objects. Our general hypothesis is that multitouch interaction can offer rich affordances for manipulating a large number of objects, especially groups of objects.

A partial answer to these questions comes from recent work by Wobbrock et al. . Users in that study were asked to develop a vocabulary of gestures; the investigators found that most (but not all) of the gestures that users invented were one-handed. However, their analysis emphasized manipulating single objects: they did not look at how users would handle gestures that affect groups of items. In this paper we explore how users interact with large numbers of small objects.

We discuss an experiment in which we asked users to transition from both a mouse and a physical condition to an interactive surface, as well as the reverse. We present a taxonomy of user gestures showing which ones were broadly used and which were more narrowly attempted. We also present timing results showing that two-handed tabletop operations can be faster than mouse actions, although not as fast as physical actions. Our research adds a dimension to Wobbrock et al.‘s conclusions showing that two-handed interaction forms a vital part of surface gesture design.

Background

Typical interactions on groups of items in mouse-based systems first require multi-object selection and then a subsequent menu selection to specify an action on the selected objects. Common techniques for multi-object selection include drawing a selection rectangle, drawing a lasso, or holding modifier keys while clicking on several objects. In gestural interfaces this two-step process can be integrated into one motion. Yet, the design of appropriate gestures is a difficult task: the designer must develop gestures that can be both reliably detected by a computer and easily learned by people.

Similar to the mouse, pen-based interfaces only offer one point of input on screen but research on pen gestures is relatively advanced compared to multi-touch gestures. Pen-based gestures for multiple object interaction have, for example, been described by Hinckley et al. . Through a combination of lasso selection and marking-menu-based command activation, multiple targets can be selected and a subsequent action can be issued. A similar example with lasso selection and subsequent gesture (e.g., a pigtail for deletion) were proposed for Tivoli, an electronic whiteboard environment .

For multi-touch technology, a few gesture sets have been developed which include specific examples of the types of multi-object gestures we are interested in. For example, Wu et al. describe a Pile-n-Browse gesture. By placing two hands on the surface, the objects between both hands are selected and can be piled by scooping both hands in or browsed through by moving the hands apart. This gesture received a mixed response in an evaluation. Tseet al. explore further multi-touch and multi-modal group selection techniques. To select and interact with multiple digital sticky notes, users can choose between hand-bracketing , single-finger mouse-like lasso-selection, or a speech-and gesture command such as search for similar items.Groups can then be further acted upon through speech and gestures.

For example, groups of notes can be moved around by using a five-fingered grabbing gesture and rearranged through a verbal command. Using a different approach, Wilon et al. explore a physical based interaction model for multi-touch devices.

Here, multiple objects can be selected by placing multiple fingers on objects or by pushing with full hand shapes or physical objects against virtual ones to form piles. Many of the above multi-selection gestures are extremely similar to the typical mouse-based techniques Wobbrock et al. present a series of desired effects, and invite users to act out corresponding gestures in order to define a vocabulary. Participants described two main selection gestures tap and lasso for both single and group selection.

This research also showed a strong influence of mouse-based paradigms in the gestures participants chose to perform. Similarly, our goal was to first find out which gestures would be natural choices for information categorization and whether a deviation from the traditional techniques of lasso or selection rectangles would be a worthwhile approach.

Previous studies have examined the motor and cognitive effects of touch screens and mouse pointers, and the advantages of two-handed interaction over one-handed techniques, primarily for specific target selection tasks . Our goal is to takea more holistic view of multi-touch interaction in a more open-ended setting of manipulating and grouping many objects.

Baseline Multi-touch Surface Interaction

Our goal is to study tasks in which users manipulate large numbers of small objects on screen. For our study, we abstracted such analytic interactions with a task involving sorting colored circles in a simple bounded 2D space.

Our study tasks, described below, involved selecting and moving colored circles on a canvas. We were particularly interested in multi-touch support for single and group selection of such objects. To provide a study platform for comparison with standard mouse-based desktop and physical objects conditions, we had to make some interaction design decisions for our baseline multi-touch system. Our design incorporates several assumptions about supporting object manipulation for surface computing:

One or two fingers touching the surface should select individual objects.

A full hand, or three or more fingers touching the surface, should select groupsof objects.

Contacts far apart probably indicate separate selections (or accidental contact) instead of a very large group. Unintentionally selecting a large group is more detrimental than selecting small groups.

Multiple contacts that are near each other but initiated at different times are probably intended to be separate selections. Synchronous action might indicate coordinated intention.

The system is implemented on the Microsoft Surface , a rear-projection multi-touch tabletop display. The Surface Software Development Kit provides basic support for hit testing of users‘ contact points on the display. It also provides coordinates and an ellipsoidal approximation of the shape of the contact, as well as contact touch, move, and release events.

Our testing implementation supports selecting and dragging small colored circles both individually and in groups. The interaction design was intentionally kept simple to support our formative study goals. Contacts from fingers and palms select all the circles within their area. As feedback of a successful selection, the circles are highlighted by changing the color of their perimeters, and can be dragged to a new position. From there, they can be released and de-selected. A (small) fingertip contact elects only the topmost circle under the contact, enabling users to separate overlapping circles. Large contacts such as palms select all circles under the contact. Using multiple fingers and hands, users can manipulate multiple circles by such direct selection and move them independently. Such direct selection techniques are fairly standard on multi-touch interfaces.

[pic]

We also provide an analogue to the usual mouse-based rectangular marquee selec-tion of groups of objects. However, a simple rectangular marquee selection does not make effective use of the multi-touch capability. Instead, the users can multi-select by defining a convex hull with three or more fingers. If three or more contacts occur within 200ms and within a distance of 6 inches from each other (approximately a hand-span), then a convex-hull is drawn around these contacts and a group selection is made of any circles inside this hull. The background area inside the hull is also colored light grey to give the user visual feedback. These hulls, and the circles within them, can then be manipulated with affine transformations based on the users‘ drag motions. For example, users can spread out or condense a group by moving their fingers or hands together or apart. While the group selection is active, users can grab it with additional fingers to perform the transformations as they desire. The group selection is released when all contacts on the group are released.

Study Design

The goal of this study is to discover how users manipulate many small objects, in three different interaction paradigms: physical, multi-touch, and mouse interaction. To support our formative design goals, we took a qualitative exploratory approach with quantitative evaluation for comparisons.

Participants

We recruited 32 participants (25 males and 7 females) and 2 pilot testers via email from our institution. We screened participants for color blindness. They were mainly researchers and software developers who were frequent computer users. The average age of participants was 34, ranging from 21 to 61. None of the participants had significant experience using the Surface. Participants had either never used the Surface before or had tried it a few times at demonstrations. Participants each received a $US10 lunch coupon for their participation. To increase motivation, additional $10 lunch coupons were given to the participants with the fastest completion time for each interface condition in the timed task.

Conditions and Groups

We compared three interface conditions: Surface, Physical and Mouse. For both the Surface and Physical conditions, we used a Microsoft Surface system measuring 24" × 18". For the Surface condition , we ran the multi-touch implementation described in Section 3 with 1024 × 768 resolution. For the Physical condition, we put 2.2cm diameter circular plastic game chips on top of the Microsoft Surface tabletop with same grey background (for consistency with the Surface condition). The circles in the Surface condition were the same apparent size as the game chips in the Physical condition.

[pic]

(Left) Physical condition and (Right) Mouse condition.

For the Mouse condition, we ran a C# desktop application on a 24'' screen. This application supported basic mouse-based multi-selection techniques: marquee selection by drawing a rectangle as well as control- and shift-clicking nodes. Circles were sized so that their radii as a proportion of display dimensions were the same on both the desktop and surface.

Since our goal is to compare the Surface condition against the other two conditions, each participant used only two conditions: Surface and one of the others. Users were randomly divided into one of four groups: Physical then Surface (PS), Surface then Physical (SP), Mouse then Surface (MS), Surface then Mouse (SM). This resulted in participants‘ data for 32 Surface, 16 Physical and 16 Mouse.

Tasks

Participants performed four tasks, each task requiring spatially organizing a large number of small objects. The first and second tasks were intended to model how analysts might spatially cluster documents based on topics, and manage space as they work on a set of documents, and were designed to capture longer-term interaction strategies. The tasks required a significant amount of interaction by the participants and gave them a chance to explore the interface.

All participants worked on the four tasks in the same order, and were not initially trained on the surface or our application. Participants were presented with a table of 200 small circles, with 50 of each color: red, green, blue, and white. The 200 circles on a Surface at the start of the first task, positioned randomly in small clusters.

With the exception of Task 3, which was timed, we encouraged participants to think aloud while performing the tasks so that we could learn their intentions and strategies.

Task 1: Clustering task. This task was designed to elicit users‘ intuitive sense of how to use gestures on the surface. The task was to organize the blue and white circles into two separate clusters that could be clearly divided from all others. Participants were told that the task would be complete when they could draw a line around the cluster without enclosing any circles of a different color. Error! Reference source not found. shows one possible end condition of Task 1.

[pic]

Example end condition of Task 1.

Task 2: Spreading Task. Participants spread out the blue cluster such that no blue circles overlap, moving other circles to make room as needed. Participants start this task with the end result of their Task 1.

Task 3: Timed Clustering Task. This task was designed to evaluate user performance time for comparison between interface conditions and to examine the strategies which users adopt over time. Task 3 repeated Task 1, but participants were asked to complete the task as quickly as possible. They were not asked to think aloud‘ and a prize was offered for the fastest time.

Task 4: Graph Layout Task. Inspired by the recent study of van Ham and Rogowitz , we asked participants to lay out a social network graph consisting of 50 nodes and about 75 links. In the Physical condition, participants did not attempt this task. Due to the broader scope and complexity of this task, the analysis of the results of Task 4 will be reported elsewhere.

Procedure

Each participant was given an initial questionnaire to collect their demographics and prior experience with the Microsoft Surface system. Participants completed Tasks 1 and 2 without training, in order to observe the gestures they naturally attempted. Participants in the Surface and Mouse condition were given a brief tutorial about the available interaction features after Task 2. At the end of each condition participants answered a questionnaire about their experience. They then repeated the same procedure with the second interface condition. At the end of the session participants answered a final questionnaire comparing the systems. Each participant session lasted at most an hour.

We recorded video of the participants to capture their hand movements, their ver-bal comments and the display screen. The software also recorded all events and user operations for both the Surface and Mouse conditions.

RESULTS

We divide our results into an analysis of the set of gestures that users attempted for Tasks 1 and 2, timing results from Task 3, and user comments from the after-survey.

Gestures

The video data for Task 1 and 2 (clustering and spreading) were analyzed for the full set of operations users attempted in both the Physical and Surface conditions. We first used the video data to develop a complete list of all gestures, both successful and unsuccessful. For example, if a participant attempted to draw a loop on the surface, we coded that as an unsuccessful attempt to simulate a mouse lasso gesture. The gestures were aggregated into categories of closely-related operations. Once the gestures were identified, the videos were analyzed a second time to determine which gestures each user attempted.

Table 1 provides a listing of all classes of gestures that participants performed during the study; six of them are illustrated below. These gestures are divided into several categories: single-hand operations that affect single or groups of objects, twohanded gestures that affect multiple groups of objects, and two-handed gestures that affect single groups. Last, we list gestures that apply only to one medium: just surface, and just physical.

In order to understand how gestures varied by condition, we classed gestures by which participants attempted them. Table 2 lists all of the gestures that were feasible in both the Physical and Surface conditions. This table also lists the percentage of participants who utilized each gesture at least once during the session. This data is aggregated by the Physical and Surface conditions, followed by a further classification by which condition was performed first (Physical, Mouse, Surface). Table 3 lists additional gestures that were only feasible for the Surface condition, while Table 4 lists gestures that were only used in the Physical condition.

[pic] [pic]

One-hand shove Drag two objects with pointer fingers

[pic] [pic]

Two hands grab groups Add/remove from selection.

[pic] [pic]

Two-hand transport. Both hands coalesce large group to small.

Fig: Six selected one- and two-handed gestures attempted by participants during the study

TABLE 1. DESCRIPTION OF GESTURES

|ONE HAND |

|Individual Items |Groups |

|Drag single object. Drag a single item across the tabletop |Splayed hand pushes pieces. Anopen hand pushes pieces. Could |

|with a fingertip. |define a hull. |

|Drag objects with individual fingers. Using separate fingers |Hand and palm. A single hand is pressed flatagainst the table to move |

|from one hand, drag individual itemsacross the table |items underneath |

|Toss single object. Use momentum to keep an object moving |it. |

|across the tabletop. |One hand shove . Moves manypoints as a group. |

| |Pinch a pile. Several fingers “pinch” a group of pieces together. In |

| |the Surface condition this would define a (very small) hull. |

|TWO HANDS |

|Coordinated, >1 Group |Coordinated, 1 Group |

|Drag two objects with pointer fingers .Does not entail any grouping |Both hands coalesce large group to small |

|operations. |Two-hand transport. Use two hands to grab a group and drag across the |

|Two hands grab points in sync. Each and ha multiple fingers |region. |

|pulling items under fingers. |Add/remove from selection. Use one hand to pull an object out|

|Rhythmic use of both hands. “Hand-over-hand and synchronized |of a group held by the other. |

|motion, repeated several Or many times. | |

|Two hands grab groups . Hands operate separately to drag groups | |

|or individual points. | |

|BY CONDITION |

|Surface only |Physical Gestures |

|One hand expand/contract. Use a single hand witha convex hull to grow |Lift Up. Pick up chips in the hand, carry them across the |

|or shrink a hull. |surface, and deposit them on the other side. |

|Two-hand hull tidy. Use fingers from two hands with a convex |Go outside the lines. Move, stack, or slide chips on the margin |

|hull to shrink the hull to make more space. |of the table, outside the screen area. |

|Two-hand hull expand/contract. Use fingers from two hands with | |

|a convex hull to manipulate the hull. |Slide around objects. When sliding circles, choose paths across |

|Expand hull to cover desired nodes. Define a hull first, then |the space that avoid other circles. |

|expand it to cover more nodes. Does not work on our Surface |"Texture"-based gestures. Slide chips under palms and fingers and |

|implementation. |shuffle them, using the feel of the chip in the hand. |

|Treat finger like a mouse. Includes drawing a lasso or marquee with |Toss items from one hand to other. Take advantage of momentum |

|one or two hands, different fingers of the hand for “right” |to slide chips from one hand to the other. |

|click, or holding down one hand to “shift-click” with the other. |Drag a handful, dropping some on the way. Intentionally let some |

|Push hard to multi-select. Press a finger harder into the table to |chips fall out of the hand, holding others, to either spread out a|

|hope to grow a selection or select more items in the near vicinity. |pile or sort them into different groups. |

Across all participants the most popular gestures were those that entailed using fingertips to move circles across the table—all participants moved at least some items around that way. While all participants realized they could move physical objects with two hands, six of them never thought to try that with the Surface (three that started in Surface condition; three from group MS). Closer examination of the gesture data revealed that participants who started with the physical condition were much more likely (88%) to try multiple fingers with both hands than users who started with the mouse (56%) or the surface (50%).

When participants worked with two hands on the surface they almost always used them on separate groups: only 30% of participants performed operations that used both hands at once to affect a single group. However, both hands were often used to move groups separately.

We observed several habits from the other conditions that crept into the Surface interactions. For example, 56% of users tried to use their fingers as a mouse, experi-menting using a different finger on the same hand for a multi-select or trying to draw marquees or lassos. Half of the users who started with the mouse continued to try mouse actions on the surface, while 25% of users who started with the physical condition tried mouse actions. More results are summarized in Table 3.

TABLE 2 Gestures that apply to both the physical and surface conditions. Values indicate the percentage of subjects who used the gesture at least once.

| |S|

|PhYSICA SurFACE |u|

| |r|

| |f|

| |a|

| |c|

| |e|

| |(|

| |b|

| |y|

| |1|

| |s|

| |t|

| |C|

| |o|

| |n|

| |d|

| |i|

| |t|

| |i|

| |o|

| |n|

| |)|

| |(n=16) |(n=32) |After Mouse (n=8) |After Physical (n=8) |After Surface (n=16) |

|1 Hand, Individual Items | | | | | |

|Drag single object |75% |94% |100% |75% |100% |

|Drag objects with fingers |81% |69% |50% |50% |88% |

|Toss single object |38% |19% |0% |13% |31% |

| | | | | | |

|1 Hand, Groups | | | | | |

|Splayed hand pushes pieces |50% |28% |25% |25% |31% |

|One hand shove |75% |47% |38% |38% |56% |

|Hand and palm |31% |41% |25% |38% |56% |

|Pinch a pile |6% |38% |13% |25% |56% |

|2 Hands, Coordinated, > 1 G | | | | | |

|Drag 2 objects with pointer |63% |63% |50% |88% |56% |

|Two hands grab points in sync |88% |50% |38% |88% |38% |

|Rhythmic use of both hands |56% |41% |50% |63% |25% |

|Both hands grab groups |81% |34% |38% |50% |25% |

|2 Hands, Coordinated, 1 G | | | | | |

|Both hands coalesce large |75% |9% |13% |13% |6% |

|Two-hand transport |69% |41% |38% |63% |31% |

|Add/remove from selection |25% |19% |0% |13% |31% |

TABLE 3 GESTURES THAT ONLY APPLY TO SURFACE

| |Mouse 1st(n=8) |Physical 1st(n=8) |Surface1st(n=16) |

|Hull Resizing | | | |

|One Hand Hull Expand/Contract |13% |13% |25% |

|Two hand hull Tidy |0% |25% |6% |

|Two-hand hull Expand/Contract |25% |63% |56% |

|Expand hull to cover desired nodes (doesn't work) |13% |25% |6% |

We wanted to understand what additional physical operations might be applied to adigital representation. In Table 4, we list operations that users performed in the physical condition that do not have a direct digital analogue. For example, 75% of all participants in the physical condition lifted the chips off the table; and 69% also pushed chips outside of the bounds of the table. Some of these gestures were attempted in the surface condition, but participants quickly realized that they were not supported on the surface. The one exception to this was a gesture to slide objects around other objects when moving them, which was possible in the surface condition although it was unnecessary since selected circles could be dragged through unselected circles.

TABLE 4 GESTURES THAT REFER TO PHYSICAL CONDITION

| |Physical (n=16) |Surface (n=32) |

|Physical Gestures | | |

|Lift Up |75% |3% |

|Go outside the lines |69% |0% |

|Slide around objects |88% |34% |

|"Texture"-based gestures (e.g. flattening a pile |44% |3% |

|Toss items from one hand to other |38% |0% |

|Drag a handful, dropping some on the way |25% |6% |

Timing Results for Task 3

In addition to articulating the set of possible operations, we also wanted to understand which ways of moving multiple objects were most efficient. Do participants do better with the two-handed grouping operations of the surface, or the familiar mouse? We analyzed the task time data with a 2 (Condition) × 2 (Group) mixed ANOVA. Table 5 shows mean completion times with standard deviations for Task 3.

TABLE 5 MEAN COMPLETION TIME FOR TASK 3 IN SECONDS

|Condition Order |MS (n=8) |PS (n=8) |SM (n=8) |SP (n=8) |

|Physical |- |71.0 (14.5) |- |107.6 (13.8) |

|Mouse |123.9 (30.9) |- |144.5 (32.5) |- |

|Surface |116.7 (21.8) |94.9 (30.3) |118.7 (31.2) |146.4 (37.5) |

Surface is faster than Mouse. For the 16 participants who completed the Surface and Mouse conditions, we ran a 2 x 2 mixed ANOVA with condition {Surface, Mouse} as the within subjects variable and order of conditions as the between subjects variable. A significant main effect of condition was found (F1,14=6.10, p=.027) with the surface condition being significantly faster (116 sec) than the mouse condition (134 sec). No significant effect of order was found (F1,14=.928, p=.352) and there was no interaction effect between condition and order (F1,14=1.38, p=.260).

Physical is faster than Surface, and trains users to be faster. For the 16 participants who completed the Surface and Physical conditions we again ran a 2 x 2 mixed ANOVA with condition {Surface, Physical} as the within subjects variable and order of conditions as the between subjects variable. A significant main effect of condition was found (F1,14=11.96, p=.004) with the physical condition being significantly faster (89 sec) than the surface condition (120 sec). In addition, a significant effect of condition order was found (F1,14=11.482, p Average + 1.5 * SD). An independent samples t-test revealed that participants who performed the physical condition first were significantly faster on the surface, than participants who performed the mouse condition first (t12=2.38, p=.035).

Number of Group Operations. In attempting to understand the time difference reported in the previous section, we found that the physical-surface (PS) group used more group operations than the mouse-surface (MS) group: an average of 33 group operations across participants in the PS group against 26 for the MS group. However this difference was not significant enough to reject a null-hypothesis (t14=0.904p=.381). Of course multi-touch interaction on the surface affords a number of othertypes of interaction that may increase efficiency in such a clustering task, e.g., simultaneous selection with multiple fingers or independent hand-over-hand gestures.

User Comments

We asked participants to rate the difficulty of the clustering task on a 7 point Likert scale (1=Very difficult, 7=Very easy). We ran a 2 x 2 mixed ANOVA with condition as the within subjects variable and order of conditions as the between subjects variable. We found a significant main effect of condition (F1,14=5.8, p=.03) with the surface condition being significantly easier (5.5) than the mouse condition (4.9). No significant effect was found between Physical and Surface.

Participants seemed to appreciate the manipulation possibilities of the Surface: when we asked which condition they preferred to perform the clustering task, 14 participants (88%) prefer Surface to Mouse. However, only 7 (44%) prefer Surface to Physical. Interestingly, the performance advantage of the Surface over the Mouse was greater than some participants thought. When we asked which condition they felt faster, only 9 participants (56%) felt Surface was faster than Mouse even though 12 (75%) actually did perform faster with Surface. However, 4 participants (25%) felt Surface was faster than Physical even though 3 (19%) were actually faster with Surface.

In verbal comments from participants who used both Physical and Surface, the most commonly cited advantage of Physical was the tactile feedback, i.e. selection feedback by feel rather than visual highlights. Whereas, the most cited advantage of the Surface was the ability to drag selected circles through any intervening circles nstead of needing to make a path around them. For the participants who used both Mouse and Surface, the most cited advantage of the Mouse was multi-selecting many dispersed circles by control-clicking, while the most cited advantage of Surface was the ability to use two hands for parallel action.

DISCUSSION

Tabletop multi-touch interfaces such as the Microsoft Surface present new opportunities and challenges for designers. Surface interaction may be more like manipulating objects in the real world than indirectly through a mouse interface, but it still has important differences from the real world, with its own advantages and disadvantages.

We observed that participants use a variety of two handed coordination. Some participants used two hands simultaneously, some used two hands in sync (hand over hand), some used coordinated hand-offs, and others used some combination of these. As a result, defining a group-by gesture requires some care because participants have different expectations about how grouping may be achieved when they first approach he Surface. In our particular implementation, participants sometimes had difficulty working with two hands independently and close together when our heuristic would make a group selection. We caution future designers of tabletop interfaces to consider his complexity in finding a good balance between physical metaphors and supporting gestures to invoke automation.

Multi-touch grouping turned out to be very useful. Many participants manipulated groups, and seemed to do so without thinking about it explicitly. Possibly the most valuable and common type of group manipulations were ephemeral operations such as the small open-handed grab and move. Massive group operations, such as moving large piles, also helped participants efficiently perform the clustering task. While our current implementation of group-select worked reasonably well as a baseline, we observed some difficulty with our hull system. We believe a better implementation of group select and increased user familiarity with multi-touch tabletop interfaces may bring user efficiency closer to what we observed in the Physical condition.

We have introduced a particular task that may be a useful benchmark for testing the efficiency and ergonomics of a particular type of basic tabletop interaction, but there is a great deal of scope for further studies. As was briefly mentioned in this paper, our study included a more challenging and creative task involving a layout of a network diagram. We intend to follow-up on this first exploration with an evaluation of a user-guided automatic layout interface that attempts to exploit the unique multitouch capability of tabletop systems.

User-Defined Gestures for Surface Computing

Many surface computing prototypes have employed gestures created by system designers. Although such gestures are appropriate for early investigations, they are not necessarily reflective of user behavior. We present an approach to designing tabletop gestures that relies on eliciting gestures from non-technical users by first portraying the effect of a gesture, and then asking users to perform its cause. In all, 1080 gestures from 20 participants were logged, analyzed, and paired with think-aloud data for 27 commands performed with 1 and 2 hands. Our findings indicate that users rarely care about the number of fingers they employ, that one hand is preferred to two, that desktop idioms strongly influence users’ mental models, and that some commands elicit little gestural agreement, suggesting the need for on-screen widgets. We also present a complete user-defined gesture set, quantitative agreement scores, implications for surface technology, and a taxonomy of surface gestures. Our results will help designers create better gesture sets informed by user behavior.

INTRODUCTION

Recently, researchers in human-computer interaction have been exploring interactive tabletops for use by individuals and groups , as part of multi-display environments , and for fun and entertainment . A key challenge of surface computing is that traditional input using the keyboard, mouse, and mouse-based widgets is no longer preferable; instead, interactive surfaces are typically controlled via multi-touch freehand gestures. Whereas input devices inherently constrain human motion for meaningful human-computer dialogue , surface gestures are versatile and highly varied almost anything one can do with one’shands could be a potential gesture. To date, most surface gestures have been defined by system designers, who personally employ them or teach them to user-testers [14,17,21,27,34,35]. Despite skillful design, this results in somewhat arbitrary gesture sets whose members may be chosen out of concern for reliable recognition .Although this criterion is important for early prototypes, it is not useful for determining which gestures match those that would be chosen by users. It is therefore timely to consider the types of surface gestures people make without regard for recognition or technical concerns.

What kinds of gestures do non-technical users make? In users’ minds, what are the important characteristics of such gestures? Does number of fingers matter like it does in many designer-defined gesture sets? How consistently are gestures employed by different users for the same commands? Although designers may organize their gestures in a principled, logical fashion, user behavior is rarely so systematic. As McNeill writes in his laborious study of human discursive gesture, “Indeed, the important thing about gestures is that they are not fixed. They are free and reveal the idiosyncratic imagery of thought” .

To investigate these idiosyncrasies, we employ a guessability study methodology that presents the effects of gestures to participants and elicits the causes meant to invoke them. By using a think-aloud protocol and video analysis, we obtain rich qualitative data that illuminates users’ mental models. By using custom software with detailed logging on a Microsoft Surface prototype, we obtain quantitative measures regarding gesture timing, activity, and preferences. The result is a detailed picture of user-defined gestures and the mental models and performance that accompany them.Although some prior work has taken a principled approach to gesture definition [20,35], ours is the first to employ users, rather than principles, in the development of gesture set. Moreover, we explicitly recruited non-technical people without prior experience using touch screens (e.g., the Apple iPhone), expecting that they would behave with and reason about interactive tabletops differently than designers and system builders.

This work contributes the following to surface computing research:

a quantitative and qualitative characterization of user-defined surface gestures, including a taxonomy

a userdefined gesture set

insight into users’ mental models when making surface gestures

an understanding of implications for surface computing technology and user interface design.

RELATED WORK

Relevant prior work includes studies of human gesture, eliciting user input, and systems defining surface gestures.

Classification of Human Gesture

Efron conducted one of the first studies of discursive human gesture resulting in five categories on which later taxonomies were built. The categories were physiographics, kinetographics, ideographics, deictics, and batons. The first two are lumped together as iconics in McNeill’s classification. McNeill also identifies metaphorics, deictics, and beats. Because Efron’s and McNeill’s studies were based on human discourse, their categories have only limited applicability to interactive surface gestures.

Kendon showed that gestures exist on a spectrum of formality and speech-dependency. From least to most formal, the spectrum was: gesticulation, language-like gestures, pantomimes, emblems, and finally, sign languages. Although surface gestures do not readily fit on this spectrum, they are a language of sorts, just as direct manipulation interfaces are known to exhibit linguistic properties .

Poggi offers a typology of four dimensions along which gestures can differ: relationship to speech, spontaneity, mapping to meaning, and semantic content. Rossini gives an overview of gesture measurement, highlighting the movement and positional parameters relevant to gesture quantification.

Tang analyzed people collaborating around a large drawing surface. Gestures emerged as an important element for simulating operations, indicating areas of interest, and referring to other group members. Tang noted actions and functions, i.e., behaviors and their effects, which are like the signs and referents in our guessability methodology .

Morris offer a classification of cooperative gestures among multiple users at a single interactive table. Their classification uses seven dimensions. These dimensions address groups of users and omit issues relevant to single-user gestures, which we cover here.

Working on a pen gesture design tool, Long showed that users are sometimes poor at picking easily differentiable gestures. To address this, our guessability methodology resolves conflicts among similar gestures by using implicit agreement among users.

Eliciting Input from Users

Some prior work has directly employed users to define input systems, as we do here. Incorporating users in the design process is not new, and is most evident in participatory design . Our approach of prompting users with referents, or effects of an action, and having them perform signs, or causes of those actions, was used by Good to develop a command-line email interface. It was also used by Wobbrock to design Edge Write unistrokes. Nielsen describe a similar approach.

A limited study similar to the current one was conducted by Epps, who presented static images of a Windows desktop on a table and asked users to illustrate various tasks with their hands. They found that the use of an index finger was the most common gesture, but acknowledged that their Windows-based prompts may have biased participants to simply emulate the mouse.

Liu observed how people manipulated physical sheets of paper when passing them on tables and designed their TNT gesture to emulate this behavior, which combines rotation and

translation in one motion. Similarly, the gestures from the Charade system were influenced by observations of presenters’ natural hand movements.

Other work has employed a Wizard of Oz approach. Mignot studied the integration of speech and gestures in a PC-based furniture layout application. They found that gestures were used for executing simple, direct, physical commands, while speech was used for high level or abstract commands. Robbe followed this work with additional studies comparing unconstrained and constrained speech input, finding that constraints improved participants’ speed and reduced the complexity of their expressions. Robbe Reiter employed users to design speech commands by taking a subset of terms exchanged between people working on a collaborative task. Beringer elicited gestures in a multimodal application, finding that most gestures involved pointing with an arbitrary number of fingers a finding we reinforce here.

Finally, Voida studied gestures in an augmented reality office. They asked users to generate gestures for accessing multiple projected displays, finding that people overwhelming used finger-pointing.

Systems Utilizing Surface Gestures

Some working tabletop systems have defined designer-made gesture sets. Wu and Balakrishnan built RoomPlanner, a furniture layout application for the DiamondTouch supporting gestures for rotation, menu access, object collection, and private viewing. Later, Wu described gesture registration, relaxation, and reuse as elements from which gestures can be built. The gestures designed in both of Wu’s systems were not elicited from users, although usability studies were conducted.

Some prototypes have employed novel architectures. Rekimoto created SmartSkin, which supports gestures made on a table or slightly above. Physical gestures for panning, scaling, rotating and “lifting” objects were defined. Wigdorstudied interaction on the underside of a table, finding that techniques using underside-touch were surprisingly feasible. Tse combined speech and gestures for controlling bird’s-eye geospatial applications using multi-finger gestures.

Recently, Wilson et al. used a physics engine with Microsoft Surface to enable unstructured gestures to affect virtual objects in a purely physical manner.

Finally, some systems have separated horizontal touch surfaces from vertical displays. Malik et al. defined eight gestures for quickly accessing and controlling all parts of a large wall-sized display. The system distinguished among 1-, 2-, 3-, and 5-finger gestures, a feature our current findings suggest may be problematic for users. Moscovich and Hughes defined three multi-finger cursors to enable gestural control of desktop objects.

DEVELOPING A USER-DEFINED GESTURE SET

User-centered design is a cornerstone of human-computer interaction. But users are not designers; therefore, care must be taken to elicit user behavior profitable for design.

Overview and Rationale

A human’s use of an interactive computer system comprises a user-computer dialogue, a conversation mediated by a language of inputs and outputs. As in any dialogue, feedback is essential to conducting this conversation. When something is misunderstood between humans, it may be rephrased. The same is true for user-computer dialogues. Feedback, or lack thereof, either endorses or deters a user’s action, causing the user to revise his or her mental model and possibly take a new action.

In developing a user-defined gesture set, we did not want the vicissitudes of gesture recognition to influence users’ behavior. Hence, we sought to remove the gulf of execution from the dialogue, creating, in essence, a monologue in which the user’s behavior is always acceptable. This enables us to observe users’ unrevised behavior, and drive system design to accommodate it. Another reason for examining users’ unrevised behavior is that interactive tabletops may be used in public spaces, where the importance of immediate usability is high.

In view of this, we developed a user-defined gesture set by having 20 non-technical participants perform gestures on a Microsoft Surface prototype. To avoid bias, no elements specific to Windows or the Macintosh were shown. Similarly, no specific application domain was assumed. Instead, participants acted in a simple blocks world of 2D shapes. Each participant saw the effect of a gesture (e.g., an object moving across the table) and was asked to perform the gesture he or she thought would cause that effect (e.g., holding the object with the left index finger while tapping the destination with the right). In linguistic Terms, the effect of a gesture is the referent to which the gestural sign refers. Twenty-seven referents were presented, and gestures were elicited for 1 and 2 hands. The system did not attempt to recognize users’ gestures, but did track and log all hand contact with the table. Participants used the think-aloud protocol and were videotaped. They also supplied subjective preference ratings.

The final user-defined gesture set was developed in light of the agreement participants exhibited in choosing gestures for each command. The more participants that used the same gesture for a given command, the more likely that gesture would be assigned to that command. In the end, our user-defined gesture set emerged as a surprisingly consistent collection founded on actual user behavior.

Referents and Signs

Conceivably, one could design a system in which all commands were executed with gestures, but this would be difficult to learn. So what is the right number of gestures to employ? For which commands do users tend to guess the same gestures? If we are to choose a mix of gestures and widgets, how should they be assigned? To answer these questions, we presented the effects of 27 commands (i.e., the referents) to 20 participants, and then asked them to invent corresponding gestures (i.e., the signs). The commands were application-agnostic, obtained from desktop and tabletop systems .Some were conceptually straightforward, others more complex. The three authors independently rated each referent’s conceptual complexity before participants made gestures. Table 1 shows the referents and ratings.

Participants

Twenty paid participants volunteered for the study. Nine were female. Average age was 43.2 years (sd = 15.6). All participants were right-handed. No participant had used an interactive tabletop, Apple iPhone, or similar. All were recruited from the general public and were not computer scientists or user interface designers. Participant occupations included restaurant host, musician, author, steelworker, and public affairs consultant.

Apparatus

The study was conducted on a Microsoft Surface prototype measuring 24" × 18" set at 1024 × 768 resolution. We wrote a C# application to present recorded animations and speech illustrating our 27 referents to the user. For example, for the pan referent (Figure 1), a recorded voice said, “Pan. Pretend you are moving the view of the screen to reveal hidden off-screen content. Here’s an example.” After the voice finished, our software animated a field of objects moving from left to right. After the animation, the software showed the objects as they were before the panning effect, and waited for the user to perform a gesture.

The Surface vision system watched participants’ hands from beneath the table and reported contact information to our software. All contacts were logged as ovals having millisecond timestamps. These logs were then parsed by our software to compute trial-level measures.

Participants’ hands were also videotaped from four angles. In addition, two authors Observed each session and took detailed notes, particularly concerning the think-aloud data.

[pic]

[pic]

Procedure

Our software randomly presented 27 referents (Table 1) to participants. For each referent, participants performed a 1-hand and a 2-hand gesture while thinking aloud, and then indicated whether they preferred 1 or 2 hands. After each gesture, participants were shown two 7-point Likert scales concerning gesture goodness and ease. With 20 participants, 27 referents, and 1 and 2 hands, a total of 20 × 27 × 2 = 1080 gestures were made. Of these, 6 were discarded due to participant confusion.

RESULTS

Our results include a gesture taxonomy, the user-defined gesture set, performance measures, subjective responses, and qualitative observations.

Classification of Surface Gestures

As noted in related work, gesture classifications have been developed for human discursive gesture [4,11,15], multimodal gestures with speech [20], cooperative gestures [17], and pen gestures [13]. However, no work has established a taxonomy of surface gestures based on user behavior to capture and describe the gesture design space.

Taxonomy of Surface Gestures

The authors manually classified each gesture along four dimensions: form, nature, binding, and flow. Within each dimension are multiple categories, shown in Table 2.

The scope of the form dimension is within one hand. It is applied separately to each hand in a 2-hand gesture. One-point touch and one-point path are special cases of static pose and static pose and path, respectively. These are worth distinguishing because of their similarity to mouse actions. A gesture is still considered a one-point touch or path even if the user casually touches with more than one finger at the same point, as our participants often did. We investigated such cases during debriefing, finding that users’ mental models of such gestures involved only one contact point.

[pic]

[pic]

In the nature dimension, symbolic gestures are visual depictions. Examples are tracing a caret (“^”) to perform insert, or forming the O.K. pose on the table (“%”) for accept. Physical gestures should ostensibly have the same effect on a table with physical objects. Metaphorical gestures occur when a gesture acts on, with, or like omething else. Examples are tracing a finger in a circle to simulate a “scroll ring,” using two fingers to “walk” across the screen, pretending the hand is a magnifying glass, swiping as if to turn a book page, or just tapping an imaginary button. Of course, the gesture itself usually is not enough to reveal its metaphorical nature; the answer lies in the user’s mental model. Finally, abstract gestures have no symbolic, physical, or metaphorical connection to their referents. The mapping is arbitrary, which does not necessarily mean it is poor. Triple-tapping an object to delete it, for example, would be an abstract gesture.

In the binding dimension, object-centric gestures only require information about the object they affect or produce. An example is pinching two fingers together on top of an object for shrink. World-dependent gestures are defined with respect to the world, such as tapping in the top-right corner of the display or dragging an object off-screen. World-independent gestures require no information about the world, and generally can occur anywhere. We include in this category gestures that can occur anywhere except on temporary objects that are not world features. Finally, mixed dependencies occur for gestures that are world-independent in one respect but world-dependent or object centric in another. This sometimes occurs for 2-hand gestures, where one hand acts on an object and the other hand acts anywhere.

A gesture’s flow is discrete if the gesture is performed, delimited, recognized, and responded to as an event. An example is tracing a question mark (“?”) to bring up help. Flow is continuous if ongoing recognition is required, such as during most of our participants’ resize gestures. Discrete and continuous gestures have been previously noted

Taxonometric Breakdown of Gestures in our Data

We found that our taxonomy adequately describes even widely differing gestures made by our users. Figure 2 shows for each dimension the percentage of gestures made within each category for all gestures in our study.

An interesting question is how the conceptual complexity of referents (Table 1) affected gesture nature (Figure 2). The average conceptual complexity for each nature category was: physical (2.11), abstract (2.99), metaphorical (3.26), and symbolic (3.52). Logistic regression indicates these differences were significant [pic] Thus, simpler commands more often resulted in physical gestures, while more complex commands resulted in metaphorical or symbolic gestures.

A User-defined Gesture Set

At the heart of this work is the creation of a user-defined gesture set. This section gives the process by which the set was created and properties of the set. Unlike prior gesture sets for surface computing, this set is based on observed user behavior and joins gestures to commands..

Agreement

After all 20 participants had provided gestures for each referent for one and two hands, we grouped the gestures within each referent such that each group held identical gestures. Group size was then used to compute an agreement score A that reflects, in a single number, the degree of consensus among participants.

[pic]

[pic]

In Eq. 1, r is a referent in the set of all referents R, Pr is the set of proposed gestures for referent r, and Pi is a subset of identical gestures from Pr. The range for A is [|Pr|-1 , 1]. As an example, consider agreement for move a little (2-hand) and select single (1-hand). Both had four groups of identical gestures. The former had groups of size 12, 3, 3, and 2; the latter of size 11, 3, 3, and 3. For move a little, we compute

[pic]

For select single, we compute

[pic]

Agreement for our study is graphed in Figure 3. The overall agreement for 1- and 2-hand gestures was A1H=0.32 and A2H=0.28, respectively. Referents’ conceptual complexities (Table 1) correlated significantly and inversely with their agreement (r=-.52, F1,25=9.51, p ................
................

In order to avoid copyright disputes, this page is only a partial summary.

Google Online Preview   Download