Synesthetic Recipes: foraging for food with the family, in ...



Presented at SIGGRAPH 2003 and published in Conference Abstracts and Applications

Published in the Proceedings of SIGGRAPH 2002 Sketches and Applications in Conference Abstracts and Applications.

Textable MovieSynesthetic Recipes: improvisingforaging for food with the family, in taste-space with a personal movie databaseimprovisationof experience

Hover : Conveying rRemote pPresence

|Catherine VaucelleHugo Liu |Matthew Hockenberry |Tristan JehanTed Selker |

|Media Lab EuropeMIT Media Lab |MIT Media Lab |MIT Media Lab/Media Lab Europe |

|Story Networks GroupCounter Intelligence Group |Counter Intelligence Group |Hyperinstruments Group/ Story Networks GroupCounter |

|catihugo@media.mit.eduBeng-Kiang Tan |hock@media.mit.eduGlorianna Davenport |Intelligence Group |

|Harvard Graduate School of Design |MIT Media Lab/Media Lab Europe |tristanselker@media.mit.eduCatherine Vaucelle |

|48 Quincy Street |Interactive Cinema Group/Story Nnetworks Group |MIT Media Lab |

|Cambridge, MA 02138 |gid@media.mit.eduTristan Jehan |77 Massachusetts Ave |

|btan@gsd.harvard.edu |MIT Media Lab |Cambridge, MA 02139 |

| |Hyperinstruments Group |cati@media.mit.edu |

| |tristan@media.mit.eduKen Goulding | |

| |MIT Department of Urban Studies & Planning | |

| |77 Massachusetts Ave | |

| |Cambridge, MA 02139 | |

| |goulding@mit.edu | |

Dan Maynes-Aminzade, Ken Goulding*, Beng-Kiang Tan**, Catherine Vaucelle***

MIT Media Lab, Tangible Media Group, One Cambridge Centre, Cambridge, MA 02139 USA, monzy@media.mit.edu

MIT, Dept. of Urban Studies & Planning*, 77 Mass Ave, Cambridge, MA 02139 USA, goulding@mit.edu*

Harvard Graduate School of Design, 48 Quincy Street, Cambridge, MA 02138 USA, btan@gsd.harvard.edu**

MIT Media Lab, Gesture and Narrative Language Group, 20 Amnes St., Cambridge, MA 02139 USA, cati@media.mit.edu***

|MIT Media Lab |MIT |

|Tangible Media Group |Dept. of Urban Studies & Planning* |

|One Cambridge Centre |77 Mass Ave |

|Cambridge, MA 02139 USA |Cambridge, MA 02139 USA |

|monzy@media.mit.edu |goulding@mit.edu |

| | |

| | |

|Harvard University |MIT Media Lab |

|Graduate School of Design** |Gesture and Narrative Language |

| |Group*** |

|48 Quincy Street |20 Amnes St. |

|Cambridge, MA 02138 USA |Cambridge, MA 02139 USA |

|btan@gsd.harvard.edu |cati@media.mit.edu |

AbstractSummary

This paper presents a new answer to one very old question, "What's for dinner?" Synesthetic Recipes is a graphical interface which allows a person to brainstorm dinner recipe ideas by describing how they imagine the recipe should taste (e.g. "hearty, mushy, moist, aromatic"); to keep mindful of the tastebuds of family members, on-screen avatars anticipating their reactions to recipes enrich the brainstorm with just-in-time family’s feedback.This sketch presents a new approach to improvising movies according to the inter-relationship between personal videos and the story of an experience. Textable Movie is a graphical interface that invites a storyteller of any age to compose and visualize movies, images and sound environments while writing a story; the system self-selects and self-edits movies in real time based on textual input from the teller. Textable Movie aims to exalt the imagination of its authors (writer, and film-maker) by immersing them in real time, in a co-constructed narration.This paper sketch presents Hover, a device that enhances remote telecommunication by providing a sense of the activity and presence of remote users. The gestures movementmotion of a remote persona are is manifested as the playful movements of a ball floating in midair. Hover is both a communication medium and an aesthetic object.

Keywords

Tangible interface, sense of presence, awareness, personal communication

I1 Introduction

Deciding "What's for dinner?" is such a basic problem facing humankind, yet with all our technological might, there seems a paucity of elegant tools to support this task. Form and ontology-based search interfaces to recipe databases unfairly reduce the artistic and creative process of foraging for food to something so bluntly straightforward as ‘locating a document’. Also, whereas people may naturally articulate their cravings in terms of how they imagine a dish to taste, e.g. 'rich', 'spicy', homey', there was, up to now, no way to navigate recipes using this taste-space language, since the literal searchable text of recipes are restricted to ingredients, cooking algorithms, and basic classifications. Intending to design a more natural interaction, we introduce Synesthetic Recipes, a graphical interface which supports a creative and social navigation over a 60,000-recipe database.

When Marcel Proust writes about having tea and cookies, he getsis inspired by having the experience himself, which brings back memories to his mind. This is well known as the "Madeleine of Proust" phenomenon [12]. With Textable Movie, we would like to recreate this same phenomenon, by presenting instantly to the users, videos from their own footage. By immersion into their own memories, they could become engaged into telling rich, and passionate stories, based on past experience. It is inspired by a previous work thatwhich annotates images in order to retrieve them within a specific context [21] and it extends the concept to the making of movies with a number of automatic functions.

Video editing tools usually provide the user with many parametric functions for creating movies, but do so with a certain constraint on spontaneityn capabilities generally misses recalled experience, and its ven affect the creative process and sound ected in real time according to n creating a story and making a . With Textable Movie, the videos could stimulate the author's imagination while the experience is recalled. Even though no empirical data has been collected at this time, we could hypothesize that pictures may lead the fantasy of the user during this improvisation.etween personal videos and past

personal memoriesmovie story and making a . The inter-relationship between personal videos and past experiences can enhance the creative process.

Textable Movie both retrieves the segments of movies from the database and plays them in real time above the story typed. It , and creates a new movie, a storybook according to the story typednarrative, in a transparent and easy way and a storybook, and a narrative in real time.

Video -editing tools attempts tousually provide many parametric functions tofor createing a movie, made Face-to-face dialogue can beis more engaging than telephone conversation because of the added elements of gesture, touch, and body language. Video conferencing attempts to provide these missing elements, but does so at the cost of lack of improvisationation in their in constructing a movie creation, m

issing the spontaneous relationship between the narration of a recalled experience and its audio-visual recordingvisuals . a specified typed. dof words in the sentence typedRGB value of the movies, withy.itsIt creates a new movie based on the story played in real time author'the of the author enhances the experience of storytelling usign multimedia for family and friends.

provides

, and lack of reflectioreflection onto the medium itself, and high bandwidth, expensive equipment, and heightened demands on the attention of the user. Hover provides a low-cost, low-bandwidth, less distracting solution that enhances the experience of telephone conversations with family and friends.

It provides the sense of direct relation between text and movies in a story recalled , visual awareness of remote persons in the form of an abstract physical representation with several affordances:

The ability to “touch” the remote person

A real-time indication of the level of physical activity of the remote person

Aa real-time visual feeback on the presence and abscence indication of the storyline of the narration by creating instantly a storybook, and by finalizing a movie and its written story; with its story independently from the level of physical activity of the remote person;

tThe ability to color code the text typed byin inserting scenes, zooming in and out, and modifying the colors of the movies are important trnasformation on the movie to guide throughout the narration. personalize the representations of the remote persons in a way that makes sense to the user; and

The the ability to grasp and interact with a “surrogate” representing the remote person. provision of a surrogate of the remote person that may be grasped

HoverTextable Movie is not intended to convey the an automatic story in a form of a meaning of segemtnsgestures in a conversationmovies, but rather to convey a sense of the story of the author andby imersing the author in his/her own imaginationpast experience exhalting his/her imagination. stpresence of the remote person in a captivating and poetic fashion.

Textable Movie has been presented to many tellers during Open Houses on research projects, and has engaged them drivesin many scenario saccording the movies played. is inspired We were inspired fromby other works that used using physical objects image annotation to communicateretrieve the images in a specific context [1], and can be seen as an extensionds of such system can be used in a performingby creating thenew movies andbased on the recalled experience and the story of its authors. circulate a sense of stories various communities about common variousfor the exact same footage ofvideoaims to recreate As inAs the "eofs" phenomena; in which the teller is havingProust is telling his story about ahving a tea witheating these cookies called Madeleine. At the instant he drinks the tea on top of this piece of madeleine, his entire body recalls his souvenirs his a hot drink and eats a "madeleine" then the texture, and tastes of the madeleine connects him to his souvenirs. Textable Movie by showing instant videos of experience aims to rcreate this sensation of souvenirs to then immerse the user into his life and enagge him'telling all the rich details of d"that by, its immediate response to the user's story, exsm to and its poetic meaning.

performing movies and creating a new movie to show activity or presence, or to support intimacy [1][2][3].

RELATED WORK

Previous studies of awareness and embodiments [1][2] have concentrated on detecting presences, peripheral awareness, or integrating audio and video for communication. Hover focuses on intentional awareness, representing presence and communicating a general sense of physical activity of a remote person. It is similar to approaches of using “tangibles” (i.e. physical objects) to communicate, to show activity or presence, or to support intimacy [2][3][4][5][6]. We wanted a solution that augmented existing personal communication on a telephone, but that can function as an easy-to-use, low bandwidth desktop accessory.

D2 Description

Textable Movie retrieves movie segments and sounds environments in a specified database, from analyzing textual input. It loads and plays them in real time while the story is being typed. the author's imagination while Consequently, a novel movie gets created and generated in a very transparent, and easy manner.Synesthetic Recipes re-conceptualizes recipe search as interactive, inexact, and family-mindful brainstorming. Forms and ontologies are eliminated in favor of a simple Google-style query box, which strives to accept any keywords a person might use to describe their craving. Currently, it accepts 5,000 ingredient keywords (e.g. “chicken”, “Tabasco sauce”), 1,000 taste-space sensorial keywords (e.g. “spicy,” “chewy,” “silky”, “colorful”), and 400 nutrient keywords (e.g. “vitamin a”, “manganese”), as well as all the negations (e.g. “no chicken,” “not spicy”). As the food ‘forager’ types keywords (Fig. 1a), relevant recipe suggestions continuously fade in and out on little scraps of paper laid over a plate; the opacity of a suggestion scrap signifies its relevance. Here, the information ‘pull’ of a traditional search is turned into a soft ‘push’ of suggestions; and the whole process is more amenable to interactive refinement, for example, a forager sees some beef recipes, decides against beef, and so simply adds “no beef” as the next keyword; all done without switching contexts. To make recipes searchable by taste-space keywords which are often left out of a recipe text, the system benefits from commonsense knowledge about cooking from the Open Mind Thought for Food database, cf. [Singh, Barry & Liu 2004], containing 21,000 sensorial facts about 4,200 ingredients (e.g. “lemons are sour”), and 1,300 facts about 400 procedures (e.g. “whipping something makes it fluffy”). The 60,000-recipe database is first parsed using natural language tools and food commonsense is applied to infer the likely tastes, aromas, textures and character of a cooked recipe; these annotations allow recipes to be searched using more intuitive vocabularies of taste. When the forager clicks on a recipe suggestion (Fig. 1d), the recipe text is rendered with semantic highlighting such that the essence of their query is intelligible at-a-glance – e.g. you searched for spicy, and now, all the spicy ingredients are highlighted in this recipe view. Of course, deciding what to make for dinner cannot occur in a social vacuum, as the tastes of family members need to be considered. To enrich the Synesthetic Recipes brainstorm with family-mindfulness, avatars embody the tastebuds of family members (Fig. 1b); Sally can fill in her avatar with keywords and be assured that Mom makes a dinner she likes. As Mom forages for recipes (Fig. 1c), Sally’s avatar emotes either love, neutral, or hate reactions based on how well the current recipe satisfies Sally’s tastebuds; and avatars also popup dialogs to express more specifically what they love or hate about a recipe. That emoting character interfaces can positively support decision-making in socially mindful tasks was reported in [Taylor et al. 1999].

|[pic] |[pic] |

|[pic] |[pic] |

Figure 1. (clockwise from top-left) a) as keywords are typed, recipes suggestions fade in and out; b) avatars’ tastebuds are keyword-programmable; c) avatars react to browsing with love-neutral-hate and dialogs; d) recipe text is semantically highlighted to show the ‘essence’ of the query. (web.media.mit.edu/~hugo/demos/SR-SIGGRAPH2005.mov The simplicity of use, and immediate response could help focusing on the mediumstorytelling, and its poeticrather than on the concern of technical load of montageediting.

The system can easily be connected to any personal movie database, and simply requires a text file with a series of descriptive keywords for each clip. For example, the following short keyword sequence [forest.mov, forest nature tree wood leaves Yosemite;] could describe a personal 10-second video clip of the Yosemite park forest, called "forest.mov". The personal labeling is important as it allows the user to give the medium his/her own meaning. The current version also features a series of simple commands, which add instant tmanipulations of the movie being played. These commands are typed directly in the text, and include: [closeup] to zoom in the frame, [faster] and [slower] to change the speed rate, [loop] and [palindrome] to loop in a normal or palindrome fashion, and [spring], [summer], [fall], and [winter], to alter the overall coloration of the image. By deleting the command, the effect disappears (see Figure 1).

imedia for family and friends.

HoverTextable Movie is designed to be an aesthetic object as well as a functional one. It uses technology that can easily be connected to any personal database of movies. operate in a home or office environment. Hover It is connected to a comgraphical interface that looks for movies and sound files onto a computer according to a list of keywords. puter and telephone or an Internet phone. There is a stand on the Hover platform on which Tthe user places multicolored ballshis/her places personal movies onto his/herthe computer and labepersonalizsesl them by labels representing his/her own meaning of the media, persons with whom she frequently communicates (e.g. the keywordtext "family forest" will beis associated to for a movie file called such as "forest.mov.," but it can also be linked to "leaves.mov" if that ist what the choice of the author thinks represent his/her notion of the forest."A movie describing a typical and friends) on a stand on the Hover platform.. SheThe author of the narrative can not only personalizes the interaction with the movies through the identity labelings, but also of hte movies with many keywords for one movie, and the storyline is created automaticaslly according to historybook decisions, e.g. scene 1: in the park, scene 2: at home. This creates a , to facilitate the construction of the storyboard in real time automatically.

of the balls by painting different colors and patterns on them., or by pasting the balls with stickers or photos.

In a scenario where adolscentsPeter calls Jane wants to tell the story about her holidays to her friend Paulcan be found at: . )

Synesthetic Recipes is rooted in two years of research on computer representations of food, done in collaboration with Barbara Wheaton, food historian and curator of Harvard’s Schlesinger Food Library. Its earlier versions have been exhibited at several open houses of the MIT Media Lab, where hundreds of people have played with the system; many technologists, designers, mothers, and food critics have experimented with the system and contributed to its evolving design; in the near future, we hope to make the tool widely available for Mums (and Dads) everywhere. ,. Jane

(refer to for an animation of this process), Peter first digitalizes her movies onto her computer, labels them her movies takes a ball representing Jane and puts it on the top of the rampher meaningown description. She opens Textable Movie and begins her story: . "I was in a ". Suddenly, tThe forest movie of leaves appears instantly on the screen, floating onto herthe text. These bright leaves suddenly It reminds her ofabout this buy she saw behind a this giant rock she saw behind this tree, andthen she types: "Then, behind anthis amazing rock was lying there, in the middle of nowhere..." and. tThea huge rock apppears. The author then wants to zoom inside the video and types [zoom in] then the movie of the rock becomes bigger and bigger. She goes furhter bycontinues: typing "I heard birdsa voice behind around me", then a voice whispers... After sShe decides her movie is finished, clics on the "Play all" button, the birds then sing... She finalizes her narrative, and send her new movie with hersheand sends it to her friend, movie and text togetherstory to her friendly look at her movie created and is surprised by the association of movies and sounds she has created. Her movie is ready with a story to be sent. Her friend Paul receives an unusual movie... The ball is sensed, then rolls down and stays at the foot of the ramp. Jane’s number is then dialed automaticallyPeter then calls Jane. Once Jane picks up the call, the ball on Peter’s end floats. On Jane’s end, the ball thatmovie represents her documentary about her holidays, on Paul'the friend'ss end, the movies is a way for him to noticedetect elements that are the most important for his understanding. The text supports the narration, but the visualized story and its storyline expresses even more about what PaulJane' wants to know, less formal, more spontaneous. ul\''Peter floats when the call comes in. (see Fig. 1). If Jane wants to pick up the call, she grabs the ball and puts it on the ramp. (fFigure. 1). The ball rolls across a sensor on the ramp, sending a signal to Peter’s end to indicate that Jane has picked up the call. At the foot of the ramp, an air stream levitates the ball (fFigure. 2). As Peter speaks, Jane sees the ball floating up and down in correspondence with Peter’s movements (Fsee Fig.ure 3). This will convey to Jane a sense of Peter’s presence and level of activity. If Peter is inactive, the ball will hover at a fixed height. When the telephone conversation ends and both parties hang up the telephone, the ball stops floating and the user places it in its original position.

Our current implementation uses a softwarean engine to detect the presence and abscence of specific keywords in a story typed. It also associate multiple keywords for the same movie, and plays in real time the movie file onto the screen. camera with vision tracking, attached to a computer. The computer analyzes the keys typed, adn videos, images, adn sounds environments are being played. video stream and detects the level of motion. It sends this information to the remote computer, which relays the motion data to the remote Hover device.

The Hover unit reads the data and sends the appropriate electrical pulses to the fan and servo motor to control the position and the height of the ball. If the remote person gestures wildly, the ball floats rapidly up and down.

[pic] [pic]

Figure 1. At Jane’s end, the b - Ball floats when a call comes in. She grabs the ball to answer the callAs t As the author types the word forest, a video that he/she has shot appears and starts playing instantly on the screen. The author types "forest", ahis video of the forest he has experienced appears on the screen and plays in real time.Ftoryappears on the screen

ThenFollowing the story, the the video of a rock appears on the screen. , The ball rolls across a sensor on the ramp and that sends a signal to Peter’s end to indicate that Jane has picked up the call. At the foot of the ramp, an air stream levitates the ball (fig. 2). .

[pic]

Fig 1. Hover at Jane’s end. (to be replaced with photo of actual device with annotations)

As Peter speaks, Jane sees the ball floating up and down in correspondence with Peter’s movements (see Fig. 32). This will convey to Jane a sense of Peter’s presence and level of activity. If Peter is inactive, the ball will hover at a fixed height. When the telephone conversation ends and both parties hang up the telephone, the ball stops floating and the user places it in its original position. If a person feels the need to “touch” the remote person, he can hold the ball in his hand or touch it while it is floating. However, the prototype does not currently support a two-way communication of tactile sense.

[pic] [pic]

Figure 2. At Jane’s end - The Bball rolls down the ramp and an air stream levitates itauthors has requests a closeup of the current movieBand because of the [closeup] command used, the. rock instantly becomes bigger. Because of the [winter] command, The authors mentionnedmentions that it was cold, and wanted a winter aspect on the pictture, the pre-settings of the colors during ""“winter"” are applied to the current movie..

[pic] [pic][pic]

Figure 3 2. In conversation: t. he Bball floats up and down in correspondence to the remote person’s movement.

Future work: StudyApplication of Textable Movie in cross-cultural settings

Our future work would beis bBased on the research done on online international exchangesintercultural training system using multimedia tools to reflect onto someone else's culture [3] We would like to apply this systemTextable Movie would be appplied to a System of Networked Story SystemStories, and. It that would allow the exchange connect the movies from made by adolescents from differentvarious countriesdifferent communities and countries. Each comminity of people willwould create its own a database of movies, and by usign Textable Movie they will tell its story on on this Network System their everyday life activities using Textable Movietofor the other communities using the Network System. same system. This will be a meaningful way of reflection onto someone oneelse' s culture as well as an engaging experience into one's own. Discussion and fFuture work

Textable Movie has been presented to, and experimented by, many tellers at an open house of the Media Lab Europe in Dublin, and has engaged them in various tales, although using the same footage. The direct, and instant relationship between text and movie seems to be quite effective, surprising, aesthetic, inspiring, and fun.

The core engine of Textable Movie has already been used in other applications, e.g., a system that retrieves, and displays images, from analyzing text messages sent by cellular phone. In its future version, based on previosousthe research about online intercultural training research [3], our system will be networked, and used as a multimedia tool to reflect onto someone else's culture. The communities (currently, adolescents from Dublin, Boston, and Sao Paulo) will share their movies, and will learn about the other's own perception of their environment.

(to be replaced with photo of actual device)

Acknowledgments

WeManyWe would like to thanks to Professor Hiroshi IshiiGlorianna Davenport of MIT Media Lab for hiser support, Tristan Jehan for his help in the text analysis of Textable Movie and the students of thehis Tangible Interfaces courseresearchers ofin the Storynetworks Group at MLE for their feedback, and the students in the Interactive Cinema Group for their valuable advice..

References

Singh P., Barry B., Liu H. 2004. Teaching Machines about Everyday Life. BT Technology Journal 22(4), 227-240. Kluwer.Proust M., À la recherche du temps perdu. Du côté de chez Swann, 1913; tr. Remembrance of Things Past, 1922.

Taylor I.C. et al. 1999. Providing animated characters with designated personality profiles. Embodied Conversational Characters, 87-92.Lieberman H., and Liu H., Adaptive Linking between Text and Photos Using Common Sense Reasoning, Conference on Adaptive Hypermedia and Adaptive Web Systems, Malaga, Spain, May 2002.

Strong, R., and Gaver, W.W. (1996). Feather, scent and shaker: Supporting simple intimacy. Proceedings of CSCW’96..

Asakawa T. et al, To appear in the Proceedings of the World Conference on Educational Multimedia, 2003.ED-MEDIA 2003 World Conference on Educational Ishii, H., Ren, S. and Frei, P., Pinwheels: Visualizing Information Flow in an Architectural Space. Extended Abstracts of CHI '01, ACM Press, pp.111-112.

Ishii, H., and B. Ulmer. Tangible Bits: Towards Seamless Interfaces between People, Bits, and Atoms. Proceedings CHI’97, ACM Press, 1997, pp. 234-241.

IMPLEMENTATION

Our current implementation uses a camera with vision tracking, attached to a computer. The computer analyzes the video stream and detects the level of motion onscreen. It sends this information to the remote computer, which relays the motion data to the remote Hover device. The Hover unit reads the data from the serial port and sends the appropriate electrical pulses to the fan and servo motor to control the position and the height of the ball. Varying the duty cycle of the pulse sent to the fan motor changes the volume of air produced, which in turn controls how high the ball levitates. If the remote person is gesticulating wildly, the ball floats rapidly up and down, conveying a sense of urgency.

CONCLUSION AND FUTURE WORK

One user response to commented that she finds that the a floating ball is was that it is so magical and that she actually pays much more attention to it. Even though the device supports only a one-to-one telephone conversation right now, it is possible to extend it to support multi-party telephone conferencing or a chat room environment of family or close friends. In future we would also like to make the capture of expressions more integrated with a telephone device. For instance, the telephone can have sensors embedded that detect the intensity of the grip of the remote person and maps that onto Hover.

As a generic three dimensional display device, Hover can also be suitable for other applications where the physical movement can be mapped into an artistic representation of that movement e.g. application for baby monitoring or pet monitoring.

We believe that the poetic & aesthetic nature of the display could be a powerful way to complement existing communication system.

ACKNOWLEDGMENTS

We thank Prof Hiroshi Ishii of MIT Media Lab for his support and students of the Tangible Interface course for their feedback.

REFERENCES

Pedersen, E.R. and T. Sokoler: AROMA – Abstract representation of mediated presence supporting mutual awareness. Proceedings of the CHI ’97 Conference, ACM Press, 1997.

Kuzuoka H, and Greensberg, S. (1999). Mediating Awareness and Communication through Digital and Physical Surrogates. Proceedings of the ACM SIGCHI ’99 Conference Extended Abstracts.

Strong, R., and Gaver, W.W. (1996). Feather, scent and shaker: Supporting simple intimacy. Proceedings of CSCW’96.

Ishii, H., Ren, S. and Frei, P., Pinwheels: Visualizing Information Flow in an Architectural Space. Extended Abstracts of CHI '01, ACM Press, pp.111-112.

Ishii, H., and B. Ulmer. Tangible Bits: Towards Seamless Interfaces between People, Bits, and Atoms. Proceedings CHI’97, ACM Press, 1997, pp. 234-241.

Chang, A., Resner, B., Koerner B., Wang, X and Ishii, H., LumiTouch: An Emotional Communication Device. Extended abstracts of CHI’01, ACM Press, pp. 313-314.

The columns on the last page should be of equal length.

................
................

In order to avoid copyright disputes, this page is only a partial summary.

Google Online Preview   Download