martes 28 de abril de 2009

Conociendo la XO

Dado que nuestro proyecto será montado en el hardware de las computaroras portátiles XO, nos parece conveniente familiarizarnos con ella. Por este motivo a continuación presentamos un video filmado con la XO, la resolución realmente es mala y desalentadora... Todo parece indicar que la
mejor opción es usar una camara web conectada a la XO en lugar de la cámara que trae la máquina.


video


A continuación un video similar filmado con una cámara web de una laptop convencional.


video




Aquí esta la salida del comando "lspci" ejecutado desde la XO:

00:01.0 Host bridge: Advanced Micro Devices [AMD] Unknown device 0028 (rev 21)
00:01.1 VGA compatible controller: Advanced Micro Devices [AMD] Geode LX Video
00:01.2 Entertainment encryption device: Advanced Micro Devices [AMD] Geode LX AES Security Block
00:0c.0 FLASH memory: Marvell Technology Group Ltd. Unknown device 4100 (rev 10)
00:0c.1 SD Host controller: Marvell Technology Group Ltd. Unknown device 4101 (rev 10)
00:0c.2 Multimedia video controller: Marvell Technology Group Ltd. Unknown device 4102 (rev 10)
00:0f.0 ISA bridge: Advanced Micro Devices [AMD] CS5536 [Geode companion] ISA (rev 03)
00:0f.3 Multimedia audio controller: Advanced Micro Devices [AMD] CS5536 [Geode companion] Audio (rev 01)
00:0f.4 USB Controller: Advanced Micro Devices [AMD] CS5536 [Geode companion] OHC (rev 02)
00:0f.5 USB Controller: Advanced Micro Devices [AMD] CS5536 [Geode companion] EHC (rev 02)


Aquí se encuentran las especificaciónes de hardware de la XO.

Construyendo un prototipo con openframeworks

Comenzamos con la construcción de un prototipo básico. Por el momento lo que logramos hacer es tomar como entrada la camara web y reconocer contornos y orientaciones de los objetos presentados.

También se tienen algunas propiedades de los objetos, como por ejemplo el área de los mismos (por el momento no consideramos objetos en 3 dimensiones sino que trabajamos con dibujos en 2d) y estamos trabajando para reconocer figuras tales como cuadriláteros, triángulos, círculos, etc.

Encontramos un paper de Desarrollo de interfaces naturales para aplicaciónes dirigidas a niños en el cual se plantea la siguiente estrategia de reconocimiento.

Se detalla en este apartado el proceso de reconocimiento y las técnicas empleadas en cada
una de las fases. El sistema es capaz, por el momento, de reconocer forma, color y orientación
de cartulinas repartidas sobre el escritorio (fondo blanco uniforme); para llevarlo a cabo se han implementado algoritmos estándar básicos de reconocimiento visual.

El proceso es el que se explica a continuación (ver Figura 1).


1. Captura. Como se ha comentado, el hardware de captura visual consiste en una webcam
conectada al PC
mediante puerto USB. Una llamada a la librería “VideoForWindows” de
Windows, devuelve un array de píxeles de
dimensión 320 x 200, en la cual cada píxel es
representado por sus componentes de color
RGB (Rojo, Verde y Azul) de 8 bits de resolución
en cada componente.



2. Umbralización de la imagen. Se genera un valor de umbral automático que representa un
valor de luminancia (entre 0 y 255) que separa el escritorio de los elementos tangibles. Se
genera un array binario. Para eliminar ruidos de señal de vídeo se aplica un filtrado de
mediana.


3. Visión retinal. Esta fase se encarga de la detección de movimiento. El objetivo es no mostrar
en pantalla resultados “falsos” y esperar hasta que el usuario haya terminado
de realizar su
disposición de elemento tangibles.



Figura 1. Flujo del proceso de reconocimiento visual


4. Segmentación de la imagen. Con la imagen estática, y a partir del array binario, se cuenta
y localiza cuántos elementos tangibles hay sobre el escritorio. A continuación se aplica el
algoritmo estándar de etiquetado de blobs (los blobs son puntos o regiones en una imagen que pueden ser tanto mas claras o mas oscuras que sus alrededores
).


5. Parametrización de los blobs. Una vez individualizados y etiquetados los blobs, se
parametrizan: se obtienen los valores que caracterizan las propiedades físicas de los
elementos tangibles que se han detectado en el proceso de segmentación: Área, contorno,
orientación, color.


lunes 27 de abril de 2009

Perceptual User Interface (PUI)

Perceptual User Interface – Matthew Turk and George Robertson

PUIs are characterized by interaction techniques that combine an understanding of natural human capabilities (particularly communication, motor, cognitive, and perceptual skills) with computer I/O devices and machine perception and reasoning. They seek to make the user interface more natural andcompelling by taking advantage of the ways in which people naturally interact with each other and with the world—both verbally and nonverbally.

A perceptive UI (as opposed to PUI) is one that adds human-like perceptual capabilities to the computer, for example, making the computer aware of what the user is saying or what the user’s face, body, and hands are doing. These interfaces provide input to the computer while leveraging human communication and motor skills.

Multimodal UI - We use multiple modalities when we engage in face-to-face communication, leading to more effective communication. Most work on multimodal UI has focused on computer input (for example, using speech together with penbased gestures). Multimodal output uses different modalities, like visual display, audio, and tactile feedback, to engage human perceptual, cognitive, and communication skills in understanding what is being presented.

Multimedia UI uses perceptual and cognitive skills to interpret information presented to the user. Text, graphics, audio, and video are the typical media used. Multimedia research focuses on the media, while multimodal research focuses on the human perceptual channels. From that point of view, multimedia research is a subset of multimodal output research.

PUI integrates perceptive, multimodal, and multimedia interfaces to bring our human capabilities to bear on creating more natural and intuitive interfaces. Perceptual interfaces will enable multiple styles of interaction—such as speech only, speech and gesture, text and touch, vision, andsynthetic sound—-each of which may be appropriate in different circumstances, whether that be desktop apps, hands-free mobile use, or embedded household systems.

Perceptual User Interface – Matthew Turk and George Robertson

http://www.cs.ucsb.edu/~mturk/Papers/CACM2000.pdf

Leveraging Human Capabilities in Perceptual Interfaces - George Robertson, Microsoft Research

http://www.cs.ucsb.edu/conferences/PUI/PUIWorkshop98/Format.htm

Perceptual User Interface – Matthew Turk Microsoft Research

For some time, graphical user interfaces (GUIs) have been the dominant platform for human computer interaction.However, as the way we use computers changes and computing becomes more pervasive and ubiquitous, GUIs will not easily support the range of interactions necessary to meet users’ needs. In order to accommodate a wider range of scenarios, tasks, users, and preferences, we need to move toward interfaces that are natural, intuitive, adaptive, and unobtrusive. The aim of a new focus in HCI, called Perceptual User Interfaces (PUIs), is to make human-computer interaction more like how people interact with each other and with the world.

In recent years, people have been discussing post-WIMP ( using windows, icons, menus, and pointing devices) interfaces and interaction techniques, including such pursuits as desktop 3D graphics, multimodal interfaces, tangible interfaces, virtual reality and augmented reality. These arise from a need to support natural, flexible, efficient, and powerfully expressive interaction techniques that are easy to learn and use [5]. In addition, as computing becomes more pervasive, we will need to support a plethora of form factors, from workstations to handheld devices to wearable computers to invisible, ubiquitous systems. The GUI style of interaction, especially with its reliance on the keyboard and mouse, will not scale to fit future HCI needs.

Era

Paradigma

Implementation

1950s

None

Switches, wires, punched cards

1970s

Typewiter

Command-line interface

1980s

Desktop

GUI / WIMP

2000s

Natural interaction

PUI (multimodal input and

output)

Fig1 The evolution of user interfaces

Perceptual user interfaces may be defined as:

Highly interactive, multimodal interfaces modeled after natural human-to-human interaction, with the goal of enabling people to interact with technology in a similar fashion to how they interact with each other and with the physical world.

Perceptual user interfaces should take advantage of human perceptual capabilities in order to present information and context in meaningful and natural ways. So we need to further understand human vision, auditory perception, conversational conventions, haptic capabilities, etc. Similarly, PUIs should take advantage of advances in computer vision, speech and sound recognition, machine learning, and natural language understanding, to understand and disambiguate natural human communication mechanisms.

Vision is clearly an important element of human-human communication. Although we can communicate without it, people still tend to spend endless hours travelling in order to meet face to face. Why? Because there is a richness of communication that cannot be matched using only voice or text. Body language such as facial expressions, silent nods and other gestures add personality, trust, and important information in human-to-human dialog. We expect it can do the same in human-computer interaction. Vision based interfaces (VBI) is a subfield of perceptual user interfaces which concentrates on developing visual awareness of people.

VBI (and, in general, PUIs) can be categorized into two aspects: control and awareness. Control is explicit communication to the system – e.g., put that object there. Awareness, picking up information about the subject without an explicit attempt to communicate, gives context to an application (or to a PUI). The system may or may not change its behavior based on this information. For example, a system may decide to stop all unnecessary background processes when it sees me enter the room – not because of an explicit command I issues, but because of a change in its context. Current computer interfaces have little or no concept of awareness. While many research efforts emphasize VBI for control, it is likely that VBI for awareness will be more useful in the long run.

VBI projects:

  • Track a user’s head and use this for both awareness and control.
  • Recognize a set of gestures in order to control virtual instruments.
  • Track the subject’s body using an articulated kinematic model.

http://ilab.cs.ucsb.edu/projects/turk/Turk%20EC-NSF%20Workshop.pdf

Más allá de Internet: La Red Universal Digital – Fernando Sáez Vacas

Adjunto dos links, el primero de ellos es el libro, que a pesar de no tener todas las páginas habilitadas, igual me pareció interesante, principalmente el capitulo 7 que habla de Tecnología Antropocéntrica (interfaces de usuario gráficas, en lenguaje natural, perceptivas y Pui´s).

El otro link, muestra extractos en pdf de algunos capítulos del libro.

http://books.google.com.uy/books?id=RejZS5pXNL0C&dq=M%C3%81S+ALL%C3%81+DE+INTERNET:+LA+RED+UNIVERSAL+DIGITAL.+X-ECONOM%C3%8DA+Y+NUEVO+ENTORNO+TECNOSOCIAL&printsec=frontcover&source=bl&ots=X_49snATxD&sig=dRx5aoiOWMQg3mNV9HoIZMUFri8&hl=es&ei=MhrZSZSaA9fulQe5npXKDA&sa=X&oi=book_result&ct=result&resnum=1#PPP1,M1

http://www.gsi.dit.upm.es/~fsaez/intl/Red%20Universal%20Digital/index.html

Alt. Interface - ( net.art wiki)

Este breve material menciona hacia donde se dirigen las interfaces virtuales y habla de las PUI´s como interfaces que nos permiten una expresión humana transparente, con movilidad, facilitando la comunicación.

http://netart.iua.upf.edu/wiki/index.php/Alt.interface

A Simple Habituation Mechanism for Perceptual User Interfaces – O.Déniz, José Lorenzo Blanco, Martin Hernández.

Complex human-computer interfaces are more and more making use of high-level concepts extracted from sensory data for detecting aspects related to emotional states like fatigue, surprise, boredom, etc. Repetitive sensory patterns, for example, almost always will mean that the robot or agent will switch to a ”bored” state, or that it will turn its attention to other entity. Novel structures in sensory data will normally cause surprise, increase of attention or even defensive reactions. The aim of this work is to introduce a simple mechanism for detecting such repetitive patterns in sensory data. Basically, sensory data can present two types of monotonous patterns: constant frequency (be it zero or greater than zero, be it a unique frequency or a wide spectrum) and repetitive frequency spectrum changes. Both types are considered by the proposed method in a conceptually and computationally simple framework. Experiments carried out using sensory data extracted both from the visual and auditory domains show the validity of the approach.

http://cabrillo.lsi.uned.es:8080/aepia/Uploads/23/51.pdf

Perceptual user interface for humancomputer interaction - Weidong Geng, Vladimir Elistratov, Marina Kolesnik, Thomas Kulessa, Wolfgang Strauss

In this paper we present our effort towards perceptual user interface for main interaction tasks, such as navigation/travel, selection/picking and personal data access, in e-commerce environment. A set of intuitive navigation devices, including Treadmill, Virtual Balance and Cyberwheel, are described and

evaluated for web/public space. Vision-based pointing is explored for free-hand selection/picking, and wireless personal access system is developed for unobtrusive transmission of personal information from hand-held devices. Furthermore, we implement an integrated interaction platform, which could couple these devices together in complex scenarios such as portals for shopping.

Weidong Geng, Vladimir Elistratov, Marina Kolesnik, Thomas Kulessa, Wolfgang Strauss

http://netzspannung.org/cat/servlet/CatServlet?cmd=netzkollektor&subCommand=showEntry&entryId=41614&lang=de