Change search
ReferencesLink to record
Permanent link

Direct link
How to use manual labelers in evaluation of lip analysis systems?
Umeå University, Faculty of Science and Technology, Department of Applied Physics and Electronics. (Digital Media Lab.,)
Umeå University, Faculty of Science and Technology, Department of Applied Physics and Electronics.
Umeå University, Faculty of Science and Technology, Department of Applied Physics and Electronics. (Digital Media Lab.,)
2009 (English)In: Visual speech recognition: Lip segmentation and mapping / [ed] Shilin W & Alan Liew, USA: IGI Global , 2009, 239-259 p.Chapter in book (Other academic)
Abstract [en]

The purpose of this chapter is not to describe any lip analysis algorithms but rather to discuss some of the issues involved in evaluating and calibrating labeled lip features from human operators. In the chapter we question the common practice in the field: using manual lip labels directly as the ground truth for the evaluation of lip analysis algorithms. Our empirical results using an Expectation-Maximization procedure show that subjective noise in manual labelers can be quite significant in terms of quantifying both human and  algorithm extraction performance. To train and evaluate a lip analysis system one can measure the performance of human operators and infer the “ground truth” from the manual labelers, simultaneously.

Place, publisher, year, edition, pages
USA: IGI Global , 2009. 239-259 p.
Keyword [en]
Lip Analysis, Expectation Maximization Algorithm, Performance Evaluation
National Category
Physical Sciences Engineering and Technology
Research subject
Signal Processing; Computerized Image Analysis; Computing Science
URN: urn:nbn:se:umu:diva-20300ISBN: 978-160566186-5OAI: diva2:208420
Available from: 2009-03-18 Created: 2009-03-18 Last updated: 2012-03-20
In thesis
1. Expressing emotions through vibration for perception and control
Open this publication in new window or tab >>Expressing emotions through vibration for perception and control
2010 (English)Doctoral thesis, comprehensive summary (Other academic)
Alternative title[en]
Expressing emotions through vibration
Abstract [en]

This thesis addresses a challenging problem: “how to let the visually impaired ‘see’ others emotions”. We, human beings, are heavily dependent on facial expressions to express ourselves. A smile shows that the person you are talking to is pleased, amused, relieved etc. People use emotional information from facial expressions to switch between conversation topics and to determine attitudes of individuals. Missing emotional information from facial expressions and head gestures makes the visually impaired extremely difficult to interact with others in social events. To enhance the visually impaired’s social interactive ability, in this thesis we have been working on the scientific topic of ‘expressing human emotions through vibrotactile patterns’.

It is quite challenging to deliver human emotions through touch since our touch channel is very limited. We first investigated how to render emotions through a vibrator. We developed a real time “lipless” tracking system to extract dynamic emotions from the mouth and employed mobile phones as a platform for the visually impaired to perceive primary emotion types. Later on, we extended the system to render more general dynamic media signals: for example, render live football games through vibration in the mobile for improving mobile user communication and entertainment experience. To display more natural emotions (i.e. emotion type plus emotion intensity), we developed the technology to enable the visually impaired to directly interpret human emotions. This was achieved by use of machine vision techniques and vibrotactile display. The display is comprised of a ‘vibration actuators matrix’ mounted on the back of a chair and the actuators are sequentially activated to provide dynamic emotional information. The research focus has been on finding a global, analytical, and semantic representation for facial expressions to replace state of the art facial action coding systems (FACS) approach. We proposed to use the manifold of facial expressions to characterize dynamic emotions. The basic emotional expressions with increasing intensity become curves on the manifold extended from the center. The blends of emotions lie between those curves, which could be defined analytically by the positions of the main curves. The manifold is the “Braille Code” of emotions.

The developed methodology and technology has been extended for building assistive wheelchair systems to aid a specific group of disabled people, cerebral palsy or stroke patients (i.e. lacking fine motor control skills), who don’t have ability to access and control the wheelchair with conventional means, such as joystick or chin stick. The solution is to extract the manifold of the head or the tongue gestures for controlling the wheelchair. The manifold is rendered by a 2D vibration array to provide user of the wheelchair with action information from gestures and system status information, which is very important in enhancing usability of such an assistive system. Current research work not only provides a foundation stone for vibrotactile rendering system based on object localization but also a concrete step to a new dimension of human-machine interaction.

Place, publisher, year, edition, pages
Umeå: Umeå universitet, Institutionen för tillämpad fysik och elektronik, 2010. 159 p.
Digital Media Lab, ISSN 1652-6295 ; 12
Multimodal Signal Processing, Mobile Communication, Vibrotactile Rendering, Locally Linear Embedding, Object Detection, Human Facial Expression Analysis, Lip Tracking, Object Tracking, HCI, Expectation-Maximization Algorithm, Lipless Tracking, Image Analysis, Visually Impaired.
National Category
Signal Processing Computer Vision and Robotics (Autonomous Systems) Computer Science Telecommunications Information Science
Research subject
Computerized Image Analysis; Computing Science; Electronics; Systems Analysis
urn:nbn:se:umu:diva-32990 (URN)978-91-7264-978-1 (ISBN)
Public defence
2010-04-28, Naturvetarhuset, N300, Umeå universitet, Umeå, Sweden, 09:00 (English)
Taktil Video
Available from: 2010-04-07 Created: 2010-04-06 Last updated: 2010-04-20Bibliographically approved

Open Access in DiVA

No full text

Other links

Search in DiVA

By author/editor
ur Réhman, ShafiqLiu, LiLi, Haibo
By organisation
Department of Applied Physics and Electronics
Physical SciencesEngineering and Technology

Search outside of DiVA

GoogleGoogle Scholar
The number of downloads is the sum of all downloads of full texts. It may include eg previous versions that are now no longer available

Total: 87 hits
ReferencesLink to record
Permanent link

Direct link