umu.sePublications
Change search
CiteExportLink to record
Permanent link

Direct link
Cite
Citation style
  • apa
  • ieee
  • modern-language-association-8th-edition
  • vancouver
  • Other style
More styles
Language
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Other locale
More languages
Output format
  • html
  • text
  • asciidoc
  • rtf
Very low bitrate facial video coding: based on principal component analysis
Umeå University, Faculty of Science and Technology, Applied Physics and Electronics.
2006 (English)Licentiate thesis, comprehensive summary (Other academic)
Abstract [en]

This thesis introduces a coding scheme for very low bitrate video coding through the aid of principal component analysis. Principal information of the facial mimic for a person can be extracted and stored in an Eigenspace. Entire video frames of this persons face can then be compressed with the Eigenspace to only a few projection coefficients. Principal component video coding encodes entire frames at once and increased frame size does not increase the necessary bitrate for encoding, as standard coding schemes do. This enables video communication with high frame rate, spatial resolution and visual quality at very low bitrates. No standard video coding technique provides these four features at the same time.

Theoretical bounds for using principal components to encode facial video sequences are presented. Two different theoretical bounds are derived. One that describes the minimal distortion when a certain number of Eigenimages are used and one that describes the minimum distortion when a minimum number of bits are used.

We investigate how the reconstruction quality for the coding scheme is affected when the Eigenspace, mean image and coefficients are compressed to enable efficient transmission. The Eigenspace and mean image are compressed through JPEG-compression while the while the coefficients are quantized. We show that high compression ratios can be used almost without any decrease in reconstruction quality for the coding scheme.

Different ways of re-using the Eigenspace for a person extracted from one video sequence to encode other video sequences are examined. The most important factor is the positioning of the facial features in the video frames.

Through a user test we find that it is extremely important to consider secondary workloads and how users make use of video when experimental setups are designed.

Place, publisher, year, edition, pages
Umeå: Tillämpad fysik och elektronik , 2006. , 54 p.
Series
Digital Media Lab, ISSN 1652-6295 ; 7
Keyword [en]
Image processing, Video processing, Very low bitrate coding
National Category
Other Electrical Engineering, Electronic Engineering, Information Engineering
Identifiers
URN: urn:nbn:se:umu:diva-895ISBN: 91-7264-172-x OAI: oai:DiVA.org:umu-895DiVA: diva2:144942
Presentation
(English)
Supervisors
Available from: 2006-10-13 Created: 2006-10-13 Last updated: 2010-01-26Bibliographically approved
List of papers
1. Full-frame video coding for facial video sequences based on principal component analysis
Open this publication in new window or tab >>Full-frame video coding for facial video sequences based on principal component analysis
2005 (English)In: Proceedings of Irish Machine Vision and Image Processing Conference, 25-32 p.Article in journal (Refereed) Published
Identifiers
urn:nbn:se:umu:diva-3375 (URN)
Available from: 2008-09-05 Created: 2008-09-05 Last updated: 2016-02-23Bibliographically approved
2. Representation bound for human facial mimic with the aid of principal component analysis
Open this publication in new window or tab >>Representation bound for human facial mimic with the aid of principal component analysis
2010 (English)In: International Journal of Image and Graphics, ISSN 0219-4678, Vol. 10, no 3, 343-363 p.Article in journal (Refereed) Published
Abstract [en]

In this paper, we examine how much information is needed to represent the facial mimic, based on Paul Ekman's assumption that the facial mimic can be represented with a few basic emotions. Principal component analysis is used to compact the important facial expressions. Theoretical bounds for facial mimic representation are presented both for using a certain number of principal components and a certain number of bits. When 10 principal components are used to reconstruct color image video at a resolution of 240 × 176 pixels the representation bound is on average 36.8 dB, measured in peak signal-to-noise ratio. Practical confirmation of the theoretical bounds is demonstrated. Quantization of projection coefficients affects the representation, but a quantization with approximately 7-8 bits is found to match an exact representation, measured in mean square error.

Place, publisher, year, edition, pages
World Scientific Publishing Company, 2010
Keyword
Distortion bound, rate-distortion bound, facial mimic, basic emotions, principal component analysis, PCA
National Category
Media Engineering
Identifiers
urn:nbn:se:umu:diva-5428 (URN)10.1142/S0219467810003810 (DOI)
Available from: 2011-09-19 Created: 2006-10-13 Last updated: 2011-09-19Bibliographically approved
3. Eigenspace compression for very low bitrate transmission of facial video
Open this publication in new window or tab >>Eigenspace compression for very low bitrate transmission of facial video
2007 (English)In: IASTED International conference on Signal Processing, Pattern Recognition and ApplicationsArticle in journal (Refereed) Published
Identifiers
urn:nbn:se:umu:diva-3377 (URN)
Available from: 2008-09-05 Created: 2008-09-05 Last updated: 2010-01-26Bibliographically approved
4. Re-use of Eigenspaces to encode new facial video sequences
Open this publication in new window or tab >>Re-use of Eigenspaces to encode new facial video sequences
(English)Manuscript (Other academic)
Identifiers
urn:nbn:se:umu:diva-5430 (URN)
Available from: 2006-10-13 Created: 2006-10-13 Last updated: 2016-02-23Bibliographically approved
5. Internet card play with video conferencing
Open this publication in new window or tab >>Internet card play with video conferencing
2006 (English)In: Proceedings SSBA 2006: Symposium on Image Analysis, Umeå, March 16-17, 2006 / [ed] Fredrik Georgsson, Niclas Börlin, Umeå: Umeå universitet , 2006, 93-96 p.Chapter in book (Other academic)
Abstract [en]

In an experiment, groups of four participants played “bluffstopp” — a card game based on deception — over the Internet while communicating through multicast video and audio. Higher frame rates lead to lower video quality ratings. The result is explained as an effect of increasing visual workload.

Place, publisher, year, edition, pages
Umeå: Umeå universitet, 2006
Series
Report / UMINF, ISSN 0348-0542 ; 11
Identifiers
urn:nbn:se:umu:diva-5480 (URN)
Available from: 2006-11-02 Created: 2006-11-02 Last updated: 2010-01-26Bibliographically approved

Open Access in DiVA

fulltext(761 kB)538 downloads
File information
File name FULLTEXT01.pdfFile size 761 kBChecksum SHA-1
0a33c03a4cb874396aadfad592f99d94ef02da66ed24d28101327fdf3a0e4ea866ef4537
Type fulltextMimetype application/pdf

Authority records BETA

Söderström, Ulrik

Search in DiVA

By author/editor
Söderström, Ulrik
By organisation
Applied Physics and Electronics
Other Electrical Engineering, Electronic Engineering, Information Engineering

Search outside of DiVA

GoogleGoogle Scholar
Total: 538 downloads
The number of downloads is the sum of all downloads of full texts. It may include eg previous versions that are now no longer available

isbn
urn-nbn

Altmetric score

isbn
urn-nbn
Total: 574 hits
CiteExportLink to record
Permanent link

Direct link
Cite
Citation style
  • apa
  • ieee
  • modern-language-association-8th-edition
  • vancouver
  • Other style
More styles
Language
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Other locale
More languages
Output format
  • html
  • text
  • asciidoc
  • rtf