umu.sePublikasjoner
Endre søk
RefereraExporteraLink to record
Permanent link

Direct link
Referera
Referensformat
  • apa
  • ieee
  • modern-language-association-8th-edition
  • vancouver
  • Annet format
Fler format
Språk
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Annet språk
Fler språk
Utmatningsformat
  • html
  • text
  • asciidoc
  • rtf
Very Low Bitrate Video Communication: A Principal Component Analysis Approach
Umeå universitet, Teknisk-naturvetenskapliga fakulteten, Institutionen för tillämpad fysik och elektronik.
2008 (engelsk)Doktoravhandling, med artikler (Annet vitenskapelig)
Abstract [en]

A large amount of the information in conversations come from non-verbal cues such as facial expressions and body gesture. These cues are lost when we don't communicate face-to-face. But face-to-face communication doesn't have to happen in person. With video communication we can at least deliver information about the facial mimic and some gestures. This thesis is about video communication over distances; communication that can be available over networks with low capacity since the bitrate needed for video communication is low.

A visual image needs to have high quality and resolution to be semantically meaningful for communication. To deliver such video over networks require that the video is compressed. The standard way to compress video images, used by H.264 and MPEG-4, is to divide the image into blocks and represent each block with mathematical waveforms; usually frequency features. These mathematical waveforms are quite good at representing any kind of video since they do not resemble anything; they are just frequency features. But since they are completely arbitrary they cannot compress video enough to enable use over networks with limited capacity, such as GSM and GPRS.

Another issue is that such codecs have a high complexity because of the redundancy removal with positional shift of the blocks. High complexity and bitrate means that a device has to consume a large amount of energy for encoding, decoding and transmission of such video; with energy being a very important factor for battery-driven devices.

Drawbacks of standard video coding mean that it isn't possible to deliver video anywhere and anytime when it is compressed with such codecs. To resolve these issues we have developed a totally new type of video coding. Instead of using mathematical waveforms for representation we use faces to represent faces. This makes the compression much more efficient than if waveforms are used even though the faces are person-dependent.

By building a model of the changes in the face, the facial mimic, this model can be used to encode the images. The model consists of representative facial images and we use a powerful mathematical tool to extract this model; namely principal component analysis (PCA). This coding has very low complexity since encoding and decoding only consist of multiplication operations. The faces are treated as single encoding entities and all operations are performed on full images; no block processing is needed. These features mean that PCA coding can deliver high quality video at very low bitrates with low complexity for encoding and decoding.

With the use of asymmetrical PCA (aPCA) it is possible to use only semantically important areas for encoding while decoding full frames or a different part of the frames.

We show that a codec based on PCA can compress facial video to a bitrate below 5 kbps and still provide high quality. This bitrate can be delivered on a GSM network. We also show the possibility of extending PCA coding to encoding of high definition video.

sted, utgiver, år, opplag, sider
Umeå: Tillämpad fysik och elektronik , 2008. , s. 90
Serie
Digital Media Lab, ISSN 1652-6295 ; 11
Emneord [en]
Video compression, Very low bitrate, Principal component analysis, Complexity, Semantically important areas, Wearable video
HSV kategori
Identifikatorer
URN: urn:nbn:se:umu:diva-1808ISBN: 978-91-7264-644-5 (tryckt)OAI: oai:DiVA.org:umu-1808DiVA, id: diva2:142053
Disputas
2008-09-26, N200, Naturvetarhuset, Umeå universitet, Umeå, 10:00 (engelsk)
Opponent
Veileder
Tilgjengelig fra: 2008-09-05 Laget: 2008-09-05 Sist oppdatert: 2018-06-09bibliografisk kontrollert
Delarbeid
1. Full-frame video coding for facial video sequences based on principal component analysis
Åpne denne publikasjonen i ny fane eller vindu >>Full-frame video coding for facial video sequences based on principal component analysis
2005 (engelsk)Inngår i: Proceedings of Irish Machine Vision and Image Processing Conference, s. 25-32Artikkel i tidsskrift (Fagfellevurdert) Published
Identifikatorer
urn:nbn:se:umu:diva-3375 (URN)
Tilgjengelig fra: 2008-09-05 Laget: 2008-09-05 Sist oppdatert: 2018-06-09bibliografisk kontrollert
2. Representation bound for human facial mimic with the aid of principal component analysis
Åpne denne publikasjonen i ny fane eller vindu >>Representation bound for human facial mimic with the aid of principal component analysis
2010 (engelsk)Inngår i: International Journal of Image and Graphics, ISSN 0219-4678, Vol. 10, nr 3, s. 343-363Artikkel i tidsskrift (Fagfellevurdert) Published
Abstract [en]

In this paper, we examine how much information is needed to represent the facial mimic, based on Paul Ekman's assumption that the facial mimic can be represented with a few basic emotions. Principal component analysis is used to compact the important facial expressions. Theoretical bounds for facial mimic representation are presented both for using a certain number of principal components and a certain number of bits. When 10 principal components are used to reconstruct color image video at a resolution of 240 × 176 pixels the representation bound is on average 36.8 dB, measured in peak signal-to-noise ratio. Practical confirmation of the theoretical bounds is demonstrated. Quantization of projection coefficients affects the representation, but a quantization with approximately 7-8 bits is found to match an exact representation, measured in mean square error.

sted, utgiver, år, opplag, sider
World Scientific Publishing Company, 2010
Emneord
Distortion bound, rate-distortion bound, facial mimic, basic emotions, principal component analysis, PCA
HSV kategori
Identifikatorer
urn:nbn:se:umu:diva-5428 (URN)10.1142/S0219467810003810 (DOI)
Tilgjengelig fra: 2011-09-19 Laget: 2006-10-13 Sist oppdatert: 2018-06-09bibliografisk kontrollert
3. Eigenspace compression for very low bitrate transmission of facial video
Åpne denne publikasjonen i ny fane eller vindu >>Eigenspace compression for very low bitrate transmission of facial video
2007 (engelsk)Inngår i: IASTED International conference on Signal Processing, Pattern Recognition and ApplicationsArtikkel i tidsskrift (Fagfellevurdert) Published
Identifikatorer
urn:nbn:se:umu:diva-3377 (URN)
Tilgjengelig fra: 2008-09-05 Laget: 2008-09-05 Sist oppdatert: 2018-06-09bibliografisk kontrollert
4. Ultra low bit-rate video communication: video coding = pattern recognition
Åpne denne publikasjonen i ny fane eller vindu >>Ultra low bit-rate video communication: video coding = pattern recognition
2006 (engelsk)Inngår i: Proceedings of the 25th Picture Coding Symposium, 2006Konferansepaper, Publicerat paper (Fagfellevurdert)
Identifikatorer
urn:nbn:se:umu:diva-22982 (URN)
Tilgjengelig fra: 2009-05-25 Laget: 2009-05-25 Sist oppdatert: 2018-06-08bibliografisk kontrollert
5. Asymmetrical Principal Component Analysis: Theory and Its Applications to Facial Video Coding
Åpne denne publikasjonen i ny fane eller vindu >>Asymmetrical Principal Component Analysis: Theory and Its Applications to Facial Video Coding
2011 (engelsk)Inngår i: Effective Video Coding for Multimedia Applications / [ed] Sudhakar Radhakrishnan, InTech Open , 2011, s. 95-110Kapittel i bok, del av antologi (Fagfellevurdert)
sted, utgiver, år, opplag, sider
InTech Open, 2011
HSV kategori
Identifikatorer
urn:nbn:se:umu:diva-3379 (URN)
Tilgjengelig fra: 2008-09-05 Laget: 2008-09-05 Sist oppdatert: 2018-06-09bibliografisk kontrollert
6. Side view driven facial video coding
Åpne denne publikasjonen i ny fane eller vindu >>Side view driven facial video coding
(engelsk)Artikkel i tidsskrift (Fagfellevurdert) Submitted
Identifikatorer
urn:nbn:se:umu:diva-3380 (URN)
Tilgjengelig fra: 2008-09-05 Laget: 2008-09-05 Sist oppdatert: 2018-06-09bibliografisk kontrollert
7. High definition wearable video communication
Åpne denne publikasjonen i ny fane eller vindu >>High definition wearable video communication
2009 (engelsk)Inngår i: Image analysis: 16th Scandinavian Conference, SCIA 2009, Oslo, Norway, June 15-18, 2009. Proceedings / [ed] Arnt-Børre Salberg, Jon Yngve Hardeberg, Robert Jenssen, Heidelberg: Springer Berlin , 2009, s. 500-512Konferansepaper, Publicerat paper (Fagfellevurdert)
Abstract [en]

High definition (HD) video can provide video communication which is as crisp and sharp as face-to-face communication. Wearable video equipment also provide the user with mobility; the freedom to move. HD video requires high bandwidth and yields high encoding and decoding complexity when encoding based on DCT and motion estimation is used. We propose a solution that can drastically lower the bandwidth and complexity for video transmission. Asymmetrical principal component analysis can initially encode HD video into bitrates which are low considering the type of video (< 300 kbps) and after a startup phase the bitrate can be reduced to less than 5 kbps. The complexity for encoding and decoding of this video is very low; something that will save battery power for mobile devices. All of this is done only at the cost of lower quality in frame areas which aren’t considered semantically important.

sted, utgiver, år, opplag, sider
Heidelberg: Springer Berlin, 2009
Serie
Lecture Notes in Computer Science, ISSN 1611-3349 ; 5575
HSV kategori
Identifikatorer
urn:nbn:se:umu:diva-35809 (URN)10.1007/978-3-642-02230-2_51 (DOI)978-3-642-02229-6 (ISBN)
Konferanse
Scandinavian Conference on Image Analysis, SCIA
Tilgjengelig fra: 2010-09-06 Laget: 2010-09-06 Sist oppdatert: 2019-07-05bibliografisk kontrollert

Open Access i DiVA

fulltekst(1507 kB)1092 nedlastinger
Filinformasjon
Fil FULLTEXT01.pdfFilstørrelse 1507 kBChecksum MD5
0d38df3d6832a243895ec897afe3e5947062d9535471a3ff241b6086381235496343f717
Type fulltextMimetype application/pdf

Personposter BETA

Söderström, Ulrik

Søk i DiVA

Av forfatter/redaktør
Söderström, Ulrik
Av organisasjonen

Søk utenfor DiVA

GoogleGoogle Scholar
Totalt: 1092 nedlastinger
Antall nedlastinger er summen av alle nedlastinger av alle fulltekster. Det kan for eksempel være tidligere versjoner som er ikke lenger tilgjengelige

isbn
urn-nbn

Altmetric

isbn
urn-nbn
Totalt: 1800 treff
RefereraExporteraLink to record
Permanent link

Direct link
Referera
Referensformat
  • apa
  • ieee
  • modern-language-association-8th-edition
  • vancouver
  • Annet format
Fler format
Språk
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Annet språk
Fler språk
Utmatningsformat
  • html
  • text
  • asciidoc
  • rtf