umu.sePublications
Change search
Link to record
Permanent link

Direct link
BETA
Kouma, Jean-Paul
Publications (5 of 5) Show all publications
Kouma, J.-P. & Söderström, U. (2016). Wyner-Ziv Video Coding using Hadamard Transform and Deep Learning. International Journal of Advanced Computer Sciences and Applications, 7(7), 582-589
Open this publication in new window or tab >>Wyner-Ziv Video Coding using Hadamard Transform and Deep Learning
2016 (English)In: International Journal of Advanced Computer Sciences and Applications, ISSN 2158-107X, E-ISSN 2156-5570, Vol. 7, no 7, p. 582-589Article in journal (Refereed) Published
Abstract [en]

Predictive schemes are current standards of video coding. Unfortunately they do not apply well for lightweight devices such as mobile phones. The high encoding complexity is the bottleneck of the Quality of Experience (QoE) of a video conversation between mobile phones. A considerable amount of research has been conducted towards tackling that bottleneck. Most of the schemes use the so-called Wyner-Ziv Video Coding Paradigm, with results still not comparable to those of predictive coding. This paper shows a novel approach for Wyner-Ziv video compression. It is based on the Reinforcement Learning and Hadamard Transform. Our Scheme shows very promising results.

Keywords
Wyner-Ziv, video coding, rate distortion, Hadamard transform, Deep learning, Expectation ximization
National Category
Other Electrical Engineering, Electronic Engineering, Information Engineering
Identifiers
urn:nbn:se:umu:diva-126345 (URN)10.14569/IJACSA.2016.070779 (DOI)000381940300080 ()
Available from: 2016-10-25 Created: 2016-10-03 Last updated: 2018-06-09Bibliographically approved
Abedan Kondori, F., Yousefi, S., Kouma, J.-P., Liu, L. & Li, H. (2015). Direct hand pose estimation for immersive gestural interaction. Pattern Recognition Letters, 66, 91-99
Open this publication in new window or tab >>Direct hand pose estimation for immersive gestural interaction
Show others...
2015 (English)In: Pattern Recognition Letters, ISSN 0167-8655, E-ISSN 1872-7344, Vol. 66, p. 91-99Article in journal (Refereed) Published
Abstract [en]

This paper presents a novel approach for performing intuitive gesture based interaction using depth data acquired by Kinect. The main challenge to enable immersive gestural interaction is dynamic gesture recognition. This problem can be formulated as a combination of two tasks; gesture recognition and gesture pose estimation. Incorporation of fast and robust pose estimation method would lessen the burden to a great extent. In this paper we propose a direct method for real-time hand pose estimation. Based on the range images, a new version of optical flow constraint equation is derived, which can be utilized to directly estimate 3D hand motion without any need of imposing other constraints. Extensive experiments illustrate that the proposed approach performs properly in real-time with high accuracy. As a proof of concept, we demonstrate the system performance in 3D object manipulation On two different setups; desktop computing, and mobile platform. This reveals the system capability to accommodate different interaction procedures. In addition, a user study is conducted to evaluate learnability, user experience and interaction quality in 3D gestural interaction in comparison to 2D touchscreen interaction.

Keywords
Immersive gestural interaction, Dynamic gesture recognition, Hand pose estimation
National Category
Signal Processing
Identifiers
urn:nbn:se:umu:diva-86748 (URN)10.1016/j.patrec.2015.03.013 (DOI)000362271100011 ()
Available from: 2014-03-06 Created: 2014-03-06 Last updated: 2018-06-08Bibliographically approved
Kouma, J.-P. & Li, H. (2010). Large-scale face images retrieval: a transform coding approach. In: . Paper presented at 15th IEEE Symposium on Computers and Communications (IEEE ISCC’10).
Open this publication in new window or tab >>Large-scale face images retrieval: a transform coding approach
2010 (English)Conference paper, Published paper (Other academic)
Abstract [en]

Huge efforts have been devoted to face recognition technology and remarkable results, noticed. Such advances will provide us the possibility to build a new generation of search engine: persons photo fetching. It is a real computing challenge to find a person from a very large or extremely large database which might hold face images of millions or hundred millions of people. A candidate solution is to use partial information (signature) about all the face images for search, making the retrieval speed approximately proportional to the size of a signature image. In this paper we will investigate a totally new way to compress the signature images based on the observation that the face signature images and the query images are highly correlated if they are from the same individual. The face signature image can be greatly compressed (one or two orders of magnitude improvement) by use of knowledge of the query images. We can expect the new compression algorithm to speed up face search 10 to 100 times. The challenge is that query images are not available when we compress their signature image. Our approach is to transfer the face search problem into the so-called ”Wyner-Ziv Coding” problem, which could give the same compression efficiency even if the query images are not available until we decompress signature images. A practical compression scheme based on LDPC codes is developed to compress and retrieve face signature images.

National Category
Computer and Information Sciences
Identifiers
urn:nbn:se:umu:diva-36772 (URN)
Conference
15th IEEE Symposium on Computers and Communications (IEEE ISCC’10)
Available from: 2011-04-01 Created: 2010-10-11 Last updated: 2018-06-08Bibliographically approved
Karlsson, J., Kouma, J.-P., Li, H., Wark, T. & Corke, P. (2009). Demonstration of Wyner-Ziv video compression in a wireless camera sensor network. In: The 9th Scandinavian Workshop on Wireless Ad-hoc & Sensor Networks (ADHOC'09 ). 4-5 May 2009, Uppsala, Sweden.. Paper presented at ADHOC'09, 4-5 May 2009, Uppsala, Sweden.. Uppsala
Open this publication in new window or tab >>Demonstration of Wyner-Ziv video compression in a wireless camera sensor network
Show others...
2009 (English)In: The 9th Scandinavian Workshop on Wireless Ad-hoc & Sensor Networks (ADHOC'09 ). 4-5 May 2009, Uppsala, Sweden., Uppsala, 2009Conference paper, Published paper (Refereed)
Abstract [en]

Sending  video over wireless sensor networks is a challenging task. The encoding and transmission of video is very resource hungry and the sensor nodes have very limited resources in terms of communication bandwidth,memory, computation and  typically 5-10 times. In this paper we will present a practical implementation of a Wyner-Ziv video codec where the reversed asymmetry in complexity between encoder and decoder can be achieved. We will also present our sensor network platform used in this demonstration known as Fleck TM-3 as well as two different co-processor daughterboards for image processing. The different daughterboards are then compared in terms of speed and energy consumption.

Place, publisher, year, edition, pages
Uppsala: , 2009
Identifiers
urn:nbn:se:umu:diva-38022 (URN)
Conference
ADHOC'09, 4-5 May 2009, Uppsala, Sweden.
Available from: 2010-11-23 Created: 2010-11-22 Last updated: 2018-06-08Bibliographically approved
Kouma, J.-P. & Li, H. (2009). Large-scale face images retrieval: a distribution coding approach. In: ICUMT 2009 - International Conference on Ultra Modern Telecommunications. Paper presented at ICUMT 2009 - International Conference on Ultra Modern Telecommunications, St Petersburd, Russia, 12-14 October.
Open this publication in new window or tab >>Large-scale face images retrieval: a distribution coding approach
2009 (English)In: ICUMT 2009 - International Conference on Ultra Modern Telecommunications, 2009Conference paper, Published paper (Other academic)
Abstract [en]

Great progress in face recognition technology has been made recently. Such advances will provide us the possibility to build a new generation of search engine: Face Google, searching from person photos. It is very challenging to find a person from a very large or extremely large database which might hold face images of millions or hundred millions of people. The indexing technology used in most commercial search engines like Google, is very efficient for text-based search, unfortunately, it is no longer useful for image search. A solution is to use partial information (signature) about all the face images for search. The retrieval speed is approximately proportional to the size of a signature image. In this paper we will study a totally new way to compress the signature images based on the observation that the face signature images and the query images are highly correlated if they are from the same individual. The face signature image can be greatly compressed (one or two orders of magnitude improvement) by use of knowledge of the query images. We can expect the new compression algorithm to speed up face search 10 to 100 times. The challenge is that query images are not available when we compress their signature image. Our approach is to transfer the face search problem into the so-called ”Wyner-Ziv Coding” problem, which could give the same compression efficiency even if the query images are not available until we decompress signature images. A practical compression scheme based on LDPC codes is developed to compress face signature images.

National Category
Telecommunications
Identifiers
urn:nbn:se:umu:diva-36769 (URN)
Conference
ICUMT 2009 - International Conference on Ultra Modern Telecommunications, St Petersburd, Russia, 12-14 October
Available from: 2011-04-01 Created: 2010-10-11 Last updated: 2018-06-08Bibliographically approved
Organisations

Search in DiVA

Show all publications