umu.sePublications
Change search
CiteExportLink to record
Permanent link

Direct link
Cite
Citation style
  • apa
  • ieee
  • modern-language-association-8th-edition
  • vancouver
  • Other style
More styles
Language
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Other locale
More languages
Output format
  • html
  • text
  • asciidoc
  • rtf
Cognition reversed: Robot learning from demonstration
Umeå University, Faculty of Science and Technology, Department of Computing Science.
2009 (English)Licentiate thesis, comprehensive summary (Other academic)
Abstract [en]

The work presented in this thesis investigates techniques for learning from demonstration (LFD). LFD is a well established approach to robot learning, where a teacher demonstrates a behavior to a robot pupil. This thesis focuses on LFD where a human teacher demonstrates a behavior by controlling the robot via teleoperation. The robot should after demonstration be able to execute the demonstrated behavior under varying conditions.

Several views on representation, recognition and learning of robot behavior are presented and discussed from a cognitive and computational perspective. LFD-related concepts such as behavior, goal, demonstration, and repetition are defined and analyzed, with focus on how bias is introduced by the use of behavior primitives. This analysis results in a formalism where LFD is described as transitions between information spaces. Assuming that the behavior recognition problem is partly solved, ways to deal with remaining ambiguities in the interpretation of a demonstration are proposed.

A total of five algorithms for behavior recognition are proposed and evaluated, including the dynamic temporal difference algorithm Predictive Sequence Learning (PSL). PSL is model-free in the sense that it makes few assumptions of what is to be learned. One strength of PSL is that it can be used for both robot control and recognition of behavior. While many methods for behavior recognition are concerned with identifying invariants within a set of demonstrations, PSL takes a different approach by using purely predictive measures. This may be one way to reduce the need for bias in learning. PSL is, in its current form, subjected to combinatorial explosion as the input space grows, which makes it necessary to introduce some higher level coordination for learning of complex behaviors in real-world robots.

The thesis also gives a broad introduction to computational models of the human brain, where a tight coupling between perception and action plays a central role. With the focus on generation of bias, typical features of existing attempts to explain humans' and other animals' ability to learn are presented and analyzed, from both a neurological and an information theoretic perspective. Based on this analysis, four requirements for implementing general learning ability in robots are proposed. These requirements provide guidance to how a coordinating structure around PSL and similar algorithms should be implemented in a model-free way.

Abstract [sv]

Arbetet som presenteras i den här avhandlingen undersöker tekniker för att lära robotar från demonstrationer (LFD). LFD är en väl etablerad teknik där en lärare visar roboten hur den ska göra. Den här avhandlingen fokuserar på LFD där en människa fjärrstyr roboten, som i sin tur tolkar demonstrationen så att den kan repetera beteendet vid ett senare tillfälle, även då omgivningen förändrats. Flera perspektiv på representation, igenkänning och inlärning av beteende presenteras och diskuteras från ett kognitionsvetenskaplig och datavetenskapligt perspektiv. LFD-relaterade koncept så som beteende, mål, demonstration och repetition definieras och analyseras, med fokus på hur förkunskap kan implementeras genom beteendeprimitiv. Analysen resulterar i en formalism där LFD beskrivs som övergångar mellan informationsrymder. I termer av formalismen föreslås även sätt att hantera tvetydigheter efter att en demonstration tolkats genom igenkänning av beteendeprimitiv.

Fem algoritmer för igenkänning av beteende presenteras och utvärderas, däribland algoritmen Predictive Sequence Learning (PSL). PSL är modellfri i bemärkelsen att den introducerar få antaganden om inlärningssituationen. PSL kan fungera som en algoritm för både kontroll och igenkänning av beteende. Till skillnad från flertalet tekniker för igenkänning av beteende använder PSL inte likheter i beteende mellan demonstrationer. PSL utnyttjar i stället prediktiva mått som kan minska behovet av domänkunskap vid inlärning. Ett problem med algoritmen är dock att den drabbas av kombinatorisk explosion då inputrymden växer vilket gör att någon form av högre koordination behövs för inlärning av komplexa beteenden.

Avhandlingen ger dessutom en introduktion till beräkningsmässiga modeller av hjärnan där en stark koppling mellan perception och handling spelar en central roll. Typiska egenskaper hos dessa modeller presenters och analyseras från ett neurologiskt och informationsteoretiskt perspektiv. Denna analys resulterar i fyra krav för att implementera generell inlärningsförmåga i robotar. Dessa krav ger vägledning till hur en koordinerande struktur för PSL och liknande algoritmer skulle kunna implementeras på ett modellfritt sätt.

Place, publisher, year, edition, pages
Umeå: Umeå universitet, Institutionen för datavetenskap , 2009. , 138 p.
Series
Report / UMINF, ISSN 0348-0542 ; 09:20
National Category
Computer Science Human Computer Interaction
Research subject
Computer Science
Identifiers
URN: urn:nbn:se:umu:diva-32494ISBN: 978-91-7264-925-5 (print)OAI: oai:DiVA.org:umu-32494DiVA: diva2:303591
Presentation
2009-12-16, Naturvetarhuset, N300, Umeå Universitet, Umeå, 13:00 (Swedish)
Opponent
Supervisors
Available from: 2010-03-15 Created: 2010-03-14 Last updated: 2010-03-15Bibliographically approved
List of papers
1. Cognitive Perspectives on Robot Behavior
Open this publication in new window or tab >>Cognitive Perspectives on Robot Behavior
2010 (English)In: Proceedings of the 2nd International Conference on Agents and Artificial Intelligence: Special Session on Computing Languages with Multi-Agent Systems and Bio-Inspired Devices / [ed] Joaquim Filipe, Ana Fred and Bernadette Sharp, Portugal: INSTICC , 2010, 373-382 p.Conference paper, Published paper (Refereed)
Abstract [en]

A growing body of research within the field of intelligent robotics argues for a view of intelligence drastically different from classical artificial intelligence and cognitive science. The holistic and embodied ideas expressed by this research promote the view that intelligence is an emergent phenomenon. Similar perspectives, where numerous interactions within the system lead to emergent properties and cognitive abilities beyond that of the individual parts, can be found within many scientific fields. With the goal of understanding how behavior may be represented in robots, the present review tries to grasp what this notion of emergence really means and compare it with a selection of theories developed for analysis of human cognition, including the extended mind, distributed cognition and situated action. These theories reveal a view of intelligence where common notions of objects, goals, language and reasoning have to be rethought. A view where behavior, as well as the agent as such, is defined by the observer rather than given by their nature. Structures in the environment emerge by interaction rather than recognized. In such a view, the fundamental question is how emergent systems appear and develop, and how they may be controlled.

Place, publisher, year, edition, pages
Portugal: INSTICC, 2010
Keyword
Behavior based control, Cognitive artificial intelligence, Distributed cognition, Ontology, Reactive robotics, Sensory-motor coordination, Situated action
National Category
Computer Science
Research subject
Computer Science
Identifiers
urn:nbn:se:umu:diva-31868 (URN)978-989-674-021-4 (ISBN)
Conference
2nd International Conference on Agents and Artificial Intelligence
Available from: 2010-02-26 Created: 2010-02-19 Last updated: 2012-01-04Bibliographically approved
2. Behavior recognition for segmentation of demonstrated tasks
Open this publication in new window or tab >>Behavior recognition for segmentation of demonstrated tasks
2008 (English)In: IEEE SMC International Conference on Distributed Human-Machine Systems (DHMS), 2008Conference paper, Published paper (Refereed)
Abstract [en]

One common approach to the robot learning technique Learning From Demonstration, is to use a set of pre-programmed skills as building blocks for more complex tasks. One important part of this approach is recognition of these skills in a demonstration comprising a stream of sensor and actuator data. In this paper, three novel techniques for behavior recognition are presented and compared. The first technique is function-oriented and compares actions for similar inputs. The second technique is based on auto-associative neural networks and compares reconstruction errors in sensory-motor space. The third technique is based on S-Learning and compares sequences of patterns in sensory-motor space. All three techniques compute an activity level which can be seen as an alternative to a pure classification approach. Performed tests show how the former approach allows a more informative interpretation of a demonstration, by not determining "correct" behaviors but rather a number of alternative interpretations.

Keyword
Learning from demonstration, Segmentation, Generalization, Sequence Learning, Auto-associative neural networks, S-Learning
National Category
Computer Science
Identifiers
urn:nbn:se:umu:diva-9300 (URN)978-80-01-04027-0 (ISBN)
Available from: 2008-03-19 Created: 2008-03-19 Last updated: 2012-01-04Bibliographically approved
3. A formalism for learning from demonstration
Open this publication in new window or tab >>A formalism for learning from demonstration
2010 (English)In: Paladyn Journal of Behavioral Robotics, ISSN 2080-9778, 2081-4836 (e-version), Vol. 1, no 1, 1-13 p.Article in journal (Refereed) Published
Abstract [en]

The paper describes and formalizes the concepts and assumptions involved in Learning from Demonstration (LFD), a common learning technique used in robotics. LFD-related concepts like goal, generalization, and repetition are here defined, analyzed, and put into context. Robot behaviors are described in terms of trajectories through information spaces and learning is formulated as mappings between some of these spaces. Finally, behavior primitives are introduced as one example of good bias in learning, dividing the learning process into the three stages of behavior segmentation, behavior recognition, and behavior coordination. The formalism is exemplified through a sequence learning task where a robot equipped with a gripper arm is to move objects to specific areas. The introduced concepts are illustrated with special focus on how bias of various kinds can be used to enable learning from a single demonstration, and how ambiguities in demonstrations can be identified and handled.

Place, publisher, year, edition, pages
Versita, 2010
Keyword
Learning from demonstration, ambiguities, behavior, bias, generalization, robot learning
National Category
Human Computer Interaction Computer Science
Research subject
Computer Science
Identifiers
urn:nbn:se:umu:diva-32492 (URN)10.2478/s13230-010-0001-5 (DOI)
Note
co-published with Springer-Verlag GmbH; Published online: 31 March 2010Available from: 2010-06-24 Created: 2010-03-14 Last updated: 2012-01-04Bibliographically approved
4. Behavior recognition for learning from demonstration
Open this publication in new window or tab >>Behavior recognition for learning from demonstration
2010 (English)In: Proceedings of IEEE International Conference on Robotics and Automation / [ed] Nancy M. Amato et. al, 2010, 866-872 p.Conference paper, Published paper (Refereed)
Abstract [en]

Two methods for behavior recognition are presented and evaluated. Both methods are based on the dynamic temporal difference algorithm Predictive Sequence Learning (PSL) which has previously been proposed as a learning algorithm for robot control. One strength of the proposed recognition methods is that the model PSL builds to recognize behaviors is identical to that used for control, implying that the controller (inverse model) and the recognition algorithm (forward model) can be implemented as two aspects of the same model. The two proposed methods, PSLE-Comparison and PSLH-Comparison, are evaluated in a Learning from Demonstration setting, where each algorithm should recognize a known skill in a demonstration performed via teleoperation. PSLH-Comparison produced the smallest recognition error. The results indicate that PSLH-Comparison could be a suitable algorithm for integration in a hierarchical control system consistent with recent models of human perception and motor control.

Keyword
learning and adaptive systems, neurorobotics, autonomous agents
National Category
Computer Science
Research subject
Computer Science
Identifiers
urn:nbn:se:umu:diva-32375 (URN)978-1-4244-5040-4 (ISBN)
Conference
ICRA 2010 IEEE International conference on robotics and automationce, Anchorage, Alaska, May 3-8, 2010
Available from: 2010-07-05 Created: 2010-03-10 Last updated: 2012-01-04Bibliographically approved
5. Model-free learning from demonstration
Open this publication in new window or tab >>Model-free learning from demonstration
2010 (English)In: ICAART 2010 - Proceedings of the international conference on agents and artificial intelligence:  volume 2 / [ed] Joaquim Filipe, Ana LN Fred, Bernadette Sharp, Portugal: INSTICC , 2010, 62-71 p.Conference paper, Published paper (Refereed)
Abstract [en]

A novel robot learning algorithm called Predictive Sequence Learning (PSL) is presented and evaluated. PSL is a model-free prediction algorithm inspired by the dynamic temporal difference algorithm S-Learning. While S-Learning has previously been applied as a reinforcement learning algorithm for robots, PSL is here applied to a Learning from Demonstration problem. The proposed algorithm is evaluated on four tasks using a Khepera II robot. PSL builds a model from demonstrated data which is used to repeat the demonstrated behavior. After training, PSL can control the robot by continually predicting the next action, based on the sequence of passed sensor and motor events. PSL was able to successfully learn and repeat the first three (elementary) tasks, but it was unable to successfully repeat the fourth (composed) behavior. The results indicate that PSL is suitable for learning problems up to a certain complexity, while higher level coordination is required for learning more complex behaviors.

Place, publisher, year, edition, pages
Portugal: INSTICC, 2010
Keyword
Learning from Demonstration, Prediction, Robot Imitation, Motor Control, Model-free Learning
National Category
Computer Science
Research subject
Computer Science
Identifiers
urn:nbn:se:umu:diva-31865 (URN)978-989-674-022-1 (ISBN)
Conference
ICAART 2010 - The International Conference on Agents and Artificial Intelligence - Agents, Valencia, Spain, January 22-24, 2010
Available from: 2010-02-25 Created: 2010-02-19 Last updated: 2011-05-27Bibliographically approved

Open Access in DiVA

fulltext(1055 kB)527 downloads
File information
File name FULLTEXT01.pdfFile size 1055 kBChecksum SHA-512
4dc6357d883c39f21502fa0bbeb3a6176d1925fbe6c56d03b033c56d2350b901afcc112e004b0b97256c7b5f357ae4260fe03ec6a37b896c40848924d7e9a2bf
Type fulltextMimetype application/pdf

Other links

http://www.cognitionreversed.com/

Search in DiVA

By author/editor
Billing, Erik
By organisation
Department of Computing Science
Computer ScienceHuman Computer Interaction

Search outside of DiVA

GoogleGoogle Scholar
Total: 527 downloads
The number of downloads is the sum of all downloads of full texts. It may include eg previous versions that are now no longer available

isbn
urn-nbn

Altmetric score

isbn
urn-nbn
Total: 318 hits
CiteExportLink to record
Permanent link

Direct link
Cite
Citation style
  • apa
  • ieee
  • modern-language-association-8th-edition
  • vancouver
  • Other style
More styles
Language
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Other locale
More languages
Output format
  • html
  • text
  • asciidoc
  • rtf