Torby's Blog: 2012

Thursday, October 04, 2012

Tele-assisted living

The idea behind tele-assisted living is that many services can be provided remotely if there is a mobile manipulator connected to the Internet in someone's home.
Tele-operating this platform will remove the requirement for high levels of robot autonomy which we are not likely to see for decades.

Such a platform could also provide a large amount of training data to speed up the development of robot autonomy. My interest is in the area of robot learning and the delegation of skills from a tele-operator to the robot.

Many research activities are currently ongoing in this area and this blog post is meant as a list of these.

FP7 projects

ACCOMPANY: Acceptable robotiCs COMPanions for AgeiNg Years (includes Hertfordshire and Birmingham)
AALIANCE2: The European Ambient Assisted Living Innovation Alliance (includes Tunstall Healthcare Ltd., UK)
CAPSIL: International Support of a Common Awareness and Knowledge Platform for Studying and Enabling Independent Living (FP7 Support Action, includes Imperial College, London)
CONFIDENCE: Ubiquitous Care System to Support Independent Living (no UK partner)
DOMEO: Domestic robot for elderly assisteance (no UK partner)
FLORENCE: Multi-Purpose Mobile Robot for Ambient Assisted Living (no UK partners)

KSERA: Knowledgeable Service Robots for Ageing (no UK partners)
MOBISERV: An Integrated Intelligent Home Environment for the Provision of Health, Nutrition and Mobility Services to the Elderly (includes Bristol)
ROBOT-ERA : Implementation and Integration of Advanced Robotic Systems and Intelligent Environments in Real Scenarios for the Ageing Population (includes Plymouth)
SCRIPT: Supervised Care and Rehabilitation Involving Personal Tele-Robotics (includes Hertfordshire and Sheffield)
SRS: Multi-Role Shadow Robotic System for Independent Living (includes Cardiff and Bedfordshire)

Interest groups

KT-EQUAL (Sheffield)

Thursday, April 26, 2012

Horizontal Connections

I have still to find a neural model of the competitive selection mechanism that finds the winning node of a SOM. The process has been parallelised many times for parallel SOM implementations based on FPGAs or CUDA (such as Aquila). There is a lot of work on horizontal/lateral connections in the cortex. It would be very interesting to study a SOM implementation using such lateral connections to select SOM winners.

Endogenously Active Elements

Endogenous activity has been raised as an important mechanism for cognition, both in terms of neural self-organisation (Choa and Choi 2012) and for control (Bechtel and Abrahamsen 2010 ). Such a mechanism would be a very interesting extension to our PLANCS framework for behavior-based robotics (Dahl and Giraud-Carrier 2005).

Bechtel, W. and Abrahamsen, A. (2010) Understanding the brain as an endogenously active mechanism. In Proceedings of the 32nd Annual Conference of the Cognitive Science Society, Austin, Texas. (see Bechtel's web site http://mechanism.ucsd.edu/~bill/index.html)
Choa, M. W. and Choi, M. Y. (2012) Spontaneous organization of the cortical structure through endogenous neural firing and gap junction transmission, Neural Networks 31:46–52.
T. S. Dahl and C. Giraud-Carrier, "Incremental Development of Adaptive Behaviors using Trees of Self-Contained Solutions," Adaptive Behavior, 13(3)243-260, 2005.

Monday, March 19, 2012

Interpolation with SOMs

The traditional use of a SOM is to reduce the dimensionality of the data through vector quantization.
It may be possible, however, to use SOMs to interpolate across a data space based on a small number of data points. Turns out that not only is it possible, but it has been done by a number of people!

Goppert and Rosenstiel (1997) The Continuous Interpolating Self-Organizing Map, Neural Processing Letters, 5:185-192.
Yin and Allinson (1999) Interpolating self-organizing map (iSOM), Electronics Letters, 35(19):1649-1650.
Kawano, Orii, Shiraishi and Maeda (2010) A Method for Multiple Image Interpolation Employing Self-Organizing Map, Proceedings of the SMC'2010, pages 4035-4040, 10-13 October, Istanbul.

Now, applying this to RL so that we can construct an attractor landscape is a very interesting idea.

Wednesday, February 15, 2012

Incremental exploration

Learn from demonstration typically means learning from training data that are in the form of a relatively small number of complex sequences of observations and potentially actions. The strength of this learning paradigm is that the data provided is related to the crucial areas of the problem space. In the case of reinforcement learning, this would involve the key reward states and effective paths to these states from relevant starting states. However, due to the restricted part of a problem space that can be covered using this form of learning, it typically leads to brittle behaviours that are not able to compensate for perturbations that place the robot outside the known area. One solution to this problem is to use the training data to learn a policy that generalises across large areas of the problem space, such as the Nonlinear Dynamical Systems presented by Ijspeert et al. [3]. Another approach is to hard code a mechanism for returing to the known area such as the extension to Gaussian Mixture Models presented by Calinon [2]. Abbeel and Ng [1], argued, from their experience in the domain of autonomous helicopter control, that an explicit exploration policy is not required in order to improve performance up to or beyond that of the teacher. Instead, the natural perturbations would provide sufficient exploration.

Bill Smart and Leslie Kaelbling [5] developed the JAQL (Joystick and Q-learning?) algorithm to overcome this problem. The JAQL algorithms has two different learning phases. In the first phase, the robot is driven through the "interesting" parts of the problem space by a hand coded controller or by a human controller using a joystick. In the second phase, the policy learned was in control and responsible for further exploration, running in a more standard reinforcement learning mode. The JAQL algorithm has an explicit exploration policy designed to work with policies learnt from demonstration.

The JAQL exploration policy creates slight deviations from the greedy action by adding a small amount of Gaussian noise [4]. This policy creates actions that are "similar to, but different from", the greedy action.

Our RLSOM algorithm has so far been applied only to learning by demonstration, but should be capable of handling learning from exploration without other modifications that a reasonable exploration policy. This is one of the most exciting direction in which to take our research.

[1] Pieter Abbeel and Andrew Y. Ng, Exploration and apprenticeship learning in reinforcement learning. In the Proceedings of the 22nd International Conference on Machine Learning (ICML'05), pp1-8, August 7-11, Bonn, Germany, 2005.

[2] Sylvain Calinon, Robot Programming by Demonstration: A Probabilistic Approach. EPFL/CRC Press, 2009.

[3] Auke J. Ijspeert, Jun Nakanishi and Stefan Schaal, Movement imitation with nonlinear dynamical systems in humanoid robots. In the Proceedings of the International Conference on Robotics and Automation (ICRA'02), pp1398-1403, May 11 - 15, Washington, DC, 2002.

[4] William D. Smart, Making Reinforcement Learning Work on Real Robots. Ph.D. thesis, Department of Computer Science, Brown University, 2002.

[5] William D. Smart and Leslie Pack Kaelbling, Reinforcement Learning for Robot Control. In Mobile Robots XVI (Proceedings of the SPIE 4573), pp92-103, Douglas W. Gage and Howie M. Choset (eds.), Boston, Massachusetts, 2001.

Sunday, February 05, 2012

Dimensions of cognition: A Generalist Manifesto

The article below builds on some old ideas discussed in my old ECAL 2001 paper, Evolution, Adaptation, and Behavioural Holism in Artificial Intelligence and also included in my PhD dissertation Behaviour-Based Learning: Evolution-Inspired Development of Adaptive Robot Behaviours. They were originally a comment on behavior-based robotics as presented in Rodney Brooks's papers A Robust Layered Control System for a Mobile Robot and Elephants Don't Play Chess, and in Ronald C. Arkin's book Behavior-based Robotics.

These ideas gained new relevance recently when I participated in a the Challenges for Artificial Cognitive System II workshop arranged by the European Network for the Advancement of Artificial Cognitive Systems, Interaction and Robotics (EUCogII). Some ideas from that workshop were captures in the workshop wiki. Below I summarize what I took from the workshop.

There is a history, in the science of AI and its related fields, to look a problems in isolation and to simplify problems to the extent where the solutions do contribute significantly to our knowledge of cognition. Three examples are symbolic problem solving, Computer Vision and Speech Recognition. While each of these areas have produced valuable technologies, in their own right, they have also, by narrowing their focus, removed themselves so far from the problems faced by animals, including humans, that their solutions have not, to any great extent, helped our understanding of cognitive systems. One cannot criticize this work for simplifying and narrowing their studies, as this is a necessary part of developing working solutions to given problems. The specificity of their solutions however, raises the question of whether it is possible to develop systems that both solve a specific problem, and also provide key insights into cognition. Cognition is, arguably, the ability to apply a general understanding of the world to new problems and situations, and, if so, cognition is generalization rather than specialization.

A generalist approach to modelling cognition raises two fundamental questions:

Generalize across what?
What kind of problems can provide tractable challenges for generalist cognitive systems?

By challenges being tractable we mean that it is possible to imagine solutions based on the incremental development or integration of existing technologies. There are many examples of interesting but intractable challenges for generalist cognitive systems. Most challenges requiring human-like cognitive abilities such as autonomous robot workers or companions are clearly still intractable, but many challenges which have, arguably, lower cognitive requirements, e.g., robotic sheep dogs, guard dogs, steeds or even pack animals, also look intractable w.r.t. many of the sub-problems they contain.

Beyond finding tractable challenges it is also interesting to consider whether we can identify a sequence of challenges that could form milestones along a path of increasingly high levels of generalist cognitive abilities. Following from the second question, we can also ask whether any such problems would be practical, by which we mean that solving them would provide a technology that could be useful to society.

Physiology

Recognizing that cognition is dependent on physiology introduces a number of physiological dimensions of cognition:

Sensors; Vision (stereo, colour), audition (stereo), proprioceptive, tactile
Actuators; Muscles, Legs, arms, hands, thumbs
Nervous system; Spine, limbic system, cortical architecture

Neil R. Carlson's Physiology of Behavior is a good introduction to sensors, actuators and related physiology. The architecture of the complete brain is described in Larry W. Swanson's book Brain Architecture: Understanding the Basic Plan. The cortical architecture is discussed in Joaquin M Fuster's book Cortex and Mind: Unifying Cognition.

Environment

The environment is also a crucial actor in enabling or prohibiting intelligent behaviour. I have divided this into:

Physical environment
Social environment

A book that discusses some issues in this respect is John Alcock's Animal Behavior: An Evolutionary Approach.

Cognition

Anette Karmiloff-Smith, in her book Beyond Modularity: A developmental perspective on cognitive science builds on traditional developmental approaches such as those of Piaget and Fodor to suggest that a child's cognitive development takes place within five different domains before the process of representational redescription produces domain generic knowledge from the previous domain specific knowledge. Anette Karmiloff-Smith considers the following domains:

The child as a linguist
The child as a physicist
The child as a mathematician
The child as a psychologist
The child as a notator

Looking for dimensions of cognition, Stephen Mithen, in his book The prehistory of the mind, 1996, suggests a number of specialized intelligences that act as a foundation, or 'chapels' around an initial general intelligence. On top of these chapels, the 'superchapel' of meta-representation is then built, to provide the cognitive capabilities of modern humans. The specialized intelligences suggested by Steven Mithen are:

Technical intelligence
Natural history intelligence
Social intelligence
Linguistic intelligence

Finally, in the book A Roadmap for Cognitive Development in Humanoid Robots, David Vernon, Claes von Hofsten and Luciano Fadiga have used knowledge from human cognitive development to define a road map for cognitive development in humanoid robots. This work is very much in the spirit of what I suggest here, but I would like to consider a wider scientific that will give us a better chance of identifying realistic milestones.

Examples

The Kizmet robot was developed at MIT and used in a wide range of research activities.

The iCub robot is a popular humanoid research robot that has also been used for a wide range of research activities. As a result, it also has a well developed cognitive architecture.

Monday, January 23, 2012

Prioritized sweeping

In considering how we can make our RLSOM algorithm more efficient, the idea of focusing the activation spreading around the area of maximum activity was raised (by my PhD student Georgios Pierris). This concept was formalised as 'prioritized sweeping' by Moore and Atkeson in 1993.

Like growing SOMs, this is a concept I would like to explore further.

Reference:

Moore, A.W., Atkeson, C.G.: Prioritized sweeping: Reinforcement learning with less data and less time. Machine Learning, 13:103–130, 1993.