A Formal Model of Affordances for Flexible Robotic Task Execution

In recent years, it has been shown that robotic agents are able to generate human-like behavior for everyday tasks such as preparing a meal, or making a pancake. These demonstrations, while being very impressive, also impose a lot of hard constraints on the environment such that the deployment of, for example, a robotic cook to an arbitrary household is still impossible. The step change is to make it possible to deploy robots to many different environments, and for many different tasks without the need to re-program them. We believe that one of the key aspects towards the achievement of this goal is an abstraction of the interaction between the robotic agent and its environment. Such an abstraction enables the robot to execute plans without hardcoding interaction patterns. This can be achieved through reasoning about object affordances which is the topic of a paper we have submitted to the 24th European Conference on Artificial Intelligence (ECAI 2020) [1].

The affordance term was first introduced by James Gibson in 1979 to describe the interaction of living beings and their environment. He has coined the term as “what the environment offers to the animal, what it provides or furnishes, either for good or ill”. An affordance is not just a property of the environment, it also depends on the agent that must be aware of it, and whether its embodiment is suitable to perform the interaction. However, the scientific community has not reached a consensus on how an affordance should be characterized ontologically. Gibson already acknowledged this by stating that “an affordance is neither an objective property nor a subjective; or it is both if you like. An affordance cuts across the dichotomy of subjective-objective. It is both physical and psychical, yet neither”. A number of different approaches for affordance modeling were proposed in literature, including to characterize them as events or as object qualities. Our work is based on the dispositional object theory proposed by Turvey, in which a disposition of an object (the bearer) can be realized when it meets another suitably disposed object (the trigger) in the right conditions (the background).

Our goal is to enable a robot to answer different types of questions about potential interactions with its environment. One of the questions being ”what can be used for a particular purpose?”. That is, what are the combinations of objects that may interact with each other such that a dedicated goal can be achieved. As an example, a robot cook may need to find a suitable substance that can be used as a thickener for some gravy that would be too thin otherwise. Being able to answer this question is in particular relevant for goal-directed behavior. Another aspect is that a robot may need to explore an unknown environment first. To discover interaction potentials it may ask “what can this be used for?” after it has detected some object, lets say a package of flour where the answer might include that flour can be used as a gravy thickener. Similarly, the robot may ask “what can this be used with?” to discover complementary objects in the environment. That is, for example, that the flour can be used with the gravy. As knowledge about the world is likely to be incomplete, it is also worth considering negative variants of previous questions such as “what cannot be used for a particular purpose?”. These are useful to distinguish between cases where it is explicitly known that an object cannot be used for some purpose from cases where it is unknown.

Turvey interprets a disposition as the property of an object that is a potential. Consequently we define in our theory that a disposition is a property of an object that can enable an agent to perform a certain task. Dispositions are seen as absolute properties that are not dependent on context, and which are implied by the existence of the object that hosts them. We further say that a dispositional match is a potential of interaction between bearer and trigger of some disposition. This is that two objects are complementary to each other with respect to some task, such as the flour and gravy from earlier example that form a dispositional match for the task of thickening gravy, or a valve with a diameter huge enough to serve as hiding spot for some agent as displayed on this slide. The affordance term is then defined as a description that conceptualizes a dispositional match. Hence, the name descriptive affordance theory. The notion of description is derived from the upper-level model that we use, which is partly based on the Descritpion and Situation ontology. The pattern is that concepts defined in descriptions are used to classify entities within situations that satisfy the description. We say that the manifestation of an affordance is a situation that satisfies the conceptualization of a dispositional match. Meaning that an action was performed that executes the afforded task with appropriate objects that have a dispositional match taking roles during that action.

[1] Daniel Beßler, Robert Porzel, Pomarlan Mihai, Michael Beetz, Rainer Malaka, John Bateman,
    "A Formal Model of Affordances for Flexible Robotic Task Execution",
    In: Proceedings of the 24th European Conference on Artificial Intelligence (ECAI), 2020.