University of Southern California
Research Group

Planning and coordination for robot teams: DCOPs and Distributed POMDPs

 

Current Team:

Milind Tambe

Matthew Taylor

Manish Jain

Jun-Young Kwak

Onur Sert

Jagrut Sharma

Greg Rathjen

Prateek Tandon

 

Motivation

In planning and coordination for robot teams, we focus on realizing the promise of distributed constraint optimization (DCOPs) and distributed partially observable markov decision problems (Distributed POMDPs) frameworks. Our initial work on coordinating iRobot CREATEs used DCOPs, and work is on-going on implementing distributed POMDPs.

 

DCOP

Buoyed by recent successes in the area of distributed constraint optimization problems (DCOPs), this paper addresses challenges faced when applying DCOPs to real-world domains. Three fundamental challenges must be addressed for a class of real-world domains, requiring novel DCOP algorithms. First, agents may not know the payoff matrix and must explore the environment to determine rewards associated with variable settings. Second, agents may need to maximize total accumulated reward rather than instantaneous final reward. Third, limited time horizons disallow exhaustive exploration of the environment. We propose and implement a set of novel algorithms that combine decision-theoretic exploration ap- proaches with DCOP-mandated coordination. In addition to simulation results, we implement these algorithms on robots, deploying DCOPs on a distributed mobile sensor network.

   

Publications:

 

POMDP

In realistic domains, uncertainty in terms of an agent’s actions and observations is fundamental. A partially observable markov decision problem (POMDP) offers an expressive and powerful computational mechanism for generating the optimal policy that maximizes an agent’s total expected reward in such stochastic domains. More recently, the problem of deriving joint policies for a group of agents that maximize joint reward function has been modeled as distributed POMDPs (DEC-POMDPs) which provide a highly expressive framework for modeling multiagent collaboration problems. Given the NEXP-Complete complexity of DEC-POMDPs, however, the emerging consensus is to pursue approximate solutions or sacrifice expressiveness by identifying useful subclasses of DEC-POMDPs. Such approaches, through finding approximate joint policies or exploiting the structure of a subclass, are able to significantly improve the performance.

Publications:

More Info...