In planning and coordination for robot teams, we focus on realizing the promise of
distributed constraint optimization (DCOPs) and distributed partially observable markov decision problems (Distributed POMDPs)
frameworks. Our initial work on coordinating iRobot CREATEs used DCOPs, and work is on-going on implementing distributed
POMDPs.
DCOP
Buoyed by recent successes in the area of
distributed constraint optimization problems
(DCOPs), this paper addresses challenges faced
when applying DCOPs to real-world domains.
Three fundamental challenges must be addressed
for a class of real-world domains, requiring novel
DCOP algorithms. First, agents may not know the
payoff matrix and must explore the environment
to determine rewards associated with variable
settings. Second, agents may need to maximize
total accumulated reward rather than instantaneous
final reward. Third, limited time horizons disallow
exhaustive exploration of the environment. We
propose and implement a set of novel algorithms
that combine decision-theoretic exploration ap-
proaches with DCOP-mandated coordination. In
addition to simulation results, we implement these
algorithms on robots, deploying DCOPs on a
distributed mobile sensor network.
In realistic domains, uncertainty in terms of an agent’s actions and observations is fundamental. A partially observable markov decision problem (POMDP) offers an expressive and powerful computational mechanism for generating the optimal policy that maximizes an agent’s total expected reward in such stochastic domains. More recently, the problem of deriving joint policies for a group of agents that maximize joint reward function has been modeled as distributed POMDPs (DEC-POMDPs) which provide a highly expressive framework for modeling multiagent collaboration problems. Given the NEXP-Complete complexity of DEC-POMDPs, however, the emerging consensus is to pursue approximate solutions or sacrifice expressiveness by identifying useful subclasses of DEC-POMDPs. Such approaches, through finding approximate joint policies or exploiting the structure of a subclass, are able to significantly improve the performance.