Helping Robots to Spatially Understand Environments Faster

To make home-helper robots that can quickly navigate unpredictable and disordered spaces, University of Michigan researchers have developed an algorithm that lets them spatially understand environments orders of magnitude faster than previous algorithms. “Robot perception is one of the biggest bottlenecks in providing capable assistive robots that can be used in people’s homes,” says Karthik Desingh, a grad student in computer science and engineering at Michigan University. “In industrial settings, where there is structure, robots can complete tasks such as building cars quickly. But we live in unstructured environments, and we want robots that can deal with our clutter.”

Historically, robots operate most effectively in structured environments—e.g., behind guard rails or cages—to keep humans safe and the robot’s workspace clean and orderly. However, a human’s environment, at work or home, is typically a jumble of objects: papers across a keyboard, a bag hiding car keys, or an apron hiding half-open cupboards.

The team’s new algorithm is called Pull Message Passing for Nonparametric Belief Propagation. In 10 minutes, it can compute an accurate spatial understanding of an object’s pose (position and orientation) to a level of accuracy that takes previous approaches more than an hour-and-a-half.

The team demonstrated this with a Fetch robot. It showed that its algorithm can correctly perceive and use a set of drawers, even when half of the drawers are covered with a blanket, when a drawer is half-open, or when the robot’s arm itself is hiding a full view of the drawers. The algorithm can also scale beyond a simple dresser to an object with several complicated joints. They showed that the robot can even accurately perceive its own body and gripper arm.

“The concepts behind our algorithm, such as Nonparametric Belief Propagation, are already used in computer vision and perform well in capturing the world’s uncertainties. But these models have had limited use in robotics as they are expensive computationally, requiring more time than practical for an interactive robot to help in everyday tasks,” explains Chad Jenkins, professor of computer science and engineering at Michigan’s Robotics Institute.

The Nonparametric Belief Propagation technique, along with the similar Particle Message Passing technique ,were first described in 2003. They’re effective in computer vision, which attempts to thoroughly understand a scene through images and video. That’s because two-dimensional images or video requires less computational power and time than the three-dimensional scenes involved in robot perception.

These earlier approaches understand a scene by translating it into a graph model of nodes and edges, which represent each component of an object and their relationships between one another. The algorithms then hypothesize, or create beliefs of, component locations and orientations when given a set of constraints. These beliefs, which researchers call particles, vary across a range of probabilities.

To narrow down the most likely locations and orientations, the components use “push messaging” to send probable location information across nodes and back. That location information is then compared with sensor data. This process takes several iterations to arrive at an accurate belief of a scene.

For example, given a dresser with three drawers, each component of the object—in this case, each drawer and the dresser itself—would be a node. Constraints would demand drawers must be within the dresser, and the drawers move laterally but not vertically.

The information, passed among the nodes, is compared with real observations from sensors, such as a 2D image and a 3D point cloud. Messages are repeated through iterations until there is an agreement between the beliefs and sensor data.

To simplify the demands on computing, Desingh and the Michigan team used what is called “pull messaging.” Their approach turns the cacophony of back-and-forth, information-dense messages into a concise conversation between an object’s components.

In this example, instead of the dresser sending location information to a drawer only after computing information from the other drawers, the dresser checks with the drawers first. It asks each drawer for its own belief of its location, then, for accuracy, weighs that belief against information from the other drawers. It converges on an accurate understanding of a scene through iterations, just as the push approach does.

To directly compare their new approach with previous approaches, the researchers tested it on a simple 2D scene of a circle with four rectangular arms hidden among a pattern of similar circles and rectangles.

The previous approaches required more than 20 minutes of processing time per iteration to pass messages, while the team’s new method took fewer than two minutes, and as the number of beliefs or particles increased, this improvement becomes exponentially faster.

In these trials, it took five iterations with their new algorithm to achieve less than a 3.5-in. average error in estimating of the locations of drawers and dresser, or less than 8-in. average error in estimating when the it is partly obscured by a blanket.

This is on par with previous approaches, and varies depending on an object’s size, numbers of parts, and how much is visible to sensors. Most importantly, accuracy increases enough for successful manipulation of objects by a robot through continuing iterations.

“This is just the start of what we can do with belief propagation in robot perception,” Desingh says “We want to scale our work up to several objects and track them during action execution, even if the robot is not currently looking at an object. Then, the robot can use this ability to continually observe the world for goal-oriented manipulation and successfully complete tasks.”