The Bay Area's Jazz Station to the World
Play Live Radio
Next Up:
0:00
0:00
0:00 0:00
Available On Air Stations

Researchers are rushing to build AI-powered robots. But will they work?

Chelsea Finn (left) and Moo Jin Kim conduct a demonstration with a robot at Stanford University.
Moo Jin Kim/Stanford University
Chelsea Finn (left) and Moo Jin Kim conduct a demonstration with a robot at Stanford University.

STANFORD, Calif. — Artificial intelligence can find you a recipe or generate a picture, but it can't hang a picture on a wall or cook you dinner.

Chelsea Finn wants that to change. Finn, an engineer and researcher at Stanford University, believes that AI may be on the cusp of powering a new era in robotics.

"In the long term we want to develop software that would allow the robots to operate intelligently in any situation," she says.

A company she co-founded has already demonstrated a general-purpose AI robot that can fold laundry, among other tasks. Other researchers have shown AI's potential for improving robots' ability to do everything from package sorting to drone racing. And Google just unveiled an AI-powered robot that could pack a lunch.

But the research community is split over whether generative AI tools can transform robotics the way they've transformed some online work. Robots require real-world data and face much tougher problems than chatbots.

"Robots are not going to suddenly become this science fiction dream overnight," says Ken Goldberg, a professor at UC Berkeley. "It's really important that people understand that, because we're not there yet."

Dreams and disappointment

There are fewer parts of science and engineering that have a larger gap between expectation and reality than robotics. The very word "robot" was coined by Karel Čapek, a Czeck writer who, in the 1920s, wrote a play that imagined human-like beings that could carry out any task their owner commanded.

In reality, robots have had a great deal of trouble doing even trivial jobs. Machines are at their best when they perform highly repetitive movements in a carefully controlled environment–for example, on an automotive assembly line inside a factory–but the world is filled with unexpected obstacles and uncommon objects.

In Finn's laboratory at Stanford University, graduate student Moo Jin Kim demonstrates how AI-powered robots at least have the potential to fix some of those problems. Kim has been developing a program called "OpenVLA," which stands for Vision, Language, Action.

"It's one step in the direction of ChatGPT for robotics, but there's still a lot of work to do," he says.

Moo Jin Kim sets up an AI-powered robot at Stanford University.
/
Moo Jin Kim sets up an AI-powered robot at Stanford University.

The robot itself appears pretty unremarkable, just a pair of mechanical arms with pincers. What makes it different is what's inside. Regular robots must be carefully programmed. An engineer has to write detailed instructions for every task. But this robot is powered by a teachable AI neural network. The neural network operates how scientists believe the human brain might work — mathematical "nodes" in the network have billions of connections to each other in a way similar to how neurons in the brain are connected together. "Programming" such a network is simply about reinforcing the connections that matter, and weakening the ones that don't.

In practice, this means Kim can train the OpenVLA model how to do a bunch of different tasks, simply by showing it.

Attached to the robot are a pair of joysticks that control each arm. To train it, a human operator uses the joysticks to "puppeteer" the robot as it does a desired task.

"Basically like whatever task you want it to do you just keep doing it over and over like 50 times or 100 times," he says.

That repetition is all that's required. Connections between nodes in the robot's AI neural network are reinforced each time it's shown the action. Soon it can repeat the task without the puppeteer.

To demonstrate, Kim brings out a tray of different kinds of trail mix. He's already taught it how to scoop. Now I want some of the mix that has green M&Ms and nuts, and all I have to do is ask.

"Scoop some green ones with the nuts into the bowl," I type. Very slowly the robot's arms jerk into action.

On a video feed, OpenVLA places a star over the correct bin. That means the first part of the model, which has to take my text and interpret its meaning visually, has worked correctly.

It doesn't always, Kim says. "That's the part where we hold our breath."

Then slowly, hesitantly, it reaches out with its claw, picks up the scoop and gets the trail mix.

"It looks like it's working!" says Kim excitedly.

It's a very small scoop. But a scoop in the right direction.

Anything bots

Stanford researcher Chelsea Finn has co-founded a company in San Francisco called Physical Intelligence, which is seeking to take this training approach to the next level.

She envisions a world in which robots can quickly adapt to do simple jobs, like making a sandwich or restocking grocery shelves. Contrary to the current thinking on robotics, she suspects that the best way to get there might be to train a single model to do lots of different tasks.

"We actually think that trying to develop generalist systems will be more successful than trying to develop a system that does one thing very, very well," she says.

Physical Intelligence has developed an AI neural network that can fold laundry, scoop coffee beans and assemble a cardboard box, though the neural network that lets it do all those things is too powerful to be physically on the robot itself.

"In that case we actually had a workstation that was in the apartment that was computing the actions and then sending them over the network to the robot," she says.

But the next step — compiling training data for its robot AI program — is a far more difficult task than simply gathering text from the Internet to train a chatbot.

"This is really hard," Finn concedes. "We don't have an open internet of robot data, and so oftentimes it comes down to collecting the data ourselves on robots."

Still, Finn believes it's doable. In addition to human trainers, robots can also try repeatedly to do tasks on their own and quickly build up their knowledge base, she says.

Data dilemma

But Berkley's Ken Goldberg is more skeptical that the real-world gap can be bridged quickly. AI chatbots have improved massively over the past couple of years because they have had a huge amount of data to learn from. In fact, they've scooped up pretty much the entire Internet to train themselves how to write sentences and draw pictures.

Ken Goldberg co-founder at Ambi Robotics and professor at UC Berkeley.
Niall David Cytryn / Ambi Robotics
/
Ambi Robotics
Ken Goldberg co-founder at Ambi Robotics and professor at UC Berkeley.

Just building up an Internet's worth of real-world data for robots is going to go much more slowly. "At this current rate, we're going to take 100,000 years to get that much data," he says.

"I would say that these models are not going to work just the way they are being trained today," agrees Pulkit Agrawal, a robotics researcher at MIT.

Agrawal is an advocate for simulation: putting the AI neural network running the robot into a virtual world, and allowing it to repeat tasks again and again.

"The power of simulation is that we can collect very large amounts of data," he says. "For example, in three hours worth of simulation we can collect 100 days worth of data."

That approach worked well for researchers in Switzerland who recently trained a drone how to race by putting its AI-powered brain into a simulator and running it through a pre-set course over and over again. When it got into the real world it was able to fly the course faster and better than a skilled human opponent, at least part of the time.

But simulation has its drawbacks. The drone worked pretty well for an indoor course. But it couldn't handle anything that wasn't simulated — wind, rain or sunlight — could throw the drone off course.

And flying and walking are relatively simple tasks to simulate. Goldberg says that actually picking up objects or performing other manual tasks that humans find to be completely straightforward are much harder to replicate in a computer. "Basically there is no simulator that can accurately model manipulation," he says.

Grasping the problem

Some researchers think that even if the data problem can be overcome, deeper issues may bedevil AI robots.

"In my mind, the question is not, do we have enough data… it is more what is the framing of the problem," says Matthew Johnson-Roberson, a researcher at Carnegie Mellon University in Pittsburgh.

Johnson-Roberson says for all the incredible skills displayed by chatbots, the task they're asked to do is relatively simple — look at what a human user types and then try to predict the next words that user wants to see. Robots will have to do so much more than just compose a sentence.

"Next best word prediction works really well and it's a very simple problem because you're just predicting the next word," he says. Moving through space and time to execute a task is a far larger set of variables for a neural network to try and process.

"It's not clear right now that I can take 20 hours of Go-Pro footage and produce anything sensible with respect to how a robot moves around in the world," he says.

Johnson-Roberson says he thinks more fundamental research needs to be done into how neural networks can better process space and time. And he warns that the field needs to be careful because robotics has been burned before — by the race to build self-driving cars.

"So much capital rushed in so quickly," he says. "It incentivized people to make promises on a timeline they couldn't possibly deliver on." Much of the capital then left the field, and there are still fundamental problems for driverless cars that remain unsolved.

Still, even the skeptics believe that robotics will be forever changed by AI. Goldberg has co-founded a package-sorting company called Ambi Robotics that rolled out a new AI-driven system known as PRIME-1 earlier this year. It uses AI to identify the best points for a robotic arm to pick up a package. Once it has the pick point laid out by the AI, the arm, which is controlled by more conventional programming, makes the grab.

The new system has dramatically reduced the number of times that packages get dropped, he says. But he adds with a laugh, "if you put this thing in front of a pile of clothes, it's not going to know what to do with that."

Back at Stanford, Chelsea Finn says she agrees that expectations need to be kept under control.

"I think there's still a long way for the technology to go," she says. Nor does she expect universal robots will even entirely replace human labor — especially for complex tasks.

But in a world with aging populations and projected labor shortages, she thinks AI-powered robots could bridge some of the gap.

"I'm envisioning that this is really going to be something that's augmenting people and helping people," she says.

Copyright 2025 NPR

Geoff Brumfiel
Geoff Brumfiel works as a senior editor and correspondent on NPR's science desk. His editing duties include science and space, while his reporting focuses on the intersection of science and national security.