Robots perform better at a range of tasks when they draw on a growing body of experience. Thats the assertion of a team of researchers hailing from DeepMind, who in a preprint paper propose a technique called reward sketching. They claim its an effective way of eliciting human preferences to learn a reward function a function describing how an AI agent should behave that can be used to retrospectively annotate all historical data, collected for different tasks with predicted rewards for the new task. This annotated data set can then be used to learn manipulation policies, the team says, or probability distributions over actions given certain states, with reinforcement learning from visual input without interaction with a real robot.
The work builds on a DeepMind study published in January 2020, which described a technique continuous-discrete hybrid learning that optimizes for discrete and continuous actions simultaneously, treating hybrid problems in their native form. As something of a precursor to that paper, in October 2019, the Alphabet subsidiary demonstrated a novel way of transferring skills from simulation to a physical robot.
[Our] approach makes it possible to scale up RL in robotics, as we no longer need to run the robot for each step of learning. We show that the trained batch [reinforcement learning] agents, when deployed in real robots, can perform a variety of challenging tasks involving multiple interactions among rigid or deformable objects, wrote the coauthors of this latest paper. Moreover, they display a significant degree of robustness and generalization. In some cases, they even outperform human teleoperators.
As the team explains, at the heart of reward sketching are three key ideas: efficient elicitation of user preferences to learn reward functions, automatic annotation of all historical data with learned reward functions, and harnessing the data sets to learn policies from stored data via reinforcement learning.
For instance, a human teleoperates a robot with a six-degree-of-freedom mouse and a gripper button or a handheld virtual reality controller to provide first-person demonstrations of a target task. To specify a new target task, the operator controls the robot to provide several successful (and optionally unsuccessful) examples of completing the task, and these demonstrations help to bootstrap the reward learning by providing examples of successful behavior with high rewards.
In the researchers proposed approach, all robot experience including demonstrations, teleoperated trajectories, human play data, and experience from the execution of either scripted or learned policies is accumulated into whats called NeverEnding Storage (NES). A metadata system implemented as a relational database ensures its appropriately annotated and queried; it attaches environment and policy metadata to every trajectory, as well as arbitrary human-readable labels and reward sketches.
In the reward-sketching phase, humans annotate a subset of episodes from NES (including task-specific demos) with annotations of reward, using a technique that allows a single person to produce hundreds of annotations per minute. These annotations feed into a reward model thats then used to predict reward values for all experience in NES, so that all historical data in a training policy for a new task can be leveraged without requiring manual annotation of the whole repository.
An agent is trained with 75% of the batch drawn from the entirety of NES and 25% from the data specific to the target task. Then, its deployed to a robot, which enables the collection of more experience to be used for reward sketching or reinforcement learning.
In experiments, the DeepMind team used a Sawyer robot with a gripper and a wrist force-torque sensor. Observations were provided by three cameras around a cage, as well as two wide-angle cameras and one depth camera mounted at the wrist and proprioceptive sensors in the arm. In total, the team collected over 400 hours of multiple-camera videos of proprioception i.e., perception or awareness of position and movement) and actions from behavior generated by human teleoperators, as well as random, scripted, and policies.
The researchers trained multiple reinforcement learning agents in parallel for 400,000 steps and evaluated the most promising on the real-world robot. Tasked with lifting and stacking rectangular objects, the Sawyer successfully lifted 80% of the time and stacked 60% of the time, and 80% and 40% of the time when those objects were positioned in adversarial ways. Perhaps more impressively, in a separate task involving the precise insertion of a USB key into a computer port, the agent when provided reward sketches from over 100 demonstrators reached over 80% success rate within 8 hours.
The multi-component system allows a robot to solve a variety of challenging tasks that require skillful manipulation, involve multi-object interaction, and consist of many time steps, wrote the researchers. There is no need to worry about wear and tear, limits of real time processing, and many of the other challenges associated with operating real robots. Moreover, researchers are empowered to train policies using their batch [reinforcement learning] algorithm of choice.
They leave to future work identifying ways to minimize human-in-the-loop training, and to minimize the agents sensitivity to significant perturbations in the setup.
- DeepMinds AI models transition of glass from a liquid to a solid - VentureBeat - April 9th, 2020
- AI and the coronavirus fight: How artificial intelligence is taking on COVID-19 - ZDNet - April 9th, 2020
- Step away from the news and breathe - Waukon Standard - April 9th, 2020
- Companies are bracing for the toughest phase in business continuity: Karan Bajwa - Livemint - April 9th, 2020
- Applying Artificial Intelligence in the Fight Against The Coronavirus - HIT Consultant - April 1st, 2020
- The Coronavirus and the Conservative Mind - The New York Times - April 1st, 2020
- After $2 Trillion Rescue Package, This Year's Deficit Will Be 'Mind-Boggling' - NPR - April 1st, 2020
- Deep Bench: Reframing the discussion from mental health to brain health - WSAW - April 1st, 2020
- Mind-reading AI turns thoughts into words using a brain implant - New Scientist News - April 1st, 2020
- 6 webcomics with deep archives to read in isolation | Etcetera - Daily Hive - April 1st, 2020
- It was a lovely, deep sense that God is part of everything - Eternity News - April 1st, 2020
- The deep leadership flaw revealed by Trump touting his coronavirus press conference ratings - CNN - April 1st, 2020
- New Google Assistant readying Shortcuts and built-in list of supported apps - 9to5Google - April 1st, 2020
- 'This Is Us': Will Rebecca and Miguel's Relationship Be Explored in Season 5? - Showbiz Cheat Sheet - April 1st, 2020
- Good to Go: Brick and Bones in Deep Ellum - Dallas Observer - April 1st, 2020
- Artificial Intelligence: IDTechEx Research on State-Of-The-Art and Commercialisation Status in Diagnostics and Triage - PRNewswire - April 1st, 2020
- Psychodermatalogy treatments: the importance of the mind-skin link - Professional Beauty - April 1st, 2020
- Many moods of mind-yoga - Bangalore Mirror - March 29th, 2020
- Life may change for us all: How we respond to the coronavirus crisis will be defining, historians say - USA TODAY - March 29th, 2020
- America's Revolutionary Mind: A Moral History of the American Revolution and the Declaration That Defined It by C. Bradley Thompson - The Objective... - March 29th, 2020
- Deep Learning: What You Need To Know - Forbes - March 27th, 2020
- Deep Cleaners Have Their Day in a Nation Paralyzed by a Pathogen - Scientific American - March 27th, 2020
- 6 Ways To Boost Your Inner Zen Right Now - Thrive Global - March 27th, 2020
- Pro Surfer Gabriel Medina On Being The Brand Partner Of Polo Deep Blue Parfum And Competing In The Olympics - Forbes - March 27th, 2020
- COVID-19 and Firefighter Mental Health | Firehouse - Firehouse.com - March 25th, 2020
- Senior edition: Nourishment in your home - Register-Herald - March 25th, 2020
- Google open-sources framework that reduces AI training costs by up to 80% - VentureBeat - March 25th, 2020
- I Think Sex for Pay Is Wrong. Should I Stay With a Partner Who Disagrees? - The New York Times - March 25th, 2020
- Wild Mind Artisan Ales sued its former brewer. Then things got ugly. - City Pages - March 18th, 2020
- LETTERS: COVID-19 on the mind | Letters - Waco Tribune-Herald - March 18th, 2020
- Council Installed Cameras with Facial Recognition on Football Pitch - Dublin Inquirer - March 18th, 2020
- AI Fights the Pandemic (And How You Can Get Involved) - TDWI - March 18th, 2020
- Gardener Jen Kennedy Plants With Palette and Purpose in Mind - Seven Days - March 18th, 2020
- 54% of the People. 12% of the Plays. Atlanta, Do We Have a Problem? - The New York Times - March 18th, 2020
- A History Book That Isn't: Finding A Way To Teach Racism To A New Generation - OPB News - March 18th, 2020
- Graph theory suggests COVID-19 might be a small world after all - ZDNet - March 18th, 2020
- The U.S. Is In A Bear Market. There Could Be A Recession. But This Is Not 2008. - Forbes - March 16th, 2020
- Planning A Team Retreat? Seven Ways To Make Sure It Actually Sticks - Forbes - March 16th, 2020
- Decoding the Future Trajectory of Healthcare with AI - ReadWrite - March 16th, 2020
- Mind Against return on Afterlife with Walking Away and Bloom - The Groove Cartel - March 15th, 2020
- Gardening: Five things to keep in mind with ground cover plants - Bournemouth Echo - March 15th, 2020
- Fit in my 40s: Mamma mia! Can I really work out by singing Abba? - The Guardian - March 15th, 2020
- How Megan Thee Stallion Turned 'Hot' Into a State of Mind - The New York Times - March 15th, 2020
- What Lies Beneath - Earth Island Journal - March 13th, 2020
- Britain is ahead of many of its competitors in technology startups - The Economist - March 13th, 2020
- The Robots Are Coming - Boston Review - March 13th, 2020
- Top AI Announcements Of The Week: TensorFlow Quantum And More - Analytics India Magazine - March 13th, 2020
- Google is building COVID-19 screening website as Trump declares national emergency - VentureBeat - March 13th, 2020
- The case for an AI that puts nature and ethics first, not humans - The Next Web - March 8th, 2020
- How AI and Neuroscience Can Help Each Other Progress? - Analytics Insight - March 8th, 2020
- Devs Takes Its Time to Blow Your Mind - Vulture - March 8th, 2020
- Salt rooms take advantage of the compound's therapeutic benefits for the mind and body - Las Vegas Sun - March 8th, 2020
- Sundar Pichai details Google, Alphabet response to coronavirus and this unprecedented moment - 9to5Google - March 8th, 2020
- Do Insiders Own Lots Of Shares In Deep Yellow Limited (ASX:DYL)? - Simply Wall St - March 8th, 2020
- The man who refused to freeze to death - BBC News - February 27th, 2020
- Wellness At The Deep End Of The Pool - The Pulse - Chattanooga Pulse - February 27th, 2020
- Why Mindfulness is the Next Frontier in Sports Performance - GQ - February 27th, 2020
- Researchers apply developmental psychology to AI model that predicts object relationships - VentureBeat - February 27th, 2020
- This was meant to be the year the NHS went digital. What happened? - Wired.co.uk - February 27th, 2020
- Tools to help graduate students discover their own answers to academic and career questions (opinion) - Inside Higher Ed - February 27th, 2020
- Unleashing the Power of Three - Thrive Global - February 27th, 2020
- Letter to the Editor: Deep state exists only in Trump's mind - Fairfield Daily Republic - February 24th, 2020
- To Change Voters Sympathies, Its Time to Go Deep - The American Prospect - February 24th, 2020
- Camden takes tech crown as it leads UK for new business creation - City A.M. - February 24th, 2020
- These Fevered Days a fresh exploration of the wild terrain of Emily Dickinsons mind - The Boston Globe - February 24th, 2020
- Hekate's Key Journey And The Keys Of Spiritual Journeying | Hekate's Key - Patheos - February 22nd, 2020
- Google parent Alphabet is pruning its 'other bets' - Engadget - February 22nd, 2020
- Can mysticism solve the mind-body problem? - The Stute - February 22nd, 2020
- Review: 'The Lodge' is a slow-burn attack on the mind (Includes first-hand account) - Digital Journal - February 22nd, 2020
- How quieting the mind can benefit your life - Bangor Daily News - February 22nd, 2020
- Is The Recent Criticism For OpenAI by MIT Technology Review Unfair? - Analytics India Magazine - February 22nd, 2020
- The messy, secretive reality behind OpenAIs bid to save the world - MIT Technology Review - February 22nd, 2020
- The Mind-Altering Power of Deep Animal Connection - Sierra Magazine - February 13th, 2020
- Google Health, the company's newest product area, has ballooned to more than 500 employees - CNBC - February 13th, 2020
- Ryan Evans: Every invention begins with a curious mind - Akron Beacon Journal - February 13th, 2020
- Five Principles of Success - Thrive Global - February 13th, 2020
- Get Lost in the Mind of Yheti with New Album 'The Party Has Changed' [Album Review] - EDM Identity - February 13th, 2020
- 'This Is Us' Season 4: Kate in Abusive Relationship Hannah Zeile - TVLine - February 13th, 2020
- New decade, new you: 10 things you really can set your mind to doing in 2020 and beyond - Omaha World-Herald - February 11th, 2020
- Alabama man's Habitual Felony Offender Act sentence caused 'deep pain' for his sister, the last surviving member of their family - Southern Poverty... - February 11th, 2020