This article is from WeChat public account:Quantity (QbitAI), author: Guo Yipu, title figure from: vision China

From Shanghai to Beijing, every city is doing garbage sorting.

Dry garbage, wet garbage, harmful garbage, pigs can’t eat, pigs can eat, pigs eat and die… Are you still worried about what these garbage are?

If you can automatically classify the garbage, it would be fine.

Alphabet X, the Google X that previously hatched an unmanned car, finally created a garbage sorting robot.

These robots can sort garbage, put the wrong type of garbage in the right place, andWalk around the office and pick up trash.

For example, if you put the wrong mineral water bottle, the robot can pick it up and put it in the right place:

Put the cans in the wrong place, and let it be with other cans:‍

After they were tested in Alphabet’s office, found that this robot can significantly reduce the pollution caused by garbage, and the proportion of garbage that is recycled by instinct but sent to the landfill is reduced from 20% to less than 5%.

How to do it

To make robots learn to classify waste, you need to use perception, movement and operation. The computer vision technology to sense and help the robot to drive forward is familiar with the industry.How to learn how to use the “hand” to open the cabinet, open the drawer, and remove the garbage in a complex environment is the operation that the machine must learn.

As a result, Alphabet X uses three methods to make robots learn to use “hands” flexibly.

Learn from humans

The first is to learn from humans and imitate human actions and practices.

The Play-LMP algorithm is used here, which allows the robot to learn with humans without the training of a specific data set, and finally achieves an average success rate of 85.5%.

The results of the study are probably like this. The task requirements are on the left and the execution process on the right:

learn with other robots

And other robot learning is through model-free reinforcement learning, allowing many robots to share their experiences.

The specific implementation is this:

Let the robot learn by means of the door handleThe idea of ​​opening the door, each robot that is learning together has a neural network installed, and each of them is connected to a central server.

Each robot starts to look at the door and the door handle and figure out how to open it.

In this process, the actions and results of each step of each robot are transmitted to the central server behind it. The neural network in the central server begins to use the experience of these transmissions to iteratively improve the neural network.

The whole process is like the fact that the commander sent several soldiers out to investigate, and recommend the clues sent back by each soldier to form a whole operational strategy, and then tell the soldiers how to act. .

So after the improvement, the machine people have learned the skill of opening the door.


learning in the cloud

When robots want to pick up garbage, they must learn to use their own “hands” to grasp things. They must practice constantly and have a lot of data to train the model.

In the real world, robots can only practice 5,000 times a day, and the amount of data is not enough.

With the random to specification adaptation network (Randomized-to-Canonical Adaptation Networks, RCANs), the simulated training data in the cloud can be used In the actual training of the model, the success rate of the robot grasping the object is increased to 70%.

After

, combined with 5000 data captured in the real world, the model was fine-tuned, and the success rate reached 91%.

This process is equivalent to capturing 580,000 results in the real world, saving 99% of the practice times.

This way, it took 3 months to train the robot to learn to crawl. Now it will take less than a day.

In addition, this paper also issued this year’s CVPR.

Robots in structured and unstructured environments

In the current robotics field, although there are many mature robots, they are all skilled and expensive.

They handle a task with great efficiency in a single, structured environment in the factory assembly line, but can’t solve laundry and cooking in a different, complex, and not structured environment every day. These troublesome tasks.

Alphabet X’s ultimate goal is to create robots that can be used in everyday life. For example, it can be washed and washed at home with quilts, and in the office, the tea is poured out for take-out… It is used every day, so this one The project is therefore called Everyday Robots.

However, the difficulty can be imagined.

This map is given by the National Highway Traffic Safety Administration.

The left side of the horizontal axis is the task of performing specialization, and the right side is the daily complex task; the lower part of the vertical axis is in a structured environment, and the upper part is in an unstructured environment.

Obviously, in the upper right corner, the robots that can adapt to all kinds of complex environments and all kinds of martial arts in all kinds of complexities, humans create it more difficult than the lower left corner will only fix the fixed position on the assembly line. The task of industrial robots.

In the upper right quadrant, there are developing self-driving cars, Everyday Robots, which is an order of magnitude more difficult than autonomous driving.

Transportation Gate

Finally, regarding the three learning methods of this garbage sorting robot, the related papers and technical blog links are as follows:

Learn Learning Latent Plans from Play, author with humans: Corey Lynch, Mohi Khansari, Ted Xiao, Vikash Kumar, Jonathan Tompson, Sergey Levine, Pierre Sermanethttps: //learning-from-play.github.io/

Learn from other robots https://ai.googleblog.com/2016/10/how-robots-can-acquire-new-skills-from.html

Sim-to-Real via Sim-to-Sim: Data-efficient Robotic Grasping via Randomized-to-Canonical Adaptation Networks, Author: Stephen James, Paul Wohlhart, Mrinal Kalakrishnan, Dmitry Kalashnikov, Alex Irpan, Julian Ibarz, Sergey Levine, Raia Hadsell, Konstantinos Bousmalishttps: //arxiv.org/abs/1812.07252

If you are doing similar research, don’t copy the link and refer to it~


public channel number from the micro herein: qubits (QbitAI), of: Guo Yipu