This article is from WeChat official account:Tencent Research Institute (ID: cyberlawrc) , author: Wang Huanchao, original title: “Google’s latest black technology, LaMDA, can make your voice assistant no longer mentally retarded? “, the title picture comes from: Visual China

Today, few people take “smart voice assistant” seriously, and more people regard it as a synonym for “mentally retarded.”

Since Apple’s Siri was released in 2016, related technologies have been innovated one after another, and imitators have appeared one after another. However, the level of intelligence of smart assistants has not improved as fast as we imagined.

After continuous disappointment, our requirements are getting lower and lower. Apart from letting it help to set an alarm clock at 8:00 tomorrow morning or open an app, there is no other hope.

The second season of the recently hit “Love, Death and Robots”, in the first episode also tells us how serious consequences a mentally retarded voice assistant can bring: the cleaning robot “goes crazy” and starts indiscriminate attacks After that, the hostess called the intelligent customer service, not only failed to solve any problems, but kept adding chaos, and finally relied on human strength to barely escape. Ah, can it be said that we still have to bear such a mentally handicapped voice assistant in the future?

Fortunately, things have turned around. On May 18, 2021, US time, the annual Google I/O Conference is coming as scheduled. Among the many products and technologies, LaMDA is unremarkable, but it may be the savior of voice assistants for mentally retarded people.

1. What exactly is LaMDA?

The full name of LaMDA is LanguageModel for Dialogue Applications. Simply put, it is a more capable language model suitable for dialogue applications.

Like its predecessors BERT and GPT-3, LaMDA alsoBased on Transformer architecture. The latter is a neural network architecture released and open sourced by Google in 2017. The model generated by this architecture can be trained to read a set of words (such as a sentence or a paragraph), and pay attention to the connections between these words , And then predict what word will be next. [1] Unlike other models, LaMDA has received more training in dialogue.

Before starting the introduction, we need to think carefully, why are the existing voice assistants so “mentally retarded”?

The root cause of mental retardation is lack of technical ability, which is manifested as “the writing is not correct”-unable to give us the answers we want. This is relatively easy to solve, as long as the amount of training is increased, it can be gradually optimized. But another problem that is more difficult to solve is that voice intelligent assistants will only understand the questions we ask in isolation and give answers in isolation. In other words, you can’t expect it to connect to the context and have a long “continuous conversation” with us.

Be aware that our dialogue scenes in reality are completely open, often starting from one topic, extending to another topic, and ending at a completely unrelated topic. For example, we met a friend who often started with “Have you eaten?” and talked about a new game launched a few days ago, and finally planned to have a movie on the weekend.

The open nature of real-world dialogue makes it one of the most difficult problems in the field of machine learning. It involves a very important ability, that is, natural language understanding (NLU), which requires AI to be able to perform semantic contextual emotional Judging, this is a more complex ability than natural language processing (NLP).

At present, most intelligent assistants are often designed according to a narrow, pre-defined dialogue path, and cannot conduct open and continuous dialogues. This is why they still seem to be quite mentally retarded.

LaMDA has made a technological breakthrough to address this issue. LaMDA is based on a Google study in 2020 [2], this research shows that the language model based on the Transformer architecture can talk about almost all topics after being trained in dialogue.

During the training process, LaMDA discovered the subtle differences between open dialogue and other forms of language. Its core is the ability to conduct “open domain” (Open Domain) dialogue. The important support of this ability is that LaMDA can understand the context of dialogue better than the existing dialogue model. It can “decipher” the intent of a conversation by reading sentences or paragraphs, discover the connections between words, and predict the words that may appear next, so as to make a contextual answer.

With such support, LaMDA can chat with people in endless topic changes and have long open conversations. To describe it in Google’s official words, it is “to be able to talk about endless topics in a freely flowing way.”

2. From “Pluto” to “Paper Airplane”

At this Google I/O conference, LaMDA fully demonstrated its powerful dialogue capabilities. [3] In the demo session, LaMDA played the role of Pluto and had a dialogue with users.

In the example scenario, LaMDA can make precise answers based on the user’s questions, and it can also lead one topic to another, continuously advancing the conversation. The transition of this theme is not abrupt, it seems natural andreasonable.

When asked: “Which side do you want everyone to know about you?”

It replied this way: “I want people to know that I am more than just a random ice ball (random ice ball), I am actually A beautiful planet.”

LaMDA can also give an accurate answer to the question of “have there been any visitors to Pluto before?” It even reminds users intimately that if you want to visit Pluto, you need to bring a coat because it is very cold.

This kind of conversation feels like you are chatting with a knowledgeable friend. Although the topic is turbulent and constantly emerging, LaMDA can always catch the stubbornness and start a conversation naturally.

In another demonstration, LaMDA also demonstrated superb conversational skills.

In this demonstration, LaMDA plays the role of a paper airplane. When users ask, what is your worst landing place ever? It replied: “It may be a small puddle(puddle).”

When asked by a user: “What is the secret of a really good paper airplane?”

It proactively asked the user: “What does ‘good’ mean?”, showing sufficient flexibility and agility.

The user replied: “I care about distance (distance). LaMDA further focused on the topic of “how to optimize the flight distance of a paper airplane”, Shared relevant knowledge.

Be aware, LaMDThese replies of A are not preset, but are generated naturally. This also means that LaMDA does not need to undergo special training to conduct another conversation, nor will it make repeated answers. This ability is really amazing.

In these two examples, just a few words can tell that LaMDA makes the question answer more meaningful, and this is the result of the ability to understand the context of the conversation. With the help of this ability, LaMDA behaves quite rationally and astutely.

Google also stated that sanity and specificity are not the only qualities that LaMDA pursues. They also focus on abilities such as insight and sense of humor. At the same time, Google is also very concerned about factual questions, that is, whether LaMDA’s answer is in line with the facts. [4]After all, for a voice assistant, fun is very important, and correctness is more important.


Three, LaMDA’s road ahead is still far away

Whether it is more advanced AI or smarter chatbots, Google has been working hard in the past few years to promote how AI can better communicate with humans.

Pichai mentioned in his speech that the richness and flexibility of language is making it one of the greatest tools of mankind. At the same time, it has also become one of the greatest challenges of computing science. Although LaMDA can now provide suggestions and answers based on the context of the conversation, so that the dialogue can proceed without violating harmony, but it is still in the early stage of development, and it takes time to achieve the function of an AI assistant.

The question is, what is the point of improving the dialogue ability of AI assistants? At least for Google, this ability plays a significant role, because many of Google’s important products are related to information retrieval, and they are all based on the interpretation of computing language, whether it is translation capabilities or the understanding of user retrieval information. If Google can make AI better understand language, then it can improve related core products, such as Google Search, Assistant and Workspace. “It can even turn the search into a conversation, which is more natural and smooth.” Pichai said.

Of course, it is not just a breakthrough for Google as a company. The progress of dialogue ability will undoubtedly bring new imagination to all fields involving human-computer dialogue.

But the richness, flexibility, and accompanying complexity of the language undoubtedly make this task a great challenge. It can be said that in the face of such a difficult area, LaMDA’s capabilities are not yet mature. In actual operation, it may still make mistakes and give absurd responses.

For example, in the demonstration case of playing Pluto, it said that he jumped very high (jump really high) and often practice flipping Action, and is very happy to use his favorite ball-the moon to play catching games. These answers are obviously contrary to common sense.

In addition, as a language model, LaMDA inevitably faces some old AI problems. For example, it may be abused or spread prejudice. Algorithm bias is an extremely complex problem, which may originate from the design of the algorithm structure or the problem of training data sets. Its essence is the extension of social bias at the algorithm level.

As Google puts it,“Language may be one of human’s greatest tools, but like all tools, it can be abused. Models trained in language can spread this abuse—for example, By internalizing prejudice, reflecting hate speech, or copying misleading information. Even if the language it is trained on is carefully censored, the model itself may still be improperly used.”[5]

Of course, LaMDA will face many unexpected real risks. For example, it is used by criminals to commit online fraud, and similar news has become commonplace. A more simulated dialogue ability means a stronger fraud ability.

Even at the technical level, LaMDA has more room for optimization. At present, LaMDA is mainly built around text dialogue. In the future, LaMDA may be compatible with other media forms, including images, audio, video, and so on. This can be hoped in the MUM(multi-task unified model) which was also released at this conference. The future human-computer interaction method may be due to these two Technology has undergone revolutionary changes.

The specific role of LaMDA remains to be further observed. After all, Google has a history of hacking before (In 2017, Google releasedI used the restaurant reservation service AI Duplex, and it was later discovered that there was a real person behind it to help complete [6]). However, people and AI have a more natural and open dialogue, I believe it is no longer far away.

Reference material:

1. https://www.blog.google/technology/ai/lamda

2. https://ai.googleblog.com/2020/01/towards-conversational-agent-that-can.html

3. https://www.youtube.com/watch?v=aUSSfo5nCdM

4. https://www.blog.google/technology/ai/lamda

5. https://www.blog.google/technology/ai/lamda

6. https://wallstreetcn.com/articles/3567850

This article is from WeChat official account:Tencent Research Institute (ID: cyberlawrc) , Author: Wang Huanchao