Semantic understanding has always been considered “the jewel in the crown of artificial intelligence.”

Recently, I was informed that the Artificial Intelligence Robotics Technology (Beijing) Co., Ltd. (hereinafter referred to as “Deep Thinking”) was awarded the number of Hubble Technology Investment Co., Ltd. (hereinafter referred to as “Haber Investment”), a wholly-owned subsidiary of Huawei. Strategic investment of 10 million RMB.

From the deep thinking official website, the most prominent technology is “multi-modal depth semantic understanding engine (iDeepwise.ai) and human-computer interaction technology.” The official website shows that the engine technology can understand the deep semantics behind multi-modal unstructured data such as text and visual images. Among them, the machine-readable comprehension technology for long text, the multi-round human-machine dialogue technology with free cross-domain, and many The semantic understanding technology of modal information is its outstanding advantage.

The core of curiosity is two questions. First, the core technology of deep thinking, that is, what is the multimodal deep semantic understanding and human-computer interaction technology, and what kind of landing scene? Second, why does Huawei, or Huawei’s subsidiary, invest in deep thinking? What kind of business considerations and ecological layout are there?

From the next article, we may have a glimpse of one or two.

Multimodal and Multimodal Semantic Understanding

In the “2019 Machine Reading Comprehension Competition” which ended in August, among the two core technical indicators, the deep thinking was ranked first, and it stood out from more than 2,000 teams in the world and won the championship. At that time, he had an interview with Dr. Yang Zhiming, a deep-rooted artificial intelligence CEO and AI algorithm scientist.

Huawei Investment thinks deeply, is the era of multimodal semantic understanding coming?

“When humans speak, they are often colloquial, discontinuous, fragmented, and even reversible. Speech recognition only stays in the voice command, can’t understand the user’s language and the logic behind it, and can’t solve it. The user just needs in many scenes.” Yang Zhiming For example, when watching movies, humans not only look at pictures, listen to sounds, but also read subtitles, and even associate them according to movie themes.

The source or form of each kind of information can be called a modality, for example, human vision, touch, hearing, smell, taste, information media including voice, pictures, video, text, etc., while sensors There are infrared, radar, electromagnetic and so on.

Multimodal artificial intelligence is to help artificial intelligence to be more human-like through different information dimensions and sources of information.Ways to think and learn.

With the maturity of technologies such as algorithms, computing power, cloud and chips, artificial intelligence, especially strong artificial intelligence, has developed rapidly in the past few years. According to the WIPOP 2019 Artificial Intelligence Trends Report, 50% of AI patents have been published in the past five years, which means that the AI ​​industry has entered a period of rapid development during the five years from 2014 to 2018.

Of course, multi-modal artificial intelligence technology is more complicated than the single-mode artificial intelligence technology, both in terms of algorithm and computational power. It may even be exponential complexity, and ultimately The effect presented will also be closer to the effect of human thinking.

In the smart home scene, for example, the effect of speech recognition technology is to hear voice commands and execute instructions. Once there is a complicated voice description, it will enter. “I didn’t understand what you said?” , or “Do you mean this?” and further confirmation and refinement of the operation instructions.

The problem that really helps machines solve “understanding” and “understanding” is semantic understanding, and machine reading comprehension has always been considered the landmark tip of semantic understanding and natural language processing (NLP).

According to the statistics of Tencent Research Institute at the end of 2018, among the top artificial intelligence companies in China, the top three areas of financing are computer vision and image, natural language processing, and automatic driving/assisted driving, and ranked second. The name of the natural language processing, financing 12.2 billion yuan, accounting for 19%.

Deep thinking of the “multimodal deep semantics understanding engine (iDeepwise.ai) and human-computer interaction technology”, in simple terms, is to achieve a semantic understanding from simple machine perception to depth, and this will make people Machine interaction becomes smarter and is the key to the true move of the machine.

This may be one of the reasons why you can think about attracting investment from Huawei subsidiaries.

But obviously, this is not the only reason.

Huawei and Huawei’s Ecology

Today, it is difficult to summarize Huawei in one sentence. Its industry chain includes communications equipment, semiconductors, consumer electronics, cloud computing, security, etc. The company’s revenue has also increased from $18.3 billion in 2008 to 2018. 105.2 billion US dollars.

Not only does mobile phone sales surpass Apple, but Huawei’s performance in 5G, chips and smart hardware is also very eye-catching, especially in the 5G industry, such as the latest flagship mobile phone Mate30 series mobile phone, which is also equipped with the Kirin 990 chip. The first officially commercial 5G SoC chip.

For example, the distributed operating system Hongmeng OS for the whole scene. According to the previous conference, Hongmeng OS has been used first on mobile phones and tablets, and will be applied to smart watches., smart screen, car equipment, smart speakers and other smart terminals.

But with OS, chip and 5G technology, you can’t fully realize the scene of all things connected. These technologies are the foundation of Huawei’s AIOT strategy in the future, but on the basis of it, more effective technologies are needed to increase the basic grasp, land more scenes, reach more users, and achieve a feelingless experience.

Multimodal semantic understanding techniques and brain-like artificial intelligence techniques can play a key role.

At present, Deep thinking based on AI multi-modal depth semantic understanding technology and human-machine dialogue products mainly in smart car networking digital cockpit, car smart marketing, mobile smart mobile terminal, smart home, smart medical health and other applications Scenes.

Taking the scene of the mobile terminal as an example, in the scenario of the smartphone terminal, deep thinking is based on the multimodal deep semantic understanding and the human-machine dialogue engine (iDeepWise.ai), providing travel, health consultation, smart office, leisure and entertainment scenes. The intelligent human-machine dialogue interacts with the AI ​​Saas service of iDeepWise.ai.mobile. Especially in the field of travel, it provides one-stop AI intelligent travel life service for 200 million intelligent terminal users, including automatic completion of booking ticket train tickets and automatic completion of hotel reservations through man-machine dialogue.

Through Hubble, Huawei has invested in the third generation of semiconductor materials in Shandong Tianyue Advanced Materials Technology Co., Ltd., integrated circuit design company Jiewate Microelectronics (Hangzhou) Co., Ltd., and the deep thinking mentioned in this article. . It is not difficult to see that the three companies invested by Hubble in the past few months have provided Huawei with the raw materials, chip design and production and the most suitable artificial intelligence technology required by artificial intelligence. It can be said that the ideal strategic layout has been achieved through investment. .

I believe that these three companies will have more exchanges and cooperation under the great ecology of Huawei in the future.

Huawei’s investment in deep thinking also seems to indicate that artificial intelligence has entered the stage of full commercialization, and is no longer only testing and training in the laboratory, but constantly going to the real scene to go to the experiment. Go closer to success.

Artificial intelligence out of the lab

For the giant companies, especially the giant companies in the ICT field, the huge amount of data they own is a rich gold mine, but if the value of the data cannot be tapped and played, the existence of the data is meaningless. The giants have users, products and scenes, but lack the “alchemy” of artificial intelligence to refine the wealth in the gold mines and ultimately win in the same type of company.

For artificial intelligence companies, finding a good eco-partner, or an investor with real business needs, can quickly realize the technology’s landing, and most