Qi’s amazing artificial intelligence is inseparable from the tempering of their “Sanmai Real Fire.”

Editor’s note: This article is from WeChat public account “The power of machines” (ID: almosthuman2017) .

AI

The AI ​​trainer turns right!

The website of the Ministry of Human Resources and Social Security on March 2nd, the Ministry of Human Resources and Social Security, the General Administration of Market Regulation and the National Bureau of Statistics jointly released 16 new occupations such as intelligent manufacturing engineering technicians and artificial intelligence trainers to the society. . This is the second batch of new occupations released since the promulgation of the 2015 Occupation Classification Ceremony of the People’s Republic of China.

In China, more than 2 million hours of voice data and hundreds of millions of picture data are required to be labeled every year … The collection, sorting, cleaning, and labeling of these basic data is a prerequisite for training artificial intelligence models.

The huge amount of data needs to be cleaned and labeled to be awakened by value. Data labeling practitioners have emerged from time to time, and they have become a new profession in the era of artificial intelligence.

The marvelous artificial intelligence is inseparable from the tempering of their “Sanmai Real Fire.” All the skills learned by AI have been trained and reviewed by them.

With the continuous evolution and development of artificial intelligence, the profession of artificial intelligence trainers is constantly moving towards refinement, specialization, and scale. You see, the technical battlefield of the epidemic is full of them.

Writing | Yingjun, Liqin

Edit | April

An epidemic battle involving life and death, artificial intelligence is charged and stands out in epidemic prevention and control, online consultation, AI temperature measurement and other scenes, appearing as “information officers”, “little nurses”, “protection divisions”, etc. On the front line of the epidemic.

Behind these different incarnations of artificial intelligence, they all point to the same special professional group-artificial intelligence trainers. We are amazed that the artificial intelligence is inseparable from the tempering of their “Sanmai Real Fire.” All the skills learned by AI have been trained and reviewed by them.

AI

In China, more than 2 million hours of voice data need to be labeled each year, and more than hundreds of millions of picture data … The service of collecting, sorting, cleaning, and labeling these basic data is a prerequisite for training artificial intelligence models . The report states that the size of China’s artificial intelligence basic data service market in 2018 was 2.586 billion yuan, of which 86% was the data resource customization service. It is expected that the market size will exceed 11.3 billion yuan in 2025.

The market supply side behind this is mainly composed of artificial intelligence basic data service providers, algorithm research and development units, self-built or directly outsourced tagging teams. They have become a new profession in the era of artificial intelligence.

01, A special group of trainers

In the Alibaba ecosystem alone, there are more than 200,000 practitioners of artificial intelligence trainers.

On January 27, the third day of the new year, the new crown epidemic entered an outbreak phase, and the planned trip to the streets and streets during the Spring Festival was cancelled. This day is also the day when the Ali Dharma Hospital epidemic robot came online.

The main function of the robot is to detect the epidemic situation by phone call, and provide citizens with epidemic consultation and consultation services on the Internet platform. The landing mission goes in two steps. The first step is to design and train a universal robot; the second step is to do some supplementary training according to the needs of different places in order to land.

As an artificial intelligence trainer for Ali, Xihui can’t stop for a moment. Five days ago, Huihui received a designated task for supplementary training, mainly responsible for the on-line work of robots in the Guangxi Zhuang Autonomous Region.

In Guangxi dialect, the pronunciation is usually not equal. “Yes” and “4” mean the same thing. In order for the robot to know the meaning of the locals, it is necessary to carry out targeted intelligent training for the robot, specific training for dialect speech recognition training and user semantic understanding training.

Delivering “feeds” to robots, and strengthening the training of semantic understanding models, so that robots can better understand humans, is the most important part of our work. Based on the universal model of robots, we can develop speech and training models based on needs.

In order for robots to understand humans, they need to deliberately design the dialogue process and respond to speech to maintain smooth communication between the robot and users. According to Huihui, Guigang and Beihai in Guangxi, because they are located in tourist areas, need to be adjusted for user research.

In recent years, artificial intelligence trainers have gradually attracted attention.

According to ChinaAccording to the announcement of the Technical Training Center for Professional Training, the precise definition of artificial intelligence trainers is: “Using intelligent training software, database management, algorithm parameter settings, human-computer interaction design, performance test tracking and other assistance during the actual use of artificial intelligence products. Operators. ”

In the “Notice on the Promulgation of Public Information of New Occupation Information”, the work content of this special group of trainers is described as:

  • 1. Annotate and process the raw data of pictures, text, speech and other services;

  • 2. Analyze and refine the characteristics of professional fields, train and evaluate related algorithms, performance and functions of artificial intelligence products;

  • 3. Design interactive processes and application solutions for artificial intelligence products;

  • 4. Monitor, analyze, and manage the application data of artificial intelligence products;

  • 5. Adjust and optimize the parameters and configuration of artificial intelligence products.

    Their work is similar to software operation and maintenance engineers. They participate in every link from the initial data annotation to product parameter optimization. They are a key part of algorithms and technologies from theory to application, and it is also impossible for the industrialization of AI technology Missing link.

    02, “Young people in towns” move towards “Specialist tips”

    In fact, as early as 2015, the Ali Customer Experience Group incubated the first batch of domestic artificial intelligence trainers in its customer service team, and was first proposed by the Alibaba Xiaomi team and registered with the country.

    The threshold for early AI trainers is not high. They mainly collect data through data crawlers and mechanize their work, attracting a large number of “town young people” who do not have high professional technical reserves. The AI ​​trainer industry was once considered to be “Foxconn in the AI ​​industry”, it is difficult to associate it with words such as “professional”, “technical”, and “creativity”.

    According to Alipay’s new occupation survey data, “town youth” is the main force of more than 40 new occupations, and about half of the practitioners live in three, four, and five tier cities and counties, and more than two-thirds of them are part-time.

    AI

    However, as artificial intelligence enters the stage of landing and running, vertical scene data becomes the mainTo demand, the requirements for data type, quality, and precision have also increased significantly. Voice, image, and NLP datasets have begun to emerge, and the strength of head companies and professional third-party companies in the data service field has gradually emerged.

    According to related reports, in 2018, about 34% of business volume went to third-party companies that specialize in data acquisition. The demand for professional data is evident.

    AI

    Improving the professionalism and accuracy of data, for practitioners, they also need relevant professional knowledge and stimulate creativity to meet the customized needs of users.

    The labeling process is no longer a thick line, “fool-like” operation-just sketch “sky”, “vehicle”, and “crowd”. On the contrary, the dimension of the labeling is more subdivided and vertical. It only uses face recognition, and then it develops into emotion detection. In the later stage, more in-depth subdivisions such as micro-expression recognition require data service practitioners to have corresponding domain knowledge.

    In this context, the originally very fragmented data labeling industry began to be distinct. Data labeling has gradually shifted from labor-intensive to skill-intensive. AI trainers in pipeline operations have also evolved into more professional and sophisticated. Working mode, they gradually become “experts” in this field.

    In addition, the working model of the AI ​​trainer does not stop at the one-man, one-machine collaboration mode. More and more research shows that machine simulation or machine-generated data may become new exports in the future.

    The AI ​​trainer team introduced machine annotation to increase the dimensions that machines can annotate and improve the accuracy of data processed by the machine. This is not only a consideration of improving efficiency and expanding market boundaries, but also more in line with the nature of AI “de-artificialization”.

    Although AI is becoming more and more intelligent in the long run, it can assist in labeling, but it is still difficult to judge biases. For example, AI’s recognition of text evolution and emotion is still weak. In the future, AI will have to deal with the industry. More complex issues, but human perception and judgment cannot be replaced.

    It is reported that by 2025, the scale of basic data services for the single industry of autonomous driving alone will exceed 2.4 billion yuan, and the total number of industry data tasks will exceed 100 million. With the widespread application of artificial intelligence in intelligent manufacturing, intelligent transportation, smart cities, intelligent medical treatment, intelligent agriculture, intelligent logistics, intelligent finance and other industries, the scale of artificial intelligence trainers will usher in explosive growth.