Big data abstracts

Author: Liu Junhuan

The video age is coming.

The 2019 Douyin Data Report released this week states that the number of daily active users of Douyin has exceeded 400 million; globally, according to data released by YouTube in 2019, its daily active users have exceeded 1.9 billion.

Along with the emergence of a large amount of video content, violent and pornographic content has also entered the Internet, becoming a “time bomb” in user video browsing.

In this era of AI empowering everything, artificial intelligence seems to be playing its part in reviewing video content.

In 2018, Facebook launched DeepText, which uses a deep neural network architecture to understand content; YouTube has long launched Content ID to monitor and remove illegal videos involving pornography and violence, and has spent more than $ 100 million on the technology over the years . Even many media have predicted that AI will rely on its ability to organize massive amounts of data and will soon replace manual review.

But is this really the case?

Recently,The well-known overseas technology media TheVerge visited Google’s team of artificial content reviewers in Austin, United States, and reported the terrible psychological trauma suffered by members of the team due to a large number of reviewing violent content.

Youtube video review is by no means a simple task.

At present, there are about 50 million independent YouTube creators around the world, and the video uploaded every minute is up to 500 hours long, which puts great pressure on video reviewers.

The total number of Google auditors in Austin currently exceeds 100. They are mainly responsible for reviewing extreme videos with violent pornography. At the same time, the review team is also equipped with dozens of low-paid immigrants from the Middle East to assist in their work.

To ensure efficiency, each of Austin ’s content review teams must watch more than 5 hours of violent porn videos per day. Reviewers are paid $ 18.50 per hour and about $ 37,000 per year, but have not received a raise for the past two years. YouTube CEO Susan Wojcicki revealed to the media that Google had promised last year to reduce the daily workload of each content reviewer to four hours, but it has not yet been implemented.

As a result of prolonged exposure to extreme video, auditors on the Google team were exposed to severe psychological trauma. Although Google provides auditors with first-class medical services and benefits, many auditors will still be detected with mental problems such as PTSD and long-term anxiety.


Note: PTSD, referred to as post-traumatic stress disorder, refers to individuals who have experienced, witnessed or encountered one or more actual deaths involving themselves or others, or suffered Threat, or serious injury, or physical integrity is threatened, resulting in individuals with delayed and persistent mental disorders.

What are video reviewers watching every day?

In the impression of the digestive bacteria, the job of the auditor is to look at the videos uploaded by users to the site. It is probably like … you can easily make money by brushing the vibrato every day. It can be said that it is a dream job.

But whether it is YouTube or domestic, the video review is not as beautiful as digestive bacteria imagine.

In terms of Zhihu, the number of views on related topics has reached 167,951, among which the netizen @white gave an example, “ISIS was embarrassed by the video of the hostage killing by ISIS. The cruel side of the African people made me scared. What happened made me feel terrified, and the United States emitted a horrible color when it was dark! Maybe the essence of humanity is killing, people with slit throats are struggling madly, dark red blood slowly flows out, and Japan is committing suicide in the deep forest. Corpses … “

Similarly, as YouTube content moderators, their work has not been easy.

Peter, YouTube ’s content moderator, told TheVerge that he is responsible for video review of extreme extremism every day. This is arguably the most melancholy part of the review process. As mentioned above, Google is the reviewer. The team has a rigorous work plan and he must watch a sufficient amount of violent porn videos.

“Every day you see people hacking people, or shooting down relatives and friends,” Peter said. “You will feel that the world is crazy, it will make you feel uncomfortable, and you don’t even want to live on. Why do we want to What about each other? “

The lives of Peter and colleagues have been greatly affected in the past year. One colleague suffered from neurasthenia, and another collided with anxiety and depression at work. His diet and schedule gradually became disordered. He eventually developed acute vitamin deficiency and had to be hospitalized for treatment.

It’s not just Peter and colleagues. Daisy, another video reviewer who has been responsible for reviewing terrorism and child abuse content on Google Video, has been having difficulty interacting with her children for a while. After being diagnosed by a psychiatrist, she was diagnosed with PTSD and is still receiving treatment.

According to reports, when applying for a job, auditors usually do not understand how the extreme video can cause physical and mental harm. According to the content disclosed by YouTube ’s auditors, the number of audits that Google explained to candidates and Job requirements are often too low.

Although Google has established relevant health care standards for full-time auditors, they can take months off to solve psychological problems that seriously affect work and life, but this is only reported within Google. Many unreported content auditors have been ruthlessly ignored by the company after suffering psychological trauma.

Can AI save video reviewers?

AI’s participation in video auditing has not been in recent years. At the earliest around 2000, there were companies trying to do this. But at that time, video audits required artificially set features and rules, such as the distribution of yellow skin areas. It was only after the development of deep learning that video auditing finally became “flexible”.

However, in the video audit, the “human-machine combination” method is still widely used in the industry, and humans occupy an important proportion.

YouTube’s algorithm engineers revealed to Digestive that most of the videos on YouTube still need to be manually reviewed. Some of the videos being reviewed are detected by AI and some are reported by users. However, professional auditors are required to check and decide whether Violation.

According to Leo, an algorithm engineer at iQIYI, there are currently two common “human-machine cooperation” audit methods in the industry:

  • A kind of AIClassify frequently, then make recommendations to some users, and observe user reactions, of which the hot videos will be reviewed first manually;

  • The other is that the AI ​​marks the video as “good” or “bad”. When the auditors encounter the video marked “bad”, they will review it carefully, which also improves the review efficiency.

    Iqiyi currently adopts the second model. The video will be pre-judgmented by the machine review first, and then the first manual review and review will be performed. The results of the machine review are mainly used as a reference to assist the manual, and there is a video check mechanism.

    With regard to the claim that AI replaces manual review, Leo believes that it is too early. Although AI has done a good job of reviewing objective video, once it involves subjective content related to contextual semantics, it will be stretched.

    There are two technical difficulties in AI audit. One is the problem of algorithm accuracy. There is a saying in the industry that “the accuracy rate when talking about the data set is a hooligan”, that is to say, the AI ​​models trained with the data set are not all able to match the actual behavior. The accuracy of the AI ​​review reached 99%. Considering the amount of videos uploaded by users, the cumulative amount of the remaining 1% is also amazing.

    And if there is an omission, the video site will take huge risks.

    Another difficulty is the subjective judgment of the content. To put it simply, not all sexually explicit content is pornographic, or not all pornographic videos are nudity. In addition, text, voice and other aspects of the video content are mixed, which is easier for people to judge. Machines require multiple algorithms to be superimposed.

    Leo tells us that, for example, when processing audio content, on the one hand, it needs to use ASR to convert it into text, and on the other hand, it needs to classify sounds. This includes some meaningless audio, such as gasping. If text still appears in the picture, you need to use OCR to extract the text from the video, but in the end, you must use NLP, which is text understanding.

    As a result, manual audits remain a vital part of the entire auditing landscape. The professional review team configured by iQiyi is mainly responsible for screening the content of the uploaded videos by users, and filtering out content that does not meet national laws and regulations and platform standards.

    Define sensitive content? AI: Chen Ye can’t do it

    For AI review, except for the accuracy rate and subjective determination of content mentioned above, which is still unsolvable for AI, the definition of sensitive content itself is still unclear.Unclearness is also an important reason that cannot be ignored.

    In China, the illegal content is generally uniformly regulated by the SARFT, and video websites are passive in terms of rules. They need to conduct strict self-audits according to the standards specified by the SARFT. Some companies may even set up specialized legal consulting positions. Specialized in studying the policies of the State Administration of Radio, Film and Television.

    On the global scale, more video sites are responsible for defining sensitive content. But it is also because of greater initiative that they assume greater responsibility. It is undoubtedly a very tricky task to achieve uniform global review standards. Without taking into account local cultural factors, video sites will, in serious cases, fall into a fierce battle with the government and the public.

    For example, in July 2018, the Indonesian government banned the popular short music video production application TikTok, which is the international version of Douyin. Not long ago, the U.S. military also explicitly banned TikTok for security reasons.

    According to Indonesian media reports, the Indonesian government banned TikTok because the app contains too many negative videos, and public sentiment in India and Asia also generally opposes the use of TikTok by young people, as some videos can adversely affect young people. One of the videos was this: it started with a dance, and then the camera suddenly cut to a corpse. After investigation, the authorities found that the corpse was a relative of the photographer.

    Apart from the death-related videos mentioned above, the world has been extra cautious about the following:

    • Religious hate speech that incites violence

    • Fake news and spread for political purposes

    • Derogatory language against individuals / organizations

      In addition to the content of “video violence”, the definition of “video pornography” is also highly subjective and arbitrary. Previously, Instagram has caused many women to protest on the software because it allows “men’s naked nipples” but prohibits “females’ naked nipples.”

      Compared with Instagram, the rules of some social networking sites seem to be “relaxed” a lot, and they allow nudity in certain special situations.

      Tumblr, which recently updated its content rules, can be seen as an interesting note: “Prohibited content includes photos and videos of human genitals, women showing nipples, and any sexual activity Media, including illustrations. Exceptions include nude classical statues and political protests featuring nudity. New guidelines exclude text, and pornography is still allowed. As long as sex is not clearly portrayed, illustrations featuring nudity are featured And art is still allowed, breastfeeding and postpartum photos are the same. “

      Here you can also take a look at the relevant rules of “pornography” and “naked” in four social platforms with large global traffic, including Facebook and Reddit:

      It can be seen that based on different values, user groups served, and their cultural sensitivities, they must be provided with exclusive rules and exceptions. In other words, due to the subjective nature of the content, it is very difficult and impossible to create a global content standard once and for all.

      What can AI do in content review?

      Although there are many limitations and deficiencies, it is not a trend that does not prevent AI audits.

      At present, in addition to evaluating and detecting extreme text content such as spam and abusive comments in content auditing, on some social platforms, AI can also block offending images and even target harassment and Investigate bullying.

      However, there are three aspects to be aware of when using AI for content review:

      • YesContent review requires cultural awareness and contextual understanding of the “standards” of the relevant community. Although AI can perform pre-regulation to help reduce the workload of manual review, manual participation is still an indispensable link.

      • AI faces public mistrust, especially the possibility of unconscious human or technological bias. In addition, algorithms may fail to detect violations. To this end, on the one hand, it is necessary to regularly analyze and adjust the algorithm; on the other hand, stakeholders should ensure the transparency of AI.

      • Due to the variety of formats and content complexity, user-generated video content is becoming increasingly difficult to analyze, and they need to be interpreted as a whole to identify violations. In order to better understand user behavior and timely update the definition of harmful content in violation of regulations, it is best to share data sets between platforms and service providers, which will help stakeholders gain better cultural awareness and contextual understanding.

        In an ideal world, wouldn’t it be great if AI could do all of the above?

        Let’s go back to content review again. When the AI ​​review has been optimized to the greatest extent, and the auditor’s work efficiency has been greatly improved, the psychological problems of the auditor team still seem to be pending.

        During the interview, Chinese Pickle also learned that in addition to the video review team, in order to design a more accurate and usable review algorithm, algorithm engineers also need to watch a large number of violent porn videos every day, which is inevitably affected by the same . Although technological advances are unstoppable, AI, which has high expectations from the media and the general public, is also destined to carry personal sacrifices.

        As viewers in the technology torrent, we have no right or ability to change the work of auditors or algorithm engineers, but at least we can give more attention to this group. As Daisy said: “We need more people to participate in this work, but we need to change the entire system and work structure to support these people and provide them withTools and resources to deal with the problem, otherwise the problem will only get worse. “

        Related reports:

        https://www.theverge.com/2019/12/16/21021005/google-youtube-moderators-ptsd-accenture-violent-disturbing-content-interviews-video