Princeton Report: How to distinguish AI “Falun fake medicine”

Editor’s note: This article is from WeChat public account “New Witness” (ID: AI_era), source cs.princeton.ed, edit Xiaoqin, Daming.

Princeton University professor’s latest report, “How to distinguish AI “Falun fake drugs”” has recently become a fire. Many technologies that claim to use AI algorithms to predict social consequences are actually no better than linear regression models. What do you think of AI “Falun fake medicine”? Let’s discuss the New Wisdom AI circle of friends with the AI ​​big buddies.

AI is not a panacea, but more and more people describe it as a panacea. Under the encouragement of these people, more people may actually use AI as a panacea.

So, how to calm down in the atmosphere where people around you are blowing, to distinguish between true and false? Recently, Associate Professor Arvind Narayanan of the Department of Computer Science at Princeton University has written a report titled “How to Distinguish AI “Falun”.

AI is not a panacea: look at Princeton's

The main points of the report are as follows:

1. There are a lot of things that are not related to AI. They are labeled with AI. The real, socially influential AI technology that has been born has inadvertently acted as an umbrella for these counterfeit goods.

2. Many technologies that claim to use the AI ​​algorithm involve predictions of social consequences. The truth is, we can’t predict the future, but when it comes to AI, this common sense seems to be ignored.

3. In risk behavior prediction, manual scoring is much more reliable than AI scoring. For example, illegal driving, manual scoring, to a certain degree of revoked driver’s license, this score still has to be handed over to people to do.

The author first gave an example. The website below claims that with a short 30-second video, you can assess your career prospects and the stability of your work. Does it sound amazing? As long as a video is sent, the website will automatically evaluate multiple indicators and give a comprehensive score after visual presentation.

AI is not a panacea: look at Princeton's

It claims that it is not even relevant to evaluate the score results and what you said in the video. It is entirely based on the body language, the way and style of the speech.

In fact, this is just a “random number generator” with a shell attached. Whether your career is stable or not depends on luck.

Why are there so many fakes packed into AI?

The first and now “AI” is a fashionable umbrella, and the AI ​​can increase the value.

Second, some AI technologies have indeed achieved real, universal recognition.

Third, most people don’t understand AI, and companies can put anything on the AI ​​tag and sell it.

This example is just a description of the problem in the HR field. In fact, in other fields, this deliberate exaggeration of AI technology may be more serious. In this report, the authors roughly divide the current AI application model into three categories.

AI is not a panacea: look at Princeton's

The first category: cognitive AI technology. It mainly includes content recognition (including reverse image search), face recognition, medical image-based auxiliary diagnosis, text-to-speech conversion, and DeepFake. The author believes that this type of technology is basically a rapid technological advancement, and even the overly realistic performance of DeepFake has raised moral concerns.

The author believes that the main reason for the lack of space for such AI technology fraud or bragging is the certainty of the results and judgment criteria. Whether it is face recognition or text-to-speech conversion, the standard of error is very clear.

AI is not a panacea: look at Princeton's

Second category: Automated judgment AI technology. Including spam detection, pirated content detection, automatic paper rating, content recommendation, etc. This typeAlthough the application is far from perfect, it is progressing and the application prospects are gradually expanding.

For this type of AI, the criteria for judging begin to become a bit fuzzy. An article is well written. An email is not spam. Different people may have different opinions on these issues. AI will Gradually learn the way human judgment and reasoning, but often it is inevitable to make mistakes.

AI is not a panacea: look at Princeton's

The third category: social consequences prediction class AI. Including occupational performance prediction, recidivism behavior prediction, policy prediction, and terrorist attack prediction. The author believes that such AI is basically doubtful of its authenticity.

The author believes that in the case that we ourselves cannot predict the future, we must hand this task to the AI ​​and formulate policies based on the results. This choice is contrary to common sense and is likely to cause adverse consequences.

AI predicts social consequences? The effect is not much better than linear regression

The third type of AI application is about predicting social consequences, and they are fundamentally suspicious most of the time:

  • Predicting criminal habits

  • Predicting performance

  • Predictive Police

  • Predicting the risk of terrorism

  • Predicting problem children

This article also focuses on the third type of AI application, because this part of the scammers.

There are some tools shown before that can be used to predict job suitability. Similarly, bail decisions are based on algorithmic predictions of recidivism. In addition, according to an algorithm that analyzes social media posts and predicts the risk of terrorism, someone is denied access at the border.

These problems are difficult because we cannot predict the future. This should be common sense. But when artificial intelligence got involved, people seemed to decide to put common sense on hold.

AI is not a panacea: look at Princeton's

Real,Fast progress:

  • Shazam (a music recognition app)

  • Reverse image search

  • Face Recognition

  • Medical imaging-based medical diagnosis

  • Voice to text

  • Deepfakes

Not perfect, but improving:

  • Spam detection

  • Copyright Infringement

  • Automatic paper rating

  • Hate Speech Detection

  • Content recommendation

Basically suspicious:

  • Predicting recidivism

  • Forecasting success

  • Predictive Police

  • Predicting the risk of terrorism

  • Predicting problem children

Of course, this is a far cry from the use of AI (all robotics, games… not here). However, the point is to show how the limits of accuracy differ in quantity and quality for different types of tasks.

Next, there will be no real improvement in the third type of application, no matter how much data is invested.

Case: Can social consequences be predicted?

AI is not a panacea: look at Princeton's

Princeton University sociologist Matthew Salganik and others previously published Research on Fragile Family and Child Welfare, involving 457 researchers and forming a data set to conduct a machine learning challenge.

“Vulnerable families (families with unmarried parents and children) and childrenThe Sub-Welfare project tracks nearly 5,000 children (approximately three-quarters of whom are unmarried parents) born in large cities in the United States from 1998 to 2000. The “families” of these children are facing greater divisions than the average family. The danger of poverty.,

The study revolves around four aspects: (1) What are the conditions and abilities of unmarried parents, especially fathers? (2) What is the nature of unmarried parental relationships? (3) How do the children born in these families live? (4) How do policies and environmental conditions affect such families and children? The project’s Population Research Data Archives Office publicly provides six sets of relevant data.

As far as I know, this is the most rigorous effort to measure the predictability of social outcomes.

AI is not a panacea: look at Princeton's

They collected a large amount of data about each child and family based on years of in-depth interviews and repeated family observations.

AI is not a panacea: look at Princeton's

The Fragile Family Challenge (FFC) is set up similarly to many other machine learning competitions. The task is to learn the relationship between background data and result data based on the training instance. The accuracy rankings are evaluated during the competition and evaluated based on the retained data after the competition.

All background data from birth to 9 years old, and some training data from 15 years old, their task is to accurately predict the results of the following key categories:

  • Children’s average grade (academic grade)

  • Children’s courage (passion and perseverance)

  • Material difficulties in the family (measuring the extent of extreme poverty)

  • Deportation of families (no rent or mortgage)

  • Career layoffs

  • Working Training (if the primary caregiver will participate in the Work Skills Program)

AI is not a panacea: lookPrinceton University's

The perfect prediction corresponds to the decision coefficient R^2 approaching 1. Predicting the average of each instance corresponds to R^2 approaching zero (ie, the model does not learn to distinguish instances at all).

The intuition of most people thinks that the value of R^2 is between 0.5 and 0.8. Many experts who organize this challenge have high expectations.

AI is not a panacea: look at Princeton's

However, the actual results are disappointing: the value of R^2 is between 0.03 and 0.23.

Know that hundreds of professional AI/ML researchers and students are involved in the challenge, they are motivated to maximize the accuracy of predictions, and each family is given 13,000 features. These are the best performing models.

AI is not a panacea: look at Princeton's

In contrast, there are only four variables in the linear regression model, and the results obtained are not much worse than the AI ​​model (the green line above).

In other words, “AI” is not much better than a simple linear formula!

This is the crux. Regression analysis has been a hundred years old.

AI is not a panacea: look at Princeton's

The same findings are found in many other areas.

The above picture is an AI that predicts recidivism. Note that this is the correct rate, not R^2, so 65% is only slightly better than random.. The actual accuracy may be lower, because although the tool claims to be able to predict recidivism, it actually predicts that it will be arrested again because it is recorded. Therefore, at least some of the predictive performance of the algorithm comes from predictable policing bias.

Opinion: In predicting social consequences, artificial intelligence is no better than a manual score using only a few features.

This is a falsified point of view. Of course, if there is evidence to the contrary, I am willing to change my mind or give appropriate explanations to this statement. But given the current evidence, this seems to be the most cautious point of view.

AI is not a panacea: look at Princeton's

The deductions on the driver’s license can be seen as a way to predict the risk of an accident. Some studies have found that such systems are calibrated quite well. We have long known that in many areas, if all we really want to do is predict (usually not), then simple formulas are more accurate than human predictions, even for experts who have been trained for many years.

Daniel Kahneman explains that this is because human predictions are often “noisy”: given the same input, different people (even the same person at different times) make very different predictions. Using statistical formulas eliminates noise.

AI is not a panacea: look at Princeton's

The danger of artificial intelligence in predicting social consequences:

  • The need for personal data

  • Power is transferred from domain experts to irresponsible technology companies

  • Lack of interpretability

  • Influencing intervention

  • Accuracy flows to the surface

  • ……

Artificial intelligence prediction has many drawbacks compared to manual scoring rules.

The most important thing is the lack of interpretability. Imagine such a system, when you are stopped by the traffic police, the traffic police will count your numberAccording to the input computer, instead of deducting points from the driver’s license. Most of the time you are free to drive, but suddenly one day, the black box system tells you that you can’t drive anymore. Unfortunately, we have such systems in many areas today.

Summary

  • Artificial intelligence excels at certain tasks, but cannot predict social consequences.

  • We must resist the huge commercial interests that are intended to confuse this fact.

  • In most cases, manual scoring rules are equally accurate, more transparent, and worth considering.