People’s demand for initiative in mobile phone data is getting stronger and stronger, but “eavesdropping” is actually difficult to happen on a large scale.

Editor’s note: This article is from the WeChat public account “CaijingEleven” (ID: caijingEleven).

Author | Liu Shuqi

Edit | Xie Lirong

As for the insecurities brought about by apps, more and more people are feeling the pressure.

The “App Security Awareness Public Questionnaire Report” released by the National Cyber ​​Security Awareness Week in September this year showed that among the 320,000 respondents, nearly one-third of them expressed disgust with the precise advertising behavior of apps, and felt that Have been spied or overheard.

The product I just talked to with my friend appeared in the recommendation, and the local information has been pushed by the app without opening the positioning permission. The content searched on this app appeared in another app…in this big In the data age, no one can escape all kinds of accurate pushes.

In the past, people used “on the Internet, no one knows you are a dog” to describe the incomprehensibility of the Internet. But now, the Internet not only knows who you are, but also knows whether you like dogs or not, and pushes dog food ads to you.

In the British drama “Black Mirror”, a girl missed her boyfriend too much. With all the traces her boyfriend left on the social network, she pieced together an AI model almost identical to him.

“There is nowhere to escape.” Professor Zhou Yajin from the School of Computer Science and Technology of Zhejiang University has long been studying cyberspace security. He told the Caijing reporter that the plight of individuals on the Internet is becoming more and more obvious.

360 company CEO Zhou Hongyi in this year’s Internet Net Security said at the conference that some software will open users’ The camera or microphone obtains mobile phone user information, and then finds keywords through the collected information to match the user’s hobbies.

Does the phenomenon of secret photography and recording of apps that are commonly perceived by users really exist? If it does not exist, how did the App achieve this Micro Motion accurate smart push? Is there a boundary between misuse and fair use of user data? How to realize the development of artificial intelligence technology while ensuring data security?

The process of data collection, exchange, and application is like a fully enclosed black box to ordinary users, with intricate internal operating mechanisms. The Caijing reporter tried to uncover a corner of this black box.

01 The person in the world who knows you best may be an algorithm

On October 13, the highly anticipated draft of the Personal Information Protection Law made its debut on the Standing Committee of the National People’s Congress. This law specifically for the protection of personal information clarifies that “the processing of personal information shall obtain personal consent with full notice in advance, and the individual shall have the right to withdraw his consent; if important matters change, he shall re-obtain his personal consent; the individual’s disagreement shall not be regarded as Refused to provide products or services.”

This reflects the most critical principle in personal information-inform consent. But in previous practice, this principle has been ignored by Internet companies from time to time.

As early as two years ago, there were rumors that App secretly photographed and recorded secretly. The first to bear the brunt was QQ browser and Baidu< /a>Input method. At that time, the front camera of the vivo Nex mobile phone was hidden inside the mobile phone, and it was only raised when it was called. When using the QQ browser, the user would obviously see that the small camera module was turned on. At the same time, when no operation is performed on the Baidu input method interface, the phone willPrompt that recording is in progress.

Subsequently, QQ Browser and Baidu Input Method explained separately that they did not secretly retrieve the user’s recording and photographing functions, but because some websites read camera parameters and input methods to optimize voice microphone preheating.

But the hustle and bustle did not go away, and a worry lingers in people’s minds: Are our lives under surveillance all the time? But such suspicions often come from their own experience, rather than conclusive evidence.

A number of experts and veterans engaged in network security and App development told the Caijing reporter that it is possible to realize the secret filming and recording, but this method is not cost-effective. This method is not only costly and inefficient, but also has serious legal risks. In other words, there is no need to worry too much about the so-called app secretly taking pictures and recording.

Jiang Lin is the head of the Nandu Personal Information Protection Research Center. She told reporters from Caijing about three reasons why secret photography and secret recording were not established.

First of all, this requires hardware devices to support voice wake-up, as well as to distinguish dialects and improve recognition accuracy in a noisy real environment.

Secondly, uploading a huge recording file requires a lot of traffic, and it is difficult to avoid users’ attention.

The more realistic factor is that the cost of capturing user preferences through monitoring is extremely high, and the company does not need to pay for it from the perspective of input-output ratio.

“These technologies (referring to secret photography and secret recording) usually have specific application scenarios, such as economic investigation and eavesdropping, and will be targeted to important people. Such sophisticated technology will not be widely used in ordinary commercial fields. “Zhou Yajin said.

The next question is, if there is no common secret shooting and recording, why do so many users have the similar feeling of being monitored?

An engineer who has been engaged in iOS system development for many years told Caijing reporter that there are many ways to directly or indirectly obtain user data and improve user portraits. The most common source of user data is still the user’s personal data and browsing data, including search history, length of stay on each page, which page to enter from which page, etc., so as to establish a set of portraits with multiple tags for users, such as ” Male, bachelor degree, 30 years old, middle-income, married, no children, keeping cats”.

The formation of this label system is likely to be extremely complicated. For example, Zhou Yajin said that if multiple users connect to the same Wi-Fi at the same time every night, the system judges that there is a high probability that they are a family, and the pushed content is likely to be cross-shared. In addition, even if the user does not give an App location permission, Wi-Fi can also determine his approximate location to match local advertisements.

What is less noticeable is the SDK (software development kit) embedded in the App. It is a plug-in that provides specific functions or services in the App, such as advertising, payment, maps, etc. The meaning of the SDK isTherefore, when a developer needs to call a certain function, he does not need to develop it himself from scratch, but only needs to access the SDK.

“When App A and App B both adopt the same advertising SDK, then the data collected in A and B may be uploaded to this SDK, and data sharing is naturally formed between A and B. You are in A The search records and usage habits on the Internet may be reflected on B.” Zhou Yajin explained.

Because of its concealment, the SDK is also a difficult point in data security and user privacy protection.

The above-mentioned engineer who did not want to be named also pointed out a “leaker” that is easily overlooked-the mobile phone input method. He said that many third-party input methods will count user word frequency data, and these data can theoretically be sold to other companies. In this way, it is very likely that your front foot has just met with My friend said I want to buy a certain item, and I want to shop with App Just push this item to the front row.

Even data that is not originally personal information, after continuous aggregation, may be unearthed unknown connections between things, so as to analyze users’ private information. Wu Danjun, a partner of Beijing Guantao Zhongmao (Shanghai) Law Firm and an expert of the Chief Data Officer Alliance, told the Caijing reporter that such results are likely to be unforeseeable before data processing.

The so-called App candidly photographed and listened, “It is very likely that the App’guess you like’ guessed very accurately.” Jiang Lin said. After all, you never know how detailed your user portrait is in the App backend. The algorithm may be better than youUnderstand your heart.

02 Users have become scary birds

“The frequent privacy leaks in the past have caused everyone to be overly anxious and nervous.” He Yanzhe is an expert in the four-sector App governance working group. He has long been engaged in app personal information usage assessment and related technical guidance documents. When interviewed by a reporter from Caijing, he was still on a business trip, “I have been too busy these days.” He Yanzhe wrote new industry standards, researched and collected operating systems and application stores, and handled emergencies in the field of App information security. Feel the pressure and adjustment of this job all the time.

“On the one hand, it is a good thing that users’ awareness of privacy protection is getting higher and higher, but on the other hand, many people fall into a misunderstanding. It seems that when it comes to collecting personal information, it is completely wrong, but in fact the App collects personal information. In many cases, it also serves users.”

He Yanzhe said that he personally is not too disgusted with Internet cross-platform advertising, because he knows that the information transmitted in the middle is for the user portrait of the device, rather than personally identifiable information, such as mobile phone number, ID number, and address . “We are not going to completely ban personalized advertising. If there is no personalized advertising, then we can only return to the advertising era of traditional media.”

Recently, Apple’s updated iOS14 system has added the ability for users to view apps that call the clipboard. TikTok, Chrome browser, CNN, Google News and Starbucks have all been discovered by overseas users and invoked the clipboard function .

This makes users who are worried about privacy leaks once again become scared. “Reading the clipboard is a very normal function. The clipboard was born for reading and pasting.” He Yanzhe explained to the reporter of Caijing, but it is only when the user is not using or running an app in the background. The clipboard information is obtained and uploaded, which may constitute an illegal collection and use of personal information.

Worries about the security of personal information reflect the increasingly sensitive nerves of users, and the lack of the right to know and initiative of personal data.

In September of this year, Beijing Global Law Firm, Narada Personal Information Protection Research Center, and China Academy of Information and Communications Technology Security Research Institute jointly released the “Personalized Display Security and Compliance Report.” This report is for Taobao, Jingdong, WeChat, Weibo, Kaishou, Ctrip, etc. 20 commonly used apps were evaluated, and only 5 apps have a unified tag management system that can be edited and deleted. Even with this feature , The editable tags are also preset by the system, not tags generated from personal historical data.

In stark contrast is Google. On Google, users can see a series of tags generated by the system based on personal preferences, including educational background, family income, marital status, pets, etc., and can also know what Google may push. At the same time, users can also choose to turn off personalized push with one click.

All the measures, in the final analysis, are to let users know three questions: What data does the app have on me? What is the use of these data? Can I not allow it to use my data?

User needs forced mobile phone manufacturers to take action. In April of this year, one of the main selling points of the MIUI12 system launched by Xiaomi is privacy protection. The “flare” function will record all sensitive behaviors of the APP, including camera, recording and positioning. Users can also check the call record at any time.

In the latest official version of Apple iOS14 launched in September, there are similar functions. In addition, ioS ​​14 changed the Advertising Identifier (IDFA) from its default on state to off by default, and if IDFA is turned off, it means that the App can no longer track user data and perform accurate advertising.

Zhou Yajin told Caijing magazine that these technical measures make it difficult and easy to perceive personal information for apps. If the bottom layer of the operating system is changed, it may also cause compatibility problems.

In other words, if the app’s privacy protection awareness remains at the lengthy level of privacy policy that obtains user permission, and the personal information protection measures are not updated with the times, it may fall into an awkward situation where it is difficult to adapt to the operating system.

03 grey areas in privacy protection

From individuals to enterprises, to operating systems and laws and regulations, the importance of personal information protection has reached unprecedented heights. But at the same time, the ambiguity of some important issues has increased the difficulty of legislation and punishment.

First of all, the definition of privacy is still very vague. The “Civil Code” defines privacy as the tranquility of the private life of a natural person and the private space, private activities, and private information that are unwilling to be known to others.

The draft of the personal information protection law that was published recently makes it clear that “personal information is a variety of information related to identified or identifiable natural persons recorded electronically or by other means, excluding anonymized information. Individuals. Information processing includes the collection, storage, use, processing, transmission, provision, and disclosure of personal information.”

It is worth noting that the scope of personal information excludes anonymized information, which is equivalent to leaving room for the development of the big data industry. However, Jiang Lin also pointed out that different people have flexible perceptions of privacy in different scenarios, and specific analysis of specific scenarios is still needed.

Secondly, who is the owner of the data? Does the data generated by a user on WeChat belong to him or WeChat? Does WeChat have the right to share user data with other companies? Zhou Yajin said that from a common sense point of view, the owner of the data certainly belongs to the user, but it is difficult to define this in practice.

Again, how to delineate the scope of data sharing is also lack of conclusive conclusion. For example, Zhou Yajin said that if a person uses application A, and application A and application B belong to the same company, does application A need to authorize his data to application B? From the user’s point of view, he only uses A and not B, and B should not have his data; but from the company’s point of view, data sharing between A and B is very natural.

The more important issue is that in the field of information security, there is a generally accepted principle called the “least necessary principle”, that is, apps should only collect the information they need. It’s just that some companies do exactly the opposite, following the principle of “collect as much as you can.”

In practice, how to determine “the least necessary” is not just a technical issue. Sometimes this judgment can be based on common sense. For example, if a takeaway app requires an authorized address book, it is usually not necessary.

But Zhou Yajin pointed out that if an app itself only needs to obtain a vague location, but obtains an accurate address, then it is worth questioning whether it is following the principle of least necessary. On the other hand, it can also claim that obtaining the most accurate address will help provide users with better services.

What’s more, even a simple tool app, such as weather forecast and compass, sometimes has multiple “chicken ribs”Yes, so openly ask for more permissions.

“Between common sense and norms, there are a lot of ambiguities. This is like a judge’s decision. It still requires a lot of expert knowledge to participate.” Zhou Yajin said.

04 How to balance privacy protection and industry development

The early mobile Internet and big data industries showed a barbaric growth trend, and the understanding of personal information security by enterprises and users was chaotic. There is even a view that one of the important reasons for the rapid rise of China’s Internet industry and becoming a pole in the world is the back-feeding of large amounts of user data.

“The big data industry is like a car.” He Yanzhe said, “In the past, the acceleration was too strong and the speed was too high. Now it is necessary to step on the brakes and do maintenance. This is also to run better in the future.”

Zhou Yajin emphasized to the reporter of Caijing that although the development of artificial intelligence and big data must be based on the acquisition of multi-dimensional, multi-angle, and multi-user data, it does not mean that users’ privacy must be violated. Many interviewees said that current technologies such as federated learning and secure multi-party learning can complete data modeling and improve AI without disclosing data.

For example, if one company owns the user’s credit card data and the other company owns the house purchase data, the two can integrate and match the two sets of data without knowing the other’s user data. This is technically achievable.

But the reality is that many SMEs have no incentive to adopt these cutting-edge technologies. Relatively high technical thresholds, costly investment, and limited penalties are all obstacles to compliance for SMEs.

He Yanzhe calculated an account for the Caijing reporter. To achieve information security compliance in a domestic company, at least it needs to hire a team of lawyers, purchase security equipment, and conduct related evaluations. Any one of them is a cost.

The European Union’s “General Data Protection Regulation” GDPR is currently the most stringent data protection regulation in the world. According to a report released by privacy management platform DataGrail in February this year, 74% of small and medium-sized enterprises spent more than $100,000 on compliance with GDPR. 20% of companies spend more than $1 million, and only 6% of companies spend less than $50,000.

In contrast, in China, a small company usually spends less than 100,000 yuan a year on network security, which is far from enough.

“The cost of compliance is too high, which is equivalent to creating another obstacle for the development of small companies. In many cases, even survival is a problem. Where is there room for compliance?” He Yanzhe said. Therefore, the inadequate protection of privacy and security does not necessarily mean that the company is deliberately “doing evil”, but it may also be “exhausted but insufficient”, with a fluke mentality. Once touch redLine, will inevitably face being held accountable.

He Yanzhe’s philosophy is to “help” the company to comply as much as possible not to “force” the company to comply with regulations. His app special governance working group has launched a new compliance assessment tool. This tool is completely free and can provide apps with online self-assessment and help small and medium-sized enterprises find compliance problems. “It is not so perfect, but it is already very useful for small and medium-sized enterprises, and the functions of this product will continue to be updated in the future.”

Wu Danjun has long provided legal services for companies on network security and data compliance. She feels that more and more companies are willing to seek the assistance of a lawyer beforehand, or entrust a lawyer to draft a privacy policy, rather than rectify if something goes wrong. How to write and display privacy policies and the legality of obtaining personal information from third parties are common issues in corporate legal consultation.

The Nandu Personal Information Protection Research Center, where Jiang Lin works, has been conducting personal information protection compliance evaluations on 100 commonly used apps since 2017. She has personally felt that attaching importance to the protection of personal information has become a common consensus in the Internet industry.

Compared with 2017, in the 2019 privacy policy transparency evaluation report on 100 apps, companies with higher levels of transparency rose from less than 10% to more than 60%, and the proportion of unqualified apps rose from 80%. Above% dropped to 17%. “This is the best performance in all previous evaluations.” Jiang Lin emphasized that this is likely to be a good start.