This article is from WeChat public account: China economic Weekly (ID: ChinaEconomicWeekly) , author: Sun Bing, article published in the “China economic Weekly “20th issue of 2019, title map from: Oriental IC

With the recent outbreak of a series of events, a huge and secret business has surfaced, let us see the other side of big data is not wonderful.

October 21, Hangzhou police issued an announcement confirming 51 credit card (02051.HK) entrusted outsourcing collection company suspected of seeking trouble and other crimes . The company’s use of crawlers to improperly steal user data, abuse user information for profiteering collections and other issues have surfaced. Previously, 51 credit card companies such as 51 credit cards had been criticized by the Ministry of Industry and Information Technology for collecting personal information without the user’s consent.

51 Credit Card Hangzhou Headquarters (Vision China)

But the more terrifying thing is that 51 credit cards are not isolated. This year, especially in the last two months, there have been many big data companies,Credit reporting companies and Internet finance companies with such businesses were investigated. The regulatory storm is coming, and the 51 credit card that “plays the fire” is not the first one, and obviously it will not be the last one.

At the just-concluded Wuzhen 6th World Internet Conference, “Legal Space Data Legal Protection” has also become an important topic, from government officials, academic experts and leading companies from around the world, on “data security, personal information.” Issues such as “protection and cyber rule of law” and “rule of law in data governance” have fully expressed their views and opinions, in order to strengthen data risk prevention and build a safe and credible digital world.

When the car was first born, someone once sued the court and demanded that the car’s right of the road be banned because it was too fast, and the consequences of hitting the pedestrian were unimaginable, and a carriage was enough. Of course, this did not change the arrival of the “car era.” But people do have a set of laws and rules, and educate everyone who drives and rides a car, in order to enjoy the new world brought by the car, and try to avoid the damage it can cause.

Big data may be the “car” that we have just hit the road in this era. While we are taking advantage of its bright future and infinite charm, we are also at the moment to make rules for it. Otherwise, it will really hurt people, and the harm may be far greater than our imagination. The whole society needs to work together to develop a set of perfect rules, and everyone may need a big data “driver’s license.”

01 “The big data industry is almost gone”

“The big data industry is almost gone.” A big data industry insider is ridiculing in a circle of friends. Although this is a joke, on the one hand it reflects the tightening of recent policies and the strengthening of supervision; on the other hand, it also reveals how serious the problems in the past industry are.

In September of this year, many big data companies such as Tianyi Credit Information, Hangzhou Chuangxin Data, Xinyan Technology, and Konjac Technology were inspected, and dozens of companies have been included in the survey list, including many of them. A billion-dollar star unicorn company. An important reason for these companies to be investigated is the excessive collection, illegal theft and sale of personal data information using reptile technology. The reporter also found that many big data companies have simply stopped the crawler business, and some even disbanded the team.

Before, the “big data industry first stock” data hall (831428.OC) employee sales of citizen information case sensation throughout the country, this company In the past 8 months, the average daily transmission of citizen personal information exceeded 130 million, and the accumulated transmission data was compressed to about 4000 GB. After that, another Qiaoda technology was sold out to sell 800 million resumes…

“This is an industry earthquake that has never been seen since the birth of the big data industry in China. It is impossible to disappear in the industry, but the reshuffle is certain.” The above-mentioned insiders told China Economic Weekly. But this is not just an earthquake in the big data industry. As an “energy industry” in the industry chain, changes in the big data industry may have an impact, perhaps far more profound than we think.

The insider even vowed to the reporter: “I really want to check, no one’s data is 100% white.”

In fact, the big data industry has been in a state of “savage growth” since its inception. As an emerging industry, the improvement of the system and the improvement of supervision will take time, but the industry will develop. It has already run far ahead, and the “innovation” mixed with gray has emerged one after another, especially in the mutual gold field that is closest to the money and has the most temptation.

Some people think that the speed of China’s Internet industry, especially China’s financial technology and artificial intelligence, can be overtaken by Europe and the United States, thanks to the richness of big data. There is always a metaphor in the industry: big data is “oil” and algorithmic power is “engine”. The level of “engine” in Europe and the United States is very high, but the “oil” as a fuel is not enough, so it can only run and stop; while China has a gap in algorithmic power, but rich in big data. Resources can continue to run and run far away when the “engine” performance is behind.

However, this rich data resource comes from China’s largest “digital” population on the one hand, but on the other hand because of the existence of a large number of gray-scale data, which are privacy protection and data security systems in China. In the case of imperfect conditions, in exchange for personal privacy.

Users share their data moderately, and they can get more convenient, lower cost, and more experience.Good service, and Internet companies will continue to iterate algorithms, innovate products, and achieve faster development. But how is this “moderate” boundary defined? Where should the red line be drawn? How to balance the relationship between protecting privacy, controlling risk and industrial development, and encouraging innovation? … too many important questions to be solved.

After all, no industry can achieve real growth in chaos.

02 Grey “crawler” streaking data

The source of the data is a crawler. Web crawler (Spider), in short, is a program that automatically crawls network data, such as the search engine Kind of technology. The difficulty of crawling technology is not high. The technology itself has no difference between good and evil. It depends on how the technical users use it: what data can be “crawled”, what data should not “crawl”, and whether it is The user knows and agrees to “crawl” and “crawl” the data is not well encrypted to prevent theft…

Many Internet companies will set up anti-reptile mechanisms to prevent external reptiles from stealing important information, but in the end, they are one foot high and one high, and recently the “first capital” of the seventh largest commercial bank in the United States, British Airways. Many of the large domestic and foreign companies such as Marriott Hotel Group and China Lodging Group have experienced customer information disclosure, and even Facebook has not escaped.

Users can also prevent personal information from being leaked by installing various security products and applications, but they are often hard to prevent. The reality is that for many users, they have neither the protection awareness of personal data privacy nor the corresponding security capabilities. Personal data is simply “streaking”, and even because of the “small favors” of some companies. Share data.

The big data industry has long been in the gray area, and the source of many data is not “innocent”. This is no secret. It’s just that most people are unconscious or choose to ignore for the benefit, which also makes more and more people cross the red line.

The two reports of China Consumers Association are very illustrative. In August last year, the “APP Personal Information Disclosure Survey Report” issued by China Consumers Association showed that over 80% of respondents had experienced personal information disclosure. The main reason was that APP operators collected personal information and deliberately disclosed information without authorization.

And another last year 11The “100 APP Personal Information Collection and Privacy Policy Evaluation Report” released in the month is even more shocking. Of the 100 APPs that have been evaluated, there are as many as 91 APPs that have excessively collected personal information of users. Typical methods include hidden collection of users. Information, misleading user consent, compulsory authorization, excessive cable, beyond the user’s psychological expectations to obtain personal information, account cancellation difficulties.

APP illegal collection of personal information has attracted the attention of regulators. In January of this year, the four departments of the Central Network Information Office, the Ministry of Industry and Information Technology, the Ministry of Public Security and the General Administration of Market Supervision jointly issued an announcement to announce the implementation of a one-year special collection of personal information violations for the violation of laws and regulations, and entrusted the establishment of an APP special governance working group. At present, the APP Special Governance Working Group has received nearly 9,000 reported information (effectively reported by the working group for verification and preliminary verification), involving more than 2,000 APP, there are more than 800 rectification problems.

In July this year, the Ministry of Industry and Information Technology launched a special campaign to improve the security of network data protection for the telecommunications and Internet industries. It is required to complete all basic telecommunications companies (including professional companies) , 50 key Internet companies and 200 mainstream APP data security checks.

At the institutional level, the Central Network Office has also drafted the “Data Security Management Measures”, “Personal Information Exit Safety Assessment Measures” and “Mobile Internet Applications (APP )Collecting a series of institutional documents such as the Basic Specification for Personal Information, which is currently open for public comment.

This time, the violation of big data companies is frequently checked, just the beginning.

03 From advertising to lending, attractive big data business

The formation of the big data industry, the initial major industry demand is the accurate delivery of advertising, through the big data analysis of users, the user to “portrait”, find out the user’s behavior characteristics and demand preferences, information platform, electricity Business platforms and other are based on big data for personalized recommendations, not only to enhance the user experience, but also to help businesses improve the reach and conversion rate of advertising.

With the rise of Internet finance, user data analysis began to be used as a creditHelping financial institutions find suitable lenders with needs can also reduce the bad debt rate of loans. From pushing ads to lending, this application scenario is obviously more granular than the data that was needed in the past, the information is more comprehensive, and it is closer to the privacy of users.

Taking the companies that have been investigated as an example, Konjac’s data calls reach hundreds of millions of levels, serving more than 2,000 banks, insurance institutions, consumer finance, and Internet finance customers. In the eight-month period, Data Hall transmits more than 130 million pieces of personal information per day, and the amount of data is extremely large.

After the company was seized, the police found that the company illegally obtained the resume information of 220 million natural persons, and there were more than 1 billion address books, and it has relevant social relations, organizational relations, and family relationship data. . Qiaoda Technology has claimed to have more than 800 million natural people’s cognitive data, which means that more than half of the Chinese, the information is in the database of Qiaoda Technology.

Is this data properly obtained and used properly? It is difficult in theory and in reality. And even more frightening is that once the fine-grained private information is leaked, the harm is not only as simple as harassing calls, selling text messages and scams. Recently, frequent violent collections, lending, and slashing are mostly related to data privacy leaks. Therefore, the disclosure of personal information will not only endanger the personal and personal safety of the individual, but may even endanger public safety.

For example, some online lending companies steal or buy personal information from users through reptiles, and analyze their spending power, accurate home address and social relationships, and then swindle with cash-straps to let victims fall into high interest rates. The trap is violently collected without repayment.

Some big data companies will provide “positioning” services for online lending companies. Lenders will be found even if they go to the ends of the world and change their names. If you can’t find you, you can also find your family’s relatives and friends, threaten to intimidate, and force you to repay the high loan interest. There have been several cases in which college students have been trapped in “routing loans”, and thousands of dollars have been rolled into millions of dollars. In the end, they have been forced to commit suicide by harassing and intimidating threats.

Even if the data source is reasonably compliant, in recent years, there have been some “ethical issues” in the use of big data portraits, such as “big data killing” and “same price in the same room” “Looking at people to make red envelopes” and so on, are all controversial. This is the method used to accurately serve you. It is used to accurately “bull” you, the person who knows you the most, and the one who hurts you the most.

Because financial institutions and mutual money platforms receive far more revenue than the advertising industry, the big data companies they serve also have higher revenues, making this type of data more expensive. In the face of the interests, some people began to move their minds, and even the black gray production is also eyeing this enticing data business.

According to the reporter, some small and medium-sized banks and financial institutions, especially some Internet finance companies, do not have enough user data accumulated, so they can only provide credit and risk control services through third-party data companies such as Konjac. And the data sources of these data companies are black and white, they are not clear, or do not want to be clear.

Some big data companies will not only develop Alipay crawlers, WeChat crawlers, carrier crawlers, etc., from the large platform “rich data” with rich user data, but also implant reptiles to users’ mobile phones through malicious SDKs to steal user data. In particular, once biological information is leaked, it is extremely harmful. Because the name, mobile phone number, bank card, password and other information can be changed immediately, but the fingerprint, iris, face data, etc. can not be changed, and the hidden danger is invisible after being stolen.

04 Why is the EU launching a GDPR that “blocks” technological innovation?

Why should the EU launch a GDPR that “blocks” technological innovation?

Not only in China, the issue of data privacy has become a global issue, and the more intense response is in Europe, where culture pays more attention to personal privacy.