A concise history of Internet information distribution

This is a reflection on the past and present of the Internet.

Editor’s note: This article is from WeChat public account “Yuan Shoujin Laohan” (ID :chairmanJLH), author金叶宸.

This is a reflection on the past and present of the Internet.

1/Bit mirroring of the Internet and the world

I have discussed a very simple question with many friends. In your eyes, What is the Internet?

To answer this question, you must first answer another question. Why does the Internet appear?

In my opinion, the world is made up of three elements, namely: material, energy, information. So what is information?

According to the definition given by CE Shannon, the information master, in Information Theory, Information is an indeterminate description of the state of motion and the way of existence.

The process of human access to information is the process of eliminating uncertainty. Because the concept of describing information confusion and uncertainty is “information entropy”, you can also regard the process of human access to information as a process of “increment of information entropy”.

A brief history of Internet information distribution

Information entropy of binary sources

In adult words: Getting information can make you “do not understand” to “understand”.

In order to survive in this world, our genes “design” us into an efficient system for information acquisition and processing. For example, we (or all primates) have evolved the ability to recognize colors, and our eyeballs are able to recognize spectra in the 380-780 nm range, which gives us the ability to distinguish foods, find shelters, and sense danger through color.

A brief history of Internet information distribution

Of course, this is only a very basic information acquisition capability. In the process of human evolution for millions of years, in order to meet the needs of our social organizations, we have evolved more sophisticated and advanced access to and understanding of abstract information. Ability, this is the ability to process information such as text, pictures, music, etc.

Simply, Getting information is important to us, but what does it have to do with the Internet?

Here we need to understand two concepts, one is signal and the other is channel.

A brief history of Internet information distribution

Transfer of information

The information publisher (source) publishes a message that is carried on some physical medium (this is the signal), passed out through the medium (this is the channel), and received by the recipient of the information (sink).

Give a chestnut, you said a word (this sentence contains information), this sentence is transformed into a sound wave signal, passed to me through the channel of air, I heard it, this is a The process of information transfer. In this process, there will be “noise” in the channel, the noise will interfere with the signal, the information transmission will be biased, and finally the information transmission will fail. Therefore, different signal bearing forms and different channels have physical limits of relative information transmission. For example, if you say a word 1 km away, I may not hear it.

Therefore, the transmission distance of the signal from interference can have a great impact on the efficiency of the social organization we can maintain.

So in order to preserve the signal undisturbed to the greatest extent, and to transmit/obtain information as far as possible, we have invented various technologies. To date, the most advanced technology tool for delivering information is the “Internet”.

The above content, if you have studied informatics at a university, basically the first lesson.

But if we only understand the level of the tool, our understanding of the Internet is not deep enough. Because the Internet transmits information, notOne-way delivery, in fact, the Internet retains the real-world ability to let everyone publish and access information, so the Internet has formed a “network.” This network extracts as much as possible of all objective and abstract subjective information known in the world, maps this information to the Internet, and then transmits this information as an electrical signal (speed of light).

So, I often say that the Internet is a bit of the real world (BIT, the unit of information) mirror.

And the information on this bit mirror is transmitted at “speed of light”.

S=VT

We all know this simple formula, meaning distance = speed * time. Simply put, if the information is transmitted at the speed of light, then in the same time, the radius of the information we get is very long. To what extent? You must have heard the concept of “global village”. In the decade when the Internet just passed into China, people especially liked to talk about this concept. Although a bit earthy, this sentence is really very image. The earth has become a village.

It was good at the beginning, the radius of our information was expanded, and we completed an “evolution” with computers and the Internet. But we don’t really get used to this change, because the environment that humans face has changed from “information scarcity” to “information overload” for the first time in 10,000 years since the birth of civilization.

In fact, the development process of the Internet, we can sort out two main threads, respectively:

More and more information on the web;
More and more people are online.

The two main threads themselves promote each other to form a closed loop, that is, more and more people are making, publishing, and getting more and more information on the Internet, and the information on the Internet is coming. The more you attract, the more people are using it.

We all know that by 2019, the number of Internet users around the world is probably just over 4 billion, of which less than 900 million are Chinese netizens. So how much information do you know on the Internet?

A brief history of Internet information distribution

Data Source: IDC, 2017 Data Age 2025 White Paper

In 2017, IDC (International Data Corporation) did a research. They estimated that there were more than 20 ZBs on the Internet (1ZB = 1 trillion GB). At that time, it was predicted that this data would probably double by 2019. Double, up to 40 ZB.

A brief history of Internet information distribution

Global Data Volume in 2019

So roughly calculate that each netizen is theoretically assigned an average of 1ZB of information. Of course, the actual situation is not egalitarianism, but it is only easy to understand. I just want to say that even if we exercise such egalitarianism, you still can’t completely accept this information in your short life. What’s more, in the ocean of information that can be repeatedly consumed, the reality is that today you need to find it in the 40ZB information sea for the rest of your life, but it is the dozens of TBs you need.

You can get more information than you can handle, which is information overload.

The pioneers of the Internet actually realized this problem very early.

A brief history of Internet information distribution

The world’s first website

http://info.cern.ch/hypertext/WWW/TheProject.html

Tim Berners-Lee, the father of the World Wide Web (WWW), also launched the world’s first website while inventing the World Wide Web, which uses the hypertext technology to connect CERN’s laboratories. Up, because of the use of hypertext (that is, later links), people can easily browse the aggregated information. Tim Berners-Lee published the project on August 6, 1991, and this day was also considered the birth of the World Wide Web. This website introduces the specifications of hypertext, the site’s site-building details, the installation and use of the browser, etc. Later, this site also included some other sites.The example is therefore considered to be the world’s first website directory.

It is also from this day that every person accessing the World Wide Web (or the Internet you are familiar with) has the right to create a graphical website of its own and the corresponding http URL. Then use the more natural spelling of the http URL, which is much easier to access than the previous FTP address.

Of course, in the era of the birth of the World Wide Web, although Tim Berners-Lee has tried to do some directory work on the World Wide Web site, the http URL has experienced such a huge explosion in the future, so that this work obviously needs special The agency comes to serve. So from that era, the long road to Internet information distribution was opened.

We can roughly divide the pattern of Internet information distribution into four eras according to the prevailing period of its dominant information distribution model:

Category Index – Portal Age;
Search Engine – Search Age;
Subscription attention-SNS era;
Recommended Algorithm – Feed Age;

Beyond these four eras, there is a long-standing “high-heat update-community hot post” mode (this model has become a domestic phenomenon because of the rise of Baidu Post Bar in the search era. A very important information distribution model). It is important to note that most of the information distribution patterns born in these times are generated with the corresponding technology change application, and behind the business model is completely upgraded. Innovation has led to a network era.

The evolution of these information distribution models has shaped the Internet giants of different eras. But these models are not completely replaced, and more is that the new information distribution model is backward compatible with the early models, and then through the innovation of business models, the latter-led companies put their predecessors on Ground friction (at least to some extent, broke the leadership position of the predecessors).

2/Category Index-Portal Age & Search Engine-Search Age

Although the BBS forum and search engine technology were born earlier than the birth of the World Wide Web, productivity-based applications need to adapt to the objective laws of the development of the times.Information dissemination is not a search engine technology and a BBS forum, but a classification index based on hypertext technology.

Although the first Internet classified indexing service is not Yahoo, the most well-known and successful case is indeed it.

A brief history of Internet information distribution

The first edition of Yahoo!

This website, created by Chinese entrepreneur Yang Zhiyuan, quickly defeated his main rival in the mid-1990s with the efficient management and aggressive marketing strategy to become the world’s most important web portal (yes, domestic The so-called four portals – Sina, Sohu, NetEase, Tencent, were originally Yahoo!’s copycat). The concept of the so-called “portal” may be strange to young people today, but for the people at the time, the vocabulary is vividly portrayed, which is the first step for most people to go online.

People today may find it hard to imagine that the World Wide Web was born 25 years ago the age of no search engine, if people are looking for “information”, how people should start? /strong>. So Yahoo! This manual classification and manual entry into the classified search directory website is the place where you can find the fastest website and what new website appears on the Internet. You can think of such a website as a “yellow page” (a directory containing business phone numbers), or to some extent, Yahoo! It is the network yellow page (Ma Yun first wanted to do the Internet Yellow Pages, and later took Yang Zhiyuan’s investment, not for no reason).

The imagination of human beings is not born out of nothing. We always make some guesses about the future based on some existing things, combined with some trends. For those who have never seen this future, it is reasonable to refer to the existing website yellow pages and the design of products that are “materialized”.

And Yahoo! The business model can be said to be very rude and simple, that is, selling advertisements, which is exactly the banner advertisement on the selling website.

Yahoo! The obvious car banner ad above the search box

Yahoo at the time! To increase your income, the method is simple. Add banner ad slots (Add Ad Loads) on the home page (including the large categories of subsequent index directory pages) and sell individual ads. More expensive (Raise Ad Price) for this Yahoo! Willing to give high sales to advertising, while placing a large number of market ads to motivate sales and education advertisers.

Yahoo! A bunch of advertising banners on the home page

Early Yahoo! After the success of the information distribution model for classification index, the form of the product was rapidly evolved. First, the search function was added in 1995, but this search is not the same as the later search engine, mainly used to make the classification search fast. Search, the search results often point directly to a certain URL (this detail is especially important when talking about search engines). Then 96 years Yahoo! I started the mailbox service again, and then expanded the business to the news information service (this service is equivalent to the business that marked the media newspaper).

As the world’s largest Internet portal, web directory, e-mail and news information website at the time, with the rapid growth of the number of global Internet users, Yahoo’s market value once exceeded 100 billion US dollars, so that today’s Google and China’s four major portals There is always Yahoo! Both sense of sight.

Therefore, the search engine that ended the portal era in the future debuted.

Strictly speaking, search engine technology was born a year earlier than the World Wide Web. The predecessor of modern search engine was Archie, which was born in 90 years, a technology for finding files from FTP hosts by file name. In 1994, the first modern search engine, Lycos, which used spider reptile technology, was born (the word Lycos is the name of a tarantula). Four years later, Google, which we are familiar with, was born.

The initial difference between these search engines and the classified index portal is the way in which network information is obtained. Classification index is manually entered, while search engines use spider crawler programs to fully crawl information. In addition, the manually classified directory mainly contains the URL, and the spider crawler can crawl each specific web page. For the user, the former needs to enter the website to continue to find information, the latter can be in one step, convenient and fast.

Obviously, search engines are much more efficient at getting network information than classification indexes. So why did people not use search engines so much at first? Because early search engines can climb a lot of things, but in a large number of related results, search engines can not accurately “guess” which is the result you want.

The reason why Google’s rapid rise is not only because they are tirelessly crawling web pages or simply designing products. The early winners from Google Creative redefine the way search results are sorted.

One of Google’s two founders, Larry Page, invented the PageRank algorithm—measuring the value of a page by the number of links to the page. Google’s efforts to optimize search results have made them unique. In fact, including the PageRank algorithm and the Hilitop algorithm, the HITS algorithm, the TrustRank algorithm, and the SandBox (sandbox) used to process some bad web pages, Google has become the most important information distribution efficiency around 2000. company of.

But early days of Google were technologically advanced, but as a technology company, they were not able to compete with Yahoo with huge users and huge amounts of cash! Direct confrontation. In fact, they not only did not confront, Google was even Yahoo in 2002-2004! Exclusive outsourcing of search technology. Yahoo with Google technology! The experience has increased rapidly, and the number of clicks has grown rapidly, the giant Yahoo! I was so happy that I didn’t realize that this little brother would become his own gravedigger in the future.

As Yahoo! The younger brother, Google’s Larry Page once wanted to sell Google’s predecessor, BackRub, to Yahoo! And by 2002, Yahoo! I have also tried to buy Google for $3 billion, if not because Google has repaid $5 billion, Yahoo! It is too expensive to give up the deal, and the history of the world’s Internet is almost rewritten (such a story is constantly being staged on the Internet).

domeet webmaster

1/Bit mirroring of the Internet and the world

2/Category Index-Portal Age & Search Engine-Search Age