This article comes from the WeChat public account: Programming Technology Universe (ID: xuanyuancoding) , author: Regulus wind, from the title figure: vision China

Remember me, I am Ah Q, Ah Q from CPU No. 1 Workshop.

Too busy today, came to the Address Translation Department in the factory, and Xiao He, who was in charge of this work, was busy with sweat.

Seeing my arrival, Xiaohei pointed to the seat next to him and motioned me to sit down.

< p>

After sitting for a while, Xiaohei turned around from the work station, "I'm really sorry, Q, I live too much today, I have no time to entertain you."

"What are you busy with, see you sweating a lot." I asked.

"Hey, don't mention it. I always find memory page faults and keep telling the operating system to deal with them. I really miss the past. There are not so many broken things to manage." Xiao Hei sighed.

As soon as I heard the interest, "Xiaohei, tell me about your work, what is address translation, why do you miss the past?"

Xiaohei adjusted his sitting posture and grunted a few sips of water to say, "This is a long story."

Next, Xiaohei began to tell me historical stories ...

8086

It turned out that our ancestor was called 8086, and Xiao Hei showed me his picture.

< p>

It was an era of innocence and simplicity. Although the work performance was not high, the procedures of that era were very simple. Our ancestors became stars after the advent of the era, which is worthy of the top of the era.

Did you see the metal pins in the photo? That's the tentacles of our CPU and the outside world, each of which has a different role.

Through these tentacles, the CPU can work with the memory, get instructions and data, and work hard.

In those days, the conditions were relatively poor. If you can, you can do it, if you can share it, you can share it. No, you see that the address bus pins and data bus pins of the ancestor CPU are shared.

The ancestor is a 16-bit CPU, and the data (Data) bus has 16 bits, which can transmit 16 bits at a time. And the address (Address) line together to share together, so named AD0-AD15.

But there are more than 16 address buses of the ancestors, and there are 4 more A16-A19! There are 20 address lines, and 1MB of memory can be addressed!

But the registers of the ancestors are all 16 bits and can only store 16-bit addresses. But they are very clever and invented a method called Segmented Storage Management , which divides the memory into small blocks of up to 64KB, why is it 64KB, because 16-bit addresses can only be addressed so large Too. Then added a few things called segment registers, pointing to the beginning of these blocks, so that by segment address + offset address within the segment, you can access more memory.

32-bit era

Later, the computing power of the ancestors has become more and more stretched, and it really can't keep up with the times. The younger generation in the family began to pick the beam, 80286 and 80386CPU came out one after another, especially 80386, became an epoch-making existence.

< p>

In the 80386 era, we had more pins to communicate with the outside world and became 32-bit CPUs. At that time, living conditions became better, and the address lines and data lines no longer shared pins. .

< p>

Later, humans have become more and more greedy, wanting to listen to music while surfing the Internet while also editing documents, which requires running multiple programs at the same time.

At this time, someone discovered a business opportunity and developed a thing called Operating System . It turns out that those programs no longer deal directly with our CPU, but with the operating system. We deal with it, and it is them who say the middleman earns the difference!

The operating system is very clever. It allows us to execute multiple programs in turn through time slices, let us perform music playback in a while, let us execute browser programs in a while, and let us execute document editing programs in a while. We don't care, what code is not code, we don't pick it, just work hard. The speed of human reaction is far from us, they thought these procedures were really executed at the same time.

Virtual Memory

But then there was a big problem. So many programs had to be run. Everyone was squeezed in a memory. There were frequent frictions and conflicts.

< p>

The ancestors exhausted their thoughts on this matter, and finally came up with a good way, which has been used to this day.

They proposed a virtual address thing. The address used by all programs is a virtual address. When dealing with memory, our CPU internal staff will translate it into a real one. Memory address, about this matter, the memory guy has been kept in the dark by us.

In this way, each program can use a total address space of 4GB from 0x00000000 to 0xffffffff. Of course, it will not really give them so much space. The memory guy has only 4GB in total, but instead Need to apply for distribution. The allocation unit is carried out according to the page, a page of 32-bit CPU is 4KB. The hard work of these allocation management will let the operating system do it. The middleman can't just take advantage of it and do the right thing. As for our CPU, just do the address translation work.


For this reason, a new register CR3 was added inside our register to point to an address translation query dictionary. The dictionary is divided into two levels of directories. We divided a 32-bit address into three parts. The first two parts point to the entries in the two-level directory, which is used to locate which page of the address is in physical memory. The last part is the offset to the physical memory page Address translation work.

Each process has a different address space. When switching between processes, change the CR3 content and use the translation dictionary of the new process, which is particularly convenient.

We call this memory management method paged memory management .

I really admire the wisdom of the ancestors, so cleverly separate the various programs, and later we called this working mode protected mode The working mode using real memory addresses is called real address mode .

Paging exchange

Humans have become more and more greedy, more and more programs, and more and more memory requirements. As these programs continue to apply for memory pages, memory space will soon be exhausted.

We see it in our eyes, anxiously, and then negotiate with the operating system to see what to do about this problem.

The guy with the operating system is not bad, and came up with a good idea. The size of the memory is limited, but the hard disk is awesome. The hard disk has much more space. Go to an area on the hard disk, replace the pages that have not been used in the memory for a long time, and then make a mark. If anyone wants to access that page later, our CPU checks if there is this mark, and sends a page fault interrupt signal to tell the operating system to change this page back.

Through our cooperation, we solved the crisis of memory shortage. Later we called this technique memory paging swap .

Now

Time passes quickly. In our generation, the memory becomes larger. 16GB is a small case, and 32GB is also very common.

In addition to memory, our CPU itself is also more advanced, let alone say, you can only look at the number of pins we have now than the generations of our ancestors.

< p>

We have not only changed from 32-bit to 64-bit, but also from single-core to multi-core. Like the CPU I am in, there are 8 workshops with 8 cores running in parallel. .

Easter eggs

In the chat room with Xiaohei, the old K from our workshop suddenly appeared at the door.

"Ah Q, you are here, let me find it, hurry back, Huzi in the second workshop next door said we changed their data and came to the door to make trouble ..."

Foresee the future, please pay attention to the follow-up wonderful ...

This article comes from the WeChat public account: Programming Technology Universe (ID: xuanyuancoding) , author: Regulus wind, from the title figure: vision China