Burning 10,000 GPUs from Microsoft, this group is determined to smash everyone’s jobs

This article is from the official account:qubit (ID: QbitAI), author: Xiao investigation, the title figure from: vision China

2020 is destined to be an extraordinary year. In addition to the new crown epidemic, this year may also be the “first year of Skynet.”

Because in this year was born “the most significant human invention after Bitcoin”, the destroyer of graphics cards-the most versatile AI model in history.

Now you just have to give orders to it:

Please help design a webpage like XX website to introduce my product.

Just a few seconds after inputting, it will immediately output a beautiful web design.

Recall that in early June, when this model was just released, the outside world thought it was “just showing off its wealth” and did not have much technical updates compared to the previous generation.

However, after the internal test was used, the wind evaluation immediately turned 180 degrees, and programmers who had experienced it said: “It’s really fragrant.”

Facts have proved that “Microsoft’s banknote + Nvidia’s nuclear bomb” can indeed do whatever you want.

Yes, this is OpenAI’s GPT-3, which can only be used to generate text. Within a week, programmers who were brain-opened developed more than 30 functions: database engineer, accounting, operation and maintenance, Intelligent customer service… as if to replace humans.

Now, it has become a “True Full Stack ProjectTeacher”.

all-round player GPT-3

In the backend, the AI model GPT-3 successfully “mock-up” and can write the AI model by itself. Just put forward specific requirements for GPT-3:

Build an image classification model and divide the pictures into 5 categories. The training data set has a total of 25,000 pictures and the input picture size is 500×500.

GPT-3 soon output the complete code of the model:

GPT-3 even writes the comments of the code, for fear that you will not understand it.

As an AI that can write AI models, it is always essential to know a little database, and SQL language is naturally not a problem.

Now using SQL to implement statistical search of data, it is basically a one-sentence matter. When the demand is raised, GPT-3 will quickly provide the SQL query code within a few seconds.

Since I am a full-stack engineer, only the back-end code is obviously not enough.

Before, we have seen the ability of GPT-3 to design web pages. HGPT-3 will also do the TML front-end engineer.

GPT-3 can generate web page CSS code based on the description, and can be directly used in web design with a little modification.

There are also operations and maintenance tasks, and GPT-3 also knows a little bit.

In addition, GPT-3 can also do things that programmers can’t.

In the absence of any financial knowledge, tell it about every expenditure you spend, and GPT-3 will automatically generate financial statements that meet the specifications. With GPT-3, you can save your boss money for asking for an accountant.

Work and entertainment are both right. After learning a piece of chess, GPT-3 has mastered chess skills again, and I don’t know when I can play against AlphaGo.

Some people have conducted a Turing test on GPT-3, and the results look like people are answering. The tester said: “If I used the same question to test ten years ago, I would definitely think that the answerer was a human.”

Seeing this, do you have a trace of fear? Does Skynet really go online, right?

Hinton, the father of deep learning and Turing Award winner, sees further. He believes that the future of GPT-3 is not just to replace humans, but to the more distant stars and seas, and perhaps use it to crack the ultimate secret of the universe.

From the amazing performance of GPT-3, it can be inferred that the answers to life, the universe, and everything are only 4.398 trillion parameters.

GPT-3 of “Banknote Ability”

This “it’s all blame” OpenAI recently tested the GPT-3 API, allowing programmers to open their minds and making ordinary people “fear”.

GPT-3 is super capable, on the one hand because of the rapid progress of various NLP technologies in the fierce competition, and the main reason is that OpenAI is violent and rich enough.

Be aware that the GPT-2(predecessor of GPT-3) released by OpenAI last year has only 1.5 billion parameters, but this year’s GPT-3 The volume has soared to 175 billion, more than 100 times the former!

The sudden increase in parameters will undoubtedly bring significant improvements to the performance of the model, but the question is where the extra computing resources come from.

OpenAI, which was established less than 5 years ago, can make GPT-3 make a leap in one year, and of course the support of the sponsor’s father is indispensable.

Last year, Microsoft invested US$1 billion in OpenAI, allowing OpenAI, which was already struggling, to finally release its hands and feet to develop a more powerful AI model.

GPT-3 uses a two-stage approach.

First, use a massive corpus to perform unsupervised pre-training on GPT-3. The data set contains approximately 300 billion tokens. The training goal is to let the model predict the next word.

If the model predicts the result is wrong, then calculate the error and update the model to make a better prediction next time. This process must be repeated millions of times until the model can generate the correct sentence.

This step is the most expensive. The BERT developed by Google has “only” 300 million parameters. Nvidia used a computing cluster composed of more than 1,400 V100 GPUs to complete the training for nearly an hour.

Not to mention the GPT-3 pre-training model with a data set of more than 500GB, and there are 175 billion parameters, and the difficulty of training can be imagined.

In order to train GPT-3, Microsoft spent money to save a supercomputer.

In May of this year, Microsoft officially announced the launch of one of the world’s top five supercomputers, specifically for OpenAI model training. It has 285,000 CPU cores and 10,000 Nvidia V100 GPUs. (Huang Renxun should have smiled when he saw this configuration.)

With this supercomputer, OpenAI can realize “bolder ideas.”

Some professionals have speculated that it takes “355 GPU years” to train a GPT-3 model. (a GPU runs for 355 years of computing) Training costs alone are as high as 4.6 million US dollars.

In Microsoft and OpenAI’s efforts to (chao) force (piao) Next, GPT-3 was finally trained.

The above is only the first stage. The pre-training model cannot be directly used for specific tasks. Then, as long as you fine-tune the model just now, you can handle various responsible NLP tasks.

The amount of calculation in this part is much smaller, and ordinary users can afford it. So we saw that GPT-3 was used to write code and design.

controversial GPT-3

GPT-3 became an Internet celebrity, and of course, it was inevitable that netizens had heated discussions.

Some reddit netizens believe that the emergence of GPT-3 proves that general artificial intelligence is not too far away. GPT-3 has done everything that can be done. We don’t even need to do better. We just need to wait a few more years to increase the computing power and expand the data set by 10 times.

Some people also feel that we are too optimistic about GPT-3, saying that it is “Skynet” is too exaggerated. GPT-3 is essentially an NLP model and has the same structure as GPT-2, the only difference is that it is larger.

Like other neural network models, GPT-3 is still a black box. We don’t know why it makes such inferences. Moreover, it only has text predictions, no logical reasoning, no thoughts, and generalization ability outside the training set. Very bad.

For example, the code writing mentioned earlier may be more due to the related content in the technical forum, which was copied by GPT-3.

In the face of praise from netizens, the CEO of OpenAI is very cautious. He believes that there is too much hype about GPT-3. GPT-3 still has serious flaws, sometimes it makes some low-level mistakes, and there are still many areas for improvement.

Because some people have used content from the Internet to train some biased results. In a foreign atmosphere, once the negative impact expands, it will be a huge blow to OpenAI. I am afraid that this is also the reason that OpenAI dare not expand the test range. One of the reasons.

Originally, OpenAI chose not to open source because it felt that GPT-3 was “too dangerous”. Now the open API should also be in a state of testing waters. Any problem can be quickly closed to prevent the problem from spreading.

△OpenAI official tips, GPT -3Training on network content, there may be offensive content

Because of this, the GPT-3 API is now a very scarce resource, and it is no less difficult to apply for a Beijing license plate lottery. You can only try your luck.

How to apply for trial

If you are also interested in GPT-3, please go to the OpenAI official website to apply for a trial. Fill out the form below and wait for the official notice.

If the application is successful, you will get a bunch of API trial keys. You only need to know Python 3, install yarn, and follow the instructions of the GPT-3 sandbox project to experience its various “banknote capabilities” on the web. “Again.

This is a piece of code that converts natural language into LaTeX format:

# Construct GPT object and show some examples

gpt= GPT(engine=”davinci”,

temperature=0.5,

max_tokens=100)

gpt.add_example(Example(‘Two plus two equals four’, ‘2 + 2 = 4’))

gpt.add_example(Example(‘The integral from zero to infinity’,’\\int_0^{\\infty}’))

gpt.add_example(Example(‘The gradient of x squared plus two times x with respect to x’,’\\nabla_x x^2 + 2x’))

gpt.add_example(Example(‘The log of two times x’,’\\log{2x}’))

gpt.add_example(Example(‘x squared plus y squared plus equals z squared’,’x^2 + y^2 = z^2′))

#Define UI configuration

config=UIConfig(description=”Texttoequation”,

button_text=”Translate”,

placeholder=”x squared plus 2 times x”)

demo_web_app(gpt, config)

Now that you are a mature AI, you should be able to write the “Newton-Leibniz formula” automatically.

How about it, do you want to try it?

Reference link:

https://blogs.microsoft.com/ai/openai-azure-supercomputer/https://jalammar.github.io/how-gpt3-works-visualizations-animations /https://www.reddit.com/r/MachineLearning/comments/hymqof/d_gpt3_and_a_typology_of_hype_by_delip_rao/https://www.datanami.com/2020/07/21/openais-gpt-3-language-generator-is-impressive -but-dont-hold-your-breath-for-skynet/

GPT-3 application case: https://gpt3examples.com/

GPT-3 sandbox: https://github.com/shreyashankar/gpt3-sandbox

OpenAI API Developer Toolkit: https://www.notion.so/API-Developer-Toolkit-49595ed6ffcd413e93ebff10d7e70fe7

This article is from the official account: qubit (ID: QbitAI) , Author: Xiao check

domeet webmaster