This article comes from WeChat public account: I think about the pot and I’m at (ID: angelplusdevil) , author: I think I pot in GN

Introduction: It is mentioned in the book “From 0 to 1” that is regarded as the Silicon Valley Entrepreneur Bible. Monopoly enterprises have their own barriers, nothing more than: patented technology, network effects, economies of scale or brand advantages. . Disregarded by the capital world as “the first AI stock” is about to land on the Hong Kong Stock Exchange. The initial prospectus is a letter from CEO Inge to investors. At the beginning, it said that “ deep learning is open.” As the core competitiveness “.

Deep learning? This seems to be inconsistent with the four indicators mentioned in the book. What are the core competencies of AI companies represented by Kuang? To understand this, you must first understand what their business model is like. So I thought of the following questions at once:

  • Is AI a software company? Is it SaaS, PaaS, or a traditional software vendor?


    • Is AI a solution provider? Is an integrator? Or an outsourcing company?


      • Where are the barriers of AI companies? Is it really the AI ​​technology represented by “deep learning”?


        In 2016, Alpha Go started the so-called “first year of artificial intelligence”, but only two or three years later, it was discovered that “investors escaped artificial intelligence.” The focus of questioning is nothing more than technical breakthroughs encountering “bottlenecks” and business models “unclear”

        IngeThe sentence is like responding to the first point, and the second question requires complete data support and rigorous logical analysis. Judge directly by the direction of capital and subjective feelings, I think it is a speculative behavior . Therefore, the following content hopes to lead everyone to answer the above questions one by one, and finally unlock a more core topic:

        Similar to SaaS’s revolution to traditional software. Does the AI ​​company represented by Kuang really have disruptive innovation in its business model?

        Whether the answer is correct or not, we can objectively judge the core indicators, competitiveness, and future of AI companies only if we understand the problem. If you keep seeing the end, there is an egg at the end of the article.

        What is “AI company”

        First, we need to define a basic definition of AI companies represented by Kuang. Here we specifically refer to the independent research and development of artificial intelligence as a native and irreplaceable technology. At the same time, we have or have expanded into vertical industries and formed related Product or solution business.

        In order to understand this sentence more vividly, referring to the “Artificial Intelligence: A Way to Win in the Future” report released by Alibaba Cloud at the 2016 Yunqi Conference, the industry has reached the following consensus on the AI ​​industry chain:

        In the basic layer, traditional Internet companies and chip makers have obvious first-mover advantages. Therefore, most domestic AI companies will cut in from the technology layer or application layer, and with the precipitation of technology and business expansion, the boundary between the two layers is now gradually blurred. But overall, startups have two development paths:

        • Take a scene (such as face recognition) as a breakthrough point, by connecting the internals of corporate customersThe system or the self-built scene entrance such as sensors to obtain data, based on multi-dimensional data to continuously train models and optimize algorithms, find the best solution in a certain scene problem, and then copy to similar scenes in other industries;


          • Using a general technology (such as machine vision) as a breakthrough, deeply cultivating algorithms and underlying frameworks, especially when machine learning is accepted by the industry Driving the training model from the bottom can not only improve the universality and computing efficiency of the solution in different scenarios, but also ultimately improve the actual application effect.


            The former has a deep understanding of the scene, which is convenient for reaching customers and accumulating data, so the product is more easily accepted by customers and has a strong ability to monetize; the latter hopes to use the advantages of algorithms and the underlying framework to reach more industries efficiently. The partners in the industry can obtain data through open cooperation. During this period, they may not be able to reach customers directly. Therefore, they have a wide range of coverage, but their ability to realize cash is weak.

            We will not discuss which path is better for the time being. What needs to be agreed upon is: The “AI company” I am talking about also cuts in from the technology or application layer and follows any one or more of the above development paths Grow up , and take AI technology as the core and cannot be replaced by other computing methods.

            For the sake of fairness, I will select relevant AI companies and their public data recognized by the capital industry and the industry. In July last year, Arcsoft registered the science and technology board mainly for mobile phone lenses; in August, “CV (machine vision) One of Xiaolong ’s Muang Technology has submitted an IPO prospectus to the Hong Kong Stock Exchange; recently, A-share listed company HKUST, which uses speech recognition as its core technology, also ushered in the highest market value in two years.

            Thus, I can’t think of any more appropriate time to think about the questions raised earlier.

            The basics of defiance are also the truth of AI

            I have always attached great importance to understanding the company ’s historical evolution and development milestones before in-depth analysis. This not only reveals the original intention of the founder, but also shows what is the driving force behind the breakthrough of the company. Based on this, Can make independent and objective decisions on the company’s strategy, risks and development goalsJudge.

            (Source: public information, prospectus)

            I know from knowing that in 2013, Despise has tried several times to make an unlocking app based on face recognition on the consumer side, and even made an entertainment application “face master”. At that time, the boom of mobile Internet entrepreneurship was justified. It was not until 2015 that the Alipay project “Smile Payment”, which was cooperated with Ant Financial, officially opened the company’s road to commercialization and deepening into the industry. The development of defiance is similar to the second path mentioned above, but the interesting thing is that in the second year of the company’s establishment, the Face ++ platform was opened to the outside world, which seemed to herald future technical routes and strategic directions.

            Despised business and business model

            According to the prospectus, I will sort out the company’s products, commercialization and business model in detail as follows:

            Combined with the relevant financial data of (2019H1) as of June 30, 2019, we can initially grasp the operating status of defiance .

            (The picture above shows the comparison of data source costs in the industry’s SaaS, total revenue and cost ratios, and the figure below shows the various types of costs in the total cost of sales )

            From the above chart, we can find:

            • Compared to the cost of cloud services, Data source costs account for the bulk of the industry’s SaaS sales costs, close to 60% . Since the average ratio for the whole year of 2018 is larger than the first half of 2018, we have reason to believe that the average ratio for the whole year of 2019 will also be higher than the figure for the first half of 2019, and it has been increasing in the past two years;


              • In terms of the proportion of the industry’s SaaS revenue, the cost of data sources does show a downward trend. However, the to B industry generally enters the business peak in the second half of the year, and also refers to the performance of data source costs in 2018. Even if the cost ratio has dropped to 7% in the first half of 2019, it does not mean that the percentage of draws in 2019 will decrease significantly; p>


                • Most of the current industry solutions use face recognition as a typical use scenario. When the recognition technology is extended to other types of humans, objects, text, etc., it is naturally necessary to purchase rich and necessary third-party data for model training. In the future, the proportion of data source costs may not continue to decline year by year;


                  We put this discovery together with the first question that was put on hold before:

                  How to prevent AI from becoming a cost center for enterprises? The key depends on the productization ability. How to produce “AI” like Ford invented an efficient assembly line at that time-that is, a large-scale production algorithm, while reducing data source costs and computing power consumption as much as possible (cloud service cost) .

                  The cost of cloud services is relatively controllable, but as a necessary data set for the optimization of AI algorithms, can its marginal cost really decrease infinitely? In terms of SaaS or Internet products, does the dataset really have a “network effect”?

                  The answer is not necessarily, especially in the application of AI scenarios.

                  The “network effect” is, in popular terms, the more people use a product, the greater its value, such as social software. In the scenario of AI, the “network effect” of the data is reflected in the higher the amount of data, the higher the quality of the trained AI algorithm, which is reflected in the better the recognition effect or the higher the accuracy rate, and finally put into practical applications The greater the commercial value. Then when more and more scenes use high-quality algorithms, the acquisition cost of the data set will become lower and lower.

                  But is all this reasonable?

                  A16Z, a well-known venture capital firm in the United States, wrote an article called “The Empty Promise of Data Moats” (Promises that fail in the data barrier) It is mentioned that although the data does have the “network effect” mentioned above under most conditions, the algorithms obtained through massive data training can usually improve the accuracy to 50% or even higher, but it will change if you want to increase it further. It’s very difficult. The article quotes an example of using natural language processing (NLP) technology on the intelligent customer service of the credit card center:

                  The blue line represents each user requesting (such as “I want to update personal information”, etc.) , the red area represents accumulated requests, The study found that when the training data (contribution from the customer service center) accumulated to 20%, it could basically cover 20% of user requests. However, it can be seen that the curve is gradually flattening, that is, the more data used for training, the scene coverage represented by the horizontal axis does not increase linearly with it, and eventually stays at about 40%. This intelligent system will no longer be able to handle the remaining 60% of customer calls.

                  Combining this research and basic knowledge of AI, I think the main reasons are as follows:

                  • Model training requires the introduction of machine learning or even deep learning. Because the multi-layer neural network used in deep learning is still a “black box” type, it is difficult for technicians to thoroughly understand the core of the deep network, nor can they fully grasp the input and output, The parameters affect each other and other relationships, so tuning and optimizing is a very slow process;


                    • Deep learning requires a large amount of data or samples, and the larger the amount of data, the higher the quality of the data. The extreme cases in the real scene need to be covered as much as possible. “> (Corner case) . For example, in a customer service scenario, a user asks “I want to check how much money I spent in the school cafeteria before going to Starbucks to buy cappuccino the afternoon before?”, And the problem sets provided by third-party data companies in general are difficult to achieve such detail Granularity. If you want to further optimize the algorithm, you need to complete the extreme cases in the system or in the real scene.


                      So when an AI company first enters a new scene, it needs to get at least a set of minimum value data sources (minimum viable corpus) The combination of algorithms for the basic scenario, and then collect as many extreme cases as possible for continuous iteration. There are two problems behind this:

                      • Data acquisition costs will become higher and higher as the algorithm is upgraded;


                        • At the same time, the data is likely to be out of date, and the old data needs to be removed or relabeled in time.


                          Also, as the algorithm iteration takes longer, the server cost will be lower on the one hand (usually in the cloud) is non-linear Growth, on the other hand, human intervention in data processing is likely to be necessary.

                          You can also see from the staff structure in the contempt prospectus that the “data annotation” personnel accounted for 17% of the company’s total staff, second only to the research and development team.

                          So, how to solve the various cost problems caused by the “data network effect” that is gradually ineffective in AI?

                          The answer given by contempt is to use autonomous machine learning (AutoML) Researched and developed deep learning platform “Brain ++” and data management and annotation platform “Data ++”.

                          According to the relevant introduction of the contempt research institute, Brain ++ has developed into a basic layer supporting algorithm research and development, consisting of three modules: the deep learning framework MegEngine, the deep learning cloud computing platform MegCompute, and the data management platform MegData class = “text-remarks” label = “Remarks”> (also later Data ++) , corresponding to the three major elements of AI. The functions covered range from data acquisition, cleaning, preprocessing, etc., to the design of algorithm architecture, experimental links, training environments, training, tuning, model effect evaluation by researchers, to the final model distribution and deployment applications, with particular emphasis on Several unique advantages:

                          • Customized optimization for computer vision tasks, especially for a large number of image or video training tasks;


                            • AutoML technology automatically designs deep neural networks to automate the production of algorithms, so that researchers can use a minimum of manpower and time to customize a variety of algorithm combinations for the needs of fragmented vertical fields, including “long tail requirements” (ie extreme cases) ;


                              • The intelligent deployment of infrastructure, data storage, and computing ensures multi-user and multi-tasking operations, improves training efficiency, and reduces the cost of cloud services in disguise;


                                I’m not an AI expert, and the performance and technology of Brain ++ are not discussed here. However, the previous questions have gradually been answered. The reason why I despise the importance of investing in Brain ++ and call it the company’s “core competitiveness” is because the deep learning framework looks to me like an operating system to help researchers based on the scene. Different applications, differences in terminal hardware conditions, and the level of return on investment make it possible to find the optimal solution as automatically as possible. At the same time, the semi-automatic data processing and labeling functions supported by Data ++ allow multiple people to access the same set of data for training at the same time. The goal is also to fundamentally reduce bandwidth and labeling costs.

                                This also explains why Despotic released a Cover 1 early last year>

                                From the core financial data of Hikvision on the picture above, we can have a glimpse into the strategy of this traditional giant in recent years. The company proposed “AI Cloud” in 2018. The core is to cover the multi-dimensional collection, intelligence An end-to-end computing architecture for analysis, iteration of back-end algorithms, and resource scheduling. Looking back at the previous discussion of Brain ++, I believe that everyone can easily understand the core concept of “AI Cloud” and the product form shown in the figure below.

                                For defiance, if compared with similar AI machine vision companies, Brain ++ is their technically valuable sword; then in the process of commercialization, a structured underlying system may allow them to compete with the giants. The last laugh in the protracted battle. This logic is similar to the development of Brain ++. Only an open system platform is used to compatible with the existing market and new requirements. Especially for enterprise customers who must compare with each other, if they can connect to their existing equipment and related software to provide Better analysis results, which will greatly reduce the threshold for customers and partners to use. Despise calls this underlying system “platform software.”

                                This picture not only reminds me of the concept of “Central Taiwan”, but also reminds me of a concept proposed by Greylock, a venture capital agency that has invested in star enterprise service companies such as Cloudera and Docker, called “Systems of Intelligence -remarks “label =” Remarks “> (intelligent system, as shown below) “. I think this will be the foundation for a new generation of companies, especially AI companies, to build moats.

                                In simple terms, the intelligent system layer can obtain and integrate all the underlying information and data sources across platforms, combined with ABC (AI, Big data, Cloud) < / span> ability to provide customers with real-time, accurate or personalized analysis. The future battle of the moat will change from “how to get more data” to “how to use data more intelligently”. This is a process of continuous iteration, and the barriers will be higher and higher.

                                Therefore, from the “city management brain” proposed two years ago to the release of “Hetu”, an operating system compatible with multiple types of robots, last year, the defiance confirmed that system software is the core of AI commercialization. Rather than selling more smart hardware products to customers to get a slice of the market.

                                (The ecological connection design of “Hetu Platform”)

                                We often hear a word called “commercial closed loop”. The establishment of an intelligent system layer is the process of achieving internal and external two-way interaction and closed loop. More importantly, once the flow of closed loops from data collection, transmission, analysis to decision feedback is formed, the traditional way of selling hardware products will gradually become a service model where software-driven hardware generates real-time effects. Flexible payment by effect. Based on national conditions, this may not be idealized in the city’s Internet of Things management and security market, but it will definitely find a suitable foothold in logistics, retail, pan-finance and even overseas markets.

                                The intelligent system with AI as the core can not only build a stronger moat for the enterprise, but also more likely to challenge the old players in traditional industries with a business model similar to SaaS, just as Salesforce shook Oracle at that time. From this perspective, the “AIaaS” represented by contempt seems to have an infinite “approach” to SaaS in the future in terms of business models.

                                Review the core questions from the beginning:

                                • Will “AI” really become the “cost center” of defiance? How can companies solve the seemingly inevitable problem of data sources and cloud service costs?


                                  • Is the “industry SaaS” mentioned by Desperation really SaaS? And what is the model of “AIaaS” once chanted?


                                    • Where is the moat of “AIaaS”? How does defiance work?


                                      Finally, my answer is:

                                      • We still ca n’t equate “industry SaaS” in the current business with “SaaS” products that are generally understood in the market, because the temporarily unavoidable data source costs cause unpredictable gross margin fluctuations; < / p>


                                        • Data in algorithm modelThe generation process does not produce idealized network effects. To avoid data and calculation resources becoming cost centers, automated algorithm generation and data labeling may be the most efficient solution, so this tests the company’s bottom layer such as deep learning. Technical accomplishments and strategic planning;


                                          • AI companies will encounter various obstacles in their commercialization. Only by transforming product sales ideas into sustainable service models can they have the opportunity to break through. The core lies in designing the intelligent system layer from the beginning of the strategy and establishing ecological connections.


                                            The future of “AI First Share”

                                            Now that you have a thorough understanding of the history and fundamentals of defiance, look forward to the future by the way.

                                            There are several things that I think may be suitable for defiance to consider after listing:

                                            • Open source (or part of the open source) Brain ++ : The two major open source deep learning frameworks, Google’s TensorFlow and Facebook’s PyTorch, are playing hard, and performance will not be commented. What’s interesting is that TensorFlow, which first entered the mainstream, firmly occupied the industry with its stable performance and security, while PyTorch, a later entrant, opened a crack in the academic world through easy to use and simple operation. In contrast, the advantage of contempt must be in the vertical direction of proud machine vision, and the important way to maintain the leading position in this field is to build a developer ecosystem. Based on the advantages of domestically unique data and business scenarios, if machine learning or even deep learning becomes the standard for the next generation of IT construction, in the future, at least defiance in the visual field can occupy the best of time, geography and harmony;


                                              • Creating a standard language for model training and opening it to the ecology : I heard that the team has been planning a programming language for deep learning training since 2018 to coordinate the training needs Flexibility and performance requirements for inference. Last February, Facebook’s chief AI scientist Yann LeCun also mentioned whether a more flexible language than Python is needed for deep learning design. Therefore, the exploration in the industry is still at an early stage at home and abroad.Visual defiance and even domestic counterparts can be indistinguishable from international giants. I believe that this opportunity belongs to those who have an early layout;


                                                • Layout manufacturing : Manufacturing accounts for nearly one-third of China’s GDP, and machine vision first entered the industry abroad Field, mainly used for size measurement and appearance inspection. Now on the hardware side, from natural light, infrared to laser, from 2D to 3D cameras, all flowers bloom. Hikvision also released an industrial camera product line in 2017. And the exploration of AI on the software side has just begun. Although the lack of sample size and quality has caused some obstacles to the implementation of deep learning, perhaps this is just an opportunity for defiance and its open ecology. Before the real scene and needs are clear, make relevant layouts in advance, which is another tens of billions of markets.

                                                  This article comes from WeChat public account: I think about the pot and I’m at (ID: angelplusdevil) , author: I think I pot in GN