Is this a precursor to an IPO?

On the 22nd of this month, Databricks, the US big data artificial intelligence company, announced that it has received $400 million in F-round financing. The leading investors are Andreessen Horowitz’s Late Stage Venture Fund, BlackRoc, T. Rowe Price Associates and Tiger Global. Management also participated in this round of financing. It is reported that this financing is mainly used to expand the scale of research and development and the global market layout. After the current round, Databricks has reached a valuation of $6.2 billion.

This is also the second financing within a year after the company completed the E round of $250 million in financing in February this year. In the E round of financing, investors include Andreessen Horowitz, Coatue Management, Microsoft (Microsoft) and New Enterprise Associates (NEA). Microsoft, the employer, reportedly partnered with Databricks in 2017 to launch a tool for processing and analyzing large amounts of enterprise data—Azure Databricks.

Big Data Company Databricks completed two years of financing within one year, and the company's valuation has reached $6.2 billion

Image from Databricks

As a late-stage startup, such large-scale financing is relatively rare. Ali Ghodsi, co-founder and CEO of Databricks, said that this round of financing is a milestone. The company will conduct an IPO at some time in the future. There is no accurate time to disclose it, but listing is the company’s “end goal”, So Databricks may land in the secondary market in the near future. It is reported that Databricks also hired Dave Conte as the new CFO while completing this round of financing. He previously served as the chief financial officer of Splunk and helped the company complete the listing.

Databricks was created in 2013 by the University of California at Berkeley (UCB) to develop an open source Apache Spark data processing framework that provides a unified analysis platform for data science teams toData engineering and business units work together to build data products.

Databricks currently offers four products: Delta Lake – an open source data lake product; MLflow – an open source framework that helps data teams use machine learning; Koalas – creates a single machine framework for Spark and Pandos, simplifying the use of both The work of a tool; Spark – an open source analysis engine. Among them, MLflow and Delta Lake are new products released at the Spark+AI Europe Summit in Amsterdam this October.

MLflow is integrated into the Unified Data Analysis Platform (UDAP), but can be integrated with other platforms based on open source code. MLflow can be used to assist with machine learning experiments and model management, recording different algorithms and hyperparameter configurations, and applying them to generate model accuracy. MLflow also defines a model persistence format that allows models to be shared.

Delta Lake is a storage layer built on top of Spark SQL and Parquet files stored in the Databricks file system. By using differential (delta!) files and special indexes, Databricks adds significant functionality to its data lake, giving it higher performance, the same transaction management features and ACID compatibility as traditional relational databases. This means that new data can be added to the lake and then effectively queried immediately, which can solve a key pain point in the data lake.

Databricks’ Unified Data Analytics Platform (UDAP) is a cloud-based management and optimization Spark service that can be obtained directly from Amazon Cloud Services or from Microsoft Support Services on Azure Cloud. Recently, Databricks added these new features to UDAP to outperform Spark, Notebook and other basic features.

The open source versions of Databricks products can be downloaded directly from the Internet, but how to use these products to work for us is not an easy task. Databricks provides products to customers in the form of SAAS and is responsible for solving all problems encountered in use. Of course, “subscribing” to these services is a fee.

According to the report provided by Databricks, the company’s revenue forecast for this year will be more than 2.5 times higher than last year, and current revenue is $200 million. Databricks has more than 2,000 customers worldwide, including Nielsen, Shell, HP and ZEISS. These large customers continue to make the company’s financial performance continue to improve. Ghodsi said “we rooted a year agoThis trend is not predicted.”

With the strong performance of each business and new funding, Databricks plans to set up a dedicated engineering team to advance the optimization upgrades for Delta Lake, MLflow and Koalas. At the same time, Ghodsi said that their European engineering center in Amsterdam has nearly tripled in the past two years, so they also plan to increase investment by 100 million euros in Europe. In addition to these, Databricks also plans to fund market expansion in Europe, the Middle East, Africa, Asia Pacific and Latin America.

When many people are still discussing how to build a successful business model through open source, Databricks has achieved good growth in this way. Ghodsi said that this growth proves the effectiveness of the company’s open source strategy. The cloud-based open source software SAAS business model has a more well-known name, the “Red Hat (Red Hat) Business Model,” in which service providers provide open source software that relies on technical support services, training, and consulting to generate revenue.