Union AI, a Bellevue, Washington–based open source startup that helps businesses build and orchestrate their AI and data workflows with the help of a cloud-native automation platform, today announced that it has raised a $19.1 million Series A round from NEA and Nava Ventures. The company also announced the general availability of its fully managed Union Cloud service.
At the core of Union is Flyte, an open source tool for building production-grade workflow automation platforms with a focus on data, machine learning and analytics stacks. The idea behind the platform was to build a single platform that teams can then use to create their ETL pipelines and analytics workflows, as well as their machine learning pipelines. And while there are other projects on the market that offer similar orchestration capabilities, the idea here is to build a tool that is specifically built for the needs of machine learning teams.
Flyte was originally developed inside of Lyft, where Union AI CEO and co-founder Ketan Umare developed some of the company's earliest machine learning–based ETA and traffic models in 2016. At the time, Lyft had to glue together various open source systems to put these models into production.
"We got something running, but behind the scenes, it was a man behind the curtain. It was happening, but it was a lot of work," Umare said. "What we learned was that other teams in the company were also struggling -- and these were massive teams. And what happens when teams struggle is that they cannot keep the talent on. That's a big problem, but what was the root of that? They were not able to deliver their things and they were not able to articulate why they were not able to deliver. It turns out to be an infrastructure problem."
Image Credits: Union.ai
So he set out with a small team to build out the infrastructure tooling to make it easier for these teams to build their models and put them into production. But there was always friction between the software engineers and machine learning specialists. "The reason was that -- at least in the way I have distilled it -- I think software and machine learning systems or AI products are inherently different beasts," Umare argued. In his view, software typically matures over time while AI models tend to deteriorate. These models, he noted, also often change based on external factors that users have little control over. "So you cannot use the same infrastructure that you use for [software deployments]," he said.
At that point, the team decided to open source its work in the form of Flyte and work with others to build out a more machine-learning-native platform.
As is so often the case, Umare and four other members of the original Flyte team then decided to build a startup around these core ideas and the Flyte open source project, with Union AI launching in late 2020.
Currently, Flyte is being used by companies like blackshark.ai, HBO, Intel, LinkedIn, Spotify, Stripe, Wolt, and ZipRecruiter.
"The fun thing about working with these large companies -- what we do in the open source -- is that we are working on some of the biggest models on our platform. So we know it works and we didn't have to build anything specifically because we've been doing this for years. We just had to extend a couple of things," Umare said.
"Based on a single team, we see 10x more offline training jobs dispatched from Flyte, and that results in 5x more frequent model releases with sizable business gains,” said Mick Jermsurawong, a machine learning infrastructure engineer with Stripe. “I think the realization here is that ML productivity is not a nice to have but actually a business requirement.”
But the Union AI platform isn't simply building Flyte-as-a-service. The team also built Pandera (a framework for data testing) and Union ML (a framework that sits on top of Flyte and helps teams build and deploy their models using their existing set of tools). Union Cloud combines all these elements and layers a set of enterprise tools, such as single sign-on, on top of it.
"Machine learning, and especially large language models, raise big issues around privacy and information security. Companies are becoming increasingly wary of using services where they lose control over what precisely happens with their data," said Greg Papadopoulos, venture partner, NEA. "Combining the power of big models with rich company data has to be handled with care -- that’s one of the reasons why we’re so excited about the progress made by the Union.AI team, first with Flyte and now with Union Cloud. This is exactly what people are demanding and a real differentiator: Let me exploit the power of large language models while maintaining control and ownership of my data."