The case for open AI

May 05, 2023

The memo that leaked from Google (which is real) makes the case for an open future for AI. I have increasing conviction that aspects of this future will unfold but I disagree with the conclusion. I believe we are heading to a world where there’s an oligopoly for the largest, most capable models accompanied by a thriving open ecosystem of smaller models.

Let’s get this out of the way upfront: I’m an OpenAI shareholder; I’m now actively investing into the future I foresee as a VC at Spark. I’m biased, have conflict etc etc.

AGI or Bust

I don’t see how there’s anything but a small number of groups providing the largest, most capable models. At this point it seems clear that it will be OpenAI and Anthropic and then tbd on whether anyone else will join them.

The window to recreate the dynamics that have led to these groups emerging is closing. These are technical and research problems on the scale of history’s largest projects, requiring the capital requirements of such projects, while being executed at the velocity of a ruthless startup.

Even in terms of “recreating the dynamics” it’s worth remembering that OpenAI and Anthropic are essentially the same, evolving via a mitosis-like event (what a scene this moment will make when the book is eventually written). Which is to say, despite an oligopoly forming, there remains zero evidence that anyone is able to recreate the required dynamics. Further, the talent required to train models of this nature is becoming scarcer, as the flow of talent accelerates toward a small number of orgs.

On some cadence - tick, tick, tick - OpenAI and other AGI or Bust companies will release the world’s largest models, capabilities will step forward, and the world will gasp as AI accomplishes previously impossible feats.

There’s no chance open source will move in lock-step, let alone surpass these private labs when it comes to training the largest models. But open source will still play a critical role in the future of AI.

Open AI

Today, the state of open source AI is rather dismal, in the sense that while performance on benchmarks against comparable models is good, performance in production settings is not. This will all get fixed, and relatively quickly. Here’s why I’m optimistic.

Open source efforts are just getting started and accelerating quickly.

Llama was released just two months ago. Pipelines, tools, data sets, etc are all evolving quickly. Performance looks good on academic benchmarks because these models originated from research-motivated researchers and evals for production aren’t really a thing (yet). But product-motivated researchers and developers will continue to iteratively move the ecosystem toward solutions that work well in production.

The world is the R&D lab for open source.

Talent may be flowing toward a few private labs, but they will only ever have a sliver of the total talent available across the world. The permissionless innovation that open source enables means that professors at Berkeley, alongside pseudo-anonymous students in Bangladesh, are all advancing the state of the ecosystem. There will be an explosion of experiments, creating toolsets and datasets, but also integrations with the weights themselves, that tailor open models for very specific use-cases.

Many product use-cases will have a ceiling in terms of the AI capabilities they require.

As GPT-5, -6, -7, … arrive, new use-cases will be unlocked by the advancing capabilities. But with every generational step forward, some use-cases will no longer be able to absorb the increase in capability and other measures will dictate which solution is optimal.

We’re already seeing this. While GPT-4 performs breathtakingly well on standard exams, it wouldn’t make sense to use this model in production at scale for a simple classification task. Replit is showing us that the same may already be true for more complex tasks such as code completion. Sometimes it will make sense to pay up (in every sense) to use the most general, most capable model. Many times it won’t, as your needs will be best met by a narrow model that’s tailored to your specific use-case.

AI-enabled products will provide the optimal user experience by going ‘full stack’.

History has shown us that delivering the best user experience matters. And the best user experience is the result of every product detail being considered and constructed with intention. In AI-enabled products, this means that control, and the ability to make opinionated decisions across all aspects of the model, will be important for delivering the optimal product experience. As the open ecosystem evolves, it will increasingly be easy for product teams to go “full stack” with their AI model to deliver the best experience.

For example, the nature of the training data - such as form, style, length, etc - impacts what the AI generates and what the AI generates is increasingly a large part of the user experience. Product teams will start to care deeply about the training data because it will have a material impact on the user experience.

The likely future

We’re heading toward a world where a small number of players have a defensible oligopoly on the most capable, most general models. Product teams will use these models when required. Adjacent to this will be a vibrant, open ecosystem of smaller models, tailored and customizable to specific product needs.

There’s a likely future where the aggregate volume from open models is orders of magnitude larger than the private ones. What’s unclear is how much of the value will accrue to the open ecosystem.

Misc

I joined Spark Capital and am leading investments in seed and series a startups. I love entrepreneurship and technology. I love that a small team can ship and change the world. As a former founder, I’m excited to support founders building remarkable products.
We’re hiring a hacker in residence at Spark. Come build with me.

Entropy

Ready for more?