AI

When it comes to large language models, should you build or buy?

Comment

Gift bags for guests to a child's party decorated to look like a robot head.
Image Credits: Jenny Dettrick (opens in a new window) / Getty Images

Tanmay Chopra

Contributor

Tanmay Chopra works in machine learning at AI search startup Neeva, where he wrangles language models large and small. Previously, he oversaw the development of ML systems globally to counter violence and extremism on TikTok.

Last summer could only be described as an “AI summer,” especially with large language models making an explosive entrance. We saw huge neural networks trained on a massive corpora of data that can accomplish exceedingly impressive tasks, none more famous than OpenAI’s GPT-3 and its newer, hyped offspring, ChatGPT.

Companies of all shapes and sizes across industries are rushing to figure out how to incorporate and extract value from this new technology. But OpenAI’s business model has been no less transformative than its contributions to natural language processing. Unlike almost every previous release of a flagship model, this one does not come with open-source pretrained weights — that is, machine learning teams cannot simply download the models and fine-tune them for their own use cases.

Instead, they must either pay to use them as-is, or pay to fine-tune the models and then pay four times the as-is usage rate to employ it. Of course, companies can still choose other peer open-sourced models.

This has given rise to an age-old corporate — but entirely new to ML — question: Would it be better to buy or build this technology?

It’s important to note that there is no one-size-fits-all answer to this question; I’m not trying to provide a catch-all answer. I mean to highlight pros and cons of both routes and offer a framework that might help companies evaluate what works for them while also providing some middle paths that attempt to include components of both worlds.

Buying: Fast, but with clear pitfalls

Let’s start with buying. There are a whole host of model-as-a-service providers that offer custom models as APIs, charging per request. This approach is fast, reliable and requires little to no upfront capital expenditure. Effectively, this approach de-risks machine learning projects, especially for companies entering the domain, and requires limited in-house expertise beyond software engineers.

Projects can be kicked off without requiring experienced machine learning personnel, and the model outcomes can be reasonably predictable, given that the ML component is being purchased with a set of guarantees around the output.

Unfortunately, this approach comes with very clear pitfalls, primary among which is limited product defensibility. If you’re buying a model anyone can purchase and integrate it into your systems, it’s not too far-fetched to assume your competitors can achieve product parity just as quickly and reliably. That will be true unless you can create an upstream moat through non-replicable data-gathering techniques or a downstream moat through integrations.

What’s more, for high-throughput solutions, this approach can prove exceedingly expensive at scale. For context, OpenAI’s DaVinci costs $0.02 per thousand tokens. Conservatively assuming 250 tokens per request and similar-sized responses, you’re paying $0.01 per request. For a product with 100,000 requests per day, you’d pay more than $300,000 a year. Obviously, text-heavy applications (attempting to generate an article or engage in chat) would lead to even higher costs.

You must also account for the limited flexibility tied to this approach: You either use models as-is or pay significantly more to fine-tune them. It is worth remembering that the latter approach would involve an unspoken “lock-in” period with the provider, as fine-tuned models will be held in their digital custody, not yours.

Building: Flexible and defensible, but expensive and risky

On the other hand, building your own tech allows you to circumvent some of these challenges.

In most cases, “building” refers to leveraging and fine-tuning open sourced backbones not building from scratch (although that also has its place). This approach grants you exponentially greater flexibility for everything from modifying model architectures to reducing serving latencies through distilling and quantization.

It’s worth remembering that while purchased models might be impressive at many tasks, models trained in-house may well achieve sizable performance improvements on a specific task or domain. At scale, these models are much cheaper to deploy and can lead to the development of significantly defensible products that can take competitors much longer to replicate.

The most prominent example of this is the TikTok recommendation algorithm. Despite much of its details being publicly available in various research papers, even massive ML teams at its competitors are yet to replicate and deploy a similarly effective system.

Of course, there are no free lunches: Developing, deploying and maintaining elaborate machine learning systems in-house requires data engineering, machine learning and DevOps expertise, all of which are scarce and highly sought-after. Obviously, that requires high upfront investment.

The success of machine learning projects is also less predictable when you’re building them in-house, and some estimates put the likelihood of success at around the 20% mark. This may prolong the time-to-market.

All in all, while building looks extremely attractive in the long run, it requires leadership with a strong appetite for risk over an extended time period as well as deep coffers to back said appetite.

The middle road

That said, there are middle ground approaches that attempt to balance these positives and negatives. The first and most oft-discussed is prompt engineering.

This approach starts with buying then building a custom input template that serves to replace fine-tuning in some sense. It aims to guide the off-the-shelf model with clear examples or instructions, creating a middling level of defensibility in the form of custom prompts while retaining the benefits of buying.

Another way is to seek open source alternative backbones of closely equivalent quality and build atop them. This reduces upfront costs and lack of output predictability to some extent while retaining the flexibility offered by building. For example, GPT-J and GPT-Neo are two open source alternatives to GPT-3.

A slightly more intricate and newer approach is closed source approximation. This involves attempting to train an in-house model that aims to minimize the difference between GPT-3 and its own output, either terminally or at an earlier embedding stage. This will reduce time-to-market by leveraging GPT-3 in the short term and then transitioning to in-house systems as their quality improves in the long term to enable cost optimization and defensibility.

Still confused about which way to go? Here’s a three-question quiz:

Are you currently an AI business?

If yes, you’ll need to build to maintain defensibility.

If you’re not, buy for now and prompt engineering to tailor the model to your use cases.

If you want to be an AI business, work toward that over time: store data cleanly, start building an ML team and identify monetizable use cases.

Is your use case addressed by existing pre-trained models?

Can you simply afford to buy without putting in much additional work? If so, you should probably shell out the cash if time-to-market is a factor.

Building is not fast, easy or cheap. This is especially true if your use case is non-monetizable or you need a model for internal use.

Do you have unpredictable or extremely high request latency?

If yes, buying might not be economically feasible, especially in a consumer setting. That said, be realistic — quantify your request latency and buying costs to whatever extent possible. Building can be deceptively expensive, especially because you’ll need to hire ML engineers, buy tooling and pay for hosting.

Hopefully, this helps you kick off your journey!

More TechCrunch

Autonomous, AI-based players are coming to a gaming experience near you, and a new startup, Altera, is joining the fray to build this new guard of AI agents. The company announced…

Bye-bye bots: Altera’s game-playing AI agents get backing from Eric Schmidt

Google DeepMind has taken the wraps off a new version AlphaFold, their transformative machine learning model that predicts the shape and behavior of proteins. AlphaFold 3 is not only more…

Google DeepMind debuts huge AlphaFold update and free proteomics-as-a-service web app

Uber plans to deliver more perks to Uber One members, like member-exclusive events, in a bid to gain more revenue through subscriptions.  “You will see more member-exclusives coming up where…

Uber promises member exclusives as Uber One passes $1B run-rate

Close to a decade ago, brothers Aviv and Matteo Shapira co-founded a company, Replay, that created a video format for 360-degree replays — the sorts of replays that have become…

Controversial drone company Xtend leans into defense with new $40 million round

Usually, when something starts to rot, it gets pitched in the trash. But Joanne Rodriguez wants to turn the concept of rot on its head by growing fungus on trash…

Mycocycle uses mushrooms to upcycle old tires and construction waste

Mushrooms continue to be a big area for alternative proteins. Canada-based Maia Farms recently raised $1.7 million to develop a blend of mushroom and plant-based protein using biomass fermentation. There’s…

Meati Foods bites into another $100M amid growth to 7,000 retail locations

Cleaning the outside of buildings is a dirty job, and it’s also dangerous. Lucid Bots came on the scene in 2018 with its Sherpa line of drones to clean windows…

Lucid Bots secures $9M for drones to clean more than your windows

High interest rates and financial pressures make it more important than ever for finance teams to have a better handle on their cash flow, and several startups are hoping to…

Israeli startup Panax raises a $10M Series A for its AI-driven cash flow management platform

The European Union has deepened the investigation of Elon Musk-owned social network, X, that it opened back in December under the bloc’s online governance and content moderation rulebook, the Digital Services Act…

EU grills Elon Musk’s X about content moderation and deepfake risks

For the founders of Atlan, a data governance startup, data has always been at the heart of what they do, even before they launched the company. In fact, co-founders Prukalpa…

Atlan scores $105M for its data control plane, as LLMs boost importance of data

For decades, the Global Positioning System (GPS) has maintained a de facto monopoly on positioning, navigation and timing, because it’s cheap and already integrated into billions of devices around the…

Xona Space Systems closes $19M Series A to build out ultra-accurate GPS alternative

Kyle Kuzma is a lot of things. He’s a forward for the Washington Wizards NBA team and a 2020 NBA champion. He’s also a style icon — depending on who…

NBA champion Kyle Kuzma looks to bring his team mentality to Scrum Ventures

Ofcom is cracking down on Instagram, YouTube and 150,000 other web services to improve child safety online. A new Children’s Safety Code from the U.K. Internet regulator will push tech…

Ofcom to push for better age verification, filters and 40 other checks in new online child safety code

Lipids are fatty, waxy or oily compounds that, for instance, typically come in the form of fats and oils. As a result they are heavily used in the production of…

After a $20M Series A funding, Germany’s Insempra plans eco-friendly lipid production

Tesla CEO Elon Musk has said that lidar sensors are a “crutch” for autonomous vehicles. But his company has bought so many from Luminar that Tesla is now the lidar-maker’s…

Tesla is Luminar’s largest lidar customer

U.S. realty trust giant Brandywine Realty Trust has confirmed a cyberattack that resulted in the theft of data from its network. In a filing with regulators on Tuesday, the Philadelphia-based…

Brandywine Realty Trust says data stolen in ransomware attack

Rivian lost $1.45 billion in the first quarter, showing that its recent company-wide cost-cutting measures have a ways to go before it can approach profitability. The EV-maker brought in $1.2…

Rivian loses $1.45B as cost-cutting measures continue

Meta is rolling out an expanded set of generative AI tools for advertisers, after first announcing a set of AI features last October. Now, instead of only being able to…

Meta’s AI tools for advertisers can now create full new images, not just new backgrounds

On April 29, Senators Jon Ossoff (D-GA) and Marsha Blackburn (R-SC) proposed a bipartisan bill to protect children from online sexual exploitation. President Biden officially signed the REPORT Act into…

Biden signs bill to protect children from online sexual abuse and exploitation

The pandemic ushered in an e-bike boom. But like so many other pandemic trends, that boom didn’t last. The last year has seen e-bike startups VanMoof and Cake file for…

Bloom is reinventing how e-bikes are made in the US

At its iPad-focused event on Monday, Apple announced a new and improved Magic Keyboard, its keyboard accessory for iPad. The Magic Keyboard has been “completely redesigned” to be much thinner…

Apple unveils a new Magic Keyboard at iPad event

Apple isn’t yet ready to unveil its broader AI strategy — it’s saving that for its Worldwide Developer Conference in June — but the tech giant did make sure to…

Apple highlights AI features, including M4 neural engine, at iPad event

The New York Times Games announced on Tuesday that it’s launching a Wordle archive, offering subscribers access to more than 1,000 past Wordle puzzles. The company has started rolling out the Wordle…

NYT Games launches a Wordle archive with access to more than 1,000 past puzzles

Robert Kahn has been a consistent presence on the Internet since its creation — obviously, since he was its co-creator. But like many tech pioneers his resumé is longer than…

Crypto? AI? Internet co-creator Robert Kahn already did it … decades ago

Amazon is launching a new tool, Bedrock Studio, designed to let organizations experiment with generative AI models, collaborate on those models, and ultimately build generative AI-powered apps. Available in public…

Bedrock Studio is Amazon’s attempt to simplify generative AI app development

Featured Article

A comprehensive list of 2024 tech layoffs

The tech layoff wave is still going strong in 2024. Following significant workforce reductions in 2022 and 2023, this year has already seen 60,000 job cuts across 254 companies, according to independent layoffs tracker Layoffs.fyi. Companies like Tesla, Amazon, Google, TikTok, Snap and Microsoft have conducted sizable layoffs in the first months of 2024. Smaller-sized…

23 hours ago
A comprehensive list of 2024 tech layoffs

Oyo, the Indian budget-hotel chain startup, is negotiating with investors to raise a new round of funding that could cut the Indian firm’s valuation to $3 billion or lower, three…

India’s Oyo, once valued at $10B, seeks new funding at 70% discount

Five takeaways from the indictment of Dmitry Yuryevich Khoroshev, the hacker who U.S. and U.K. authorities accuse of being the mastermind of the LockBit ransomware gang.

What we learned from the indictment of LockBit’s mastermind

Jumia’s revenue and gross merchandise volume showed growth despite a decrease in quarterly active customers, according to its Q1 2024 report. Revenue increased by 19% year-over-year (57% in constant currency)…

Jumia is back, growing total sales and orders in Q1 2024

Welcome to TechCrunch Fintech! This week, we’re looking at Mercury’s latest expansions, wallet-as-a-service startup Ansa’s raise and more! To get a roundup of TechCrunch’s biggest and most important fintech stories…

Inside Mercury’s competitive push into software and Ramp’s potential M&A targets