Featured Article

Powering the brains of tomorrow’s intelligent machines

Software will ultimately enable, and differentiate, the best hardware accelerating AI

Comment

Image Credits: charles taylor (opens in a new window) / Shutterstock (opens in a new window)

Shahin Farshchi

Contributor

Shahin Farshchi is a partner at Lux Capital.

More posts from Shahin Farshchi

Sense and compute are the electronic eyes and ears that will be the ultimate power behind automating menial work and encouraging humans to cultivate their creativity. 

These new capabilities for machines will depend on the best and brightest talent, and investors who are building and financing companies aiming to deliver the AI chips destined to be the neurons and synapses of robotic brains.

Like any other Herculean task, this one is expected to come with big rewards. And it will bring with it big promises, outrageous claims and suspect results. Right now, it’s still the Wild West when it comes to measuring AI chips up against each other.

Remember laptop shopping before Apple made it easy? Cores, buses, gigabytes and GHz have given way to “Pro” and “Air.” Not so for AI chips.

Roboticists are struggling to make heads and tails out of the claims made by AI chip companies. Every passing day without autonomous cars puts more lives at risk of human drivers. Factories want humans to be more productive while out of harm’s way. Amazon wants to get as close as possible to Star Trek’s replicator by getting products to consumers faster.

A key component of that is the AI chips that will power these efforts. A talented engineer making a bet on her career to build AI chips, an investor looking to underwrite the best AI chip company and AV developers seeking the best AI chips need objective measures to make important decisions that can have huge consequences. 

A metric that gets thrown around frequently is TOPS, or trillions of operations per second, to measure performance. TOPS/W, or trillions of operations per second per Watt, is used to measure energy efficiency. These metrics are as ambiguous as they sound. 

What are the operations being performed on? What’s an operation? Under what circumstances are these operations being performed? How does the timing by which you schedule these operations impact the function you are trying to perform? Is your chip equipped with the expensive memory it needs to maintain performance when running “real-world” models? Phrased differently, do these chips actually deliver these performance numbers in the intended application?

Image via Getty Images / antoniokhr

What’s an operation?

The core mathematical function performed in training and running neural networks is a convolution, which is simply a sum of multiplications. A multiplication itself is a bunch of summations (or accumulation), so are all the summations being lumped together as one “operation,” or does each summation count as an operation? This little detail can result in a difference of 2x or more in a TOPS calculation. For the purpose of this discussion, we’ll use a complete multiply and accumulate (or MAC) as “two operations.” 

What are the conditions?

Is this chip operating full-bore at close to a volt or is it sipping electrons at half a volt? Will there be sophisticated cooling or is it expected to bake in the sun? Running chips hot, and trickling electrons into them, slows them down. Conversely, operating at modest temperature while being generous with power allows you to extract better performance out of a given design. Furthermore, does the energy measurement include loading up and preparing for an operation? As you will see below, overhead from “prep” can be as costly as performing the operation itself.

What’s the utilization?

Here is where it gets confusing. Just because a chip is rated at a certain number of TOPS, it doesn’t necessarily mean that when you give it a real-world problem it can actually deliver the equivalent of the TOPS advertised. Why? It’s not just about TOPS. It has to do with fetching the weights, or values against which operations are performed, out of memory and setting up the system to perform the calculation. This is a function of what the chip is being used for. Usually, this “setup” takes more time than the process itself. The workaround is simple: fetch the weights and set up the system for a bunch of calculations, then do a bunch of calculations. The problem with that is that you’re sitting around while everything is being fetched, and then you’re going through the calculations.  

Flex Logix (my firm Lux Capital is an investor) compares the Nvidia Tesla T4’s actual delivered TOPS performance versus the 130 TOPS it advertises on its website. They use ResNet-50, a common framework used in computer vision: it requires 3.5 billion MACs (equivalent to two operations, per above explanation of a MAC) for a modest 224×224 pixel image. That’s 7 billion operations per image. The Tesla T4 is rated at 3,920 images/second, so multiply that by the required 7 billion operations per image, and you’re at 27,440 billion operations per second, or 27 TOPS, well shy of the advertised 130 TOPS.  

Screen Shot 2019 07 19 at 6.13.46 AM
Batching is a technique where data and weights are loaded into the processor for several computation cycles. This allows you to make the most of compute capacity, BUT at the expense of added cycles to load up the weights and perform the computations. Therefore if your hardware can do 100 TOPS, memory and throughput constraints can lead you to only getting a fraction of the nameplate TOPS performance.

Where did the TOPS go? Scheduling, also known as batching, of the setup and loading up the weights followed by the actual number crunching takes us down to a fraction of the speed the core can perform. Some chipmakers overcome this problem by putting a bunch of fast, expensive SRAM on chip, rather than slow, but cheap off-chip DRAM. But chips with a ton of SRAM, like those from Graphcore and Cerebras, are big and expensive, and more conducive to data centers.  

There are, however, interesting solutions that some chip companies are pursuing.

Compilers

Traditional compilers translate instructions into machine code to run on a processor. With modern multi-core processors, multi-threading has become commonplace, but “scheduling” on a many-core processor is far simpler than the batching we describe above. Many AI chip companies are relying on generic compilers from Google and Facebook, which will result in many chip companies offering products that perform about the same in real-world conditions. 

Chip companies that build proprietary, advanced compilers specific to their hardware, and offer powerful tools to developers for a variety of applications to make the most of their silicon and Watts, will certainly have a distinct edge. Applications will range from driverless cars to factory inspection to manufacturing robotics to logistics automation to household robots to security cameras.  

New compute paradigms

Simply jamming a bunch of memory close to a bunch of compute results in big chips that sap up a bunch of power. Digital design is one of the trade-offs, so how can you have your lunch and eat it too? Get creative. Mythic (my firm Lux is an investor) is performing the multiply and accumulates inside of embedded flash memory using analog computation. This empowers them to get superior speed and energy performance on older technology nodes. Other companies are doing fancy analog and photonics to escape the grips of Moore’s Law.

Ultimately, if you’re doing conventional digital design, you’re limited by a single physical constraint: the speed at which a charge travels through a transistor at a given process node. Everything else is optimization for a given application. Want to be good at multiple applications? Think outside the VLSI box!

More TechCrunch

Venture capital has been hit hard by souring macroeconomic conditions over the past few years and it’s not yet clear how the market downturn affected VC fund performance. But recent…

VC fund performance is down sharply — but it may have already hit its lowest point

The person who claims to have 49 million Dell customer records — Menelik — told TechCrunch that he brute-forced an online company portal and scraped customer data, including physical addresses,…

Threat actor says he scraped 49M Dell customer addresses before the company found out

The social network has announced an updated version of its app that lets you offer feedback about its algorithmic feed so you can better customize it.

Bluesky now lets you personalize main Discover feed using new controls

Microsoft will launch its own mobile game store in July, the company announced at the Bloomberg Technology Summit on Thursday. Xbox president Sarah Bond shared that the company plans to…

Microsoft is launching its mobile game store in July

Smart ring maker Oura is launching two new features focused on heart health, the company announced on Friday. The first claims to help users get an idea of their cardiovascular…

Oura launches two new heart health features

Keeping up with an industry as fast-moving as AI is a tall order. So until an AI can do it for you, here’s a handy roundup of recent stories in the world…

This Week in AI: OpenAI considers allowing AI porn

Garena is quietly developing new India-themed games even though Free Fire, its biggest title, has still not made a comeback to the country.

Garena is quietly making India-themed games even as Free Fire’s relaunch remains doubtful

The U.S.’ NHTSA has opened a fourth investigation into the Fisker Ocean SUV, spurred by multiple claims of “inadvertent Automatic Emergency Braking.”

Fisker Ocean faces fourth federal safety probe

CoreWeave has formally opened an office in London that will serve as its European headquarters and home to two new data centers.

CoreWeave, a $19B AI compute provider, opens European HQ in London with plans for 2 UK data centers

The Series C funding, which brings its total raise to around $95 million, will go toward mass production of the startup’s inaugural products

AI chip startup DEEPX secures $80M Series C at a $529M valuation 

A dust-up between Evolve Bank & Trust, Mercury and Synapse has led TabaPay to abandon its acquisition plans of troubled banking-as-a-service startup Synapse.

Infighting among fintech players has caused TabaPay to ‘pull out’ from buying bankrupt Synapse

The problem is not the media, but the message.

Apple’s ‘Crush’ ad is disgusting

The Twitter for Android client was “a demo app that Google had created and gave to us,” says Particle co-founder and ex-Twitter employee Sara Beykpour.

Google built some of the first social apps for Android, including Twitter and others

WhatsApp is updating its mobile apps for a fresh and more streamlined look, while also introducing a new “darker dark mode,” the company announced on Thursday. The messaging app says…

WhatsApp’s latest update streamlines navigation and adds a ‘darker dark mode’

Plinky lets you solve the problem of saving and organizing links from anywhere with a focus on simplicity and customization.

Plinky is an app for you to collect and organize links easily

The keynote kicks off at 10 a.m. PT on Tuesday and will offer glimpses into the latest versions of Android, Wear OS and Android TV.

Google I/O 2024: How to watch

For cancer patients, medicines administered in clinical trials can help save or extend lives. But despite thousands of trials in the United States each year, only 3% to 5% of…

Triomics raises $15M Series A to automate cancer clinical trials matching

Welcome back to TechCrunch Mobility — your central hub for news and insights on the future of transportation. Sign up here for free — just click TechCrunch Mobility! Tap, tap.…

Tesla drives Luminar lidar sales and Motional pauses robotaxi plans

The newly announced “Public Content Policy” will now join Reddit’s existing privacy policy and content policy to guide how Reddit’s data is being accessed and used by commercial entities and…

Reddit locks down its public data in new content policy, says use now requires a contract

Eva Ho plans to step away from her position as general partner at Fika Ventures, the Los Angeles-based seed firm she co-founded in 2016. Fika told LPs of Ho’s intention…

Fika Ventures co-founder Eva Ho will step back from the firm after its current fund is deployed

In a post on Werner Vogels’ personal blog, he details Distill, an open-source app he built to transcribe and summarize conference calls.

Amazon’s CTO built a meeting-summarizing app for some reason

Paris-based Mistral AI, a startup working on open source large language models — the building block for generative AI services — has been raising money at a $6 billion valuation,…

Sources: Mistral AI raising at a $6B valuation, SoftBank ‘not in’ but DST is

You can expect plenty of AI, but probably not a lot of hardware.

Google I/O 2024: What to expect

Dating apps and other social friend-finders are being put on notice: Dating app giant Bumble is looking to make more acquisitions.

Bumble says it’s looking to M&A to drive growth

When Class founder Michael Chasen was in college, he and a buddy came up with the idea for Blackboard, an online classroom organizational tool. His original company was acquired for…

Blackboard founder transforms Zoom add-on designed for teachers into business tool

Groww, an Indian investment app, has become one of the first startups from the country to shift its domicile back home.

Groww joins the first wave of Indian startups moving domiciles back home from US

Technology giant Dell notified customers on Thursday that it experienced a data breach involving customers’ names and physical addresses. In an email seen by TechCrunch and shared by several people…

Dell discloses data breach of customers’ physical addresses

Featured Article

Fairgen ‘boosts’ survey results using synthetic data and AI-generated responses

The Israeli startup has raised $5.5M for its platform that uses “statistical AI” to generate synthetic data that it says is as good as the real thing.

1 day ago
Fairgen ‘boosts’ survey results using synthetic data and AI-generated responses

Hydrow, the at-home rowing machine maker, announced Thursday that it has acquired a majority stake in Speede Fitness, the company behind the AI-enabled strength training machine. The rowing startup also…

Rowing startup Hydrow acquires a majority stake in Speede Fitness as their CEO steps down

Call centers are embracing automation. There’s debate as to whether that’s a good thing, but it’s happening — and quite possibly accelerating. According to research firm TechSci Research, the global…

Retell AI lets companies build ‘voice agents’ to answer phone calls