Featured Article

Everything you know about computer vision may soon be wrong

Ubicept wants half of the world’s cameras to see things differently

6:00 AM PST • March 10, 2023

Closeup of a person holding a camera outside — **Image Credits:** Venu Dontaraboina / EyeEm (opens in a new window) / Getty Images

Computer vision could be a lot faster and better if we skip the concept of still frames and instead directly analyze the data stream from a camera. At least, that’s the theory that the newest brainchild spinning out of the MIT Media lab, Ubicept, is operating under.

Most computer vision applications work the same way: A camera takes an image (or a rapid series of images, in the case of video). These still frames are passed to a computer, which then does the analysis to figure out what is in the image. Sounds simple enough.

But there’s a problem: That paradigm assumes that creating still frames is a good idea. As humans who are used to seeing photography and video, that might seem reasonable. Computers don’t care, however, and Ubicept believes it can make computer vision far better and more reliable by ignoring the idea of frames.

The company itself is a collaboration between its co-founders. Sebastian Bauer is the company’s CEO and a postdoc at the University of Wisconsin, where he was working on lidar systems. Tristan Swedish is now Ubicept’s CTO. Before that, he was a research assistant and a master’s and Ph.D. student at the MIT Media Lab for eight years.

“There are 45 billion cameras in the world, and most of them are creating images and video that aren’t really being looked at by a human,” Bauer explained. “These cameras are mostly for perception, for systems to make decisions based on that perception. Think about autonomous driving, for example, as a system where it is about pedestrian recognition. There are all these studies coming out that show that pedestrian detection works great in bright daylight but particularly badly in low light. Other examples are cameras for industrial sorting, inspection and quality assurance. All these cameras are being used for automated decision-making. In sufficiently lit rooms or in daylight, they work well. But in low light, especially in connection with fast motion, problems come up.”

The company’s solution is to bypass the “still frame” as the source of truth for computer vision and instead measure the individual photons that hit an imaging sensor directly. That can be done with a single-photon avalanche diode array (or SPAD array, among friends). This raw stream of data can then be fed into a field-programmable gate array (FPGA, a type of super-specialized processor) and further analyzed by computer vision algorithms.

The newly founded company demonstrated its tech at CES in Las Vegas in January, and it has some pretty bold plans for the future of computer vision.

“Our vision is to have technology on at least 10% of cameras in the next five years, and in at least 50% of cameras in the next 10 years,” Bauer projected. “When you detect each individual photon with a very high time resolution, you’re doing the best that nature allows you to do. And you see the benefits, like the high-quality videos on our webpage, which are just blowing everything else out of the water.”

TechCrunch saw the technology in action at a recent demonstration in Boston and wanted to explore how the tech works and what the implications are for computer vision and AI applications.

A new form of seeing

Digital cameras generally work by grabbing a single-frame exposure by “counting” the number of photons that hit each of the sensor pixels over a certain period of time. At the end of the time period, all of those photons are multiplied together, and you have a still photograph. If nothing in the image moves, that works great, but the “if nothing moves” thing is a pretty big caveat, especially when it comes to computer vision. It turns out that when you are trying to use cameras to make decisions, everything moves all the time.

Of course, with the raw data, the company is still able to combine the stream of photons into frames, which creates beautifully crisp video without motion blur. Perhaps more excitingly, dispensing with the idea of frames means that the Ubicept team was able to take the raw data and analyze it directly. Here’s a sample video of the dramatic difference that can make in practice:

Number plate recognition using Ubicept’s technology

“SPAD sensors are manufactured using a CMOS process,” Swedish said, referring to the tech that’s used in a lot of existing digital cameras. “That means they are very scalable in terms of making them.”

Instead of creating image frames, however, the SPADs are able to detect individual photos, time-stamping them very accurately. The technology is usually used in lidar, where you send a pulse of light and wait for it to return, and you measure the difference in time. The SPADs Ubicept is removing the need for an active component (i.e., the light pulse) and just looking at the raw data that’s received from the sensors.

“You can use that data and some very clever computation to get images that are far better than what a conventional camera can produce. We are capturing the data at a more fundamental level,” Swedish said. “Conceptually, we’re digitizing light directly. This allows us to do a lot more in software, where we would previously have required analog hardware.”

As a deep camera nerd, the approach melted my brain a little (in a good way). The approach means that the worst-case scenario is that the computer vision systems Ubicept has created are as good as conventional cameras. In other words: In perfect lighting with nonmoving targets, the quality of the images and the amount of data that can be garnished from those images should be the same. As soon as the scene shifts toward less-than-ideal image capturing situations (low light, fast-moving targets), the advantage starts shifting toward Ubicept’s tech.

Of course, it isn’t entirely without its drawbacks: SPAD sensors are a bit more expensive than conventional CMOS sensors, and the vast amount of data streamed from the sensors needs to be processed and stored in a way that is useful to the end application.

An inherent and curious advantage of using SPAD sensors is that they have a lot less of what photographers are used to as “digital noise.”

“Every time you read data from a [conventional] sensor chip, the sensor tube itself adds noise. That is one major source of noise when you’re taking an image in low light. The interesting thing is that the SPAD sensors sensors are fundamentally digital: There is zero read noise,” Bauer explained. “That means that you can capture as many of these frames per second as you like, you don’t pay your toll. You can do that 100,000 times per second, or a million times per second.”

The output of this is what the company refers to as a “photon cube” — essentially a three-dimensional timeline of when each photon hit the imaging sensor. Ubicept’s product is the signal processing and computer-vision algorithms that operate to interpret this single-photon data stream.

“What’s interesting about our approach is that you’re constantly streaming data, so we inherently don’t have this issue of, ‘Did I miss the thing that only happens in a brief period of time?’” Swedish explained. “That has a direct impact on being able to improve downstream perception, like pedestrian detection and tracking other kinds of high-level vision applications. This is a shift in thinking.”

In addition to the change of approach, the company has to resolve a few new challenges: Capturing every photon as a raw stream results in a very high-bandwidth firehose of data. The biggest challenge Ubicept is facing, then, is figuring out how to discard the right data and what it needs to keep.

Implications for computer vision

The company has published a number of proofs of concept that shows off what this technology can actually do compared to other computer vision solutions.

The demonstration that caught my eye originally, as a photographer, was the video quality of footage shot out of a moving car:

Of course, the truly impressive thing you get from that is when you run that same video through a computer vision object recognition engine:

The implications for what this does to the field of computer vision may be nontrivial. In a nutshell, it enables industrial, commercial and automation technologies to be an order of magnitude better in low-light and high-speed environments. The company demonstrated what its technology could do in near darkness at 200 mph, with deeply impressive results.

“Vision is so fundamental to seeing and understanding the world for robots and computers. If you’re building a system that moves around, especially near people, it’s really critical that you have a very reliable and robust perception. It’s really important that you understand and see the world,” Swedish said. “We’re not building a consumer-facing product; we’re building a technologies stack that can be integrated with solving end-user cases: robotics, autonomous vehicles, monitoring systems, etc.”

Coming out of stealth at CES, the company has started seeing that there’s a lot of demand for its technology from roboticists that are operating in hard-to-control environments.

“But what that means is planes, helicopters, drones, cars, trucks, off-road vehicles, specialty vehicles and robots,” Bauer ticked off, broadening the use cases for the type of tech they are developing. Automated guided vehicles (AGVs) in particular (such as pick-and-pack robots) may prove to be a beachhead audience.

“AGVs usually operate in a warehouse where you can control the lighting. But sometimes you have to move between warehouses. At night, lighting conditions might be not ideal: You don’t have spotlights and illumination from all directions, you have to deal with the fog and all kinds of other environments. And that’s where we really shine. These uncontrollable environments have low light, or extremely bright light. There is motion, which leads to artifacts. We got a lot of really good customer interest from those industries. This is a paradigm change in imaging.”

The company is currently at the early stage. It just released an evaluation kit that developers can use to experiment with new use cases.

Ubicept’s ultimate goal is to make perception systems work more like the human eye.

“People don’t realize that our retina actually is part of the brain. The retina is the photosensitive part in the backof the eye. It has nerve cells and is technically part of the brain. It is computing things at that layer, before sending it down the optic nerve,” Swedish explained. “The optic nerve then goes into your the GPU and deep learning accelerator in the form of our visual system. Our technology isn’t inspired by that in a literal sense, but mathematically, we’re having to solve the same problem: We have to reduce that high-bandwidth information and descend over the optic nerve. We do that in two stages, via the FGPAs in our evaluation kit.”

The difference between a 60FPS action cam and Ubicept’s camera setup is dramatic.

From the raw data, the company can reconstruct viewable images, but the team suggests that your brain isn’t literally storing the whole field of vision in front of you just to watch television or read the information in a book. If you’re focusing on something, the rest of the world kind of falls away, your brain discarding the information and letting you settle into the thing that’s important.

In time, that’s what Ubicept hopes to be able to do. In different words: It doesn’t matter if the sun is setting or whether the car on your right is blue or orange. If there’s a pedestrian stepping out into the road, your car needs to know that right away and hit the brakes.

“[What we keep and what we discard] is super application-dependent. If you want to have a more general purpose solution, then really, it’s about frame reconstruction,” Bauer explained, waving his hands at the demonstration videos we embedded above. For more specialist use cases, however, the tech can get both smarter and faster than current perception systems.

The first photograph was taken in 1827. In the 196 years since then, photography has focused heavily on the frame and everything that happens in a frame. Ubicept may not ship this in time for the 200th anniversary of the invention of photography, but we may not have to wait much longer before the tech makes it to our pockets.

“In five to 10 years, I think this will be on smartphones,” Bauer concluded, hinting at the vast market the company has ahead of it and the true revolution in photography that might be coming sooner rather than later.

More TechCrunch

CoreWeave, a $19B AI compute provider, opens European HQ in London with plans for 2 UK data centers

Paul Sawers

51 mins ago

CoreWeave has formally opened an office in London that will serve as its European headquarters and home to two new data centers.

CoreWeave, a $19B AI compute provider, opens European HQ in London with plans for 2 UK data centers

Fundraising

AI chip startup DEEPX secures $80M Series C at a $529M valuation

Kate Park

13 hours ago

The Series C funding, which brings its total raise to around $95 million, will go toward mass production of the startup’s inaugural products

AI chip startup DEEPX secures $80M Series C at a $529M valuation

Startups

Infighting among fintech players has caused TabaPay to ‘pull out’ from buying bankrupt Synapse

Mary Ann Azevedo

15 hours ago

A dust-up between Evolve Bank & Trust, Mercury and Synapse has led TabaPay to abandon its acquisition plans of troubled banking-as-a-service startup Synapse.

Infighting among fintech players has caused TabaPay to ‘pull out’ from buying bankrupt Synapse

Media & Entertainment

Apple’s ‘Crush’ ad is disgusting

Devin Coldewey

15 hours ago

The problem is not the media, but the message.

Apps

Google built some of the first social apps for Android, including Twitter and others

Sarah Perez

17 hours ago

The Twitter for Android client was “a demo app that Google had created and gave to us,” says Particle co-founder and ex-Twitter employee Sara Beykpour.

Apps

WhatsApp’s latest update streamlines navigation and adds a ‘darker dark mode’

Aisha Malik

18 hours ago

WhatsApp is updating its mobile apps for a fresh and more streamlined look, while also introducing a new “darker dark mode,” the company announced on Thursday. The messaging app says…

Apps

Plinky is an app for you to collect and organize links easily

Ivan Mehta

18 hours ago

Plinky lets you solve the problem of saving and organizing links from anywhere with a focus on simplicity and customization.

Plinky is an app for you to collect and organize links easily

Google I/O 2024: How to watch

Brian Heater

19 hours ago

The keynote kicks off at 10 a.m. PT on Tuesday and will offer glimpses into the latest versions of Android, Wear OS and Android TV.

Venture

Triomics raises $15M Series A to automate cancer clinical trials matching

Marina Temkin

19 hours ago

For cancer patients, medicines administered in clinical trials can help save or extend lives. But despite thousands of trials in the United States each year, only 3% to 5% of…

Triomics raises $15M Series A to automate cancer clinical trials matching

Transportation

Tesla drives Luminar lidar sales and Motional pauses robotaxi plans

Kirsten Korosec

19 hours ago

Welcome back to TechCrunch Mobility — your central hub for news and insights on the future of transportation. Sign up here for free — just click TechCrunch Mobility! Tap, tap.…

Tesla drives Luminar lidar sales and Motional pauses robotaxi plans

Reddit locks down its public data in new content policy, says use now requires a contract

Sarah Perez

19 hours ago

The newly announced “Public Content Policy” will now join Reddit’s existing privacy policy and content policy to guide how Reddit’s data is being accessed and used by commercial entities and…

Venture

Fika Ventures co-founder Eva Ho will step back from the firm after its current fund is deployed

Rebecca Szkutak

20 hours ago

Eva Ho plans to step away from her position as general partner at Fika Ventures, the Los Angeles-based seed firm she co-founded in 2016. Fika told LPs of Ho’s intention…

Fika Ventures co-founder Eva Ho will step back from the firm after its current fund is deployed

Amazon’s CTO built a meeting-summarizing app for some reason

Kyle Wiggers

20 hours ago

In a post on Werner Vogels’ personal blog, he details Distill, an open-source app he built to transcribe and summarize conference calls.

Amazon’s CTO built a meeting-summarizing app for some reason

Sources: Mistral AI raising at a $6B valuation, SoftBank ‘not in’ but DST is

Ingrid Lunden

20 hours ago

Paris-based Mistral AI, a startup working on open source large language models — the building block for generative AI services — has been raising money at a $6 billion valuation,…

Sources: Mistral AI raising at a $6B valuation, SoftBank ‘not in’ but DST is

Enterprise

Google I/O 2024: What to expect

Brian Heater

20 hours ago

You can expect plenty of AI, but probably not a lot of hardware.

Apps

Bumble says it’s looking to M&A to drive growth

Sarah Perez

21 hours ago

Dating apps and other social friend-finders are being put on notice: Dating app giant Bumble is looking to make more acquisitions.

Startups

Blackboard founder transforms Zoom add-on designed for teachers into business tool

Ron Miller

21 hours ago

When Class founder Michael Chasen was in college, he and a buddy came up with the idea for Blackboard, an online classroom organizational tool. His original company was acquired for…

Blackboard founder transforms Zoom add-on designed for teachers into business tool

Startups

Groww joins the first wave of Indian startups moving domiciles back home from US

Manish Singh

21 hours ago

Groww, an Indian investment app, has become one of the first startups from the country to shift its domicile back home.

Groww joins the first wave of Indian startups moving domiciles back home from US

Security

Dell discloses data breach of customers’ physical addresses

Lorenzo Franceschi-Bicchierai

21 hours ago

Technology giant Dell notified customers on Thursday that it experienced a data breach involving customers’ names and physical addresses. In an email seen by TechCrunch and shared by several people…

Dell discloses data breach of customers’ physical addresses

Featured Article

Fairgen ‘boosts’ survey results using synthetic data and AI-generated responses

The Israeli startup has raised $5.5M for its platform that uses “statistical AI” to generate synthetic data that it says is as good as the real thing.

Paul Sawers

21 hours ago

Fairgen ‘boosts’ survey results using synthetic data and AI-generated responses

Hardware

Rowing startup Hydrow acquires a majority stake in Speede Fitness as their CEO steps down

Lauren Forristal

22 hours ago

Hydrow, the at-home rowing machine maker, announced Thursday that it has acquired a majority stake in Speede Fitness, the company behind the AI-enabled strength training machine. The rowing startup also…

Rowing startup Hydrow acquires a majority stake in Speede Fitness as their CEO steps down

Retell AI lets companies build ‘voice agents’ to answer phone calls

Kyle Wiggers

23 hours ago

Call centers are embracing automation. There’s debate as to whether that’s a good thing, but it’s happening — and quite possibly accelerating. According to research firm TechSci Research, the global…

Retell AI lets companies build ‘voice agents’ to answer phone calls

Apps

TikTok will automatically label AI-generated content created on platforms like DALL·E 3

Aisha Malik

1 day ago

TikTok is starting to automatically label AI-generated content that was made on other platforms, the company announced on Thursday. With this change, if a creator posts content on TikTok that…

TikTok will automatically label AI-generated content created on platforms like DALL·E 3

Fintech

India likely to delay UPI market caps in win for PhonePe-Google Pay duopoly

Manish Singh

1 day ago

India’s mobile payments regulator is likely to extend the deadline for imposing market share caps on the popular UPI (unified payments interface) payments rail by one to two years, sources…

India likely to delay UPI market caps in win for PhonePe-Google Pay duopoly

Commerce

Thai food delivery app Line Man Wongnai weighs IPO in Thailand, US in 2025

Kate Park

1 day ago

Line Man Wongnai, an on-demand food delivery service in Thailand, is considering an initial public offering on a Thai exchange or the U.S. in 2025.

Thai food delivery app Line Man Wongnai weighs IPO in Thailand, US in 2025

OpenAI offers a peek behind the curtain of its AI’s secret instructions

Devin Coldewey

2 days ago

Ever wonder why conversational AI like ChatGPT says “Sorry, I can’t do that” or some other polite refusal? OpenAI is offering a limited look at the reasoning behind its own…

OpenAI offers a peek behind the curtain of its AI’s secret instructions

Security

US Patent and Trademark Office confirms another leak of filers’ address data

Zack Whittaker

2 days ago

The federal government agency responsible for granting patents and trademarks is alerting thousands of filers whose private addresses were exposed following a second data spill in as many years. The…

US Patent and Trademark Office confirms another leak of filers’ address data

Security

Encrypted services Apple, Proton and Wire helped Spanish police identify activist

Lorenzo Franceschi-Bicchierai

2 days ago

As part of an investigation into people involved in the pro-independence movement in Catalonia, the Spanish police obtained information from the encrypted services Wire and Proton, which helped the authorities…

Encrypted services Apple, Proton and Wire helped Spanish police identify activist

Apps

Match looks to Hinge as Tinder fails

Lauren Forristal

2 days ago

Match Group, the company that owns several dating apps, including Tinder and Hinge, released its first-quarter earnings report on Tuesday, which shows that Tinder’s paying user base has decreased for…

Apps

Gratitude Plus makes social networking positive, private and personal

Sarah Perez

2 days ago

Private social networking is making a comeback. Gratitude Plus, a startup that aims to shift social media in a more positive direction, is expanding its wellness-focused, personal reflections journal to…

Everything you know about computer vision may soon be wrong

Ubicept wants half of the world’s cameras to see things differently

A new form of seeing

Implications for computer vision

More TechCrunch

Get the industry’s biggest tech news

TechCrunch Daily News

Startups Weekly

TechCrunch Fintech

TechCrunch Mobility

Tags