BetterData taps the blockchain to help create better synthetic data

As the global data privacy regulatory landscape gets more convoluted and constrictive, engineering teams looking to use structured data to improve their products and AI models are being pushed to jump through plenty of hoops to stay compliant.

BetterData, which is launching onstage at the TechCrunch Disrupt SF Battlefield startup competition, is aiming to help customers quickly generate representative, synthetic structured data so that technical teams can work with data in a compliant way without waiting for months to gain clearance to use actual user data or generate their own.

The company’s product helps generate the AI data in a secure way that allows clients to upload real user data and securely transmit and convert it without a copy of the data landing on BetterData’s servers. User data is tokenized and stored on a blockchain which is only accessible with a user’s private encryption keys.

Image Credits: BetterData

The generative data copy maintains the key properties of the original structured data while anonymizing and scrambling the information. This enables teams to train models and create products that are capable of parsing organic user data, but helps them avoid lengthy bureaucratic processes often required to gain access to user data.

The startup’s co-founders, CEO Uzair Javaid and CTO Kevin Yee, have backgrounds in AI data generation and blockchain security. They met at the Entrepreneur First program in Singapore.

The duo have raised $770,000 in funding and grants so far and are in the process of closing a seed raise. 

“We’ve spoken to hundreds of data teams… and they all face the problem which is access to data,” Yee told TechCrunch in an interview. “It takes a long time to access data under data protection rules… They’re trying to innovate, but it takes so much time.”

Image Credits: TechCrunch

The company announced onstage that they will be expanding the private beta after a number of successful pilot programs. BetterData is particularly targeting customers in the Banking Financial Services and Insurance (BFSI) world, as well as data and AI teams at tech companies.

Yee and Javaid hope their product can not only help those teams stay compliant with the increasing sprawl of data privacy regulations but can also help them avoid data attacks and leakages by tapping encryption and the blockchain. The blockchain element will also allow customers to have an immutable access log and a full breakdown of data lineage so they can ensure that data is never being mishandled.

For now the company’s product focuses exclusively on processing and generating structured data, but as they build out their functionality, they plan to start generating text data using natural language processing models. They are planning to launch a public beta of their cloud services solution by the end of this year.