Tonic is betting that synthetic data is the new big data to solve scalability

Big data is a sham. For years now, we have been told that every company should save every last morsel of digital exhaust in some sort of database, lest management lose some competitive intelligence against … a competitor, or something.

There is just one problem with big data though: it’s honking huge.

Processing petabytes of data to generate business insights is expensive and time consuming. Worse, all that data hanging around paints a big, bright red target on the back of the company for every hacker group in the world. Big data is expensive to maintain, expensive to protect, and expensive to keep private. And the upshot might not be all that much in the end after all — oftentimes, well-curated and chosen datasets can provide faster and better insight than endless quantities of raw data.

What should a company do? Well, they need a Tonic to ameliorate their big data sins.

Tonic is a “synthetic data” platform that transforms raw data into more manageable and private datasets usable by software engineers and business analysts. Along the way, Tonic’s algorithms de-identifies the original data and creates statistically identical but synthetic datasets, which means that personal information isn’t shared insecurely.

For instance, an online shopping platform will have transaction history on its customers and what they purchased. Sharing that data with every engineer and analyst in the company is dangerous, since that purchase history could have personally identifying details that no one without a need-to-know should have access to. Tonic could take that original payments data and transform it into a new, smaller dataset with exactly the same statistical properties, but not tied to original customers. That way, an engineer could test their app or an analyst could test their marketing campaign, all without triggering concerns about privacy.

Synthetic data and other ways to handle the privacy of large datasets has garnered massive attention from investors in recent months. We reported last week on Skyflow, which raised a round to use polymorphic encryption to ensure that employees only have access to the data they need and are blocked from accessing the rest. BigID takes a more overarching view of just tracking what data is where and who should have access to it (i.e. data governance) based on local privacy laws.

Tonic’s approach has the benefit of helping solve not just privacy issues, but also scalability challenges as datasets get larger and larger in size. That combination has attracted the attention of investors: this morning, the company announced that it has raised $8 million in a Series A led by Glenn Solomon and Oren Yunger of GGV, the latter of whom will join the company’s board.

The company was founded in 2018 by a quad of founders: CEO Ian Coe worked with COO Karl Hanson (they first met in middle school as well) and…

Read More:Source link