Apache Spark

Databricks builds a data mesh with the launch of Lakehouse Federation

Databricks today launched what it calls its Lakehouse Federation feature at its Data + AI Summit. Using this new capability, enterprises can bring together their various siloed data systems and discov

Databricks launches Project Lightspeed, its next-gen Spark streaming engine

At its Data + AI Summit, Databricks today made the requisite number of announcements one would expect from a company’s flagship developer event. Among those are the launch of Delta Lake 2.0, the

Google Cloud launches a managed Spark service

At its Cloud Next event, Google today announced the launch of Spark on Google Cloud as a fully managed service. With this, the popular open source data processing engine will become a premium offering

Databricks co-founder and CEO Ali Ghodsi is coming to TC Sessions: SaaS

In many industries, Databricks has become synonymous with modern data warehousing and data lakes. Since it’s exactly these technologies that are at the core of what modern businesses are doing a

Adobe brings over 20,000 design assets to Spark

Adobe is launching an update to its Spark social media design tool today that will bring more than 20,000 new design assets to the service. This update, which follows the launch of animations in Spark

Databricks launches SQL Analytics

AI and data analytics company Databricks today announced the launch of SQL Analytics, a new service that makes it easier for data analysts to run their standard SQL queries directly on data lakes. And

Anyscale, from the creators of the Ray-distributed computing project, launches with $20.6M led by a16z

Open source has become a critical building block of modern software, and today a new startup is coming out of stealth to capitalise on one of the newer frontiers in open source: using it to build and

Databricks announces $400M round on $6.2B valuation as analytics platform continues to grow

Databricks is a SaaS business built on top of a bunch of open-source tools, and apparently it’s been going pretty well on the business side of things. In fact, the company claims to be one of th

Databricks brings its Delta Lake project to the Linux Foundation

Databricks, the big data analytics service founded by the original developers of Apache Spark, today announced that it is bringing its Delta Lake open-source project for building data lakes to the Lin

Google brings Cloud Dataproc to Kubernetes

Cloud Dataproc is probably one of the lesser-known products in Google Cloud’s portfolio, but it’s a powerful tool for data wranglers who are looking for a fully managed cloud service that lets the

Databricks raises $250M at a $2.75B valuation for its analytics platform

Databricks, the company founded by the original team behind the Apache Spark big data analytics engine, today announced that it has raised a $250 million Series E round led by Andreessen Horowitz. Coa

Databricks releases serverless platform for Apache Spark along with new library supporting deep learning

Today to kick off Spark Summit, Databricks announced a Serverless Platform for Apache Spark — welcome news for developers looking to reduce time spent on cluster management. The move to simplify d

Yahoo supercharges TensorFlow with Apache Spark

Yahoo, model Apache Spark citizen and developer of CaffeOnSpark, which made it easier for developers building deep learning models in Caffe to scale with parallel processing, is open sourcing a new

IBM releases DataWorks to give enterprise data a home and a brain

While the gears of research are turning fast developing new methods of machine intelligence, another, perhaps more impactful, trend is brewing in the field. Open source frameworks like Apache Spark

How fog computing pushes IoT intelligence to the edge

As the Internet of Things evolves into the Internet of Everything and expands its reach into virtually every domain, high-speed data processing, analytics and shorter response times are becoming more

Scala is the new golden child

Tooling in the data science community evolves quickly, and picking the right tool for a job -- not to mention a career -- can often be divisive. Which tools should you try to master? What is the prope

Spark fragmentation undermines community

Today the Hadoop distribution war comes down to a final battle between Cloudera’s CDH and Hortonworks’ HDP. That wasn’t always the case. At the peak of the market’s fragmentation, numerous com

Microsoft bets on Apache Spark to power its big data and analytics services

Microsoft today announced that it is making a serious commitment to the open source Apache Spark cluster computing framework. After dipping its toes into the Spark ecosystem last year, the company to

Basho open-sources its Riak TS database for the Internet Of Things

It seems as though every device manufacturer in the world wants to connect its products to the internet, from mattresses and washing machines to toasters and juicers. There’s so much data out there

Optimizing Analytics On Time Series Databases

When Matthew Fontaine Maury was restricted to a desk job because of a leg injury, little did he know that he was showcasing an impressive example of crowdsourced, open source and big data time series
Load More