Apache Spark
Databricks launches Project Lightspeed, its next-gen Spark streaming engine
At its Data + AI Summit, Databricks today made the requisite number of announcements one would expect from a company’s flagship developer event. Among those are the launch of Delta Lake 2.0, the
Google Cloud launches a managed Spark service
At its Cloud Next event, Google today announced the launch of Spark on Google Cloud as a fully managed service. With this, the popular open source data processing engine will become a premium offering
Databricks co-founder and CEO Ali Ghodsi is coming to TC Sessions: SaaS
In many industries, Databricks has become synonymous with modern data warehousing and data lakes. Since it’s exactly these technologies that are at the core of what modern businesses are doing a
Adobe brings over 20,000 design assets to Spark
Adobe is launching an update to its Spark social media design tool today that will bring more than 20,000 new design assets to the service. This update, which follows the launch of animations in Spark
Databricks launches SQL Analytics
AI and data analytics company Databricks today announced the launch of SQL Analytics, a new service that makes it easier for data analysts to run their standard SQL queries directly on data lakes. And
Anyscale, from the creators of the Ray-distributed computing project, launches with $20.6M led by a16z
Open source has become a critical building block of modern software, and today a new startup is coming out of stealth to capitalise on one of the newer frontiers in open source: using it to build and
Databricks announces $400M round on $6.2B valuation as analytics platform continues to grow
Databricks is a SaaS business built on top of a bunch of open-source tools, and apparently it’s been going pretty well on the business side of things. In fact, the company claims to be one of th
Databricks brings its Delta Lake project to the Linux Foundation
Databricks, the big data analytics service founded by the original developers of Apache Spark, today announced that it is bringing its Delta Lake open-source project for building data lakes to the Lin
Google brings Cloud Dataproc to Kubernetes
Cloud Dataproc is probably one of the lesser-known products in Google Cloud’s portfolio, but it’s a powerful tool for data wranglers who are looking for a fully managed cloud service that lets the
Databricks raises $250M at a $2.75B valuation for its analytics platform
Databricks, the company founded by the original team behind the Apache Spark big data analytics engine, today announced that it has raised a $250 million Series E round led by Andreessen Horowitz. Coa
Databricks releases serverless platform for Apache Spark along with new library supporting deep learning
Today to kick off Spark Summit, Databricks announced a Serverless Platform for Apache Spark — welcome news for developers looking to reduce time spent on cluster management. The move to simplify d
Yahoo supercharges TensorFlow with Apache Spark
Yahoo, model Apache Spark citizen and developer of CaffeOnSpark, which made it easier for developers building deep learning models in Caffe to scale with parallel processing, is open sourcing a new
IBM releases DataWorks to give enterprise data a home and a brain
While the gears of research are turning fast developing new methods of machine intelligence, another, perhaps more impactful, trend is brewing in the field. Open source frameworks like Apache Spark
How fog computing pushes IoT intelligence to the edge
As the Internet of Things evolves into the Internet of Everything and expands its reach into virtually every domain, high-speed data processing, analytics and shorter response times are becoming more
Scala is the new golden child
Tooling in the data science community evolves quickly, and picking the right tool for a job -- not to mention a career -- can often be divisive. Which tools should you try to master? What is the prope
Spark fragmentation undermines community
Today the Hadoop distribution war comes down to a final battle between Cloudera’s CDH and Hortonworks’ HDP. That wasn’t always the case. At the peak of the market’s fragmentation, numerous com
Microsoft bets on Apache Spark to power its big data and analytics services
Microsoft today announced that it is making a serious commitment to the open source Apache Spark cluster computing framework. After dipping its toes into the Spark ecosystem last year, the company to
Basho open-sources its Riak TS database for the Internet Of Things
It seems as though every device manufacturer in the world wants to connect its products to the internet, from mattresses and washing machines to toasters and juicers. There’s so much data out there
Optimizing Analytics On Time Series Databases
When Matthew Fontaine Maury was restricted to a desk job because of a leg injury, little did he know that he was showcasing an impressive example of crowdsourced, open source and big data time series
IBM Pours Researchers And Resources Into Apache Spark Project
IBM today pledged it would devote 3500 researchers to the open source big data project, Apache Spark. It also announced that it was open sourcing its own IBM SystemML machine learning technology in a