Onstage today at Google’s Cloud Next conference, the company announced a series of new tools to assist users with data preparation and integration. The updates bolster both the power and agility of Google Cloud for businesses.
The first of these releases is the new private beta of Google Cloud Dataprep. Dataprep makes the data preparation process more visual. The tool includes anomaly detection and employs machine learning to suggest data transformations that can improve the quality of data.
In an attempt to democratize the process, Google prioritized cleanliness of its interface, opting to enable control via drag-and-drop. Dataprep is optimized to be integrated with GCP, meaning it can create pipelines in Google Cloud Dataflow for easy export to BigQuery.
BigQuery itself also got attention from Google, with a new BigQuery Data Transfer Service. The idea behind the release is to simplify the process of merging data from multiple sources. These capabilities increase with support for commercial data sets from Xignite, HouseCanary, Remind, AccuWeather and Dow Jones.
When connected to visualization services like Tableau, users can seamlessly prepare and display analytics. BigQuery will now support Cloud Bigtable for larger projects so users don’t have to waste time copying data from one system to the next.
“We’ve made it really easy for marketing teams to build marketing analytics on GCP,” said Brian Stevens, vice president of cloud platforms at Google.
Python developers will be pleased to know that Google is moving to general availability for its Python SDK for Cloud Dataflow. This broadens its community beyond Java.
Cloud Datalab is also moving to general availability. The workflow tool will make it easier for developers using Jupyter notebook-based environments and standard SQL to perform data analysis. TensorFlow and Scikit-learn are getting support, while batch and stream processing will now be possible using Cloud Dataflow or Apache Spark via Cloud Dataproc. Meanwhile, Stackdriver Monitoring for Cloud Dataflow is moving to beta to power monitoring and diagnostics for apps hosted by GCP or AWS.