Copyright © 2018 DataScience.US All Rights Reserved.
Cloudera Unveils Altus to Simplify Hadoop in the Cloud
Running Hadoop, whether on-premise or in the cloud, is neither simple nor easy.
Administrators with specialized skills are needed to configure, manage, and maintain the clusters for their clients, who are data scientists, engineers, and analysts. Now Cloudera is looking to eliminate that burden with a new cloud-based offering dubbed Altus.
Altus, which the Hadoop distributor unveiled today at its Strata Data Conference in London, is a new platform designed to make it easier for users to access and run its Hadoop suite of tools and applications on public cloud infrastructure. The first Altus offering runs on AWS, consumes data from the S3 object store, and targets data engineering workloads, but other Cloudera products and public cloud platforms (and object stores) will be supported over time.
It’s all about making Hadoop easy, says David Tishgart, who heads up product marketing for Cloudera’s Data Engineering, the first Cloudera product to be exposed to the public atop the Altus platform.
“We want to make sure the data engineers, who are more frequently working on transient clusters in cloud, are able to quickly spin up jobs, run their jobs, and terminate them, but do so in a way that doesn’t require them to also deal with cluster operations and management,” Tishgart tells Datanami.
Cloudera is no cloud newbie – 18% of the workloads under Cloudera’s Distribution of Hadoop (CDH) already run on cloud infrastructure, Tishgart says. The difference is that organizations would have to provide their own personnel to handle cluster and cloud administrative tasks with standard cloud CDH. That requirement is eliminated with the Altus platform as a service (PaaS) offering.
“Before Altus, when you wanted to run your data processing jobs on cloud environments, you also had to deal with the infrastructure overhead, the management and operations of your cluster,” Tishgart says. “What Altus does is it remove that burden completely.”
Cloudera sells its Hadoop-based distribution through various SKUs (or stock keeping units). The Data Engineering SKU is focused on data ingest and transformation tasks, and includes Spark, Hive, Hive on Spark, and MapReduce2 engines.
Up to this point, Cloudera customers could license the Data Engineering SKU, deploy it on AWS or Azure public clouds , and then manage those clusters with Cloudera Director. Now that Data Engineering is a PaaS offering on AWS, clients using no longer have to worry about using Cloudera Director (although Cloudera points out that they will still need Director for other cloud uses).