New serverless options for Amazon EMR, Amazon MSK, and Amazon Redshift help customers analyze vast amounts of data without having to configure, scale, or manage the underlying infrastructure
Informatica, NextGen Healthcare, and Huron among customers and partners using new serverless analytics options
Amazon Web Services, Inc. (AWS), an Amazon.com, Inc. company (NASDAQ: AMZN), today announced the general availability of three new serverless analytics offerings that make it even easier for customers to analyze vast amounts of data without having to configure, scale, or manage the underlying infrastructure. Today’s announcements include new serverless offerings for Amazon EMR to enable customers to run analytics applications using open-source big data frameworks (Apache Spark and Hive) without having to manage the underlying infrastructure, Amazon Managed Streaming for Apache Kafka (Amazon MSK) to simplify real-time data ingestion and streaming, and Amazon Redshift to allow customers to run high-performance data warehousing and analytics workloads on petabytes of data without having to manage clusters. Along with other serverless analytics offerings from AWS such as Amazon QuickSight for business intelligence and AWS Glue for data integration, the new offerings announced today make it significantly easier and more cost-effective for customers to modernize their infrastructure and analyze vast amounts of data without worrying about capacity planning or incurring excess costs by over-provisioning for peak demand. There are no upfront commitments or additional costs to use Amazon EMR Serverless, Amazon MSK Serverless, and Amazon Redshift Serverless, and customers only pay for the precise capacity needed for their analytics workloads.
“By offering the most serverless options for data analytics in the cloud—including options for data warehousing, big data processing, real-time data analysis, data integration, interactive dashboards and visualizations, and more—we are making it even easier for customers to maximize the value of their data to drive innovation, improve customer experiences, and make better decisions faster,” said Swami Sivasubramanian, vice president of Database, Analytics, and Machine Learning at AWS. “With these new serverless options, customers can run even the most variable and intermittent analytics workloads and expand the use of analytics throughout their organizations without worrying about provisioning or scaling capacity—or incurring excess cost.”
AWS customers choose from a wide variety of purpose-built analytics services to derive maximum value from their organizations’ data, including Amazon EMR for processing vast amounts of unstructured data (using open-source big data frameworks like Apache Spark and Hive), Amazon MSK for ingesting real-time data streams, and Amazon Redshift for data warehousing. While many customers appreciate the fine-grained control these services offer, a subset of customers with highly variable or intermittent workloads would prefer to have AWS manage the underlying infrastructure by automatically adding or subtracting resources based on application demand. To remove the complexity of scaling and managing infrastructure, AWS introduced the concept of serverless, event-driven computing in 2014. Many customers have since adopted serverless technologies on AWS—including Amazon Kinesis Data Streams for real-time data streaming, AWS Glue for data integration, and Amazon QuickSight for interactive dashboards and visualizations—to take advantage of benefits like automatic provisioning, on-demand scaling, and pay-for-use pricing. With the new serverless offerings for Amazon EMR, Amazon MSK, and Amazon Redshift, AWS offers the broadest set of serverless analytics capabilities in the cloud, making it even easier for customers to lower costs, expand analytics to more users, and maximize their data’s value.
- Serverless big data analytics with Amazon EMR Serverless: Tens of thousands of customers use Amazon EMR to run open-source frameworks like Apache Spark and Hive for large-scale distributed data processing jobs, interactive SQL queries, and machine learning applications. Amazon EMR supports the most big data frameworks in the cloud, enabling customers to run big data applications and petabyte-scale data analytics faster, and at less than half the cost of on-premises solutions. With Amazon EMR Serverless, customers can simply specify the framework they want to run, and Amazon EMR Serverless automatically provisions, manages, and scales the necessary compute and memory resources as workload demands change. Customers can get started with Amazon EMR Serverless by simply selecting an open-source framework and submitting their jobs using the Amazon EMR application programming interface (API), the AWS Command Line Interface (AWS CLI), or an integrated development environment (IDE) with Amazon EMR Studio. Amazon EMR Serverless is generally available today to customers running Amazon EMR in US East (N. Virginia), US West (Oregon), Asia Pacific (Tokyo), and Europe (Ireland), with availability in additional AWS Regions coming soon. To get started with Amazon EMR Serverless, visit aws.amazon.com/emr/serverless.
- Serverless data streaming with Amazon MSK Serverless: Today’s organizations are increasingly adopting Apache Kafka to capture and analyze real-time data streams from Internet of Things (IoT) devices, website clickstreams, database logs, and many other sources where dynamic data is continuously generated. With this new serverless option, Amazon MSK Serverless now provisions, manages, and scales clusters automatically, so customers no longer have to worry about capacity planning or unpredictable streaming workloads. To take advantage of Amazon MSK Serverless, customers simply create a cluster in the Amazon MSK console, set up a private and secure Apache Kafka endpoint, and use new or existing Apache Kafka clients to stream data. Amazon MSK Serverless is generally available today to customers running Amazon MSK in US East (Ohio), US East (N. Virginia), US West (Oregon), Asia Pacific (Singapore), Asia Pacific (Sydney), Asia Pacific (Tokyo), Europe (Frankfurt), Europe (Ireland), and Europe (Stockholm), with availability in additional AWS Regions coming soon. To get started with Amazon MSK Serverless, visit aws.amazon.com/msk/features/msk-serverless.
- Serverless data warehouse with Amazon Redshift Serverless: Tens of thousands of customers are collectively processing more than two exabytes of data with Amazon Redshift every day. Amazon Redshift offers up to 3x better price performance than other enterprise cloud data warehouses, providing customers with faster data analytics at lower cost. Amazon Redshift Serverless now makes it even easier to get insights from data quickly without the need to manage data warehouse infrastructure. Customers currently managing their own Amazon Redshift clusters can choose to move them to the new serverless option using the Amazon Redshift console or API without making changes to their applications. Amazon Redshift Serverless is generally available today to customers running Amazon Redshift in US East (Ohio), US East (N. Virginia), US West (Oregon), Asia Pacific (Singapore), Asia Pacific (Sydney), Asia Pacific (Seoul), Asia Pacific (Tokyo), Europe (Frankfurt), Europe (Ireland), Europe (London), and Europe (Stockholm), with availability in additional AWS Regions coming soon. To get started with Amazon Redshift Serverless, visit aws.amazon.com/redshift/redshift-serverless.
Amobee provides advertising solutions that help customers unify audiences to optimize results across all TV, connected TV, and digital media to drive customers’ growth. “While we like the flexibility that Amazon EMR provides to scale resources up or down automatically based on workload requirements, some of our infrequent but heavy jobs were disrupting existing clusters, necessitating us to create and manage additional clusters for these jobs,” said David Ortiz, senior manager of Engineering at Amobee. “Amazon EMR Serverless allowed us to right-size the CPU and memory resources that the jobs required without the overhead of any additional processes, helping us streamline our workflows and cut costs by providing just the right amount of capacity to meet workload demands precisely when we need it.”
Powered by Apache Kylin, Kyligence Cloud accelerates organizations’ business intelligence and analytics on big data. “To help customers make critical business decisions from an extensive volume of data, our platform loads and processes a significant amount of data using Spark jobs. Doing this at scale became costly and required operational overhead,” said Luke Han, co-founder and CEO at Kyligence. “We adopted Amazon EMR Serverless to help us eliminate the costs and administrative tasks of maintaining and tuning clusters. Amazon EMR Serverless has helped us reduce that complexity by taking over the time-consuming tasks of managing, tuning, and optimizing clusters for performance as workload demand changes. And because it is less expensive than our previous solution, we can pass cost savings on to our customers.”
Glas Data provides simplified data management for the agricultural sector. “We ingest and process data in real time using Amazon MSK to inform automated data analytics and alerts for our customers. Our workloads can be highly variable and unpredictable, with some actions generating only a few messages that require a small amount of capacity, and others creating a much larger number of messages that require significantly more capacity,” said Robert Sanders, CTO and founder at Glas Data. “This workload variability makes it difficult to predict which action will be taken at what time, causing us to monitor and adjust capacity constantly to avoid unexpected capacity constraints. Amazon MSK Serverless automatically scales capacity up and down based on workload requirements, removing the system administration overhead and freeing us up to develop our solution without worrying about memory and storage constraints or incurring excess costs.”
NextGen Healthcare is a leading provider of innovative healthcare technology solutions on a mission to improve the lives of those who practice medicine and their patients. “Our NextGen Population Health solution provides actionable insights directly to care teams via the aggregation and transformation of multi-source data. Optimizing our systems to reduce manual interventions like setting up and managing data warehouse infrastructure is critical to our success,” said Owen Zacharias, vice president of Application Delivery at NextGen Healthcare. “With Amazon Redshift Serverless, we’re no longer managing complex warehouse orchestration systems. Amazon Redshift Serverless has improved workload performance, and its auto-scaling capabilities allow us to use the speed of Amazon Redshift for even our most dynamic workloads, while only paying for what we use. We’re excited to migrate additional workloads to Amazon Redshift Serverless. It’s a game changer.”
Informatica provides an end-to-end cloud data management platform that connects, manages, unifies, and governs data, empowering enterprises to modernize and advance their data strategies. “Organizations today are looking to expand data and analytics, but face challenges with data silos, cost constraints, and infrastructure management,” said Rik Tamm-Daniels, GVP of Ecosystems at Informatica. “Amazon Redshift Serverless helps address these challenges by automatically provisioning and scaling resources to meet demand, making it easy to run analytics without the need to set up and manage data warehouse infrastructure or the worry of incurring excess costs by overprovisioning for peak demand. Together with our Intelligent Data Management Cloud on AWS, Amazon Redshift Serverless helps us provide Informatica customers with a serverless data and analytics foundation to power their most business-critical initiatives.”
The Rail Delivery Group (RDG) brings together the companies that run Britain’s railway into a single team to deliver a better railway experience. “Amazon Redshift Serverless delivers high performance for our teams, and because it automatically provisions and manages the underlying data warehouse, more of our business users can quickly and easily get insights from data,” said Toby Ayre, head of Data and Analytics at Rail Delivery Group. “Amazon Redshift Serverless automatically scales data warehouse capacity to handle even our most demanding and unpredictable workloads, helping us lower our costs and expand the use of analytics across our organization.”
Huron is a global professional services firm that collaborates with clients to create sound strategies, optimize operations, accelerate digital transformation, and empower businesses and their people to own their future. “We’re thrilled to include Amazon Redshift Serverless as an exciting addition to our data analytics workflow. This offering seamlessly replaces several parts of our previous infrastructure, and its simplicity makes it very easy to use,” said Harry Gollakota, data engineer at Huron. “Amazon Redshift Serverless drastically helps reduce data engineering latency and acts as a force multiplier in accelerating development. Implementing Amazon Redshift Serverless helped us cut through our data engineering backlog and now allows us to spend more of our time gathering insights from the data.”
Visit AITechPark for cutting-edge Tech Trends around AI, ML, Cybersecurity, along with AITech News, and timely updates from industry professionals!