Encrypt Data At-Rest and In-Flight on Amazon EMR with Security Configurations
Customers running analytics, stream processing, machine learning, and ETL workloads on personally identifiable information, health information, and financial data have strict requirements for encryption of data at-rest and in-transit. The Apache Spark and Hadoop ecosystems lend themselves to these big data use cases, and customers have asked us to provide a quick and easy way to encrypt data at-rest and data in-transit between nodes in each execution framework.
With the release of security configurations for Amazon EMR release 5.0.0 and 4.8.0, customers can now easily enable encryption for data at-rest in Amazon S3, HDFS, and local disk, and enable encryption for data in-flight in the Apache Spark, Apache Tez, and Apache Hadoop MapReduce frameworks.
Security configurations make it easy to specify the encryption keys and certificates to use, ranging fromAWS Key Management Service to supplying your own custom encryption materials provider (for an example of custom providers, see the Nasdaq about EMRFS and Amazon S3 client-side encryption post). Additionally, you can apply a security configuration to multiple clusters, making it easy to standardize your security settings. For instance, this makes it easy for customers to encrypt data across their HIPAA-compliant Amazon EMR workloads.