1. IT-Security >
  2. Programmierung >
  3. Azure HDInsight integration with Data Lake Storage Gen2 preview – ACL and security update


ArabicEnglishFrenchGermanGreekItalianJapaneseKoreanPersianPolishPortugueseRussianSpanishTurkishVietnamese

Azure HDInsight integration with Data Lake Storage Gen2 preview – ACL and security update

RSS Kategorie Pfeil Programmierung vom | Quelle: azure.microsoft.com Direktlink öffnen

Today we are sharing an update to the Azure HDInsight integration with Azure Data Lake Storage Gen 2. This integration will enable HDInsight customers to drive analytics from the data stored in Azure Data Lake Storage Gen 2 using popular open source frameworks such as Apache Spark, Hive, MapReduce, Kafka, Storm, and HBase in a secure manner.

Azure Data Lake Storage Gen2

Azure Data Lake Storage Gen2 is the only data lake designed specifically for enterprises to run large scale analytics workloads in the cloud. It unifies the core capabilities from the first generation of Azure Data Lake with a Hadoop compatible file system endpoint now directly integrated into Azure Blob Storage. This enhancement combines the scale and cost benefits of object storage with the reliability and performance typically associated only with on-premises file systems. This new file system includes a full hierarchical namespace that makes files and folders first class citizens, translating to faster, more reliable analytics job execution.

Azure Data Lake Storage Gen2 also includes limitless storage ensuring capacity to meet the needs of even the largest, most complex workloads. In addition, Azure Data Lake Storage Gen2 delivers on native integration with Azure Active Directory and support POSIX compliant ACLs to enable granular permission assignments on files and folders.

Key benefits

Hadoop compatible access

Azure Data Lake Storage Gen2 allows you to manage and access data just as you would with a Hadoop Distributed File System (HDFS). The ABFS driver is available within all Apache Hadoop environments. File systems are well understood by developers and users alike. There is no need to learn a new storage paradigm when you move to the cloud as the file system interface exposed by Azure Data Lake Storage Gen2 is the same paradigm used by computers, large and small.

Role based access control

The security model for Azure Data Lake Storage Gen2 supports ACL and POSIX permissions.

These storage ACL capabilities along with fine grain access control via apache Ranger in HDInsight for applications such as Spark, Kafka, Hive, and HBase make it very convenient to open up your data lake for entire organization with appropriate security control and auditing in place.

SSL only access

With this update, ADLS Gen 2 accounts can only be accessed via https protocol ensuring that only encrypted communication is possible between HDInsight and storage.

Global availability

Azure Data Lake Storage Gen 2 and HDInsight are available across the globe, offering the scale needed to bring big data applications closer to users around the world, preserving data residency, and offering comprehensive compliance and resiliency options for customers.

Atomic directory manipulation

Object stores approximate a directory hierarchy by adopting a convention of embedding slashes (/) in the object name to denote path segments. While this convention works for organizing objects, the convention provides no assistance for actions like moving, renaming, or deleting directories. Without real directories, applications must process potentially millions of individual blobs to achieve directory-level tasks. By contrast, the hierarchical namespace processes these tasks by updating a single entry (the parent directory).

This dramatic optimization is especially significant for many big data analytics frameworks. Tools like Hive and Spark often write output to temporary locations and then rename the location at the conclusion of the job. Without the hierarchical namespace, this rename can often take longer than the analytics process itself. Lower job latency equals lower total cost of ownership (TCO) for analytics workloads.

Scale

HDInsight and Azure Data Lake Storage Gen2 bring new levels of scale for big data workloads. Customers can run workloads that scale at 100’s Gb/Sec to Petabytes of storage without needing to shard the data across multiple storage accounts.

Encryption at REST

Encryption in Azure Data Lake Storage Gen2 helps you protect your data, implement enterprise security policies, and meet regulatory compliance requirements. Azure Data Lake Storage Gen 2 supports encryption of data both at rest and in transit.

Network firewall

Integrated network firewall capabilities allow you to define rules restricting access only to requests originating from specified networks or HDInsight clusters in a specific VNET.

How does the integration work?

HDInsight and Azure Data Lake Storage Gen2 integration is based upon user-assigned managed identity. You assign appropriate access to HDInsight with your Azure Data Lake Storage Gen2 accounts. Once configured, your HDInsight cluster is able to use Azure Data Lake Storage Gen2 as its storage.

HDInsight Storage example

Getting started

Start using Azure Data Lake Storage Gen2 with Azure HDInsight today.

Feedback

We look forward to your comments and feedback. If there are any feature requests, customer asks, or suggestions, please contact us at [email protected].

Additional resources

...

Webseite öffnen Komplette Webseite öffnen

Newsbewertung

Kommentiere zu Azure HDInsight integration with Data Lake Storage Gen2 preview – ACL and security update






Ähnliche Beiträge

  • 1. Get up to speed with Azure HDInsight: The comprehensive guide vom 2038.62 Punkte ic_school_black_18dp
    Azure HDInsight is an easy, cost-effective, enterprise-grade service for open source analytics. With HDInsight, you get managed clusters for various Apache big data technologies, such as Spark, MapReduce, Kafka, Hive, HBase, Storm and ML Services bac
  • 2. Get up to speed with Azure HDInsight: The comprehensive guide vom 2038.62 Punkte ic_school_black_18dp
    Azure HDInsight is an easy, cost-effective, enterprise-grade service for open source analytics. With HDInsight, you get managed clusters for various Apache big data technologies, such as Spark, MapReduce, Kafka, Hive, HBase, Storm and ML Services bac
  • 3. Azure.Source – Volume 61 vom 1996.09 Punkte ic_school_black_18dp
    Microsoft Connect(); 2018 On Tuesday, December 4th, Microsoft Connect(); 2018 provided a full day of developer-focused content—including updates on Azure and Visual Studio, keynotes, demos, and real-time coding with experts. Scott Guthrie’s keyno
  • 4. Azure.Source – Volume 62 vom 1628.08 Punkte ic_school_black_18dp
    KubeCon North America 2018 KubeCon North America 2018: Serverless Kubernetes and community led innovation! Brendan Burns, Distinguished Engineer in Microsoft Azure and co-founder of the Kubernetes project, provides a welcome to KubeCon North America 2018, which took p
  • 5. Azure.Source – Volume 58 vom 1563.01 Punkte ic_school_black_18dp
    Now in preview Update 18.11 for Azure Sphere in public preview This is an update to the Azure Sphere Operating System, Azure Sphere Security Service, and Visual Studio development environment. This release includes substantial investments in our se
  • 6. Azure HDInsight integration with Data Lake Storage Gen2 preview – ACL and security update vom 1338.94 Punkte ic_school_black_18dp
    Today we are sharing an update to the Azure HDInsight integration with Azure Data Lake Storage Gen 2. This integration will enable HDInsight customers to drive analytics from the data stored in Azure Data Lake Storage Gen 2 using popular open source fra
  • 7. Azure HDInsight integration with Data Lake Storage Gen2 preview – ACL and security update vom 1338.94 Punkte ic_school_black_18dp
    Today we are sharing an update to the Azure HDInsight integration with Azure Data Lake Storage Gen 2. This integration will enable HDInsight customers to drive analytics from the data stored in Azure Data Lake Storage Gen 2 using popular open source fra
  • 8. Azure HDInsight integration with Data Lake Storage Gen2 preview – ACL and security update vom 1338.94 Punkte ic_school_black_18dp
    Today we are sharing an update to the Azure HDInsight integration with Azure Data Lake Storage Gen 2. This integration will enable HDInsight customers to drive analytics from the data stored in Azure Data Lake Storage Gen 2 using popular open source fra
  • 9. Azure.Source – Volume 60 vom 1180.55 Punkte ic_school_black_18dp
    Now in preview Simplifying security for serverless and web apps with Azure Functions and App Service New security features for Azure App Service and Azure Functions reduce the amount of code you need to work with identities and secrets under management. Key Vault re
  • 10. Silo busting 2.0—Multi-protocol access for Azure Data Lake Storage vom 1059.26 Punkte ic_school_black_18dp
    Cloud data lakes solve a foundational problem for big data analytics—providing secure, scalable storage for data that traditionally lives in separate data silos. Data lakes were designed from the start to break down data barriers and jump start big
  • 11. Azure.Source – Volume 63 vom 920.92 Punkte ic_school_black_18dp
    Now in preview Transparent Data Encryption (TDE) with customer managed keys for Managed Instance Announces the public preview of Transparent Data Encryption (TDE) with Bring Your Own Key (BYOK) support for Microsoft Azure SQL Database Managed Instance. Azure SQL Database M
  • 12. Azure.Source – Volume 63 vom 920.92 Punkte ic_school_black_18dp
    Now in preview Transparent Data Encryption (TDE) with customer managed keys for Managed Instance Announces the public preview of Transparent Data Encryption (TDE) with Bring Your Own Key (BYOK) support for Microsoft Azure SQL Database Managed Instance. Azure SQL Database M