Ready to get started?

Download a free trial of the Azure Data Lake Storage Data Provider to get started:

 Download Now

Learn more:

Azure Data Lake Storage Icon Azure Data Lake Storage ADO.NET Provider

Rapidly create and deploy powerful .NET applications that integrate with Azure Data Lake Storage.

How to Access Azure Data Lake Storage Data Using Entity Framework



This article shows how to access Azure Data Lake Storage data using an Entity Framework code-first approach. Entity Framework 6 is available in .NET 4.5 and above.

Microsoft Entity Framework serves as an object-relational mapping framework for working with data represented as objects. Although Visual Studio offers the ADO.NET Entity Data Model wizard to automatically generate the Entity Model, this model-first approach may present challenges when your data source undergoes changes or when you require greater control over entity operations. In this article, we will delve into the code-first approach for accessing Azure Data Lake Storage data through the CData ADO.NET Provider, providing you with more flexibility and control.

  1. Open Visual Studio and create a new Windows Form Application. This article uses a C# project with .NET 4.5.
  2. Run the command 'Install-Package EntityFramework' in the Package Manger Console in Visual Studio to install the latest release of Entity Framework.
  3. Modify the App.config file in the project to add a reference to the Azure Data Lake Storage Entity Framework 6 assembly and the connection string.

    Authenticating to a Gen 1 DataLakeStore Account

    Gen 1 uses OAuth 2.0 in Azure AD for authentication.

    For this, an Active Directory web application is required. You can create one as follows:

    1. Sign in to your Azure Account through the .
    2. Select "Azure Active Directory".
    3. Select "App registrations".
    4. Select "New application registration".
    5. Provide a name and URL for the application. Select Web app for the type of application you want to create.
    6. Select "Required permissions" and change the required permissions for this app. At a minimum, "Azure Data Lake" and "Windows Azure Service Management API" are required.
    7. Select "Key" and generate a new key. Add a description, a duration, and take note of the generated key. You won't be able to see it again.

    To authenticate against a Gen 1 DataLakeStore account, the following properties are required:

    • Schema: Set this to ADLSGen1.
    • Account: Set this to the name of the account.
    • OAuthClientId: Set this to the application Id of the app you created.
    • OAuthClientSecret: Set this to the key generated for the app you created.
    • TenantId: Set this to the tenant Id. See the property for more information on how to acquire this.
    • Directory: Set this to the path which will be used to store the replicated file. If not specified, the root directory will be used.

    Authenticating to a Gen 2 DataLakeStore Account

    To authenticate against a Gen 2 DataLakeStore account, the following properties are required:

    • Schema: Set this to ADLSGen2.
    • Account: Set this to the name of the account.
    • FileSystem: Set this to the file system which will be used for this account.
    • AccessKey: Set this to the access key which will be used to authenticate the calls to the API. See the property for more information on how to acquire this.
    • Directory: Set this to the path which will be used to store the replicated file. If not specified, the root directory will be used.
    <configuration> ... <connectionStrings> <add name="ADLSContext" connectionString="Offline=False;Schema=ADLSGen2;Account=myAccount;FileSystem=myFileSystem;AccessKey=myAccessKey;InitiateOAuth=GETANDREFRESH" providerName="System.Data.CData.ADLS" /> </connectionStrings> <entityFramework> <providers> ... <provider invariantName="System.Data.CData.ADLS" type="System.Data.CData.ADLS.ADLSProviderServices, System.Data.CData.ADLS.Entities.EF6" /> </providers> <entityFramework> </configuration> </code>
  4. Add a reference to System.Data.CData.ADLS.Entities.EF6.dll, located in the lib -> 4.0 subfolder in the installation directory.
  5. Build the project at this point to ensure everything is working correctly. Once that's done, you can start coding using Entity Framework.
  6. Add a new .cs file to the project and add a class to it. This will be your database context, and it will extend the DbContext class. In the example, this class is named ADLSContext. The following code example overrides the OnModelCreating method to make the following changes:
    • Remove PluralizingTableNameConvention from the ModelBuilder Conventions.
    • Remove requests to the MigrationHistory table.
    using System.Data.Entity; using System.Data.Entity.Infrastructure; using System.Data.Entity.ModelConfiguration.Conventions; class ADLSContext : DbContext { public ADLSContext() { } protected override void OnModelCreating(DbModelBuilder modelBuilder) { // To remove the requests to the Migration History table Database.SetInitializer<ADLSContext>(null); // To remove the plural names modelBuilder.Conventions.Remove<PluralizingTableNameConvention>(); } }
  7. Create another .cs file and name it after the Azure Data Lake Storage entity you are retrieving, for example, Resources. In this file, define both the Entity and the Entity Configuration, which will resemble the example below: using System.Data.Entity.ModelConfiguration; using System.ComponentModel.DataAnnotations.Schema; [System.ComponentModel.DataAnnotations.Schema.Table("Resources")] public class Resources { [System.ComponentModel.DataAnnotations.Key] public System.String FullPath { get; set; } public System.String Permission { get; set; } }
  8. Now that you have created an entity, add the entity to your context class: public DbSet<Resources> Resources { set; get; }
  9. With the context and entity finished, you are now ready to query the data in a separate class. For example: ADLSContext context = new ADLSContext(); context.Configuration.UseDatabaseNullSemantics = true; var query = from line in context.Resources select line;