How do I use Entity Framework Extensions BulkInsert and AllowDuplicateKeys

3

I am importing data from a CSV file into a SQL Server DB, the CSV may contain duplicate entries. My existing code uses SqlBulkCopy() and "IGNORE_DUP_KEY = ON", all is well with duplicates.

I am switching to code first using EF Core with Zzz Projects Entity Framework Extensions.

Primary keys are defined in DbContext:

        protected override void OnModelCreating(ModelBuilder modelBuilder)
        {
            if (modelBuilder == null)
                throw new ArgumentNullException(nameof(modelBuilder));

            modelBuilder.Entity<AmbientWeatherData>()
                .HasKey(u => new { u.TimeStamp });
            modelBuilder.Entity<ApcUpsData>()
                .HasKey(u => new { u.TimeStamp });
            modelBuilder.Entity<PurpleAirData>()
                .HasKey(u => new { u.TimeStamp });
            modelBuilder.Entity<Vr1710Data>()
                .HasKey(u => new { u.TimeStamp });
            modelBuilder.Entity<Vr1710EventData>()
                .HasKey(u => new { u.TimeStamp, u.Type });
            modelBuilder.Entity<WeatherGooseData>()
                .HasKey(u => new { u.TimeStamp });
            modelBuilder.Entity<WundergroundData>()
                .HasKey(u => new { u.TimeStamp });
            modelBuilder.Entity<WundergroundDailyData>()
                .HasKey(u => new { u.TimeStamp });
        }

Data is inserted using BulkInsert:

dbContext.BulkInsert(data, options => options.AllowDuplicateKeys = true);

Exception when inserting duplicates:

Microsoft.Data.SqlClient.SqlException (0x80131904): Violation of PRIMARY KEY constraint 'PK_Vr1710Data'. Cannot insert duplicate key in object 'dbo.Vr1710Data'. The duplicate key value is (2014 - 12 - 28 14:26:07.0000000).

The function that inserts the data is generic and type agnostic.

        public static bool Import<TDataType, TLoaderType, TConfigType>(string fileName, string connectionString)
            where TLoaderType : TelemetryLoader<TDataType, TConfigType>, new()
            where TConfigType : TelemetryConfig
            where TDataType : TelemetryData
        {
// ...
}

The online examples for BulkInsert() and AllowDuplicateKeys require the use of ColumnPrimaryKeyExpression From: https://entityframework-extensions.net/bulk-insert#insert-only-if-the-entity-not-already-exists

context.BulkInsert(customers, options => {
    options.InsertIfNotExists = true;
    options.ColumnPrimaryKeyExpression = c => c.Code;
});

How can I ignore duplicates without needing a specialization for ColumnPrimaryKeyExpression for every data type, but instead use my keys as defined in the DbContext?

c#
.net-core
entity-framework-core
asked on Stack Overflow Mar 8, 2020 by PieterV • edited Mar 8, 2020 by stuartd

1 Answer

1

How can I ignore duplicates without needing a specialization for ColumnPrimaryKeyExpression for every data type, but instead use my keys as defined in the DbContext?

By default, the library already takes the key from the DbContext.

In the example, c.Code was representing a custom key (The CustomerID is the real key).

If I understand correctly the problem, you need to use the following code:

context.BulkInsert(customers, options => {
    options.InsertIfNotExists = true;
    options.AllowDuplicateKeys = true;
});

Both options are required.

answered on Stack Overflow Mar 9, 2020 by Jonathan Magnan

User contributions licensed under CC BY-SA 3.0