I am importing data from a CSV file into a SQL Server DB, the CSV may contain duplicate entries. My existing code uses SqlBulkCopy() and "IGNORE_DUP_KEY = ON", all is well with duplicates.
I am switching to code first using EF Core with Zzz Projects Entity Framework Extensions.
Primary keys are defined in DbContext:
protected override void OnModelCreating(ModelBuilder modelBuilder)
{
if (modelBuilder == null)
throw new ArgumentNullException(nameof(modelBuilder));
modelBuilder.Entity<AmbientWeatherData>()
.HasKey(u => new { u.TimeStamp });
modelBuilder.Entity<ApcUpsData>()
.HasKey(u => new { u.TimeStamp });
modelBuilder.Entity<PurpleAirData>()
.HasKey(u => new { u.TimeStamp });
modelBuilder.Entity<Vr1710Data>()
.HasKey(u => new { u.TimeStamp });
modelBuilder.Entity<Vr1710EventData>()
.HasKey(u => new { u.TimeStamp, u.Type });
modelBuilder.Entity<WeatherGooseData>()
.HasKey(u => new { u.TimeStamp });
modelBuilder.Entity<WundergroundData>()
.HasKey(u => new { u.TimeStamp });
modelBuilder.Entity<WundergroundDailyData>()
.HasKey(u => new { u.TimeStamp });
}
Data is inserted using BulkInsert:
dbContext.BulkInsert(data, options => options.AllowDuplicateKeys = true);
Exception when inserting duplicates:
Microsoft.Data.SqlClient.SqlException (0x80131904): Violation of PRIMARY KEY constraint 'PK_Vr1710Data'. Cannot insert duplicate key in object 'dbo.Vr1710Data'. The duplicate key value is (2014 - 12 - 28 14:26:07.0000000).
The function that inserts the data is generic and type agnostic.
public static bool Import<TDataType, TLoaderType, TConfigType>(string fileName, string connectionString)
where TLoaderType : TelemetryLoader<TDataType, TConfigType>, new()
where TConfigType : TelemetryConfig
where TDataType : TelemetryData
{
// ...
}
The online examples for BulkInsert()
and AllowDuplicateKeys
require the use of ColumnPrimaryKeyExpression
From: https://entityframework-extensions.net/bulk-insert#insert-only-if-the-entity-not-already-exists
context.BulkInsert(customers, options => {
options.InsertIfNotExists = true;
options.ColumnPrimaryKeyExpression = c => c.Code;
});
How can I ignore duplicates without needing a specialization for ColumnPrimaryKeyExpression
for every data type, but instead use my keys as defined in the DbContext?
How can I ignore duplicates without needing a specialization for ColumnPrimaryKeyExpression for every data type, but instead use my keys as defined in the DbContext?
By default, the library already takes the key from the DbContext
.
In the example, c.Code
was representing a custom key (The CustomerID
is the real key).
If I understand correctly the problem, you need to use the following code:
context.BulkInsert(customers, options => {
options.InsertIfNotExists = true;
options.AllowDuplicateKeys = true;
});
Both options are required.
User contributions licensed under CC BY-SA 3.0