Sqlite bulk insert with fresh transaction of a batch

0

I need to populate a Sqlite table with precalcutated values. Only one table with currently ~20 rows.

After fiddling around a day or two I managed to got the approach 1 working with bulk inserts which works great for "smaller amounts of entries" (5_000_000 records) and it's quite fast. Unfortunately approach 1 hangs somewhere at 139_000_000 records and the iterator will deliver ~ 200_000_000 records by now. The console application does not stop nor throw an exception I could share here.

And I am wondering where the data goes because the memory isn't increasing significantly nor is the database file written (this happens on the commit at the end).

My approach 1:

public static void SqliteBulkInsert()
{
    CustomIterator citer = new CostumIterator();

    using (var connection = new SqliteConnection("Filename=Iterations.db"))
    {
        connection.Open();

        using (var transaction = connection.BeginTransaction())
        {
            var command = connection.CreateCommand();
            command.Transaction = transaction;

            command.CommandText = @"INSERT INTO iterations (V1, V2) VALUES ($V1, $V2)";

            var p1 = command.CreateParameter();
            p1.ParameterName = "$V1";
            command.Parameters.Add(p1);

            var p2 = command.CreateParameter();
            p2.ParameterName = "$V2";
            command.Parameters.Add(p2);

            long i = 0;

            while (citer.Next(out byte v1, out byte v2))
            {
                p1.Value = v1;
                p2.Value = v2;

                command.ExecuteNonQuery();

                if (i % 10_000 == 0) Console.WriteLine(i);
            }

            transaction.Commit();
        }

        Console.WriteLine("done");
        connection.Close();
    }
}

In a second approach I tried to create a new transaction every batch of 1_000_000. But no matter how I try I received different exceptions like "the transaction of the command does not belong to the current connection" and so on.

When I just add a commit to the batch as in

if (i % 10_000 == 0) 
{ 
  transaction.Commit(); 
  Console.WriteLine(i);
}

I receive:

System.InvalidOperationException
  HResult=0x80131509
  Message=The transaction object is not associated with the same connection object as this command.
  Source=Microsoft.Data.Sqlite
  StackTrace:
   at Microsoft.Data.Sqlite.SqliteCommand.ExecuteReader(CommandBehavior behavior)
   at Microsoft.Data.Sqlite.SqliteCommand.ExecuteNonQuery()
   at DevTryOuts.Program.PopulateSqliteCombinationsBulkInsert2() in A:\DevTryOuts\Program.cs:line 568
   at DevTryOuts.Program.Main(String[] args) in A:\DevTryOuts\Program.cs:line 115

Any hint how I could correctly implement a new transaction for each batch?


There has been some misassumptions:

  1. The iterator does not generate 200 million records: it only generates close below 140 million
  2. Therefor it reaches the transaction commit (which is the reason the last log entry show 139 million index
  3. It "hangs" (including full PC freeze) while writing the records to disk
  4. It froze because of unsufficient disk space (~50 GB database) and worked when picking another disk with more free disk space.
c#
sqlite
.net-core
asked on Stack Overflow Dec 3, 2020 by monty • edited Dec 7, 2020 by monty

0 Answers

Nobody has answered this question yet.


User contributions licensed under CC BY-SA 3.0