I've developed a Queue Trigger based Function app in .Net Standard 2.0. When the App was down for maintenance, or new version it can happen that there are more than 20000 Messages on the queue. Basically the App reads the content of each xml file with a XMLReader and creates one record of it. These records are directly inserted into a Azure SQL Data Warehouse. But when the App is ReStarted we are getting quite some dependency Errors, all due to a SQL-Login error.
System.Data.SqlClient.SqlException (0x80131904): Connection Timeout Expired.
The timeout period elapsed during the post-login phase.
The connection could have timed out while waiting for server to complete the login process and respond; Or it could have timed out while attempting to create multiple active connections.
This failure occurred while attempting to connect to the routing destination.
The duration spent while attempting to connect to the original server was - [Pre-Login]
When looking to the Statistics of the Datawarehouse I can see that there were over 800 connections active at that moment. So I do understand that this might be too much, but how could I solve this, I don't think there is a way to limit the number of simultaneous function Apps, ...
If someone has a idea (even on Saturday night), please feel free.
PS: In normal Operations the Function works fine, it is just when restarting it fires too much too quickly ...
I'd like to understand more about the application, because this is an anti-pattern for loading ASDW.
A more normal approach to this requirement would be to shred the XML into micro-batch files, then ingest the files using Polybase. Depending on your landing zone structure, even a restart would be a very simple task.
Under which DWU are you running? The concurrency effect of this model could be substantial, not only poor performance, but a negative effect on other workloads at the time.
Edited after responses:
If I had to handle a workload like this I would use EventHub or Kafka to Databricks, shred the XML, then write to ASDW. Here's a great example that udpates a microbatch to DW every 30 seconds:
https://azure.microsoft.com/en-au/blog/near-real-time-analytics-in-azure-sql-data-warehouse/
This approach will ingest data to ASDW using Polybase, which will be substantially faster than SQL inserts, and provide enhanced concurrency.
If you're on the consumption plan, it's possible that this is happening because your function app is getting massively scaled-out due to the large queue message backlog. In that case, the WEBSITE_MAX_DYNAMIC_APPLICATION_SCALE_OUT
documented here will help you limit how many VMs your app runs on (though it is not a 100% guaranteed limit due to how the system behaves when it runs into capacity constraints).
This issue tracks improving the overall experience in this area, but there is no ETA: https://github.com/Azure/azure-functions-host/issues/1207
User contributions licensed under CC BY-SA 3.0