Why is there a timeout in SqlClient / SqlConnection query (ExecuteScalar or ExecuteReader) when using .NET Core, but not .NET regular or SSMS?

1

I am trying to execute the SP using below code in .Net Core

using (DBContext context = new DBContext()){
                {
using (var command = context.Database.GetDbConnection().CreateCommand())
    {

        command.CommandText = "Sp_Name";
        command.CommandType = CommandType.StoredProcedure;
        command.Parameters.Add(new SqlParameter("@input", SqlDbType.VarChar ,3) { Value = InputValue });
        command.Parameters.Add(new SqlParameter("@Return_Value", SqlDbType.VarChar, 3) { Value = string.Empty });

        context.Database.OpenConnection();

        var dataReader = command.ExecuteReader();

        if (dataReader.Read())
        {
            var code = dataReader.GetString(dataReader.GetOrdinal(""));
        }
    }}

The query works fine for some input param but throwing exception for some, example:

--This scenario working fine in EF Code and SQL

    SP - exec Sp_Name @input = 'PDX',  @Return_Value = ''

    --Result (No Column Name) - '3I9' 

-- This scenario not working in EF Code, but working fine in SQL

    SP - exec Sp_Name @input = 'N01',  @Return_Value = ''

    --Result (No Column Name)  - 'WE5'

Exception Message

System.Data.SqlClient.SqlException (0x80131904): Timeout expired.  The timeout period elapsed prior to completion of the operation or the server is not responding. ---> System.ComponentModel.Win32Exception (0x80004005): The wait operation timed out
   at System.Data.SqlClient.SqlConnection.OnError(SqlException exception, Boolean breakConnection, Action`1 wrapCloseInAction)
   at System.Data.SqlClient.TdsParser.ThrowExceptionAndWarning(TdsParserStateObject stateObj, Boolean callerHasConnectionLock, Boolean asyncClose)
   at System.Data.SqlClient.TdsParser.TryRun(RunBehavior runBehavior, SqlCommand cmdHandler, SqlDataReader dataStream, BulkCopySimpleResultSet bulkCopyHandler, TdsParserStateObject stateObj, Boolean& dataReady)
   at System.Data.SqlClient.SqlDataReader.TryConsumeMetaData()
   at System.Data.SqlClient.SqlDataReader.get_MetaData()
   at System.Data.SqlClient.SqlCommand.FinishExecuteReader(SqlDataReader ds, RunBehavior runBehavior, String resetOptionsString)
   at System.Data.SqlClient.SqlCommand.RunExecuteReaderTds(CommandBehavior cmdBehavior, RunBehavior runBehavior, Boolean returnStream, Boolean async, Int32 timeout, Task& task, Boolean asyncWrite, SqlDataReader ds)
   at System.Data.SqlClient.SqlCommand.ExecuteScalar()
   at Mednax.ReferringPhysician.Data.PdxService.getGPMSCode(String practiceCode) in C:\Work\GIT\ReferringPhysician2\Mednax.ReferringPhysician.WebAPI\Mednax.ReferringPhysician.Data\PdxService.cs:line 971
ClientConnectionId:199f2b1a-cb1b-4752-8632-9f2c54bcefd8
Error Number:-2,State:0,Class:11

Stack Trace:

   at System.Data.SqlClient.SqlConnection.OnError(SqlException exception, Boolean breakConnection, Action`1 wrapCloseInAction)
   at System.Data.SqlClient.TdsParser.ThrowExceptionAndWarning(TdsParserStateObject stateObj, Boolean callerHasConnectionLock, Boolean asyncClose)
   at System.Data.SqlClient.TdsParser.TryRun(RunBehavior runBehavior, SqlCommand cmdHandler, SqlDataReader dataStream, BulkCopySimpleResultSet bulkCopyHandler, TdsParserStateObject stateObj, Boolean& dataReady)
   at System.Data.SqlClient.SqlDataReader.TryConsumeMetaData()
   at System.Data.SqlClient.SqlDataReader.get_MetaData()
   at System.Data.SqlClient.SqlCommand.FinishExecuteReader(SqlDataReader ds, RunBehavior runBehavior, String resetOptionsString)
   at System.Data.SqlClient.SqlCommand.RunExecuteReaderTds(CommandBehavior cmdBehavior, RunBehavior runBehavior, Boolean returnStream, Boolean async, Int32 timeout, Task& task, Boolean asyncWrite, SqlDataReader ds)
   at System.Data.SqlClient.SqlCommand.ExecuteScalar()

SP looks as below:

SP -

(
@Input varchar(3),
@Return_Value varchar(3) output
)
AS

SET NOCOUNT ON
SET @Return_Value = NULL

SELECT  TOP 1   @Return_Value = pacl.P_Code
    FROM    TABLEA pacl with (nolock)
    LEFT OUTER JOIN TABLEB rpp with (nolock)
        ON  rpp.Code = pacl.Code
        AND rpp.P_Code = @Input
    WHERE   rpp.P_Code IS NULL
    ORDER BY pacl.P_Code

IF @@Rowcount = 0 SET   @Return_Value = '***'

Select @Return_Value

Details inside Targetsite of Exception message:

enter image description here

c#
entity-framework-core
asked on Stack Overflow May 9, 2018 by Pradeep H • edited May 10, 2018 by Camilo Terevinto

1 Answer

2

What you're seeing here appears to be a bad query-plan cache entry due to parameter sniffing, causing a timeout due to a catastrophic query plan. The issue with parameter sniffing is that it generates a query plan based on the first parameter value it sees when there is no existing query plan for an operation (that matches the current execution mode). If you have heavily biased data, the query plan generated can be fine for some values, but catastrophic for others. For example, consider the scenario where there are 3 rows with one value and 3 million rows with another value. If you generate a query-plan based on the "3 rows" value, it might make decisions optimized for that magnitude - it'll work fine for 3, 30 and probably 300 - but for 3 million it could crumble. Likewise in reverse. Here at Stack Overflow, we call this the "Jon Skeet problem": Jon (the #1 user on the users page) has very different data distribution to a brand new 1-rep user, and query plans for Jon are terrible for that 1-rep user, and vice versa.

Fortunately, SQL Server has a query hint for this situation: OPTIMIZE FOR / UNKNOWN. The simplest usage of this is to add OPTION ( OPTIMIZE FOR UNKNOWN ) to the affected query; this instructs it to not bias the query plan hugely based on the parameter values seen when generating the query. You can also specify individual parameters if only some of them are problematic (@userId for us, for example).

So; why might this work in SSMS (query analyzer) and .NET, but not .NET core? I assume that the problem here is different SET options. The various SET options define the execution mode; some of these options can impact query generation, so a separate plan may be needed for two clients with different SET options. This means that .NET Core may be effectively hitting a different query-plan cache to .NET, so: when one is working, the other is failing. But: this doesn't mean that one is "worse"; rather, it simply means that one of them happened to generate a query plan on data that caused a catastrophic plan. The same problem could have impacted either, at a random time when the plan cache became invalidated for some reason (typically just: gradual data drift) - just as the most awkward user (etc) was using the site. Parameter sniffing issues do not usually show up immediately - they strike in the middle of the night 4 days after anyone has deployed anything.

answered on Stack Overflow May 9, 2018 by Marc Gravell

User contributions licensed under CC BY-SA 3.0