Esat Erkec
Understand SQL update statement execution plan

SQL Update statement Performance Tips

December 7, 2021 by

The SQL update statement is used to modify an existing record or records in a table and it is commonly widely used in databases applications. In this article, we will examine the update statement in terms of the performance perspective.

Best practices to improve SQL update statement performance

As with the select queries, some best practice suggestions can be made to improve the performance of SQL update statements:

  • To optimize update operations, we should keep the transaction size as short as possible
  • We need to consider the lock escalation mode of the modified table to minimize the usage of too many resources
  • Analyzing the execution plan may help to resolve performance bottlenecks of the update query
  • We can remove the redundant indexes on the table
  • As possible as we need to use the WHERE clause in our update statements
  • If it is possible we have to disable the delete triggers when we perform an update statement
  • We should choose low peak usage times for major updates

Let’s tackle this “Analyzing the execution plan may help to resolve performance bottlenecks of the update query“ best practice with an example.

Understanding a very simple SQL update statement execution plan

One of the most effective ways to understand SQL Server query performance details is to interpret the query plan. The query we will tackle below updates all rows in a heap table with a value that returned from a scalar-valued function. We will try to analyze and overcome its performance issue.

  • Tip: A heap table is a table type that has not owned a clustered index so the data is not stored in a logical order
  • Firstly, with the help of the following query, we will create a Person table and will insert some 100k sample rows into this table.

    After creating the Person table, we will update the ModifiedDate column to use the Uf_LastModification scalar function.

    Understand SQL update statement execution plan

    The Table Scan operator is chosen by the optimizer to read all data of the table. Actually, the optimizer has no other alternative to read the Person table data because the Person table is a heap, and it does not include any clustered index. Let’s start to read the execution plan right to left.

    • Table Scan: The table scan operator begins to read the table at the beginning of the table and goes through every row in the table. Includes the proper records into the result set based on matching criteria
      • Tip: Sometimes the table scan and index scan operations can be confused but they have some differences. In the index scan operation, the data engine uses the existing index to read the data pages. Using the indexes for the data read operation reduces the number of I/O. Another point about the index scan operation is that usually, an index scan is less expensive than a table scan because it looks at sorted data
    • Compute Scalar: Compute scalar operator performs the scalar-valued function. When we look at the Defined Values attribute of this operator, we can see all details about it

      SQL compute scalar operator

    • Table Spool (Eager Spool): This operator stores the data in tempdb until the data flow from the child (previous ) operator is completed and does not transfer it to the next operator. When the data flow is completed, the data is transferred to the next operator. Because of this working mechanism, the Eager Spool is a blocking operator. This operator is used when data is required to reuse during the execution of the query or prevent the query from the Halloween problem
    • Table Update: This operator updates the input rows of the specified table

    Finding the SQL update statement performance problem

    When we look at this query execution plan, actually the problem is very simple because the scalar-valued function is invoked for each of the 100k rows every time so the query is slow down. We can use an extended event to capture to see the how many time the user-defined function is invoked. We can use the module_end event because it captures when a stored procedure or function is called and successfully completed. To create an extended event, we will right-click on the Sessions menu and select the New Session.

    We will give a name to our event session and then we click the :

    • Start the event session immediately after session creation
    • Watch live data on screen as its captured


    How to give a name to the event session

    We click the Events tab and add the module_end event to Selected_events

    After selecting the module_end event, we click the Configure button and set the filters. In this window, we select the session_id and object name as a filter field. The session_id filter helps to capture the events only for this session.

    Filter a SQL Server extended event

    In the Data Storage tab, we will select a target storage type that the event data will be stored. We can choose the event file type. So that, the event data will be stored as SQL Server Extended Event File (XEL) file type in the specified folder.

    As the last step, we will click the Advanced tab and select the Event retention mode. For this event session, we will select the No Event loss because we want to capture all invocation of the scalar-valued function. However, this option may cause performance problems for this reason It is better not to use it in the production environment.

    After clicking the OK button, the Watch Live Data window will open and start the event data broadcasting when we execute the SQL update statement.

    Extended event watch live data

    In this window, we open the Extended Event menu and then click Choose Columns menu item to group the event data according to the object name. So that we can easily see how many times the scalar-valued function is invoked.

    Grouping extended event data

    As the last step, we click the Aggregation menu item and select the SUM aggregate type for the duration column. So we can analyze the total time it takes to invoke the scalar-valued function for this SQL update statement.

    Aggregate extended event data

    • Tip:The duration unit of the module_end event is a microsecond

    SQL update statement and scalar-valued function affect

    As seen above, SQL Server consumes 22 minutes to invoke this function.

    Convert microsecond to minute

    Another problem with this query is that the query optimizer cannot use a parallel plan due to scalar-valued function usage. When we look at the estimated cost of the query is higher than the server Cost threshold for Parallesim. However, the optimizer can not generate a parallel plan.

    Parallelisim and update statement

    At the same time, we can find the NonParallelPlanReason attribute indicates CouldNotGenerateValidParallelPlan. On the other hand, using trace flag 8649 or ENABLE_PARALLEL_PLAN_PREFERENCE hint does not change this situation.

    • Tip: The trace flag 8649 and ENABLE_PARALLEL_PLAN_PREFERENCE query hint forces the optimizer to generate a parallel plan if it is possible

    QUERYTRACEON 8649

    Solving the SQL update statement performance problem

    As we analyzed, this SQL update statement performance problem is related to using scalar-valued function usage. How we can get rid of this function:

    • Use variable: We can assign the function result into a variable and then we can update the table to use this variable

      Use variables in update statement

      This query only takes 1 second to complete

    • Change Database Complibity Level to 150: Scalar UDF Inlining feature announced with SQL Server 2019. Through this feature, the scalar function is converted into an expression or subquery and then adopted into the query automatically

      SQL Server  2019 Scalar UDF Inlining

    As seen above, the execution plan that is created in SQL Server 2019 enables parallelism and is different than the previous one. The section in the red box has been implemented to the query plan due to the scalar UDF inlining feature. So that, we get rid of to call scalar function 100k times.

    Conclusion

    In this article, we have learned performance details about the SQL update statement and we also analyze query performance with all details.

    Esat Erkec
168 Views