Rajendra Gupta
Verify ghost records increased count

The ghost cleanup task for SQL Server Databases

October 6, 2021 by

This article explains the internals of the Ghost cleanup task for SQL Server Databases.

Introduction

Each relational database consists of the background process and user processes.Similarly, SQL Server has many internal processes such as Checkpoint, Lazy Writer, Lock Monitor, System_Health_Monitor, Trace Queue Task and Ghost CleanUp. Broadly, we can divide these system processes into the following categories.

  • Data Writer Processes
    • Lazy writer
    • CHECKPOINT
    • EAGER WRITER
  • Logging Transaction Processes
    • LOG WRITER
    • BACKUP LOG
  • System processes
    • RESOURCE MONITOR
    • LOCK MONITOR
    • GHOST CLEANUP
    • TRACE QUEUE TASK
  • Job Scheduling processes
    • SQL Server Agent

I have seen DBAs have an idea about Checkpoint, Lazy Writer process, but they ignore the importance of the Ghost Cleanup background process. To understand some of these background processes, refer to the article – SQL Server CHECKPOINT, Lazy Writer, Eager Writer and Dirty Pages in SQL Server

Have you seen the GHOST CLEANUP task while checking the processes using sp_who2 or dynamic management view sys.dm_exec_requests? You can identify a background process using the status column output as Background.

This article walks through with an insight on the ghost cleanup task in the SQL Server Database.

Pre-requisites

It would help if you had a basic understanding of SQL Server indexes – Clustered and nonclustered index to understand this article’s ghost cleanup task.

  • Note: You can always refer to Indexes for understanding index requirements, architecture, usage

What is a Ghost Cleanup Task for SQL Server Database?

Ghost cleanup is a background process that periodically runs and cleans up ghost records. The next question is – What is a ghost record?

When users delete records from the clustered index data page or nonclustered index leaf page, SQL Server does not physically remove them. Instead, it marks those records as Ghost Records. Therefore, the deleted row stays on the page; however, the row header indicates that the row is a ghost row.

It is a performance optimization technique so that the delete operation performs logical deletion and finishes quickly. Similarly, during a rollback, SQL Server has to unmark the record as a ghost record. Thus, it eliminates the requirement of reinserting those deleted records for a rollback.

Once the delete transaction commits and the ghost cleanup task occurs, it checks the ghost records on a page and removes them physically. SQL Server also updates the PFS (Page Free Space) page header to mark it as a ghost record page. It also marks the database as having ghosted entries.

The ghost cleanup task performs the following work:

  • The ghost cleanup task occurs every 5 seconds (SQL Server 2012+)
  • It starts checking databases marked as having ghost entries and scans the PFS pages of the ghost database
  • It removes the ghost record rows physically from the database
  • Once it removes all ghost records from a database, it marks it with no ghost entries. Therefore, SQL Server skips the database during the next run of the ghost cleanup task

    Important Notes:

  • A single run of the GHOST CLEANUP task cleans up a maximum of 10 pages in each execution. It is to ensure that it does not impact the system’s performance. If there are multiple pages ( more than 10), it processes them during the subsequent execution of the ghost cleanup task
  • If the cleanup process could not take a shared lock on the database having ghost records, it skips it and checks again during the next run

Usually, we do not see the GHOST CLEANUP task if you query sp_who2 or DMV because it is a quick process. However, for an extensive database with multiple frequent deletes, you can observe this in the session’s output.

In this article, to capture the ghost cleanup task, we will create a table [GhostSession] and insert records into table using a WHILE loop until the GHOST CLEANUP background processis captured.

As shown below, SQL Server triggered the ghost cleanup task for database id 5.

Ghost Cleanup task in SQL Server Database

To find out ghost records in a database, we can use the DMV sys.dm_db_index_physical_stats as per the below script. This script groups records on the database id and orders them in descending order of total ghost records.

  • Note: The DMV returns size,fragmentation information and ghost records information of the specified table or DB. You can refer Microsoft docs for understanding the use cases of the DMV sys.dm_db_index_physical_stats

As shown below, the [AdventureWorks2017] database has two ghost records while running the DMV. We do not have any ghost records for remaining databases.

Check ghost record count SQL Server Databases

Practical demonstration of ghost records for a SQL Server database

Until you have got understanding of the ghost cleanup task for SQL Server. This section will look at internal SQL Server pages to identify the ghost records and cleanup process. For this purpose, let’s create a SQL Server database and insert sample records in it.

Before we proceed further, we enable the following two trace flags:

  • Trace flag 3604 to view DBCC results in SSMS
  • Trace flag 661 to disable the GHOST CLEANUP task
  • Note: Do not enable trace flags in the production environment unless you have a specific requirement

  • Note: Do not disable GHOST CLEANUP in a production instance. If required, have a controlled environment testing before doing it for production
  • Further, we remove records from the demo table and view page data using the DBCC PAGE. You can also use DMV sys.dm_db_page_info in SQL Server 2019.

    The sample output shows entries on the PFS page that has a message:

    View page details

    You can query index DMV to check the number of ghost records. As shown below, it has 2000 ghost records as of now.

    Verify ghost records increased count

    We can also use the fn_dblog() function to check the content of the transaction log.

    You see the operation as LOP_DELETE_ROWS and Context as LCX_MARK_AS_GHOST for the deleted rows in the sample table in the output.

    Check context as LCK_MARK_AS_GHOST

    Disable the ghost cleanup for the SQL Server database

    Disabling the ghost cleanup process can increase the buffer pool size, raise IO requests. It might increase database size because SQL Server will not reuse the space that is cleared by deleting the rows. SQL Server also has to add new pages for accommodating new inserts that increase page splits and ultimately performance issues

    As we saw earlier, you can use trace flag 661 for disabling the ghost cleanup. However, it is not recommended to do so until you have a specific requirement and it is tested well enough for a production release.

    However, if you disable the ghost cleanup task, you can manually delete the ghost records using the following stored procedures.

    Use of stored procedure sp_clean_db_free_space in SQL Server Database

    This procedure cleans the ghost records on the pages of a database. It requires two arguments –database name (mandatory) and cleaning delay time in seconds (optional). The cleanup delay is an interval that you can specify to delay between cleaning of the pages. The default value is 0.

    The syntax of the sp_clean_db_free_space procedure:

    Stored procedure sp_clean_db_file_free_space in SQL Server Database

    This procedure can clean up ghost records in a specific data file for a database. As shown in the syntax, it requires an additional data file id (mandatory) argument.

    The syntax of the sp_clean_db_file_free_space procedure:

    • Note: The execution time of these stored procedures depends on the size of the database, data file, available free space, disk capacity. You should run them on off business hours to avoid any performance impact

    Conclusion

    This article explored the ghost cleanup task in SQL Server Database for removing the ghost records. You should get an understanding of this helpful background process. If you disable it for a specific reason, use the stored procedure to clean them manually. The article’s purpose is to give an understanding of the ghost cleanup task. Therefore, I again reinstate that never disable ghost cleanup task in SQL Server unless you have a particular reason to do so.

    Rajendra Gupta
    Latest posts by Rajendra Gupta (see all)
168 Views