Jonathan Kehayias

Understanding Event Loss with Extended Events

SentryOne Newsletters

The SQLPerformance.com bi-weekly newsletter keeps you up to speed on the most recent blog posts and forum discussions in the SQL Server community.

eNews is a bi-monthly newsletter with fun information about SentryOne, tips to help improve your productivity, and much more.

Subscribe

Featured Author

Erin Stellato is a Principal Consultant with SQLskills and a Microsoft Data Platform MVP.

Erin’s Posts

My colleague, Erin Stellato, recently asked me a question about where and why event loss could happen with Extended Events. The question was the result of a comment someone had made on one of her blog posts that was claiming that showplan_xml events can not be collected by the XE Profiler or through a “live” stream of the events from the server. I happen to know that this is not correct because I have routinely demonstrated the negative performance impacts of using the post_query_execution_showplan event against a production workload by adding the event in the UI and having it watch the live data, so this started a more in depth discussion about how and when Extended Events will discard an event that has been generated during data collection.

Event Size Matters

Extended Events configures internal memory buffer space for an event session when it is initially started on the server, and the configuration of the event session options determines how large the memory buffers are, and the maximum size event that the event session can collect. While most events generated by Extended Events are relatively lightweight and small in binary format, specific events can generate a much larger payload of data that has to be buffered. The default event session options results in a session configuration with three internal memory buffers for holding events that are 1,441,587 bytes in size. The size and number of memory buffers for an event session can be found in the sys.dm_xe_sessions DMV while the session STATE=START on the server:

SELECT
    s.name, 
    s.total_regular_buffers,
    s.regular_buffer_size,
    s.total_large_buffers,
    s.large_buffer_size,
    s.total_buffer_size
FROM sys.dm_xe_sessions AS s;

Notice that there are zero large buffers for each of the system defined event sessions, and the large buffer size is also set to zero, which is the default configuration. The large buffers for an event session are only created when the MAX_EVENT_SIZE session option is configured for the event session. The default value for this option is 0, which means that the largest event the event session can actually consume is the size of a regular memory buffer, which is 1,441,587 bytes. For certain events, like the ones that produce the showplan_xml, it is actually relatively easy to have an event size that is larger than the default memory buffer size for the event session. In these cases, the large event would actually be discarded by the event session since it cannot be placed into a memory buffer for dispatching.

Controlling Event Loss

There are three specific session options that determine how large of an event an event session can actually collect, and one that controls how events are dropped when the buffer memory for the event session is full or under pressure. All four of these matter when we are talking about collecting events that could generate a large event payload and we want to minimize the chance that we could potentially drop an event. An example event session that would be prone to event loss due to memory pressure in the buffer space for the event session is below:

CREATE EVENT SESSION [Locks] ON SERVER 
ADD EVENT sqlserver.lock_acquired,
ADD EVENT sqlserver.lock_released
ADD TARGET package0.event_file(SET filename=N'Locks',max_file_size=(5),max_rollover_files=(4))
WITH (MAX_MEMORY=4096 KB,
MEMORY_PARTITION_MODE=NONE,
EVENT_RETENTION_MODE=ALLOW_SINGLE_EVENT_LOSS,
MAX_EVENT_SIZE=0 KB);

Note: This is not an event session I would ever recommend running on a production workload – the volume of data it would generate would be significant, since it is tracking every lock acquire and release.

If we start this session and then run the AdventureWorks Books Online Workload generator that is available on my blog against an instance of SQL Server, the session will rapidly begin dropping events due to the fast event generation and the delay in buffer flushing to the event_file target that is configured. The number of events that have been dropped by an event session can be tracked in the sys.dm_xe_sessions DMV if the event session options have been configured with EVENT_RETENTION_MODE = ALLOW_SINGLE_EVENT_LOSS. If the event session is configured with EVENT_RETENTION_MODE=ALLOW_MULTIPLE_EVENT_LOSS then entire memory buffers of events can be dropped and it only counts how many buffers were dropped and not the number of individual events each buffer contained.

SELECT
   s.name, 
   s.total_regular_buffers,
   s.regular_buffer_size,
   s.total_large_buffers,
   s.large_buffer_size,
   s.dropped_event_count,
   s.dropped_buffer_count,
   s.largest_event_dropped_size
FROM sys.dm_xe_sessions AS s;

Here, we can see that 100,521 events were dropped and the largest size of an event that was dropped was 176 bytes, which is smaller than the size of our regular buffer space, so we are just hitting normal buffer memory space pressure. However, if we create an event session that collects the two of the showplan events (see this article for why this will negatively impact performance severely and should not be done on production servers), along with the batch starting and completed events and generate some larger plans, we can trigger event loss due to event size.

CREATE EVENT SESSION [DropsEvents] ON SERVER 
ADD EVENT sqlserver.query_post_execution_showplan,
ADD EVENT sqlserver.query_pre_execution_showplan,
ADD EVENT sqlserver.sql_batch_completed,
ADD EVENT sqlserver.sql_batch_starting;

Here, we can see that the largest_event_dropped_size is greater than our regular_buffer_size, so this means that we need to change the configuration of our session buffers. If we increase the MAX_MEMORY for the event session, it can increase the size of our regular buffers. The default value is only 4MB, which is where the 1.4MB buffer size shown above comes from. If we change this to be 64MB for the event session, the regular_buffer_size will be 22.4MB in size which would accommodate our 3.7MB dropped event. The other option is to set the MAX_EVENT_SIZE option which provides the large_buffer_size for large events and is divided by two for the session.

CREATE EVENT SESSION [CollectsEvents] ON SERVER 
ADD EVENT sqlserver.query_post_execution_showplan,
ADD EVENT sqlserver.query_pre_execution_showplan,
ADD EVENT sqlserver.sql_batch_completed,
ADD EVENT sqlserver.sql_batch_starting
WITH (MAX_MEMORY=65536 KB,MAX_EVENT_SIZE=65536 KB,MEMORY_PARTITION_MODE=NONE);

So here we can see the two large buffers with a 33.6MB size and after running the same plan generating workload again we have no dropped events for our new CollectsEvents session, but we have doubled the dropped events for our DropsEvents session using the defaults.

So, there you have it; why certain events might not be collected by an event session, how to go about troubleshooting when events are being dropped, and how to determine if it is the size of the event that is causing the issue. Many of the sessions that I see in actual use on client systems have the defaults for event session options, especially where it comes to memory. This is one area that, once you understand the buffering mechanism used by Extended Events, and then consider the size of the events that could potentially be generated, you will begin to make changes in how the session options are defined to minimize the potential for events being dropped due to memory space limits or event size constraints.