Difference between revisions of "Deadlock investigation thru the errorlog"

From SybaseWiki
Jump to: navigation, search
m
 
Line 3: Line 3:
 
A technique is described here to investigate the deadlock information and with a tool this task is automated and statistics can be gathered.  
 
A technique is described here to investigate the deadlock information and with a tool this task is automated and statistics can be gathered.  
  
Software download link: http://www.petersap.nl/documents/deadlock.zip (8 Kb)
+
Software download link: http://www.petersap.nl/documents/deadlock.tar.gz (8 Kb)
  
 
==Sample deadlock message==
 
==Sample deadlock message==

Latest revision as of 21:57, 1 August 2007

When the configuration option "print deadlock information" has been switched on, an informational message is put into the Sybase ASE errorlog when a deadlock is detected. The information can be used to see if the tables involved in a deadlock can benefit from a change of their locking schema.

A technique is described here to investigate the deadlock information and with a tool this task is automated and statistics can be gathered.

Software download link: http://www.petersap.nl/documents/deadlock.tar.gz (8 Kb)

Sample deadlock message

This is an (abbreviated) sample of the information shown in the errorlog when a deadlock has been detected:

Deadlock Id 1 detected
1 deadlock chain(s) involved.

Process (Familyid 0, Spid 21, Suid 4) was executing a INSERT command in the procedure 'objid 0'.
SQL Text: insert into books(BookId,Title) select 208513,"Sybase DBA guide"
Process (Familyid 0, Spid 25, Suid 4) was executing a UPDATE command in the procedure 'sp_priceBook'.
SQL Text: exec sp_priceBook "My Pet Goat",6.99

Process (Familyid 0, Spid 25) was waiting for a 'exclusive page' lock on page 660720 of the 'books' table in database 4 but process (Familyid 0, Spid 21) already held a 'exclusive page' lock on it.
Process (Familyid 0, Spid 21) was waiting for a 'shared page' lock on page 652391 of the 'bookPrice' table in database 4 but process (Familyid 0, Spid 25) already held a 'exclusive page' lock on it.

Process (Familyid 0, Spid 21) was chosen as the victim. End of deadlock information.

The highlighted text shows in which database, object and page number the deadlock was encountered.

Sybase ASE locking schemas

Sybase ASE (version 11.9.2 and later) uses three different locking schemas: AllPages, DataPages and DataRows. Sybase stores rows for a table or index within pages. Depending on the locking schema, information is locked at the page or at the row level. The following table shows how each locking schema locks information.

Locking Schema Locks on Index Locks on Data
All Pages Page Page
DataPages Not locked Page
DataRows Not locked Row

The table shows that, for instance, the DataPages locking schema does not lock indexes but data is locked at the page level. When a single page is locked, all rows on that page are locked with it. (NB: with the DataPages and DataRows locking schema indexes are locked through latches, a non-transactional locking method)

Moving from AllPages to DataPages to DataRows

Deadlock information can be used to tune the locking schema for a table. When a table uses the AllPages schema and deadlocks occur mostly on pages that contain rows for indexes, a move to DataPages might reduce the change of hitting a deadlock. Also, when deadlocks are frequently occurring on the data level, a move to the DataRows locking schema is an option.

When a table is already using the DataRows locking schema other techniques must be used to eliminate deadlocks. Be aware of the fact that deadlocks cannot always be avoided and the application logic should deal with it by re-submitting the transaction.

Large and busy systems might want to avoid moving all tables to the DataRows locking schema because of the overhead involved. It is not uncommon to see configurations with "number of locks" configured to a value of several million and still hitting that limit. This situation can easily occur through a combination of a high transaction isolation level and the DataRows locking schema. Try to avoid these situations by tuning the locking schema, the transaction isolation level, lock promotion thresholds or other settings.

Servers short of physical ram should be aware of the fact that locks consume memory (approx. 140 Mb per 1 million locks).

Index or Data?

The deadlock information in the errorlog doesn't show if the deadlock was related to an index or to data. However, with the undocumented dbcc page command, this can be determined easily. For instance, dbcc page(4,652391) will tell what type of page is involved, through the indid.

Below is an example of the dbcc page(4,652391) output:

Page not found in Cache: default data cache.
Page read from disk.

BUFFER:
Buffer header for buffer 0x0x446d3800 (Mass head)
  page=0x0x446d3000 bdnew=0x(nil) bdold=0x(nil) bhash=0x(nil)
  bmass_next=0x(nil) bmass_prev=0x(nil) bvirtpg=134242919 bdbid=4
  bmass_head=0x0x446d3800 bmass_tail=0x0x446d3800 bcache_desc=0x0x446d6310
  bpool_desc=0x(nil) bdbtable=0x(nil)
  Mass bkeep=0 Mass bpawaited=0 Mass btripsleft=0 Mass btripsleft_orig=0
  bmass_size=2048 (2K pool) bunref_cnt=0
  bmass_stat=0x0800 (0x00000800 (MASS_NOTHASHED))
  bbuf_stat=0x0 (0x00000000)
  Buffer blpageno=652391 bpg_size=2k Mass blpageno=652391 (Buffer slot #: 0)
  bxls_pin=0x(nil) bxls_next=0x(nil) bspid=0
  bxls_flushseq=0 bxls_pinseq=0 bcurrxdes=0x(nil)
Latch and the wait queue:
Latch (address: 0x0x446d3820)
  latchmode: 0x0 (FREE_LATCH) latchowner: 0 latchnoofowners: 0
  latchwaitq: 0x(nil) latchwaitqt: 0x(nil)

Latch wait queue:

PAGE HEADER:
Page header for page 0x0x446d3000
pageno=652391 nextpg=655160 prevpg=652390 objid=233048835 timestamp=0000 01f2b4c2
nextrno=32 level=0 indid=0 freeoff=1961 minlen=39
page status bits: 0x101 (0x0100 (PG_ADDEND), 0x0001 (PG_DATA))

The output shows that indid has a value of 0, and so this page is used to store data. Remember that 0 is used for data, values > 0 and < 255 are used for indexes and 255 itself is used for off-row-objects including text and image data.

The deadlock analyzer tool

The output of a single deadlock is by far not enough to make a decision on. The tool provided here reads through the errorlog, makes the required dbcc page commands and scans the output. At the end several statistics are shown. The tool is a Unix/Linux shell script, executing statements with isql and using bcp.

Sample output

Below is a sample output from the analyzer. The first report shows deadlocks occurring at the table level, the second report shows the total number of deadlock per table at page and or row level and the last one shows the number of deadlocks per table and indid.

Object                                                       Deadlocks on table 
-----------------------------------------------------------  ------------------- 
pubs..authors                                                                  8 
pubs..publishers                                                             252 
(2 rows affected)

Object                                           Page/Row    Deadlocks per table 
-----------------------------------------------  ----------  --------------------- 
pubs..roysched                                   P                              10
pubs..authors                                    P                              91
pubs..sales                                      P                             212 
pubs..salesdetail                                R                             244 
(4 rows affected)

Object                                                indid    Deadlocks per index 
----------------------------------------------  -----------  --------------------- 
pubs..authors                                             0                     91 
pubs..roysched                                            0                      5 
pubs..roysched                                            2                      5 
pubs..sales                                               0                    212 
pubs..salesdetail                                         0                    244 
(5 rows affected)