Monday, October 11, 2010

Identifying High CPU in GC caused By Json.NET Newtonsoft.Json.JsonReader.ParseString windbg

just get a high CPU case, Here is the way I narrow down the issues.

Problems: IIS get high CPU, even only small requests been forwarded to this server.

Answer: something wrong with the json.net, need to update the dll to latest version. http://json.codeplex.com/releases/view/50552

receipt to locate the problem.

Create a Memory Dump using the DebugDiag. http://www.microsoft.com/downloads/en/details.aspx?FamilyID=28bd5941-c458-46f1-b24d-f60151d875a3&displaylang=en

Also copy 3 dlls from production server to debug machine. follow

windbg:PDB symbol for mscorwks.dll not loaded when you debug dump of .net app or silverlight apps and How to configure WinDbg to run other versions of the .NET

to make sure the windbg pick up the right extensions.

What are those CPU threads doing? pick up the smoking gun threads

 

.time

0:055> .time
Debug session time: Thu Oct  7 09:23:37.000 2010 (GMT-7)
System Uptime: 0 days 7:13:54.707
Process Uptime: 0 days 7:10:08.000
  Kernel time: 0 days 23:44:52.000
  User time: 0 days 14:44:43.000

How about CPU threads , time spending.

!runaway

0:055> !runaway
User Mode Time
  Thread       Time
  32:930       0 days 1:44:11.109
  30:914       0 days 1:25:19.343
  34:940       0 days 1:16:02.093
  39:9e4       0 days 1:14:45.171
  31:938       0 days 1:14:08.984
  25:17ec      0 days 1:13:34.687
  33:6f4       0 days 1:12:02.671
  29:17fc      0 days 1:10:18.765
  40:9e8       0 days 1:09:28.203
  27:17f4      0 days 1:08:49.453
  37:7d0       0 days 1:07:50.390
  35:3e8       0 days 1:07:15.718
  36:998       0 days 1:07:10.687
  38:9d8       0 days 1:07:08.968
  28:17f8      0 days 0:59:04.093
  26:17f0      0 days 0:58:39.687
  65:1f74      0 days 0:14:17.843
  44:ce8       0 days 0:12:41.843
  52:d2c       0 days 0:12:38.515
  53:d30       0 days 0:12:34.359
  55:d38       0 days 0:11:53.093
  54:d34       0 days 0:11:44.203
  45:cec       0 days 0:11:42.875
  51:d28       0 days 0:11:36.093
  62:1b74      0 days 0:11:20.921
  46:cf0       0 days 0:11:13.515


the top 15 threads using a lot CPU cycles. What are they doing Now.

Switch the first thread, #32
~32s
!clrstack

0:032> !clrstack
OS Thread Id: 0x930 (32)
Unable to walk the managed stack. The current thread is likely not a
managed thread. You can run !threads to get a list of managed threads in
the process

thread 32 is not a managed thread. but that doesn’t exclude our CLR code from the bad lists.

then run kb

kb

0:032> kb
ChildEBP RetAddr  Args to Child             
02f3f688 7c827d29 77e61d1e 00000358 00000000 ntdll!KiFastSystemCallRet
02f3f68c 77e61d1e 00000358 00000000 00000000 ntdll!ZwWaitForSingleObject+0xc
02f3f6fc 79e8c5f9 00000358 ffffffff 00000000 kernel32!WaitForSingleObjectEx+0xac
02f3f740 79e8c52f 00000358 ffffffff 00000000 mscorwks!PEImage::LoadImage+0x1af
02f3f790 79e8c54e ffffffff 00000000 00000000 mscorwks!CLREvent::WaitEx+0x117
02f3f7a4 79fd4417 ffffffff 00000000 00000000 mscorwks!CLREvent::Wait+0x17
02f3f7e8 79fcfd70 000f49b8 00000006 00000000 mscorwks!SVR::t_join::join+0xae
02f3f828 79fcf35d 00000002 00000000 000f49b8 mscorwks!SVR::gc_heap::mark_phase+0x176
02f3f854 79fcf77d 00000000 ffffffff 000f49b8 mscorwks!SVR::gc_heap::gc1+0x46
02f3f874 79fcf283 00000000 00000000 000f49b8 mscorwks!SVR::gc_heap::garbage_collect+0x246
02f3f898 79f23b6a 00000000 00000000 00000000 mscorwks!SVR::gc_heap::gc_thread_function+0x6a
02f3ffb8 77e6482f 000f49b8 00000000 00000000 mscorwks!SVR::gc_heap::gc_thread_stub+0x92
02f3ffec 00000000 79f23b1c 000f49b8 00000000 kernel32!BaseThreadStart+0x34


thread 32 is working on GC, which is CPU intensive.
then try other top cpu intensive threads they did the same thingg, Working on GC

0:038> kb
ChildEBP RetAddr  Args to Child             
030bf088 7c827d29 77e61d1e 00000358 00000000 ntdll!KiFastSystemCallRet
030bf08c 77e61d1e 00000358 00000000 00000000 ntdll!ZwWaitForSingleObject+0xc
030bf0fc 79e8c5f9 00000358 ffffffff 00000000 kernel32!WaitForSingleObjectEx+0xac
030bf140 79e8c52f 00000358 ffffffff 00000000 mscorwks!PEImage::LoadImage+0x1af
030bf190 79e8c54e ffffffff 00000000 00000000 mscorwks!CLREvent::WaitEx+0x117
030bf1a4 79fd4417 ffffffff 00000000 00000000 mscorwks!CLREvent::Wait+0x17
030bf1e8 79fcfd70 000fc048 00000006 00000000 mscorwks!SVR::t_join::join+0xae
030bf228 79fcf35d 00000002 00000000 000fc048 mscorwks!SVR::gc_heap::mark_phase+0x176
030bf254 79fcf77d 00000000 ffffffff 000fc048 mscorwks!SVR::gc_heap::gc1+0x46
030bf274 79fcf283 00000000 00000000 000fc048 mscorwks!SVR::gc_heap::garbage_collect+0x246
030bf298 79f23b6a 00000000 030bf2c0 7c82cce3 mscorwks!SVR::gc_heap::gc_thread_function+0x6a
030bffb8 77e6482f 000fc048 00000000 00000000 mscorwks!SVR::gc_heap::gc_thread_stub+0x92
030bffec 00000000 79f23b1c 000fc048 00000000 kernel32!BaseThreadStart+0x34


search gc context, there is a lot

x mscorwks!SVR::gc_heap::*
image

So all those busy threads is working on GC, which thread triggered the Demand for GC? that will be the next question.

time to check out those green threads. which consumes rest quota of the CPU

65:1f74      0 days 0:14:17.843
  44:ce8       0 days 0:12:41.843
  52:d2c       0 days 0:12:38.515
  53:d30       0 days 0:12:34.359
  55:d38       0 days 0:11:53.093
  54:d34       0 days 0:11:44.203
  45:cec       0 days 0:11:42.875
  51:d28       0 days 0:11:36.093
  62:1b74      0 days 0:11:20.921
  46:cf0       0 days 0:11:13.515

 

Switch to thread 65 or 44
~65s
!clrstack
image

the stack means in the shopping cart page, it will use json.net to parse some value. in our case, the values is stord in Cookie.

Is this tread is asking for more memory. It is, Let me show you why?

kb
image

1, it try to create a big array, then NO space. ask for runtime to issue a gc.

then for us, we get very big value to parse. there are some thing wrong with the cookie system. Update to the new version will get better performance

anyway, for Json Deserialization. first make sure the size is reasonable before you parse it.

No comments:

 
Locations of visitors to this page