shoyu of lysu

Introduce to Gdb

0. Purpose

I’m a freshman to use gdb or c, because I mainly use java, but there is some situation for java programer to use gdb tools to solve problem. So I ‘mark’ this article~

1. Debug Target

gdb could debug many kinds of programming languages, like c, c++, objective-c, and so on. It doesn’t like debug tools in java, we often use it’s command-line interface instead of GUI.

Before we start to debug, we should take code compile with debug information.

for example, using gcc we should add -g flag

$ gcc example.c -g -o example

This is the reason why if we want to debug jvm, we must build it with debug info first.

2. Run as debug

using

$ gdb ./example

then we enter the gdb-commandline, and we can use tab key to complete or list all command

there are some basic command

	command	description
	dir [directories]	set source code folder
	r	run it
	b [where]	add a break, where is an express like ‘xx_method’ or linenum
	b [where] if [condition]	break when condition
	l	list code
	step	step into a method
	c	continue
	finish	finish current method
	p [var name]	watch a variable value
	x/? [address]	watch the memory address
	dump memory [file] [sadd] [eadd]	dump memory between to address into file
	info thread	list threads
	info break	list all break
	thread [tid]	switch thread
	bt	show stacktrace
	[enter]	redo last command

and so on…

2. Attach to running process

using

$ gdb ./example [pid]

to attach to pid process..

3. Run with coredump file

using

$ gdb ./example [coredump file path]

the previous blog tell some detail of debug coredump file.

Also see

GDB调试程序

Mar 14th, 2015

debug, os,

Comments

There are too many tools for us to track the system problem. It’s just like a detector’s work, at first, we must find ‘evidence’ as quickly as possible. Beside the awareness, using the right tools for problem is vital. Let’s record it in here.

1. Core dump

when

First ‘tool’ is not a tools, it’s just a snapshot file that record memory/processor/register and so on when our program crashed as OS reason. Just like javadump file, it’s very useful for us diagnose and debug the problem.

When can we get a core dump file?

Linux use signal as one kind of asynchronous event handling mechanism, each signal has its default action, like Ingore (ignore signal ), Stop(suspend process), Terminate (terminate process), and Core (termination and core dump) etc..

So when Core action is triggered, linux will generate core dump.

Signal	Action	Addition
SIGQUIT	CORE	Quit from keyboard, e.g. `"Ctrl+\"`
SIGILL	CORE	Illegal Instruction, e.g. `kill -ll $$`
SIGABRT	CORE	Abort signal from abort
SIGSEGV	CORE	Invalid memory reference, e.g. write to null pointer memory area or overflowstack
SIGTRAP	CORE	Trace/breakpoint trap

For full sigal list, plean see signal manual page. And signal we show in core dump to help using recognize the problem type.

how

I. enable configuation

En~…But we often hear that ‘I could not find any core dump when it crashed’.

There some switcher to let system generate a core dump.

Enter ulimit -c command and get the result value as 0, it indicate that core dump is disabled by default, it would not generate core dump file. We can use the command ulimit -c unlimited to enable the core dump function, and does not limit the core dump file size;

Using the above command will only effective for terminal current environment， if you want to be permanent, you can modify the file /etc/security/limits.conf file

#/etc/security/limits.conf
#Each line describes a limit for a user in the form:
#<domain>      <type>  <item>         <value>
      *         soft     core        unlimited

II. dump file path

The default generated core file is saved in the executable file’s directory, file name is core.
Using sysctl-a |grep core, and kernel.core_pattern will indiate the dump file path, and kernel.core_uses_pid using 1 to let core file contains process id
To modify it using sysctl -w or sysctl -p as sudoer.

III. debug core file

use the command gdb [program] [coredump] to view the core file

vagrant@vagrant-ubuntu-trusty:~/test$ gdb seg core
GNU gdb (Ubuntu 7.8-1ubuntu4) 7.8.0.20141001-cvs
Copyright (C) 2014 Free Software Foundation, Inc.
License GPLv3+: GNU GPL version 3 or later <http://gnu.org/licenses/gpl.html>
This is free software: you are free to change and redistribute it.
There is NO WARRANTY, to the extent permitted by law.  Type "show copying"
and "show warranty" for details.
This GDB was configured as "x86_64-linux-gnu".
Type "show configuration" for configuration details.
For bug reporting instructions, please see:
<http://www.gnu.org/software/gdb/bugs/>.
Find the GDB manual and other documentation resources online at:
<http://www.gnu.org/software/gdb/documentation/>.
For help, type "help".
Type "apropos word" to search for commands related to "word"...
Reading symbols from seg...(no debugging symbols found)...done.
[New LWP 12887]
Core was generated by `./seg'.
Program terminated with signal SIGSEGV, Segmentation fault.
#0  0x0000000000400506 in main ()
(gdb) where
#0  0x0000000000400506 in main ()
(gdb) info frame
Stack level 0, frame at 0x7fff961538c0:
 rip = 0x400506 in main; saved rip = 0x7f17f9d0eec5
 Arglist at 0x7fff961538b0, args:
 Locals at 0x7fff961538b0, Previous frame's sp is 0x7fff961538c0
 Saved registers:
  rbp at 0x7fff961538b0, rip at 0x7fff961538b8
(gdb) Quit

IV. generate dump with gdb

we can use gdb to generate a core dump easily..

first, connect to java pid

gdb -q --pid=xxx

then

(gdb) generate-core-file

at last, don’t forget to detail from process

(gdb) detach

V. for java process

if core dump is come from java process

we can transform it to java heap file use jmap

jmap -dump:format=b,file=heap.hprof $JAVA_HOME/java core.xxx

then use tools like MAT to analysis memory.

or we can use jstack to see java stack

jstack -m $JAVA_HOME/bin/java core.xxx

(ps: jstack maybe meet a bug )

Also see

2. dmesg/messages

when

When process crashed, we need to confirm is it killed by oom-killer?

how

We can use dmesg and /var/adm/messages.* to see kernel messages.(dmesg for kernel ring buffer, and messages for all)

expect boot info…

oom-killer will in it, we can use

sudo dmesg | grep java | grep -i oom-killer

to see confirm that is program killed by oom-killer.

Also see

3. strace

when

“strace – trace system calls and signals”

En~…we can use strace to see system call and signals in program.

we can use strace to see param or return value of system call and signals.

how

I. follow call and signals

use strace command to see system call

strace ./[program]

we will see system call param and return value and signals

mmap(NULL, 4096, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = 0x7f90c0139000
mmap(NULL, 8192, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = 0x7f90c0137000
arch_prctl(ARCH_SET_FS, 0x7f90c0137740) = 0
mprotect(0x7f90bff19000, 16384, PROT_READ) = 0
mprotect(0x600000, 4096, PROT_READ)     = 0
mprotect(0x7f90c0146000, 4096, PROT_READ) = 0
munmap(0x7f90c013a000, 39504)           = 0
fstat(0, {st_mode=S_IFCHR|0620, st_rdev=makedev(136, 1), ...}) = 0
mmap(NULL, 4096, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = 0x7f90c0143000
read(0, 0x7f90c0143000, 1024)           = ? ERESTARTSYS (To be restarted if SA_RESTART is set)
--- SIGTERM {si_signo=SIGTERM, si_code=SI_USER, si_pid=13689, si_uid=1000} ---
+++ killed by SIGTERM +++

II. count system call

strace -c ./[program]

we will see count as this

vagrant@vagrant-ubuntu-trusty:~/test$ strace -c ./test_strace
^CProcess 13742 detached
% time     seconds  usecs/call     calls    errors syscall
------ ----------- ----------- --------- --------- ----------------
 32.09    0.000060           8         8           mmap
 23.53    0.000044          15         3           fstat
 14.44    0.000027           7         4           mprotect
  9.63    0.000018          18         1           munmap
  8.56    0.000016           8         2           open
  6.95    0.000013           4         3         3 access
  1.60    0.000003           3         1           read
  1.07    0.000002           1         2           close
  1.07    0.000002           2         1           execve
  0.53    0.000001           1         1           brk
  0.53    0.000001           1         1           arch_prctl
------ ----------- ----------- --------- --------- ----------------
100.00    0.000187                    27         3 total

III. other options

option	effect
-o	output result to file
-T	track the system call time
-p	trace a running procee, using `-p [pid]`
-e	A qualifying expression which modifies which events to trace or how to trace them, e.g. `-e trace=signal` to only trace signal, or `-e write` write IO operation.

Also see

4. ulimit

when

In linux, every process has its resource limit, like sub-process number, open file number, dump file size, etc..

If our process more than its max value will lead a crash.

how

Limit is in resource level, but using count is in user level.

so limit like ‘file number’ will use number that AProc open file number plus BProc open file number to check is over limit.

This is the reason why some software like ‘apache’ using standalone user to run it..

To see process limit, use

cat /proc/[pid]/limits

what the limit value come from?

init proc: set by kernal
system service: set by setrlimit
shell proc: per login use /etc/security/limits.conf or set by pre ulimit -Sx command
shell executed: inherite from shell process

To modify process limit..(only soft limit we can modify)

Use ulimit we can modify shell process limit and continue process forked from shell.

Use echo -n \"Max processes=xx:yy\" >/proc/<pid>/limits

we can modify limit for running process(need root)

Also see:

CGroup Task Counter

Mar 7th, 2015

Comments

How to Deal With Non-heap or Native Memory Leak

when question occur out of heap

This week, I meet two very strange memory problems. Althought they are in different system, they have the same appearance —– Java process use much more memory over that JVM option setted.(e.g. Jvm option use 1.5G, but in top command, java consume 2.9G memory).

direct buffer memory

First system was a data-transfer one. It uses Netty heavily. When get the Heap dump we see many DirectByteBuffer objects, it’s one of signals that we have direct buffer memory problem…then we use java.nio.Bits(see this) to confirm that direct buffer memory increase quickly when many request came.

At last, we found that netty sender product much requst that over receiver’s ability. So, we recheck our netty configuaration, and set WATER_MARK option and check channel.isWritable before write data to confirm that Netty has avalible to handle that request, if not we will try it later. Then direct buffer memory is under our control.

native memory

The other system was a Lib-scan one. It will scan many .jar file every day.

At first, we try to dump heap many times, but we can not found any footmark in heap dump.

So, we change tool and use pmap -x [pid] and cat /proc/[pid]/smaps to see java memory usage.

First, we see find many big anon blocks using 65500+ memory, like this..

00007f8b00000000   65512   40148   40148 rwx--    [ anon ]
00007f8b03ffa000      24       0       0 -----    [ anon ]
00007f8b04000000   65520   59816   59816 rwx--    [ anon ]
00007f8b07ffc000      16       0       0 -----    [ anon ]

Then we using gdb to dump the content of these blocks

gdp -pid [pid]

dump memory mem.bin 0x00007f8b00000000 0x00007f8b00000000+65512

Then let it visuable

cat mem.bin | strings

It’s suprise that the block content is MANIFEST.MD file.. at this time, we start to suspect jar..

pmap -x [pid] | grep '.jar' | sort -k6 | uniq -c

After do some statistics, we found that many scanned jar file using memory more then once and never give back it…It’s the problem.

Then open JVM code — jdk/src/share/native/java/util/zip/zip_util.c and ZipFile.c will found ZipFile/JarFile will use mmap and only free then when close() be called..

So, after recheck and fix unclosed ZipFile/JarFile.. The problem was solved.

UPDATE: except un-close Jar file, before jdk1.7 URLClassLoader close JarFile lazy also lead mmap leak… we can use 1.7+ or hack it like this http://snipplr.com/view/24224/class-loader-which-close-opened-jar-files/ or try System.gc

UPDATE: the other accessible way may be let JarFile/ZipFile doesn’t use mmap…we can use jvm opt -Dsun.zip.disableMemoryMapping to disable it.http://www.oracle.com/us/technologies/java/overview-156328.html

Summary

Using heap dump, we can find java heap problem, but Java and Java-Libary also have chance to make non-heap memory leak.

When use nio-libary, we must take care the direct memory usage, using nio libary carefully, and should better let java.nio.Bits be monited(also this way is trick.)

In addition to nio problem, mis-used JNI api also lead the non-heap memory leak, too. We should try pmap or /proc/[pid]/smaps to found which jar/lib/stack/file might using much memory..then use gdb to watch content of blocks to help our analysis.

Feb 2nd, 2015

jvm

Comments

Tunning CMS GC

Introduce to CMS Collector

‘CMS’(Concurrent Mark-Sweep) is a gc algorithm that widely used by ‘reponse-time critical’ application.

To young generation, minor gc, CMS will stop all application threads too, but ‘ParNew’ performs scan in multi-threads and scan young generation is quick, because it’s small.

To old generation, instead of stopping the application threads during full gc, it take more background threads to periodically scan old generation. application threads only pause when it is realy nesessary. so its overall pause time is mush less than collector like ‘thoughput collector’.

The trade-off is CMS will take more CPU usage to drive scan threads. In addition, background scan don’t compact old generation, so there will be many fragments in old generation. When CPU is shortage or too fragemented to allocate new object, CMS will trigger full gc to resolve the problem.

CMS is enabled by -XX:+UseConcMarkSweepGC -XX:+UseParNewGC.

Common attention

1. Never specify a heap that is larger than physical memory

it will make system do swap to hard disk, it’s very very slow.

2. Set both initial and max heap size to same value

it will reduce the heap resize operation, although it will hold bigger heap for it’s real needing.

3. Make a right size Generations

Heap is divided to many regions as we know. The Larger young generation used, the less chances young gc will occur(although slower). The smaller old generation used more full gcs will take place.

4. Make Perm Generation doesn’t full

Resizing perm-gen will require a full-gc which is expensive.

5. Control gc threads number

set flag -XX:ParallelGCThreads=X will effects the number of threads used for young generation collect and stop-world-phase of old generation. Using more threads will lead to shortern stop time, it’s take more CPU usage in consequence.

Attention for CMS

1. minor gc

The bigger young generation, the less gc will occur. The bigger young generation, the longer GC cycles will be take.

Survivor size and tenuring

TLABS..todo.

2. concurrent cycle detail

Concurrent cycle will be trigger when it’s sufficiently full.

It starts with initial mark phase, which stop application threads to find GC root object in heap.

The next is mark phase that will not stop application and run conncurrent with application

Then preclean phase will concurrent running too, then abortable preclean phase will be taken to wait until the young generation is about 50% full to reduce the posibility for ygc occurs before remark.

Then remark phase will stop application threads again.

The sweep phase will swap concurrently.

Concurrent cycle doesn’t collect young generation directly, but it will have at least one ygc because of abortable preclean phase.

3. CMS failure

a. concurrent mode failure

When ygc occurs and no enough room in old generaition for promoted objects. CMS will trigger a full gc.

b. promotion fail for too fragements

When ygc occur, but old-gen is too fragements for promented objects. in the middle of ygc(ParNew) will collect and compact entrie old generation, It will take much time then concurrent mode failure, because compact is expensive.

c. full gc without concurrent failure

Normally, CMS will never meet full gc, expect concurrent failure or too fragements, but when perm-gen full, will take full gc, too. so take care of PerGen.

d. a race

When old-gen was filled for CMSInitiatingOccupancyFraction(default 70%) percent, concurrent cycle start a race, CMS must complete scan and free object befor remainder(100% – CMSInitiatingOccupancyFraction) be filled up. If not will meet a concurrent mode failure..

4. Solve concurrent failure problem

a. run concurrent cycle more often

Set -XX:+UseCMSInitiatingOccupancyOnly and -XX:CMSInitiatingOccupancyFraction=N will control the time for cycle start.

Small CMSInitiatingOccupancyFraction value will let cycle start sooner. But too small value will let concurrent cycle consume much CPU time, and start time become uncontrol when CPU resource is shortage; and so much concurrent cycle may take overall pause time increased.

b. run concurrent cycle more quick

Set -XX:ConcGCThreads=N will control the threads to run cycle. the more threads the quick cycle will have, as well as more CPU usage.

If concurrent mode failure doesn’t often take, set less value will save our CPU resouce.

5. CMS PermGen

PermGen will not be collect by default when using CMS.

By set -XX:CMSPermGenSweepingEnable and -XX:CMSClassUnloadingEnable will let cms threads to collect PermGen and free class metadata.

6. Use G1

If u are using big heap and fragement problem disturb u, G1 is valuable to explore.

Nov 23rd, 2014

jvm

Comments

Introduce to Gdb

0. Purpose

1. Debug Target

2. Run as debug

2. Attach to running process

3. Run with coredump file

Troubleshooting Tools

0. The purpose

1. Core dump

when

how

I. enable configuation

II. dump file path

III. debug core file

IV. generate dump with gdb

V. for java process

2. dmesg/messages

when

how

3. strace

when

how

I. follow call and signals

II. count system call

III. other options

4. ulimit

when

how

How to Deal With Non-heap or Native Memory Leak

when question occur out of heap

direct buffer memory

native memory

Summary

Tunning CMS GC

Introduce to CMS Collector

Common attention

1. Never specify a heap that is larger than physical memory

2. Set both initial and max heap size to same value

3. Make a right size Generations

4. Make Perm Generation doesn’t full

5. Control gc threads number

Attention for CMS

1. minor gc

2. concurrent cycle detail

3. CMS failure

a. concurrent mode failure

b. promotion fail for too fragements

c. full gc without concurrent failure

d. a race

4. Solve concurrent failure problem

a. run concurrent cycle more often

b. run concurrent cycle more quick

5. CMS PermGen

6. Use G1