Introduce to Gdb

0. Purpose

I’m a freshman to use gdb or c, because I mainly use java, but there is some situation for java programer to use gdb tools to solve problem. So I ‘mark’ this article~

1. Debug Target

gdb could debug many kinds of programming languages, like c, c++, objective-c, and so on. It doesn’t like debug tools in java, we often use it’s command-line interface instead of GUI.

Before we start to debug, we should take code compile with debug information.

for example, using gcc we should add -g flag

$ gcc example.c -g -o example

This is the reason why if we want to debug jvm, we must build it with debug info first.

2. Run as debug

using

$ gdb ./example

then we enter the gdb-commandline, and we can use tab key to complete or list all command

there are some basic command

command description
dir [directories] set source code folder
r run it
b [where] add a break, where is an express like ‘xx_method’ or linenum
b [where] if [condition] break when condition
l list code
step step into a method
c continue
finish finish current method
p [var name] watch a variable value
x/? [address] watch the memory address
dump memory [file] [sadd] [eadd] dump memory between to address into file
info thread list threads
info break list all break
thread [tid] switch thread
bt show stacktrace
[enter] redo last command

and so on…

2. Attach to running process

using

$ gdb ./example [pid]

to attach to pid process..

3. Run with coredump file

using

$ gdb ./example [coredump file path]

the previous blog tell some detail of debug coredump file.

Also see

Troubleshooting Tools

0. The purpose


There are too many tools for us to track the system problem. It’s just like a detector’s work, at first, we must find ‘evidence’ as quickly as possible. Beside the awareness, using the right tools for problem is vital. Let’s record it in here.

1. Core dump


when

First ‘tool’ is not a tools, it’s just a snapshot file that record memory/processor/register and so on when our program crashed as OS reason. Just like javadump file, it’s very useful for us diagnose and debug the problem.

When can we get a core dump file?

Linux use signal as one kind of asynchronous event handling mechanism, each signal has its default action, like Ingore (ignore signal ), Stop(suspend process), Terminate (terminate process), and Core (termination and core dump) etc..

So when Core action is triggered, linux will generate core dump.

Signal Action Addition
SIGQUIT CORE Quit from keyboard, e.g. "Ctrl+\"
SIGILL CORE Illegal Instruction, e.g. kill -ll $$
SIGABRT CORE Abort signal from abort
SIGSEGV CORE Invalid memory reference, e.g. write to null pointer memory area or overflowstack
SIGTRAP CORE Trace/breakpoint trap

For full sigal list, plean see signal manual page. And signal we show in core dump to help using recognize the problem type.

how

I. enable configuation

En~…But we often hear that ‘I could not find any core dump when it crashed’.

There some switcher to let system generate a core dump.

Enter ulimit -c command and get the result value as 0, it indicate that core dump is disabled by default, it would not generate core dump file. We can use the command ulimit -c unlimited to enable the core dump function, and does not limit the core dump file size;

Using the above command will only effective for terminal current environment, if you want to be permanent, you can modify the file /etc/security/limits.conf file

#/etc/security/limits.conf
#Each line describes a limit for a user in the form:
#<domain>      <type>  <item>         <value>
      *         soft     core        unlimited
II. dump file path
  • The default generated core file is saved in the executable file’s directory, file name is core.
  • Using sysctl-a |grep core, and kernel.core_pattern will indiate the dump file path, and kernel.core_uses_pid using 1 to let core file contains process id
  • To modify it using sysctl -w or sysctl -p as sudoer.
III. debug core file

use the command gdb [program] [coredump] to view the core file

vagrant@vagrant-ubuntu-trusty:~/test$ gdb seg core
GNU gdb (Ubuntu 7.8-1ubuntu4) 7.8.0.20141001-cvs
Copyright (C) 2014 Free Software Foundation, Inc.
License GPLv3+: GNU GPL version 3 or later <http://gnu.org/licenses/gpl.html>
This is free software: you are free to change and redistribute it.
There is NO WARRANTY, to the extent permitted by law.  Type "show copying"
and "show warranty" for details.
This GDB was configured as "x86_64-linux-gnu".
Type "show configuration" for configuration details.
For bug reporting instructions, please see:
<http://www.gnu.org/software/gdb/bugs/>.
Find the GDB manual and other documentation resources online at:
<http://www.gnu.org/software/gdb/documentation/>.
For help, type "help".
Type "apropos word" to search for commands related to "word"...
Reading symbols from seg...(no debugging symbols found)...done.
[New LWP 12887]
Core was generated by `./seg'.
Program terminated with signal SIGSEGV, Segmentation fault.
#0  0x0000000000400506 in main ()
(gdb) where
#0  0x0000000000400506 in main ()
(gdb) info frame
Stack level 0, frame at 0x7fff961538c0:
 rip = 0x400506 in main; saved rip = 0x7f17f9d0eec5
 Arglist at 0x7fff961538b0, args:
 Locals at 0x7fff961538b0, Previous frame's sp is 0x7fff961538c0
 Saved registers:
  rbp at 0x7fff961538b0, rip at 0x7fff961538b8
(gdb) Quit
IV. generate dump with gdb

we can use gdb to generate a core dump easily..

first, connect to java pid

gdb -q --pid=xxx

then

(gdb) generate-core-file     

at last, don’t forget to detail from process

(gdb) detach 
V. for java process

if core dump is come from java process

we can transform it to java heap file use jmap

jmap -dump:format=b,file=heap.hprof $JAVA_HOME/java core.xxx

then use tools like MAT to analysis memory.

or we can use jstack to see java stack

jstack -m $JAVA_HOME/bin/java core.xxx

(ps: jstack maybe meet a bug )

Also see

2. dmesg/messages


when

When process crashed, we need to confirm is it killed by oom-killer?

how

We can use dmesg and /var/adm/messages.* to see kernel messages.(dmesg for kernel ring buffer, and messages for all)

expect boot info…

oom-killer will in it, we can use

sudo dmesg | grep java | grep -i oom-killer

to see confirm that is program killed by oom-killer.

Also see

3. strace


when

“strace – trace system calls and signals”

En~…we can use strace to see system call and signals in program.

we can use strace to see param or return value of system call and signals.

how

I. follow call and signals

use strace command to see system call

strace ./[program]

we will see system call param and return value and signals

mmap(NULL, 4096, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = 0x7f90c0139000
mmap(NULL, 8192, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = 0x7f90c0137000
arch_prctl(ARCH_SET_FS, 0x7f90c0137740) = 0
mprotect(0x7f90bff19000, 16384, PROT_READ) = 0
mprotect(0x600000, 4096, PROT_READ)     = 0
mprotect(0x7f90c0146000, 4096, PROT_READ) = 0
munmap(0x7f90c013a000, 39504)           = 0
fstat(0, {st_mode=S_IFCHR|0620, st_rdev=makedev(136, 1), ...}) = 0
mmap(NULL, 4096, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = 0x7f90c0143000
read(0, 0x7f90c0143000, 1024)           = ? ERESTARTSYS (To be restarted if SA_RESTART is set)
--- SIGTERM {si_signo=SIGTERM, si_code=SI_USER, si_pid=13689, si_uid=1000} ---
+++ killed by SIGTERM +++
II. count system call
strace -c ./[program]

we will see count as this

vagrant@vagrant-ubuntu-trusty:~/test$ strace -c ./test_strace
^CProcess 13742 detached
% time     seconds  usecs/call     calls    errors syscall
------ ----------- ----------- --------- --------- ----------------
 32.09    0.000060           8         8           mmap
 23.53    0.000044          15         3           fstat
 14.44    0.000027           7         4           mprotect
  9.63    0.000018          18         1           munmap
  8.56    0.000016           8         2           open
  6.95    0.000013           4         3         3 access
  1.60    0.000003           3         1           read
  1.07    0.000002           1         2           close
  1.07    0.000002           2         1           execve
  0.53    0.000001           1         1           brk
  0.53    0.000001           1         1           arch_prctl
------ ----------- ----------- --------- --------- ----------------
100.00    0.000187                    27         3 total        
III. other options
option effect
-o output result to file
-T track the system call time
-p trace a running procee, using -p [pid]
-e A qualifying expression which modifies which events to trace or how to trace them, e.g. -e trace=signal to only trace signal, or -e write write IO operation.

Also see

4. ulimit


when

In linux, every process has its resource limit, like sub-process number, open file number, dump file size, etc..

If our process more than its max value will lead a crash.

how

Limit is in resource level, but using count is in user level.

so limit like ‘file number’ will use number that AProc open file number plus BProc open file number to check is over limit.

This is the reason why some software like ‘apache’ using standalone user to run it..

To see process limit, use

cat /proc/[pid]/limits

what the limit value come from?

  • init proc: set by kernal
  • system service: set by setrlimit
  • shell proc: per login use /etc/security/limits.conf or set by pre ulimit -Sx command
  • shell executed: inherite from shell process

To modify process limit..(only soft limit we can modify)

Use ulimit we can modify shell process limit and continue process forked from shell.

Use echo -n \"Max processes=xx:yy\" >/proc/<pid>/limits

we can modify limit for running process(need root)

Also see:

os

How to Deal With Non-heap or Native Memory Leak

when question occur out of heap


This week, I meet two very strange memory problems. Althought they are in different system, they have the same appearance —– Java process use much more memory over that JVM option setted.(e.g. Jvm option use 1.5G, but in top command, java consume 2.9G memory).

direct buffer memory


First system was a data-transfer one. It uses Netty heavily. When get the Heap dump we see many DirectByteBuffer objects, it’s one of signals that we have direct buffer memory problem…then we use java.nio.Bits(see this) to confirm that direct buffer memory increase quickly when many request came.

At last, we found that netty sender product much requst that over receiver’s ability. So, we recheck our netty configuaration, and set WATER_MARK option and check channel.isWritable before write data to confirm that Netty has avalible to handle that request, if not we will try it later. Then direct buffer memory is under our control.

native memory


The other system was a Lib-scan one. It will scan many .jar file every day.

At first, we try to dump heap many times, but we can not found any footmark in heap dump.

So, we change tool and use pmap -x [pid] and cat /proc/[pid]/smaps to see java memory usage.

First, we see find many big anon blocks using 65500+ memory, like this..

00007f8b00000000   65512   40148   40148 rwx--    [ anon ]
00007f8b03ffa000      24       0       0 -----    [ anon ]
00007f8b04000000   65520   59816   59816 rwx--    [ anon ]
00007f8b07ffc000      16       0       0 -----    [ anon ]

Then we using gdb to dump the content of these blocks

gdp -pid [pid]

dump memory mem.bin 0x00007f8b00000000 0x00007f8b00000000+65512

Then let it visuable

cat mem.bin | strings

It’s suprise that the block content is MANIFEST.MD file.. at this time, we start to suspect jar..

pmap -x [pid] | grep '.jar' | sort -k6 | uniq -c

After do some statistics, we found that many scanned jar file using memory more then once and never give back it…It’s the problem.

Then open JVM code — jdk/src/share/native/java/util/zip/zip_util.c and ZipFile.c will found ZipFile/JarFile will use mmap and only free then when close() be called..

So, after recheck and fix unclosed ZipFile/JarFile.. The problem was solved.

UPDATE: except un-close Jar file, before jdk1.7 URLClassLoader close JarFile lazy also lead mmap leak… we can use 1.7+ or hack it like this http://snipplr.com/view/24224/class-loader-which-close-opened-jar-files/ or try System.gc

UPDATE: the other accessible way may be let JarFile/ZipFile doesn’t use mmap…we can use jvm opt -Dsun.zip.disableMemoryMapping to disable it.http://www.oracle.com/us/technologies/java/overview-156328.html

Summary


Using heap dump, we can find java heap problem, but Java and Java-Libary also have chance to make non-heap memory leak.

When use nio-libary, we must take care the direct memory usage, using nio libary carefully, and should better let java.nio.Bits be monited(also this way is trick.)

In addition to nio problem, mis-used JNI api also lead the non-heap memory leak, too. We should try pmap or /proc/[pid]/smaps to found which jar/lib/stack/file might using much memory..then use gdb to watch content of blocks to help our analysis.

jvm

Tunning CMS GC

Introduce to CMS Collector


‘CMS’(Concurrent Mark-Sweep) is a gc algorithm that widely used by ‘reponse-time critical’ application.

To young generation, minor gc, CMS will stop all application threads too, but ‘ParNew’ performs scan in multi-threads and scan young generation is quick, because it’s small.

To old generation, instead of stopping the application threads during full gc, it take more background threads to periodically scan old generation. application threads only pause when it is realy nesessary. so its overall pause time is mush less than collector like ‘thoughput collector’.

The trade-off is CMS will take more CPU usage to drive scan threads. In addition, background scan don’t compact old generation, so there will be many fragments in old generation. When CPU is shortage or too fragemented to allocate new object, CMS will trigger full gc to resolve the problem.

CMS is enabled by -XX:+UseConcMarkSweepGC -XX:+UseParNewGC.

Common attention


1. Never specify a heap that is larger than physical memory

it will make system do swap to hard disk, it’s very very slow.

2. Set both initial and max heap size to same value

it will reduce the heap resize operation, although it will hold bigger heap for it’s real needing.

3. Make a right size Generations

Heap is divided to many regions as we know. The Larger young generation used, the less chances young gc will occur(although slower). The smaller old generation used more full gcs will take place.

4. Make Perm Generation doesn’t full

Resizing perm-gen will require a full-gc which is expensive.

5. Control gc threads number

set flag -XX:ParallelGCThreads=X will effects the number of threads used for young generation collect and stop-world-phase of old generation. Using more threads will lead to shortern stop time, it’s take more CPU usage in consequence.

Attention for CMS


1. minor gc

The bigger young generation, the less gc will occur. The bigger young generation, the longer GC cycles will be take.

Survivor size and tenuring

TLABS..todo.

2. concurrent cycle detail

Concurrent cycle will be trigger when it’s sufficiently full.

It starts with initial mark phase, which stop application threads to find GC root object in heap.

The next is mark phase that will not stop application and run conncurrent with application

Then preclean phase will concurrent running too, then abortable preclean phase will be taken to wait until the young generation is about 50% full to reduce the posibility for ygc occurs before remark.

Then remark phase will stop application threads again.

The sweep phase will swap concurrently.

Concurrent cycle doesn’t collect young generation directly, but it will have at least one ygc because of abortable preclean phase.

3. CMS failure

a. concurrent mode failure

When ygc occurs and no enough room in old generaition for promoted objects. CMS will trigger a full gc.

b. promotion fail for too fragements

When ygc occur, but old-gen is too fragements for promented objects. in the middle of ygc(ParNew) will collect and compact entrie old generation, It will take much time then concurrent mode failure, because compact is expensive.

c. full gc without concurrent failure

Normally, CMS will never meet full gc, expect concurrent failure or too fragements, but when perm-gen full, will take full gc, too. so take care of PerGen.

d. a race

When old-gen was filled for CMSInitiatingOccupancyFraction(default 70%) percent, concurrent cycle start a race, CMS must complete scan and free object befor remainder(100% – CMSInitiatingOccupancyFraction) be filled up. If not will meet a concurrent mode failure..

4. Solve concurrent failure problem

a. run concurrent cycle more often

Set -XX:+UseCMSInitiatingOccupancyOnly and -XX:CMSInitiatingOccupancyFraction=N will control the time for cycle start.

Small CMSInitiatingOccupancyFraction value will let cycle start sooner. But too small value will let concurrent cycle consume much CPU time, and start time become uncontrol when CPU resource is shortage; and so much concurrent cycle may take overall pause time increased.

b. run concurrent cycle more quick

Set -XX:ConcGCThreads=N will control the threads to run cycle. the more threads the quick cycle will have, as well as more CPU usage.

If concurrent mode failure doesn’t often take, set less value will save our CPU resouce.

5. CMS PermGen

PermGen will not be collect by default when using CMS.

By set -XX:CMSPermGenSweepingEnable and -XX:CMSClassUnloadingEnable will let cms threads to collect PermGen and free class metadata.

6. Use G1

If u are using big heap and fragement problem disturb u, G1 is valuable to explore.

jvm