annotate gc/doc/debugging.html @ 51488:5de98dce4bd1

*** empty log message ***
author Dave Love <fx@gnu.org>
date Thu, 05 Jun 2003 17:49:22 +0000
parents
children
Ignore whitespace changes - Everywhere: Within whitespace: At end of lines:
rev   line source
51488
5de98dce4bd1 *** empty log message ***
Dave Love <fx@gnu.org>
parents:
diff changeset
1 <HTML>
5de98dce4bd1 *** empty log message ***
Dave Love <fx@gnu.org>
parents:
diff changeset
2 <HEAD>
5de98dce4bd1 *** empty log message ***
Dave Love <fx@gnu.org>
parents:
diff changeset
3 <TITLE>Debugging Garbage Collector Related Problems</title>
5de98dce4bd1 *** empty log message ***
Dave Love <fx@gnu.org>
parents:
diff changeset
4 </head>
5de98dce4bd1 *** empty log message ***
Dave Love <fx@gnu.org>
parents:
diff changeset
5 <BODY>
5de98dce4bd1 *** empty log message ***
Dave Love <fx@gnu.org>
parents:
diff changeset
6 <H1>Debugging Garbage Collector Related Problems</h1>
5de98dce4bd1 *** empty log message ***
Dave Love <fx@gnu.org>
parents:
diff changeset
7 This page contains some hints on
5de98dce4bd1 *** empty log message ***
Dave Love <fx@gnu.org>
parents:
diff changeset
8 debugging issues specific to
5de98dce4bd1 *** empty log message ***
Dave Love <fx@gnu.org>
parents:
diff changeset
9 the Boehm-Demers-Weiser conservative garbage collector.
5de98dce4bd1 *** empty log message ***
Dave Love <fx@gnu.org>
parents:
diff changeset
10 It applies both to debugging issues in client code that manifest themselves
5de98dce4bd1 *** empty log message ***
Dave Love <fx@gnu.org>
parents:
diff changeset
11 as collector misbehavior, and to debugging the collector itself.
5de98dce4bd1 *** empty log message ***
Dave Love <fx@gnu.org>
parents:
diff changeset
12 <P>
5de98dce4bd1 *** empty log message ***
Dave Love <fx@gnu.org>
parents:
diff changeset
13 If you suspect a bug in the collector itself, it is strongly recommended
5de98dce4bd1 *** empty log message ***
Dave Love <fx@gnu.org>
parents:
diff changeset
14 that you try the latest collector release, even if it is labelled as "alpha",
5de98dce4bd1 *** empty log message ***
Dave Love <fx@gnu.org>
parents:
diff changeset
15 before proceeding.
5de98dce4bd1 *** empty log message ***
Dave Love <fx@gnu.org>
parents:
diff changeset
16 <H2>Bus Errors and Segmentation Violations</h2>
5de98dce4bd1 *** empty log message ***
Dave Love <fx@gnu.org>
parents:
diff changeset
17 <P>
5de98dce4bd1 *** empty log message ***
Dave Love <fx@gnu.org>
parents:
diff changeset
18 If the fault occurred in GC_find_limit, or with incremental collection enabled,
5de98dce4bd1 *** empty log message ***
Dave Love <fx@gnu.org>
parents:
diff changeset
19 this is probably normal. The collector installs handlers to take care of
5de98dce4bd1 *** empty log message ***
Dave Love <fx@gnu.org>
parents:
diff changeset
20 these. You will not see these unless you are using a debugger.
5de98dce4bd1 *** empty log message ***
Dave Love <fx@gnu.org>
parents:
diff changeset
21 Your debugger <I>should</i> allow you to continue.
5de98dce4bd1 *** empty log message ***
Dave Love <fx@gnu.org>
parents:
diff changeset
22 It's often preferable to tell the debugger to ignore SIGBUS and SIGSEGV
5de98dce4bd1 *** empty log message ***
Dave Love <fx@gnu.org>
parents:
diff changeset
23 ("<TT>handle SIGSEGV SIGBUS nostop noprint</tt>" in gdb,
5de98dce4bd1 *** empty log message ***
Dave Love <fx@gnu.org>
parents:
diff changeset
24 "<TT>ignore SIGSEGV SIGBUS</tt>" in most versions of dbx)
5de98dce4bd1 *** empty log message ***
Dave Love <fx@gnu.org>
parents:
diff changeset
25 and set a breakpoint in <TT>abort</tt>.
5de98dce4bd1 *** empty log message ***
Dave Love <fx@gnu.org>
parents:
diff changeset
26 The collector will call abort if the signal had another cause,
5de98dce4bd1 *** empty log message ***
Dave Love <fx@gnu.org>
parents:
diff changeset
27 and there was not other handler previously installed.
5de98dce4bd1 *** empty log message ***
Dave Love <fx@gnu.org>
parents:
diff changeset
28 <P>
5de98dce4bd1 *** empty log message ***
Dave Love <fx@gnu.org>
parents:
diff changeset
29 We recommend debugging without incremental collection if possible.
5de98dce4bd1 *** empty log message ***
Dave Love <fx@gnu.org>
parents:
diff changeset
30 (This applies directly to UNIX systems.
5de98dce4bd1 *** empty log message ***
Dave Love <fx@gnu.org>
parents:
diff changeset
31 Debugging with incremental collection under win32 is worse. See README.win32.)
5de98dce4bd1 *** empty log message ***
Dave Love <fx@gnu.org>
parents:
diff changeset
32 <P>
5de98dce4bd1 *** empty log message ***
Dave Love <fx@gnu.org>
parents:
diff changeset
33 If the application generates an unhandled SIGSEGV or equivalent, it may
5de98dce4bd1 *** empty log message ***
Dave Love <fx@gnu.org>
parents:
diff changeset
34 often be easiest to set the environment variable GC_LOOP_ON_ABORT. On many
5de98dce4bd1 *** empty log message ***
Dave Love <fx@gnu.org>
parents:
diff changeset
35 platforms, this will cause the collector to loop in a handler when the
5de98dce4bd1 *** empty log message ***
Dave Love <fx@gnu.org>
parents:
diff changeset
36 SIGSEGV is encountered (or when the collector aborts for some other reason),
5de98dce4bd1 *** empty log message ***
Dave Love <fx@gnu.org>
parents:
diff changeset
37 and a debugger can then be attached to the looping
5de98dce4bd1 *** empty log message ***
Dave Love <fx@gnu.org>
parents:
diff changeset
38 process. This sidesteps common operating system problems related
5de98dce4bd1 *** empty log message ***
Dave Love <fx@gnu.org>
parents:
diff changeset
39 to incomplete core files for multithreaded applications, etc.
5de98dce4bd1 *** empty log message ***
Dave Love <fx@gnu.org>
parents:
diff changeset
40 <H2>Other Signals</h2>
5de98dce4bd1 *** empty log message ***
Dave Love <fx@gnu.org>
parents:
diff changeset
41 On most platforms, the multithreaded version of the collector needs one or
5de98dce4bd1 *** empty log message ***
Dave Love <fx@gnu.org>
parents:
diff changeset
42 two other signals for internal use by the collector in stopping threads.
5de98dce4bd1 *** empty log message ***
Dave Love <fx@gnu.org>
parents:
diff changeset
43 It is normally wise to tell the debugger to ignore these. On Linux,
5de98dce4bd1 *** empty log message ***
Dave Love <fx@gnu.org>
parents:
diff changeset
44 the collector currently uses SIGPWR and SIGXCPU by default.
5de98dce4bd1 *** empty log message ***
Dave Love <fx@gnu.org>
parents:
diff changeset
45 <H2>Warning Messages About Needing to Allocate Blacklisted Blocks</h2>
5de98dce4bd1 *** empty log message ***
Dave Love <fx@gnu.org>
parents:
diff changeset
46 The garbage collector generates warning messages of the form
5de98dce4bd1 *** empty log message ***
Dave Love <fx@gnu.org>
parents:
diff changeset
47 <PRE>
5de98dce4bd1 *** empty log message ***
Dave Love <fx@gnu.org>
parents:
diff changeset
48 Needed to allocate blacklisted block at 0x...
5de98dce4bd1 *** empty log message ***
Dave Love <fx@gnu.org>
parents:
diff changeset
49 </pre>
5de98dce4bd1 *** empty log message ***
Dave Love <fx@gnu.org>
parents:
diff changeset
50 when it needs to allocate a block at a location that it knows to be
5de98dce4bd1 *** empty log message ***
Dave Love <fx@gnu.org>
parents:
diff changeset
51 referenced by a false pointer. These false pointers can be either permanent
5de98dce4bd1 *** empty log message ***
Dave Love <fx@gnu.org>
parents:
diff changeset
52 (<I>e.g.</i> a static integer variable that never changes) or temporary.
5de98dce4bd1 *** empty log message ***
Dave Love <fx@gnu.org>
parents:
diff changeset
53 In the latter case, the warning is largely spurious, and the block will
5de98dce4bd1 *** empty log message ***
Dave Love <fx@gnu.org>
parents:
diff changeset
54 eventually be reclaimed normally.
5de98dce4bd1 *** empty log message ***
Dave Love <fx@gnu.org>
parents:
diff changeset
55 In the former case, the program will still run correctly, but the block
5de98dce4bd1 *** empty log message ***
Dave Love <fx@gnu.org>
parents:
diff changeset
56 will never be reclaimed. Unless the block is intended to be
5de98dce4bd1 *** empty log message ***
Dave Love <fx@gnu.org>
parents:
diff changeset
57 permanent, the warning indicates a memory leak.
5de98dce4bd1 *** empty log message ***
Dave Love <fx@gnu.org>
parents:
diff changeset
58 <OL>
5de98dce4bd1 *** empty log message ***
Dave Love <fx@gnu.org>
parents:
diff changeset
59 <LI>Ignore these warnings while you are using GC_DEBUG. Some of the routines
5de98dce4bd1 *** empty log message ***
Dave Love <fx@gnu.org>
parents:
diff changeset
60 mentioned below don't have debugging equivalents. (Alternatively, write
5de98dce4bd1 *** empty log message ***
Dave Love <fx@gnu.org>
parents:
diff changeset
61 the missing routines and send them to me.)
5de98dce4bd1 *** empty log message ***
Dave Love <fx@gnu.org>
parents:
diff changeset
62 <LI>Replace allocator calls that request large blocks with calls to
5de98dce4bd1 *** empty log message ***
Dave Love <fx@gnu.org>
parents:
diff changeset
63 <TT>GC_malloc_ignore_off_page</tt> or
5de98dce4bd1 *** empty log message ***
Dave Love <fx@gnu.org>
parents:
diff changeset
64 <TT>GC_malloc_atomic_ignore_off_page</tt>. You may want to set a
5de98dce4bd1 *** empty log message ***
Dave Love <fx@gnu.org>
parents:
diff changeset
65 breakpoint in <TT>GC_default_warn_proc</tt> to help you identify such calls.
5de98dce4bd1 *** empty log message ***
Dave Love <fx@gnu.org>
parents:
diff changeset
66 Make sure that a pointer to somewhere near the beginning of the resulting block
5de98dce4bd1 *** empty log message ***
Dave Love <fx@gnu.org>
parents:
diff changeset
67 is maintained in a (preferably volatile) variable as long as
5de98dce4bd1 *** empty log message ***
Dave Love <fx@gnu.org>
parents:
diff changeset
68 the block is needed.
5de98dce4bd1 *** empty log message ***
Dave Love <fx@gnu.org>
parents:
diff changeset
69 <LI>
5de98dce4bd1 *** empty log message ***
Dave Love <fx@gnu.org>
parents:
diff changeset
70 If the large blocks are allocated with realloc, we suggest instead allocating
5de98dce4bd1 *** empty log message ***
Dave Love <fx@gnu.org>
parents:
diff changeset
71 them with something like the following. Note that the realloc size increment
5de98dce4bd1 *** empty log message ***
Dave Love <fx@gnu.org>
parents:
diff changeset
72 should be fairly large (e.g. a factor of 3/2) for this to exhibit reasonable
5de98dce4bd1 *** empty log message ***
Dave Love <fx@gnu.org>
parents:
diff changeset
73 performance. But we all know we should do that anyway.
5de98dce4bd1 *** empty log message ***
Dave Love <fx@gnu.org>
parents:
diff changeset
74 <PRE>
5de98dce4bd1 *** empty log message ***
Dave Love <fx@gnu.org>
parents:
diff changeset
75 void * big_realloc(void *p, size_t new_size)
5de98dce4bd1 *** empty log message ***
Dave Love <fx@gnu.org>
parents:
diff changeset
76 {
5de98dce4bd1 *** empty log message ***
Dave Love <fx@gnu.org>
parents:
diff changeset
77 size_t old_size = GC_size(p);
5de98dce4bd1 *** empty log message ***
Dave Love <fx@gnu.org>
parents:
diff changeset
78 void * result;
5de98dce4bd1 *** empty log message ***
Dave Love <fx@gnu.org>
parents:
diff changeset
79
5de98dce4bd1 *** empty log message ***
Dave Love <fx@gnu.org>
parents:
diff changeset
80 if (new_size <= 10000) return(GC_realloc(p, new_size));
5de98dce4bd1 *** empty log message ***
Dave Love <fx@gnu.org>
parents:
diff changeset
81 if (new_size <= old_size) return(p);
5de98dce4bd1 *** empty log message ***
Dave Love <fx@gnu.org>
parents:
diff changeset
82 result = GC_malloc_ignore_off_page(new_size);
5de98dce4bd1 *** empty log message ***
Dave Love <fx@gnu.org>
parents:
diff changeset
83 if (result == 0) return(0);
5de98dce4bd1 *** empty log message ***
Dave Love <fx@gnu.org>
parents:
diff changeset
84 memcpy(result,p,old_size);
5de98dce4bd1 *** empty log message ***
Dave Love <fx@gnu.org>
parents:
diff changeset
85 GC_free(p);
5de98dce4bd1 *** empty log message ***
Dave Love <fx@gnu.org>
parents:
diff changeset
86 return(result);
5de98dce4bd1 *** empty log message ***
Dave Love <fx@gnu.org>
parents:
diff changeset
87 }
5de98dce4bd1 *** empty log message ***
Dave Love <fx@gnu.org>
parents:
diff changeset
88 </pre>
5de98dce4bd1 *** empty log message ***
Dave Love <fx@gnu.org>
parents:
diff changeset
89
5de98dce4bd1 *** empty log message ***
Dave Love <fx@gnu.org>
parents:
diff changeset
90 <LI> In the unlikely case that even relatively small object
5de98dce4bd1 *** empty log message ***
Dave Love <fx@gnu.org>
parents:
diff changeset
91 (&lt;20KB) allocations are triggering these warnings, then your address
5de98dce4bd1 *** empty log message ***
Dave Love <fx@gnu.org>
parents:
diff changeset
92 space contains lots of "bogus pointers", i.e. values that appear to
5de98dce4bd1 *** empty log message ***
Dave Love <fx@gnu.org>
parents:
diff changeset
93 be pointers but aren't. Usually this can be solved by using GC_malloc_atomic
5de98dce4bd1 *** empty log message ***
Dave Love <fx@gnu.org>
parents:
diff changeset
94 or the routines in gc_typed.h to allocate large pointer-free regions of bitmaps, etc. Sometimes the problem can be solved with trivial changes of encoding
5de98dce4bd1 *** empty log message ***
Dave Love <fx@gnu.org>
parents:
diff changeset
95 in certain values. It is possible, to identify the source of the bogus
5de98dce4bd1 *** empty log message ***
Dave Love <fx@gnu.org>
parents:
diff changeset
96 pointers by building the collector with <TT>-DPRINT_BLACK_LIST</tt>,
5de98dce4bd1 *** empty log message ***
Dave Love <fx@gnu.org>
parents:
diff changeset
97 which will cause it to print the "bogus pointers", along with their location.
5de98dce4bd1 *** empty log message ***
Dave Love <fx@gnu.org>
parents:
diff changeset
98
5de98dce4bd1 *** empty log message ***
Dave Love <fx@gnu.org>
parents:
diff changeset
99 <LI> If you get only a fixed number of these warnings, you are probably only
5de98dce4bd1 *** empty log message ***
Dave Love <fx@gnu.org>
parents:
diff changeset
100 introducing a bounded leak by ignoring them. If the data structures being
5de98dce4bd1 *** empty log message ***
Dave Love <fx@gnu.org>
parents:
diff changeset
101 allocated are intended to be permanent, then it is also safe to ignore them.
5de98dce4bd1 *** empty log message ***
Dave Love <fx@gnu.org>
parents:
diff changeset
102 The warnings can be turned off by calling GC_set_warn_proc with a procedure
5de98dce4bd1 *** empty log message ***
Dave Love <fx@gnu.org>
parents:
diff changeset
103 that ignores these warnings (e.g. by doing absolutely nothing).
5de98dce4bd1 *** empty log message ***
Dave Love <fx@gnu.org>
parents:
diff changeset
104 </ol>
5de98dce4bd1 *** empty log message ***
Dave Love <fx@gnu.org>
parents:
diff changeset
105
5de98dce4bd1 *** empty log message ***
Dave Love <fx@gnu.org>
parents:
diff changeset
106 <H2>The Collector References a Bad Address in <TT>GC_malloc</tt></h2>
5de98dce4bd1 *** empty log message ***
Dave Love <fx@gnu.org>
parents:
diff changeset
107
5de98dce4bd1 *** empty log message ***
Dave Love <fx@gnu.org>
parents:
diff changeset
108 This typically happens while the collector is trying to remove an entry from
5de98dce4bd1 *** empty log message ***
Dave Love <fx@gnu.org>
parents:
diff changeset
109 its free list, and the free list pointer is bad because the free list link
5de98dce4bd1 *** empty log message ***
Dave Love <fx@gnu.org>
parents:
diff changeset
110 in the last allocated object was bad.
5de98dce4bd1 *** empty log message ***
Dave Love <fx@gnu.org>
parents:
diff changeset
111 <P>
5de98dce4bd1 *** empty log message ***
Dave Love <fx@gnu.org>
parents:
diff changeset
112 With &gt; 99% probability, you wrote past the end of an allocated object.
5de98dce4bd1 *** empty log message ***
Dave Love <fx@gnu.org>
parents:
diff changeset
113 Try setting <TT>GC_DEBUG</tt> before including <TT>gc.h</tt> and
5de98dce4bd1 *** empty log message ***
Dave Love <fx@gnu.org>
parents:
diff changeset
114 allocating with <TT>GC_MALLOC</tt>. This will try to detect such
5de98dce4bd1 *** empty log message ***
Dave Love <fx@gnu.org>
parents:
diff changeset
115 overwrite errors.
5de98dce4bd1 *** empty log message ***
Dave Love <fx@gnu.org>
parents:
diff changeset
116
5de98dce4bd1 *** empty log message ***
Dave Love <fx@gnu.org>
parents:
diff changeset
117 <H2>Unexpectedly Large Heap</h2>
5de98dce4bd1 *** empty log message ***
Dave Love <fx@gnu.org>
parents:
diff changeset
118
5de98dce4bd1 *** empty log message ***
Dave Love <fx@gnu.org>
parents:
diff changeset
119 Unexpected heap growth can be due to one of the following:
5de98dce4bd1 *** empty log message ***
Dave Love <fx@gnu.org>
parents:
diff changeset
120 <OL>
5de98dce4bd1 *** empty log message ***
Dave Love <fx@gnu.org>
parents:
diff changeset
121 <LI> Data structures that are being unintentionally retained. This
5de98dce4bd1 *** empty log message ***
Dave Love <fx@gnu.org>
parents:
diff changeset
122 is commonly caused by data structures that are no longer being used,
5de98dce4bd1 *** empty log message ***
Dave Love <fx@gnu.org>
parents:
diff changeset
123 but were not cleared, or by caches growing without bounds.
5de98dce4bd1 *** empty log message ***
Dave Love <fx@gnu.org>
parents:
diff changeset
124 <LI> Pointer misidentification. The garbage collector is interpreting
5de98dce4bd1 *** empty log message ***
Dave Love <fx@gnu.org>
parents:
diff changeset
125 integers or other data as pointers and retaining the "referenced"
5de98dce4bd1 *** empty log message ***
Dave Love <fx@gnu.org>
parents:
diff changeset
126 objects.
5de98dce4bd1 *** empty log message ***
Dave Love <fx@gnu.org>
parents:
diff changeset
127 <LI> Heap fragmentation. This should never result in unbounded growth,
5de98dce4bd1 *** empty log message ***
Dave Love <fx@gnu.org>
parents:
diff changeset
128 but it may account for larger heaps. This is most commonly caused
5de98dce4bd1 *** empty log message ***
Dave Love <fx@gnu.org>
parents:
diff changeset
129 by allocation of large objects. On some platforms it can be reduced
5de98dce4bd1 *** empty log message ***
Dave Love <fx@gnu.org>
parents:
diff changeset
130 by building with -DUSE_MUNMAP, which will cause the collector to unmap
5de98dce4bd1 *** empty log message ***
Dave Love <fx@gnu.org>
parents:
diff changeset
131 memory corresponding to pages that have not been recently used.
5de98dce4bd1 *** empty log message ***
Dave Love <fx@gnu.org>
parents:
diff changeset
132 <LI> Per object overhead. This is usually a relatively minor effect, but
5de98dce4bd1 *** empty log message ***
Dave Love <fx@gnu.org>
parents:
diff changeset
133 it may be worth considering. If the collector recognizes interior
5de98dce4bd1 *** empty log message ***
Dave Love <fx@gnu.org>
parents:
diff changeset
134 pointers, object sizes are increased, so that one-past-the-end pointers
5de98dce4bd1 *** empty log message ***
Dave Love <fx@gnu.org>
parents:
diff changeset
135 are correctly recognized. The collector can be configured not to do this
5de98dce4bd1 *** empty log message ***
Dave Love <fx@gnu.org>
parents:
diff changeset
136 (<TT>-DDONT_ADD_BYTE_AT_END</tt>).
5de98dce4bd1 *** empty log message ***
Dave Love <fx@gnu.org>
parents:
diff changeset
137 <P>
5de98dce4bd1 *** empty log message ***
Dave Love <fx@gnu.org>
parents:
diff changeset
138 The collector rounds up object sizes so the result fits well into the
5de98dce4bd1 *** empty log message ***
Dave Love <fx@gnu.org>
parents:
diff changeset
139 chunk size (<TT>HBLKSIZE</tt>, normally 4K on 32 bit machines, 8K
5de98dce4bd1 *** empty log message ***
Dave Love <fx@gnu.org>
parents:
diff changeset
140 on 64 bit machines) used by the collector. Thus it may be worth avoiding
5de98dce4bd1 *** empty log message ***
Dave Love <fx@gnu.org>
parents:
diff changeset
141 objects of size 2K + 1 (or 2K if a byte is being added at the end.)
5de98dce4bd1 *** empty log message ***
Dave Love <fx@gnu.org>
parents:
diff changeset
142 </ol>
5de98dce4bd1 *** empty log message ***
Dave Love <fx@gnu.org>
parents:
diff changeset
143 The last two cases can often be identified by looking at the output
5de98dce4bd1 *** empty log message ***
Dave Love <fx@gnu.org>
parents:
diff changeset
144 of a call to <TT>GC_dump()</tt>. Among other things, it will print the
5de98dce4bd1 *** empty log message ***
Dave Love <fx@gnu.org>
parents:
diff changeset
145 list of free heap blocks, and a very brief description of all chunks in
5de98dce4bd1 *** empty log message ***
Dave Love <fx@gnu.org>
parents:
diff changeset
146 the heap, the object sizes they correspond to, and how many live objects
5de98dce4bd1 *** empty log message ***
Dave Love <fx@gnu.org>
parents:
diff changeset
147 were found in the chunk at the last collection.
5de98dce4bd1 *** empty log message ***
Dave Love <fx@gnu.org>
parents:
diff changeset
148 <P>
5de98dce4bd1 *** empty log message ***
Dave Love <fx@gnu.org>
parents:
diff changeset
149 Growing data structures can usually be identified by
5de98dce4bd1 *** empty log message ***
Dave Love <fx@gnu.org>
parents:
diff changeset
150 <OL>
5de98dce4bd1 *** empty log message ***
Dave Love <fx@gnu.org>
parents:
diff changeset
151 <LI> Building the collector with <TT>-DKEEP_BACK_PTRS</tt>,
5de98dce4bd1 *** empty log message ***
Dave Love <fx@gnu.org>
parents:
diff changeset
152 <LI> Preferably using debugging allocation (defining <TT>GC_DEBUG</tt>
5de98dce4bd1 *** empty log message ***
Dave Love <fx@gnu.org>
parents:
diff changeset
153 before including <TT>gc.h</tt> and allocating with <TT>GC_MALLOC</tt>),
5de98dce4bd1 *** empty log message ***
Dave Love <fx@gnu.org>
parents:
diff changeset
154 so that objects will be identified by their allocation site,
5de98dce4bd1 *** empty log message ***
Dave Love <fx@gnu.org>
parents:
diff changeset
155 <LI> Running the application long enough so
5de98dce4bd1 *** empty log message ***
Dave Love <fx@gnu.org>
parents:
diff changeset
156 that most of the heap is composed of "leaked" memory, and
5de98dce4bd1 *** empty log message ***
Dave Love <fx@gnu.org>
parents:
diff changeset
157 <LI> Then calling <TT>GC_generate_random_backtrace()</tt> from backptr.h
5de98dce4bd1 *** empty log message ***
Dave Love <fx@gnu.org>
parents:
diff changeset
158 a few times to determine why some randomly sampled objects in the heap are
5de98dce4bd1 *** empty log message ***
Dave Love <fx@gnu.org>
parents:
diff changeset
159 being retained.
5de98dce4bd1 *** empty log message ***
Dave Love <fx@gnu.org>
parents:
diff changeset
160 </ol>
5de98dce4bd1 *** empty log message ***
Dave Love <fx@gnu.org>
parents:
diff changeset
161 <P>
5de98dce4bd1 *** empty log message ***
Dave Love <fx@gnu.org>
parents:
diff changeset
162 The same technique can often be used to identify problems with false
5de98dce4bd1 *** empty log message ***
Dave Love <fx@gnu.org>
parents:
diff changeset
163 pointers, by noting whether the reference chains printed by
5de98dce4bd1 *** empty log message ***
Dave Love <fx@gnu.org>
parents:
diff changeset
164 <TT>GC_generate_random_backtrace()</tt> involve any misidentified pointers.
5de98dce4bd1 *** empty log message ***
Dave Love <fx@gnu.org>
parents:
diff changeset
165 An alternate technique is to build the collector with
5de98dce4bd1 *** empty log message ***
Dave Love <fx@gnu.org>
parents:
diff changeset
166 <TT>-DPRINT_BLACK_LIST</tt> which will cause it to report values that
5de98dce4bd1 *** empty log message ***
Dave Love <fx@gnu.org>
parents:
diff changeset
167 are almost, but not quite, look like heap pointers. It is very likely that
5de98dce4bd1 *** empty log message ***
Dave Love <fx@gnu.org>
parents:
diff changeset
168 actual false pointers will come from similar sources.
5de98dce4bd1 *** empty log message ***
Dave Love <fx@gnu.org>
parents:
diff changeset
169 <P>
5de98dce4bd1 *** empty log message ***
Dave Love <fx@gnu.org>
parents:
diff changeset
170 In the unlikely case that false pointers are an issue, it can usually
5de98dce4bd1 *** empty log message ***
Dave Love <fx@gnu.org>
parents:
diff changeset
171 be resolved using one or more of the following techniques:
5de98dce4bd1 *** empty log message ***
Dave Love <fx@gnu.org>
parents:
diff changeset
172 <OL>
5de98dce4bd1 *** empty log message ***
Dave Love <fx@gnu.org>
parents:
diff changeset
173 <LI> Use <TT>GC_malloc_atomic</tt> for objects containing no pointers.
5de98dce4bd1 *** empty log message ***
Dave Love <fx@gnu.org>
parents:
diff changeset
174 This is especially important for large arrays containing compressed data,
5de98dce4bd1 *** empty log message ***
Dave Love <fx@gnu.org>
parents:
diff changeset
175 pseudo-random numbers, and the like. It is also likely to improve GC
5de98dce4bd1 *** empty log message ***
Dave Love <fx@gnu.org>
parents:
diff changeset
176 performance, perhaps drastically so if the application is paging.
5de98dce4bd1 *** empty log message ***
Dave Love <fx@gnu.org>
parents:
diff changeset
177 <LI> If you allocate large objects containing only
5de98dce4bd1 *** empty log message ***
Dave Love <fx@gnu.org>
parents:
diff changeset
178 one or two pointers at the beginning, either try the typed allocation
5de98dce4bd1 *** empty log message ***
Dave Love <fx@gnu.org>
parents:
diff changeset
179 primitives is <TT>gc_typed.h</tt>, or separate out the pointerfree component.
5de98dce4bd1 *** empty log message ***
Dave Love <fx@gnu.org>
parents:
diff changeset
180 <LI> Consider using <TT>GC_malloc_ignore_off_page()</tt>
5de98dce4bd1 *** empty log message ***
Dave Love <fx@gnu.org>
parents:
diff changeset
181 to allocate large objects. (See <TT>gc.h</tt> and above for details.
5de98dce4bd1 *** empty log message ***
Dave Love <fx@gnu.org>
parents:
diff changeset
182 Large means &gt; 100K in most environments.)
5de98dce4bd1 *** empty log message ***
Dave Love <fx@gnu.org>
parents:
diff changeset
183 </ol>
5de98dce4bd1 *** empty log message ***
Dave Love <fx@gnu.org>
parents:
diff changeset
184 <H2>Prematurely Reclaimed Objects</h2>
5de98dce4bd1 *** empty log message ***
Dave Love <fx@gnu.org>
parents:
diff changeset
185 The usual symptom of this is a segmentation fault, or an obviously overwritten
5de98dce4bd1 *** empty log message ***
Dave Love <fx@gnu.org>
parents:
diff changeset
186 value in a heap object. This should, of course, be impossible. In practice,
5de98dce4bd1 *** empty log message ***
Dave Love <fx@gnu.org>
parents:
diff changeset
187 it may happen for reasons like the following:
5de98dce4bd1 *** empty log message ***
Dave Love <fx@gnu.org>
parents:
diff changeset
188 <OL>
5de98dce4bd1 *** empty log message ***
Dave Love <fx@gnu.org>
parents:
diff changeset
189 <LI> The collector did not intercept the creation of threads correctly in
5de98dce4bd1 *** empty log message ***
Dave Love <fx@gnu.org>
parents:
diff changeset
190 a multithreaded application, <I>e.g.</i> because the client called
5de98dce4bd1 *** empty log message ***
Dave Love <fx@gnu.org>
parents:
diff changeset
191 <TT>pthread_create</tt> without including <TT>gc.h</tt>, which redefines it.
5de98dce4bd1 *** empty log message ***
Dave Love <fx@gnu.org>
parents:
diff changeset
192 <LI> The last pointer to an object in the garbage collected heap was stored
5de98dce4bd1 *** empty log message ***
Dave Love <fx@gnu.org>
parents:
diff changeset
193 somewhere were the collector couldn't see it, <I>e.g.</i> in an
5de98dce4bd1 *** empty log message ***
Dave Love <fx@gnu.org>
parents:
diff changeset
194 object allocated with system <TT>malloc</tt>, in certain types of
5de98dce4bd1 *** empty log message ***
Dave Love <fx@gnu.org>
parents:
diff changeset
195 <TT>mmap</tt>ed files,
5de98dce4bd1 *** empty log message ***
Dave Love <fx@gnu.org>
parents:
diff changeset
196 or in some data structure visible only to the OS. (On some platforms,
5de98dce4bd1 *** empty log message ***
Dave Love <fx@gnu.org>
parents:
diff changeset
197 thread-local storage is one of these.)
5de98dce4bd1 *** empty log message ***
Dave Love <fx@gnu.org>
parents:
diff changeset
198 <LI> The last pointer to an object was somehow disguised, <I>e.g.</i> by
5de98dce4bd1 *** empty log message ***
Dave Love <fx@gnu.org>
parents:
diff changeset
199 XORing it with another pointer.
5de98dce4bd1 *** empty log message ***
Dave Love <fx@gnu.org>
parents:
diff changeset
200 <LI> Incorrect use of <TT>GC_malloc_atomic</tt> or typed allocation.
5de98dce4bd1 *** empty log message ***
Dave Love <fx@gnu.org>
parents:
diff changeset
201 <LI> An incorrect <TT>GC_free</tt> call.
5de98dce4bd1 *** empty log message ***
Dave Love <fx@gnu.org>
parents:
diff changeset
202 <LI> The client program overwrote an internal garbage collector data structure.
5de98dce4bd1 *** empty log message ***
Dave Love <fx@gnu.org>
parents:
diff changeset
203 <LI> A garbage collector bug.
5de98dce4bd1 *** empty log message ***
Dave Love <fx@gnu.org>
parents:
diff changeset
204 <LI> (Empirically less likely than any of the above.) A compiler optimization
5de98dce4bd1 *** empty log message ***
Dave Love <fx@gnu.org>
parents:
diff changeset
205 that disguised the last pointer.
5de98dce4bd1 *** empty log message ***
Dave Love <fx@gnu.org>
parents:
diff changeset
206 </ol>
5de98dce4bd1 *** empty log message ***
Dave Love <fx@gnu.org>
parents:
diff changeset
207 The following relatively simple techniques should be tried first to narrow
5de98dce4bd1 *** empty log message ***
Dave Love <fx@gnu.org>
parents:
diff changeset
208 down the problem:
5de98dce4bd1 *** empty log message ***
Dave Love <fx@gnu.org>
parents:
diff changeset
209 <OL>
5de98dce4bd1 *** empty log message ***
Dave Love <fx@gnu.org>
parents:
diff changeset
210 <LI> If you are using the incremental collector try turning it off for
5de98dce4bd1 *** empty log message ***
Dave Love <fx@gnu.org>
parents:
diff changeset
211 debugging.
5de98dce4bd1 *** empty log message ***
Dave Love <fx@gnu.org>
parents:
diff changeset
212 <LI> If you are using shared libraries, try linking statically. If that works,
5de98dce4bd1 *** empty log message ***
Dave Love <fx@gnu.org>
parents:
diff changeset
213 ensure that DYNAMIC_LOADING is defined on your platform.
5de98dce4bd1 *** empty log message ***
Dave Love <fx@gnu.org>
parents:
diff changeset
214 <LI> Try to reproduce the problem with fully debuggable unoptimized code.
5de98dce4bd1 *** empty log message ***
Dave Love <fx@gnu.org>
parents:
diff changeset
215 This will eliminate the last possibility, as well as making debugging easier.
5de98dce4bd1 *** empty log message ***
Dave Love <fx@gnu.org>
parents:
diff changeset
216 <LI> Try replacing any suspect typed allocation and <TT>GC_malloc_atomic</tt>
5de98dce4bd1 *** empty log message ***
Dave Love <fx@gnu.org>
parents:
diff changeset
217 calls with calls to <TT>GC_malloc</tt>.
5de98dce4bd1 *** empty log message ***
Dave Love <fx@gnu.org>
parents:
diff changeset
218 <LI> Try removing any GC_free calls (<I>e.g.</i> with a suitable
5de98dce4bd1 *** empty log message ***
Dave Love <fx@gnu.org>
parents:
diff changeset
219 <TT>#define</tt>).
5de98dce4bd1 *** empty log message ***
Dave Love <fx@gnu.org>
parents:
diff changeset
220 <LI> Rebuild the collector with <TT>-DGC_ASSERTIONS</tt>.
5de98dce4bd1 *** empty log message ***
Dave Love <fx@gnu.org>
parents:
diff changeset
221 <LI> If the following works on your platform (i.e. if gctest still works
5de98dce4bd1 *** empty log message ***
Dave Love <fx@gnu.org>
parents:
diff changeset
222 if you do this), try building the collector with
5de98dce4bd1 *** empty log message ***
Dave Love <fx@gnu.org>
parents:
diff changeset
223 <TT>-DREDIRECT_MALLOC=GC_malloc_uncollectable</tt>. This will cause
5de98dce4bd1 *** empty log message ***
Dave Love <fx@gnu.org>
parents:
diff changeset
224 the collector to scan memory allocated with malloc.
5de98dce4bd1 *** empty log message ***
Dave Love <fx@gnu.org>
parents:
diff changeset
225 </ol>
5de98dce4bd1 *** empty log message ***
Dave Love <fx@gnu.org>
parents:
diff changeset
226 If all else fails, you will have to attack this with a debugger.
5de98dce4bd1 *** empty log message ***
Dave Love <fx@gnu.org>
parents:
diff changeset
227 Suggested steps:
5de98dce4bd1 *** empty log message ***
Dave Love <fx@gnu.org>
parents:
diff changeset
228 <OL>
5de98dce4bd1 *** empty log message ***
Dave Love <fx@gnu.org>
parents:
diff changeset
229 <LI> Call <TT>GC_dump()</tt> from the debugger around the time of the failure. Verify
5de98dce4bd1 *** empty log message ***
Dave Love <fx@gnu.org>
parents:
diff changeset
230 that the collectors idea of the root set (i.e. static data regions which
5de98dce4bd1 *** empty log message ***
Dave Love <fx@gnu.org>
parents:
diff changeset
231 it should scan for pointers) looks plausible. If not, i.e. if it doesn't
5de98dce4bd1 *** empty log message ***
Dave Love <fx@gnu.org>
parents:
diff changeset
232 include some static variables, report this as
5de98dce4bd1 *** empty log message ***
Dave Love <fx@gnu.org>
parents:
diff changeset
233 a collector bug. Be sure to describe your platform precisely, since this sort
5de98dce4bd1 *** empty log message ***
Dave Love <fx@gnu.org>
parents:
diff changeset
234 of problem is nearly always very platform dependent.
5de98dce4bd1 *** empty log message ***
Dave Love <fx@gnu.org>
parents:
diff changeset
235 <LI> Especially if the failure is not deterministic, try to isolate it to
5de98dce4bd1 *** empty log message ***
Dave Love <fx@gnu.org>
parents:
diff changeset
236 a relatively small test case.
5de98dce4bd1 *** empty log message ***
Dave Love <fx@gnu.org>
parents:
diff changeset
237 <LI> Set a break point in <TT>GC_finish_collection</tt>. This is a good
5de98dce4bd1 *** empty log message ***
Dave Love <fx@gnu.org>
parents:
diff changeset
238 point to examine what has been marked, i.e. found reachable, by the
5de98dce4bd1 *** empty log message ***
Dave Love <fx@gnu.org>
parents:
diff changeset
239 collector.
5de98dce4bd1 *** empty log message ***
Dave Love <fx@gnu.org>
parents:
diff changeset
240 <LI> If the failure is deterministic, run the process
5de98dce4bd1 *** empty log message ***
Dave Love <fx@gnu.org>
parents:
diff changeset
241 up to the last collection before the failure.
5de98dce4bd1 *** empty log message ***
Dave Love <fx@gnu.org>
parents:
diff changeset
242 Note that the variable <TT>GC_gc_no</tt> counts collections and can be used
5de98dce4bd1 *** empty log message ***
Dave Love <fx@gnu.org>
parents:
diff changeset
243 to set a conditional breakpoint in the right one. It is incremented just
5de98dce4bd1 *** empty log message ***
Dave Love <fx@gnu.org>
parents:
diff changeset
244 before the call to GC_finish_collection.
5de98dce4bd1 *** empty log message ***
Dave Love <fx@gnu.org>
parents:
diff changeset
245 If object <TT>p</tt> was prematurely recycled, it may be helpful to
5de98dce4bd1 *** empty log message ***
Dave Love <fx@gnu.org>
parents:
diff changeset
246 look at <TT>*GC_find_header(p)</tt> at the failure point.
5de98dce4bd1 *** empty log message ***
Dave Love <fx@gnu.org>
parents:
diff changeset
247 The <TT>hb_last_reclaimed</tt> field will identify the collection number
5de98dce4bd1 *** empty log message ***
Dave Love <fx@gnu.org>
parents:
diff changeset
248 during which its block was last swept.
5de98dce4bd1 *** empty log message ***
Dave Love <fx@gnu.org>
parents:
diff changeset
249 <LI> Verify that the offending object still has its correct contents at
5de98dce4bd1 *** empty log message ***
Dave Love <fx@gnu.org>
parents:
diff changeset
250 this point.
5de98dce4bd1 *** empty log message ***
Dave Love <fx@gnu.org>
parents:
diff changeset
251 Then call <TT>GC_is_marked(p)</tt> from the debugger to verify that the
5de98dce4bd1 *** empty log message ***
Dave Love <fx@gnu.org>
parents:
diff changeset
252 object has not been marked, and is about to be reclaimed. Note that
5de98dce4bd1 *** empty log message ***
Dave Love <fx@gnu.org>
parents:
diff changeset
253 <TT>GC_is_marked(p)</tt> expects the real address of an object (the
5de98dce4bd1 *** empty log message ***
Dave Love <fx@gnu.org>
parents:
diff changeset
254 address of the debug header if there is one), and thus it may
5de98dce4bd1 *** empty log message ***
Dave Love <fx@gnu.org>
parents:
diff changeset
255 be more appropriate to call <TT>GC_is_marked(GC_base(p))</tt>
5de98dce4bd1 *** empty log message ***
Dave Love <fx@gnu.org>
parents:
diff changeset
256 instead.
5de98dce4bd1 *** empty log message ***
Dave Love <fx@gnu.org>
parents:
diff changeset
257 <LI> Determine a path from a root, i.e. static variable, stack, or
5de98dce4bd1 *** empty log message ***
Dave Love <fx@gnu.org>
parents:
diff changeset
258 register variable,
5de98dce4bd1 *** empty log message ***
Dave Love <fx@gnu.org>
parents:
diff changeset
259 to the reclaimed object. Call <TT>GC_is_marked(q)</tt> for each object
5de98dce4bd1 *** empty log message ***
Dave Love <fx@gnu.org>
parents:
diff changeset
260 <TT>q</tt> along the path, trying to locate the first unmarked object, say
5de98dce4bd1 *** empty log message ***
Dave Love <fx@gnu.org>
parents:
diff changeset
261 <TT>r</tt>.
5de98dce4bd1 *** empty log message ***
Dave Love <fx@gnu.org>
parents:
diff changeset
262 <LI> If <TT>r</tt> is pointed to by a static root,
5de98dce4bd1 *** empty log message ***
Dave Love <fx@gnu.org>
parents:
diff changeset
263 verify that the location
5de98dce4bd1 *** empty log message ***
Dave Love <fx@gnu.org>
parents:
diff changeset
264 pointing to it is part of the root set printed by <TT>GC_dump()</tt>. If it
5de98dce4bd1 *** empty log message ***
Dave Love <fx@gnu.org>
parents:
diff changeset
265 is on the stack in the main (or only) thread, verify that
5de98dce4bd1 *** empty log message ***
Dave Love <fx@gnu.org>
parents:
diff changeset
266 <TT>GC_stackbottom</tt> is set correctly to the base of the stack. If it is
5de98dce4bd1 *** empty log message ***
Dave Love <fx@gnu.org>
parents:
diff changeset
267 in another thread stack, check the collector's thread data structure
5de98dce4bd1 *** empty log message ***
Dave Love <fx@gnu.org>
parents:
diff changeset
268 (<TT>GC_thread[]</tt> on several platforms) to make sure that stack bounds
5de98dce4bd1 *** empty log message ***
Dave Love <fx@gnu.org>
parents:
diff changeset
269 are set correctly.
5de98dce4bd1 *** empty log message ***
Dave Love <fx@gnu.org>
parents:
diff changeset
270 <LI> If <TT>r</tt> is pointed to by heap object <TT>s</tt>, check that the
5de98dce4bd1 *** empty log message ***
Dave Love <fx@gnu.org>
parents:
diff changeset
271 collector's layout description for <TT>s</tt> is such that the pointer field
5de98dce4bd1 *** empty log message ***
Dave Love <fx@gnu.org>
parents:
diff changeset
272 will be scanned. Call <TT>*GC_find_header(s)</tt> to look at the descriptor
5de98dce4bd1 *** empty log message ***
Dave Love <fx@gnu.org>
parents:
diff changeset
273 for the heap chunk. The <TT>hb_descr</tt> field specifies the layout
5de98dce4bd1 *** empty log message ***
Dave Love <fx@gnu.org>
parents:
diff changeset
274 of objects in that chunk. See gc_mark.h for the meaning of the descriptor.
5de98dce4bd1 *** empty log message ***
Dave Love <fx@gnu.org>
parents:
diff changeset
275 (If it's low order 2 bits are zero, then it is just the length of the
5de98dce4bd1 *** empty log message ***
Dave Love <fx@gnu.org>
parents:
diff changeset
276 object prefix to be scanned. This form is always used for objects allocated
5de98dce4bd1 *** empty log message ***
Dave Love <fx@gnu.org>
parents:
diff changeset
277 with <TT>GC_malloc</tt> or <TT>GC_malloc_atomic</tt>.)
5de98dce4bd1 *** empty log message ***
Dave Love <fx@gnu.org>
parents:
diff changeset
278 <LI> If the failure is not deterministic, you may still be able to apply some
5de98dce4bd1 *** empty log message ***
Dave Love <fx@gnu.org>
parents:
diff changeset
279 of the above technique at the point of failure. But remember that objects
5de98dce4bd1 *** empty log message ***
Dave Love <fx@gnu.org>
parents:
diff changeset
280 allocated since the last collection will not have been marked, even if the
5de98dce4bd1 *** empty log message ***
Dave Love <fx@gnu.org>
parents:
diff changeset
281 collector is functioning properly. On some platforms, the collector
5de98dce4bd1 *** empty log message ***
Dave Love <fx@gnu.org>
parents:
diff changeset
282 can be configured to save call chains in objects for debugging.
5de98dce4bd1 *** empty log message ***
Dave Love <fx@gnu.org>
parents:
diff changeset
283 Enabling this feature will also cause it to save the call stack at the
5de98dce4bd1 *** empty log message ***
Dave Love <fx@gnu.org>
parents:
diff changeset
284 point of the last GC in GC_arrays._last_stack.
5de98dce4bd1 *** empty log message ***
Dave Love <fx@gnu.org>
parents:
diff changeset
285 <LI> When looking at GC internal data structures remember that a number
5de98dce4bd1 *** empty log message ***
Dave Love <fx@gnu.org>
parents:
diff changeset
286 of <TT>GC_</tt><I>xxx</i> variables are really macro defined to
5de98dce4bd1 *** empty log message ***
Dave Love <fx@gnu.org>
parents:
diff changeset
287 <TT>GC_arrays._</tt><I>xxx</i>, so that
5de98dce4bd1 *** empty log message ***
Dave Love <fx@gnu.org>
parents:
diff changeset
288 the collector can avoid scanning them.
5de98dce4bd1 *** empty log message ***
Dave Love <fx@gnu.org>
parents:
diff changeset
289 </ol>
5de98dce4bd1 *** empty log message ***
Dave Love <fx@gnu.org>
parents:
diff changeset
290 </body>
5de98dce4bd1 *** empty log message ***
Dave Love <fx@gnu.org>
parents:
diff changeset
291 </html>
5de98dce4bd1 *** empty log message ***
Dave Love <fx@gnu.org>
parents:
diff changeset
292
5de98dce4bd1 *** empty log message ***
Dave Love <fx@gnu.org>
parents:
diff changeset
293
5de98dce4bd1 *** empty log message ***
Dave Love <fx@gnu.org>
parents:
diff changeset
294
5de98dce4bd1 *** empty log message ***
Dave Love <fx@gnu.org>
parents:
diff changeset
295