[NCLUG] Garbage Collection in C

Tkil tkil at scrye.com
Wed Nov 26 00:23:55 MST 2003


Um... first off, please post text (not HTML) to this list.  Easier for
everyone.

>>>>> "Alex" == Alex Bobby <linuxprogrammer2003 at yahoo.co.in> writes:

Alex> Do we have any better technical approach to implement garbage
Alex> collection in C, other than establishing a counter and freeing
Alex> () the dynamic chunk when the count becomes zero.

Alex> I believe to make use of the storage classes of the pointers
Alex> pointing to the chunk, the information embedded in the object
Alex> file, by the compiler.

Alex> Thus, we go beyond the compilation phase, which I am not
Alex> interested in.

First off, you have to define "better": faster to code?  faster to
execute?  more portable to different architectures?  more portable to
different operating systems?  easier to use?  less memory used?

Note that C is mostly designed for low-level coding.  It can be used
for higher-level applications, but typically through the creation of
appropriate abstraction layers.

Going far enough down this path gets you first to C++, which has all
the "on the metal" capabilities of C but with vastly more powerful
(and more likely to be abused) abstraction capabilities.

Going even further, if you find yourself asking questions like "how do
I implement garbage collection in C", consider using a [much]
higher-level language: Java if you like strongly typed, Perl or Python
if scripting is acceptable.  There are more esoteric languages out
there, as well; PHP has a well-deserved niche for dynamic content
generation, and Ruby seems here to stay (and there are some Ruby
bigots^Wproponents on this list).

Regarding specific techniques, what you describe is known as
"reference counting", and is a fine technique.  Watch out for cycles,
though.  In C++, this is most often implemented through the use of
"smart pointers".  In Perl and Java, reference counts are maintained
by the runtime engine; when the counts go to zero, the objects are
destroyed (possibly asynchronously; Perl prefers to do so when the
last reference is dropped, but Java makes no such guarantee).

"Garbage collection" more often refers to slightly more opaque methods
that have a more intimate knowledge of the application binary
interface (ABI) of a given platform (OS + hardware).  The basic idea
is to track all allocations, then periodically scan all pointers to
see if any of them are pointing at those locations.  Any that are no
longer pointed at are ready to be collected.

Here is where the tradeoffs of memory usage and timeliness come in, as
well as the potential loss of portability.  Should it copy everything
that is used, then blow away what isn't?  How does it track what is
used?  Can it hold up the execution of the entire process while it
tries to determine what is refernced and what isn't?  How much CPU and
memory are you willing to give up for this?

Google keywords like "Boehm Garbage Collector" will likely be
enlightening.

Note that some large C projects do view garbage collection as the
lesser evil, compared to arguing with intricate structures of
indeterminate lifetimes, e.g., GCC.  Generally it is viewed as a type
of laziness and imprecision; you'll note that the Linux kernel prefers
reference counting over more general garbage collection.

So, if you want to continue this discussion:

1. What problem are you trying to solve?

2. What solutions are you proposing?

3. What other solutions have you examined, and why did you decide they
   were inadequate?

Thanks,
t.



More information about the NCLUG mailing list