[Ns-developers] Bug 790: Memory leak in TestSuite routing-aodv-regression

craigdo@ee.washington.edu craigdo at ee.washington.edu
Thu Jan 14 11:06:51 PST 2010


> Compare test.py around line 673 (run_job_synchronously), with wutils.py
> around line 126.  Mathieu (IIRC) had already detected this problem and
> added
> a regexp matching on the valgrind output to detect memory leaks.  When
> the
> new test.py script was written, this fix was not taken into
> consideration.

I have never observed an instance where valgrind, called directly, reported a block being "definitely lost" but did not return the error specified by "error-exitcode".

In any case, "still reachable" is a different class of error than "definitely lost" and we have never considered "still reachable" an error that needs to be fixed.  This is probably because valgrind doesn't either:

----------

Because there are different kinds of leaks with different severities, an interesting question is this: which leaks should be counted as true "errors" and which should not? The answer to this question affects the numbers printed in the ERROR SUMMARY line, and also the effect of the --error-exitcode option. Memcheck uses the following criteria:

First, a leak is only counted as a true "error" if --leak-check=full is specified. In other words, an unprinted leak is not considered a true "error". If this were not the case, it would be possible to get a high error count but not have any errors printed, which would be confusing.

After that, definitely lost and possibly lost blocks are counted as true "errors". Indirectly lost and still reachable blocks are not counted as true "errors", even if --show-reachable=yes is specified and they are printed; this is because such blocks don't need direct fixing by the programmer

----------

If we really want to, we can easily print error information for "still reachable" blocks by turning on --show-reachable=yes, but these still won't be treated as by valgrind.  If we want to error on this case, we will need to look for "still reachable" in the valgrind output.  I don't know of any reason to look for "definitely lost" in the output since valgrind does correctly return "error-exitcode" if it finds such an error.

We can discuss turning "still reachable" into an error, but now is probably not the optimal time.

-- Craig

FYI, here's some more on the error categories from the memcheck manual:

Every possible case can be reduced to one of the above nine. Memcheck merges some of these cases in its output, resulting in the following four categories.

* "Still reachable". This covers cases 1 and 2 (for the BBB blocks) above. A start-pointer or chain of start-pointers to the block is found. Since the block is still pointed at, the programmer could, at least in principle, have freed it before program exit. Because these are very common and arguably not a problem, Memcheck won't report such blocks individually unless --show-reachable=yes is specified.

* "Definitely lost". This covers case 3 (for the BBB blocks) above. This means that no pointer to the block can be found. The block is classified as "lost", because the programmer could not possibly have freed it at program exit, since no pointer to it exists. This is likely a symptom of having lost the pointer at some earlier point in the program. Such cases should be fixed by the programmer.

* "Indirectly lost". This covers cases 4 and 9 (for the BBB blocks) above. This means that the block is lost, not because there are no pointers to it, but rather because all the blocks that point to it are themselves lost. For example, if you have a binary tree and the root node is lost, all its children nodes will be indirectly lost. Because the problem will disappear if the definitely lost block that caused the indirect leak is fixed, Memcheck won't report such blocks individually unless --show-reachable=yes is specified.

* "Possibly lost". This covers cases 5--8 (for the BBB blocks) above. This means that a chain of one or more pointers to the block has been found, but at least one of the pointers is an interior-pointer. This could just be a random value in memory that happens to point into a block, and so you shouldn't consider this ok unless you know you have interior-pointers.

> -----Original Message-----
> From: ns-developers-bounces at ISI.EDU [mailto:ns-developers-
> bounces at ISI.EDU] On Behalf Of Gustavo Carneiro
> Sent: Thursday, January 14, 2010 10:02 AM
> To: Faker Moatamri
> Cc: ns-developers at ISI.EDU
> Subject: Re: [Ns-developers] Bug 790: Memory leak in TestSuite routing-
> aodv-regression
> 
> Compare test.py around line 673 (run_job_synchronously), with wutils.py
> around line 126.  Mathieu (IIRC) had already detected this problem and
> added
> a regexp matching on the valgrind output to detect memory leaks.  When
> the
> new test.py script was written, this fix was not taken into
> consideration.
> 
> 2010/1/14 Faker Moatamri <faker.moatamri at sophia.inria.fr>
> 
> > Hi,
> > Actually this is not only for routing aodv regression but the worst
> is that
> > valgrind is returning 0 which leaves everything green:
> >
> > By running ./test.py -g -v you get the following memory leaks (I
> removed
> > not interesting lines):
> >
> > utils/test-runner --suite=routing-aodv-regression
> > still reachable: 936 bytes in 7 blocks
> >
> > utils/test-runner --suite=routing-aodv
> > still reachable: 280 bytes in 6 blocks
> >
> > utils/test-runner --suite=routing-olsr-regression
> > still reachable: 200 bytes in 4 blocks
> >
> > utils/test-runner --suite=routing-olsr-header
> > definitely lost: 224 bytes in 2 blocks.
> > indirectly lost: 17,256 bytes in 150 blocks.
> > possibly lost: 0 bytes in 0 blocks.
> > still reachable: 200 bytes in 4 blocks.
> >
> > utils/test-runner --suite=ipv6-protocol
> > still reachable: 200 bytes in 4 blocks
> >
> > utils/test-runner --suite=packetbb-test-suite
> > still reachable: 5,816 bytes in 42 blocks
> >
> > utils/test-runner --suite=drop-tail-queue
> > still reachable: 720 bytes in 16 blocks
> >
> > utils/test-runner --suite=packet-metadata
> > still reachable: 936 bytes in 7 blocks
> >
> > utils/test-runner --suite=buffer
> > 200 bytes in 4 blocks
> >
> > utils/test-runner --suite=object-name-service
> > still reachable: 200 bytes in 4 blocks
> >
> > examples/stats/wifi-example-sim
> > still reachable: 200 bytes in 4 blocks
> >
> > examples/tcp/star
> > still reachable: 200 bytes in 4 blocks
> >
> > and there is some others like those....
> >
> > They are not detected as errors and if you remove the -v option you
> > wouldn't see anything. Any thoughts guys?
> >
> > Best regards
> >
> > Faker Moatamri
> >
> >
> 
> 
> --
> Gustavo J. A. M. Carneiro
> INESC Porto, Telecommunications and Multimedia Unit
> "The universe is always one step beyond logic." -- Frank Herbert





More information about the Ns-developers mailing list