Just Say No to Finalize Methods!


This article was contributed by Brent Rector.

Environment: Managed C++

Introduction

When you try to learn a new programming language and framework, such as .NET, you try and map what you know to the new .NET concepts you are learning. Generally, this helps you come up to speed in the new environment more quickly. However, occasionally, concepts that seem similar are not and you can end up creating applications that don't operate as you intended.

C++ destructors and .NET finalizers initially seem to be similar, if not identical, concepts. A developer who creates a finalizer, expecting it to work like a C++ destructor, will be unpleasantly surprised. A C++ destructor is a specially named method (~className) that the C++ runtime executes immediately when you tell it to destroy an instance of the class. A C# finalizer is also a specially named method. In fact, you define it using the ~className syntax just like a C++ destructor (which I personally think is unfortunate). However, any similarity between the two concepts stops at this point.

When the .NET garbage collector (GC) decides it can collect an object, the GC looks to see whether the object has a Finalize method that the GC needs to call. When it does, the GC does not collect the object, but instead schedules a background thread to call the Finalize method at some indefinite time in the future. This difference in semantics can have huge effects on your object's behavior.

It is the C++ programmer's responsibility to control the lifetime of an object instance and tell the C++ runtime when to call the destructor method for an object and reclaim the memory for the object. Note that this model makes a few assumptions.

C++ Destructor, .NET Finalizer, and Lifetime Management Semantics

First, the burden is on the client of a C++ object to control the lifetime of both the object's resources and the object's memory. When you heap allocate a C++ object, you must call delete at the appropriate time to invoke the destructor (which typically releases the object's resources) and free the memory allocated for the object. Alternatively, you must stack allocate the object to tell the runtime to call the destructor and (implicitly) reclaim the memory when the stack frame goes out of scope. Of course, some stack frames don't go out of scope for hours, even days or weeks, so the burden is still on the client to decide whether this is the appropriate lifetime management of the object. When the client gets the lifetime management wrong, the object never gets its destructor called, and the memory leaks.

Second, placing resource clean-up code in a destructor assumes that resource lifetime and object lifetime end at the same point in time. Some designs reuse objects and you may well want to free resources of an object far sooner than releasing the object itself. This is easy to do, actually. You simply add a Close or Dispose method to the object and have the client call the method as soon as the resources should be released. In C++ terms, this means the client must determine the appropriate time to instruct the object to release its resources and the client must determine the appropriate time to tell the object to self-destruct (end of object lifetime).

The client of a .NET object must also determine the appropriate resource lifetime and call the Dispose method to free those resources. However, the client of a .NET object never controls the lifetime of the object. At some unknown point in time, after you can no longer use an object, the GC will attempt to collect the object. At some unknown time after that, the .NET runtime will call your finalize method. At some unknown time after that, the .NET runtime will reclaim the memory for the object.

The obvious advantage to this design is that you can never forget to free unreferenced objects. Additionally, you can never mistakenly reference a previously freed object. This eliminates most common memory management bugs. However, this model forces you to separate the lifetime management of resources encapsulated by an object and the lifetime of the object itself.

Therefore, in the .NET world, by design, we place resource clean-up code in a Dispose method and require the client to call it at the appropriate time. The GC itself automatically handles the object lifetime and memory reclamation.

Performance Considerations

class Foo {
  ~Foo () { . . . }
}

The ~Foo method declaration in the above C# code is identical to a destructor syntactically, but in C# it represents a finalizer. In fact, the C# compiler actually generates the following code:

class Foo {
  protected override void Finalize ()) {
    try {
       . . .
    }
    finally { base.Finalize(); }
  }
}

However, adding a finalizer to a class, even if it does nothing, causes instantiations of class instances to run more slowly plus just the presence of such objects in the heap causes the GC to run more slowly.

When creating each instance of a class with a finalizer, the GC must record that it needs to call the finalizer for the instance before it collects the object. This process causes allocations of such objects to take longer. (In absolute terms, this isn't a big performance loss but in comparison to the extremely fast allocation of non-finalizable objects, it is considerably slower.)

Each time the garbage collector runs, for each collectable object in the heap, the GC has to search this finalization object list to see whether the collectable object is in the list. The longer this list, the more time the GC spends searching the list and the slower the GC runs.

Finally, when the GC finds a collectable object on the finalization list, it cannot collect it right away. Instead, it must keep the object alive and schedule a background thread to call the Finalize method at some time in the future. This operation can have pervasive side-effects.

For example, when the collectable object itself has references to other objects, keeping the finalizable object alive means all the objects to which it holds references are now also kept alive, plus their dependencies, and so on. Additionally, because a finalizable object survives its first garbage collection, the GC promotes it from generation zero to generation one. The GC collects objects in higher generations less frequently than it does objects in lower generations, so the net effect is that the object and everything it references stay in memory much longer than they otherwise would if the initial object didn't have an empty finalize method.

Usage Considerations

Let's now examine what code in a Finalize method realistically can do. First, you should put as little code as possible in the method and make it execute as quickly as possible. There is a single background thread that sequentially calls all Finalize methods. When your Finalize method runs slowly, it causes a delay before subsequent objects get their Finalize method called.

Second, don't block in a finalizer. Technically, this is the same point as the first one because blocking can be considered making the method run extremely slowly.

Third, you cannot, in general, use any member reference variables of an object in its Finalize method. It is impossible to guarantee the finalization order of objects. Therefore, when the GC background thread calls an object's Finalize method, other objects referenced by its member variables may have already had their Finalize methods called. This means that these referenced objects may no longer be functional.

Conclusion

So what's the purpose of a Finalize method? That's a good question. My opinion is that Java had one, so .NET needed one as well. Therefore, marketing comparisons could show that .NET matches Java feature by feature as much as possible. However, I think it wasn't a terribly useful feature in Java and it's not a terribly useful feature in .NET.

The only code you can usefully put in a Finalize method releases non-memory resources. However, it would be far better to have the client release such resources as soon as possible via a Dispose method. So, in effect, you are creating a Finalize method solely to free resources in the case the client screws up and forgets to free them.

In the best-case scenario, you are done using an object (its lifetime ends) at the same time as you are done using its non-memory resources (its resources lifetimes end) and you've willingly incurred all the performance hits of implementing the Finalize method. Immediately after you could have called the object's Dispose method, the garbage collector runs and attempts to collect the object. It schedules the free background thread to run and the thread immediately calls your Finalize method. The method releases the non-memory resources. The memory for the object still stays around until the next GC run, at which time the GC reclaims the memory for the object and all objects it references.

In a worst-case scenario, your object's Finalize method won't run until your process terminates gracefully. (Actually, the real worst case is that it never runs at all because the process terminates abnormally.) Shortly thereafter, non-memory resources will be released anyway, so the Finalize method served no real purpose other than to keep the application from running as fast as it otherwise could.

Just say No to Finalize methods!

About the Author

Brent Rector has designed and written operating systems, compilers, and many, many applications. He began developing Windows applications using a beta version of Windows 1.0. Brent is the founder of Wise Owl Consulting, Inc. (http://www.wiseowl.com), and the architect and primary developer of Demeanor for .NET, Enterprise Edition—the premier .NET code obfuscator. In addition, Brent is an instructor for Wintellect (http://www.wintellect.com). He has also written several books on Windows programming, including ATL Internals, Win32 Programming, and others.



Comments

  • Finalizers are not totally evil, but just say No to New!

    Posted by Legacy on 07/14/2003 12:00am

    Originally posted by: Steve Russell

    Brent,

    Thanks for the insightful comments. In my classes, I teach that the GC should be viewed as a "safety net". It's there in case we, as programmers, forget to call the Dispose or Close method of a class.

    I think you're a bit pessimistic though saying "Just say NO". Safety nets are a good idea. We're all human, and we make mistakes. Yes, we run the risk of holding resources for way too long, but that would have happened if we'd forgotten to call "delete" in C++ too. At least with a safety net in place, there's a good chance that the destructor will be called, although we may have to wait until program termination for it to happen. In C++, it would _never_ happen, assuming we're using object pointers. And, yep, if our program dies abnormally, both C++ and .NET are in trouble.

    Will the GC framework make our program run slower? Maybe. The advantage of the GC is that we don't pay the price of complex heap management with lots of new/delete calls. Doing "new" operations in .NET is relatively fast. The disadvantage is that we pay a _BIG_ price when we're running low on resources/memory and the GC kicks in. Where's the break-even point? It depends on your app and system. Good programmers will evaluate the costs. For some applications, the GC wins.

    The real danger of the GC is that programmers get sloppy. They just assume that they can just leave their dirty objects lying around, and the GC will magically find their garbage and clean up after them. Sorry, but there's no Mom hiding in the machine. There are real costs associated with running the GC. A tidy programmer will not want to incur Mom's wrath, and will clean up after themselves.

    But as you point out, there is no way for the object's user to control the lifetime of its memory in .NET. Or is there? I encourage my students to recycle their objects. Each class can (should?) provide an object factory, such as a static/shared Create method (very COM-like, isn't it?), and then the Dispose method can place objects back into an object pool within the class (COM+, anyone?). Users are then encouraged to use Create rather than new.

    To me, saying "NO to NEW" is a good slogan too. It keeps the total memory profile quite small, since we don't just program as if memory is infinite. And we don't have to resign ourselves to regular GC overhead as simply a "fact" of life we must accept.

    As .NET programmers, we are _still_ responsible for managing memory usage. The GC is not a panacea, and we are very foolish is we assume that it is. But at least there is a safty net (or is that safety .net?) in place for those times when we slip up.

    Steve Russell,
    MCT/MCSD/MCSE/...

    Reply
  • Thank you for the conclusion and ....

    Posted by Legacy on 07/11/2003 12:00am

    Originally posted by: Zhifang Zhao

    But 
    
    As to use reference variable to other objects, what is the difference between to use them in normal class method and in finalizer?

    If my class foo has a reference variable to class fofo object in class foo's Finalizer, is there any chance fofo object being collected by GC before foo's finalizer get executed? fofo object still have a reference outside before foo's finalizer excute, right?

    Thank you

    Zhifang

    Reply
Leave a Comment
  • Your email address will not be published. All fields are required.

Top White Papers and Webcasts

  • Specialization and efficiency are always in need. Whether it's replacing an aging roof, getting a haircut, or tuning up a car, most seek the assistance of trusted experts. The same is true in the business world, where an increasing number of companies are seeking the help of others to administer their IT systems and services. This special edition of Unleashing IT highlights a new breed of IT caretaker -- Cisco Powered service providers -- and the business advantages and operational efficiencies they …

  • Packaged application development teams frequently operate with limited testing environments due to time and labor constraints. By virtualizing the entire application stack, packaged application development teams can deliver business results faster, at higher quality, and with lower risk.

Most Popular Programming Stories

More for Developers

Latest Developer Headlines

RSS Feeds