Performance Tips and Tricks in .NET ApplicationsA .NET Developer Platform White Paper
Emmanuel Schanzer
Microsoft Corporation
August 2001
Summary: This article is for developers who want to tweak their applications for optimal performance in the managed world. Sample code, explanations and design guidelines are addressed for Database, Windows Forms and ASP applications, as well as language-specific tips for Microsoft Visual Basic and Managed C++. (25 printed pages)Contents
Overview
Performance Tips for All Applications
Tips for Database Access
Performance Tips for ASP.NET Applications
Tips for Porting and Developing in Visual Basic
Tips for Porting and Developing in Managed C++
Additional Resources
Appendix: Cost of Virtual Calls and AllocationsOverview
This white paper is designed as a reference for developers writing applications for .NET and looking for various ways to improve performance. If you are a developer who is new to .NET, you should be familiar with both the platform and your language of choice. This paper strictly builds on that knowledge, and assumes that the programmer already knows enough to get the program running. If you are porting an existing application to .NET, it's worth reading this document before you begin the port. Some of the tips here are helpful in the design phase, and provide information you should be aware of before you begin the port.
This paper is divided into segments, with tips organized by project and developer type. The first set of tips is a must-read for writing in any language, and contains advice that will help you with any target language on the Common Language Runtime (CLR). A related section follows with ASP-specific tips. The second set of tips is organized by language, dealing with specific tips about using Managed C++ and Microsoft® Visual Basic®.
Due to schedule limitations, the version 1 (v1) run time had to target the broadest functionality first, and then deal with special-case optimizations later. This results in a few pigeonhole cases where performance becomes an issue. As such, this paper covers several tips that are designed to avoid this case. These tips will not be relevant in the next version (vNext), as these cases are systematically identified and optimized. I'll point them out as we go, and it is up to you to decide whether it is worth the effort.Performance Tips for All Applications
There are a few tips to remember when working on the CLR in any language. These are relevant to everyone, and should be the first line of defense when dealing with performance issues.Throw Fewer Exceptions
Throwing exceptions can be very expensive, so make sure that you don't throw a lot of them. Use Perfmon to see how many exceptions your application is throwing. It may surprise you to find that certain areas of your application throw more exceptions than you expected. For better granularity, you can also check the exception number programmatically by using Performance Counters.
Finding and designing away exception-heavy code can result in a decent perf win. Bear in mind that this has nothing to do with try/catch blocks: you only incur the cost when the actual exception is thrown. You can use as many try/catch blocks as you want. Using exceptions gratuitously is where you lose performance. For example, you should stay away from things like using exceptions for control flow.
Here's a simple example of how expensive exceptions can be: we'll simply run through a For loop, generating thousands or exceptions and then terminating. Try commenting out the throw statement to see the difference in speed: those exceptions result in tremendous overhead.
public static void Main(string[] args){ int j = 0; for(int i = 0; i < 10000; i++){ try{ j = i; throw new System.Exception(); } catch {} } System.Console.Write(j); return; }Beware! The run time can throw exceptions on its own! For example, Response.Redirect() throws a ThreadAbort exception. Even if you don't explicitly throw exceptions, you may use functions that do. Make sure you check Perfmon to get the real story, and the debugger to check the source. To Visual Basic developers: Visual Basic turns on int checking by default, to make sure that things like overflow and divide-by-zero throw exceptions. You may want to turn this off to gain performance. If you use COM, you should keep in mind that HRESULTS can return as exceptions. Make sure you keep track of these carefully. Make Chunky Calls
A chunky call is a function call that performs several tasks, such as a method that initializes several fields of an object. This is to be viewed against chatty calls, that do very simple tasks and require multiple calls to get things done (such as setting every field of an object with a different call). It's important to make chunky, rather than chatty calls across methods where the overhead is higher than for simple, intra-AppDomain method calls. P/Invoke, interop and remoting calls all carry overhead, and you want to use them sparingly. In each of these cases, you should try to design your application so that it doesn't rely on small, frequent calls that carry so much overhead.
A transition occurs whenever managed code is called from unmanaged code, and vice versa. The run time makes it extremely easy for the programmer to do interop, but this comes at a performance price. When a transition happens, the following steps needs to be taken: Perform data marshalling Fix Calling Convention Protect callee-saved registers Switch thread mode so that GC won't block unmanaged threads Erect an Exception Handling frame on calls into managed code Take control of thread (optional)
To speed up transition time, try to make use of P/Invoke when you can. The overhead is as little as 31 instructions plus the cost of marshalling if data marshalling is required, and only 8 otherwise. COM interop is much more expensive, taking upwards of 65 instructions.
Data marshalling isn't always expensive. Primitive types require almost no marshalling at all, and classes with explicit layout are also cheap. The real slowdown occurs during data translation, such as text conversion from ASCI to Unicode. Make sure that data that gets passed across the managed boundary is only converted if it needs to be: it may turn out that simply by agreeing on a certain datatype or format across your program you can cut out a lot of marshalling overhead.
The following types are called blittable, meaning they can be copied directly across the managed/unmanaged boundary with no marshalling whatsoever: sbyte, byte, short, ushort, int, uint, long, ulong, float and double. You can pass these for free, as well as ValueTypes and single-dimensional arrays containing blittable types. The gritty details of marshalling can be explored further on the MSDN Library. I recommend reading it carefully if you spend a lot of your time marshalling.Design with ValueTypes
Use simple structs when you can, and when you don't do a lot of boxing and unboxing. Here's a simple example to demonstrate the speed difference:
using System;
namespace ConsoleApplication{
public struct foo{ public foo(double arg){ this.y = arg; } public double y; } public class bar{ public bar(double arg){ this.y = arg; } public double y; } class Class1{ static void Main(string[] args){ System.Console.WriteLine("starting struct loop..."); for(int i = 0; i < 50000000; i++) {foo test = new foo(3.14);} System.Console.WriteLine("struct loop complete. starting object loop..."); for(int i = 0; i < 50000000; i++) {bar test2 = new bar(3.14); } System.Console.WriteLine("All done"); } }}
When you run this example, you'll see that the struct loop is orders of magnitude faster. However, it is important to beware of using ValueTypes when you treat them like objects. This adds extra boxing and unboxing overhead to your program, and can end up costing you more than it would if you had stuck with objects! To see this in action, modify the code above to use an array of foos and bars. You'll find that the performance is more or less equal.Tradeoffs ValueTypes are far less flexible than Objects, and end up hurting performance if used incorrectly. You need to be very careful about when and how you use them.
Try modifying the sample above, and storing the foos and bars inside arrays or hashtables. You'll see the speed gain disappear, just with one boxing and unboxing operation.
You can keep track of how heavily you box and unbox by looking at GC allocations and collections. This can be done using either Perfmon externally or Performance Counters in your code.
See the in-depth discussion of ValueTypes in Performance Considerations of Run-Time Technologies in the .NET Framework.Use AddRange to Add Groups
Use AddRange to add a whole collection, rather than adding each item in the collection iteratively. Nearly all windows controls and collections have both Add and AddRange methods, and each is optimized for a different purpose. Add is useful for adding a single item, whereas AddRange has some extra overhead but wins out when adding multiple items. Here are just a few of the classes that support Add and AddRange: StringCollection, TraceCollection, etc. HttpWebRequest UserControl ColumnHeader Trim Your Working Set
Minimize the number of assemblies you use to keep your working set small. If you load an entire assembly just to use one method, you're paying a tremendous cost for very little benefit. See if you can duplicate that method's functionality using code that you already have loaded.
Keeping track of your working set is difficult, and could probably be the subject of an entire paper. Here are some tips to help you out: Use vadump.exe to track your working set. This is discussed in another white paper covering various tools for the managed environment. Look at Perfmon or Performance Counters. They can give you detail feedback about the number of classes that you load, or the number of methods that get JITed. You can get readouts for how much time you spend in the loader, or what percent of your execution time is spent paging. Use For Loops for String Iteration—version 1
In C#, the foreach keyword allows you to walk across items in a list, string, etc. and perform operations on each item. This is a very powerful tool, since it acts as a general-purpose enumerator over many types. The tradeoff for this generalization is speed, and if you rely heavily on string iteration you should use a For loop instead. Since strings are simple character arrays, they can be walked using much less overhead than other structures. The JIT is smart enough (in many cases) to optimize away bounds-checking and other things inside a For loop, but is prohibited from doing this on foreach walks. The end result is that in version 1, a For loop on strings is up to five times faster than using foreach. This will change in future versions, but for version 1 this is a definite way to increase performance.
Here's a simple test method to demonstrate the difference in speed. Try running it, then removing the For loop and uncommenting the foreach statement. On my machine, the For loop took about a second, with about 3 seconds for the foreach statement.
public static void Main(string[] args) { string s = "monkeys!"; int dummy = 0; System.Text.StringBuilder sb = new System.Text.StringBuilder(s); for(int i = 0; i < 1000000; i++) sb.Append(s); s = sb.ToString(); //foreach (char c in s) dummy++; for (int i = 0; i < 1000000; i++) dummy++; return; }}Tradeoffs Foreach is far more readable, and in the future it will become as fast as a For loop for special cases like strings. Unless string manipulation is a real performance hog for you, the slightly messier code may not be worth it.

