Changes in System.IO classes in .NET Framework 4.0


Earlier versions of .NET Framework (prior to .NET Framework 4.0 and going back to .NET 2.0) had many APIs in the System.IO namespace for enumerating lines in a file, enumerating files in a directory, etc. These APIs returned arrays which could then be looped over to process each item.

For example, if one desires to print all the lines in a file, he/she can use the File.ReadAllLines API to get an array of strings (each string representing a line) which can then be printed by iterating over the array.

  string[] allLines = File.ReadAllLines("foo.txt");
  foreach(string line in allLines)

Similarly, to get a list of all files/directories in a directory, you call the GetFiles/GetDirectories API on DirectoryInfo class.

  DirectoryInfo dirInfo = new DirectoryInfo(@"c: \windows\system32");
  FileInfo[] arrayFileInfo = dirInfo.GetFiles();
  DirectoryInfo[] arrayDirectoryInfo = dirInfo.GetDirectories();

Issues With the Old APIs

While the above APIs worked as expected, there was a considerable performance hit when they were exercised on large files. The performance hit is coming from the fact that the APIs mentioned above are synchronous. i.e. The operation is blocked until all the lines in a file are read (to populate the array). Imagine the time the operation will take when you are parsing a 1 GB log file. Another issue on a memory-constrained execution environment will be the amount of memory needed to allocate the array. If you were only interested in the first few lines, you still have to pay the penalty of loading all the lines in memory.

New APIs in .NET Framework 4.0

To overcome these issues, in .NET Framework 4.0, the Base Class Library folks over at the Common Language Runtime team built new APIs with enumerators rather than arrays. These new APIs were extremely efficient because they didn't read all the lines into memory at once. Also since it read only one line at a time into memory, you can abrupt your iteration at any point without having to pay the late-comers tax we saw in the older APIs.

The new APIs are:

  • File.ReadLines (+1 overload)
  • File.WriteAllLines (+1 overload)
  • File.AppendAllLines (+1 overload)
  • DirectoryInfo,EnumerateDirectories(+2 overloads)
  • DirectoryInfo,EnumerateFiles (+2 overloads)
  • DirectoryInfo.EnumerateFileSystemInfos(+2 overloads)
  • Directory.EnumerateDirectories (+2 overloads)
  • Directory.EnumerateFiles (+2 overloads)
  • Directory.EnumerateFileSystemEntries (+2 overloads)

These new work by returning an IEnumerable <t> which is a much more performant operation than an array of objects returned by the earlier methods.

The application developer can use the returned iterator to iterate, reducing the startup disk I/O experienced in the older APIs.

Hands On

Here is how the APIs can be used:

  using System;
  using System.Collections.Generic;
  using System.Linq;
  using System.Text;
  using System.IO;
  namespace FileEnumerators
      class Program
          static void Main(string[] args)
              DateTimeOffset tstart = new DateTimeOffset(DateTime.Now);
              string[] oldlines = File.ReadAllLines(Environment.ExpandEnvironmentVariables(@"%TEMP%\registry.reg"));
              DateTimeOffset tstop = new DateTimeOffset(DateTime.Now);
              TimeSpan difference = tstop - tstart;
              Console.WriteLine("Time taken with old API = " + difference.ToString());
              tstart = new DateTimeOffset(DateTime.Now);
              for (int i = 0; i < oldlines.Length; i++)
                  // Dont do anything. Just cycle through
              tstop = new DateTimeOffset(DateTime.Now);
              TimeSpan cycleDifference = tstop - tstart;
              Console.WriteLine("Cycle time taken with old API = " + cycleDifference.ToString());
              tstart = new DateTimeOffset(DateTime.Now);
              IEnumerable<string> allLines = File.ReadLines(Environment.ExpandEnvironmentVariables("%TEMP%\\registry.reg"));
              tstop = new DateTimeOffset(DateTime.Now);
              difference = tstop - tstart;
              Console.WriteLine("Time taken with new API = " + difference.ToString());
              tstart = new DateTimeOffset(DateTime.Now);
              foreach (string str in allLines)
                  // Dont do anything. Just cycle through
              tstop = new DateTimeOffset(DateTime.Now);
              cycleDifference = tstop - tstart;
              Console.WriteLine("Cycle time taken with new API = " + cycleDifference.ToString());

The results are very obvious. On my system (currently already pegging one of the two CPUs on a dual-core at constant value), I exported my registry to a temp file and ran the above code on that file. On a DEBUG build, the results are below:

Time taken with old API = 00:00:11.6660156
Cycle time taken with old API = 00:00:00.0087891
Time taken with new API = 00:00:00
Cycle time taken with new API = 00:00:09.9238281

The new API (which has the enumerators) does not block to read all the contents of a large file. Instead it immediately returns the enumerator (highly performant). The old API loads everything in memory so it takes an initial hit, however its cycle time is almost zero since it does not have to hit the hard drive to get the values of the strings again.

Usage of the Performant APIs

The new APIs can be very useful when you want to list all files/directories in a directory which has a lot of content including lot of sub-directories.

The new APIs can also be used where we want to read lines in a very large text file.


In the above article, we saw how the new APIs in System.IO help improve the performance by using enumerators, hence reducing the initial lookup time.

Related Article

About the Author

Vipul Vipul Patel

Vipul Patel is a Software Engineer currently working at Microsoft Corporation, working in the Office Communications Group and has worked in the .NET team earlier in the Base Class libraries and the Debugging and Profiling team. He can be reached at


  • There are no comments yet. Be the first to comment!

Leave a Comment
  • Your email address will not be published. All fields are required.

Top White Papers and Webcasts

  • Live Event Date: December 11, 2014 @ 1:00 p.m. ET / 10:00 a.m. PT Market pressures to move more quickly and develop innovative applications are forcing organizations to rethink how they develop and release applications. The combination of public clouds and physical back-end infrastructures are a means to get applications out faster. However, these hybrid solutions complicate DevOps adoption, with application delivery pipelines that span across complex hybrid cloud and non-cloud environments. Check out this …

  • CentreCorp is a fully integrated and diversified property management and real estate service company, specializing in the "shopping center" segment, and is one of the premier retail service providers in North America. Company executives travel a great deal, carrying a number of traveling laptops with critical current business data, and no easy way to back up to the network outside the office. Read this case study to learn how CentreCorp implemented a suite of business continuity services that included …

Most Popular Programming Stories

More for Developers

RSS Feeds