Building the Right Environment to Support AI, Machine Learning and Deep Learning
Microsoft has called me a few times in the past for an interview. Once to work for the ASP.NET team and once to work for what used to be called the national practices services, part of Microsoft Consulting, and for a while I worked as a consultant for Microsoft consulting services.
Microsoft is building software for general consumption, or thinking of it another way, for the widest audience possible. Part of my interview of the ASP.NET was how to build software for general consumption. The challenge with general consumption software is that a developer may want to make things easier, but easy software sometimes opens some doors and closes others. I think this is what happened with
GetDirectoriesreplaced the API methods
findnextto make traversing the file system easier. The drawback with
GetDirectorieswas that these methods returned arrays, and the code had to wait until the entire array was populated before the calls returned. To eliminate the wait
GetDirectorieshave newer versions named
EnumerateDirectories. You no longer have to wait for
EnumerateDirectoriesto return to begin using the returned files and directories, but there are a couple of minor obstacles involved here to.
EnumerateDirectories you don't have to map API calls, write your own recursive descent to map items in sub-folders, and you don't have to wait for the entire enumeration to finish to interact with the results. The obstacle is that if you encounter an error, for example when you enumerate folders and child folders, the
EnumerateXxx method will throw an exception, and the enumeration will fail. This makes it difficult to do something as seemingly simple as create a file system browser with these methods. There is hope though.
Let's take a look at
EnumerateDirectories, how to use LINQ, and exception handlers to walk the file system even in the face of access exceptions.
Using EnumerateFiles and EnumerateDirectories
A version of
EnumerateDirectories exists in both the Directory and DirectoryInfo classes in the System.IO namespace. The version in the Directory class can traverse sub-directories and returns just the names of the matched items. If you need the file or directory information then use the DirectoryInfo version. Both
EnumerateDirectories have several overloaded versions that accept a variation of arguments including the starting path, the file mask, and a
SearchOption enumeration argument that determines whether just the top folder or all folders are searched.
The following code uses
Directory.EnumerateFiles to return the files in the root of the C: drive. The resultant enumeration is used as a source in a LINQ query. (In this case it is worth noting that the LINQ query doesn't filter the data so it is superfluous; that is you could just use
Directory.EnumeratoryFiles by itself.) The
Array.ForEach method converts the enumeration to an array, a requirement of the
ForEach method, and a Lambda expression to write the results to the console.
Dim files = From file In Directory.EnumerateFiles("C:\") Select file Array.ForEach(files.ToArray(), Sub(f) Console.WriteLine(f))
In such a simple use case
EnumerateDirectories are easy to use. This is what we want from a framework.
Handling EnumerateDirectories Exceptions
If you call
Directory.EnumerateDirectories with the
SearchOption.AllDirectories argument then
EnumerateDirectories will perform a directory traversal for you, which includes sub-folders. However, if you start at a high level directory like the root then you are likely to encounter a folder that will cause an access exception. You won't know this until you touch the directory data though. For example, if you write:
Dim d = Directory.EnumerateDirectories("C:\", "*.*", SearchOption.AllDirectories)
Then the code will compile and run. If you use the results as the source for a
for loop such as the following:
' Get all directories For Each d In Directory.EnumerateDirectories("C:\", "*.*", SearchOption.AllDirectories) Console.WriteLine(d) NextThen the code may throw an
UnauthorizedAccessExceptionwhen it hits a folder like C:\Documents and Settings. To catch the exception wrap the
for loopin a
Try Catchblock. Here is the revised code.
Try For Each d In Directory.EnumerateDirectories("C:\", "*.*", SearchOption.AllDirectories) Console.WriteLine(d) Next Catch ex As Exception Console.WriteLine("Handle exception here") End Try
You can use a broader exception class like Exception to catch all possible exceptions or multiple catch blocks if you want to handle individual kinds of exceptions differently.
A potential challenge here is that if you actually want to display or capture all folders even if access is denied then you need a slightly different approach than simply using the
Searching All Folders Without Failing on Exceptions
Suppose you want to traverse all folders. Suppose further that if you encounter an exception you want to skip that folder and continue. Until there is an option like
SearchOption.ContinueOnErrors or an attachable event that performs a semantically similar operation there is no opportunity to continue on an exception when it only takes one method call to traverse sub-folders. For this reason you need to split the call up.
To effectively skip folders and continue you can combine a Stack object and obtain top level folders. Next, pop each folder off the stack and try to query the sub-folders, repeat the pushing and popping of the stack until all folders and sub-folders have been traversed. Because this approach splits the traversal into an outer loop to manage the stack you can handle an exception and continue processing the items in the stack. Listing 1 provides a solution that will grab unauthorized folders-for example for your file system management tool-too and continue processing other folders.
Imports System.IO Imports System.Security Module Module1 Sub Main() Dim results As List(Of String) = New List(Of String) Dim start As String = "c:\" results.Add(start) Dim stack As Stack(Of String) = New Stack(Of String) Do Try Debug.WriteLine(start) Dim dirs = From d In Directory.EnumerateDirectories(start, "*.*", SearchOption.TopDirectoryOnly) Select d ' multline Lambda - don't really need this Array.ForEach(dirs.ToArray(), Sub(d) stack.Push(d) End Sub) start = stack.Pop() results.Add(start) Catch ex As UnauthorizedAccessException Console.WriteLine(ex.Message) start = stack.Pop() results.Add(start) End Try Loop Until (stack.Count = 0) For Each d In results Console.WriteLine(d) Next Console.ReadLine() End Sub End Module
Listing 1: Using a Stack to track top-level folders and then process sub-folders one at a time in the event of an exception.
The List(Of String) will contain the found directories. The variable start will start the search at the root of the C: drive. The stack is used to store the directories that need to be searched. The
Do loop starts the process and the
Try begins the exception handling block.
The statement beginning with
Dim dirs begins searching the sub-folders in the folder assigned to start.
SearchOption.TopDirectoryOnly will return the folders in start. (You could skip the LINQ query if you didn't want to perform an additional filtering, but the code does demonstrate how to incorporate LINQ if you need to.) The
Array.ForEach pushes all of the child folders on the stack using multi-line Lambda sub-expression syntax. Next, the stack is popped, the next folder is stored in the results list, and the process continues if the stack is not empty. If an exception is thrown the stack is popped again and the next item is stored.
The way this solution is written and by starting at the root this will be a long running process. However, since the solution is broken up into multiple stages you have plenty of opportunities to interact with the results along the way and you can even split the solution into a background process--which is probably what Windows Explorer does.
EnumerateDirectories make it really easy to get file and directory names and information. The most common complaint on the web is that these methods stop processing on an exception. By splitting the call up by directories you can manage exceptions in an outer loop. The online response to aborted processing is to modify the
EnumerateXxx methods to incorporate continue behavior.