Building the Right Environment to Support AI, Machine Learning and Deep Learning
The inimitable Robbie Powell read my earlier article on keyboard trapping and wanted to use the code. Mr. Powell was tasked with trapping specific key combinations for a testing application. The basic idea is that testees should not be distracted during a test. By eliminating the ability to open an application other than the test application, their attention was more ably focused. Considering the nature and importance of the candidate's future endeavors, a bit of tunnel vision during testing was warranted. Unfortunately, the code from the November article does not port directly from VB6 to VB.NET. Mr. Powell did a superlative job porting the code but something still didn't work quite right. Together we figured out the differences, which are provided here.
Permit me to rehash some of the material in the November article for those who did not have an opportunity to read that article. If you have read the November article and just need to fill in the blanks, I encourage you to skip ahead to the Implementing the Keyboard Delegate section and The Complete Code Listing section. For a complete presentation, continue.
Writing API Declarations
For VB.NET developers, we can use the old-style Declare syntax to import DLL library methods. One also has the ability to use new .NET attributes for declaring API methods, specifically the DllImportAttribute. The Declare keyword is shorthand notation that causes the compiler to add and use the DllImportAttribute. Simply keep in mind that you will need to use the DllImportAttribute if you are programming in some other .NET language besides VB.NET. For our purposes, we will use the convenience notation.
To trap and examine keys before other applications get them, we need to hook the keyboard (ultimately release the hook), call the old keyboard handler, and interpret key combinations. To accomplish this feat, we need to import the SetWindowsHookEx, UnhookWindowsHookEx, CallNextHookEx, and GetAsyncState. Perhaps you will understand the rationale a bit better with some background information. So, before we look at the syntactic mechanics of a declaration statement, let's take a quick historical journey.
Understanding Low-Level Hooks
It seems like just a few brief years ago that you couldn't write anything interesting without writing interrupt handlers. In very low memory, Basic Input and Output (BIOS) code is loaded. This code provides the basic capabilities that your PC needs. (Assuming you are using a DOS-based PC. I imagine a MAC has something analogous to the BIOS for PCs...) These basic services are called interrupt handlers, and they are referred to by number. For example, interrupt 5 is the print screen interrupt. Interrupt 0x10 (hexadecimal) provides direct video input and output, interrupt 0x19 will reboot your computer, and interrupts 0x9 and 0x16 manage keyboard input. Pretty powerful stuff, these interrupt handlers.
Just a few years ago, one would have to write a custom interrupt handler and redirect the BIOS code to the new handler to replace the basic services. For example, prior to Windows, if code attempted to read from the A: drive and no diskette were in the drive, an application would hang. However, if the code provided an interrupt 0x24 handler, the error could be caught and new behavior provided. Supplanting basic BIOS behavior with new behavior is exactly what popup programs and TSR (Terminate and Stay Resident) utilities did all the time. Problematically, working at this level is an all or nothing proposition. Make a mistake and the whole PC crashed. These very low-level capabilities can still be accessed—for example, write asm int 3 end in Delphi and the debugger will stop because interrupt 3 is a low-level breakpoint. However, because replacing basic system services can result in unreliable PC behavior, operating system engineers were motivated to shield programmers from these mistakes.
To aid in productivity, we work at a higher level of abstraction. Instead of writing an interrupt handler for interrupt 0x9 and 0x16 to handle keyboard input directly, we simply write an event handler for the KeyDown (or some related) event handler. However, you can still interact with the operating system at a much lower level of abstraction than the VB KeyDown event. Simply keep in mind that the lower you go, the more responsibility you have. Back to the present.
To trap keystrokes before other applications get them, we have to interact with the operating system somewhere between the BIOS' interrupt handler and the high-level KeyDown event. To trap all keys, we are closer to the BIOS, perhaps, than the KeyDown event. Consequently, care must be exercised.
Declaring API Methods
The convenience syntax for declaring an API method is very similar to the notation used in VB6. We need to use the Declare keyword, match the signature of the API method, indicate the library that contains the API method, and optionally, indicate the visibility. For example, to import the SetWindowsHookEx API method, we might write:
Public Declare Function SetWindowsHookEx Lib "user32" _ Alias "SetWindowsHookExA" (ByVal idHook As Integer, _ ByVal lpfn As KeyboardHookDelegate, ByVal hmod As Integer, _ ByVal dwThreadId As Integer) As Integer
Here is the breakdown of the declaration statement:
- Public—Defines the visibility as Public. (Any code can call this method.)
- Declare—The keyword that indicates that we are implicitly importing a library method
- Function—The library method returns a value
- SetWindowsHookEx—The name we'll use in our code
- Lib "user32"—Specifies the library that contains the method. (You can find the physical API DLL by searching for user32.dll on your PC.)
- Alias "SetWindowsHookExA"—Indicates the real name of the method in the DLL
The rest of the declaration defines the signature of the DLL method. If you look closely at the declaration, you will notice something suspicious—KeyboardHookDelegate. Delegates didn't exist prior to .NET, yet the declaration clearly uses something call KeyboardHookDelegate.
The API method does not use a delegate. The API method actually defines the lpfn argument as a 32-bit integer. The CLR does an excellent job matching the needs of the API—a pointer to a function—with an analogous .NET entity a delegate. Delegates are classes that contain function pointers; however, a delegate is a class that is much more than just the address of a function. A function pointer can be represented as a 32-bit integer, so it is clear that some fudging is done for us to permit a delegate to be passed where only an integer is needed. The net benefit is that we can use more convenient .NET types where previously less convenient raw data types would have been used. Additional declarations are shown in The Complete Code Listing.
Implementing the Keyboard Delegate
To hook the keyboard, we are inserting our method into the address space for the existing low-level handler. This is what we did with interrupt handlers, and we still perform the same basic operation at a moderately higher level of abstraction. As is true with interrupt handlers, we need to hang onto the old handler, and ensure we call it. If we don't call the old handler, we prevent someone else's code from running. This would be rude unless our intention is to prevent someone else's keyboard code from running.
The delegate signature has to play by the same rules as a plain vanilla function pointer. The delegate signature must match an expected signature. Delegates will be invoked with the anticipation and necessity of receiving specific arguments and a return value if one is expected. In our example, the operating system will be calling with two integers and a structure that contains key state information. The caller will be expecting a return value, too. We can name the delegate anything, but as mentioned, the signature must match. The signature of our callback method is defined next.
Public Delegate Function KeyboardHookDelegate( _ ByVal Code As Integer, _ ByVal wParam As Integer, ByRef lParam As KBDLLHOOKSTRUCT) _ As Integer
Decomposed into chunks, we have:
- Public—The Delegate type is public
- Delegate—Defines this method signature as a subclass of the System.Delegate type
- Function—Indicates that the caller will expect a return value
- KeyboardHookDelegate—Is the name of the delegate
- Code—Is the name of the first argument, an Integer, that is passed by value
- wParam—Is a by-value Integer that we don't need in the example but is commonly found in message methods
- lParam—Very important to keyboard hooking; we need a pointer to the keyboard state information. This structure will tell us everything we need to know about the keys being pressed, released, and held. It is important to define this argument ByRef.
- As Integer—Indicates that the caller will be expecting an Integer.
We will actually need a method that very closely matches the signature of the delegate. The only point at which we can deviate is the name of the actual arguments. The callback method can use different names for the arguments, but the order and type of the arguments and the method type—function or subroutine—must match exactly.
Hooking the Keyboard
To hook the keyboard, we need to call the SetWindowsHookEx method. We will need a constant indicating what we want to hook, the idHook argument. We need a method that can be called back, the lpfn argument. A handle of the application doing the hooking, which is our application and the hmod argument, and the thread of the process we want to hook.
When hooking the keyboard in .NET, this part of the revision—from VB6–7 to VB.NET—is the most problematic. To facilitate, I have taken an important excerpt from the complete listing, listing 2. That excerpt is provided in listing 1.
Listing 1: Critical revisions to hooking the keyboard in .NET.
<MarshalAs(UnmanagedType.FunctionPtr)> _ Private callback As KeyboardHookDelegate Public Sub HookKeyboard() callback = New KeyboardHookDelegate(AddressOf KeyboardCallback) KeyboardHandle = SetWindowsHookEx( _ WH_KEYBOARD_LL, callback, _ Marshal.GetHINSTANCE( _ [Assembly].GetExecutingAssembly.GetModules()(0)).ToInt32, 0) Call CheckHooked() End Sub
Delegates are managed objects in .NET. This means that they are garbage collected. A problem occurs when we pass a delegate to the unmanaged code of the user32.dll API. Apparently, the garbage collector doesn't know that the delegate object is in use and after a short interval—roughly 47 seconds in experiments—the delegate is garbage collected. Consequently, when the API method attempts to call the method represented by the delegate back, a null reference exception occurs. To prevent the delegate from getting GC'd, we need to tag a delegate variable with the System.Runtime.InteropServices.MarshalAsAttribute, passing the enumerated value UnmanagedType.FunctionPtr. This tags the delegate argument, preventing it from being GC'd in an untimely fashion.
The first argument to SetWindowsHookEx is WH_KEYBOARD_LL. The second argument is the tagged delegate that contains the address of our local callback method. The third argument is the handle (hWnd) of the application doing the hooking, and passing 0 for the thread id means that we want to hook the keyboard for all threads.
For all of our efforts, if we forget the MarshalAsAttribute, the code fails miserably. You can read more about COMInterop in my new book Visual Basic .NET Power Coding from Addison-Wesley, available July, 2003.
Trapping Key Combinations
Determining if specific key combinations are being pressed requires some tricky gyrations. (Keep in mind that we are working at a pretty low level here.) This code remains pretty much unchanged from the November article. The basic idea is to read the current key press in the KBDLLHOOKSTRUCT.vkCode. If you need to look for specific multi-key combinations, you may need to call GetAsyncKeyState to determine whether additional keys are being held. For example, we call GetAsynckeyState(VK_CONTROL) in listing 2 to see whether the Ctrl key is being held down.
Unhooking the Keyboard
The return value from SetWindowsHookEx is stored. This is the address of the hook we replaced. We don't discard this value because if we want to let some key combinations slip past our hook, we need to use the return value of SetWindwosHookEx to call the old hook. We also use this value to unhook the keyboard, returning the old hook state, when we are finished holding onto the keyboard handler. Call UnhookWindowsHookEx passing the return value from SetWindowsHookEx to restore the original keyboard hook.
The Complete Code Listing
Listing 2 presents the complete revised listing for VB.NET. Most of this code is more of the same kinds of code that we have discussed already, including some additional methods, declare statements, the KDDLLHOOKSTRUCT, and some useful constants. You can copy and paste the code in listing 2 directly into a module to experiment with it. Call HookKeyboard to begin intercepting the three defined key combinations and UnhookKeyboard to restore the old keyboard state.
Listing 2: The complete revised listing for implementing low-level keyboard hooks.
Imports System.Runtime.InteropServices Imports System.Reflection Imports System.Drawing Imports System.Threading Module Keyboard Public Declare Function UnhookWindowsHookEx Lib "user32" _ (ByVal hHook As Integer) As Integer Public Declare Function SetWindowsHookEx Lib "user32" _ Alias "SetWindowsHookExA" (ByVal idHook As Integer, _ ByVal lpfn As KeyboardHookDelegate, ByVal hmod As Integer, _ ByVal dwThreadId As Integer) As Integer Private Declare Function GetAsyncKeyState Lib "user32" _ (ByVal vKey As Integer) As Integer Private Declare Function CallNextHookEx Lib "user32" _ (ByVal hHook As Integer, _ ByVal nCode As Integer, _ ByVal wParam As Integer, _ ByVal lParam As KBDLLHOOKSTRUCT) As Integer Public Structure KBDLLHOOKSTRUCT Public vkCode As Integer Public scanCode As Integer Public flags As Integer Public time As Integer Public dwExtraInfo As Integer End Structure ' Low-Level Keyboard Constants Private Const HC_ACTION As Integer = 0 Private Const LLKHF_EXTENDED As Integer = &H1 Private Const LLKHF_INJECTED As Integer = &H10 Private Const LLKHF_ALTDOWN As Integer = &H20 Private Const LLKHF_UP As Integer = &H80 ' Virtual Keys Public Const VK_TAB = &H9 Public Const VK_CONTROL = &H11 Public Const VK_ESCAPE = &H1B Public Const VK_DELETE = &H2E Private Const WH_KEYBOARD_LL As Integer = 13& Public KeyboardHandle As Integer ' Implement this function to block as many ' key combinations as you'd like Public Function IsHooked( _ ByRef Hookstruct As KBDLLHOOKSTRUCT) As Boolean Debug.WriteLine("Hookstruct.vkCode: " & Hookstruct.vkCode) Debug.WriteLine(Hookstruct.vkCode = VK_ESCAPE) Debug.WriteLine(Hookstruct.vkCode = VK_TAB) If (Hookstruct.vkCode = VK_ESCAPE) And _ CBool(GetAsyncKeyState(VK_CONTROL) _ And &H8000) Then Call HookedState("Ctrl + Esc blocked") Return True End If If (Hookstruct.vkCode = VK_TAB) And _ CBool(Hookstruct.flags And _ LLKHF_ALTDOWN) Then Call HookedState("Alt + Tab blockd") Return True End If If (Hookstruct.vkCode = VK_ESCAPE) And _ CBool(Hookstruct.flags And _ LLKHF_ALTDOWN) Then Call HookedState("Alt + Escape blocked") Return True End If Return False End Function Private Sub HookedState(ByVal Text As String) Debug.WriteLine(Text) End Sub Public Function KeyboardCallback(ByVal Code As Integer, _ ByVal wParam As Integer, _ ByRef lParam As KBDLLHOOKSTRUCT) As Integer If (Code = HC_ACTION) Then Debug.WriteLine("Calling IsHooked") If (IsHooked(lParam)) Then Return 1 End If End If Return CallNextHookEx(KeyboardHandle, _ Code, wParam, lParam) End Function Public Delegate Function KeyboardHookDelegate( _ ByVal Code As Integer, _ ByVal wParam As Integer, ByRef lParam As KBDLLHOOKSTRUCT) _ As Integer <MarshalAs(UnmanagedType.FunctionPtr)> _ Private callback As KeyboardHookDelegate Public Sub HookKeyboard() callback = New KeyboardHookDelegate(AddressOf KeyboardCallback) KeyboardHandle = SetWindowsHookEx( _ WH_KEYBOARD_LL, callback, _ Marshal.GetHINSTANCE( _ [Assembly].GetExecutingAssembly.GetModules()(0)).ToInt32, 0) Call CheckHooked() End Sub Public Sub CheckHooked() If (Hooked()) Then Debug.WriteLine("Keyboard hooked") Else Debug.WriteLine("Keyboard hook failed: " & Err.LastDllError) End If End Sub Private Function Hooked() Hooked = KeyboardHandle <> 0 End Function Public Sub UnhookKeyboard() If (Hooked()) Then Call UnhookWindowsHookEx(KeyboardHandle) End If End Sub End Module
Be aware that mistakes may completely lock up your keyboard and you may need to reboot. To prevent this kind of problem, I use the ThreadPool and a separate thread to release the keyboard after 10 or 15 seconds. This strategy has been invaluable while developing low-level code. You can learn more about multithreading here in past and future articles or by picking up a copy of my book, Visual Basic .NET Unleashed, from Sams.
Run the sample code and you will see that the Windows API is alive and well in .NET. Thankfully, you will need to have very special needs indeed to resort to calling into the Windows API. This is a far cry from VB6, where almost anything useful required interaction with the Windows API.
Disclaimer: The VS IDE hooks the keyboard. You may need to run the sample code outside of the IDE for the keyboard hook API call to succeed.
One of the most important differences between VB6 and VB.NET is the notion of managed code. Code in VB.NET is managed. This means objects can be moved around in memory and garbage collected. Old Windows API methods do not represent managed code. As a result, you may get some quirky behavior when interacting between .NET and the Windows API. If you plan on writing a lot of code that interoperates with the Windows API or COM, I encourage you to pick up a good book on COM Interop and a good advanced book such as my Visual Basic .NET Power Coding from Addison-Wesley that explores these intricate nooks and crannies for you.
About the Author
Paul Kimmel is a freelance writer for Developer.com and CodeGuru.com. Look for his recent book, Visual Basic .NET Power Coding, from Addison-Wesley on Amazon.com. Paul Kimmel is available to help design and build your .NET solutions and can be contacted at firstname.lastname@example.org.
# # #