Most of this section on Character Codes deals with software encoding. However encoding must pass from the human and thru the hardware before it ever gets to the software. The physical keyboard keys when tapped, passes a number called a scan code to the software which then does the character coding. The scan code is completely different from character coding. See also Shortcuts.

Here are the scan codes for the main part of a US English keyboard.

Illustration showing the scan codes for a US English keyboard

In Windows, once a scan code is acquired, a DLL for the hardware then maps the scan code to a VK (Virtual Key). Each VK has a variable name (VK_x) and a byte value (between 0x00 and 0xFF) as defined by in winuser.h, which has meta data for the USER subsystem of the Windows operating system. VK_0 thru VK_9 and VK_A thru VK_Z have the appropriate ASCII values (0x30-0x39 and 0x41-0x5A). Note that the letter keys are for the upper case characters. Calling upon the GetKeyboardState API will return an array containing the state of every VK.

Here are the VKs for the main part of a US English keyboard.

Illustration mapping the virtual keys for a US English keyboard

In addition to the VK array, applications also need to know the Shift State. The Shift State is basically knowledge of the VK array prior to the current VK array, and is usually concerned with the combinations of SHIFT, CTRL, ALT, ALTGR (the right hand side ALT), CAPS LOCK, etc., either simultaneously or in sequences.

Most keyboards have 47-48 Physical Keys. By combining keys with different Shift States, more keys are possible. EG: If you have a US English International keyboard:

Pressing Returns
a a
SHIFT+a A
ALTGR+a á
ALTGR+SHIFT+a Á
` then a à
` then SHIFT+a À

Note that for some languages, esp. CJK languages, even 48 Physical Keys and 8 Shift States does not produce enough characters to cover all the glyphs in the language.

The next step is where it really gets messy. The OS gets info about the keys before applications get it. This includes WM_KEYDOWN and WM_CHAR.

Applications can also mimic some of this early key processing with functions like these: keybd_event, MapVirtualKey, MapVirtualKeyEx, OemKeyScan, SendInput, ToAscii, ToAsciiEx, ToUnicode, ToUnicodeEx, VkKeyScan, and VkKeyScanEx.

After all this, applications may finally get code points for the Character Codes, whether ASCII, ANSI, Unicode, etc., which can be translated into glyphs or characters, like letters, numbers, and punctuation.



GeorgeHernandez.comSome rights reserved