IAccessible2 Implementation Guide
document maintainer: Brian Cragun, IBM
contents last modified: 2010-06-01
canonical URI for this document: http://a11y.org/ia2-implementation
[R2] Proposed updates 7/12/2011. I have added some proposed updates and clarifications to the document. For accessibility purposes, they are indicated the markup tags [R2]changes[/R2]. Comments may be sent to cragun at us dot ibm dot com.
The goal of this document is to provide a clear set of guidelines for software developers who want to add advanced accessibility features to their applications by using the IAccessible2 set of interfaces. It focuses on supporting access to two common application constructs: rich text editable areas and tables with editable content. By following these guidelines, developers ensure their applications are accessible to screen reader users.
Before continuing, the reader should be familiar with Microsoft® Active Accessibility® (MSAA). For an explanation of important concepts in MSAA that are pertinent to this document such as objects, focus, events, and ancestry, please refer to the MSAA Specification Documentation. The guidelines described herein are predicated on the assumption that developers have correctly followed MSAA implementation.
For additional information, the Reference section at the bottom of this document contains links to in-depth information on creating an MSAA server and the IAccessible and IAccessible2 interfaces.
Furthermore, an understanding of COM clients, servers, objects, and interfaces is essential for understanding the following guidelines.
Note: IAccessible2 is both a specific interface, and a set of interfaces. Hereafter in this document, unless explicitly specified, when the term IAccessible2 is used, it is meant to indicate the specific IAccessible2 interface rather than the set of interfaces.
All objects must, at a minimum, provide the IAccessible, IAccessible2, and IServiceProvider interfaces. The following subsections describe the guidelines for how these interfaces are to be provided.
Providing the IAccessible interface
TheIAccessible interface is the entry point for all other interfaces described in this document. When a screen reader receives an event, it requests the IAccessible interface for the object associated with that event. Active accessibility servers provide objects by responding to the
WM_GETOBJECT message. Objects returned by
WM_GETOBJECT must support the IAccessible interface. In addition, these objects must also support the IServiceProvider interface, for reasons which will become clear in the next section.
QueryInterface and QueryService
Once it obtains the IAccessible interface, the screen reader, as an assistive technology (AT) client, uses the
IServiceProvider::QueryService methods to retrieve other interfaces.
The difference between these two methods is in the underlying object supporting the requested interfaces.
Any interface returned by a call to QueryInterface must be implemented by the object on which the call was made. Therefore, if object O implements interfaces A and B, when calling either A.QueryInterface(B), or B.QueryInterface(A), the underlying object O is the same. This concept is called symmetry. This allows the client to always have access to interface B from interface A, and vice versa.
In contrast, an interface returned by QueryService may be implemented on a different object than the one on which the call was made. So, if object O supports interface A, and another object P supports interface B, in a call to A.QueryService(B), object O may return interface B as implemented by object P. In other words, the symmetry required by QueryInterface is not required by QueryService. This allows application developers to implement interfaces on different objects.
As described earlier, when a screen reader receives an event, it retrieves the IAccessible interface. At this point the screen reader may call QueryService for two possible interfaces: either IAccessible2 or IAccessibleApplication.
The AT uses QueryService to retrieve the IAccessible2 interface from IAccessible because the system, in oleacc.dll, wraps all IAccessible interfaces returned by MSAA servers in a proxy object, which is then given to the client. The purpose of the proxy wrapper is to provide support for dynamic attribution. Dynamic attribution allows the system to provide base level MSAA support by gathering MSAA information from the underlying window and returning it via MSAA properties. An example of this might be to take the window text of a button and return it in IAccessible::accName. This process provides a minimum of accessibility for applications where the app developer has not specifically implemented MSAA, but has used common windows controls.
The problem with the proxy wrapper is that when the client calls QueryInterface, it is actually calling QueryInterface on the proxy object, not the underlying object provided by the server. And as expected, calling QueryInterface on the proxy object for the IAccessible2 interface will fail, even when the wrapped object supports it. Enter the IServiceProvider interface as the solution. IServiceProvider::QueryService allows the server application to return a new, unwrapped, object to the client. The client can then call QueryInterface on the new object provided directly by the server.
The reason the AT calls QueryService for the IAccessibleApplication interface is to make life easier for server developers. It seems unnecessary for server developers to have to implement the IAccessibleAplication interface on every object they provide. Since the application information is apt to be the same from object to object, it makes sense that servers may want to return one static object with the application information. Using QueryService makes this possible.
These are the only two interfaces the AT requests using QueryService. Because the AT uses QueryService to obtain these interfaces, the IServiceProvider interface must be implemented on the object providing the IAccessible interface. And except for the two mentioned above, and like all the interfaces described hereafter, the IServiceProvider interface will be retrieved using QueryInterface. Therefore, after obtaining the IAccessible2 interface using QueryService, the AT uses QueryInterface to retrieve all others in the set of IAccessible2 interfaces.
Note: Some screen readers pass
IID_IAccessible to QueryService as the service ID parameter.
All objects must support the
IAccessible2::uniqueID property. The unique ID for an object may not be zero (0). Otherwise, The specific or relative value of the unique ID is unimportant to the client, but ID’s must be unique. i.e. No two objects in a text area/document may have the same ID.
The IAccessible2 documentation describes the unique ID as “…an identifier for this object, is unique within the current window, and remains the same for the lifetime of the accessible object.” Consult
HRESULT IAccessible2::uniqueID documentation for further details.
Unique IDs are not required to remain constant across application sessions.
Unhandled exceptions of any kind are not allowed for any method on any interface referred to in this document. This extends to IAccessible and any of the set of IAccessible2 interfaces. Developers may use exceptions internally in their code, but all methods in all interfaces must return to the client as expected. Unhandled exceptions disrupt the client’s code flow, and will likely cause unexpected results. When unhandled exceptions occur, the problems caused are notoriously difficult to track down.
All interface functions must return the
HRESULT value of
S_OK when they succeed and/or when valid data is returned. If a function succeeds, but cannot return valid data, it may return
S_FALSE or a COM error code less than (<) 0. If a function fails for any reason, it must return a COM error code less than (<) 0.
Editable documents and text fields
Anyone who uses a computer understands the concept of an editable text area. Such areas are most often used to write simple text in forms, chat messages, e-mail messages, rich text documents and so on. They have a caret which indicates the insertion point for typing. And they typically let users copy, cut, paste, and otherwise edit text. Sometimes these areas are read only. In this case, they still provide all the navigational support of an editable field.
AT expects editable areas to behave as explained in the following sections.
Identifying editable areas
The first step in making an editable area accessible is making it clear to the client that it has indeed encountered such an area. There are five criteria AT uses to determine if a text area supports the accessibility features described in this document.
- Focus: Focus must be placed on the text area. Please refer to the MSAA Specification documentation for a detailed discussion of setting focus to an object.
- Role: The role property for the focused object must be either
- If the object has the role of
- If the text within spans multiple lines, the object must have the state of
- This object may also have the state of
STATE_SYSTEM_READONLYif the text cannot be modified. However, it is implied that the application supports keyboard navigation as though the text were modifiable.
- If the text within spans multiple lines, the object must have the state of
- If the object has the role of
- it must have the state of
- It is implied that the text within may span multiple lines.
- Objects with this role may also have the state of
STATE_SYSTEM_READONLYif the text within cannot be modified. However, it is implied that the application supports keyboard navigation as though the text were modifiable.
- it must have the state of
- If the object has the role of
- IAccessibleText: The object must support the IAccessibleText interface. Guidelines for supporting this important interface are described at length further in this document.
IAccessible2::attributesmethod must return an attribute specifying which AccessibleText model the object implements. The next section, IAccessibleText model, provides specifications for this attribute.
The IAccessible2 set of interfaces may be thought of as a set of building blocks that can be assembled in a number of ways in order to support different types of controls. It is possible to implement these interfaces in different combinations, with different sets of expectations. This is especially so when providing support for complex controls such as editable text areas. Therefore, it is important that both the client and server understand and agree on the implementation details.
For editable text areas, AT clients primarily interact with the IAccessibleText interface. The implementation details for this interface and all interfaces and properties described in the “Editable documents and text areas” section of this document are known as an IAccessibleText model.
The specific IAccessibleText model described in this document is “A1”. “A” indicates the general model, and “1” indicates the model version.
Since it may be possible to have multiple IAccessibleText models for different editable controls, the IAccessible2::attributes method must return an attribute specifying which IAccessibleText model it provides. The attribute has the following form: [R2]
Providing the text-model [/R2] attribute as shown above, indicates that the implementation details for an editable text area, especially the IAccessibleText interface and its associated interfaces, follow the guidelines set forth in this document.
To screen reader users, keyboard access is essential for editing text. Keyboard access extends well beyond normal typing keys such as alpha-numeric keys, backspace, delete, etc. It includes all possible caret movement commands. This extends not just to commands which simply move the caret, but also those commands for selecting text which highlight text to be copied, cut, inserted, deleted, or replaced.
And an important point to understand is that a screen reader does not control keyboard access to an editable area; it must be provided by the underlying editable control. In other words, keyboard access to a control would ideally be the same whether or not a screen reader were running.
The following is a list of common caret movement and text selection commands. The items labeled “required” are commands without which the edit control is unusable. The items labeled “optional” are still recommended but aren’t strictly necessary. All other options are highly recommended and compose a base set of editing keystrokes which almost all keyboard users know and find useful when editing text.
- Left and right arrows: Move by character (required)
- [R2] Left and right arrows in a table: Move by character unless at the beginning or end of a cell, in which case the caret moves to the next or previous cell (required). [/R2]
- Shift + left and right arrows: Select by character (required)
- Up and down arrows: Move by line (required)
- [R2] Up and down arrows in a table: Move by line in a multiline cell or to previous or next row if at the top or bottom of a cell (required). [/R2]
- Shift + up and down arrows: select by line (required)
- Home: Move to the beginning of the current line (required)
- Shift + home: Select to the beginning of the line (required)
- End: Move to the end of the current line (required)
- Shift +end: Select to the end of the line (required)
- Ctrl + left and right arrows: Move by word (required)
- Shift + ctrl + left and right arrows: Select by word (required)
- Ctrl + home: Move to the beginning of the document
- Shift + ctrl + home: Select to the beginning of the document
- Ctrl + end: Move to the end of the document
- Shift + ctrl +end: Select to the end of the document
- Page up and page down: Move between pages
- Ctrl + up and down arrows: Move between paragraphs [R2] (optional) [/R2]
- Alt + up and down arrows: Move between sentences (optional)
- [R2] Tab and shift+tab in a table: Move to next or prior cell (required). [/R2]
Common sense would dictate that not all these keystrokes are necessary all the time. For example, moving up and down by line doesn’t make sense for a single line edit field. However, it is something that all multiline edit areas must provide. Another example might be a multiline edit area like an e-mail message which does not support multiple pages. In that case, supporting the page up and page down keystrokes isn’t necessary. However, in cases where multiple pages do exist, these keystrokes are highly recommended. Developers should use this type of reasoning to provide the greatest possible keyboard access to their controls.
The graphic indicating the insertion point for text in an editable area is referred to as the caret. Even when an object with the role of
ROLE_SYSTEM_TEXT has focus, it is the caret that is the primary point of interest.
It is important to distinguish between the object with focus and the object with the caret. For an editable control, only one object may have the focus, and that focus must remain constant regardless of the position of the caret. Additionally, the focus object must have the state of
STATE_SYSTEM_FOCUSED set as long as the text area has focus. No descendant of the object with focus may have this state. The object with the caret may be the same as the object with focus. But if it is not, the object with the caret must be a descendant of the object with focus. So, the focused object is essentially the root of a document tree, and the caret object is the object within that tree which presently owns the caret.
Furthermore, the document hierarchy is not valid until the client can:
- Start from the object with focus, and by recursively using the AccessibleChildren function provided by MSAA, find the object that contains the caret;
- start from the object with focus, and by recursively using the IAccessibleHypertext interface, find the object that contains the caret; and
- start from the object with the caret, and by recursively calling the
IAccessible::accParentmethod, find the object with focus.
The ability for the client to traverse upward and downward through the document object hierarchy is critical. This will become apparent as this document continues.
Providing the caret
Similar to the way the server provides the object with focus by sending the EVENT_OBJECT_FOCUS event, the server provides the object that owns the caret to the AT client by sending the IA2_EVENT_TEXT_CARET_MOVED event. This event must be sent once after the EVENT_OBJECT_FOCUS event to indicate the initial position of the caret, and again every time the caret moves. The object provided with this event must be the object containing the caret.
To “contain” or “own” the caret literally means that for a particular object:
- a call to IAccessibleText::caretOffset returns the HRESULT S_OK and a valid offset value. And
- the offset provided by IAccesibleText::caretOffset is not the offset of an embed character (0xFFFC).
Note1: the process of recursing the IAccessibleText hierarchy using embed characters is explained fully in the section entitled “Embed characters”.
Note2: There is one small exception to item B above: the caret offset for the object containing the caret may be an embed char, if the object referred to by that embed char represents a non-text object such as a graphic.
The above conditions can only be true for one object in the document hierarchy at a time.
If an object does not own the caret, and no decendant of that object owns the caret, IAccessibleText::caretPosition must return either S_FALSE or a failure code.
If a decendant of an object owns the caret, IAccessibleText::caretPosition must return S_OK and an offset indicating an embed character. The client must be able to use the embed character in conjunction with the IAccessibleHypertext interface to downwardly traverse through the object hierarchy to find the object which owns the caret. For more detail, please see the sections entitled “Downwardly traversing through embedded objects” and “Embed characters”.
Character offsets are 0 based, 0 representing the start of the text. The caret offset indicates the character directly to the right of the caret. And when two offsets indicate a range of text, the ending offset is non-inclusive.
For example, if an object contains the text “ABCD”, and the caret is positioned to the left of the letter “B”, the starting offset for the text is 0, the ending offset is 4, and the caret offset is 1.
For any character with a starting offset of N, the ending offset of that character will be N+1. So, in this example, For the character “C”, the starting offset is 2 and the ending offset is 3.
Finally, servers must never return a starting offset which is greater than (>) the ending offset.
It is critical that servers return correct offsets from all applicable functions.
As mentioned earlier in the section Keyboard navigation, the screen reader does not move the caret. Rather, it merely speaks the text at the caret location. It does this by requesting text from the object containing the caret.
Two simple methods a screen reader uses when retrieving text are IAccessibleText::nCharacters and IAccessibleText::text.
IAccessibleText::nCharacters simply returns the length of the text contained by the object. Embed characters are only counted once per each, even if the objects they represent contain text whose length is greater than one. In other words, if an object has 5 plain text characters and 2 embed characters, the total returned by nCharacters would always be 7.
IAccessibleText::text returns a segment of text, where the starting and ending offsets are specified by the caller. For example, if an object contains the text, “the rain in Spain falls mainly in the plain”; and the client specifies a starting offset of 4 and an ending offset of 8; the server would return the text “rain”.
Often a screen reader requests just a portion of the text contained by an object. The screen reader may ask for a line of text, a word, or even just a single character. These bits of text are delineated by boundaries. Boundaries are how the screen reader specifies to a server which text increment it needs. Common boundary types are character, word, line, paragraph, and all.
To retrieve the text for a given boundary, the screen reader calls the function IAccessibleText::textAtOffset. This function takes an offset and a boundary type. It returns the starting and ending offsets of that boundary, in addition to the text contained therein.
Example: an object contains the following text where the lines are assumed to be wrapped.
There are 20 characters in this example, including white space characters. The soft line breaks (‘\r’) at the points where the lines wrap are not included in the character count. If the AT client calls IAccessibleText::textAtOffset, passing in an offset of 8 and a boundary type of IA2_TEXT_BOUNDARY_LINE, the server would return the starting offset of 7, the ending offset of 14, and the text “Line 2 ”.
Word boundaries and keyboard navigation
It is important that boundary type and keyboard navigation are in agreement about the size of text increments. There is not a predetermined rule to govern exactly how commands like the next and prior word keystrokes should work. It is however, imperative that the movement of the caret and the alignment of boundaries be the same.
This is most often evident, and easiest to demonstrate, in situations involving punctuation.
When addressing this concern, the first question is: How do the keyboard navigation keystrokes work? Does moving by word with the ctrl + left and right keystrokes move the caret to individual punctuation marks? Or do these keystrokes move the caret past the punctuation on to the next word?
If an editable area contains the text
“I will not eat green eggs and ham. I will not eat them Sam I am.”
When starting at the letter “h” in “ham”, if the user presses ctrl + right arrow to move the caret to the next word, where does the caret now sit? Is it positioned on the period at the end of the first sentence? Or is it positioned at the word “I” which begins the second sentence.
If the caret is positioned at the period, and the screen reader requests the text for the word boundary at that position, the server should return “. ” because the boundary between words which end with a period is between the word and the period. Likewise, if the requested offset lies in the word “ham”, and the screen reader requests the text for the word boundary at that position, the returned text should not include the period character for the same reason.
Now take the other case where the caret moved to the word “I”. Since movement of the caret skipped past the punctuation after “ham” and directly to the next alpha-numeric word, when the screen reader requests the text for the word boundary at the original position at the “h” of “ham”, the server should return the text “ham. ”. Likewise, if the screen reader requests the text at the word boundary of the period character, the text returned by the server will be the same. Because there is no stop at the period, IAccessibleText::textAtOffset must return the text of the word boundary including the period.
Another similar case involves URLs such as “www.google.com”. If the caret stops at the first “w”, and the next-word command skips past the entire URL, when the screen reader requests text for the word boundary at any offset in the URL, the server should return the text of the entire URL. Conversely, if the next-word command moves the caret to the individual segments and the punctuation within the URL, the server should return only the word/segment or the punctuation mark respectively when the word boundary is requested by the client.
This principle extends to punctuation characters such as “-”, “_”, “/”, “@”, “%”, and many others not mentioned here.
This principle also applies to white space which may occur after the visible text of words. Notice in the example given above that spaces are included in the text returned after the visible characters in the word “ham. ”. The white space provided by the server should match the navigational boundary of the word.
Lastly, If the client requests the text within a word boundary, and the offset specified by the client indicates a space between words, the server should return the text within the entire word boundary, including the visible characters. This is yet another way of saying that the server should always return all of the text for a given boundary, no matter which offset inside the boundary is requested.
It should be clear though, that regardless of how the caret moves by word, the text returned for the word boundary by IAccessibleText::textAtOffset must be in agreement with that movement.
New line characters
Note: the term “new line character” as used here refers to either soft line breaks (‘\r’) or hard line breaks (‘\n’).
If the client requests text for the line boundary at an offset which indicates a new line character, the server must return the text of the line prior to that new line character. In other words, the end of a line boundary should always be after a new line character. This request often occurs when the caret is just past the last visible character on a line.
If the client specifies the last offset possible, which will be the position just beyond the last character in the object and equal to the number of characters contained by the object, the server should return the prior line when the line boundary is requested, and an empty string (“”) when either the character or word boundaries is requested.
Current caret position
Screen readers most often requests text for a given boundary at the current caret position. When making such a request, the screen reader passes -2 to IAccessibleText::textAtOffset for the value of the offset parameter. The -2 value indicates the current caret position. (-1 indicates the last offset, equal to the number of characters contained by the object.)
However, passing -2 for the offset value is not merely a convenience. When a long line of text wraps automatically to fit the visible window, the offsets at the point where the text wraps on the first line and where the text begins on the second line may be the same.
Take the following text for example, where The vertical bar (‘|’) indicates the current position of the caret.
Assuming the line wrapped between words, there would be no hard line break (‘\n’) between the space (‘ ‘) and the number ‘1’. Therefore, there would be 11 characters in this text. This means that offset 8 where the caret is positioned can indicate either a spot just past the last character of line one, or just before the first character of line two. Using the -2 value allows the server to disambiguate between these two possibilities based on the current placement of the caret.
The following are the requirements for returning the text of a boundary when the offset Passed to IAccessibleText::textAtOffset is -2 and the current caret position is ambiguous.
IA2_TEXT_BOUNDARY_CHAR: The server should return the character at the insertion point, regardless of the line actually containing the caret. In the example above, the insertion point is actually the first character of the second line, even though the cursor is on the first line.
IA2_TEXT_BOUNDARY_WORD: The server should return the word at the insertion point, regardless of the line actually containing the caret. In the example above, the insertion point is actually the first word of the second line, even though the cursor is on the first line.
IA2_TEXT_BOUNDARY_LINE: The server should always return the line containing the caret, regardless of the insertion point. Likewise, the server should never return a line on which the caret is not visible. The user must be able to start at line 1 of a document, and by pressing the down arrow key, read every line of the document with no line being repeated. This should be true regardless of the position of the caret on the line. (i.e. home, end, in between, or no-man’s-land.)
IA2_TEXT_BOUNDARY_PARAGRAPH: not applicable
IA2_TEXT_BOUNDARY_ALL: not applicable
Automatically inserted characters
Occasionally, word processors will automatically generate characters which appear on a line along with editable text. The characters are not themselves editable, but are part of the document. The most common examples of automatically inserted characters are in bulleted and numbered lists.
When such characters are present, the server must return them as part of the text retrieved by IAccessibleText::textAtOffset, IAccessibleText::text, and other such functions which provide text to the client. Since the caret never traverses these characters, servers need only return them in the case where:
- The client calls IAccessibleText::text and requests a range starting with offset 0;
- The client asks for the text in the line boundary and the specified offset is within the boundary of the first line of the paragraph; and
- When the client requests the text in the paragraph boundary.
With regard to item 1. above, if a list item should have enough text to wrap to a new line, when the client requests the text in the line boundary for any line other than the first, the server should not return the automatically generated text.
The embed character is the Unicode character 0xFFFC. Embed characters indicate objects embedded within the text. These objects may be lists, tables, links, graphics, and so on. The embed character, along with the IAccessibleHypertext interface, is the mechanism which allows servers to make rich editable content accessible.
The IAccessibleHypertext and IAccessibleHyperlink interfaces discussed in the following sections are unfortunately named. For the purposes of this document, neither interface refers specifically to web sites, HTTP, or other related topics. Instead, these interfaces are used as a mechanism for structuring and supporting embedded objects of any type.
Objects which support IAccessibleText, and which contain text with embed characters, must support the IAccessibleHypertext interface. And the function IAccessibleHypertext::nHyperlinks must return a value equal to the number of embed characters in the text.
A screen reader uses the IAccessibleHypertext interface as follows.
When the screen reader encounters an embed character in the text, it first requests the index of the object associated with the embed character by calling IAccessibleHypertext::hyperlinkIndex. This function takes a character offset, and the server returns an index which corresponds to a child object.
The screen reader then calls IAccessibleHypertext::hyperlink, passing in the index retrieved from IAccessibleHypertext::hyperlinkIndex. The server returns an object that implements the IAccessibleHyperlink interface. All embedded objects must support the IAccessibleHyperlink interface, for reasons which will become clear later on in this document.
The object returned by IAccessibleHypertext::hyperlink must support the IAccessibleText interface if it contains text, and must support the IAccessibleHypertext interface if its text contains embed characters. In this way, a client can recursively iterate through embedded text using the IAccessibleHypertext interface.
All objects returned by IAccessibleHypertext::hyperlink must at least support the IAccessible interface, and IAccessible::accRole must return a role correctly describing them. If the object has a role from the IAccessible2 set of roles, the object must support the IAccessible2 interface.
Some types of text features may require multiple nested objects with multiple roles to properly represent them. The following text describes commonly encountered roles, and how the screen reader expects the server to implement them.
To support a list, servers must first supply an embedded object with the role of ROLE_SYSTEM_LIST. This object must support both the IAccessibleText and IAccessibleHypertext interfaces. An object with the role of ROLE_SYSTEM_LIST may only have embedded objects with the role of ROLE_SYSTEM_LISTITEM. (For simplicity, from now on, objects with the role of ROLE_SYSTEM_LIST will be called “list” and objects with the role of ROLE_SYSTEM_LISTITEM will be called “list item”.)
A list must have one (1) embed character for each list item or nested list that it contains. The screen reader will represent list item objects in the order of their associated embed characters.
List items may contain embed characters representing any type of object. This includes nested lists.
Nesting level is the depth of a list embedded within a list. Servers must provide the nesting level of list item objects with the IAccessible2::groupPosition function.
Although IAccessible2::groupPosition takes and returns three (3) parameters, the server should use the first parameter, “groupLevel”, to provide the nesting level of the list. Please see the IAccessible2 API documentation for further details.
To support a table, servers [R2] add an embed character in the text. Retrieving the hyperlink associated with the embed character returns an [/R2] object with the role of
[R2] The table object will have children that are table row objects. Each of these table row objects will have cell objects as their children. It is important that each of the cell objects have state IA2_STATE_EDITABLE and support the IAccessibleText interface. It is highly desirable that the server provide the IaccessibleTable2 and IaccessibleTableCell interfaces when supporting tables as these interfaces provide much more detail for the Assistive Technology.
The server must provide keyboard navigation for the table as described previously in the section on Keyboard Access. That is, there must be a keystroke that moves focus amongst the cells in the table as well as keys to access the contents of individual cells. [/R2]
To support a link, servers must supply an embedded object with the role of
Links may contain embed characters representing any type of object.
To support an image, servers must supply an embedded object with the role of
Graphics must have text describing their content. The screen reader should look for this text in the object's
Graphics need only support the basic interfaces required by this document (IAccessible, IAccessible2, and IServiceProvider). However, if a graphic object supports IAccessibleText, the screen reader should also render the text provided by that interface.
To support a heading, servers must supply an embedded object with the role of
Servers must provide the level of heading objects with the
IAccessible2::groupPosition takes and returns three (3) parameters, the server should use the first parameter, “groupLevel”, to provide the heading level. Refer to the IAccessible2 API documentation for further details.
Other Control Types
A wide variety of other control types may be supported in a similar fashion. For example, a server may add an embed character that returns an object with role ROLE_SYSTEM_PUSHBUTTON to represent a button control. When the caret is located on such a control, the server must provide a mechanism for interacting with the control. For example, if the embedded control is a button, the server may use the spacebar to activate the button. Similarly, if the embedded control is a list box or combo box, the up and down arrow keys will move the focus inside the control rather than from line to line as is normally the case. Note that control types that might be supported in a similar fashion include but are not limited to buttons, combo boxes, list boxes, drop-down lists, date time pickers, and so on.
Nonstandard control types also may be embedded in text using an embed character and returning a suitable role. For these nonstandard controls, the server must again provide keyboard accessible means for manipulating the control. Something like a tree grid might require the server to implement keystrokes to provide navigation between the tree part of the control and the grid part of the control as well as providing keystrokes to allow navigation within the grid or within the tree itself.
Embeds should be used as described whenever the object attribute "text-model:a1;" is used. For "text-model:a1;" the IAccessibleText should contain the embed characters for each embedded control, and the IAccessibleHypertext and IAccessibleHyperlink interfaces for finding these controls should work. This would apply to all applications including browsers, and controls with complex content such as multiline text boxes or tables. The actual controls may be an IAccessible and be a child of the document.
Downwardly traversing through embedded objects
As mentioned in item B, in the section entitled “The caret”, the client must be able to start from the object with focus, and by recursively using the IAccessibleHypertext interface, find the object that contains the caret. Therefore, a call to IAccessibleText::caretOffset on the object with focus must always return the HRESULT S_OK and a valid offset value. If the offset value returned indicates an embed character, the client must be able to use the IAccessibleHypertext interface as described in the previous section to downwardly traverse the object hierarchy until it finds the object containing the caret.
Upwardly traversing from embedded objects
As mentioned in the section of this document entitled “The caret”, clients must be able to start from the object with the caret, and by recursively calling the IAccessible::accParent method, find the object with focus. In addition, in the section on embed characters, it specifies that all embedded objects must support the IAccessibleHyperlink interface. These guidelines become critical when boundaries contain multiple objects and the starting object for the offset is embedded.
For example, consider the following line containing a link.
"Please visit CNN for further details."
Now imagine that the user moves the caret to the letter C in “CNN”. When this occurs, the server will send the IA2_EVENT_TEXT_CARET_MOVED event indicating that an object with the role of ROLE_SYSTEM_LINK has the caret. This object will contain the text, “CNN”.
In the example, the line boundary obviously extends beyond the object with the caret. So what happens if a screen reader asks for the text of the line boundary at this offset?
First, the screen reader asks the caret object for the text of the line boundary at the caret position. In the example above, the server would return the text “CNN”, a starting offset of 0, and an ending offset of 3.
When the screen reader sees that the starting offset of the boundary is 0, it requests the IAccessibleHyperlink interface from the caret object. Once that interface has been obtained, the screen reader calls
IAccessibleHyperlink::startIndex to retrieve the offset of the embed character which corresponds to the caret object. Note: although the function name contains the word “index”, this function returns a character offset, rather than the hyperlink index.
Now, the screen reader calls IAccessible::accParent to retrieve the parent object. It then asks the parent for the text of the line boundary at the offset retrieved by IAccessibleHyperlink::startIndex in the previous step. The text retrieved would be “Please visit [0xFFFc] for further details.” Note: the text within brackets represents a single embed character.
If the starting offset of this line were 0, a screen reader would again ask for the IAccessibleHyperlink interface, this time for the parent object. The screen reader continues traversing recursively up through the object hierarchy in this way until any one of the following conditions is met.
- The starting offset of the specified boundary is non-zero.
- The current object does not support the IAccessibleHyperlink interface (this can only be valid on the topmost object, since all other objects are embedded); or
- The object reports the state of
STATE_SYSTEM_FOCUSEDas provided by the
Having found the starting object and offset of the text boundary, a screen reader gathers the text for the boundary in the manner described in the section “Embed characters”.
While the example above describes only two objects, it is important to remember that there may be multiple levels of object nesting, and the server must provide for their upward traversal in the manner outlined above.
Selected text is text that is highlighted so that it may be copied, cut, deleted, replaced, formatted, or otherwise acted upon as a single block of text.
Text selection is represented by a starting object which contains the starting offset and an ending object which contains the ending offset. The starting and ending objects may be the same, but the starting and ending offsets must not.
When retrieving selected text, a screen reader always starts at the root object of an editable text area. This is the object with focus. When text is selected, the root object will always contain a selection, even if that selection is a single embed character representing an embedded object. Likewise, all descendants between the root object and the starting and ending objects of the selected text will contain a selection. The screen reader searches down through the text to find the beginning and ending objects and offsets for the current selection by recursively calling the
IAccessibleText::selection function. Recursion ends when neither the first nor last characters of the selection are the embed character.
The first parameter to
IAccessibleText::selection is a selection index. These selection indexes are meant to allow for multiple non-contiguous selections. However, a screen reader does not typically support non-contiguous selections, and therefore would send 0 for this parameter.
IAccessibleText::selection is called, the server returns a starting and ending offset for the current text selection. If there is no selection, the server may return either
S_FALSE or a COM error code. Otherwise the server must return offsets representing the selection where the ending offset is greater than (>) the starting offset. In other words, the selection must always be at least one (1) character wide, or the returned offsets are not valid.
Because the content of an editable text area is represented by objects in a tree-like structure, and the starting and ending objects of a selection need not be the same, it is important to remember that these objects may have any kind of relationship to one another. For example, the starting object may be the same as the ending object; the starting object may be the parent of the ending object; the starting object may be the child of the ending object; the objects may be siblings with the parent containing part of the selected text; and so on. It is therefore important for developers to thoroughly test how their servers provide selected text.
- Microsoft Active Accessibility Architecture
- IAccessible2 API documentation
- IAccessible interface documentation
- MSAA roles
- MSAA states
- MSAA events
- Implementing an MSAA Server - How Mozilla Does It, and Practical Tips for Developers
Active Accessibility and Microsoft are either registered trademarks or trademarks of Microsoft Corporation in the United States and/or other countries.