The treachery of HTML elements

In 1929, René Magritte, the surrealist painter who was only 30 years old at the time, painted one of his most famous paintings, named “The Treachery of Images”. The painting is of a brown wooden pipe with a dark spout, and the caption reads “Ceci n’est pas une pipe” (meaning This is not a pipe in French).

René Magritte – The Treachery of Images (This is Not a Pipe), 1929, used in this blog post as a metaphor for html elements.

The inherent contradiction of the image in the painting, its defiant caption, as well as the ideas the artist wished to convey have since been the subject of countless studies and art history books. The prevailing interpretation states that Magritte wished to emphasize the gap between language and reality by reminding the viewer that the object in front of him is not a pipe but only a visual representation of a pipe.In Harry Orciner’s book “Magritte: Ideas and Pictures” Magritte had this to say about the painting:

The famous pipe. How people reproached me for it! And yet, could you stuff my pipe? No, it’s just a representation, is it not? So if I had written on my picture “This is a pipe”, I’d have been lying!
René Magritte

I was recently reminded of Magritte’s painting when I saw a MEME where the caption “This is not a pipe” is replaced with, “B***h, I might be.”

At the time, I was busy writing about how using semantic landmarks impacts the UX of screen reader and voice control software users. I guess that’s why, when I saw this MEME, it crossed my mind that both the original painting and the MEME referencing it are framing the story of HTML semantics. The original picture illustrates the potential gaps between the semantics of the UI’s building blocks (the HTML elements), their visual output and their actual meaning in the UI, whereas the MEME boldly implies the options we have in HTML today to expand and, if necessary, also change the semantics of these building blocks.

Discussing the gap between visual representation and reality is an excellent intellectual exercise that has many variants. However, this gap may exclude users who do not have the ability and tools to bridge it when it comes to web pages and applications.

The aim of this post is to illustrate the importance of HTML semantics and its effect on how assistive technologies convey the UI with the help of Magritte’s painting.

One more side note before we dive into it: Almost all assistive technologies are affected by HTML semantics to some extent. In this post, I mainly refer to screen readers for two intertwined reasons.

Firstly, I am convinced that it is the assistive technology most affected by the HTML semantics.

Secondly, because screen readers are so affected by the DOM semantics, they best illustrate the accessibility issues that may occur by lack of or incorrect semantics.

Nevertheless, the issues we will discuss are relevant to the UX of other assistive technologies as well.

HTML elements tangibility

We still refer to Magritte’s painting. If a blind person were standing in front of the painting, and we asked him to describe the object in front of him, without intermediaries’ help, he would presumably rely mainly on the sense of touch. He will probably be able to say that it is a painted canvas and estimate its dimensions. If he is familiar with the materials, he may also say that it is an oil painting. There may be other small details he could tell. Still, we can assume that a “pipe” will not be one of the nouns he will use in his description, since apart from the flat visual representation of a pipe on it, this object doesn’t have any properties of a pipe because it is simply not a pipe.

The same issue applies to HTML elements. HTML elements exist in a virtual space and are therefore intangible.

The physical properties that characterize elements in the real world are replaced in the virtual space by semantic values that define the object’s properties. The semantics of an HTML element is the sum of the properties and states that determine its essence, purpose, and the way it is used; in our case, the properties are explicitly and implicitly conveyed to the user by assistive technologies.

Semantics in action

Let’s take two buttons, for example. Both buttons have the same CSS rules applied, and both have the same “onclick” event listener attached. One of the buttons is implemented with a <button> tag, and the other one is implemented using a <div> tag. While the <button> based button will be accessible for all assistive technologies users, the <div> based button will be hard or impossible to use by the same users. Let’s examine a few examples of why:

Keyboard only users

<div> elements are not focusable by default and therefore they are excluded from the page’s tabbing sequence. HTML elements that are not part of the page’s tabbing sequence (not focusable), are not accessible to keyboard users.

Screen reader users

Screen readers announce each element’s type and how to use it in case the element is interactable. <div> elements are generic containers; nothing about their semantics implies that the user can interact with them. A screen reader won’t announce its type or how to use it, and only read its text node

Voice control users

Voice control software displays the names of the interactable elements on the page to control them by voice. For example, saying “click send” will trigger a button labeled with the word “Send“. But <div> elements are not mapped as interactable elements, and will therefore not be displayed by the voice control interface to interact with.

We saw three examples of how the lack of semantics affects the accessibility of the element. Still, one could claim that all the issues discussed above can be overcome by adding a few more attributes to the element, which brings us to discuss the MEME image’s side of the story.

Beyond the boundaries of native HTML semantics

Unfortunately, it is not a rare thing to find UI elements with incorrect semantics. Years of lack of orderly HTML standardization, insufficient semantic variety, and the browsers’ tolerance regarding the way the HTML is structured, led to a situation that there is often an incompatibility in web pages between the semantics, its building blocks and their actual meaning and purpose. While these incompatibilities will usually be imperceptible to the average user, they significantly affect the user experience of those who depend on assistive technologies to the point that they may prevent these people from using the interface at all.

Since issues arising due to the semantic structure mainly affect accessibility, it was only natural that WAI (Web Accessibility Initiatives) were the ones who took on the task. In 2014 they introduced the role attribute, backed up by the “Roles Model” as part of the WAI-ARIA.

The “role” attribute was designed to extend the semantics of HTML elements and allow for more complex semantic structures than what can be achieved with standard native HTML. The “role” attribute, at least in my opinion, is indeed the most significant leap made in the maturation of the language semantics. It is safe to say that the “role” attribute is the most diverse and complex HTML attribute. It has 61 allowed values, divided into three categories; some act as components of composite UI widgets and require specific parent or child elements. However, this complexity and diversity also make the “role” attribute deceptive, and misuse can often cause deterioration in accessibility instead of improving it.

Know your `role` (attribute)

A playful remake of René Magritte's painting and popular meme, showing a pipe and "b***h I might be" underneath, as a metaphor for html elements.

As mentioned earlier, the “role” attribute might be deceptive, since it may imply that setting a role to an element will automatically apply all the role’s attributes to it. In fact, the element will not adopt any of the role’s attributes, nor will it detract any of its default native attributes. For example, <div role="button">Click Me</div> will not become focusable, and will still have most of the issues discussed regarding the <div> based button earlier.

To better understand how the role attribute works, let’s look at the “WAI-ARIA Authoring Practices“. The document’s preface listed the only two principles that WAI defines for correct and effective use of the ARIA attributes. The first one refers to the role attribute:

Principle 1: A role is a promise

It is essential to understand that the only change the role attribute makes is to change the type of element on the accessibility tree. The accessibility tree is parallel to the DOM tree; its function is to map the page elements for assistive technologies. When we set a role to an element, we change how it will be listed and announced by assistive technologies. Therefore, as authors, we make a promise that we have incorporated into the element all the other attributes expected from its new role.

Let’s look at this example <div role="button">Click Me</div> again. Assistive technologies will register and announce this element as a ”button”. You may remember that this was one of the accessibility issues of the <div> cased button we have discussed earlier in this post. However, this button will still not be accessible for keyboard users. It has other screen readers related issues as well.

As mentioned above, <div> elements are not focusable by default, and therefore they are not accessible as controls for keyboard users. We can fix that issue by adding to the <div> element tabindex="0", which makes any DOM element focusable. That will cover all the issues discussed above. What if, for example, this button should be disabled in some cases? A simple “disabled” attribute that could work just fine for a <button> element will not work here since the “disabled” attribute is a DOM attribute. As far as the DOM is concerned, the element has not changed, and it is still a <div>; hence it does not have a disabled/enabled state at all. Therefore, if we want to disable our <div> based button, we would have to do the following:

Remove any keyboard/mouse event listeners attached to the element.
Remove the “tabindex” attribute, so that it is not focusable.
Add aria-disabled="true" so that screen readers can announce its correct state.

And you will have to do vice versa when you wish to enable it again.

Therefore, in the HTML world, a painting can become a pipe. However, it is a bad idea and considered bad practice to change elements’ semantics when there is an equivalent native semantic element. It is harder to maintain over time (since you will have to manage all the attributes and states by yourself), and it makes aless readable.

Another factor that often leads to a misunderstanding of the role attribute is its large amount and variety of allowable values. This variety sometimes creates the impression that any string is a valid value, while not only is there a strict list of allowed values, many of them require a specific set of additional attributes and states to fulfill their roles’ promise.

The role attribute was created out of necessity to enhance HTML’s semantic diversity. In fact, it is its one and only purpose. Using it correctly can significantly improve the UX for assistive technologies that use the accessibility tree to convey content to the user. In the next and last part of this post, we will see how to safely use the role attribute and make sure that its promise is fulfilled.

Rules of thumb for using the role attribute

Always prefer a native semantic element over a role attribute

As mentioned earlier, the role attribute was meant to enhance the HTML semantics; it should by no means change or replace it. Role values with a native semantic equivalent were usually meant to be used in composite design patterns such as combo-boxes or multi-level menus, not for changing the semantics of a specific single element. Let’s now discuss accessible how to use to make the most out of the role attribute.

Be sure to use a valid role value

HTML element’s “role” attribute can be assigned with one of 61 different optional values; with that amount, one can easily make a mistake like using an inaccurate value string or an invalid value. To ensure that you are using a valid role value, check if the role value you wish to use is listed in the WAI-ARIA’s roles list.

Check the role’s required attributes, states, and context

As already mentioned, the semantics of an element is the sum of its properties and attributes. Setting a role to an element is similar to a statement of intent and promises that it can fulfill whatever is implied from its role. Certain roles should have a state or value available; others must be placed in a specific context or associated with other elements. In many cases, if these attributes are not set correctly or missing, the element will become less accessible than it was without having a role assigned to it.

Therefore, to ensure that we don’t miss any attribute, clicking the role name on the WAI-ARIA’s roles list will scroll the page to the role’s description and characteristics table.

Let’s look at some of the “checkbox” role’s characteristics:

The "checkbox" role's characteristics table

Required States and Properties: The listed properties under this characteristic are not optional and required to satisfy the role’s semantics.
Implicit Value for Role: This is a default value that is set to the element in case a required state or value was not set.
Accessible Name Required: This characteristic is self-explanatory; nevertheless, it is essential for the user’s ability to perceive the UI correctly.
Related concepts: Since the WAI-ARIA role was designed to enhance HTML semantics, many of its values are based on the semantics of native HTML elements. The “Related concepts” characteristic presents the HTML element that the role is based on or most similar to if such an element exists. The roles’ characteristics tables are focused on attributes that affect how an element is registered to the accessibility tree. Still, these do not necessarily cover all the accessibility aspects of the element. For example, in our case, the “checkbox” role is related to HTML, input[type= "checkbox"] element. We know that input elements must be focusable to be accessible by keyboard, so we can infer that our element should be focusable as well.

The characteristics rows of the characteristics table may vary between the different roles depending on the role’s semantic requirements. Let’s look at the characteristics tables of two more roles, namely “list“, and “listitem“. These roles, for example, have context requirements that the checkbox did not have. For example, the “list” role has a “Required Owned Elements” characteristic, which details what the type of its direct children must be. On the other hand, in the characteristics table of the “listitem” role, there is the opposite characteristic named “Required Context Role,” which specifies the required type of its parent element.

The "list" role's characteristics table with the "Required Owned Elements" row highlighted beside the The "listitem" role's characteristics table with the "Required Context Role" row highlighted

Checking the role’s characteristics table is therefore essential to ensure that we meet all the role’s requirements. However, some roles require a composite structure that involves more than one element, which does not necessarily have an explicit connection between them, and nevertheless, they may affect each other. In these cases, the characteristics tables are a good start, but they don’t give the full picture so we might want to check the WAI-ARIA’s Design Pattern Examples list.

ARIA Design Patterns

The roles’ characteristics tables give us a restrained perspective of the states and properties required to the specific element with the “role” attribute. However, in most cases, an element with a role attribute will not stand by itself, and it will be part of a more complex structure. Let’s take a combo-box, for instance; it combines a drop-down list and a single-line editable text field, allowing the user to either type a value directly or select a value from the list. The element with the role="combobox" is only wrapping this orchestration and registering it to the accessibility tree as a “combobox”, but what does it take to make it accessible? We need a way to express whether the list is expanded or collapsed, announce what the currently active value is, whether there is auto-completion or not, which component in the widget controls which other components, and to generally convey the relationships between the different components of the widget.

To assist developers in mapping all the accessibility requirements of the various roles, WAI-ARIA made a list of 44 roles and 32 attributes and created one or more code examples for each of them to illustrate different use cases of the role and its accessibility requirements.

The full list is part of the WAI-ARIA Authoring Practices. You can find it here.

One last side note before we are wrapping up. The role attribute, like the other WAI-ARIA attributes, constitutes significant progress towards an inclusive web. However, they are supported on a different level and manner between the various assistive technologies and browsers. So, if you intend to rely on an attribute which you are not sure how well it’s supported, it will be a good idea to check it to make sure and look for alternatives if needed.

Wrapping it up

At the beginning of the post, we explored the concept of semantics while looking at Rene Magritte’s painting, “The Treachery of Images“. We have come to see that the semantics of elements is essential for the accessibility of HTML pages and applications. We have also seen that the semantics of an element consist of attributes and behaviors, all of which need to be considered when the author defines the element’s semantics. Finally, we saw how the role attribute extends the native HTML semantics and ways how to use it correctly and effectively.

Thank you for reading.