This article, written by Jon Ritzdorf, was originally published by WebsiteMagazine in June, 2012.
If your internet presence is an important part of your business, you’ve probably been exposed to at least some buzz around HTML5. So you may already be thinking of making the big switch to this popular markup language. But if you pull the trigger, will you be better positioned for going multilingual with your website in the near future? And how do the particularities of HTML5 impact the website localization process?
Making the switch: many pros, just a few cons
HTML5 is still under development and not expected to become the new markup standard (beyond W3C Recommendation status) for some time to come. So why bother with it now?
For one, it’s the future. HTML5 is already supported by a range of desktop browsers, and that support is growing rapidly. The adoption of HTML5 is spreading quickly as well, thanks to pioneering work by major search engines and social media sites.
On top of that, smart phones and tablets all support HTML5. So if you consider the fact that, by 2014, mobile devices are predicted to eclipse desktops as the preferred platform for internet surfing, a case can certainly be made to switch to HTML5.
Here’s another considerable advantage: using well-constructed HTML5 will help with that ever so important aspect of your website’s online success: your Google search rankings. HTML5’s semantic-driven tag structure make websites more Google-friendly. A website built in HTML5 supports multiple devices, renders faster and is smaller in download size—attributes that will earn you Google’s favor and a higher ranking in natural search results.
One small caveat, however, is that older versions of IE (pre- IE8) do not respond well to HTML5. So if your user base or target audience heavily relies on IE7 or – horror – IE6, and can’t upgrade to newer versions, switching to HTML5 will be problematic. Businesses with a strong presence in China should be careful as IE6 still has a rather large user base in that market.
New HTML5 features that impact multilingual sites
Some of the biggest changes that HTML5 offers don’t affect multilingual sites significantly, at least not at first glance. Digging a little deeper, however, you’ll find a few elements to address as you strategically groom your website for multilingual web users:
Instead of having to rely on plug-ins like Flash, you can now embed audio and video elements directly in your HTML code. The standard controls coming with the video embed code lack a lot of sophistication but the advantage is that they are language independent, using generic and widely-accepted icons instead of text. Not relying on a closed – and therefore hard to localize – system like Flash is, obviously, beneficial.
The new canvas tag creates a space that allows the creation of dynamically drawn graphics, including text. Depending on what’s being done to text, the canvas could complicate localization. When implementing canvas animations, you’ll have to keep in mind that the text is going to be translated. This will require an internationalization mindset to make sure that translation into other languages doesn’t dictate a major overhaul of your code.
One of the major changes as far as the actual HTML tags are concerned, is the move towards a more semantic markup. New tags and attributes have been added, with many of them far more descriptive in nature, enabling a more meaningful organization of the content. This will help you create cleaner and more understandable code, which will result in a more meaningful page code layout. Moreover, it will be easier for machines or applications like web crawlers, translation apps or content parsers to interpret content structure, and understand the function of page elements.
New tags and attributes from a localization perspective
Let’s have a more detailed look at some of the new tags and attributes, specifically from a multilingual and localization perspective.
The <translate> attribute
Remember the days when you’d create lists of items NOT to translate before handing over your files to your translation vendor? Or when you’d have to write detailed instructions for which elements should stay in English? With the new HTML5 attribute <translate>, this cumbersome task is thankfully a thing of the past. You can now flag these elements straight in the HTML code as text that should not be translated with <translate=”no”>. Your localization vendor will know immediately that the content surrounded by this flag should stay untouched.
Example: the user manual for your new line of cameras tells the user to click the “Flash Off” button on the camera:
<p>Select <span class=”menuItem”>Flash Off</span> if you don’t want to use the camera’s flash light.</p>
When translating the user manual, terms referring to the camera’s menu should stay in English. The <translate> attribute now let’s you mark this item as such:
<p>Select <span class=”menuItem” translate=”no”>Flash Off</span> if you don’t want to use the camera’s flash light.</p>
The <translate> attribute can be used on any element and takes only “yes” or “no” as its value. When set to “no”, automated or workbench translation tools should lock the text of that element so it won’t get translated. Read this blog post for a more detailed discussion of this helpful <translate> attribute.
<details> and <summary> attributes
<summary>Acclaro’s service commitment</summary>
<p>Our clients come first and meeting their needs throughout the life of a project is primary</p>
In this example, on page load, only the summary would be visible to the user. The rest of the content in the details tag would be hidden. The user can now click the summary to show the rest of the content.
Obvious uses for these tags are FAQs, HTML Help systems, collapsible TOC sections, etc. These attributes come with a significant downside, however: currently, they are only supported by Chrome.
As far as translation goes, any professional localization vendor should know that content within these tags needs to be translated, so there shouldn’t be a need for you to provide additional instructions. Moreover, their QA team and reviewers will know to click the summary to open and review the rest of the details content.
Ruby annotations are pieces of text showing up alongside the so-called base text. They usually provide additional information about the base text, such as its pronunciation. Their use is most common in East Asian documents. It can also show translations right above the source text. Here are some examples:
Ruby providing translation
Ruby providing pronunciation guidance
Examples taken from http://www.useragentman.com/blog/2010/10/29/cross-browser-html5-ruby-annotations-using-css/
HTML 5 supports Ruby annotation through the following four tags:
ruby> The Ruby tag
Specifies a Ruby Annotation.
<rb> Ruby base tag
Specifies the base text to be annotated.
<rt> Ruby text
Specifies the actual annotating Ruby text. By default, it appears above the base text.
<rp> Ruby parenthesis
Specifies alternative text for browsers that don’t support Ruby text.
Because the use of Ruby annotations can complicate the website translation process, you’ll want to partner with a translation agency specializing in website localization. The experts will know how to work with these annotations across your target languages.
The <ContentEditable> attribute
Have you ever dreamed of adding sticky note functionality to your website, allowing users to jot down comments directly on your webpages? Or have you ever wished your translation agency could directly access your web copy to extract text for translation? Your wishes are granted with this new attribute. All you need to do is set ContentEditable to true for any element, and the user or translator can then click on it to directly change or copy the text.
Here’s how it works: by default, elements with the ContentEditable attribute set will have a grey border as the user hovers over them. Clicking inside the element will give the user access to its content. The user can then edit or copy it directly.
The <spellcheck> attribute
If you want to provide spell check for user-generated content, you can now use the new <spellcheck=”true”> attribute. Adding this attribute to an editable element will trigger spell check when the user adds or changes text. The following elements can be spellchecked:
Text values in input elements (exception: password)
Text in <textarea> elements
Text in editable elements
For this to work correctly in the context of a multilingual site, you need to make sure the right language is used for the spell check. The browser will use the language set for the edited element itself first. It the element itself does not have a language set, the setting of its parent element will be used, and so on. Here’s an example:
In the above sample code, the first text area will be spell checked in English (html lang=”en”). The second text area has French set as its language, so it will be spell checked in French. The third text area does not have a language set, so it will use the setting of the first parent up, which is the <div> with lang=”ru” (Russian).
Pay extra attention to the language settings of your elements when you set up spell check for a site that will be in multiple languages.
Now that you’ve seen what HTML5 can do for your website, both in terms of developer conveniences and localization process shortcuts, nothing’s to stop you from taking full advantage of this wonderful new language. And there are many resources to support you as you make the switch.