In last month’s article, we followed how Hypertext Markup Language (HTML) developed from its humble origins to become the dominant platform for publishing, entertainment, and commerce. Before we can understand how HTML 5 can help us, I want to take a deeper look at HTML’s markup, some of the problems associated with it, and what was missing.

Basic Markup

Let’s start by revisiting HTML’s most basic premise. As its name implies, HTML is a markup language that provides tags for creating web pages. In most cases, with a few exceptions, a tag includes a start <>and </> end tag. For example, to declare a web page you use the <html></html> tags. The web page itself is divided into two parts, a header that includes information about the page, and the body that contains the page’s content:

<html>
    <head>
        <title>Page Title</title>
    </head>
    <body></body>
</html>

The body section includes headings that divide the page into sections of different levels, and paragraphs that indicate body text, and lists, for example:

<h1>Section 1</h1>
<p>In this section...</p>
<h2>Section 2</h2>

These elements can be formatted as bold , italics , etc.

HTMl pages can also include lists, images, links, and tables.

Growing Pains

As content creators and web developers tried to expand beyond creating basic home, it became apparent that HTML simplicity was both a blessing and a curse.

Consistently Inconsistent

For a start, HTML’s structure was inconsistent. Some HTML elements, such as headings, required start and end tags (headings, links, tables, and formatting); some elements (paragraphs, lines, and line brakes), only needed a start tag, while lists (bullets and numbered) were a mixture of both types. For example, the following markup is legal HTML:

<h2>Section</h2>
<p>Para 1
<p><b>Para 2</b>
<ol>
     <li>Point 1 
     <li>Point 2
</ol>

At the time, this was thought of as a convenient shortcut and web browsers were designed to be able to handle HTML’s contradictory rules. As the number of platforms and devices has multiplied, we can no longer be sure that a web page created on one platform’s browser will render correctly on another. As a result, web developers must spend more time designing and testing pages. This is one of the key issues that HTML 5 hopes to solve.

It’s Just Semantics

HTML basic design metaphor was based on print. It uses the same typographical conventions that enable a human reader to infer semantic meaning and hierarchy. Due to its basic design and its users expectations, a web page is a jumble of structure, content, and formatting.

The web evolved quickly and in unforeseen ways. In the early days of the web, all HTML were created by hand and read by humans. While sentient computers, like HAL 9000, SkyNet, or Deep Thought, are still in the realms of science fiction, the main consumers of web content are not people, but machines. This means that if your web page cannot be parsed and indexed by the major search engines, it is as if it never existed. Since Google and Bing make money by selling adverts, if content owners can’t attract readers to their sites, they will not be able to attract advertisers.

More Than Words

In November 2006, Google bought the YouTube video sharing site for $1.65 billion in stock. Apart from the unprecedented sum Google was willing to pay, the deal marked the web’s transition from a text platform to a rich multimedia content distribution platform. In addition to video, the web also provided maps (Google), pictures (Flickr), and music (MySpace).

Video and audio are not the only new forms of content on the web. HTML also lacked good solutions for static and animated content. Before HTML 5, web browsers had to rely on non-native plugins in order to serve multimedia content. At the time, this meant Adobe Flash. Apart from issues related to vendor lock-in and other commercial considerations, Flash also had a number of security issues, as well as performance and bandwidth issues. As a result, both the browser vendors and content providers wanted to provide better solutions.

Moving On

HTML 5 aims to solve these problems. It will also add new features like Geo-location, local storage, and communications. In next month’s article, I will look at HTML 5’s new markup and what it brings to the table. In the meantime, if you want to take a closer look, the HTML 5 spec is a good place to start.