Headings defining document structure

Tomas Jogin has started an interesting discussion reflecting how heading level choices (h1, h2, etc) can give a different perception of document structure. For example, running the Clagnut home page through the W3 validator in Outline mode reveals this document structure:

  H2 Site contents
H1 The website of Richard Rutter, a web producer from...
  H2 (blog post)I'm back posted 1 day ago
    H3 (subheading in blog post) So what's been happening...
  H2 (blog post) Back in a week posted 2 weeks ago
  H2 (blog post) Multimap redesign posted 2 weeks ago
  H2 (blog post) Tiger posted 3 weeks ago
  H2 (blog post) Gmail invites posted 3 weeks ago
  H2 (blog post) iTunes Music Store UK is empty posted...
  H2 (blog post) Dynamically underlining accesskeys posted...
  H2 (blog post) Footie mad posted 1 month ago
  H2 (blog post) British Sea Power posted 1 month ago
  H2 (blog post) Collaborative Design posted 1 month ago
  H2 (side bar box) Search blog
  H2 (side bar box – random photo) Back seat drivers
  H2 (side bar box) Switch typefaces
  H2 (side bar box) Blogmarks
  H2 (side bar box) One year ago
  H2 (side bar box) Listening right now
  H2 (side bar box) New Music
  H2 (side bar box) Weather in Brighton
  H2 (side bar box) Webring
  H2 (side bar box) Syndicate
  H2 (blog roll category) Web design
  H2 (blog roll category) Burgled
  H2 (blog roll category) Looks, listens & reads
  H2 (blog roll category) Affiliated efforts
  H2 (blog roll category) Acquaintances

The implication here is that every blog post and every side bar box is an equally important sub-section of the whole page. So based on headings alone, blog posts are not distinct from side bars, and my blog roll categories are not distinct from other side bar boxes. However I use other mark-up for this purpose – blog posts are contained in the their own list, as is the blog roll – so what’s the problem?

Well there isn’t really a problem as the document is still quite well structured, but better use of headings could be useful. Headings can be used for automatic generation of tables of contents (there is already a Mozilla sidebar which does this); they are used by JAWS to quickly navigate through a document; and headings are used by Google in its ranking algorithms.

So it seems my home page structure could be more useful by changing all the h2s to h3s and adding in a couple of sub-headings:

H1 The website of Richard Rutter, a web producer from...
  H2 Site contents
  H2 Most recent ten posts
    H3 (blog post)I'm back posted 1 day ago
      H4 (subheading in blog post) So what's been happening...
    H3 (blog post) Back in a week posted 2 weeks ago
    H3 (blog post) Multimap redesign posted 2 weeks ago
    H3 (blog post) Tiger posted 3 weeks ago
    H3 (blog post) Gmail invites posted 3 weeks ago
    H3 (blog post) iTunes Music Store UK is empty posted...
    H3 (blog post) Dynamically underlining accesskeys posted...
    H3 (blog post) Footie mad posted 1 month ago
    H3 (blog post) British Sea Power posted 1 month ago
    H3 (blog post) Collaborative Design posted 1 month ago
  H2 Tools    
    H3 (side bar box) Search blog
    H3 (side bar box) Switch typefaces
  H2 Additional Info  
    H3 (side bar box) Random Photo
    H3 (side bar box) Blogmarks
    H3 (side bar box) One year ago
    H3 (side bar box) Listening right now
    H3 (side bar box) New Music
    H3 (side bar box) Weather in Brighton
    H3 (side bar box) Webring
    H3 side bar box) Syndicate
  H2 Recommended Links
    H3 (blog roll category) Web design
    H3 (blog roll category) Burgled
    H3 (blog roll category) Looks, listens & reads
    H3 (blog roll category) Affiliated efforts
    H3 (blog roll category) Acquaintances

Which brings up a question asked by Andy Budd: does it make sense to write a complete structured document only to then hide some of the content? That would be my case with the afore-mentioned structure: I already hide the h1 (it’s there for Google’s benefit) and I would also hide the new h2s. Given the reliance on headings to determine document structure, a few hidden ones to help add clarity would not be a bad thing.

That we’re having this discussion at all is due to the origins of HTML. It was originally conceived as a way of marking up scientific documents with a conventional heading, sub-heading, sub-sub-heading structure. Nowadays we are trying to apply the same methodology to more mature, hypertextually complex Web pages which, visually and functionally are somewhat different to academic papers.