WYSIWYG editor spec - allowed HTML

Starting from the building blocks, what should an editor allow? Most of the browser based editors allow people to edit the source HTML. Indeed, if you allow people to (from ATAG):

Allow the author to edit all properties of each element

Then allowing people to edit the source is a possible method. However, one of the top priority guidelines is also:

Ensure that the tool automatically generates valid markup.

So the editor will have to filter what gets saved, the requirement is to allow the developer to restrict what HTML elements and attributes are allowed.

Allowed HTML

This filter should basically get rid of anything not recognised as valid HTML, and in fact, anything that doesn’t separate style and content. This leaves us with the following (X)HTML elements, list from W3Schools:

  • <a>
  • <abbr>
  • <acronym>
  • <address>
  • <area>
  • <blockquote>
  • <br>
  • <caption>
  • <cite>
  • <code>
  • <col>
  • <colgroup>
  • <dd>
  • <del>
  • <div>
  • <dfn>
  • <dl>
  • <dt>
  • <em>
  • <h1> to <h6>
  • <hr>
  • <i>
  • <img>
  • <ins>
  • <kbd>
  • <li>
  • <map>
  • <object>
  • <ol>
  • <p>
  • <param>
  • <pre>
  • <q>
  • <samp>
  • <span>
  • <strong>
  • <sub>
  • <sup>
  • <table>
  • <tbody>
  • <td>
  • <tfoot>
  • <th>
  • <thead>
  • <tr>
  • <ul>
  • <var>

Assuming that the editor does not open the whole page, but just part of it, items outside of the content area (e.g. base, title), have been removed.
The list also removes deprecated elements such as font.

Some items should not be edited by a WYSIWYG editor, such as form elements, frames, and scripts, although these could be included with specialist functionality, that is beyond the scope of this article.

<dfn>, <samp>, <kbd>, <var> are also of limited value, and could be ignored.

Attributes allowed

Removing undesired tags is only half the battle, and any element can be subsumed by applying style or event handlers, which should both be applied by separate CSS or JavaScript files. The core attributes to allow would be:

  • class
  • id
  • title
  • dir
  • lang

I would not include style, as this would allow inline styling. I also wouldn’t include tabindex or accesskeys. Accesskeys have various problems, and although there are good accesskey implementations, they should be part of the site templates, not editable by authors. Tabindex is primarily useful for elements such as form controls, and should only be tackled (if at all) in a site wide, consistent way, therefore not controlled by the author.

Element Specific Attributes

Some elements need particular attributes to add meaning, alt text on images for example. These are the attributes for each element that should be allowed:

  • a
    • charset
    • coords
    • href
    • hreflang
    • rel
    • rev
    • shape
    • type
  • area
    • alt (required)
    • coords
    • href
    • nohref
    • shape
  • blockquote
    • cite
  • col
    • span
  • del
    • cite
    • datetime
  • img
    • alt (required)
    • src (required)
    • width
    • height
    • ismap
    • longdesc
    • usemap
  • ins
    • cite
    • datetime
  • map
    • name
  • object
    • archive
    • classid
    • codebase
    • codetype
    • data
    • declare
    • height
    • name
    • standby
    • type
    • usemap
    • width
  • param
    • name (required)
    • type
    • value
    • valuetype
  • q
    • cite
  • table
    • summary
  • td
    • abbr
    • axis
    • colpsan
    • headers
    • rowspan
    • scope
  • th
    • abbr
    • axis
    • colpsan
    • headers
    • rowspan
    • scope

You may have noticed that there were no event attributes for JavaScript, these shouldn’t be part of an HTML editor, as per the principles of Unobtrusive JavaScript.

Editor support for restricting HTML

This section of the page could change, these are just preliminary notes which I’ll update at the end of the series. If I’ve missed something (quite likely when trying 3 things at once!), please leave a comment, preferably with a useful link.

TinyMCE

The documentation outlines how to restrict the output with a configuration called valid_elements, which is aimed at providing exactly this functionality. Subject to experimentation, TinyMCE seems to support this requirement well.

FCKeditor

The FCKeditor documentation doesn’t mention a configuration for this purpose, although you can remove toolbar options easily, unless you take off the HTML button there doesn’t appear to be a way of enforcing certain HTML.

Xstandard

Although the Xstandard documentation doesn’t give any indication of how to restrict the available tags, it does appear to restrict people to valid XHTML. In fact, the code produced when you view source is excellent, actually well-spaced XML rather than HTML.

I suspect this is a major reason Xstandard uses a plugin rather than using JavaScript - they aren’t hamstrung by what each browser produces. (The JavaScript editors have to transform and filter the code generated by the browser). However, using a plugin impacts the interoperability, as it doesn’t support as many browsers or operating systems.

Still, even without this specific configuration option, it is leagues ahead of FCKeditor (and ahead of a default installation of TinyMCE). It almost supports this requirement for restricting the code allowed anyway, although it would be good to remove many of the ‘properties‘ in the interface.


Technorati Tags:

11 contributions to “WYSIWYG editor spec - allowed HTML

  1. You wrote:

    using a plugin impacts the interoperability, as it doesn’t support as many browsers or operating systems.

    If you count the number of OS and browser/version combinations, I think you will find that XStandard will run on more systems than pure JavaScript editors. Most JavaScript editors do not work on Safari or Opera or offer very limited features.

    The same JavaScript editors also generate different markup in different browsers which is not very interoperable.

  2. Hi Vlad,

    Fair point, although at the time I wrote this, there wasn’t a Mac version (or at least it was in beta). Out of interest, is there a Linux version of XStandard? I couldn’t see an easy ‘requirements’ link on the site, although I’m sure there must be something there.

    The same JavaScript editors also generate different markup in different browsers which is not very interoperable.

    That’s going to be very annoying in testing, and one of the reasons there is browser/platform info on the checklist.

  3. I would like to see this in Linux. How hard can it be to make a plugin that already exists for firefox work in firefox for linux as well? Isn’t it all just javascript?

  4. The JavaScript ones (e.g. TinyMCE and FCKeditor) should work on Linux in Firefox, Xstandard doesn’t yet, but could support in future if there was enough demand.

Comments are closed.