Office 2007 PDFs – Not (always) accessible

Please note there is an update to this article after new information from Microsoft.

Of course, to create an accessible PDF from Word, the Word document needs to be done properly, otherwise it doesn’t matter what you use to create the PDF, it won’t be accessible.

I had previously heard that Office 2007 (or 12 back then) was going to have built in PDF support, with tagging (i.e. accessible output). Not too long ago I installed Vista and Office 2007 on my work machine, but there was no sign of it. Then I found the Microsoft Save as PDF or XPS Add-in for 2007 Microsoft Office.

It would be great if you could just ‘save as’ (which itself is an improvement on having to remember that PDFs are print-outs) and create a PDF of your document from Word, Excel or Powerpoint.

Whilst preparing the materials for a course on accessibility PDFs, I thought I’d give it a go with a really simple document. The results initally looked great (apart from being several times the file size), but then I dug in a little.

Screen shot of Acrobat Pro showing the first page and the tags dialogue open. There is no text in the tags, everything is an image.

The document is very simple, with just headings, lists, paragraphs and images, but all the text is saved as images.

These images do have ‘actual text’, but that just strikes me as a work-around. According to Adobe’s own Accessibility Techniques, ‘Actual Text’ is for:

text that will be sent to a screen reader. If text is entered, the entered text will be read, and not the text that may comprise the displayed document content. This property should be entered only if you
want something other than the content of an element to be read by a screen reader. Basically, this is replacement text.

For comparison, ‘Alternate Text’ is additional or descriptive text that can be used to describe an image, formula, or other item in the document that does not translate naturally into text.

So it isn’t wrong to use this attibute for passing text through to alternative readers, but why isn’t it just text?

Compared to the same document created from Word 2003 & Acrobat 8:

Screen shot of the same document but showing text within each tag.

The difference it makes in a screen reader is fairly stark, even if you consider it technical accessible:

how

difficult

is

it

to

understand

like

this?

Because of all the separated images a screen reader can’t tell what’s a sentence and what’s just a word. (These simple test documents can be made available if you want to check this.)

Worse still, the accessibility features built into Acrobat are useless:

  • Reflowing the document to fit the screen width just blanks out the content.
  • Zooming into the text shows blockier text than when using an Acrobat created PDF.
  • Using different colours schemes doesn’t work (again, blank content).
  • You can’t select text to copy and paste elsewhere (not strictly accessibility, but annoying).

I would say that this is a very misleading dialogue:

The Office 2007 PDF options dialogue, showing a tick for "Document structure tags for accessibility"

Given that the Office 2007 PDF pluggin is a free download (once you’ve paid for Office 2007 of course), I was hoping that most people internally could use that instead of paying the Adobe tax for accessible PDFs.

Unfortunately, without providing reasonably accessible output, I can’t use or recommend that option.

Update – it’s a font thing

Thanks to the quick responses from Microsoft, it turns out not to be as bad as initially thought. The use of a non-standard font seems to have triggered the use of bitmapped text. I changed the font to Arial, and the structure and text came out fine, possibly clearer than the Adobe version.

Screenshot of the acrobat file and tags dialogue open, showing a nice HTML like structure.

In comparison to the non-standard fonts version:

  • Reflowing the document to fit the screen width shows the text content, but blanks out the image.
  • Zooming into the text is fine.
  • Using different colours schemes is fine.
  • You can select text to copy and paste elsewhere.

So, much better, although I can’t work out why the image disappears.

However, half the reason we use PDF in the first place is that we can embed our (nicer) company font into documents we send out. The Microsoft test team are currently investigating why this font triggered the issue, but for now the advice (from me) has to be to use ‘standard’ fonts and test thoroughly for accessibility.

14 contributions to “Office 2007 PDFs – Not (always) accessible

  1. Well, it was something to hope for — but I can’t claim to be surprised that this is not what Microsoft claimed.

    Disappointing, to be sure.

  2. I can’t tell from the screen-shot, but it sounds like you used a font in the original document which could not be embedded in the PDF. (Likely because the rights bits on the font indicated that embedding was not permitted.)

    If this is what’s going on here, you should get much better results (both with accessibility and file size)changing the document font to more standard ones, such as Arial/TNR or Cambria/Calibri.

  3. on the subject of office suites and PDF, do you have any experience with OpenOffice? its PDF output does create tagged PDFs, but i’m very annoyed that,from a presentation i carefully crafted in OO’s powerpoint equivalent, none of my image ALTs come through in the PDF version, and for some reason any links are flagged as not being in the document tree by the Acrobat accessibility check. am i missing something?

  4. Hi Jeff,

    You are right, it’s not a standard font, I’ll try again tomorrow with a standard one.

    Still, I’m used to Acrobat being able to embed fonts (that’s half the reason we use it), it this not possible with MS Office PDFs?

    Patrick: I have a little experience, but because we had formatting issues between it and MS Office it’s not a route we have tried much.

    Since you need Acrobat Pro to even check the accessibility, we’ve tended to stick with Office & Acrobat.

    Is the link read out using the built in reader? That might give an indication if it’s really a problem.

  5. Alastair – good to hear that switching fonts cleared up the issue for you. The PDFs we generate will embed fonts, but only if the rights bits in the font are set to indicate that the font may be embedded. It turns out that these bits are not always set the way that your license agreement as the purchaser of the fonts would lead you to expect, but the bits are all we have to go on.

    The copy we have of the Myriad Pro Condensed font does indicate that it may not be embedded. You may get better results with other fonts you have licensed.

  6. Jeff – thanks for the update, but without the ability to embed fonts we’ve bought, it means the pluggin is not an option for us.

    How does acrobat get around that?

    Also, have you looked at how they come out when re-flowing the document? I couldn’t see why the image disappeared.

  7. Alastair -
    Unless we can determine that we are reading the permission bits wrong, (and we are re-checking this) then the only solution for you is going to be for font vendors who want to permit font embedding to mark their fonts accordingly. As you note, there may be other tools out there that interpret these rights differently and if you do have the rights to embed these fonts, you may need to use one of those other tools to do so. (Perhaps Adobe the font vendor has an agreement not to sue Adobe the Acrobat producer?)

    Not sure what is happening to drop the image on reflow. Let’s take that up over e-mail.

  8. Adobe say otherwise:

    “All fonts produced by Adobe Systems can be embedded in AdobeĀ® Portable Document Format (PDF) files, as well as other types of files”
    http://www.adobe.com/type/embedding.html

    “Adobe permits embedding certain typefaces into documents for the purposes of viewing and printing only. Documents with embedded typefaces may not be edited unless those embedded typefaces are licensed to and installed on the computer doing the editing.”
    http://www.adobe.com/type/topics/licenseqa.html#q23

    “Embedding: Editable”
    http://www.adobe.com/type/browser/P/P_1729.html

    Also, looking at the version of the font installed on my system, ‘Embedding rights’ is ‘Editable embedding’

    “Editable: Fonts with an editable embedding permission can be embedded in electronic documents, and the embedded font can then be used by the recipient of the electronic document to view, print and further edit or modify the text and structure of the document in which it is embedded. These changes or edits can then be saved in the original document. Several fonts in the Adobe Type Showroom, including all Adobe Originals typefaces, other Adobe-owned typefaces and certain third-party font foundry typefaces, allow for editable embedding.”
    http://www.adobe.com/type/browser/info/embedding.html

    Am I looking at the wrong property, Or do we have different versions of the font?

  9. One of the test team kindly mentioned how you can check (in Office 2007) with Publisher (more on that later):

    • Open Publisher
    • Type something into a textbox, and format in the font to test.
    • Go to tools > commercial printing tools > fonts.
    • Check the dialogue for “may not embed” restrictions.

    And trying that, it does indeed say “may not embed”.

    The file properties say “Font embeddability: Editable”, but the licence line is blank, and there is nothing else relevant and readable in the file itself except the URL:
    http://www.adobe.com/type/legal.html

    Which leads onto legalese that I don’t feel qualified to try and translate!

  10. Had to use an xbar for statistical equations (average.) It is in the collection on word but only as WP MathB font. Adobe wouldn’t convert it into .pdf format claimed it was a
    “[ Error: WP-MathB cannot be embedded because of licensing restrictions. ]
    [ Font vendor (Alts) does not permit this font to be embedded in PDF. ]”

    I cant find it in any of the other acceptable Word XP fonts. The only workaround is a pain! when you use it hundreds of times.

  11. I tried exporting my two-page resume.

    If I use Cambria, the file size is 182 KB, even tho Cambria is listed as an embedded font. For some reason Arial (“ArialMT”) is also embedded.

    If I use only Arial, the file size goes down to 65 KB.

    If I export from OpenOffice 2, the file size is 43 KB, with Cambria embedded, with no apparent loss of formatting.

  12. I tried to use “save as pdf and xps” tool in office 2007 to save word file with many mathematical equations as pdf but there were errors in some of math. equations (strange shapes instead of equations) after the first pages.

Comments are closed.