How can the docPr id attribute corrupt a .docx document (HRESULT 0x80004005)?

-1

I have a corrupt document that was generated by some generating software, however, I don't see any good reason why the file is corrupt.

The file contains many images (217 if my count is correct).

I have modified the file to simplify it and to remove sensitive information, but the corruption is still the same as the one I had before doing the modifications. By doing a 1 byte character change, I can remove the corruption, but I don't understand why the corruption happens in the first place.

The corruption says :

"The file xxx cannot be opened because" there are problems with the contents" Details: HRESULT 0x80004005

Location part: /word/footer5.xml Line:0 column 0

The footer5.xml file contains an image which has following docPr tag :

<wp:docPr id="299" name="Group 27"/>

I have tried to edit the file manually, and change the docPr id, to for example :

<wp:docPr id="300" name="Group 27"/>

After doing this, the file is no more corrupt.

Attached are the corrupt file with value 299, and the valid file with value 300.

Changing the id to one of the following values : 300, 2990, 301, results in non-corrupt documents.

Changing the id to one of the following values : 298, 29, 297, results in corrupt documents.

So my guess would be that for some reasons , ids less than 300 will cause such a bug.

However, there are many other docPr ids that are less than 300 in the document, so I don''t see why this one is causing problems.

Also If I remove any image in the document in word/document.xml, the file is no more corrupted.

I am using Word 2016 desktop app on Windows to view the file after the generation

On those word-viewers, the file is not seen as corrupt.

Can you explain why the file is corrupt, and how docPr ids should be set in a generated document ?

ms-word
openxml
docx
asked on Stack Overflow Sep 26, 2018 by edi9999

0 Answers

Nobody has answered this question yet.


User contributions licensed under CC BY-SA 3.0