Click to See Complete Forum and Search --> : Reading formatting from Word with C#


dzonka1
May 30th, 2008, 04:39 AM
Hi,

I have a word table, I' ve read the values from this table to data grid. In the last column there is a long text which formatting is changing even within the same cell of table, for examle the one part of text is bold, other italic, othe underlined, other listed, and particular fragment has NormalStyle, Header 1, Header 2 etc instances of styles.

And my question is: how can I read programatically formatting of text from the single cell (or in general) int that way to be able to recognize the style.

Thank you very much for your help in advance. Each tip will be very helpful

TheCPUWizard
May 30th, 2008, 07:58 AM
Your best bet is to use Word 2007's new format .docx

1) It is a documented file format. Previous versions are not officially documented. Internally it is a mix of XML and ZIP files.

2) To properly access previous versions, you need to use the DOM (Document Object Model). This requires a valid legal license for Word on the machine. Note that under "normal" circumstances, Words can NOT be accessed (legally) in a Server of any type.

TheCPUWizard
May 30th, 2008, 07:58 AM
Your best bet is to use Word 2007's new format .docx

1) It is a documented file format. Previous versions are not officially documented. Internally it is a mix of XML and ZIP files.

2) To properly access previous versions, you need to use the DOM (Document Object Model). This requires a valid legal license for Word on the machine. Note that under "normal" circumstances, Words can NOT be accessed (legally) in a Server of any type.

dzonka1
May 30th, 2008, 09:08 AM
Unfortunatelly I have to access version 2003 not 2007.
Cuould you give me some more tips how can I use DOM?