I am trying to find out whether there exist anything in the word document that has a font of 2. However, I have not been able to do this. To begin with, I've tried to read the font of each word in a sample word document that only has one line and 7 words. I am not getting the correct results.
Here is my code:
HWPFDocument doc = new HWPFDocument (fileStream);
WordExtractor we = new WordExtractor(doc);
Range range = doc.getRange()
String[] paragraphs = we.getParagraphText();
for (int i = 0; i < paragraphs.length; i++) {
Paragraph pr = range.getParagraph(i);
int k = 0
while (true) {
CharacterRun run = pr.getCharacterRun(k++);
System.out.println("Color: " + run.getColor());
System.out.println("Font: " + run.getFontName());
System.out.println("Font Size: " + run.getFontSize());
if (run.getEndOffSet() == pr.getEndOffSet())
break;
}
}
However, the above code always doubles the font size. i.e. if the actual font size in the document is 12 then it outputs 24 and if actual font is 8 then it outputs 16.
Is this the correct way to read font size from a word document ??
Yes, that's the correct way; the measurement is in half points.
In a docx, you'd have something like:
<w:rPr>
<w:sz w:val="28" />
</w:rPr>
ECMA 376 spec on @sz defines the unit as ST_HpsMeasure (Measurement in Half-Points)
Its the same with the binary doc format, which HWPF supports. If you look at [MS-DOC], you'll see it also specifies the size of text in half-points.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With