How do I read word document with bold and italic formatting by using POI

Question 1

WordExtractor returns only the text, nothing else.

The simplest way for you to get the text+formatting of a word document is to switch to using Apache Tika. Apache Tika builds on top of Apache POI (amongst others), and offers both plain text extraction and rich extraction (XHTML with formatting).

Alternately, if you want to write the code yourself, I'd suggest you review the code in Tika's WordExtractor, which demonstrates how to use Apache POI to get the formatting information of runs of text out.

Question 2

Instead of using WordExtractor, you can read with Range:

...
HWPFDocument doc = new HWPFDocument(fis);
Range r = doc.getRange();
...

Range is the central class of that model. When you get range, you can play more with the features of the texts and, for instance, iterate through all CharacterRuns, and check if it is Italic (.isItalic()) or change to Italic: (.setItalic(true)).

for(int i = 0; i<r.numCharacterRuns(); i++)
        {
            CharacterRun cr = r.getCharacterRun(i);
            cr.setItalic(true);
            ...
        }

...
File fon = new File(yourFilePathOut);
FileOutputStream fos = new FileOutputStream(fon);
doc.write(fos); 
...

It works if you are stick to use HWPF. Between, to frame into and work with the concept of Paragraph is more convenient.