You've clearly been able to get ArticleExtractor to parse utf-8 text. The (likely) problem is that boilerplate's algorithms are specifically tailored to English and aren't working so well on a Gujarati (?) article. The algorithms use verbosity of phrases (eg: number of words per phrase) as well as some specific phrases (comments, have your say, etc) to determine the barriers of the article, as well as what pieces within the article are content or non content.
Have a look in the boilerpipe/filters/english
directory of the library for more info on the algorithms. Unfortunately to get the same level of accuracy in non-English languages you would need to repeat their study on each language, or have a list of translated stop words and an idea about verbosity for each language you use.