Question

Actually, I bumped into the problem, that I can not locally compile site with jekyll. I receive this error when trying to add any characters not from 'standard' (English) ones (I don't even know how to characterize those as I still use Unicode which includes all of those).

Generating... error: incompatible encoding regexp match 
    (UTF-8 regexp with IBM866 string). Use --trace to view backtrace

I tried different languages letters: German, Polish, Ukrainian/Russian (ä ü ß ę ł є ї л), but result is always the same -- error.

I have this one setup in jekyll configuration (_config.yml). My local version of Ruby is 1.9.3 (and I really need only local version as use precompiled files approach - not jekyll-server on the side).

encoding: UTF-8
markdown: redcarpet

(also tried markdown: maruku)

I guess, that's really more Ruby-related error, because when I use listed characters in title

---
title: ä ü ß ę ł є ї л
---

it works great (and those characters live great in compiled pages), but not when they are added into the page body to parse. But still, I am not Ruby developer and not the one who would easily locate the place to solve the thing in sources.

If such characters use is not possible, then it makes Jekyll use pretty limited.

No correct solution

OTHER TIPS

Actually, thanks to Rafal Chmiel, it became clear what's the problem. It was caused by jekyll rss-plugin, which (seems) can not deal with other languages but English only.

I've added issue into github. Just stopped to use that plugin at the moment.

This can be useful for any other user who uses any other plugins which can cause some other errors.

Just try to trace Jekyll to see what really happens:

jekyll build --trace

updated:

some more investigations shown that the problem inside the plugin is Maruku markdown parsing method. Actually, it causes the error, as the rss-plugin uses that code by default as I understand (even if any other 'markdown: ...' method selected in Jekyll setting).

I tried:

markdown: maruku

(without any plugins) and got exactly the same problem, which tells me something like 'Multilingual support by Jekyll is not available when using maruku markdown transformation' at the moment.


one more update:

some further investigations did show a problem with actually, Jekyll parser. The way why my "title: [not-standard-unicode-characters]" worked is because of my editor inserted BOM in the beginning of the file: redcarpet then does not process the head part as a head part but just as some text and just generates the text as usual text.

However, when I removed the BOM, then the new error (different from the one above, but also encoding-related) appeared:

... test21.markdown: invalid byte sequence in UTF-8
... gems/jekyll-1.2.1/lib/jekyll/excerpt.rb:110:in `scan': invalid byte sequence in UTF-8 (ArgumentError)

header text is:

---
layout: post
title: український заголовок
---

український текст
some other text

I have added the issue request into Jekyll github.

Generally, that makes me curious is that because I am the first who ever tried to use Jekyll for sites with not only standard English content?!! Am I? Otherwise, such errors would be strange behaviour. No?


update:

After speaking "with" Jekyll here, the way to deal with multi-language characters is to use HTML entities, like this:

---
title: "український"
---

wrapping them in quotes.

Also, Jekyll's team thought is:

Once v1.3 comes out, try assigning encoding: "utf-8" in your _config.yml and trying without the HTML entities. It should correct itself (hopefully).

Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top