The reference Java target for ANTLR can only parse characters in the supplementary plane by using a UTF-16 surrogate pair in the grammar and using a UTF-16 encoding for your input stream. Other targets are created by members of the community and may or (as you saw the Ruby target) may not support the same range of characters.
Since there is no way to represent anything past 0xFFFE in the grammar itself, you'll be limited to the UTF-16 encoding even if you modify a target to support characters above 0xFF.