User loginNavigation |
Parser Generators Supporting Astral CharactersIt's important to me that the programming language I'm working on has full Unicode support, which includes supporting the 'astral' characters from planes 2 through 16. I'm not very interested in parsing, so I would really prefer to use a parser generator instead of writing a parser myself. However, the parser generators I've looked at only support plane 0, the Basic Multilingual Plane, despite claiming to have "full" Unicode support. As a side note, they often don't even warn you about this when specifying astral characters in the grammar. They happily accept code points over 0xFFFF and output broken code. The generated code doesn't even check for this stuff at runtime, but rather just behaves unexpectedly. For example, when allowing [0x010000..0x10FFFF] characters in identifiers, SableCC output code that parsed my entire input file as a single gigantic identifier. So I'm wondering, what popular parser generators support astral characters? Or if there aren't any, what unpopular ones? Am I not going to be able to use a preexisting parser generator to get what I want? By JamesJustinHarrell at 2008-07-10 19:10 | LtU Forum | previous forum topic | next forum topic | other blogs | 6706 reads
|
Browse archives
Active forum topics |
Recent comments
18 hours 8 sec ago
18 hours 14 min ago
5 days 19 hours ago
5 days 19 hours ago
5 days 19 hours ago
3 weeks 6 days ago
4 weeks 4 days ago
4 weeks 5 days ago
4 weeks 6 days ago
4 weeks 6 days ago