User loginNavigation |
Parser Generators Supporting Astral CharactersIt's important to me that the programming language I'm working on has full Unicode support, which includes supporting the 'astral' characters from planes 2 through 16. I'm not very interested in parsing, so I would really prefer to use a parser generator instead of writing a parser myself. However, the parser generators I've looked at only support plane 0, the Basic Multilingual Plane, despite claiming to have "full" Unicode support. As a side note, they often don't even warn you about this when specifying astral characters in the grammar. They happily accept code points over 0xFFFF and output broken code. The generated code doesn't even check for this stuff at runtime, but rather just behaves unexpectedly. For example, when allowing [0x010000..0x10FFFF] characters in identifiers, SableCC output code that parsed my entire input file as a single gigantic identifier. So I'm wondering, what popular parser generators support astral characters? Or if there aren't any, what unpopular ones? Am I not going to be able to use a preexisting parser generator to get what I want? By JamesJustinHarrell at 2008-07-10 19:10 | LtU Forum | previous forum topic | next forum topic | other blogs | 6430 reads
|
Browse archives
Active forum topics |
Recent comments
22 weeks 6 days ago
22 weeks 6 days ago
22 weeks 6 days ago
45 weeks 19 hours ago
49 weeks 2 days ago
50 weeks 6 days ago
50 weeks 6 days ago
1 year 1 week ago
1 year 6 weeks ago
1 year 6 weeks ago