User loginNavigation |
Parser Generators Supporting Astral CharactersIt's important to me that the programming language I'm working on has full Unicode support, which includes supporting the 'astral' characters from planes 2 through 16. I'm not very interested in parsing, so I would really prefer to use a parser generator instead of writing a parser myself. However, the parser generators I've looked at only support plane 0, the Basic Multilingual Plane, despite claiming to have "full" Unicode support. As a side note, they often don't even warn you about this when specifying astral characters in the grammar. They happily accept code points over 0xFFFF and output broken code. The generated code doesn't even check for this stuff at runtime, but rather just behaves unexpectedly. For example, when allowing [0x010000..0x10FFFF] characters in identifiers, SableCC output code that parsed my entire input file as a single gigantic identifier. So I'm wondering, what popular parser generators support astral characters? Or if there aren't any, what unpopular ones? Am I not going to be able to use a preexisting parser generator to get what I want? By JamesJustinHarrell at 2008-07-10 19:10 | LtU Forum | previous forum topic | next forum topic | other blogs | 6475 reads
|
Browse archives
Active forum topics |
Recent comments
32 weeks 6 days ago
32 weeks 6 days ago
32 weeks 6 days ago
1 year 3 weeks ago
1 year 7 weeks ago
1 year 8 weeks ago
1 year 8 weeks ago
1 year 11 weeks ago
1 year 16 weeks ago
1 year 16 weeks ago