archives

Is character as a type meaningless?

Assume 1,2,3 creates a list of three integers, 1 and 2 and 3. Similarly, I could create a list of identifiers like func1, func2, func3. I could also have three characters, 'a', 'b', 'c'.

Having 'a', 'b', 'c' is really like having the string "abc", since strings are an array of characters. For anyone having read how Unicode works, you might have noticed that the definition of a character is kind of tricky. Some characters are composed of several characters, so what do you mean with character? A Unicode code point as Unicode would like to see it? Single UTF-8 element as C and C++ typically see it? UTF-16 element as in Windows?

All this makes me wonder, is there any reason to treat character as something else than just a short string? It would be a sub-type of string, in that it is constrained to only evaluate to a single Unicode character. Would this make sense, or are there pitfalls with this kind of thinking?