Unicode and Emoji, or The Giant Pawn Mystery
I generally despise emoji, but I reluctantly learned a few things about them this morning.
My latest couple blog posts involved chess, and I sent out a couple tweets using chess symbols. Along the way I ran into a mystery: sometimes the black pawn is much larger than other chess symbols. I first noticed this in Excel. Then I noticed that sometimes it happens in the Twitter app, and sometimes not, sometimes on the twitter web site, and sometimes not.
For example, the following screen shot is from Safari on iOS.
What's going on? I explained in a footnote to this post, but I wanted to make this its own post to make it easier to find in the future.
In a nutshell, something in the software environment is deciding that 11 of the twelve chess characters are to be taken literally, but the character for the black pawn is to be interpreted as an emojus [1] representing chess. I'm not clear on whether this is happening in the font or in an app. Probably one, both, or neither depending on circumstances.
I erroneously thought that emoji were all outside Unicode's BMP (Basic Multibyte Plane) so as not to be confused with ordinary characters. Alas, that is not true.
Here is a full list of Unicode characters interpreted (by ...?) as emoji. There are 210 emoji characters in the MBP and 380 outside, i.e. 210 below FFFF and 380 above FFFF.
***
[1] I know that emoji" is a Japanese word, not a Latin word, but to my ear the singular of emoji" should be emojus."
The post Unicode and Emoji, or The Giant Pawn Mystery first appeared on John D. Cook.