I recently stumbled upon a situation where Notepad counts an emoji as two characters instead of one. I have a text file with the names of every BFDI episode in sequential order, and there's a new episode with a title that's just an emoji. Can someone explain why this happens?
2 Answers
To put it simply, emojis can be made up of multiple characters or code points. Think of an emoji as having a plain text version that may include modifiers (like skin tone), which results in it being counted as more than one character. This explains why Notepad would register an emoji as two characters: it sees the underlying code points instead of just the visual representation. How the software processes these codes can vary too, especially on different systems.
Character counting in Notepad can get a bit complicated because of how emojis are encoded using Unicode. In the past, each character was typically represented by a single byte, which made counting characters straightforward. With the introduction of Unicode, things changed. Emojis often require multiple bytes to encode due to their complexity. For example, while some emojis might show as one character to us visually, they can actually consist of two or more bytes in their encoded form, leading to Notepad counting them as multiple characters. It's a quirk of how the software reads and interprets those characters.
Got it! So it’s all about how Windows interprets that emoji. Thanks for breaking it down!