Few people know that the very common "UTF" term is the acronym of Unicode Transformation Format. These are algorithmic mappings, part of the Unicode standard, that map each code point (the absolute numeric representation of a character) to a unique sequence of bytes representing the given character. Notice that the mappings can be used in both directions, converting back and forth different representations.
The standard defines three of these formats, depending on how many bits are used to represent the initial part of the set (the initial 128 characters): 8, 16, or 32. It is interesting to notice that all three forms of encodings need at most 4 bytes of data for each code point.
A problem relating to multi-byte representations (UTF-16 and UTF-32) is which of the bytes comes first? According to the standard, all forms are allowed, so you can have a UTF-16 BE (big-endian13) or LE (little-endian), and the same for UTF-32.
Was this article helpful?
What you need to know about… Project Management Made Easy! Project management consists of more than just a large building project and can encompass small projects as well. No matter what the size of your project, you need to have some sort of project management. How you manage your project has everything to do with its outcome.