Related articles:
Unicode
ASCII
Mojibake
UTF-16/UCS-2
XML
Unicode and HTML
Code page
Character encoding
Universal Character Set
Mapping of Unicode characters
Cyrillic alphabet
Windows-1252
Character encodings in HTML
ISO/IEC 8859-1
Ken Thompson
Plan 9 from Bell Labs
Class (file format)
Key terms:
bytes
encoding
unicode
bom
string
ascii
bits
sequence
rfc
invalid
iec
alphabet
api
errors
planes
code points
unix
handling
annex
browsers
decoding
parser
character set
cyrillic
code page
bell labs
unicode standard
scripts
sorting
disadvantages
compatibility
obsolete
ascii characters
two bytes
surrogate
first byte
decomposed
simplistic
character encoding
three bytes
continuation
per character
implementations
legacy encoding
invalid sequences
universal character set
unicode code points
programming language
basic multilingual plane
mapping of unicode character planes
Search external links cited by footnotes on Wikipedia page UTF-8:
|
|