UTF-8: Perbedaan antara revisi

Konten dihapus Konten ditambahkan
HsfBot (bicara | kontrib)
k Clean up, replaced: metoda → metode using AWB
Tidak ada ringkasan suntingan
Tag: Suntingan perangkat seluler Suntingan peramban seluler
Baris 11:
| publisher = W3Techs
| accessdate = March 30, 2010
}}</ref> [[Internet Engineering Task Force]] (IETF) mengharuskan semua [[protokol (komputer)|protokol]] [[Internet]] untuk mengidentifikasi ''[[Pengkodean karakter|encoding]]'' yang dipakai untuk data karakter, dan pengkodean karakter yang didukung (''supported character encoding'') untuk menyertakan UTF-8.<ref name="rfc2277">{{Cite journal |first=H. |last=Alvestrand |title=RFC 2277 |contribution=IETF Policy on Character Sets and Languages |publisher=[[Internet Engineering Task Force]] |year=1998}}</ref> [[:en:Internet Mail Consortium|Internet Mail Consortium]] (IMC) merekomendasi seluruh program e-mail dapat menayangkan dan membuat e-mail menggunakan UTF-8.<ref name="IMC">{{cite web
| url = http://www.imc.org/mail-i18n.html
| title = Using International Characters in Internet Mail
Baris 17:
| date = August 1, 1998
| accessdate = November 8, 2007
}}</ref> UTF-8 juga terus meningkat penggunaannya sebagai ''default character encoding'' dalam [[sistem operasi]], [[bahasa pemrograman]], [[application programming interface|API]], dan [[aplikasi perangkat lunak]].
 
<!--
UTF-8 encodes each of the 1,112,064 [[code point]]s in the Unicode character set using one to four 8-bit [[byte]]s (termed "[[octet (computing)|octets]]" in the Unicode Standard). Code points with lower numerical values (i.e. earlier code positions in the Unicode character set, which tend to occur more frequently) are encoded using fewer bytes. The first 128 characters of Unicode, which correspond one-to-one with [[ASCII]], are encoded using a single octet with the same binary value as ASCII, making valid ASCII text valid UTF-8-encoded Unicode as well.
 
The official [[Internet Assigned Numbers Authority|IANA]] code for the UTF-8 character encoding is <code>UTF-8</code>.<ref>{{cite web |url=http://www.iana.org/assignments/character-sets |title=CHARACTER SETS |publisher=Internet Assigned Numbers Authority |date=November 4, 2010 |accessdate=5 December 2010}}</ref>
 
==History==
By early 1992 the search was on for a good byte-stream encoding of multi-byte character sets. The draft [[Universal Character Set|ISO 10646I6]] standard contained a non-required [[Addendum|annex]] called [[UTF-1]] that provided a byte-stream encoding of its 32-bit code points. This encoding was not satisfactory on performance grounds, but did introduce the notion that bytes in the range of 0–127 continue representing the ASCII characters in UTF, thereby providing backward compatibility with ASCII.
 
In July 1992, the [[X/Open]] committee XoJIG was looking for a better encoding. Dave Prosser of [[Unix System Laboratories]] submitted a proposal for one that had faster implementation characteristics and introduced the improvement that 7-bit ASCII characters would ''only'' represent themselves; all multibyte sequences would include only bytes where the high bit was set. This original proposal, FSS-UTF (File System Safe UCS Transformation Format), was similar in concept to UTF-8, but lacked the crucial property of self-synchronization.<ref name=pikeviacambridge>{{cite web|url=http://www.cl.cam.ac.uk/~mgk25/ucs/utf-8-history.txt|title=UTF-8 history|first=Rob|last=Pike|date=30 Apr 2003 | accessdate=September 7, 2012}}</ref><ref>{{cite web|url=https://plus.google.com/u/0/101960720994009339267/posts/Rz1udTvtiMg|title=UTF-8 turned 20 years old yesterday|first=Rob |last=Pike|date=September 6, 2012 | accessdate=September 7, 2012}}</ref>