A universal 16-bit (two byte) standard character set for
representing plain text in
computer processing, which includes the major modern scripts;
classical forms of Greek, Sanskrit, and Pali; the symbols
used in Braille;
mathematical and technical symbols; and over 21,000 East Asian ideographs--7,000
more than the East Asian Character Code (EACC) used in USMARC.
Many more scripts have been proposed for inclusion and are under consideration.
Development of Unicode began in 1987 when Joe Becker and Lee Collins of
Xerox and Mark Davis of Apple sought to devise a character set
as simple as ASCII to
meet the needs of the entire computing world. Joe Becker is credited with
coining the term,
which stands for "unique, universal, and uniform character encoding."
The Research
Libraries Group (RLG), developer of EACC, joined the project in
its early stages, and in 1991 the Unicode Consortium was established to develop
and promote the new standard. At the same time, the Joint Technical Committee 1
(JTC 1) of the International
Organization for Standardization (ISO) and the International
Electrotechnical Commission (IEC) were also working on a global character set.
In 1992, the two initiatives merged. Since then, Unicode has been synchronized
with ISO/IEC 10646.
The current version of Unicode can define approximately
65,000 characters, with extensions to accommodate an additional 1 million
characters. Duplication is avoided by assigning a single code when a character
is common to more than one language.
The standard also provides guidelines for sorting and searching, compression and
transmission, transcoding to other standards, and truncation. Library issues
center on the use of Unicode data in machine-readable bibliographic
records, since large numbers of existing records are encoded in
7- and 8-bit character sets. The MARBI Committee of
the American
Library Association (ALA), responsible for advising the Library of Congress on
the USMARC formats, has delegated work on the use of Unicode to its
Subcommittee on Character Sets and to special task forces.
Unicode is currently used in Java from Sun, Windows NT and Internet
Explorer from Microsoft, Netscape Navigator, the Macintosh operating
system from Apple, and database applications
from Oracle, Sybase, etc. Many vendors
of integrated
library systems are moving toward implementation of Unicode in
their systems. Click here to learn more about the
Unicode Standard.
Thanks for Visiting
Asheesh Kamal
0 Comments