UNICODEメモ4

今回は本当にメモ。要約無し。
後日まとめる予定。

3 符号化文字集合（ ccs ）

A coded character set is defined to be a mapping from a set of abstract characters to the set of nonnegative integers. This range of integers need not be contiguous. In the Unicode Standard, the concept of the Unicode scalar value (cf. definition D28, in Chapter 3, "Conformance" of [Unicode]) explicitly defines such a noncontiguous range of integers.

符号化文字集合は。抽象的な文字集合から負でない整数のセットに
対応付けされている事により定義されています。
この整数の範囲は連続する必要はありません。
ユニコード標準に於いて、ユニコードスカラバリューの概念は
明らかにそのような非隣接の範囲の整数を定義します。

An abstract character is defined to be in a coded character set if the coded character set maps from it to an integer. That integer is the code point to which the abstract character has been assigned. That abstract character is then an encoded character.

抽象的なキャラクタは、符号化文字集合は抽象的なキャラクタに対する
整数での対応付けであるなら、符号化文字集合にあるように定義されます。
その整数は抽象的なキャラクタが割り当てられたコード・ポイントです。

Coded character sets are the basic object that both ISO and vendor character encoding committees produce. They relate a defined repertoire to nonnegative integers, which then can be used unambiguously to refer to particular abstract characters from the repertoire.

符号化文字集合はISOとベンダー文字符号化委員会の両方が作り出す基本的なオブジェクトです。
それらは負でない整数に定義されたレパートリーとして定義されます。
そして、レパートリーから特定の抽象的なキャラクタについて言及するのに使用することができます。

A coded character set may also be known as a character encoding, a coded character repertoire, a character set definition, or a code page.

符号化文字集合は文字エンコーディング、文字集合定義、符号化された
文字レパートリー、コードページとしても知られています。

In the IBM CDRA architecture, CP ("code page" ) values refer to coded character sets.

IBM CDRAアーキテクチャでは、CP値は符号化文字集合について言及します。

Note that this use of the term code page is quite precise and limited. It should not be -- but generally is -- confused with the generic use of code page to refer to character encoding schemes.

カテゴリ:

トラックバック(0)

コメントする

カテゴリ

月別アーカイブ

ウェブページ