Annotation of lucent/sys/man/1/tcs, revision 1.1.1.1

1.1       root        1: .TH TCS 1
                      2: .SH NAME
                      3: tcs \- translate character sets
                      4: .SH SYNOPSIS
                      5: .B tcs
                      6: [
                      7: .B -slcv
                      8: ]
                      9: [
                     10: .B -f
                     11: .I ics
                     12: ]
                     13: [
                     14: .B -t
                     15: .I ocs
                     16: ]
                     17: [
                     18: .I file ...
                     19: ]
                     20: .SH DESCRIPTION
                     21: .I Tcs
                     22: interprets the named
                     23: .I file(s)
                     24: (standard input default) as a stream of characters from the
                     25: .I ics
                     26: character set or format, converts them to runes,
                     27: and then converts them into a stream of characters from the
                     28: .I ocs
                     29: character set or format on the standard output.
                     30: The default value for
                     31: .I ics
                     32: and
                     33: .I ocs
                     34: is
                     35: .BR utf ,
                     36: the
                     37: .SM UTF
                     38: encoding described in
                     39: .IR utf (6).
                     40: The
                     41: .B -l
                     42: option lists the character sets known to
                     43: .IR tcs .
                     44: Processing continues in the face of conversion errors (the
                     45: .B -s
                     46: option prevents reporting of these errors).
                     47: The
                     48: .B -c
                     49: option forces the output to contain only correctly converted characters;
                     50: otherwise,
                     51: .B 0x80
                     52: characters will be substituted for
                     53: .SM UTF
                     54: encoding errors and
                     55: .B 0xFFFD
                     56: characters will substituted for unknown characters.
                     57: .PP
                     58: The
                     59: .B -v
                     60: option generates various diagnostic and summary information on standard error,
                     61: or makes the
                     62: .B -l
                     63: output more verbose.
                     64: .PP
                     65: .I Tcs
                     66: recognizes an ever changing list of character sets.
                     67: In particular, it supports a variety of Russian and Japanese encodings.
                     68: Some of the supported encodings are
                     69: .TF jis-kanji
                     70: .TP
                     71: .B utf
                     72: The Plan 9
                     73: .SM UTF
                     74: encoding, known by ISO as UTF-8
                     75: .TP
                     76: .B utf1
                     77: The deprecated original
                     78: .SM UTF
                     79: encoding from ISO 10646
                     80: .TP
                     81: .B ascii
                     82: 7-bit ASCII
                     83: .TP
                     84: .B 8859-1
                     85: Latin-1 (Central European)
                     86: .TP
                     87: .B 8859-2
                     88: Latin-2 (Czech .. Slovak)
                     89: .TP
                     90: .B 8859-3
                     91: Latin-3 (Dutch .. Turkish)
                     92: .TP
                     93: .B 8859-4
                     94: Latin-4 (Scandinavian)
                     95: .TP
                     96: .B 8859-5
                     97: Part 5 (Cyrillic)
                     98: .TP
                     99: .B 8859-6
                    100: Part 6 (Arabic)
                    101: .TP
                    102: .B 8859-7
                    103: Part 7 (Greek)
                    104: .TP
                    105: .B 8859-8
                    106: Part 8 (Hebrew)
                    107: .TP
                    108: .B 8859-9
                    109: Latin-5 (Finnish .. Portuguese)
                    110: .TP
                    111: .B koi8
                    112: KOI-8 (GOST 19769-74)
                    113: .TP
                    114: .B jis-kanji
                    115: ISO 2022-JP
                    116: .TP
                    117: .B ujis
                    118: EUC-JX: JIS 0208
                    119: .TP
                    120: .B ms-kanji
                    121: Microsoft, or Shift-JIS
                    122: .TP
                    123: .B jis
                    124: (from only) guesses between ISO 2022-JP, EUC or Shift-Jis
                    125: .TP
                    126: .B gb
                    127: Chinese national standard (GB2312-80)
                    128: .TP
                    129: .B big5
                    130: Big 5 (HKU version)
                    131: .TP
                    132: .B unicode
                    133: Unicode Standard 1.0
                    134: .TP
                    135: .B tis
                    136: Thai character set plus ASCII (TIS 620-1986)
                    137: .TP
                    138: .B msdos
                    139: IBM PC: CP 437
                    140: .TP
                    141: .B atari
                    142: Atari-ST character set
                    143: .SH EXAMPLES
                    144: .TP
                    145: .B tcs -f 8859-1
                    146: Convert 8859-1 (Latin-1) characters into
                    147: .SM UTF
                    148: format.
                    149: .TP
                    150: .B tcs -s -f jis
                    151: Convert characters encoded in one of several shift JIS encodings into
                    152: .SM UTF
                    153: format.
                    154: Unknown Kanji will be converted into
                    155: .B 0xFFFD
                    156: characters.
                    157: .TP
                    158: .B tcs -lv
                    159: Print an up to date list of the supported character sets.
                    160: .SH SOURCE
                    161: .B /sys/src/cmd/tcs
                    162: .SH SEE ALSO
                    163: .IR ascii (1), 
                    164: .IR rune (2), 
                    165: .IR utf (6).

unix.superglobalmegacorp.com

This archive runs on limited infrastructure. Preserving old code on modern bandwidth. Automated agents are requested to crawl responsibly.