term(5) File formats term(5)

term(5) File formats term(5) #

term(5) File formats term(5)

NNAAMMEE #

 term - format of compiled term file.

SSYYNNOOPPSSIISS #

 tteerrmm

DDEESSCCRRIIPPTTIIOONN #

SSTTOORRAAGGEE LLOOCCAATTIIOONN #

 Compiled terminfo descriptions are placed under the directory
 //uussrr//sshhaarree//tteerrmmiinnffoo.  Two configurations are supported (when building the
 nnccuurrsseess libraries):

 ddiirreeccttoorryy ttrreeee
      A two-level scheme is used to avoid a linear search of a huge UNIX
      system directory: //uussrr//sshhaarree//tteerrmmiinnffoo//cc//nnaammee where _n_a_m_e is the name
      of the terminal, and _c is the first character of _n_a_m_e.  Thus, _a_c_t_4
      can be found in the file //uussrr//sshhaarree//tteerrmmiinnffoo//aa//aacctt44.  Synonyms for
      the same terminal are implemented by multiple links to the same
      compiled file.

 hhaasshheedd ddaattaabbaassee
      Using Berkeley database, two types of records are stored: the
      terminfo data in the same format as stored in a directory tree with
      the terminfo's primary name as a key, and records containing only
      aliases pointing to the primary name.

      If built to write hashed databases, nnccuurrsseess can still read terminfo
      databases organized as a directory tree, but cannot write entries
      into the directory tree.  It can write (or rewrite) entries in the
      hashed database.

      nnccuurrsseess distinguishes the two cases in the TERMINFO and
      TERMINFO_DIRS environment variable by assuming a directory tree for
      entries that correspond to an existing directory, and hashed
      database otherwise.

LLEEGGAACCYY SSTTOORRAAGGEE FFOORRMMAATT #

 The format has been chosen so that it will be the same on all hardware.
 An 8 or more bit byte is assumed, but no assumptions about byte ordering
 or sign extension are made.

 The compiled file is created with the ttiicc program, and read by the
 routine sseettuupptteerrmm(3).  The file is divided into six parts:

      a) _h_e_a_d_e_r,

      b) _t_e_r_m_i_n_a_l _n_a_m_e_s,

      c) _b_o_o_l_e_a_n _f_l_a_g_s,

      d) _n_u_m_b_e_r_s,

      e) _s_t_r_i_n_g_s, and

      f) _s_t_r_i_n_g _t_a_b_l_e.

 The _h_e_a_d_e_r section begins the file.  This section contains six short
 integers in the format described below.  These integers are

      (1) the _m_a_g_i_c _n_u_m_b_e_r (octal 0432);

      (2) the size, in bytes, of the _t_e_r_m_i_n_a_l _n_a_m_e_s section;

      (3) the number of bytes in the _b_o_o_l_e_a_n _f_l_a_g_s section;

      (4) the number of short integers in the _n_u_m_b_e_r_s section;

      (5) the number of offsets (short integers) in the _s_t_r_i_n_g_s section;

      (6) the size, in bytes, of the _s_t_r_i_n_g _t_a_b_l_e.

 The capabilities in the _b_o_o_l_e_a_n _f_l_a_g_s, _n_u_m_b_e_r_s, and _s_t_r_i_n_g_s sections are
 in the same order as the file <term.h>.

 Short integers are signed, in the range -32768 to 32767.  They are stored
 as two 8-bit bytes.  The first byte contains the least significant 8 bits
 of the value, and the second byte contains the most significant 8 bits.
 (Thus, the value represented is 256*second+first.)  This format
 corresponds to the hardware of the VAX and PDP-11 (that is, little-endian
 machines).  Machines where this does not correspond to the hardware must
 read the integers as two bytes and compute the little-endian value.

 Numbers in a terminal description, whether they are entries in the
 _n_u_m_b_e_r_s or _s_t_r_i_n_g_s table, are positive integers.  Boolean flags are
 treated as positive one-byte integers.  In each case, those positive
 integers represent a terminal capability.  The terminal compiler tic uses
 negative integers to handle the cases where a capability is not
 available:

 •   If a capability is absent from this terminal, tic stores a -1 in the
     corresponding table.

     The integer value -1 is represented by two bytes 0377, 0377.
     Absent boolean values are represented by the byte 0 (false).

 •   If a capability has been canceled from this terminal, tic stores a -2
     in the corresponding table.

     The integer value -2 is represented by two bytes 0377, 0376.
     The boolean value -2 is represented by the byte 0376.

 •   Other negative values are illegal.

 The _t_e_r_m_i_n_a_l _n_a_m_e_s section comes after the _h_e_a_d_e_r.  It contains the first
 line of the terminfo description, listing the various names for the
 terminal, separated by the “|” character.  The _t_e_r_m_i_n_a_l _n_a_m_e_s section is
 terminated with an ASCII NUL character.

 The _b_o_o_l_e_a_n _f_l_a_g_s section has one byte for each flag.  Boolean
 capabilities are either 1 or 0 (true or false) according to whether the
 terminal supports the given capability or not.

 Between the _b_o_o_l_e_a_n _f_l_a_g_s section and the _n_u_m_b_e_r section, a null byte
 will be inserted, if necessary, to ensure that the _n_u_m_b_e_r section begins
 on an even byte This is a relic of the PDP-11's word-addressed
 architecture, originally designed to avoid traps induced by addressing a
 word on an odd byte boundary.  All short integers are aligned on a short
 word boundary.

 The _n_u_m_b_e_r_s section is similar to the _b_o_o_l_e_a_n _f_l_a_g_s section.  Each
 capability takes up two bytes, and is stored as a little-endian short
 integer.

 The _s_t_r_i_n_g_s section is also similar.  Each capability is stored as a
 short integer.  The capability value is an index into the _s_t_r_i_n_g _t_a_b_l_e.

 The _s_t_r_i_n_g _t_a_b_l_e is the last section.  It contains all of the values of
 string capabilities referenced in the _s_t_r_i_n_g_s section.  Each string is
 null-terminated.  Special characters in ^X or \c notation are stored in
 their interpreted form, not the printing representation.  Padding
 information $<nn> and parameter information %x are stored intact in
 uninterpreted form.

EEXXTTEENNDDEEDD SSTTOORRAAGGEE FFOORRMMAATT #

 The previous section describes the conventional terminfo binary format.
 With some minor variations of the offsets (see PORTABILITY), the same
 binary format is used in all modern UNIX systems.  Each system uses a
 predefined set of boolean, number or string capabilities.

 The nnccuurrsseess libraries and applications support extended terminfo binary
 format, allowing users to define capabilities which are loaded at
 runtime.  This extension is made possible by using the fact that the
 other implementations stop reading the terminfo data when they have
 reached the end of the size given in the header.  nnccuurrsseess checks the
 size, and if it exceeds that due to the predefined data, continues to
 parse according to its own scheme.

 First, it reads the extended header (5 short integers):

      (1)  count of extended boolean capabilities

      (2)  count of extended numeric capabilities

      (3)  count of extended string capabilities

      (4)  count of the items in extended string table

      (5)  size of the extended string table in bytes

 The count- and size-values for the extended string table include the
 extended capability _n_a_m_e_s as well as extended capability _v_a_l_u_e_s.

 Using the counts and sizes, nnccuurrsseess allocates arrays and reads data for
 the extended capabilities in the same order as the header information.

 The extended string table contains values for string capabilities.  After
 the end of these values, it contains the names for each of the extended
 capabilities in order, e.g., booleans, then numbers and finally strings.

 Applications which manipulate terminal data can use the definitions
 described in tteerrmm__vvaarriiaabblleess(3) which associate the long capability names
 with members of a TTEERRMMTTYYPPEE structure.

EEXXTTEENNDDEEDD NNUUMMBBEERR FFOORRMMAATT #

 On occasion, 16-bit signed integers are not large enough.  With nnccuurrsseess
 6.1, a new format was introduced by making a few changes to the legacy
 format:

 •   a different magic number (octal 01036)

 •   changing the type for the _n_u_m_b_e_r array from signed 16-bit integers to
     signed 32-bit integers.

 To maintain compatibility, the library presents the same data structures
 to direct users of the TTEERRMMTTYYPPEE structure as in previous formats.
 However, that cannot provide callers with the extended numbers.  The
 library uses a similar but hidden data structure TTEERRMMTTYYPPEE22 to provide
 data for the terminfo functions.

PPOORRTTAABBIILLIITTYY #

sseettuupptteerrmm Note that it is possible for sseettuupptteerrmm to expect a different set of capabilities than are actually present in the file. Either the database may have been updated since sseettuupptteerrmm was recompiled (resulting in extra unrecognized entries in the file) or the program may have been recompiled more recently than the database was updated (resulting in missing entries). The routine sseettuupptteerrmm must be prepared for both possibilities - this is why the numbers and sizes are included. Also, new capabilities must always be added at the end of the lists of boolean, number, and string capabilities.

BBiinnaarryy ffoorrmmaatt X/Open Curses does not specify a format for the terminfo database. UNIX System V curses used a directory-tree of binary files, one per terminal description.

 Despite the consistent use of little-endian for numbers and the otherwise
 self-describing format, it is not wise to count on portability of binary
 terminfo entries between commercial UNIX versions.  The problem is that
 there are at least three versions of terminfo (under HP-UX, AIX, and
 OSF/1) which diverged from System V terminfo after SVr1, and have added
 extension capabilities to the string table that (in the binary format)
 collide with System V and XSI Curses extensions.  See tteerrmmiinnffoo(5) for
 detailed discussion of terminfo source compatibility issues.

 This implementation is by default compatible with the binary terminfo
 format used by Solaris curses, except in a few less-used details where it
 was found that the latter did not match X/Open Curses.  The format used
 by the other Unix versions can be matched by building ncurses with
 different configuration options.

MMaaggiicc ccooddeess The magic number in a binary terminfo file is the first 16-bits (two bytes). Besides making it more reliable for the library to check that a file is terminfo, utilities such as ffiillee(1) also use that to tell what the file-format is. System V defined more than one magic number, with 0433, 0435 as screen-dumps (see ssccrr__dduummpp(5)). This implementation uses 01036 as a continuation of that sequence, but with a different high-order byte to avoid confusion.

TThhee TTEERRMMTTYYPPEE ssttrruuccttuurree Direct access to the TTEERRMMTTYYPPEE structure is provided for legacy applications. Portable applications should use the ttiiggeettffllaagg and related functions described in tteerrmmiinnffoo(3) for reading terminal capabilities.

MMiixxeedd--ccaassee tteerrmmiinnaall nnaammeess A small number of terminal descriptions use uppercase characters in their names. If the underlying filesystem ignores the difference between uppercase and lowercase, nnccuurrsseess represents the “first character” of the terminal name used as the intermediate level of a directory tree in (two- character) hexadecimal form.

EEXXAAMMPPLLEE #

 As an example, here is a description for the Lear-Siegler ADM-3, a
 popular though rather stupid early terminal:

     adm3a|lsi adm3a,
             am,
             cols#80, lines#24,
             bel=^G, clear= 32$<1>, cr=^M, cub1=^H, cud1=^J,
             cuf1=^L, cup=\E=%p1%{32}%+%c%p2%{32}%+%c, cuu1=^K,
             home=^^, ind=^J,

 and a hexadecimal dump of the compiled terminal description:

     0000  1a 01 10 00 02 00 03 00  82 00 31 00 61 64 6d 33  ........ ..1.adm3
     0010  61 7c 6c 73 69 20 61 64  6d 33 61 00 00 01 50 00  a|lsi ad m3a...P.
     0020  ff ff 18 00 ff ff 00 00  02 00 ff ff ff ff 04 00  ........ ........
     0030  ff ff ff ff ff ff ff ff  0a 00 25 00 27 00 ff ff  ........ ..%.'...
     0040  29 00 ff ff ff ff 2b 00  ff ff 2d 00 ff ff ff ff  ).....+. ..-.....
     0050  ff ff ff ff ff ff ff ff  ff ff ff ff ff ff ff ff  ........ ........
     0060  ff ff ff ff ff ff ff ff  ff ff ff ff ff ff ff ff  ........ ........
     0070  ff ff ff ff ff ff ff ff  ff ff ff ff ff ff ff ff  ........ ........
     0080  ff ff ff ff ff ff ff ff  ff ff ff ff ff ff ff ff  ........ ........
     0090  ff ff ff ff ff ff ff ff  ff ff ff ff ff ff ff ff  ........ ........
     00a0  ff ff ff ff ff ff ff ff  ff ff ff ff ff ff ff ff  ........ ........
     00b0  ff ff ff ff ff ff ff ff  ff ff ff ff ff ff ff ff  ........ ........
     00c0  ff ff ff ff ff ff ff ff  ff ff ff ff ff ff ff ff  ........ ........
     00d0  ff ff ff ff ff ff ff ff  ff ff ff ff ff ff ff ff  ........ ........
     00e0  ff ff ff ff ff ff ff ff  ff ff ff ff ff ff ff ff  ........ ........
     00f0  ff ff ff ff ff ff ff ff  ff ff ff ff ff ff ff ff  ........ ........
     0100  ff ff ff ff ff ff ff ff  ff ff ff ff ff ff ff ff  ........ ........
     0110  ff ff ff ff ff ff ff ff  ff ff ff ff ff ff ff ff  ........ ........
     0120  ff ff ff ff ff ff 2f 00  07 00 0d 00 1a 24 3c 31  ....../. .....$<1
     0130  3e 00 1b 3d 25 70 31 25  7b 33 32 7d 25 2b 25 63  >..=%p1% {32}%+%c
     0140  25 70 32 25 7b 33 32 7d  25 2b 25 63 00 0a 00 1e  %p2%{32} %+%c....
     0150  00 08 00 0c 00 0b 00 0a  00                       ........ .

LLIIMMIITTSS #

 Some limitations:

 •   total compiled entries cannot exceed 4096 bytes in the legacy format.

 •   total compiled entries cannot exceed 32768 bytes in the extended
     format.

 •   the name field cannot exceed 128 bytes.

 Compiled entries are limited to 32768 bytes because offsets into the
 _s_t_r_i_n_g_s _t_a_b_l_e use two-byte integers.  The legacy format could have
 supported 32768-byte entries, but was limited a virtual memory page's
 4096 bytes.

FFIILLEESS #

 /usr/share/terminfo/*/* compiled terminal capability database

SSEEEE AALLSSOO #

 ccuurrsseess(3), tteerrmmiinnffoo(5).

AAUUTTHHOORRSS #

 Thomas E. Dickey
 extended terminfo format for ncurses 5.0
 hashed database support for ncurses 5.6
 extended number support for ncurses 6.1

 Eric S. Raymond
 documented legacy terminfo format, e.g., from _p_c_u_r_s_e_s.

ncurses 6.4 2023-07-01 term(5)