Terms in this glossary attributed to (NIST) are derived from
the National Institute of Standards and Technology (NIST) glossary.
A
|
|---|
| abstract data type (ADT) | | A mathematically specified collection of data-storing
entities with operations to create, access, or change instances.
(NIST) See also data structure.
|
|---|
| algorithm | | A computable set of steps to achieve a desired result. The
word comes from the Persian author Abu Ja'far Mohammed ibn Mūsā al-Khowārizmī who
wrote a book with arithmetic rules dating from about 825 A.D. (NIST)
|
|---|
| API | | See Application Programming Interface.
|
|---|
| Application Programming Interface (API) | | The interface (calling conventions) by which an application
program accesses an operating system and other services.
|
|---|
| array | | A set of items that are randomly accessible by numeric index.
(NIST)
|
|---|
B
|
|---|
| B-tree | | A balanced search tree in which every node has between t-1 and 2t-1 children,
where t is an arbitrary constant. This is a
good structure if much of the tree is in slow memory (disk), since
the height, and hence the number of accesses, can be kept small,
say one or two, by picking a large t. (NIST)
|
|---|
| big-O notation | | A theoretical measurement used to compare two algorithms
that accomplish the same task, usually in terms of the time or memory
needed.
|
|---|
| binary tree | | A tree with at most two children for each node.
(NIST)
|
|---|
| bit array | | An array whose values are bits.
|
|---|
C
|
|---|
| cache | | A small fast memory holding recently accessed data, designed
to speed up subsequent access to the same data. Most often applied
to CPU access but also used for a local copy of data accessible
over a network or disk.
|
|---|
| cache line | | A block of memory local to a CPU chip that caches
the same sized block from the computer's main memory. This is analogous
to using a buffer cache to minimize disk I/O, but is implemented
in hardware. Cache lines are size-aligned and typically 16 words
in size in modern processors.
|
|---|
| cache-line fill | | Filling a cache line by accessing main memory as
the result of a cache miss for one or more bytes not currently in
any cache line. A cache-line fill takes ~30 times longer than a
cache hit.
|
|---|
| cardinal number | | A basic or primary value. Examples of cardinal numbers
are 1, 7, 9, and 123. Contrast with ordinal number.
|
|---|
| count | | In the Judy technology, the number of indexes between
any two indexes, inclusive of the two indexes.
|
|---|
D
|
|---|
| data structure | | An organization of information, usually in memory, for
better algorithm efficiency, such as queue, stack, linked list, heap,
dictionary, and tree. It may include redundant information, such
as length of the list or number of nodes in a subtree. (NIST) See also abstract data type.
|
|---|
| density | | A population divided by its expanse.
|
|---|
| digital tree | | A tree in which the key is decoded one digit or
byte at a time.
|
|---|
| dynamic array | | An array whose size may change over time. Items are
not only added or removed, but memory used changes, too. (NIST)
|
|---|
E
|
|---|
| expanse | | The range of possible indexes that can be stored
under one pointer (root or lower) in a tree structure. For example,
a root pointer to a 32-bit Judy array has an expanse of 0... 232.
|
|---|
H
|
|---|
| hash function | | A function which maps keys to integers, hopefully
to get a more even distribution on a smaller set of values. (NIST)
|
|---|
| hash table | | A dictionary in which keys are mapped to array
positions by a hash function. Having more than one key map to the
same position is called a collision. There are many ways to resolve collisions,
but they may be divided into open addressing, in which all elements
are kept within the table, and chaining, in which external data
structures are used. (NIST)
|
|---|
| hybrid data structures | | Using different abstract data types (ADTs) at different
levels in a tree structure. This means switching to a different
ADT while traversing the tree as the expanse and/or population shrinks
to best represent the population's indexes with the fewest cache-line
fills and the least amount of memory.
|
|---|
I
|
|---|
| index | | A key value to an ADT that appears array-like at
the application interface. Since Judy appears array-like, keys into
Judy ADTs are referred to as indexes.
|
|---|
J
|
|---|
| Judy array | | An ADT that acts much like an ordinary array, but which
appears unbounded (except by the available RAM and word size of
the machine) and is allocated by the first store/insert into the
array. Judy indexes can be inserted, retrieved, deleted, and searched
in sorted order.
|
|---|
| Judy tree | | The internal data structure used to implement the data
stored in what is presented externally as a Judy array.
|
|---|
L
|
|---|
| leaf node | | In tree structures, a node that has no nodes under
it.
|
|---|
| level compression | | Widening the nodes of a tree (branches of a Judy digital
tree) in order to have fewer indirections (address calculations) and
possible cache-line fills to access data.
|
|---|
| linear list | | A simple, packed sequence of data items that are usually
the same size and in some order (such as sorted numerically). For
example, a linear list could be a list of valid indexes of one word
each.
|
|---|
N
|
|---|
| node | | (1) A unit of reference in a data structure. Also
called a vertex in graphs and trees. (2) A collection of information
which must be kept at a single memory location. (NIST)
|
|---|
O
|
|---|
| O notation | | See big-O notation.
|
|---|
| ordinal | | The sequence in which something is in relation to
others of its kind. Examples of ordinal numbers are first, third,
11th, and 123rd. Contrast with cardinal numbers.
|
|---|
P
|
|---|
| population | | The number of indexes that have been assigned or
used.
|
|---|
S
|
|---|
| scalability | | (1) The ability of a computer application or product (hardware
or software) to continue to function well as it (or its context)
is changed in size or volume in order to meet a user need. Typically,
the rescaling is to a larger size or volume. (2) The ability not
only to function well in the rescaled situation, but to actually
take full advantage of it.
|
|---|
| skip list | | A probabilistic alternative to balanced trees that was
invented by William Pugh in 1987. Parallel lists at higher levels skip
geometrically more items. Searching begins at the highest level
to quickly get to the right part of the list, then uses progressively
lower level lists. With enough levels, searching is O(log
n).
|
|---|
| sort | | To arrange items in a predetermined order. There
are dozens of sort algorithms, the choice of which depends on factors such
as the number of items relative to working memory, knowledge of
the orderliness of the items or the range of the keys, the cost
of comparing keys versus the cost of moving items, etc. (NIST)
|
|---|
| sparse array | | An array in which some of the indexes may be invalid.
A sparse data set may be represented by any of a large variety
of data structures; an array is just one possibility. If the range of
the data set is large and the valid indexes (keys) are sparse, a simple
array is wasteful of memory although fast to access.
|
|---|
T
|
|---|
| tree structure | | A hierarchical ADT in which the non-leaf nodes (in
the graph sense) contain pointers to (that is, addresses of) other
nodes, in an acyclical and non-multi-path fashion, possibly in addition
to other data stored in the non-leaf nodes.
|
|---|
| trie | | See digital tree. .
|
|---|
U
|
|---|
| ulong_t | | In C programming, an integer type that is set to long (maximum
word length as defined by the machine) and to unsigned (a positive
integer from 0 - n). On a 32-bit machine, ulong_t is
0 to 232- 1.
On a 64-bit machine, ulong_t is 0 to 264-
1.
|
|---|
| unbounded array | | An array where the maximum size is not arbitrarily
limited (or bounded) at less than the maximum word size available
to the machine. For example, on a 32-bit machine an unbounded array
could be from 0 to 232,
while on a 64-bit machine an unbounded array could be from 0 to 264.
|
|---|
W
|
|---|
| word | | A unit of memory that is the same size as a pointer
to pointer to void, and/or an unsigned long, in native C; typically
32 or 64 bits.
|
|---|