HP C/HP-UX Online Help


Return to the Main HP C Online Help page



Program Organization

Lexical Elements
Declarations
Constants
Structuring a C Program

Lexical Elements

C language programs are composed of lexical elements. The lexical elements of the C language are characters and white spaces that are grouped together into tokens. This section describes the following syntactic objects:

White Space, Newlines, and Continuation Lines

In C source files, blanks, newlines, vertical tabs, horizontal tabs, and formfeeds are all considered to be white space characters.

The main purpose of white space characters is to format source files so that they are more readable. The compiler ignores white space characters, except when they are used to separate tokens or when they appear within string literals.

The newline character is not treated as white space in preprocessor directives. A newline character is used to terminate preprocessor directives. See Overview of the Preprocessor for more information.

The line continuation character in C is the backslash (\). Use the continuation character at the end of the line when splitting a quoted string or preprocessor directive across one or more lines of source code.

Spreading Source Code Across Multiple Lines

You can split a string or preprocessor directive across one or more lines. To split a string or preprocessor directive, however, you must use the continuation character (\) at the end of the line to be split; for example:         '\num ' descr='[num]'define foo_macro(x,y,z) ((x) + (y))\                          * ((z) - (x))   printf("This is an very, very, very lengthy and \ very, very uninteresting string.");

Comments

A comment is any series of characters beginning with /* and ending with */. The compiler replaces each comment with a single space character.

HP C allows comments to appear anywhere in the source file except within identifiers or string literals. The C language does not support nested comments.

In the following example, a comment follows an assignment statement:

average = total / number_of_components; /* Find mean value. */
Comments may also span multiple lines, as in:
/*
    This is a
    multi-line comment.
 */

Identifiers

Identifiers, also called names, can consist of the following: The first character must be a letter, underscore or a $ sign. Identifiers that begin with an underscore are generally reserved for system use. The ANSI/ISO standard reserves all names that begin with two underscores or an underscore followed by an uppercase letter for system use.

NOTE  HP C allows the dollar sign ($) in identifiers, including using the $ in the first character, as an extension to the language.

Identifiers cannot conflict with reserved Keywords .

Legal Identifiers

    meters
    green_eggs_and_ham
    system_name
    UPPER_AND_lower_case
    $name       Legal
in HP C, but non-standard

Illegal Identifiers

    20_meters    Starts
with a digit
    int       The
int type is a reserved keyword
    no$#@good    Contains
illegal characters

Length of Identifiers

HP C identifiers are unique up to 256 characters.

The ANSI/ISO standard requires compilers to support names of up to 32 characters for local variables and 6 characters for global variables.

To improve portability, it is a good idea to make your local variable names unique within the first 32 characters, and global variable names unique within the first 6 characters.

Case Sensitivity in Identifiers

In C, identifier names are always case-sensitive. An identifier written in uppercase letters is considered different from the same identifier written in lowercase. For example, the following three identifiers are all unique:
kilograms
KILOGRAMS
Kilograms
Some HP-UX programming languages (such as Pascal and FORTRAN) are case-insensitive. When writing an HP C program that calls routines from these other languages, you must be aware of this difference in sensitivity.

Strings are also case-sensitive. The system recognizes the following two strings as distinct:

"THE RAIN IN SPAIN"
"the rain in spain"

Keywords

HP C supports the list of keywords shown below. You cannot use keywords as identifiers; if you do, the compiler reports an error. You cannot abbreviate a keyword, and you must enter keywords in lowercase letters.
NOTE  The const, signed, and volatile keywords are part of the ANSI/ISO standard, but not part of the K&R language definition.

auto

Causes the variable to be dynamically allocated and initialized only when the block containing the variable is being executed. This is the default for local variables.

break

See break .

case

An optional element in a switch statement. The case label is followed by an integral constant expression and a (:) colon. No two case constant expressions in the same switch statement can have the same value. For example:
switch (getchar())
{
   case "r":
   case "R":
      moveright();
      break;
      ...
}

char

The char keyword defines an integer type that is 1 byte long.

A char type has a minimum value of -128 and a maximum value of 127.

The numeric range for unsigned char is 1 byte, with a minimum value of 0 and a maximum value of 255.

const

Specifies that an object of any type must have a constant value throughout the scope of its name. For example:
/* declare factor as a constant float */
const float factor = 2.54;
The value of factor cannot change after the initialization.

continue

See continue .

default

A keyword used within the switch statement to identify the catch-all-else statement. For example:
switch (grade){
   case "A":
      printf("Excellent\n");
      break;
   default:
      printf("Invalid grade\n");
      break;
}

do

See do...while .

double

A 64-bit data type for representing floating-point numbers The approximate range in decimal for double is:

Other floating-point types are float, float80, extended and long double.

else

See if.

enum

See Enumeration .

extern

Used for declarations both within and outside of a function (except for function arguments). Signifies that the object is declared somewhere else.

extended

An 80-bit data type representing floating-point numbers.

The approximate range in decimal for extended is:

Other floating-point types are float, float80, double and long double.

float

A 32-bit data type for representing floating-point numbers.

The approximate range in decimal for float is:

Other floating-point types are extended, float80, double and long double.

float80

An 80-bit data type for representing floating-point numbers on the IPF platform.

The range for float80 is:

Other floating-point types are float, extended, double and long double.

for

See for.

goto

See goto .

if

See if .

int

A 32-bit data type for representing whole numbers.

The range for int is -2,147,483,648 through 2,147,483,647.

The range for unsigned int is 0 through 4,294,967,295.

long

A 32-bit integer data type in the HP-UX 32-bit data model. The range for long is -2,147,483,648 through 2,147,483,647. For the HP-UX 64-bit data model, the long data type is 64-bits and the range is the same as the long long data type.

The long long 64-bit data type is supported as an extension to the language when you use the -Ae compile-line option.

The range for long long is -9,223,372,036,854,775,808 through +9,223,372,036,854,775,807.

long double

A 128-bit data type representing floating-point numbers.

The approximate range in decimal for long double is:

Other floating-point types are extended, float, float80, and double.

register

Indicates to the compiler that the variable is heavily used and may be stored in a register for better performance.

return

See return .

short

A 16-bit integer data type.

The range for short is -32,768 through 32,767.

The range for unsigned short is 0 through 65,535.

signed

All integer data types are signed by default. The high-order bit is used to indicate whether a value is greater than or less than zero. Use this modifier for better source code readability. The signed keyword can be used with these data types: Whether or not char is signed or unsigned by default is implementation-defined. The signed keyword lets you explicitly declare (in a portable) way a signed char.

sizeof

>
See sizeof Operator.

static

A variable that has memory allocated for it at program startup time. The variable is associated with a single memory location until the end of the program.

struct

See Structure and Union Tags .

switch

See switch.

__thread

This HP-specific keyword defines a thread specific data variable, distinguishing it from other data items that are shared by all threads. With a thread-specific data variable, each thread has its own copy of the data item. These variables eliminate the need to allocate thread-specific data dynamically, thus improving performance.

This keyword is implemented as an HP-specific type qualifier, with the same syntax as const and volatile, but not the same semantics. Syntax examples:

__thread int var;
int __thread var;
Semantics for the __thread keyword: Only variables of static duration can be thread specific. Thread specific data objects can not be initialized. Pointers of static duration that are not thread specific may not be initialized with the address of a thread specific object - assignment is okay. All global variables, thread specific or not, are initialized to zero by the linker implicitly.

Only one declaration, for example,

__thread int x;
is allowed in one compilation unit that contributes to the program (including libraries linked into the executable). All other declarations must be strictly references:
extern __thread int x;
Even though __thread has the same syntax as a type qualifier, it does not qualify the type, but is a storage class specification for the data object. As such, it is type compatible with non-thread-specific data objects of the same type. That is, a thread specific data int is type compatible with an ordinary int, (unlike const and volatile qualified int).

Note that use of the __thread keyword in a shared library will prevent that shared library from being dynamically loaded (that is, loaded via an explicit call to shl_load()).

typedef

See Typedef Declarations.

union

Structure and Union Tags.

unsigned

A data type modifier that indicates that no sign bit will be used. The data is assumed to contain values greater than or equal to zero. All integer data types are signed by default. The unsigned keyword can be used to modify these data types:

void

The void data type has three important purposes: To indicate that a function does not return a value, you can write a function definition such as:
void func(int a, int b)
{
      . . .    
}
This indicates that the function func() does not return a value. Likewise, on the calling side, you declare func() as:
extern void func(int, int);

volatile

Specifies that the value of a variable might change in ways that the compiler cannot predict. If volatile is used, the compiler will not perform certain optimizations on that variable.

while

See while .

Declarations

The following declarations are described in this section:

In general, a variable declaration has the following format:

[storage_class_specifier] [data_type] variable_name
    [=initial_value];
where: Here are a few sample variable declarations without storage class specifiers or initial values:
int   age;                 /* integer variable "age" */
int length, width;         /* abbreviated declaration of two
                              variables*/
float ph;                  /* floating-point variable "ph" */
char  a_letter;            /* character variable "a_letter" */
int   values[10];         /* array of 10 integers named values */
enum  days {mon, wed, fri}; /* enumerated variable "days" */

Typedef Declarations

The C language allows you to create your own names for data types with the typedef keyword. Syntactically, a typedef is similar to a variable declaration except that the declaration is preceded by the typedef keyword.

A typedef declaration may appear anywhere a variable declaration may appear and obeys the same scoping rules as a normal declaration. Once declared, a typedef name may be used anywhere that the type is allowed (such as in a declaration, cast operation, or sizeof operation). You can write typedef names in all uppercase so that they are not confused with variable names.

You may not include an initializer with a typedef.

The statement:

typedef long int FOUR_BYTE_INT;
makes the name FOUR_BYTE_INT synonymous with long int. The following two declarations are now identical:
long int j;
FOUR_BYTE_INT j;

Abstract Global Types

Typedefs are useful for abstracting global types that can be used throughout a program, as shown in the following structure and array declaration:
typedef struct {
    char  month[4];
    int   day;
    int   year;
} BIRTHDAY;
 
typedef char A_LINE[80]; /* A_LINE is an array of
                          * 80 characters */

Improving Portability

Type definitions can be used to compensate for differences in C compilers. For example:
#if SMALL_COMPUTER
    typedef int SHORTINT;
    typedef long LONGINT;

#elif
    BIG_COMPUTER
    typedef short SHORTINT;
    typedef int LONGINT;

#endif

This is useful when writing code to run on two computers, a small computer where an int is two bytes, and a large computer where an int is four bytes. Instead of using short, long, and int, you can use SHORTINT and LONGINT and be assured that SHORTINT is two bytes and LONGINT is four bytes regardless of the machine.

Simplifying Complex Declarations

You can use typedefs to simplify complex declarations. For example:
typedef float *PTRF, ARRAYF[], FUNCF();
This declares three new types called PTRF (a pointer to a float), ARRAYF (an array of floats), and FUNCF (a function returning a float). These typedefs could then be used in declarations such as the following:
PTRF x[5];  /* a 5-element array of pointers to floats */
FUNCF z;    /* A function returning a float */

Using typedefs for Arrays

The following two examples illustrate what can happen when you mix pointers and typedefs that represent arrays. The problem with the program on the left is that ptr points to an array of 80 chars, rather than a single element of a char array. Because of scaling in pointer arithmetic, the increment operator adds 80 bytes, not one byte, to ptr.
 
wrong right
typedef char STR[80];      
STR   string, *ptr;         
                           
                           
                            
main()                      
{                           
   ptr = string;            
   printf("ptr = %d\n", ptr);
   ptr++;                     
   printf("ptr = %d\n", ptr);  
}                            
                            
*** Run-Time Results ***     
                             
ptr = 3997696                
ptr = 3997776                
typedef char STR[80];      
STR   string;             
char  *ptr;                
                            
                            
main()                      
{                            
   ptr = string;
   printf("ptr = %d\n", ptr);
   ptr++;
   printf("ptr = %d\n", ptr);
}                            
                           
*** Run-Time Results ***     
                            
ptr = 3997696               
ptr = 3997697               

Name Spaces

All identifiers (names) in a program fall into one of four name spaces. Names in different name spaces never interfere with each other. That is, you can use the same name for an object in each of the four name spaces without these names affecting one another. The four name spaces are as follows:
NOTE  The separate name spaces for goto labels and for each struct, union, or enum definition are part of the ANSI/ISO standard, but not part of the K&R language definition.

The following example uses the same name, overuse, in four different ways:

int main(void)
{
    int overuse;       /* normal identifier */
    struct overuse {   /* tag name */
        float overuse; /* member name */
        char *p;
    } x;
    goto overuse;
overuse: overuse = 3;  /* label name */
}

Structure, Union, and Enum Names

Each struct, union, or enum defines its own name space, so that different declarations can have the same member names without conflict. The following is legal:
struct A {
    int x;
    float y;
};
struct B {
    int x;
    float y;
};
The members in struct A are distinct from the members in structB.

Macro Names

Macro names do interfere with the other four name spaces. Therefore, when you specify a macro name, do not use this name in one of the other four name spaces. For example, the following program fragment is incorrect because it contains a macro named square and a label named square:
#define square(arg)  arg * arg

int main(void) {   ... square: ... }

Constants

There are four types of constants in C:
Every constant has two properties: value and type. For example, the constant 15 has value 15 and type int.

Integer Constants

HP C supports three forms of integer constants:

Integer constants may not contain any punctuation such as commas or periods.

Examples of Integer Constants

The following examples show some legal constants in decimal, octal, and hexadecimal form:
 
Table 2: 
Decimal  Octal  Hexadecimal 
3 003 0x3
8 010 0x8
15 017 0xF
16 020 0x10
21 025 0x15
-87 -0127 -0x57
187 0273 0xBB
255 0377 0xff

Floating-Point Constants

A floating-point constant is any number that contains a decimal point and/or exponent sign for scientific notation.

The number may be followed by an f or F, to signify that it is of type float, or by an l or L, to signify that it is of type long double. If the number does not have a suffix, it is of type double even if it can be accurately represented in four bytes.

If the magnitude of a floating-point constant is too great or too small to be represented in a double, the C compiler will substitute a value that can be represented. This substitute value is not always predictable.

You may precede a floating-point constant with the unary plus or minus operator to make its value positive or negative.

Scientific Notation

Scientific notation is a useful shorthand for writing lengthy floating-point values. In scientific notation, a value consists of two parts: a number called the mantissa followed by a power of 10 called the characteristic (or exponent).

The letter e or E, standing for exponent, is used to separate the two parts.

The floating-point constant 3e2, for instance, is interpreted as 3*(102), or 300. Likewise, the value -2.5e-4 is interpreted as -2.5/(104), or -0.00025.

Examples of Floating-Point Constants

Here are some examples of legal and illegal floating-point constants.
 
Table 3: Floating-Point Constants 
Constant  Legal or Illegal 
3. legal
35 legal - interpreted as an integer.
3.141 legal
3,500.45 illegal - commas are illegal.
.3333333333 legal
4E illegal - the exponent must be followed by a number
0.3 legal
-3e2 legal
4e3.6 illegal - the exponent must be an integer
3.0E5 legal
+3.6 legal
0.4E-5 legal

Character Constants

A character constant is any printable character or legal escape sequence enclosed in single quotes. A character constant can begin with the letter L to indicate that it is a wide character constant; this notation is ordinarily used for characters in an extended character set. In HP C, an ordinary character constant occupies one byte of storage; a wide character constant occupies the rightmost byte of a 4-byte integer.

The value of a character constant is the integer ISO Latin-1 value of the character. For example, the value of the constant x is 120.

Escape Sequences

HP C supports several escape sequences:
 
Table 4: Character Escape Codes 
Escape Code  Character  What it Does 
\a Audible alert Rings the terminal"s bell.
\b Backspace Moves the cursor back one space.
\f Formfeed Moves the cursor to the next logical page.
\n Newline Prints a newline.
\r Carriage return Prints a carriage return.
\t Horizontal tab Prints a horizontal tab.
\v Vertical tab Prints a vertical tab.
\\ Backslash Prints a backslash.
\? Question mark Prints a question mark.
\' Single quote Prints a single quote.
\" Double quote Prints a double quote.

The escape sequences for octal and hexadecimal numbers are commonly used to represent characters. For example, if ISO Latin-1 representations are being used, the letter a may be written as \141 or \x61 and Z as \132 or \x5A. This syntax is most frequently used to represent the null character as \0. This is exactly equivalent to the numeric constant zero (0). When you use the octal format, you do not need to include the zero prefix as you would for a normal octal constant.

Multi-Character Constants

Each character in an ordinary character constant takes up one byte of storage; therefore, you can store up to a 4-byte character constant in a 32-bit integer and up to a 2-byte character constant in a 16-bit integer.

For example, the following assignments are legal:

{
   char    x;              /* 1-byte integer */
   unsigned short int si;  /* 2-byte integer */
   unsigned long int li;   /* 4-byte integer */
 
/* the following two assignments are portable: */
   x  =  "j";      /* 1-byte character constant */
   li = L"j";      /* 4-byte wide char constant */
 
/* the following two assignments are not portable,
   and are not recommended: */
   si = "ef";     /* 2-character constant */
   li = "abcd";   /* 4-character constant */
}
The variable si is assigned the value of e and f, where each character takes up 8 bits of the 16-bit value. The HP C compiler places the last character in the rightmost (least significant) byte. Therefore, the constant ef will have a hexadecimal value of 6566. Since the order in which bytes are assigned is machine dependent, other machines may reverse the order, assigning f to the most significant byte. In that case, the resulting value would be 6665. For maximum portability, do not use multi-character constants. Use character arrays instead.

String Constants

A string constant is any series of printable characters or escape characters enclosed in double quotes. The compiler automatically appends a null character (\0) to the end of the string so that the size of the array is one greater than the number of characters in the string. For example,
"A short string"
becomes an array with 15 elements.

Like a character constant, a string constant can begin with the letter L to indicate that it is a string constant in an extended character set.

To span a string constant over more than one line, use the backslash character (\), also called the continuation character. The following, for instance, is legal:

    strcpy(string,"This is a very long string that requires more \
than one line");
Note that if you indent the second line, the spaces will be part of the string.

The compiler concatenates adjacent string constants. Therefore, you can also span a string constant over one line as shown:

strcpy(string, "This is a very long string that requires more "
               "than one line");
When you indent the second line with this method, the spaces are not part of the string.

The type of a string is array of char, and strings obey the same conversion rules as other arrays. Except when a string appears as the operand of sizeof or as an initializer, it is converted to a pointer to the first element of the string. Note also that the null string,

""
is legal, and contains a single trailing null character.

Structuring a C Program

When you write a C program, you can put all of your source code into one file or spread it across many files. A typical C source file contains some or all of the following components:

Example

The following shows how a program can be organized:

/* preprocessor directives */

#include <stdio.h>
#define WEIGHTING_FACTOR 0.6
/* global typedef declaration */
typedef float THIRTY_TWO_BIT_REAL;
/* global variable declaration */
THIRTY_TWO_BIT_REAL correction_factor = 1.15;
/* prototype */
float average (float arg1, THIRTY_TWO_BIT_REAL arg2)
/* start of function body */
{
   /* local variable declaration */
    float mean;
   /* assignment statement */
    mean = (arg1 * WEIGHTING_FACTOR) +
           (arg2 * (1.0 - WEIGHTING_FACTOR));
/* return statement */
     return (mean * correction_factor);
/* end of function body */
}
int main(void)
/* start of function body */
{
/* local variable declarations */ 
    float value1, value2, result;
/* statements */
    printf("Enter two values -- ");
    scanf("%f%f", &value1, &value2);
    result = average(value1, value2);
/* continuation line */
    printf("The weighted average using a correction \
factor of %4.2f is %5.2f\n", correction_factor, result);
/* end of function body */
}