Tuesday, June 23, 2009

If "NULL" and "0" are equivalent, which should I use?

Many programmers believe that "NULL" should be used in all pointer contexts, as a reminder that the value is to be thought of as a pointer. Others feel that the confusion surrounding "NULL" and "0" is only compounded by hiding "0" behind a #definition, and prefer to use unadorned "0" instead. There is no one right answer.
C programmers must understand that "NULL" and "0" are interchangeable and that an uncast "0" is perfectly acceptable in initialization, assignment, and comparison contexts. Any usage of "NULL" (as opposed to "0") should be considered a gentle reminder that a pointer is involved; programmers should not depend on it (either for their own understanding or the compiler's) for
distinguishing pointer 0's from integer 0's.
NULL should _not_ be used when another kind of 0 is required, even though it might work, because doing so sends the wrong stylistic message. (ANSI allows the #definition of NULL to be (void *)0, which will not work in non-pointer contexts.) In particular, do not use NULL when the ASCII null character (NUL) is desired.
Provide your own definition
#define NUL '\0'

Is the abbreviated pointer comparison "if(p)" to test for non-null pointers valid? What if the internal representation for null pointers is nonzero?

When C requires the boolean value of an expression (in the if, while, for, and do statements, and with the &&, ,!, and ?: operators), a false value is produced when the expression compares equal to zero, and a true value otherwise. That is, whenever one writes
if(expr)
where "expr" is any expression at all, the compiler essentially acts as if it had been written as
if(expr != 0)
Substituting the trivial pointer expression "p" for "expr," we have
if(p) is equivalent to if(p != 0)
and this is a comparison context, so the compiler can tell that the (implicit) 0 is a null pointer, and use the correct value. There is no trickery involved here; compilers do work this way, and generate identical code for both statements. The internal representation of a pointer does _not_ matter.
The Boolean negation operator, !, can be described as follows:
!expr is essentially equivalent to expr?0:1
It is left as an exercise for the reader to show that
if(!p) is equivalent to if(p == 0)

How should NULL be #defined on a machine which uses a nonzero bit pattern as the internal representation of a null pointer?

Programmers should never need to know the internal representation(s) of null pointers, because they are normally taken care of by the compiler. If a machine uses a nonzero bit pattern for null pointers, it is the compiler's responsibility to generate it when the programmer requests, by writing "0" or "NULL," a null pointer. Therefore, #defining NULL as 0 on a machine for which internal null pointers are nonzero is as valid as on any other, because the compiler must (and can) still generate the machine's correct null pointers in response to unadorned 0's seen in pointer contexts.

WHAT IS NULL AND HOW IS IT #DEFINED?

As a matter of style, many people prefer not to have unadorned 0's scattered throughout their programs. For this reason, the preprocessor macro NULL is #defined (by or ), with value 0 (or (void *)0, about which more later). A programmer who wishes to make explicit the distinction between 0 the integer and 0 the null pointer can then use NULL whenever a null pointer is required. This is a stylistic convention only; the preprocessor turns NULL back to 0 which is then recognized by the compiler (in pointer contexts) as before. In particular, a cast may still be necessary before NULL (as before 0) in a function call argument.
(The table under question 2 above applies for NULL as well as 0.)
NULL should _only_ be used for pointers.

HOW DO I "get" A NULL POINTER IN MY PROGRAMS?

According to the language definition, a constant 0 in a pointer context is converted into a null pointer at compile time. That is, in an initialization, assignment, or comparison when one side is a variable or expression of pointer type, the compiler can tell that a constant 0 on the other side requests a null pointer, and generate the correctly-typed null pointer value. Therefore, the following fragments are perfectly legal:
char *p = 0;
if(p != 0)
However, an argument being passed to a function is not necessarily recognizable as a pointer context, and the compiler may not be able to tell that an unadorned 0 "means" a null pointer. For instance, the Unix system call "execl" takes a variable-length, null-pointer-terminated list of character pointer arguments. To generate a null pointer in a function call context, an explicit cast is typically required:
execl("/bin/sh", "sh", "-c", "ls", (char *)0);
If the (char *) cast were omitted, the compiler would not know to pass a null pointer, and would pass an integer 0 instead. (Note that many Unix manuals get this example wrong.)
When function prototypes are in scope, argument passing becomes an "assignment context," and most casts may safely be omitted, since the prototype tells the compiler that a pointer is required, and of which type, enabling it to correctly cast unadorned 0's. Function prototypes cannot provide the types for variable arguments in variable-length argument lists, however, so explicit casts are still required for those arguments. It is safest always to cast null pointer function arguments, to guard against varargs functions or those without prototypes, to allow interim use of non-ANSI compilers, and to demonstrate that you know what you are doing.
Summary:
Unadorned 0 okay: Explicit cast required:
initialization function call,
no prototype in scope
assignment
variable argument in
comparison varargs function call
function call,
prototype in scope,
fixed argument

WHAT IS THIS INFAMOUS NULL POINTER, ANYWAY ?

A: The language definition states that for each pointer type, there is a special value -- the "null pointer" -- which is distinguishable from all other pointer values and which is not the address of any object. That is the address-of operator & will never yield a null pointer, nor will a successful call to malloc. (malloc returns a null pointer when it fails, and this is a typical use of null pointers: as a "special" pointer value with some other meaning, usually "not allocated" or "not pointing anywhere yet.")
A null pointer is conceptually different from an uninitialized pointer. A null pointer is known not to point to any object; an uninitialized pointer might point anywhere. See also questions 49, 55, and 85.
As mentioned in the definition above, there is a null pointer for each pointer type, and the internal values of null pointers for different types may be different. Although programmers need not know the internal values, the compiler must always be informed which type of null pointer is required, so it can make the distinction if necessary

Monday, June 22, 2009

WHAT IS A COMPILER?

This is not a silly question. We were all beginners at one time and asked the same question. The following answer is provided for those that have no programming experience.
A computer cannot understand the spoken or written language that we humans use in our day to day conversations, and likewise, we cannot understand the binary language that the computer uses to do it's tasks. It is therefore necessary for us to write instructions in some specially defined language, in this case C, which we can understand, then have that very precise language converted into the very terse language that the computer can understand. This is the job of the compiler.
A C compiler is itself a computer program who's only job is to convert the C program from our form to a form the computer can read and execute. The computer prefers a string of 1's and 0's that mean very little to us, but can be very quickly and accurately understood by the computer. The original C program is called the "source code", and the resulting compiled code produced by the compiler is usually called an "object file".
One or more object files are combined with predefined libraries by a linker, sometimes called a binder, to produce the final complete file that can be executed by the computer. A library is a collection of pre-compiled "object code" that provides operations that are done repeatedly by many computer programs.
Any good compiler that you purchase will provide not only a compiler, but an editor, a debugger, a library, and a linker. Online documentation and help files are usually included, and many compilers have a tutorial to walk you through the steps of compiling, linking and executing your first program.

C IS USALLY FIRST

The programming language C was originally developed by Dennis Ritchie of Bell Laboratories and was designed to run on a PDP-11 with a UNIX operating system. Although it was originally intended to run under UNIX, there has been a great interest in running it under the MS-DOS operating system on the IBM PC and compatibles. It is an excellent language for this environment because of the simplicity of expression, the compactness of the code, and the wide range of applicability. Also, due to the simplicity and ease of writing a C compiler, it is usually the first high level language available on any new computer, including microcomputers, minicomputers, and mainframes.
C is not the best beginning language because it is somewhat cryptic in nature. It allows the programmer a wide range of operations from high level down to a very low level, approaching the level of assembly language. There seems to be no limit to the flexibility available. One experienced C programmer made the statement, "You can program anything in C", and the statement is well supported by my own experience with the language. Along with the resulting freedom however, you take on a great deal of responsibility because it is very easy to write a program that destroys itself due to the silly little errors that a good Pascal compiler will flag and call a fatal error. In C, you are very much on your own as you will soon find. The programming language C was originally developed by Dennis Ritchie of Bell Laboratories and was designed to run on a PDP-11 with a UNIX operating system. Although it was originally intended to run under UNIX, there has been a great interest in running it under the MS-DOS operating system on the IBM PC and compatibles. It is an excellent language for this environment because of the simplicity of expression, the compactness of the code, and the wide range of applicability. Also, due to the simplicity and ease of writing a C compiler, it is usually the first high level language available on any new computer, including microcomputers, minicomputers, and mainframes.
C is not the best beginning language because it is somewhat cryptic in nature. It allows the programmer a wide range of operations from high level down to a very low level, approaching the level of assembly language. There seems to be no limit to the flexibility available. One experienced C programmer made the statement, "You can program anything in C", and the statement is well supported by my own experience with the language. Along with the resulting freedom however, you take on a great deal of responsibility because it is very easy to write a program that destroys itself due to the silly little errors that a good Pascal compiler will flag and call a fatal error. In C, you are very much on your own as you will soon find.

Sunday, June 21, 2009