People always say that macros are unsafe, and also that they are not (directly) type-checking on their arguments, and so on. Worse: when errors occur, the compiler gives intrincate and incomprehensible diagnostics, because the macro is just a mess.
Is it possible to use macros in almost the same way as a function, by having safe type-checking, avoiding typical pitfalls and in a way that the compiler gives the right diagnostic.
Let us quickly remember some typical pitfalls produced by macros.
Example 1
#define SQUARE(X) X*X
int i = SQUARE(1+5);
Intended value of i
: 36.
True value of i
: 11 (with macro expansion: 1+5*1+5
). Pitfall!
(Typical) Solution (Example 2)
#define SQUARE(X) (X)*(X)
int i = (int) SQUARE(3.9);
Intended value of i
: 15.
True value of i
: 11 (after macro expansion: (int) (3.9)*(3.9))
. Pitfall!
(Typical) Solution (Example 3)
#define SQUARE(X) ((X)*(X))
It works fine with integers and floats, but it is easily broken:
int x = 2;
int i = SQUARE(++x);
Intended value of i
: 9 (because (2+1)*(2+1)
...).
True value of i
: 12 (macro expansion: ((++x)*(++x))
, which gives 3*4
). Pitfall!
A nice method for type-checking in macros can be found here:
However I want more: some kind of interface or "standard" syntax, and a (small) number of easy-to-remember rules. The intent is "be able to use (not to implement)" macros as similar to functions as possible. That means: well written fake-functions.
Why is that interesting in some way?
I think that is an interesting challenge to achieve in C.
Is it useful?
Edit: In standard C is not possible to define nested functions. But, sometimes, one would prefer to be able to define short (inline
) functions nested inside other ones. Thus, a function-like prototyped macro would be a possibility to take in account.
The speed at which macros and functions differs. Macros are typically faster than functions as they don't involve actual function call overhead.
The Concept of C MacrosMacros can even accept arguments and such macros are known as function-like macros. It can be useful if tokens are concatenated into code to simplify some complex declarations. Macros provide text replacement functionality at pre-processing time.
For portability, you should not have more than 31 parameters for a macro. The parameter list may end with an ellipsis (…). In this case, the identifier __VA_ARGS__ may appear in the replacement list.
The macro in C language is known as the piece of code which can be replaced by the macro value. The macro is defined with the help of #define preprocessor directive and the macro doesn't end with a semicolon(;). Macro is just a name given to certain values or expressions it doesn't point to any memory location.
This answer is divided in 4 sections:
(1.) 1st case. Block macros (or non-returning value macros)
Let us consider easy examples first. Suppose that we need a "command"
that prints the square of integer numbers, followed by '\n'.
We decided to implement it with a macro.
But we want the argument to be verified by the compiler as an int
.
We write:
#define PRINTINT_SQUARE(X) { \
int x = (X); \
printf("%d\n", x*x); \
}
(X)
avoid almost all pitfalls.X
is invoked only once inside the macro. This avoids the pitfall of Example 3 of the question. X
is immediately held in the variable x
. x
instead X
.If we systematize this discipline, the typical problems of macros will be avoided.
Now, something like this correctly prints 9:
int i = 3;
PRINTINT_SQUARE(i++);
Obviously this approach could have a weak point: the variable x
defined inside the macro could have conflicts with other variables in the program also called x
. This is a scope issue. However, it's not a problem since the macro-body has been written as a block enclosed by { }
. This is enough to handle every scope-issue, and every potential problem with the "inner" variables x
is tackled.
It could be argued that the variable x
is an extra object and maybe not desired.
But x
has (only) temporary duration: it is created at the beginning of the macro, with the opening {
, and it is destroyed at the end of the macro, with the closing }
.
In this way, x
it is working as a function parameter: a temporal variable is created to hold the value of the parameter, and it is finally discarded when the macro "returns".
We are not committing any sin that functions have not done yet!
More important: when the programmer attempts to "call" the macro with a wrong parameter, the compiler gives the same diagnostic that a function would give under the same situation.
So, it seems every macro pitfall has been solved!
However, we have a little syntactical issue, as you can see here:
do ... while (0)
macro substitutions Therefore, it is imperative (I say) to add a do {} while(0)
construct to the block-like macro definition:
#define PRINTINT_SQUARE(X) do { \
int x = (X); \
printf("%d\n", x*x); \
} while(0)
Now, this do { } while(0)
stuff works fine, but it is anti-aesthetical.
The problem is that it has no intuitive meaning for the programmer.
I suggest the use of a meaningful approach, like this:
#define xxbeg_macroblock do {
#define xxend_macroblock } while(0)
#define PRINTINT_SQUARE(X) \
xxbeg_macroblock \
int x = (X); \
printf("%d\n", x*x); \
xxend_macroblock
(The inclusion of }
in xxend_macroblock
avoids some ambiguity with while(0)
).
Of course, this syntax is not safe anymore.
It has to be carefully documented to avoid misuses.
Consider the following ugly example:
{ xxend_macroblock printf("Hello");
(2.) Summarizing
Block-defined macros that do not return values can behave like functions if we write them by following the disciplined style:
#define xxbeg_macroblock do {
#define xxend_macroblock } while(0)
#define MY_BLOCK_MACRO(Par1, Par2, ..., ParN) \
xxbeg_macroblock \
desired_type1 temp_var1 = (Par1); \
desired_type2 temp_var2 = (Par2); \
/* ... ... ... */ \
desired_typeN temp_varN = (ParN); \
/* (do stuff with objects temp_var1, ..., temp_varN); */ \
xxend_macroblock
MY_BLOCK_MACRO()
is a statement, not an expression: there is no "return" value of any kind, not even void
. (3.) Can we provide an interface for the parameters of the macro?
Although we solved the problem of type-checking of parameters, the programmer cannot figure out what type the parameters "have". It is necessary to provide some kind of macro prototype! This is possible, and very safely, but we have to tolerate a little tricky syntax and some restrictions, also.
Can you figure out what the following lines do?
xxMacroPrototype(PrintData, int x; float y; char *z; int n; );
#define PrintData(X, Y, Z, N) { \
PrintData data = { .x = (X), .y = (Y), .z = (Z), .n = (N) }; \
printf("%d %g %s %d\n", data.x, data.y, data.z, data.n); \
}
PrintData(1, 3.14, "Hello", 4);
PrintData
. data
which collects all the arguments of the macro, at once. .x = (N), .n = (X)
). To declare a prototype, we write xxMacroPrototype
with 2 arguments:
The list of types and names of "local" variables that will be used inside the macro. We will call to this items: pseudoparameters of the macro.
The list of pseudoparameters has to be written as a list of type-variable pairs, separated (and ended) by semicolons (;).
In the body of the macro, the first statement will be a declaration of this form:MacroName foo = { .pseudoparam1 = (MacroPar1), .pseudoparam2 = (MacroPar2), ..., .pseudoparamN = (MacroParN) }
foo.pesudoparam1
, foo.pseudoparam2
, and so on.The definition of xxMacroPrototype() is as follows:
#define xxMacroPrototype(NAME, ARGS) typedef struct { ARGS } NAME
Simple, isn't it?
typedef struct
. struct
declaration. (For example, variable-size arrays only can be at the end of the list).
(In particular, it is recommended to use pointer-to instead of variable-size array declarators as pseudoparameters.) xxMacroPrototype
invocation is done. However, it is easy to be disciplined with that kind of declarations, and it is easy to the programmer to respect the rules.
Can a block-macro 'return' a value?
Yes. Actually, it can retrieve as many values as you want,
by simply passing arguments by reference, as scanf()
does.
But you probably are thinking of something else:
(4.) 2nd case. Function-like macros
For them, we need a little different method to declare macro-prototypes, one that includes a type for the returned value. Also, we'll have to learn a (not-hard) technique that let us to keep the safety of block-macros, with a return value having the type we want.
The typechecking of arguments can be achieved as shown here:
In block-macros we can declare the struct variable NAME
just inside the macro itself,
thus keeping it hidden to the rest of the program. For function-like macros this cannot be done (in standard C99). We have to define a variable of type NAME
before any invocation of the macro. If we are ready to pay this price, then we can earn the desired "safe function-like macro", with returning values of a specific type.
We show the code, with an example, and then we comment it:
#define xxFuncMacroPrototype(RETTYPE, MACRODATA, ARGS) typedef struct { RETTYPE xxmacro__ret__; ARGS } MACRODATA
xxFuncMacroPrototype(float, xxSUM_data, int x; float y; );
xxSUM_data xxsum;
#define SUM(X, Y) ( xxsum = (xxSUM_data){ .x = (X), .y = (Y) }, \
xxsum.xxmacro__ret__ = xxsum.x + xxsum.y, \
xxsum.xxmacro__ret__)
printf("%g\n", SUM(1, 2.2));
The first line defines the "syntax" for function-macro prototypes.
A such prototype has 3 arguments:
The "return" value is an additional field in the struct, with a fixed name: xxmacro__ret__
.
This is declared, for safety, as the first element in the struct. Then the list of pseudoparameters is "pasted".
When we use this interface (if you let me call it this way), we have to follow a series of rules, in order:
typedef struct
that the macro itselfs builds, so you have not worry about, and just use it (in the example this type is xxSUM_data
). xxSUM_data xxsum;
). #define SUM(X, Y)
.( )
, in order to obtain an EXPRESSION (thus, a "returning" value). xxsum
. This is done by: xxsum = (xxSUM_data){ .x = (X), .y = (Y) },
Observe that an object of type xxSUM_data
is created in the air with the aid of compound literals provided by C99 syntax. The fields of this object are filled by reading the arguments X, Y, of the macro, just once, and surrounded by parenthesis, for safety.
Then we evaluate a list of expressions and functions, all of them separated by comma operators (,).
Finally, after the last comma, we just write xxsum.xxmacro__ret__
, which is considered as the last term in the comma expression, and thus is the "returning" value of the macro.
Why all that stuff? Why a typedef struct
?
To use a struct is better than use individual variables,
because the information is packed all in one object, and the data keep hidden to the rest of the program.
We don't want to define "a lot of variables" to hold the arguments of each macro in the program.
Instead, by defining systematically typedef struct
associated to a macro, we have a more easy to handle such macros.
Can we avoid the "external variable" xxsum above?
Since compound literals are lvalues, one can believe that this is possible.
In fact, we can define this kind of macros, as shown in:
But in practice, I cannot find the way to implement it in a safe way.
For example, the macro SUM(X,Y) above cannot be implemented with this method only.
(I tried to make some tricks with pointer-to-struct + compound literals, but it seems impossible).
UPDATE:
(5.) Broking my code.
The example given in Section 1 can be broken this way (as Chris Dodd showed me in his comment, below):
int x = 5; /* x defined outside the macro */
PRINTINT_SQUARE(x);
Since inside the macro there is another object named x (this: int x = (X);
, where X
is the formal parameter of the macro PRINTINT_SQUARE(X)
), what is actually "passed" as argument is not the "value" 5 defined outside the macro, but another one: a garbage value.
To understand it, let us unroll the two lines above after macro expansion:
int x = 5;
{ int x = (x); printf("%d", x*x); }
The variable x
inside the block is initialized... to its own undetermined value!
In general, the technique developed in sections 1 to 3 for block macros can be broken in a similar way, while the struct object we use to hold the parameters is declared inside the block.
This shows that this kind of code can be broken, so it is unsafe:
Don't try to declare "local" variables "inside" the macro to hold the parameters.
xxMacroPrototype()
line. This is less ambitious, but anyway it responses the question: "How much is it possible to...?". On the other hand, now we follow the same approach for the two cases: block and function-like macros.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With