Next article: Friday Q&A 2009-08-14: Practical Blocks
Previous article: Friday Q&A 2009-07-10: Type Specifiers in C, Part 3
Tags: c fridayqna
Greetings and welcome back to Friday Q&A. This week I'm going to discuss some tips and tricks for using
printf-style format strings in C, as suggested by Kevin Avila.
Almost everyone doing C or Objective-C programming uses format strings. In C, they're used by the
printf family of functions. In Cocoa,
NSString both use them. They're a powerful way to build strings, but many people only know the basics. This week I'll delve into some hidden corners to take full advantage of the power it offers. Note that if you don't know the basics already, this article isn't going to make a lot of sense to you, so read up on a good
printf tutorial before continuing.
Finding the Documentation
Hopefully all my readers know this, but just in case: if you type
man printf at your shell prompt, you will get a bunch of confusing stuff that does not appear relevant to C programming. That's because you're actually reading the documentation for the shell command
printf, not the C function. To see documentation on the C function, you need to type
man 3 printf. The Cocoa documentation also contains information on format strings, but since the only significant difference in Cocoa format strings is the addition of the
%@ specifier for printing the
-description of objects, I like to just use the
Varags and Type Promotion
Format strings are always used with a function (or method) that takes variable arguments. This is important for several reasons.
First, the more obvious reason is that C doesn't provide any mechanism for the called function to know how many or what type of variable arguments it got. This means that your format string must exactly match the arguments you provide. Any mismatch could lead to bad output or a crash.
The less obvious reason is that C promotes types in values that get passed as variable arguments. In short, anything smaller than an
int gets promoted to
float gets promoted to
double. So when you pass in a
char, you'll use a format specifier for
int to print it, and likewise with passing a
float and using a
Types of Unknown Size
Frequently when programming in C or Cocoa you'll use a
typedef whose definition is not guaranteed. Examples of this are
size_t it's easy:
printf actually has a format specifier for
size_t: use the
z with one of the standard
CGFloat it's also easy: because
float gets promoted to
double, the same
%f specifier will work with either. No need to change anything.
NSInteger you need to get a little cleverer. You can't use
%d because they might be bigger than an
int. You can't use
%lld because they might be smaller than those, and type promotion doesn't carry over. They could even be bigger than those. What you'll want to do here is make an explicit cast to your variable to a size you know will be large enough to hold it, and then use that specifier. For example:
Strings of Limited Length
%s specifier will print a C string. This is tremendously handy. However sometimes you want to print a sequence of characters that isn't necessarily a C string. For this, you can use the
. (that's a period) modifier to specify a length. For example, here is a convenient way to turn a
FourCharCode into an NSString:
uint32_t valSwapped = CFSwapInt32HostToBig(fcc); // FCCs are stored backwards on Intel NSString *str = [NSString stringWithFormat:@"%.4s", &valSwapped;];
NSStringthat the string is only four characters long, which keeps it from running off the end.
Sometimes you don't know the length ahead of time. This used to happen a lot with Pascal strings, but they're getting pretty rare these days. For this, you can use
* as your length, and then it will read the length as a separate argument. (Note that this separate argument must be of type int, so beware types of unknown size!)
Here's an example of that:
printf("%.*s", length, charbuffer);
printf("%.*s", pstring, pstring + 1);
Printing pointers is a handy thing to do but many people don't know how to do it right. You often see code like this:
The correct way is easy: just use the
%p specifier. You get nice hexadecimal output and the type always matches.
Beware of NULL
This one is so commonly ignored that
clang actually have a workaround just for this, but it's still interesting to know.
NULL can legally just be a
0, like so:
#define NULL 0
NULLas a pointer argument to a vararg function like
NSLog, your code is no longer conformant, because you're really passing an
int! For example, this is, strictly speaking, wrong:
This is easy to fix: if you ever need to do this sort of thing, you can just cast the
NULL to a pointer type like so:
printf("%p", (void *)NULL);
NULL-terminated list of arguments, like
execl. Yes, that means all of the code out there which looks like this is, strictly speaking, wrong:
[NSArray arrayWithObjects:a, b, c, nil];
clanghave a workaround for this. They
NULLto be a magic symbol which has either pointer or integer type depending on the context in which it's used, so the correct pointer value is passed into the function.
Always Constant Format Strings
I see far too much code which does this:
someStringcontains the character sequence
%@, or another format specifier? Then you probably crash.
It gets worse. What if you do this with
printf or similar instead, and
someString comes from a source outside your control, like off the internet? Then horrible things can occur.
One of the format specifiers supported by
printf (but not Cocoa) is the
%n specifier. This is very different from the other specifiers, in that it actually gives you a value back instead of taking one from you. It wants an
int * argument, and will write the number of characters written so far into that argument. For example:
printf("%d%n%d", a, &howmany, b);
howmanywill contain the width of the first integer being printed.
If an attacker has control over the format string, then they can use the
%n specifier to write an arbitrary value to a location in memory! This can then be used to take over your program. This attack is not theoretical.
In general, you should not pass anything other than a constant string as a format string. Every so often it is useful to build a format string dynamically first, but think hard before you do this whether you can accomplish your goal without that, and if you do it, then take extra care to ensure that your string will always be valid.
Random Access Arguments
Typical format string usage is straight through start to finish. The first specifier uses the first argument, the second specifier uses the second argument, etc. However this is not mandatory! You can actually have any specifier use any argument. This is done by adding
n$ to the format specifier, where
n is the argument number to print. Arguments count from 1. For example, this prints the two arguments in reverse order:
printf("a = %2$d b = %1$d", b, a);
printf("%1$s could not be accessed, error %d. Try rebooting %1$s.", name, err);
printf("a = %2$d", b, a);
That wraps up this week's Friday Q&A. There's a lot more to what format strings can do than what I discussed today. Read the man page and take a look at how you can control precision, padding, output formats, and more.
Friday Q&A will be going on hiatus for at least one week and probably two due to various things which are going to keep me busy in that time.
In the meantime, keep those suggestions coming in. The more topics I have to choose from, the better topics you'll be able to read, so send them in!
Comments RSS feed for this page
Add your thoughts, post a comment:
Spam and off-topic posts will be deleted without notice. Culprits may be publicly humiliated at my sole discretion.