Thursday, 29 September 2016

A Modest Alternative II: Taming stdio's scanf

The Ugly Brother

My favourite Irish joke (and I'm Irish enough to tell it) concerns a posh gent who is lost in Dublin. He asks a bystander, "Tell me my good man, how can I get to the National Museum?". Who replies: "Well sir, I wouldn't go from here". This is the received opinion about scanf, the ugly brother of printf.

It tends to appear only in the kind of beginner tutorals where programs ask the user to provide two numbers and then add them up. Serious people avoid scanf because it's hard to use properly: easy to confuse, and tricky to get meaningful errors. Beginners read the tutorials and then ask questions about how to bullet-proof scanf and get told by serious people not to use it.

scanf's glamorous cousin std::cin is easier to use, because it matches against strongly-typed non-const references - reading a constant is a compile error. But error handling is not straightforward.

Besides, sometimes you don't have the resources nor the inclination to use iostreams, as I discussed in my last article

Line by Line

A common strategy is to read a text file, line by line. Here std::getline is superior to fgets; it allocates more buffer if needed and trims end-of-line characters.

The outstreams library provides a very similar interface:

string line;
stream::Reader in("myfile.txt");
while (in.getline(line)) {
   do_something(line);
}
if (! in) {
   stream::errs(in.error())('\n');
}
// if failed:
// --> No such file or directory

Always a good idea to check for errors; if the error state is set, then no further read operations will take place, so it's perfectly fine to check afterwards.

So far, very much like iostreams's istream interface, except you get a sensible error without having to call perror yourself (which belongs to stdio, the irony)

From now on I'll assume that you have broken down and said 'using namespace stream'. Here's a one-liner which populates a container:

vector<string> lines;
Reader in("myfile.txt");
in.getlines(lines);
// errors!?

The container can be anything which understands push_back - a std::list would work just as well here. There is an optional second argument to getlines which gives the maximum number of lines to grab. (It's not how these things are typically organized; you would pretend that Reader was a container - or write an adapter - and use std::copy and std::back_inserter. This is more general but significantly uglifies the common case.)

Then there is a readall method which grabs all of the file and puts it into a std::string. The weakness of the standard string - that it knows nothing about character encoding - becomes a strength when you treat it as sliceable, appendable bag of bytes.

Having captured the lines, those serious people will typically then use serious conversion functions like std::stoul etc to actually extract values and get errors.

A Comedy of Errors

Things become less than simple with both stdio and iostreams when reading items one by one. scanf returns the number of items successfully processed, or EOF if we run out of stream. If the number scanned is too big for the type, garbage results. (The man page does claim that it will set errno if a conversion resulted in a value out of range, but it appears to be lying.) So serious use involves lots of checking.

I've tried to tame these errors by putting a facade around the actual scanf calls, rather as outstreams wrapped printf:

double x;
string s;
short b;

StrReader is ("4.2 boo 42");
is (x) (s) (b);

You always should be aware of errors: files may be mistyped, mangled by the network, or maliciously altered.

is.set("x4.2 boo 42");
is (x) (x) (b);
errs(is.error())('\n');
// --> error reading double
// at 'x4.2'
is.set("x4.2 boo 344455555");
is (x) (x) (b);
errs(is.error())('\n');
// --> error converting int16 
// --> out of range
//  344455555
is.set("x4.2 boo x10");
is (x) (x) (b);
errs(is.error())('\n');
// --> error reading int64 at
// 'x10'

('int64' for reading a short? Because I cannot assume that the read value is in range.)

There's an even more thorough way to handle errors; there is an error struct which can be 'read' from a stream.

Reader::Error err;

if (! is (x) (x) (b) (err)) {
// err.errcode  EOF or errno
// err.msg as returned by error()
// err.pos position in file
}

// safe one-liner:
// capture the error state
// before reader object dies
if (Reader("tmp.txt")
    .getline(line)
    (err)
) {
   // cool
} else {
   // bummer! But at least
   // you know _where_
}

So, in summary so far, Reader (and its string-oriented cousin StrReader) provides a safer way to use stdio for input, with better error handling, without the baroque contortions of iostream errors.

Why no exceptions? It can be a matter of taste; it's often forbidden in embedded coding and it can be argued that paying attention to the error where it happens leads to better code. You are completely free to throw your own exception of course after checking for an error, but it then it won't be some generic 'file not found' exception which makes no sense several stack traces down.

Some useful tricks

It's possible to read a file of numbers directly:

outstreams$ cat numbers.txt
10 20 30 40
50 60 70 80
...
int i;
Reader in("numbers.txt");
while (in (i)) {
    outs(i);
}
outs(eol);
// -> 10 20 30 40 50 60 70 80

Of course, the read could be replaced by in >> i with istream and it would work in the same way, except that any errors would not give as much information.

What if we only wanted the first three numbers on each line?

in (i) (j) (k) ();
outs(i)(j)(k)(eol);
// -> 10 20 30    
in (i) (j) (k) ();
outs(i)(j)(k)(eol);
// -> 50 60 70

The no-argument overload of the call operator is equivalent to the skip method, which generally takes the number of lines to skip.

Binary files

The read method comes in two flavours. The first is given a buffer with its size, and returns the number of bytes actually read.

Reader self ("rx-read");
self.setpos(0,'$');
auto endp = self.getpos();
outs("length was")
    (endp)(eol);
// --> length was 42752
self.setpos(0,'^');
auto buff = 
    new uint8_t [endp];
auto res = 
    self.read(buff,endp);
outs("read")(res)
    ("bytes")(eol);
// --> read 42752 bytes

setpos is a little eccentric, but will make sense to people who use regular expressions - '^' means 'start of file' and '$' means 'end of file'; anything else means current position.

(The std::istream method of the same name and signature returns the stream, which is consistent, but awkward if you're interested in the bytes actually read.)

The other form of read is a template method that takes one argument and returns the stream. It's useful for objects that have fixed size at compile time, like structs and arrays

MyHeader st;
MyInfo info;
inf.read(st).read(info);
// error state will be EOF
// if we didn't read all

Commands

Executing and capturing the output of external problems is a pain point in C++ for me. So naturally I wanted to capture the output of that excellent old dog popen and make it easier to use.

CmdReader ls("ls *.cpp");
vector<string> cpp_files;
ls.getlines(cpp_files);

string uname =
    CmdReader("uname").line();
if (uname == "Linux") { 
    outs("that's a relief")(eol);
}

CmdReader derives from Reader - all it need do is override close_handle so that pclose is called instread of fclose. It has an extra convenience method line for grabbing first line of output. stdout and stdin are merged.

If you aren't particularly interested in the output and just success/failure, there are a few convenient patterns:

// quick cool/uncool check
string res = CmdReader(
    "true", 
    cmd_ok
    ).line();
if (res != "OK") {
    outs("very weird shell")
        (eol);
}

// actual return code
int retcode;
CmdReader("false",cmd_retcode)
    (retcode);

Wrapping Up

We have lost some scanf functionality by breaking up the format into little bits.

However, if you already have a means to split the input stream into parts, then Reader machinery can be re-used to do the actual conversion - see templ-read.cpp:

int n;
double x;
string s;
auto parts = {"42","5.2","hello"};
auto rdr = make_parts_reader(
    parts.begin(),
    parts.end()
);
rdr (n) (x) (s);

This pattern can be made to work with any source of strings, like regular expression matches, database queries, and so forth.

I tried to show that scanf can be tamed, and it then becomes more pleasant and reliable to use. This library is not a heavy dependency (instream and outstream do not depend on each other and are each a single source file) and provides a middle option when choosing between stdio and iostreams.

Highly structured input formats are best read with the correct tools - we do have JSON, CSV, config-file, etc parsers. I suspect the problem here is that there is no browsable and discoverable repository of useful C++ libraries like Cargo for Rust, etc.

Saturday, 9 April 2016

stdio or iostreams? - A Modest Alternative

An Awkward Choice

The standard C++ iostreams library is generally easier to use for output than stdio, if you don't particularly care about exact formatting output and have no resource constraints. No one can say that stdio is pretty.. Portable use of printf requires the use of ugly PRIu64 style macros from and noisy calls to c_str(). It's fundamentally weakly-typed and can only work with primitive types.

However, using iostreams gets ugly fast, with a host of manipulators and formatting specifiers, all tied together with <<. There are also some serious misfeatures - beginners learn to obsessively spell 'end of line' as std::endl but never learn that it also flushes the stream, until their program proves to be an utter dog at I/O. Generally standard C++ streams are slower and allocate far too much to be considered seriously for embedded work. In the immortal words of the readline(3) man page, under the Bugs section "It's too big and too slow".

Embedded systems which can benefit from the abstraction capabilities of C++ don't necessarily have space for the large sprawling monster that the standard iostreams has become. In this realm, printf (or a stripped-down variant) still rules.

It is true that we no longer depend on simple text i/o as much as the old-timers did, since many structured text output standards have emerged. We still lean on stdio/iostreams to generate strings, since string manipulation is still relatively poor compared to other languages. With some classes of problems, debug writes are heavily used. And all non-trivial applications need logging support.

The library I'm proposing - outstreams - is for people who want or need an alternative. It presents another style of interface, where overloading operator() leads to a fluent and efficient style for organizing output. It is still possible to use printf flags when more exact formatting is required. Building it on top of stdio gives us a solid well-tested base.

It is not, I must emphasize, a criticism of the standard C++ library. That would be the technical equivalent of complaining about the weather.[0]

Chaining with operator() for Fun and Profit

By default, the standard outstreams for stdout and stderr define space as the field separator which saves a lot of typing when making readable output with little fuss.

double x = 1.2;
string msg = "hello";
int i = 42;

outs (x) (msg) (i) ('\n');
// --> 1.2 hello 42
outs("my name is")(msg)('\n');
// --> my name is hello

Chaining calls like this has advantages - most code editors are aware of paired parentheses, and some actually fill in the closing bracket when you type '('. It's easy to add an extra argument to override the default formatting, in a much more constant-friendly way than with printf:

const char *myfmt = "%6.2f";
double y = 0;
outs(x,myfmt)(y,myfmt)
    (msg,quote_s)('\n');
// -->   1.20  0.00 'hello'

Another useful property of operator() is that standard algorithms understand callables:

vector<int> vi {10,20,30};
for_each(vi.begin(), vi.end(), outs);
outs('\n');
// -> 10 20 30

Containers

The for_each idiom is cute but outstreams provides something more robust where you can provide a custom format and/or a custom separator:

// signatures
//   Writer& operator() (It start, It finis,
//      const char *fmt=nullptr, char sepr=' ');
//  Writer& operator() (It start, It finis, 
//     char sepr);

outs(vi.begin(),vi.end(),',')('\n');
// --> 10,20,30

string s = "\xFE\xEE\xAA";
outs(s.begin(),s.end(),hex_u)('\n');
// --> FEEEAA

In the C++11 standard there is a marvelous class called std::intializer_list which is implicitly used in bracket initialization of containers. We overload it directly to support brace lists of objects of the same type; there is also an convenient overload for std::pair:

// signature: 
// Writer& operator() (
//    const std::initializer_list<T>& arr,
//    const char *fmt=nullptr, char sepr=' ');
outs({10,20,30})('\n');
// --> 10 20 30

// little burst of JSON
outs('{')({
    make_pair("hello",42),
    make_pair("dolly",99),
    make_pair("frodo",111)
},quote_d,',')('}')();
// --> { "hello":42,"dolly":99,"frodo":111 }

// which will also work when iterating
// over `std::map`

This works because the quote format only applies to strings - anything else ignores it.

Writing to Files and Strings: Handling Errors

The Writer class can open and manage a file stream:

Writer("tmp.txt")(msg)('\n');
// -> tmp.txt: hello\n

C++ manages the closing of the file automatically, as with the iostreams equivalent.

Of course, there's no general guarantee that 'tmp.txt' can be opened for writing: outstreams borrow a trick from iostreams and convert to a bool:

Writer wr("tmp.txt");
if (wr) {
   wr(msg)('\n');
} else {
   errs(wr.error())('\n');
}

It's straightforward to build up strings using this style of output. (Mucking around with sprintf can be awkward and error-prone. Clue: always spell it snprintf) Here is the ever useful generalized string concatenation pattern:

StrWriter sw;
int arr[] {10,20,30};
sw(arr,arr+3,',');
string vis = sw.str();
// -> vis == "10,20,30"

The Problem with "Hello World"

It is perfectly possible to construct a generic 'print' function in modern C++. With variadic templates it can be done elegantly in a type-safe way. The definition is recursive; printing out n items is defined as printing the first value, then printing the last n-1 values; printing 1 value uses outs.

template <typename T>
Writer& print(T v) {
   return outs(v);
}

template <typename T, typename... Args>
Writer& print(T first, Args... args) {
   print(first);
   return print(args...);
}

...
int answer = 42;
string who = "world";

print("hello",who,"answer",answer,'\n');
// -> hello world answer 42

This is cute, but although the implementation shows the flexibility of modern C++, the result shows the limitations of print as a utility. Fine for a quick display of values, but what if you need an integer in hex, or a floating-point number to a particular precision, and so forth? Where print exists in languages - Python 3, Java and Lua have this style - there is this problem of formating. One approach is to define little helper functions and generally lean on string handling; for instance it's easy with StrWriter to define to_str:

template <typename T>
string to_str(T v, 
    const char *fmt=nullptr)
{
   StrWriter sw;
   sw(v,fmt);
   return sw.str();
}
...
print
    ("answer in hex is 0x" + to_str(answer,"X"))
    ('\n');
--> answer in hex is 0x2A

You see this kind of code quite frequently with Java, and it sucks for high performance logging because of the cost of creating and disposing all the temporary strings. Java has since acquired an actual printf equivalent (probably provoked by its competitive younger sister C#) and both Python and Lua programmers use some kind of format to make nice strings. Not to say that to_str isn't a useful function - it's more flexible than std::to_string - but it will have a cost that you might not always want to pay.

Another approach is to create little wrappers, like a Hex class and so forth. So you get code like this: print("answer in hex",Hex(answer))();. The namespace becomes cluttered with these classes, like how std is full of things like hex,dec and so forth. A compromise is to add just another function which wraps a value and a format. This isn't bad for performance. since the wrapper type just transfers references to the values; you can see for yourself in print.cpp.

The other approach is the one taken by iostreams - define some special values which control the formatting of the next value, and so forth[1]. It can be done, but it's messy and makes the concept rather less appealing. It's a nice example of the Hello World Fallacy [2]where the easy stuff is attractively easy and the hard stuff is unnecessarily hard. And I maintain that print and iostreams fall into exactly that space.

This implementation of print does have the nice properly that it's easy to overload for new types, which is not the case for outstreams.

Quick Output And Logging

Use of the preprocessor in modern C++ is considered generally a Bad Thing, for good reasons. Macros stomp on everything, without respect for scope, and C++ provides alternative mechanisms for nearly everything (inlining, constant definition, etc) that C relies on the preprocessor to provide. But it isn't an absolute evil, if macros are always written as UPPER_CASE and so clearly distinct from scoped variables.

Here's a case where developer convenience outweighs ideological purity; dumping out variables. Sometimes symbolic debuggers are too intrusive or simply not available. It's a nice example of old macro magic combined with new operator overloading.

#define VA(var) (#var)(var,quote_d)
#define VX(var) (#var)(var,hex_u)
...
string full_name;
uint64_t  id_number;
char ch = ' ';
...
outs VA(full_name) VA(id_number)
    VX(ch) ('\n');
// --> full_name "bonzo the dog" id_number 666 ch 20

Here is a trick which allows you to completely switch off tracing, with little overhead. (Many loggers will suppress output if the level is too low, but any expressions will still be evaluated.)

// if the FILE* is NULL, then a Writer 
// instance converts to false
// can say logs.set(nullptr) to
// switch off tracing
#define TRACE if (logs) logs ("TRACE")
...
TRACE VA(obj.expensive_method()) ('\n');

I mentioned that logging was something that all serious programs need. It's tempting to write your own logger, but it's tricky to get right and this wheel has been invented before.

We use log4cpp where I work, but only its mother would consider it to be elegant and well-documented:

plogger->log(log4cpp::Priority::DEBUG,
    "answer is %d",42);

It does have an iostreams-like alternative interface but it's a bit clumsy and half-baked. However, it is very configurable and handles all the details of creating log files, rolling them over, writing them to a remote syslog, and so forth.

It is easy to wrap this in a Writer-derived class. In fact, it's easier to derive from StrWriter and override put_eoln, which is used by the 'end of line' empty operator() overload. Normally it just uses write_char to put out '\n', but here we use it to actually construct the call to log4cpp:

using PriorityLevel
    = log4cpp::Priority::PriorityLevel;
class LogWriter: public StrWriter {
    PriorityLevel level;
public:
    LogWriter(PriorityLevel level)
        : StrWriter(' '),level(level) {}

    virtual void put_eoln() {
        plogger->log(level,
            "%s",str().c_str());
        clear();
    }
};

By just exposing references to error, warn etc to Writer, the rest of your program does not have any dependencies on the log4cpp headers - just in case you do want to drop in your own replacement that directly uses syslog. Look at testlog.cpp and how it uses logger.cpp to encapsulate the details of logging.

Thereafter, use as before:

error("hello")(42)('\n');
warn("this is a warning")('\n');
debug("won't appear with defaults)('\n');
// -->
// 2016-04-10 09:32:11,552 [ERROR] hello 42
// 2016-04-10 09:32:11,553 [WARN] this is a warning

Costs and Limitations

In a simple test (writing a million records to a file with five comma-separated fields) outstreams seems to be about 10% more expensive, as we can expect from needing more calls to stdio. (Equivalent test for iostreams shows it seriously lagging behind). The library itself is small, so if your system has vfprintf (or equivalent) then it's a easy dependency. If the macro OLD_STD_CPP is defined it compiles fine for C++-03, without support for initializer lists.

There is a fundamental problem with operator() here - it can only be defined as a method of a type, unlike operator<<. So adding your own types requires overriding a special function and using a template version of operator() with a additional first const char* argument to avoid the inevitable Template Overloading Blues.

As a more traditional alternative, if a type implements the Writeable interface, then things work cleanly. (In any case, how a class wants to write itself out is its own private business.) Writeable provides a handy to_string using the implementation of the overriden write_to method.

Some may find this notation too compressed - << is nice & spacy by comparison. It is certainly less noisy than stdio, since format strings are optional. Sticky separators can be annoying (controlling them properly was probably the most tricky bit of the implementation) but for most applications, they seem appropriate - they can always be turned off.

UPDATE

Some did find the notation too compressed - in particular () seems to vanish. So now ('\n') is completely equivalent to ().

The cppformat also looks like a good alternative. It resolves the specification problem with variadic print by allowing for a format, either traditional or Python-style.

Some commenters felt that operator-chaining was completely old-hat and that variadic templates were obviously superior, which is a matter of taste.

Internationalization represents a problem, since the word order may change in translation - Python-style does solve this issue.

[0] Although that doesn't mean we have to like all of it. And using it is not compulsory.

[1] But don't make them sticky like with iostreams - there are Stackoverflow questions about how to print out in hex and then how to stop printing out in hex.

[2] This fallacy is an independent rediscovery of the exact phrase from an earlier article, but the feeling on Reddit was that the first guy's website was ugly and hence inherently inferior.

stdio or iostreams? - A Modest Alternative

An Awkward Choice

The standard C++ iostreams library is generally easier to use for output than stdio, if you don't particularly care about exact formatting output and have no resource constraints. No one can say that stdio is pretty.. Portable use of printf requires the use of ugly PRIu64 style macros from and noisy calls to c_str(). It's fundamentally weakly-typed and can only work with primitive types.

However, using iostreams gets ugly fast, with a host of manipulators and formatting specifiers, all tied together with <<. There are also some serious misfeatures - beginners learn to obsessively spell 'end of line' as std::endl but never learn that it also flushes the stream, until their program proves to be an utter dog at I/O. Generally standard C++ streams are slower and allocate far too much to be considered seriously for embedded work. In the immortal words of the readline(3) man page, under the Bugs section "It's too big and too slow".

Embedded systems which can benefit from the abstraction capabilities of C++ don't necessarily have space for the large sprawling monster that the standard iostreams has become. In this realm, printf (or a stripped-down variant) still rules.

It is true that we no longer depend on simple text i/o as much as the old-timers did, since many structured text output standards have emerged. We still lean on stdio/iostreams to generate strings, since string manipulation is still relatively poor compared to other languages. With some classes of problems, debug writes are heavily used. And all non-trivial applications need logging support.

The library I'm proposing - outstreams - is for people who want or need an alternative. It presents another style of interface, where overloading operator() leads to a fluent and efficient style for organizing output. It is still possible to use printf flags when more exact formatting is required. Building it on top of stdio gives us a solid well-tested base.

It is not, I must emphasize, a criticism of the standard C++ library. That would be the technical equivalent of complaining about the weather.[0]

Chaining with operator() for Fun and Profit

By default, the standard outstreams for stdout and stderr define space as the field separator which saves a lot of typing when making readable output with little fuss.

double x = 1.2;
string msg = "hello";
int i = 42;

outs (x) (msg) (i) ('\n');
// --> 1.2 hello 42
outs("my name is")(msg)('\n');
// --> my name is hello

Chaining calls like this has advantages - most code editors are aware of paired parentheses, and some actually fill in the closing bracket when you type '('. It's easy to add an extra argument to override the default formatting, in a much more constant-friendly way than with printf:

const char *myfmt = "%6.2f";
double y = 0;
outs(x,myfmt)(y,myfmt)
    (msg,quote_s)('\n');
// -->   1.20  0.00 'hello'

Another useful property of operator() is that standard algorithms understand callables:

vector<int> vi {10,20,30};
for_each(vi.begin(), vi.end(), outs);
outs('\n');
// -> 10 20 30

Containers

The for_each idiom is cute but outstreams provides something more robust where you can provide a custom format and/or a custom separator:

// signatures
//   Writer& operator() (It start, It finis,
//      const char *fmt=nullptr, char sepr=' ');
//  Writer& operator() (It start, It finis, 
//     char sepr);

outs(vi.begin(),vi.end(),',')('\n');
// --> 10,20,30

string s = "\xFE\xEE\xAA";
outs(s.begin(),s.end(),hex_u)('\n');
// --> FEEEAA

In the C++11 standard there is a marvelous class called std::intializer_list which is implicitly used in bracket initialization of containers. We overload it directly to support brace lists of objects of the same type; there is also an convenient overload for std::pair:

// signature: 
// Writer& operator() (
//    const std::initializer_list<T>& arr,
//    const char *fmt=nullptr, char sepr=' ');
outs({10,20,30})('\n');
// --> 10 20 30

// little burst of JSON
outs('{')({
    make_pair("hello",42),
    make_pair("dolly",99),
    make_pair("frodo",111)
},quote_d,',')('}')();
// --> { "hello":42,"dolly":99,"frodo":111 }

// which will also work when iterating
// over `std::map`

This works because the quote format only applies to strings - anything else ignores it.

Writing to Files and Strings: Handling Errors

The Writer class can open and manage a file stream:

Writer("tmp.txt")(msg)('\n');
// -> tmp.txt: hello\n

C++ manages the closing of the file automatically, as with the iostreams equivalent.

Of course, there's no general guarantee that 'tmp.txt' can be opened for writing: outstreams borrow a trick from iostreams and convert to a bool:

Writer wr("tmp.txt");
if (wr) {
   wr(msg)('\n');
} else {
   errs(wr.error())('\n');
}

It's straightforward to build up strings using this style of output. (Mucking around with sprintf can be awkward and error-prone. Clue: always spell it snprintf) Here is the ever useful generalized string concatenation pattern:

StrWriter sw;
int arr[] {10,20,30};
sw(arr,arr+3,',');
string vis = sw.str();
// -> vis == "10,20,30"

The Problem with "Hello World"

It is perfectly possible to construct a generic 'print' function in modern C++. With variadic templates it can be done elegantly in a type-safe way. The definition is recursive; printing out n items is defined as printing the first value, then printing the last n-1 values; printing 1 value uses outs.

template <typename T>
Writer& print(T v) {
   return outs(v);
}

template <typename T, typename... Args>
Writer& print(T first, Args... args) {
   print(first);
   return print(args...);
}

...
int answer = 42;
string who = "world";

print("hello",who,"answer",answer,'\n');
// -> hello world answer 42

This is cute, but although the implementation shows the flexibility of modern C++, the result shows the limitations of print as a utility. Fine for a quick display of values, but what if you need an integer in hex, or a floating-point number to a particular precision, and so forth? Where print exists in languages - Python 3, Java and Lua have this style - there is this problem of formating. One approach is to define little helper functions and generally lean on string handling; for instance it's easy with StrWriter to define to_str:

template <typename T>
string to_str(T v, 
    const char *fmt=nullptr)
{
   StrWriter sw;
   sw(v,fmt);
   return sw.str();
}
...
print
    ("answer in hex is 0x" + to_str(answer,"X"))
    ('\n');
--> answer in hex is 0x2A

You see this kind of code quite frequently with Java, and it sucks for high performance logging because of the cost of creating and disposing all the temporary strings. Java has since acquired an actual printf equivalent (probably provoked by its competitive younger sister C#) and both Python and Lua programmers use some kind of format to make nice strings. Not to say that to_str isn't a useful function - it's more flexible than std::to_string - but it will have a cost that you might not always want to pay.

Another approach is to create little wrappers, like a Hex class and so forth. So you get code like this: print("answer in hex",Hex(answer))();. The namespace becomes cluttered with these classes, like how std is full of things like hex,dec and so forth. A compromise is to add just another function which wraps a value and a format. This isn't bad for performance. since the wrapper type just transfers references to the values; you can see for yourself in print.cpp.

The other approach is the one taken by iostreams - define some special values which control the formatting of the next value, and so forth[1]. It can be done, but it's messy and makes the concept rather less appealing. It's a nice example of the Hello World Fallacy [2]where the easy stuff is attractively easy and the hard stuff is unnecessarily hard. And I maintain that print and iostreams fall into exactly that space.

This implementation of print does have the nice properly that it's easy to overload for new types, which is not the case for outstreams.

Quick Output And Logging

Use of the preprocessor in modern C++ is considered generally a Bad Thing, for good reasons. Macros stomp on everything, without respect for scope, and C++ provides alternative mechanisms for nearly everything (inlining, constant definition, etc) that C relies on the preprocessor to provide. But it isn't an absolute evil, if macros are always written as UPPER_CASE and so clearly distinct from scoped variables.

Here's a case where developer convenience outweighs ideological purity; dumping out variables. Sometimes symbolic debuggers are too intrusive or simply not available. It's a nice example of old macro magic combined with new operator overloading.

#define VA(var) (#var)(var,quote_d)
#define VX(var) (#var)(var,hex_u)
...
string full_name;
uint64_t  id_number;
char ch = ' ';
...
outs VA(full_name) VA(id_number)
    VX(ch) ('\n');
// --> full_name "bonzo the dog" id_number 666 ch 20

Here is a trick which allows you to completely switch off tracing, with little overhead. (Many loggers will suppress output if the level is too low, but any expressions will still be evaluated.)

// if the FILE* is NULL, then a Writer 
// instance converts to false
// can say logs.set(nullptr) to
// switch off tracing
#define TRACE if (logs) logs ("TRACE")
...
TRACE VA(obj.expensive_method()) ('\n');

I mentioned that logging was something that all serious programs need. It's tempting to write your own logger, but it's tricky to get right and this wheel has been invented before.

We use log4cpp where I work, but only its mother would consider it to be elegant and well-documented:

plogger->log(log4cpp::Priority::DEBUG,
    "answer is %d",42);

It does have an iostreams-like alternative interface but it's a bit clumsy and half-baked. However, it is very configurable and handles all the details of creating log files, rolling them over, writing them to a remote syslog, and so forth.

It is easy to wrap this in a Writer-derived class. In fact, it's easier to derive from StrWriter and override put_eoln, which is used by the 'end of line' empty operator() overload. Normally it just uses write_char to put out '\n', but here we use it to actually construct the call to log4cpp:

using PriorityLevel
    = log4cpp::Priority::PriorityLevel;
class LogWriter: public StrWriter {
    PriorityLevel level;
public:
    LogWriter(PriorityLevel level)
        : StrWriter(' '),level(level) {}

    virtual void put_eoln() {
        plogger->log(level,
            "%s",str().c_str());
        clear();
    }
};

By just exposing references to error, warn etc to Writer, the rest of your program does not have any dependencies on the log4cpp headers - just in case you do want to drop in your own replacement that directly uses syslog. Look at testlog.cpp and how it uses logger.cpp to encapsulate the details of logging.

Thereafter, use as before:

error("hello")(42)('\n');
warn("this is a warning")('\n');
debug("won't appear with defaults)('\n');
// -->
// 2016-04-10 09:32:11,552 [ERROR] hello 42
// 2016-04-10 09:32:11,553 [WARN] this is a warning

Costs and Limitations

In a simple test (writing a million records to a file with five comma-separated fields) outstreams seems to be about 10% more expensive, as we can expect from needing more calls to stdio. (Equivalent test for iostreams shows it seriously lagging behind). The library itself is small, so if your system has vfprintf (or equivalent) then it's a easy dependency. If the macro OLD_STD_CPP is defined it compiles fine for C++-03, without support for initializer lists.

There is a fundamental problem with operator() here - it can only be defined as a method of a type, unlike operator<<. So adding your own types requires overriding a special function and using a template version of operator() with a additional first const char* argument to avoid the inevitable Template Overloading Blues.

As a more traditional alternative, if a type implements the Writeable interface, then things work cleanly. (In any case, how a class wants to write itself out is its own private business.) Writeable provides a handy to_string using the implementation of the overriden write_to method.

Some may find this notation too compressed - << is nice & spacy by comparison. It is certainly less noisy than stdio, since format strings are optional. Sticky separators can be annoying (controlling them properly was probably the most tricky bit of the implementation) but for most applications, they seem appropriate - they can always be turned off.

UPDATE

Some did find the notation too compressed - in particular () seems to vanish. So now ('\n') is completely equivalent to ().

The cppformat also looks like a good alternative. It resolves the specification problem with variadic print by allowing for a format, either traditional or Python-style.

Some commenters felt that operator-chaining was completely old-hat and that variadic templates were obviously superior, which is a matter of taste.

Internationalization represents a problem, since the word order may change in translation - Python-style does solve this issue.

[0] Although that doesn't mean we have to like all of it. And using it is not compulsory.

[1] But don't make them sticky like with iostreams - there are Stackoverflow questions about how to print out in hex and then how to stop printing out in hex.

[2] This fallacy is an independent rediscovery of the exact phrase from an earlier article, but the feeling on Reddit was that the first guy's website was ugly and hence inherently inferior.