Developing Talk Ideas

As a conference organizer, I’m always in speaker recruitment mode. Engaging presentations are the life blood of conferences and stimulate the most important part of a conference–the attendee discussion outside of scheduled sessions.

I don’t know what to talk about.

conference speakerOne comment that I hear from potential speakers is that they don’t know what to talk about. Sometimes this comes from an engineer that has never done a conference talk and sometimes from a speaker that has presented a few times covering all of the obvious (to them) topics.

If this situation applies to you: You’d like present, but feel at a loss for a topic, here is my advice:

watch videos

Start watching videos of other presenters. You know where to find them. Here is the C++Now video channel, the Meeting C++ channel, the CppCon channel, the Pacific++ channel, and, well, all of the C++ videos on YouTube.

But don’t just passively watch them. Watch them taking notes. Make a note about:

  • everything that you find interesting and want to go investigate.

If a talk touches on a feature or idea that is new to you or is used in a new way, you may want to play with it to learn more about it and see how it works in your domain.

You might come up with something that no one has every done before. Even if what you find isn’t completely new, if it was new to you, it is likely to be new to many in your audience.

  • everything that you think that you could explain better or provide better examples or use cases.

conference speakerThis is where presentation skills are so important. I can’t tell you how many times I’ve seen a talk, after which, if you’d ask me, “What new fact did you learn?” I might be hard pressed to say what was new, but I feel like I understood the topic better because of how well it was presented.

Saying the same thing that others have said, but better, is worth doing. C++ is complicated enough that, for most topics, most of us won’t absorb every detail on the first presentation.

Just repeating someone else’s talk with better examples isn’t going to be very compelling, but offering a better way of thinking about a topic and combining topics to provide better comparison or contrast, yields value to your audience.

  • everything that you disagree with.

If you feel like you have a better solution or approach than what is currently being presented as conventional wisdom, then you’ll be passionate about your presentation and passion can be the difference between a good presentation and a great one. Perhaps you have a lightning talk duel in your future.

Interested is interesting.

Look at your notes. Do you have a list of ideas that sound interesting to you? If you are interested, your audience probably is as well. Interested is interesting.

Are you seeing patterns? Perhaps you are developing a fresh perspective. If it is a point of view that make more sense to you, it is likely that your audience will appreciate your insights.

Do you have something you want to rant about? I think I’ve got five minutes in our program schedule for you.

the best way to develop C++ presentation ideas is to be engaged with the C++ community.

Hate videos? You can use this approach without watching a single video. Instead, you can listen to CppCast or read any of the many C++ blogs. Because the insight that I’m sharing is that the best way to develop presentation ideas is to be engaged with the community. The ideas that are exciting the rest of the community are going to spark something in you as well.

Good luck and I’ll see you on stage!

A Foolish Consistency

The Hobgoblin of Little Minds

Ralph Waldo EmersonRalph Waldo Emerson famously said, “A foolish consistency is the hobgoblin of little minds, adored by little statesmen and philosophers and divines.” I don’t think he was talking about code, but that statement couldn’t be more relevant to software engineers.

I’ve experienced a scenario like this a number of time in my career:

I’m sharing a new approach to writing code that offers some clear improvements to what we’ve been doing. Perhaps it is more readable, more efficient, or safer. But the response that I hear from colleagues is, “But we can’t do that here. We have <some large number> lines of code where we didn’t do it that way, so it wouldn’t be consistent.”

This is Why We Can’t Have Nice Things

I know what you’re thinking:

That can’t be right. No one would say “No” to a better way of doing something simply because they can’t improve their entire codebase all at once.

In a sense you are right. I really think that most of the time I’ve heard this response, the individual was really thinking:

I’m uncomfortable with change in general and this change in particular, but I don’t want to admit that I don’t like/understand this just because it wasn’t invented by me. If I try to argue against this change on its merits, I’ll reveal that I don’t really understand the issues and I take the chance that I won’t prevail on the merits, so I’ll play the “consistency” trump card to shutdown discussion now.

A Wise Consistency

Don’t get me wrong, I’m all for consistency. It was clearly a mistake that the member function to delete all elements of specific value from a container was called erase for all containers except list, where it was called remove. (This mistake was fixed in C++11, by adding erase to list.)

Accidental inconsistencies are, of course, foolish and unproductive. Leave gratuitous  inconsistencies for creative writing. In technical work, figure out the best way to express something and always express it the same way.

Until you learn a better way. Then embrace the improvement. This inconsistency is called growth.

Improvement is Change

The STL was a monumental intellectual achievement, but one flaw was that member predicates were improperly named. Consider the empty container member function. Is it a command to empty the container or a query about the container’s state? You know, but that is because you are already familiar with the library. Naive users must experience the confusion between empty, which is a query and clear, which is a command. There is no a priori way of knowing which is which. They could just as easily be the other way around.

The flaw is unnecessary if the policy that predicates should start with is_ had been followed. The function that is named empty should have been named is_empty and the function that is named clear should have been named empty. (It isn’t obvious that clear means “empty the container,” it could, just as easily, mean, set all element to the default constructed state.)

To be consistent, no predicate in the standard should start with is_. Fortunately the committee elected to be inconsistent when naming the predicates in <type_traits>. Improvements from previous practice are, by their very nature, inconsistent with that practice.

Being consistent means never improving. This is acceptable for people who think that their coding practice is perfect, but not for people that crave continuous improvement.

Street Signs

Imagine driving through a unfamiliar city that had mismatched street signs at every corner. It would be hard to navigate because it would be hard to pick out the street signs amid the visual noise of all the business signage and billboards. That kind of inconsistency could lead to traffic snarls and accidents and should be avoided.

But suppose a city with uniform street signs adopts a new, more readable,  standard for signs. The city might not have the budget to replace all the signs overnight. It might start replacing the signs on the busiest streets and using the new signs on newly constructed streets. Over time, less traveled streets would get new signs, but perhaps some residential streets where traffic is slow moving and there is no competing signage, might never be updated.

How hard would it be navigate a city that had two signage styles? This isn’t the challenge of a mismatched signs at every corner, it is simply dealing with the change that is inevitable with improvement.

East const

I’ve not written a post in… well, I don’t want to know… so what got me motivated to vent about hobgoblins and little minds?

Some background: Many years ago I read an article by Dan Saks which recommend placing const after what it modifies. I found the article persuasive. Notice that the rule for const placement is:

const modifies what is on its left. Unless there is nothing on its left, in which case it modifies what’s on its right.

If you consistently place const after what it modifies, the rule becomes much simpler:

const modifies what is on its left.

(I love simple, consistent rules.)

It also make declarations (which are read inside-out and right to left) easier to read. This isn’t the place to list all the arguments that Dan shared, but they convinced me. For years, I’ve followed this practice. I knew that I was in a stylistic minority, but I thought it was a better way to write code so that is what I did.

Recently, I’ve noticed that more and more C++ programmers are starting to adopt this style. In fact, a few months ago, I learned that it had a catchy name: East const.

What prompted this posting was a tweet with a link to this core guideline: NL.26: Use conventional const notation. You can read it yourself, but it essentially says, don’t use East const. The guideline concedes that East const is more logical, but since it is less common, it is forbidden.

As I said, I’ve known for years that I was in the minority and I’m okay with it if you say:

This is how we wrote code back in the Eighties and we liked it!

But to condemn a more logical approach because of inconsistency? Well, you know what Ralph Waldo Emerson said about it.

It really gets me that after invoking consistency, the guidelines authors hide behind novices. They are in effect saying, we need to compromise the quality of our code instead of improving our training materials.

What puzzles me is the enforcement:

Flag const used as a suffix for a type.

How is one supposed to declare an immutable pointer that references a mutable int? I might do it like this:

int * const p;

But the type of p ends in const. Perhaps there is another way to declare it that doesn’t have a const suffix. If so you won’t see it in this core guideline because one of its “OK” examples also ends in const. Or perhaps the enforcement engine will just have a complicated rule.

I don’t like complicated rules, but some people seem to like them for const.

If you’d like to share your thoughts, please comment on the reddit post. (I’m well aware that this is a rant, but please keep comments constructive and professional. Thanks.)

My take at times

This article is inspired by Dmitry Ledentsov’s blog post: C++ version of ruby’s Integer::times via user-defined literals.

In this post Dmitry (aka DLed) is inspired to replicate Ruby’s syntax for repeating actions. In Ruby we can write:

    42.times do ...

Where … is some bit of code to be repeated, in this case, forty-two times. Dmitry’s blog shows off the Modern C++ feature of user-defined literals.

But I’m a bit old school on this and I don’t care for that syntax. But the post did inspire me to play around with how I’d implement this with a more traditional syntax. Along the way we’ll touch on decltype, template function overloading, SFINAE, variadic template parameters, assert(), type functions, perfect forwarding, and, of course, trying to write our code as generally as possible.

First cut

So the first cut is pretty straight forward. We want to write this:

    int main()
    {
        times(3, []{std::cout << "bla/n";});
    }

So we need this:

    #include <iostream>

    template <class Callable>
    void times(int n, Callable f)
    {
        for (int i{0}; i != n; ++i)
        {
            f();
        }
    }

We could avoid the template by writing the function like this:

    void times(int n, void(*f)())
    {
        for (int i{0}; i != n; ++i)
        {
            f();
        }
    }

Or, more clearly:

    using Callable = void(*)();
    void times(int n, Callable f)
    {
        for (int i{0}; i != n; ++i)
        {
            f();
        }
    }

But this would only work for functions and function pointers that are a perfect match for this signature. A function that returns a non-void return type, couldn’t be passed, even though it would work for our purposes.

But more importantly, this approach would only work for functions and function pointers. It would not work for function objects and, most importantly of all, it wouldn’t work or lambdas.

Of course we want to be as general as possible, so we’ll do this:

    #include <iostream>

    template<class Count, class Callable>
    void times(Count n, Callable f)
    {
        for (Count i{0}; i != n; ++i)
            {
            f();
            }
    }

But just as Dmitry does, we want to be able to also support a version where the Callable takes a parameter that is the loop iteration. This is one thing that I miss when using the Modern range based for syntax–sometimes I want to know what iteration we are on.

Iteration count

So we want to be able to write this:

    int main()
    {
        times(3,
              [](int i){std::cout << "counting: " << i << "/n";});

        times(3, []{std::cout << "bla/n";});
    }

We use the trick suggested by Kirk Shoop (and improved by Alberto Ganesh Barbati) to implement this:

    #include <utility>

    // #1 without argument
    template<class Count,
             class Callable,
             class Enable = decltype(std::declval<Callable>()())
            >
    void times(Count n, Callable f, ...)
    {
        for (Count i{0}; i != n; ++i)
            {
            f();
            }
    }

    // #2 with argument
    template<class Count,
             class Callable,
             class Enable =
                   decltype(std::declval<Callable>()(Count{0}))
            >
    void times(Count n, Callable f)
    {
        for (Count i{0}; i != n; ++i)
            {
            f(i);
            }
    }

Notice that the variadic parameter pack () is introduced to prevent the compiler from objecting to the fact that we are essentially defining the same template twice with the only difference being the default type of the Enable parameter.

By introducing the parameter pack argument we make the two definitions of times different so that the compiler will accept them both as overloads. (Two definitions that only differ in a default value are a redefinition, not an overload.) But because a parameter pack can be empty (and will be in this use case), the two definitions are really the same in practice. It doesn’t matter which definition gets the parameter pack because, as we just discussed, it will be empty.

The reason that the compiler rejects a redefinition is because it won’t know which function to call, the original or the redefinition. By using the empty parameter pack trick to fool the compiler into thinking that these are two different definitions when in practice they really aren’t, aren’t we just setting ourselves up for failure when we try to call the function? How will the compiler know which definition to use when we make a class for which the parameter pack is empty?

The trick here is that we are using something called SFINAE. If you don’t know about enable_if and/or SFINAE, then you can either just chalk this up to template magic that you’ll learn later, or you could watch this talk and learn it now.

In this code we aren’t using enable_if, which is the typical go to tool for SFINAE, instead we have the Enable parameter which is functioning like enable_if. That is to say that if the Callable doesn’t take a Count argument (initialized by 0) then the definition that I’ve labeled #2 with argument gets SFINAE’ed out and if the Callable can’t be called without arguments, then the definition that I’ve labeled #1 without argument gets SFINAE’ed out.

Since the Enable parameter is only used for SFINAE, we never pass it a value and only use the default. The default value expression uses std::declval, which added to C++11 specifically for this type of situation. If we didn’t have it, we’d end up using something like this:

class Enable = decltype((*(Callable*) nullptr)())

declval just creates a reference type of its template arguments to it can be used in expressions in unevaluated contexts like decltype().

Exploiting the generality

So far we haven’t done anything that Dmitry didn’t accomplish and some people might prefer his (Ruby inspired) syntax. But I prefer the traditional syntax because I think it is more general. Here is why I think so:

Now that we are taking an iterator count, we might want to play games with that. We’d like to index between any two arbitrary values. We could index between 10 to 15. We could even handle indexing in reverse order:

    int main()
    {
        times(10,
              16,
              [](int i){std::cout << "index: " << i << "/n";});
        times(15,
              9,
              [](int i){std::cout << "index: " << i << "/n";});

        times(3,
              [](int i){std::cout << "counting: " << i << "/n";});

        times(3, []{std::cout << "bla/n";});
    }

The implementations is the same as before with the addition of these definitions:

    #include <utility>
    // #1 without argument
    template<
            class Count,
            class Callable,
            class Enable = decltype(std::declval<Callable>()())
            >
    void times(Count start, Count end, Callable f, ...)
    {
        if (start < end)
            {
            for (auto i(start); i != end; ++i)
                {
                f();
                }
            }
        else if (start > end)
            {
            for (auto i(start); i != end; --i)
                {
                f();
                }
            }
    }

    // #2 with argument
    template<
            class Count,
            class Callable,
            class Enable =
                  decltype(std::declval<Callable>()(Count{0}))
            >
    void times(Count start, Count end, Callable f)
    {
        if (start < end)
            {
            for (auto i(start); i != end; ++i)
                {
                f(i);
                }
            }
        else if (start > end)
            {
            for (auto i(start); i != end; --i)
                {
                f(i);
                }
            }
    }

Notice that if the Callable doesn’t take the index argument, we don’t really care what order we index, but we still need to get the direction correct in order to know if we need to increment or decrement.

Next step

Now that we’ve done this, I’d guess that you see the next step coming. We’d like to increment (or decrement) by values other than one. So we’d like to be able to do this:

    int main()
    {
        times(10,
              20,
              3,
              [](int i){std::cout << "i: " << i << "/n";});
        times(20,
              10,
              3,
              [](int i){std::cout << "i: " << i << "/n";});

        times(10,
              16,
              [](int i){std::cout << "index: " << i << "/n";});
        times(15,
              9,
              [](int i){std::cout << "index: " << i << "/n";});

        times(3,
              [](int i){std::cout << "counting: " << i << "/n";});

        times(3, []{std::cout << "bla/n";});
    }

The implementations is the same as before with the addition of these definitions:

    #include <utility>
    // #1 without argument
    template<
            class Count,
            class Callable,
            class Enable =
                  decltype(std::declval<Callable>()())
            >
    void times(Count start,
               Count end,
               Count delta,
               Callable f,
               ...)
    {
        assert(delta != 0);
        if (delta < 0)
            {
            delta = 0 - delta;
            }
        if (start < end)
            {
            for (auto i(start); i < end; i += delta)
                {
                f();
                }
            }
        else if (start > end)
            {
            for (auto i(start); i > end; i -= delta)
                {
                f();
                }
            }
    }

    // #2 with argument
    template<
            class Count,
            class Callable,
            class Enable =
                  decltype(std::declval<Callable>()(Count{0}))
            >
    void times(Count start, Count end, Count delta, Callable f)
    {
        assert(delta != 0);
        if (delta < 0)
            {
            delta = 0 - delta;
            }
        if (start < end)
            {
            for (auto i(start); i < end; i += delta)
                {
                f(i);
                }
            }
        else if (start > end)
            {
            for (auto i(start); i > end; i -= delta)
                {
                f(i);
                }
            }
    }

If the delta parameter is zero, this is a logic error, so we assert(), as is appropriate for logic errors. That means that we also need:

    #include <cassert>

The direction of the delta is ambiguous in normal conversation. One could say “Count from twenty to ten by threes.” and not be expected to say “Count from twenty to ten by negative threes.” So we’ll take it as given that the direction of the delta is determined by the start and end values of the indexing range.

Note that we need to change our terminating condition. Up until now we’ve followed the STL inspired practice of terminating with the general:

i != end

but now that we have non-monatomic incrementing, we need the less general loop termination to avoid overshooting the end point.

Arbitrary parameters

The last generality I want to address is to allow us to pass arbitrary values to be used as parameters to the Callable. We see this pattern in the call to std::async(). We’d like to be able to write:

    int main()
    {
        times(3,
              6,
              1,
              [](int i) {std::cout << "4: " << i << "\n";},
              4);
        times(3,
              6,
              1,
              [](int i, float f)
                  {std::cout << i << " " << f << "\n";},
              4,
              0.4);
        times(3, 6, 1, [] {std::cout << "no params\n";});
        times(3,
              6,
              1,
              [](int i) {std::cout << "count: " << i << "\n";});
    }

That is to say, we want to be able to pass any number of any type of parameters after the Callable and these parameters should be perfect forwarded to the Callable.

We don’t want to break any of our existing functionality (which depends on using the empty variadic parameter pack trick) and we don’t want to cause any ambiguity if we happen to pass a parameter of the same type as the Count parameter.

For our implementation we just need to add a new definition that will only be enabled (using our friend std::enable_if) when there are parameters passed after the Callable. That is, when the parameter pack is not empty.

Here is our implementation. Note that we need to add this same definition for the flavors that have one and two Count parameters before the Callable.

    template<
            class Count,
            class Callable,
            class... Args
            >
    auto times(Count start,
               Count end,
               Count delta,
               Callable f,
               Args&&... args) ->
    std::enable_if_t<(sizeof...(args) > 0)>
    {
        assert(delta != 0);
        if (delta < 0)
            {
            delta = 0 - delta;
            }
        if (start < end)
            {
            for (auto i(start); i < end; i += delta)
                {
                f(std::forward<Args>(args)...);
                }
            }
        else if (start > end)
            {
            for (auto i(start); i > end; i -= delta)
                {
                f(std::forward<Args>(args)...);
                }
            }
    }

In the previous versions, we didn’t need to name the parameter pack type or value because we didn’t ever use it. (It was only used to prevent the compiler from complaining about our redefinition of times().) But here we are going to use the parameter pack (we are going to perfect forward it to the Callable), so we need to give it a type (our template parameter Args&&) and a name, args.

The && following the template type means that this is a forwarding reference. It has the same syntax as an rvalue reference, but due to the magic of reference collapsing, args is a pack of lvalue and rvalue references depending on what was deduced from what is passed. When we pass them on with std::forward<Args>(args)… the compiler will pass them to the Callable so that each parameter is the type (rvalue or lvalue) that the reference is bound to.

This is the secret of perfect forwarding. How this happens is really beyond the scope of what I want to talk about here, but I did want to show how to perfect forward an arbitrary set of parameters à la async(), emplace(), and make_shared().

We are also using std::enable_if to disable this definition when the parameter pack is empty.

Template type functions, which is what enable_if is, are templates that take types (and compile time constant values) as parameters and return a type (as a nested type named type). enable_if takes two parameters, one a compile time Boolean constant and the other a type. If the constant is true, then the type function returns the passed type. If the constant is false, the function definition is disabled, that is it is SFINAE’ed out of the overload set. The overload set is the set of functions that the compiler will consider when, in this case, we call times.

Since enable_if is a type function, we would need to write it like this:

    typename std::enable_if<(sizeof...(args) > 0), void>::type

except there are some short cuts. The second parameter (the type) defaults to void which is what we want in this case, so we don’t need to specify it. Also we can use enable_if_t. This is a type alias to the type defined by the nested type referred to as ::type, so we end up just writing:

    std::enable_if_t<(sizeof...(args) > 0)>

Note that we need the parentheses around the compile-time constant Boolean expression because it contains the “>” character which will be seen as closing the template parameter block if it is not in parentheses.

The magic of enable_if is that it makes a function template definition “go away” (when the Boolean value is false) no matter where it is in the definition. It can be any of the parameter types, a default value of a parameter type, or the return value. Since I’m using it as the return value here and because it depends on an identifier defined in the parameter block (args), we need to use the trailing return type syntax:

    auto times(...) -> return_type

The Boolean compile time constant that we are passing to enable_if is sizeof…(args) > 0. This use of sizeof takes a parameter pack (that’s what the is for) and, instead of returning a number of bytes as sizeof usually returns, it returns the number of parameters in the parameter back. In our case we don’t care about the number, we only care that it is non-zero, that is to say, that the parameter pack is not empty. If the parameter pack is empty for a particular call, the compiler ignores this definition for that particular call, because of enable_if. If the parameter pack is not empty (there are parameters after the Callable) then the definition is used and its return value is void.

Comments

This exercise was, to me, less about a useful implementation (unlike Dmitry, I’ve not used Ruby or times and don’t crave them) than about playing with implementing a simple library that explores some nice implementation techniques.

If you’d like to share your thoughts, please comment on the reddit post.


Updates

This post has been updated based on the reddit comments. My thanks to the all the commenters.

mark_99 pointed out that the ellipsis trick is classic varargs, not the modern variadic template parameter feature.

Bobby Graese? (TrueJournals) pointed out that I was needlessly using decltype to declare i.

Thanks to Alberto Ganesh Barbati (iaanus) for teaching me about declval.

The comments also contain an application of if constexpr by Vittorio Romeo  (SuperV1234) that really cleans up this exercise. I didn’t update for this because if constexpr is a C++17 feature.

Vittorio also suggests that “You should also consider passing f by forwarding-reference, in order to avoid unnecessary copies for non-zero-sized callable objects.” This is not consistent with the standard practice in the STL of just passing callables by copy. That is based on the assumption that callables are usually cheap to copy. There is a readability trade-off. You be the judge. Here is this suggestion applied to the first cut:

    #include <iostream>

    template <class Callable>
    void times(int n, Callable&& f)
    {
        for (int i{0}; i != n; ++i)
        {
            std::forward<Callable>(f)();
        }
    }

And thanks to Louis Dione for reminding us that Hana does it better.

Undefined Behavior and CERT’s Vulnerability Note

There were a lot of interesting comments to last week’s post on Apple’s secure coding guide and I plan to follow up on those in future posts, but I first wanted to make a comment of my own on the  vulnerability note from cert.org that was referenced by Apple’s document and by my post.

CERT’s vulnerability note

The vulnerability note’s overview states:

Some C compilers [and C++ compilers – JK] optimize away pointer arithmetic overflow tests that depend on undefined behavior without providing a diagnostic (a warning). Applications containing these tests may be vulnerable to buffer overflows if compiled with these compilers.

Undefined behavior

When your code exhibits undefined behavior, the compiler is not constrained by the language standard and any behavior at all is acceptable (to the standard). That is why we say the behavior is “undefined.” Insert joke on nasal daemons here.

Saying that the compiler is free to generate any code it wants is an obvious way of phrasing this, but it is really looking at it the wrong way around. Looked at from the compiler writer’s perspective a better way of phrasing it would be:

The compiler is free to assume that undefined behavior never happens so no code needs to be generated to handle such cases and no code needs to be generated to test for such cases.

The scary thing is that if you are writing code that actually does encounter undefined behavior, it is extremely unlikely that you’ll be happy with the outcome that results from these optimizations.

So the take-away is that we shouldn’t write code that has undefined behavior. But this isn’t news. That has been standard advice since the beginning of time.  (Which, according to Unix, was 1970-01-01.)

What is new is that modern compilers are more and more starting to exploit the freedom granted them in undefined behavior cases and have been more aggressive about identifying those cases and optimizing out code that would deal only with them.

Warnings

Note that part of the CERT’s vulnerability note overview states that a diagnostic isn’t required. This is true, but naive readers might be tempted to think that requiring such a diagnostic would be a good idea. It would not.

Consider a function with a precondition of a non-null pointer parameter because the pointer will be dereferenced in the function. Do we want the compiler to warn us that it is optimizing out code for the null pointer case? It isn’t possible for the compiler to determine which optimizations are for undefined behavior which is known about (and which we are careful to prevent) and which optimizations are for undefined behavior which would be a surprise to us.

Requiring the compiler to warn for every undefined behavior optimization would result in an avalanche of false positives and users would end up silencing all such warning.

Some Undefined Behaviors are More Equal Than Others

So the problem is that coders are writing code with undefined behavior and they need to fix that right? Well not according to CERT:

Application developers and vendors of large codebases that cannot be audited for use of the defective wrapping checks are urged to avoid using compiler implementations that perform the offending optimization. Vendors and developers should carefully evaluate the conditions under which their compiler may perform the offending optimization. In some cases, downgrading the version of the compiler in use or sticking with versions of the compiler that do not perform the offending optimization may mitigate resulting vulnerabilities in applications. [emphasis mine – JK]

That’s right, the problem isn’t that we have code with undefined behavior, the problem is that nasty compilers are using “offending” optimizations.

To give the vulnerability note its due, it does give a coding solution to the example problem explaining how to fix the issue with better code. But I found the quoted statement both surprising and bothersome. The attitude is that it is okay that we have broken code, as long as we don’t upgrade our compilers, is hard to swallow.

As compilers mature they are generating better code (modulo some regressions) and it is likely that the code they are generating for you is more secure and less likely to have subtle bugs with each subsequent revision. Asking developers to opt out of compiler improvements so that they can avoid fixing broken code makes me suspicious of the commitment to code quality.

Please post comments on Google Plus.

Undefined Behavior and Apple’s Secure Coding Guide

Recently Apple released its Secure Coding Guide (dated 2014-02-11). This is filled the kind of good advice you’d expect to see from a high-tech firm that is committed to helping developers on their platform create secure code.

But I want to call your attention to the section called Avoiding Integer Overflows and Underflows. On page 28 is this code snippet:

size_t bytes = n * m;
if (bytes < n || bytes < m) { /* BAD BAD BAD */
    ... /* allocate "bytes" space */
}

Apple’s document doesn’t explicitly say that n and m are signed integers, but we’ll assume they are because the undefined behavior discussed in the reference only occurs for overflows of signed integers.

What this code is attempting to do is to detect if the expression n * m has overflowed. As the document states, this approach won’t work. It references CWE-733, CERT VU#162289. The issue here is that in order for the condition to be true, the expression must have overflowed. But overflow of signed integer is undefined behavior in C and C++. So the condition can never be true unless the program exhibits undefined behavior. So the compiler is free to remove the if statement.

In other words, in the case where the condition is false, removing the if statement doesn’t matter (because it wouldn’t be executed anyway and removing it speeds up the code) and in the case where the condition is true, the program exhibits undefined behavior and the compiler can emit any code at all in that case, including code without the if statement. So either way, no if statement.

This is a real-world optimization that is produced by modern compilers. They optimize for the case where the code doesn’t produce undefined behavior and they ignore the undefined behavior case.

So Apple is correct to warn about this. The problem is that the document then recommends an example solution with the following snippet.

size_t bytes = n * m;
if (n > 0 && m > 0 && SIZE_MAX/n >= m) {
    ... /* allocate "bytes" space */
}

I have to say I’m a little surprised by  this!

How can a document that just explained the problem of undefined behavior of signed integer overflow recommend a solution that has undefined behavior triggered by signed integer overflow?

The confusion here is that the document has not correctly determined the real issue. The document assumes that the problem with the first snippet is that the condition in the if statement can only be true in the case of undefined behavior, so the complier can and will remove the test.

While that is true as far as it goes, that isn’t the crux of the problem. The undefined behavior happens on the first line of the snippet. If the expression n * m results in signed integer overflow, then it doesn’t matter what follows, the code has entered the realm of undefined behavior and all bets are off.

In the first code snippet the assumption is that the first line exhibits the wrapping behavior that most processors will perform. The document correctly points out that isn’t a correct assumption. But in the second snippet, offered by the document as a solution, the assumption is that the first line won’t abort the program or do something else unexpected if the value overflows. But that is exactly what the language does not promise. How you detect undefined behavior after the fact is irrelevant. Once you’ve stepped outside the line of defined behavior, it is too late to pull back.

So what is the solution to a situation like this?

The document is on the right track to a solution. The key is to be able to detect the overflow situation without triggering it. Or in this specific case, detect that n * m would overflow, without actually calculating the value of bytes. But putting the detection after the calculation of bytes defeats that purpose because by then we’ve triggered undefined behavior. We need to use the detection to avoid the calculation that would result in undefined behavior. Something like this:

if (n > 0 && m > 0 && SIZE_MAX/n >= m) {
    size_t bytes = n * m; /* will not overflow */
    ... /* allocate "bytes" space */ }
else {
    /* handle the overflow case */
}

With this approach, bytes is not calculated until after we have determined that the calculation will not overflow so undefined behavior is avoided.

Marshall Clow has a talk on undefined behavior at the upcoming C++Now.

Shout out to Microsoft MVP Bruce for pointing out this issue.

Please post comments on Google Plus.

 

What is a User-Defined Type?

creation

The meaning of “user-defined type” is so obvious that the Standard doesn’t define it. But it uses the term over a dozen times, so it might be good to know what it means.

bjarne2Prof. Stroustrup knows what it means and he is very clear that any type that is not built-in is user-defined. (See the second paragraph of section 9.1 in Programming Principles and Practice Using C++.) He even specifically calls out “standard library types” as an example of user-defined types. In other words, a user-defined type is any compound type.

The Standard does seem to agree in several places. For example in [dcl.type.simple] the standard says:

The other simple-type-specifiers specify either a previously-declared user-defined type or one of the fundamental types. (Emphasis mine.)

In context, it is pretty clear that std::string (for example) is a simple-type-specifier and it clearly isn’t a fundamental type so it must be a user-defined type. (In [basic.fundamental], the standard says that there are two kinds of types: fundament types and compound types. It then lists the fundamentals types–there are no Standard Library types in the list.)

So what’s the problem? The problem is that sometimes the term seems to be used in a way that implies that Standard Library types are not user-defined.

Consider this from [namespace.std]:

A program may add a template specialization for any standard library template to namespace std only if the declaration depends on a user-defined type and the specialization meets the standard library requirements for the original template and is not explicitly prohibited. (Emphasis mine.)

Here the standard is saying that it is legal to create a specialization like:

namespace std
{
    template <>
    void swap<my_type>(my_type&, my_type&) {…}
}

This is okay because:

  • swap is Standard Library template and
  • my_type is a user-defined type.

The reason that the standard makes the restriction that the template type must be user-defined is because it wouldn’t do to allow users to do something like this:

namespace std
{
    template <>
    void swap<int>(int&, int&) {…}
}

Allowing a user to define the implementation of std::swap() could lead to nasty surprises for some users. In so many words, the Standard is saying that you only get to define the implementation of std::swap() for types that you define yourself, not for int, double, etc. But what about std::string? Does the Standard intend for users to legally do this:

namespace std
{
    template <>
    void swap<string>(string&, string&) {…}
}

and provide an implementation of std::swap() for std::string of the user’s choosing? This is what the Standard is saying, if we choose to interpret the meaning of user-defined types as any compound type.

There are several similar references in the Standard. References where something is only permitted in instances where at least one type is a user-defined type. Consider

  • common_type in Table 57 of [meta.trans.other],
  • the last requirement in the first paragraph of [unord.hash], and
  • is_error_code_enum and is_error_condition_enum of [syserr].

In these and other references, it really doesn’t make sense to allow users to create specializations for Standard Library types.

My informal, not statistically significant, survey of a couple of friends of mine on the Standards Committee indicated that there are Committee members who don’t use the any-compound-type-is-a-user-defined-type definition, but instead accept (as one of them said) that:

user defined types are, broadly speaking, types that aren’t mentioned in the standard.

I think the Standard should be clear about the definition of this term and not leave it up to use to guess, because, while user-defined types are good for C++, user-defined terms are not.

Let me know what you think.

Unofficial Update on C++Now 2104

I want to point out that what I’m saying here is “unofficial.” Any dates or details about C++Now that are known for certain we would publish on the official C++Now website. I’ve gotten a number of message from people about the 2014 conference and I wanted to let people know what to expect.

Dates

The dates are May 12-17. This is official, but it might be a little surprising or confusing. For all of conference history, going back to the misty beginnings of BoostCon, the conferences has started with registration followed by a reception or other socializing on Sunday and the first technical sessions on Monday morning. This year registration and the reception will be on Monday with the first technical sessions starting on Tuesday morning. The conference isn’t shorter, it will end on Saturday instead of Friday. We’ve made this change to accommodate individuals that don’t want to travel on Sunday which is Mothers’ Day in the US.

Hotel

Also official is that the last day to get the Early Bird discount (saving $20 per night) on hotel rooms at the Aspen Meadows is January 10th. This is coming right up, so make your arrangements right now. Do not wait for registration to open!

C++Now

Here is what is unofficial:

Content

Although we did have to extend the submission deadline (it seems that we always do) we ended up with a nice number of quality submissions (we always do!), so we’ll have three tracks just like we did the last two years. We haven’t finished reviewing the submissions so it will be awhile before we will have a schedule online or even be able to let submitters know what has been accepted.

Student/Volunteers

We will again have Student/Volunteers. We did this for the first time last year and it was quite successful. Last year we had seven volunteers (grad students, undergrads, and a high school student) and we were able to raise funds for travel and lodging for all of them. We found that seven was more than we needed, so this year we’ll probably have fewer. We will certainly waive registration fees for volunteers, but how much we can help with travel and lodging depends on how successful we are raising funds for that purpose. If you are or know someone who is interested in applying to be a Student/Volunteer, be ready to submit your application. We’ll want to see your résumé and a personal statement about why we should choose you. We’ll probably be accepting these Real Soon Now™.

Registration

But what most people seem to be concerned about is registration. This isn’t surprising. Last year, for the first time, we sold out. It was very gratifying to sell out. But it was very painful to have a waiting list full of people that we had to disappoint.

We are pretty close to opening registration for 2014. We expect to sell out again this year. In fact, we expect to sell out even sooner because, unlike last year, people know that we’ve sold out before. Last year we sold out by the middle of March. How soon we’ll sell out this year is anybody’s guess, but it is clear already that people are concerned about ending up on the wait list.

So my advice, if you are interested in attending is:

  1. Make your hotel arrangements now!
  2. If you need approval from your boss or spouse to attend, start working on that now.
  3. Watch for the “registration is open” announcement.
  4. Don’t hesitate when registration opens. (Unless you have made a presentation submission – see Submitters below.)

Where should you look for the “registration is open” announcement? Well you could check the official website everyday, but who is going to remember to do that? If you use a feed reader, there is an RSS feed you can follow. You can also follow the official twitter account. You can also expect the announcement on any of these Boost mailing lists: the user list, the developer list, and/or the interest list. I also expect to see it reported on isocpp.org. (If you were on last year’s wait list, you’ll receive an email from Marshall Clow, the conference registrar, with the announcement.)

Submitters

If you have made a session submission, do not register now. If you’ve submitted a session, we’ll hold a place for you. When the decision is made about which submissions will be accepted, we’ll contact you with instruction about how you can register. Even if we’ve decided not to accept your submission, you’ll have a chance to register.

Good luck and I hope to see you all in Aspen this May!

Comment on Google+.

C++ Amusements

By the time you read this, I’ll probably be “under the knife” or recovering from same.

The surgeon is predicting a couple of rather uncomfortable days followed by over two weeks of taking it easy. I won’t be able to operate heavy machinery so it’s fortunate that my main development machine is a laptop. I should have several days of naps being broken up by sessions on the computer. Which brings me to the point of this blog.

What are your C++ time killers amusements?

logo-sun-1I go to the C++ Standard Foundation’s isocpp.org website at least once a day to see the latest in the C++ world. Hi Herb, Eric, Marshall… (Rumor has it that there is a lot more content coming to that site. Keep an eye on it.) In addition to news about C++ events and postings, it has some recent StackOverflow questions. If one of those looks interesting I may be sucked in to look at few questions and there goes an hour.

C++NowOf course if I know I have an hour or so, I may look for an interesting online video. There are some great sessions recorded at C++Now and GoingNative. Channel 9 sometimes has a C++ and Beyond session. The recording quality is better for the Channel 9 videos, but the BoostCon videos cover a wide range of topics and have a lot of content.

If you didn’t get enough questions on StackOverflow, try out the just launched C++ Quiz. Hi Anders. By the way here is my C++ trivia question (apologies to those that already saw this on twitter):

Q) what is the name of the only std::exception member function to return a pointer
A) true

If you didn’t get enough interesting links on isocpp.org, spend a few minutes on twitter searching for #cpp or #cpp11 (or even #cpp14). You’ll likely find something worthing reading and several people worth following. Hi James, Kate, Jens, Andrey, Eric, Marshall, Anna-Jayne, Dean, Anders, Diego

imgres

If twitter isn’t your thing, look at the C++ sub reddit. Hi STL. Just about everything of interest to C++ programmers will get posted there. But there is a bit of noise mixed with the signal. For a purer experience stick with established C++ blogs like the ones I’ve listed on my blog roll at the right. Hi everybody.

You might not expect good technical conversations on a social network, but Google+ is pretty good. Why not drop by and let me know what ideas you have for how I can spend my time while I’m recuperating.

Boost to Git Modular

As has been reported in a few places, Boost is transitioning from Subversion to Git. The Boost Steering Committee has voted to “push the button” which amounts to shutting down the SVN repository and making a final run of the conversion script. There are a lot of details available on the Boost wiki.

octocat

The motivating issue for this transition is not just to see Boost move to a repository system that has become the de facto standard for open source projects, but to support better modularity for the Boost Libraries. Boost policy has been to encourage developers to depend on existing Boost libraries rather than to “re-invent the wheel.” But this leads to a lot of intra-library dependencies that discourage users that may be interested in only a small number of libraries.

Moving to Git doesn’t solve this problem, but it does move in the direction of modularity which is more and more important as the number of Boost libraries grows. (When I first visited boost.org, there were four libraries. Now I count one hundred twenty-six libraries.)

I want to make a big shout out to everyone that has worked to make this transition possible. I wish I new all the names, but some of the ones that I do know are: Dave Abrahams, Daniel Pfeifer, John Wiegley, and Beman Dawes. If you know someone who helped with this transition, please add a comment.


My goal is to update this blog every Tuesday, but next Tuesday I’ll be having major surgery so it is unlikely that I’ll online for awhile. If I don’t have something up by Monday night, it will likely be a while.