comments – Page 2

Tutorials for C++Now

The C++Now Call for Submissions for 2014 is out. You might think that after releasing it, we just sit back and wait for all the great submissions to come rolling it. I hope it looks that easy to the outside world, but truth is that encouraging potential speakers to submit sessions is a year-round job that just gets more intense after the Call the Submissions. We are constantly looking for people with interesting things to say about C++ and asking them if they’d like to share at C++Now.

In 2012, when we changed the conference name from BoostCon to C++Now, we also added a third track of presentations. We wanted to create space for tutorials, specifically tutorials on the new language and library features. For this track, we are not just looking for good speakers, we also are looking for coverage of specific topics.

This year Scott Schurr is leading the effort to make certain that we are presenting an excellent tutorial track. I’ve his permission to share some of the ideas that we are pursuing.

Although some of the talks have been proposed with specific presenters in mind, most of these talks are “unclaimed” by any particular presenter. Why am I sharing them? Boost is an open-source library and I thought we could experiment with adding some transparency to our conference planning.

I’m also hoping that this list might be inspirational. Perhaps it will inspire you to claim one of these topics for yourself and make a submission of it or a completely different topic (the deadline is December 8th). If you have any ideas or questions about making a submission, you can contact the program committee. If you’d just like to comment on the list, you can do that at google plus or redit.

“Moving Well.” Guidance for the best way(s) to design a movable class. This can be complicated by interactions between movability and perfect forwarding (See Scott Meyers’ reflections on this problem at http://scottmeyers.blogspot.com/2013/07/video-for-universal-referenceoverloadin.html)
“How and Why to Implement Value Semantics in C++”.” This talk might start out with an introduction to the ideas behind value semantics. A talk that discusses moving, deleting, defaulting, explicit conversion functions and delegating construction would be very valuable.
“The Libraries That You Should Know About.” Consider the “optional” library (no longer in C++14), “flat_set”, and reference wrappers. Consider about 5 minutes each on twelve small libraries. This could include standardized but under-appreciated libraries as well as Boost and Adobe Source Libraries.
“Unicode Support in Standard C++.” A lot of work has gone on here. What can you do with Unicode and just standard C++ in real-life situations?
“Intro to Functional Programming in C++.” What’s the point of functional programming. Don’t I have to do that in Haskell? Is functional programming an all-or-nothing thing? Can I benefit from making my C++ “more functional”? If so, how would I do that?
“Intro to Concurrency.” Obviously this could span two or three sessions and would make a terrific workshop.
“A Survey of Algorithms.” If you haven’t seen Sean Parent’s talks at Going Native, stop reading this and go watch them. (http://channel9.msdn.com/Events/GoingNative/2013/Cpp-Seasoning and http://channel9.msdn.com/Events/GoingNative/2013/Inheritance-Is-The-Base-Class-of-Evil) One reaction to these talks is that “we” don’t really know what algorithms are available to us. Another point is that even if we know the available algorithms we don’t necessarily recognize them in the wild. Sean recognized a particular operation as a kind of rotate. Insights into how to recognize algorithms would be invaluable. If this topic were restricted to std, then it might be a short talk, but if we include Boost and the Adobe Source Library there could be real gems.
“How To Do Random Numbers Right (and How Not To Do Them).” We should ask STL to do his talk from Going Native as a tutorial.
“RegEx in C++.” STL did a great tutorial on this at Going Native. We could ask STL for a repeat, or we could ask someone else to give their take on it.
“Building Great Hashing Functions.” Perhaps this could be combined with bucket size/rehashing strategies for unordered containers.
“Fixed Size Arrays.”
“Alignment Control in C++11.” Most programmers don’t think about alignment. This talk could start with an introduction to alignment including where the requirement comes from. From there the talk could go into aligneof, alignas, aligned_storage. It could finish up with a discussion of over aligned data types (see N3396) and how to manage them.
“How To Build a Perfect Function Wrapper.” Suppose I want to do “something.” (I always use the example of timing how long we spend inside an API. This is conceptually very simple. You just start timing when you are about to make an API call and stop when you finish.) But you want to be able to wrap arbitrary calls. So you need to use variadic template parameters and perfect forwarding. This may be too simple to justify a complete session. Maybe it could be combined with a tutorial on bind, function, and lambda. (Or a tutorial on the time library.)
“Managing Object Lifetimes.” Start with RAII, move on to smart pointers, then show how the combination gives you 90% of the exception-safety that you want. Finish up with a discussion on passing parameters – how to make your APIs be more explicit about ownership and lifetimes.
“Compilation Errors: The Best Kind of Errors.” When I’ve tried talking to typical programmers about compile-time error checking they lift their noses and suggest that it couldn’t be useful. But with careful design many kinds of errors (admittedly not all) can be hoisted to compile-or link-time. We need to promote finding programming errors early.
“SFINAE Boot Camp.” This is a really important technique that is more of a black art than a science. If someone can present rhyme and reason for how to use SFINAE in a variety of circumstances I think a lot of people would benefit from that.
“Understanding &&.” You’ve no doubt noticed that && in a signature does not act like & or * in a signature. A session that helps folks develop an intuition for && and its effects would be great. All the experts already have this nailed, but this could be a valuable topic for mid-level programmers. Perhaps this topic could incorporate the first topic, “Moving Well.”
“New Features in C++14.” Surely we ought to cover this topic.
“C++11/14 Glossary.” C++11 extends rvalues and lvalues to include a total of five different values categories. PODs used to be important, but now aggregates, trivial classes, and standard-layout classes carry more of the semantic load. The volatile keyword has a new meaning in C++11 that it didn’t have in C++98. A deep dive into some of these vocabulary changes (and others) could be useful for many mid-level programmers.
“What’s My Value?” To mix it up a bit, we could do C++ related questions in some sort of game show format. Imagine 45 minutes of glossary followed by a 45 minute game show format presentation with questions about the preceding material. This could also be applied to a talk about the memory model followed by game show style questions.
“Building a Boost Reference Application.” Boost could have a reference application (or maybe several) which uses many Boost libraries to do something useful. The reference application would be an example of how to use Boost libraries in the real world.
Topics that don’t focus on a specific feature. The Meeting C++ conference (http://meetingcpp.com/index.php/schedule13.html) has several such topics this year:
- Simpler C++ code through C++11
- Clean C++ – throw down the gauntlet against software entropy!
- Generic Programming for the rest of us
- Efficient Team Development for C++ Projects
- Presentations on tools C++ developers typically use or might want to
“Presentations on tools C++ developers typically use or might want to use: Git, CMake, Quickbook (and how to use the tool chain to generate documentation) etc.
Workshop to organize “review teams” (probably easier than finding individuals who have to do all the work 🙂 which start working their way through the libraries in the review queue (“Review manager in a week” instead of “Library in a week”? 😉
“Google Summer of Code.” This is a successful program. Can we leverage it into a useful tutorial topic?
“C++11/14 Overview: Core Language Features”
- C++ Timeline
- Goals for C++11
- The Simpler Core Language Features
  - auto, decltype, trailing return types
  - nullptr
  - Range for
  - >> in template specializations
  - static_assert
  - noexcept
  - extern template
  - constexpr
“C++11/14 Overview: Features Specific to Class Design”
- Generated functions: default / delete
- Override control: override / final
- Delegating constructors
- Inheriting constructors
- Increased flexibility for in-class initializers
- Explicit conversion operators
“C++11/14 Overview: Larger Language Features”
- Initialization
  - Initializer lists
  - Uniform initialization
  - Prevention of narrowing
- Lambdas
- Rvalue references, move semantics, perfect forwarding:
  - lvalues and rvalues (the modern view)
  - “move” semantics
  - Universal references
  - Perfect forwarding
“C++11/14 Overview: Concurrency”
- Threads
- Passing arguments to threads
- Data lifetime considerations
- Synchronization with mutexes and locking
- Returning values from threads using async() and futures
- Condition variables
- Thread-local storage
- Atomics

Please comment on Google Plus or reddit.

Ode to a Flat Set

In my previous post I discussed the most important changes in hardware since the STL was created. These changes are the increasing cost of malloc and the increasing importance of caching considerations.

I’ll group the standard containers into contiguous containers: vector, string, and deque, and node-based containers: list, *set, *map, and unordered*. The implications are that we want to favor contiguous containers over node-based containers both because they have fewer allocations per item and also because they are more cache friendly.

This isn’t earth shaking news. Alex Stepanov, the STL’s creator, has been quoted as saying, “Use vectors whenever you can. If you cannot use vectors, redesign your solution so that you can use vectors.”

Using vector, which has always been good advice, is now even better advice. It has always been true that for small values of N, vector beats containers with better order operations. Do we care about performance if N is small? If don’t have much data, why worry about which container might be faster? It is true that a small data set may mean that cost is insignificant in any one operation, but we may still be concerned about optimizing container performance . If our look-ups are in a small data set, but we are doing them in a loop, performance is still a consideration. With modern hardware the values for N where contiguous containers can beat node based containers can be larger than ever. How large? Follow Andrei Alexandrescu’s advice: measure.

If your entire vector fits in the cache, it will be hard to beat by a node-based container such as list, set, or map and perhaps even for unordered containers (with their constant-time accesses).

That is, if your vector is sorted.

In Item 23 of Effective STL, Scott Meyers discussed replacing associative containers with sorted vectors. There are trade-offs, of course. If you are doing inserts, erases, and lookups in an unpredictable pattern, then sorted vector looks less attractive. They also look less attractive if the container holds a very large number of items. Large numbers of items make cache misses likely even for vector and also favor the constant time access of hashes.

They look more attractive as the ratio of lookups to modifications gets larger or if you can segregate the container modification phase from the lookup phase. The ideal scenario for sorted vector is that you populate the vector, sort it once, then read it many times.

Lookups on sorted vector use lower_bound, upper_bound, or equal_range which have the same order (log) as associative container lookups. Although the order is the same, sorted vector will out perform tree-based containers because tree-based containers are dereferencing pointers and often getting cache misses while sorted vector is calculating offsets and getting cache hits.

The story for populating the container is more nuanced. An arbitrary insert into a sorted vector will be order N because vector will need to move all the elements following the insertion point. The log order of associative containers is better. On the other hand, the associative container is going to allocate a node. Vector can move a lot of elements in the time it takes to allocated a node.

Of course the interface for vector isn’t the same as that of an associative container. In order for lookups to be log order, the vector must be sorted and it must stay sorted (or be re-sorted) in the face of modifications. Insertions will need to be preceded by calls to lower_bound to find the insertion point. A single lapse may create a hard to find bug.

Enter flat_set.

The Boost Container library has a family of flat_* containers that have associative container interfaces and semantics, but are implemented as sorted vectors.

In addition to faster lookup, the flat_* containers have much faster iteration, less memory overhead, fewer allocations, and improved cache performance.

However, as with all things, there are trade-offs. Unlike items in an associative container which are allocated into their own nodes and are never moved or copied (and thus never throw while being moved or copied), items in a flat_* container must be movable or copyable and may throw when being moved or copied during insertion or erasure. Further, insertion and erasure invalidates iterators and may be slower (particularly for non-movable types).

In short, the smaller your container and the closer your usage is to read-only, the better flat_* containers look as replacements for associative containers.

In a future post I’ll discuss an alternative to consider when the sorted vector’s limitations make it unattractive.

Please comment on Google Plus or reddit.

STL is Dated

I’ve been using the STL since before it became part of the 1998 standard, so I’ve heard a number of opinions on it. Still it was a little surprising to hear Alex Stepanov, the library’s creator, say that it was out-of-date. We both work for A9.com, one of the perks of which is that I have occasional conversations with him. (Submit your resumes to jonkalb (at) a9.com.)

He knew he had my attention.

Alex has devoted his professional life to the intersection of mathematical purity and hardware-based practicality. The STL is the product of that work and is the best expression of applying mathematics to the hardware of its day.

But hardware has changed since the STL was designed. In at least two important ways. The first is that multi-core and even many-core machines are the common target so heap allocation, always expensive, now requires a lock. The second is that the assumption that all memory accesses are equally expensive is no longer valid. Caching has a huge impact on performance. Accessing memory in the near vicinity (a cache hit) is dramatically less expensive than arbitrary memory accesses.

So what are the implications for STL users?

One implication is that allocators are becoming more and more important. Allocators as originally designed were not created to deal with allocation performance, but to deal with near and far pointers (if you don’t know about them, count your 32-bit blessings). For this reason, the standard definition of allocators hasn’t been sufficient to deal with some of the allocation strategies users want to exploit.

This has changed in C++11.

Alisdair Meridth is the Chair of the Standard Library Subcommittee and an employee of Bloomberg, one of the companies that is really concerned about the impact of allocators on performance. Alisdair spoke on “Allocators in C++” at C++Now 2013. I’ll refer you to that talk because it brings you up to date on what is in C++11 and what is being proposed for future standards and because referring you to it allows me to pretend that I actually understand it all.

In a future posts, I’ll discuss the implications of containers, both for contiguous containers and node-based containers.

Please comment on Google Plus or reddit.