libsigc++ 2.99.5: Even less code

I managed to rip out another chunk of slightly-incomprehensible code from libsigc++, and those improvements are in libsigc++-3.0’s 2.99.5 version.

Now there’s no common functor_base base class and I removed all those interdependent result_type typedefs that were scattered throughout the code. When I had unravelled the whole result_type chain, I could replace it with just a handful of decltype(auto) uses, realising that that’s what the whole mess was trying to achieve. I love decltype(auto).

BigOQuiz: C++ Standard Library Algorithms

I’ve reworked bigoquiz.com, which I created a few weeks ago. to allow multiple quizzes instead of just the default quiz about the Big-O complexity of algorithms and data structures. For some quizzes it also now offers reversed questions, asking you for the question that matches the answer.

For instance, there is now a C++ Standard Algorithms quiz. You could, for instance, check that you know the descriptions of the Sort Operations functions. Or you could check that you know the names of those Sort Operations functions based on their descriptions (or as multiple-cdon’t always hoice). Maybe this is one way to get more familiar with the many standard C++ algorithms that we often don’t realize are available. I hope this helps me as I strive to use “no raw loops“.

As before, you can track your progress in each section of the quiz, and identify problem questions that you keep getting wrong.

screenshot_bigoquiz_cpp_std_algorithms_sections

I’ve also added a Design Patterns quiz, based on the patterns mentioned in the Head-First Design Patterns book.

The user profile page also now shows overall statistics for all the quizzes that you’ve tried.

screenshot_bigoquiz_userprofile

 

Trying CMake Again

I’ve recently been playing with CMake again, with much more success than last time. I still don’t love it, but:

I first tried it in the cmake branch of my little PrefixSuffix application because its build is rather simple.

I then finally allowed libsigc++ to have an additional CMake build system, at least in libsigc++-3.0, and Marcin Kolny got it done. This was possible now that libsigc++ doesn’t generate code from .m4 files. We wouldn’t have wanted to maintain two complicated build systems, but two simple build systems seems acceptable. I noticed many questions on StackExchange about actually using libsigc++ in CMake builds systems, so I added a hint about that to the documentation.

I also added a CMake build system to my Glom source code, to test something bigger, with many awkward dependencies such as Python, Boost, and PostgreSQL. It’s working well now, though I haven’t yet added all the tests to the build. This let me really try out CLion.

Minor Conclusions

I’m fairly pleased with the result. CMake doesn’t now feel massively more annoying than autotools, though it makes no attempt to do the dist and distcheck that I really need. I have now achieved a level of acceptance that means I wouldn’t mind much working on a project that uses CMake. I wouldn’t even need to hold my nose. The following complaints don’t seem to bother me quite so much any more:

I still find invoking CMake horribly obscure compared to ./configure. For instance

cmake -DCMAKE_INSTALL_PREFIX:PATH=/opt/something

seems ugly compared to

./configure --prefix=/opt/gnome

and I don’t see an equivalent for “./configure –help” to list all generic and project-specific build options. I also haven’t tried to replace mm-commons‘s wonderful  –enable-warnings=fatal to easily turn on compiler warnings as errors, encouraging my code to be free of warnings and deprecated API while not troubling people who just want to build the project with the least difficulties.

I still cannot understand how it’s normal in the CMake world to create these awful little Find*() scripts for each library that you might depend on, instead of just using pkg-config, though CMake’s pkg-config support seems to work well now. And hoping that your own library’s Find*() script ends up in CMake itself does not seem scalable.

It’s also a little odd that I cannot specify the link directories (-L flags) per target. link_directories() seems to apply to all targets.

CMake’s syntax still seems like a toy language. Its quoting (none/implicit/unknown) and separating (spaces, not commas) and use of whitespace (occasionally significant) sometimes makes m4 feel almost friendly. The non-case-sensitivity of function names (but case-sensitivity of variable names) just leads to arbitrarily inconsistent code and unnecessary style differences between projects.

I don’t like the CMake tradition of having separate CMakeList.txt files in sub-directories, because this can lead to duplication and makes it harder to see the whole build system at once. I prefer non-recursive autotools, so I tried to use just one CMakeList.txt for my CMake projects too, which was only a little awkward.

libsigc++: std::function-style syntax.

Yesterday I released version 2.99.2 of libsigc++-3.0. This changes the declaration syntax for sigc::slot and sigc::signal to use the same style as std::function, which was added in C++11. We don’t want to be arbitrarily and unnecessarily different.

We’ve also simplified sigc::mem_fun() to always take a reference, instead of either a reference or a pointer.

So, where you might do this for libsigc++-2.0:

sigc::slot<void, int, A> slot = sigc::mem_fun(this, &Thing::on_something);
signal<void, int, A> m_signal;

You would now do this for libsigc++-3.0.

sigc::slot<void(int, A)> slot = sigc::mem_fun(*this, &Thing::on_something);
signal<void(int, A)> m_signal;

Actually, as of version 2.9.1 of libsigc++-2.0, you can use both syntaxes with libsigc++-2.0, allowing you some time to adapt to the new API before one day moving to libsigc++-3.0.

libsigc++ 3.0: Very variadic

I have just released libsigc++ 2.99.1, the first release of the libsigc++-3.0 API, which installs in parallel with the existing libsigc++-2.0 API. This is libsigc++ using much more modern C++, specifically C++14. The API itself is almost completely unchanged, but the implementation is now clearer and easier to contribute to. I’m rather proud of my work, but I’m quite sure that there’s still room for improvement.

Unfortunately, glibmm and gtkmm can’t use libsigc++-3.0 until they have their own parallel-installable versions. That’s not likely to happen until the mythical GTK+ 4 happens. Even though libsigc++ is mostly all templates in headers, its symbols do appear in the linker symbols for gtkmm method signatures, so changes to libsigc++ can break gtkmm ABI. I do have a sigc3 branch of glibmm in git just to show that it works.

With around 150 commits, I gradually refactored libsigc++ to use variadic templates with C++14 instead of generating code from nasty m4 files. Over the years, we have built up quite a large set of regression tests, run during “make check” and “make distcheck”, including various corner cases. Without these tests, the refactoring would have been much harder. With the tests, I could make small commits, knowing that each commit had not broken anything. When something did seem wrong, I could add a test and go back through the git history to find the problem, sometimes splitting commits into even smaller changes. I did a lot of that, rebasing several times, sometimes stopping and starting again after help from Jonathan Wakeley and Marcin Kolny. Those tests give me much more confidence in the end result than I could have if I had chosen to just reimplement the entire API from scratch.

The API is almost all .h files and according to wc, there are 24,145 lines of code in those files in libsigc++ 2.7.1 (after make), and  6,507 in libsigc++ 2.99.1. So there is now only 27% as much code.

This is possible because:

  • C++ variadic templates allow us to have one class or function where we previously had to generate multiple versions for 1 to 6 function parameters, sometimes with additional versions for const and non-const parameters or const and non-const (and volatile and non-volatile) member function pointers.
  • decltype(auto) lets us avoid lots of templated type traits just to correctly specify the correct type for methods.
  • The standard C++ type traits, such as std::conditional<>, std::result_of<>, std::is_base_of<>, std::remove_volatile<> and std::is_const<> let us write very generic code, sometimes replacing our own type traits. I added some more to libsigc++ to get compile-time type traits for member method pointers.
  • template aliases (like typedefs for templates) avoided the need for multiple functor classes deriving from a common base, even though I ended up not needing most of these aliases either.
  • I replaced our sigc::ref() and sigc::reference_wrapper() with std::ref() and std::reference_wrapper<>. Presumably these share a common history.
  • I removed some configure checks and ifdef-ed workarounds for older MSVC++ and Sun Forte compilers. Hopefully they aren’t necessary now, but we will see.

For some adaptors  I used the tuple utilities that I’ve been working on recently, for instance in sigc::bind(). These are copied into the libsigc++ source code, and I’d particularly welcome improvements to them in the form of patches or github pull requests.

I’m still not completely happy with all the overloads we have for sigc::mem_fun(), to take member functions that are non-const, const, non-volatile and/or volatile, but I have some things still to try. We might also remove several by not allowing both mem_fun(pointer, func) and mem_fun(reference, func).

Please do suggest ways to simplify the code yet further.

Playing with C++ Variadic Templates and Dynamic Programming

Dynamic Programming C++ base classes

Over the last few months I’ve been exploring various dynamic programming (DP) algorithms, trying to get a feel for how to choose suitable sub-problems and the dimensions of those sub-problems, and when a top-down or bottom-up strategy is best. In principle, dynamic programming lets you break a problem down into suitable sub-problems, as long as they obey the “principle of optimality“, meaning you can use some broad state (result) of the sub-problem calculation, without needing the actual specifics of each solution.

I’ve also been playing lots with variadic templates in modern C++.

My little murrayc-dp-algorithms project is the result. It has base classes for both bottom-up DP and top-down DP and several examples that use them. The end result is only slightly useful, and is currently far from optimal for performance. But I still think it’s an interesting exercise that shows how these various DP algorithms really relate to each other. It should also explain how I came up with those examples in my earlier blog post.

Top-Down DP

Let’s start with top-down dynamic-programming, because it’s generally simpler. Calculating Fibonacci numbers is the classic example. This is clearly inefficient because every recursive call recalculates almost the same lesser Fibonacci numbers, twice. This is exponential, with time complexity of roughly O(2^n).

int f(int i) {
  if (n == 0)
    return 0;
  else if (n == 1)
    return 1;

  return f(i - 1) + f(i -2);
}

One simple way to avoid the wasteful recalculation is to cache the results and reuse them. That’s generally called Memoization. For Fibonacci number calculation, this is still not the best way, but it’s a useful example.

DpTopDownBase provides a framework for dynamic programming with memoization. You provide a calc_subproblem() implementation, which should use get_subproblem(). That get_subproblem() will return a previously-calculated result if it exists, or recursively call calc_subproblem().

1-Dimensional Top-Down DP

For instance, the top-down DpFibonacci class is declared like so, deriving from DpTopDownBase:

class DpFibonacci
  : public murraycdp::DpTopDownBase< unsigned long, // subproblem type unsigned int // i > {
public:
  explicit DpFibonacci(unsigned int n)
  : n_(n) {}

This specifies that its calc_subproblem() (and the base class’ get_subproblem())will take an unsigned int and return an unsigned long:

unsigned long
calc_subproblem(
  type_base::type_level level,
  unsigned int i) const override {
  // Base cases:
  if (i == 0) {
    return 0;
  } else if (i == 1) {
    return 1;
  }

  return get_subproblem(level, i - 1)
    + get_subproblem(level, i - 2);
}

The level parameter is just for optional debug output, to show how deep we are in the recursive call stack.

It also implements get_goal_cell() so that the base class knows which value of n to use to get the result f(n). This just returns the n provided to DpFibonacci’s constructor.

We can then instantiate the class and call calc():

DpFibonacci dp(n);
auto result = dp.calc();

The base DpTopDownBase::calc() method calls get_goal_cell() and calls calc_subproblem() for that cell. Because our calc_subproblem() implementation uses get_subproblem(), it will automatically avoid recalculating already-calculated sub-problems.

That DpTopDownBase::calc() method is a fairly simple example of the usefulness of C++ variadic template parameters to call one variadic method with the result of another variadic method, using tuples and std::experimental::apply(). And it’s type-safe at compile time all the way.

2-Dimensional Top-Down DP

But DpTopDownBase isn’t just good for a 1-dimensional cache, based on one parameter. The wonder of C++ variadic templates lets us specify multiple parameters for calc_subproblem and get_goal_cell(). Those overridden methods take just the right parameter types and just the right number of parameters.

And this is where top-down DP can be useful, because it might not be necessary to calculate all possible sub-problems for all possible values of these parameters.

For instance, in the Knapsack Problem, we have a knapsack with a certain maximum weight capacity, and we have several items of various weights and values. We want to find the set of items to put in the knapsack that maximises the value. We give each item an index number.

In the classic Knapsack DP algorithm, we then have 2 dimensions (or parameters) for each sub-problem: A count of items and a maximum weight. So if we have 5 items and total capacity of 100Kg, we can calculate the max value, f(5, 100Kg) by using other sub-problems such as f(4, 80Kg) along the way. It’s not likely that we’ll need to calculate all 5 * 100 possible subproblems, though it is likely that we’ll reuse some sub-problem calculations.

For instance, the top-down DpKnapsack class derives from the base DpTopDownBase like so:

class DpKnapsack
: public murraycdp::DpTopDownBase<
    SubSolution,
    SubSolution::type_vec_items::size_type,
    Item::type_weight
  > {

Again it overrides calc_subproblem() and overrides get_goal_cell(). This time it takes 2 parameters instead of just i:

SubSolution
calc_subproblem(
  type_level level,
  type_size items_count,
  type_weight weight_capacity) const override {

And it can call get_subproblem() with those 2 parameters, letting the base class worry about how to cache the sub-problems and when to recalculate them. (This currently uses a rather inefficient generic hashing function for std::tuple<>s. )

Notice that we return a custom SubSolution class for the sub-problem result instead of just a number. This lets us store both the maximum value for that sub-problem and the items used in that sub-problem solution. That helps us to reconstruct the complete set of items for the sub-solutions along the way. However, that is inefficient because we won’t need all sub-solution item sets . It should instead be possible to write some generic code to  reconstruct the solution by working backwards from the result of the goal. I’ll try that out sometime.

Again we can just instantiate the class and call calc():

//Items with their values and weights:
DpKnapsack::type_vec_items items{
  {13, 15}, {12, 14}, {2, 13}, {10, 12},
  {1, 11}, {7, 10}, {6, 9}, {5, 8},
  {11, 7}, {14, 6}, {9, 5}, {4, 4}, {3, 3},
  {22, 2}, {8, 1}};
DpKnapsack::type_value weight_capacity = 45;

DpKnapsack dp(items, weight_capacity);
const auto result = dp.calc();
std::cout << "solution: value: " << result.value << std::endl
  << "with items: ";
print_vec(result.solution);
std::cout << std::endl;

The top-down DpContextFreeGrammarParser , top-down DpMakeChange , top-down DpEditDistance , top-down DpParenthesization do much the same thing. The top-down DpTsp uses 3 dimensions and overrides calc() to choose the minimum result over all the sub-problems at the top level instead of just letting the default base calc() use the result of one goal sub-problem as the overall result.

I think it’s nice to separate the real heart of each DP algorithm from the code that does the boring caching and checking.

Bottom-Up DP

In bottom-up DP, we calculate every single possible sub-problem result, starting from our base case. Even if we won’t need every single possible result, this is often more efficient because we sometimes know when we can discard whole sets of sub-problems. For instance, looking again at the classic Fibonacci Numbers example, we only really need the last 2 sub-problem results:

int f(int n) {
  if (n == 0)
    return 0;
  else if (n == 1)
    return 1;

  int result = 0;
  int prev1 = 1; // f(1) = 0;
  int prev2 = 0; // f(0) = 0;
  for (int i = 1; i < n; ++i) {
    result = (prev1 + prev2);
    prev2 = prev1;
    prev1 = result;
  }

  return result;
}

So we don’t need to cache all i results along the way. For a multi-dimensional DP algorithm, that can save us a lot of space.

1-Dimensional Bottom-Up DP

The bottom-up DpFibonacci class derives from DpBottomUpBase, specifying:

  • the number of sub-problems to keep along the way: 2 to keep f(n-1) and f(n-2).
  • the type of each sub-problem’s (and the final) calculation result: long for the result of f(5), for instance.
  • the type of the input: int for 5, for instance, in f(5).
class DpFibonacci
: public murraycdp::DpBottomUpBase<
    2, // count of sub-problems to keep
    unsigned long, // sub problem type
    unsigned int // i
  > {

DpFibonacci again overrides calc_subproblem() and get_goal_cell(), with exactly the same implementation as the top-down version.

The base class’ calc() method then just iterates from 0 to n, calling the virtual calc_subproblem() method and then returning the result for the goal cell, based on calling the virtual get_goal_cell() method.

DpBottomUpBase keeps just the right number of previously-calculated sub-problems, while reusing the discarded sub-problem storage, by using my (funky) circular_vector.

DpTripleStep is very similar, though it needs to keep the 4 previous sub-problems instead of 2, and of course it does a different calculation per sub-problem.

2-Dimensional Bottom-Up DP

The bottom-up DpKnapsack class is again much like the top-down DpKnapsack implementation, but it again specifies that it only needs the previous 2 sets of sub-problems.

class DpKnapsack
  : public murraycdp::DpBottomUpBase<
    2, // count of sub-problems to keep.
    SubSolution,
    SubSolution::type_vec_items::size_type,
    Item::type_weight
  > {

Where the bottom-up DpFibonacci passed one count to the base class’ constructor, the bottom-up DpKnapsack passes two, one for each calc_subproblem() parameter:

DpKnapsack(
  const type_vec_items& items,
  type_weight weight_capacity)
  : DpBottomUpBase(
      items.size() + 1,
      weight_capacity + 1),
    items_(items),
    weight_capacity_(weight_capacity) {}

The base DpBottomUpBase class uses these constructor parameters as ends of ranges, assuming that each range starts at 0. It will then iterate over the first range, iterating over the second range each time, like a nested for loop:

for (int items_count = 0; items_count < MAX_ITEMS; ++items_count) {
  for (int weight_capacity = 0; weight_capacity < MAX_WEIGHT_CAPACITY; ++ weight_capacity) {
    ...
}

There is no way in C++ to nest an arbitrary number of for loops. But we can use variadic templates to declare an arbitrarily-nested vector. And thanks to my funky variadic templates for vectors of vector (of vectors of vectors, etc), the base class can generically call calc_subproblem() as if it had nested for loops, regardless of how much nesting there should be. My for_vector_of_vectors() takes the (nested) vector, and a series of start and end parameters for each level and eventually uses std::experimental::apply() to call the functor (a lambda, in this case) for each combination. To get that series of start and end parameters, it uses my funky tuple_interlace() to alternate starts (0) with ends.

The bottom-up DpEditDistance and bottom-up DpSubstringMatching do much the same thing.

Problems with DpBottomUpBase

The bottom-up base class has some serious limitations:

It only lets us discard older sub-problems in the first dimension, such as the Knapsack sub-problems for items less than n-2. But you might understand enough about your algorithm to know other sets of sub-problems that can be discarded at various times, based on less simple criteria. This would make the best use of space while avoiding recalculation. Maybe some recalculation would be OK as long as you could minimize it.

Also, it is limited to a simple iteration, using a range from start to end, using a simple operator++ to move to the next iteration. But sometimes you want to do something a little more complicated than an iterating for loop. For instance, my top-down DpTsp class finds a solution for the Travelling Salesman Problem by manipulating sub-sets in its calc_subproblem(). I could use gosper’s hack to create a suitable n-choose-k subset iterator, but it still needs to iterate over a further set of subsets missing each item from the parent subset, and I suppose that would be doable with another custom iterator. But the start and end iterators at each level would then have to be instantiated based on the values of the parent level, and cannot just be supplied before the whole calc() starts. Maybe I could do something with lambda functions to let these iterators be instantiated along the way.

Big-O Quiz: An experiment with Google AppEngine

bigoquiz.com

screenshot_bigoquiz_webOver the last few weeks I’ve been working on bigoquiz.com. I wanted to learn the Big-O algorithmic complexity of the main algorithms and data structures by heart. So, of course, I first built a system to help me do that.

bigocheatsheet.com and bigoref.com already do a good job of listing the information, but bigoquiz.com lets you check that you really know each one, and shows you the ones you keep getting wrong.

Tech I learned along the way

I find it very hard to resist a project when I see that it would let me play with multiple new things. This one was a chance to learn about:

Details

I noticed that bigoref.com had some corrections and additions compared to bigcheatsheet.com. And I made some additions of my own. I’ve listed it all here for anybody who is interested:

For instance,  bigoref.com adds Graph Search algorithms such as Dijkstra’s and Bellman-Ford (missing from bigocheatsheet.com), under “Searching“. I instead listed the graph search algorithms in bigoquiz.com under “Graph Search”  and added more graph algorithms that I’d learned about recently in my Coursera course, though I’m tempted to create a completely different graph algorithms quiz: Kruskal’s Minimum Spanning Tree, Prim’s Minimum Spanning Tree, Floyd Warshall All Pairs Shortest Path, and Johnson’s All Pairs Shortest Path.

bigoref.com also adds linear and binary search of arrays (missing from bigocheatsheet.com). For bigoquiz.com I just split Array into sorted and unsorted in “Data Structure Operations”.

I found just one error on bigoquiz.com that was corrected on bigoref.com, though there might be others:

And I noticed that bigoref.com has some additional errors so I corrected those on bigoquiz.com and  filed github issues for bigoref:

Also:

  • bigocheatsheet.com has entries for Stack, but bigoref.com does not. I kept that in bigoquiz.com.
  • bigoref.com has entries for Ternary Search Tree, apparently a kind of Trie (Prefix Tree), but bigocheatsheet.com does not. I kept this in bigoquiz.com. Curiously, bigoref.com doesn’t have entries for regular Tries (Prefix Trees), or Suffix Trees. I added them to bigoquiz.com, though I’m not completely sure about the various best/worst/average time/space complexities.
  • bigoref.com uses the term “indexing” where bigocheatsheet.com uses “access”. I’ve stuck with “access” for bigoquiz.com.
  • bigoref.com splits array into Basic Array and Dynamic Array, not allowing insertion or deletion in a Basic Array. I don’t find the distinction useful, so I kept it as just Array in bigoquiz.com.

GWTP: Model/View/Presenter for GWT with GIN/GUICE dependency injection

I’ve building a new website project with GWT, so I thought I’d take the opportunity to try GWTP, which provides an MVP framework for GWT. I’ve previously used bare GWT to do much the same thing while developing OnlineGlom, but I hoped it could be easier. Both are really for structuring client-side UI and behaviour.

Once you’ve got your GWT application built, you can almost forget how difficult it was to create the basic structure. So I’ve taken the time to describe both the old (GWT) way and the new (GWTP) way of doing things to remind myself how much better life is with GWTP. I mostly just discuss how the activities/presenters and views interact in the two systems. You would need to learn a bit more to write a full application. If you are familiar with GWT, you might skip to the GWTP section.

You’ll see that, even with GWTP, the code is far from concise, but that’s hard to avoid completely with Java, particularly when trying to split code into Presenters and Views, and splitting implementation and interface. Overall, I really wish that GWTP was just part of GWT with the old GWT Activities deprecated. Then people could focus on this one right way of doing things, and making it even better.

For GWT, I refer to my old OnlineGlom project. For GWTP, I refer to the little murrayc-gwtp-appengine-example project I created recently.

GWT: Activities/Views/Places

GWT provides the Activities and Places framework, and suggests you create Views for your UI. That page has a good overview and explanation, but I’ll go over the parts here too.

Place

GWT’s Activities and Places framework is mixed up with GWT’s history support, which lets the different states of your application have URLs that can appear in browser bookmarks and history. That’s how we have a “history token”, which describes a Place.

For instance, for each main location in your app, you would have SomethingPlace, extending GWT’s Place. Here’s an example from OnlineGlom: DetailsPlace is for a page that shows the details from a database record.

The @Prefix(details) annotation on its Tokenizer tells GWT that this place can handle URLs that begin with/ #/details, such as http:://onlineglom.something.com/#/details?table=things?primaryKey=123. The Place class takes the place’s parameters in its constructor and stores them. Those parameters usually refer to the ?something=blah parameters in the URLs.

getPlace() constructs a Place object from a token (mostly a URL), and getToken() constructs a token from a Place object. There isn’t any convenience API to avoid the tedious splitting and concatenation of strings, though we created a couple of helper methods for this in OnlineGlom.

Activity

Each place in your app will have an Activity, whose constructor will take an instance of your derived Place class. For instance, in OnlineGlom, the DetailsActivity takes a DetailsPlace. The Activity can use the Place’s getter methods to discover what the URL’s parameters have specified.

Your app will derive an ActivityMapper that maps the places to the Activities. For instance, in OnlineGlom, our DataActivityMapper‘s getActivity() method instantiates a DetailsActivity when it gets a DetailsPlace.

You’ll pass your ActivityMapper when creating an ActivityManager. For instance, in OnlineGlom, we create an ActivityManager that takes our DataActivityMapper. You’ll then call setDisplay() on your ActivityManager, passing an AcceptsOneWidget that will be used to show your activitity’s View.

This is already a lot of boilerplate code for some basic functionality.

View

Each place has a View. You’ll have one view interface and at least one implementation of that interface. For instance, OnlineGlom has its DetailsView interface and its DetailsViewImpl class that implements that interface. Your Activity can call methods on the view to make it show appropriate UI for the Activity’s state.

GWT has no base View class or interface, though we created a base View class for OnlineGlom, which has a View.Presenter interface implemented by all the Activities. Each Activity can then call setPresenter(this) on the view, giving the view a way to respond back to the Activity, such as telling the Activity that the user has clicked on something that should take the user to a new location.

Your ClientFactoryImpl, which implements your ClientFactory interface, instantiates the view implementation based on the view interface. For instance, OnlineGlom’s ClientFactoryImpl creates a DetailsViewImpl when asked for a DetailsView.

Your Activity takes your ClientFactory and uses it to create its View. For instance, in OnlineGlom’s DetailsActivity constructor, which was itself called by the DetailsActivityManager that I mentioned earlier. The actual ClientFactoryImpl is instantiated by a call to GWT.create(ClientFactory.class), using the ClientFactory to ClientFactoryImpl mapping in a .gwt.xml file. For instance, in OnlineGlom.gwt.xml:

<!-- Use ClientFactoryImpl by default -->
<replace-with class="org.glom.web.client.ClientFactoryImpl">
  <when-type-is class="org.glom.web.client.ClientFactory" />
</replace-with>

In theory, you might want different implementations of your View interface for different devices. Maybe you’d have DetailsViewMobileImpl, for instance, but that’s not a technique that I’ve personally found useful. However, allowing multiple implementations of the View also allows testing of your Activity by mocking the View,  letting you test logic without having to test UI behaviour at the same time.

Again, you can see that you have to implement lots of repetitive boilerplate code and you might wonder why you need to bother.

GWTP: Presenters/Views/Places

GWTP uses Presenters rather than Activities, but they are much the same thing: code that tells your UI View what to do without telling the UI View how to do it. But GWTP requires less repetitive code, by using dependency injection with GIN (GUICE for GWT) to tie the pieces together.

If you are returning to Java after some time away then Dependency Injection is your Tirimasu. This Google I/O 2009: Big Modular Java with GUICE talk seems to be an excellent introduction to dependency injection in general, and dependency injection with GUICE.

For instance, this lets us add a parameter to a constructor, and as long as the constructor has the @Inject annotation, and as long as we’ve told GIN (or GUICE) what implementation types to use, GIN (or GUICE) can end up calling that constructor with an appropriate instance for the parameter.

Place

WIth GWTP, you don’t need to implement any actual Place classes. Instead, you just use the @NameToken annotation on your Presenter’s proxy (declared via the @ProxyStandard annotation). For instance, in murrayc-gwtp-appengine-example’s ThingPresenter:

@ProxyStandard
@NameToken(NameTokens.THING)
interface MyProxy extends ProxyPlace {
}

That NameTokens.THING is just a string constant that I’ve put together with the others.

Presenter

Your Presenter should derive from GWTP’s Presenter, which is a generic class that takes your View interface and your Proxy interface.

For instance in murrayc-gwtp-appengine-example’s ThingPresenter:

public class ThingPresenter
  extends Presenter<ThingPresenter.MyView, ThingPresenter.MyProxy>

The presenters’s prepareFromRequest() override uses simple getParameter() calls to get the URL’s parameters, without the need for a separate Place class and manual string parsing. For instance, in ThingPresenter’s prepareFromRequest():

@Override
public void prepareFromRequest(final PlaceRequest request) {
  super.prepareFromRequest(request);
  final String thingId =
    request.getParameter(NameTokens.THING_PARAM_THING_ID, null);

View

Your View should extend GWTP’s ViewImpl and implement your Presenter’s View interface. For instance:

public class ThingView
  extends ViewImpl
  implements ThingPresenter.MyView

However, you will very often want to derive instead from GWTP’s ViewWithUiHandlers, specifying a small UiHandlers interface you’ve created, so your View can notify its presenter about user interaction. For instance, in murrayc-gwtp-appengine-example’s ThingView:

public class ThingView
  extends ViewWithUiHandlers<ThingUserEditUiHandlers>
  implements ThingPresenter.MyView {

That ThingUserEditUiHandler should also be implemented by the presenter. For instance, in ThingPresenter:

public class ThingPresenter
  extends Presenter<ThingPresenter.MyView, ThingPresenter.MyProxy>
  implements ThingUserEditUiHandlers {

And your presenters’s View interface should extend HasUiHandlers. For instance, in ThingPresenter:

interface MyView
  extends View,
  HasUiHandlers<ThingUserEditUiHandlers> {

The View can then indirectly call a method on the presenter like so:

getUiHandlers().onThingChosen("blah");

However, you must first call your view’s setUIHandler() from your presenters constructor, like so:

getView().setUiHandlers(this);

Modules

Each Presenter/View should have a Module. For instance, murrayc-gwtp-appengine-example’s ThingModule links the presenter and view together like so:

public class ThingModule extends AbstractPresenterModule {
  @Override
  protected void configure() {
    bindPresenter(ThingPresenter.class,
      ThingPresenter.MyView.class,
      ThingView.class,
      ThingPresenter.MyProxy.class);
  }
}

These sub-modules would then be combined in one module for the whole application. For instance:

public class ApplicationModule extends AbstractPresenterModule
  @Override
  protected void configure() {
    install(new ThingModule());
    install(new UserProfileModule());

This is where you can control the dependency injection. By providing a different ApplicationModule or different sub-modules, you can cause different implementations to be instantiated. For instance, to create mock Views.

Events: Communication Between Presenters

You application might have several nested presenters on a page, so one presenter might need to respond to a change to another presenter. GWTP lets us do this by defining our own GwtEvent. For instance,  murrayc-gwtp-appengine-example’s ThingUserAddedEvent:

public class ThingUserAnswerAddedEvent
extends GwtEvent<ThingUserAnswerAddedEvent.EventHandler> {

One presenter may then fire that event, by calling its fire() method, providing any parameters needed by the event.

A presenter may handle the event by implementing your event’s EventHandler interface. For instance, in murrayc-gwtp-appengine-example’s UserHistoryRecentPresenter:

public class UserHistoryRecentPresenter
  extends PresenterWidget<UserHistoryRecentPresenter.MyView>
  implements ThingUserAnswerAddedEvent.EventHandler {

And then registering the presenter as a handler for the event, with GWTP’s addRegisteredHandler(). For instance in UserHistoryRecentPresenter’s constructor:

addRegisteredHandler(ThingUserAnswerAddedEvent.TYPE, this);

You must also use GWTP’s @ProxyEvent annotation (see this too) on the handler method:

@ProxyEvent
@Override
public void onThingUserAnswerAdded(final ThingUserAnswerAddedEvent event) {

 

Simple GWTP and Objectify example with Maven

I’ve been playing with GWTP (an MVP framework for GWT) and Google’s AppEngine (via Objectify). I prefer to use the Maven build system. So I created a simple small example that does this: murrayc-gwtp-appengine-example. Improvements are very welcome. I’d like to learn how the code should be improved.

In case it’s interesting to anyone, here’s why I wanted this little example to exist:

GWT

The official GWT website’s equivalent StockWatcher example/documentation does not demonstrate best practices. It:

There is official GWT sample code that uses maven and Objectify, but it’s all mixed up with other stuff and I wanted something much simpler.

There is also an official GWT MVP example from 2010 hidden away, but it’s not in git and I didn’t even find it until I started writing this.

Objectify

Objectify’s documentation seems pretty good now that I look at it again, though I struggled at first. I think Google (and StackExchange) kept taking me to example code that used one of the various older versions of the API.

Oddly, Objectify’s documentation suggests that you use objectify-gwt with GWT. but objectify-gwt doesn’t seem to have any documentation and I didn’t seem to need it myself.

GWTP

The official GWTP website’s documentation is rather scattered, with several simple typos that make its example code snippets inconsistent with each other. The documentation seems to have been restructured from some other source. For instance, several internal links don’t take you to what is apparently intended. Some internal links take you to the page itself.  Lots of the best documentation is spread across their blog posts, but not brought together properly. They do at least show use of maven in their beginner’s GWTP example. but that example code itself is not in version control as a real project. GWTP is a nice clean API that deserves better.

Of course, it is incredibly hard to describe source code in documentation and keep that documentation and source code up to date and in sync. It needs a system. It never works if you just put the source code inline where no compiler can touch it. Although it’s nice to show source code inline (ideally taken automatically from a real source file), you also need links to the real source code so people can see the latest version.

C++ Core Guidelines: Ownership and Parameters

I’ve been catching up on the wonderful CppCon talks from 2014 and 2015. It looks like we are getting towards a consensus about how to write modern C++, though I guess we are still early in the process,  and still coming up with as many questions as answers.

When this all settles down, it might need Scott Meyers to come out of retirement from C++ to combine and update his Effective C++ books. Maybe he’d do it if a really modern C++ emerged from it all.

Talks, Guidelines, Smartpointers and Ownership

Some of the most important talks focus on when pointers, or smartpointers, have “ownership”. Three talks in particular progress towards a rather utopian vision in which (shared) object ownership is clearer in various parts of your code and in which there are no object lifetime bugs:

  • CppCon 2014: Herb Sutter “Back to the Basics! Essentials of Modern C++ Style”  (slides)
    Herb discussed some best practices, particularly around auto and smartpointers.
  • CppCon 2015: Bjarne Stroustrup: “Writing Good C++14” (slides)
    Bjarne discussed how we can agree on guidelines that static analysis tools can use, and how following the rules can let static analysis tools find more bugs. For example, by using types to indicate whether a (smart)pointer is owning. As in Herb’s 2014 talk, he suggested that we should get() the raw pointer out of a shared_ptr<> before passing it to a function that doesn’t really want to share ownership. He also introduced the GSL, which has owned<> for when you can’t use shared_ptr<> or unique_ptr<>.
  • CppCon 2015: Herb Sutter: “Better C++14 by Default” (slides)
    The next day, Herb discussed many more ways that static analysis tools could avoid several classes of bugs, and how we can break the rules down into sets called profiles. He suggested some annotations, via the C++ generalized attributes syntax, already in C++11, which could let the tools find even more problems, but showed that we can make good assumptions by default. He showed a Microsoft static analysis tool doing this already.

I like the idea of tools telling me what to do. C++ is more complicated than ever, though also better than ever, and there’s more scope than ever for people to disagree about how to use it. At some point I like to move on and get stuff done. I’m sceptical that we can ever really make our code as safe as Bjarne and Herb suggest, but I’m convinced that we could catch several new categories of problems before runtime, and I want to try.

C++ Core Guidelines, Smartpointers, and Ownership

The C++ Core Guidelines are currently a bit of a muddle. There are many rules that seem to overlap too much, often with insubstantial examples. Too often there are admonitions about what not to do rather than examples of how to do things properly. Trying to comply with one rule will sometimes violate another rule, with no clues about resolving the conflict. The github issues and commits show that conflicts are being clarified, but the whole thing seems so big and spread so thinly that I guess it will be a long time until it feels consistent and cohesive.

The advice often seems to assume that you are writing application code rather than reusable library code, though large real world systems are rarely just one or the other. Therefore there is little consideration of ABI stability. For instance, it suggests “F.7: For general use, take T* or T& arguments rather than smart pointers” and “I.11: Never transfer ownership by a raw pointer (T*)” (and more overlapping advice) but what happens when you need to change the implementation and need to share ownership of the passed instance? You can’t just change a T* to a std::shared_ptr<T> without making existing applications crash. The document suggests GSL’s owned<> in this case, though I guess it will still affect shared library symbol names for methods, and it won’t actually do any reference counting. So in a library where you know that instances get passed around a lot, and there’s a reasonable chance that ownership will be shared, it doesn’t seem unreasonable to default to std::shared_ptr<> even if that obscures the actual ownership in some parts of the code.

Furthermore, when looking at other peoples’ code, how do we distinguish the foo(const T*) method that probably doesn’t keep a copy of the pointer, from all the foo(const T*) methods that might do anything at all because their authors never heard of the guidelines?

Raw pointers as non-owners

As a start, I’m playing with this idea of not overusing smartpointers and letting the lack of a smartpointer indicate non-involvement in the ownership. For instance:

std::shared_ptr<Something> something = get_something();

// This takes a shared_ptr:
do_something_and_keep_it(something); 

// This doesn't care about ownership:
do_something_and_dont_keep_it(something.get());

I’m ready to try this out in an application such as Glom, but fearful of doing it in a library API, such as gtkmm.

Bear in mind that, In the past,

  • I have been one of the maintainers that made glibmm and gtkmm use Glib::RefPtr<>, an intrusive reference-counting smartpointer, for all our wrapper classes of reference-counted glib/GTK+ objects (not widgets). For instance, see the doxygen graph of glibmm/giomm classes deriving from Glib::ObjectBase, and there are many in gtkmm too (not the widgets).
  • I converted most of my Glom codebase to use all of its data structure classes by shared_ptr (a custom shared_ptr<> before C++11) rather than trying to decide when they can be deleted.

These have worked out very well, partly because “only use this via this smartpointer” is a nice simple rule to follow. I’ve resisted letting anyone easily get the underlying pointer out of a RefPtr, to preserve this simplicity. Now that there’s this new (at least to me) consensus, I guess we’ll actually add RefPtr::get() and RefPtr::operator*() soon.

But I still hesitate, because I don’t expect the typical developer to be careful enough until there are tools to stop misuse and I don’t expect the typical developer to use the tools even when they exist. So the price of a better experience for experts may be more ways for non-experts to make mistakes. The gtkmm API has tried to be a pure and modern C++ while also being pragmatic and obvious, but it will be hard to keep the balance in this case.

Trying out raw pointers as non-owners in Glom

I tried this out in an “ownership” branch of Glom. I first looked at some static (standalone) functions that took a std::shared_ptr<> because those were unlikely to store any state between calls. Here is the fairly small patch that changed those parameters.

I then looked at some other functions that took a shared pointer to const (std::shared_ptr<const Something>), suggesting that the implementation just wanted to look at it and then forget it. Changing one function’s parameter forces you to look at the calling functions, where you might find that the shared_ptr<> is itself a parameter that should be changed to a non-owning raw pointer. I followed this logic up the call stack for a while and ended up with a larger patch. I forget which function I tried to change first.

I like that this reduces the amount of code where I need to worry about circular references, which I would break with a std::weak_ptr<>. But I still worry enough to wish there was some way to automatically identify where these might happen.

I’m still a little uncomfortable with this mix of shared_ptr<> and raw pointers, even in purely application code, so I’m not ready to put this in git master just yet. But I think I’ll come around to the idea.

std::unique_ptr parameters to take ownership

The C++ Core Guidelines, suggest “R.32: Take a unique_ptr<widget> parameter to express that a function assumes ownership of a widget” though they currently tell you to avoid std::move() (update: fixed now) which you’d need when calling such a method with a named variable. They don’t mean  gtkmm Widget, but I do.

Watch Herb Sutter’s 2014 talk for the rationale for taking std::unique_ptr<>, but not all types, by value. It’s also mentioned in the C++ Core Guidelines “F.15: Prefer simple and conventional ways of passing information” section.

For instance:

void
SomeClass::take_something(std::unique_ptr<Something> something) {
...
}

...
auto something =
  std::make_unique<Something>("something", 2, 'a');
something->set_extra_foos(foo1, foo2);
bar.take_something(std::move(something));

// Or without the intermediate variable:
bar.take_something(
  std::make_unique<Something>("something", 2, 'a'));

// Or again without the intermediate variable:
bar.take_something(
  create_something("something", 2, 'a'));

Trying out std::unique_ptr<> parameters to take ownership in gtkmm

gtkmm has Gtk::manage(), which roughly means “let the container widget worry about deleting this child widget later”. For instance, you can new a Gtk::Button and call Gtk::manage() on that button before you add() it to a Gtk::Grid. It works well and it’s very simple. Incidentally, all it really does is enable the default behaviour for GtkWidgets – the only clever stuff about Gtk::manage() is how we change the default for gtkmm.

Passing a std::unique_ptr<> to add() could have much the same meaning, and would have the advantage of using standard C++ API and a known convention. I have a patch in bugzilla that does this and there’s been some discussion on gtkmm-list.

However, the result is sometimes hard to love. You have to move the add() to later in your code, to avoid a use after std::move(). And, for some of our API, you really do need to use the pointer right after you’ve add()ed it to a container. So you need to take a temporary “observing” (non-owning) raw pointer before the add().

Currently, the worst part of the gtkmm API for this, which has never felt
quite right, though we can blame GTK+, is this:

auto cell = std::make_unique<Gtk::CellRendererText>();
cell->property_style() = Pango::STYLE_ITALIC

auto cell_nonowning = cell.get(); // yuck
auto column = std::make_unique<Gtk::TreeViewColumn>("something", std::move(cell));
column->add_attribute(cell_nonowning->property_text(), columns.title);
column->add_attribute(cell_nonowning->property_style_set(), columns.italic);

treeview.append_column(std::move(column));

I guess we’ll add the add(std::unique_ptr<>) overloads to allow this, but not deprecate Gtk::manage() until we feel more comfortable with how they make us rearrange our code.