Tag Archives: bigoquiz

Using GitHub Actions to check builds

GitHub Actions

A couple of weeks ago I spent some time adding GitHub Actions to some of my projects. I was lucky enough to get access to the beta, but it has now become available for everyone.

GitHub Actions lets you specify builds, tests, etc, to run on certain triggers, such as after every commit push. I’ve now used it in several projects, each using different build environments. It has worked very well, and has really improved the developer experience, and therefore improved the code, in a few of my projects. It has encouraged me to use pull requests even for changes to niche personal projects so I can benefit fully.

Actually, I’m not sure that Actions is the best name for the system. It really lets you describe Jobs, and each Job might use a few Actions. But maybe that’s just a matter of perspective.

Now that it’s public, the pricing suggests that most open source projects will have enough time for free, for which I’m grateful. Nevertheless, history shows that it wouldn’t be wise to tie your projects too closely to one proprietary system. For instance, that’s another reason not to duplicate your build rules in a GitHub Actions workflow file. You should have a simple build system that you can just call. I generally do that via a simple Makefile, as I mention below.

Here are the GitHub projects that I’ve added GitHub Actions to:

libsigc++

We now have GitHub Actions to build and run tests for libsigc++ with several versions of 3 C++ compilers: 3 versions of gcc, 5 versions of clang, and even 1 version of msvc (thanks to Stuart Dootson). We will also add an action to check our C++ code formatting via clang-format.  If you’d like every libsigc++ commit to work with some other compiler, please do try to make that happen via a pull request.

This will give us valuable confidence that our changes will work everywhere, and give us valuable clues when they don’t. It’s great to know that people can get this feedback automatically when they submit a PR, without waiting for a human.

I might be abusing GitHub’s generosity with so many builds jobs, but luckily there are not that many commits to this project. I wonder if GitHub hope to fund some of the GitHub Actions costs for open source projects via GitHub Sponsors.

angular-bigoquiz -client (The bigoquiz.com frontend)

I added GitHub Actions to build and run tests for this Angular project. I put the npm/ng commands behind a Makefile to keep the CI rules as simple as possible. The only thing worse than having undocumented or overly-complicated build steps, is repeating those incantations in multiple places.

I like how it uses just one Workflow definition, but runs it with several Node versions, thanks to the strategy.matix specifier, and the with.node-version specifier made available by the actions/setup-node action.

See the Workflow definitions.

go-bigoquiz -server (The bigoquiz.com backend)

I added GitHub Actions to build and run tests for this Go microservice. It only builds for go 1.12, but I think I could parameterise that like the node version for angular-bigoquiz-client.

Again, I put the go commands behind a Makefile to keep the CI rules as simple as possible.

I did some major refactoring of this code, adding tests along the way, and having CI made that fairly painless.

See the Workflow definitions.

Minor Imperfections

I have lots of experience using Atlassian’s competing Bitbucket Pipelines system. GitHub Actions feels very similar, though I still miss a few things:

Can’t trigger builds for individual commits

I often create PRs with many small commits while refactoring old code. 10 commits is not so unusual, and 30 has happened. Good test coverage makes this possible, and inevitably that test coverage shows a problem in one of the commits. When that happens in a BitBucket PR, with BitBucket Pipelines, I like doing a manual binary search, starting a new CI run for individual commits until I see where the problem started. (Actually, I wish that it would automatically do that binary search, like a git bisect.)

Unfortunately, I can’t see any way to start a job for an arbitrary commit, so I can only ever see the result for the latest commit in a PR.

No caching

Doing full builds of projects on every commit is simple but wasteful. Ideally, we could do reliable incremental builds, though that’s hard to get right. But some caching could save time and heat.

For instance, Bitbucket Pipelines lets me cache previous Docker build steps and even gives me a ccache cache to reuse C++ compilation units. Well, the ccache cache is not a real cache – it’s just a dump of files that’s re-generated every couple of weeks, and is stale in the meantime, but it’s better than nothing.

Providing CI build time to companies is obviously meant to be a profitable business. Unfortunately, it makes the incentives a bit confused. For instance, I don’t believe that Bitbucket are eager to help their customers spend less money on build time, so they don’t offer very capable caching. But competing services, such as GitHub Actions, if they get this right, could make them care about losing customers.

GWT: Changing from GWT-RPC to JSON with RestyGWT

GWT but with REST and JSON

My bigoquiz.com website uses my gwt-bigoquiz project, based on Java and GWT. One advantage of GWT is the serialization between the server and client. Using GWT-RPC, you can use the same Java classes on the server and the (compiled to JavaScript) browser client. (Well, at least when it works – when it doesn’t work it can be very hard to find out why). For instance, your Java service can have a getThing() method which your client code can call asynchronously, getting a Thing. It avoids duplication and avoids manually writing code to parse an intermediate format and recreate objects.

However, GWT-RPC ties you into using GWT on both the server and client. I would rather have a standard REST API that serves, and receives, JSON. I could then experiment with alternative implementations independently on the server side or the client side.

RestyGWT, together with Jersey, makes this quite easy. I have now ported gwt-bigoquiz from GWT-RPC to RestyGWT and Jersey. Here I describe how you might do that for your project.

You might also like to watch David Chandler’s talk about RestyGWT and Jersey at GWT.Create 2015 (slides, GitHub), which covers the same stuff in a little more detail, using slightly older versions.

Build

We need to add the following dependencies to our pom.xml file, so we can use RestyGWT, Jersey, Jackson and the JAX-RS annotations.

I’ve added comments to explain why we need each dependency, based on my best guesses. Please let me know about any errors you find. pom.xml files too often just contain cargo-culted blocks of XML pasted from blog entries or StackOverflow answers, far removed from whoever first wrote them. Still, for later versions of these dependencies, or different environments, you might need a different combination of versions or different workarounds.

<!-- For JAX-RS annotations, such as @GET, @POST, @Path, @PathParam, @QueryParam, etc -->
<dependency>
	<groupId>javax.ws.rs</groupId>
	<artifactId>javax.ws.rs-api</artifactId>
	<version>2.1-m09</version>
	<!-- Note: 2.0 gives us a ClassNotFoundException about RxInvokerProvider when we try to GET from the URL. -->
</dependency>

<!-- For client-side code to query a REST/JSON server API. This uses methods and classes annotated with JAX-RS annotations, doing serialization/deserialization with Jackson. -->
<dependency>
	<groupId>org.fusesource.restygwt</groupId>
	<artifactId>restygwt</artifactId>
	<version>2.2.0</version>
</dependency>

<!-- For JSON serialization/deserialization based on the JAX-RS annotations. Used by RestyGWT on the client side. Also adds some annotations such as @JsonInclude and @JsonIgnore, -->
<dependency>
	<groupId>com.fasterxml.jackson.jaxrs</groupId>
	<artifactId>jackson-jaxrs-json-provider</artifactId>
	<version>2.8.9</version>
</dependency>

<!-- To serve REST/JSON queries of REST resources.
	 A <servlet> tag in the web.xml file indicates that Jersey should use certain classes as REST resources/servelets.
	 These REST resources/servlets use Java methods and classes annotated with JAX-RS and Jackson annotations.

	 jersey-container-servlet needs the Java Servlet API version 3, supported by the AppEngine Java 8 runtime (currently beta).
	 Alternatively, jersey-container-servlet-core needs the Java Servlet API version 2, supported by the AppEngine Java 7 runtime. -->
<dependency>
	<groupId>org.glassfish.jersey.containers</groupId>
	<artifactId>jersey-container-servlet</artifactId>
	<version>2.26-b07</version>
</dependency>

<!-- To let Jersey use Jackson. -->
<dependency>
	<groupId>org.glassfish.jersey.media</groupId>
	<artifactId>jersey-media-json-jackson</artifactId>
	<version>2.26-b07</version>
</dependency>

<!-- Workaround this error: java.lang.IllegalStateException: InjectionManagerFactory not found. See https://stackoverflow.com/a/44546979/1123654 -->
<dependency>
	<groupId>org.glassfish.jersey.inject</groupId>
	<artifactId>jersey-hk2</artifactId>
	<version>2.26-b07</version>
</dependency>

Configuration

We must edit some XML files so we can use RestyGWT and Jersey.

.gwt.xml Module file

Our .gwt.xml Module definition must mention RestyGWT inside the tag, so our client code can use RestyGWT.

<inherits name="org.fusesource.restygwt.RestyGWT"/>

web.xml file

We must add <servlet> and <servlet-mapping> tags in the web.xml file to state that Jersey should use our classes to serve REST resources as servlets.

<servlet>
    <servlet-name>restServlet</servlet-name>
    <servlet-class>org.glassfish.jersey.servlet.ServletContainer</servlet-class>
    <init-param>
        <param-name>jersey.config.server.provider.packages</param-name>
        <param-value>com.murrayc.myproject.server.api</param-value>
    </init-param>
</servlet>
<servlet-mapping>
    <servlet-name>restServlet</servlet-name>
    <url-pattern>/api/*</url-pattern>
</servlet-mapping>

Server-Side Code Changes

The REST Resource

The classes for our REST resources must be in the package specified in the web.xml file., where Jersey can find them. These correspond roughly to the *ServiceImpl classes, derived from GWT’s RemoteServiceServlet, that we used with GWT-RPC.

These classes don’t need to derive from any special base class or implement any special interface. They do need to be annotated with @Path to specify the first part of the name of the resource. For instance:

@Path("thing")
public class ThingResource {
    ...
}

We then annotate methods of the class with @GET or @POST and @Produces.

Without a @Path annotation on the method, the method matches the base path of the class. For instance, this might return all available Things, when a client GETs from the thing/ URL.

@GET
@Produces("application/json")
public Collection<Thing> get() {
    ...
}

We can use additional @Path annotations on the method. For instance, this might return a Thing with a matching Thing ID, when a client GETs from the thing/123 URL, for instance.

@GET
@Path("/{id}")
@Produces("application/json")
public Thing getById(@PathParam("id") String id) {
    ...
}

Note that the Java method name does not appear in the URL.

The ServletContext

In our GWT-RPC service classes, we could get the ServletContext via getServletConfig().getServletContext(). With our REST classes, we instead use the JAX-RS @Context attribute on a member field, like so:

@Context
ServletContext context;

Jersey apparently then assigns the context at runtime.

Client-Side Code Changes

With GWT-RPC, we had a client interface like this:

public interface ThingServiceAsync {
    void getThings(AsyncCallback<List<Thing>> async);
    ...
}

which we called like so:

AsyncCallback<List<Thing>> callback = new AsyncCallback<List<Thing>>() {
    @Override
    public void onFailure(Throwable caught) {
        ...
    }

    @Override
    public void onSuccess(List<Thing> result) {
        getView().setThings(result);
    }
};

ThingServiceAsync.Util.getInstance().getThings(callback);

With RestyGWT, we’ll instead have a client interface like this:

@Path("/api/thing")
public interface ThingClient extends RestService {
    @GET
    public void getThings(MethodCallback<List<Thing>> callback);
    
    @GET
    @Path("/{id}")
    public void getThing(@PathParam("id") String id, MethodCallback<Thing> callback);
}

And we’ll use it like so:

MethodCallback<List<Thing>> callback = new MethodCallback<List<Thing>>() {
    @Override
    public void onFailure(Method method, Throwable caught) {
        ...
    }

    @Override
    public void onSuccess(Method method, List<Thing> result) {
        getView().setThingList(result);
    }
};

Defaults.setServiceRoot(GWT.getHostPageBaseURL());
ThingClient client = GWT.create(ThingClient.class);
client.getThings(callback);

So to change the GWT-RPC client code to RestyGWT client code, you only need to:

  • Create the client interface.
  • Use it instead of the GWT-RPC Async client class.
  • Change AsyncCallback to MethodCallback.
  • Add the Method parameter to onFailure() and onSuccess().

Parameters

A method of a GWT-RPC service just takes Java parameters. But for a REST service, you need to decide whether these are path parameters, or query parameters. For instance, if the URL is /things/123?color=blue then there is one path parameter (123), and one query parameter (color, with value blue). You would indicate this in your service class like so, using the JAX-RS attributes:

@GET
@Path("/{thing-id}")
@Produces("application/json")
public Thing getById(@PathParam("thing-id") String thingId, @QueryParam("color-id") String colorId) {
    ...
}

and in the client interface like so:

@Path("/api/thing")
public interface ThingClient extends RestService {
    @GET
    @Path("/{thing-id}")
    public void getThing(@PathParam("thing-id") String thingId, @QueryParam("color-id") String colorId, MethodCallback<Thing> callback);
}

Ignoring some Getters

Not all get methods should result in items in the JSON. You can avoid this with the Jackson @JsonIgnore annotation. For instance:

public class Thing {
    ...

    @JsonIgnore
    public getIrrelevantStuff() {
       ...
    }
}

Ignoring null and empty objects

To keep your JSON small, you can use Jackson’s @JsonInclude attribute to stop Jersey from writing name/value pairs with null or empty values. Of course this can obscure the JSON format by hiding its possible contents. For instance:

@JsonInclude(JsonInclude.Include.NON_EMPTY)
public class Thing {
    ...
}

You might choose to use NON_NULL rather than NON_EMPTY.

Adding Setters

If our REST GET method returns an instance of Thing, Jersey will serve a JSON string based on any public get*() methods that take no parameters, and the public fields. So, this class,

public class Thing {
    public String id;
    public getStuff() {
       ...
    }
}

would produce JSON like this, recursing into contained class instances:

{
  "id": "123",
  "stuff" {
     "foo": true,
     "bar": 789
  }
}

However, if there are no corresponding setter functions, such as setStuff(), RestyGWT (or Jackson, which it uses) will silently ignore these parts of the JSON when deserializing the JSON into a client-side Java instance. If there is a way to detect this at runtime, I’d like to know about it.

This makes it particularly important to test the serialization and deserialization. For instance:

@Test
public void ThingJsonTest() throws IOException {
    // The Jackson ObjectMapper:
    ObjectMapper objectMapper = new ObjectMapper();

    // Get the JSON for an object:
    Thing objToWrite = new Thing();
    objToWrite.setFoo(2);
    // etc.
    
    final String json = objectMapper.writeValueAsString(objToWrite);
    assertNotNull(json);
    assertFalse(json.isEmpty());

    // Get an object from the JSON: 
    Thing obj = objectMapper.readValue(json, Thing.class);
    assertNotNull(obj);
    assertEquals(2, obj.getFoo());
    // etc.
}

Removing the GWT-RPC code

Once everything is working, we can remove the unused GWT-RPC classes and configuration:

  • The ThingService interface, which extends RemoteService.
  • The ThingServiceAsync interface.
  • The ThingServiceImpl class.
  • The <servlet> and <servlet-mapping> tags from web.xml

C++ Implementations of “Engineering the Compiler” Pseudocode

Implementing is understanding

I’m gradually reading through the Engineering a Compiler book. I’m enjoying it but after a while I started wondering if I really understood the algorithms and I wanted to make sure of that before going further. So I implemented the algorithms in C++ (C++17, because it’s nice), testing them against the example inputs and grammars from the book. For instance, here is my code to construct, and use, the action and goto tables which define a DFA for a bottom-up table-driven LR(1) parser, along with some test code to use it with a simple expression grammar.

So far I’ve done this for Chapter 2 (Scanners) and Chapter 3 (Parsers) and I’ll try to keep going. It is a useful exercise for me, and maybe it’s useful to someone else reading the book.  Please note that the code will probably be meaningless to you if you haven’t read the book or something similar. On the other hand, the code will probably seem childlike if you’ve studied compilers properly.

Trying to get  the code to work often showed me that I had had only the illusion of fully understanding. Much of the pseudocode in the book is not very clear, even when you adjust your mind to its vaguely mathematical syntax, and to know what it really means you have to closely read the text descriptions, sometimes finding clues several pages back. For instance it’s not always clear what is meant by a particular symbol, and I found at least one conditional check that appeared in the description but not in the code. I would much rather see descriptions inline in the code.

Pseudocode is vague

I’m not a fan of pseudocode either, though I understand the difficulty in choosing a real programming language for examples. It means putting some readers off. But there could at least be some real code in an appendix. The book has some detailed walkthroughs that show that the authors must have implemented the algorithms somehow, so it’s a shame that I couldn’t just look at that code. I wonder what real programming language might be the most compact for manipulating sets of states when dealing with finite state automata states and grammar symbols.

For the parsers chapter this wasn’t helped by the, presumably traditional, ambiguity of the “FIRST” sets terminology. FIRST(symbol), FIRST(symbols), FIRST(production), and FIRST+(production) are all things.

My code isn’t meant to be efficient. For instance, my LR(1) parser table-building code has some awfully inefficient use of std::set as a key in a std::map. The code is already lengthier than I’d like, so I would prefer not to make it more efficient at the cost of readability. I’ve also probably overdone it with the half-constexpr and vaguely-concepty-generic Grammar classes. But I am tempted by the idea of making these fully constexpr with some kind of constexpr map, and then one day having the compiler build the generated (DFA) tables at compile time, so I wanted to explore that just a little.

But the book is good

Despite my complaints about the presentation of the algorithms, so far I’d still recommend the book. I feel like it’s getting the ideas into my head , it’s not really that bad, and as far as I know there’s nothing better. Of course, given that I haven’t read the alternatives, my recommendation shouldn’t mean much.

Also, these days I always write up my notes as a bigoquiz.com quiz, so here are my quizzable compiler notes so far.

Big-O Quiz: Graph Theory

I’ve been studying more graph theory recently. To take notes that I might remember, I added a Graphs quiz to my Big-O Quiz site. It covers the graph theory mentioned in The Algorithm Design Manual and in the Stanford/Coursera “Algorithms: Design and Analysis Parts 1 & 2“  courses by Tim Roughgarden. I’m adding to it as I work through Tim Roughgarden’s follow-up “Second course in Algorithms“.

It’s often a slightly silly quiz, offering you algorithm descriptions that make no sense for the question’s problem, along with the correct one, but the aim is to just to reinforce my memory by exercising my memory. So I’ve tried to use concise descriptions of problems and algorithms, without trying to be exhaustively correct.

BigOQuiz: C++ Standard Library Algorithms

I’ve reworked bigoquiz.com, which I created a few weeks ago. to allow multiple quizzes instead of just the default quiz about the Big-O complexity of algorithms and data structures. For some quizzes it also now offers reversed questions, asking you for the question that matches the answer.

For instance, there is now a C++ Standard Algorithms quiz. You could, for instance, check that you know the descriptions of the Sort Operations functions. Or you could check that you know the names of those Sort Operations functions based on their descriptions (or as multiple-choice). Maybe this is one way to get more familiar with the many standard C++ algorithms that we often don’t realize are available. I hope this helps me as I strive to use “no raw loops“.

As before, you can track your progress in each section of the quiz, and identify problem questions that you keep getting wrong.

screenshot_bigoquiz_cpp_std_algorithms_sections

I’ve also added a Design Patterns quiz, based on the patterns mentioned in the Head-First Design Patterns book.

The user profile page also now shows overall statistics for all the quizzes that you’ve tried.

screenshot_bigoquiz_userprofile

 

Big-O Quiz: An experiment with Google AppEngine

bigoquiz.com

screenshot_bigoquiz_webOver the last few weeks I’ve been working on bigoquiz.com. I wanted to learn the Big-O algorithmic complexity of the main algorithms and data structures by heart. So, of course, I first built a system to help me do that.

bigocheatsheet.com and bigoref.com already do a good job of listing the information, but bigoquiz.com lets you check that you really know each one, and shows you the ones you keep getting wrong.

Tech I learned along the way

I find it very hard to resist a project when I see that it would let me play with multiple new things. This one was a chance to learn about:

Details

I noticed that bigoref.com had some corrections and additions compared to bigcheatsheet.com. And I made some additions of my own. I’ve listed it all here for anybody who is interested:

For instance,  bigoref.com adds Graph Search algorithms such as Dijkstra’s and Bellman-Ford (missing from bigocheatsheet.com), under “Searching“. I instead listed the graph search algorithms in bigoquiz.com under “Graph Search”  and added more graph algorithms that I’d learned about recently in my Coursera course, though I’m tempted to create a completely different graph algorithms quiz: Kruskal’s Minimum Spanning Tree, Prim’s Minimum Spanning Tree, Floyd Warshall All Pairs Shortest Path, and Johnson’s All Pairs Shortest Path.

bigoref.com also adds linear and binary search of arrays (missing from bigocheatsheet.com). For bigoquiz.com I just split Array into sorted and unsorted in “Data Structure Operations”.

I found just one error on bigoquiz.com that was corrected on bigoref.com, though there might be others:

And I noticed that bigoref.com has some additional errors so I corrected those on bigoquiz.com and  filed github issues for bigoref:

Also:

  • bigocheatsheet.com has entries for Stack, but bigoref.com does not. I kept that in bigoquiz.com.
  • bigoref.com has entries for Ternary Search Tree, apparently a kind of Trie (Prefix Tree), but bigocheatsheet.com does not. I kept this in bigoquiz.com. Curiously, bigoref.com doesn’t have entries for regular Tries (Prefix Trees), or Suffix Trees. I added them to bigoquiz.com, though I’m not completely sure about the various best/worst/average time/space complexities.
  • bigoref.com uses the term “indexing” where bigocheatsheet.com uses “access”. I’ve stuck with “access” for bigoquiz.com.
  • bigoref.com splits array into Basic Array and Dynamic Array, not allowing insertion or deletion in a Basic Array. I don’t find the distinction useful, so I kept it as just Array in bigoquiz.com.