Tag Archives: bigoquiz

C++ Implementations of “Engineering the Compiler” Pseudocode

Implementing is understanding

I’m gradually reading through the Engineering a Compiler book. I’m enjoying it but after a while I started wondering if I really understood the algorithms and I wanted to make sure of that before going further. So I implemented the algorithms in C++ (C++17, because it’s nice), testing them against the example inputs and grammars from the book. For instance, here is my code to construct, and use, the action and goto tables which define a DFA for a bottom-up table-driven LR(1) parser, along with some test code to use it with a simple expression grammar.

So far I’ve done this for Chapter 2 (Scanners) and Chapter 3 (Parsers) and I’ll try to keep going. It is a useful exercise for me, and maybe it’s useful to someone else reading the book.  Please note that the code will probably be meaningless to you if you haven’t read the book or something similar. On the other hand, the code will probably seem childlike if you’ve studied compilers properly.

Trying to get  the code to work often showed me that I had had only the illusion of fully understanding. Much of the pseudocode in the book is not very clear, even when you adjust your mind to its vaguely mathematical syntax, and to know what it really means you have to closely read the text descriptions, sometimes finding clues several pages back. For instance it’s not always clear what is meant by a particular symbol, and I found at least one conditional check that appeared in the description but not in the code. I would much rather see descriptions inline in the code.

Pseudocode is vague

I’m not a fan of pseudocode either, though I understand the difficulty in choosing a real programming language for examples. It means putting some readers off. But there could at least be some real code in an appendix. The book has some detailed walkthroughs that show that the authors must have implemented the algorithms somehow, so it’s a shame that I couldn’t just look at that code. I wonder what real programming language might be the most compact for manipulating sets of states when dealing with finite state automata states and grammar symbols.

For the parsers chapter this wasn’t helped by the, presumably traditional, ambiguity of the “FIRST” sets terminology. FIRST(symbol), FIRST(symbols), FIRST(production), and FIRST+(production) are all things.

My code isn’t meant to be efficient. For instance, my LR(1) parser table-building code has some awfully inefficient use of std::set as a key in a std::map. The code is already lengthier than I’d like, so I would prefer not to make it more efficient at the cost of readability. I’ve also probably overdone it with the half-constexpr and vaguely-concepty-generic Grammar classes. But I am tempted by the idea of making these fully constexpr with some kind of constexpr map, and then one day having the compiler build the generated (DFA) tables at compile time, so I wanted to explore that just a little.

But the book is good

Despite my complaints about the presentation of the algorithms, so far I’d still recommend the book. I feel like it’s getting the ideas into my head , it’s not really that bad, and as far as I know there’s nothing better. Of course, given that I haven’t read the alternatives, my recommendation shouldn’t mean much.

Also, these days I always write up my notes as a bigoquiz.com quiz, so here are my quizzable compiler notes so far.

Big-O Quiz: Graph Theory

I’ve been studying more graph theory recently. To take notes that I might remember, I added a Graphs quiz to my Big-O Quiz site. It covers the graph theory mentioned in The Algorithm Design Manual and in the Stanford/Coursera “Algorithms: Design and Analysis Parts 1 & 2”  courses by Tim Roughgarden. I’m adding to it as I work through Tim Roughgarden’s follow-up “Second course in Algorithms“.

It’s often a slightly silly quiz, offering you algorithm descriptions that make no sense for the question’s problem, along with the correct one, but the aim is to just to reinforce my memory by exercising my memory. So I’ve tried to use concise descriptions of problems and algorithms, without trying to be exhaustively correct.

BigOQuiz: C++ Standard Library Algorithms

I’ve reworked bigoquiz.com, which I created a few weeks ago. to allow multiple quizzes instead of just the default quiz about the Big-O complexity of algorithms and data structures. For some quizzes it also now offers reversed questions, asking you for the question that matches the answer.

For instance, there is now a C++ Standard Algorithms quiz. You could, for instance, check that you know the descriptions of the Sort Operations functions. Or you could check that you know the names of those Sort Operations functions based on their descriptions (or as multiple-cdon’t always hoice). Maybe this is one way to get more familiar with the many standard C++ algorithms that we often don’t realize are available. I hope this helps me as I strive to use “no raw loops“.

As before, you can track your progress in each section of the quiz, and identify problem questions that you keep getting wrong.

screenshot_bigoquiz_cpp_std_algorithms_sections

I’ve also added a Design Patterns quiz, based on the patterns mentioned in the Head-First Design Patterns book.

The user profile page also now shows overall statistics for all the quizzes that you’ve tried.

screenshot_bigoquiz_userprofile

 

Big-O Quiz: An experiment with Google AppEngine

bigoquiz.com

screenshot_bigoquiz_webOver the last few weeks I’ve been working on bigoquiz.com. I wanted to learn the Big-O algorithmic complexity of the main algorithms and data structures by heart. So, of course, I first built a system to help me do that.

bigocheatsheet.com and bigoref.com already do a good job of listing the information, but bigoquiz.com lets you check that you really know each one, and shows you the ones you keep getting wrong.

Tech I learned along the way

I find it very hard to resist a project when I see that it would let me play with multiple new things. This one was a chance to learn about:

Details

I noticed that bigoref.com had some corrections and additions compared to bigcheatsheet.com. And I made some additions of my own. I’ve listed it all here for anybody who is interested:

For instance,  bigoref.com adds Graph Search algorithms such as Dijkstra’s and Bellman-Ford (missing from bigocheatsheet.com), under “Searching“. I instead listed the graph search algorithms in bigoquiz.com under “Graph Search”  and added more graph algorithms that I’d learned about recently in my Coursera course, though I’m tempted to create a completely different graph algorithms quiz: Kruskal’s Minimum Spanning Tree, Prim’s Minimum Spanning Tree, Floyd Warshall All Pairs Shortest Path, and Johnson’s All Pairs Shortest Path.

bigoref.com also adds linear and binary search of arrays (missing from bigocheatsheet.com). For bigoquiz.com I just split Array into sorted and unsorted in “Data Structure Operations”.

I found just one error on bigoquiz.com that was corrected on bigoref.com, though there might be others:

And I noticed that bigoref.com has some additional errors so I corrected those on bigoquiz.com and  filed github issues for bigoref:

Also:

  • bigocheatsheet.com has entries for Stack, but bigoref.com does not. I kept that in bigoquiz.com.
  • bigoref.com has entries for Ternary Search Tree, apparently a kind of Trie (Prefix Tree), but bigocheatsheet.com does not. I kept this in bigoquiz.com. Curiously, bigoref.com doesn’t have entries for regular Tries (Prefix Trees), or Suffix Trees. I added them to bigoquiz.com, though I’m not completely sure about the various best/worst/average time/space complexities.
  • bigoref.com uses the term “indexing” where bigocheatsheet.com uses “access”. I’ve stuck with “access” for bigoquiz.com.
  • bigoref.com splits array into Basic Array and Dynamic Array, not allowing insertion or deletion in a Basic Array. I don’t find the distinction useful, so I kept it as just Array in bigoquiz.com.