Looking at the Zooniverse code

Recently I’ve been looking over the Zooniverse citizen science project and its  source code on github, partly because it’s interesting as a user and partly because I thought writing an Android app for Galaxy Zoo would be a good learning exercise and something useful to open source.

So far my Android app can’t do more than show images, but I thought I’d write up some notes already. I hesitate to implement the Android App further because the classification decision tree is so tied up in the web site’s code, as I describe below.

Hopefully this won’t feel like an attack from a clueless outsider. I’m just a science enthusiast who happens to have spent years developing open source software in various communities. I’ve seen the same mistakes over and over again and I’ve seen how to make things better.

Zooniverse is organised by the Citizen Science Alliance (CSA). Incidentally, though the CSA has an organizational structure, I can’t see it’s actual legal form. Is it a foundation or a company? The zooniverse.org and citizensciencealliance.org domains are registered to Chris Lintott. Maybe it’s just a loose association of researchers with academic institutions assigning funds to work they care about, and maybe that’s normal in academia. The various Zooniverse employees actually seem to work for the member organisations such as the Adler Planetarium or the University of Oxford, though I guess some funding is for specific Zooniverse projects and some funding is for the overall Zooniverse development and hosting. That probably makes coordination difficult, like a big open source project.

Open Source

Since early 2013, the main Galaxy Zoo project has the code for it’s website on github along with the Zooniverse JavaScript library that it shares with other Zooniverse projects.

But most projects highlighted at zooniverse.org (such as Planet Four, Asteroid Zoo, Moon Zoo, or Solar StormWatch) don’t yet have their website’s code on github. This looks like a worrying trend. It doesn’t look like open sourcing has become the default as planned.

The zooniverse github repositories list is a poor overview, particularly because most of the respositories have no github description whatsover. Github should make them mandatory, even though they’d need to be updated later. Most projects don’t even have a basic description in their README.md either. Furthermore, I’d like to see a clear separation between front-ends, server-side code, and utilities (for processing data or for installing/maintaining servers.), maybe presented out on a github wiki page.

Also, they apparently have no plans to open source the server-side code (Ouroboros at api.zooniverse.org) that serves new subjects (such as galaxy images) to classify and receives classifications of these subjects. I think I’ve read that it’s a Ruby-On-Rails system. The client-side and server-side code is tightly bound, so this is a bit awkward. There is clearly room at least for some of the data structure and descriptions to be abstracted out and shared between the server, the client, and the analysis tools.

I can’t find any real documentation about the various Zooniverse code or APIs so there’s an awful chance of this blog post being the only introductory documentation that exists. I’d really welcome corrections and I’d gladly help. Neither can I find any place for public discussion of the software’s development, such as a mailing list. It’s hard for any open source project to mature without at least somewhere to discuss it.


Arfon Smith at Zooniverse wrote some blog entries about the Zooniverse Domain Model, Tools and Technologies, and Server-side logic (my title).  (Arfon has since left Zooniverse to work at Github). I also found some useful  documentation at the zooniverse npm.org page. But I had to look at the code and the network traffic to get a more complete picture.

Languages, Libraries, Frameworks

The zooniverse front-end web-sites generally seem to be written in CoffeeScript (a mostly nicer language on top of JavaScript), using the Spine framework, which seems to make it easier to separate data and code into an MVC structure and to write code that deals asynchronously with the server while caching some data locally.

Some Coffeescript is written inline with the HTML, in Eco (.eco) files.

The CSS is written in the Stylus syntax, as expected by hem, which they use to bundle the code up for deployment.

I’m no JavaScript expert, but these seem like fairly wise choices.

Zooniverse web sites communicate with the Ouroboros server using RESTful GET (get subjects to classify) and POST (return a classification of a subject) HTTP requests, using JSON syntax. I think the JSON syntax is generated/parsed by the base Spine.Module.  I don’t know of any implementation-independent documentation for this web API.

The website code uses the Zooniverse library  as a helper to communicate with the server, for instance to login, to get subjects, and to submit classifications, and to support the lists of recent and favourite subjetct. The Zooniverse library is also implemented in Coffescript. Strangely, the generated JavaScript is also checked into git. The Api class seems to be most interesting..

Questions and Answers

Let’s look at the Galaxy-Zoo website though its maybe the most complicated. It allows users to classify images of galaxies. Those images may be from one of several astronomical surveys, such as Sloan or UKIDSS. Each survey has an ID and a Workflow ID listed in config.coffee (with much duplication of magic numbers). Each survey has a human-readable description and title in the list of English strings.

Each survey has a question/decision tree under app/lib, such as Galaxy-Zoo’s sloan_tree.coffee.  I wonder if this generated or duplicated from somewhere in the server software.  Why are the long question titles duplicated and used as IDs for leadsTo instead of using short codes? Is this tree validated somehow during the build?

These IDs, Workflow IDs, and decision trees are listed in the Subject class.

Question IDs

The zero-based index of the questions in the decision trees are used as IDs when submitting the classification. For instance, a submitted classification POST might contain the following parameter to show that, when classifying a Sloan image, for the “Is there any sign of a spiral arm pattern” question (sloan-3, and the 4th question asked of me) I answered “Spiral” (a-0):

classification[annotations][4][sloan-3]: "a-0"

These implicit IDs, such as sloan-3, are also used in the translations,  and throughout the code. For instance, to reuse some translation strings, to decide if there should be a talk-page link. That i18n hack in particular belongs as an association in the decision tree.

These implicit IDs are also used in the CSS (via the Stylus .styl files) to identify the relevant icons. The icons are in one workflow.png file in order to use the CSS Sprites technique for performance). The various sub-parts of that image are selected by CSS in common.styl.

This seems very fragile. It would be safer if the icon files were stored separately and then the combined file was generated, along with that .styl CSS. I guess that the icons are already stored separately somewhere, maybe as SVG. One parent file could define the decision tree and all the associated descriptions and icon files.

Ideally much of this structure would be described in configuration files separately from the code. That generalisation would allow more code reuse between Zooniverse projects and could allow reuse by other front-ends such as iPhone and Android apps. Presumably it’s this fragility that has caused Galaxy Zoo to withdraw its previous mobile apps. Even with such an improvement, you’d still need a proper release process to coordinate development of interdependent software.

Subject and Classification

Galaxy-Zoo has a Subject class, as does the Operation War Diaries project. These usually derive from the base Subject class in the zooniverse library  ,though the Snapshot Serengeti Subject class does not.

The Ouroboros server at at api.zooniverse.org provides a list of subjects for each group (a group is a survey, I think) to be classified via JSON. Here is the list of subjects for Galaxy Zoo’s Sloan survey. And here is the subjects list for Snapshot Serengeti with a simpler URI because there is only one group/survey.

The surveyId (for the group) for Galaxy Zoo is chosen randomly, though it’s currently hard-coded to always choose the Sloan survey. This JSON message contains the URLs of images for each  subject, in the list of “locations”. The Subject’s fetch() method calls the Api.get() method from the Zooniverse library and then creates Subjects for each item that the JSON message mentions.

The Subject’s constructor seems to take theJSON fragment to populate its member fields using the default Spine.Model’s AJAX functionality.

Galaxy-Zoo has a Classification class, and Snapshot Serengeti has one too. There doesn’t seem to be any common base Classification class in the zooniverse library. The Classification’s send() method calls the Classification’s toJSON() method before POSTING the message to the server via the Zooniverse library’s Api.post() method.

It’s hard to see any great commonality between the various projects.
For instance, a Galaxy Zoo classification is a series of answers to multiple-choice questions, with the questions being from a decision tree. I guess that Snapshot Serengeti’s animal classification is similar, though you can provide multiple sets of answers to the same questions about what animal it is and what it is doing, to identify multiple animals in each image. Moon Zoo and Planet Four ask you to draw specific shapes on an image and also classify each shape you’ve drawn, probably resulting in coordinates and identifications.

I wonder if the server-side code has any common model for these data structures or if the classifications just get fed into project-specific databases for project-specific data analysis later.

Android Glom Experiments

Over the past few weeks I’ve been diving into Android development using a semi-realistic project to force me to learn properly. I wrote a rough first version of a read-only Glom database UI for Android, called android-glom. So now there’s a version in gtkmm (C++), Qt (C++), GWT (Java) and Android (Java). It’s a good way to really try out a framework and I’ve really enjoyed doing that with no pressure.

Here are some of my thoughts about the experience. I’d welcome feedback about the opinions I’ve formed and about the code I’ve written.

I’ve been doing this while reading O’Reilly’s Programming Android book which is pretty good. I had already read the Busy Coder’s Guide to Android Development a few years ago, which I also liked, but I didn’t realise until recently that it’s been kept up to date as a subscription, because Amazon only has the old edition.

Android Studio

I tried out Android Studio for this project. It’s not quite officially stable but I already find it preferable to Eclipse even though its UI is only slightly less cluttered than Eclipse. It has never asked me to choose a “Perspective”, which is nice. Like Eclipse, it has refactoring tools so you can move Java code around and rename stuff easily. I miss this now when I use C++.


Android doesn’t really support JDBC, making sure that you don’t even try to access an external database server on a tablet or handheld that probably doesn’t have a reliable connection. Those database servers are not meant to be exposed directly on the internet anyway. But it’s still annoying that have to use a similar-but-separate Android-specific and SQLite-specific database API (SQLiteDatabase, SQLiteOpenHelper and Cursor) instead, making code less portable.

Luckily I was still able to reuse much of my SQL-building and data-structure Java code from gwt-glom with just minor changes.

Activities and Fragments

Fragments are a fairly new way to let you rearrange your Android UI in various ways for different sized screens or different orientations. For instance a tablet UI might have one Activity that shows a list fragment at the left and a detail fragment at the right, depending on what item is selected in the list. But a handset version of that UI might open a second detail Activity to show that detail fragment, because it doesn’t have a big enough screen to show both at once.

Unfortunately, there are various official examples that use fragments to support both tablet and handet UIs in one app, but they don’t all use the same techniques or API. For instance, the “Building a Flexible UI” documentation instead suggests just replacing the one fragment in the one single activity for the handset UI and using FragmentTransaction.addToBackStack() so the back button works.

However, the Master/Detail new-activity template code used in Eclipse and Android Studio uses the multiple Activity idea (only one Activity is ever used in the Tablet UI), setting an mTwoPane boolean after detecting which XML layout file has been loaded. This works because you can specify separate layouts (or other resources) for Android to use depending on, for instance, screen size or orientation, and you just need to check what it has decided to use.

So this is the system that I’ve used, later checking that mTwoPane boolean when I want to navigate to another part of the UI, either telling the main Activity to do something with one of its fragments, putting the necessary parameters in a Bundle given to Fragment.setArguments() or just using startActivity() to start a new Activity with the appropriate parameters in its Intent. It’s slightly annoying that fragments take a Bundle but Activities take an Intent, with both having very similar but separate APIs.

Content Providers

Many parts of the Android API, such as ListView, ListFragment, and ListActivity, assume the use of a Cursor to access data, which in turn requires that you use a either a SQLiteDatabase or a ContentProvider.

The Android API documentation implicitly pushes you, via deprecation,  to use a  Content Provider, rather than just a SqliteDatabase, to separate your UI and data into separate processes even when you have no need to share your application’s data with other applications, even though the high-level documentation says “You don’t need to develop your own provider if you don’t intend to share your data with other applications.”

Specifically, you can tell your ListView, ListFragment, or ListActivity to show your SQLite database data via a CursorAdapter, such as SimpleCursorAdapter, which takes a Cursor. That Cursor can be the result of a SQLiteDatabase.query() or rawQuery(). But you’ll need to call Activity.startManagingCursor() on that Cursor and that is deprecated (probably because its not asynchronous) in favour of using CursorLoader (by implementing LoaderManager). And that means using a ContentProvider. See the “Running a Query with a CursorLoader” documentation. I wish that the documentation and examples just started off with this clear recommendation.

You can instead implement a custom CursorLoader that uses a SQLiteDatabase directly, but you then lose some functionality such as automatic ListView updating via notification when the data changes. And I think I’ve read of other ListView functionality that only works with a ContentProvider but I can’t find that documentation now. I think it was something about searching or auto-completion.

Content Providers are a bit awkard

Unfortunately a ContentProvider doesn’t provide quite as much loose binding as you’d hope. The ContentProvider API is very much like a SQL API. Most example code seems to just expose the SQLite database structure directly, sometimes with a simple mapping of column names as a thin separation. Given that this is the most common ContentProvider implementation that gets cargo-culted, it seems that it should be a lot less verbose.

(Update: I noticed that the mapping of external column names to internal columns names is useless anyway because you end up exposing the internal column names when your Content Provider’s query() returns the database query’s cursor as your Content Provider query()’s cursor. Client code will often then need those internal database column names to call Cursor.getColumnIndex() to then get the values in the columns. This is only a problem when you don’t specify specific columns, which would then be mapped by SQLiteQueryBuilder.setProjectionMap().)

I also don’t like how this forces so much implementation code to be forced into one ContentProvider API, separated only by switch statements in the query(), insert(), update() and delete() implementations. I’d like an easy way to delegate my various ContentProvider URL implementations to separate classes.

Here’s a link to my ContentProvider to show what I mean. It feels like a mess.

A database as a web service

On the other hand, it’s good that the ContentProvider provides a route to storing your data on the internet instead, maybe just caching it locally, without changing your UI code. For instance, using some RESTful service, whose API would closely match a database API.

In fact, I’d like some easy way to just expose a database on the web, with multiple user-level and table-level access control. I know that developers spend huge amounts of time implementing very specific “business logic” web APIs to hide their underlying databases, and I know that developers rightly fear anyone having direct access to their databases. But couldn’t it be done properly? I guess that most of these systems have much the same code with much the same security mistakes, just to put the names of their database tables and columns behind a few levels of programming language class names, method names and parameter names that provide just a thin sense of security and a token hint at modularity.

I’ve found a few systems that do this, but I have no idea of which ones are trusted and used much.


Family Research

I’ve been doing some family tree research. My parents and their parents were so adverse to talking that I unfairly associate Scottishness with unusual levels of reticence. Luckily, Scotland has excellent public records of births, marriages and deaths, so you can be sure of some facts without having to talk to each other.

I didn’t find out as much as I’d hoped about my immediate family, but I did fill in a few details. I hoped to find a few people who might have photographs, but that hasn’t happened so far. However, I was surprised to be so interested in the lives of people who were so long forgotten.

First I’ll mention some of the websites I used, and then I’ll mention some of the family members that I researched, so that search engines can find their names.

Scotland’s People: Public Records

I’m lucky that almost all my ancestors for several generations lived in Scotland, in Edinburgh or nearby. For some reason, Scotland took excellent records of births, marriages and deaths from 1855 as well as regular census records, and you can search for these records online at scotlandspeople.gov.uk. You can download most of them online, though you need to order some as paper copies – usually the more recent ones. They have some church records from before 1885 too, but these don’t generally include enough detail (such as parents of people getting married) to identify people properly. It’s not awfully expensive.

The Scottish birth certificates, marriage certificates and death certificates include the names of the parents and the professions of the fathers, and the birth certificates include the marriage years too, so you can use them without much doubt. Often, a death certificate will mention the spouse, or be signed by a previously-unknown child or sibling. Sometimes a sibling is the witness for a marriage. Then you can track that person down via birth and death certificates. I don’t know of any other way to find siblings because scotlandspeople.gov.uk doesn’t let you search births by both parents’ names.

The occupations are particularly interesting. For my family they seem to show a move into Edinburgh from surrounding villages at the end of the industrial revolution, moving from menial labour into trades and then the (newly created?) middle class with office jobs. For instance, my great-grandparents and great-great grandparents include several “scavengers” with occasional labourers, bakers, miners, brush makers or factory workers, but some of their (male) children became mechanics or clerks. Their children became clerks and shopkeepers or even doctors.

English Public Records

This is far harder for the English birth, marriage and death (BMD) records. They seem to exist, but they aren’t online, and you can’t even order them easily unless you know the exact place and year. freebmd.org.uk only has an incomplete index, but it looks like you’d have to look through the complete set of BMD microfiches in a major English city to be thorough. This is a huge waste.

For instance, I can’t find the certificate for my aunt’s (Joan Isabel Lawrence) death in London around 1995.

Ancestry.co.uk / ancestry.com / ancestry.de

The Ancestry.com service (or their regional versions) is genuinely useful. They have many (but apparently not all) English records that I wouldn’t have found otherwise, and most of the Scottish records. Most usefully, you don’t need to do individual searches – their Hints system will find possible supporting records for people in your family tree and offer to add new family members that those records mention. However, they don’t have all the Scottish records, so you’ll still need to use scotlandspeople.gov.uk.

They have some other sets of records of varying levels of completeness, such as boat passenger lists, phone books and military service records, which I’ve often found useful.

Likewise, the Hints system lets you expand your tree by using the public family trees from other users. I’ve even found a few photographs of slightly-distant relatives this way. This can tend to offer the illusion of certainty when people accept conclusions that may be based on inadequate records.


I’ve also used 23andme‘s DNA Relatives feature. This is generally full of a thousand 4th or 5th cousins who are far enough removed that it’s almost impossible to see the connection, but I’ve had one 2nd/3rd cousin (and her son) show up. Because I got my father into 23andme too, it could show me that she was related on my mother’s side. She didn’t know much about her family, but she gave me enough clues for me to use scotlandspeople.gov.uk to see that her grandmother was the sister of my mother’s grandmother.

In general, 23andme’s list of possible relatives has not been very useful so far. Most connections that it shows are almost impossible to investigate using public records or family knowledge. Many must be due to unofficial biological fathers or mothers that don’t show up in marriage and birth records.

Most people on 23andme are in America and I have noticed a tendency for them to fixate on proving that they are directly related to settlers from northern Europe, happily ignoring the hundreds of other ancestors who would have been in the same generation. Maybe this will be harder once 23andme has a broader cross-section of even American society.

My Grandfather, Ian Murray Lawrence, and my grandmother, Isabella Forsyth

My grandfather, Ian Murray Lawrence (1914-08-17 to 2006-09-26) lived in Edinburgh but my parents had moved to Swindon, which was around 10 hours away by train at that time. My mother didn’t talk to him. I didn’t meet him often until I was in Edinburgh at University in 1991/1992 and even then he was a wall of silence. My mother, Margaret Maya Lawrence (1944-12-03 to 1999-07-15) died in 1999 and he died in 2006.

He was a pharmacist, as was my mother. My father says his pharmacy was in Pilton, Edinburgh, and the 1950 phonebook on ancestry.co.uk lists him at 33 W(est) Pilton Park, Davidson’s Mains, Edinburgh, near where he lived in Blackhall, Edinburgh. I’d love to find a photograph of it though it was apparently a fairly temporary structure that’s since been replaced with a health center.

My grandfather mentioned once vaguely, while drunk, that he had been in the pacific during World War 2, but never wanted to talk about it. Indeed, my mother’s birth certificate in 1944 mentioned that he was a Sick Bay Petty Officer in the Royal Navy. Now that I’m his next of kin, I could request his service records from the Royal Navy (this works for the Air Force and Army too). The forms are slightly awkward and they only take payment via paper cheque, but after a couple of months they gave me his complete list of postings. They showed that he was mostly at onshore Royal Navy hospitals, starting in Wales and England, but then in Australia and Hong Kong. This is his complete record, in case anyone else with similar results (or photos!) finds it via Google:

1942-12-30 to 1943-02-08: HMS Glendower: Probationary Sick Berth Attendant
1943-02-09 to 1943-02-09: Passage: Probationary Sick Berth Attendant
1943-02-10 to 1943-05-26: RNH Plymouth: Probationary Sick Berth Attendant
1943-05-27 to 1943-06-27: RNH Plymouth: Sick Berth Attendant
1943-06-28 to 1943-09-02: RNH Plymouth: Sick Berth Petty Officer
1943-09-03 to 1944-03-27: HMS Drake: Sick Berth Petty Officer
1944-03-28 to 1944-03-28: HMS Pembroke: Sick Berth Petty Officer
1944-03-29 to 1944-03-29: Passage: Sick Berth Petty Officer
1944-03-30 to 1945-03-05: RNH Chatham: Sick Berth Petty Officer
1945-03-06 to 1945-03-08: HMS Pembroke: Sick Berth Petty Officer
1945-03-09 to 1945-04-09: Passage: Sick Berth Petty Officer
1945-04-10 to 1945-04-14: HMS Golden Hind: Sick Berth Petty Officer
1945-04-15 to 1945-04-30: HMS Furneaux: Sick Berth Petty Officer
1945-05-01 to 1945-09-10: HMS Brisbane: Sick Berth Petty Officer
1945-09-11 to 1945-09-20: HMS Furneaux: Sick Berth Petty Officer
1945-09-21 to 1945-10-10: Passage: Sick Berth Petty Officer
1945-10-11 to 1946-03-22: RNH Hong Kong: Sick Berth Petty Officer
1946-03-23 to 1946-05-29: HMS Pembroke: Sick Berth Petty Officer

My grandfather’s wife, Isabella Forsyth (1915-10-02 to 1996-02-19), was said to have disappeared in the late 1950s or early 1960s, but I went with my mother to her cremation in 1996. I now have her death certificate showing that she died (apparently of suicide) in a (psychiatric?) care home in Edinburgh (since closed). Presumably my grandfather and mother knew all along where she was. I’d like to know when and why she was there, but I don’t think I’m entitled to access to any of her medical records.

I did discover that she had a sister, Margaret Pevey Forsyth (1911-10-26 to 1986-03-13) who had two sons David Shaw and John Shaw- my mother’s cousins. I even found those sons’ children (my second cousins) on Facebook, but there’s no real way to contact strangers on Facebook. Three are in England and one is in Florida. I have a picture of the English 2nd-cousin’s father with my mother, and I do wonder if they have more pictures of my family.

My grandfather’s (Ian Murray Lawrence) father’s death certificate from scotlandspeople.gov.uk was signed by his brother Alexander Lawrence who I’d never heard of. I then found him on the 1911 census (the parents’ ages and father’s profession matched) via ancestry.co.uk, and found his death certificate via scotlandspeople.gov.uk. He died of a heart attack in 1982 on North Bridge in Edinburgh while staying at the Salvation Army hostel at 1 Pleasance. Presumably they couldn’t find any relatives because he was listed in 1982 under “estates fallen to the crown”. My mother apparently had no idea that this uncle existed. So my grandfather seems to have mislaid at least his brother, his wife, and two daughters.

My aunt, Joan Isabel Lawrence

This is still  the biggest family mystery.

I never met my  mother’s sister, Joan Isabel Lawrence (1954-02-09 to approx. 1995) who nobody seemed to have hear of since before I was born. In 1995 we heard that she had died in London of sclerosis of the liver – she’s assumed to have drunk herself to death. But that’s just what my father says that my mother said that my grandfather found out when someone replied to a Christmas card that he sent her. We were surprised that he had an address for her and we don’t know what it was. Apparently it took a few months for them to identify her body and my father thinks she was using the name Joan Maclaren, maybe as a stage or modelling name. The GRO say they can’t find any death certificate for Joan Lawrence, Joan Forsyth (her mother’s maiden name) or Joan Maclaren,for England and Wales for 1994 to 1996.

My grandmother, Isabella Stubbs Sparks

I actually knew my grandmother Isabella Stubbs Sparks (1911-07-10 to 2000-03-10) and grandfather Richard Cummings (1904-06-02 to 1990-02-11) because they moved to London and then to Swindon, where I lived. She was always strange, but now I can see that her family life must have been difficult. The last of 9 children, her parents lost five of her siblings within just a few years, mostly before she was born. In 1903 the 7th child died 3 days after birth. In 1906 the 8th child died after 1 yearof spina bifida. In 1908 the second-oldest died at age 11 from drinking Tuberculosis-infected milk.  In 1914, the 5th child died at age 16. In 1917 the oldest died at age 25 at the Somme in WW1. This cannot have been a happy home.

This is the other major change that I see when looking at the family tree that I built up. People just 2 or 3 generations ago regularly had around 10 children and lost half of them. We forget how lucky we are.

Google Music is a bit Awkward

This is a little addition to my post about building a Heart of Rock and Soul playlist on Google Music.

I’m surprisingly unbothered that I need to use Google Music on an Android device (or iPhone) or web browser to play my music. I mostly use my Android tablet or phone through a bluetooth speaker. However, it could use lots of improvement:

  • There’s no way to export a playlist, and your Google Music data is not part of Google Takeout.
  • I don’t think there’s a way to purchase all the tracks in my playlist. I’m afraid of what would happen if I cancelled my All Access subscription. Would it offer me the chance to buy the tracks? Would it permanently delete my playlist? Would I lose everything before I knew if I would lose everything?
  • You can’t search for tracks in a playlist. The browser’s (Firefox or Chrome) Find feature generally only searches through one page of the playlist.
  • You can’t easily move tracks around. If you want to add a track near the start of a 1000 song playlist then you’ll spend a long time dragging and dropping it from the bottom of the list to the top.
  • In the Google Music Android app, one false touch of the screen during an unexpected screen rotation can cause you to accidentally swipe a song out of your playlist. There’s no way to undo it, and no way to know exactly which version of the track you have just lost.
  • Songs you add to your library (or playlist) today might not play tomorrow. I’ve had songs just refuse to play. I guess the albums were removed from Google Music, but it would be nice to see some onscreen explanation when it happens. I’ve had this happen with Mitch Ryder and the Detroit Wheels’ “Rev Up” compilation, Ann Peebles’ “The Hi Singles A’s & B’s” compilation and a Gene Pitney song.
  • It gets confused sometimes and fails silently. Playlists sometimes won’t sync between different browser tabs, as if the changes haven’t reached the server. Recently my playlist was stuck at 991 songs. Trying to add one more seemed to work, but it didn’t show up in other browser tabs or in the Android app. Trying to move the song in the playlist resulted in a “Couldn’t change order. Please try again.” error message. By chance I found that the playlist in the Android app had some of those now-unplayable songs (see above) in strange positions that were not showing up in the browser. After removing them in the Android app, I could make changes in the browser again.

Heart of Rock and Soul: 992 out of 1001

This blog post will seem rather long, purist, and completist. That is the point. Sorry.

Wasted Youth

In around 1989, when I was 16 or 17, I got Dave Marsh’s Heart of Rock and Soul book, probably on the strength of a Q Magazine review. I spent the next few years searching for the songs it described, mostly from the 50s, 60s and 70s. Living in Swindon (UK), before the web, before Amazon, this involved many disappointing visits to awful second hand record shops.

At the time, UK radio was dominated by some awful crap and UK politics was coldly populist, as if it wasn’t hard enough being a teenager. Dave Marsh’s passionate rants gave me something better. My opinions of music and US politics have been pretty much lifted from him ever since, maturing only slightly.

His book introduced me to artists I would never have found otherwise, such as Al Green, Curtis Mayfield, Sly Stone, Roy Orbison, and Bobby Bland. It invited me to love Motown, Stax, Chess, and Atlantic soul wholeheartedly. It elevated Springsteen, Madonna and Prince above their mediocre company. I made time for George Jones and Patsy Cline. But I found only a fraction of the 1001 songs and over the years I stopped looking and graduated to other distractions. I lost my collection of CDs and cassettes in a series of moves, when I couldn’t take what I couldn’t carry. My small box of vinyl was allegedly taken to a charity shop in the 90s. I even lost the book.

Google Music All Access

A couple of months ago, I signed up for Google Music All Access, got a new copy of the book, and started hunting again. I’ve built up a Google Music playlist of most of the songs from Heart of Rock and Soul. Until now, I had still never heard a third of the songs. Almost all of them fill me with joy.

This time around I’ve realized how many doo-wop songs are in the list, and how good they are. And there are a couple of wonderfully strange old R&B songs that are new to me: such as Brenton Wood’s Oogum Boogum. I’m not enthused about most of the country songs that are new to me, but they might grow on me.

The kids seem to like it too. I had been trying to bombard them with Motown and Stax collections without much success, but the variety of this playlist sometimes grabs them. They like the doo-wop and the girl groups.

The rest of this post is a rather specific snapshot of the distribution and licensing problems that still seem to affect online music, along with a lack of curation by actual humans. I guess it’s representative.

I’ve blogged separately about some of the problems with Google Music specifically.

Multiple versions, re-recordings

I guess this has been possible for ages with Apple’s iTunes (though I use Linux and Android) but only All Access lets me listen to multiple versions of each song to decide which one is the original recording, without forcing me to pay for each attempt. That’s important because many 50s and 60s songs have been re-recorded so badly so many times. Presumably the original contracts wouldn’t have let the artists earn money with the originals, though I doubt they ever made much from the faked greatest hits compilations either.

I used 45cat.com and Google image searches to find pictures of the original 45″ singles, whose labels often mention the duration, so I knew where to start.

Just for instance, Google Music has:

(And that’s ignoring the Karaoke versions that exist of almost all songs. I wish I could just turn them off in Google Music.)

It would be great if Google could at least calculate the duration after stripping silence from the start and end of tracks. It would be even better if it analyzed the tracks and offered to group them into apparent duplicates. It would be a Googly problem to solve.

It would then be nice to blacklist some compilation albums so they don’t even appear in the search results by default. For instance, this “The Premium Collection” Drifters compilation seems to be all half-hearted re-recordings but this “Essentials” Drifters collection, though incomplete, is full of wonderful.

Google Music is Incomplete

I’ve been far more successful with Google Music than I ever was with second-hand record shops, but I had to go to Amazon for some MP3s, and some were only on iTunes. I’ve also had to buy a few CDs to get stuff that I can’t buy digitally anywhere. That got my count up to 992 out of 1001.

Some of the fairly mainstream stuff that was so hard to find the first time is still hard to find online now. For instance, Mitch Ryder and the Detroit Wheels and Bob Seger seem to have had licensing problems or objections back then that continue even now.

There are probably even more songs from the list that aren’t in Google Music – I had already uploaded my personal collection of MP3s when I started.

Not on Google Music, but on Amazon (amazon.de) as MP3s

  • The Young Rascals: Good Lovin’
    Google has the album that’s meant to have it, but the track is missing.
    amazon.de has the MP3 from that album. amazon.com has the MP3 too.
    It’s also on iTunes, which has the full version of the album.
  • John Lennon: Instant Karma (We All Shine On)
    amazon.de has it (remastered) as an MP3. amazon.com has the MP3 too.
  • Big Youth: Streets in Africa
    It’s on Google Music, but with the artist listed as The Heptones, and with a burst of distortion at the start.
    It’s on amazon.de as MP3 for instance on the “The Chanting Dread Inna Fine Style” compilation album, or amazon.com also as MP3.

Not on Google Music, not on Amazon (amazon.de), but on iTunes.

  • The Beatles: “Ticket to Ride”, “She Loves You”, “I Saw Her Standing There”, “Twist and Shout”, “Help”, “Day Tripper”, “I’m Down”, “Get Back”, “Revolution”, “We Can Work it Out”, “Strawberry Fields Forever”, “I Feel Fine”.
    Of course, it’s well known that iTunes has a Beatles exclusive. That’s just annoying.
  • Lenny O’Henry: Across the Street
    Not on Google, not on Amazon, not on Spotify. iTunes has it.
  • The Manhattans: Follow your heart
    It’s on amazon.com as MP3, but amazon.de has it only on a CD. iTunes has it.

On CD Only

These were on CDs, but not on Google, Amazon.de, or ITunes as MP3s. I bought the CDs  mostly from Amazon 3rd-party sellers and ebay.

  • Mitch Ryder and the Detroit Wheels: Devil with a Blue Dress On / Good Golly Miss Molly (Medley) and Little Latin Lupe Lu.
    amazon.de has the Rev Up compilation CD. amazon.com has a different compilation album of the same name on MP3.
    Google Music had the album at first but then silently refused to actually play its songs (see below). Google Music also has some later lame re-recordings by Mitch Ryder.
    I used to have a similar compilation on cassette in the car. I’ve missed it.
  • Van Morrison: Wavelength
    amazon.com has the Wavelength Album on CD:
    amazon.de has the Wavelength album on CD from third-parties.
    This song felt worth the twenty year wait.
  • Van Morrison: it’s all in the game
    amazon.com has it as MP3 on the Into The Music album.
    amazon.de has it only on the Into The Music album CD.
  • Bob Seger: Roll Me Away, Night Moves, 2 + 2 = ?, Ramblin’ Gamblin’ Man
    Google Play’s regular search takes me to his “Ultimate Hits” compilation listing a price even though I’m paying for All Access. Strangely, Google Music’s search doesn’t show it. Maybe that’s because it’s not part of All Access for some reason.
    Amazon and iTunes have everything but “2 + 2 = ?” on “Ultimate Hits”, only available on CD on amazon.de but available in MP3 on amazon.com.
  • Johnny Rivers: Secret Agent Man
    amazon.com only has a re-recording as MP3.
    Amazon.de and iTunes don’t have it as MP3 either, though they have an album (compilation?) of that name, without the title song.
    Amazon.de has Bear Family’s Summer Rains compilation on CD. Amazon.com has the CD too.
  • The Parliaments: (I Wanna) Testify
    Google Music only has later (longer, looser) versions, by Parliament instead of The Parliaments.
    iTunes has the single version, correctly listed under The Parliaments.
    amazon.de has it on CD from third-party sellers. amazon.com has it on CD too for crazy prices.
  • David & David: Welcome to the Boomtown
    It’s on amazon.com as an MP3 but amazon.de has it only on the album CD.
  • Planet Patrol: I Didn’t Know I Loved You (‘Til I Saw You Rock and Roll)
    It’s on iTunes, on “The Tommy Boy Story volume 1″, which is also on Google Music, but with only 1 track.
    Amzon.com has it as MP3 from that album and amazon.de has it as MP3 too.
  • Larry Williams and Johnny Watson: Mercy Mercy Mercy
    It’s on amazon.de, on the Two For the Price of One album CD.
    It’s on amazon.com, also as MP3.
    It’s not on iTunes either.
  • Jerry Lee Lewis: One Has My Name
    It’s not on Google and not on iTunes.
    It’s on amazon.com as MP3.
    It’s on amazon.de, but only as a CD.
  • Wallace Brothers: Precious words
    Not on Google, not on iTunes.
    On amazon.com on a compilation, but only as a  CD.
    On amazon.de, as a CD.
  • Donna Fargo: A Sign of the Times
    Not on Google Music, not on iTunes.
    On amazon.com as an MP3.
    On amazon.de, but only on a CD from third-parties.
    When I finally heard it, it didn’t seem worth all the effort.
  • War, Slippin’ Into Darkness
    Not on Google Music (though a live version is there.)
    amazon.com has it as an MP3.
    amazon.de has it only on a compilation CD.

Still Searching

The last few songs that I can’t find anywhere (not Google Music, not Amazon MP3s, not iTunes, not Spotify) are:

I have a second version of the playlist that removes some stuff that I’d rather not hear too often, such as Billy Ocean’s creepy (and not that good) “Get Outta My Dreams, Get Into My Car”, talky stuff like the Special AKA’s The Boiler, Donna Fargo’s Sign of the Times, Isaac Hayes’ By The Time  I Get To Phoenix, and the Christmas songs.

cluttermm 1.18

cluttermm provides gtkmm-like C++ bindings for the Clutter API. It hasn’t had much attention over the last few years, while the Clutter API has moved on considerably.

I don’t have a great interest in Clutter, though I’d like cluttermm to be ready to inspire gtkmm if any future GTK+ 4 absorbs Clutter. However, Ian Martin has recently put lots of work into updating cluttermm and I’ve been helping him to get his changes upstream and I released some cluttermm-1.17.x tarballs. It’s in the cluttermm-1-18 branch.

This has involved deprecating a huge amount of API and adding (hopefully all of) the API that replaces it. I’ve already removed all this API in the parallel-installable cluttermm git master branch but I have some linker errors that stop that from building, and I’m not going to spend time on it until there are some releases from clutter’s git master (clutter-2.0).

This also means that the cluttermm-tutorial is even more wildly out of date. I’m not likely to spend my free time updating it.

I don’t want to be cluttermm maintainer again, but I didn’t want Ian’s work to be wasted and I hope he wants to keep at it.


glibmm 2.40 and gtkmm 3.12

Only a little late, we released stable glibmm 2.40.0 and gtkmm 3.12.0 versions.

This was only possible thanks to lots of work from Kjell Ahlstedt and  Juan Rafael García Blanco. For instance, Juan added the Gtk:::ActionBar, Gtk::FlowBox and Gtk::Popover classes,after having added Gtk::HeaderBar, Gtk::PlacesSidebar, Gtk::Revealer and Gtk::SearchBar in gtkmm 3.10. They also added example code in gtkmm-documentation.

Openismus Over

Last week I was at a notary here in Munich to officially put Openismus GmbH into its liquidation phase, after seven years. The company is closing down, though with no debts and with a little left over. I feel good about that.

This has been the plan for a while since it became harder to get reliable customer work, though that was more a result of the company structure and my own time constraints than any particular change in the tech economy. It became possible once the last few employees had found good jobs to move on to.

I got past any sadness about this a long time ago. I guess it would be nice for things to be running along at their best again, but there was never a sense of security and always a stressful balance of risk and responsibility. It was a good problem to have.

For the last year or so I’ve mostly been busy with all the tedious work of shutting down the offices in Munich and Berlin along with the day to day paperwork involved in a company. Now I feel a sense of relief to be free of these responsibilities.

It will be a few months until I start to look seriously for what’s next. I’ll probably look for a nice stable development and management job here in Munich, and I’ll try particularly hard to find something that lets me work part time so I can pick my kids up from school.

In the meantime I’m enjoying the small sense of achievement that comes from taking care of all the little things I’ve let slip over the past few years and catching up with a bunch of tech stuff that I haven’t had time to learn in depth.

Lego Wedo with MIT’s Scratch: Simple Robotics for Kids

Lego Wedo

Last week my son, who has just turned 6, tried out the Lego Wedo kit that I’ve had sitting in the cupboard until I thought he was ready. It’s a very simple system of sensors and a motor that plug into a computer via a USB hub so the child can write simple programs to control it. For instance, the program can turn the motor on or off depending on the whether a sensor detects something.

It’s a nice simple first step into programming and robotics for kids that aren’t old enough to deal with Lego Mindstorms, which I guess needs a higher level of reading and writing skills. Controlling real objects in the real world is interesting enough to small children. As robotics is not yet pervasive in everyday life, the limited functionality is helpfully simple without seeming unimpressive.

It’s the best equivalent I’ve found for my own first programming experiences with BASIC on the Sinclair ZX81 when I was 8 years old. That didn’t allow anything but programming from the moment you turned it on, and the BASIC keywords were just a keyboard press away. Putting text on the screen and reading text input was new enough to keep my interest. Today, children expect computers to do much more impressive things on a screen. But they don’t expect so much from Lego.

I talked my son through building the programs, encouraging him to start with simple steps, checking that they worked, finding out why they didn’t work, and then building upon that step, until he had a whole program. For instance, first we would make something move, then make it move as we wanted, and only then think about how to make it move only at certain times or how to make things happen onscreen. Plenty went wrong so he had a chance to learn that errors and debugging are normal.

Lego Wedo is part of the Lego Education line, which is aimed at schools, and their procurement processes, rather than general consumers. This need to support a network of distributors, and to provide curriculum support and lesson plans, is probably why it’s so expensive. The Basic Lego Wedo set (Lego kit 9580) is 129.95 Dollars in the US, or 148.74 EUR in Germany. Until recently, it was hard to buy this stuff as an individual, at least here in Germany, but the Lego Education online shops (listed here) now allow this. And the Lego Wedo kit is available on Amazon.de from third parties. It’s probably worth looking on ebay too.

Scratch instead of the Lego Wedo Software

The kit is useless without software, so you usually need to buy the Lego Wedo software separately, making the whole thing even more expensive (89.95 USD in the US, or 101.14 EUR in Germany, though I think it used to be twice as much). It runs only on Windows and Mac.

Luckily, MIT’s Scratch 1.4 has support for Lego Wedo. It “just works” on Linux, at least on Ubuntu Linux and Fedora Linux. Scratch is available for Windows and Mac, so I guess that it will work there too.

We have an OLPC XO laptop which has Scratch installed by default, which works perfectly with the Lego Wedo. I recommend this setup wholeheartedly because then the child doesn’t need to deal with all the surrounding crap in an adult’s operating system and doesn’t need permission or help to start playing. Unfortunately, I don’t know how individuals can buy these now, though you might have luck on ebay.

I’m not a fan of Scratch’s user interface because it expects kids to drag and drop, or click, on tiny targets and read tiny text. I guess that the official software is much easier. But Scratch is way better than nothing, and my 6-year old child can handle it now.

When the Lego Wedo USB Hub is connected, the Motion and Sensing areas have extra Lego Wedo Scratch blocks for use with the motors and sensors.

Unfortunately, the latest version (2.0) of Scratch, as seen on the main Scratch site, is browser-based, using Flash, though they are apparently aiming to rewrite Scratch in Javascript. It doesn’t seem likely that the online version will support hardware such as Lego Wedo any time soon, though it’s apparently in progress. Luckily the older versions are still available and still work.

Lego Wedo Hardware

The Lego Wedo basic kit has one motor, a distance sensor and a tilt sensor, along with the lego pieces, such as blocks, cogs, and axles, needed to build the models seen below. That’s obviously fairly limiting, but it’s enough for some first steps for young children.

The official Lego software can apparently control multiple motors, and multiple USB hubs, but MIT’s Scratch can only control one motor without installing a custom thingy.

The Lego Wedo motor is the same as the Power Function motors that can be bought separately  or that come with big Lego kits such as the Lego Technic 4×4 Crawler which my son has just built. However, these motors and sensors are not compatible with the Lego Mindstorms system (neither the older NXT nor the newer EV3 systems), which is rather annoying.

Lego Wedo Models

We built a few of the basic Lego Wedo models using the online PDF instructions. The advanced models, such as the ferris wheel, need (unusual) parts from the LEGO Education WeDo Resource Set (Or “Expansion Set” – Lego kit 9585) but we might try building something similar with what we have.

We had to think up suitable Scratch programs for ourselves, but that’s a useful exercise. The Lego Wedo Teachers guide (see Activities here) has some helpful suggestions about what you might try to achieve.

We built these models:
(View this at the original page if you can’t see the (done in a rush) embedded videos.)

Hungry Alligator

The alligator snaps its jaws open and shut when your finger gets close enough. Our program kept checking the distance sensor, and when that had a small enough value, we turned the motor on in one direction for some time and then in the other direction for some time. We had to make sure that the mouth was open when it finished, or it would block its own sensor and keep snapping forever.

Here’s a picture of our program:

Goal Keeper

The goal keeper moves around in front of the goal. The distance sensor behind the goal detects balls moving past it, so the program can keep score. We had two sets of blocks – one to move the goal keeper at random, and one to keep checking the sensor, incrementing a variable and showing its value on screen. We hacked in a wait to stop it from incrementing the variable too much as the ball passed the sensor but this could be cleverer.

Here’s a picture of our program:

Flying Bird

The bird’s body and wings can be moved manually. There’s no motor in this model. It’s just a fancy way to trigger the distance and tilt sensors. When the body moves, the tilt sensor triggers a flapping sound from the computer, and when the head blocks the distance sensor, the computer makes a chirping sound. Scratch has a bird chirp sound by default but we had to record ourselves saying “flap” for the other sound.

The tilt sensor is rather imprecise, and only provides one value at a time to indicate one of flat, up, down, left or right, which doesn’t make much sense. Scratch displays the value as a number between 0 and 4 which makes things hard for kids. The meaning of each tilt value is mentioned here.

Drumming Monkey

The monkey’s arms move up and down due to the motion of the rotating cams attached to the motor, letting you change the rhythm by changing the motors speed and duration, and by changing the positions of the cams and the arms. This model has no sensors. It’s rather underwhelming.

Here’s a picture of our program:

jhbuild and clang’s scan-build

I love C and C++ compiler warnings, particularly when I can fix the problems they show. I love getting new warnings for code I thought was clean. There are always a few warnings that show serious mistakes.

I recently noticed that, since 2011, jhbuild can do some extra static code analysis, using clang’s scan-build tool. This is thanks to Jeremy Huddleston and friends.

It’s pretty simple, though I don’t think it’s in the jhbuild manual yet. It should work if you just install scan-build (from your distro package, such as the clang-analyzer package on Fedora), and add this line to your .config/jhbuildrc file in your home directory:

static_analyzer = True

My scan-build here on Fedora 19 didn’t seem to find its own clang executable, so I had to add this line too:

static_analyzer_template = 'scan-build --use-analyzer=/usr/bin/clang -v -o %(outputdir)s/%(module)s'

I wonder if any systems are running this regularly. I’d like to just view it online every now and then. My local /tmp/jhbuild_static_analyzer/ directory is full of lovely scan-build reports for various GNOME modules but I’m sure others would enjoy that bounty.