CNK's Blog

Getting started with Erlang

Tomorrow Tim and I start a month-long Erlang class from Erlang Solutions The class notes say you should come with Erlang installed but the easy option for installing (brew install erlang) would have given me R14 (and R16 is just about out) and the official instructions on how to install from source were a little off putting. I really don’t want to use MacPorts if I don’t have to. We use it at work and it’s dependency management is a bit messy so you end up installing the entire universe to get the one tool you want.

Fortunately the Erlang Solutions folks have a DMG installer for Erlang R16A at: https://www.erlang-solutions.com/downloads/download-erlang-otp. It says it is compiled for Snow Leopard (10.6.8) but the DMG installer ran fine and the little bit of playing around I did in the Erlang shell seems to work.

I was very pleased that the Getting Started docs mentioned that Erlang ships with a set of tools - including an Erlang mode for Emacs. The binary intaller I used placed the lib files in /usr/local/lib/erlang/ rather than /usr/local/otp but with that small change, I now have emacs with Erlang support!!

Can’t wait for class to start!

Etags

Time to use ctags (or really etags in emacs). Most of the time I work on projects that are small enough that I can keep a lot of what I need to know in my head - or look up information from my frameworks on the internet. However, I am currently working on some code that I am having trouble sorting through. All of it is home grown - written by someone who no longer works with us - so no way to find documentation.

Parts of the code are quite clear and have been pretty easy to modify, but I am having trouble tracing the larger flow of data and messages. So perhaps the answer is a tool that would make it easier to go forward and back (particularly back) within the code.

It looks like I have ctags, however, a bunch of tutorials I have looked at suggest that one should use exuberant ctags instead of just plain ctags. May as well give that a try. The instructions in this blog post say you can just use brew install ctags-exuberant. That worked fine until the symlink stage when brew said I already had something at /usr/local/bin/ctags. Looks like ctags is one of the things that installed when Homebrew installed Emacs 24.1 for me. I might want to go back to that version, so I changed the name of that symlink to ctags-from-emacs and then reran the linking step. Now I see:

    $ /usr/local/bin/ctags --version
    Exuberant Ctags 5.8, Copyright (C) 1996-2009 Darren Hiebert
      Compiled: Jan 26 2013, 23:23:25
      Addresses: <dhiebert@users.sourceforge.net>, http://ctags.sourceforge.net
      Optional compiled features: +wildcards, +regex

And, more importantly, when I run ctags --list-languages I see Ruby in the list of supported languages, which I didn’t see in the ctags version that came with emacs.

I went looking for advice on how to generate my TAGS file and found mactag which will let you set up a configuration file to tell ctags which things you want indexed (your code, gems, etc) and where to find them. Looks pretty useful.

Notes on 'Practical Object-Oriented Design In Ruby'

I have been looking forward to reading Sandi Metz’s “Practical Object-Oriented Design In Ruby” since I heard she was writing it. The LA Ruby Study Group has chosen it as our next book, so I’ll have some folks to discuss it with. But I still want record some ideas I have been struggling with as I read.

First, I am surprised that Sandi manages to be so thought-provoking with such concise examples. Chapters 2 and 3 revolve around a code example that contains about 50 lines of code. But she still manages to create several plausible alternative implementations, each with it’s own advantages and faults. Her examples remind me of problems I have run into in other code. More importantly, the book offers ideas for for refactoring such messes - but with the following caution against over-engineering:

Do not feel compelled to make design decisions prematurely. … When the future cost of doing nothing is the same as the current cost, postpone the decision. Make the decision only when you must with the information you have at the time.

Chapters 2 & 3 - Constructing Objects

Depend on behavior, not data.

Concretely this usually amounts to accessing data/attributes via their getters (and setters). At first that seems a bit high-ceremony for Ruby - but being Ruby, it really isn’t. If you don’t need the getter to do anything special, you can create it with attr_reader :blah. 99% of the time the result of the method #blah is just going to be @blah. When when you find something in that last 1%, it is great that the only refactoring you need to do is to define a more complex getter #blah.

If you have some data that needs to travel together but isn’t really enough to warrant its own class (yet), then use a ruby Struct to make bundle it up - with named attributes to make its meaning clearer.

Reactor to reveal intent

The book is filled with gems like this one - after a section demonstrating several very small refactoring:

Do these refactoring even when you don’t know the ultimate design. They are needed, not because the design is clear, but because it isn’t. You do not have to know where you’re going to use good design practices to get there. Good practices reveal design.

Isolate dependencies

One of the best ways to reduce coupling between classes is though dependency injection. Where possible, pass in the things you depend on as parameters. One immediate pay off for this is that it makes your testing easier. Instead of using mocks and stubs to intercept method calls while running your tests, you can just pass in an appropriately constructed fake that provides just enough support so you can write your tests. For example, if the current test depends on data from the class you are depending on, instead of passing in the entire object, pass in a Struct containing the data you need for the current test.

Sometimes it isn’t feasible to refactor to use dependency injection. When your code already has some issues with tight coupling, you may not be able to fully extract a hidden object right away - or you may not be able to change the class’s initialization signature without breaking a ton of other things. So the book shows examples of using a wrapper to initialize your object using the interface you wish you had - or of isolating the methods that are making you wish you had a separate, dependent object so they are ready to extract when you can (p 32).

Sandi also showed an example of creating a method in your class whose entire purpose is to wrap a call made on a dependent object. This can be particularly useful if that dependent object is in active development and frequently changes its method signatures - or if you are afraid that call to the external dependency will be overlooked within a much larger method (p 50).

Using hashes for initialization (and merging them with a hash of default attributes) is very useful. It frees you from trying to recall a order for the initialization parameters and helps instance creation code serve as some of the documentation about what the object contains.

Chapter 4 - Creating Flexible Interfaces

Once your object has a single responsibility, then you need to work on giving it an optimal interface.

Object-oriented applications are defined by the messages that pass between objects.

This chapter focuses on how to determine if your messages are right: are you sending the right messages? and are you sending them to the right receiver? On the sending side, the message should specify what it wants, not how the receiving object should behave. If the sender is doing a lot of micro-management, then perhaps the sender needs to fully delegate to the receiver. If the receiver does not have all the knowledge to take care of the delegated request, that may be a sign that you need some other intermediate object that manages the request.

Context

The things that Trip knows about other objects make up its context…. The context that an object expects has a direct effect on how difficult it is to reuse…. Objects that have a complicated context are hard to use and hard to test; they require complicated setup before they can do anything.

I recognize the complicated setup code smell but I hadn’t explicity thought about having a lot of context in terms of an object knowing too much about it’s collaborators. Does your class make a bunch of calls to methods in other objects? If so, even if you have minimized coupling by using dependency injection, your object knows the names of many methods in its collaborators - and may need to know a lot about the parameters for those methods. The second refactoring of this chapter (fig 4.7 on p 72) gives an example of how to reduce what a trip needs to know about its collaborator, the mechanic. Instead of handing the mechanic individual bicycles and asking him to prepare them, the trip just tells the mechanic to make the preparations it needs to make for this trip. This is how you move to specifying what you want done, not how you want it done - but increasing the trust with which one object delegates to another.

The examples in the book are great, but I do have one question about the example on p 72, figure 4.7. Doesn’t passing the trip instance along to the mechanic as the argument to the prepare method potentially increase the coupling between the trip and mechanic classes? Not really - it merely changes which object is in control. One of the two classes needs to know that they collaborate around preparing bicycles. In the initial code, the trip knows about bicycles and it knows that the mechanic needs to prepare them. In the final example, the mechanic knows it is responsible for preparing bicycles and asks the trip to hand them over. The point of the trip passing ‘self’ when calling the mechanic’s prepare method is 1) it is a form of dependency injection that facilitates isolated testing and 2) if sometime later the mechanic’s preparations change to need more information from my_trip than just the list of bicycles, then we don’t have to add additional parameters to my_trip’s call to my_mechanic#prepare. When I first saw that it felt like the mechanic instance suddenly had a much closer relationship with EVERYTHING about a trip, but in practice, my_mechanic could always have queried my_trip for that information anyway using my_trip’s public interface. Passing the trip instance into my_mechanic encourages the mechanic class to access what ever information it needs from my_trip via that injected dependency.

Perhaps I am so wowed by POODR because it seems to anticipate the exact difficulties I have. The very next section, “Trusting Other Objects”, directly addresses my unease with example 4.7 and points out that now what trip is full delegating the bicycle preparations to the mechanic, you could use the same strategy to delegate different preparations to other classes - using the exact same interface. For example, you could loop over an array of collaborators and call prepare(self) on each.

This blind trust is a keystone of object-oriented design. It allows objects to collaborate without binding themselves to context and is necessary in any application that expects to grow and change.

So I guess the answer is that I just must get comfortable with this design paradigm, sometimes summarized as “Don’t ask, tell”.

Law of Demeter

The last section of the chapter discusses how to fix long message chains (Law of Demeter violations) using a message passing perspective. Long method chains are problematic because they tie your object to specific public methods of several other objects. This increases the chances that your object may need to change because of changes in a distant object.

The train wrecks of Demeter violations are clues that there are objects whose public interfaces are lacking.

Instead of using the existing public interfaces of the intermediate objects to construct these long chains, you need to figure out what additional public interfaces you need.

Focusing on messages reveals object that might otherwise be overlooked. When messages are trusting and ask for what the sender wants instead of telling the receiver how to behave, objects naturally evolve public interfaces that are flexible and reusable in novel and unexpected ways.

Chapter 5 - Duck Typing

Methods that check kind_of? or responds_to? before sending a message are both indications that your object doesn’t trust its collaborators to do the right thing. When you see this, you know you are a missing an abstraction which would unify your collaborators. When you have discovered this abstraction, sometimes it is sufficient to add a single method to each of the collaborators. In Sandi’s example, each collaborator class got a prepare_trip method in which their part of the trip preparations could be defined. Then instead of trip micro-managing the preparations, it can just call prepare_trip on each of its collaborators and let them take care of it.

Chapter 6 - Acquiring Behavior Through Inheritance

In Ruby you can affect an object’s method lookup tree (aka inheritance hierarchy) in a couple of ways. You can create a Class -> SubClass relationship. You can use extend and include to add modules. Or you may add methods to a class’s Singleton class. There are some differences (e.g. you can not create an instance of a module, only a class) but to a first order approximation, these three things are the same. All of them add methods which can be found automatically by your object. If you set up these inheritance relationships correctly, that’s great. But done incorrectly it’s a recipe for unexpected failures. Fortunately Sandi provides some great advice on how to stay out of trouble.

First, how do you know you need subclasses? One clue is often having a variable called type or category and methods that check the value of that variable to decide what to do. Sandi’s first piece of advice is to take note of this sign - but to wait until your category or list gets a third member before refactoring to use inheritance. Having more examples makes it easier for you to figure out what behavior should be in the parent class and what is specific to the subclasses. When you have enough information to create your class hierarchy, create the super class as an empty class and have your existing class inherit from it. Then start fleshing out your other subclasses. Any time your second subclass needs a method (or version of a method) that is in your original class (now considered your first subclass), refactor the method to move the shared behavior up to the superclass. If you are rigorous about only moving abstract behavior up into the superclass, you avoid much unnecessary overriding of methods to work around an imperfect abstraction in your superclass.

Template Method Pattern

One thing that often differs between different subclasses are the defaults; in the example in chapter 6, road bikes and mountain bikes have different default tire sizes. So each subclass will need to have a default_tire_size method with a different value. In addition, it is important that the parent class also have a default_tire_size method - even if all it does is raise a NotImplementedError. This is important so that any additional bike types you create will immediately implement the shared Bicycle behavior.

    class Bicycle
      def default_tire_size
        raise NotImplementedError, "Instances of #{self.class} cannot respond to default_tire_size"
      end
    end

    irb> RecumbentBicycle.new.default_tire_size
      NotImplementedError:
        Instances of RecumbentBicycle cannot respond to: 'default_tire_size'

If RecumbentBicycle is a Bicycle, then by some perspectives the two classes are by definition tightly coupled. But you should still employ techniques to spare your subclass from having to know details of how its superclasses implement methods it wants to extend. A class’s initialize method is one that subclasses often need to override. And a common mistake is to forget to call super at the appropriate point in your subclass’s initialize method.

…forcing a subclass to know how to interact with its abstract superclass causes many problems. It pushes knowledge of the algorithm down into the subclasses, forcing each to explicitly send super to participate. It causes duplication of code across subclasses, requiring that all send super in exactly the same places. And it raises the chance that future programmers will create errors when writing new subclasses, because programmers can be relied upon to include the correct specializations but can easily forget to send super.

Hook Messages

One way around the super problem is for the superclass to send hook messages at appropriate integration points. If a subclass needs to add or modify the behavior of the superclass, it can implement an appropriate hook method. As noted above, the superclass must always have an implementation any shared methods; though usually the superclass’s hook method is just a no-op.

A similar pattern when dealing with shared and specialized data is to have the shared attributes defined in the parent class, e.g. in the spares method in the book’s example. Then each subclass overrides the method to add additional attributes. Again this can be an invitation to forget to merge the shared data from the parent class with your specializations. A safer pattern is for the parent class to manage the shared data - and to manage melding in the specialized data from each subclass. In our spares example, the parent class declares the spares method with all the shared information. Then it calls local_spares to get any additional data and merges it into the method’s output. So instead of declaring it’s own spares method, the subclass adds specialized data by implementing local_spares.

Ubuntu 12.04 on VirtualBox 4.2

I got a call today from someone who had upgraded their desktop from Ubuntu 8 to Ubuntu 12 and was having trouble getting Ruby reinstalled. She had previously installed Ruby from source (as apparently everyone needs to do in the Debian/Ubuntu world) without any problem - but not this time. I do a reasonable amount of Ruby and system administration but I work in a RedHat shop so don’t know a ton about Debian-derived distros. But I thought I would try to see what I could do to help.

Installing Ubuntu 12.04

First thing I need is an Ubuntu box to practice on. I had VirtualBox installed and have used Vagrant to manage some boxes. However, since Vagrant is written in Ruby and usually does its configuration of the box with Chef or Puppet, I think the premade vagrant images will already have Ruby installed. - making it difficult to figure out the correct steps for installing Ruby from source. So I think I should install Ubuntu without my favorite virtualization tool.

First, I upgraded my VirtualBox software to 4.2.4. (My older boxes appear to still run but they appear to have lost their Guest Additions; I’ll have to go back and sort that out).

Ubuntu is pretty common so I searched for a prebuilt (hopefully minimal) VirtualBox image and found a promising looking one at http://virtualboxes.org/images/ubuntu/. I downloaded this one:

    Ubuntu Linux 12.04 x86
    Size (compressed/uncompressed): 769 MB/3.2 GB
    Link: http://sourceforge.net/projects/virtualboximage/files/Ubuntu%20Linux/12.04/ubuntu_12.04-x86.7z
    Active user account(s) (username/password): ubuntu/reverse
    Notes: Guest Additions NOTinstalled; tip: set Video RAM 64MB minimum

First issue: that is compressed with 7zip so I’ll need p7zip to uncompress it. Fortunately I have Homebrew installed so I just did:

    $ brew install 7zip

This installed a formula it said was p7zip from https://github.com/mxcl/homebrew/commits/master/Library/Formula/p7zip.rb and brew list p7zip shows:

    $ brew list p7zip
    /usr/local/Cellar/p7zip/9.20.1/bin/7zr
    /usr/local/Cellar/p7zip/9.20.1/bin/7za
    /usr/local/Cellar/p7zip/9.20.1/bin/7z
    /usr/local/Cellar/p7zip/9.20.1/lib/p7zip/ (6 files)
    /usr/local/Cellar/p7zip/9.20.1/share/doc/ (50 files)
    /usr/local/Cellar/p7zip/9.20.1/share/man/ (3 files)

After a little fishing around I think the uncompress command is ‘7z x '

    $ man 7z
    $ 7z x Downloads/ubuntu_12.04-x86.7z
    Extracting  ubuntu_12.04/ubuntu_12.04.vbox

This produces a folder named ubuntu_12.04 containing a small .vbox file and 3.5 GB .vdi file. The later is the ‘virtual disk image’ file that we will want to use to create our new VM. I followed [these instructions] (http://www.thelinuxdaily.com/2010/02/how-to-setup-a-pre-built-virtualbox-guest-image-tutorialguide/) and created a new VM with 512 MB RAM and using the pre-existing .vdi file. Then I just start the newly created VM from the menu.

There are a couple of peculiarities - the main one is that the requested keyboard is Italian. I found the keyboard settings and chose an English keyboard layout instead. Then I changed the update server to the server for the US and upgraded packages. The GUI says it wants to do 489 updates!

VirtualBox Guest Additions

Oh, and I don’t need the Guest Additions for preventing the VirtualBox from capturing my mouse but I do need them so I can cut and paste between the host and guest OSs. Under the VirtualBox Devices menu there is an item for Install Guest Additions. Clicked that to get them installed. However even though I rebooted the virtual machine, I still can’t copy and paste between the VM to my Mac.

Installing Ruby

OK now for installing Ruby. The person I am helping was installing Ruby globally from source but I wanted to see if RVM could install Ruby 1.8.7 for me. First I installed rvm:

    $ curl -L https://get.rvm.io | bash

That added a .bash_profile to my home directory. That didn’t play well with my shell in emacs so I moved that line to my .bashrc file. Then I installed ruby and made it my default ruby:

    $ rvm install 1.8.7
    $ rvm use --default 1.8.7

That worked just fine - but I can’t install any gems.

    $ gem install bundler
    ERROR: Loading command: install (LoadError)
           no such file to load -- zlib
    ERROR: While executing gem ... (NameError)
           uninitialized constant Gem::Commands::InstallCommand

Sounds like I am missing some dependencies. Fortunately RVM will tell me what it needs - and in Ubuntu syntax:

    $ rvm requirements
    $ sudo apt-get install build-essential openssl libreadline6
         libreadline6-dev curl git-core zlib1g zlib1g-dev libssl-dev
         libyaml-dev libsql3-dev sqlite3 libxml2-dev libxslt-dev
         autoconf libc6-dev ncurses-dev automake libtool bison
         subversion pkg-config

However, once I installed those, I still wasn’t able to install any gems. Apparently I should have installed the prerequisites BEFORE I installed the ruby. Once I got the order correct, installing ruby 1.8.7 via RVM also installed 4 gems:

    bundler
    rake
    rubygems-bundler
    rvm

Packaging Ruby Code

For the last month or so the LA Ruby Study group has been reading “Building Awesome Command-Line Applications in Ruby”. For the most part it is a pretty good book and I think it makes a good intro book for folks who are new to Ruby, new to programming, and new to Unix. I get the sense that not everyone in the group is enjoying the tour of ‘the Unix way’ as much as I am but perhaps someday they will have reason to need ‘man’ or pipe (‘|’) and will remember that they heard about that in study group.

The examples the author chose - a script to backup MySQL databases and a to do application - are pretty good: small enough to be understood in 5 minutes but with plenty of room to add features, make tweaks, improve things in some small way. I am particularly enjoying the to do app because I haven’t written a command suite before. Judging from dependencies I have see while installing other gems, the thor gem seems to be a more popular then the GLI gem the author chose but I am pretty impressed with the breadth of support GLI’s scaffold created for us. For example the scaffold generates a Rakefile with a nicely tailored task for building Rdoc documentation. Before I found that, I had tried just ‘rdoc todo’ but that tried process every file in the directory - including several that had no reason to be in my Rdocs. I could have read the rdoc docs and figured out how to document only the project fieles I need to, but the generated rake task is a LOT easier - and gave me some examples of parameters I might want to set to customize further.

This week we are discussing how to distribute your code - as a Gem, an RPM, and as source on GitHub. The generated gemspec looks pretty straightforward and the gem builds and installs. And there are instructions on how to distribute opensource gems on rubygems.org and how to run a private gem server if you need to do that.

The author then goes on to repackaging your gem as an RPM using gem2rpm. His instructions seem to work - though these are a bit more concise. The generated RPM spec file looks reasonable - except for the fact that the development/testing dependencies are listed as full on ‘Requires’ when they are not actually needed to run the program. This Stack Overflow post notes the same issue and suggests a patch to gem2rpm but for just a handful of things to make into RPMs, it may be just as satisfactory to remove the development dependencies by hand while you are inspecting the RPM spec file.

A bigger issue is the value it automatically picked up for ‘gemdir’; it is my personal .rvm gemset directory. I wonder what the correct answer should be. If I just need to match where the rubygems installed via RPM expects gems, then I can probably hardcode a guess. But that seems less than ideal. I see that gem2rpm tries to calculate the correct value using:

    %global gemdir %(ruby -rubygems -e 'puts Gem::dir' 2>/dev/null)
    %global geminstdir %{gemdir}/gems/%{gemname}-%{version}

Poking around in the gem2rpm gem README, I found a link to the official packaging guidelines. I’ll have to read that and see if there is an official resolution for this.