CNK's Blog

Testing in Django

Test Runner

First the good part, Django, by default, uses Python’s built in unittest library - and as of Python 2.7+ that has a reasonable set of built in assertion types. (And for versions of django before 1.8, django backported the python 2.7 unittest library.) Django has a pretty good test discovery system (apparently from the upgraded Python 2.7 unittest library) and will run any code in files matching test*.py. So to run all your test, you just type ./manage.py test at the top of your project. You can also run the tests in individual modules or classes by doing something like: ./manage.py test animals.tests - without having to put the if name == __main__ stuff in each file. You can even run individual tests - though you have to know the method name (more or less like you have to in Ruby’s minitest framework): ./manage.py test animals.tests.AnimalTestCase.test_animals_can_speak

The best part of Django’s test runner is the setup it does for you - creating a new test database for each run, running your migrations, and if you ask it to, importing any fixtures you request. Then, after collection all the discovered tests, it runs each test inside a transaction to provide isolation.

Test Output

But coming from the Ruby and Ruby on Rails worlds, the testing tools in Python and Django are not as elegant as I am used to. At one point I thought the Ruby community’s emphasis on creating testing tools that display English-like output for the running tests bordered on obsessive. But having spent some time in Python/Django which doesn’t encourage tools like that, I have come to appreciate the Rubyist’s efforts. Both RSpec and Minitest have multiple built in output format options - and lots of people have created their own reporter add ons - so you can see your test output exactly the way you want it with very little effort. The django test command allows 4 verbosity levels but for the most part they only change how much detail you get about the setup process for the tests. The only differences in the test output reporting are that you get dots at verbosity levels 0 and 1 and the test names and file locations at levels 2 and 3:

    $ python ./manage.py test -v 2

    ..... setup info omitted .......

    test_debugging (accounts.tests.test_models.UserModelTests) ... ok
    test_login_link_available_when_not_logged_in (accounts.tests.test_templates.LoginLinkTests) ... ok
    test_logout_link_available_when_logged_in (accounts.tests.test_templates.LoginLinkTests) ... ok
    test_signup_link_available_when_not_logged_in (accounts.tests.test_templates.LoginLinkTests) ... ok
    test_user_account_link_available_when_logged_in (accounts.tests.test_templates.LoginLinkTests) ... ok
    test_profile_url (accounts.tests.test_urls.AccuntsURLTests) ... ok
    test_signup_url (accounts.tests.test_urls.AccuntsURLTests) ... ok
    test_url_for_home_page (mk_web_core.tests.GeneralTests) ... ok

    ----------------------------------------------------------------------
        Ran 8 tests in 0.069s

So increasing the verbosity level is useful for debugging your tests but disappointing if you are trying to use the tests to document your intentions.

Behave and django-behave

This is the main reason why, despite being unenthusiastic about Cucumber in Ruby, I am supporting using Python’s behave with django-behave for our new project. One of the things I don’t like about cucumber is it all to frequently becomes an exercise in writing regular expressions (for the step matching). I don’t like that if you need to pass state between steps, you set instance variables; this is effective, but it looks kind of like magic.

With ‘behave’, you need to do the same things but in more explicit ways. The step matching involves litteral text with placeholders. If you want to do full regular expression matching you can, but you need to set the step matcher for that step to be ‘re’ - regular expression matching isn’t the default. For sharing state, there is a global context variable. When you are running features and scenarios, additional namespaces get added to the root context object - and then removed again as they go out of scope again. Adding information to the context variable seems more explicit - but with the namespace adding and removing - I am not sure that this isn’t more magical than the instance variables in Cucumber.

Django’s TestCase Encourages Integration Tests

The main testing tool that Django encourages using is it’s TestCase class which tests a bunch of concerns - request options, routing, the view’s context dictionary, response status and template rendering.

It’s odd to me that Django’s docs only discuss integration tests and not proper unit tests. With Django’s Pythonic explicitness, it is fairly easy to set up isolated pieces of the Django stack by just importing the pieces you care about into your test. For example, you can test your template’s rendering by creating a dictionary for the context information and then rendering the template with that context. Harry Percival’s book “Test-Driven Development with Python” does a very nice job of showing you how to unit test the various sections of the Django stack - routing, views, templates, etc.

More than just not discussing isolated unit tests, at least some of Django’s built in assertions actively require you to write a functional / integration test. I tried rendering my template to html and then called assertContains to test some specific html info. But I got an error about the status code! In order to use assertContains on the template, I have to make a view request.

Coming from the Rails world, I don’t really want the simple assertContains matching. What I really want is Rails’ built-in html testing method, assert_select. I found a Django library that is somewhat similar, django-with-asserts. But like assertContains, django-with-assert’s test mixin class uses the Django TestCase as it’s base and so also wants you to make a view request so it can test the StatusCode. I really wanted django-with-asserts functionality but I want to use it in isolation when I can, so I forked it and removed the dependency on the request / response cycle.

A Send-Only Email Server

Our ZenPhoto install wants to be able to notify us when there are new comments. I also may eventually want to set up exception notifications for some of my dynamic sites. At least for now, I don’t want to run a full-blown mail server for our domains; I don’t want to deal with spam detection and restricting who can use the mail server to relay mail, etc. But I know that many of the common Unix email servers can be configured so that they don’t receive mail and only send mail if it originates on one or more specific servers. I don’t have a lot of experience setting up mail servers. The ones I am most familiar with are qmail (which is what ArsDigita used everywhere) and Postfix. I am betting that it will be easier to set up Postfix on Ubuntu so let’s look for some instructions.

Installing Postfix

There are some promising looking instructions on the Digital Ocean site - for Postfix on Ubuntu 14.04. Postfix is apparently the default mail server for Ubuntu because sudo apt-get install mailutils installs postfix as one of the “additional packages”. The install process asked me two questions: what kind of mail server configuration I needed (I chose ‘Internet Site’), and what is the domain name for the mail server. I debated whether I should leave this set to the hostname for the server, which is a subdomain of one of our domains, or if I should set it to just the domain. Tim may have our domain name registrar set up for email forwarding for the domain so it may be slightly safer to configure this mail server with the subdomain. And it will make it a lot clearer to me where the email is coming from.

    $ sudo apt-get install mailutils
    ...
    ... Lots of install info....
    ...
    Setting up postfix (2.11.0-1ubuntu1) ...
    Adding group `postfix' (GID 114) ...
    Done.
    Adding system user `postfix' (UID 106) ...
    Adding new user `postfix' (UID 106) with group `postfix' ...
    Not creating home directory `/var/spool/postfix'.
    Creating /etc/postfix/dynamicmaps.cf
    Adding tcp map entry to /etc/postfix/dynamicmaps.cf
    Adding sqlite map entry to /etc/postfix/dynamicmaps.cf
    Adding group `postdrop' (GID 115) ...
    Done.
    setting myhostname: trickster.ictinike.org
    setting alias maps
    setting alias database
    changing /etc/mailname to trickster.ictinike.org
    setting myorigin
    setting destinations: trickster.ictinike.org, localhost.ictinike.org,
    , localhost
    setting relayhost:
    setting mynetworks: 127.0.0.0/8 [::ffff:127.0.0.0]/104 [::1]/128
    setting mailbox_size_limit: 0
    setting recipient_delimiter: +
    setting inet_interfaces: all
    setting inet_protocols: all
    /etc/aliases does not exist, creating it.
    WARNING: /etc/aliases exists, but does not have a root alias.

    Postfix is now set up with a default configuration.  If you need to
    make changes, edit /etc/postfix/main.cf (and others) as needed.
    To view Postfix configuration values, see postconf(1).

    After modifying main.cf, be sure to run '/etc/init.d/postfix reload'.

    Running newaliases
     * Stopping Postfix Mail Transport Agent postfix
        ...done.
     * Starting Postfix Mail Transport Agent postfix
        ...done.
    Processing triggers for ufw (0.34~rc-0ubuntu2) ...
    Processing triggers for ureadahead (0.100.0-16) ...
    Setting up mailutils (1:2.99.98-1.1) ...
    update-alternatives: using /usr/bin/frm.mailutils to provide /usr/bin/frm (frm) in auto mode
    update-alternatives: using /usr/bin/from.mailutils to provide /usr/bin/from (from) in auto mode
    update-alternatives: using /usr/bin/messages.mailutils to provide /usr/bin/messages (messages) in auto mode
    update-alternatives: using /usr/bin/movemail.mailutils to provide /usr/bin/movemail (movemail) in auto mode
    update-alternatives: using /usr/bin/readmsg.mailutils to provide /usr/bin/readmsg (readmsg) in auto mode
    update-alternatives: using /usr/bin/dotlock.mailutils to provide /usr/bin/dotlock (dotlock) in auto mode
    update-alternatives: using /usr/bin/mail.mailutils to provide /usr/bin/mailx (mailx) in auto mode
    Processing triggers for libc-bin (2.19-0ubuntu6.6) ...

Configuring Postfix to only accept mail from localhost

The installer had set up Postfix to listen on all available interfaces. So netstat -ltpn shows

    Active Internet connections (only servers)
    Proto Recv-Q Send-Q Local Address           Foreign Address         State       PID/Program name
    tcp        0      0 127.0.0.1:3306          0.0.0.0:*               LISTEN      2028/mysqld
    tcp        0      0 0.0.0.0:22              0.0.0.0:*               LISTEN      11341/sshd
    tcp        0      0 0.0.0.0:25              0.0.0.0:*               LISTEN      15201/master
    tcp6       0      0 :::80                   :::*                    LISTEN      2176/apache2
    tcp6       0      0 :::22                   :::*                    LISTEN      11341/sshd
    tcp6       0      0 :::25                   :::*                    LISTEN      15201/master

So, following the instructions, I edited /etc/postfix/main.cf and changed inet_interfaces = all to inet_interfaces = localhost and restarted the postfix service. Now I see postfix only on the local interface (ipv4 and ipv6):

    tcp        0      0 127.0.0.1:25            0.0.0.0:*               LISTEN      15405/master
    tcp6       0      0 ::1:25                  :::*                    LISTEN      15405/master

I tested email sending with: echo "test email body" | mail -s "Test email" cnk@<destination> and it went through just fine. YEAH!

Now, I need to forward system mail (e.g. root mail) to me. To do this, I need to add a line to /etc/aliases for root + the destination emails. Then I got the new entries in /etc/aliases into /etc/aliases.db by running the newaliases command. I tested the new root works by sending a second test email: echo "test email body" | mail -s "Test email for root" root And this one also got to me.

There was an additional section about how to protect my domain from being used for spam - especially in this case, being impersonated. The article on setting up an SPF record doesn’t look too hard - if the service we are using to do DNS lets us set that up. I’ll have to look into it when we are switching DNS.

Configuring Email in ZenPhoto

Having the ability to get root mail is good - but the main reason I wanted email on this server was for ZenPhoto’s comment functionality. So now, on the plugin page of the ZenPhoto admin site, there is a Mail tab with two options. For now I chose zenphoto_sendmail which just uses the PHP mail facility to send mail using the local mail server.

Upgrading ZenPhoto

I have shared a web site with a photographer friend of mine for several years. I would like to do some other, more modern things, with my web server so it’s time to upgrade. I have been using RedHat as my Linux distribution for …. well since RedHat version 4 or 5. But when I started using Vagrant VMs when I looked for RH Virtual Box images, the CentOS or Fedora images where generally much larger than the Ubuntu images. So I started playing with Ubuntu. And at work we are planning to use Ubuntu for our new project because we want something that supports AppArmor - so means Arch or something in the Debian family. Ubuntu is widely used in the Rails and Django communities so seems like a good choice. The latest long-term supoort version is Ubuntu 14.04, aka Trusty.

Having chosen my Linux distribution, I need to chose a hosting service. The two current contenders for low cost VPS’s are DigitalOcean and Linode. A couple of years ago Linode sponsored RailsRumble and provided the hosting for all the contestants. It seemed fairly decent and had some nice options like their StackScripts. So I think I’ll use Linode.

New Linode Server

I spun up a Linode 2G instance on Ubuntu 14.04 LTS with a 512 Mb swap disk (max it would allow me to set). That same form asked me to set a root password for the new server.

Using that password, I logged in and immediately did:

    apt-get update
    apt-get upgrade
    apt-get install git-core curl zlib1g-dev build-essential libssl-dev libreadline-dev libyaml-dev libsqlite3-dev \
            sqlite3 libxml2-dev libxslt1-dev libcurl4-openssl-dev python-software-properties libffi-dev

    useradd -g staff -m -s /bin/bash cnk
    useradd -g staff -m -s /bin/bash tim
    # and set passwords for both of us
    apt-get install emacs24-nox
    # added both tim and cnk to the sudoer's file

From the Getting Started Guide

hostname

One of the first things the Getting Started guide asks you to do is to set the hostname.

    root@localhost:/# echo "trickster" > /etc/hostname
    root@localhost:/# hostname -F /etc/hostname

Edited /etc/hosts to add one line mapping the public IP assigned to me to the hostname I just configured. The final file is:

    127.0.0.1       localhost
    127.0.1.1       ubuntu
    45.79.nnn.nnn   trickster.<domain>.org trickster

    # The following lines are desirable for IPv6 capable hosts
    ::1     localhost ip6-localhost ip6-loopback
    ff02::1 ip6-allnodes
    ff02::2 ip6-allrouters

Time zone

The default for Ubuntu is for servers to use UTC until that is changed to something else:

    root@localhost:~# date
    Wed Jun 17 06:04:31 UTC 2015

    root@localhost:~# dpkg-reconfigure tzdata
    ..... I get a 'GUI' that let's me choose my timezone
    Current default time zone: 'US/Pacific'
    Local time is now:      Tue Jun 16 23:05:07 PDT 2015.
    Universal Time is now:  Wed Jun 17 06:05:07 UTC 2015.

Securing the server

Linode also has an excellent security guide so let’s work through that.

sshd Configuration

It suggests disabling password authentication and only allowing keys. OK so I copied my public key into the ~/.ssh/authorized_keys file on the linode box. So I can now ssh without a password.

Then I edited /etc/ssh/sshd_config to disable password authentication and root login and restarted sshd.

Setting Up iptables

The guide has a fairly sensible looking set of firewall rules for IP tables and good instructions for how to create them in a file and then load them into iptables. The defaults look just fine to me for now so I just followed the instructions. And similarly I followed the instructions for how to get loading the firewall rules into the boot sequence before the network interface is enabled. I would need to reboot the server to see if that really worked but don’t feel like it just now.

The guide also suggested setting up Fail2Ban but since we are not allowing logins with passwords, I am not really sure how helpful that is. We do have ZenPhoto set up to password protect most of our albums, mostly because we were getting comment spam on the old version of the site. So perhaps I will want to set that up at some point - but not for now.

Installing MySQL, Apache, and PHP

Linode also has a guide for setting up a LAMP stack on their server (including some tuning for their smallest (1G RAM) offering). But I had found this other guide for setting up Rails on Ubuntu 14.04, so I mostly used it.

Installing and Configuring MySQL

So first, install some packages:

    sudo apt-get install mysql-server mysql-client libmysqlclient-dev

For reference, this gives me mysqld Ver 5.5.43-0ubuntu0.14.04.1 for debian-linux-gnu on x86_64 ((Ubuntu))

I was prompted to set a root password for the database which I did. The default install bound MySQLd to 127.0.0.1:3306 which is good and the default_storage_engine is InnoDB - also good. But the default server character set is latin1. It is the 21st century and I think we should all be using UTF8 all the time. So I created a file in /etc/mysql/conf.d/default_charset_utf8.cnf and used what I had used yesterday when I set up MySQL a CentOS system with MySQL 5.5:

    [mysqld]
    default-character-set = utf8

But the server would not restart. In the error log I see:

    /usr/sbin/mysqld: unknown variable 'default-character-set=utf8'

Hun? Searching for that error message turned up this blog post which claims that option was deprecated in MySQL 5.0. Searching the MySQL docs, I only see default-character-set as a command line option. Apparently the more correct way to do this now is:

    [mysqld]
    character_set_server = utf8

That works:

    mysql> show variables like 'char%';
    +--------------------------+----------------------------+
    | Variable_name            | Value                      |
    +--------------------------+----------------------------+
    | character_set_client     | utf8                       |
    | character_set_connection | utf8                       |
    | character_set_database   | utf8                       |
    | character_set_filesystem | binary                     |
    | character_set_results    | utf8                       |
    | character_set_server     | utf8                       |
    | character_set_system     | utf8                       |
    | character_sets_dir       | /usr/share/mysql/charsets/ |
    +--------------------------+----------------------------+
    8 rows in set (0.00 sec)

Checking the security of my MySQL server set up, there do not appear to be any test databases or anonymous access to the database. And root can only log in locally. While I was in the mysql console, I created the zenphoto user and zenphoto_prod database.

Apache, PHP, and ZenPhoto

A week or so ago, I had installed LAMP + the new version of ZenPhoto on a VM (using Vagrant) so I could see how feasible it would be to migrate directly from our super old ZenPhoto to the latest version by creating database migration files. So now I need to do the same thing on the new server. First I installed apache and PHP:

    apt-get install apache2 libapache2-mod-php5 php5-mysqlnd php5-gd

I was able to use the defaults for most of the apache configuration. I may want to use the status command. In /etc/apache2/envvars file there is the following comment:

    ## The command to get the status for 'apache2ctl status'.
    ## Some packages providing 'www-browser' need '--dump' instead of '-dump'.
    # export APACHE_LYNX='www-browser -dump'

Looks like w3m satisfies that requirement so I changed the line above to export APACHE_LYNX='w3m -dump'

I now see the default Apache page for Ubuntu when I go to the IP address of my server. I didn’t make any changes to the php parameters. Later, I may need to tweak the parameters in /etc/php5/apache2/php.ini but for now I am just going to move on to installing ZenPhoto.

ZenPhoto

I copied zenphoto-zenphoto-1.4.7.tar.gz up to the server, untarred it into /var/www/html/ and renamed the folder to just ‘zenphoto’.

    chown -R www-data:www-data zenphoto

ZenPhoto supplies an example config file: zp-core/zenphoto_cfg.txt. You copy that to zp-data/zenphoto.cfg.php and then edit it to provide the database connection information, etc.

    chmod 600 zp-data/zenphoto.cfg.php

Normally what one would do next is navigate to the zenphoto/zp-core/setup.php url and let the set up script install everything. But I want to use the data from our current site. Fortunately zenphoto is set up so people can share a database by setting a tablename prefix. So by setting the table prefix for the new site to something different than our original install, I was able to have complete sets of tables for the old and new installations in the same database. I went through the tables field by field and found that although there were some new columns, I was able to write SQL statements to copy info from the original site into the new tables. The admin / configuration information was very different so I did not attempt to tranfer that. But the image and album information was very similar so it was pretty straightfoward to transfer most of the data into the new schema.

Once I had the original album and image data in the new tables, I navigated to http://<ipaddress>/zenphoto/ to run the setup script on top of the mostly set up database. With the database in place and the software set up, I used rsync to transfer the photos into the albums directory on the new site.

Rewrite rules for pretty urls

Now I can navigate to the photo albums - but the urls are different. The old site was using rewrite rules to create pretty urls like: http://ictinike.org/zenphoto/tim/misc/ted_09/ but the new site is serving that alubum page as http://45.79.100.71/zenphoto/index.php?album=tim/misc/ted_09

The install script had asked if I wanted it to create a .htaccess file and I had said yes. mod_rewrite was not enabled by default in my apache install but I had enabled it using sudo a2enmod rewrite but still no dice. The .htaccess files from the old and new versions of ZenPhoto are very different so it was hard to tell if that was the issue, or something else. In fact the new file says:

    # htaccess file version 1.4.5;
    #       Rewrite rules are now handled by PHP code
    # See the file "zenphoto-rewrite.txt" for the actual rules
    #
    # These rules redirect everything not directly accessing a file to the
    #       Zenphoto index.php script
    ...

In the ZenPhoto admin interface there is a checkbox under URL options for mod rewrite. When I check that, the links in the pages are now the ‘pretty urls’ that I expect. But clicking on them gives me a 404 error. Unchecking the box gives me back the index.php + query args urls - which work. Hmmmmm. It took me a while to figure out that the issue was that my main apache configration was set to ignore .htaccess files. In my /etch/apache2/apache2.conf:

    <Directory /var/www/>
        Options Indexes FollowSymLinks
        AllowOverride None
        Require all granted
    </Directory>

I could have allowed .htaccess files to be used by changing the AllowOverride directive. But since I have full control over the configs, it seemed more sensible to just copy the rules from the .htaccess file into the virtual host config file for my ZenPhoto site. Since ZenPhoto does a lot of magical set up, I didn’t remove the unused .htaccess file from the zenphoto directory - in case something in the ZenPhoto admin checks to see if it is available.

    <Directory />
        # CNK copied this from the .htaccess file in /zenphoto
        # rather than change the AllowOverride settings globally
        #
        # htaccess file version 1.4.5;
        #   Rewrite rules are now handled by PHP code
        # See the file "zenphoto-rewrite.txt" for the actual rules
        #
        # These rules redirect everything not directly accessing a file to the Zenphoto index.php script
        #
        <IfModule mod_autoindex.c>
            IndexIgnore *
        </IfModule>
        <IfModule mod_rewrite.c>
            RewriteEngine On

            RewriteBase /zenphoto

            RewriteCond %{REQUEST_FILENAME} -f [OR]
            RewriteCond %{REQUEST_FILENAME} -d
            RewriteRule ^.*$ - [L]

            RewriteRule ^.*/?$   index.php [L,QSA]
        </IfModule>
    </Directory>

Comment configuration

OK almost everything is up and running. But I don’t see the comments I migrated - nor do I see a comment form on each page. I didn’t transfer the configuration stuff via the database, so I need to enable comments from the admin interface There is a configuration page for the comments plugin - asking if we need people’s names and email addresses and whether or not to allow anonymous or private commenting. I am going to defer to Tim on how he wants that set up. I did go ahead and enable one of the captcha plugins so at least there is a small barrier to comment spam. That and the comment system will email us about new comments - once I get email set up. See the next post for how I set up an outgoing mail server on the new machine. With that working, all I had to do was enable the zenphoto_sendmail plugin to get new comments emailing us. I think that just sends mail as if it were using /usr/bin/sendmail (which postfix will pick up as a backwards compatibility nicety). If we want something more configurable, we may want to switch to the PHPMailer plugin which allows you to set more configuration options in the ZenPhoto admin interface.

There is an archive page that shows when all the photos were taken (based on the EXIF data I think). The listings were there but the pages it linked to said “no images found”. But when I clicked the “Refresh Metadata” button on the main admin page, it rebuilt whatever index was needed to make the images show up on the by date listings.

MySQL Character Set Configuration

I want to set up MySQL server and all to it connections to it to use utf8. For work, we are using MySQL server 5.5 and RHEL 6.5 (or the equivalent CentOS on our Vagrant VMs).

Step 1. What is it currently using?

    mysql> show variables like 'char%';
    +--------------------------+----------------------------+
    | Variable_name            | Value                      |
    +--------------------------+----------------------------+
    | character_set_client     | latin1                     |
    | character_set_connection | latin1                     |
    | character_set_database   | latin1                     |
    | character_set_filesystem | binary                     |
    | character_set_results    | latin1                     |
    | character_set_server     | latin1                     |
    | character_set_system     | utf8                       |
    | character_sets_dir       | /usr/share/mysql/charsets/ |
    +--------------------------+----------------------------+
    8 rows in set (0.00 sec)

    mysql> show variables like '%collation%';
    +----------------------+-------------------+
    | Variable_name        | Value             |
    +----------------------+-------------------+
    | collation_connection | latin1_swedish_ci |
    | collation_database   | latin1_swedish_ci |
    | collation_server     | latin1_swedish_ci |
    +----------------------+-------------------+
    3 rows in set (0.00 sec)

Step 2. Changing the server’s character set.

The my.cnf file has a number of sections:

    mysqld_safe
    mysqld
    mysql
    client

Settings in the [mysqld] section apply to the server. Settings in the [mysql] section apply to the mysql command line client. And settings in the [client] section apply to all clients. Since I am mainly trying to change how the server and our python (or ruby) clients interact with the database, I want to make changes in the [mysqld] and [client] sections.

If I add default-character-set = utf8 to the [mysqld] section, and restart the server, my settings are now:

    mysql> show variables like 'char%';
    +--------------------------+----------------------------+
    | Variable_name            | Value                      |
    +--------------------------+----------------------------+
    | character_set_client     | latin1                     |
    | character_set_connection | latin1                     |
    | character_set_database   | utf8                       |
    | character_set_filesystem | binary                     |
    | character_set_results    | latin1                     |
    | character_set_server     | utf8                       |
    | character_set_system     | utf8                       |
    | character_sets_dir       | /usr/share/mysql/charsets/ |
    +--------------------------+----------------------------+
    8 rows in set (0.00 sec)

    mysql> show variables like '%collation%';
    +----------------------+-------------------+
    | Variable_name        | Value             |
    +----------------------+-------------------+
    | collation_connection | latin1_swedish_ci |
    | collation_database   | utf8_general_ci   |
    | collation_server     | utf8_general_ci   |
    +----------------------+-------------------+
    3 rows in set (0.01 sec)

When I create a new database without specifying a character set, it is created with server’s default, UTF8. That’s good. But the client and connection character set still say latin1.

    mysql> create database project_database;
    Query OK, 1 row affected (0.00 sec)

    mysql> SELECT schema_name, DEFAULT_CHARACTER_SET_NAME,
    DEFAULT_COLLATION_NAME FROM INFORMATION_SCHEMA.SCHEMATA;
    +--------------------+----------------------------+------------------------+
    | schema_name        | DEFAULT_CHARACTER_SET_NAME | DEFAULT_COLLATION_NAME |
    +--------------------+----------------------------+------------------------+
    | information_schema | utf8                       | utf8_general_ci        |
    | mysql              | utf8                       | utf8_general_ci        |
    | project_database   | utf8                       | utf8_general_ci        |
    +--------------------+----------------------------+------------------------+
    3 rows in set (0.00 sec)

Step 3a. Setting the connection characterset explicitly

I can explicitly set the character set for character_set_client, character_set_connection, and character_set_results after I have logged into the database but doing either: mysql> SET NAMES 'utf8'; OR mysql> charset utf8

In either case, all of my character set variables are now utf8. With ‘set names’, the character set reverts to the default ‘latin1’ if the client has to reconnect but with the ‘charset utf8’ command, the change survives needing to reconnect. (Note the syntax differences betweeen the 2 commands - set names requires quoting ‘utf8’ and a semicolon to execute. Charset does not require either.)

    mysql> show variables like 'char%';
    +--------------------------+----------------------------+
    | Variable_name            | Value                      |
    +--------------------------+----------------------------+
    | character_set_client     | utf8                       |
    | character_set_connection | utf8                       |
    | character_set_database   | utf8                       |
    | character_set_filesystem | binary                     |
    | character_set_results    | utf8                       |
    | character_set_server     | utf8                       |
    | character_set_system     | utf8                       |
    | character_sets_dir       | /usr/share/mysql/charsets/ |
    +--------------------------+----------------------------+
    8 rows in set (0.00 sec)

But both of those commands have to be issued after the connection is established - and need to be repeated each time I connect. I want something more automatic.

Step 3b. Changing the client / connection character set.

It looked like I should be able to bet the client settings working by adding the same line, default-character-set = utf8, to the [client] section. But in my initial trials, that didn’t seem to change anything. I also tried adding default-character-set = utf8 to both the [client] and [mysql] sections. But I still get:

    mysql> show variables like 'char%';
    +--------------------------+----------------------------+
    | variable_name            | Value                      |
    +--------------------------+----------------------------+
    | character_set_client     | latin1                     |
    | character_set_connection | latin1                     |
    | character_set_database   | utf8                       |
    | character_set_filesystem | binary                     |
    | character_set_results    | latin1                     |
    | character_set_server     | utf8                       |
    | character_set_system     | utf8                       |
    | character_sets_dir       | /usr/share/mysql/charsets/ |
    +--------------------------+----------------------------+
    8 rows in set (0.01 sec)

    mysql> show variables like '%collation%';
    +----------------------+-------------------+
    | Variable_name        | Value             |
    +----------------------+-------------------+
    | collation_connection | latin1_swedish_ci |
    | collation_database   | utf8_general_ci   |
    | collation_server     | utf8_general_ci   |
    +----------------------+-------------------+
    3 rows in set (0.00 sec)

Some of the things I was reading indicated that my problem might be that connections for superusers (like root) might behave differently than connections for normal users. I tried creating a normal user and then retesting various combinations of parameters - with the same results.

I finally found something that consistently gave me utf8 connection parameters without having to set them expliclty (see digression below). In this article about converting to a better version of MySQL utf8, utf8mb4, I see this directive: character-set-client-handshake = FALSE. Adding that to my [mysqld] section gives me consistent utf8 connection parameters with root and other users. However, I don’t think I really want to do that. I may want to be able to override the character set information for some connections (for example if I am connecting to legacy databases that are not in utf8). I think most of the frameworks I use already pass a character set parameter with the connection - at least Rails has a character set option in it’s database.yml config file.

So I am still looking for the right answer.

StackEdit

Installing StackEdit Locally

I have been looking into different editing and publishing tools for a new project at work. One of the formats of interest in Markdown because it is fairly straightforward to do a number of basic things and the stored format is cleaner than html. But since most people, myself included, don’t have the various tags committed to memory, we are going to need to provide an in browser preview for the Markdown input. So I went looking for libraries that would allow us to add a preview pane to our textareas.

One of the tools I found is an in browser editor that uses Markdown as it’s fundamental input format and can save the markdown to a variety of collaboration and blogging tools or export the text as html or PDF. It will also save your file to a ssh server. I am not sure I want to put my username and password into a web site - any web site. So instead of using the web interface, I think I’ll try to install it on my laptop and use it from there.

So, following the developer instructions at https://github.com/benweet/stackedit/blob/master/doc/developer-guide.md, I did:

  1. Git clone https://github.com/benweet/stackedit.git into ~/Code/javascript

  2. StackEdit needs node and npm. I have those, but I bet my versions are out of date, so let’s update all my homebrew software: brew update brew upgrade

  3. Update my version of npm (gets released more frequently than node.js): npm install npm -g

  4. I need gulp and bower and probably want them global / in my path: npm install -g bower npm install -g gulp

  5. install local dependencies (in ~/Code/javascript/stackedit): npm install

  6. Now back on the developer instructions: bower install

  7. The instructions for running the node server were export PORT=80 && node server.js but that gave me errors because I was trying to bind to a privileged port. So instead I did export PORT=9999 && node server.js and I am now editing this on http://localhost:9999/editor

Unfortunately, I was not able to get the saving to an ssh server to work - at least not when the ssh server is my local Mac. I am pretty sure this is an issue with how my Mac’s ssh daemon is configured. But without that, I don’t know that editing in the browser with side by side preview is all that much more convenient than creating posts in emacs and using Octopress’s rake preview task. So installing StackEdit locally was a fun exercise. But I think I will mostly be using it on the web at http://stackedit.io/