Why I don’t rely on Google

Recently several otherwise tech-savvy friends have been perplexed that I don’t just use Google for everything. They explain that I could use Google Voice for my U.S. phone number, use Google Checkout for The Mathematician’s Dice, sync my contacts and calendars through GMail, and log in to many things on the web using their OpenID service. And they wonder why I suffer a bit of spam instead of using GMail for my primary email account.

Continue reading →

Introducing Fragments

Today I am announcing Fragments, a project I’ve been working on for a few months. It’s on GitHub and PyPI.

Fragments uses concepts from version control to replace many uses of templating languages. Instead of a templating language, it provides diff-based templating; instead of revision control, it provides “fragmentation control”.

Fragments enables a programmer to violate the DRY (Don’t Repeat Yourself) principle; it is a Multiple Source of Truth engine.

What is diff-based templating?

Generating HTML with templating languages is difficult because templating languages often have two semi-incompatible purposes. The first purpose is managing common HTML elements & structure: headers, sidebars, and footers; across multiple templates. This is sometimes called page “inheritance”. The second purpose is to perform idiosyncratic display logic on data coming from another source. When these two purposes can be separated, templates can be much simpler.

Fragments manages this first purpose, common HTML elements and structure, with diff and merge algorithms. The actual display logic is left to your application, or to a templating language whose templates are themselves managed by Fragments.

What is fragmentation control?

The machinery to manage common and different code fragments across multiple versions of a single file already exists in modern version control systems. Fragments adapts these tools to manage common and different versions of several different files.

Each file is in effect its own “branch”, and whenever you modify a file (“branch”) you can apply (“merge”) that change into whichever other files (“branches”) you choose. In this sense Fragments is a different kind of “source control”–rather than controlling versions/revisions over time, it controls fragments across many files that all exist simultaneously. Hence the term “fragmentation control”.

As I am a linguist, I have to point out that the distinction between Synchronic and Diachronic Linguistics gave me this idea in the first place.

How does it work?

The merge algorithm is a version of Precise Codeville Merge modified to support cherry-picking. Precise Codeville Merge was chosen because it supports accidental clean merges and convergence. That is, if two files are independently modified in the same way, they merge together cleanly. This makes adding new files easy; use Fragment’s fork command to create a new file based on other files (or just cp one of your files), change it as desired, and commit it. Subsequent changes to any un-modified, common sections, in that file or in its siblings, will be applicable across the rest of the repository.

Like version control, you run Fragments on the command line each time you make a change to your HTML, not before each page render.

What is it good for?

Fragments was designed with the task of simplifying large collections of HTML or HTML templates. It could replace simpler CMS-managed websites with pure static HTML. It could also handle several different translations of an HTML website, ensuring that the same HTML structure was wrapped around each translation of the content.

But Fragments is also not HTML specific. If it’s got newlines, Fragments can manage it. That means XML, CSS, JSON, YAML, or source code from any programming language where newlines are common (sorry, Perl). cFragments is even smart enough to know not to merge totally different files together. You could use it to manage a large set of configuration files for different servers and deployment configurations, for example. Or you could use it to manage bug fixes to that mess of duplicated source files on that legacy project you wish you didn’t have to maintain.

In short, Fragments can be used anyplace where you have thought to yourself “this group of files really is violating DRY”.

Use it

Fragments is released under the BSD License. You can read more about it and get the code on GitHub and PyPI. And you can find me on Twitter as @glyphobet.

Special thanks to Ross Cohen (@carnieross) for his thoughts on the idea, and for preparing Precise Codeville Merge for use in Fragments.

Hackers, it is time to rethink, redesign, or replace GNU Gettext

GNU Gettext may be the de facto solution for internationalizing software, but every time I work with it, I find myself asking the same questions:

Why, in this age of virtual machines and dynamic, interpreted languages, do I still have to compile .po files to .mo files before I can use my translations?
I can reconfigure my web application, modify its database, and clear its caches whenever I want, so why do I have to do a code push and restart the entire runtime just so that “Login” can be correctly translated to “Anmelden”? Try explaining that to a business guy.
To translate new messages in my application, I have to run a series of arcane commands before the changed text is available to be translated. Specifically, the process involves generating .pot files, then updating .po files from them. Why isn’t this automatic?
Why is it still possible for bad translations to cause a crash? Translators do the weirdest things when presented with formatting directives in their translations… I’ve seen %s translated as $s and as %S, %(domain)s translated as %(domaine)s, and ${0} translated as #0, but the most common is to just remove the weird formatting directives entirely. And they all cause string formatting code to crash.
Why isn’t there a better option for translating HTML? Translators shouldn’t be expected to understand that in Click <a href="..." class="link">Here!</a>, “Click” and “Here!” should be translated, but “class” and “link” should not be. And they certainly can’t be expected to understand that if their language swaps the order of “Click” and “Here”, the <a> tag should move along with “Here”.
Why isn’t there something better than the convention to assign the gettext function to the identifier _, and then wrap all your strings in _()? Not only is this phenomenally ugly, but one misplaced parenthesis breaks it: _("That username %s is already taken" % username)
Why is support for languages that have more than two plural forms still an awful, confusing, fragile hack? Plural support was clearly designed by someone who thought that all languages were like English in having merely singular and plural. I’ve seen too many .po files for singular/dual/plural languages, where the translator obviously did not understand that msgstr[0] is the singular, msgstr[1] the dual, and msgstr[2] the plural.
Why, in this age of distributed version control, experimental merge algorithms, and eventually consistent noSQL databases, is the task of merging several half-translated .po files from several different sources still a nightmarish manual process?
Why, if I decide I need an Oxford comma or a capital letter in my English message, do I risk breaking all of the translations for that message?

There are libraries that allow you to use .po files directly, and I’m sure you can hack up some dynamic translation reloading code. Eliminating the ugliness of _() in code, and avoiding incorrectly placed parentheses after it, could be done with a library that inspects the parse tree and monkeypatches the native string class. Checking for consistency of formatting directives is not that hard. A better HTML translation technique would take some work, but it’s not impossible. The confusion around plural forms is just a user-interface issue. Merging translated messages may not be fully automatable, but at least it could be made a lot more user-friendly. And the last point can be avoided by using message keys, but that hack shouldn’t be necessary.

Gettext is behind the times. Or is it? Half of me expects someone to tell me that all these projects I’ve worked on are just ignoring features of Gettext that would solve these problems. And the other half of me expects someone to tell me I should be using some next-generation Gettext replacement that doesn’t have enough Google juice for me to find. (Let me know on Twitter: @glyphobet.)

GNU Gettext is is based on Sun’s Gettext, whose API was designed in the early ’90s. Hackers, it’s 2012. Technology has moved forward and left Gettext behind. It is time to rethink, redesign, or replace it.

[CUSTOMER NAME REDACTED] or Anything is possible on the internets

I just received this email. Details have been redacted to anonymize it. Rant follows.

From: [CUSTOMER NAME REDACTED]

Matt,

Sorry for bothering you, but I found your CV online and saw that you used to be the Lead Developer for [PRODUCT NAME REDACTED] a few years back. My wife owns a small [BUSINESS TYPE REDACTED] in [LOCATION REDACTED] and we recently migrated to [PRODUCT NAME REDACTED], which was a fluid easy process, no doubt due to some of your work — thank you for that!

One question I’ve had since moving over though is regarding their scheduling and if there’s any way to make it play with google cal or ical — I’ve asked [PRODUCT NAME REDACTED] and the techs there and it seems to be a pretty straight forward “no”… but knowing the internets and that “anything is possible” more or less, I gotta think that there must be a way to write some kind of script that could at least scrape the [REDACTED] calendar and at least provide a way just subscribe or “view” the schedule– I’m not even talking about two-way functionality… viewing would be a huge help for us and her colleagues. Moreover, my guess is that we aren’t the only ones who would love to have a way to check the schedule that wasn’t dependent on logging in to [PRODUCT NAME REDACTED], especially since they have yet to offer any mobile apps for smart phones, and that any script/app/plugin/program that’s created could even be shared with other [REDACTED].

Anyway… I won’t carry on as this is a straight cold call… but if you do have any advice and have a chance to respond, I would be most grateful!

Cheers!

[CUSTOMER NAME REDACTED]

This is jaw-droppingly awful. Let me count the ways:

This guy is asking me to think about a job and a piece of software that I stopped working on years ago. Since a programmer’s job is, in many ways, to think, he’s essentially asking me to work for free.
He is fishing for me to contradict what he has been told by the company I used to work for, which would be a totally unacceptable thing for a programmer to do even when still employed by said company.
Even if I was willing to think about a software project that I haven’t looked at in years and undercut my former employer by contradicting them, it’s likely that the project has changed since I left in ways I cannot even begin to imagine. So even if I did remember enough about the project to confidently answer his question, I would probably be totally wrong.
What would he do if I told him it would be totally easy to implement? Go back to my former employers and tell them that some random who used to work for them said that it would be easy? Is that going to make them change their mind about implementing this feature? No.
Anything is possible? On the internets?

This is the kind of obnoxious customer that small software companies just don’t need. End of rant.

Python else in loops: survey results

As promised, here are the results of two totally unscientific surveys, one conducted at PyCon 2011 and the other over Twitter just a few days ago, about the behavior of else in Python loops. The results show that only 25% of respondents know what else in a Python loop actually does, and 55% think they know but are wrong.

A week with Snow Leopard and Lion

After starting a new job a few weeks ago, I found myself in the somewhat unusual position of having a new Mac OS X Lion (10.7) installation at work and a Snow Leopard (10.6) installation at home. Switching every day focused my attention on the differences between the two. Read on to hear my thoughts.

Mission Control

What used to be Dashboard, Spaces, and Exposé have been combined into Mission Control. Overall, this is a tremendous improvement.

The All Windows mode scales all the windows proportionally, like Exposé did in Leopard. If you use any application with small windows (like Stickies), you’ll know how silly Snow Leopard’s Exposé looked when it scaled a 1200×1200 browser window to the same size as a bunch of 100×100 sticky notes and threw them all in a grid. (The non-proportional scaling in Snow Leopard’s Exposé is so useless I hack the Dock after every upgrade to bring back Leopard’s scaling.) The Application Windows mode in Lion uses Snow Leopard’s non-proportional scaling, which is generally ok because multiple windows in the same app are usually similarly sized.

All Windows also groups windows together, which I find helpful, although not critical. You can drag windows into other desktops from the All Windows mode, too. But if you move your pointer even slightly while clicking on a window to activate it, Mission Control thinks you’re starting to drag it, and ends up doing nothing when you meant to activate a window. This is maddening. There should be a threshold below which any mouse movement while clicking just ends up as a click (or perhaps there is; if so, it’s too low).

Mission Control stole Ctrl-Left and Ctrl-right keystrokes to switch desktops, which I use in the terminal all the fricking time. Luckily these keystrokes can be turned off in the settings. I was a multiple desktop power-user back when I used Xwindows, but I never turned on Spaces on Snow Leopard and still haven’t used them on Lion. I’m not sure why not.

There are a few display bugs. Sometimes the blue window highlights are way bigger than the window previews. Textmate’s Find dialog doesn’t (always) show up in Mission Control, which I hope will be fixed sometime before Textmate 2 is released later this year.

The biggest problem for Mission Control is that it doesn’t let you restore or even see minimized windows in All Windows mode, only in Application Windows mode, where they are lumped in with previously opened documents. If you enable “Minimize to Application Icon”, the only way to get at your minimized windows is via the Application Windows mode.

Scroll bars

I got used to the new scroll bars in 30 seconds. I haven’t missed the up/down arrows at all. Weirdly, the default scrolling direction on the MacMini was set up out of the box for trackpads, but on the MacBook, it was set up for scroll wheels.

Migration Assistant

Earlier this year I helped my mom migrate from a vintage MacMini running Tiger (10.4) to a new one running Snow Leopard. The Migration Assistants on the two machines completely refused to cooperate with each other during the initial setup of the new Mini. It took many, many attempts after the install was complete to I get it to migrate her documents. I thought this was because I was migrating between two major releases.

I started out using Lion on an elderly MacMini while my new laptop was in the mail, so I got to experience Migration Assistant again. Before attempting to migrate, I set up the new laptop using a dummy account, and ran Software Update on both computers until they were both running 10.7.1. I hoped the up-to-date Migration Assistant would do better at migrating between two identical versions of Mac OS, but it was just as recalcitrant as before. The two computers steadfastly refused to recognize each other at the same time, even though they were both on the same wireless network just inches away from each other (and from the wireless router). I finally borrowed an ethernet cable, turned the wireless off on both, and hooked them up ethernet port to ethernet port. It felt dirty, but it seemed to help; after a few more tries my account was migrated. I think it took longer to coax Migration Assistant into cooperation than it did to actually migrate my files.

So Migration Assistant is just as cranky between two Lion installs as it was between Tiger and Snow Leopard. I can’t imagine how frustrating it must be for the average, non-technical Mac user to migrate to a new Mac.

The new Mail

I forced myself to use the new side-by-side Mail interface for the first week, hoping that I’d come to like it. I didn’t. I hate it. I switched back to classic view after a week.

I can’t really say what bugs me about Mail, but it just feels wrong. One glaring issue is that the text and icons in the message list have been made bigger, which is a pain for someone with over ten years of email in lots of different folders. Some people, including Jon Gruber, speak highly of it, so I’m holding out hope that it will grow on me.

But when I install Lion on my personal machine, I’ll make a backup of Snow Leopard’s Mail.app and see if it works on Lion. Just in case.

Autocorrect

Autocorrect is for people who aren’t fast, good typers. It’s not for me. I turned it off. In Mail and system-wide. I wonder why Mail has its own setting for autocorrect that overrides the system setting.

XCode

Seriously, Apple? You mean I have to create an iTunes account, even though this is a work computer and I won’t ever be buying software for it, and verify it, just in order to download XCode? And then, the XCode application that gets installed is actually the XCode installer? What a hassle.

Ok, ok, to be fair, this isn’t a Lion thing. But seriously, Apple?

MacPorts

Aside from mysql5-server, I got everything I needed installed fine. I was even able to beat mysql5-server into submission, with a judicious application of MacPorts-kung-fu.

Textmate

After hearing many rumors about Textmate not really working under Lion, I’m happy to report that it seems to work fine. (Aside from the aforementioned wacky interaction with Mission Control.)

Launchpad

I won’t use Launchpad. Apple is clearly trying to unify the experience of iOS and Mac OS. But I already have Spotlight, the Dock, and the Applications folder. How many views of my applications do I need?

Resume

The resume feature rocks, at least in Terminal. Quit Terminal or restart your machine, and when you relaunch Terminal, all your windows are there, in the same places, with the same tabs, and the scrollback buffers are still there.

I honestly haven’t used or noticed Resume from any other applications. Of course, all the web browsers already have something like it.

Resize from any edge

For the last twenty-odd years, Windows (since 3.1) and most Xwindows window managers have had the ability to resize windows from any edge And Mac OS, going back just as long, to Classic Mac OS, has always forced you to resize (only down and to the right) by grabbing a little, 16×16 pixel box at the lower right of the window. This has always been a minor shortcoming in Mac OS’s interface.

Leave it to Apple’s UI designers to realize you could resize windows from any edge without having an fat, ugly, click-target border all the way around every single window. Resize from any edge is a perfect illustration of Apple’s design genius: Less clutter, same power. I’ve wanted this on Mac OS for so long, I’d be willing to pay the €23 just for this feature alone.

Eleven years of email

Phobos Labs’ nine years of sleep inspired me to make this chart of the last eleven years of my email. Read more….

Push to Github using Mercurial

It’s simple to push Mercurial source code repositories to GitHub (or another git server). This is useful if you can’t, or don’t want to, switch off of Mercurial to git. You can take advantage of GitHub’s superior community and features (sorry, BitBucket) without having to puzzle over ____ (anti-git screed left as an exercise to the reader).

To do this, you’ll need the hg-git plugin for Mercurial, by Scott Chacon and others. The technique is outlined on hg-git’s read-me, but it still took me some effort to get working right, so I thought this blog post might be helpful.

I assume you already have Mercurial installed, a project in a Mercurial repository you want to push, and a GitHub account. Here’s how you do it.

Install hggit

If you’re using MacPorts and Python 2.6, you should be able to do this with:

sudo port install py26-hggit

At the end of hggit’s install process, hggit prints out a line to add to your ~/.hgrc. For MacPorts & Python 2.6, this will look like:

hggit=/opt/local/Library/Frameworks/Python.framework/Versions/2.6/lib/python2.6/site-packages/hggit

Note: Installing hggit in a virtualenv is probably not a good idea unless you also are using a version of Mercurial installed in the same virtualenv. You could probably make it work, but it’s likely more work than it’s worth.

Set up and connect to a GitHub repository

Create an empty repository on GitHub to store your Mercurial project. Copy the git URL from the new repository. Append git+ssh:// to the beginning, and change the : to a /. For example, this:

git@github.com:mercurialpoweruser/mercurialrocks.git

becomes this:

git+ssh://git@github.com/mercurialpoweruser/mercurialrocks.git

This URL then goes in the [paths] section of your Mercurial repository’s .hg/hgrc. If you want to push and pull from GitHub by default, it will look like this:

[paths]
default = git+ssh://git@github.com/mercurialpoweruser/mercurialrocks.git

Or, if you already have a remote Mercurial repository that you want to keep using, you can add the GitHub repository with a different name:

[paths]
default = ssh://hg@bitbucket.org/mercurialpoweruser/mercurialrocks
github = git+ssh://git@github.com/mercurialpoweruser/mercurialrocks.git

Next, push your repository to GitHub for the first time, with hg push (if GitHub is your default) or hg push github (if GitHub isn’t the default).

Now, go to GitHub and reload the page, and revel in the feeling that you are living in a utopian future where cars fly, cold fusion supplies our electricity, and different DVCSs interoperate with each other.

Gotchas

If you find that Mercurial mysteriously stops pushing new changes to GitHub, even though there are new changes to push and it’s pushing them to Mercurial servers fine, you might want to take a look your bookmarks. Do this with hg bookmarks. Make sure that the changeset ID for master is the changeset ID of tip: hg tip. If they’re not the same, update the master bookmark with hg bookmark -f master. This happened to me when I used an older version of Mercurial (1.3.1) without the bookmarks extension enabled. Don’t do that.

Your mileage may vary

I haven’t used this technique on repositories that have lots of branches, file renames, merge conflicts, or other “fancy” distributed version control features. I wouldn’t advise switching your entire development team and mission-critical repositories over to this technique without some more extensive testing.

Mathematician’s Dice now for sale

The Mathematician’s Dice are now for sale to the general public!

The Kickstarter project to get these dice to market was an amazing experience. I plan to write the whole thing up and post it here soon.

Geospatial Queries in MongoDB

Here’s a little gist that I wrote to illustrate the difference between MongoDB‘s use of degrees vs. radians in its non-spherical and spherical geospatial queries:

glyphobet • глыфобет • γλυφοβετ

musings over a tuna fish sandwich