Tag Archives: linguistics

Let’s code about bike locks some more

I just got sucked into Peter Norvig‘s Let’s Code About Bike Locks. If you haven’t read that yet, read it first, at least the beginning. Otherwise this will make no sense.

The strategy of starting from the first tumbler and then calculating the subsequent ones seemed like it could be improved on. What if we looked at all the four-letter English words, and chose the most common letter, on any tumbler, fixed that letter on its tumbler, and then continued on to the second most common letter, on any tumbler, and so on? How well could we do?

So I coded it up, and here’s the result, a lock that can make 1,410 words from Norvig’s list at http://norvig.com/ngrams/words4.txt, 170 more words than his best:

Lock: ABCDGLMPRST AEHILNORUY ACEILMNORST ADEKLNORSTY

I believe Norvig’s strategy of improving a lock with random permutations also would be less likely to improve on this lock. Changing any letter would, by definition, be choosing a letter that occurs in that position in fewer words. However, it’s still possible to improve; there might be some better letter choices that, while poorer overall, are still better for the specific other letters already chosen for the other tumblers.

Update 15 Jun 2015: Someone was wrong on the internet and this time it was me! Astute readers will notice that a tiny off-by-one bug in my implementation (see the fifth revision) led it to generate a lock with three tumblers with eleven letters each, and one tumbler with ten letters.

The new best lock from this implementation only generates 1,161 words, leaving Norvig’s solution the best still:

Lock: ABCDLMPRST AEHILNORUY AEILMNORST ADEKLNOSTY

There’s no such things as bugs or features

Unless you’ve only ever worked with technical people, you’ve run into the old “is it a bug or is it a feature” argument. Generally, a business person reports something as a bug because it’s not working properly, but the reaction from technical people is that that particular feature just isn’t built yet or that specific detail was not in the original specification. This can be a source of great friction because it usually involves technical people saying the product will take longer than expected to finish. Less often, business will report a problem as a new feature, but the problem is already-built code that is just not functioning properly. This is less contentious because it usually means it’s less work to fix than business originally thought.

Is there a way to eliminate this debate?

Continue reading

Two new projects: German Grammar and Möbius

I’ve been hacking on two new projects in my spare time.

The German Grammar Explorer (mainly the German Declension Explorer) is helping me wrap my head around some of the more complex patterns in the German language. It’s also an experiment in deliberate synæsthesia; It uses a palette of eight colors plus white to color-code similar patterns and related morphosyntax. The idea is to give a general feeling for when the general patterns of the language are broken.

Möbius is a totally useless experiment in binding scroll events and doing funny stuff with them, and experimenting with some newer features of HTML 5 and CSS 3.

Fuchsia, baige [sic], puke, butter yellow, pistachio…

In a fascinating bit of amateur lexical analysis, Stephen Von Worley created this color strata image, using data collected from XKCD‘s Color Survey:

Sadly, pistachio, a hue that’s notoriously difficult to pin down, is nowhere to be found. Spencer Finch will be disappointed.

If anyone’s interested in the actual linguistics behind color names, Berlin & Kay’s Basic Color Terms (Amazon link) is the seminal work, although I’m pretty sure they didn’t discover color names like baige [sic], puke, or butter yellow.