Category Archives: essay

Essays

Monolithic repositories versus project repositories

Recently, Hacker News got ahold of Gregory Szorc’s article on monolithic repositories, and even Wired is weighing in on how big codebases should be organized. While the discussion is interesting, it seems to focus on two extremes: on the one hand, putting all of a company’s code into a single monolithic repository, and, on the other hand, breaking a big company’s code code up into many, many small repositories. Both extremes are too simplistic. The better approach is to align repository boundaries with the software’s build, deployment, and versioning boundaries. What does this mean exactly? Perhaps it is best illustrated with some examples:

You have a small app that knows how to deploy itself, and you rarely, if ever, need to deploy anything other than the current version of it. Keep the deployment scripts in the same repository as the app.
You have a few existing APIs written in the same language and built using similar libraries. You’re about to build a replacement for one of those APIs in a different, high-performance language, which will use a different storage backend. Not only will the build and deployment steps for this new code be different, but you will want to deploy new versions of both the new and old app independently. Here you have build, deployment and versioning boundaries, so it’s natural to use a different repository for the new code. Since it’s a different language, the new code won’t be relying on any existing code, so there’s no temptation to copy and paste code into the new project.
You need to support a complex set of deployment environments, clusters of virtual machines running different services, connecting to different backends, testing, staging, and production environments running different versions of services, and so on. Here we have clear versioning and deployment boundaries: your deployment process needs to support deploying different versions of the code. Your deployment process needs to understand and know about different versions of what it is deploying, since you might need to rollback and deploy an earlier version. It also needs to have the ability to deploy, for example, the current stable version of the site backed by MySQL database to the production environment, and also deploy the head of an experimental branch backed by PostgreSQL to the staging environment. You won’t be able to encapsulate the switch from MySQL to PostgreSQL in a single merge of a feature branch anyway, so it’s just a headache to maintain deployment scripts which know about how to deploy to MySQL in the same branch where you have just removed all your MySQL dependent code. So the deployment scripts are better kept in their own repository.
You have an API for internal use only, and a single consumer of that API, both running inside your private network. Here you have total control over versioning: you can change the API and its consumer in a single commit, ensure that both are updated at deploy time, and there’s no versioning boundary. If the API and its consumer are also written in the same language, the there should be little to no reason to keep them in separate repositories.
You have an API accessible over the public internet, and Android and iOS apps which talk to that API. Here you already have significant versioning boundaries. You can’t be sure that every app out there is up-to-date, so you have to keep the old APIs up and running for a while, or make sure the latest APIs are backwards compatible. And you have significant deployment boundaries: you wouldn’t want to wait two weeks to deploy an update to the API because you are waiting for the new API support in the iOS app to make it to the App Store. Since any version of your API must support several different versions of your mobile apps, and any version of your mobile app must be able to talk to several different versions of your API, there is again no hope of rolling any change into a single (merge) commit, and there’s no added cleanliness or simplicity to gain from having these different pieces of the software in the same repository. So this code can be split into multiple repositories.

There are a few more things to remember when setting up repositories.

Splitting vs. Merging

It’s far simpler to split out part of a monolithic repository than is to merge two independent repositories:

If you’re splitting, just create a new repository, push the old code there, remove everything but what you want to split out, and commit. You can then re-organize it if you like, but you don’t have to (and if you’re using git, you can even filter the history). All the files keep their history, and you don’t have to worry about file name collisions.
If you’re merging, you most likely have to do some top level reorganization first, then you can merge one repository into the other, but now looking at history before this merge point will show a jumbled mess of commits from both repositories.

Increasing Fragmentation

If you feel like you’re being forced into creating a third repository to store code that is needed by two other repositories, then that’s probably a sign that those two repositories should be a single one. This is a common trap that projects get into; once they have split their repositories too finely, then the only solution seems to be more splitting. When considering this option ask yourself: do these two repositories really have different build, deployment, and versioning boundaries? If not, bite the bullet and merge the repositories, rather than creating a third one for the shared code.

Ease of Access

Ease of access to code is often presented as an advantage to the monolithic repositories model. But this argument is unconvincing. A programmer can still have access to all the company’s code, even if that requires cloning multiple repositories. And the vast majority of cases, a programmer is going to work on two, maybe three, different projects at the same time. It’s not as if Google is filled with programmers who work on WebM encoding for YouTube on Monday, map-reduce for Google search on Tuesday, CSS and JavaScript for Gmail on Wednesday, Java for Android on Thursday, and Chrome on Windows on Friday. Programmers like that are extremely rare, and you shouldn’t optimize your repository structure to make them happy, especially not if it means forcing everybody else to download and track changes to large amounts of code they will never touch.

In Conclusion

To sum up, both the one monolithic repository dogma, and the many small project-based repositories dogma are oversimplified to the point of being harmful. Instead, focus on splitting your code into repositories along its natural versioning, build, and deployment boundaries.

Someone suggested that my Five Eyes Flag would work as a flag for the English language. While this isn’t quite right—any flag for the English language would have to include Ireland, probably South Africa, and arguably many other places (Belize, India, etc.)—it got me thinking what languages could use a flag of their own.

It would have to be a language (officially) spoken in more than one, but not more than a handful of countries. French, Spanish & Arabic are too widely spoken, and there are already a boatload of bad flags for the German language. So I decided to try designing a flag for the Chinese language.

Chinese is the official language of five polities, symbols from whose flags appear on this flag, atop a color also taken from their flag. In order from left to right, they are Singapore, Macau, Hong Kong, The Republic of China (a.k.a. Taiwan), and The People’s Republic of China (a.k.a. just China). The symbols are arranged in an arc that mirrors the geographic locations of the five in east Asia, from Singapore in the south, to Taiwan off the east coast of mainland China.

Also, note that I said polities and not countries: Macau and Hong Kong are technically not countries but Special Administrative Regions of China, and Taiwan is not widely recognized as a country. There’s lots to be offended about by this flag; not only the animosity between China and Taiwan but the fact, pointed out to me by a friend, that the flag of Taiwan is actually the flag of the KMT, the dominant political party there.

I hereby release this flag into the public domain so it may stoke the flames of many internet flame wars. Enjoy!

Better, Fewer, Shorter Meetings

Many organizations are plagued with intolerably long, frequent, and ineffective meetings. Here’s what I’ve learned about how to have better, fewer, shorter meetings. Most of these insights come from working with Freyr Guðmundsson at my start-up (formerly GoodsCloud, now NewStore), and he tells me that many are originally from the book Death by Meeting.

Meeting types

There are three types of meetings:

Strategic: meetings about what we should do.
Tactical: meetings about how we are going to accomplish the strategy we already agreed on.
Reflective: meetings about what went right and wrong and how we can do better next time.

These meeting types should occur in this sequence, in a loop.

Everyone should understand these meeting categories. Every meeting should fall into only one category. If a meeting category starts to change, that’s ok, but the group should be conscious of that when it happens.

Strategic meetings can change into tactical ones: “we can’t choose that strategy until we know how it will work” or reflective ones: “we tried that last time and it didn’t work”. Tactical meetings can change into strategic ones: “Well now maybe with these new thoughts we picked the wrong strategy” or reflective ones: “we tried that last time and it didn’t work”. And reflective meetings can turn into either, as you discover a better strategy or tactic and want to start exploring it.

When a strategic or tactical meeting changes to a different type, it’s a sign that previous meetings were incomplete, or skipped, an important participant was excluded, or institutional knowledge and experience has not spread to the whole team. When a reflective meeting changes to a different type, it’s a sign of the team getting ahead of itself.

When a meeting type is changing, one thing the participants can do is explicitly note that it’s now a different kind of meeting and agree to the change. But it’s often better to put the triggering issue aside and reschedule a new meeting of the new type for that issue, because frequently different people need to be in the meeting now that it’s a different type.

Controlling meeting lengths

Strategic meetings are the only kind that should be open-ended. It helps to loosely time-box them, but if they go over, that’s ok.
Tactical and reflective meetings should be strictly time-boxed and rescheduled, not continued, if they go over time.
Develop zero tolerance for anyone arriving late to meetings. Being five minutes late for a six person meeting wastes thirty minutes of company time. There are stories about companies that charge people $1 per minute per person that they are late to meetings. Two minutes late to an eight-person meeting? $16 off your paycheck. I’m not suggesting this (it probably is illegal in some places) but it’s good to make people aware that showing up late has a real cost.
Give a five minute grace time at the end of the meeting for people to take a break, go to the bathroom, get a coffee or a smoke, before their next meeting. So, schedule them for 11:00-11:25, or 11:00-11:55, not 11:00-11:30, or 11:00-12:00. This grace period will also help with tardiness.
Meetings tend to expand to their allotted time. Experiment with five, ten, fifteen minute meetings, don’t just make everything a half hour or an hour.

Anectdote: I once had a two-day workshop with a client who flew in three people to meet with two from our company. We spent about an hour understanding the problem, and another hour working out a solution. Then we spent the remainder of the day and a half mediating a contentious argument between two people from the client company, which was ultimately irrelevant to the solution. And we couldn’t stop it. These two people were in a power struggle and they suddenly had the free time allocated to pursue it. At the end of day two, as the client was packing up to go to the airport and get back to work, the two guys agreed to disagree on their contentious point, and we all reiterated our commitment to the solution we’d come up with in the first two hours the day before. It had been a huge waste of everyone’s time, but a great example of how people can fill up left-over meeting time for something else.

Meeting Participants

Meetings should contain only the minimum number of people that are necessary. The absolute best is to keep the meetings to no more than two or three.

If someone is sitting there, typing/tapping on his/her laptop/phone, and not really listening, that person was not necessary and shouldn’t have been invited.

Enforcing Meeting Etiquette

The most important point about meeting etiquette is to develop a culture where everyone, not just the boss or meeting leader, is aware of and advocating for the company’s meeting etiquette and rules.

The person who calls the meeting is responsible for publicizing the meeting type and the allotted time. If this is not done, then the rest of the team should ask for that information at the beginning of the meeting, or it should be canceled.
It’s everyone’s responsibility, not just the boss or meeting leader, to keep an eye on the time, to stick to the meeting type, and to call out when the time is up or when the meeting is changing type.
It’s also everyone’s responsibility to speak up when there is too much conflict. Often it’s the two most powerful/influential people in the meeting who end up arguing, and the others feel unable to speak up about it, and end up trapped listening to an argument. Develop a culture where anybody, even the person who just started yesterday, can speak up and say “calm down!” or ask “can you continue this discussion at a different time?”

Examples

I would hold strategic architecture meetings with just two or three of my devs from a nine-person team, and schedule them for an hour, and almost always end them early.
Our nine-person dev team’s reflective meetings were strictly time-boxed to one hour (well, 55 minutes). We frequently had more to talk about at the end, but usually they were points that only one or two people really cared about, so we would break out into little groups and talk about those remaining issues informally over the next day or so.

Conclusion

It might seem like a lot of work to stick to all of these guidelines and ideas, but actually it wasn’t that hard for us. I hope it helps you enjoy better, fewer and shorter meetings too.

A revised double-decker g for Apple’s San Francisco font

Based on some great feedback on Reddit regarding my revisions to Apple’s San Francisco font, I’ve revised my revised double-decker g:

As before, the original:

With a humanist a:

And with the new double-decker g:

And SVG versions: alternate a, alternate g, and an animated version over on tumblr.

Let’s code about bike locks some more

I just got sucked into Peter Norvig‘s Let’s Code About Bike Locks. If you haven’t read that yet, read it first, at least the beginning. Otherwise this will make no sense.

The strategy of starting from the first tumbler and then calculating the subsequent ones seemed like it could be improved on. What if we looked at all the four-letter English words, and chose the most common letter, on any tumbler, fixed that letter on its tumbler, and then continued on to the second most common letter, on any tumbler, and so on? How well could we do?

So I coded it up, and here’s the result, a lock that can make 1,410 words from Norvig’s list at http://norvig.com/ngrams/words4.txt, 170 more words than his best:

Lock: ABCDGLMPRST AEHILNORUY ACEILMNORST ADEKLNORSTY

I believe Norvig’s strategy of improving a lock with random permutations also would be less likely to improve on this lock. Changing any letter would, by definition, be choosing a letter that occurs in that position in fewer words. However, it’s still possible to improve; there might be some better letter choices that, while poorer overall, are still better for the specific other letters already chosen for the other tumblers.

Update 15 Jun 2015: Someone was wrong on the internet and this time it was me! Astute readers will notice that a tiny off-by-one bug in my implementation (see the fifth revision) led it to generate a lock with three tumblers with eleven letters each, and one tumbler with ten letters.

The new best lock from this implementation only generates 1,161 words, leaving Norvig’s solution the best still:

Lock: ABCDLMPRST AEHILNORUY AEILMNORST ADEKLNOSTY

Apple’s San Francisco font: adding double-decker g’s and humanist a’s

Apple’s new San Francisco font is going to be a vast improvement on Helvetica as a system font in iOS 9 and OS X 10.11. But it features a double-story a without a double-decker, looptail, or eyeglass g. It’s always seemed right to me for a font to have either both, or neither, of these special letters.

Turns out the font world disagrees with my intuition. Futura is the only one of my favorite fonts with a single-story a, and while Gill Sans, Trebuchet, Times, Palatino, Optima and American Typewriter all have both double-story as and double-decker gs (left side), Helvetica, Arial, Courier, Verdana, and Lucida Grande (right side) all have mixed double-story as with simple humanist gs.

However, I still wanted to see a more Futura-like and a more Gill-Sans-like San Francisco, and although the font is only available at the moment to Apple developers, I was able to get a copy, and I’ve made alternate glyphs.

Here’s the original:

And here it is with a simple humanist a:

And here it is again with a double-decker g:

And just for fun, here are SVG versions of my alternate a and alternate g, and an animated version over on tumblr.

Update: I hereby release it all into the public domain. Apple, if you’re listening, feel free to incorporate one or the other of these into San Francisco.

Three Player Chess Variants

I recently got curious about three player games in general, and three player chess variants in particular. This spawned a whole page about three-player chess variants, complete with a discussion of the topology of particular boards, some of the more interesting moves and patterns, and printable and/or creative-commons licensed, downloadable board designs.

If you’re interested in chess, three-player games, or geometry & topology, check it out.

There’s no such things as bugs or features

Unless you’ve only ever worked with technical people, you’ve run into the old “is it a bug or is it a feature” argument. Generally, a business person reports something as a bug because it’s not working properly, but the reaction from technical people is that that particular feature just isn’t built yet or that specific detail was not in the original specification. This can be a source of great friction because it usually involves technical people saying the product will take longer than expected to finish. Less often, business will report a problem as a new feature, but the problem is already-built code that is just not functioning properly. This is less contentious because it usually means it’s less work to fix than business originally thought.

Is there a way to eliminate this debate?

Continue reading →

A tour of the differences between JavaScript and Python

Introduction

JavaScript and Python are two very important languages today. Too many programmers, however, work in both languages, but know just one of them well. This means they end up writing code in one language in the same style as the other, unaware of some of the more subtle differences between the two. If you know JavaScript or Python well, and you want to improve your skills and knowledge of the other, I wrote this article for you.

Disclaimer: I know Python (slightly) better than I know JavaScript, and I’ve not done any JavaScript outside of the browser, so I tried to keep the bits about JavaScript agnostic to the host environment, but, fair warning, there may be subtle differences in server-side JavaScript that are not mentioned here because I don’t know them.

Continue reading →

Why you shouldn’t use git merge –rebase

There is a common belief that git merge --rebase is somehow preferable to normal merging. The general assertion seems to be that a linear history is somehow “cleaner”, “easier to understand“, and that normal merging introduces “extra commits” and “merge bubbles“, the latter presumably being only slightly less objectionable than economic bubbles. Some organizations even go so far as to mandate always merging with --rebase. But ask someone to give a real, technical justification—just one—for this belief, and they mumble some aesthetic vapidities and then start talking about the weather.

Let’s put aside for a moment the ridiculous assertion that a directed acyclic graph is somehow more difficult for programmers—programmers!—to understand than a linear history. I want to show you how normal merging is in fact preferable to using --rebase all the time.

Continue reading →

glyphobet • глыфобет • γλυφοβετ

musings over a tuna fish sandwich

Category Archives: essay

Monolithic repositories versus project repositories

Splitting vs. Merging

Increasing Fragmentation

Ease of Access

In Conclusion

Better, Fewer, Shorter Meetings

Meeting types

Controlling meeting lengths

Meeting Participants

Enforcing Meeting Etiquette

Examples

Conclusion

A revised double-decker g for Apple’s San Francisco font

Let’s code about bike locks some more

Apple’s San Francisco font: adding double-decker g’s and humanist a’s

Three Player Chess Variants

There’s no such things as bugs or features

A tour of the differences between JavaScript and Python

Introduction

Why you shouldn’t use git merge –rebase