Greater Than Code

The Human Side of Technology

Commit Etiquette: git commit -m “considered impolite”

Andrew Mason is a Software Engineer at Yelp and an avid reader and podcast listener. He is an active participant in CodeNewbies and spends the little remaining free time writing Rust, scripts to automate menial tasks, and the occasional blog post.

Ding! Ada’s¹ phone buzzed, telling her about a new JIRA² ticket assigned to her. It seemed like a fairly straightforward ticket – just fix this tiny bug in one of their older libraries.

Hours later, Ada was banging her head against her keyboard. The code in this library had tentacles entangled everywhere. It felt like the butterfly effect applied to software – delete one parameter from one function, and another thread breaks under a particular race condition. But she was nothing if not persistent. A git blame on the line in question showed that no one had touched it literally in years. No surprises there. As a last bastion, she decided to look at the git logs for the two files side by side, hoping some past author left any breadcrumbs of how these two pieces interacted.

Ada face-desked. Abject horror flashed across her face, which she promptly buried in her hands in despair. The commit read: Add param to function so other thread doesn't crash, and she knew the author used “-m”.

Commit Kindly

A lot of git tutorials instruct beginners to do something like the following:

git commit -m "my first commit"

I get it. If you don’t instruct them to use the m flag, you have to talk about gitconfig, EDITOR, and probably how to exit vim. That adds cognitive overhead to the task of “learning git” that may discourage initial committers. But – I would argue – it’s fundamental to using git well, and further that not talking about it reinforces bad practices.

Committing from the command line encourages conciseness over verbosity. If you make a typo or come up with a more clear phrasing, you probably won’t bother to whack the arrow key to go back and fix it. But commit messages should be verbose! Commit messages are not just a gratuitous step that we go through to appease git (although some people may think this). Writing commits – just like writing code – is fundamentally an exercise in communication. They will eventually be consumed by humans, and, when they are, they ought to provide value to the human.

With this in mind, I advocate the following practices when committing changes.

Keep It Small

Commit small, coherent sets of changes. This has nothing to do with how you write commit messages, and everything to do with how you stage your changes. “git add -p” is your friend here. Ensuring your diff is small and coherent will make the next practice far easier to follow.

Context is Everything

Commit messages should be structured as follows:

High-level summary of the changes. Keep it short

Paragraphs upon paragraphs of _actual prose_, providing any relevant
details about the changes. If you want to take advantage of
`git shortlog` to have prettier logs, you should wrap the lines at 72
characters.

Relevant details can include:  potential “gotchas”, reasonings behind
changes, motivations, relevant context for the changes, or anything
else that may be helpful to some future developer spelunking through
the commit history.

These details should be much more of “why” behind the change, rather
than a “what”. The "what" can be obtained by looking at the diff.

Minor Nits

The main point of my writing this is to talk about the content of commit messages, rather than the format, but because of Reasons, there are two formatting nitpicks that I need to talk about. The high-level summary should not end in a period, because of git send-mail. There should, however, be an additional newline to separate the summary line from the detailed notes. I’m actually not sure whythis is, but Linus has spoken, so maybe we’re all just too stupid to get it (he once actually said “if you still don’t like it, that’s OK: that’s why I’m boss. I simply know better than you do.” Ouch.).

Commit Showcases

I went looking through some repos to pull out some examples to illustrate my point. Here’s what I found.

The Good

[Derek Prior](https://github.com/rails/rails/commit/13fd5586cef628a71e0e2900820010742a911099) (Rails):

Add `redirect_back` for safer referrer redirects

`redirect_to :back` is a somewhat common pattern in Rails apps, but it
is not completely safe. There are a number of circumstances where HTTP
referrer information is not available on the request. This happens often
with bot traffic and occasionally to user traffic depending on browser
security settings.

When there is no referrer available on the request, `redirect_to :back`
will raise `ActionController::RedirectBackError`, usually resulting in
an application error.

`redirect_back` takes a required `fallback_location` keyword argument
that specifies the redirect when the referrer information is not
available.  This prevents 500 errors caused by
`ActionController::RedirectBackError`.

Sean Griffin (Rails):

Allow `Relation#or` to accept a relation with different `references`

Note that the two relations must still have the same `includes` values
(which is the only time `references` actually does anything). It makes
sense for us to allow this, as `references` is called implicitly when
passing a hash to `where`.

Fixes #29411

Ariel Ben-Yehuda (Rust):

typeck::check::coercion - roll back failed unsizing type vars

This wraps unsizing coercions within an additional level of
`commit_if_ok`, which rolls back type variables if the unsizing coercion
fails. This prevents a large amount of type-variables from accumulating
while type-checking a large function, e.g. shaving 2GB off one of the
4GB peaks in #36799.

Marcus Buffett (Rust):

Catch IOError

If config.toml doesn't exist, then an IOError will be raised
on the `with open(...)` line. Prior to e788fa7, this was
caught because the `except` clause didn't specify what
exceptions it caught, so both IOError and OSError were
caught

Alex Crichton (Rust):

rustc: Don't use DelimToken::None if possible

This commit fixes a regression from #44601 where lowering attribute to HIR now
involves expanding interpolated tokens to their actual tokens. In that commit
all interpolated tokens were surrounded with a `DelimToken::None` group of
tokens, but this ended up causing regressions like #44730 where the various
attribute parsers in `syntax/attr.rs` weren't ready to cope with
`DelimToken::None`. Instead of fixing the parser in `attr.rs` this commit
instead opts to just avoid the `DelimToken::None` in the first place, ensuring
that the token stream should look the same as it did before where possible.

Closes #44730

Sean Griffin (Diesel):

Allow ad-hoc inserts using tuples

This provides a more lightweight API for inserts that mirrors what is
possible with `.set` for updates. The implementation provided ensures
that all elements within the tuple are valid for the same table, and
that there's no funkiness like sticking an upsert value in there.
(Secret way to break Diesel -- It will totally allow you to stick a
`Vec` in there, and will generate invalid SQL as a result. We should fix
that one eventually, but it's pretty low priority)

One thing I would also like to allow is to allow mixing ad-hoc values
and `Insertable` structs. (See 10374a5
for rationale). I tried a naive approach where I moved the parens out
from the `InsertValues` impl on tuples, and into `InsertStatement`, but
that breaks PG upsert, so I'll need to spend more time on it. I've left
the test cases for that in place, but I will defer actually making them
pass for a later PR in order to keep the PR size and scope reasonable.

I expected to need more tests/compile-tests, but what's included here
felt like it pretty exhaustively tests everything.

Fixes #789

Sean Griffin (Diesel):

rm `IntoInsertStatement`

This gets rid of `IntoInsertStatement`, and turns `BatchInsertStatement`
into the only visible type for constructing queries. In its
implementations of `LoadDsl` and `ExecuteDsl`, it ends up constructing
`InsertStatement` which is what actually performs the execution.

The names are now super poor and misleading. The reason I've opted not
to rename them in this commit is that I actually think I'm going to end
up getting rid of `BatchInsertStatement` entirely, and only have
`InsertStatement`. If we don't end up going that route, I'm going to
rename `InsertStatement` to `ExecutableInsertStatement` and rename
`BatchInsertStatement` to just `InsertStatement`.

Aside from simplifying our codebase, this change also makes sure that
all insert statements constructed will include the 0-row checking case.
I think we can eventually just move this into the `QueryFragment` impl
by doing `SELECT returning_clause FROM table WHERE 1=0`. This would mean
the only thing we have to special case is SQLite batch insert.

Fixes #797

Sean Griffin (Diesel):

Refactor `InsertValues` for tuples

This changes the implementation of `InsertValues` on tuples to not care
about its interior type, and moves the SQLite special handling to
`ColumnInsertValue` directly.

The addition of the `Table` parameter on `InsertValues` is required for
the tuple implementation to enforce that all values are targeting the
same table.

One side effect of this structure is that it will (eventually) allow
arbitrary nesting of `Insertable` types. This is important for use cases
like rocket, where you will likely have a `user_id` field which comes
from the session, but we want you to derive `Insertable` and `FromForm`
on the same struct. This means we should be able to write code like the
following:

    #[derive(Insertable, FromForm)]
    #[table_name = "posts"]
    struct NewPost {
        title: String,
        body: String,
    }

    #[post("/posts", data = "<form>")]
    fn create_post(data: NewPost, user: AuthenticatedUser, conn: DbConn)
        -> QueryResult<Redirect>
    {
        use schema::posts::dsl::*;

        let created = insert(&(user_id.eq(user.id), data))
            .into(posts)
            .returning(id)
            .get_result::<i32>(&*conn)?;
        let url = format!("/posts/{}", created);
        Ok(Redirect::to(&url))
    }

This should also let us simplify the `default_values` code dramatically.

This commit message is fantastic! It has all the detail you could possibly want to know about the change, along with a code sample showing what the change is going for!

The Bad

Me! (Conway):

single commit implementation :O

Yup. I committed (get it?) a sin here. I am ashamed, and I should feel ashamed. All I can do is try to be better in the future.

Sean Griffin (Diesel):

Don't look at this commit

It's too embarrassing

Sean brought this commit up on The Bike Shed. His commits are usually so good (see above) that I couldn’t not put this here to tease him a bit.

The Ugly

Warning: Most of these contain expletives, and are not safe for work. But if you want to see some really poor commit messages, check out Commits Logs From Last Night.

In Closing

Commits are meant to be read by humans. Provide lots of context when describing your changes. It’s like taking the time to eat well or get enough sleep – you may want to take shortcuts, but the benefits to doing it Right will pay off in major ways in the long term.

PS: In the original story, I played both roles. That is to say, I was both Ada and the Bad Committer. Sometimes, the reader of your breadcrumbs is Future You. So at the very least, be kind to Future You, and don’t commit with “-m”.

¹ Yes, the Ada Lovelace.
² Believe it or not, JIRA was around even then.