TeamCity, Containers and SSH keys

TeamCity is our Continuous Integration (CI) environment of choice at Airsource, and we love it. Yes, there are open source options for CI, but when you spend your days developing commercial software it makes sense to break out the checkbook for tools that make everyone's job easier.

Even with the best tools for the job, you still sometimes run into interesting issues. We have recently been doing more web development for our clients, and have settled on using containers both for deployment and for CI. This means that your build agents are pulling exactly the right (consistent) environment during the build process, and further reduces the scope for build agent configurations to diverge.

This was the scenario I ran into the other week:

  • A TeamCity step being used for an "npm build" command
  • The npm environment needing to pull an in-house package from our GitLab deployment
  • The build slave was unable to pull the package due to SSH key problems

The symptoms

Well lots of this:

Node.js: code 128, An unknown git error occurred

and looking more deeply into the logs:

npm ERR! An unknown git error occurred
npm ERR! Warning: Permanently added 'git.airsource.co.uk,10.0.1.178' (ECDSA) to the list of known hosts.
npm ERR! git@git.airsource.co.uk: Permission denied (publickey).
npm ERR! fatal: Could not read from remote repository.
npm ERR!
npm ERR! Please make sure you have the correct access rights
npm ERR! and the repository exists.

Which points to git being unable to talk to the repository. This all, actually, makes perfect sense - the Node.js container is isolated from the parent system, and of course can't access the agent's SSH keys. So we need to, somehow, supply the SSH credentials to the container.

Attempt 1: ssh-agent

This is the pristine solution - use the SSH agent to expose the credentials to the container. And it should work. But unfortunately it seems that the Node.js build process doesn't pick up the SSH agent. Phooey. Same result.

Attempt 2: expose the keyfile

This solution actually improved things. In the additional docker run arguments, mount the agent's ssh keys to somewhere that the container can find them:

TeamCity build step configuration: attempt 2

A little finessing is required - we need to ensure that the GIT_SSH_COMMAND is set to use the identity you have carefully exposed to the container. Assuming your build agent is in /Users/agent, and you've created the necessary ssh key for it with ssh-keygen, you should be able to do the following:

-v "/Users/agent/.ssh/id_rsa:/tmp/id_rsa"
-e "GIT_SSH_COMMAND=/usr/bin/ssh -i /tmp/id_rsa"

and it will all start to work. Well, nearly...

npm ERR! An unknown git error occurred
npm ERR! command git --no-replace-objects ls-remote git@git.airsource.co.uk:airsource/fine-video-scrubber.git
npm ERR! Host key verification failed.
npm ERR! fatal: Could not read from remote repository.
npm ERR!
npm ERR! Please make sure you have the correct access rights
npm ERR! and the repository exists.

The npm resolution process is trying to login to our Gitlab to retrieve the npm package, but unfortunately it doesn't know the host! So it's failing to connect due to an unknown host.

Security misstep: don't do this

The tempting solution, which does work, is to just ignore hostkeys by supplying -o StrictHostKeyChecking=no to GIT_SSH_COMMAND. And you know what? That will actually work. But! It does open you up to so-called man-in-the-middle attacks if someone breaks into your network and inserts a malicious fake git host that hoovers up credentials. But a lot of people suggested it online, and I can see why it's tempting to just shove in the change and call it a day.

But we can do better...

Attempt 3: expose both the keyfile AND the known_hosts file

Much better, is of course, to ensure that the container knows about the same servers as your container host. And we can do that very easily by exposing the known_hosts file to the container as follows:

TeamCity build step configuration: attempt 3

-v "/Users/agent/.ssh/id_rsa:/tmp/id_rsa"
-v "/Users/agent/.ssh/known_hosts:/tmp/known_hosts"
-e "GIT_SSH_COMMAND=/usr/bin/ssh  -o UserKnownHostsFile=/tmp/known_hosts -i /tmp/id_rsa"

With this minor tweak, the Node.js container fires up and is able to pull npm packages from our Gitlab just fine, no drama.