Git continually impresses me. I've gone down a bit of a rabbit hole with how it works, and it's just a delight. I want to find a good way to see if a git repo changed, and I wonder if there's a way to do this using http directly, and there is.

Investgation

First thing I did was to setup a proxy and run a git clone through it so I could see what the url is.

First setup a proxy

This will create a web proxy on localhost:8080 with a ui on localhost:8081. Once you kill this container everything cleans up.

1
  docker run --rm -it -p 8080:8080 -p 8081:8081 mitmproxy/mitmproxy mitmweb --web-host 0.0.0.0

Open up http://localhost:8081 to check it out.

Setup the test environment

Since I'm going to be messing with my git config global state, lets just spin up a throw away container so I don't need to worry about it after. Also, install git in the environment.

We'll also set the -global config variables for http.proxy and http.sslVerify to allow the mitm proxy to be accepted.

1
2
3
4
5
6
  docker run --rm -it --network host debian:10
  # You should be in the container now
  apt-get update && apt-get install -y git

  git config --global http.proxy http://localhost:8080
  git config --global http.sslVerify false

Then clone

1
git clone https://github.com/wschenk/willschenk.com/

Now we can see that

request urlGET https://github.com/wschenk/willschenk.com/info/refs?service=git-upload-pack HTTP/2.0
user agentgit/2.20.1
accepts/

And it returns a binary file format of type application/x-git-upload-pack-advertisement

We can kill our docker containers, and save the file to start parsing!

Note that github anyway looks at the user agent to determine what to serve, not the Accept header:

1
curl -H "Accept: */*" https://github.com/wschenk/willschenk.com/info/refs?service=git-upload-pack
Not Found

So use -A to download the data.

1
2
  curl -A "git/2.20.1" --output ref.bin https://github.com/wschenk/willschenk.com/info/refs?service=git-upload-pack
  ls -l ref.bin
-rw-r--r-- 1 wschenk wschenk 2223 Feb 11 16:55 ref.bin

Parsing the file

Now that we have this binary file, we need to figure out how to read it. Basically, file contains records with 4 byte headers telling how long the each message is. We read the header, then those number of bytes afterwards (hex count).

The first record also contains a null byte, which is a list of capabitilities that the git server supports. One thing that's interesting here is symref=HEAD:refs/head/master which is what tells you what HEAD is pointing to. We'll look for that and seperate out and print that line if needed.

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
  # Adapted from page 543 of Building Git
  def recv_packet input
    head = input.read(4)
    return head unless /[0-9a-f]{4}/ =~ head

    size = head.to_i(16)
    return nil if size == 0

    input.read(size - 4).sub(/\n$/, "")
  end

  File.open( "ref.bin", "rb" ) do |input|
    while !input.eof?
      packet = recv_packet input
      if packet
        line, options = packet.split( "\0", 2 )
        puts options if options
        puts line
      end
    end
  end

Which returns

# service=git-upload-pack
multi_ack thin-pack side-band side-band-64k ofs-delta shallow deepen-since deepen-not deepen-relative no-progress include-tag multi_ack_detailed allow-tip-sha1-in-want allow-reachable-sha1-in-want no-done symref=HEAD:refs/heads/master filter object-format=sha1 agent=git/github-g2ff1cad44179
847ea6126ee5f8ec5c4ad24e51b527face7a2979 HEAD
8fa8f73e7afa4564d6fb9d3788b71d06b705a98a refs/heads/askgit
e6a7c3f659b0da619ca3a9093084c57bd75705ca refs/heads/edit-effigy
614fa2058304444b2bcbbc963829c99252ecbb8e refs/heads/gh-pages
aaeb84f5c985786ed5905356325f37f5a7f136f2 refs/heads/hugo
b8dfbe29de07d79f115f5814df0981c592046e78 refs/heads/look-gemfiles
847ea6126ee5f8ec5c4ad24e51b527face7a2979 refs/heads/master
514bd4d82776edbb973576315fe5a479821a8fd7 refs/heads/tailwindedit
3c2ca082e7b8331a127afae2dec968d7cebc2ddf refs/meta/_netlify_cms
041b7a90c63d744418df2634d85d6f2d8974a9c8 refs/pull/1/head
03cee5d1d11a75e03b374c6d7600095332d4d178 refs/pull/10/head
e710ce9b944db4d418638ccaac4ade28b6f05bf8 refs/pull/11/head
d30312612ecdda3f46bc617c7167254ed6adcd6a refs/pull/12/head
25be0d44efb3cd2d890c5cbd151a84fb302efb16 refs/pull/13/head
5e8f6760d36c9a2cb6be656b68d82dc63ce91b61 refs/pull/14/head
e6a7c3f659b0da619ca3a9093084c57bd75705ca refs/pull/15/head
52b9dd7146cee391519307f64fa3b7039e54a2dc refs/pull/16/head
b93d017621b4324a5267b58cbd52a2b9ae2e2c0f refs/pull/17/head
8fa8f73e7afa4564d6fb9d3788b71d06b705a98a refs/pull/18/head
b8dfbe29de07d79f115f5814df0981c592046e78 refs/pull/19/head
a6bac7782c04514fe084449b27ec8eba9c794f7e refs/pull/2/head
d265bda5243b1ee0991cd241f52f7f9e908d1c72 refs/pull/20/head
dd032eab146ab1a8d15a406806f35da9a14f4842 refs/pull/3/head
54297948071e5531b1c0550b7e80f7af2b310db4 refs/pull/4/head
a3d591c21779b5ff9ecf5ee75f1e88ba37f9f11e refs/pull/5/head
4ff3703c2166bd4d7f5b123f90bb22da526dd3bc refs/pull/6/head
514bd4d82776edbb973576315fe5a479821a8fd7 refs/pull/7/head
2bbb09950d485304907850438cb381ddedc68aeb refs/pull/8/head
9eae9ff2e6f708ba605813ed4bb7bd6f3b8fbf23 refs/pull/9/head
9a843f070f3f11e09c689ba5c159fb261236029d refs/tags/middleman

Next steps

This solves my use case because I can now compare the references to the previously seen versions, and if they've changed do something else. I would really recommend looking in deeper to see how the object transfer protocol is implemented, it's pretty amazing how git is put together.

Previously

Installing faasd cgi was good, serverless is better

2021-02-08

Next

Jarred-Sumner/git-peek

2021-02-12

howto

Previously

Installing faasd cgi was good, serverless is better

2021-02-08

Next

Building static OpenFaas templates Packaging up the packager

2021-02-12