Release code diffs
What changes between releases
- tags
- packagemangers
- npm
- bundler
- git
Contents
When tracking and upgrading software you want to have an idea of what changed. Looking at the readme is helpful, and projects that keep a changelog are polite and friendly, but it's nice to actually get down to it and see what the changes actually are.
Loading the repo and finding the tags
We first need to look at where the code is from. In looking at
gemfiles we found how to see what gem you are currently working with,
and in looking and package.json we did the same for npm
. The logic in
the same once we have
- Repo
- Original tag
- New tag
For testing purposes, we are assuming that you've pulled this information out from your package manager and we know that
Package | middleman |
Repository | https://github.com/middleman/middleman |
Original Version | 4.3.3 |
New version | 4.3.9 |
We'll start by cloning or updating the repo into a directory. In this
case, the repo will go into /tmp/middleman/repo
. Our analysis will go
in /tmp/middleman
.
|
|
Repo directory: /tmp/middleman/repo
Tags
Now that we have the repo, we need to find the tags in the project. We'll pull down a list of all the tags in the project.
|
|
latest_tag : v5.0.0.rc.1
Semver
Lets parse the tags to see if they are version numbers, if they are prereleases, and be able to tell if there is a high patch version that matches our criteria.
|
|
We can filter through the versions to find only those without any
prerelease info (like .rc.1
or .beta-2
that are canditidate releases)
and print out what the current latest is:
|
|
latest_release : v4.3.10
Now we can write some methods to match our versions to tags:
|
|
current_tag: v4.3.3 new_tag : v4.3.9
We can also look to see what is the latest patch release for the same
major and minor version that we're running. According to semver
standards – and it's not totally clear how strictly these are
followed or even if they make sense – patch releases should be bug
fixes only and totally compatible.
|
|
most compatible patch: v4.3.10
Change information
Now that we have identified the tags, we can start to ask questions
about what happened between those two results. We'll use the git log
command, which is amazingly powerful and super cool.
Super cool. You heard me.
These are the formatting options that we'll be using:
Option | Description |
%aI | Commit time |
%h | Hash |
%ae | Author email |
%an | Author name |
%s | Summary |
We're going to output a log of the changes that happened between the
tags. We'll have to do some crazy regex
to pull out the fields.
|
|
Most recent change: 2018-01-20T08:26:10+09:00 Oldest change : 2020-09-09T14:06:57-07:00
We can can calculate how many days have passed bewteen the releases like so
|
|
Days passed : 963.9033217592593
Which is a bit unnecessarily precise but gives you a sense of all the years that have gone by.
Tickets
A lot of projects put a ticket number in the commit, in the format of
#nnn
where n
is an integer. For projects that use Jira, these are
generally three letters, a dash, and a number, but either way they
start with a #
. So lets print out those issues that we find in the
commit summary messages.
|
|
Which gives us:
["2083", "2143", "2287", "2316", "2323", "2327", "2348"]
From here we could try and figure out what issue tracker this repository is using, and then cross reference that to see what has been going on.
In the case of this repository we see in .github/CONTRIBUTING.md
that
it uses GitHub Issues which is pretty common and popular for GitHub
hosted projects, and not like shocking or anything.
A brief excursion into the CHANGELOG
The commit messages are semi automated, and if you look at the keeping a changelog site they recommend against dumping in commit messages directly. Lets try and parse out the changelog in the repo to see if we are missing any other issues or interesting things.
This is a fairly straightforward way to "parse" this file, but since it's freeform we don't know if many projects support it. This works as a quick scaffold now though.
|
|
Which yields:
No entry for 4.3.9 Found change log for 4.3.3
So not that useful in this case.
Seeing the authors
Using the CLI
One thing you can do with regular (awesome) git cli is something
called git shortlog which shows you a rolled up version of commits by
authors. Here I'm using -n
which sorts by author commits.
|
|
And we can see that Thomas Reynolds
seems to have done most of the
maintence work.
Thomas Reynolds (13): Bump minor Lock old bundler Disable bind test on travis Update changelog [ci skip] Prep Add Ruby 2.7.0 to CI Prepare 4.3.6 Disable therubyracer Bump Prep release Update changelog Fix #2083 Prep 4.3.9 Alexey Vasiliev (1): Update kramdown to avoid CVE-2020-14001 in v4 (#2348) Johnny Shields (1): Fix ignore of I18n files (#2143) Julik Tarkhanov (1): Reset Content-Length header when rewriting (#2316) Leigh McCulloch (1): Loosen activesupport dependence (#2327) Maarten (1): Fix i18n with anchor v4 (#2287) bravegrape (1): Add empty image alt tag if alt text not specified (#2323)
We can also include -s
to show the summary only, in other words
doesn't include the commit one liner.
|
|
13 Thomas Reynolds 1 Alexey Vasiliev 1 Johnny Shields 1 Julik Tarkhanov 1 Leigh McCulloch 1 Maarten 1 bravegrape
Using code to summarize authors
We can recreate this view pretty simply:
|
|
13 Thomas Reynolds 1 Johnny Shields 1 Maarten 1 Julik Tarkhanov 1 bravegrape 1 Leigh McCulloch 1 Alexey Vasiliev
The sort order is slight different, but 1
is 1
…
Listing the files changed
Between the two versions we want to see everything that changed. We
can do this using the git diff
command, and pass in --numstat
to see
the files that changed.
|
|
18 0 .devcontainer/Dockerfile 37 0 .devcontainer/devcontainer.json 4 0 .travis.yml 29 0 CHANGELOG.md 2 2 Gemfile 17 24 Gemfile.lock 316 315 middleman-cli/features/preview_server.feature 11 0 middleman-core/features/default_alt_tag.feature 4 0 middleman-core/features/i18n_link_to.feature 67 11 middleman-core/features/ignore.feature 5 2 middleman-core/features/liquid.feature 17 0 middleman-core/features/markdown_kramdown.feature 1 1 middleman-core/features/relative_assets.feature 1 1 middleman-core/features/relative_assets_helpers_only.feature 0 0 middleman-core/fixtures/default-alt-tags-app/config.rb 1 0 middleman-core/fixtures/default-alt-tags-app/source/empty-alt-tag.html.erb - - middleman-core/fixtures/default-alt-tags-app/source/images/blank.gif 1 0 middleman-core/fixtures/default-alt-tags-app/source/meaningful-alt-tag.html.erb 2 0 middleman-core/lib/middleman-core/core_extensions/default_helpers.rb 9 0 middleman-core/lib/middleman-core/core_extensions/i18n.rb 5 0 middleman-core/lib/middleman-core/core_extensions/inline_url_rewriter.rb 4 0 middleman-core/lib/middleman-core/renderers/kramdown.rb 1 1 middleman-core/lib/middleman-core/template_renderer.rb 1 1 middleman-core/lib/middleman-core/version.rb 1 1 middleman-core/middleman-core.gemspec 1 1 middleman/middleman.gemspec
--numstat
shows you the lines of code added and deleted, and from this
it looks like most of the work on the repo was in the test directory
for a feature named preview server
. The actual number of changes to
the main source code seems pretty small, but if we want to take a look
at what those changes are:
|
|
Which shows a huge output of the diffs of all the code files from the
one tag to the other. I'll spare you from scrolling through, but if we
look just at the version.rb
file you can see that it shows the diffs
from where you start to where you end up – in this case, from version
4.3.3
to =4.3.9.
|
|
diff --git a/middleman-core/lib/middleman-core/version.rb b/middleman-core/lib/middleman-core/version.rb index 42bc84bc..753d3c87 100644 --- a/middleman-core/lib/middleman-core/version.rb +++ b/middleman-core/lib/middleman-core/version.rb @@ -1,5 +1,5 @@ module Middleman # Current Version # @return [String] - VERSION = '4.3.3'.freeze unless const_defined?(:VERSION) + VERSION = '4.3.9'.freeze unless const_defined?(:VERSION) end
In summary
Given a Gemfile.lock or a package-lock.json we can see which version of a module you are currently running, where the code is hosted, and which is the latest version. From here we can pull down the repo, look for the tags that marked each specific version, and see who worked on it and what the overall diffs are to see exactly what code has changed. This works if the code is hosted on github, or any other giy repository.
In addition to looking at the repositories between the tags, we can also also pull in static analysis for other parts of the project. We can put the gitlog in SQLite and do further analysis.
Each of these steps needs further refinement but we've got all of the major pieces together.
Previously
Next