Newsletter Archive

Inspecting changes locally before pushing

If you work on your branch you run into the situation that you would like to push your changes to the remote repository. CI will then pick up your changes and run the linting and code quality checks on it. Afterwards, you will see whether you improved the quality. But perhaps there are some new violations that crept into the code? Happens to all of us!

I usually like to see and check if any new issues might come up on CI. This lets me improve them before I push.

The checks I run locally depend on the kind of work that I do. Lately that’s a lot of Ruby on Rails again — which is great. I love that framework and the language.

To grade my code, I use Rubocop, Reek and Flay. If you run their respective commands on your repository, they will check the whole code base. This might be ok, if you didn’t have any issues before. Since I join teams, these days, that work on legacy projects it is rare that there are no problems with the code. If I run the commands just so, I will get a long list and couldn’t possibly see the issues that I introduced through my changes. Lucikly, there is Git and some “command line foo” that can help us here:

git fetch && git diff-tree -r --no-commit-id --name-only master@\{u\} head | xargs ls -1 2>/dev/null | grep '\.rb$' | xargs rubocop

This command will fetch the current state from the remote and diff your branch/changes to master branch. It then runs rubocop on these changes.

In my ~/.aliases.local file I added three lines for all three linters.

# Code Quality
alias rubocop-changes="git fetch && git diff-tree -r --no-commit-id --name-only master@\{u\} head | xargs ls -1 2>/dev/null | grep '\.rb$' | xargs rubocop"
alias reek-changes="git fetch && git diff-tree -r --no-commit-id --name-only master@\{u\} head | xargs ls -1 2>/dev/null | grep '\.rb$' | xargs reek"
alias flay-changes="git fetch && git diff-tree -r --no-commit-id --name-only master@\{u\} head | xargs ls -1 2>/dev/null | grep '\.rb$' | xargs flay"

I am still working on a way to just call one command and have all thee commands run. That doesn’t yet work. Probably because of exit-code reasons, when one linter finds issues.

These simple commands offer a convenient way to find local issues and correct them before pushing to CI.

Avdi Grimm’s view on deleting tests

A few weeks ago I wrote about deleting your tests. Yesterday I received the weekly email from Avdi Grimm, where he touches on this subject.

Some premises about my relationship with unit testing:

I like test-driven development.

I like driving out individual object design small, isolated (e.g. no database) unit tests.

I think of these unit tests as a design aid, full stop. Any help they provide with preventing regressions is gravy.

I treat unit tests as disposable. Once they have served their purpose as design aids, I will only keep them around so long as they aren’t getting in the way.

These four premises are strongly interconnected. Take away #1 (test first), and tests are no longer a design aid and none of the other points are valid. Take away #2 (isolation) and we get into integration test territory where there’s a much higher possibility of tests having regression-prevention value. Take away #3 (design focus) and the other points lose their justification. Take away #4 (disposability) and I spend all my time updating tests broken code changes.

This makes it easy for me to find myself at cross purposes with others in discussions about unit testing, because often they come into the conversation not sharing my premises. For instance, if your focus in unit testing is on preventing regressions, you might well decide isolated unit tests are more trouble than they are worth. I can’t really argue with that.

A recent conversation on this topic inspired the thought from the driving-design perspective, maybe unit tests are really just a crutch for languages that don’t have a sufficiently strong REPL experience. While I don’t think this is strictly true, I think the perspective shines a useful light on what we’re really trying to accomplish with unit tests. I almost think that what I really want out of unit tests is the ability to fiddle with live objects in a REPL until they do what I want, and then save that REPL session directly to a test for convenience in flagging API changes.

That conversation also spawned the idea of immutable unit tests: unit tests that can only be deleted, never updated. A bit like TCR. I wonder if this would place some helpful incentives on the test-writing process.

You should subscribe to his newsletter, if you haven’t yet.

Why should you use Linters?

What is a linter?

If you write code, you want you code to look a certain way. You want the syntax to meet your stylistic requirements. In the beginning, when you learn coding, you don’t have much preference for a certain style. It is common to copy what you see others do. That is good. Continue doing that. After doing that for a longer while and learning to write code on your own, you might join a team.

When a team of developers writes code, the syntax should not show the author.

When a team of developers writes code, the syntax should not show the author. The code look the same and conform to the same rules regardless of who wrote it. A linter is a tool that helps you achieve that goal. It scans your code for deviations from the rules your team set up. You can even use it to automatically correct your code.

Why are linters necessary?

Code is read often, but written only once. It is easier for a team to read code, if it is written in one style. This includes rules on whitespace, punctuation and specifics of the language you write in. It isn’t always easy to follow the rules and conform to the styleguide. A linter will warn you when you deviate from the standard and help you correct your code.

How do you use a linter?

Linters observe the code you write, while you write it. They are integrated into your editor, or your IDE. A few editors and IDEs already bring preconfigured linters. Most of the time you have to configure the rules to match you team’s rules.

When you write code that your linter does not like, you will get a notice in your editor. This is what it looks like for Rubocop in the IDE RubyMine.

Should you lint all your code?

Yes. If you want your team or other developers to work with you code, you should lint it. There are reasons to deviate from common standard coding guidelines. You might have your reasons and I won’t challenge you here. Just be aware that the standard is used by the majority of developers on the majority of projects. Consider changing jobs or teams. You suddenly would be facing new guidelines. You’d have to adept your coding style. If you know the standard you can follow it more easily.

How does linting work for teams?

The same way it works for individuals. The team comes to a conclusion on the style they want to use. This could be a standard style with some minor customizations. Next, the team writes or adepts configuration files for the linters they want to use. These are then made available to every developer to use.

If you follow my guidelines on achieving software quality, you will have a continuous integration workflow that enforces your linting rules as well.

In the end your code reads easier and looks cleaner. Don’t be afraid that all code will look the same. There are still ways to find your own handwriting style when coding

External forces

I am occupied with learning these days. Learning on my own about visualizations of data among other topics. But also learning about learning. For that I read what other people think about learning. There are many things I have to learn about this whole topic. One thought I saw repeatedly, was about external forces, or limiting factors.

Let me elaborate what I mean by that: There are people that can motivate themselves more easily than others can. They reach their goals or at least try very hard. Others give up more easily when they face some resistance. As always, there are people in the middle between these extremes. You know best which group you belong to. 💪

What has this do with software quality? I am getting there… 😉

I am wondering how external forces could help improve quality. If you need to reach your goal and you don’t belong to the group of highly self-motivated people there are options like hiring a coach. Athletes do that all the time. I pay for a “virtual” coach that guides my running efforts.

How could you hire a “virtual” coach for your coding efforts, for reaching your targets on your software quality metrics? You could hire me or other “real” coaches, of course. But that doesn’t scale too well and might be too expensive.

Again, for some people it is easy enough to use static analysis or linting — a kind of coach in it’s own right — and follow their guidelines. Yet, still there are people that ignore the warnings or guidelines imposed upon them by the tools. Reasons may be a hard deadline or too much workload. How could we offer external forces, limiting factors that help them, guide them, towards doing the right thing?

One solution I can think of is to have a robot not accept your code when it is below standard or ignores guidelines. A robot could be anything that measures and grades your code and reports back to your team. Some tools already offer this, for example GitLab. If you want to merge code that decreases the overal quality metrics, you are not allowed to do so. So that would be one.

Another idea: If you try to commit or merge such code, you need to consult with another developer about the code. Once you worked on it together, the other dev has to enter her secret key, to remove the lock on the merge. This forces you to pair on code more often.

When it comes to teaching there is this saying of the “glass has to be empty (enough).” You cannot pour water into it, when it’s already filled. Said ideas 👆probably won’t work for a team that isn’t aiming for learning and improving.

I will continue to think.

Complex conditionals

The other day we dealt with code coverage and gnarly conditionals. I promised to offer a way to be able to test them properly.

THERE IS NONE.

Ha, what a bad joke. But the real answer might not be better, depending on your point of view.

What you have to do is create a table.

(A || B) && C

This is our conditional.

| m | A | B | C | RESULT |
----------------------
| 0 | T | T | T |    T   | 
| 1 | T | F | T |    T   | x   
| 2 | F | F | T |    F   | x   
| 3 | F | T | T |    T   | x   
| 4 | T | T | F |    F   |    
| 5 | T | F | F |    F   | x   
| 6 | F | T | F |    F   |    
| 7 | F | F | F |    F   |    

# m is the test case
# A, B, C are the atomic parts of the conditional
# RESULT is the result of the evaluation of the conditional

For three terms in a conditional, you can have 8 different cases (2^3). You don’t need to test every case. You have to find those cases where switching one term (A, B or C) changes the RESULT. You take those cases and write tests for them. You can ignore the rest as they don’t bring you any new information. For our example these could be the test cases 1, 2, 3, 4. I marked them with an x,

The general rule of thumb is that you can solve this with n + 1 test cases where n is the number of terms.

This technique is called Modified Condition/Decision Coverage or short MC/DC. I like this name, it’s easy to remember 🤘.

It gets harder to do, when one term of the conditional is used more than once (coupled). Another thing to take note of is that depending on the decision statement in the code, it may not be possible to vary the value of the coupled term such that it alone causes the decision outcome to change. You can deal with this by only testing uncoupled atomic decisions. Or you analyse every case where coupling occurs one-by-one. Then you know which ones to use.

You’ll have to do this in the aerospace software industry or where you deal with safety-critical systems.

If you read this far: Congratulations. This is one of the more in-depth topics of testing software. You deserve a 👏 for learning about these things!

Code coverage can be misleading

During the last week, I had two discussions about code coverage. Code coverage is the metric of how many lines of code are covered by your automated test suite. Many test frameworks have built-in ways to measure this. Other times you have to install another tool manually. When you run your tests you then see how many lines are not covered by a test. That means that no test was run where this line of code was evaluated or executed or interpreted.

When you reach 100% code coverage, what then? Are you done? Could you guarantee that there are absolutely no bugs in your code?

If you are tempted to say Yes, or “maybe?” then let me tell you that you are wrong.

Consider this piece of code.

If you write a unit test for this method, the line eval... will be interpreted because of the if emergency at the end. The line is thus covered.

But the code is not covered or tested.

Admittedly, this is a very trivial example that I made up. In reality, there are some more profound things to consider.

If you have complex conditionals you might need a logic table where you compare all possible combinations of the atomic parts of the conditional.

You cannot possibly evaluate this in your head and know whether you checked for every possible, sensible combination. Yet when you cover that line you are at 100% coverage and can go home, right?

So what do you do? Let’s look at this tomorrow.

Quick wins, part 2: Method names revisited

Yesterday, we had the first part of this series on quick wins and simple steps to improve your code quality. It was about naming — specifically variables and method names.

Two things were not 100% right in these examples.

The first error

You might have noticed that my loop examples were written in Ruby code. Yet the method name doSomething was written in camelCase. This is unusual for Ruby code where developers tend to use snake_case for method names.

I did not lint that email. Hence no robot told me about my error. I believe it is a good example of the benefit of automatic linting. This error would have been found. If you read the code and were put off by this naming scheme, you even experienced why conventions and rules are necessary: Because coding by the rules helps developers to focus on the semantics of the code, not the syntax.

The second problem

Do you remember that I wrote about JavaScript loops and gave the classic for loop example? I complained about the i variable and that it should be called iterator. Perhaps you did not like this idea and my example? Let me take a step back for a second.

When naming variables and method names, you have to make sure they “speak.” The names should indicate their meaning and make it easier for another developer to understand the semantics of your code. Yet, when you are fully aware of the problem domain you code deals with, it can happen that you try to be too specific. If you are, you tend to use long, verbose names for variables and methods. An example could be this:

A common value for the allowed length of method names is 30 chars. The above example could be broken into two methods.

The next possible quick win might be to think about the existence of these methods inside the Article class. Verifying and fixing meta tags surely does not need to be done inside this class. If you want to follow the Single Responsibility Principle, you should make sure that the Article class does not have a reason to change, if you decide to change something about verifying and fixing meta tags. Rather the Article class might change when you decide that an article should have a mandatory sub-headline.

Back to the iterator. This name could pass the test for too specific. Only, when i refers to a variable that is declared and initialized outside of the scope of the loop, does it make sense to choose a different name. Or doesn’t it? As always it depends. The classic for loop is taught in almost every book on JavaScript, and it is easily identifiable as to what it is. But there might be reasons to deviate from that, as I indicated above.

Conclusion

To summarize:

  1. Choose good names. 😎
  2. Spot places where knowledge about something does not belong.

Tomorrow we’ll look at another example for 2 and a practical idea of what to do about it.

How do you do PRs?

Before opening a PR

I advocate for using before-commit hooks in Git. These hooks enable you to run custom scripts before commiting changes. There a quite a number of quality checks, if you ask me (and you kind-of do, since you are reading this! 😉), and almost all of them should be run before comitting your changes. I’ll mention the checks later on, as well.

Baseline measurements of a PR

One of the most important things to remember when doing PRs:
Don’t let them get too big. A size I found to work well was around 400 lines changed. While this is no hard limit, going over it tends to incorporate too many changes at once. This makes the review harder for the reviewers.

Another thing not to lose track of, is the number of days a PR is open. You should review PRs as soon, as they are assigned to you, or when you are being mentioned in their description (or in comments). It is your duty not to let your team wait for too long on your feedback/review. If there are no major errors in the changes, a 400 lines feature should be reviewed and merged within 1–2 business days.

Who do you assign a PR to? Who should review?

This depends very much on your team structure and how you are organized. If it is possible, it would be great to get two people to review your changes. A healthy mix could be one person from your immediate team, and another one from a different team. This can have the benefit of a different perspective. The one from the outside-team probably doesn’t know too much about your feature and the specifics of your code. This forces you to be extra detailled when writing your description and documentation and they might not make some assumptions that you and your team take for granted. This situation needs to be handled carefully, you don’t want to get derailed by fundamental questions. But it could also prove to be beneficial to your feature, ymmv. It also increases knowledge across team boundaries, which I consider to be good practice.

If cross-team doesn’t work for you, try to get a more experienced (or same level as you) and a less experienced person from your team. While both reviewers will help you improve your result, the less experienced might get a chance to learn from reviewing your code.

In any case, @mention your reviewers in the description and (if possible in your software) assign the PR to the one you want to be responsible for merging.

What should you write in the description of a PR?

The first thing that you should write is a good title. This is the first thing that reviewers see, and it usually also goes into the emails they get about the PR. So make sure they understand what it is about. This is also a good place to let the reviewers know if you only want a review, without the goal of merging the PR. Perhaps you have an early draft of your feature, and want guidance on the general direction. You can do that by prefixing your title with something like

WIP: Integrate Stripe as payment provider

This WIP (work in progress) makes your intentions clear. If you are using GitHub, common practice is to use labels for that. So you could have a "Do not merge" label, or again a "WIP" label. I tend to always write it into the title as well, so people can’t miss it.

The next point to take care of is the description. This is where you should spent most of your time, when creating a PR. Consider writing about the following points:

  • What is changed?
  • Why was the change necessary?
  • How does the project / part of the code / microservice look after making your changes?
  • Are there any side-effects of your change?
  • Do you want to add any commentary about your changes? Perhaps link to Stack Overflow answers that helped you, or blog articles and books you referenced during development?
  • Is there anything unclear with your changes? Perhaps you have a road-block that you need help with? Or you are uncertain how your changes interfere with another component? Ask about that. This is the place to do that.

If it makes sense, you should add screenshots of the before/after situation. If your changes contain animations or important visual effects, provide a gif or a short screencast of your changes.

Finally, make clear what you want the reviewers to look for, test or what they should comment on. If they follow this guide, they will look for more than that, but it is always appreciated to get a little guidance from the author of a PR.

Don’t forget to link/reference the issue/ticket in your management software that this feature/PR belongs to.

Best practices on doing pull requests

This article will deal with a few best practices that should increase the quality of your PRs as well as make your life easier, when using PRs.

Deploy a test build

I wrote about it before, but you should have your build server (or continuous integration server) deploy your branch-to-be-reviewed to a seperate, clean staging environment. This is where testers can have a look at your changes. This environment is a 100% copy for the production environment. The only thing missing is the real data from production (since that’s sensitive data). During the deployment the CI server will populate the environment’s database with seed data that is close to the data on production (and mirrors the size of production’s database).

You could name this a staging or remote test environment.

Document the lifecycle in your project management tool

It doesn’t matter whether you use JIRA, Trello, Asana or something else. As long as the tool integrates with your build environment and repository in a way, that you can easily jump between the ticket, the PR and any documentation on the wiki. The ticket needs to document the complete lifecycle from the creation of the new branch with every commit on it, to the PR and the CI server’s build results as well as the merge and deployment to the remote-test environment and, lastly, to production. You want to have a clear audit-trail easily accessible. This is not done to be able to point fingers, but to be able to understand why things are the way they are. Even 12 months later. That’s what your project management tool is responsible for.

Merging when everything is green and 👍

Merging PRs can be a story in itself. The most important thing is: Only merge when every reviewer gave you the thumbs-up and when the CI server indicates that there were no errors at all.

The merging should be done by the assigned reviewer. If there are multiple people responsible, you can talk about who should merge. It shouldn’t be the author of the PR. Allowing the author to merge her own changes leads to a bad practice where people merge to quickly, without a proper review. Other than that, it doesn’t matter whether the person merging is more or less senior than the author.

Where do you merge into, or: What’s the base branch?

The simplest strategy is to merge a feature branch (and most other branches) into the master branch. This is usually the branch that gets deployed.
It could be, that you follow a different dev methodology, like [git-flow]. If that is the case, you will probably merge into develop. So your base branch would be develop.

No matter what your base branch, note that:

Smaller features make merging easier

The less changes you want to merge, the easier it will be to review them. You should count on the fact that development of other features will continue while you are developing your feature branch. Those might have started earlier, or there were other branches that fixed bugs. In any case, your base branch will have moved on and evolved. It had a certain state when you branched off of it, but that state won’t be the same it is in right now. The longer you wait with your feature, the more work you’ll have to do — after you already thought you were done — when you want to merge your changes. This can be frustrating. You wouldn’t be the first one to introduce bugs at this point. Since the code has evolved, you are not yet that familiar with the new pieces. But you’ll have to make your code work with them. That’s where bugs are hiding.

The smaller the feature, the earlier you can merge and the less painful a process it will be.

Commit early and often with good commit messages

I wrote about it before: Let your commits be small and many. Your commits should follow the SRP (Single Responsibilty Principle). Your commit should change one thing. It might and could have side-effects. But it’s purpose it to change one thing. In your commit message’s subject (50 char length at max) you should succinctly describe what changes this commit introduces. Whenever you reach to the word "and", you know your commit is too big.

A clean history of 2000 commits that are each tightly scoped and with clearly written commit messages is easier to understand and more helpful than 200 commits that each contain countless changes with minimal commit messages. Bad examples for commit messages, that are too often, are "fix", "kill bug", "add stripe" etc.

The commit message serves as your audit trail to your future self. Perhaps you have to do software development for a few years, but there will be a time where you scroll through your commit messages, searching for this one commit that helps you understand why a bug occurs, or why a library was rolled back to an older version. I’ve seen it with enough teams, that I really suggest you start improving your commit messages.

Do not squash your commits

This might be a contested opinion, but I am against sqashing commits. Please see the point above, but your commits paint a picture and tell a story. Squashing them loses this. Sure, the changes are there, but you are not able to identify how a change came to be.

There is one situation where squashing improves things: When you commit your code, without taking care to write good commit message or how to reduce your commit sizes. Then you squash your commits in the end and write a good, detailled commit message, that adheres to the before-mentioned best practices. This does increase the quality. The history, the how is still lost, but at least you can understand that one commit. That’s still better than many commits without proper messages.

There are people that do not like merge commits. I understand the reasoning, it seems to be an unnecessary commit. This is true, you can prevent this when rebasing your changes onto the base branch. This can be difficult if your repository changes very quickly.

If you do need to squash before merging (I don’t know or understand why, please tell me!) please only do it right before merging. Don’t do this when opening the PR. Your reviewers should be able to have a look at the history of the development of your branch.

Use templates to make your lives easier

Now that I explained why you need a well-written description and detailled commit messages, you might wonder how you can facilitate their creation.
I would propose you create templates.
We’ll start with commit message templates.

Git commit message template

Inside your project folder, you’ll find the .git folder. Open the .git/config file in your favorite text editor. Add the following lines:

[commit]
    # can be local to your project or global
    template = path/to/your/.gitmessage

It makes no sense, to create this for every project, so you can add this inside your global git configuration. On unix systems this can be found in

~/.gitconfig

You would add the same thing in there. I would propose you create a repository for your company or department where you store these templates. This makes it easier to update them for the whole team. Then you just clone the repository and link it correctly: template = path/to/cloned/repository/.gitmessage.

Here’s a link to my Git templates.

This is a basic template that you could tweak:

# 50-character subject line
#
# 72-character wrapped longer description. This should answer:
#
# * Why was this change necessary?
# * How does it address the problem?
# * Are there any side effects?
#
# Include a link to the ticket, if any.
#
# Add co-authors if you worked on this code with others:
#
# Co-authored-by: Full Name <email@example.com>
# Co-authored-by: Full Name <email@example.com>

You would have this pasted into every commit message you create as a comment (the # signs denote comments in git messages). Then you never forget what to write about.

These templates work locally for every developer.

Pull request templates

If you use GitHub or GitLab you can create a template that will be used when somebody creates a PR in your repository. It works mostly the same as the git commit message template: Your template gets pasted into the description of the PR and you can use it as an aid to fill in the textbox. As far as I know, there is currently no way to require certain information to be filled out.

A way around that would be to have your CI server parse the PR description and complain on the PR if certain things are missing. This would mean that your developers need to be very disciplined in filling out the description, or else risk complaints by the CI server. If you would be interested in something like that, I will show how to do that in another article.

If you are using GitHub, you can find their [description on how to use a pull request template on their help pages].
GitLab offer a similar description, showing you [how to create merge request templates for their service].

Both services let you use markdown to write your descriptions. There are some useful additions to Markdown available for you. You can put them to use in your template, so your developers gradually learn to use them properly. One of the biggest features is to create checkboxes in your description (or comments). This is done like this:

- [ ] Make logo more bigger

While your checkboxes won’t be mandatory to be checked for your pull request to be accepted, they do provide a nice visual cue about any open tasks.

Merge the pull request

I said it earlier already, the person merging the PR shouldn’t be the one who opened it. This helps to enforce the principle of four-eyes (or less, depending on your developers).

Finally the PR is merged and you can work on the next feature.

🎉