Posts

Shoestring Market Research

Like many tools, the Git version control system, through a handful of commands, provides the core functionality that most users need on a daily basis. In today’s post, I will touch on one that doesn’t seem to be used as often or, for many users, hasn’t been used at all.

Making One of Two

One of Git’s primary features, that was a huge draw for me when first exposed to it, is branching. Unlike many other systems, branching in Git is simple, fast, and the expected procedure to use for introducing new functionality, fixing bugs, and more.

If you are working alone in a repository, there usually isn’t much of a need to do more than simple branch and merge operations. However, Wwhen you add more contributors, however, managing the work can become a bit more involved, but still be easily maintained with the branch and merge commands.

A Common Occurrence

Let’s say you’ve created a new feature branch off the dev branch, done some work (with commits) locally in that branch, then see that someone else working in a different feature branch has merged their work into the dev branch.

oak-city-labs-trevor-history-lesson-1

You might continue work in your feature branch because the new code doesn’t affect what you’re implementing and vice versa. But often, you will find yourself wanting to incorporate changes and fixes into your current working branch.

At this point, most will reach for the merge command which will definitely do the job it’s intended to: incorporate changes from another branch into the current one. But if you find yourself (or other team members) doing this often, you can wind up with quite a few merge commits which, among other things, can make it a bit more difficult to understand the history of the project. Merge commits don’t go away when you eventually fold your feature branch back into dev.

Rebase

While merging is a common workflow for handling these kinds of scenarios, another option is the rebase command.

The rebase command performs the same work as a merge operation, but in a way that results in a different historical view.

Using the scenario from above, how would things look if you reached for rebase instead of merge?

When you choose to rebase the changes in your feature branch onto those that have been committed in another branch (such as the dev branch your feature branch is based on), the command goes back to the common ancestor snapshot saving the feature branch commits to temporary files. It then switches to the branch you’re rebasing (dev), resetting HEAD to the most recent commit, then replaying those stored commits from your feature branch and reinstating the feature branch.

Now, a look at the history will appear to show that the work you’ve done in the feature branch came after the commits (done in parallel) in the dev branch. No merge commits will “litter” the history of either branch.

oak-city-labs-trevor-history-lesson-2

While this workflow may seem only useful for resulting in clean commit histories, Scott Chacon (Pro-Git) makes a good point that it also provides value when working in a repository that you don’t maintain. By rebasing your work on origin/dev, for example, the maintainer does not need to do any integration work to incorporate your changes; just a fast-forward merge.

The Golden Rule

There is absolutely at least one time when you will not want to employ the rebase command: when your commits have been published outside of your repository.

If you’ve pushed local commits to a remote, do not use the rebase command. It can cause lots of pain and suffering for your teammates (even though there are ways to work around it).

To rebase or not to rebase…

Most people are comfortable with just using the merge command for combining branch work. It’s a completely serviceable practice and, as they will often argue, shows the “true” history of the repository compared to rebasing.

I often reach for rebase when hot fixes or other commits that contain changes I’d like to have while working in a feature branch are made before I’m done. The feature branch will be clear of those merge commits letting me see a clear history of the work and will maintain that clarity when it is merged back into the destination branch.

At Oak City Labs we develop using a particular stack of technologies we are not only comfortable with, but believe to be the most effective tools in solving whatever challenges come our way. Too often when a developer says “We use X framework or Y library”, it sounds like utter gibberish and has no real meaning to non-technical people. This disconnect is the inspiration for my blog series: Non Technical Overviews of Technical Things.


Hey there!

Welcome back to my series of non technical overviews of technical things. This week I’ll be educating you on the powerful yet soothing nature of Git, a modern version control software. According to the Git docs, version control is “a system that records changes to a file or set of files over time so that you can recall specific versions later.” Sounds useful, right?

You may find version control useful if you have ever found yourself saying any of the following:

  • “Oh #&(@ I ‘accidentally’ deleted EVERYTHING!!!”
  • “Oh #&(@ I accidentally deleted EVERYTHING!!!”
  • “What did version 4.0.5.6.2.10.549.1 of this file look like?”
  • “Who made this change to this file?”
  • “How can my team make changes to the same file without overwriting each other’s work?”
  • “Do we backup our files?”

Intrigued? Great! Proper use of version control is one of the easiest and most efficient ways to keep your projects organized. Now a twist on that old familiar story to help explain things…

You’re A Web Developer, Charlie Brown

Chuck, Patty and Linus are working to build a website together. Chuck is a graphic artist, while Patty and Linus are developers.

The trio decides to use Git to organize and store their project. We will visualize their Git process using a sticky note board, a la realtimeboard.com. For clarity, we’ll match their sticky note colors up with their shirt colors: Green for Patty, Yellow for Chuck and Red for Linus.

Step 1: Set up Git

A nice aspect of Git is that it allows you to store your main project files anywhere. Git has a concept called a ‘repository’ which you can think of as the main storage location. In this example, it will be the whiteboard, but in real life, it can be any computer with the ability to store files. The most popular ‘hosting as a service’ platforms are Github and Bitbucket. For people who prefer to host the files themselves, Gitlab is a trusted option.

With Git, each person working on the project maintains a copy of all the project files at all times. The remote repository is a copy of the project from which all others pull from. When someone finishes a task in their local repository, they will push it up to the remote repository, and when they want to make sure they have the latest versions of all files, they pull down from the remote repository.

To start their project, they store one file on their remote repository:

  • index.html – The main website file

Here is a visual representation of Team Peanuts’ Git usage for their project:

Step 2: Assign tasks

Now that the team has set up their remote repository, they choose tasks for each of them to work on.

Step 3: Setup for development

Now that the team has decided what they will work on, it’s time to start developing! Since the project is currently only stored on the remote repository, all three of our favorite character use git clone to create a clone, or an exact copy, of the remote repository. Now Chuck, Patty and Linus are all set up and ready to go!

Step 4: Development!

Being the efficient, diligent team they are, the Peanuts gang all get to work immediately doing what they do best! Chuck gets to work on the website icon, while Linus and Patty begin coding.

While Chuck is busy creating the website graphics in Adobe Illustrator, Patty and Linus each work on their copy of the index.html file, which was retrieved from the remote repository and acts as the main code file for their website.

Step 5: Sharing Work

Patty finishes her work first. Since she edited index.html, she needs to make sure her teammates get the very latest version of that file. Since the Peanuts team project is structured where each of the three characters’ local repositories (clones of the remote repository) only receive files from the remote repository, Patty has to push her changes up to the remote repository for her teammates to see.

To communicate with a Git repository, local or remote, you have to package your code changes in a commit. Think of a commit as an envelope containing all the changes that have been made recently. As soon as a commit is created, any changes made after that will fall into the next commit envelope.

Patty bundles her index.html changes into a commit called “Add boilerplate page layout”. Now, her local repository has been notified of these changes and can communicate to the remote repository whenever Patty is ready.

Since Patty works quickly, she goes ahead and pushes, or shares, her commit with the remote repository. In practice, Patty could send one push with many commits to the remote repository.

Before Patty pushes her files from local to remote, the Git board looks like this:

After pushing, the Git board now looks like:

At this point, Patty has completed her job and it’s up to Chuck and Linus to update their local repositories to mirror the remote repository and remain up-to-date.

Step 6: Sharing Work Pt. 2 // Updating Local Repositories

Right now, Chuck and Linus’s repositories are outdated. They lack the changes that Patty made. That’s alright for Chuck, though. His work does not touch the same files as Patty, so he’s not worried. He bundles his website icon in a commit.

Now that his local repository has the commit, Chuck is ready to push his local repository changes to the remote repository.

Chuck cannot do that, though. Why? Because Git would reject his commit. Git would say “Chuck, you don’t have the latest files. Your local repository is not in sync with the remote repository. Pull down the latest files before you try to push your changes.” Okay, Git wouldn’t actually say that…but it does reject his commit. How does Git know Chuck’s local repository is outdated? Magic. Also, because Git tracks every change in the history of the project, and it sees that his latest commit is one commit ahead of the latest commit on the remote repository. In order to make sure the project remains logically sound, Git requires that all local repositories are up-to-date with the remote repository before sending new commits.

Chuck pulls the latest code from the remote repository. To perform this operation, his local repository pulls down all of the commits that Chuck doesn’t  have. Chuck can now push his icon, so he does.

Step 7: Merge Conflict Resolution

So far, all has been fine and dandy. Patty and Chuck were working on separate files, so there were no conflicts when Chuck’s commits were laid on top of Patty’s. But Linus has been working on the same index.html file as Patty, and now Linus is two commits behind the remote repository (one Patty commit, one Chuck commit).

Let’s say Linus finishes his work and bundles it in a commit envelope titled “Website spinning.” We already know he won’t be able to push, as we saw with Chuck, until Linus pulls the latest commits from the remote repository. Linus does so.

!!! Oh no !!! CONFLICTS DETECTED. What happened? Patty and Linus edited the same file. This means in order for Linus’ work to be applied on top of the commits from remote, any conflicts that arose from editing the same lines in index.html must be fixed before Linus can push his work to remote. In about 90% of cases, Git will actually be able to automagically fix these conflicts without you having to do any extra work. The other ~10% of the time, however, require manual conflict resolution. To do this, Git will show you which parts of the files are in conflict so that you can go in and fix it yourself.

Luckily for Linus (and for you, blog consumer), Git handled all conflicts that arose with aplomb. Now that Linus’ local repository has the latest commits from remote AND has his latest local commit, Linus can push his code to the remote repository.

Voila! Now Patty and Chuck can pull the latest changes from the remote repository and tada! Everybody’s local repository is synced up with the remote repository, and website development is well on its way.

Commands

Here are the Git concepts I covered in this post, alongside the actual Git command you would use to perform such an action.

  • Add – git add
  • Commit – git commit
  • Pull – git pull
  • Push – git push

To create a local repository:

  • If remote already exists:
    • git clone <url>
  • If remote doesn’t exist:
    • git init

To create a remote repository:

  • If using a provider such as Github or Bitbucket:
    • Create one via their website
  • If self-hosting:
    • Create one the same as you would create a local repository

What We Learned

Git is an incredibly powerful tool that we only scratched the surface of in this post. There are far more sophisticated aspects like branching and rebasing that weren’t even mentioned.

If you want to get started with git, this simple guide is a great place to begin your journey. Until then, the doctor is out.

When I think about automation, the first thing that pops into my head is a giant warehouse teeming with robots that scurry around, filling orders and shipping them off to fulfill the whims of internet shoppers. Something like this:

It’s not exactly the Jetson’s yet, but they’re getting there. And robots are cool.

But automation really applies to anything that can save us time, reduce errors and make us more efficient (which is code for saving money). The best targets for automation are tasks that are well defined, repetitive and common. In other words, chores that are boring, tedious and error prone. Fortunately for developers, we have plenty of these targets.

Our primary goal here is to automate the journey from code repository to finished product. A developer should be able to check code into the git repo and the system will build an installable product with no human intervention.

Benefits of Automation — Fast, Consistent, Correct

Let’s look more specifically at what automation can do for developers. At Oak City Labs, we build mobile apps and their backend servers that live in The Cloud. What sort of concrete things do we get from automation? We’re able to save time, ensure dependability and have confidence in every build. Automation creates builds that are fast, consistent and correct.

Saving time is the most obvious benefit of the whole process. When I have a bug to fix, I track it down and fix the errors on my laptop. Once I check that fix into git, I’m done. The automation kicks in, notices the change, runs through the build process and uploads the fixed app to a beta distribution service. Our QA folks get an email that the new app is ready to install and test. That whole process takes 30 minutes, but I’m free and clear as soon as I check into git. That’s 30 minutes of waiting for builds, running tests, and waiting for uploads that I don’t have to worry about or monitor. I’m on to tracking down the next bug. Saving half an hour a couple of times a week adds up, and sometimes it’s more like a couple of times a day. With those extra hours, I can fix more bugs and write more tests!

Less obvious than the time savings is consistency. Automation is codified process, so these builds happen exactly the same way each time. The automation always takes the same steps, in the same order, in the same context every time. Doing builds manually, I might use my laptop or my desktop, which are mostly the same, but not quite. Because I’m in a hurry, I might forget one of those simple steps, like tagging the repo with the build number, which won’t matter until I try to backtrack a buggy build later. With automation, we just don’t have those worries. Even better, I can go on vacation. Any of the developers on our team can build the app correctly. If the client needs a trivial fix, like changing a copyright date, it’s simple. Anyone on the team can update the text in the code repository and a few minutes later, a build is ready to test. Not only does the automation reduce the chances of human error, but it makes sure we no longer have to rely on a particular human to operate the controls.

Along with consistency, we also have confidence in every build. Consistency builds confidence, but so does testing and regimen. We build our software with a healthy dose of testing. As part of our automated builds, the testing ensures that the code behaves as we expect and that a change in one part of the code hasn’t inadvertently caused an error elsewhere. Of course we don’t catch every bug, but when there is a bug, we add a test to make sure we don’t make that mistake again.

Our automation is a tool we use every day as part of our development habits. It’s not a special task that’s only run at the full moon to bless our new release. It’s our daily driver, that reliable Toyota Camry we drive every day that always runs. It might need a little maintenance now and then, but it’s not your crazy uncle’s antique Mustang that only works a third of the time when he tries to take it out on a Saturday afternoon. This is really important when crisis mode comes around. Imagine if the app has some critical bug that needs to be patched ASAP. Because we have confidence in our consistent and correct automation, we can focus on fixing the bug and know that once it’s fixed, releasing the new version to users will be a smooth standard procedure.

Automation has been something we’ve grown to rely on in our development process because it make us more efficient and saves us time. For our clients, saving time means saving them money. We can get more work done because we can focus on the real challenge of writing apps and leave the boring tedium to our trusted automation. With the consistency and confidence that automation adds to our workflow, we can always be proud to deliver a top notch product to our clients. Fast, consistent and correct — automation delivers all three!

How the Magic Happens

So… automation is great and wonderful and makes the grass greener and the sun shine brighter, but how does it work? We’ve been talking in very vague terms so far, so let’s get down to the bits and bytes of how to put it all together.

For the last several years, we’ve been using a git branching strategy based on the excellent git-flow model. You can read all the brilliant details in the link, but the short story is that you have two long lived branches, master and dev. The master branch contains production level code that is always ready to release. The dev branch is the main development version that gets new features and maintenance fixes. Once a new feature set is ready in dev, it gets merged into master. This maps very nicely onto our concrete goals for automation.

Our projects have two deployment modes: beta and production. Beta is code from the dev branch. This is code ready for internal testing. For mobile apps, beta builds are distributed to QA devices for testing against the staging server. For server apps, beta builds are deployed to the staging server for testing before rolling out to production. Production mode is the real deal. Production mobile apps go to the app stores and to real users. Production server apps are rolled out to public servers ready to support users.

The automation workflow maps from git-flow into our deployment environments with the dev branch always feeding the beta environment and the master branch feeding the production environment.

The engine we use for our automation is a continuous integration server called TeamCity from JetBrains. It’s a commercial product, but it’s free for small installations. TeamCity coordinates the automation workflow by

  1. monitoring both master and dev branches in git
  2. downloading new code
  3. building the new version of the app
  4. running all the tests
  5. deploying to beta or production

If any of those steps fail, the process stops and the alarms go off. Our dev team is alerted via email and Slack and we’re on the problem like minions on a banana.

The last three steps are specific to the product type. For iOS, we use the wonderful Fastlane tools to orchestrate building the app, running tests, and uploading to either Fabric’s Crashlytics beta distribution service or to Apple’s App Store for production release.

We use Fastlane for Android, as well, so the flow is very similar. Beta releases go to Crashlytics, but production releases are shipped to the Google Play Store.

Our servers are written in Python and our web applications are in Angular. Testing for both uses the respective native testing tools, driven from TeamCity. For deployment, we use Ansible to push new versions to our beta and production clusters. We love Ansible because it’s simple and only requires an ssh connection to a node for complete management. Also, since Ansible is Python, it’s easy to extend when we need to do something special.

Since all of our build paths go through TeamCity, the TC dashboard is a great place to get a quick rundown of the build status across all projects.

With TeamCity coordinating our workflow automation and using tools like Ansible and Fastlane to enable builds and deployments, we’ve been able to build a system that is fast, consistent and correct, relieving us of the tedium of builds and letting us focus on the hard problems of building great apps for ourselves and our awesome clients.