Wednesday, June 18, 2014

The Life of Code

Yet another thing to come out of the recent WeBWorK::Asheville code camp was a crystallization of our development protocols.  We now have a more formal process for how code gets reviewed and integrated into the openwebwork repository.  In this blog post I will go through the process of creating, submitting, and eventually merging a feature into WeBWorK.  This is just an overview and will likely be out of date.  You can read up on the most current iteration here.

Suppose we want to make a feature which adds a chat client to WeBWorK so students who are logged in at the same time can chat with each other.  You might think the first thing you need to do to create a new feature for WeBWorK would be to write some code.  In truth the process starts much earlier.  Before you can write code you need to have a good coding environment.  Of course, it all begins with a WeBWorK installation.  Your development installation should not be the same machine that you use for students as it will be unstable.  I recommend a virtual machine setup as described here.  Your setup can be whatever you would like it to be, however, as long as it is based off of git.

Git is our source code management system with a central repository hosted by github.com at this address.  If you don't already have one, you should sign up for an github account.  Next you need your own personal copy, or fork, of the openwebwork repository.  Go here and click the fork icon in the upper right hand side, and follow the instructions.  Now do the same thing for the pg repository here.  Now you need to link your github repository to your development machine.  If you used the installation script, the openwebwork repository is a remote on your development machine called origin.  The following code will add your personal repository as a remote called personal
cd /opt/webwork/webwork2
git remote add personal https://github.com/myusername/webwork2.git
cd /opt/webwork/pg
git remote add personal https://github.com/myusername/pg.git
Now we have git setup and linked to our repository.  You can pull code from a repository using git pull remote-name branch-name and you can push code to a repository using git push remote-name branch-name.  You can check out what remote repositories you have setup using git remote -v and you can see what branches you have by using git branch -v.

Be warned, git is an extremely flexible code management system, but it has a very steep learning curve.  Its quite easy to get yourself in a situation where you have to start over because you messed up your commits or your branches.  If you haven't used git before I recommend trying out the interactive tutorial here.   You should also feel free to ask questions, either on the forum or the freenode #webwork channel. 

You might think we could start coding, but we have another decision to make first.  All new WeBWorK features should go in their own "feature branch".  You (almost) never make commits directly to one of the openwebwork branches.  Furethermore, different features should go into different branches so they can be kept separate until they are merged into the main code.  So my chat feature will go into a branch which I will call studentchat.  However, before I can create that branch I need to choose where I want the feature to end up.  There are generally three choices.
  • master:  The master branch always represents the current release.  You should only target a feature for this branch if you are making a small fix for an important bug.
  • release/x.y: This branch contains the beta version of the next WeBWorK release.  It is primarily used for testing and bugfixing purposes. In particular feature branches should only target this branch if they are fixing bugs for an upcoming release, or if they add a feature which is either superficial or has little chance of breaking anything.
  • develop:  The develop branch is where we generally collect new features. Features are only merged into this branch when they are more or less done, which means that develop should be relatively stable.  Anything that is merged into develop represents something that will be included in the next release branch.  In development terms this means your feature has to be out of "alpha" and well into "beta". Most developers should base their feature branches off of develop.
My chat feature is a reasonably substantial addition so I will be aiming to get it included into develop.  If you are unsure about what you should target either go with develop or ask someone.  Now that I have decided where my branch is going I can create it as follows
git checkout -b studentchat origin/develop
If I wanted to target release/x.y I would change origin/develop to origin/release/x.y  Because we make this choice at the time we create the feature branch, we have to have a good idea of where our feature will end up before we even start writing it.

Note:  With rare exception, branches created or "branched" from a certain branch have to be merged back into that same branch.   So this studentchat feature branch could only be merged into develop.  Sometimes you decide that you targeted the wrong branch after you have already started on your feature.  In this case you can use the following command to move your changes from, say, develop, to release/x.y
git rebase --onto release/x.y develop studentchat
Whew! And all of that before we even wrote our first line of code.  We are at the good part now, though.  We can start implementing our chat client.  In addition to practicing good code techniques, like integrating and using existing code, striving for consistency, and being security conscious, we should do the following:
  1. Commit often with git commit -a, but only commit changes that are relevant to the feature you are implementing.  
    • If you end up finding a bug, don't just fix it as part of the feature branch.  Use git stash to stash your current changes, switch to a new branch (maybe one based off of release/x.y) and fix the bug there.  Commit the changes to that branch, then switch back to your feature branch and pick up where you left off with git stash pop.  
  2. Pull often with git pull origin develop, fixing conflicts if necessary.  Only pull from the branch you are targeting.  
    • If you based a branch off of develop then the safest thing to do is to only ever pull develop into your feature branch.  There are situations where its OK to pull other branches, even other feature branches, into your feature branch, but its almost always more trouble than its worth. 
A well disciplined work flow might look something like this:
code for chat feature
git commit -a
git pull origin develop
fix conflicts
git commit -a
write more code, but find a bug
git stash
git checkout -b bugfix origin/release/x.y
fix bug
git commit -a
git push personal bugfix
git checkout studentchat
git stash pop
write more code for chat client
git commit -a
and so on ...
 Note:  If your feature requires you to change code in pg then you will need to create a separate branch in pg for your changes there.  This is because pg and webwork2 are two different git repositories.  As a result, if you are writing code for both the you will need to have two branches and, eventually, two separate pull requests.  On the other hand, if you are only changing pg code then you don't have to make a branch for webwork2.

Eventually you will be done with your feature.  Of course, by done I mean you have finished writing your feature and you have tested it.  Remember, even code that is submitted to develop should be out of alpha and well into beta.  There are four types of testing which you should be aware of:
  • Ad Hoc Testing: This is the most common type of testing and just involves using your code for a while and seeing if anything breaks. This is a good first step but is not sufficient.
  • Black Box Testing: This is where you systematically go through and test the functionality of your feature. Click on everything that can be clicked and make sure it does what it is supposed to do. For very small features or bug fixes this can be sufficient. 
  • White Box Testing: This is where you test the code itself. First, click the green "Compare, Review, and Create a Pull Request" button on your personal github repository. Use the edit button to select your feature branch and its target branch. This will show you a list of changes between your feature branch and its target. Go through change by change and for each perform an action in WeBWorK which causes the code changes to be run. Check that the results are what you expect them to be. This level of testing is needed for any sizable feature, or any code which impacts core WeBWorK functionality. 
  • Regression Testing: This is the longest and most thorough type of testing. It involves checking to see that the code you wrote didn't introduce bugs into previously working code. Some of this is done at the same time as you are doing the Black Box or White Box testing, and for many things full scale regression testing is impractical. However if you need to do this level of testing then go to the Testing Checklist and start going through the list, one item at a time, adding new items as necessary, and fixing any bugs you come across. 
So lets suppose we have written and tested (white box and black box) our chat feature.  Now we let the little bird out of its nest into the open world.  Its time to create a pull request.
  1. Make sure your feature branch can be merged cleanly. In other words, pull it's target branch, fix any conflicts, commit the results, and push to your git repository.
  2. Go to the webwork2 repository page in your personal github account, select your feature branch from the branch dropdown and click the green button.
  3. Click "Edit" on the pull request bar and change the base of openwebwork/webwork2 to the target branch for your feature.   In our case we would use develop.   
  4. Review your pull request. In particular take a close look at the file changes.
    • Are all of the changes relevant to your feature? Did anything unexpected sneak in?
    • Do you have any "configuration" changes or changes with hard-coded path? Any site specific code?
    • Are there a reasonable number of changes? 
    • Will it be easy for a reviewer to look over your submission?
  5. Pick a title for your pull request and write a description. You should describe the major changes included in the pull request as well as fairly detailed instructions on how to test to see if the changes are working. The better your description is and the clearer your instructions are the more likely someone will be able to test and merge your pull request in a timely manner.
  6. Double check that the pull request is for the correct branch and submit. 
One your pull request is submitted all you can do is wait.  In general it can take days for someone to get back to you about it, and it can be even longer during finals or midterms.  Eventually someone will get back to you, though, and the review process begins.  This is a little like the review process for publishing a paper, with faster turnaround (hopefully!).  The person reviewing your pull request will start by posting a message with any questions or issues or bugs that they have come across.  Don't be disheartened if they have a lot of concerns.  Testing code is like proofreading; other people are always going to find things you missed.  Next you address the reviewers questions, fix any bugs and consider making any other changes you need to make.  When you make changes, commit the result, and then push the branch to your personal github repository the pull request will automatically update with the latest changes.  There is no need to submit a new pull request.  The reviewer can then take a look at the latest version and the cycle will repeat itself over and over.... (sometimes for months).  Eventually, after hard work and patience, the feature will be polished enough that the reviewer will merge the pull request into its target branch.  Once this happens you can delete your feature branch
git branch -D studentchat
Congratulations!  Your code is part of WeBWorK now.   If you end up having to make any changes to it (and you almost certainly will) you will need to do so on a new feature branch, starting the whole process over again.  (This is not strictly speaking true, but it is the safest practice.)


No comments:

Post a Comment

Note: Only a member of this blog may post a comment.