Chapter 15 Self Documenting Help
As we build up our Snakemake workflow the Snakefile becomes increasingly complex.
It is easy to imagine that a future version of you,
or a co-author,
might have difficulty understanding what each rule is doing.
This chapter introduces an approach to add some comments to the Snakefile to make it easier to navigate.
We also add a new help
rule so that we can get a summary of the comments printed to screen.
15.1 Our Commenting Syle
If we look at the Snakefile in it’s current state, we see that there are a bare minimum amount of comments already integrated.
Because Snakemake is a Python library all comments must begin with one #
.
We then added some additional formatting of comments to create two types.21
Our first type of comment is a ‘double hash’ - ##
.
We use this to represent information that someone who reads our code might find helpful to build understanding of the file.
For example, the head of the Snakefile contains information on the name of the workflow and the authors:
Second, we have broken up the Snakefile somewhat into sections such as Dictionaries, Build Rules and Clean Rules using the # --- Something --- #
notation.
15.2 Adding Further Comments
The information we have added in comments so far is not really enough to help us remember much.
What we want is a simple one or two line summary of what each of our rules do so that we can look back at them in the future.
We are going to make comments above each rule using the ##
notation.
The structure will follow a common pattern: rule_name : some description
For example, for the textbook_solow
rule:
## textbook_solow: construct a table of regression estimates for textbook solow model
rule textbook_solow:
input:
script = "src/tables/tab01_textbook_solow.R",
models = expand("out/analysis/{iModel}_ols_{iSubset}.rds",
iModel = MODELS,
iSubset = DATA_SUBSET)),
params:
filepath = "out/analysis/",
model_expr = "model_solow*.rds"
output:
table = "out/tables/tab01_textbook_solow.tex"
shell:
"Rscript {input.script} \
--filepath {params.filepath} \
--models {params.model_expr} \
--out {output.table}"
Exercise: Adding Comments
Add some comments using our suggested style to each of the Snakemake rules we have constructed so far.
15.3 A help
Rule
The Snakefile is now much easier to understand when reading through it.
Future you will be grateful for the effort.
We can go one step further: construct a rule that prints the output of these comments to the screen.
Let’s call this rule help
, so that if we run snakemake help
all our useful comments will be printed to screen.
We usually put this help rule at the bottom of our Snakefile, so that we don’t keep bumping into it as we work.
The rule has the following structure:
# --- Help Rules --- #
## help : prints help comments for Snakefile
rule help:
input: "Snakefile"
shell:
"sed -n 's/^##//p' {input}"
The rule depends on the Snakefile
itself - because if the file changes, the might too.
The shell comand that we use is a little confusing when we look at it the first time.
Intuitively heres what it is doing:
- The comments we want to print are those that have the double hash notation,
##
. sed
is a progam that can perform text manipulations to some input- We want it to find all lines starting with
##
in our input file … - … and print out the remainder of those lines
- We want it to find all lines starting with
Let’s see it in action:
Building DAG of jobs...
Using shell: /bin/bash
Provided cores: 1
Rules claiming more threads will be scaled down.
Job counts:
count jobs
1 help
1
[Wed Feb 6 19:13:51 2019]
rule help:
input: Snakefile
jobid: 0
Snakemake - MRW Replication
@yourname
all : builds all final outputs
augment_solow : construct a table of estimates for augmented solow model
textbook_solow : construct a table of regression estimates for textbook solow model
make_figs : builds all figures
figures : recipe for constructing a figure (cannot be called)
estimate_models : estimates all regressions
ols_models : recipe for estimating a single regression (cannot be called)
gen_regression_vars: creates variables needed to estimate a regression
rename_vars : creates meaningful variable names
clean : removes all content from out/ directory
help : prints help comments for Snakefile
[Wed Feb 6 19:13:51 2019]
Finished job 0.
1 of 1 steps (100%) done
We see that after the usual Snakemake output telling us what it is doing, our comments are printed to screen.
Exercise: Saving help
output to file
Modify the help
rule so that the printed content is saved in a file called HELP.txt
Do notice that the exact formatting of all this help is totally arbitrary. Noone tells you that two
#
’s or a mix of#
’s and-
’s are meaningful, correct or standard ways of putting comments in a file. Our choice reflects path dependence in the way we started writing Snakefiles, and we have found them useful and persisted with our initial choice. Feel free to create your own. The only important thing is the line starts with one#
so that Snakemake doesnt try and execute that line as code.↩︎