Finally on time
Winter is coming and winter tires are even coming faster!! Normally, I’m part of the crowd who rushes to the workshop in first snow, now I’m part of the (surprisingly small) crowd of people who rush on night frost already…

Winter is coming and winter tires are even coming faster!! Normally, I’m part of the crowd who rushes to the workshop in first snow, now I’m part of the (surprisingly small) crowd of people who rush on night frost already…

We finally managed to assemble our genotyping-by-sequencing paper and published it via pre-publishing service bioRxiv, while the manuscript goes now to peer-review. Here you can find it already:
https://www.biorxiv.org/content/10.1101/2023.10.03.560633v1

After quite some time I decided (once again) to start working on the updates for this webpage. For now, I added information regarding the different Snakemake pipelines I wrote for the most common bioinformatics use cases, have a look here: Pipelines
Lets see how long this flow will go, though.
The normal table command table() calculates the frequency of each element of a vector like this:
R> df <- data.frame(var = c("A", "A", "B", "B", "C", "C", "C"))
R> table(df)
df
A B C
2 2 3
So, it tells us, we have two times A and B and three times C, accordingly.
However, if we have now the situation like this:
df <- data.frame(var = c("A", "A", "B", "B", "C", "C", "C"), value = c(10, 20, 20, 40, 15, 25, 35))
Meaning, we have a categorical variable var and a numeric variable value and for each categorical variable we would like to get the sum over the numerical variable, we can simply use the base-R command aggregate like this
R> aggregate(value ~ var, data = df, FUN = sum)
var wt
1 A 40
2 B 60
3 C 70
As I often use also the data.table package, here is also a simple solution using this package, assuming we do (respective have a data table from some other source, like fread)
library("data.table")
dt <- data.table(df)
Then we can just sume over a column name with respect to another column like this (and assign the value into a new variable tot) :
setDT(dt)[, .(n = sum(value)), var]