New research published
We finally managed to assemble our genotyping-by-sequencing paper and published it via pre-publishing service bioRxiv, while the manuscript goes now to peer-review. Here you can find it already:
https://www.biorxiv.org/content/10.1101/2023.10.03.560633v1

New updates
After quite some time I decided (once again) to start working on the updates for this webpage. For now, I added information regarding the different Snakemake pipelines I wrote for the most common bioinformatics use cases, have a look here: Pipelines
Lets see how long this flow will go, though.
Creating weighted tables with R / sum of numerics associated to some categorical variable
The normal table command table() calculates the frequency of each element of a vector like this:
R> df <- data.frame(var = c("A", "A", "B", "B", "C", "C", "C"))
R> table(df)
df
A B C
2 2 3
So, it tells us, we have two times A and B and three times C, accordingly.
However, if we have now the situation like this:
df <- data.frame(var = c("A", "A", "B", "B", "C", "C", "C"), value = c(10, 20, 20, 40, 15, 25, 35))
Meaning, we have a categorical variable var and a numeric variable value and for each categorical variable we would like to get the sum over the numerical variable, we can simply use the base-R command aggregate like this
R> aggregate(value ~ var, data = df, FUN = sum)
var wt
1 A 40
2 B 60
3 C 70
As I often use also the data.table package, here is also a simple solution using this package, assuming we do (respective have a data table from some other source, like fread)
library("data.table")
dt <- data.table(df)
Then we can just sume over a column name with respect to another column like this (and assign the value into a new variable tot) :
setDT(dt)[, .(n = sum(value)), var]
Nice new video release
Traditionally, it was again rather silent here on my little blog, and instead of having any proper content, I would like to recommend an excellent Band from Sweden, the ‘Viagra Boys’. They released a great music video yesterday!
