Daniel

Holiday read arrived

6. October 2023 Daniel Comments 0 Comment

With the upcoming autumn holiday ahead, my autumn holiday read just arrived in time! Looking forward to having a light read, as my Python become a little rusty lately it is a nice kick-start again to read about it from an introductory perspective. First glance looks really nice!

New research published

5. October 2023 Daniel Comments 0 Comment

We finally managed to assemble our genotyping-by-sequencing paper and published it via pre-publishing service bioRxiv, while the manuscript goes now to peer-review. Here you can find it already:

https://www.biorxiv.org/content/10.1101/2023.10.03.560633v1

New updates

15. September 2022 Daniel Comments 0 Comment

After quite some time I decided (once again) to start working on the updates for this webpage. For now, I added information regarding the different Snakemake pipelines I wrote for the most common bioinformatics use cases, have a look here: Pipelines

Lets see how long this flow will go, though.

Creating weighted tables with R / sum of numerics associated to some categorical variable

31. January 2022 Daniel Comments 0 Comment

The normal table command table() calculates the frequency of each element of a vector like this:

R> df <- data.frame(var = c("A", "A", "B", "B", "C", "C", "C"))
R> table(df)
df
A B C 
2 2 3

So, it tells us, we have two times A and B and three times C, accordingly.

However, if we have now the situation like this:

df <- data.frame(var = c("A", "A", "B", "B", "C", "C", "C"), value = c(10, 20, 20, 40, 15, 25, 35))

Meaning, we have a categorical variable var and a numeric variable value and for each categorical variable we would like to get the sum over the numerical variable, we can simply use the base-R command aggregate like this

R> aggregate(value ~ var, data = df, FUN = sum)
  var wt
1   A 40
2   B 60
3   C 70

As I often use also the data.table package, here is also a simple solution using this package, assuming we do (respective have a data table from some other source, like fread)

library("data.table")
dt <- data.table(df)

Then we can just sume over a column name with respect to another column like this (and assign the value into a new variable tot) :

setDT(dt)[, .(n = sum(value)), var]

Nice new video release

28. January 2022 Daniel Comments 0 Comment

Traditionally, it was again rather silent here on my little blog, and instead of having any proper content, I would like to recommend an excellent Band from Sweden, the ‘Viagra Boys’. They released a great music video yesterday!