Daniel

Happy New Year 2026

2. January 2026 Daniel Comments 0 Comment

A year of rather mixed feelings, 2025 is now over. From my perspective, there were many ups and downs, and to some extent I am afraid that 2026 will be much the same—so there may be a bumpy ride ahead. Let’s see what will happen. Nevertheless, all the best for the upcoming year!

We started the year with rather cold weather, in the New Year night we had around -14 degrees.

Limiting Parallel Jobs in Snakemake Using Resources

1. April 2025 Daniel Comments 0 Comment

Introduction

When running computationally intensive workflows with Snakemake, you might encounter issues where too many jobs are running in parallel, causing excessive I/O load, memory pressure, or high latency on your hard drive. This can lead to failed jobs or degraded performance.

Snakemake provides a way to limit parallel execution per rule using the resources directive, but this only works if you also specify a global resource limit when executing the workflow.

In this blog post, we will demonstrate how to properly limit the number of parallel jobs for a specific rule using Snakemake’s resource management system.

The Problem: Too Many Jobs Running at Once

Consider the following Snakemake rule:

rule process_data:
    input:
        "{sample}.raw"
    output:
        "{sample}.processed"
    resources:
        process_data_jobs=1  # Assign a resource unit for limiting the number of jobs
    shell:
        """
        some_tool --input {input} --output {output}
        """

Why Doesn’t `resources` Alone Limit Job Execution?

You might expect that setting resources: processing_jobs=1 would automatically limit Snakemake to running only 1 job at a time. However, Snakemake does not enforce resource-based scheduling unless you specify a global limit when launching the workflow.

Without a global limit, Snakemake may still launch too many jobs in parallel, overloading your system.

The Solution: Enforce Resource Limits

To actually restrict the number of parallel jobs, run Snakemake with:

snakemake --resources process_data_jobs=10

How Does This Work?

Each job of process_data requests 1 unit of process_data_jobs.
The global limit processing_jobs=10 ensures that at most 10 jobs (10 / 1 = 10) run in parallel. You can also set different units, if you like

Before setting this limit, too many jobs could be running at once! After applying it, only 10 jobs were allowed to run simultaneously.

Conclusion

If you are facing high disk latency, I/O pressure, or excessive job execution in Snakemake, the best way to control it is by:

Using resources to define per-job resource requirements.
Setting a global resource limit (--resources processing_jobs=10) when executing Snakemake.

This approach ensures your workflow runs efficiently and reliably without overloading your system!

Genomic Prediction for Timothy Grass in Finland

19. March 2025 Daniel Comments 0 Comment

Timothy (Phleum pratense L.) is a key forage grass for Finnish agriculture, and improving its yield, winter hardiness, and digestibility is crucial for sustainable production. Our recent study explored the potential of genomic prediction to accelerate breeding progress by leveraging genotyping-by-sequencing and advanced statistical models.

Key findings:
* Heritability estimates ranged from 0.13 (yield at first cut) to 0.86 (digestibility at second cut).
* Genetic correlations suggest trade-offs between yield and winter survival but positive links between digestibility traits.
* Genomic breeding values were estimated using advanced statistical approaches, including a novel scaling of the genomic relationship matrix.
* Predictive ability reached up to 0.62 for digestibility, and validation confirmed moderate accuracy with little dispersion.

Despite concerns that genotype quality might impact predictions, our results show that genomic prediction remains a powerful tool for Timothy breeding in Finland. This research highlights the potential for data-driven breeding strategies to enhance forage crop resilience and quality.

https://link.springer.com/article/10.1007/s00122-025-04860-9

New R-package started

28. February 2025 Daniel Comments 0 Comment

I just started a new R package called ‘SnakebiteTools’. I would like to collect there small helper functions to better analyse and monitor Snakemake runs. The output can later be used for resource optimization, checking the status of an ongoing Snakemake run (which might be messy for runs with plenty of jobs) etc.

Creating Drop-Down Menus in Excel

27. February 2025 Daniel Comments 0 Comment

I often do not remember how to create simple drop down menus in Excel and so I decided to write a short note here. The thing I want to have:

In one tab, I want to have a column with possible values for my drop down menu, e.g my project names
In another tab, I want to have in each button of a column a drop down button that allows me to chose from these values.

This is in principle rather easy to achieve:

Step 1: Prepare the List on Another Sheet

Open your Excel file and go to the sheet where you want to store the drop-down values (e.g., Projects).
Enter the list of values in a column (e.g., A1:A10 in Projects).

Step 2: Name the List (Optional but Recommended)

Select the range of values in Projects (e.g., A1:A10 or the whole column).
Click on the Formula tab → Define Name.
Enter a name (e.g., MyProjects) and click OK.

Step 3: Create the Drop-Down List

Go to the sheet where you want the drop-down (e.g., Tasks).
Select the cell(s) where you want the drop-down.
Click on the Data tab → Data Validation.
In the Allow box, choose List.
In the Source box:
- If you named the range: enter =MyProjects
- If not: enter e.g. =Projects!A1:A10
Click OK.

In case you have in the first line a header (e.g. with column names) you want to remove this line from the drop down options. You can do that like this: