Christian Meesters
RSA
MIGfMA0GCSqGSIb3DQEBAQUAA4GNADCBiQKBgQCKyTtRc5W2LD8SIQhd+T7IOhgr4GinEVpynNUiUppNi1Yvdo+qHljDplzsXlxc4pPwEB2+/RBYT75rMeD/VrV12s5Aq9t5XalDFKnxPE1vFN4P3dINAOq8ikc1yzU0fpJpz09+tefHG+Zk1Mz90PnLWLCs30wM6Iq6hI5NHmjklQIDAQAB
Christian Meesters
RSA
MIGfMA0GCSqGSIb3DQEBAQUAA4GNADCBiQKBgQC+DVJTSCzdJUJMzlwRfb2Qm2NZpuZ932RpVliWxMAhmZ6ZkeeibmJlW/Z0q51YkFArJIWbJ0f7rLQWM5epuZV4I5cjMqepOWGey6LrPolMs8lI9dJf9zW9lJsneV+py3U8WsrB6mg//pkEeXnVDd+WscADyXJ5EeLwoZjCkPn8/QIDAQAB
I am a sceptic of web applications and reproducibility. However, knowing some participants of this hackathon personally, I gained some trust.
interesting summary for ligand screening and validation
corner stone of data analysis
config.yml
samples: samples.csv
ref:
species: "Drosophila melanogaster"
genome: ""
annotation: ""
accession: "GCF_000001215.4"
ensembl_species: "" # e.g., "homo_sapiens"
build: "" # e.g., "GRCh38"
release: "" # e.g., "105"
read_filter:
min_length: 200
minimap2:
index_opts: ""
opts: ""
maximum_secondary: 100
secondary_score_ratio: 1.0
samtools:
samtobam_opts: "-b"
bamsort_opts: ""
bamindex_opts: ""
bamstats_opts: ""
quant:
salmon_libtype: "U"
deseq2:
fit_type: ""
design_factors:
- "condition"
lfc_null: 1.0
alt_hypothesis: "greaterAbs"
point_width: 20
mincount: 10
alpha: 0.05
threshold_plot: 10
colormap: "Blues"
figtype: "png"
batch_effect:
- ""
isoform_analysis:
FLAIR: true
qscore: 1
exp_thresh: 10
col_opts: "--annotation_reliant generate --generate_map --stringent"
protein_annotation:
lambda: false
uniref: "https://ftp.imp.fu-berlin.de/pub/lambda/index/lambda3/gen_0/uniref50_20230713.lba.gz"
num_matches: 3
RAjHDlPDghZzc9ZvQ3uJQNJ9Jd_KAYzZt7dk5PXKgjRyE
This workflow performs differential expression analysis of RNA-seq data obtained from Oxford Nanopore long-read sequencing technology.
First a transcriptome FASTA is constructed using gffread [https://github.com/gpertea/gffread]. Reads are then mapped to the transcriptome with the long-read optimized alignment tool minimap2 [https://github.com/lh3/minimap2].
Next quantification is performed using salmon [https://github.com/COMBINE-lab/salmon] before normalization and differential expression analysis are conducted by PyDESeq2 [https://github.com/owkin/PyDESeq2].
The workflow can optionally analyze splice-isoforms through integrating the FLAIR [https://github.com/BrooksLabUCSC/flair] workflow.
Additionaly, NanoPlot [https://github.com/wdecoster/NanoPlot] is employed to analyze initial sequencing data and QualiMap [https://github.com/EagleGenomics-cookbooks/QualiMap] is used to evaluate mapping results.
2026-05-05T18:00:57.134252+00:00
sorted_alignments/{sample}_sorted.bam
QC/qualimap/{sample}
github.com/snakemake/snakemake-wrappers/bio/qualimap/bamqc/environment.yaml@v4.4.0
QC/qualimap/{sample}
qualimap/{sample}/qualimapReport.html
python>=3.12.4
sorted_alignments/{sample}_sorted.bam
sorted_alignments/{sample}_sorted.bam.bai
github.com/snakemake/snakemake-wrappers/bio/samtools/index/environment.yaml@v7.6.0
alignments/{sample}.bam
sorted_alignments/{sample}_sorted.bam
github.com/snakemake/snakemake-wrappers/bio/samtools/sort/environment.yaml@v7.6.0
alignments/{sample}.bam
QC/bamstats/{sample}.txt
github.com/snakemake/snakemake-wrappers/bio/samtools/stats/environment.yaml@v3.13.4
sorted_alignments/{sample}_sorted.bam
sorted_alignments/{sample}_sorted.bam.bai
iso_analysis/beds/{sample}.bed
flair=2.0.0
references/genomic.fa
index/flair_genome_index.mmi
github.com/snakemake/snakemake-wrappers/bio/minimap2/index/environment.yaml@v7.6.0
transcriptome/corrected_transcriptome.fa
index/transcriptome_index.mmi
github.com/snakemake/snakemake-wrappers/bio/minimap2/index/environment.yaml@v7.6.0
iso_analysis/beds/barcode10.bed
iso_analysis/beds/barcode11.bed
iso_analysis/beds/barcode12.bed
iso_analysis/beds/barcode13.bed
iso_analysis/beds/barcode15.bed
iso_analysis/beds/barcode16.bed
iso_analysis/beds/all_samples.bed
python>=3.12.4
transcriptome/transcriptome.fa
transcriptome/corrected_transcriptome.fa
gffread>=0.12.7
alignments/{sample}.bam
transcriptome/corrected_transcriptome.fa
counts/{sample}_salmon/quant.sf
salmon>=1.10.3
de_analysis/all.rds
de_analysis/{factor}_{prop_a}_vs_{prop_b}_MA_plot.svg
de_analysis/{factor}_{prop_a}_vs_{prop_b}_count_heatmap.svg
de_analysis/{factor}_{prop_a}_vs_{prop_b}_dispersion_plot.svg
de_analysis/{factor}_{prop_a}_vs_{prop_b}_l2fc.tsv
de_analysis/{factor}_{prop_a}_vs_{prop_b}_sample_heatmap.svg
de_analysis/{factor}_{prop_a}_vs_{prop_b}_top_count_heatmap.svg
bioconductor-deseq2 =1.46.0
r-ashr =2.2_63
r-pheatmap
r-rcolorbrewer
r-stringr =1.5.1
/fshpc/meesters/projects/snakemake-workflows/rna-longseq-de-isoform/config/demo/samples.csv
merged/all_counts_gene.tsv
de_analysis/all.rds
de_analysis/normcounts.tsv
bioconductor-deseq2 =1.46.0
r-ashr =2.2_63
r-pheatmap
r-rcolorbrewer
r-stringr =1.5.1
references/ensembl_annotation.gff3
github.com/snakemake/snakemake-wrappers/bio/reference/ensembl-annotation/environment.yaml@v7.5.0
references/ensembl_genome.fa
github.com/snakemake/snakemake-wrappers/bio/reference/ensembl-sequence/environment.yaml@v7.5.0
references/ncbi_dataset_annotation.zip
ncbi-datasets-cli>=18.14.0
unzip>=6.0.0
references/ncbi_dataset_genome.zip
ncbi-datasets-cli>=18.14.0
unzip>=6.0.0
filter/{sample}_filtered.fq
biopython >=1.84
pandas>=2.2.2
python>=3.12.4
filter/barcode10_filtered.fq
filter/barcode11_filtered.fq
filter/barcode12_filtered.fq
filter/barcode13_filtered.fq
filter/barcode15_filtered.fq
filter/barcode16_filtered.fq
index/flair_genome_index.mmi
references/genomic.fa
iso_analysis/align/flair.bam
iso_analysis/align/flair.bam.bai
iso_analysis/align/flair.bed
flair=2.0.0
filter/barcode10_filtered.fq
filter/barcode11_filtered.fq
filter/barcode12_filtered.fq
filter/barcode13_filtered.fq
filter/barcode15_filtered.fq
filter/barcode16_filtered.fq
iso_analysis/align/flair_all_corrected.bed
references/genomic.fa
references/standardized_genomic.gtf
iso_analysis/collapse/flair.isoforms.bed
iso_analysis/collapse/flair.isoforms.fa
flair=2.0.0
iso_analysis/align/flair.bed
references/genomic.fa
references/standardized_genomic.gtf
iso_analysis/align/flair_all_corrected.bed
flair=2.0.0
iso_analysis/quantify/flair.counts.tsv
iso_analysis/diffexp/genes_deseq2_QCplots_{condition_value1}_v_{condition_value2}.pdf
iso_analysis/diffexp/genes_deseq2_{condition_value1}_v_{condition_value2}.tsv
iso_analysis/diffexp/isoforms_deseq2_QCplots_{condition_value1}_v_{condition_value2}.pdf
iso_analysis/diffexp/isoforms_deseq2_{condition_value1}_v_{condition_value2}.tsv
iso_analysis/diffexp/isoforms_drimseq_{condition_value1}_v_{condition_value2}.tsv
flair=2.0.0
iso_analysis/collapse/flair.isoforms.bed
iso_analysis/quantify/flair.counts.tsv
iso_analysis/plots
flair=2.0.0
iso_analysis/collapse/flair.isoforms.bed
iso_analysis/collapse/flair.isoforms.fa
iso_analysis/reads_manifest.tsv
iso_analysis/quantify/flair.counts.tsv
flair=2.0.0
de_analysis/{factor}_{prop_a}_vs_{prop_b}_l2fc.tsv
transcriptome/corrected_transcriptome.fa
protein_annotation/{factor}_{prop_a}_vs_{prop_b}_de_genes.fa
biopython >=1.84
pandas>=2.2.2
python>=3.12.4
references/genomic.fa
references/standardized_genomic.gff
references/genomic.fa.fai
transcriptome/transcriptome.fa
gffread>=0.12.7
references/genomic.gff
ncbi-datasets-cli>=18.14.0
unzip>=6.0.0
references/genomic.fa
ncbi-datasets-cli>=18.14.0
unzip>=6.0.0
protein_annotation/index/UniRef.lba.gz
wget>=1.21.4
protein_annotation/blast_results_{factor}_{prop_a}_vs_{prop_b}.m8
protein_annotation/proteins_{factor}_{prop_a}_vs_{prop_b}.csv
biopython >=1.84
pandas>=2.2.2
python>=3.12.4
references/standardized_genomic.gff
references/standardized_genomic.gtf
gffread>=0.12.7
iso_analysis/plots
iso_analysis/report/isoforms
iso_analysis/report/usage
python>=3.12.4
protein_annotation/index/UniRef.lba.gz
protein_annotation/{factor}_{prop_a}_vs_{prop_b}_de_genes.fa
protein_annotation/blast_results_{factor}_{prop_a}_vs_{prop_b}.m8
lambda>=3.1.0
filter/{sample}_filtered.fq
index/transcriptome_index.mmi
alignments/{sample}.sam
github.com/snakemake/snakemake-wrappers/bio/minimap2/aligner/environment.yaml@v7.6.0
counts/barcode10_salmon/quant.sf
counts/barcode11_salmon/quant.sf
counts/barcode12_salmon/quant.sf
counts/barcode13_salmon/quant.sf
counts/barcode15_salmon/quant.sf
counts/barcode16_salmon/quant.sf
merged/all_counts.tsv
pandas>=2.2.2
python>=3.12.4
de_analysis/all.rds
de_analysis/pca_{variable}.svg
bioconductor-deseq2 =1.46.0
r-ashr =2.2_63
r-pheatmap
r-rcolorbrewer
r-stringr =1.5.1
iso_analysis/reads_manifest.tsv
pandas>=2.2.2
python>=3.12.4
alignments/{sample}.sam
alignments/{sample}.bam
github.com/snakemake/snakemake-wrappers/bio/samtools/view/environment.yaml@v7.6.0
NanoPlot/{sample}/NanoPlot-report.html
nanoplot
references/genomic.gff
references/standardized_genomic.gff
agat>=1.4.0
/lustre/project/nhr-zdvhpc/dtest/raw/barcode10.fastq.gz
/lustre/project/nhr-zdvhpc/dtest/raw/barcode11.fastq.gz
/lustre/project/nhr-zdvhpc/dtest/raw/barcode12.fastq.gz
/lustre/project/nhr-zdvhpc/dtest/raw/barcode13.fastq.gz
/lustre/project/nhr-zdvhpc/dtest/raw/barcode15.fastq.gz
/lustre/project/nhr-zdvhpc/dtest/raw/barcode16.fastq.gz
NanoPlot/NanoPlot-report.html
nanoplot
merged/all_counts.tsv
references/standardized_genomic.gff
merged/all_counts_gene.tsv
merged/transcriptid_to_gene_plot.svg
anndata=0.10.8
bioinfokit
pydeseq2=0.4.10
seaborn>=0.13.2
from workflow configuration
workflow rules
config.yml
samples: samples.csv
ref:
species: "Drosophila melanogaster"
genome: ""
annotation: ""
accession: "GCF_000001215.4"
ensembl_species: "" # e.g., "homo_sapiens"
build: "" # e.g., "GRCh38"
release: "" # e.g., "105"
read_filter:
min_length: 200
minimap2:
index_opts: ""
opts: ""
maximum_secondary: 100
secondary_score_ratio: 1.0
samtools:
samtobam_opts: "-b"
bamsort_opts: ""
bamindex_opts: ""
bamstats_opts: ""
quant:
salmon_libtype: "U"
deseq2:
fit_type: ""
design_factors:
- "condition"
lfc_null: 1.0
alt_hypothesis: "greaterAbs"
point_width: 20
mincount: 10
alpha: 0.05
threshold_plot: 10
colormap: "Blues"
figtype: "png"
batch_effect:
- ""
isoform_analysis:
FLAIR: true
qscore: 1
exp_thresh: 10
col_opts: "--annotation_reliant generate --generate_map --stringent"
protein_annotation:
lambda: false
uniref: "https://ftp.imp.fu-berlin.de/pub/lambda/index/lambda3/gen_0/uniref50_20230713.lba.gz"
num_matches: 3
RAjHDlPDghZzc9ZvQ3uJQNJ9Jd_KAYzZt7dk5PXKgjRyE
This workflow performs differential expression analysis of RNA-seq data obtained from Oxford Nanopore long-read sequencing technology.
First a transcriptome FASTA is constructed using gffread [https://github.com/gpertea/gffread]. Reads are then mapped to the transcriptome with the long-read optimized alignment tool minimap2 [https://github.com/lh3/minimap2].
Next quantification is performed using salmon [https://github.com/COMBINE-lab/salmon] before normalization and differential expression analysis are conducted by PyDESeq2 [https://github.com/owkin/PyDESeq2].
The workflow can optionally analyze splice-isoforms through integrating the FLAIR [https://github.com/BrooksLabUCSC/flair] workflow.
Additionaly, NanoPlot [https://github.com/wdecoster/NanoPlot] is employed to analyze initial sequencing data and QualiMap [https://github.com/EagleGenomics-cookbooks/QualiMap] is used to evaluate mapping results.
2026-05-05T15:42:06.149516+00:00
sorted_alignments/{sample}_sorted.bam
QC/qualimap/{sample}
github.com/snakemake/snakemake-wrappers/bio/qualimap/bamqc/environment.yaml@v4.4.0
v4.4.0
QC/qualimap/{sample}
qualimap/{sample}/qualimapReport.html
../envs/base.yml
sorted_alignments/{sample}_sorted.bam
sorted_alignments/{sample}_sorted.bam.bai
github.com/snakemake/snakemake-wrappers/bio/samtools/index/environment.yaml@v7.6.0
v7.6.0
alignments/{sample}.bam
sorted_alignments/{sample}_sorted.bam
github.com/snakemake/snakemake-wrappers/bio/samtools/sort/environment.yaml@v7.6.0
v7.6.0
alignments/{sample}.bam
QC/bamstats/{sample}.txt
github.com/snakemake/snakemake-wrappers/bio/samtools/stats/environment.yaml@v3.13.4
v3.13.4
sorted_alignments/{sample}_sorted.bam
sorted_alignments/{sample}_sorted.bam.bai
iso_analysis/beds/{sample}.bed
../envs/flair.yml
references/genomic.fa
index/flair_genome_index.mmi
github.com/snakemake/snakemake-wrappers/bio/minimap2/index/environment.yaml@v7.6.0
v7.6.0
transcriptome/corrected_transcriptome.fa
index/transcriptome_index.mmi
github.com/snakemake/snakemake-wrappers/bio/minimap2/index/environment.yaml@v7.6.0
v7.6.0
iso_analysis/beds/barcode10.bed
iso_analysis/beds/barcode11.bed
iso_analysis/beds/barcode12.bed
iso_analysis/beds/barcode13.bed
iso_analysis/beds/barcode15.bed
iso_analysis/beds/barcode16.bed
iso_analysis/beds/all_samples.bed
../envs/base.yml
transcriptome/transcriptome.fa
transcriptome/corrected_transcriptome.fa
../envs/gffread.yml
alignments/{sample}.bam
transcriptome/corrected_transcriptome.fa
counts/{sample}_salmon/quant.sf
../envs/salmon.yml
de_analysis/all.rds
de_analysis/{factor}_{prop_a}_vs_{prop_b}_MA_plot.svg
de_analysis/{factor}_{prop_a}_vs_{prop_b}_count_heatmap.svg
de_analysis/{factor}_{prop_a}_vs_{prop_b}_dispersion_plot.svg
de_analysis/{factor}_{prop_a}_vs_{prop_b}_l2fc.tsv
de_analysis/{factor}_{prop_a}_vs_{prop_b}_sample_heatmap.svg
de_analysis/{factor}_{prop_a}_vs_{prop_b}_top_count_heatmap.svg
../envs/deseq2.yml
/fshpc/meesters/projects/snakemake-workflows/rna-longseq-de-isoform/config/demo/samples.csv
merged/all_counts_gene.tsv
de_analysis/all.rds
de_analysis/normcounts.tsv
../envs/deseq2.yml
references/ensembl_annotation.gff3
github.com/snakemake/snakemake-wrappers/bio/reference/ensembl-annotation/environment.yaml@v7.5.0
v7.5.0
references/ensembl_genome.fa
github.com/snakemake/snakemake-wrappers/bio/reference/ensembl-sequence/environment.yaml@v7.5.0
v7.5.0
references/ncbi_dataset_annotation.zip
../envs/reference.yml
references/ncbi_dataset_genome.zip
../envs/reference.yml
<function <lambda> at 0x7fcf6a1dce00>
filter/{sample}_filtered.fq
../envs/biopython.yml
filter/barcode10_filtered.fq
filter/barcode11_filtered.fq
filter/barcode12_filtered.fq
filter/barcode13_filtered.fq
filter/barcode15_filtered.fq
filter/barcode16_filtered.fq
index/flair_genome_index.mmi
references/genomic.fa
iso_analysis/align/flair.bam
iso_analysis/align/flair.bam.bai
iso_analysis/align/flair.bed
../envs/flair.yml
filter/barcode10_filtered.fq
filter/barcode11_filtered.fq
filter/barcode12_filtered.fq
filter/barcode13_filtered.fq
filter/barcode15_filtered.fq
filter/barcode16_filtered.fq
iso_analysis/align/flair_all_corrected.bed
references/genomic.fa
references/standardized_genomic.gtf
iso_analysis/collapse/flair.isoforms.bed
iso_analysis/collapse/flair.isoforms.fa
../envs/flair.yml
iso_analysis/align/flair.bed
references/genomic.fa
references/standardized_genomic.gtf
iso_analysis/align/flair_all_corrected.bed
../envs/flair.yml
iso_analysis/quantify/flair.counts.tsv
iso_analysis/diffexp/genes_deseq2_QCplots_{condition_value1}_v_{condition_value2}.pdf
iso_analysis/diffexp/genes_deseq2_{condition_value1}_v_{condition_value2}.tsv
iso_analysis/diffexp/isoforms_deseq2_QCplots_{condition_value1}_v_{condition_value2}.pdf
iso_analysis/diffexp/isoforms_deseq2_{condition_value1}_v_{condition_value2}.tsv
iso_analysis/diffexp/isoforms_drimseq_{condition_value1}_v_{condition_value2}.tsv
../envs/flair.yml
<function <lambda> at 0x7fcf6a1df2e0>
iso_analysis/collapse/flair.isoforms.bed
iso_analysis/quantify/flair.counts.tsv
iso_analysis/plots
../envs/flair.yml
iso_analysis/collapse/flair.isoforms.bed
iso_analysis/collapse/flair.isoforms.fa
iso_analysis/reads_manifest.tsv
iso_analysis/quantify/flair.counts.tsv
../envs/flair.yml
de_analysis/{factor}_{prop_a}_vs_{prop_b}_l2fc.tsv
transcriptome/corrected_transcriptome.fa
protein_annotation/{factor}_{prop_a}_vs_{prop_b}_de_genes.fa
../envs/biopython.yml
references/genomic.fa
references/standardized_genomic.gff
references/genomic.fa.fai
transcriptome/transcriptome.fa
../envs/gffread.yml
<function <lambda> at 0x7fcf6a394cc0>
references/genomic.gff
../envs/reference.yml
<function <lambda> at 0x7fcf6a394d60>
references/genomic.fa
../envs/reference.yml
protein_annotation/index/UniRef.lba.gz
../envs/wget.yml
protein_annotation/blast_results_{factor}_{prop_a}_vs_{prop_b}.m8
protein_annotation/proteins_{factor}_{prop_a}_vs_{prop_b}.csv
../envs/biopython.yml
references/standardized_genomic.gff
references/standardized_genomic.gtf
../envs/gffread.yml
iso_analysis/plots
iso_analysis/report/isoforms
iso_analysis/report/usage
../envs/base.yml
protein_annotation/index/UniRef.lba.gz
protein_annotation/{factor}_{prop_a}_vs_{prop_b}_de_genes.fa
protein_annotation/blast_results_{factor}_{prop_a}_vs_{prop_b}.m8
../envs/lambda3.yml
filter/{sample}_filtered.fq
index/transcriptome_index.mmi
alignments/{sample}.sam
github.com/snakemake/snakemake-wrappers/bio/minimap2/aligner/environment.yaml@v7.6.0
v7.6.0
counts/barcode10_salmon/quant.sf
counts/barcode11_salmon/quant.sf
counts/barcode12_salmon/quant.sf
counts/barcode13_salmon/quant.sf
counts/barcode15_salmon/quant.sf
counts/barcode16_salmon/quant.sf
merged/all_counts.tsv
../envs/pandas.yml
de_analysis/all.rds
de_analysis/pca_{variable}.svg
../envs/deseq2.yml
iso_analysis/reads_manifest.tsv
../envs/pandas.yml
alignments/{sample}.sam
alignments/{sample}.bam
github.com/snakemake/snakemake-wrappers/bio/samtools/view/environment.yaml@v7.6.0
v7.6.0
<function <lambda> at 0x7fcf6a397ba0>
NanoPlot/{sample}/NanoPlot-report.html
../envs/nanoplot.yml
references/genomic.gff
references/standardized_genomic.gff
../envs/agat.yml
/lustre/project/nhr-zdvhpc/dtest/raw/barcode10.fastq.gz
/lustre/project/nhr-zdvhpc/dtest/raw/barcode11.fastq.gz
/lustre/project/nhr-zdvhpc/dtest/raw/barcode12.fastq.gz
/lustre/project/nhr-zdvhpc/dtest/raw/barcode13.fastq.gz
/lustre/project/nhr-zdvhpc/dtest/raw/barcode15.fastq.gz
/lustre/project/nhr-zdvhpc/dtest/raw/barcode16.fastq.gz
NanoPlot/NanoPlot-report.html
../envs/nanoplot.yml
merged/all_counts.tsv
references/standardized_genomic.gff
merged/all_counts_gene.tsv
merged/transcriptid_to_gene_plot.svg
../envs/pydeseq2.yml
from workflow configuration
workflow rules
A gentle introduction to Nanopubs
Nanopub 101
Welcome!
config.yml
samples: samples.csv
ref:
species: "Drosophila melanogaster"
genome: ""
annotation: ""
accession: "GCF_000001215.4"
ensembl_species: "" # e.g., "homo_sapiens"
build: "" # e.g., "GRCh38"
release: "" # e.g., "105"
read_filter:
min_length: 200
minimap2:
index_opts: ""
opts: ""
maximum_secondary: 100
secondary_score_ratio: 1.0
samtools:
samtobam_opts: "-b"
bamsort_opts: ""
bamindex_opts: ""
bamstats_opts: ""
quant:
salmon_libtype: "U"
deseq2:
fit_type: ""
design_factors:
- "condition"
lfc_null: 1.0
alt_hypothesis: "greaterAbs"
point_width: 20
mincount: 10
alpha: 0.05
threshold_plot: 10
colormap: "Blues"
figtype: "png"
batch_effect:
- ""
isoform_analysis:
FLAIR: true
qscore: 1
exp_thresh: 10
col_opts: "--annotation_reliant generate --generate_map --stringent"
protein_annotation:
lambda: false
uniref: "https://ftp.imp.fu-berlin.de/pub/lambda/index/lambda3/gen_0/uniref50_20230713.lba.gz"
num_matches: 3
RAjHDlPDghZzc9ZvQ3uJQNJ9Jd_KAYzZt7dk5PXKgjRyE
This workflow performs differential expression analysis of RNA-seq data obtained from Oxford Nanopore long-read sequencing technology.
First a transcriptome FASTA is constructed using gffread [https://github.com/gpertea/gffread]. Reads are then mapped to the transcriptome with the long-read optimized alignment tool minimap2 [https://github.com/lh3/minimap2].
Next quantification is performed using salmon [https://github.com/COMBINE-lab/salmon] before normalization and differential expression analysis are conducted by PyDESeq2 [https://github.com/owkin/PyDESeq2].
The workflow can optionally analyze splice-isoforms through integrating the FLAIR [https://github.com/BrooksLabUCSC/flair] workflow.
Additionaly, NanoPlot [https://github.com/wdecoster/NanoPlot] is employed to analyze initial sequencing data and QualiMap [https://github.com/EagleGenomics-cookbooks/QualiMap] is used to evaluate mapping results.
2026-04-17T15:19:46.851430+00:00
QC/qualimap/barcode10
alignment_qa
qualimap/barcode13/qualimapReport.html
alignment_qa_report
qualimap/barcode15/qualimapReport.html
alignment_qa_report
qualimap/barcode16/qualimapReport.html
alignment_qa_report
sorted_alignments/barcode10_sorted.bam
bam_sort
sorted_alignments/barcode11_sorted.bam
bam_sort
sorted_alignments/barcode12_sorted.bam
bam_sort
sorted_alignments/barcode13_sorted.bam
bam_sort
sorted_alignments/barcode15_sorted.bam
bam_sort
sorted_alignments/barcode16_sorted.bam
bam_sort
QC/bamstats/barcode10.txt
bam_stats
QC/qualimap/barcode11
alignment_qa
QC/bamstats/barcode11.txt
bam_stats
QC/bamstats/barcode12.txt
bam_stats
QC/bamstats/barcode13.txt
bam_stats
QC/bamstats/barcode15.txt
bam_stats
QC/bamstats/barcode16.txt
bam_stats
index/flair_genome_index.mmi
build_flair_genome_index
index/transcriptome_index.mmi
build_minimap_index
transcriptome/corrected_transcriptome.fa
correct_transcriptome
counts/barcode10_salmon/quant.sf
count_reads
counts/barcode11_salmon/quant.sf
count_reads
QC/qualimap/barcode12
alignment_qa
counts/barcode12_salmon/quant.sf
count_reads
counts/barcode13_salmon/quant.sf
count_reads
counts/barcode15_salmon/quant.sf
count_reads
counts/barcode16_salmon/quant.sf
count_reads
de_analysis/condition_wt_vs_mt_MA_plot.svg
de_analysis/condition_wt_vs_mt_count_heatmap.svg
de_analysis/condition_wt_vs_mt_dispersion_plot.svg
de_analysis/condition_wt_vs_mt_l2fc.tsv
de_analysis/condition_wt_vs_mt_sample_heatmap.svg
de_analysis/condition_wt_vs_mt_top_count_heatmap.svg
deseq2
de_analysis/all.rds
de_analysis/normcounts.tsv
deseq2_init
references/ncbi_dataset_annotation.zip
download_ncbi_annotation
references/ncbi_dataset_genome.zip
download_ncbi_genome
filter/barcode10_filtered.fq
filter_reads
filter/barcode11_filtered.fq
filter_reads
QC/qualimap/barcode13
alignment_qa
filter/barcode12_filtered.fq
filter_reads
filter/barcode13_filtered.fq
filter_reads
filter/barcode15_filtered.fq
filter_reads
filter/barcode16_filtered.fq
filter_reads
iso_analysis/align/flair.bam
iso_analysis/align/flair.bam.bai
iso_analysis/align/flair.bed
flair_align
iso_analysis/collapse/flair.isoforms.bed
iso_analysis/collapse/flair.isoforms.fa
flair_collapse
iso_analysis/align/flair_all_corrected.bed
flair_correct
iso_analysis/diffexp/genes_deseq2_QCplots_wt_v_mt.pdf
iso_analysis/diffexp/genes_deseq2_wt_v_mt.tsv
iso_analysis/diffexp/isoforms_deseq2_QCplots_wt_v_mt.pdf
iso_analysis/diffexp/isoforms_deseq2_wt_v_mt.tsv
iso_analysis/diffexp/isoforms_drimseq_wt_v_mt.tsv
flair_diffexp
iso_analysis/plots
flair_plot_isoforms
iso_analysis/quantify/flair.counts.tsv
flair_quantify
QC/qualimap/barcode15
alignment_qa
references/genomic.fa.fai
transcriptome/transcriptome.fa
genome_to_transcriptome
references/genomic.gff
get_annotation
references/genomic.fa
get_genome
references/standardized_genomic.gtf
gff_to_gtf
iso_analysis/report/isoforms
iso_analysis/report/usage
iso_analysis_report
alignments/barcode10.sam
map_reads
alignments/barcode11.sam
map_reads
alignments/barcode12.sam
map_reads
alignments/barcode13.sam
map_reads
alignments/barcode15.sam
map_reads
QC/qualimap/barcode16
alignment_qa
alignments/barcode16.sam
map_reads
merged/all_counts.tsv
merge_read_counts
iso_analysis/reads_manifest.tsv
reads_manifest
alignments/barcode10.bam
sam_to_bam
alignments/barcode11.bam
sam_to_bam
alignments/barcode12.bam
sam_to_bam
alignments/barcode13.bam
sam_to_bam
alignments/barcode15.bam
sam_to_bam
alignments/barcode16.bam
sam_to_bam
NanoPlot/barcode10/NanoPlot-report.html
sample_qa_plot
qualimap/barcode10/qualimapReport.html
alignment_qa_report
NanoPlot/barcode11/NanoPlot-report.html
sample_qa_plot
NanoPlot/barcode12/NanoPlot-report.html
sample_qa_plot
NanoPlot/barcode13/NanoPlot-report.html
sample_qa_plot
NanoPlot/barcode15/NanoPlot-report.html
sample_qa_plot
NanoPlot/barcode16/NanoPlot-report.html
sample_qa_plot
references/standardized_genomic.gff
standardize_gff
NanoPlot/NanoPlot-report.html
total_sample_qa_plot
merged/all_counts_gene.tsv
merged/transcriptid_to_gene_plot.svg
transcriptid_to_gene
qualimap/barcode11/qualimapReport.html
alignment_qa_report
qualimap/barcode12/qualimapReport.html
alignment_qa_report
from workflow configuration
2026-04-07
2026-04-11
Open Science Retreat Global - 2026
Machynlleth, Wales, UK
https://digital-research.academy/
https://www.wikidata.org/wiki/Q137737148?wprov=acrw1_0
OpenScience Community members
The Snakemake Developer Community is a support group for software packages of the Snakemake Workflow Management System. It organizes events such as hackathons and workshops revolving around a better and reproducible data analysis.
Snakemake Community
config/config.yaml
samples:
A: data/samples/A.fastq
B: data/samples/B.fastq
2026-03-17T19:26:28.393653+00:00
1
-R '@RG\tID:{sample}\tSM:{sample}'
2
samtools
3
coordinate
4
1
calls/all.vcf
calls/positions.png
calls/quals.png
all
all
calls/all.vcf
envs/bcftools.yml
shell
bcftools_call
sorted_reads/A.bam
sorted_reads/B.bam
sorted_reads/{sample}.bam
envs/bwa.yml
v8.1.1/bio/bwa/mem
["-R '@RG\\tID:{sample}\\tSM:{sample}'", "samtools", "coordinate", ""]
bwa_mem
calls/positions.png
envs/matplotlib.yml
scripts/plot-positions.py
plot_positions
calls/quals.png
envs/matplotlib.yml
scripts/plot-quals.py
plot_quals
sorted_reads/A.bam.bai
sorted_reads/B.bam.bai
sorted_reads/{sample}.bam.bai
envs/samtools.yml
v5.7.0/bio/samtools/index
[""]
samtools_index
from workflow configuration
2026-03-17T16:56:38.233861+00:00
1
-R '@RG\tID:{sample}\tSM:{sample}'
2
samtools
3
coordinate
4
1
all
all
calls/all.vcf
envs/bcftools.yml
shell
bcftools_call
sorted_reads/A.bam
sorted_reads/B.bam
sorted_reads/{sample}.bam
envs/bwa.yml
v8.1.1/bio/bwa/mem
["-R '@RG\\tID:{sample}\\tSM:{sample}'", "samtools", "coordinate", ""]
bwa_mem
calls/positions.png
envs/matplotlib.yml
scripts/plot-positions.py
plot_positions
calls/quals.png
envs/matplotlib.yml
scripts/plot-quals.py
plot_quals
sorted_reads/A.bam.bai
sorted_reads/B.bam.bai
sorted_reads/{sample}.bam.bai
envs/samtools.yml
v5.7.0/bio/samtools/index
[""]
samtools_index
{"samples": {"A": "data/samples/A.fastq", "B": "data/samples/B.fastq"}}
config/config.yaml
Snakefile
config/config.yaml
envs/bcftools.yml
envs/bwa.yml
envs/matplotlib.yml
envs/samtools.yml
scripts/plot-positions.py
scripts/plot-quals.py
/home/meesters/Documents/Teaching/snakemake-hpc-teaching-material/snakemake-tutorial/Snakefile
/home/meesters/Documents/Teaching/snakemake-hpc-teaching-material/snakemake-tutorial/Snakefile
{}
8
2
6
There is already work ongoing for the java library: https://github.com/Nanopublication/nanopub-py/issues/215 , https://github.com/Nanopublication/nanopub-java/compare/master...56-add-sem-release-workflow
Could nanopub-py linked with the release-please action (creating automated releases and registering with PyPi in one go)? I would require an organizational GitHub token (from which the java library could profit, too).
the snakemake tutorial workflow in various conditions.
Snakemake HPC Teaching Material
HTML
XML
PDF
workflow report
has the description
has license
has the title
is a
has primary source workflow
data catalog
is archived at
was published on
will be no longer after from
has repository
results from
has source
URI of institution/organization where report is archived
Template for describing a workflow report
Describing a workflow report
Report: ${title}
Computational Workflow
description of the report
date when report availability expires (YYYY-MM-DD)
license of the published report
URI, ROR of insitution/organization where report was produced
URI of research project in which report was produced
date when made publicly available (YYYY-MM-DD)
short URI suffix or full URI
URI of repository where report is published
title of the report
URI of workflow nanopub that is the primary source of this report
This is an example report for the Snakemake analysis workflow to analyse differential gene expression and isoform splicing analysis on Nanopore long-read RNA-Seq data using minimap2, Salmon, and DESeq2 : https://github.com/snakemake-workflows/rna-longseq-de-isoform/
Example Snakemake Report
2026-03-04
This nanopub template contains an error, should not be placed under datasets and is currently not retractable.
date
is a
has name
is further described by
end time
located in
start time
hybrid event
in-person event
virtual event
has event organizer
has event sponsor
has participating community
This template can be used to define a Retreat - describe the purpose, here.
Defining a Retreat
Events
community - a community that has partecipated in the event
(starting) date of the event (e.g. 2020-12-31)
[0-9][0-9][0-9][0-9]-[0-9][0-9]-[0-9][0-9]
short name, used as URI suffix
finish date of the event (e.g. 2020-12-31)
location of the event (URL or name)
the name of the event
online resource
organizer of the event (ROR, URL or name)
sponsor of the event (ROR or name)
0
1
2
3
4
6
7
8
9
10
11
5
Testing data set for a Snakemake differential and splicing isoform detection workflow based on ONT RNA-longseq data
FASTA
FASTQ
fitch program
GCG
GenBank format
genpept
GFF2-seq
GFF3-seq
giFASTA format
pdbatom
created by
has description
needs or produces data format
has license
has title
improves upon method
CC-BY-4.0
serves purpose
applies to domains
has source repository
has keywords
uses computational approach
is accessible via
Apache 2.0
BSD 3 Clause
GPL 3.0
MIT
applicable research domains
https://www.wikidata.org/w/api.php?action=wbsearchentities&language=en&format=json&limit=5&search=
primary computational purpose
https://www.wikidata.org/w/api.php?action=wbsearchentities&language=en&format=json&limit=5&search=
Template for describing a computational workflow implemented for and supported by a Workflow Management System
Computational Workflow
Method: ${title}
Computational Workflow
method this improves upon (DOI or URI)
computational approach used
https://www.wikidata.org/w/api.php?action=wbsearchentities&language=en&format=json&limit=5&search=
URI of the workflow repository
ORCID of creator
https://orcid.org/
ORCID (just the 16-digit ID)
0000-\d{4}-\d{4}-\d{4}
description of what the method does
[\s\S]{10,1000}
DOI URL
https://doi.org/
DOI (starting with '10.')
10.(\d)+/(\S)+
GitHub repository URL
expected input data format
input:
Input:
research keywords from Wikidata
https://www.wikidata.org/w/api.php?action=wbsearchentities&language=en&format=json&limit=5&search=
software license
title of the Workflow
[\s\S]{5,200}
Workflow Management System
uses workflow management system
Snakemake
Workflow Management System
Nextflow
Galaxy
CWL
A Snakemake workflow analysing ONT RNA-Seq transcriptomics data to yield differential expression results along with splicing isoforms detection.
rna-longseq-de-isoform
FASTA
FASTQ
fitch program
GCG
GenBank format
genpept
GFF2-seq
GFF3-seq
giFASTA format
pdbatom
created by
has description
needs or produces data format
has license
has title
improves upon method
CC-BY-4.0
serves purpose
applies to domains
has source repository
has keywords
uses computational approach
reports performance results
is accessible via
Apache 2.0
BSD 3 Clause
GPL 3.0
MIT
applicable research domains
https://www.wikidata.org/w/api.php?action=wbsearchentities&language=en&format=json&limit=5&search=
primary computational purpose
https://www.wikidata.org/w/api.php?action=wbsearchentities&language=en&format=json&limit=5&search=
Template for describing a computational workflow implemented for and supported by a Workflow Management System
Computational Workflow (without target namespace)
Method: ${title}
Computational Workflow
method this improves upon (DOI or URI)
computational approach used
https://www.wikidata.org/w/api.php?action=wbsearchentities&language=en&format=json&limit=5&search=
URI of the workflow repository
ORCID of creator
https://orcid.org/
ORCID (just the 16-digit ID)
0000-\d{4}-\d{4}-\d{4}
description of what the method does
[\s\S]{10,1000}
DOI URL
https://doi.org/
DOI (starting with '10.')
10.(\d)+/(\S)+
GitHub repository URL
expected input data format
input:
Input:
research keywords from Wikidata
https://www.wikidata.org/w/api.php?action=wbsearchentities&language=en&format=json&limit=5&search=
software license
quantitative performance measurements
[\s\S]{10,1000}
title of the Workflow
[\s\S]{5,200}
Workflow Management System
uses workflow management system
Snakemake
Workflow Management System
Nextflow
Galaxy
CWL
I think we need to have an "onboarding SOP" for
- people who start authenticating others for the first time.
- for 3rd party lecturers who introduce nanopubs in workshops
It could be a slide set or a static part of the homepage or ... anything else without too much effort.
What do you think?
2026-03-09
2026-03-13
Snakemake Hackathon
Munich, Germany
Christian Meesters
Clemens Lange
David Lähnemann
Johannes Köster
Lukas Heinrich
Matthew Feickert
date
is a
has name
is further described by
end time
located in
start time
hybrid event
in-person event
virtual event
has event organizer
has event sponsor
has participating community
This template can be used to define a Hackathon Event.
Defining a Hackathon Event
Events
community - a community that has partecipated in the event
(starting) date of the event (e.g. 2020-12-31)
[0-9][0-9][0-9][0-9]-[0-9][0-9]-[0-9][0-9]
short name, used as URI suffix
finish date of the event (e.g. 2020-12-31)
location of the event (URL or name)
the name of the event
online resource
organizer of the event (ROR, URL or name)
sponsor of the event (ROR or name)
0
1
2
3
4
6
7
8
9
10
11
5
Ah, I was thinking, should the "Plan to attend" nanopub template not have a description? A text label, as not all URLs are descriptive?
And there it is! The template is updated. Already!
How about adding the option to link it with another nanopub? A declared event? And the declared event should have a descriptive field, too. ;-)
What do you think?
The German Research Software Engineering (deRSE26) takes place from March 3 to March 5, 2026 at the University of Stuttgart.
Contributions from all computational fields are welcome.
German Research Software Engineering Conference 2026
9. October 2025
An introduction to Continuous Integration
Continuous Integration Course
Data Analysis Workflow on HPC Systems
Snakemake HPC Teaching Material
virtual screening results (tabulated enthalpies)
Quick, “Imputation-free” meta-analysis with proxy-SNPs