Skip to content

PEKA jobs mismatching peak and crosslink inputs #33

@lconde-ucl

Description

@lconde-ucl

ISSUE

Hi,

There seems to be a bug in how PEKA is executed in the pipeline: the jobs are mixing up input files, using peak files from one sample and crosslink files from another. For example:

[...]
TMPDIR=$(pwd)/tmp peka  \
   -i SAMPLE-1_R1_genome.clippy._rollmean10_minHeightAdjust1.0_minPromAdjust1.0_minGeneCount5_Peaks.bed  \
   -x SAMPLE-2_R1.genome.xl.bed \
   -g GRCh38.primary_assembly.genome.fa  \
   -gi GRCh38.primary_assembly.genome.fa.fai \
   -r gencode_filtered_regions_genicOtherfalse.resolved.gtf         
[...]

SUGGESTED SOLUTION:

I think is an issue with the way the CLIPPY_GENOME.out.peaks and ch_genome_crosslink_bed channels are passed separately into the PEKA module, without being joined on their metadata. So nextflow seems to combine whichever files are available first, instead of proper sample-matched pairs. I believe that Joining the channels beforehand should fix the issue. Suggested change:

Currently:

/*
* MODULE: Run peka on genome-level crosslinks
*/
PEKA (
   CLIPPY_GENOME.out.peaks,
   ch_genome_crosslink_bed,
   ch_fasta.collect{ it[1] },
   ch_fasta_fai.collect{ it[1] },
   ch_regions_resolved_gtf.collect{ it[1] }
)
ch_versions = ch_versions.mix(PEKA.out.versions)

Suggested fix:

/*
* MODULE: Run peka on genome-level crosslinks
*/
ch_peka_input = CLIPPY_GENOME.out.peaks
   .join(ch_genome_crosslink_bed, by: 0)
PEKA (
   ch_peka_input.map { meta, peaks, crosslinks -> [meta, peaks] },
   ch_peka_input.map { meta, peaks, crosslinks -> [meta, crosslinks] },
   ch_fasta.collect{ it[1] },
   ch_fasta_fai.collect{ it[1] },
   ch_regions_resolved_gtf.collect{ it[1] }
)
ch_versions = ch_versions.mix(PEKA.out.versions)

Best
Lucia

PS: I’ve submitted a pull request to address this issue: #42

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions