Sleuth-Error in check_target_mapping #257

nmorf · 2021-05-31T16:15:30Z

Hello,

I'm trying to use bioMart to retrieve the gene names from Apis mellifera from Ensemble. I'm trying to analyze the data generated by Kallisto using Sleuth.

I encounter the error posted in 2017 (link below). I haven't been able to fix it myself. I was wondering if someone could direct me to a possible solution without editing the fasta files?

#111

Here is the error message that I get.

mart <- useMart('metazoa_mart', host = 'metazoa.ensembl.org')
mart <- useDataset('amellifera_eg_gene', mart)

t2g <- biomaRt::getBM(attributes = c("ensembl_transcript_id", "ensembl_gene_id",

                                 "external_gene_name"), mart = mart)

t2g <- dplyr::rename(t2g, target_id = ensembl_transcript_id,

                 ens_gene = ensembl_gene_id, ext_gene = external_gene_name)

so <- sleuth_prep(s2c, ~ condition, target_mapping = t2g)
reading in kallisto results
dropping unused factor levels
........................
Error in check_target_mapping(tmp_names, target_mapping, !is.null(aggregation_column)) :
couldn't solve nonzero intersection
In addition: There were 25 warnings (use warnings() to see them)

Thank you,
nm

The text was updated successfully, but these errors were encountered:

gcamprecios · 2022-08-04T09:40:16Z

Good morning,

This is the first time I use Sleuth after pseudoalignment with kallisto. Quite new to this.
Everything runs well, except for when I try to collapse transcripts to genes with the target_mapping. I get exactly the same error as nmorf above, and I was wondering if it had been solved somewhere else. I can't seem to find an answer, and I've tried to generate all kinds of files to use this function. Here it is the code I am using, which is basically what I see in the walkthroughs and from everybody!
To generate the t2g file:

mart <- biomaRt::useMart(biomart = "ENSEMBL_MART_ENSEMBL",
dataset = "hsapiens_gene_ensembl",
host = 'ensembl.org')
t2g <- biomaRt::getBM(attributes = c("ensembl_transcript_id","ensembl_transcript_id_version", "ensembl_gene_id",
"ensembl_gene_id_version","external_gene_name","description",
"chromosome_name","start_position",
"end_position","strand",
"entrezgene_id"), mart = mart)
t2g <- dplyr::rename(t2g, target_id = ensembl_transcript_id,
ens_gene = ensembl_gene_id, ext_gene = external_gene_name)

t2g <- dplyr::select(t2g, c('target_id', 'ens_gene', 'ext_gene'))

To run the sleuth_prep function:

so122 <- sleuth_prep (metadata122,
target_mapping = t2g,
aggregation_column = 'ens_gene',
read_bootstrap_tpm = TRUE,
extra_bootstrap_summary = TRUE,
transformation_function = function(x) log2(x + 0.5),
num_cores = 2)

The error I get all the time (no matter how I construct the t2g data.frame):

Warning: It appears that you are running Sleuth from within Rstudio.
Because of concerns with forking processes from a GUI, 'num_cores' is being set to 1.
If you wish to take advantage of multiple cores, please consider running sleuth from the command line.reading in kallisto results
dropping unused factor levels
Error in check_target_mapping(tmp_names, target_mapping, !is.null(aggregation_column)) :
couldn't solve nonzero intersection

And here I show you he first rows of our .tsv abundance file from kallisto (I use the .h5 for the sleuth_prep:

target_id								length	eff_length
ENST00000456328.2	ENSG00000223972.5	OTTHUMG00000000961.2	OTTHUMT00000362751.1	DDX11L1-202	DDX11L1	1657	processed_transcript	1657	1453.07
ENST00000450305.2	ENSG00000223972.5	OTTHUMG00000000961.2	OTTHUMT00000002844.2	DDX11L1-201	DDX11L1	632	transcribed_unprocessed_pseudogene	632	428.3
ENST00000488147.1	ENSG00000227232.5	OTTHUMG00000000958.1	OTTHUMT00000002839.1	WASH7P-201	WASH7P	1351	unprocessed_pseudogene	1351	1147.07
ENST00000619216.1	ENSG00000278267.1	-	-	MIR6859-1-201	MIR6859-1	68	miRNA	68	34.625
ENST00000473358.1	ENSG00000243485.5	OTTHUMG00000000959.2	OTTHUMT00000002840.1	MIR1302-2HG-202	MIR1302-2HG	712	lncRNA	712	508.07

sigusn · 2022-08-04T14:19:04Z

Hi, I think there could be some issue with the abundance file. I usually only have one column with "target_id" but you have more columns without headings.
Example of my abundance.tsv
target_id length eff_length est_counts tpm
ENST00000631435.1 12 6.64286 0 0

gcamprecios · 2022-08-04T14:32:10Z

HI @sigusn , thanks very much for the response. Indeed, I found another page where all this issue was discussed and solved back in 2019. My problem is that I generated my kallisto with the genomcodev40, and all my abundance files had looong "target_id" names, which made it impossible to match. I used their code to change the names to all the abundance files at once, leaving only the ENS name.

I leave here the page with the discussion and solution.
Thanks!

https://groups.google.com/g/kallisto-and-applications/c/KQ8782UD35E/m/hbqqMOgGBwAJ

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Sleuth-Error in check_target_mapping #257

Sleuth-Error in check_target_mapping #257

nmorf commented May 31, 2021

gcamprecios commented Aug 4, 2022

sigusn commented Aug 4, 2022

gcamprecios commented Aug 4, 2022

Sleuth-Error in check_target_mapping #257

Sleuth-Error in check_target_mapping #257

Comments

nmorf commented May 31, 2021

gcamprecios commented Aug 4, 2022

sigusn commented Aug 4, 2022

gcamprecios commented Aug 4, 2022