Tabix error using UCSC bedgraph example #4

lynxoid · 2013-11-08T20:06:01Z

I am trying to create a custom track for the WashU EpiGenome browser (instructions here: http://washugb.blogspot.com/2012/09/generate-tabix-files-from-bigwig-files.html), so I am using a bedgraph file example posted at UCSC page: http://genome.ucsc.edu/goldenPath/help/bedgraph.html

The file looks like this:

browser position chr19:49302001-49304701
browser hide all
browser pack refGene encodeRegions
browser full altGraph
#   300 base wide bar graph, autoScale is on by default == graphing
#   limits will dynamically change to always show full range of data
#   in viewing window, priority = 20 positions this as the second graph
#   Note, zero-relative, half-open coordinate system in use for bedGraph format
track type=bedGraph name="BedGraph Format" description="BedGraph format" visibility=full color=200,100,0 altColor=0,100,200 priority=20
chr19 49302000 49302300 -1.0
chr19 49302300 49302600 -0.75
chr19 49302600 49302900 -0.50
chr19 49302900 49303200 -0.25
chr19 49303200 49303500 0.0
chr19 49303500 49303800 0.25
chr19 49303800 49304100 0.50
chr19 49304100 49304400 0.75
chr19 49304400 49304700 1.00

I run bzip first:

bgzip input.bedgraph

and then I run tabix:

tabix -p bed input.bedgraph.gz

at which point I get these errors:

[get_intv] the following line cannot be parsed and skipped: browser position chr19:49302001-49304701
[ti_index_core] the indexes overlap or are out of bounds

If bedgraph is not the file format tabix expects, what is the input file format?

Thanks!

The text was updated successfully, but these errors were encountered:

pd3 · 2013-11-11T08:39:29Z

I was unable to reproduce the problem. What version of tabix and bgzip are you using? This is what I did:

# cat | while read A B C D; do echo -e "$A\t$B\t$C\t$D"; done | bgzip -c > rmme.bed.gz
chr19 49302000 49302300 -1.0
chr19 49302300 49302600 -0.75
chr19 49302600 49302900 -0.50
chr19 49302900 49303200 -0.25
chr19 49303200 49303500 0.0
chr19 49303500 49303800 0.25
chr19 49303800 49304100 0.50
chr19 49304100 49304400 0.75
chr19 49304400 49304700 1.00

$ tabix -p bed rmme.bed.gz
$ tabix rmme.bed.gz chr19:49303800-49304100
chr19   49303500    49303800    0.25
chr19   49303800    49304100    0.50

lynxoid · 2013-11-12T16:14:10Z

I figured it out: I was using spaces as separators while tabs were expected by default. I don't think I saw this fact mentioned anywhere obvious, so may be it's a good idea to make it more transparent -- or parse on whitespace instead.

lh3 · 2013-11-12T16:27:40Z

I just realized that the UCSC format page does not explicitly require that BED should be TAB delimited. However, the "BED detail format" does require TAB as the only separator. Conventionally, BED files are TAB delimited, too. That said, it would be good to throw a warning/error when the line is space delimited or optionally parse space-delimited files. This is not of high priority, though.

GarrettJenkinson mentioned this issue May 28, 2018

BED files produces are not tabulated GarrettJenkinson/informME#11

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Tabix error using UCSC bedgraph example #4

Tabix error using UCSC bedgraph example #4

lynxoid commented Nov 8, 2013

pd3 commented Nov 11, 2013

lynxoid commented Nov 12, 2013

lh3 commented Nov 12, 2013

Tabix error using UCSC bedgraph example #4

Tabix error using UCSC bedgraph example #4

Comments

lynxoid commented Nov 8, 2013

pd3 commented Nov 11, 2013

lynxoid commented Nov 12, 2013

lh3 commented Nov 12, 2013