Skip to content

faq 29229112

Billy Charlton edited this page Sep 5, 2018 · 2 revisions

HDF5 network files?

by GregoryM on 2015-06-05 12:51:00


The xml-based format is nice, but it has some drawbacks too. First, xml has lots of non-data characters in it for attribute names and structure frames, so a given file can be much larger than it needs to be. Second, reading and writing to xml can be kind of slow and confusing.

I noticed that [pandanas](http://synthicity.github.io/pandana/tutorial.html#create-the-network) stores edge and node tables in an HDF5 object. This format seems to be open, fast, and extensible. What would it take to allow MATSim to handle that kind of file as an alternative to xml?


Comments: 1


Re: HDF5 network files?

by Kai Nagel on 2015-06-05 19:54:15

Some remarks (maybe not an "answer"):

  • Some discussion about "the future of matsim" is under JIRA (matsim.atlassian.net)39b78b4e-5e5e-31ca-ad6c-6e03846a9bf5MATSIM-189.  We should probably eventually decide if the above question is rather a q&a or rather an "issue".
  • We certainly want our "main" file format to be directly editable.  Some of us have worked with file formats that were not directly editable and that was always a fairly large hindrance.  (I can edit xml.gz directly in emacs.)
  • The space consumption problem is actually not very large; the gz compression removes the repeated material.
  • Having said that, nothing speaks against having additional input file formats.  The two main problems are: (a) someone needs to write the writers and readers; (b) someone needs to maintain them when the data structures change (not very often, but still).  I would say that if the material is well implemented, including regression tests, in all likelihood (b) would not be a large problem.
  • Finally, if some standard emerges in the community, we would consider switching matsim to that standard.  Clearly, I can only promise the "consideration"; we would need to make the decision in a group.

Clone this wiki locally