Project Meeting 2020.11.03

Technical call

Discuss telecommute model design with Joel
- This is the initial design, all components are up for discussion
- What exactly is telecommuting? It is the replacement of travel
- Day of the week matters a lot in telecommute modeling
- The model is based on the sandag model spec
- What about worker occupation? It is not in many synpops, but it is important for telecommute prediction
- What are the policy knobs for what-if analysis? Both pre and post COVID
- Income also an important variable
- Telecommute frequency model affects CDAP, INMTF, and NMTSF submodels
- How is usually work at home modeled?
- Add a work from home or work out of the home model as well
- If work from home, then there's no commute to replace with telecommuting
- ActivitySim doesn't have a work from home model but it needs it
- DaySim has it
- MAG telecommute model is like SANDAG's model
- MAG did COVID scenario analysis with it; Joel has a paper they wrote
- They varied work from home rates by worker occupation
- The SEMCOG data for estimation should work fine
- The MWCOG model, where RSG is also building an ActivitySim model, has few employment types
- Recommend:
  - Worker telecommute frequency model
  - Work from home extension to work location choice
  - Use SEMCOG data to estimate
- Work from home model includes an employment accessibility term
- SANDAG work from home model doesn't have occupation/industry since estimated from old survey
- Person occupation/industry can be supported by activitysim since it is just additional data on the person table
- The location choice size term selector is user defineable but can be just one person variable and it is currently income
- Person occupation is important for COVID analysis
- We won't implement downstream effects for all models, just some to illustrate that it works; we don't want to bit off too much at once
- SEMCOG wants to get going so comments due next week
- Can we include transit service in the model estimation? The bay area data shows this matters for telecommuting
- Maybe we could pool the bay area data and the SEMCOG data but that probably too big for the scope
- SEMCOG will contribute the example and so could be updated for a bay area example
- We'll create a wiki page with Joel's initial design presentation and Wu's background info
- The presentation is sufficient for the first deliverable
Discuss progress on TVPB / skim performance and caching strategies with Doyle
- Jeff Newman's feather backend for skims testing is very promising
- So Doyle is testing replacing (or adding) a feather memmap caching backend replacement to activitysim to substitute for in-memory skims
- This would free up a lot of ram to use for other purposes and hopefully get overall faster runtimes
- In addition, the skim architecture needed some re-writing / updating based on all the updates over the years so that's getting cleaned up too
- There are performance differences between numpy memmap and feather memmap
- Instead of 6 GB of RAM for the TM1 skims, there's a memmap file on disk and 200 MB RAM usage
- Performance is similar so far singled threaded
- This could replace the need for scaled int skims
- Could still store scaled int skims in the cache too in order to reduce it
- With memmapped designs, you need to organize the data in continuous blocks for how the data will be accessed in order to get good performance
- So making the first index in the existing ODT (Origin, Destination, TimeOfDay) queries, the TimeOfDay dimension makes the queries much faster
- This means rearranging the skims after reading them from the OMX files but before putting them in the cache
- So may be able to avoid the existing multiprocessing shared memory setup that currently handles skims and needs to be extended for multiple zones and TVPB data
- SANDAG has 4D skims, with VOT bin as well, is that a problem? It can either be collapsed to a 3rd dimension or support can be extended with a little work
- How much disk space is the cache? About the same as RAM would be
- Want to use a fast SSD drive
- Disk/RAM usage in modern OSs is changing because of how paging works so the traditional disk/RAM distinctions are blurring
- Have you tested threading yet? Not yet but believes it should work because the OS is doing the paging
- All this work is in support for improved skim/tvpb data management for eventual tvpb performance tuning
- If we can use the cache really fast then let's do that, it makes the whole system easier to maintain, if not, then back to multiprocessing shared memory objects
- Much RAM usage is for the household processing chunks/threads so maybe we should focus more on that?
- We could work on slimming them down in terms of data types
- TOD is stored as 8 bytes for example
- CT-RAMP has this issue as well
- Freeing up more RAM to give to the chunks/threads would provide more throughput as well
- Expect to focus on tvpb data in addition to just skims later this week
Discuss progress on running the full scale TM2 Marin work tour mode choice example with me
- I've got the full scale Marin TM2 work tour mode choice example running on my development machine
- It's just running single threaded for now since Jeff working on the multiprocessing/performance stuff
- It includes the tap lines trimming functionality
- I've summarized mode shares and boarding tap counts
- Work tour transit mode share is 19% in Marin TM2 and 18% in asim right now
- But no drive transit, which is 6% in Marin TM2
- Also the tap count distribution looks pretty good, but there are some outliers that need review, such as tap 5117
- There's lots of good logging/tracing to review for debugging
- Plan to trace a couple HHs to find the issues - no drive transit and tap 5117 being popular in TM2 but not in asim
Clint and Jeff to discuss ARC runtimes and improvements

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Project Meeting 2020.11.03

Technical call

ActivitySim

Clone this wiki locally