Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

use geom_box_tree for fragment stats calculation #996

Closed
stevengj opened this issue Aug 20, 2019 · 5 comments
Closed

use geom_box_tree for fragment stats calculation #996

stevengj opened this issue Aug 20, 2019 · 5 comments

Comments

@stevengj
Copy link
Collaborator

To speed up the computation of box_overlap_with_object, you can:

  1. compute the geom_box_tree object with create_geom_box_tree0 for the whole computational cell during init_libctl or similar.

  2. When you are computing the cost of a box, walk down the tree (using the bounding boxes stored in each tree node) to quickly find a list of objects that may potentially intersect your box, and only for these objects call the slow box_overlap_with_object routine

@oskooi
Copy link
Collaborator

oskooi commented Aug 21, 2019

For reference, the following plot shows the time for choose_chunkdivision (for split_by_cost) as a fraction of the total runtime for an OLED simulation (script shown in #988:comment) vs. the number of chunks/cores. Results are shown for two sets of runs with resolution of 50 and 60.

oled_choose_chunkdivision_fraction

The time for choose_chunkdivision increases monotonically with the chunk count and is a large fraction of the total time (i.e., >0.2) already for the smallest chunk counts. Improvements in the split_by_cost method should aim to considerably reduce these times.

Additionally, the following are the same results showing the time in seconds which demonstrate that choose_chunkdivision takes a few minutes.

oled_choose_chunkdivision

@stevengj
Copy link
Collaborator Author

stevengj commented Sep 18, 2019

Summary of method for using geom_box_tree. See also the implementation of tree_search (https://github.com/NanoComp/libctl/blob/master/utils/geom.c#L1831-L1852).

Suppose you want to compute the intersection of a box B with all of the objects, and you have a tree T. Then:

  1. Walk through the tree recursively. Visit all of the nodes. Exit whenever you get to a node whose bounding box b does not intersect B.

  2. For node where nobjects > 0 (in practice this is only the leaves), compute the intersection L of B with the node's bounding box b.

  3. Loop through that node's objects[i] for i=0; i < nobjects; ++i. Find objects for which objects[i].box intersects L.

  4. Call box_overlap_with_object to compute the intersection of these objects[i].o with L - objects[i].shiftby.

The sum of the intersections in step 4, from all of the leaves and objects, is then what you want (replacing your current loop over objects).

@stevengj
Copy link
Collaborator Author

Looks like dimensions is used in only two places in libctl/utils/geom.c. In both cases, it seems like we can allow dimensions == 3 and just skip dimensions with zero size. For example, in the LOOP_PERIODIC macro, just do:

      int iii, jjj, kkk; \
	      for (iii = -1; iii <= 1; ++iii) { \
		   shiftby.x = iii * geometry_lattice.size.x; \
		   for (jjj = -1; jjj <= 1; ++jjj) { \
			shiftby.y = jjj * geometry_lattice.size.y; \
			for (kkk = -1; kkk <= 1; ++kkk) { \
			     shiftby.z = kkk * geometry_lattice.size.z; \
			     body; \
                             if (geometry_lattice.size.z == 0) break; \
			} \
                        if (geometry_lattice.size.y == 0) break; \
		   } \
                    if (geometry_lattice.size.x == 0) break; \
	      } \

@stevengj
Copy link
Collaborator Author

and in divide_geom_box_tree, for the for (i = 0; i < dimensions; ++i) loop, do:

     for (i = 0; i < dimensions; ++i) {
           if (VEC_I(t->b.high, i) == VEC_I(t->b.low, i)) continue; /* skip empty dimensions */
	  find_best_partition(t->nobjects, t->objects, i, &division_point[i],
			      &division_nobjects[i][0],
			      &division_nobjects[i][1]);
	  if (MAX(division_nobjects[i][0], division_nobjects[i][1]) <
	      MAX(division_nobjects[best][0], division_nobjects[best][1]))
	       best = i;
     }

@ChristopherHogan
Copy link
Contributor

This was implemented in #1030, but it didn't seem to improve performance.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

3 participants