-
Notifications
You must be signed in to change notification settings - Fork 121
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
optimize mvps_push_or_pull #662
Conversation
apperently the CI uses a different lua version than minetest 5.7 since it lacks table.move. not sure what to make of that |
|
LuaJIT adds table.move. You can check at load if it exists, and implement |
mesecons_mvps/init.lua
Outdated
@@ -71,40 +71,49 @@ function mesecon.mvps_get_stack(pos, dir, maximum, all_pull_sticky) | |||
local pos_set = {} | |||
local frontiers = mesecon.fifo_queue.new() | |||
frontiers:add(vector.new(pos)) | |||
-- micro-optimization: lift local definitions out of loop |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It would be new to me that lifting local variable declarations out of their scope is cheaper. May I ask where you read about this?
(Also, imho it hinders code quality.)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this is based on personal testing, and is likely not worth it.
unless LuaJIT automatically does this, a local variable would have to have space for it allocated, which necessarily takes time.
|
||
on_mvps_move(moved_nodes) | ||
|
||
on_mvps_move(moved_nodes) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What was changed here, whitespace?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yeah this must've happened when i was deleting and inserting lines. are random formatting changes considered a major problem?
if vector.equals(link, np) then | ||
frontiers:add(adjpos) | ||
-- optimization: don't check blocks that are already part of the stack | ||
if not pos_set[minetest.hash_node_position(adjpos)] then |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What’s about r==dir?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
i don't fully understand, do you mean i should add an additional optimization to skip this code when r==dir, since that block will already have been added to the queue?
i guess that's a missed optimization then, yeah. do you want me to add it?
end | ||
local nodedef = minetest.registered_nodes[nn.name] | ||
if nodedef and nodedef.mvps_sticky then | ||
local connected = nodedef.mvps_sticky(np, nn) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
👍
all this code review is nice, but what would really be helpful is for someone benchmark this to see if the speedup is worth me fixing everything up. or if someone could point me in the direction of some lua/minetest benchmarking tools i could try to do it myself. my limited testing was based off of realtime and framerate, both of which can be influenced by other processes. |
Minetest has an internal profiler:
According to |
personally i used stacks of slimeblocks on top of sticky pistons connected to a 1-tick clock. |
this function is a big bottleneck, since it contains several layers of nested loops, and has quite a few allocations in the innermost levels.
i did my best to optimize it, and i'm only confident in about half my changes, although some imprecice testing implied as much as a 10x speedup.
if someone could benchmark this properly and see what changes are worth it, that would be greatly appreciated.