Skip to content
This repository has been archived by the owner on Dec 21, 2023. It is now read-only.

better progress report #3117

Open
wants to merge 11 commits into
base: main
Choose a base branch
from

Conversation

guihao-liang
Copy link
Collaborator

@guihao-liang guihao-liang commented Apr 13, 2020

update 05/08/2020

implemented a thread-safe stopwatch to measure the elapsed time spent by concurrent IO.


report progress to users in order to let them wait for slow IO operations. Reduce the chance that users think the program is hanging without progress.

Tries to solve #3119. But due to the fact that we should provide prompt based on elapsed time. Anyway, we should have metrics inside of the SFrame by remembering how many bytes are downloaded and tell the user the progress.


update

upload

05/08/2020

Finished fetching block 0. Elapsed 0s for downloading s3://tc_qa/integration/manual/upload/medium_sframe_ac/dir_archive.ini
Uploading s3://tc_qa/integration/manual/upload/medium_sframe_ac/dir_archive.ini. Elapsed time 0 seconds
Finished uploading s3://tc_qa/integration/manual/upload/medium_sframe_ac/dir_archive.ini. Elapsed time 0 seconds
Uploading s3://tc_qa/integration/manual/upload/medium_sframe_ac/m_9cfcba377b62f99.0000. Elapsed time 0 seconds
Finished uploading s3://tc_qa/integration/manual/upload/medium_sframe_ac/m_9cfcba377b62f99.0000. Elapsed time 69 seconds
Uploading s3://tc_qa/integration/manual/upload/medium_sframe_ac/m_9cfcba377b62f99.sidx. Elapsed time 0 seconds
Finished uploading s3://tc_qa/integration/manual/upload/medium_sframe_ac/m_9cfcba377b62f99.sidx. Elapsed time 0 seconds
Uploading s3://tc_qa/integration/manual/upload/medium_sframe_ac/m_9cfcba377b62f99.frame_idx. Elapsed time 0 seconds
Finished uploading s3://tc_qa/integration/manual/upload/medium_sframe_ac/m_9cfcba377b62f99.frame_idx. Elapsed time 0 seconds
Uploading s3://tc_qa/integration/manual/upload/medium_sframe_ac/dir_archive.ini. Elapsed time 0 seconds
Finished uploading s3://tc_qa/integration/manual/upload/medium_sframe_ac/dir_archive.ini. Elapsed time 0 seconds

download

a prompt based on time interval.

Finished fetching block 0. Elapsed 0s for downloading s3://tc_qa/integration/manual/sframes/big_sframe_od/dir_archive.ini
Finished fetching block 0. Elapsed 0s for downloading s3://tc_qa/integration/manual/sframes/big_sframe_od/m_ea214100aab1ac60.sidx
Finished fetching block 0. Elapsed 0s for downloading s3://tc_qa/integration/manual/sframes/big_sframe_od/m_ea214100aab1ac60.frame_idx
Finished fetching block 1. Elapsed 10s for downloading s3://tc_qa/integration/manual/sframes/big_sframe_od/m_ea214100aab1ac60.0000
Finished fetching block 0. Elapsed 74s for downloading s3://tc_qa/integration/manual/sframes/big_sframe_od/m_ea214100aab1ac60.0000
+------------------------+--------------+----------+
|         image          |     name     |  label   |
+------------------------+--------------+----------+
| Height: 480 Width: 640 |   bike_248   |   bike   |
| Height: 480 Width: 640 |   bike_254   |   bike   |
| Height: 480 Width: 640 |   bike_267   |   bike   |
| Height: 480 Width: 640 |   bike_107   |   bike   |
| Height: 480 Width: 640 | carsgraz_225 | carsgraz |
| Height: 480 Width: 640 |   bike_187   |   bike   |
| Height: 480 Width: 640 |   bike_077   |   bike   |
| Height: 480 Width: 640 | carsgraz_053 | carsgraz |
| Height: 480 Width: 640 |   bike_362   |   bike   |
| Height: 480 Width: 640 | carsgraz_351 | carsgraz |
+------------------------+--------------+----------+

@guihao-liang guihao-liang added this to the 6.2 milestone Apr 13, 2020
@guihao-liang guihao-liang self-assigned this Apr 13, 2020
src/core/storage/fileio/s3_filesys.cpp Outdated Show resolved Hide resolved
src/core/storage/fileio/s3_filesys.cpp Outdated Show resolved Hide resolved
src/core/storage/fileio/s3_api.cpp Outdated Show resolved Hide resolved
src/core/storage/fileio/s3_filesys.cpp Outdated Show resolved Hide resolved
src/core/storage/fileio/s3_filesys.cpp Outdated Show resolved Hide resolved
@TobyRoseman
Copy link
Collaborator

@guihao-liang - As we discussed it would in person, it would be better if progress was reported at fixed time interval rather then every time a 64MB block is uploaded/downloaded. Here is an example of where we do something like that in the Python layer.

@guihao-liang
Copy link
Collaborator Author

@guihao-liang - As we discussed it would in person, it would be better if progress was reported at fixed time interval rather then every time a 64MB block is uploaded/downloaded. Here is an example of where we do something like that in the Python layer.

Thanks! I already had my own implementation yesterday. But it's pretty similar. I think a better solution is to have a separate thread reporting every 20s. But I have no time. This should be my last S3 PR. I need to work on other stuff.

@guihao-liang guihao-liang linked an issue Apr 14, 2020 that may be closed by this pull request
@guihao-liang
Copy link
Collaborator Author

A possible solution. 80% finished. Feel free to pick it up.

@guihao-liang guihao-liang removed this from the 6.2 milestone Apr 15, 2020
@TobyRoseman TobyRoseman reopened this Apr 15, 2020
@guihao-liang guihao-liang force-pushed the 04-13-better-report branch 2 times, most recently from ca4f444 to 471b0b9 Compare May 8, 2020 20:10
@guihao-liang guihao-liang requested review from brtal and nickjong May 8, 2020 21:01
@guihao-liang guihao-liang added this to the 6.3 milestone May 8, 2020
@guihao-liang
Copy link
Collaborator Author

passed internally. Job id 112426.

@guihao-liang guihao-liang force-pushed the 04-13-better-report branch from c5ec0b1 to e163555 Compare May 18, 2020 23:10
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Provide S3 prompts for user using SFrame to download S3.
2 participants