Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

distributed.worker - WARNING - Could not find data #2733

Open
TonyHuBD opened this issue May 21, 2024 · 2 comments
Open

distributed.worker - WARNING - Could not find data #2733

TonyHuBD opened this issue May 21, 2024 · 2 comments

Comments

@TonyHuBD
Copy link

TonyHuBD commented May 21, 2024

Code Sample, a copy-pastable example to reproduce your bug.

feature_matrix, feature_defs = ft.dfs(
    entityset=es, 
    target_dataframe_name="acc",
    agg_primitives=["count", "sum"],
    trans_primitives=[ "MultiplyNumericBoolean"],
    cutoff_time=cutoff_times,
    cutoff_time_in_index=True,
    training_window="24 hour",
    max_depth=2,
    verbose=True,
    n_jobs = 36
    
)

Warning message

2024-05-18 18:25:09,703 - distributed.worker - WARNING - Could not find da
ta: {'bytes-e7e617a37c90e401634a701acdeac78d': ['tcp://127.0.0.1:41237', '
tcp://127.0.0.1:43061', 'tcp://127.0.0.1:46849', 'tcp://127.0.0.1:40099',
'tcp://127.0.0.1:46353']} on workers: [] (who_has: {'EntitySet-878e4a0191d
9aef3e784de699c988216': ['tcp://127.0.0.1:41237', 'tcp://127.0.0.1:43061',
'tcp://127.0.0.1:46493', 'tcp://127.0.0.1:40099', 'tcp://127.0.0.1:46353'
], 'bytes-e7e617a37c90e401634a701acdeac78d': ['tcp://127.0.0.1:41237', 'tc
p://127.0.0.1:43061', 'tcp://127.0.0.1:46849', 'tcp://127.0.0.1:40099', 't
cp://127.0.0.1:46353']})
2024-05-18 18:25:09,704 - distributed.scheduler - WARNING - Worker tcp://1
27.0.0.1:45941 failed to acquire keys: {'bytes-e7e617a37c90e401634a701acde
ac78d': ('tcp://127.0.0.1:41237', 'tcp://127.0.0.1:43061', 'tcp://127.0.0.
1:46849', 'tcp://127.0.0.1:40099', 'tcp://127.0.0.1:46353')}
2024-05-18 18:28:00,306 - distributed.worker - WARNING - Could not find da
ta: {'bytes-e7e617a37c90e401634a701acdeac78d': ['tcp://127.0.0.1:45941', '
tcp://127.0.0.1:41237', 'tcp://127.0.0.1:46849', 'tcp://127.0.0.1:46353']}
on workers: [] (who_has: {'bytes-e7e617a37c90e401634a701acdeac78d': ['tcp
://127.0.0.1:45941', 'tcp://127.0.0.1:41237', 'tcp://127.0.0.1:46849', 'tc
p://127.0.0.1:46353']})

I don't know why. I can't get the result when I use n_jobs.

@thehomebrewnerd
Copy link
Contributor

thehomebrewnerd commented May 21, 2024

@TonyHuBD n_jobs seems to be working as expected for me. Unfortunately, the information above isn't detailed enough to know what might be going wrong. If you are not using the most recent versions of Featuretools or Dask, my first suggestion would be to try to upgrade to the latests released versions and try again.

@BohdanBilonoh
Copy link

I can confirm that latest featuretools is not working with distributed>=2024.8.2 because of this line in featuretools and this func in the latest distributed. it could be fixed by passing client to Future like so

...
num_scattered_workers = len(
    client.who_has([Future(es_token, client=client)]).get(es_token, []),
)
...

@thehomebrewnerd please let me know what you think about the fix so I could make a PR

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants