Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Use of statistics from Zhou et. al. #47

Closed
johnurbanik opened this issue Mar 31, 2020 · 3 comments
Closed

Use of statistics from Zhou et. al. #47

johnurbanik opened this issue Mar 31, 2020 · 3 comments
Labels
duplicate This issue or pull request already exists priority This needs to get done first

Comments

@johnurbanik
Copy link

johnurbanik commented Mar 31, 2020

As mentioned in #40, I'm very worried about the validity of the Zhou et. al. statistics, here and in general across the epidemiological modeling space right now. I'm not sure that the experimental design leads to a result where the length of hospitalization is valid. The experiment is still useful in terms of understanding co-morbidities, but I think it may be necessary to find a different source for length of stay information.

I'd really appreciate if you all take a look. I'm hoping I'm wrong here, as it seems that a lot of modeling is using the medians from this study (and using it as a mean, not even a distribution like you guys are). Perhaps another study shows similar results.

Quoting the study:

Since these two hospitals were the only designated hospitals for transfer of patients with COVID-19 from other hospitals in Wuhan until Feb 1, 2020, our study
enrolled all adult inpatients who were hospitalised for COVID-19 and had a definite outcome (dead or discharged) at the early stage of the outbreak.

First, due to the retrospective study design, not all laboratory tests were done in all patients, including lactate dehydrogenase, IL-6, and serum ferritin. Therefore, their role might be underestimated in predicting in-hospital death. Second, patients were sometimes transferred late in their illness to the two included hospitals. Lack of effective antivirals, inadequate adherence to standard supportive therapy, and high-dose corticosteroid use might have also contributed to the poor clinical outcomes in some patients. Third, the estimated duration of viral shedding is limited by the frequency of respiratory specimen collection, lack of quantitative viral RNA detection, and relatively low positive rate of SARS-CoV-2 RNA detection in throat-swabs.37 Fourth, by excluding patients still in hospital as of Jan 31, 2020, and thus relatively more severe disease at an earlier stage, the case fatality ratio in our study cannot reflect the true mortality of COVID-19.

The two biggest red flags for me are that 'patients were sometimes transferred late in their illness to the two included' hospitals. The fact that the study includes 'time from illness onset to death discharge' >>> 'hospital length of stay' combined with the graphics in figure 1 and 2 suggest that a large percentage of the patients were hospitalized for other reasons before transfer (how else would they have labs for these patients and why else would they suspect such a large infectious time before hospitalization?).

I expect that the bar for admission in most hospitals will be higher than just fever and a positive test (i.e. I've heard reports that in some cases in New York, those without dyspnea are sent home, sometimes without testing), but it still remains that dyspnea is median 4 days earlier in the dataset than hospital admission (which is inexplicably in Table 1 and not copied into Table 2).

Further, the fact that patients who did not have a definite outcome were not included in the sample means that patients who were admitted before Jan 19 but were not yet discharged would skew the distribution toward a longer hospitalization time.

If these two variables interact it is possible that the actual distributions we'd see could have a substantially larger median than suggested (or at least a fatter tail). I hope that I am wrong.

@johnurbanik
Copy link
Author

I performed some analysis using the empirical dataset from Wuhan over the later stages of the initial outbreak. It isn't my best analysis (I tried to get it out quickly after work and I'd welcome anyone contributing), but the data certainly seems to point to the data from Zhou et. al. being pretty far off in terms of the mean and tails of the distribution asymptotically.

Please feel free to take a look.

https://github.com/understand-covid/proposal/blob/master/parameter%20estimation/hospital_stay_analysis.ipynb

@thibautjombart
Copy link
Owner

This is pretty useful, thanks! I think we will:

How does that sound?

@thibautjombart thibautjombart added the priority This needs to get done first label Apr 1, 2020
@thibautjombart thibautjombart added the duplicate This issue or pull request already exists label Apr 1, 2020
@thibautjombart
Copy link
Owner

Closing to follow on #49 and #9

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
duplicate This issue or pull request already exists priority This needs to get done first
Projects
None yet
Development

No branches or pull requests

2 participants