-
Notifications
You must be signed in to change notification settings - Fork 5
Null values currently fail validation for EmergencyCareEpisodeSchema due to int64 type #44
Comments
Having had a better look at feature_maps.py I think might be better managed by doing a fillna(0) on the relevant columns! Was wondering if I could clarify a few other things however @vvcb
|
Missing SNOMED codes should be replaced with 0. This avoids pandas NaN issues (I have included a link in the documentation). Is PR #46 still necessary if this is already done? If there is a specific code for missing values, then this should be included in feature_maps. Will be great if you are happy to do this . |
|
|
a) Treat absence of diagnosis as 'Non-ACSC' We have about 10% of patients where there is no diagnosis assigned within the emergency care dataset |
Ah...I see it now 😊. Option b maybe the correct one but worth checking with the lead team regarding how they want this handled. 10% is a sizable proportion to be discarding. |
Following columns are currently failing validation because they contain null values but are set to
dtype=np.int64
in theEmergencyCareEpisodeSchema
edcomorb_[0-9]{2}$
eddiag_[0-9]{2}$
eddiagqual_[0-9]{2}$
edentryseq_[0-9]{2}$
edinvest_[0-9]{2}$
edtreat_[0-9]{2}$
Suggest changing to
pd.Int64Dtype()
to allow null valuesCould be changed to float type but you get this awkward situation where pandas adds a decimal point onto the end of the SNOMED code
The text was updated successfully, but these errors were encountered: