rbmi expects that the incoming dataset is complete, that is that it has 1 row per patient per time point even if the analysis value is missing for that row.
We currently offer the helper functions expand(), fill_locf() & expand_locf() to help create this dataset. However in practice these don't work well as they are dependent on LOCF for assigning default values to missing covariate values in the newly created rows; the issue being that if the first / baseline row was missing then there is nothing for LOCF to populate the missing covariate values with.
Currently in internal training we have been recommend users to first create a reference row that has all the covariate values which can be used in combination with LOCF however this is clunky as the user needs to remember to drop the reference row afterwards. Additionally it also means the user needs to add a "REFERENCE" factor level to the visit variable and also remember to re-level to remove this level afterwards.
I would propose a new function expand_ref() which does the expansion step and fills in newly formed rows with values from the reference dataset where reference is strictly 1 row per patient. e.g.
expand_ref(
data = adeff,
ref = adref,
AVISIT = c("week 1", "week 2", ...)
)
Though I would not apply any locf here, if the user needed that they can pipe it into a call to fill_locf() (in hindsight I think expand_locf() tries to do too much at once and shouldn't exist but we are where we are and I don't think we should remove as people have already built code around it).
rbmi expects that the incoming dataset is complete, that is that it has 1 row per patient per time point even if the analysis value is missing for that row.
We currently offer the helper functions
expand(),fill_locf()&expand_locf()to help create this dataset. However in practice these don't work well as they are dependent on LOCF for assigning default values to missing covariate values in the newly created rows; the issue being that if the first / baseline row was missing then there is nothing for LOCF to populate the missing covariate values with.Currently in internal training we have been recommend users to first create a reference row that has all the covariate values which can be used in combination with LOCF however this is clunky as the user needs to remember to drop the reference row afterwards. Additionally it also means the user needs to add a "REFERENCE" factor level to the visit variable and also remember to re-level to remove this level afterwards.
I would propose a new function
expand_ref()which does the expansion step and fills in newly formed rows with values from the reference dataset where reference is strictly 1 row per patient. e.g.Though I would not apply any locf here, if the user needed that they can pipe it into a call to
fill_locf()(in hindsight I thinkexpand_locf()tries to do too much at once and shouldn't exist but we are where we are and I don't think we should remove as people have already built code around it).