-
Notifications
You must be signed in to change notification settings - Fork 5
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Could the intervals be extended to month and/or month-year? #14
Comments
Hi @Lextuga007 - I've only just seen this, not sure why I wasn't notified before. as far as I know, if it works with I'd be happy to take a look if you have some trial data you could share (offline)? |
Hey @johnmackintosh did you guys end up finding out if this worked? I'm potentially going to be doing a count of folks added before but not removed from a register on a specific date over multiple years. It appears that specifying "year" would be fine - how would I go about setting the day & month to check at? |
@will-ball I never got round to looking into this in detail. In reference to @Lextuga007's comment, the package doesn't necessarily only go to day level, but it does expect date-time, rather than dates. If you use the individual level, the function returns a row per individual per interval, including the original start and end datetimes, plus the interval's base date and hour - which you can use to filter results to a specific date and time. Alternatively, maybe you could use data.table's rolling joins? https://www.gormanalysis.com/blog/r-data-table-rolling-joins/ https://r-norberg.blogspot.com/2016/06/understanding-datatable-rolling-joins.html If you have some fake data to play around with, would be happy to take a look at all the options |
Thanks for getting back to me @johnmackintosh I've not encountered rolling joins before so will take a look, thanks for flagging. I've got a toy dataset to illustrate: # Simple Example
library(tidyverse)
library(lubridate)
library(truncnorm)
n_people <- 1000
start_date <- as_date("2012-01-01")
end_date <- as_date("2021-12-31")
set.seed(20221214)
data <- as_tibble(
list(
id = sample(1:n_people, replace = TRUE),
added = start_date + sample.int(end_date - start_date, n_people))) %>%
mutate(
removed = added + rtruncnorm(n_people, mean = 30, sd = 15, a = 1, b = 1000),
days = added %--% removed %/% days(1)) From data which essentially looks like this, I'd like to count how many people are 'registered' on the 31st July each year. I don't think it should complicate anything but the same person can be added/removed multiple times. I will have a play myself but if you get bored and want to take a look let me know. |
see if this gives you what you need @will-ball ?
|
will give you tallies for each cutoff date |
That works perfectly thanks 😄 |
Nice one @will-ball |
Yes, it does look like "year" is supported as
id 1 should get 2019 and 2020 but because it's end date is on the 1st 2020 doesn't show. I'm guessing but is this something related to the date times and the time is tipping it to 2019-12-31? The same happens with id 3 which should be 2019, 2020, 2021 and 2022 but 2022 is dropped. |
Hmm, I wonder if that is timezone related. I don't have much bandwidth to look into this at present. Another possible influencing factor is my use of "within" as the method used with foverlaps. Will try and get that sorted soon. |
Tom Jemmett https://github.com/tomjemmett wrote this code which I've adapted for the data I used and it's made me realise that what I need to count is not really a census as I don't want to subtract people who leave for something like prevalence.
I think for prevalence I'd need to drop the generating of -1 for an exit. |
I want to give patientcounter a try with smoking prevalence data by team or ward and I have information over many years so the best way to 'count' the open people in a team or ward are by referrals by month-year. Patientcounter only goes to day - is that right?
The text was updated successfully, but these errors were encountered: