You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
The objective in survival analysis — also referred to as reliability analysis in engineering — is to establish a connection between covariates and the time of an event. The name survival analysis originates from clinical research, where predicting the time to death, i.e., survival, is often the main objective. Survival analysis is a type of regression problem (one wants to predict a continuous value), but with a twist. It differs from traditional regression by the fact that parts of the training data can only be partially observed – they are censored.
In view of great popularity of time series data and time series databases, survival analysis provides a straightforward way to reason about time-to-event predictions. We learnt about Smartcore from PostgresML— machine learning for Postgres, which recently made the switch to use Smartcore for performance reasons. We're trying to estimate how hard it would be to perform time-to-event prediction in our Postgres; it would seem that doing so would require a Smartcore implementation for survival analysis (originally implemented as sckit_learn.linear_model)
Exactly how hard would it be, how much would have to be done from scratch, and where would we even begin to approach this if we don't have the expertise in the subject? Honestly, we would love to pay somebody to do this but where would we look to hire for this?
Best regards
The text was updated successfully, but these errors were encountered:
Hi, thanks for using smartcore. We really like to hear from organizations leveraging the library, the engineers at PostgresML have been very kind to contribute useful code recently.
I have never heard of scikit-survival, I will take a look; though I have worked a little with SCM (structural causal models, you can read some examples in my blogposts), at first sight it looks like they try to tackle similar problems but I don't know of any formal relation between the two approaches. New day, new things to learn.
As usual scikit provides a nice API and we can always try to backtrack, having support of a scientist that knows the the math in depth is sometimes hard to find, fortunately we can usually collect a nice number of tests from existing libraries and experiments or papers to feel confident enough of being numerically accurate. Becoming an expert of a particular framework takes indeed years, the same to make a good library. In my experience it depends mostly on how many layers of nested functions are needed. In very simple words, for example, implementing a pipeline with multiple pre-processing and processing steps takes longer (usually months even in presence of existing code, many "classes" and methods that call each other) than a simple regression that usually means implementing some methods (in smartcore most of the classifiers fit in one file, multiple files usually means compounded effort).
I'm submitting a
What is survival analysis?
In view of great popularity of time series data and time series databases, survival analysis provides a straightforward way to reason about time-to-event predictions. We learnt about Smartcore from PostgresML— machine learning for Postgres, which recently made the switch to use Smartcore for performance reasons. We're trying to estimate how hard it would be to perform time-to-event prediction in our Postgres; it would seem that doing so would require a Smartcore implementation for survival analysis (originally implemented as sckit_learn.linear_model)
Exactly how hard would it be, how much would have to be done from scratch, and where would we even begin to approach this if we don't have the expertise in the subject? Honestly, we would love to pay somebody to do this but where would we look to hire for this?
Best regards
The text was updated successfully, but these errors were encountered: