-
Notifications
You must be signed in to change notification settings - Fork 18
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add support for stats::formula
objects
#35
Comments
In R there are two approaches for declaring label and features:
Arithmetic operations are supported only with "formula interface", because this way they become part of the model object (eg. can be serialized/deserialized in RDS data format). However, the support for "formula interface" varies considerably between R packages - it is best supported by several built-in packages (eg. the You need to check the documentation of your target R package/function if it supports the "formula interface" or not.
See the following presentation: There are many in-formula feature engineering examples starting from slide 13. |
Thanks a lot for the many explanations, comments and link. That is great
Looking over it you have many examples with "as.formular". These are
exactly the things which I would like to have WITHOUT wrapping it into a
linear model or else. Just straight these formulars. No special R package.
That is really not possible?
I hoped for something like a plain "model" function given by r2pmml which
is kind of an identity wrapper around the formula or something
…On Tue, 5 Dec 2017, 19:15 Villu Ruusmann, ***@***.***> wrote:
In R there are two approaches for declaring label and features:
1. "Matrix interface": model(x = features, y = label)
2. "Formula interface": model(label ~ features, data = data)
Arithmetic operations are supported only with "formula interface", because
this way they become part of the model object (eg. can be
serialized/deserialized in RDS data format). However, the support for
"formula interface" varies considerably between R packages - it is best
supported by several built-in packages (eg. the base package, which
provides glm() and lm() functions), reasonably supported by several
others (eg. earth and randomForest packages), and not at all supported by
many more.
You need to check the documentation of your target R package/function if
it supports the "formula interface" or not.
If possible, can you provide example code?
See the following presentation:
https://www.slideshare.net/VilluRuusmann/converting-r-to-pmml-82182483
There are many in-formula feature engineering examples starting from slide
13.
—
You are receiving this because you authored the thread.
Reply to this email directly, view it on GitHub
<#35 (comment)>, or mute
the thread
<https://github.com/notifications/unsubscribe-auth/AEDu-ECsdZ3a3PVK6JrDuZX_HOMx2wacks5s9YhAgaJpZM4Q2ru->
.
|
You mean taking a formula = as.formula(...)
r2pmml(formula, "formula.pmml") What will happen to those PMML fragments afterwards? Want to copy-paste them manually to someplace else? The PMML thinking is that formula objects cannot exist in isolation. They have to be associated with a model object or, alternatively, be converted to some-sort of function definition (typically a However, it would be possible to teach the |
thank you very much for the explanations and for paraphrasing my thoughts. Thanks a lot! |
Suppose you create a #library("r2pmml")
formula = as.formula(y ~ I(x1 + x2))
#r2pmml(formula, "formula.pmml") A formula object could be translated to a singleton A corresponding PMML fragment might look like this: <PMML>
<DataDictionary>
<DataField name="x1" dataType="double" optype="continuous"/>
<DataField name="x2" dataType="double" optype="continuous"/>
</DataDictionary>
<TransformationDictionary>
<DerivedField name="y" dataType="double" optype="continuous">
<Apply function="+">
<FieldRef field="x1"/>
<FieldRef field="x2"/>
</Apply>
</DerivedField>
</TransformationDictionary>
</PMML> This kind of "partial conversion" can be very helpful if you're trying to convert a piece of R (or Python) code into PMML. It will be very easy to copy the above |
stats::formula
objects
at the package's README https://github.com/jpmml/r2pmml#model-formulae it says that one can use nice R syntax to define normal arithmetic processing of the data when using GLM or so
Are they also supported independently of LM/GLM, I mean to create simple models, just involving simple arithmetics.
If possible, can you provide example code? If not, can it be supported in general?
The text was updated successfully, but these errors were encountered: