itree {itree} | R Documentation |
Fit a itree
model.
itree(formula, data, weights, subset, na.action = na.itree, method, penalty= "none", model = FALSE, x = FALSE, y = TRUE, parms, control, cost, ...)
formula |
a formula, with a response but no interaction terms. |
data |
an optional data frame in which to interpret the variables named in the formula. |
weights |
optional case weights. |
subset |
optional expression saying that only a subset of the rows of the data should be used in the fit. |
na.action |
the default action deletes all observations for which
|
method |
one of |
penalty |
one of |
model |
if logical: keep a copy of the model frame in the result?
If the input value for |
x |
keep a copy of the |
y |
keep a copy of the dependent variable in the result. If
missing and |
parms |
optional parameters for the splitting function. |
control |
a list of options that control details of the
|
cost |
a vector of non-negative costs, one for each variable in the model. Defaults to one for all variables. These are scalings to be applied when considering splits, so the improvement on splitting on a variable is divided by its cost in deciding which split to choose. Note that costs are not currently supported by the extremes or purity methods. |
... |
arguments to |
itree is based on the code of rpart, but with some extensions targeted at growing interpretable/parsimonious trees. Bug reports and the like should be directed to this package's maintainer – not rpart's.
An object of class itree
. See itree.object
.
Breiman, Friedman, Olshen, and Stone. (1984) Classification and Regression Trees.
Buja, Andreas and Lee, Yung-Seop (2001). Data Mining Criteria for Tree-Based Regression and Classification, Proceedings of KDD 2001, 27-36.
Wadsworth.
itree.control
, itree.object
,
summary.itree
, print.itree
#CART (same as rpart): fit <- itree(Kyphosis ~ Age + Number + Start, data=kyphosis) fit2 <- itree(Kyphosis ~ Age + Number + Start, data=kyphosis, parms=list(prior=c(.65,.35), split='information')) fit3 <- itree(Kyphosis ~ Age + Number + Start, data=kyphosis, control=itree.control(cp=.05)) par(mfrow=c(1,2), xpd=NA) # otherwise on some devices the text is clipped plot(fit) text(fit, use.n=TRUE) plot(fit2) text(fit2, use.n=TRUE) #### new to itree: #same example, but using one-sided extremes: fit.ext <- itree(Kyphosis ~ Age + Number + Start, data=kyphosis,method="extremes", parms=list(classOfInterest="absent")) #we see buckets with every y="absent": plot(fit.ext); text(fit.ext,use.n=TRUE) library(mlbench); data(BostonHousing) #one sided purity: fit4 <- itree(medv~.,BostonHousing,method="purity",minbucket=25) #low means tree: fit5 <- itree(medv~.,BostonHousing,method="extremes",parms=-1,minbucket=25) #new variable penalty: fit6 <- itree(medv~.,BostonHousing,penalty="newvar",interp_param1=.2) #ema penalty fit7 <- itree(medv~.,BostonHousing,penalty="ema",interp_param1=.1) #one-sided-purity + new variable penalty: fit8 <- itree(medv~.,BostonHousing,method="purity",penalty="newvar",interp_param1=.2) #one-sided extremes for classification must specify a "class of interest" data(PimaIndiansDiabetes) levels(PimaIndiansDiabetes$diabetes) fit9.a <- itree(diabetes~.,PimaIndiansDiabetes,minbucket=50, method="extremes",parms=list(classOfInterest="neg")) plot(fit9.a); text(fit9.a) #can also pass the index of the class of interest in levels(). fit9.b <- itree(diabetes~.,PimaIndiansDiabetes,minbucket=50, method="extremes",parms=list(classOfInterest=1)) # so fit9.a = fit9.b