R rpms

rpms is an implementation of decision trees and random forests for R.


Installation

install.packages('rpms')


Usage

A model is trained like:

tree <- rpms(y ~ va + vb + vc, data=data)

Note that the dependent variable (y in the above example) must be numeric; it being any other class leads to error messages like "'list' object cannot be coerced to type 'double'".

Partitions are determined through randomized permutation and hypothesis tests.

Using a trained model, the predicted clusters can be attached to a (new) dataset like:

data$node <- end_nodes(tree, newdata=data)

Similarly, the predicted outcomes (which are uniform within a predicted cluster) can be attached to a (new) dataset like:

data$prediction <- predict(tree, newdata=data)

Options for Complex Survey Design

The hypothesis tests used in this package support complex survey designs.

tree <- rpms(y ~ va + vb + vc, data=data, weights=~wtvar, strata=~stratavar, cluster=~clustervar)

Given clusters, the trees are permuted in a 2 step algorithm: first across clusters and then within clusters. This algorithm does not perform well when the clusters are significantly varying in (effective) size.

Visualization

To plot a specific partition, try:

node_plot(tree, node=1, data=data)

To render the entire tree, try the qtree function. This generates LaTeX figure markup which can be rendered separately. Note that rendering the figure depends on the lscape and tikz-qtree packages being included.

Random Forests

tree <- rpms(y ~ va + vb + vc, data=data)

Uniformly random trees are generated, and then aggregated as a weighted average. The trees are weighted by inverse variance.


CategoryRicottone

R/Rpms (last edited 2026-04-07 20:56:05 by DominicRicottone)