= R rpms = '''rpms''' is an implementation of [[Statistics/DecisionTrees|decision trees]] and [[Statistics/RandomForest|random forests]] for R. <> ---- == Installation == {{{ install.packages('rpms') }}} ---- == Usage == A model is trained like: {{{ tree <- rpms(y ~ va + vb + vc, data=data) }}} Note that the dependent variable (`y` in the above example) must be [[R/DataTypes#Numeric|numeric]]; it being any other class leads to error messages like "'list' object cannot be coerced to type 'double'". Partitions are determined through randomized permutation and hypothesis tests. Using a trained model, the predicted clusters can be attached to a (new) dataset like: {{{ data$node <- end_nodes(tree, newdata=data) }}} Similarly, the predicted outcomes (which are uniform within a predicted cluster) can be attached to a (new) dataset like: {{{ data$prediction <- predict(tree, newdata=data) }}} === Options for Complex Survey Design === The hypothesis tests used in this package support complex survey designs. {{{ tree <- rpms(y ~ va + vb + vc, data=data, weights=~wtvar, strata=~stratavar, cluster=~clustervar) }}} Given clusters, the trees are permuted in a [[APermutationTestOnComplexSampleData|2 step algorithm]]: first across clusters and then within clusters. This algorithm does not perform well when the clusters are significantly varying in (effective) size. === Visualization === To plot a specific partition, try: {{{ node_plot(tree, node=1, data=data) }}} To render the entire tree, try the `qtree` function. This generates [[LaTeX]] figure markup which can be rendered separately. Note that rendering the figure depends on the `lscape` and `tikz-qtree` packages being included. === Random Forests === {{{ tree <- rpms(y ~ va + vb + vc, data=data) }}} Uniformly random trees are generated, and then aggregated as a weighted average. The trees are weighted by [[Statistics/InverseVarianceWeights|inverse variance]]. ---- CategoryRicottone