Differences between revisions 1 and 2

R XGBoost

XGBoost is a software implementation of gradient boosting for estimating decision trees. This article is specifically about the official R bindings.

Contents

R XGBoost
1. Installation
2. Usage

Installation

install.packages('XGBoost')

Usage

Some data preparation is required, e.g. partitioning the data into a training and testing sets.

data(mtcars)

# Identify the independent variables
predictors <- c("disp","wt","cyl","gear","carb")

# Partition data set
parts = createDataPartition(mtcars$mpg, p = .8)
train = mtcars[parts, ]
test = mtcars[-parts, ]

# Create analytic matrices
train.x <- data.matrix(train[,predictors])
test.x <- data.matrix(test[,predictors])

Try:

library(xgboost)

# Note: data is the analytic matrix, label is the outcome
xgb.train = xgb.DMatrix(data = train.x, label = train$mpg)
xgb.test = xgb.DMatrix(data = test.x, label = test$mpg)

# Estimate model on training data set
my.model = xgboost(data = xgb.train, max.depth = 3, nrounds = 70)
[snip]

# Predict outcome using trained model and test data set
pred <- predict(my.model, xgb.test)

Note that the xgboost function is a wrapper around xgb.train.

There is also a xgb.cv function that automatically partitions the data set into random, equal-sized samples. One is retained for model training, while the others are swapped in and out for testing the tuning adjustments through cross-validation.

my.data <- xgb.DMatrix(data = mtcars[,predictors], label = mtcars$mpg)
my.model <- xgb.cv(data = my.data, nfold = 5, nrounds = 3, max_depth = 3)

To get the importance matrix (the importance metrics as a data.table object), try:

my.importance <- xgb.importance(model = my.model)

CategoryRicottone

-  ⇤ ← Revision 1 as of 2025-03-20 14:08:43 → 
  Size: 1816
  Editor: DominicRicottone
  Comment: Initial commit
+   ← Revision 2 as of 2025-04-08 14:47:40 → ⇥
  Size: 1926
  Editor: DominicRicottone
  Comment: Cleanup
-Deletions are marked like this.
+Additions are marked like this.
 Line 3:
-[[XGBoost]] is a gradient boosting library. This article is specifically about the official R bindings.
+[[XGBoost]] is a software implementation of [[Statistics/GradientBoosting|gradient boosting]] for estimating [[Statistics/DecisionTrees|decision trees]]. This article is specifically about the official R bindings.

Diff for "R/XGBoost"

R XGBoost

Installation

Usage