A Swift interface for XGBoost.
The current interface is wrapping around the C API of XGBoost, tries to conform to the Python API. Document see docs.
If you run into any problem, please file an issue or even better a pull request.
You can follow XGBoost
document for installation
or build library.
Develop and tested under macOS 10.15 with brew install xgboost
. The C header file and
library are located through pkg-config
, it should work directly. Otherwise,
place an pkg-config file as /usr/local/lib/pkgconfig/xgboost.pc
with content:
prefix=/usr/local/Cellar/xgboost/1.1.0
exec_prefix=${prefix}/bin
libdir=${prefix}/lib
includedir=${prefix}/include
Name: xgboost
Description: XGBoost
Version: 1.1.0
Cflags: -I${includedir}
Libs: -L${libdir} -lxgboost
Please read through the following links for more configuration detail.
Ubuntu is tested by using Swift's docker
image, the latest version is
Ubuntu18.04 for now.
Please follow XGBoost
document for installation
or build library. Or you can check the Dockerfile Dockerfile_test_ubuntu
that
used for testing.
It evovles fastly, please constantly check the version.
.package(url: "https://github.com/ddxgz/XGBoost.swift.git", from: "0.6.0")
You may find more cases in the test file in code repo.
Still in early development, use with caution.
let train = try DMatrix(fromFile: "data/agaricus.txt.train")
let test = try DMatrix(fromFile: "data/agaricus.txt.test")
let bst = try xgboost(data: train, numRound: 10)
let pred = bst.predict(data: test)
let cvResult = try xgboostCV(data: train, numRound: 10)
// save and load model as binary
let modelBin = "bst.bin"
try bst.saveModel(toFile: modelBin)
let bstLoaded = try xgboost(data: train, numRound: 0, modelFile: modelBin)
// save and load model as json
let modelJson = "bst.json"
bst.saveModel(toFile: modelJson)
let bstJsonLoaded = try xgboost(data: train, numRound: 0, modelFile: modelJson)
// save model config
try bst.saveConfig(toFile: "config.json")
// Default LibSVM text format file
let datafile = "data/agaricus.txt.train"
let train = try DMatrix(fromFile: datafile)
// Load DMatrix from csv file by set `format` parameter
let csv = "data/train.csv"
let trainCSV = try DMatrix(fromFile: csv, format: "csv")
// or by providing format URI
let csv2 = "data/train.csv?format=csv"
let trainCSV2 = try DMatrix(fromFile: csv2, format: "csv")
// Construct from array of Floats, by setting shape, missing values will be filled
// in automatically or by setting `missing`.
let matWithNa = try DMatrix(fromArray: [Float]([0, 1, 2, 3]), shape: (10, 10))
// Data slicing by array of indexes
let trainSliced = train.slice(rows: [0, 3])!
// by range
let trainRanged = train.slice(rows: 0 ..< 10)!
// allow groups
let trainSlicedGroup = train.slice(rows: [0, 3], allowGroups: true)!
// get shape of DMatrix
train.shape
train.numRow
train.numCol
// Save DMatrix to binary file
let dmFile = "Tests/tmp/dmfile.sliced"
try trainSliced.saveBinary(toFile: dmFile)
// Get DMatrix labels
let labels = train.label
// Get weights
let weights = train.weight
// Set weights
let weightSet = [Float]([1, 3, 4])
// Get base_margins
let base_margins = train.base_margin
let train = try DMatrix(fromFile: "data/agaricus.txt.train")
let test = try DMatrix(fromFile: "data/agaricus.txt.test")
let params = [
("objective", "binary:logistic"),
("max_depth", "9"),
("eval_metric", "auc"),
("eval_metric", "aucpr"),
]
// Construct booster while boosting
let bst = try xgboost(params: param, data: train, numRound: 10)
// Set parameters by passing dictionary or name-value pair
bst.setParam(param)
bst.setParam(name: "alpha", value: "0.1")
let result = bst.predict(data: test)
// Save model to file
let modelfile = "Tests/tmp/bst.model"
try bst.saveModel(toFile: modelfile)
// Construct booster from file
let bstJsonLoaded2 = try Booster(params: params, cache: [train],
modelFile: modelfileJson)
// Save config to json file
let configfile = "Tests/tmp/config.json"
try bst.saveConfig(toFile: configfile)
// Cross Validation
let cvResults = try xgboostCV(params: param, data: train, numRound: 10, nFold: 5)
The SimplePrintEvalution
and EarlyStop
are builtin simple example of callback.
You can also define a custom callback that conforms to XGBCallback
protocol, see
more in the document of protocol.
let train = try DMatrix(fromFile: "data/agaricus.txt.train")
let test = try DMatrix(fromFile: "data/agaricus.txt.test")
let callbacks = [SimplePrintEvalution(period: 5),
EarlyStop(stoppingRounds: 5)]
let bst = try xgboost(data: train, numRound: 10,
evalSet: [(train, "train"), (test, "test")],
callbacks: callbacks)
A custom objecitve function has a signature of FuncObj
, and a custom evaluation function has a signature of FuncEval
.
let train = try DMatrix(fromFile: "data/agaricus.txt.train")
let test = try DMatrix(fromFile: "data/agaricus.txt.test")
func dumEval(preds: [Float], dmatrix: DMatrix) -> (String, Float) {
let labels = dmatrix.label
let predicts = preds.map { x -> Float in
if x > 0 { return 1.0 } else { return 0.0 }
}
var cnt: Float = 0
for (label, pred) in zip(labels, predicts) {
if label == pred {
cnt += 1
}
}
return ("dumEval", cnt / Float(labels.count))
}
func logLossObj(preds: [Float], dmatrix: DMatrix) -> ([Float], [Float]) {
let labels = dmatrix.label
let predicts = preds.map { x -> Float in
Float(1.0 / (1.0 + exp(-x)))
}
var grad = [Float](), hess = [Float]()
for (label, pred) in zip(labels, predicts) {
grad.append(pred - label)
hess.append(pred * (1.0 - pred))
}
return (grad, hess)
}
let callbacks = [SimplePrintEvalution(period: 5)]
let bst = try xgboost(data: train, numRound: 10,
evalSet: [(train, "train"), (test, "test")],
fnObj: logLossObj,
fnEval: dumEval,
callbacks: callbacks)