-
Notifications
You must be signed in to change notification settings - Fork 109
Closed
Description
Hi there,
Lime currently does not seem to support NAs in data. Here's an example:
library(caret)
library(lime)
set.seed(123)
x = as.data.frame(matrix(rnorm(100*10), ncol=10))
x$V1 = ifelse(x$V2 > 0, NA, x$V1) # introduce random NAs in V1
y = round(runif(100))
y = as.factor(y)
levels(y) = c("no", "yes")
data = cbind(x, target = y)
fitControl <- trainControl(method = "repeatedcv",
number = 10,
repeats = 1,
allowParallel = TRUE,
classProbs = TRUE,
summaryFunction = twoClassSummary)
XGBModel = train(target ~ .,
data = data,
trControl = fitControl,
method = "xgbTree",
search = "random",
metric = "ROC",
na.action = na.pass) # force XGB to take NAs into account
prediction = predict.train(XGBModel, data, na.action = na.pass, type = "prob") # works fine
explain = lime(data, XGBModel, bin_continuous = T, n_permutations = 1000) # error
# Error in quantile.default(x[[i]], seq(0, 1, length.out = n_bins + 1)) :
# missing values and NaN's not allowed if 'na.rm' is FALSE
explain = lime(na.omit(data), XGBModel, bin_continuous = T, n_permutations = 1000) # works fine
It would be great if NAs could be handled like XGB does.
Thanks guys for your work!
Reactions are currently unavailable
Metadata
Metadata
Assignees
Labels
No labels