Skip to content

Integer variables converted to double during LIME #32

@fennovj

Description

@fennovj

I'll try to produce a minimal reproducible example, but I tried a few caret methods, and it did not go wrong for any of them. However, it went wrong when using the ctree classifier from the party library, since it seems to be strict at not accepting that the test set has doubles instead of integers.

makeGeneric <- function(ctreemodel){
  return(structure(list(ctreemodel), class = "myclass"))
}

predict.myclass <- function(model, newdata, type="prob", ...){
  stopifnot(type == "prob")
  predict(model[[1]], newdata, type = "prob") %>% data.frame %>% t %>% 
    data.frame("false" = 1 - ., "true" = .)
}

model_type.myclass <- function(x, ...) "classification"

FT <- read.csv("https://raw.githubusercontent.com/vincentarelbundock/Rdatasets/master/csv/datasets/Titanic.csv")
FT <- na.omit(FT)
FT$Age <- as.integer(FT$Age)

ctreemodel <- party::ctree(Survived ~ PClass + Sex + Age, FT[-1,])
genericModel <- makeGeneric(ctreemodel) 

explainer <- lime::lime(FT[-1,], genericModel)
explanation <- lime::explain(FT[1,], explainer, n_labels = 1, n_features = 2) 

As you can see, I make a class that is used to predict using the ctreemodel. the data.frame("false" = 1-., "true" = .) only works for binary classification, but that is not the issue here since it is easy to extend to multiclass classification. party proceeds to throw the following error:

 Error in checkData(oldData, RET) : 
  Classes of new data do not match original data 

Note that this error does not occur when I comment out the FT$Age <- as.integer(FT$Age) line.
I used browser() during the predict.myclass function, and it turned out that the newdata passed by lime had its integer variable replaced by a double. Then, it goes wrong during the predict(model[[1]], newdata, type = "prob") code, since this function expects Age to be an integer, but lime converted it to a double somehow.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions