R/preprocessing.R
filter.Rd
To ensure a reasonable computation on paper similarity, [filter_var_type()] allows you to subset the decision table by the most common variable/ type combinations among all papers. [filter_papers()] allows you to subset the paper with at least x decisions or the top x papers with the most decisions.
filter_papers(df, n = NULL, n_value = NULL)
filter_var_type(df, n = NULL, n_value = NULL)
raw_df <- read.csv(system.file("papers.csv", package = "dossier")) |> tibble::as_tibble()
tbl_df <- as_decision_tbl(raw_df)
df <- tbl_df |> filter_var_type(n = 6)
df2 <- tbl_df |> filter_var_type(n_value = 3)
identical(df, df2)
#> [1] TRUE