Subset the decision table by the most common variable-type and the paper

To ensure a reasonable computation on paper similarity, [filter_var_type()] allows you to subset the decision table by the most common variable/ type combinations among all papers. [filter_papers()] allows you to subset the paper with at least x decisions or the top x papers with the most decisions.

filter_papers(df, n = NULL, n_value = NULL)

filter_var_type(df, n = NULL, n_value = NULL)

Arguments

df: The decision table object
n: Optional. A numeric value indicating the number of papers to keep.
n_value: Optional. A numeric value indicating the minimum number of decisions a paper must have to be kept.

Examples

raw_df <- read.csv(system.file("papers.csv", package = "dossier")) |> tibble::as_tibble()
tbl_df <- as_decision_tbl(raw_df)
df <- tbl_df |> filter_var_type(n = 6)
df2 <- tbl_df |> filter_var_type(n_value = 3)
identical(df, df2)
#> [1] TRUE