The `calc_decision_similarity` function presents text similarity score for matched decisions for each paper pair.`calc_paper_similarity` aggregates all the decision scores to produce similarity score for paper pairs.
calc_decision_similarity(
df,
embed = NULL,
text_model = "bert-base-uncased",
new_names = c("paper1", "paper2")
)
compute_text_embed(df, text_model = "bert-base-uncased")
calc_paper_similarity(res, .f = mean)
A decision table object
Optional. A text embedding object from `compute_text_embed()`.
A text model.See `text::textEmbed()`
A character vector of length 2. The column names for the paper pairs. Default to `c("paper1", "paper2")`
The output from `calc_decision_similarity()`
Optional. A function to aggregate decision similarity score to paper similarity score, defaults to `mean`.
A tibble object
if (FALSE) { # \dontrun{
library(text)
library(readr)
# preprocessing
tbl_df <- readr::read_csv(system.file("papers.csv", package = "dossier")) |>
as_decision_tbl()
paper_df <- tbl_df |>
filter_var_type(n = 6) |> # first 6 variable-type pairs
filter_papers(n_value = 3)
# calculate the text embed
embed_df <- paper_df |> compute_text_embed()
# calculate decision similarity
distance_decision_df <- calc_decision_similarity(paper_df, embed = embed_df)
# aggregate from decision similarity to paper similarity
distance_df <- distance_decision_df |> calc_paper_similarity()
} # }