Shows the differences in a table between two snapshot versions or timestamps. Returns added, removed, and (optionally) changed rows.
Usage
db_diff(
schema = "main",
table,
from_version = NULL,
to_version = NULL,
from_timestamp = NULL,
to_timestamp = NULL,
key_cols = NULL,
collect = TRUE
)Arguments
- schema
Schema name (default "main")
- table
Table name
- from_version
Starting snapshot version (integer) or NULL to use from_timestamp
- to_version
Ending snapshot version (integer, default: current) or NULL to use to_timestamp
- from_timestamp
Starting timestamp (alternative to from_version)
- to_timestamp
Ending timestamp (alternative to to_version, default: current)
- key_cols
Character vector of columns that uniquely identify rows. If provided, enables detection of modified rows (not just added/removed).
- collect
If TRUE (default), returns collected data.frames. If FALSE, returns lazy tbl references.
Examples
if (FALSE) { # \dontrun{
db_lake_connect()
# Compare versions 3 and 5
diff <- db_diff(table = "products", from_version = 3, to_version = 5)
diff$added
diff$removed
# Compare with key columns to see modifications
diff <- db_diff(table = "products", from_version = 3, to_version = 5,
key_cols = "product_id")
diff$modified
} # }