Create a list of nanoparquet options.
Arguments
- class
The extra class or classes to add to data frames created in
read_parquet()
. By default nanoparquet adds the"tbl"
class, so data frames are printed differently if the pillar package is loaded.- use_arrow_metadata
TRUE
orFALSE
. IfTRUE
, thenread_parquet()
andparquet_column_types()
will make use of the Apache Arrow metadata to assign R classes to Parquet columns. This is currently used to detect factor columns, and to detect "difftime" columns.If this option is
FALSE
:"factor" columns are read as character vectors.
"difftime" columns are read as real numbers, meaning one of seconds, milliseconds, microseconds or nanoseconds. Impossible to tell which without using the Arrow metadata.
- write_arrow_metadata
Whether to add the Apache Arrow types as metadata to the file
write_parquet()
.
Examples
if (FALSE) {
# the effect of using Arrow metadata
tmp <- tempfile(fileext = ".parquet")
d <- data.frame(
fct = as.factor("a"),
dft = as.difftime(10, units = "secs")
)
write_parquet(d, tmp)
read_parquet(tmp, options = parquet_options(use_arrow_metadata = TRUE))
read_parquet(tmp, options = parquet_options(use_arrow_metadata = FALSE))
}