df2yaml
is an R package distributed as part of the CRAN. To install the package,
start R and enter:
# install via CRAN
install.package("df2yaml")
# install via Github
# install.package("remotes") #In case you have not installed it.
remotes::install_github("showteeth/df2yaml")
In general, it is recommended to install from Github repository (update more timely).
Once df2yaml
is installed, it can be loaded by the
following command.
The goal of df2yaml
is simplify the process of
converting dataframe to YAML. The dataframe with multiple key columns
and one value column (this column can also contain key-value pair(s))
will be converted to multi-level hierarchy.
Load the test data, this test data contains two key
columns (paras
and subcmd
) and
one value column, the value column also contains
key and value pair(s) separated by “:”.
# library
library(df2yaml)
# load test file
test_file <- system.file("extdata", "df2yaml_l3.txt", package = "df2yaml")
test_data = read.table(file = test_file, header = T, sep = "\t")
head(test_data)
#> paras subcmd values
#> 1 picard insert_size MINIMUM_PCT: 0.5
#> 2 picard markdup CREATE_INDEX: true; VALIDATION_STRINGENCY: SILENT
#> 3 preseq -r 100 -seg_len 100000000
#> 4 qualimap --java-mem-size=20G -outformat HTML
#> 5 rseqc mapq: 30; percentile-floor: 5; percentile-step: 5
# output yaml string
yaml_res = df2yaml(df = test_data, key_col = c("paras", "subcmd"), val_col = "values")
cat(yaml_res)
#> preseq: -r 100 -seg_len 100000000
#> qualimap: --java-mem-size=20G -outformat HTML
#> rseqc:
#> mapq: 30
#> percentile-floor: 5
#> percentile-step: 5
#> picard:
#> insert_size:
#> MINIMUM_PCT: 0.5
#> markdup:
#> CREATE_INDEX: true
#> VALIDATION_STRINGENCY: SILENT
Convert above dataframe to YAML:
yaml_res = df2yaml(df = test_data, key_col = c("paras", "subcmd"), val_col = "values")
cat(yaml_res)
#> preseq: -r 100 -seg_len 100000000
#> qualimap: --java-mem-size=20G -outformat HTML
#> rseqc:
#> mapq: 30
#> percentile-floor: 5
#> percentile-step: 5
#> picard:
#> insert_size:
#> MINIMUM_PCT: 0.5
#> markdup:
#> CREATE_INDEX: true
#> VALIDATION_STRINGENCY: SILENT
There is no limit to the number of key columns used to convert.
sessionInfo()
#> R version 4.4.2 (2024-10-31)
#> Platform: x86_64-pc-linux-gnu
#> Running under: Ubuntu 24.04.1 LTS
#>
#> Matrix products: default
#> BLAS: /usr/lib/x86_64-linux-gnu/openblas-pthread/libblas.so.3
#> LAPACK: /usr/lib/x86_64-linux-gnu/openblas-pthread/libopenblasp-r0.3.26.so; LAPACK version 3.12.0
#>
#> locale:
#> [1] LC_CTYPE=en_US.UTF-8 LC_NUMERIC=C
#> [3] LC_TIME=en_US.UTF-8 LC_COLLATE=C
#> [5] LC_MONETARY=en_US.UTF-8 LC_MESSAGES=en_US.UTF-8
#> [7] LC_PAPER=en_US.UTF-8 LC_NAME=C
#> [9] LC_ADDRESS=C LC_TELEPHONE=C
#> [11] LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C
#>
#> time zone: Etc/UTC
#> tzcode source: system (glibc)
#>
#> attached base packages:
#> [1] stats graphics grDevices utils datasets methods base
#>
#> other attached packages:
#> [1] df2yaml_0.3.1 prettydoc_0.4.1
#>
#> loaded via a namespace (and not attached):
#> [1] vctrs_0.6.5 cli_3.6.3 knitr_1.49 rlang_1.1.4
#> [5] xfun_0.49 rrapply_1.2.7 generics_0.1.3 jsonlite_1.8.9
#> [9] glue_1.8.0 buildtools_1.0.0 htmltools_0.5.8.1 maketools_1.3.1
#> [13] sys_3.4.3 sass_0.4.9 fansi_1.0.6 rmarkdown_2.29
#> [17] tibble_3.2.1 evaluate_1.0.1 jquerylib_0.1.4 fastmap_1.2.0
#> [21] yaml_2.3.10 lifecycle_1.0.4 compiler_4.4.2 dplyr_1.1.4
#> [25] pkgconfig_2.0.3 digest_0.6.37 R6_2.5.1 tidyselect_1.2.1
#> [29] utf8_1.2.4 pillar_1.9.0 magrittr_2.0.3 bslib_0.8.0
#> [33] tools_4.4.2 cachem_1.1.0