Title: | Feel at Home using R, Thanks to Shortcuts Functions Making it Simple |
---|---|
Description: | A collection of personal functions designed to simplify and streamline common R programming tasks. This package provides reusable tools and shortcuts for frequently used calculations and workflows. |
Authors: | Raphaël Flambard [aut, cre], Adrien Cocuaud [ctb] |
Maintainer: | Raphaël Flambard <[email protected]> |
License: | GPL-3 |
Version: | 1.0.0 |
Built: | 2025-02-12 10:16:04 UTC |
Source: | https://github.com/r-alpha-act/r.alpha.home |
Adds dummy columns to reach the number specified by the user. this is mostly useful to ensure straightforward and easy data updating when using pivot tables in Excel. It allows replacement of the previous data sheet by the new one, without having to take care about the number of columns, which will always be the same.
cols_pad(data, nCols = 100, colPrefix = "x_")
cols_pad(data, nCols = 100, colPrefix = "x_")
data |
The data frame to which dummy columns will be added. |
nCols |
the total number of columns required : default is 100 |
colPrefix |
A string used as the prefix for the names of dummy columns. |
A data frame with the specified total number of columns.
table <- data.frame(a = 1:5, b = letters[1:5]) extraTable <- cols_pad(table, nCols = 6, colPrefix = "extra_") print(extraTable)
table <- data.frame(a = 1:5, b = letters[1:5]) extraTable <- cols_pad(table, nCols = 6, colPrefix = "extra_") print(extraTable)
Compares column names in two tables based on a given pattern. Provides information about which columns are present in which tables.
compareVars(x, y, pattern = "")
compareVars(x, y, pattern = "")
x |
A data frame representing the first table. |
y |
A data frame representing the second table. |
pattern |
A string pattern used to filter and compare only a subset of variables (column names). |
A list containing:
all
: All column names from both tables.
common
: Column names found in both tables.
onlyX
: Column names found only in the first table (x
).
onlyY
: Column names found only in the second table (y
).
exclusive
: Column names found in only one of the two tables.
# Example tables table1 <- data.frame(exclusive_1 = 1:5, common_1 = 6:10, common_2 = 11:15) table2 <- data.frame(common_1 = 16:20, common_2 = 21:25, exclusive_2 = 26:30) # Compare all columns (no pattern given) compare_all <- compareVars(table1, table2) compare_all$common compare_all$exclusive compare_all$onlyX compare_all$onlyY # compare only columns following a specific pattern compare_wPattern <- compareVars(table1, table2, pattern = "1") compare_wPattern$all compare_wPattern$common
# Example tables table1 <- data.frame(exclusive_1 = 1:5, common_1 = 6:10, common_2 = 11:15) table2 <- data.frame(common_1 = 16:20, common_2 = 21:25, exclusive_2 = 26:30) # Compare all columns (no pattern given) compare_all <- compareVars(table1, table2) compare_all$common compare_all$exclusive compare_all$onlyX compare_all$onlyY # compare only columns following a specific pattern compare_wPattern <- compareVars(table1, table2, pattern = "1") compare_wPattern$all compare_wPattern$common
This function aims at identifying sections and sub-sections numbers, based on markers of section starts and ends.
Given a data frame, and the name of a column giving the start/stop markers, it will add columns giving infos about the successive section levels
countSwitches( data, colNm, sttMark, endMark, includeStt = TRUE, includeEnd = TRUE )
countSwitches( data, colNm, sttMark, endMark, includeStt = TRUE, includeEnd = TRUE )
data |
A data frame containing the column to process. |
colNm |
A string specifying the column name in 'data' to evaluate. |
sttMark |
A value indicating the start of a series. |
endMark |
A value indicating the end of a series. |
includeStt |
Logical. Should the start marker be included as part of the series? Default is 'TRUE'. |
includeEnd |
Logical. Should the end marker be included as part of the series? Default is 'TRUE'. |
A modified version of the input data frame with additional columns including:
'catLvl': The current series level calculated as the difference between the cumulative counts of start and end markers.
'lvl_1', 'lvl_2', 'lvl_3': Final series counts returned for each respective level.
This function is currently mostly useful internally, to perform foldAllBr().
# example code library(dplyr) tribble( ~step , "start" , "content of section 1" , "start" , "subsection 1.1" , "end" , "end" , "out of any section" , "start" , "section 2" , "start" , "subsection 2.1" , "end" , "start" , "subsection 2.2" , "end" , "end" ) %>% countSwitches(colNm = "step", "start", "end")
# example code library(dplyr) tribble( ~step , "start" , "content of section 1" , "start" , "subsection 1.1" , "end" , "end" , "out of any section" , "start" , "section 2" , "start" , "subsection 2.1" , "end" , "start" , "subsection 2.2" , "end" , "end" ) %>% countSwitches(colNm = "step", "start", "end")
This function works with code split into parts identified by brackets. The format is as follows:
{ ... code from part 1 ... } # part 1 { ... } # part 2
It automatically identifies parts to fold/unfold easily.
Shortcuts required:
"fold all brackets": shift + alt + S (Windows) / ctrl + shift + up (Mac)
"expand fold": shift + alt + D (Windows) / ctrl + shift + down (Mac)
foldAllBr(time = FALSE, debug_getTbl = FALSE)
foldAllBr(time = FALSE, debug_getTbl = FALSE)
time |
Logical. If 'TRUE', the function will return |
debug_getTbl |
Logical. If 'TRUE', returns the 'docContent' table with tags for debugging purposes. |
A list containing:
debug_info
: A data frame with debugging information if debug_getTbl = TRUE
.
timer_plot
: A ggplot
object visualizing execution times if time = TRUE
.
If both parameters are FALSE
, the function returns a list with NULL
values.
Imports multiple files into a list, concatenates them into a single table, and adds an 'fName' variable.
The files can be selected either by giving a file list (character vector), or by specifying a pattern.
importAll( path = ".", pattern = "", ignore.case = FALSE, importFunction = NULL, fill = FALSE, fileList = NULL )
importAll( path = ".", pattern = "", ignore.case = FALSE, importFunction = NULL, fill = FALSE, fileList = NULL )
path |
Path to the directory, passed to 'list.files'. |
pattern |
Pattern to match file names, passed to 'list.files'. |
ignore.case |
Logical. If 'TRUE', ignores case when matching file names. Passed to 'list.files'. Default behavior is case-sensitive ('FALSE') |
importFunction |
A custom function for importing files. If not set, the function selects an import method based on the file extension. |
fill |
Logical. Passed to 'rbind' to allow filling missing columns. |
fileList |
A character vector of file names to import (used instead of 'pattern'). |
A data frame containing the concatenated table with the fName column
# Directory containing test files test_path <- tempdir() # Create test files write.csv( data.frame(a = 1:3, b = 4:6) , file.path(test_path, "file1.csv")) write.csv( data.frame(a = 7:9, b = 10:12) , file.path(test_path, "file2.csv")) write.csv( data.frame(a = 3:5, b = 8:10) , file.path(test_path, "file3.csv")) saveRDS( data.frame(a = 1:5, b = 6:10) , file.path(test_path, "file1.rds")) saveRDS( data.frame(a = 11:15, b = 16:20), file.path(test_path, "file2.rds")) # Example 1 : Import all csv files result <- importAll(path = test_path, pattern = "\\.csv$") print(result) # Example 2: Import only selected files file_list <- c("file1.csv", "file2.csv") result <- importAll(path = test_path, fileList = file_list) print(result) # Example 3: Import all .rds files result <- importAll(path = test_path, pattern = "\\.rds$") print(result) # Example 4: Use a custom import function custom_import <- function(file) { data <- read.csv(file, stringsAsFactors = FALSE) return(data) } result <- importAll(path = test_path, pattern = "\\.csv$", importFunction = custom_import) print(result)
# Directory containing test files test_path <- tempdir() # Create test files write.csv( data.frame(a = 1:3, b = 4:6) , file.path(test_path, "file1.csv")) write.csv( data.frame(a = 7:9, b = 10:12) , file.path(test_path, "file2.csv")) write.csv( data.frame(a = 3:5, b = 8:10) , file.path(test_path, "file3.csv")) saveRDS( data.frame(a = 1:5, b = 6:10) , file.path(test_path, "file1.rds")) saveRDS( data.frame(a = 11:15, b = 16:20), file.path(test_path, "file2.rds")) # Example 1 : Import all csv files result <- importAll(path = test_path, pattern = "\\.csv$") print(result) # Example 2: Import only selected files file_list <- c("file1.csv", "file2.csv") result <- importAll(path = test_path, fileList = file_list) print(result) # Example 3: Import all .rds files result <- importAll(path = test_path, pattern = "\\.rds$") print(result) # Example 4: Use a custom import function custom_import <- function(file) { data <- read.csv(file, stringsAsFactors = FALSE) return(data) } result <- importAll(path = test_path, pattern = "\\.csv$", importFunction = custom_import) print(result)
a custom usage of left_join, with more detailed checks. Performs a left join and verifies that no unexpected duplicates or mismatches occur. In cas of unexpected results, gives details about what caused the problem.
left_join_checks( x, y, ..., req_xAllMatch = 1, req_preserved_x = 1, behavior = "error", showNotFound = FALSE, showProblems = TRUE, time = FALSE )
left_join_checks( x, y, ..., req_xAllMatch = 1, req_preserved_x = 1, behavior = "error", showNotFound = FALSE, showProblems = TRUE, time = FALSE )
x |
A data.table representing the left table. |
y |
A data.table representing the right table. |
... |
Additional arguments passed to 'dplyr::left_join'. |
req_xAllMatch |
Logical. Ensure that all rows in 'x' find a match in 'y'. Default: FALSE. |
req_preserved_x |
Logical. Ensure that the number of rows in 'x' remains unchanged after the join. Default: TRUE. |
behavior |
Character. Specifies behavior if validation fails. Options: '"warning"' or '"error"'. (default: '"warning"') |
showNotFound |
Logical. Show rows from 'x' that did not match with 'y'. Default: FALSE. |
showProblems |
Logical. Display the problems encountered during the joining process, if any. |
time |
Logical. Internal argument used only for testing purposes, timing the function steps |
A data.table containing the joined table.
library(data.table) library(dplyr) # Example 1: Simple left join with all matches table_left <- data.table(id = 1:3, value_left = c("A", "B", "C")) table_right <- data.table(id = 1:3, value_right = c("X", "Y", "Z")) result <- left_join_checks(table_left, table_right, by = "id", req_preserved_x = TRUE) print(result) # Ensures all rows in table_left are preserved # Example 2: Left join with missing matches table_left <- data.table(id = 1:5, value_left = c("A", "B", "C", "D", "E")) table_right <- data.table(id = c(1, 3, 5), value_right = c("X", "Y", "Z")) result <- left_join_checks( table_left, table_right, by = "id", req_preserved_x = TRUE, showNotFound = TRUE, behavior = "warning" ) print(result) # Rows from table_left with no matches in table_right are shown
library(data.table) library(dplyr) # Example 1: Simple left join with all matches table_left <- data.table(id = 1:3, value_left = c("A", "B", "C")) table_right <- data.table(id = 1:3, value_right = c("X", "Y", "Z")) result <- left_join_checks(table_left, table_right, by = "id", req_preserved_x = TRUE) print(result) # Ensures all rows in table_left are preserved # Example 2: Left join with missing matches table_left <- data.table(id = 1:5, value_left = c("A", "B", "C", "D", "E")) table_right <- data.table(id = c(1, 3, 5), value_right = c("X", "Y", "Z")) result <- left_join_checks( table_left, table_right, by = "id", req_preserved_x = TRUE, showNotFound = TRUE, behavior = "warning" ) print(result) # Rows from table_left with no matches in table_right are shown
Modifies the brightness level of the active graphics window by adjusting its background color.
This is especially useful when using dark RStudio themes, where a 100 graphic window creates an unconfortable contrast.
lum_0_100(lum = NULL)
lum_0_100(lum = NULL)
lum |
Numeric. Brightness level, ranging from 0 (completely dark) to 100 (maximum brightness). |
no return value : only apply the theme_set() function
Saves a file with current date in its name in a sub directory located in the same directory as the original file. Optionally, a note is added after the file name.
quickSave( saveDir, filePath = NULL, saveNote = NULL, overwrite = FALSE, verbose = FALSE )
quickSave( saveDir, filePath = NULL, saveNote = NULL, overwrite = FALSE, verbose = FALSE )
saveDir |
Choose the directory used to store saves. Suggested : 'old' |
filePath |
Optional, if you want to save another file than the current one : full path of the file you want to save. |
saveNote |
An optional custom note to append to the file name for the save, allowing to keep track of why this save has been done. |
overwrite |
Logical. Should an existing save with the same name be overwritten? Default is 'FALSE'. |
verbose |
logical. If turned to 'TRUE', the save path is displayed |
the output value of the function used to copy file
Generates a vector of random dates within a specified range. This function tries to replicate the usage of the r* functions from stats package, such as runif(), rpois(), ...
rdate( x, min = paste0(format(Sys.Date(), "%Y"), "-01-01"), max = paste0(format(Sys.Date(), "%Y"), "-12-31"), sort = FALSE, include_hours = FALSE )
rdate( x, min = paste0(format(Sys.Date(), "%Y"), "-01-01"), max = paste0(format(Sys.Date(), "%Y"), "-12-31"), sort = FALSE, include_hours = FALSE )
x |
Integer. Length of the output vector (number of random dates to generate). |
min |
Date. Optional. The minimum date for the range. Defaults to the 1st of January of the current year. |
max |
Date. Optional. The maximum date for the range. Defaults to the 31st of December of the current year. |
sort |
Logical. Should the dates be sorted in ascending order? Default is 'FALSE'. |
include_hours |
Logical. Should the generated dates include time? Default is 'FALSE' (dates only). this will slow down the function |
A vector of random dates of length 'x'.
# Generate 5 random dates between two specific dates, sorted rdate(5, min = as.Date("2020-01-01"), max = as.Date("2020-12-31"), sort = TRUE) # Generate 7 random datetime values (with hours) rdate(7, include_hours = TRUE)
# Generate 5 random dates between two specific dates, sorted rdate(5, min = as.Date("2020-01-01"), max = as.Date("2020-12-31"), sort = TRUE) # Generate 7 random datetime values (with hours) rdate(7, include_hours = TRUE)
Modifies the brightness of a color by multiplying its RGB components by a specified factor.
Mostly for internal usage inside lum_0_100 function.
ret_lum(hexCol, rgbFact)
ret_lum(hexCol, rgbFact)
hexCol |
Character. The color to adjust, specified in hexadecimal format (e.g., "#FF5733"). |
rgbFact |
Numeric. The luminosity factor : - use a factor between 0 and 1 to decrease luminosity - use a factor >1 to increase it The final Brightness value will be maintained between 0 and 1. |
A modified hex color in hexadecimal format.
# Example 1: Lightening a color ret_lum("#FF5733", 1.5) # Returns a lighter version of the input color # Example 2: Darkening a color ret_lum("#FF5733", 0.7) # Returns a darker version of the input color
# Example 1: Lightening a color ret_lum("#FF5733", 1.5) # Returns a lighter version of the input color # Example 2: Darkening a color ret_lum("#FF5733", 0.7) # Returns a darker version of the input color
Returns the directory path where the current source code file is located.
It is especially useful when the same source code is used by multiple users, each using his own environment, with different file paths.
the aim is to avoid writing full paths in raw text inside source codes.
root()
root()
A character string representing the absolute path of the directory containing the current source file.
A wrapper for the 'format' function, designed to format numbers with custom defaults for thousands separator, number of significant digits, and scientific notation.
sepThsd(x, big.mark = " ", digits = 1, scientific = FALSE)
sepThsd(x, big.mark = " ", digits = 1, scientific = FALSE)
x |
Numeric. The input values to format. |
big.mark |
Character. The separator for thousands (e.g., '" "' for "1 000" or '","' for "1,000"). Default is '" "'. |
digits |
Integer. The number of significant digits to display. Default is '1'. |
scientific |
Logical. Should the numbers be displayed in scientific notation? Default is 'FALSE'. |
A character vector of formatted numbers.
# Format with a comma as a thousands separator and 3 significant digits sepThsd(1234567.89, big.mark = ",", digits = 3) # Use scientific notation sepThsd(1234567.89, scientific = TRUE)
# Format with a comma as a thousands separator and 3 significant digits sepThsd(1234567.89, big.mark = ",", digits = 3) # Use scientific notation sepThsd(1234567.89, scientific = TRUE)
Adjust the background color of a Shiny app's main body and sidebar based on a specified luminosity level.
The purpose is the same as lum_0_100() function, avoiding problems with high contrast between with graphic windows and dark themes.
shiny_lum_0_100(lum)
shiny_lum_0_100(lum)
lum |
Numeric. Luminosity level, ranging from 0 (black) to 100 (white). |
The HTML tags for setting the background and sidebar colors.
The 'timer' function allows you to append timeStamps to a data.table, and include additional metadata provided as arguments. The last call calculates time differences between timeStamps.
timer(timer_table = data.table(), end = FALSE, ...)
timer(timer_table = data.table(), end = FALSE, ...)
timer_table |
A data.table containing the timer log to continue from. Defaults to an empty 'data.table(). |
end |
A logical, inidicating the end of the timer, defaulted to FALSE. 'timer()' calls must be placed at the beginning of each part : therefore, this 'closing' step is necessary to compute time for the last part. Time differences between timeStamps are calculated only when closing the timer. |
... |
Additional specifications. Use named arguments to provide documentation on the code parts you are timing : naming the current step, the version of the code you are trying, or any other useful specification |
A 'data.table' containing the original data, plus one new timeStamp, and optionally computed time differences :
'timeStamp': The current timeStamp ('POSIXct').
'timeStamp_num': timeStamp converted to numeric, useful for intermediary calculations.
'dt_num': The time difference in seconds between consecutive rows as a numeric value.
'dt_text': The formatted time difference in seconds with milliseconds as a character string.
Additional columns for any information provided by the user via '...'. It allows documentation about the current step running, substeps, which version is being tested, ...
# compare code speed between using a loop, or the mean() function library(data.table) library(dplyr) tmr <- data.table() # Initialize timer vec <- rnorm(1e6) # Example vector tmr <- timer(tmr, method = "loop") # timeStamp : 1st step ================= total <- 0 for (i in seq_along(vec)) total <- total + vec[i] mean_loop <- total / length(vec) tmr <- timer(tmr, method = "mean()") # timeStamp : 1st step ================= mean_func <- mean(vec) tmr <- timer(tmr, end = TRUE) # timeStamp : close timer ============== t_step1 <- tmr[method == "loop"]$dt_num t_step2 <- tmr[method == "mean()"]$dt_num diff_pc <- (t_step2/t_step1 - 1) * 100 diff_txt <- format(diff_pc, nsmall = 0, digits = 1) # view speed difference print(tmr %>% select(-matches("_num$"))) paste0("speed difference : ", diff_txt, "%")
# compare code speed between using a loop, or the mean() function library(data.table) library(dplyr) tmr <- data.table() # Initialize timer vec <- rnorm(1e6) # Example vector tmr <- timer(tmr, method = "loop") # timeStamp : 1st step ================= total <- 0 for (i in seq_along(vec)) total <- total + vec[i] mean_loop <- total / length(vec) tmr <- timer(tmr, method = "mean()") # timeStamp : 1st step ================= mean_func <- mean(vec) tmr <- timer(tmr, end = TRUE) # timeStamp : close timer ============== t_step1 <- tmr[method == "loop"]$dt_num t_step2 <- tmr[method == "mean()"]$dt_num diff_pc <- (t_step2/t_step1 - 1) * 100 diff_txt <- format(diff_pc, nsmall = 0, digits = 1) # view speed difference print(tmr %>% select(-matches("_num$"))) paste0("speed difference : ", diff_txt, "%")