sorandf creates a data frame with columns corresponding to different types of random data. It is intended for making reproducible examples easier.

sorandf(
  rows = 10L,
  cols = c("race", "gender", "age"),
  names = make.names(cols)
)

sorandf_add(newf, name)

sorandf_reset()

Arguments

rows

integer. number of rows in data.frame.

cols

character. Vector of types of columns.

names

character. Vector of names of columns in data.frame output.

newf

function to add to sorandf functions. Must have just one parameter, n.

name

character. Name to call new function.

Details

Each column must have a function defining it. It is possible to add new functions using the sorandf_add function and reset back to the defaults using the sorandf_reset function. Within this function, there are a number of defaults which can be called by default:

id = function(n) paste0("ID.", 1:n)

group = function(n) sample(c("control", "treat"), n, replace = TRUE)

hs.grad = function(n) sample(c("yes", "no"), n, replace = TRUE)

race = function(n) sample(c("black", "white", "asian"), n, replace = TRUE, prob=c(.25, .5, .25))

gender = function(n) sample(c("male", "female"), n, replace = TRUE)

age = function(n) sample(18:40, n, replace = TRUE)

m.status = function(n) sample(c("never", "married", "divorced", "widowed"), n, replace = TRUE, prob=c(.25, .4, .3, .05))

political = function(n) sample(c("democrat", "republican", "independent", "other"), n, replace= TRUE, prob=c(.35, .35, .20, .1))

n.kids = function(n) rpois(n, 1.5)

income = function(n) sample(c(seq(0, 30000, by=1000), seq(0, 150000, by=1000)), n, replace=TRUE)

score = function(n) rnorm(n)

Author

Sebastian Campbell, Tyler Rinker

Examples

if (FALSE) { sorandf(15, c("id", "age", "score"), names= c("card", "years", "points")) sorandf_add(function(n){sample(1:10, n)}, "newf") sorandf(10, c("gender", "race", "newf")) sorandf_reset() }