Get Multivariate Functional Data from a data frame

get_mfd_df(
  dt,
  domain,
  arg,
  id,
  variables,
  n_basis = 30,
  n_order = 4,
  basisobj = NULL,
  Lfdobj = 2,
  lambda = NULL,
  lambda_grid = 10^seq(-10, 1, length.out = 10),
  ncores = 1
)

Arguments

dt

A data.frame containing the discrete data. For each functional variable, a single column, whose name is provided in the argument variables, contains discrete values of that variable for all functional observation. The column indicated by the argument id denotes which is the functional observation in each row. The column indicated by the argument arg gives the argument value at which the discrete values of the functional variables are observed for each row.

domain

A numeric vector of length 2 defining the interval over which the functional data object can be evaluated.

arg

A character variable, which is the name of the column of the data frame dt giving the argument values at which the functional variables are evaluated for each row.

id

A character variable indicating which is the functional observation in each row.

variables

A vector of characters of the column names of the data frame dt indicating the functional variables.

n_basis

An integer variable specifying the number of basis functions; default value is 30. See details on basis functions.

n_order

An integer specifying the order of b-splines, which is one higher than their degree. The default of 4 gives cubic splines.

basisobj

An object of class basisfd defining the basis function expansion. Default is NULL, which means that a basisfd object is created by doing create.bspline.basis(rangeval = domain, nbasis = n_basis, norder = n_order)

Lfdobj

An object of class Lfd defining a linear differential operator of order m. It is used to specify a roughness penalty through fdPar. Alternatively, a nonnegative integer specifying the order m can be given and is passed as Lfdobj argument to the function fdPar, which indicates that the derivative of order m is penalized. Default value is 2, which means that the integrated squared second derivative is penalized.

lambda

A non-negative real number. If you want to use a single specified smoothing parameter for all functional data objects in the dataset, this argument is passed to the function fda::fdPar. Default value is NULL, in this case the smoothing parameter is chosen by minimizing the generalized cross-validation (GCV) criterion over the grid of values given by the argument. See details on how smoothing parameters work.

lambda_grid

A vector of non-negative real numbers. If lambda is provided as a single number, this argument is ignored. If lambda is NULL, then this provides the grid of values over which the optimal smoothing parameter is searched. Default value is 10^seq(-10,1,l=20).

ncores

If you want parallelization, give the number of cores/threads to be used when doing GCV separately on all observations.

Value

An object of class mfd. See also ?mfd for additional details on the multivariate functional data class.

Details

Basis functions are created with fda::create.bspline.basis(domain, n_basis), i.e. B-spline basis functions of order 4 with equally spaced knots are used to create mfd objects.

The smoothing penalty lambda is provided as fda::fdPar(bs, 2, lambda), where bs is the basis object and 2 indicates that the integrated squared second derivative is penalized.

Rather than having a data frame with long format, i.e. with all functional observations in a single column for each functional variable, if all functional observations are observed on a common equally spaced grid, discrete data may be available in matrix form for each functional variable. In this case, see get_mfd_list.

See also

Examples

library(funcharts)

x <- seq(1, 10, length = 25)
y11 <- cos(x)
y21 <- cos(2 * x)
y12 <- sin(x)
y22 <- sin(2 * x)
df <- data.frame(id = factor(rep(1:2, each = length(x))),
                 x = rep(x, times = 2),
                 y1 = c(y11, y21),
                 y2 = c(y12, y22))

mfdobj <- get_mfd_df(dt = df,
                     domain = c(1, 10),
                     arg = "x",
                     id = "id",
                     variables = c("y1", "y2"),
                     lambda = 1e-5)