(Generalized) Partially Confirmatory Factor Analysis

PCFA is a partially confirmatory approach covering a wide range of the exploratory-confirmatory continuum in factor analytic models (Chen, Guo, Zhang, & Pan, 2021). The PCFA is only for continuous data, while the generalized PCFA (GPCFA; Chen, 2021) covers both continuous and categorical data.

There are two major model variants with different constraints for identification. One assumes local independence (LI) with a more exploratory tendency, which can be also called the E-step. The other allows local dependence (LD) with a more confirmatory tendency, which can be also called the C-step. Parameters are obtained by sampling from the posterior distributions with the Markov chain Monte Carlo (MCMC) techniques. Different Bayesian Lasso methods are used to regularize the loading pattern and LD. The estimation results can be summarized with summary.lawbl and the factorial eigenvalue can be plotted with plot_lawbl.

Usage

pcfa(
  dat,
  Q,
  LD = TRUE,
  cati = NULL,
  cand_thd = 0.2,
  PPMC = FALSE,
  burn = 5000,
  iter = 5000,
  update = 1000,
  missing = NA,
  rfit = TRUE,
  sign_check = FALSE,
  sign_eps = -0.5,
  rs = FALSE,
  auto_stop = FALSE,
  max_conv = 10,
  rseed = 12345,
  digits = 4,
  alas = FALSE,
  verbose = FALSE
)

Arguments

dat

A \(N \times J\) data matrix or data.frame consisting of the responses of \(N\) individuals to \(J\) items.

Q

A \(J \times K\) design matrix for the loading pattern with \(K\) factors and \(J\) items. Elements are 1, -1, and 0 for specified, unspecified, and zero-fixed loadings, respectively. For models with LI or the E-step, one can specify a few (e.g., 2) loadings per factor. For models with LD or the C-step, the sufficient condition of one specified loading per item is suggested, although there can be a few items without any specified loading. See Examples.

LD

logical; TRUE for allowing LD (model with LD or C-step).

cati

The set of categorical (polytomous) items in sequence number (i.e., 1 to \(J\)); NULL for no and -1 for all items (default is NULL).

cand_thd

Candidate parameter for sampling the thresholds with the MH algorithm.

PPMC

logical; TRUE for conducting posterior predictive model checking.

burn

Number of burn-in iterations before posterior sampling.

iter

Number of formal iterations for posterior sampling (> 0).

update

Number of iterations to update the sampling information.

missing

Value for missing data (default is NA).

rfit

logical; TRUE for providing relative fit (DIC, BIC, AIC).

sign_check

logical; TRUE for checking sign switch of loading vector.

sign_eps

minimum value for switch sign of loading vector (if sign_check=TRUE).

rs

logical; TRUE for enabling recommendation system.

auto_stop

logical; TRUE for enabling auto stop based on EPSR<1.1.

max_conv

maximum consecutive number of convergence for auto stop.

rseed

An integer for the random seed.

digits

Number of significant digits to print when printing numeric values.

alas

logical; for adaptive Lasso or not. The default is FALSE.

verbose

logical; to display the sampling information every update or not.

Feigen: Eigenvalue for each factor.
NLA_le3: Number of Loading estimates >= .3 for each factor.
Shrink: Shrinkage (or ave. shrinkage for each factor for adaptive Lasso).
EPSR & NCOV: EPSR for each factor & # of convergence.
Ave. Thd: Ave. thresholds for polytomous items.
Acc Rate: Acceptance rate of threshold (MH algorithm).
LD>.2 >.1 LD>.2 >.1: # of LD terms larger than .2 and .1, and LD's shrinkage parameter.
#Sign_sw: Number of sign switch for each factor.

Value

pcfa returns an object of class lawbl without item intercepts. It contains a lot of information about the posteriors that can be summarized using summary.lawbl.

References

Chen, J., Guo, Z., Zhang, L., & Pan, J. (2021). A partially confirmatory approach to scale development with the Bayesian Lasso. Psychological Methods. 26(2), 210–235. DOI: 10.1037/met0000293.

Chen, J. (2021). A generalized partially confirmatory factor analysis framework with mixed Bayesian Lasso methods. Multivariate Behavioral Research. DOI: 10.1080/00273171.2021.1925520.

Examples

# \donttest{
#####################################################
#  Example 1: Estimation with continuous data & LD  #
#####################################################

dat <- sim18cfa1$dat
J <- ncol(dat)
K <- 3
Q<-matrix(-1,J,K);
Q[1:6,1]<-Q[7:12,2]<-Q[13:18,3]<-1

m0 <- pcfa(dat = dat, Q = Q, LD = TRUE,burn = 2000, iter = 2000)
summary(m0) # summarize basic information
#> $NJK
#> [1] 1000   18    3
#> 
#> $`Miss%`
#> [1] 0
#> 
#> $`LD Allowed`
#> [1] TRUE
#> 
#> $`Burn in`
#> [1] 2000
#> 
#> $Iteration
#> [1] 2000
#> 
#> $`No. of sig lambda`
#> [1] 23
#> 
#> $Selected
#> [1] TRUE TRUE TRUE
#> 
#> $`Auto, NCONV, MCONV`
#> [1]  0  0 10
#> 
#> $EPSR
#>      Point est. Upper C.I.
#> [1,]     1.7125     3.5603
#> [2,]     1.0459     1.1902
#> [3,]     1.1321     1.4495
#> 
#> $`No. of sig LD terms`
#> [1] 6
#> 
#> $`DIC, BIC, AIC`
#> [1] 3597.172 2497.062 1378.094
#> 
#> $Time
#>    user  system elapsed 
#>   35.08    0.02   35.18 
#> 
summary(m0, what = 'qlambda') #summarize significant loadings in pattern/Q-matrix format
#>          1      2      3
#> I1  0.7233 0.0000 0.0000
#> I2  0.6469 0.0000 0.0000
#> I3  0.7660 0.0000 0.0000
#> I4  0.7660 0.0000 0.0000
#> I5  0.7658 0.1835 0.0000
#> I6  0.7425 0.0000 0.0000
#> I7  0.0000 0.7646 0.0000
#> I8  0.0000 0.7091 0.0000
#> I9  0.0000 0.7420 0.0000
#> I10 0.0000 0.7220 0.0000
#> I11 0.0000 0.7087 0.2341
#> I12 0.0000 0.7163 0.2194
#> I13 0.0000 0.0000 0.7084
#> I14 0.0000 0.0000 0.6745
#> I15 0.0000 0.0000 0.7426
#> I16 0.0000 0.0000 0.7343
#> I17 0.2472 0.0000 0.7259
#> I18 0.2328 0.0000 0.7234
summary(m0, what = 'offpsx') #summarize significant LD terms
#>      row col    est     sd  lower  upper sig
#> [1,]  14   1 0.2876 0.0416 0.2042 0.3671   1
#> [2,]   7   2 0.2495 0.0608 0.1408 0.3637   1
#> [3,]   4   3 0.2692 0.0516 0.1744 0.3658   1
#> [4,]  13   8 0.2654 0.0614 0.1418 0.3899   1
#> [5,]  10   9 0.3080 0.0608 0.2008 0.4382   1
#> [6,]  16  15 0.2631 0.0635 0.1374 0.3843   1

######################################################
#  Example 2: Estimation with categorical data & LI  #
######################################################
dat <- sim18ccfa40$dat
J <- ncol(dat)
K <- 3
Q<-matrix(-1,J,K);
Q[1:2,1]<-Q[7:8,2]<-Q[13:14,3]<-1

m1 <- pcfa(dat = dat, Q = Q,LD = FALSE,cati=-1,burn = 2000, iter = 2000)
summary(m1) # summarize basic information
#> $NJK
#> [1] 1000   18    3
#> 
#> $`Miss%`
#> [1] 9.888889
#> 
#> $`LD Allowed`
#> [1] FALSE
#> 
#> $`Burn in`
#> [1] 2000
#> 
#> $Iteration
#> [1] 2000
#> 
#> $`No. of sig lambda`
#> [1] 24
#> 
#> $Selected
#> [1] TRUE TRUE TRUE
#> 
#> $`Auto, NCONV, MCONV`
#> [1]  0  0 10
#> 
#> $EPSR
#>      Point est. Upper C.I.
#> [1,]     1.2016     1.6828
#> [2,]     1.1469     1.4762
#> [3,]     1.0670     1.2669
#> 
#> $`Cat Items`
#>  [1]  1  2  3  4  5  6  7  8  9 10 11 12 13 14 15 16 17 18
#> 
#> $`max No. of categories`
#> [1] 4
#> 
#> $`DIC, BIC, AIC`
#> [1] 4133.164 3834.688 3466.606
#> 
#> $Time
#>    user  system elapsed 
#>   65.82    0.11   66.26 
#> 
summary(m1, what = 'qlambda') #summarize significant loadings in pattern/Q-matrix format
#>          1      2      3
#> I1  0.7423 0.0000 0.0000
#> I2  0.7406 0.0000 0.0000
#> I3  0.7442 0.0000 0.0000
#> I4  0.7620 0.0000 0.0000
#> I5  0.7688 0.2287 0.0000
#> I6  0.7184 0.2275 0.0000
#> I7  0.0000 0.7350 0.0000
#> I8  0.0000 0.7322 0.0000
#> I9  0.0000 0.7314 0.0000
#> I10 0.0000 0.6976 0.0000
#> I11 0.0000 0.7014 0.2681
#> I12 0.0000 0.6945 0.2641
#> I13 0.0000 0.0000 0.7506
#> I14 0.0000 0.0000 0.7243
#> I15 0.0000 0.0000 0.7331
#> I16 0.0000 0.0000 0.7115
#> I17 0.2784 0.0000 0.7283
#> I18 0.3019 0.0000 0.6667
summary(m1, what = 'offpsx') #summarize significant LD terms
#> NULL
summary(m1,what='thd') #thresholds for categorical items
#>          [,1]    [,2]   [,3]
#>  [1,] -1.3566  0.0754 1.6245
#>  [2,] -1.4324  0.0571 1.4929
#>  [3,] -1.4491  0.0539 1.4929
#>  [4,] -1.3762  0.0598 1.5975
#>  [5,] -1.4274  0.0147 1.4727
#>  [6,] -1.4862  0.0849 1.5689
#>  [7,] -1.4943  0.0361 1.6604
#>  [8,] -1.5413 -0.0550 1.4096
#>  [9,] -1.4773 -0.0132 1.5135
#> [10,] -1.5041  0.0126 1.6268
#> [11,] -1.4495  0.0218 1.5745
#> [12,] -1.5340  0.0205 1.6335
#> [13,] -1.5586 -0.0522 1.4086
#> [14,] -1.4153  0.0509 1.5145
#> [15,] -1.5415 -0.0238 1.4803
#> [16,] -1.4598 -0.0093 1.3759
#> [17,] -1.4275  0.0653 1.4807
#> [18,] -1.5253  0.0380 1.5541
# }