Procedimentos para teste da unidimensionalidade

Procedimentos para teste da unidimensionalidade

Prof. Dr. Ricardo PrimiUSF

segunda-feira, 16 de setembro de 13

Sumário• Definição

• Modelos (uni, multi, bi-fatorial)

• Modelo “operacional”: unidimensionalidade essencial

• Procedimentos

• AFE scree plot

• AFE análise paralela

• AFE com itens dicotômicos

• Full information factor analysis (FIFA)

• Comparação Bi-Factor com Unidimensionalidade

• Análise fatorial dos resíduos

• AFC

• Programas

• SPSS: scree plot e análise paralela

• TESTFACT: AFE e AFC bi-factor e FIFA

• MPLUS AFE, AFC bi-factor e FIFA

• WINSTEPS


O que é unidimensionalidade ?• Uma dimensão controlando o escore no item

• Discriminação e carga fatorial


Construtos(latentes((exemplo(da(educação)(

–  Domínio(de(saberes((MAT):(equações,(sistemas(de(equações,(funções((etc..)(((hCp://www.khanacademy.org/math/algebra).(

–  Causa:(raciocínio(e(conhecimento(matemáJco(–  Transmissão((como(a(causa(se(estrutura):(via(educação(ensinoLaprendizagem(

Gq(e(RQ((

Causa((latente)(Solução(de(problemas(matemáJcos((observáveis)(


Construtos(latentes((exemplo(da(medicina)(

–  Roséola:(febre,(nariz(escorrendo,(tosse(e(pintas(vermelhas(na(pele(

–  Causa:(vírus(do(herpes(humano((HVH@6)(–  Transmissão((como(a(causa(se(estrutura):(via(saliva((

HVH@6(

Causa((latente)(Sintomas((observáveis)(


Análise(Fatorial(

•  Por(exemplo,(como(sabemos(se(um(conjunto(de(sintomas(tem(uma(causa(comum(?(– Associação/correlação:((

Causa((latente)(Sintomas((observáveis)(

?(


Análise(Fatorial:(psicologia(

•  Estudo(fatorial(da(inteligência(– Aplicam8se(vários(testes(de(conteúdos(diferentes((raciocínio(lógico,(memória,(cria?vidade,(solução(de(problemas,(etc..)((

– Calcula8se(uma(matriz(de(associações(entre(os(vários(testes((matriz(de(correlação)(

– Agrupam8se(os(testes(pelas(correlações(– Analisam8se(os(testes(e(se(inferem(os(fatores((causas(dos(desempenhos).(





Análise(paralela(

•  Foram(extráidos(grupos(de(itens((fatores)(de(1(a(5(e(examinados(para(ver(se(faziam(sen;do(

•  O(modelo(de(dois(fatores(foi(o(mais(coerente((fatores(correlacionados)(

0 50 100 150

05

1015

Eigen values of tetrachoric/polychoric matrix

Factor Number

Eige

n va

lues

of o

rigin

al a

nd s

imul

ated

fact

ors

and

com

pone

nts

PC Actual Data PC Simulated DataFA Actual Data FA Simulated Data

0 50 100 150

05

1015

Eigen values of tetrachoric/polychoric matrix

Factor Number

Eige

n va

lues

of o

rigin

al a

nd s

imul

ated

fact

ors

and

com

pone

nts

PC Actual Data PC Simulated DataFA Actual Data FA Simulated Data


Representação gráfica dos modelos SEM

the researcher is hoping to measure versus how much is

due to secondary dimensions? This is exactly the question

the bifactor model can address.In most IRT applications, researchers explore whether

their item-level data are sufficient for Model A as opposed

to Models B or C. We argue that the more interestingcomparisons are between Model A (unidimensional) versus

Model D (bifactor), and the choice between Model C

(multiple correlated dimensions) versus Model D. For theremainder, we will explore these comparisons using the

CAHPS!2.0 data introduced previously. Specifically, let us

assume that we wish to measure a single construct (i.e.,PPC), and thus we ask the standard question, ‘‘does this

item set represent a single construct, or are the data too

multidimensional for the application of a unidimensionalIRT model?’’

The first set of columns in Table 3 display estimated

factor loadings from exploratory principal axis factoringsof the polychoric correlation matrix for one-, two-, and

five-factor extractions, with oblimin rotations. The five

factor solution represents the a priori domains while the

two factor solution represents a plausible alternative [15].MICROFACT [39] was used to conduct these analyses.

The first column in Table 3 displays the loadings under

Model A. Although the items vary widely in their loadings,an argument can be made that there is a ‘‘strong’’ general

dimension here. All items load reasonably well (>.40) on

the first factor; the first five eigenvalues for this matrix are6.8, 1.4, .9, .8, and, .7 and thus the ratio of the first to

second eigenvalues is 4.9. The Goodness-of-fit (GFI) sta-

tistic [12] is .982, the mean residual is .001 with standarddeviation of .06. Finally, the total variance explained by the

first factor is 43%.

On the other hand, reasonable arguments could also bemade that the data are multidimensional and that applica-

tion of a unidimensional IRT model is not appropriate.

Specifically, both two-factor and five-factor solutions aresubstantively interpretable. The two-factor solution can be

MODEL A MODEL B

MODEL C MODEL D

Fig. 1 Four possible latentvariable models

Qual Life Res (2007) 16:19–31 23

123


Tipos&de&modelos&mul,dimensionais&


O que é multidimesnionalidade ?10 Chapter 16

1

8

5

3

2 1

3

2

ITEMS LATENT DIMENSIONS

9

7

6

4

1

3

2

1

9

8

7

6

5

4

3

2


Between ItemMulti-Dimensionality

Within ItemMulti-Dimensionality

Figure 16-2. A Graphic Depiction of Within and Between Item Multi-dimensionality

5. FIRST ISSUE: WITHIN VERSUS BETWEEN MULTIDIMENSIONALITY

To assist in the discussion of different types of multidimensional models and tests we have introduced the notions of within and between item multidimensionality (Adams, Wilson & Wang, 1997; Wang, 1994; Wang & Wilson, in 1996). A test is regarded as multidimensional between item if it is made up of several unidimensional sub-scales. A test is considered multi-dimensional within item at least one of the items relate to more than one latent dimension.

The Multidimensional Between-Item Model. Tests that contain several sub-scales each measuring related, but supposedly distinct, latent dimensions are very commonly encountered in practice. In such tests each item belongs to only one particular sub-scale and there are no items in common across the sub-scales. In the past, item response modelling of such tests has proceeded by either (a) applying a unidimensional model to each of the scales

EDMS 724 – Modern Measurement Theories (Spring 2008, Dr. André A. Rupp)

Graphical Illustration

aj1 = 1.8, aj2 = .3, dj = .3, Dj = -.273

see Ackerman (1994)

294 Assessment 18(3)

to the standard ICC in order to account for the second latent variable.

Nonparametric, nonmonotone, and multiple group IRT models have all emerged (see van der Linden & Hambleton, 1997). However, many of these extensions pose formidable challenges to applied researchers (e.g., complex formulas, lack of accessible software for performing the analyses, and unknown statistical properties). Technological and theoreti-cal advances must be made before the wide array of IRT models becomes accessible.

Applications of Item Response Theory to Clinical AssessmentModel Selection

The fundamental role of an IRT equation is to model exam-inees’ test response behavior. As previously reviewed, a number of IRT models can be selected for this purpose. In clinical assessment, Rasch models enjoy great popularity in Europe and have seen moderate use in the United States. However, because Rasch models demand identical item dis-crimination parameters, they often fail to fit scales developed with older technology (e.g., Tenenbaum, Furst, & Weingarten, 1985). Although Rasch models may be applicable to content-specific subscales couched within more complex frameworks (e.g., Bouman & Kok, 1987; Chang, 1996; Cole, Rabin, Smith, & Kaufman, 2004), they appear to be inappropriate for

scales measuring psychological syndromes. Nevertheless, because of the beneficial properties of the model, fit of a Rasch framework should be given thorough consideration. Two-parameter models appear to be more congruent with existing clinical measures than their Rasch (one-parameter) counterparts (Reise & Waller, 1990). They have accurately reproduced observed data where Rasch models have failed (e.g., Aggen, Neale, & Kendler, 2005; Cooper & Gomez, 2008; Ferrando, 1994; Gray-Little, Williams, & Hancock, 1997). Owing to its greater flexibility and its congruence with common factor theory, the two-parameter model is more common in clinical assessment.

The three-parameter model has been applied to clini-cal assessment less commonly. Although the model adds flexibility to analyses, conceptualizing the impact of “pseudo-guessing” on items related to personality and psy-chopathology can be difficult. On clinical tests, the lower asymptote parameter has occasionally been thought of as being indicative of a response style (e.g., social desirability, true response bias, etc.; see Zumbo, Pope, Watson, & Hubley, 1997). For example, if examinees are unwilling to respond openly to an item concerning sexual practices, drug use, mental health, and so on, responses could be drawn toward more conservative options. Rouse, Finger, and Butcher (1999) fit a three-parameter model to scales from the second edition of the Minnesota Multiphasic Personality Inventory (MMPI-2; Butcher et al., 2001) and found substantial cor-relations between estimates of lower asymptotes and indices

Anxiety

–20

2

Dep

ress

ion

–2

0

2

0.0

0.2

0.4

0.6

0.8

1.0

Figure 2. Item characteristic surface for a multidimensional item response modelAnxiety and depression are used as examples of two distinct latent variables that both influence the probability of item endorsement.

at Serials Records, University of Minnesota Libraries on August 27, 2011asm.sagepub.comDownloaded from


Unidimensionalidade essencial

• Pressuposto de quase todas as análises clássicas!

• Um fator dominante e fatores secundários, embora existam, são negligenciáveis

10 Chapter 16

1

8

5

3

2 1

3

2


9

7

6

4

1

3

2

1

9

8

7

6

5

4

3

2


Between ItemMulti-Dimensionality

Within ItemMulti-Dimensionality

Figure 16-2. A Graphic Depiction of Within and Between Item Multi-dimensionality

5. FIRST ISSUE: WITHIN VERSUS BETWEEN MULTIDIMENSIONALITY

To assist in the discussion of different types of multidimensional models and tests we have introduced the notions of within and between item multidimensionality (Adams, Wilson & Wang, 1997; Wang, 1994; Wang & Wilson, in 1996). A test is regarded as multidimensional between item if it is made up of several unidimensional sub-scales. A test is considered multi-dimensional within item at least one of the items relate to more than one latent dimension.

The Multidimensional Between-Item Model. Tests that contain several sub-scales each measuring related, but supposedly distinct, latent dimensions are very commonly encountered in practice. In such tests each item belongs to only one particular sub-scale and there are no items in common across the sub-scales. In the past, item response modelling of such tests has proceeded by either (a) applying a unidimensional model to each of the scales


Procedimentos

• SPSS: scree plot

• Eigenvalue X ordem

• Eigen 1 / Eigen 2 > 5

• SPSS: RanEingen (Enzmann, 1997)

• Cria V variáveis para N sujeitos com distribuição aleatória. r v1 vs rv2 = 0. (dados paralelos)

• Extrai os fatores dessa matriz

• Repete i vezes

• Calcula a média dos primeiros, segundos, terceiros ... fatores

• Enzmann, D. (1997). RanEigen: a program to determine the parallel analysis criterion for the number of principal components. Applied Psychological Measurement, 21, 232. (http://www2.jura.uni-hamburg.de/instkrim/kriminologie/Mitarbeiter/Enzmann/Software/Enzmann_Software.html)


http://www2.jura.uni-hamburg.de/instkrim/kriminologie/Mitarbeiter/Enzmann/Software/Enzmann_Software.html




!


Exemplo1 ENEM 2006

Resultados

Dois fatores 20,6% e 2,9% da variância (Eig1/Eig2 = 4,54, Rotação Promax r (f1 vs f2)=0,67

Modelagem Rasch e Extração de contrastes r =0,50

quinta-feira, 2 de junho de 2011segunda-feira, 16 de setembro de 13

Exemplo 2: QFCP

• SPSS ...


Análise Fatorial Exploratória (AFE) com itens dicotômicos

FERGUSON, G. A. (1941). The factorial interpretation of test dificulty. Psychometrika, 6 (5): 323-329.




Full information factor analysis e análise fatorial dos resíduos

• FIFA (Testfact)

• Análise fatorial via TRI

• Estima-se os parâmetros (geralmente o modelo de 3 parâmetros, c é informado) e deriva-se as cargas a partir dos parâmetros a.

• TRI multidimensional !

• Análise fatorial dos resíduos (WINSTEPS)

• Aplica-se o modelo unidimensional (1o fator) todos os itens

• Calcula-se os resíduos

• Roda-se a análise de componentes principais nos resíduos (2o, 3o, 4o fatores) sem rotação

• Dividi-se os itens que formam grupos nos contrastes

• Verifica-se se os escores estimados a partir desses itens estão correlacionados (avalia-se a interferência das dimensões secundárias)


Comparação Bi-Factor com Unidimensionalidade

• Em essência uma AFC

• Passo 1: análise unidimensional tradicional

• Passo 2: análise bifatorial

• Compara-se as cargas no fator geral das duas soluções

• Se a os fatores específicos não “distorcem” a medida as cargas no fator geral de ambos os modelos serão similares. Isso é interpretado que, mesmo com a presença de fatores específicos a medida do fator geral não é distorcida

the researcher is hoping to measure versus how much is

due to secondary dimensions? This is exactly the question

the bifactor model can address.In most IRT applications, researchers explore whether

their item-level data are sufficient for Model A as opposed

to Models B or C. We argue that the more interestingcomparisons are between Model A (unidimensional) versus

Model D (bifactor), and the choice between Model C

(multiple correlated dimensions) versus Model D. For theremainder, we will explore these comparisons using the

CAHPS!2.0 data introduced previously. Specifically, let us

assume that we wish to measure a single construct (i.e.,PPC), and thus we ask the standard question, ‘‘does this

item set represent a single construct, or are the data too

multidimensional for the application of a unidimensionalIRT model?’’

The first set of columns in Table 3 display estimated

factor loadings from exploratory principal axis factoringsof the polychoric correlation matrix for one-, two-, and

five-factor extractions, with oblimin rotations. The five

factor solution represents the a priori domains while the

two factor solution represents a plausible alternative [15].MICROFACT [39] was used to conduct these analyses.

The first column in Table 3 displays the loadings under

Model A. Although the items vary widely in their loadings,an argument can be made that there is a ‘‘strong’’ general

dimension here. All items load reasonably well (>.40) on

the first factor; the first five eigenvalues for this matrix are6.8, 1.4, .9, .8, and, .7 and thus the ratio of the first to

second eigenvalues is 4.9. The Goodness-of-fit (GFI) sta-

tistic [12] is .982, the mean residual is .001 with standarddeviation of .06. Finally, the total variance explained by the

first factor is 43%.

On the other hand, reasonable arguments could also bemade that the data are multidimensional and that applica-

tion of a unidimensional IRT model is not appropriate.

Specifically, both two-factor and five-factor solutions aresubstantively interpretable. The two-factor solution can be

MODEL A MODEL B

MODEL C MODEL D

Fig. 1 Four possible latentvariable models

Qual Life Res (2007) 16:19–31 23

123


The role of the bifactor model in resolving dimensionalityissues in health outcomes measures

Steven P. Reise Æ Julien Morizot Æ Ron D. Hays

Received: 25 August 2006 / Accepted: 30 January 2007 / Published online: 4 May 2007! Springer Science+Business Media B.V. 2007

AbstractObjectives We propose the application of a bifactormodel for exploring the dimensional structure of an item

response matrix, and for handling multidimensionality.

Background We argue that a bifactor analysis can com-plement traditional dimensionality investigations by: (a)

providing an evaluation of the distortion that may occur

when unidimensional models are fit to multidimensionaldata, (b) allowing researchers to examine the utility of

forming subscales, and, (c) providing an alternative to non-

hierarchical multidimensional models for scaling individ-ual differences.

Method To demonstrate our arguments, we use responses

(N = 1,000 Medicaid recipients) to 16 items in the Con-sumer Assessment of Healthcare Providers and Systems

(CAHPS!2.0) survey.

Analyses Exploratory and confirmatory factor analyticand item response theory models (unidimensional, multi-

dimensional, and bifactor) were estimated.

Results CAHPS! items are consistent with both unidi-mensional and multidimensional solutions. However, the

bifactor model revealed that the overwhelming majority of

common variance was due to a general factor. After con-trolling for the general factor, subscales provided little

measurement precision.

Conclusion The bifactor model provides a valuable toolfor exploring dimensionality related questions. In the

Discussion, we describe contexts where a bifactor analysis

is most productively used, and we contrast bifactor withmultidimensional IRT models (MIRT). We also describe

implications of bifactor models for IRT applications, and

raise some limitations.

Keywords Bifactor model ! Unidimensionality

assumption ! Item response theory ! Multidimensional itemresponse model ! Health outcomes measurement

Item response theory (IRT) [1] methods were developed inthe context of large-scale assessment to more efficiently

and accurately measure broadband dimensional constructs

such as verbal and quantitative aptitude. In recent years,IRT methods have been applied in a wider variety of

substantive contexts, especially health outcomes research

[2–6]. This article is not an introductory or didactic reviewof IRT methods as applied in the health outcomes domain.

Such reports are available in a variety of sources [7, 8].Rather, this article is aimed toward researchers who are

familiar with IRT methods, and who are actively working

on IRT applications. We assume the reader is familiar withthe literature that relates factor analytic and IRT models

[9–13].

We discuss two related topics in the application of IRTmodels to health outcomes measures: (a) (uni)dimension-

ality assessment, and (b) application of hierarchical versus

non-hierarchical multidimensional models. We first draw adistinction between narrow and broad constructs/measures.

We then argue that for broader measures with diverse

indicators, researchers should consider use of the bifactormodel for representing the (multi)dimensional structure of

their data. We argue that a bifactor model: (a) allows for

the examination of the distortion that may occur whenunidimensional IRT models are fit to multidimensional

data, (b) allows researchers to empirically examine the

utility of forming subscales, and, (c) provides an alternative

S. P. Reise (&) ! J. Morizot ! R. D. HaysDepartment of Psychology, University of California, Franz Hall,Los Angeles, CA 90095-1563, USAe-mail: [email protected]

123

Qual Life Res (2007) 16:19–31

DOI 10.1007/s11136-007-9183-7


!

1!

!

1

La utilización del modelo bifactorial para testar la unidimensionalidad de una batería de

pruebas de raciocinio.

The use of the bi-factor model to test the uni-dimensionality of a battery of reasoning tests

Running Head: Item factor analysis of a Battery of Reasoning Tests

Ricardo Primi,∗ Marjorie Cristina Rocha da Silva and Priscila Rodrigues Santana,

Graduate Program in Psychology,

University of San Francisco, Brazil

Monalisa Muniz,

University of Vale do Sapucaí, Brazil

and

Leandro S. Almeida

Institute of Education, University of Minho, Portugal.

Please address correspondence to:

Ricardo Primi Universidade São Francisco Laboratório de Avaliação Psicológica e Educacional (LabAPE) Rua Alexandre Rodrigues Barbosa, 45 CEP 13251-900, Itatiba São Paulo, Brazil. E-mail: [email protected]

Web: www.labape.com.br

∗ The research activities of the first author, which resulted in this article, are financed by the

Brazilian National Council for Scientific Research (CNPq) and the São Paulo Research

Foundation (FAPESP).

!

3!

!

3

Abstract

The Battery of Reasoning Tests 5 (BPR-5) aims to assess the reasoning ability of individuals

using sub-tests with different formats and contents that require basic processes of inductive

and deductive reasoning in their resolution. The BPR has three sequential forms: BPR-5i

(for children from first to fifth grade), BPR-5 – Form A (for children from sixth to eight

grade) and BPR-5 – form B (for high school and undergraduate students). The present study

analysed 412 questionnaires concerning BPR-5i, 603 questionnaires concerning BPR-5 –

Form A and 1748 questionnaires concerning BPR-5 – Form B. The main goal was to test the

uni-dimensionality of the battery and its tests in relation to items using the bi-factor model.

Results have indicated that the assumption of a general reasoning factor underlying different

contents items is supported.

Key-words: Battery of reasoning tests, factorial validity, item response theory, bi-factor

model

Resumén

La Batería de Pruebas de Raciocinio (BPR-5) tiene como objetivo evaluar la capacidad de

razonamiento de las personas utilizando pruebas menores con diferentes ítems y contenidos,

pero que presentan relaciones en lo referente a la inducción y la deducción que intervienen

en su resolución de la tarea. La BPR tiene una organización secuencial: BPR-5i (para niños

de 1º a 5º grado), BPR-5 versión A (del 6º al 8º grado) y BPR-5 versión B (enseñanza

secundaria y terciaria). El presente estudio evaluó los datos de 289 protocolos de la BPR-5i,

603 de la BPR-5 versión A y 1748 de la BPR-5 versión B. El objetivo principal fue poner a

prueba la unidimensionalidad de la batería y de las pruebas que la componen. Los resultados

confirmaron la existencia de un factor único relatado con el razonamiento

independientemente del contenido de las tareas.

Palabras clave: Batería de Pruebas de Raciocinio, validez factorial, desarrollo cognitivo,

teoría de respuesta al ítem, calibración de la prueba.


Exemplo BPR-5 A, B e i

!

22!

!

22

Figure 1. Item examples of BPR sub-tests

Abstract Reasoning A B C D E

?

Verbal Reasoning

Day : Night is as Bright :

A. Light B. Energy C. Dark D. Clarity E. Cloud

Numerical Reasoning

1 3 5 7 9 ? ?

Spatial Reasoning A B C D E

Mechanical Reasoning

What level (A, B, C) allows a person to reach a greater depth after jumping? If equal mark D.

Practical Reasoning (Only in BPR-5i)

John's house is nearby Anthony’s home. One house is white and the other is grey. Anthony's

house is not white. State what is the colour of the house of each of these two men.


!

18!

!

18

Table 1. Summary results of bi-factor and full information factor models

BPR-5i

BPR-5 A BPR-5 B

Bi-factor model g bi (%) 35.00 34.43 30.72 s AR (%) 8.61 2.35 2.39 s VR (%) 6.57 2.24 2.77 s MR (%) - 3.04 4.28 s SR (%) - 2.88 2.58 s NR (%) 6.29 3.93 4.41 s PR (%) 3.25 - - Uniqueness 40.25 51.13 52.83 Reliability (of g-factor) .92 .90 .90 χ2 27040.8 /105 63719.7 / 256 177566.0/1443 CFI .981 .974 .951 TLI .980 .973 .949 RMSEA .017 .014 .017

Uni-dimensional model (Full information)

Eig 1 / Eig 2 35.9/6.4 (5.6) 37.7/5.15 (7.32) 39.37/6.41 (6.14) g full 44.08 37.42 40.42 Reliability (of g-factor) .95 .95 .95 χ2 34891.1 / 207 65848.91 /371 182457.3 /1558 CFI .904 .924 .876 TLI .902 .922 .873 RMSEA .038 .024 .027

Corrected Chi-Square difference test for the weighted least squares estimator (WLSMV) �χ2 1421.33 1540.46 3905.26 df 102 115 113 p <.0001 <.0001 <.0001


!

23!

!

23

Figure 2. Scatter plots of factor loadings in g factor with the bi-dimensional model for the youngest children (top), A (lower left) and B Forms (lower right).


Documents

Procedimentos para teste da unidimensionalidade