This summary
+380.000 other summaries
A unique study tool
A rehearsal system for this summary
Studycoaching with videos
Remember faster, study better. Scientifically proven.
PREMIUM summaries are quality controlled, selected summaries prepared for you to help you achieve your study goals faster!
Summary  Multivariate Data Analysis

1 College Factor Analysis

What is Factor analysis?Is a General name denoting a class of procedures primarily used for data reduction and summarization.

How is Factor analysis an Interdependence technique?Factor analysis is an Interdependence technique in that an entire set of interdependent relationships is examined without making the distinction between dependent and independent variable.

In what circumstances is factor analysis used?
 To identify underlying demensions, or factors, that explain the correlations among a set of variables.
 To indentify a new, smaller, set of uncorrelated variables to replace the origian set of correlated variables in subsequent multivariate analysis (E.G. Regression or ancova)
 To identify underlying demensions, or factors, that explain the correlations among a set of variables.

What are the applications of factor analysis?
 Assess the validity of construct measurements
 Market segmentation for identifying the underlying variables on which to group the customers.
 product research: Determine the brand attributes that influence consumers choice
 Price management: Identify the characteristics of pricesensitive consumers
 Assess the validity of construct measurements

What are the two forms of factor analysis?Explaratory and confirmatory

What is meant with exploratory factor analysis?
 Researcher intends to find an underlying structure
 researcher has assumptions that superior factors cause correlations between variables
 researcher intends to reveal interrelationships
 generation of hypotheses
 Researcher intends to find an underlying structure

What is meant with confirmatory factor analysis?
 Researcher has a priori ideas of the underlying factors that derive from theory
 Researcher identifies relationships between variables and factors before conducting the factor analysis
 Testing hypotheses
 special case of structural equation models
 Researcher has a priori ideas of the underlying factors that derive from theory

What is meant with Data Matrix XContains the characteristics of objects that derived from questioning prersons

What is meant with standard data matrix ZZ contains the standardized characteristics of objects (Standradized data always have a mean of 0 and a standard deviation of 1)

What is meant with correlation matrix RR describes statistical interrelationships between the variables. (The diagonal just contains 1)

What is meant with reduced correlation matrix R*?R* contains main diagonal of the estimated communalities

What is meant with factor loadings AContains the correlations between factors and variables.

What is meant with rotated factor loadings A*?Contains correlations between factors and variables after rotation of the coordinate plane

What is meant with the factor scores P?P contains the extracted factors.

What are the steps of conducting Factor analysis?
 Problem formulation
 construction of the correlation matrix
 selecting the method of factor analysis
 determination of number of factors
 rotation of factors
 interpretation of factors
 Calculation of factor scores
 or Selection of surrogate variables
 Creation of summated scores
 .
 Determination of Model Fit
 Problem formulation

What is meant with formulating the problem in factor analysis?
 The objectives of factor analysis should be identified.
 data summarization?
 Data reduction
 The variables to be included in the factor analysis should be specified based on past research, theory, and judgment of the researcher. It is important that the variables be appropriately measured on an interval or ratio scale.
 An appropriate sample size should be used. As a rough guideline, there should be at least four or five times as many observations (Sample size) as there are variables
 The objectives of factor analysis should be identified.

There are two statistics that can be used to see if factor analysis is appropriate. Which two and what do they say?
 Bartlett's test of sphericity can be used to test the null hypothesis that the variables are uncorrelated in the population.
 The KaiserMeyerOlkin(KMO) measures sampling adequacy. Small values of the KMO statistic indicate that the correlations between pairs of variables cannot be explained by other variables and that factor analysis may not be appropriate.
 Bartlett's test of sphericity can be used to test the null hypothesis that the variables are uncorrelated in the population.

What is meant with construct the correlation matrix?
 The analytical process is based on a matrix of correlations between the variables.
 Bartlett's test of sphericity can be used to test the null hypothesis that the variables are uncorrelated in the population: In other words, the population correlation matrix is an identity matrix. If this hypothesis cannot be rejected, then the appropriateness of factor analysis should be questioned.

What is the KaiserMeyerOlking (KMO)?It is another useful statistic measure of sampling adequacy. Small values of the KMO statistic indicate that the correlations between pairs of variables cannot be explained by other variables and that factor analysis may not be appropriate.
Statistic measure could also be used instead of the correlation matrix in the construct the correlation matrix. 
What are the two methods of conducting a factor analysis?
 Principal component analysis
 How can we reduce the number of variable to a smaller number of components
 Common Factor Analysis
 What are the dimensions that are underlying the variables?

What is meant with principal component analysis and when is this one appropriate?In principal component analysis, the total variance in the data is considered. The diagonal of the correlation matrix consists of unities, and full variance is brought into the factor matrix. Principal component analysis is recommended when the primary concern is to determine the minimum number of factors that will account for maximum variance in the data for use in subsequent multivariate analysis. The factors are called principal components.

What is meant with the common factor analysis and when is this one appropriate?In common factor analysis. The factors are estimated based only on the common variance. Communalities are inserted in the diagonal of the correlation matrix. This method is appropriate when the primary concern is to identify the underlying dimensions and the common variance is of interest. This method is also known as principal axis factoring.

When do we use principal component or common factor analysis?
 Principal component analysis:
How can we reduce the number ofvariables to a smaller number of components?  Common factor analysis:
What are the dimensions that areunderlying thevariables ?
 Principal component analysis:

How does the Principal component model works?Mathematically, each variable is expressed as a linear combination of the components. The covariation among the variables is described in terms of a small number of principal components. If the variables are standardized, the principal component model may be represented as: Xi = Ai 1*C1 + Ai 2*C2 + Ai 3*C3 + . . . + Aim*Cm
 Where Xi= ith standardized variable
 Where Aij= standardized multiple regression coefficients of variable i on principal component j
 Where Cj= principal component
 m= number of components

How does the common factor analysis model work?Mathematically, each variable is expressed as a linear combination of underlying factors. The covariation among the variables is described in terms of a small number of common factors plus a unique factor for each variable. If the variables are standardized, the factor model may be represented as: Xi = Ai 1*F1 + Ai 2*F2 + Ai 3*F3 + . . . + AimFm + Vi *Ui
 Xi= ith standardized variable
 Aij= standardized multiple regression coefficient of variable i on common factor j
 Fj= common factor
 Vi= standardized regression coefficient of variable i on unique factor i
 Ui= the unique factor for variable i
 m= number of common factors

What is in essention the difference between the principal component model and common factor analysis model?With the principal component model, each variable is expressed as a linear combination of the components
With the common factor analysis model each variable is expressed as a linear combination of underlying factors. 
What is meant with Priori Determination?Sometimes a researcher had prior knowledge and that makes that he knows how many factors to expect.

What is meant with determination Based on Eigenvalues?In this approach, only factors with Eigenvalues greater than 1.0 are retained. An Eigenvalue represents the amount of variance associated with the factor. Hence, only factors with a variance greater than 1.0 are included. Factors with variance less than 1.0 are no better than a single variable, since, due to standardization, each variable has a variance of 1.0. If the number of variables is less than 20, this approach will result in a conservative number of factors.

What is meant with determination based on percentage of variance?In this approach, the number of factors extracted is determined so that the cumulative percentage of variance extracted by the factors reaches a satisfactory level. It is recommended that the factors extracted should account for at least 60% of the variance.

What is meant with determination based on Scree plot?A scree plot is a plot of the eigenvalues against the number of factors in order of extraction.

When are we talking about Orthogonal Rotation?If the Axes are maintained at the right angles

When are we talking about oblique rotation?When axes are not maintained at right angles, and the factors are correlated. Oblique rotation should be used if it is likely that the factors are highly correlated

Why would you use sum scores sometimes?
 Easier to use for predictionoriented research
 only little loss of information compared to factor scores
 can be assessed more easily through reliability coefficients

What are the two statistics associated with factor Analysis?
 Bartlett's test of sphericity. This is a test statistic used to examine the hypothesis that the variables are uncorrelated in the population. In other words, the population correlation matrix is an identity matrix; each variable correlates perfectly with itself (r=1) but has no correlation with the other variables (r=0)
 Correlation Matrix: A correlation matrix is a lower triangle matrix showing the simple correlations, r, between all possible pairs of variables included in the analysis. The diagonal elements, which are all 1, are usually omitted

What is meant with communality?Communality is the amount of variance a variable shares with all the other variables being considered. This is also the proportion of variance explained by the common factors.

What is meant with the eigenvalue?The eigenvalue represents the total variance explained by each factor.

What are factor loadings?Factor
loadings are simplecorrelations between the variables and the factors. Variables load on factors. Not the other way around 
What is a factor loading plot?A factor loading plot is a plot of the original variables using the factor loadings as coordinates.

What is a factor matirx?A factor matrix contains the factor loadings of all the variables on all the factors extracted.

What are the factor Scores?Factor scores are composite scores estimated for each respondent on the derived factors.

What does the KaiserMeyerOlkin (KMO) measure?The KMO measure of sampling adequacy is an index used to examine the appropriateness of factor analysis. High values (Between 0.5 and 1.0) indicate factor analysis is appropriate. If below 0.5, it may not be appropriate.

What is meant with the percentage of variance?The percentage of the total variance attributed to each factor

What are Residuals?Are the differences between the observed correlations, as given in the input correlation matrix, and the reproduced correlations, as estimated from the factor matrix.

When conducting a reliability analysis, what value is accepted?In the early stages of research 0.7 is accepted. Later higher values should be used.

Is it true that in common factor analysis, each factor consists of a common and a unique part?True or not?

Is it true that in order to determine the model fit, it is common to compare the symptotic correlations with the asymptotic correlations?True or not?

A variable should be interpreted in terms of the factors that load high on it?True or not?

When are we talking about a dependent technique?As soon as something depends on one other, then it is a dependent technique

Which techniques are dependent and which are interdependent?
 Factor analysis: interdependent because there is no dependent
 Anova: dependent because we have a dependent and the independent variables
 Ancova: dependent because we have a dependent and the independent variables
 Cluster: inter because we do not distinguish between the dependent and the independent
 Structual equation modeling: dependent
 Factor analysis: interdependent because there is no dependent

Which measurement levels are necessary for Factor, Anova, Ancova, Cluster, Regression and Structual equation modeling?
 Anova: dependent must Metric, Independent is categorical
 Ancova: dependent and independent must both be metric
 Regression: dependent is metric, independent does not matter
 structual equation: usually all variables are metric, but when used for surveys, dependent is metric, independent could be categorical (larg number of cases)
 Cluster: usually all metric, but as long as you choose the right distance measure it is al not that important.
 Anova: dependent must Metric, Independent is categorical
Read the full summary
This summary. +380.000 other summaries. A unique study tool. A rehearsal system for this summary. Studycoaching with videos.
Latest added flashcards
How is signifcant moderator effect represented
This is represented by a significant change in Rsquare
What are regression coefficients?
Regression coefficents describe changes in the dependent variable due to changes in an independent variable if all other independent variables remain constant (ceteris paribus)
Why is multicollinearity a problem?
Multicollinearity is a problem because it undermines the statistical significance of an independent variable.
What is multicollinearity?
This exists whenever an independent variable is highly correlated with one or more of the other independent variables in a multiple regression equation.
What should the VIF score be?
<10, better even <5
How is multicollinearity measured?
With the VIF score
In the context of corrective actions for influential, what should be done with an exceptional observation with no likely explanation?
Presents presents a special problem because there is no reason for
deleting the case, but its inclusion cannot be justified either,
suggesting analyses with and without the observations to make
a complete assessment
deleting the case, but its inclusion cannot be justified either,
suggesting analyses with and without the observations to make
a complete assessment
In the context of corrective actions for influential, what should be done with a valid but exceptional observation that is explainable by an extraordinary situation?
The remedy is to delete the case. THis should not be done when the variables reflecting the extraordinary situation are included in the regression equation.
In the context of corrective actions for influential, how should an error in observations or data entry be solved?
The remedy is correcting the data or deleting the case.
What is practical significance?
Practical significance is the relationship between variables of the realworld applications. The practical significance explains the relevance of the study