# Samenvatting Multivariate Data Analysis

ISBN-10 129202190X ISBN-13 9781292021904
289 Flashcards en notities
3 Studenten

# Onthoud sneller, leer beter. Wetenschappelijk bewezen.

## Dit is de samenvatting van het boek "Multivariate Data Analysis". De auteur(s) van het boek is/zijn . Het ISBN van dit boek is 9781292021904 of 129202190X. Deze samenvatting is geschreven door studenten die effectief studeren met de studietool van Study Smart With Chris.

PREMIUM samenvattingen zijn gecontroleerd op kwaliteit en speciaal geselecteerd om je leerdoelen nog sneller te kunnen bereiken!

• ## 1 College Factor Analysis

• What is Factor analysis?
Is a General name denoting a class of procedures primarily used for data reduction and summarization.
• How is Factor analysis an Interdependence technique?
Factor analysis is an Interdependence technique in that an entire set of interdependent relationships is examined without making the distinction between dependent and independent variable.
• In what circumstances is factor analysis used?
• To identify underlying demensions, or factors, that explain the correlations among a set of variables.
• To indentify a new, smaller, set of uncorrelated variables to replace the origian set of correlated variables in subsequent multivariate analysis (E.G. Regression or ancova)
• What are the applications of factor analysis?
• Assess the validity of construct measurements
• Market segmentation for identifying the underlying variables on which to group the customers.
• product research: Determine the brand attributes that influence consumers choice
• Price management: Identify the characteristics of price-sensitive consumers
• What are the two forms of factor analysis?
Explaratory and confirmatory
• What is meant with exploratory factor analysis?
• Researcher intends to find an underlying structure
• researcher has assumptions that superior factors cause correlations between variables
• researcher intends to reveal interrelationships
• generation of hypotheses
• What is meant with confirmatory factor analysis?
• Researcher has a priori ideas of the underlying factors that derive from theory
• Researcher identifies relationships between variables and factors before conducting the factor analysis
• Testing hypotheses
• special case of structural equation models
• What is meant with Data Matrix X
Contains the characteristics of objects that derived from questioning prersons
• What is meant with standard data matrix Z
Z contains the standardized characteristics of objects (Standradized data always have a mean of 0 and a standard deviation of 1)
• What is meant with correlation matrix R
R describes statistical interrelationships between the variables. (The diagonal just contains 1)
• What is meant with reduced correlation matrix R*?
R* contains main diagonal of the estimated communalities
Contains the correlations between factors and variables.
Contains correlations between factors and variables after rotation of the coordinate plane
• What is meant with the factor scores P?
P contains the extracted factors.
• What are the steps of conducting Factor analysis?
1. Problem formulation
2. construction of the correlation matrix
3. selecting the method of factor analysis
4. determination of number of factors
5. rotation of factors
6. interpretation of factors
1. Calculation of factor scores
2. or Selection of surrogate variables
3. Creation of summated scores
7. .
8. Determination of Model Fit
• What is meant with formulating the problem in factor analysis?
• The objectives of factor analysis should be identified.
• data summarization?
• Data reduction
• The variables to be included in the factor analysis should be specified based on past research, theory, and judgment of the researcher. It is important that the variables be appropriately measured on an interval or ratio scale.
• An appropriate sample size should be used. As a rough guideline, there should be at least four or five times as many observations (Sample size) as there are variables
• There are two statistics that can be used to see if factor analysis is appropriate. Which two and what do they say?
• Bartlett's test of sphericity can be used to test the null hypothesis that the variables are uncorrelated in the population.
• The Kaiser-Meyer-Olkin(KMO) measures sampling adequacy. Small values of the KMO statistic indicate that the correlations between pairs of variables cannot be explained by other variables and that factor analysis may not be appropriate.
• What is meant with construct the correlation matrix?
• The analytical process is based on a matrix of correlations between the variables.
• Bartlett's test of sphericity can be used to test the null hypothesis that the variables are uncorrelated in the population: In other words, the population correlation matrix is an identity matrix. If this hypothesis cannot be rejected, then the appropriateness of factor analysis should be questioned.
• What is the Kaiser-Meyer-Olking (KMO)?
It is another useful statistic measure of sampling adequacy. Small values of the KMO statistic indicate that the correlations between pairs of variables cannot be explained by other variables and that factor analysis may not be appropriate.

Statistic measure could also be used instead of the correlation matrix in the construct the correlation matrix.
• What are the two methods of conducting a factor analysis?
• Principal component analysis
• How can we reduce the number of variable to a smaller number of components
• Common Factor Analysis
• What are the dimensions that are underlying the variables?
• What is meant with principal component analysis and when is this one appropriate?
In principal component analysis, the total variance in the data is considered. The diagonal of the correlation matrix consists of unities, and full variance is brought into the factor matrix. Principal component analysis is recommended when the primary concern is to determine the minimum number of factors that will account for maximum variance in the data for use in subsequent multivariate analysis. The factors are called principal components.
• What is meant with the common factor analysis and when is this one appropriate?
In common factor analysis. The factors are estimated based only on the common variance. Communalities are inserted in the diagonal of the correlation matrix. This method is appropriate when the primary concern is to identify the underlying dimensions and the common variance is of interest. This method is also known as principal axis factoring.
• When do we use principal component or common factor analysis?
• Principal component analysis:
How can we reduce the number of variables to a smaller number of components?
• Common factor analysis:
What are the dimensions that are underlying the variables?
• How does the Principal component model works?
Mathematically, each variable is expressed as a linear combination of the components. The covariation among the variables is described in terms of a small number of principal components. If the variables are standardized, the principal component model may be represented as: Xi = Ai 1*C1 + Ai 2*C2 + Ai 3*C3 + . . . + Aim*Cm

• Where Xi= ith standardized variable
• Where Aij= standardized multiple regression coefficients of variable i on principal component j
• Where Cj= principal component
• m= number of components
• How does the common factor analysis model work?
Mathematically, each variable is expressed as a linear combination of underlying factors. The covariation among the variables is described in terms of a small number of common factors plus a unique factor for each variable. If the variables are standardized, the factor model may be represented as: Xi = Ai 1*F1 + Ai 2*F2 + Ai 3*F3 + . . . + AimFm + Vi *Ui

• Xi= ith standardized variable
• Aij= standardized multiple regression coefficient of variable i on common factor j
• Fj= common factor
• Vi= standardized regression coefficient of variable i on unique factor i
• Ui= the unique factor for variable i
• m= number of common factors
• What is in essention the difference between the principal component model and common factor analysis model?
With the principal component model, each variable is expressed as a linear combination of the components

With the common factor analysis model each variable is expressed as a linear combination of underlying factors.
• What is meant with Priori Determination?
Sometimes a researcher had prior knowledge and that makes that he knows how many factors to expect.
• What is meant with determination Based on Eigenvalues?
In this approach, only factors with Eigenvalues greater than 1.0 are retained. An Eigenvalue represents the amount of variance associated with the factor. Hence, only factors with a variance greater than 1.0 are included. Factors with variance less than 1.0 are no better than a single variable, since, due to standardization, each variable has a variance of 1.0. If the number of variables is less than 20, this approach will result in a conservative number of factors.
• What is meant with determination based on percentage of variance?
In this approach, the number of factors extracted is determined so that the cumulative percentage of variance extracted by the factors reaches a satisfactory level. It is recommended that the factors extracted should account for at least 60% of the variance.
• What is meant with determination based on Scree plot?
A scree plot is a plot of the eigenvalues against the number of factors in order of extraction.
• When are we talking about Orthogonal Rotation?
If the Axes are maintained at the right angles
• When are we talking about oblique rotation?
When axes are not maintained at right angles, and the factors are correlated. Oblique rotation should be used if it is likely that the factors are highly correlated
• Why would you use sum scores sometimes?
• Easier to use for prediction-oriented research
• only little loss of information compared to factor scores
• can be assessed more easily through reliability coefficients
• What are the two statistics associated with factor Analysis?
• Bartlett's test of sphericity. This is a test statistic used to examine the hypothesis that the variables are uncorrelated in the population. In other words, the population correlation matrix is an identity matrix; each variable correlates perfectly with itself (r=1) but has no correlation with the other variables (r=0)
• Correlation Matrix: A correlation matrix is a lower triangle matrix showing the simple correlations, r, between all possible pairs of variables included in the analysis. The diagonal elements, which are all 1, are usually omitted
• What is meant with communality?
Communality is the amount of variance a variable shares with all the other variables being considered. This is also the proportion of variance explained by the common factors.
• What is meant with the eigenvalue?
The eigenvalue represents the total variance explained by each factor.
Factor loadings are simple correlations between the variables and the factors. Variables load on factors. Not the other way around
• What is a factor matirx?
A factor matrix contains the factor loadings of all the variables on all the factors extracted.
• What are the factor Scores?
Factor scores are composite scores estimated for each respondent on the derived factors.
• What does the Kaiser-Meyer-Olkin (KMO) measure?
The KMO measure of sampling adequacy is an index used to examine the appropriateness of factor analysis. High values (Between 0.5 and 1.0) indicate factor analysis is appropriate. If below 0.5, it may not be appropriate.
• What is meant with the percentage of variance?
The percentage of the total variance attributed to each factor
• What are Residuals?
Are the differences between the observed correlations, as given in the input correlation matrix, and the reproduced correlations, as estimated from the factor matrix.
• When conducting a reliability analysis, what value is accepted?
In the early stages of research 0.7 is accepted. Later higher values should be used.
• Is it true that in common factor analysis, each factor consists of a common and a unique part?
True or not?
• Is it true that in order to determine the model fit, it is common to compare the symptotic correlations with the asymptotic correlations?
True or not?
• A variable should be interpreted in terms of the factors that load high on it?
True or not?
• When are we talking about a dependent technique?
As soon as something depends on one other, then it is a dependent technique
• Which techniques are dependent and which are interdependent?
• Factor analysis: interdependent because there is no dependent
• Anova: dependent because we have a dependent and the independent variables
• Ancova: dependent because we have a dependent and the independent variables
• Cluster: inter because we do not distinguish between the dependent and the independent
• Structual equation modeling: dependent
• Which measurement levels are necessary for Factor, Anova, Ancova, Cluster, Regression and Structual equation modeling?
• Anova: dependent must Metric, Independent is categorical
• Ancova: dependent and independent must both be metric
• Regression: dependent is metric, independent does not matter
• structual equation: usually all variables are metric, but when used for surveys, dependent is metric, independent could be categorical (larg number of cases)
• Cluster: usually all metric, but as long as you choose the right distance measure it is al not that important.
Lees volledige samenvatting
Deze samenvatting. +380.000 andere samenvattingen. Een unieke studietool. Een oefentool voor deze samenvatting. Studiecoaching met filmpjes.

### Laatst toegevoegde flashcards

How is signifcant moderator effect represented
This is represented by a significant  change in Rsquare
What are regression coefficients?
Regression coefficents describe changes in the dependent variable due to changes in an independent variable if all other independent variables remain constant (ceteris paribus)
Why is multicollinearity a problem?
Multicollinearity is a problem because it undermines the statistical significance of an independent variable.
What is multicollinearity?
This exists whenever an independent variable is highly correlated with one or more of the other independent variables in a multiple regression equation.
What should the VIF score be?
<10, better even <5
How is multicollinearity measured?
With the VIF score
In the context of corrective actions for influential, what should be done with an exceptional observation with no likely explanation?
Presents  presents a special problem because there is no reason for
deleting the case, but its inclusion cannot be justified either,
suggesting analyses with and without the observations to make
a complete assessment
In the context of corrective actions for influential, what should be done with a valid but exceptional observation that is explainable by an extraordinary situation?
The remedy is to delete the case. THis should not be done when the variables reflecting the extraordinary situation are included in the regression equation.
In the context of corrective actions for influential, how should an error in observations or data entry be solved?
The remedy is correcting the data or deleting the case.
What is practical significance?
Practical significance is the relationship between variables of the real-world applications. The practical significance explains the relevance of the study