Factor Structure of the Social Appearance Anxiety Scale in Turkish Early Adolescents

Although the Social Appearance Anxiety Scale (SAAS) is most often validated with the use of confirmatory factor analysis (CFA) on undergraduate students, exploratory factor analysis and multiple factor retention decision criteria necessitate the analysis of underlying factor structure to prevent over and under factoring as well as to reveal convergence and divergence in different factor analysis methods. This study examines the factor structure of SAAS using exploratory factor analysis (EFA) with multiple factor retention decision criteria, and CFA in a large sample of adolescents in Turkey via secondary data analysis. The number of participants was 2,098 (1,072 female and 1,026 male; M = 12.77 years; SD=1.69) students attending 22 junior high schools. Results suggest that the Social Appearance Anxiety Scale is a valid and reliable measure with a unidimesional structure for Turkish early adolescents.


Introduction
Social anxiety and body image disorders affect a considerable number of people in the world. Lifetime prevalence estimates vary between 1.7% to 13.3% in large-scale surveys for both disorders in diverse populations [1][2][3]. Thus, measuring the symptoms of social anxiety and body image disorders in children, adolescents, and adults for an early identification of individuals at risk and to develop intervention programs is crucial.
There are a large number of self-report measures intended to determine social anxiety/social phobia and body image available in the literature for adolescents and adults, such as the Liebowitz Social Anxiety Scale [4], the Social Anxiety Scale for Adolescents [5], the Body-Image Ideals Questionnaire [6], the Appearance Schemas Inventory [7], the Body-Esteem Scale for Adolescents and Adults [8].
However, only a few of scale assess concerns regarding an individual's physical appearance. The Social Appearance Anxiety Scale (SAAS) is a relatively new instrument in the psychology literature that was developed in an attempt to measure anxiety in situations where one's physical appearance may be evaluated [9].
SAAS is a brief scale composed of 16-items that participants evaluate on a five point Likert-type scale, ranging from strongly disagree to strongly agree. SAAS items consist of cognitive, emotional, and behavioral statements associated with social appearance anxiety. In their initial development study of SAAS, Hart et al. [ 9] reported that SAAS is a one-factor scale using a polychoric correlation matrix with weighted-least squares (WLS) factor extraction method. They used scree-test and root mean square error of approximation (RMSEA) for the number of factors retain. Confirmatory factor analyses also demonstrated a good fit with the established data.
After this initial study, Ko [10] translated this new measurement instrument to German and Korean language in his cross-cultural doctorate study and revealed that SAAS shows two factor for Germans and three factors for Koreans using the Pearson correlation matrix and eingenvalue greater than one rule for determining the number of factors to retain. Kadir, Rahman, and Desa [11] also found three factor structure for SAAS using principle component analysis and varimax rotation in Malaysian undergraduate students. Levinson and Rodebaugh [12] also examined the factor structure using both exploratory and confirmatory factor analysis in two studies. EFA suggests two factors, Item 2 and Item 3 on a separate factor and another fourteen items loaded on the second factor, but CFA of two factor and one factor with correlated error terms for Item 2 and Item 3 model did not improve model fit compared to a one factor model (Study 1, Study 2). Study 1 also revealed that Item 1 might be problematic.
A series of studies by Doğan [13,14] also investigated the factor structure of SAAS in both adolescents and undergraduate students, employing Pearson's correlation matrix. He used total variance criteria for unrotated factor 514 Factor Structure of the Social Appearance Anxiety Scale in Turkish Early Adolescents solution as a factor retention rule and reported a single factor structure in line with Levinson and Rodebaugh's study. SAAS also demonstrated a single factor structure in female eating disorder patients using a polychoric correlation matrix with a weighted-least squares estimation method, and scree testing as a factor retention decision rule [15]. Even though these studies provided support for SAAS's single factor structure, these studies need to be replicated for at least three reasons.
Firstly, despite the evidence of factorial validity of SAAS in English-speaking samples, cross-cultural studies report inconsistent findings (e.g. Ko [10], Kadir et al. [11]). As seen in aforementioned studies, English-speaking samples produce evidence for single factor structure whereas other cross-cultural studies failed to confirm this factor structure (e.g. Ko [10], Kadir et al. [11]). Secondly, most of the previous studies have examined the factor structure using a polychoric correlation matrix. It has been a long known fact that Likert-type response formats assumed to interval scales, may even use a ratio scale if properly anchored [16]. Measurement experts also recommend to use polychoric correlation when the univariate distributions of are asymmetric or with excess of kurtosis. If both indices are lower than one in absolute value, then Pearson correlation is advised [17]. Additionally, simulation studies also showed that when items are measure on an ordinal scale and contain five (or more) categories (such as SAAS items rated on a 5-point Likert-type scale), these could be safely treated as continuous, if they are not skewed or kurtotic [18]. Although, skewness and kurtosis values are sample-depended, analyzing SAAS items using a Pearson correlation matrix is more appropriate when skewness and kurtosis values are mostly expected range (e.g. skewness and kurtosis were lower than 2 and 7, respectively) as this study. At the same time, the polychoric correlation matrix has some disadvantages such as inflating factor loadings. Compared to the polychoric correlation matrix, the pearson correlation matrix produces lower factor loadings so the replicability of factor loadings and factor structure possible increase.
Lastly, a typical factor analysis, conducted for scale development or validation studies, consists in collecting data, generating correlation or covariance matrix, and deciding how many factors to retain. Additionally it includes performing factor rotation to reach a final solution (preferable), interpreting the factor structure, and constructing factor scores to use in further research.
For many researchers, probably, the most vexing part of this process is to decide how many factors to retain after the extraction of components. Many rules of thumbs have been proposed by researchers in order to determine the optimum number of factors to retain. If researchers fail to find a theoretically sound ideal number of factors to retain, then this is a threat to factor reliability across a set of samples and construct validity. It should be noted that over factoring is less detrimental when compared to under factoring [19], which is more likely to contain a considerable amount of error in estimated factors, shadow true factor structure, and lead deformed factor loadings.
In order to prevent under factoring and over factoring in the scale development process, researchers have developed techniques to decide on optimal factors to retain such as: eigenvalue greater than unity [20], the scree test [21], parallel analysis [22], and the minimum partial average test [23]. Furthermore, newly developed alternatives such as the standard error scree test, 95 percentile parallel analysis [24], parallel analysis based on polychoric correlations matrix [25], and the Hull method [26]. Among these techniques, eigenvalue greater than unity and the scree test are widely used in applied psychological research because of the wide availability of popular software packages (e.g. SPSS, SAS, and STATA). The sum of all squared factor loadings for a factor is called the eigenvalue [27]. The eigenvalue greater than one rule, also known in the literature as the 'Kaiser-Gutman rule', posits that a factor with an eigenvalue higher than one contributes to variance substantially, whereas a factor with an eigenvalue lower than one explains only a trivial amount of factors. The scree test aims to find the number of true factors using a scree plot that necessitates visual examination of eigenvalues in a graph. When the eigenvalues occur in a broken-stick pattern, researchers find the true number of factors to retain. Other factors account for trivial or minor variances. Similarly to the scree test, a standard error scree test uses multiple regressions to eigenvalue standard errors until nontrivial factors are removed from the dataset.
Parallel analysis produces virtually random correlation matrices using the same number of variables and sample size, and produce eigenvalues for researchers so as to compare these virtually obtained mean eigenvalues with their actual sample eigenvalues. The last actual eigenvalue has a higher value than the virtually obtained mean eigenvalue indicates the number of factors to retain. Because parallel analysis approach tends to overestimate factors in some cases [22], Glorfeld [24] examined the accuracy of using 95 percentile mean eigenvalues in parallel analysis and recommended to use 95 percentile mean eigenvalues when deciding how many factors to retain. Both parallel analysis and 95 percentile parallel analysis use the pearson correlation matrix to achieve randomly generated eigenvalues, and assumes multivariate normal distribution. Timmerman and Lorenzo-Seva [25] expanded this method for non-normal Likert variables that requires the use of a polychoric correlation matrix in factor analysis.
Minimum partial average (MAP) test purports that common factors associated with latent factors and partial correlations occur until the last true component, then partial correlations begin to arise [23]. In other words, MAP stops extracting factors when the average of the squared partial reaches a minimum [28]. The Hull method aims to find a model with an optimal balance between model and number of factors using goodness of fit indexes such as Root Mean Square Error of Approximation (RMSEA), and the Comparative Fit Index (CFI) [26].
Although each of these methods helps researchers to discover an underlying proper factor in scale development process, each also has its own weaknesses. For example, researchers criticize an eigenvalue greater than the one rule for over factoring, scree test for subjectivity, and the Map test for under factoring in some conditions. Some methods such as parallel analysis, standard error scree test, and Map test are also time consuming for many researchers and not possible to implement with commonly used statistical software programs (e.g. SPSS). The Hull method is also effective when the number of measured variables per factor is high [26]. In order to handle these weaknesses, researchers recommend simultaneously using multiple criteria for ascertaining the appropriate number of factors before rotation [29,30]. Unfortunately this is not the case for SAAS. Most previous research tends to rely on only one criteria to determine factor structure of SAAS; using multiple criteria to determine the number of factors is not common practice. Lastly, these studies also implemented relatively small sample sizes compared to this study and there is a paucity of research on adolescents examining factor structure of SAAS. To date, to the best of our knowledge, there is only one study that examines the factor structure of scale on adolescents [13].
The purpose of the current study is to examine the factor structure of SAAS using EFA with multiple factor retention decision criteria in a large sample of adolescents in Turkey. Additionally, examining to fit to SAAS responses via CFAs for theoretically proposed factor structure, for multiple factor retention decision criteria suggested factor structure. At the same time, SAAS internal consistency estimates will also be evaluated.

Participants
In order to achieve the purpose of this research, secondary data analysis conducted in Şahin's [31] master thesis data. Part of the data that were not related to the current research purpose were analyzed and presented elsewhere [32][33][34]. A total of 22 junior high schools participated in the study. The number of participants was 2,098, all junior high school students from Merzifon town. Merzifon town, a highly populated district, is located in the central Black Sea region of Turkey [32]. The group of participants was constituted of 1,072 (51%) females and 1,026 (49%) males. The age of the participants ranged between 11-15 years old. Of the participants, 8.2% (n=173) were 11 years old, 33.6% (n=705) were 12 years old, 32.9% (n= 691) were 13 years old, 23.4% (n= 491) were 14 years old, and 1.8% (n= 38) were 15 years old. The overall mean age was 12.77 (SD = .96). Furthermore, 33.6% (n= 705) of participants were sixth grade students, 32.7% (n= 687) were seventh grade students, and 33.7% (n= 706) were eighth grade. The education levels of the participants' parents were low. Approximately 55.8% (n= 1170) of participants' mothers and 32% (n= 671) of participants' fathers have an education equivalent to five years of education or below.

Measures
Sociodemographic Questionnaire: Participants were asked to describe their school, sex, grade level, and the education level of their mother and father.
Social Appearance Anxiety Scale (SAAS) [9]: As described in the introduction, the SAAS is a self-report scale made up of 16 items. Each item is rated on a five-point scale ranging from 1 (not at all) to 5 (extremely) and Item 1 is reverse-coded. Possible scores range between 16 to 80. Higher scores indicate higher level of social appearance anxiety. SAAS was adapted to Turkish culture by Doğan [13,14] in two separate studies on university students and adolescents. Doğan reported good levels of convergent and divergent validity as well as excellent internal consistency in these studies [12,13]. Example items from the SAAS are, "I am concerned that I have missed out on opportunities because of my appearance." "I worry people will judge the way I look negatively."

Procedure
Participants completed the research instruments under the supervision of teachers during regular class time. The purpose of study, anonymity, confidentiality, and voluntary participation were presented on the front of the sociodemographic questionnaire. Before conducting any analysis, assumptions of statistical analysis were checked. Univariate normality was checked via graphical approaches, skewness and kurtosis values along with descriptive statistics. Data was screened for univariate and multivariate outliers. Participants who have three standard deviations above or below the average of SAAS scores were considered univariate outliers [35]. Three outliers detected and removed the dataset in this case. The Mahalanobis distance test was used to identify multivariate outliers using a conservative significance level of α= .001 [36]. Multivariate outliers calculated using only SAAS items. Because our analysis focused on the factor structure of SAAS items, one hundred and twenty-one participants were also omitted from dataset because they were identified as multivariate outliers. However, statistical analyses also implemented with complete cases and results were identical to the findings of the study. Statistical analysis related to complete cases may obtain from first author of this study upon request. Therefore, the analysis implemented without multivariate outliers was reported. Statistical analyses implemented in Factor 9.2 [37], SPSS 22, Standard Error Scree Test (SE) [38], and Lisrel 8.71 computer programs.
The data was randomly divided into two samples. In the first sample, EFA was conducted to reveal the underlying factor structure with multiple factor retention decision criteria. After the EFA, a CFA was conducted on the second sample to test recommended factor structure. Structural 516 Factor Structure of the Social Appearance Anxiety Scale in Turkish Early Adolescents equation modeling scholars recommend using various fit indexes to evaluate model fit in CFA. This study employed the following fit indices to evaluate congruence between the model and the data: Chi-Square (χ²) with its degree of freedom, RMSEA with its 90% confidence interval, standardized root mean square residual (SRMR), and the CFI. Following the suggestions of Marsh, Balla, and McDonald [39], we also reported the Tucker Lewis Index (TLI) as in previous researches, evidence for less affected in large sample sizes [39]. There is general agreement that values close to .90 for CFI and TLI represent an acceptable model fit and .95 and over represent a good model fit, whereas an RMSEA value equal or less than .08 represents an acceptable fit [40]. Values close to .05 or less indicate a good fit to the data [41]. Lastly, an SRMI value close or less than .08 represents a good fit to the data [42].
To assess reliability of the SAAS in the total sample, McDonald' s Omega (ω) was computed using Factor 9.2 [37]. Compared to Cronbach's alpha, ω has the advantage of taking into account the strength of association between items and constructs as well as item-specific measurement errors [43]. Thus, ω provides more realistic estimates of true reliability of scale. Table 1 shows mean, standard deviation, skewness and kurtosis values computed for each SAAS item. All items were slightly above and below the midpoints of 2.5, ranging from 1.76 to 2.93, standard deviations ranging from 1.11 to 1.44, indicating fairly negative responses to items by participants and a spread of scores around the mean. The absolute skewness and kurtosis values for the items were between .02 and 1.42, .06 and 1.30, respectively. Absolute values for skewness and kurtosis were lower than 2 and 7, respectively, indicating univariate normality [44]. Inspections of skewness and kurtosis values show that the data did not violate the normality assumption. In order to uncover factor structure of SAAS, Principal Component Analysis was conducted to the first set of sample data. The Kaiser-Meyer-Olkin measure of sampling the adequacy coefficient was .96, which indicates excellent sample size for factor analysis. Bartlett's test of sphericity was also significant (χ² (120) = 8850.04, p = .001). These results together show the factorability of the correlation matrix. Before rotation, multiple criteria were used to determine the number of factors to extract and then rotation was used. The Eigenvalue greater than one rule, the scree test, the standard error scree test, and Horn's parallel analysis suggested two factors to retain. However, Velicer's MAP, 95 percent parallel analysis, the Hull method, and theoretical considerations recommended to retain one factor. Because of disagreement between the number of factors to extract, principal component analysis was conducted to the first set of sample data for one and two factor solutions. When two factors were extracted, varimax rotation was used. Factor loadings for one factor extraction ranged from .36 to .78. For two factor solutions, Item 2 and Item 3 constituted the Factor II, other items made up Factor I. Factor I factor loadings ranged from .41 to .75 whereas Factor II (Item 2 and Item 3) factor loadings were .80 and .79, respectively. Based on previous research and current factor analysis results, three possible models tested in confirmatory factor analysis in the second sample: One factor model (Model 1), one factor model with correlated uniqueness for Item 2 and Item 3 (Model 2), and two factor model as Item 2 and Item 3 separate factor (Model 3). Models were tested using covariance matrix between items through maximum likelihood estimation method for continuous data. As seen in Table 2, the best fitting model to sample data was Model 2. Model 1 was a satisfactory fit to the data but the χ² value was too high. Model 2 reduced the χ², RMSEA, and SRMR fit indices and improved overall model fit. Model 3 had an analogous fit to indices as Model 2. A chi square nested difference test performed to compare Model 2 to Model 3, the result was insignificant ( p > .05), indicating that more restricted model did not improve model data parsimony. The reliability of SAAS as measured by ω was .92 which indicates excellent reliability.

Discussion
Although the SAAS scale was mostly validated via CFA conducted on undergraduate students, exploratory factor analysis and multiple factor retention decision criteria require underlying factor structure to prevent over and under factoring as well as to reveal convergence and divergence in different factor analysis methods. This study extended and replicated previous studies by means of analyzing SAAS factor structure by using EFA with the Pearson Correlation Matrix, multiple factor retention decision criteria, confirmatory factory analysis as well as its internal consistency reliability in a large Turkish junior high school adolescent sample. The present study shows that when multiple factor retention decision criteria are considered only three of the seven factor retention decision criteria included in Velicer's Map, 95 percent parallel analysis, and the Hull method support the theoretical consideration. All other criteria suggested to retain two factors. Previous Monte Carlo studies examining the performance of multiple factor retention decision criteria methods using simulated data suggested the use MAP and parallel analysis [45]. The current study results also support these studies using empirical data. The CFA results of this study also demonstrated that one factor model was the best fit to sample data. This finding was in agreement with the previous studies examining its factor structure using CFA in adolescents and undergraduate students [12][13][14] as well as eating disorder patients [15] using pearson and polychoric correlation matrixes. In terms of reliability, as measured by ω was also excellent. A threshold value of reliability .70 for research purposes, .90 for clinical practice is considered satisfactory in the psychological literature. In this respect, SAAS may be used for both research purposes and clinical practices. Consequently, these results suggest that researchers may use SAAS as a valid, reliable, and unidimesional scale in their research on early adolescents, at least for Turkey. In the life-span development, early adolescence is a critical period of development with considerable changes in personal life. Adolescents have to deal with many developmental tasks such as keeping up with their rapidly changing body, choosing a high school, forming an independent personality, establishing a close relationships with the same and opposite sex friends, and becoming a contributing member of society.
Among these listed challenges the most difficult is probably keeping up with their changing bodies. Adolescents with a low body image satisfaction are more likely to have difficulties with their interpersonal relationships, develop mental health problems, which includes low self-esteem, depression, body image disorder, and social anxiety. Therefore, early identification of adolescents at risk with valid and reliable measures is crucial. SAAS is a valuable measure for early identification within this context. This study has some limitations just as all other studies. This research was based on secondary data analysis and secondary data analysis restricts the researcher's ability to investigate other constructs related to SAAS for convergent and divergent validity. Secondly, this study examined factor structure in early adolescents. Thus, the findings of the study are only generalizable to similar samples. Additionally, EFA and CFA is a valuable tool for researchers to investigate construct validity but this is not the only way to determine SAAS validity. In future studies, researchers could examine the factor structure of SAAS using other validity methods, and examine its factor structure in older adults.