Knowledge and Use of Grammar among Indonesian Second Language Learners of Arabic: Focus on Grammatical Gender Agreement

This study investigates spoken production of Indonesian L2 learners of Arabic examined by the frameworks of Local Impairment Hypothesis (LIH) and Missing Surface Inflectional Hypothesis (MSIH). Because the feature strength is absent from L2 learners, LIH claims that they will suffer from impairment at any stages of development. Accordingly, all participants will experience variability in the use of gender agreement. Conversely, MSIH predicts that such impairment L2 learners exhibit is superficial. Based on the view of MSIH, variability occurred in the inflectional surface feature system is because of mapping abstract features to the correct inflected forms. Thirty-two subjects who participated in this study were divided into two distinct groups: intermediate and advanced. Grammaticality Judgment Task was employed to collect the data of abstract knowledge of features, while Sentence Fragment Completion Task was utilised to elicit the use of surface inflected forms. Although the accuracy in the production of inflectional morphology was moderately low, the findings revealed that most subjects were aware of the grammatical features being examined in the study. Eventually, the findings confirm the framework of MSIH and were not in line with LIH claiming that the higher variability will occur in both tasks at any stage of development.


Introduction
In previous years, there were several studies on foreign and second language acquisition. These studies have been examining the complexities which encounter learners in their learning process. These studies have helped language teachers to be much conscious of the complexity areas faced by their students and contribute meticulous importance on them. The analysis of the area of learning complicatedness in second language teaching has conventionally been the innermost feature of study in numerous parts of the world [1][2][3][4][5][6].
Several notable scholars have stated that spoken production could be one of the main areas of errors when somebody starts learning a new language (L2). In other words, the task requires applying the correct grammatical rules after appropriately determining the lexical item in the context. In the oral production that involves spontaneous process; therefore, L2 learners might find it challenging to utter some expressions while deciding the appropriate grammar. The phenomenon frequently happens not only to those who are less proficient but also to advanced L2 learners [7][8][9][10][11].
Within the point of view of L2 learners' oral production, features that have become one of the most problematic and been comprehensively examined by numerous distinguished scholars is grammatical gender agreement [12][13][14][15][16][17][18]. It is argued that grammatical gender agreement may sometimes provoke continuous difficulty, not only for beginning L2 learners but also whose proficiency is more advanced [10]. A reasonable reason for such difficulty is that selecting the most appropriate grammatical category comprises two tasks. First, to assign nouns to the correct gender, L2 learners must acquire knowledge at an abstract level. This prerequisite may be challenging since several languages own noun classification systems divided into masculine and feminine, and sometimes neuter. Second, after mastering gender assignment, L2 learners are required to deliver bound morphemes connected to language elements such Knowledge and Use of Grammar among Indonesian Second Language Learners of Arabic: Focus on Grammatical Gender Agreement 710 as verbs, determiners, adjectives, or adverbs when meeting head nouns to show agreement. For this reason, it is believed that L2 learners will suffer from significant difficulties because grammatical gender agreement requires profoundly demanding tasks [9]. Although L2 learners might be aware of what they have expressed is grammatically incorrect, errors in grammatical gender agreement might repeatedly occur in the oral production of L2 learners. Based on this phenomenon, a striking question appears as to whether or L2 learners' difficulties lie at an abstract level or in surface manifestation. That is, the Local Impairment Hypothesis (LIH) argues that, regardless of L1, such difficulties appear due to L2 learners' substantial impairment of features correlated with functional categories and feature strength [7,19,20]. Subsequently, L2 learners will experience variability when dealing with inflectional morphology. However, the Missing Surface Inflectional Hypothesis (MSIH) claims that such errors are superficial because L2 learners are unable to map abstract features onto the surface morphological realisation [21][22][23]. Therefore, since agreement morphology requires highly complex activities, the optionality of inflection is due to surface complexities and spelling them out correctly [13].
The current study investigates the two contradictory points of view proposed by LIH and MSIH to discover whether errors in L2 learners' language production are caused by underlying abstract features or surface morphological realisation. The primary focus of this research is on Arabic S-V agreement from adults Indonesian L2 learners whose gender agreement is absent from their L1. Since production problems in the initial stages have been widely investigated in several previous studies, therefore, focusing on late learners in this research is significant [7,19,24,20]. Moreover, this study examines Arabic because its language system owns an exceptionally rich gender agreement. To show agreement in S-V order, Arabic inflected verbs must accommodate some features such as number, person, gender, and tense. Since S-V order agrees with number, person, and gender, while V-S order does not, this could be one of the main reasons why this research focuses solely on S-V instead of V-S agreement. Because of Arabic agreement complexities, the two different viewpoints (LIH and MSIH) will propose distinct explanations concerning the variability in the use of grammatical gender agreement in Arabic. In contrast, the current research is in accordance with MSIH, arguing that optionality in L2 learners' production is not a severe impairment. Instead, the difficulty is due to surface manifestation.

The Current Study
The current research investigates the inflectional properties of number (singular, dual, plural) and gender (masculine, feminine) agreement between Arabic nouns and verbs. In addition, since Arabic manifests a different rule of agreement for plural non-human nouns, humanness (human, non-human) will be conveyed. Utilizing Arabic singular, dual, and plural forms in S-V sequence are assumed to be more challenging rather than V-S order, and thus more considerable optionality is expected to happen. That is, the findings of more demanding tasks are likely to contribute more significant data. The research questions are, therefore, as follows:

Theories on L2 Manifestation of Functional Features and Functional Categories
In current periods, the competence of interlanguage grammar has been widely investigated by an array of academic articles. Regarding several scholars, functional categories and features are not available to L2 learners other than those present in their L1. In contrast, others propose that functional features and categories are acquirable in interlanguage (IL) grammar, regardless of L1. More specifically, all discussions revolve around two fundamental fields, namely Universal Grammar (UG) and L1 transfer.
The first crucial assumption claims that a small number of scopes of IL grammar will be impaired, specifically in the functional domain. One theory related to this claim is the Failed Functional Features Hypothesis (FFFH) coined by Hawkins and Chan [25]. Regarding this proposal, the formal features in L2 that differ significantly from L1 will be unattainable. For example, functional categories such as AGR, Comp, Det will not be attainable for L2 learners if those categories are not present in their L1. Hawkins and Chan's [25] first introduced the theory by examining the English wh-operator movement of Chinese L2 learners. Subjects in this research were divided into two groups, namely Chinese and French participants, and one group including native speakers of English. The results showed that Chinese-speaking L2 learners of English were relatively less accurate on the wh-operator movement since the features being investigated in the study were not present in their L1. Accordingly, it would be difficult for Chinese L2 learners to acquire these functional features because they were absent from L1. In contrast, with an adequate degree of L2 exposure, it is argued that several L2 features can be attainable with the assistance of UG, Universal Journal of Educational Research 8(2): 709-722, 2020 711 even though such features might be different from those available in L1.
Nevertheless, an alternative argument asserts that functional properties in L2 are not impaired. Instead, L2 learners can acquire them through the assistance of UG regardless of L1. This proposal is in line with the full access hypothesis suggesting that parameter resetting can be achieved by L2 learners with the assistance of UG, even though the L1 features might be significantly different from L2 features. By implication, since IL grammars are not bound to L1 functional categories, parameter resetting is feasible, and thus feature or feature strength can be fully attained [26]. However, a more advanced account proposed by Schwartz and Sprouse [27] claimed that L1 knowledge might be necessary for L2 acquisition. Their theory called the Full Transfer / Full access (FTFA) hypothesis assumes that the whole L1 grammar containing functional properties could support L2's initial state. Therefore, when L1 grammar is not likely able to deal with new L2 features, L2 learners could utilise the assistance of UG to accommodate new feature strength and parameter settings.
Following the current approach, Ali [28] examined Arabic grammatical gender agreement in V-S sequence. The research comprised L2 learners whose L1 manifests a V-S gender agreement system (+gender group), those whose L1 lacks this feature (-gender group), and one control group of native speakers. The experimental groups were categorised into beginner, intermediate, and advanced. The study utilised the Grammaticality Judgment Task (GJT), one task of picture description, and two tasks of sentence completion to elicit the data of language comprehension and production. The results revealed that -gender group performed the same accuracy as +gender group on V-S grammatical gender agreement. Ali [28], therefore, came with a conclusion that the results were in line with a full access approach instead of FTFA because L2 learners could not perform native-like. It is interesting to note that the data, to some extent, also supported FTFA since a few learners could be as accurate as native speakers.
Previously published articles employing the framework of UG and L1 were also worth mentioning because they convey interchangeable principles. More specifically, they address specific explanations related to the variability in the use of inflectional morphology caused by the impact of impaired grammatical features. Regarding FTFA approach, MSIH suggests that such impairment is immensely superficial because L2 grammatical features are attainable by the assistance of UG through L1. In contrast, LIH claims that UG assistance through L1 transfer is inaccessible as L2 grammatical features and feature strength is impaired. That is, FFFH and its principle assuming there is no transfer, supports Vainikka [29,30] and Eubank's [24] impairment theory. This proposal will be explained in detail in the next section.

Language Impairment
From one perspective, it is assumed that L1 and L2 learners have different language acquisition mechanisms. Thus, while L1 learners are constrained by a Language Acquisition Device, L2 learners rely on linear sequencing strategies [20]. Consequently, it is claimed that L2 learners will never reach native-like competence, and thus IL grammars are more prone to impairment during language development than is the case with L1 learners. Regarding this account, there are at least three relevant theoretical viewpoints addressing language impairment, namely Eubank's [24] underspecification view, Vainikka and Young-Scholten's [29,30] gradual development view, and Beck's [1] local impairment view.
Vainikka and Young-Scholten's theory of gradual development [29,30] assumes that impairment is present at an early stage of L2 because, at this level, L2 learners cannot achieve full functional projections. In terms of verb raising, Vainikka and Young-Scholten [29,30] argue that this is absent in the VP stage or the initial L2 learners' state. If L2 input accommodates verb raising, it will be overt at the FP stage. However, FP projection itself is not specific and thus, as a result, variability in the use of verb raising emerges in mature L2 learners. Verb raising is then only obligatory in the IP stage, where the functional head is certainly specific. In conclusion, they assumed that the optionality of verb raising in their study is a developmental phenomenon, implying that impairment will disappear in conjunction with the level of proficiency.
This proposal is likely to be in accordance with Eubank's theory [24], as both agree that access to grammatical features is available through UG. However, the main difference between the two approaches lies in the domain of L1 transfer. This is based on the FFFH principle, where L2 is constrained by UG, but not by L1 transfer. In Eubank's [24] proposal, he clearly argued that the L1 final state, especially inflectional strength, does not transfer to the L2 early state. Therefore, rather than claiming there is no verb raising in the L2 early grammar, he suggested that L2 learners show variability in the use of verb raising in this state. In his study, Eubank investigated English mental representation in the early state among French-speaking learners of English. As he argued that L1 grammar does not transfer, he found that French L2 learners of English did not exhibit L1 aspects such as strong agreement and tense. He therefore concluded that, if the relevant inflection that specifies feature strength is not yet acquired, verb raising in interlanguage grammar will freely take place. The proposals of both Eubank [24], and Vainikka and Young-Scholten [29,30], claim verb raising is optionally used in the early stage. Nevertheless, both also argue that, when relevant feature strength is attainable, impairment will never appear and, therefore, verb raising becomes obligatory. However, while Vainikka predicts no verb raising at the initial stage, Eubank made no comment on the initial stage.
A strong argument for impairment comes from a proposal by Meisel [20] that claims a total impairment of UG (the no access approach). Meisel [20] claims that, because functional categories and features are permanently impaired, variability will always appear in interlanguage grammar. In his study, the findings showed that German negation was frequently placed before the finite verb in L2 production, while L1 participants always produced pas and nicht in the final position. Based on these results, Meisel [20] claimed that L2 development differs significantly from L1 development, indicating that Universal Grammar (UG) is not available in L2. He therefore, suggested that, rather than relying on general operations operated by UG, L2 learners generate their own tools using linear sequencing strategies. Moreover, Meisel's [20] findings were in line with Clahsen and Muysken's [19] study exploring different German word-order developmental sequences between L1 and L2. Their results showed that, with the assistance of UG, L1 learners could acknowledge that German is SOV and VSO languages (depending on linguistic contexts) without any difficulties, while L2 learners assumed that German is only an SVO language. The researchers concluded that, irrespective of L1 background and proficiency level, UG will not be accessible to L2. Hence in this case, L2 German word-order is substantially impaired.
Narrowing the focus, Beck proposed a more specific theory called the Local Impairment Hypothesis (LIH). This posits that L2 learners will never attain a functional feature system at any stage, and thus feature strength will be permanently impaired [7]. In his research, Beck [1] argued that the functional features of L2 learners that permit verbs to be raised become inert, leading to L2 learners' inability to distinguish between verb-raised and unraised-verb stimuli. To investigate this proposal, Beck [7] explored verb raising in German language pairing among adult English learners of German. It has been established that the English language does not facilitate overt verb raising, and for this reason the study aimed to explore L2 learners' performance in figuring out overt verb raising.in German. The study included 48 English-speaking Germans and 27 native speakers of German as a control group. Experimental subjects were then divided into more advanced and less advanced groups. Beck utilised a sentence matching task to investigate L2 learners' preference for SV-Adv-O and S-Adv-Vo patterns, and an oral translation task to asses S-V inversion and overt morphology. The hypothesis of Local Impairment predicted that, because feature strength has become impaired, all participants, irrespective of proficiency level, would show variability and resort to using optional patterns of SV-Adv-O and S-Adv-Vo. With respect to verb raising, Beck's findings showed that less advanced learners permitted verb raising while more advanced learners could not distinguish between the raised verb and unraised verb, resulting in invariability. Given these findings, Beck refuted the proposals of Vainikka and Young-Scholten [29,30], and Eubank [24], which argued that feature strength will be fully specified along with the level of proficiency. Because, in Beck's [1] findings, more advanced learners showed optionality in their use of verb raising, the proposals of Vainikka and Young-Scholten [29,30], and Eubank [24], were not strongly supported. Beck therefore claimed that his results were more consistent with the Local Impairment Hypothesis as optionality was more likely to be found in advanced learners than less advanced learners.
Amongst the three proposals presented all confirm that interlanguage grammar will exhibit a specific deficit in the functional feature system. However, Vainikka and Young-Scholten propounded a weaker version of impairment while Beck insists that L2 learners, even at an advanced level of proficiency, will suffer from significant impairment due to the absence of feature strength. However, the following proposal by Lardiere [10] occupies a position that lies somewhere between the two, arguing that L2 learners do not lack feature strength. Instead, variability occurs due to an obstacle in surface morphology.

Missing Surface Inflection
As noted above, Lardiere [10] refutes the three proposals presented thus far as she argues that the problem is not simply an impairment of underlying abstract features and feature strength. Instead, Lardiere [10] argues that the problem is attributable to the surface forms as she found that adult L2 learners who show impoverished inflectional morphology do indeed have knowledge associated with syntactic functional categories or features. This assumption is in line with the FTFA proposal, which claims that L1 transfer may enable interlanguage grammar to access UG. Therefore, because parameter resetting is accessible, new functional features and feature strength in L2 can be acquired. This means that, if variability is found during language development, it is not simply due to the absence of feature strength but may instead be due to problems in surface realisation.
To provide support for this view, Lardiere [10] conducted a longitudinal study on an adult Chinese speaker, Patty, who had moved to the USA and has now been living there for 18 years. Due to the constant and substantial exposure to an English-speaking environment, she was assumed to acquire her end-state grammar at the age of 41. However, Patty's spontaneous production was surprisingly bad, even though she lived in an almost exclusively English-speaking environment. Her use of tense marking stabilised at approximately 34%, and her production of the 3rd person singular was no more than 17%. Despite Patty's low rate of past tense marking, she produced virtually perfect results on nominative and accusative pronominal cases, indicating that she had tense and agreement at an abstract level. Lardiere thus concluded that Patty's problem lay in mapping surface morphological forms rather than a serious impairment in underlying features.
The Missing Surface Inflection Hypothesis (MSIH) in second language acquisition was first proposed in Prevost and White's [23] study which investigated the spontaneous production of two L2 learners of French and two L2 learners of German in a longitudinal study. Their main interest lay in examining, if present, the optionality use of finite and non-finite verbs, and verbal agreement. Regarding the use of finite and non-finite verbs, the results showed high accuracy infiniteness placement, which meant that L2 learners appropriately distributed finite forms in non-finite contexts. Regarding their accuracy in verbal agreement, L2 learners of French and German were largely accurate in matching inflected verbs appropriately to given subjects. In this respect, the results failed to support an impairment view as L2 learners did not exhibit variability in finite verb forms and agreement. Functional features and categories for finiteness and agreement were in evidence at the L2 abstract level. The findings therefore appeared to support MSIH. Following Lardiere's research, the phenomenon of distributing to default forms was believed to constitute mapping problems between abstract feature and morphological manifestation.
Addressing the theoretical frameworks of FFFH, Full access Hypothesis, LIH and MSIH, De Garavito and White [31] [25] employed a game to collect spontaneous production data from L2 French-speaking learners of Spanish, utilising four cards in which the participants were asked to describe the picture on the card. The findings revealed that L2 learners accurately produced N Adj orders instead of Adj N orders, indicating that nouns were correctly raised over adjectives. Agreement accuracy ratios on determiners were relatively high at 85%. Regarding determiners, L2 learners were significantly less accurate on gender agreement for adjectives. They also showed variability regarding gender agreement on adjectives between two groups where masculine forms tended to be overgeneralised to feminine forms. Based on the findings, De Garavito and White [31] argued that there is no evidence to support LIH which predicts optionality in N-Adj orders and agreement use on determiners and adjectives. Instead, they argued the findings supported Lardiere's [4] assumption of 'mapping problems', and thus refuted FFFH as well as the Full access hypothesis. To summarise, two proposals addressing LIH and MSIH made several predictions regarding production errors and the appearance of optionality due to impairment. Based on recent research that investigated the production of S-V gender agreement in Arabic, LIH predicts that L2 learners will exhibit impairment at any stage of development due to the absence of feature strength. Therefore, variability in the use of gender agreement is assumed to occur amongst all learners without exception, even advanced L2 learners. In contrast, MSIH suggests that L2 learners will not suffer from serious impairment. Any variability in the inflectional surface feature system is due to mapping abstract gender features to appropriately inflected forms. Like FTFA, MSIH predicts that L2 learners are able to acquire functional features and categories in interlanguage grammar regardless of L1.
However, there is the little in the way of published research focusing on the production of grammatical gender agreement among Arabic L2 learners, especially in relation to singular, dual, and plural verbs that agree with head nouns in both the past and present tense. The present study therefore aims to address this gap in the Arabic SLA domain.

Methodology
This chapter will elucidate the research methodology used to address the proposed research questions and test the predictions made by LIH and MSIH. It will describe in detail the participants, the reading proficiency test, the two main tasks: GJT and sentence fragment completion task, and procedures for data collection

Participants
Thirty-three L2 Arabic learners were recruited to take part in the study. Twenty-two students are studying for a BA in Arabic Language Education, while the remainder are studying for different BA programmes. All are undergraduate students at Sunan Kalijaga State Islamic University, Indonesia. After completing the language proficiency test, the participants were divided into two groups: advanced (n=9) and intermediate (n=23). One participant was excluded from the study as she did not attend a spoken data collection session. Thus, of the initial 33 participants, 32 remained. At the time data were collected, all participants were adult Arabic learners and their ages ranged from 19 to 23 (M = 21.03). It is interesting to note that most of the participants began learning or were exposed to Arabic in early adolescence. The recruitment process was conducted by email, phone, and through social media.

Reading Proficiency Test
A reading proficiency test was administered to the participants to test their levels of proficiency. Based on the results, the participants were divided into two groups: advanced and intermediate speakers.

Grammaticality Judgment Task (GJT)
This task was mainly employed to elicit abstract knowledge of agreement features from the participants. The written GJT consisted of 72 target sentences and 36 fillers. The target sentences were divided into 36 grammatical sentences and 36 ungrammatical sentences. The grammatical sentences contained 18 pairs of masculine nouns and feminine verbs; and 18 pairs of feminine nouns and feminine verbs. The 18 head nouns were then divided into 6 singular, 6 dual and 6 plural nouns, each group of which contained 3 human nouns in the past tense and 3 non-human nouns in the present tense. The grammatical sentences were then divided into 12 categories, as follows. 1. Singular masculine human subject -Singular masculine verb, Past tense 2. Singular masculine non-human subject -Singular masculine verb, Present tense 3. Dual masculine human subject -Dual masculine verb, Past tense 4. Dual masculine non-human subject -Dual masculine verb, Present tense 5. Plural masculine human subject -Plural masculine verb, Past tense 6. Plural masculine non-human subject -singular feminine verb, Present tense 7. Singular feminine human subject -Singular feminine verb, Past tense 8. Singular feminine non-human subject -Singular feminine verb, Present tense 9. Dual feminine human subject -dual feminine verb, Past tense 10. Dual feminine non-human subject -dual feminine verb, Present tense 11. Plural feminine human subject -plural feminine verb, Past tense 12. Plural feminine non-human subject -Plural feminine verb, Present tense The ungrammatical sentences, on the other hand, showed disagreement between the head nouns and the verbs that followed. In these sentences, when the noun was presented in a masculine form, the verb was provided in a feminine form, and vice versa. For example, a singular, masculine, human noun would be followed by a singular, feminine verb. Like the grammatical sentences the ungrammatical sentences were divided into 12 categories. The difference between them lay in the gender mismatches between masculine and feminine.
One of the research questions investigates L2 learners' preferences with respect to gender, number, and humanness. At this point, humanness must be observed as plural non-human nouns use a different rule in Arabic. When the head noun is plural masculine/feminine non-human, the verb must show agreement in singular, feminine forms. For this reason, it will be interesting to know whether this different grammatical rule is still used by L2 Indonesian-speaking learners. By implication, if a plural, non-human form is treated as a regular plural human noun, variability will then be detected. Hence, it should be possible to determine whether optionality use takes place at the abstract level or through surface morphological form.
36 fillers were included in this test to distract the participant from the target items. Some fillers consisted of verbal nouns (masdar) which originate from verbs. The verbal nouns contained an action without an actor or time information. For example, rather than saying 훈Ᏺ ma g e Ᏺ ʔana: ʔasurru to express 'I am happy', one could also say 훈Ᏺ mϭmg ʔana: masru:run to express the same feeling. Other fillers comprised famous Arabic words of wisdom.
The grammatical sentences, ungrammatical sentences, and fillers were included randomly in the task. The vocabularies used in this test were commonly spoken vocabularies used in daily conversation. Before conducting the real test, trials took place to familiarise participants with the questions. Queries or questions regarding task procedure were permitted in this session. A few minutes after the trial, the participants were asked to judge whether each sentence was grammatically correct or incorrect. Each correct answer accrued 1 point with 0 points awarded for an incorrect answer.

Sentence Fragment Completion Task
A sentence fragment completion task was also administered to investigate participants' spoken data inflected realisation. Because both GJT and the spoken test are related measurements, it is important to keep the vocabularies used in both tasks similar. For this reason, all the nouns and verbs used in the sentence fragment completion task were taken from the Grammaticality Judgment Task.
The participants were shown withe fragments of nouns followed by verbs on the researcher's computer screen. The participants were then asked to repeat the fragments when they appeared on the screen and complete the sentence with the verb (the object and the adverb, if possible) that agreed with the noun. It was important to ask the participants completing the sentences to keep their attention away from the point the researcher wanted to examine. To obtain spontaneous spoken answers, the fragments of nouns and verbs on the screen disappeared after 10 seconds. The Sentence Fragment Completion task contained 36 fragments of nouns and verbs that were kept in their base forms. This meant there were 36 verbs that must agree with 18 masculine nouns and 18 feminine nouns. The 18 nouns were divided into 6 singular, 6 dual and 6 plural nouns, and then each group of 6 was split into 3 human and 3 non-human nouns. Each correct answer was worth 1 point while each incorrect answer was awarded 0 points. As the purpose of this task is to scrutinize subject-verb gender agreement, 1 point was still awarded even if the participants did not use the object and adverb, or used them incorrectly. An example of target item is as follows: 1. A sentence containing plural, feminine, non-human nouns followed by a singular, masculine verb in the present tense. To complete the task, the participants were asked to complete the fragments above, for example, by placing the verb that agrees in a complete sentence and in an appropriate tense 1 , such as: ϳϭΗ 훈 Α ⺰ϳ Ᏺ m ϛΑ al-ʕankabu:t-iya:tu ta-qifu ʔama:ma ∫ ∫ ajarati (the-spider-pl.nh f.sing-stand in front of the-tree.s.f.nh) ' The spiders stand in front of the tree. ' However, if participants answered only with the subject and the verb, they would still be awarded 1 point if the verb was inflected correctly. Any inappropriate uses of an adverb (semantically) such as ϳϭΗ 훈 Α ⺰ϳ Η ϭ 훈ϳ ϛ Α al-ʕankabu:t-iya:tu ta-qifu baʕda tana:wuli l-ʕa ∫ a: ʔi (the-spider-pl.f.nh f.s-stand after having the-dinner) 'The spiders stand after having a dinner' were not considered and 1 point was still awarded. The allocated time was strictly adhered when they answered questions to ensure spontaneous spoken production. To summarise, S-V order was chosen in this study rather than V-S order as it offers more features: number, gender, and person. The study employed GJT and sentence fragment completion tasks to elicit primary data from L2 Indonesia-speaking learners of Arabic. The GJT was used to test L2 learners' abstract knowledge of features, while the spoken production task was employed to observe participants' use of surface inflected forms. Regarding Beck's theory, LIH, the prediction is that: 11. As features and feature strength are inert, variable use of S-V agreement will occur in L2 learners in both tasks. In contrast, MSIH, the proposal from Lardiere and Prevost, predicts that: 1. Because the production problem is due to mapping abstract features to surface morphological forms, GJT scores measuring abstract knowledge will be similar or may even outperform results on the sentence fragment completion task which measures surface morphological realization.

The Reading Proficiency Task
As mentioned earlier, the reading proficiency test was administered to 32 respondents and consisted of 40 multiple choice questions. Each correct answer was worth 1 point. Table 1, below, presents the mean scores of the respondents on the reading proficiency test. The mean score of L2 learners' performance on the Reading Proficiency Task was 76.25. Table 2, below, shows the mean scores of the respondents when divided into two different levels of proficiency: advanced and intermediate.

Main Experimental Tasks
The experimental tasks utilised in this study were the Grammaticality Judgement Task (GJT) and the Sentence Fragment Completion Task. of feminine nouns and masculine verbs. The participants were asked to judge whether the sentence was grammatically correct or incorrect. Each correct answer was awarded 1 point while 0 was given for an incorrect answer. Table 3, below, contains the scores of the 32 respondents on the Grammaticality Judgment Task. L2 learners achieved average agreement ratios with a mean of 70.75 in GJT. The following table illustrates and compares the mean scores at two levels of proficiency, advanced and intermediate learners.  Table 5 and 6 contains the results of the Grammaticality Judgment Task for humanness (human vs non-human) as a conceptual factor and gender (masculine vs feminine) feature.  = 63.75). A one-way ANOVA was conducted which showed that the differences in one variable with three levels were significant (F (2,93) = 5.48, p = .006). Thus, the respondents accomplished better agreement when the subject was singular and performed worse when the subject was plural. No significant difference was found between singular verbs and dual verbs.  Table 7 contains the results of the Grammaticality Judgment Task by number (a verb that agrees with a subject in singular, dual, or plural)  = 65.78). Regarding the feature preference, L2 learners performed better when verbs had to agree with head nouns in singular human form. In contrast, L2 learners achieved equal scores on masculine and feminine nouns, indicating that no preference regarding gender was found in GJT.

The Sentence Fragment Completion Task
The Sentence Fragment Completion Task consisted of 36 subjects and verbs where the verb was maintained in its root or default base form. The participants were asked to complete the fragment sentence with a suitable verb that agrees with the subject, if possible with a complete sentence containing a verb, an object, and an adverb. The participants were not provided with new subjects and verbs in this task as all subjects and verbs were taken from the Grammaticality Judgment Task. Like the GJT, the task was also broken down into categories with respect to gender (masculine and feminine), number (singular, dual, and plural) and humanness (human and non-human). Each correct answer was given 1 point while each incorrect answer was awarded 0 points. As the primary concern in this study was to identify correct verb agreement, 1 point was still awarded for correct agreement even though there was inappropriate use (semantically) of the object and/or the adverb. Table 8 displays the results of the respondents on the Sentence Fragment Completion Task. This table shows that the mean scores of 32 respondents on the Sentence Fragment Completion Task was M = 63.84, which was lower than scores on the GJT (M = 70.75). Table 9 highlights the results of the respondents when divided into two different levels of proficiency: advanced and intermediate learners.  Table 9 shows that, once again, the advanced participants had higher agreement accuracy ratios (M = 78.55) than their counterparts (M = 58.08). The independent samples t-test showed that the differences in scores on this task was significant, t (30) = 2.71, p = .01 (two-tailed).
Tables 10, 11, and 12 present the scores on the Sentence Fragment Completion Task by two features: gender (masculine vs feminine), number (singular vs dual vs plural), with humanness (human vs non-human) as a conceptual factor.  Table 10 shows that L2 learners performed better on masculine verbs (M = 70.97) than on feminine verbs (M = 59.00) in spoken production. In the domain of humanness, human verbs (M = 71.31) obtained better agreement accuracy than their counterparts, non-human verbs (M = 58.66). The independent samples t-test showed there was a significant difference on the gender feature, t (62) = 2.22, p = .03, and humanness, t (62) = 2.27, p = .02.  Table 12 shows different results with respect to verbs in singular (M = 84. 16), dual (M = 59.66), and plural forms (M = 63.75). A one-way ANOVA revealed that the differences among these three variables were significant (F (2,93) = 17.37, p < .001). Thus, the respondents achieved better agreement on singular verbs, which was similar to the results found for the GJT The following section presents a breakdown of the number feature (singular, dual, and plural) by gender and humanness. This is because it is important to determine where L2 learners' preferences lay. Table 13 presents singular verbs by gender and number. The results show that singular masculine human verbs (92%) were preferred to singular feminine non-human verbs (66%). Moreover, singular masculine human (92%) and singular masculine non-human verbs (90%) were preferred to both singular feminine human (85%) and singular feminine non-human (66%) verbs respectively. A one-way ANOVA revealed that these differences were significant (F (3,124) = 9.88, p < .001). Therefore, based on the results, the order of preference is follows: singular feminine non-human verbs < singular feminine human verbs < singular masculine non-human verbs < singular masculine human verbs.  It is interesting to note that, out of 32 respondents, 28 respondents had perfect scores when verbs needed to match singular masculine human subjects. Although the remaining 4 participants exhibited errors, 3 gave correct agreement in the present tense. Because the questions were constructed in the past tense, their answers were considered incorrect. Nevertheless, by implication, all participants performed perfectly on singular masculine human verbs except one.
A breakdown of dual verbs by gender and humanness is presented in table 14. The results show that masculine verbs still outperform feminine verbs. It is interesting to note that, in dual verbs, masculine non-human verbs (78%) exhibited the highest agreement correct ratios amongst all categories. A one-way ANOVA was conducted to assess the interaction between the groups displayed in table 14. This showed there was a significant difference between the groups (F (3,124) = 6.88, p < .001). Figure below, shows L2 learners' preferences on plural verbs by gender and humanness. The results show that L2 learners perform better on plural human verbs than plural non-human verbs. It is interesting to note that, compared to singular and dual non-human verbs, plural (masculine / feminine) non-human verbs produced the lowest scores (32% and 43% respectively). A one-way ANOVA revealed that the differences between groups were significant (F (3,124) = 6.14, p = .001).

Discussion
This study investigated S-V agreement in Arabic among 32 Indonesian-speaking L2 learners. The study utilised the Grammaticality Judgment Task and the Sentence Fragment Completion task to elicit data on abstract feature knowledge and language use in relation to surface inflected forms. The main concern of the study was to examine the nature of any feature variability, and to determine whether optionality use of feature supports LIH or MSIH theories. To avoid any asymmetry in written and oral task results, the researcher strictly adhered to vocabulary and time constraints in this study. Therefore, basic and simple vocabularies were used in both tasks, and participants were subject to a time constraint when completing the tasks.
Each research question will now be addressed in turn. The first research question was stated as follows: 1. Will L2 Arabic learners show any tendencies regarding gender (masculine, feminine), number (singular, dual, plural), and humanness (human, non-human)?
Based on the findings, the answer to this question is that they do. L2 learners in this study achieved better agreement ratios on singular masculine (human at 92% or non-human at 90%) verbs, which was higher than for the other groups (dual and plural verbs), as shown in table 13. A possible explanation is that third singular masculine acts as a default dictionary form in Arabic. In other words, singular masculine verbs require zero morpheme {-0} when they agree with singular masculine head nouns. This is likely to be the easiest form in the Sentence Fragment Completion task because all verbs in this task are in their base form (third singular masculine). Hence, after the participants have correctly identified third singular masculine head nouns, they then repeat the third singular masculine verbs that have been already provided in a default form.
The findings also revealed that participants displayed accuracy on singular feminine (human at 85% and non-human at 66%) than dual and plural feminine verbs. The most plausible reason for this is that singular feminine forms are marked by gender marker {-a} [13]. With this overt suffix, L2 learners can rely on this marker to assign singular feminine gender correctly, and then inflect verbs accordingly to show agreement. It is interesting to note that relying on the feminine gender marker was clearly evident in this study. Out of the three singular feminine nouns provided in the spoken production task, one crypto feminine noun containing a zero-morpheme marking, like masculine subjects, was included. The problem was illustrated by the fact that 22 participants failed to use a singular feminine verb when the head noun was mሲ ʔalarḍu (the earth) 2 . Overall, L2 learners show better accuracy of agreement on singular feminine and masculine verbs. This is unsurprising as the singular feminine form is usually marked by a gender suffix, while the latter experiences a zero morpheme. Moreover, the superiority of the singular masculine might be attributable to the fact this form is acquired first in L1 child acquisition and is the most frequently used form by less advanced speakers of Arabic.
It is worth noting that the participants showed the worst agreement accuracy when plural masculine/feminine verbs had to be matched with plural masculine/feminine non-human head nouns, where scores of 32% and 43% were achieved respectively (See table 15). The explanation for these extensive errors on plural non-human verbs is because plural non-human verbs adhere to a distinct agreement rule in Modern Standard Arabic. Thus, when plural nouns do not refer to human beings, they must be treated as singular feminine. Therefore, correct agreement in this domain is achieved by matching plural non-human head nouns to singular feminine verbs. Within this domain, 18% of error ratios were found using plural verbs, while 62% of errors involved the use of singular masculine verbs. A further explanation of this phenomenon will be discussed later in support of MSIH. Overall, greater errors in plural non-human forms may be due to a specific form of gender assignment that can make the agreement process more complicated. Complicated plural non-human forms were also included in Khalid's [35] study and may confirm the present research. In her study, she investigated the oral 2 ゐሲᝐ ʔalarḍu (the earth) is an example of a crypto feminine noun. This is the case when a small subclass of feminine noun is not marked by a regular suffix {-a}, but is instead marked by a zero morpheme {-0}, like a masculine noun. production of number and gender agreement in bilingual Arab-American speakers. Her findings revealed that plural, masculine/feminine, and non-human nouns were among the most difficult grammatical elements encountered by the participants.
The second research question in this study was stated as follows: 2. Will the results support the Local Impairment Hypothesis (LIH) or Missing Surface Inflectional Hypothesis (MSIH)?
The Local Impairment Hypothesis predicts that L2 learners cannot fully attain functional features, regardless of L1 background, and therefore feature strength in L2 is essentially impaired [7,24]. This implies that L2 Arabic learners will exhibit variability when they encounter inflectional morphology and thus Arabic S-V agreement will be missing in their oral production. Based on the findings, the overall results on oral production data were relatively low (M = 63.84). This implies somewhat low accuracy, but not significant variability. In comparison, Patty, whose language production was investigated by Lardiere [10], achieved a worse stabilised inflectional morphology score of 34%, even though she showed perfect accuracy on nominative and accusative pronominal cases, thus indicating that she had knowledge of relevant features at abstract level. The results of the recent study seem likely to support the notion of LIH as the optionality use of inflectional morphology in verbal gender agreement is evident. Nevertheless, LIH is not strongly supported for the following reasons.
There is insufficient evidence to claim the gender feature is impaired because subject number 1 showed no errors in GJT and verbal production data. As highlighted in figure 16, one person (participant number 1) obtained a perfect 100% score on the Grammaticality Judgment Task and the Sentence Fragment Completion Task. Moreover, participants 5 and 6 also achieved satisfactory scores of 97% and 94% in the spoken production task even though their scores on the GJT, 88% and 83% respectively, were lower. Hence, although the overall findings imply there is variability in the use of the inflectional feature system, the excellent scores of participants 1, 5, and 6 suggest that full attainment is possible for L2 learners. Therefore, the findings of the present study disconfirm LIH as full L2 attainment is possible and variability in inflectional morphology seems temporary rather than permanent. S-V agreement scores on the GJT and the sentence completion fragment task MSIH, on the other hand, may be able to account for variability in the use of verbal gender agreement. Within this framework, it would be predicted that there will be temporary impairment due to the difficulty in realising functional features. L2 learners would therefore find it difficult to map abstract features to surface morphological forms; that is for Arabic L2 learners in the present study, the problem will lie in mapping gender agreement features appropriately to the surface morphological manifestation [21][22][23]. Strictly speaking, it would be predicted that L2 learners will be aware of the grammatical gender agreement rules being examined, but will fail to provide correct morphological forms. The recent findings support these predictions to some extent. An indication of this is that the participants exhibited statistically higher correct agreement on the written task (M = 70.75) than on the oral production task (M = 63.84). Because the written task aimed to determine L2 learners' knowledge of gender features and the oral task explored the production of proper inflected surface morphology, the better results on GJT indicate that L2 learners of Arabic do have conscious knowledge of gender agreement rules. This suggests that abstract knowledge of gender features has been acquired and refutes the view that feature strength is essentially impaired in L2 learners. Moreover, it is very difficult to claim that L2 learners at any stage will suffer from impairment as some L2 learners of Arabic produced superior results, indicating that the impairment is temporary and not related to all stages of L2 development.
Moreover, another possible explanation supporting MSIH relates to the production of dual verbs in the past tense. Surprisingly, the findings show that the correct use of dual masculine non-human verbs surpassed that of dual masculine human verbs even though human verbs were used most by participants in both the tasks (see Figure  1-3). Scrutinising participants' errors in depth, however, reveals that such errors were not related to humanness. Instead, most of the participants made several errors in the use of verb tenses. The most noticeable errors in dual masculine human verbs concerned the use of a special suffix for the present tense {-a:ni}. Proper agreement is achieved by producing dual masculine human verbs in the past tense as all human forms in this task were presented in the past tense. Thus, while the appropriate agreement must be in the past tense by utilising the suffix {a:}, most of the participants provided the structure in the present tense by using the suffix {a:ni}. The followings are examples of such errors: a) IL: *ar-rajul-a:ni ɦaḍar-a:ni wali:metal ʕursi the-man.du.masc.hum attended.du.masc.hum wedding party TL: ar-rajul-a:ni ɦaḍar-a: wali:matal ʕursi the-man.du.masc.hum attended.du.masc.hum wedding party The men attended a wedding party. The teachers explained the lesson. Nevertheless, the participants did provide the correct suffix alif {-a:} from its base or default singular form, but tended to overgeneralise it by adding the suffix alif nun {-a:ni} as if in the present tense. Regarding the findings, 28 errors misusing the suffix {a:ni} in the past tense were found, while no significant errors were detected in the present tense. In this instance, the participants were assumed to have assigned dual nouns successfully as the proper suffix for dual verbs in the past tense {a:} was produced. This phenomenon therefore suggests that feature strength is not impaired. It is more the case that the problem of overgeneralising the suffix {a:ni} in the past tense is due to realisation of surface morphology which supports the MSIH proposal.
It is instructive to review prior findings revealing that participants made worse errors in plural (masculine/feminine) verb domains. As mentioned earlier, the grammatical rule for plural verbs in MSA is an exception. They are treated as singular feminine rather than plural forms. Noticing such error patterns, the greatest errors in this area were found using singular masculine forms (61/192) rather than misusing dual or even plural forms. This indicates that some L2 learners knew that plural non-human forms must be treated differently and are not the same as regular (human) plurals. Thus, total feature strength impairment is not confirmed, as those who made several errors did recognise that plural non-human nouns must agree with singular verbs, but failed to turn them into feminine forms. Therefore, a mapping problem may indeed be the explanation.
To sum up, variability in the use of inflectional morphology is evident in L2 learners' oral production in the recent study. However, based on the findings, variability appears to be attributable to difficulties in mapping abstract features to surface forms rather than any impaired feature strength in interlanguage grammar. Because some learners produced superior results, with one participant displaying perfect accuracy in both tasks, the findings provide support for MSIH rather than the existence of significant impairment proposed by LIH.

Conclusions and Recommendations
This study explored Arabic S-V agreement among 32 Indonesian-speaking L2 learners of Arabic. Its primary focus was on investigating the inflectional morphology of gender (masculine/feminine) number (singular/dual/plural) and the influence of the conceptual factor of humanness (human/non-human). The findings of this study were then utilized to examine the degree if support they provide for two explanatory frameworks LIH and MSIH.
The reading proficiency task was employed to divide participants into intermediate and advanced groups. Two experimental tasks, GJT and sentence fragment completion, tasks were then administered to elicit data on L2 learners' language knowledge and their use of gender agreement. The findings showed that most participants were aware of the grammatical features being investigated in the study, even though there was relatively low accuracy in the production of inflectional morphology. The findings were therefore more in line with the MSIH framework, and disconfirm the claim by LIH that there will be greater variability in both tasks at any stage of development. Furthermore, the aim of the study was to select L2 learners who had reached a final-state of grammar so that data on the comparison between abstract knowledge and surface form manifestation was more reliable. The researcher acknowledges that the participants in this study might still be at intermediate low and intermediate high levels of proficiency. This might be due to the use of a partial reading proficiency test that contains approximately 200 questions and normally takes 2.5 hours to complete. Due to limited time, this study only utilised 40 target items which may have affected the accuracy with which proficiency level was determined.
To conclude, very little research in Arabic SLA has investigated dual or plural S-V agreement which is generally more challenging as it involves very complex attached affixes. Future studies need to employ many target items in one single variable, for example 50 target items for dual/plural S-V agreement, to obtain more meaningful data and produce more robust results. However, due to time constraints, the present study utilised only a small number of target items for one variable. Nevertheless, clear trends have been identified which will serve as a foundation for future research.