Highlights
Abstract
Abbreviations
Keywords
1. Introduction
2. Literature review
3. Methodology, research aims and questions
4. Method
5. Results and discussions of the findings
6. Concluding remarks: revisiting research questions, implications, and recommendations
7. Note
CRediT authorship contribution statement
Acknowledgements
References
Vitae
Abstract
This study investigated lexical density and diversity differences in English as L1 vs L2 academic writing of EFL, ESL, and English L1 postgraduate students to compare their lexical proficiency in EFL vs. English L1 academic settings. A corpus of 210 dissertation abstracts was analysed using three natural language processing tools [LCA, TAALED, and Coh-Metrix] where the effects of text length and topic were controlled. In doing so, we examined the relationship between 15 lexical indices and the construct-distinctiveness of lexical density and diversity. The measure-testing process also assesses the effectiveness of each measure in a pair/group of closely-related measures (in terms of the quantification methods) in capturing lexical diversity differences of these texts. This is to obtain a small number of unique measures that capture lexical diversity as an indicator of lexical proficiency and to assist future writing researchers in the measure-selection process in the face of a multitude of available measures. The findings have important implications for writing assessment and research on lexical indicators of writing proficiency, materials development in EFL academic settings especially for thesis/dissertation writing modules, and a possible contribution of ESL academic immersion programmes in approximating English L1 and L2 proficiency.
1. Introduction
Lexical density and diversity as two dimensions of lexical complexity and aspects of productive lexical knowledge remain as two of the most reliable indicators of lexical and linguistic proficiency and development of language users in the first and second language as well as writing and academic studies (see e.g., Bulté & Housen, 2012; Lu, 2012). Lexical density is the proportion of lexical/content words to all words/tokens; lexical density, especially a dense use of nouns, is regarded as an indicator of condensed academic writing and advanced informational prose (e.g., in Biber, 2006; Biber & Gray, 2016; Pietilä, 2015) and as a strong predictor of academic writing proficiency (e.g., Kim, 2014). Lexical diversity is the use of a range of diverse words (also known as unique word types) to convey meaning and is regarded as an indicator and predictor of lexical proficiency and development (Gonzalez, 2013; Mazgutova & Kormos, 2015; Yoon, 2017). Lexical density and diversity, although interrelated, can be differentiated in that lexical density seeks to present how densely lexical items are packed into syntactic structures, while lexical diversity is representative of non-repetitious and/or different lexical and grammatical items used in language production, e.g., in a text. Correspondingly, a learner can produce statements with higher lexical density and lower lexical diversity and vice versa (Johansson, 2008).