head
left
 
ISSN: 1738-1460
Homeome
Commercial
Conferences
Contact
Editorial Board
Hard Cover
International
Introduction
Privacy Policy
Related Links
Search
Site Map
Special Editions
Submissions
I
J


| Teaching Articles Home |

Volume 6
Teachers Articles
May 2005
Article 1


Article Title

Comparison of Three Methods of Assessing Difficulty

Author

Dr. Nasrin Shokrpour
Shiraz University of Medical Sciences
Shiraz
Iran

Introduction

Text difficulty has been a concern of educational researchers and practitioners for more than 70 years (Chall and Conrad, 1991) and many have used different methods to assess the difficulty of the text (Chall and Dale, 1995). In fact, one of the most important aspects of textbook development has been considered to be texts of appropriate difficulty by educational publishers. Matching the difficulty of textbooks and readers' reading ability has been taken into account by publishers, writers, editors and teachers in order to use the text successfully (Chall and Conrad, 1991; Chall and Dale, 1995; Harris-Sharples, 1983; Day, 1994).

Controlling the readability of reading instructional material dates back to the early 19th century, leading to the emergence of different readability formulas. Many of such formulas have been widely used for several decades (Fry, 1977; Dale and Chall, 1948; Chall, 1956; 1958) and have been considered as the most reliable and valid (Klare, 1963; 1984). But most of these formulas tend to focus on syntactic and semantic measures of difficulty and do not include the variables known to predict difficulty. After some years of application of these formulas, even their developers admitted several weaknesses attributed to their formula (Klare, 1984). Accepting the weaknesses of these formulas, some of the researchers attempted to improve and modify the existing measures of difficulty and, as claimed by them, develop qualitative measures (Chall, 1956; 1958; Chall and Dale, 1995).

Qualitative assessments have been proved to be valid for more than 80 years in psychological and educational research (Thorndike's writing scale, 1910, 1912). Porter and Popp (1975) found a high correlation between judgments of difficulty of children's books and the difficulty of those books as measured by cloze scores and oral reading errors. Chall (1958) found a 0.8 correlation between judges' rating of difficulty of passages and their readability levels.

The interest has been centered on qualitative assessment even more in the past decade. The Reading Recovery Program at Ohio State University (1990) and Weaver (1992) are such examples. In all, the characteristics of difficulty used by these studies are quite similar to those used in classic studies of readability such as Gray and Leary (1935) and Chall and Dale (1995).

In an attempt to develop a qualitative approach to readability, Chall et al (1996) presented a method based on matching samples of text to exemplars that have been scaled for comprehension difficulty, including six scales ranging from reading level 1 to 16+ in literature, science and social studies. As claimed by them, their qualitative assessment can be more sensitive to the great variety of text variables that differentiate text, including vocabulary, syntax, conceptual load, text structure and cohesion rather than only focusing on text features. Of course, this type of qualitative assessment is mostly based on a total reaction to the text.

On the other hand, linguistic analysis of the text based on systemic functional grammar is an approach to readability which focuses on mode as the factor contributing to text complexity. In this approach, differences in mode and its relationship to complexity are considered in determining the level of difficulty of texts. As stated by Martin (1992), mode "refers to the role language is playing in realizing social action." (pp.508-9). Spoken language is concerned with the process, language in action and, therefore, it is less complex. Written mode of language, on the other hand, is related to the product, i.e. language in reflection which makes this type of language more abstract and as a result more complex. According to systemisists, lexico-grammatical features of the text and their variations contribute to variations in complexity which is derived from some text features including: (1) lexical density which is the proportion of lexical items as a ratio of the number of clauses in a text, (2) grammatical intricacy which is the use of long and intricate clause complex patterns, (3) complex nominal groups, or embedding structure of nominal groups including a noun, pre and post modifiers consisting of embedded clauses, and (4) grammatical metaphor which is an atypical realization of process, participants and circumstance functions in the language system. For full details of how these features are operationalised in determining the difficulty level of the texts, refer to Shokrpour (2004). In a study done by Shokrpour (1998), it was found that a systemic functional approach to complexity is a better measure of difficulty than readability formulas.

Since reading difficulty has been and continues to be one of the most important aspects of reading comprehension, finding a good approach to readability assessment would be useful for both teachers and writers. Thus, this study aims at comparing three methods of estimating difficulty, i.e. one classic (Fry) and two qualitative ones (Shokrpour and Chall, et al). There is an attempt in this study to determine whether there is a relation between judgment of difficulty by Fry's readability formula, Shokrpour's and Chall, et al's methods, and difficulty of the texts used as measured by cloze tests taken by first year university students.

Materials and methods
Materials: Four passages of Chall et al's science scale at different levels of difficulty (1, 4, 8, 16+) as determined by them were selected. The difficulty level of these passages was calculated by Fry's formula (1977) and also by systemic functional grammar criteria (Shokrpour 2004). The tests were changed into cloze tests deleting every 7th word.

Participants: The tests were administered to 114 first year medical students enrolling in General English I courses in Shiraz University of Medical Sciences who were at a comparable level of proficiency.

Procedure: The tests were administered during two consecutive weeks under standard conditions in 2004 academic year. After completing the tests, they were presented with the same four passages, this time with all the deletions intact and were asked to tell which of them they thought was the easiest, which was about in the middle and which was the hardest. Then the scores were calculated based on the total number of correctly answered blanks. The data were analyzed using descriptive analysis, correlation and t.test.

Results
Then tests were in an increasing level of difficulty based on Chall et al's scale (1,4,8,16+). The analysis of the text difficulty using Fry's formula gave the following results which show exactly the same order as that of Chall et al's, the tests being in an increasing order of difficulty.
Table 1: Difficulty level as determined by Fry's readability formula

Texts Levels
1 1
2 4
3 7
4 13

The results of the linguistic analysis of the text using the method based on systemic functional criteria are displayed in the following Table:

Table 2: Difficulty level as determined by systemic functional grammar criteria

Text Lexical
Density
Grammatical
Intricacy
Complex
Nominal
Group
Grammatical
Metaphor
1 2.26 1.14 0 0.07
2 3.23 2.08 0.31 0,58
3 3.5 4 0.38 1
4 5.2 3 1.3 3.42

As shown in the Table, the order of difficulty as determined by systemic functional grammar agrees with that proposed by Chall et al with one exception. Based on systemic functional grammar the higher the grammatical intricacy of a text, the less abstract and, therefore, the easier the text. This accords with Chall et al's order of difficulty in tests 3 and 4 but not in 1 and 2 since the order of difficulty here is 4,3,1,2. As to the last two criteria, the order of the difficulty in this model confirms that of Chall et al, the tests being in an order of increasing difficulty.

The results of the descriptive analysis indicate that more students filled all the blanks correctly in test 1 while it decreases as we move from test 1 to 4 (38.6%, 24.6%, 7%, and 1.8%, respectively). The mean of the correct answers in test 1 is the highest (12.2) and the lowest in test 4 (4.25). Therefore the scores mostly accord with the levels determined by the three methods. Moreover, the results of the t.test between the students' scores in each test (Table 3) indicate significant differences in all tests (.001)

Table 3. Paired samples test

Sig. (2-tailed)
TEST1-TEST2 .001
TEST2-TEST3 .001
TEST3-TEST4 .001
TEST1-TEST3 .006
TEST2-TEST4 .001
TEST1-TEST4 .001

To determine if there is a correlation between the students' scores in each test, a correlation test was performed between every two tests. (Table 4)

Table 4. Correlation between the tests

N Correlation Sig.
TEST1-TEST2 114 .697 .001
TEST2-TEST3 114 .399 .001
TEST3-TEST4 114 .593 .001
TEST1-TEST3 114 .324 .001
TEST2-TEST4 114 .341 .001
TEST1-TEST4 114 .254 .006


As shown in the results, there is a higher correlation between tests 1 and 2 (.69) and tests 3 and 4 (.59) than between tests1 and 3 (.32) and tests 1 and 4 (.25). This shows that there is a higher correlation between two easy or two difficult tests than between an easy and a difficult one, confirming the orders proposed by these three methods of assessing difficulty. Therefore, based on these results, there is no significant difference between the three methods' proposed order of difficulty as shown in the scores.

The results of the interview with the students as to their impression about the difficulty of each text are displayed in the following Table:
Table 5. Students' impression about the difficulty of the texts
TEXT 1

Level Frequency Percent
1 87 76.3
2 23 20.2
3 4 3.5

TEXT 2

1 44 38.7
2 47 41.2
3 16 14
4 7 6.1

TEXT 3

1 13 11.4
2 31 27.2
3 46 40.4
4 24 21.0

TEXT 4

1 28 24.6
2 12 10.5
3 37 32.4
4 37 32.5


As shown in the Table, a higher percentage of students view text 1 as the easiest (76.3%) while this figure decreases as we move to text 4 (text2=38.6%, text3=11.4%, text4=24.6%). As to the most difficult texts, a lower percentage of the students see text 1 as the most difficult (35%) and this figure increases for text 2-4 (text2=14%, text3=21.1%, text4=32.5%). Therefore, it can be concluded that the order presented by our methods is confirmed by the students' impression about the difficulty of these four texts.

Discussion
Based on the results of this study, the students' mean scores in the easiest and the most difficult tests agree with the levels determined by the three methods. Various methods of assessing the readability of the text have been proposed so far. The most direct way is to measure it by administering a reading comprehension test based on related material to a group of readers with known abilities in language. It can also be measured by experts' judgment. And a third approach can be readability formulas. The first and second approaches are time consuming and costly. As to readability formulas, although recent studies show that they are valid to be used in the EFL context (Greenfield, 2003), many researchers have found that classic formulas are not very accurate predictors of EFL difficulty (Brown, 1998; Shokrpour & Gibbons, 1998). Although in this study there was a relationship between students' tested comprehension scores and Fry's readability formula, this is because in making their exemplars, Chall et al have used their own readability formula which focuses on characteristics such as other such formulas. In all, these formulas are strictly text based and do not address the interactive nature of the reading process. Moreover, they do not distinguish between written and spoken discourse.

On the other hand, qualitative methods are more precise and more sensitive to the great variety of text variables that contribute to its difficulty. Chall et al's method is based on total impression rather than on the analysis of text features. They have claimed that the qualitative measure developed by them has the advantage of simplicity and time effectiveness. Although linguistic analysis of text requires some time to be spent on it, it is worth doing since it focuses on the differences between spoken and written language and its effect on the complexity of the text. Brown in his study (1998) came to the same conclusion, stating that the analysis of linguistic characteristics of the text is highly related to EFL difficulty.

As to the components of linguistic analysis of the text using the systemic functional grammar criteria, lexical density has been found to be highly related to EFL difficulty (Brown, 1998). This was confirmed in our study since the order of difficulty for lexical density was exactly the same as that of the mean scores of the students (1,2,3,4). Moreover, complex nominal groups including those with two or more modifying elements, with a prepositional phrase as qualifier, an embedded clause as modifier and a noun acting as a modifier contribute to the packing of information while adding to the structural complexity and carrying the main burden of the lexical content of the text. As to grammatical metaphor, in an atypical realization, a process may be realized as thing and circumstantial meaning as process, which are a metaphorical lexicogrammatical form of semantic configuration. These are more complex than non-metaphorical ones.

In systemic functional grammar it is proposed that spoken language is grammatically intricate but lexically sparse. Therefore, grammatical intricacy, the use of long sentences, is lower in written language since in this type of language more embedding which makes the sentences shorter is used. This was confirmed in two levels of Chall et al's method but not in tests 1 and 2. But according to a new readability formula called the Lexile Frame-work Software it is claimed that passages consisting of short sentences are assumed to be easier to read than passages consisting of longer sentences. This is the point that needs to be further investigated.

Some recent studies have proved to be in the same line as systemisists in regard to complexity. Shin (2002) reports that it is generally assumed that abstract texts will be more difficult to understand than texts describing real objects since the former requires more exacting referencing skills than the latter. This is exactly what systemic functional grammar claims to be the difference between spoken and written language. Research has also shown that lexical and syntactic knowledge in L2 are the strongest predictors in L2 reading performance among other factors (Bernhardt and Kamil, 1995; Cooper, 1984). Similarly, Day (1994) considers lexical density and background knowledge as the two most important elements that contribute to the complexity of the text.

Conclusion
As qualitative assessments, both our methods focus on vocabulary difficulty, and idea density and difficulty while the approaches are different, one using readability formula and focusing on impression and the other using linguistic analysis of the text. In general, this study indicates that further studies are required to provide us with the best EFL readability index. In the course of this study, a number of questions occurred to me that need to be addressed in future by other researchers:

1. Would similar results be obtained if this study were repeated using other students in other EFL or ESL contexts?
2. Would similar results be obtained if other approaches to readability assessment were used?

References
Bernhardt, E. B. & Kamil, M. L. (1995). Interpreting relationship between L1 and L2 reading: Consolidating the linguistic threshold and the linguistic interdependence hypothesis. Applied Linguistics, 16(1), 15-34.

Brown, J. D. (1998). An EFL readability index. JALT Journal, 20(2).

Chall, J. S. (1956). A survey of users of the Dale-Chall formula. Educational Research Bulletin, 35, 197-212.

Chall, J. S. (1958). Readability: An appraisal of research and application. Columbus, Oh: Ohio State University Press.

Chall, J. S. & Conrad, S. (1991). Should textbooks challenge students: A case for easy or hard textbooks. New York: Teachers College Press.

Chall, J. S. & Dale, E. (1995). Readabiliy revisited: The new Dale-Chall readability formula. Cambridge, MA: Harvard University Press.

Chall, J. S., Bissex, G. L., Conrad, S. S., Harris-Sharples, S. (1996). Qualitative assessment of text difficulty: A practical guideline for teachers and writers. Cambridge: Brookline Books.

Cooper, M. (1984). Linguistic competence of practiced and unpracticed nonnative readers of English. In J.C. Alderson and A.H. Urguhart (Eds.), Reading in a foreign language (pp. 122-35). New York: Longman.

Dale, E. & Chall, J. S. (1948). A formula for predicting readability and instructions. Educational Research Bulletin, 27, 11-20, 28.

Day, R. R. (1994). Selecting a passage for the EFL reading class. Forum, 32(1), p. 20.

Fry, E. (1977). Fry's readability graph: classification, validity and extension to level 17. Journal of Reading, 21(3), 242-52.

Gray, W. S. & Leary, B. E. (1935). What makes a text readable. Chicago: University of Chicago Press.

Greenfield, J. (2003). The Miyazaki EFL readability index. Comparative Culture, 9, 41-49.

Harrsi-Sharples, S. (1983). A study of the "watch" between student reading ability and textbook difficulty during classroom instruction. Unpublished doctoral dissertation: Harvard graduate school of education. Cambridge, MA.

Klare, G. (1963). The measurement of readability. Ames, Ia: Iowa State University Press.

Klare, G. (1984). Readability . In P.D. Pearson (Ed.), Handbook of reading research. (pp. 681-744). New York: Longman.

Martin, J. R. (1992). English text, system and structure. Philadelphia, Amsterdom: John Benjamins.

Porter, D. & Popp, H. (1975). Measuring the readability of children's trade books. Report to Ford Foundation.

Reading Recovery Program. (1990). Reading recovery booklist. Columbus, OH: Ohio State University.

Shin, S. (2002). Effects of subskills and text types on Korean EFL reading scores. Second Language Studies, 20(2), 107-130.

Shokrpour, N. & Gibbons, J. (1998). Register complexity and the inadequacy of readability formulas as a measure of difficulty. IJOAL, 24 (2), 19-36.

Shokrpour, N. (2004). Systemic functional grammar as a basis for assessing text difficulty. IJOAL, 30(2), 5-26.

Thorndike, E. L. (1912). Handwriting. NY: Teachers college (originally published in teachers college record, 11, March 1910).

Weaver, B. (1992). Defining literacy levels. Charlottesville, NY: Story House Corporation

right
 
Articles {Teaching}
2008 Journals
2007 Journals
2006 Journals
2005 Journals
2004 Journals
2003 Journals
2002 Journals
Academic Citation
Author Index
Blog pages new
Book Reviews
For Libraries
Indexes
Institution Index
Interviews
Journal E-books
Key Word Index
Subject Index
Teaching Articles
Thesis
Top 20 articles
Video
T
Announcements
Conference Material
Journals in Group
R & D in EFL
TESOL Certificate CET

 

foot
xx
Part of the Time-Taylor Network
From a knowledge and respect of the past moving towards the English international language future.

Copyright © 1999-2008 Asian EFL Journal ..........Contact Us .............last updated 7th/May/2008