Introduction
Text difficulty has been a concern of educational
researchers and practitioners for more than
70 years (Chall and Conrad, 1991) and many
have used different methods to assess the
difficulty of the text (Chall and Dale, 1995).
In fact, one of the most important aspects
of textbook development has been considered
to be texts of appropriate difficulty by educational
publishers. Matching the difficulty of textbooks
and readers' reading ability has been taken
into account by publishers, writers, editors
and teachers in order to use the text successfully
(Chall and Conrad, 1991; Chall and Dale, 1995;
Harris-Sharples, 1983; Day, 1994).
Controlling the readability of reading instructional
material dates back to the early 19th century,
leading to the emergence of different readability
formulas. Many of such formulas have been
widely used for several decades (Fry, 1977;
Dale and Chall, 1948; Chall, 1956; 1958) and
have been considered as the most reliable
and valid (Klare, 1963; 1984). But most of
these formulas tend to focus on syntactic
and semantic measures of difficulty and do
not include the variables known to predict
difficulty. After some years of application
of these formulas, even their developers admitted
several weaknesses attributed to their formula
(Klare, 1984). Accepting the weaknesses of
these formulas, some of the researchers attempted
to improve and modify the existing measures
of difficulty and, as claimed by them, develop
qualitative measures (Chall, 1956; 1958; Chall
and Dale, 1995).
Qualitative assessments have been proved to
be valid for more than 80 years in psychological
and educational research (Thorndike's writing
scale, 1910, 1912). Porter and Popp (1975)
found a high correlation between judgments
of difficulty of children's books and the
difficulty of those books as measured by cloze
scores and oral reading errors. Chall (1958)
found a 0.8 correlation between judges' rating
of difficulty of passages and their readability
levels.
The interest has been centered on qualitative
assessment even more in the past decade. The
Reading Recovery Program at Ohio State University
(1990) and Weaver (1992) are such examples.
In all, the characteristics of difficulty
used by these studies are quite similar to
those used in classic studies of readability
such as Gray and Leary (1935) and Chall and
Dale (1995).
In
an attempt to develop a qualitative approach
to readability, Chall et al (1996) presented
a method based on matching samples of text
to exemplars that have been scaled for comprehension
difficulty, including six scales ranging from
reading level 1 to 16+ in literature, science
and social studies. As claimed by them, their
qualitative assessment can be more sensitive
to the great variety of text variables that
differentiate text, including vocabulary,
syntax, conceptual load, text structure and
cohesion rather than only focusing on text
features. Of course, this type of qualitative
assessment is mostly based on a total reaction
to the text.
On the other hand, linguistic analysis of
the text based on systemic functional grammar
is an approach to readability which focuses
on mode as the factor contributing to text
complexity. In this approach, differences
in mode and its relationship to complexity
are considered in determining the level of
difficulty of texts. As stated by Martin (1992),
mode "refers to the role language is
playing in realizing social action."
(pp.508-9). Spoken language is concerned with
the process, language in action and, therefore,
it is less complex. Written mode of language,
on the other hand, is related to the product,
i.e. language in reflection which makes this
type of language more abstract and as a result
more complex. According to systemisists, lexico-grammatical
features of the text and their variations
contribute to variations in complexity which
is derived from some text features including:
(1) lexical density which is the proportion
of lexical items as a ratio of the number
of clauses in a text, (2) grammatical intricacy
which is the use of long and intricate clause
complex patterns, (3) complex nominal groups,
or embedding structure of nominal groups including
a noun, pre and post modifiers consisting
of embedded clauses, and (4) grammatical metaphor
which is an atypical realization of process,
participants and circumstance functions in
the language system. For full details of how
these features are operationalised in determining
the difficulty level of the texts, refer to
Shokrpour (2004). In a study done by Shokrpour
(1998), it was found that a systemic functional
approach to complexity is a better measure
of difficulty than readability formulas.
Since reading difficulty has been and continues
to be one of the most important aspects of
reading comprehension, finding a good approach
to readability assessment would be useful
for both teachers and writers. Thus, this
study aims at comparing three methods of estimating
difficulty, i.e. one classic (Fry) and two
qualitative ones (Shokrpour and Chall, et
al). There is an attempt in this study to
determine whether there is a relation between
judgment of difficulty by Fry's readability
formula, Shokrpour's and Chall, et al's methods,
and difficulty of the texts used as measured
by cloze tests taken by first year university
students.
Materials
and methods
Materials: Four passages of Chall et
al's science scale at different levels of
difficulty (1, 4, 8, 16+) as determined by
them were selected. The difficulty level of
these passages was calculated by Fry's formula
(1977) and also by systemic functional grammar
criteria (Shokrpour 2004). The tests were
changed into cloze tests deleting every 7th
word.
Participants:
The tests were administered to 114 first year
medical students enrolling in General English
I courses in Shiraz University of Medical
Sciences who were at a comparable level of
proficiency.
Procedure:
The tests were administered during two
consecutive weeks under standard conditions
in 2004 academic year. After completing the
tests, they were presented with the same four
passages, this time with all the deletions
intact and were asked to tell which of them
they thought was the easiest, which was about
in the middle and which was the hardest. Then
the scores were calculated based on the total
number of correctly answered blanks. The data
were analyzed using descriptive analysis,
correlation and t.test.
Results
Then tests were in an increasing level of
difficulty based on Chall et al's scale (1,4,8,16+).
The analysis of the text difficulty using
Fry's formula gave the following results which
show exactly the same order as that of Chall
et al's, the tests being in an increasing
order of difficulty.
Table 1: Difficulty level as determined by
Fry's readability formula
| Texts |
Levels |
| 1 |
1 |
| 2 |
4 |
| 3 |
7 |
| 4 |
13 |
The
results of the linguistic analysis of the
text using the method based on systemic
functional criteria are displayed in the
following Table:
Table
2: Difficulty level as determined by
systemic functional grammar criteria
| Text
|
Lexical
Density
|
Grammatical
Intricacy
|
Complex
Nominal
Group
|
Grammatical
Metaphor
|
| 1 |
2.26 |
1.14 |
0 |
0.07 |
| 2 |
3.23 |
2.08 |
0.31 |
0,58 |
| 3 |
3.5 |
4 |
0.38 |
1 |
| 4 |
5.2 |
3 |
1.3 |
3.42 |
As
shown in the Table, the order of difficulty
as determined by systemic functional grammar
agrees with that proposed by Chall et al
with one exception. Based on systemic functional
grammar the higher the grammatical intricacy
of a text, the less abstract and, therefore,
the easier the text. This accords with Chall
et al's order of difficulty in tests 3 and
4 but not in 1 and 2 since the order of
difficulty here is 4,3,1,2. As to the last
two criteria, the order of the difficulty
in this model confirms that of Chall et
al, the tests being in an order of increasing
difficulty.
The
results of the descriptive analysis indicate
that more students filled all the blanks
correctly in test 1 while it decreases as
we move from test 1 to 4 (38.6%, 24.6%,
7%, and 1.8%, respectively). The mean of
the correct answers in test 1 is the highest
(12.2) and the lowest in test 4 (4.25).
Therefore the scores mostly accord with
the levels determined by the three methods.
Moreover, the results of the t.test between
the students' scores in each test (Table
3) indicate significant differences in all
tests (.001)
Table
3. Paired samples test
|
Sig.
(2-tailed) |
| TEST1-TEST2 |
.001 |
| TEST2-TEST3 |
.001 |
| TEST3-TEST4 |
.001 |
| TEST1-TEST3 |
.006 |
| TEST2-TEST4 |
.001 |
| TEST1-TEST4 |
.001 |
To
determine if there is a correlation between
the students' scores in each test, a correlation
test was performed between every two tests.
(Table 4)
Table 4. Correlation between the
tests
|
N |
Correlation |
Sig. |
| TEST1-TEST2 |
114 |
.697 |
.001 |
| TEST2-TEST3 |
114 |
.399 |
.001 |
| TEST3-TEST4 |
114 |
.593 |
.001 |
| TEST1-TEST3 |
114 |
.324 |
.001 |
| TEST2-TEST4 |
114 |
.341 |
.001 |
| TEST1-TEST4 |
114 |
.254 |
.006 |
As shown in the results, there is a higher
correlation between tests 1 and 2 (.69) and
tests 3 and 4 (.59) than between tests1 and
3 (.32) and tests 1 and 4 (.25). This shows
that there is a higher correlation between
two easy or two difficult tests than between
an easy and a difficult one, confirming the
orders proposed by these three methods of
assessing difficulty. Therefore, based on
these results, there is no significant difference
between the three methods' proposed order
of difficulty as shown in the scores.
The
results of the interview with the students
as to their impression about the difficulty
of each text are displayed in the following
Table:
Table 5. Students' impression about the difficulty
of the texts
TEXT 1
| Level |
Frequency |
Percent |
| 1 |
87 |
76.3 |
| 2 |
23 |
20.2 |
| 3 |
4 |
3.5 |
TEXT
2
| 1 |
44 |
38.7 |
| 2 |
47 |
41.2 |
| 3 |
16 |
14 |
| 4 |
7 |
6.1 |
TEXT
3
| 1 |
13 |
11.4 |
| 2 |
31 |
27.2 |
| 3 |
46 |
40.4 |
| 4 |
24 |
21.0 |
TEXT
4
| 1 |
28 |
24.6 |
| 2 |
12 |
10.5 |
| 3 |
37 |
32.4 |
| 4 |
37 |
32.5 |
As shown in the Table, a higher percentage
of students view text 1 as the easiest (76.3%)
while this figure decreases as we move to
text 4 (text2=38.6%, text3=11.4%, text4=24.6%).
As to the most difficult texts, a lower percentage
of the students see text 1 as the most difficult
(35%) and this figure increases for text 2-4
(text2=14%, text3=21.1%, text4=32.5%). Therefore,
it can be concluded that the order presented
by our methods is confirmed by the students'
impression about the difficulty of these four
texts.
Discussion
Based on the results of this study, the students'
mean scores in the easiest and the most difficult
tests agree with the levels determined by
the three methods. Various methods of assessing
the readability of the text have been proposed
so far. The most direct way is to measure
it by administering a reading comprehension
test based on related material to a group
of readers with known abilities in language.
It can also be measured by experts' judgment.
And a third approach can be readability formulas.
The first and second approaches are time consuming
and costly. As to readability formulas, although
recent studies show that they are valid to
be used in the EFL context (Greenfield, 2003),
many researchers have found that classic formulas
are not very accurate predictors of EFL difficulty
(Brown, 1998; Shokrpour & Gibbons, 1998).
Although in this study there was a relationship
between students' tested comprehension scores
and Fry's readability formula, this is because
in making their exemplars, Chall et al have
used their own readability formula which focuses
on characteristics such as other such formulas.
In all, these formulas are strictly text based
and do not address the interactive nature
of the reading process. Moreover, they do
not distinguish between written and spoken
discourse.
On
the other hand, qualitative methods are more
precise and more sensitive to the great variety
of text variables that contribute to its difficulty.
Chall et al's method is based on total impression
rather than on the analysis of text features.
They have claimed that the qualitative measure
developed by them has the advantage of simplicity
and time effectiveness. Although linguistic
analysis of text requires some time to be
spent on it, it is worth doing since it focuses
on the differences between spoken and written
language and its effect on the complexity
of the text. Brown in his study (1998) came
to the same conclusion, stating that the analysis
of linguistic characteristics of the text
is highly related to EFL difficulty.
As
to the components of linguistic analysis of
the text using the systemic functional grammar
criteria, lexical density has been found to
be highly related to EFL difficulty (Brown,
1998). This was confirmed in our study since
the order of difficulty for lexical density
was exactly the same as that of the mean scores
of the students (1,2,3,4). Moreover, complex
nominal groups including those with two or
more modifying elements, with a prepositional
phrase as qualifier, an embedded clause as
modifier and a noun acting as a modifier contribute
to the packing of information while adding
to the structural complexity and carrying
the main burden of the lexical content of
the text. As to grammatical metaphor, in an
atypical realization, a process may be realized
as thing and circumstantial meaning as process,
which are a metaphorical lexicogrammatical
form of semantic configuration. These are
more complex than non-metaphorical ones.
In
systemic functional grammar it is proposed
that spoken language is grammatically intricate
but lexically sparse. Therefore, grammatical
intricacy, the use of long sentences, is lower
in written language since in this type of
language more embedding which makes the sentences
shorter is used. This was confirmed in two
levels of Chall et al's method but not in
tests 1 and 2. But according to a new readability
formula called the Lexile Frame-work Software
it is claimed that passages consisting of
short sentences are assumed to be easier to
read than passages consisting of longer sentences.
This is the point that needs to be further
investigated.
Some
recent studies have proved to be in the same
line as systemisists in regard to complexity.
Shin (2002) reports that it is generally assumed
that abstract texts will be more difficult
to understand than texts describing real objects
since the former requires more exacting referencing
skills than the latter. This is exactly what
systemic functional grammar claims to be the
difference between spoken and written language.
Research has also shown that lexical and syntactic
knowledge in L2 are the strongest predictors
in L2 reading performance among other factors
(Bernhardt and Kamil, 1995; Cooper, 1984).
Similarly, Day (1994) considers lexical density
and background knowledge as the two most important
elements that contribute to the complexity
of the text.
Conclusion
As qualitative assessments, both our methods
focus on vocabulary difficulty, and idea density
and difficulty while the approaches are different,
one using readability formula and focusing
on impression and the other using linguistic
analysis of the text. In general, this study
indicates that further studies are required
to provide us with the best EFL readability
index. In the course of this study, a number
of questions occurred to me that need to be
addressed in future by other researchers:
References
Bernhardt, E. B. & Kamil, M. L. (1995).
Interpreting relationship between L1 and L2
reading: Consolidating the linguistic threshold
and the linguistic interdependence hypothesis.
Applied Linguistics, 16(1), 15-34.
Brown,
J. D. (1998). An EFL readability index. JALT
Journal, 20(2).
Chall,
J. S. (1956). A survey of users of the Dale-Chall
formula. Educational Research Bulletin,
35, 197-212.
Chall,
J. S. (1958). Readability: An appraisal
of research and application. Columbus,
Oh: Ohio State University Press.
Chall,
J. S. & Conrad, S. (1991). Should textbooks
challenge students: A case for easy or hard
textbooks. New York: Teachers College
Press.
Chall,
J. S. & Dale, E. (1995). Readabiliy
revisited: The new Dale-Chall readability
formula. Cambridge, MA: Harvard University
Press.
Chall,
J. S., Bissex, G. L., Conrad, S. S., Harris-Sharples,
S. (1996). Qualitative assessment of text
difficulty: A practical guideline for teachers
and writers. Cambridge: Brookline Books.
Cooper,
M. (1984). Linguistic competence of practiced
and unpracticed nonnative readers of English.
In J.C. Alderson and A.H. Urguhart (Eds.),
Reading in a foreign language (pp. 122-35).
New York: Longman.
Dale,
E. & Chall, J. S. (1948). A formula for
predicting readability and instructions. Educational
Research Bulletin, 27, 11-20, 28.
Day,
R. R. (1994). Selecting a passage for the
EFL reading class. Forum, 32(1), p.
20.
Fry,
E. (1977). Fry's readability graph: classification,
validity and extension to level 17. Journal
of Reading, 21(3), 242-52.
Gray,
W. S. & Leary, B. E. (1935). What makes
a text readable. Chicago: University of
Chicago Press.
Greenfield,
J. (2003). The Miyazaki EFL readability index.
Comparative Culture, 9, 41-49.
Harrsi-Sharples,
S. (1983). A study of the "watch"
between student reading ability and textbook
difficulty during classroom instruction.
Unpublished doctoral dissertation: Harvard
graduate school of education. Cambridge, MA.
Klare,
G. (1963). The measurement of readability.
Ames, Ia: Iowa State University Press.
Klare,
G. (1984). Readability . In P.D. Pearson (Ed.),
Handbook of reading research. (pp.
681-744). New York: Longman.
Martin,
J. R. (1992). English text, system and
structure. Philadelphia, Amsterdom: John
Benjamins.
Porter,
D. & Popp, H. (1975). Measuring the
readability of children's trade books.
Report to Ford Foundation.
Reading
Recovery Program. (1990). Reading recovery
booklist. Columbus, OH: Ohio State University.
Shin,
S. (2002). Effects of subskills and text types
on Korean EFL reading scores. Second Language
Studies, 20(2), 107-130.
Shokrpour,
N. & Gibbons, J. (1998). Register complexity
and the inadequacy of readability formulas
as a measure of difficulty. IJOAL, 24
(2), 19-36.
Shokrpour,
N. (2004). Systemic functional grammar as
a basis for assessing text difficulty. IJOAL,
30(2), 5-26.
Thorndike,
E. L. (1912). Handwriting. NY: Teachers
college (originally published in teachers
college record, 11, March 1910).
Weaver,
B. (1992). Defining literacy levels.
Charlottesville, NY: Story House Corporation