(101824)
Project on Test Design
I. Introduction
Test design is a complex process that involves simultaneous development of the test purpose,
structure, the candidate profiles and language among other features. Proficiency tests are
particularly complex to design as their purpose is to measure the overall language ability of
the candidates they are designed for, therefore they are required to test all the possible
language abilities of the candidates. The following assessment is of an international
proficiency test designed with the purpose of eliciting a multitude of language skills
including listening, reading, speaking and writing. The points covered in the assessment
concentrate on the particularities of the questions and texts chosen and the reasons behind
their design.
II. Designing an international proficiency test
2.1. Literature review
The design of the listening skills covered in the proficiency test designed for the purpose of
this paper (Appendix 1) include the need to listen for specific information and using and
manipulating the information which the candidates have been exposed to (Lewkowicz, 1991).
The reading skills assessed, however, require a much complex assessment strategy due to the
nature of the reading skills, which are considered ‘receptive’ and therefore more difficult to
elicit (Hughes, 2003). The effect of textual features and structures are also discussed in terms
of their effect on the candidate’s comprehension and the issues this creates for test design
(Liu, 2011) . In terms of assessing the speaking skills the issues of eliciting speaking abilities
2
through controlled short questions (Hughes, 2003) are discussed, along with the design of the
longer free speech topics which include issues such as comparability and cultural sensitivity
of topics (Lee & VanPatten, 2003). The design of the writing section includes such issues as
the design of a writing test that assesses the candidate’s writing ability rather than knowledge
of the topic and creating writing tasks that are inclusive of all cultural writing constructs
(Uysal, 2010; Hall, 2010), as well as creating a representative sample of a students’ writing
ability (Hughes, 2003).
2.2. Test design
2.2.1. Background to the test designed
The test designed for the purpose of this paper is an international proficiency test. Its purpose
is to assess the general listening, reading, speaking and writing skills of the candidates.
Therefore, the test is not designed for a specific group level or a specific course and care has
been taken to design the test in such a way that the candidates are not discriminated in terms
of their abilities and knowledge of anything else other than the target language, which is
English.
2.2.2. Listening test
Listening is a skill that requires the listeners to utilize a number of language features
including that of grammar, syntax, morphemes, phonemes, lexis as well as the spoken
language in its context. All these features command the attention of the listener, which makes
it hard for listeners to concentrate on all features all the time. For this reason listening
comprehension questions need to be specific. The first listening question (Appendix I, Q1)
was designed with the aim of eliciting specific information from the listening text. For this
purpose the items was designed as multiple choice items which requires the candidate to
listen for specific information and choose only one right answer. The first item requires the
3
listener to listen for the cues in the recording and identify the name of the next tour, from the
four distractors. Multiple choice questions are a good way to elicit specific information from
a listening test because they take up a short time for response and can be numerous. On the
other hand, multiple choice questions are also limited in that we cannot be sure how many of
the correct responses were chosen by chance (Hughes, 2003). For this reason the multiple
choice questions were designed with four distractors in order to minimize the possibility of
chance. The four distractors also cannot be chosen at random because they need to be
relevant to the question. The first multiple choice question (Appendix I, Q1, Item1) asks for a
specific name, so besides the right answer the other distractors also need to be specific names
taken from the test, in order to test whether the candidate has truly understood the text. Other
distractors might be specific names or words from the text that sound the similar as the right
answer such as the distractor ‘Back to Bass’ and ‘Back to Basics’. The second item
(Appendix I, Q1, Item2), besides eliciting specific information, also requires higher order
thinking because it requires the responder not only to identify the correct answer but also
identify the context.
In addition to testing for specific information, a listening test can also test whether
participants are able to manipulate and use the information they have heard. In real life, often
the listener not only has to identify the information they have listened to, but also be able to
‘do something with what has been heard’, be it responding to the speaker or noting what they
have heard (Lewkowicz,1991, p. 26). For this reason, it makes sense that listening
comprehension questions should also require the listener to use the information they have
heard. The second technique used for the listening section (Appendix I, Q2) is a cloze
passage exercise that is not an exact word-for-word replica of what the candidates hear, but a
summary text of the recording written in third person. This technique has been used so that
the responders are forced to use the information they have heard in a different context, rather
4
than following a written replica of the listening text that only requires them to fill in the
missing words. The second technique for the listening section requires a more difficult skill
and will therefore be used to discriminate between the weaker candidates, who can only listen
to specific information, from the stronger candidates who will be able to not only listen for
specific information but also be able to use it productively.
2.2.3. Reading test
While reading is often classified as the less problematic skill to acquire, it is still hard to test
due to the fact that it is a receptive skill. When testing reading we cannot measure exactly
what the reader has perceived because the exercise of receptive skills does not manifest
themselves in ‘overt behaviour’ (Hughes, 2003). This means that what the reader produces in
response to a question might be different to what they perceived while reading the text
because no question can elicit all the information interpreted by the reader. For this reason, it
is important that the test contains a variety of questions that are designed to measure a
number of reading abilities. For this purpose the questions for the reading texts in the
proficiency test include short answer questions (Appendix II, Q2) which is a technique that
can test multiple abilities. Question 2 in the reading ability is used to test the student’s
reading comprehension ability with questions such as ‘why’ and ‘which’. It is also designed
to test the ability of the respondents to identify specific information from the text with the
items such as item 2 and item 5 (Appendix II, Q2) that ask the respondents to find and list
specific information from the text.
The text itself can also impact the reading of the students and their responses, depending on
its topic, content, text type, genre, linguistic variables and typographical features among other
variables (Liu, 2011). This means that not only does the student’s interpretation of the text
vary, but the text itself has inherent variables that influence the comprehensibility of the text.
5
It is therefore desirable that besides comprehending the content, the test responders also need
to be able to identify the specific features of a text. The first question in the reading section
(Appendix II, Q1) has been designed to test the responder’s ability to identify text structure
and the function of the individual paragraphs by asking the respondents to complete the
sentences by matching the possible headings to the correct paragraph number. At the same
time, the question also tests the respondents’ ability to identify the ‘gist’ or the general
meaning of the individual paragraphs because in order to match the headings to the right
paragraph the respondent must first understand the general idea of each paragraph as well as
its purpose within the text. The abilities being tested in question 1 require a complex process
in order to identify the correct paragraph but only a short response, so the questions have
been designed using the fill in the blank technique where the respondents have to complete
the sentences with the correct number of the paragraph. This way the respondents can
concentrate more on comprehending and identifying the paragraphs rather than worry about
providing extensive information. Questions such as these that require complex abilities which
are measured by short responses are more reliable for scoring purposes than questions that
require an extensive response.
2.2.4. Speaking test
The first part of the speaking test involves the student answering general questions about
themselves which might include their interests and hobbies, their education and their future
ambitions. This part has been designed with the purpose of eliciting specific language
features connected to the topic. Oral tasks often require the candidate to produce free flowing
language; however this is not always desirable because the candidate might not produce a
representative sample of the language features. Hughes (2003) states that unless the speaking
tasks are restrictive ‘it is not possible to predict all the operations which will be performed’
(p.117) meaning that free flowing speech is unpredictable and might not produce the desired
6
oral skills which need to be tested. The short questions in the speaking test (Appendix III,
Q1) are designed to elicit different language functions. The first item in Q1 requires the
candidate to talk about interests which will force them to use language features such as
expressing preferences, expressing likes and dislikes, justifying opinions, making
comparisons and describing. The second item (Appendix III, Q1, Item 2) requires the
candidate to talk about their future education and/or work plans which will force them to use
language features such as explaining, speculation, expressing preferences, describing and
drawing conclusions. The second item also has a follow up question asking the candidate to
explain the reasons for his previous choices if they did not previously do so which will force
them to use language functions such as describing, explaining, justifying and elaborating on
an idea if they didn’t provide enough representative samples of these functions in the first
question. These short questions have been designed in a way to control the language that the
candidate will produce so that the examiners have a representative sample of a variety of oral
tasks (Wigglesworth, 2000). Pre-set short questions such as these allow the examiners to
elicit the right oral skills form the candidate, who might not produce all the desired oral skills
if they were not controlled by specific questions.
The second item requires the respondents to talk for 5 minutes on a topic randomly selected
by the examiners from a pool of possible topics. Ideally, the topics available for selection
should not be based on cultural, educational or any other specific knowledge that might
disadvantage particular candidates. For this reason the pool of topics for the speaking test
need to be carefully chosen to include a variety of topics on globally acceptable topics
including language features inclusive of diverse English varieties (Whong, 2011, p. 174).
The topics used should also be comparable in the sense that the scores obtained from one
topic should yield similar results as all the other topics (Lee & VanPatten, 2003, p.99). The
second speaking component (Appendix III, Q2, Items1 and 2) has been designed to test
7
candidates’ ability to express their opinions on a given topic. The topics on the environment
(Appendix III, Q2, Item 1) and technological advances (Appendix III, Q2, Item 2) are both
relevant throughout the world and also do not disadvantage candidates. The higher order oral
skills required from this task however, are designed to distinguish between weak and strong
candidates. While in the first speaking section the candidates were prompted to explain and
justify their preferences, this longer speaking section requires harder language functions.
Giving an opinion on a topic such as the environment and technological advances requires the
candidates to resort to higher order speaking functions including elaborating on ideas,
expressing and justifying opinions, analyzing, drawing conclusions, making comments,
indicating attitude and develop an argument. The ability of the candidates to use these
language functions will discriminate them from the weaker candidates.
2.2.5. Writing test
The first writing question is a short writing response which requires the candidates to analyse
a graph by writing and comparing the main features they see (Appendix IV, Q1). This
question was designed to elicit specific language features from the candidates, including
expressing opinions, comments and expressing observations as well as reporting comments,
descriptions and reporting decisions. The purpose of the question was to provide a medium to
ease the students into the writing tasks. For this purpose the word length was kept at 150w
and the language functions remain relatively easy requiring only observations and
comparisons which can be drawn by looking at the given table (Appendix IV, Q1). The graph
presented has been chosen for its universal content and easy interpretability, so that the
candidates have all been given the same question and the same graph to analyse and are not
required to include any outside information. This will ensure that all the candidates are
assessed only on their writing ability and not on their knowledge of a particular field or topic
(O’Loughlin & Wigglesworth, 2007).
8
As an international proficiency test the writing tasks should be designed to cater to all the
possible candidate profiles without disadvantaging any candidates based on their origin or
culture (Uysal, 2010; Hall 2010). This reasoning stems from the belief that different cultures
around the world have different writing constructs, which can vary in style, use of language
and directness. For the purpose of creating a level testing field for all the candidates, the
writing topics need to be culturally sensitive in order to give every candidate an equal chance
at showing their writing abilities (Davies et. al. 2003). The second item writing task consists
of a writing topic that has been selected to be inclusive of all possible candidate profiles. The
second question (Appendix IV, Q2,) includes topics where the candidate is asked to discuss
the advantages and disadvantages of moving to a new country or the advantages and
disadvantages of their country’s educational or employment system. These topics are both
expected to be familiar to the candidates because they have all considered moving to a new
country to study or to work, and they have also all been involved in their country’s education
or employment system, therefore the test takers are not disadvantaged on the basis of
particular cultural knowledge.
The long response writing items (Appendix IV, Q2) have also been designed to elicit specific
higher order language features from the candidates. The questions ask the candidates to
‘discuss the advantages and disadvantages’ which requires the use of more difficult language
features than the first writing question. The two writing questions have been designed to elicit
different language features in order to create variability and include a selection of
representative tasks because the more tasks there is the more representative the test will be of
the candidates writing ability (Hughes, 2003, p. 86). These language features include and will
therefore discriminate between the weaker candidates who can only make observations and
9
comparisons from the stronger candidates who can discuss in further detail and provide
justifications for their observations.
III. Conclusion
The design of this international proficiency test has highlighted the complexity of issues that
need to be taken in perspective when undertaking such a difficult task. The design of the
listening test highlighted the need for assessing specific skills as well as manipulating with
the information comprehended. The design of the reading test highlighted the complexities of
assessing a receptive skills and the effect of textual features. Designing a speaking test
brought forward such issues as eliciting a representative sample of language functions
through controlled questions among the comparability and cultural sensitivity of the speaking
topics. Finally, the writing section highlighted the issue of designing writing tasks for
assessing the candidate’s use of language rather than their knowledge of the topic as well as
creating a representative sample of student’s writing abilities. Together, these issues provide
a glimpse into the difficult process of designing an international proficiency test as well as
the rewards in terms of providing the fairest possible testing field for such candidates with
such varied and diverse profiles.
10
IV. References
Davies, A., Hamp-Lyons, L. & Kemp, C. (2003). Whose norms? International proficiency tests in English. World Englishes, 22(4), 571-584.
Hall, G. (2010). International English language testing: A critical response. ELT Journal, 64(3), 321-328.
Hughes, A. (2003). Testing for language teachers (2nd ed.). Cambridge: Cambridge University Press.
Lee, J. & VanPatten, B. (2003). Making communicative language teaching happen (2nd ed.). New York: McGraw-Hill.
Lewkowicz, J. (1991). Testing listening comprehension: A new approach?. Hong Kong Papers in Linguistics and Language Teaching 14. Accessed 10/10/2011 http://sunzi.lib.hku.hk/hkjo/article.jsp?book=4&issue=40003
Liu, F. (2011). A short analysis of the text variables affecting reading and testing reading. Studies in Literature and Language. 2(2), 44-49.
O’Loughlin, K. & Wigglesworth, G. (2007). Investigating task design in academic writing prompts. In Milanovic, M. and Weir, C., Language Testing 19: IELTS Collected Papers. Cambridge: Cambridge University Press.
Uysal, H. H. (2010). A critical review of the IELTS writing test. ELT Journal, 64(3), 314320.
Whong, M. (2011). Language teaching: Linguistic theory in practice. Edinburgh: Edinburgh University press.
Wigglesworth, G. (2000). Issues in the development of oral tasks for competency-based assessments of second language performance. In G. Brindley, (Ed.), Studies in immigrant English language assessment, Volume 1. Sydney: National Centre for English language Teaching and Research.
PLACE THIS ORDER OR A SIMILAR ORDER WITH US TODAY AND GET AN AMAZING DISCOUNT