Features of Testing in Language Learning Coursework

Testing Assessment of the learned items is one of significant considerations in learning. Only a comprehensive testing of the things learned will give the result or outcome of everything that is learned throughout a course. This is particularly important in language learning. Only a testing would check the competence of the learner as well as the actual manifestation of the learner in the learning of languages. Thus, the inexorableness of language testing may not be questioned. Now we come to face the more significant question of the effectiveness of language testing. What is the criterion for testing the language skills of the learner? How effective is the testing of language? What are the criteria for an effective testing? What is the importance and relation of and between reliability and validity? Is it possible to design test tasks that are authentic and reliable? These are some of the questions that particularly interest us. The statement by Bachman and Palmer is of significant consideration in this respect. “Language testers have been told that the qualities of reliability and validity are essentially in conflict, or that it is not possible to design test tasks that are authentic and at the same time reliable.” (Lyle & Palmer 1996, p. 18). Accordingly, the four qualities of useful language assessment includes reliability which is the consistency of measurement, constructive validity, which means the meaningfulness of the interpretations that we make on the basis of assessment scores, authenticity, which is the degree of correspondence between the characteristics of a given assessment task and the characteristics of a relevant non-assessment language use task, and internal impact. (Four Qualities of Useful Language Assessments). In this essay we will discuss the statement by Bachman and Palmer analyzing, also, the various threats on validity and reliability and some remedial measures. “In the past twenty years, language testing research and practice have witnessed the refinement of a rich variety of approaches and tools for research and development, along with a broadening of philosophical perspectives and the kinds of research questions that are being investigated.” (Bachman). When we say that an effective test is practical, we imply that the test is not extremely expensive; it stays within proper time confines; it is relatively easy to administer; and it has a scoring or testing system that is explicit and effective. As Bachman and Palmer would argue, the usefulness of a particular test means the sum of reliability, construct validity, authenticity, interactiveness, impact and practicality. (Lyle & Palmer 1996, p. 18) “A reliable test is consistent and dependable.” (Brown, 2004, p.20). In the view of Bachman and Palmer reliability is the function of the consistency of testing scores from one set of tests to another. Brown identifies validity as the most complex and important criterion for an effective test. Validity implies the relation between the conclusions made from a test or assessment and the purpose and intention on the basis of suitability, meaningfulness, and efficacy of the assessment. As Brown establishes, a number of different kinds of evidence influence the measure of validity since there are no unconditional determiners of it exist. It is the right time that we consider the actual meaning and reliability of the statement of Bachman and Palmer that we referred to in the opening of this discussion. “Language testers have been told that the qualities of reliability and validity are essentially in conflict, or that it is not possible to design test tasks that are authentic and at the same time reliable.” (Lyle & Palmer, 1996, p. 18). What was the purpose of these language experts when making this statement? Or did they mean the same when they say that reliability and validity are in conflict or that it is not possible to design test tasks that are authentic and at the same time reliable? In fact, they were not especially interested in the tension that existed among the different qualities. Rather what they wanted is that the testers understand the complementarities that existed among these qualities. Only an understanding of the different qualities in combination would make an appropriate balance among the qualities such as reliability, validity and others in a testing. “The Bachman and Palmer (1996) framework of test usefulness can be relevant in helping teachers decide which type of test to use. This framework proposes six qualities of test usefulness: Reliability, Construct Validity, Authenticity, Interactiveness, Impact, and Practicality. Bachman and Palmer suggest that test developers develop an appropriate balance among these qualities by setting minimum acceptable standards.” (Nakamura, 2004). Lat us have deeper understanding of the qualities of reliability and validity and the major treats on them. They are very crucial qualities of a large scale test and there are many factors that affect these qualities in a language test. First of all, only a consistent and dependable test will be reliable. That means, if a test is given in different occasions to the same student or matched students, they should produce the related result for the test to be recognized as a reliable test. However, many conditions and factors such as variations in the student, scoring, test administration, and the test itself can influence the reliability of a test. Most significantly, learner-related issues such as sickness, exhaustion, anxiety, peculiarities of the day, and other related physical and psychological factors can affect the reliability of a test. (Brown, 2004, p.21). There are other rater reliability issues that affect the reliability of a test. They include the human errors, partiality, subjectivity and prejudice (termed inter-rater reliability) and lack of definite criteria, fatigue, carelessness and bias of the teachers (termed intra-rater reliability). Another threat to the quality of reliability is posed by the unreliability that happens due to the conditions pf test administration. Poor test condition can be crucial deciding factors in the reliability of a test. To name another factor affecting test reliability, the nature of the test itself can be considerable element. Thus, we may conclude that the aforementioned elements can be crucial with regard to the reliability of a test and these are the most crucial threats to the quality of reliability. In an effort to counter the issues that affect reliability, let it be noted that reliability is applicable to the test as well as to the teacher. The issues that are discussed in this paper need to be taken special consideration to minimize the effect of the issues on reliability. Simple but careful measures can be crucial in determining the reliability of a test. Thus, the threats of reliability raised by the administration and the test itself may be countered by a well organized and error free management of the test. And if the standards of the test input for all the students are the same we can expect more reliability of the test. For better reliability of the test it is necessary to provide the physical conditions which will ensure an error free testing system. The risk in student participation in the test can be minimized by providing the physically suitable condition. It is also necessary to ensure that the inputs of the test are error free and cannot affect changes such as the illness of the child, the anxiety of unambiguous. There are many issues related with reliability that can be solved by careful and definite means. For example the testing environment of the student can be made suitable to the student so that the chances for issues related to it are solved. There are, on the other hand, certain conditions that we cannot affect any modification or change like illness and so on. Identifying the issues which we can make some contribution on favor of the test and student and the implementation of them would be the best recommendable remedy to overcome the threats to the quality of reliability in testing. To establish the reliability of any test the examiner variability needs to be virtually eliminated by objective formats. “Variability of testing conditions is reduced by meticulous care in providing instructions to test administrator & in formulating the explanations to the candidate (if necessary give some preliminary practice with rubrics etc so first timers are not handicapped)” (Methods of Assessment). For a high stability reliability, which is of primary importance in testing, the distance between the individuals in the group should be kept the same or almost the same on each occasion. Apart from stability reliability, equivalence reliability which means equivalence in result when applied to same object is also necessary. The often disregarded rater reliability is a more difficult threat to overcome. It is necessary that the teachers adopt every possible step to minimize the issues related with rater reliability. There are many types of measures that may be adopted in order to reduce the issue connected with rater reliability. For the effective implementation of such measures the part played by the teacher is of great significance. The often found intra-rater reliability can be enhanced by the following guidelines as suggested by Brown. He suggests that consistent sets of criteria for a correct response must be used. The other recommendations include ‘Give uniform attention to those sets throughout the evaluation time,’ ‘Read through tests at least twice to check for your consistency,’ ‘If you have made “mid-stream” modifications of what you consider as correct response, go back and apply the same standards to all,’ and ‘Avoid fatigue by reading the tests in several sittings, especially if the requirement is a matter of several hours.’ (Brown, 2004, p.21). These are very useful guidelines in order to enhance the reliability of testing. Among the many sources of unreliability the test itself involves a main source and the measures to check this is of great importance as they have wide ranging effects on the other issues as well. “Firstly, making sure that areas to be tested are thoroughly covered ensures that, for example, a correct answer was not simply a lucky guess, or that a wrong answer does not give an unbalanced view of the candidates ability. Obviously, the longer the test is, the more reliable in this respect it is. The only limitation on just how comprehensive it is will be practicality. Generally speaking, the more important the test is, the more time it will take to complete, as the results will be more trustworthy.” (Principles of Language Testing, 2003). The students also must be identifiable with the test in its format, techniques and other criteria. For a good reliability the administration of the test under the identical and non-distracting conditions is advisable. Having considered the issues related to reliability, it is also necessary that we have right idea about the issues connected with the quality of validity. Validity, as we understand, is the most important quality of test use and it concerns the extent to which meaningful inferences can be drawn from test scores (Bachman, 1990). “In order to examine the validity of a test, it requires a validation process by which a test user presents evidence to support the inferences or decisions made on the basis of test scores.” (Crocker & Algina, 1986). The validity is of three main types and they are construct validity, predictive validity, and content validity. “When a test measures what it is intended to measure and nothing else, it is valid. Validity is the extent to which a test measures what it is intended to measure.” (Methods of Assessment). To quote the same source, “the validity of your test design rests on its relationship with your own goals and objectives i.e. its success in measuring the behaviors you wish your learners to develop or the skills they need to further their own objectives. There is some consensus in societies about useful skills and socially responsible behavior, though within most educational systems it is for test designers to define & / or take account of the aims and content of the program of study which is the subject of the assessment.” (Methods of Assessment). Validity of the test is dependent on many factors and the proper guidelines that can be followed for a better validity of the test are of great significance. The conclusions of the result in a test and the purpose of the test should have relatedness. As the purpose proposes so the result concludes – this is the most important criterion of validity checking. The suitability of the test, its meaningfulness, and effectiveness should be maintained for a test to be valid. Issues that threaten the validity of a test can be identified and measures need to be adopted accordingly. Let us always remember that a valid test is designed to test the progress made by the student in the areas of study. “The test would be invalid if representative samples from the whole syllabus (whether they are grammar points, vocabulary items or skills in reading, writing, listening or speaking) were not present. It is important, in particular, not simply to test those areas which are easy to test.” (Principles of Language Testing, 2003). The validity is relative to the other qualities of testing. Thus, the identification of both the reliability and validity issues and solving the issues with the right strategies can be very crucial in the testing of a language. At the end of this discussion, the relation between reliability and validity is evident. There are many evidences to clarify this relationship between reliability and validity. No wonder many testers feel that the two concepts are essentially in conflict or that it is not possible to have test tasks that are both reliable and valid. However, such a view would be an extreme one and a better understanding of the two testing qualities is required. “The fact that reliability is a measure of whether a measuring device measures a construct in the same way from context to context suggests that any valid measure must first be reliable, and if measures are not reliable, they obscure the construct they measure and hence, may obstruct validity. In that sense, the combined discussion of reliability and validity serves as a synthesis of those two issues in addressing any research, including language testing.” (Brown & Hudson, 2002). In conclusion, we understand that a language test that is resourcefully prepared will provide details regarding the different qualities of testing. They are important in the sense they give the real meaning to any test conducted. Principles of reliability, validity and others need to be well dealt with the tests that check the language awareness of the student. The clarification with regard to the essentiality of a test will stimulate a student to better take part in the test. Reliability and validity are topics of important analyses. “The reliability of a test is connected with the accuracy of measurement: a test which measures more accurately is more reliable because it has a smaller error of measurement” and a test can be “valid if it measures what it is supposed to measure, or if it does what it is supposed to do.” (Raatz et al). Now to consider the relation between reliability and validity, we may state that they are very complexly related. It is a fact to relate validity and reliability in testing. However, there are many complementarities among the test qualities as well. There are various threats to these qualities of testing as we have extensively identified in the discussions. It does not mean that these are not without remedies. In fact, the measures that are put forward in the discussion will find their place in any discussions on the topic. Reference Lyle, Bachman., & Palmer, Adrian. (1996). Language Testing in Practice. Oxford. p. 18. Nakamura, Yuji. A. (2004). Comparison of holistic and analytic scoring methods in the assessment of writing. Tokyo Keizai University. Retrieved December 5, 2007, from http://jalt.org/pansig/2004/HTML/Nakamura.htm Four Qualities of Useful Language Assessments. Retrieved December 5, 2007, from http://www.culi.chula.ac.th/dia/DIA-WEB/pp_files/Palmer%20impact.ppt#669,4 Bachman, Lyle F. Modern language testing at the turn of the century. Assuring that what we count counts. Retrieved December 5, 2007, from http://ltj.sagepub.com/cgi/content/abstract/17/1/1 Brown, H.D. (2004). Language assessment: Principles and classroom practices. New York: Pearson Education. p. 20. Brown, H.D. (2004). Language assessment: Principles and classroom practices. New York: Pearson Education. p. 21. Methods of Assessment. English Language Learning and Teaching. Ted Power. Retrieved December 5, 2007, from http://www.btinternet.com/~ted.power/esl0736.html Brown, James Dean & Hudson, Thom. (2002). Criterion-referenced Language Testing. Tesl-ej. Retrieved December 5, 2007, from http://tesl-ej.org/ej26/r2.html Raatz, Ulrich et al. Introduction to language testing and C-Tests. Retrieved December 5, 2007, from http://www.uni-duisburg.de/FB3/ANGLING/FORSCHUNG/HOWTODO.HTM Principles of Language Testing. (2003). Edited guide entry. BBC.co.uk. Retrieved December 5, 2007, from http://www.bbc.co.uk/dna/h2g2/A1297910 Crocker, L., & Algina, J. (1986). Introduction to classical and modern test theory. Cronbach, 1971. Belmont, CA: Wadsworth Group/Thomson Learning. Read More

Features of Testing in Language Learning - Coursework Example

Extract of sample "Features of Testing in Language Learning"

CHECK THESE SAMPLES OF Features of Testing in Language Learning

How Technology, Texting Have Affected Our Language

How children learn the sounds of their language

Summarize the article

Language Use and Culture

Problematizing the Use and Learning of a Second Language

IOS Development Using Watchkit

Benefits and Drawbacks of Computer-based Testing

Language as a Seventh Sense