Evaluation Methodology Essay Example | Topics and Well Written Essays - 2500 words

Running head: SHORT OF PAPER (50 CHARACTERS OR LESS) of Paper of Paper American Evaluation Association (AEA) (2009) defines evaluation as “assessing the strengths and weaknesses of programs, policies, personnel, products, and organizations to improve their effectiveness.” Evaluation is a systematic process and therefore, requires definite models to fall upon to design its process. They, in general, help in designing evaluation. Nevertheless, models are more than tools or templates as they rely heavily on concepts and beliefs about the nature, and function of evaluation, and extend as approaches and methodologies (Smith, 1997). Models have been categorized based on the object of evaluation and method. In addition, classification is also based on the various other factors that influence the evaluation and its outcomes. For example, the position of evaluator - external or internal- with reference to the organization, extent of stakeholder participation, purpose of evaluation, the nature of the object evaluated are some of the criteria that have a huge say in the choice of the model (Davidson, 2005). For example, in many cases, accountability and the pressure to document and provide evidence require evaluation, during which “an independent evaluation” is possible. On the other hand, a more participatory model is likely when the focus is on developing a “learning organization” (Davidson, 2005). For instance, Razik and Swanson (2001) have given a categorization of models under five groups: goal attainment, judgmental, decision facilitation, naturalistic and self-evaluation models. Models that are feasible to be used in performance appraisal evaluations like teacher evaluation is an example of purpose based classification. Kein and Aikins’s model use an objective-based approach in teacher evaluation (Razik and Swanson, 2001). Yet another, but popular purpose based classifications of key models has been done by Preskill and Russ-Eft (2005). Further to these, evaluation models are categorized based on the methods of data collection they employ; they could be qualitative, quantitative or a mixed method (Guskey, 2000). Quantitative approaches employ statistical analysis tools and are more technical. The tools for data collection, in this case, give tangible data for quantitative analysis. On the other hand, qualitative approaches employ tools such as focus groups, interviews, etc. which provide intangible data that help in qualitative analysis. There are individual methodologies based on specific approaches and over a period, integrated methodologies have come to be developed which give a multidimensional view of evaluation combining more than one methodology (Smith, 1997). As there are broad categories within evaluation: i.e. qualitative and quantitative, traditional and natural, there is a continuing conflict among the evaluators as to the best model or methodology for program evaluations. However, the suitability of the methodologies in turn depends on the nature of the program, for example, educational, developmental, and training, etc. Posavac & Carey (2007) say that the significance of any evaluation model lies in its usefulness in addressing specific evaluation questions of any program evaluation. So, the choice of the evaluation model comes only after a thorough analysis of the evaluation and stakeholder requirements, and deciding the evaluation questions. Moreover, it is the evaluation goals that decide the “relative emphasis” on quantitative and qualitative methods that are to be used in a program evaluation. Light and Pillemer (1984) cited in Posavac and Carey (2007, p. 166) say “The pursuit of good science should transcend personal preferences for numbers or narrative.” For example, evaluation of a political campaign might need qualitative evaluation method in the first place supplemented with quantitative method. But, evaluation of a medical experiment is possible only with numbers, or in other words, quantitative methods. Yet, it needs qualitative approach to validate the results (Mcsweeny and Creer, 1995, cited in Posavac and Carey (2007)). It is clear that qualitative and quantitative methods are to be used in tandem to arrive at best results. There are various methodologies and designs that are based on these methodologies in evaluation. The approach they take helps the evaluators to decide the suitable evaluation method and design their evaluation. Firstly, the Traditional evaluation methodologies were more “impressionistic” and “informal” and the evaluators were mostly people in charge of the program. For example, in clinical evaluation, the physicians themselves made the evaluations. Though the motive is the same as that of the modern evaluations, traditional methodologies lack “disciplined analysis” and it is highly possible that “cognitive biases may have influenced even the well-intentioned evaluators” (Posavac and Carey, 2007, p. 24). Social Science Research Model addresses the shortage in traditional model, the Social Science Research Model focused more on making evaluations “rigorous” free of bias. This model brings evaluation similar to research in many ways, making it more systematic, enquiry based and non-biased or objective. However, it had its limitations in applied settings as pointed out by Boruch (1997) and Lipsey (1990) cited in Posavac and Carey (2007). Objectives-Based Evaluation is one of the most used methodologies in evaluation, where the effectiveness is judged based on a program’s design and its ability to achieve the objectives. However, the major drawback of this is that evaluators tend to get stuck to the goals and fail to look for the reasons behind any failure or problem and any unintended consequences. Goal-Free Evaluation, as opposed to the earlier model, Scriven’s goal-free evaluation model believes that “the goals of any program or activity should not be taken as given” (Guskey, 2000). It fills the gap created by Tyler’s model, by addressing the unexpected outcomes of a program without neglecting the predetermined goals. In this way, the findings using this model could reveal the list of outcomes achieved, which might include intended as well as unintended ones. Difficulty in determining the tools to measure unintended outcomes is one of the major drawbacks. The underlying belief is that in the absence of goals, the evaluators will not have a narrow focus rather will try to “assess the total impact”, to record “all the positive and negative impacts of the program” (Posavac and Carey, 2007, p. 25). However, it might be time-consuming, expensive and open-ended. Fiscal Evaluation, as laid out by Levin and McEwan (2001) cited in Posavac and Carey (2007), this model focuses on the financial investments and calculating the returns on investment (ROI). This is of interest in corporate, and manufacturing sector as well as in service sector programs where investors would like to see the value for the money they have invested in the form of effectiveness and profit. However, the final evaluation is not quality based but quantity based, in other words, financial considerations. Accountability Model is otherwise known as the audit model and is an extension of the fiscal model. This highlights the responsibility attached to specific roles, which are accountable for the funds invested in the projects or programs. This model requires dependence on regulations and can be of use only in specific cases. Naturalistic or Qualitative Model is used when evaluators need a thorough understanding of a program, which is not possible with the uncertain quasi-experimental designs. Though the reports tend to be lengthy, the information helps understand the quantitative data gathered. Single-Group, Nonexperimental Outcome Evaluations is an outcome based approach in evaluation that can be used in the initial stages when evaluators or stakeholders want to know if a program really needs an evaluation. This helps identify if there is any improvement in participants so that there is scope for measuring the improvement using a complex evaluation design. There are designs such as single-group design, single-group and descriptive design within this framework. They are non experimental in that they did not put the participants under observation or experiment like the experimental design, or quasi-experimental designs. However, this design allows evaluator to test the participants both before and after participation in the program or only after participation. The purpose is to get data to show that the participants have “changed in the direction that the program was planned to encourage” (Posavac and Carey, 2007, p. 156). It adopts two different approaches: one that gathers outcome that is meaningful and the other that measures “outcomes using proxy variables” (Posavac and Carey, 2007, p. 158). This evaluation design strives to identify identity the outcomes and measure them, but they cannot explain well the reasons for the changes, but only assumes that all changes are caused by the program. It fails to consider the fact that changes in participants are possible even otherwise due to external and internal factors and this is called as the “threats to internal validity”. Considered a major disadvantage of this design, these threats reduce the validity and reliability of the evaluation results. The changes depend on the kind of people tested. Their profiles, knowledge and skill levels before the program matter. Also, low scores on a post-test may not always be due to inefficient program but may be due to health and other personal reasons. In addition, the duration of the evaluation decides as to the identical number of participants who take the pre-test and the post-test. It is highly likely that attrition rates increase if it takes a long duration. There are possible errors in the way the tools and methods for evaluation are employed. For example, the standards used to evaluate the pre and post assessments have a huge say on the measured changes. But, as many argue, better results are possible only if they are tested at both times. Also, evaluating pre-test should happen only after the post-test is taken is another argument. If evaluators test a single group, they may show a reasonable change which are likely influenced by various factors; to address this if they employ more than one group and test them, or if they test various groups over a period for many times, then there is a possibility to minimize errors. This has been adopted in the quasi-experimental designs. Quasi-experimental designs purport to show the connection between the reasons and the effects in the outcome evaluation approaches. They test the “causal hypotheses” (Posavac and Carey, 2007, p. 175) through various means like observing participants as well as additional natural groups over a period of time before and after the program and use a range of variables that are likely to be affected and those that may remain unaffected by the program. This gives possible interpretable data unlike the single-group non experimental designs. There are various designs that come under this banner: Time-series design, non-equivalent control group designs, regression-discontinuity design and a combination of the first two. The first design gathers information at many points over a certain period of time to add validity to the interpretation of results. Economists use this to interpret trends, while social researchers use the same with individuals. Collecting data over a period reduces the “threats to internal validity” to some extent (Posavac and Carey, 2007, p. 180). This is otherwise called as “interrupted time series” (Posavac and Carey, 2007, p. 180). There is a danger of calculating the mean value before and after the intervention of the program to arrive at the results, which again will be returning back to the single-group non experimental design. Non-equivalent group design, otherwise called as the comparison groups design is proposed to address the threat to validity due to observation of a single group of participants. Here, it recommends observation of a comparison group to reduce the error. But, both the groups are prone to more than one influence which again reduces the accuracy of the results. Moreover, no group can be considered perfectly parallel or similar to be compared with the participant group. For example, a teacher evaluating a math course for her second grade class may choose to use a second grade group from the same school or a different school for comparison. But, there are many variables that decide the change. The groups also cannot be equivalent. So, this model is more likely to be affected by regression effects (Posavac and Carey, 2007). Regression-discontinuity design is more effective in comparing “non-equivalent groups” in evaluation. The success of a combined design lies more in repeating the tests or observations for several times during a period of time with both the participants and the comparison group. Selective control design is more helpful when the “appropriate non-equivalent groups are available” (Posavac and Carey, 2007, p.194). The rate of change is calculated after recording the change over the period. This increases the validity of results as the result would be based on the mean value of the different rates of change over the period. All the quasi-experimental designs attempt to overcome the internal threats to validity of results, but identifying the threats is a huge task in itself. Moreover, there is no consensus reached on the statistical tools used for analysis. Experimental designs are based on experiments that are supposed to yield observable data for evaluation. This design is effective than the earlier ones as it is not affected by the threats they face. Experiments are used as tools in evaluation so as qualitative measurement tools are employed in qualitative methods. Experimental designs are close to the approach of using pre and post tests to present findings to the stakeholders. Though experiments help observe people over a period of time and to get concrete evidence for the change, there are many objections to its use in evaluations. It is often considered to have impact on participant’s health or psychological factors. This design can prove helpful to evaluate a new program or when documentation is of dire need. It can also be used when all stakeholder believe that the current program is ineffective but are unsure of the changes to be made ((Posavac and Carey, 2007). This is a branch of outcome evaluation methodology and requires good planning and structuring. It also requires well coordinated work with stakeholders to minimize attrition and to put them at ease. Experiments themselves are to be perfected to yield statistically deducible data. Another danger is to analyse and interpret the data to give valid conclusions. Cost analysis, a form of fiscal analysis model in evaluation deals with quantitative data to show the efficiency of a program in terms of ROI. However, interpretation of the data requires good understanding of the program policies and working in tandem with the authorities. Qualitative evaluation methods work in programs where identifiable program objectives can be measured quantitatively. Qualitative methods work alongside quantitative data collection to ensure valid results. They gather value to the quantitative data and explain the ‘why’ and ‘how’ factors that is essential for evaluation findings. Participant observations, interviews, focus groups are some examples of qualitative data collection. Open-ended questions draw participant responses and reactions about their experience and various other valuable factors that contribute to the program effectiveness. They may at times throw light on the unintended outcomes and answer the reasons behind those. However, before engaging in these, qualitative method starts with a close observation of the program, its content and setting. Sampling could be both random and purposive when it comes to the respondents in the focus groups or surveys. The evaluation questions decide the kind of sampling. The main drawback is the possibility of the interpretations being subjective. To minimize this, they are usually used along with quantitative methods to arrive at reliable conclusions. The various methods and designs available are to be used wisely by the evaluators depending on the evaluation requirements, objectives and stakeholder requirements. It also has to take into account the cultural factors involved before choosing any evaluation tool, for instance experiments. The analysis shows that a mixed approach could work better to ensure higher level of quality in the evaluation results minimizing the errors to a great extent. References American Evaluation Association (2009). About us. Retrieved October 2, 2009 from http://www.eval.org/aboutus/organization/aboutus.asp Davidson, E.J. (2005). Evaluation methodology basics: The nuts and bolts of sound evaluation. Thousand Oaks, California: Sage. Guskey, T.R. (2000). Evaluating professional development. Thousand Oaks, California: Sage. Posavac, E.J., & Carey, R.G. (2007). Program evaluation methods and case studies (7th ed.). NJ: Prentice Hall. Razik, T.A. & Swanson, A.D. (2001). Fundamental concepts of educational leadership (2nd ed.). Upper Saddle River, New Jersey: Prentice-Hall. Smith, N.L. (1997). Evaluation models and approaches. In J. P. Keeves (Ed.). Educational research, methodology, and measurement: An international handbook (2nd ed.). (pp. 217-224). Cambridge, UK: Pregamon. Read More

Evaluation Methodology - Essay Example

Extract of sample "Evaluation Methodology"

CHECK THESE SAMPLES OF Evaluation Methodology

Individual Proposal to Change or Add a Policy, or Create New Business

Grant Writing: Concept Paper

The Strategic Value of IT in the Hospitality Industry

Smart Form

Strategies for Housing Providers Serving Homeless People

Smart form

Using Miekleys checklist in evaluation

User Awareness of Defamation Associated with Twitter