Materials Evaluation (Pt1)

This is a big one, an important one, and sadly, it is this area of materials design that makes my head hurt slightly. There is so much that goes into an honest and reliable evaluation. It is one of the key aspects of materials design that I identified in the first week of the course.

Over the years in my many teaching roles I have been involved in many informal coursebook evaluations. These were generally part of my institution’s staff meetings and training segments. The process was one of proposed future coursebooks being given out in the meeting and then being asked to give our opinions and suggestions about which book we would prefer to use in our classes. Despite the clear lack of training, and for some of us experience, the unsystematic approach of few minutes’ flick-through was all that was provided for what should be a very important decision. The chosen coursebook would go on to form part of the syllabus for those classes and would hence dictate the topics, language and delivery in those classes. Teachers were forced to use them and the students were prompted to buy them.

Thinking and reading about evaluation has led me to the realisation that I evaluate materials everyday. The flexible manner in which my skills’ classes are structured means that I can use a variety of resources, either published or teacher authored. This requires daily evaluation of what learners may need and cross-referencing them with the suitability of the content, pedagogy and methods of those materials. However, I don’t feel that previously I have based my decisions on anything more than instinct or experience. That is not to say that those things aren’t important and valid, but I am keen to move forward. I need to recognise my articulated principles (as discussed in my previous posts) and apply them to my evaluation criteria as a means of having a systematic approach. If I am honest with myself, I don’t always consider the whole paradigm of methods e.g. I may opt for a more effective approach over a cognitive, or function over form. With a more rigorous examination of other evaluators’ principles, criteria and checklists I hope to dig deeper into my understanding of developing a well rounded approach to evaluation.

Tomlinson (2013) states that no two evaluations will ever be the same as the needs, objectives, backgrounds, and preferred styles differ from context to context. No matter how structured, criterion-referenced and rigorous an evaluation is; it will essentially be subjective. A starting point has to be about asking: who are the materials being evaluated for? The main point being that globally published materials cannot be evaluated in such way, but their effect is significant for those people who come into contact with them (learners, teachers, the syllabus, the evaluation itself). This echoes my sentiments about the coursebook evaluations I have experienced.

An evaluation is not an analysis because the objectives and procedures are different (Tomlinson, 2013). Analysis tries to offer objectivity by asking what the materials contain, and what they aim to achieve. This can be answered factually with “yes” or “no” responses or as a verifiable description. Here it should be acknowledged that any questionnaire written by evaluators might still be influenced by their own ideology and experience, and accordingly seen as biased. On the other hand, evaluation questions are in some ways about making judgements; answering on a sliding or Likert scale (for scores to be totalled) to measure the influence of something such as: “are the listening texts likely to engage the learner?”, choosing a grade between 1-5 (1 = very, 5 = not at all). This is something that I struggle with myself when presented with scales. The criteria can be very specific, but still an element of subjectivity can creep into your answers. Over a larger-scale questionnaire this may have significant impact on the outcome of the evaluation.

The unique situation of learning and teaching means that evaluators (professional or amateur) will adhere to their own conscious or subconscious principles. This in turn will drive the criteria used to glean the appropriate information to make evaluative judgments. The more experienced a teacher, the more likely they will be bringing the bias of their experience to an evaluation, thus potentially rendering it less valid. Tomlinson (2013) advocates that this possible bias should be articulated from the start in order to give the evaluation greater validity, reliability and less scope to be misleading.

McGrath (2002) uses Cunningworth’s point of view that course materials are not intrinsically good or bad, rather they are more or less effective in helping learners to reach particular goals in specific contexts. This is a very sobering thought that I must comprehend before I am drunk on evaluating power. The materials that others have designed are suitable for other contexts, maybe just not my classroom and therefore my judgement of them needs to be appropriate. It is important for me to note that when I evaluate any materials, my teaching requirements (academic skills and IELTS examinations) may not match the materials I am presented with and vice versa. This does not mean that those materials should be judged on that context alone; I don’t evaluate all the clothes in a department store when all I want to buy is gloves. The merits and demerits of socks are just not worth comparing for that context.

Evaluations differ in purpose, in personnel, in formality and in timing e.g. helping a publisher, someone doing their own research, developing their own materials, or writing a review for a journal. There are three stages when an evaluation can, or should take place. The most common one it seems in the institution that I have worked in is the pre-use evaluation and from talking to other colleagues that is their experience as well.

There are two dimensions to a systematic approach to materials evaluation that I must define before using them. The macro-dimension consists of a series of stages (the approach in the broad sense); the micro-dimension is what occurs within each stage (the steps or techniques employed). The pre-use, in-use and post-use evaluations are macro-dimensions, but the criteria within each of those is the micro-dimensions (McGrath 2002).

A pre-use evaluation makes predictions about potential value of materials that can be:

  • Context-free – reviewing for journal
  • Context influenced – review draft materials for a publisher with target users
  • Context dependent – selecting for use in a particular class (Tomlinson 2013)

They are often conducted on an impressionistic level and part of the quick-flick culture that has been the situation I have encountered most. Tomlinson (2013) describes it as “fundamentally a subjective rule of thumb activity.” However, McGrath (2002) mentions that checklists and more in-depth criterion-referenced evaluations can be used. This can hopefully reduce subjectivity and offer a more principled, rigorous, systematic and reliable outcome for judgements to be made.

In fact, McGrath (2002) supports the procedure that involves conducting a materials analysis first which is then followed by a first glance evaluation, user feedback and evaluation using context-specific checklists. This is not the only method that is used, others include:

  • Riazi (2003) surveys the teaching and learner situation conducting a neutral analysis and then carrying out a belief-driven evaluation.
  • Rudby (2003) uses a dynamic model of evaluation with categories for psychological validity, pedagogical validity and the process and content validity.
  • Mukundan (2006) uses a composite framework combining checklists, reflective journals and computer software to evaluate ELT TEXTBOOKS in Malaysia
  • McDonough (2013) develops criteria evaluating the suitability of materials in relation to usability, generalizability, adaptability and flexibility.

(All cited in McGrath, 2013)

The second type of evaluation is an in-use one; this offers a more objective and reliable perspective of the materials. It has the potential to make use of measurements rather than just relying on predictions because it will be able to reflect on the materials being used and the immediate reactions and effects. What can be measured in an in-use evaluation is:

  • Clarity of instruction
  • Clarity of lay-out
  • Comprehensibility of texts
  • Credibility of tasks
  • Achievability of tasks
  • Achievement of performance objectives
  • Potential for localisation
  • Practicality, teach ability, flexibility and appeal of materials
  • Motivating power of materials
  • Impact of materials
  • Effectiveness in facilitating short-term learning.

This does not mean that this type of evaluation is not without limitations. It can make judgments about the criteria that are observable and judgments about the material’s effects on short-term memory. However, it cannot claim to measure effective learning nor what is happening in the learner’s brain due to the delayed effect of instruction.

Post-evaluation is probably the most valuable (yet least administered) way to make judgements on the potential affordances and pertinence of materials for your classroom. This is because it is not economical or pragmatic for most institutions. It would take time and expertise to complete a post-evaluation successfully. If administered effectively it has the potential to note short-term effects with regards to motivation, impact and achievability. It could also examine and feedback on the long-term effects of durable learning and application. It can answer important questions as:

  • What learners know that they did not know before using the materials?
  • What do learners still do not know despite using the materials?
  • What can learners do that they could not do before?
  • What can’t learners still do despite using the materials?
  • To what extent have materials prepared learners for examinations?

The benefit for a post-evaluation is that it could measure actual learning outcomes through various ways of measuring: test what the materials have taught, test what the learners can do, interviews, questionnaires, criterion-referenced evaluations by the user etc. That type of data would provide reliable and robust feedback for decisions about the use, adaption or replacement of the materials to be made. There is still a need for caution because learning is not an exact science and variables like the following may affect outcomes in numerous ways: teacher effectiveness, parental support, language exposure outside the classroom, intrinsic motivation.

What is apparent is that for any evaluation to be systematic it is preferable for a criterion-referenced checklist to be employed, which will aid the gathering of data. Much like in the design of materials, for an evaluation to have any coherence and validity it needs a framework of principles of ELT to be compared against.

Based on my principles and what I have read so far, I would want to employ all three stages of evaluation of any material. This is obviously a long and time-consuming process. A pre-use evaluation is what my colleagues and I encounter most often. In line with my beliefs I would look for the following criteria in a pre-use evaluation:

  • Does the input appear to engage the learners?
  • Is it appropriate for their current level?
  • Is the input accessible and engaging?
  • Will the input and the tasks motivate/prompt students to interact with target language?
  • Do learners have opportunities to use target language in communicative tasks?
  • Are those communicative tasks useful for the language outside of the classroom?

In my next post I will look at the criteria and frameworks that can be used to evaluate materials. I will also discuss what some of my colleagues and I discussed in our seminar about materials evaluation.

References

Mcgrath, I. (2002) Materials evaluation and design for language teaching, Edinburgh, Edinburgh University Press.

Mcgrath, I. (2013) Teaching materials and the roles of EFL/ESL teachers: theory versus practice, London, Continuum.Tomlinson, B. (2012) Materials Development for Language Learning and Teaching. Language Teaching, 45 (2), 143-179.

Tomlinson, B., Editor of Compilation. (2013) Developing materials for language teaching, London, Bloomsbury Academic.

Leave a Reply

Your email address will not be published. Required fields are marked *