In this post I want to look at what goes into the construction of an evaluation framework. This means determining the format of a checklist, which is likely to produce a tension between breadth and depth, and informativity and economy. Moving from a pre-use impressionistic evaluation toward a comprehensive and more informative one needs careful deliberation over the micro- and macro-characteristics that make up a criterion-referenced checklist. What is clearly vital at the start of the evaluation process is both the consideration of the learners’ needs and the context of use.
The micro-considerations for this area of evaluation are the characteristics of the learners (age, level, learning style, socio-cultural background etc.), the learners’ needs (language, skills, functions, language systems etc.), the teachers who will use the materials (experience, confidence, methodological competence etc.) the programme and the institution (level within the educational system, timetable, physical environment public or private sector). The macro-features will be focused on the external context: aims of education (examination systems curriculum content, language policy, role of target language within that country) the aims of language education (national syllabus, cultural and religious considerations) (McGrath 2002).
As mentioned in my previous post, there is still a lot of debate about the use of analysis questions in evaluations, and there are differing techniques used by authors in their approach and the steps (micro) and stages (macro) involved in an evaluation. It is clear that there is a benefit in performing a pre-evaluation analysis. It is implied that this should separate the ‘wheat from the chaff’ by highlighting key objective aspects that materials must contain, if they don’t they can be rejected. Accurate assumptions can be made about the layout, images, and the types of skills being assessed from an informal scan of materials. An analysis can be characterised as impressionistic, a checklist, or in-depth, the idea being that these analytical descriptions are to be compared to the identified needs before moving on to an evaluation. The materials analyses that I encounter are of the impressionistic nature and rely on flick tests. My assumption is that this is due to the economy of time and the lack of training given in this field. It does not comfort me that important decisions about materials are made in this way. This module has gone a long way to make a difference in my approach to and understanding of these decisions.
What does it take to create a checklist for an evaluation of ELT materials that is holistic and economical?
McGrath (2002) offers three approaches to checklist design, firstly you can borrow and adapt checklists that are available to you for your own context. Secondly, you can brainstorm and draft your own original fit-to-purpose checklist. The final option would be to research the people that the materials immediately affect (teachers and learners) and find out what is important to them. From my experience the evaluations I perform are somewhere inbetween the first and the last point made by Mcgrath. The criteria I use (which is not a hard copy checklist, but an internal one) is based on my own personal context of my students, my principles and the observation of the characteristics of the class e.g. level, age, cutlure.
McGrath (2002) suggests the following steps for designing a checklist:
- Decide general categories within which specific checklists will be organised
- Decide specific criteria within each category
- Decide ordering of general categories and specific criteria
- Decide format of prompts and responses.
As a means of comparison, Tomlinson (2013) suggests brainstorming a list of universal criteria that is applicable to any language learning materials, then derive principles of language teaching and learning from classroom observation. This should provide a fundamental basis for materials evaluation. Following this the criteria needs to be sub-divided to help pinpoint specific aspects to be revised or adapted:
For example, if looking at instructions, are they:
- succinct
- sufficient
- self-standing
- standardised
- separated
- sequenced / staged
The universal criterion needs to be revised and monitored to maintain consistency and validity. Questions should reflect evaluators’ principles, but not impose a rigid methodology as a requirement of the materials. This may lead to some materials being dismissed due to pedagogical bias or assumptions. Are questions reliable so that other evaluators would interpret it in the same way? Are the terms and concepts applicable to differing interpretations of applied linguistics? If they are not, then it suggested that they are avoided or glossed (Tomlinson, 2013). I can relate to this point, there have been occasions, especially early on in my career, when my knowledge of meta-language and learning theories felt insufficient to contribute to a discussion about the pedagogical merits of materials. My lack of experience meant that I did not have the depth of knowledge to interpret tasks from different contexts and needs.
Williams (1983) (as cited in McGrath, 2002) makes a salient point that “a checklist is not a static phenomenon”. Every context is different, and therefore the strength that a list of criteria has is only relevant to the situation in which it is to be used. ‘Off-the-shelf’ checklists are likely to need adjustments to suit different contexts. The categories and checklists are supposed to be instruments of objective analysis, evaluation and observation, but they are as much a reflection of the time at which they were conceived and of the beliefs of the author (in the same way as the materials that they are being used to evaluate). Learning theories have evolved and will continued to do so and the evaluation must observe this fact too. You would not review an early mobile phone using the same criteria that you would for a modern smart phone it just isn’t the same device any more.
The evaluation checklist has to be relevant to the ELT context in which it is to be used. The framework for the checklist should be criterion-referenced based upon the principles that that evaluator(s) believes are most apt. The issue of what should be included in a checklist and what is superfluous to requirement is when evaluation can become very muddied and complicated. Tomlinson (2013) makes a valid distinction between general criteria i.e. essential features of any good teaching-learning material and specific criteria or context-related. In other words, general criteria is essential, and specific criteria can only be determined on a basis of individual circumstances. Moving from general to specific criteria McGrath (2002) believes will lead to the identification of a set of core criteria to be used or applied irrespective of evaluation method in any situation. While he also suggests that once the general criteria has been tentatively decided, the next step of populating the checklist with specific criteria that is comprehensive and relevant is potential ‘messy’. McGrath advises that reference to published checklists may assist in avoiding an over-zealous and context heavy criteria.
I feel there are some excellent points raised here because it is important for my principles to run alongside what is already in the ELT domain of what is good practice and essential to language learning. The general criteria must present a holistic picture of ELT. The general criteria may very well remain, but the specific will shift and adapt to the context and research developments. A general framework will inevitably lead to a more holistic and economical method of evaluation. For a consistent and balanced framework, the general criteria must consider current learning theories based on the findings of research that are most convincing and applicable in ELT.
(Tomlinson, 2013) suggests the following generally agreed upon criteria:
- Deep processing: processing is semantically focused on meaning of the intake and relevance to learner.
- Affective engagement
- Mental connections: between new and familiar.
- Experiential learning: apprehension before comprehension
- Learner’s need to want to learn
- Multidimensional processing – sensory imagining, affective association, use of an inner voice, learning experiences with emotions, attitudes, opinions and ideas.
- Informal personal voice more likely to facilitate learning than those, which are formal and distant.
- Informal discourse
- Active rather than passive
- Concreteness e.g. examples and anecdotes
- Inclusiveness
- Sharing experiences and opinions
- Occasional casual redundancies rather than always being concise
In addition to the theories of ELT there is also the Second Language Acquisition (SLA) theories to consider. Adding to an already tricky task is the inconclusiveness and controversial variants in this field. This re-enforces the idea of avoiding rigidity, I must be careful here not the hold too tightly to my principles and beliefs, and allow other recognised pedagogical factors to influence my criteria. Tomlinson (2013) suggests some of the agreed upon ideas are:
Materials should:
- Achieve impact.
- Help learners feel at ease
- What is being taught should be perceived as relevant and useful by learners
- Facilitate learner investment
- Learners are ready to acquire what’s being taught linguistic and developmental readiness and psychological readiness
- Expose learners to authentic language use semi planned and unplanned discourse requiring a mental response
- Learner’s attention should be drawn to linguistic features of the input
- Provide opportunities to use target language to achieve communicative purpose
- Take into account the positive effects of instruction are usually delayed Learners different learning styles
- Take into account differing affective attitudes
- Maximise learning potential encouraging intellectual aesthetic and emotional involvement stimulating both right and left brain activities
- Opportunities for outcome feedback
The list of criteria could be infinite unless the evaluation is principled and the evaluator’s principles are overt and referenced to procedures (Tomlinson, 2013). The danger being, if an evaluation is ad-hoc it could lead to misleading results. This has definitely been the case in the past when I have been pre-evaluating course books.
McGrath (2002) says that individual criterion is a matter of judgement based on their circumstances. Again, it comes down to context and the specific criteria being used. If I were to evaluate materials for use outside of my classroom, would those that do not meet a specific piece of criterion be rejected or would it be suitable for something else? The rating, weighting and scoring format has paramount importance because the responses to the criteria are what determin decisions about those materials. In addition, the interpretations of data need rationalised assessment because high scores in one section of criteria does not automatically indicate suitable materials. What should be studied are: a widespread of desired features, and the concentration of scores in those areas. Care needs to be taken so answers don’t appear to be no-committal. Questionnaires that have several options could result in lots of ‘safe’ decsions being made. This opens up the consideration for a debate about open-ended questions and as they require more investment and are likely to offer more thoughtful responses. McGrath (2002) simply states that the acid test for clarity of criteria is to try it out.
The ordering and the amount of questions in each section or category is purely determined on its merits alone and there should be no strict regulations enforced on equality for each category or item because in fact not each part of the evaluation is as important as another. This is when the articulated principles should play a part on the different items present in the evaluation.
In order to make my checklist as reliable and valid as I possible the criterion-referenced checklist needs to evaluated. Tomlinson and Masuhara, (2010) advise the use of five clear questions to be asked of your evaluation criteria:
- Is each question and evaluation question?
- Does each question only ask one question?
- Is each question answerable?
- Is each question free of dogma?
- Is each question reliable in the sense that other evaluators would interpret it in the same way?
Tomlinson’s (2012) State-of-art article states that it is rare that checklists can satisfy all of these questions, which goes some way to further underlining that most evaluation checklists are not generalizable or transferable. Therefore, using other peoples published ones is not the job done. There is still plenty of work to be done.
This week’s section of the course has been really tough. Some of my peers did fantastic jobs on their evaluations. They cross-referenced their findings and made valid and reliable judgements for their teaching and learning contexts. That is the thing, it is so much of them in there that makes the evaluation what it is. The beliefs and principles that they set out from the start were completely different to mine from the previous week. There is no way of knowing if we had worked together whose principles would have taken president. The thing that scares me is the evaluating of the evaluating. The evaluation checking questions needed to check the actual evaluation questions. Does it tumble out of control and there is need to have the evaluative evaluation checking questions? If validity and reliability is to be trusted there needs to be a limit on bias as much as possible? I am not sure what my evaluation criteria will look like yet.Once I have created some materials I will evaluate them accordingly.
My general criteria (based on my articulated principles) would be:
- They are contextually relevant to the learners needs.
- Materials should provide opportunities for communication to take place.
- Target language has a real world use and function beyond the classroom
- Variety of approaches and types of task (this is more for a coursebook, I suspect)
- Engagement – images, video, topics.
References
Mcgrath, I. (2002) Materials evaluation and design for language teaching, Edinburgh, Edinburgh University Press.
Tomlinson, B. (2012) Materials Development for Language Learning and Teaching. Language Teaching, 45 (2), 143-179.
Tomlinson, B., Editor of Compilation. (2013) Developing materials for language teaching, London, Bloomsbury Academic.
Tomlinson, B. & Masuhara, H. (2010) Research for materials development in language learning: evidence for best practice, London, Continuum.