Sign Up
Tutor Hunt on twitter Tutor Hunt on facebook

Tutor HuntResources General Studies Resources


Effectivity And Difficulties Of Assessment For Learning Techniques In Science: Literature And Practice

This article aims to investigate the pros and cons of assessment for learning and will argue that, despite of many benefits of AfL techniques, there are also many concerns and difficulties that need to be taken into account. Researches and evidences from literature and reflection from practice will be used to illustrate this and some practical strategies will be discussed. The discussion also addresses some aspects of AfL that can be particularly tailored for teaching science.

Date : 12/05/2019


Author Information

Uploaded by : Shahram
Uploaded on : 12/05/2019
Subject : General Studies

The majority of teaching through history has been students passively listen to the teacher and take notes. The new pedagogic practice challenges this approach as it undermines the role of learners to actively participate in learning through assessment processes (Pierce, 2013). The high impact of assessment as an effective tool on the quality of learning and motivation of students is now widely acceptable which also can provide a positive and supportive environment (Wanous, Procter & Murshid, 2009).

Assessment in a wide range can include all types of testing. In one form it can be used to make schools accountable by presenting numerical test results for pupils. In another form it could be utilised for certification purposes in official exams to enable making appropriate choices. For example a student can choose a suitable job or further education, the employer can select competent applicants and the education centres to enrol fit applicants. For this reason these types of assessment should be analogous across all schools in the region or the whole country. They are normally formal tests, performed in special times, in an isolated way with minimum of no influence from original teachers.

Assessment for learning

Assessment can also be used during learning to provide information for students to assess themselves and each other and also for the teachers as feedback from students so that they can adjust and modify their teaching level, style and activities to improve and promote students learning. This later purpose is the topic for this study and is called assessment for learning or simply AfL. It also can be referred to as formative assessment if the evidence of assessment is actually used to improve teaching and learning. Different from the first two types, assessment for learning is informal and could be carried out at any time during the teaching and learning process by any individual teacher. Having several different techniques, it can be used in every lesson as a teaching tool, embedded in most teaching activities and as part of lesson plans to keep the learning on track.

How assessment for learning can serve to improve learning and what evidences are there in the literature? An extensive literature survey checking many books and over 160 journals, including 580 articles or chapters across 9 years was completed in 1997 by Black and Wiliam, and the results published in 1998 as a booklet called Inside the Black Box. They highlighted many success stories and examples that innovations in formative assessment have produced considerable learning gains (Black, et al., 2003).

The booklet argues that the formative assessment is at the heart of effective teaching (Black and Wiliam, 1998). They borrowed the term Black Box from system engineering in which a demand is fed into the system as input and the system needs to produce some outputs to meet the requirements. In education, the inputs are students, teachers and other staff, parents expectations, test papers etc. and outputs could be knowledgeable students, satisfied parents and more expert teachers. The central point of their argument is as an effective policy to gain proper results we need to focus on inside the box and the missing element inside the system is AfL.

I think the idea of black box comes from the fact that the authors have considered the summative assessment as something which is outside the box when the learning has happened and finalised but the formative assessment as intermediate and continuous that happens inside the classrooms during the learning progress. If this is true, I challenge this because the summative assessment is still inside the box but in certain periods and at the end of each milestone.

The authors seem to have a positivist methodology in their study since they have generalised their findings as a universal truth. Looking closely at each success story they have reflected in the book shows that it has had many limitations and assumptions and the results are generalised from the limited populations to a wider population to generate a law-like recipe for the whole education system in the UK or even wider. For example, one case described in Black, et al., (2003) was the study made by 25 Portuguese mathematics teachers for students of year 8 and 9. Their result shows a double gain for the experimental groups which had daily self-assessment compared to the control groups without self-assessment. However, these teachers have also raised the concern that they had to teach the success criteria in addition to the lesson objectives and also highlighted the necessity of significant change in classroom pedagogy for the formative assessment to be effective.

In spite of AfL s benefits, there are also some concerns and issues highlighted in literature. It is not an easy task to apply AfL methods and gain significant results in every classroom. Applying some AfL techniques needs significant change in teaching style and practice. Teachers need to spent time in planning and during lesson to modify their activities based on the results of the assessments.

Black, et al. (2003), identified three categories of difficulties about the application of formative assessment: The effective learning should be defined first based on some assumptions, since the success of AfL methods is more obvious if the actively students involvement is considered as an important part of effective learning process. There are some evidences that AfL designed by individual teachers can lead to superficial and shallow learning rather than deep understanding. Also these tests normally are not shared, discussed or critically reviewed by other teachers. The second category of concerns is about the possible negative impact on some students which makes them feel they are in a competition with other students rather than individual progress, particularly when the feedback is in the form of numerical grading and marks. This can have negative impact on the confidence and enthusiasm of some children and this need to be taken into account. The third group of difficulties is regarding the impact of formative assessment on future decisions or managerial role of these assessments when teachers may use their own marks to report rather than analysing students works and make quick prediction about the results of official exams based on their own assessments.

Elements of assessment for learning

The main components AfL suggested by Black and Wiliam (1998) are questioning, talk, feedback, sharing criteria and peer and self-assessment.

Questioning techniques are one of the important aspects of AfL. In this study I focus only on one aspect of questioning as wait time. Mary Budd Rowe had more than 20 years research on the effect of wait time during questioning and found that the wait time between the teacher question and their intervention if students had no answer was normally less than a second. In later researches she introduced two dimensions called wait time1 which is the pause following teacher s question and wait time2 which is the pause following student s response. She argued that such a short time in not enough to allow pupils to come up with an effective answer and suggests that increasing both of these variables to around 3 seconds can improve student use of language and logic as well as the student and teacher attitudes and expectations (Rowe, 1986, p43). In other sentence Rowe introduces 2.7 seconds as the high threshold for waiting.

When first I reviewed the article, I challenged the accuracy of using 0.1 second for this purpose due to measuring difficulties during research and more importantly during practice. However, considering that the researchers have used the tape records and servo-chart plotters which recognise the speech patterns and measure the pauses, it can justify the measurement accuracy but still for the practice, using 3 seconds guide is more practical than 2.7 seconds. According to the results increasing the wait time increased the length of student answers to 3-7 times, also more inferences provided with evidence and logic, more questions asked, increased exchanges between students, decreased failure to answer and less disciplinary moves (Rowe, 1986, p44).

Another questioning technique is to start the lesson with an open and in-depth and challenging question rather than a closed one. For example Black, et al. (2003) suggests that a close question such as What is the photosynthesis equation? can be replaced by an open-ended question such as why large plants do not grow in desrts where is sunny and there is plenty of light for photosynthesis? This will allow all students to think and participate in discussion rather than only those who remember the formula of photosynthesis.

On the other hand there are some counter-arguments such as if student cannot answer even after 3 seconds they could feel more embarrassed (Row, 1986). Some other teachers reported that more wait time as unbearable silences that can make pupil switch off or misbehave (Black, et al., 2003, p72).

Feedback is another essential part of the AfL and is the information which help students to relate their learning and performance to desired objectives which leads to learning gains (Nicol Macfarlane-Dick, 2006). Walker (2012) suggests that the feedback should be in a loop and part of a dialogue between teacher and student to provide a prompt respond to the student issues and also for the teacher to assess that their method is appropriate and at the right level. For this purpose walker compares two methods of assessment student evaluated teaching, SET and classroom assessment techniques, CAT, then recommends the second method which can provide immediate information to assess the learning progress in classroom.

SET can provide diagnostic feedback for teachers to measure their effectiveness to meet government criteria based on students input (Marsh, 1987). However, there are some criticisms and questions in literature about the competence of SET such as: How much is the validity of students vote? Do they really vote on the teaching effectiveness or the teacher s charisma? Is there a consensus about what is an effective teaching? Why sometimes a teacher that spoon-feeds the information is more favourable than a teacher who challenges for a deep learning? (Platt, 1993)

CAT techniques as an AfL, on the other hand, are simple and quick activities that provide immediate formative feedback for both teacher and students and therefore give the chance to them to close the feedback loop and can enhance deep learning. It initially introduced by Angelo and Cross in 1993 and directly assesses the effectiveness of learning rather than other parameters such as easiness of the course or teacher s charisma. National Teaching and Learning Forum (1998) summarises some CAT methods such as knowledge probe, minute paper, one-sentence summary, directed paragraphing, application cards and muddiest points. The latest technique for example, is to ask pupils to read a text and make note of the parts that they don t understand as the muddiest point which they can discuss it with their peers and also helps the teacher to identify which are needs to be elaborated (Walker, 2012). Using IT facilities students could be asked to answer CAT questions with their key pad or mobile phones and the answers can be presented immediately on the board and analysed by the teacher and students (Markett et al., 2006).

Wiliam (1999) compares the effectiveness of different types of feedbacks by highlighting two works of Butler (1998) on 132 year 7 Israeli students (Butler, 1988) and on 200 year 6 and 7 (Butlet, 1987). These studies which utilised a randomised controlled trial (RCT) research style, showed improvement in the quality of work of students after receiving only-commented feedbacks on their works but no improvement if there was no feedback (control group) or the feedback was in the form of grades, praises or marks plus comments. Wiliam does not explain in his paper that how he is convinced that the work of Butler on a limited group of students in a different country with different conditions and on a specific year group can be generalised to the whole schools in the UK. particularly when he concludes at the end that changing the kinds of feedback we use in mathematics classrooms could have more effect than all the government initiatives put together (Wiliam, 1999). Instead of validity check, using a positivist methodology and approach, Wiliam justifies the results by identifying a harmony with the old pastoral advice that the behaviour should be commented not the child and compatibility with a previous work (Good and Grouws, 1975) which indicated that praise is not necessarily a positive tool and should be minimised to effective ones.

Assessment for learning in science

Assessment for learning not only has some generic features which means that it can be applied in all stages and subjects, it also has some specific features or techniques which suits them more for a specific level or subject. Within literature there are some works such as the work of Keogh and Naylor in 1997, 1998, 2004 and 2007 that targeted the application of AfL in science and to address the misconceptions and importance of students discussions in science (Hodgson and Pyle, 2010). Black and Harrison (2004, p3) suggest that since science necessitates students to question and discuss findings, formative assessment fit well in science in which pupils will develop understanding about the world by interacting and developing ideas about the phenomena. Also students can predict what might happen if the conditions for a phenomena changes by observation and through experiments.

Black, et al. (2003) suggests that in spite of the correct model and explanation of some natural phenomenon by science, there are some misconceptions among pupils for example astronauts are weightless on the moon because there is no air. The more effective way of resolving the misconception is to open up a discussion and feedback by evidences that support the scientific view rather than direct presentation of the scientific model.

Questioning can help develop a good understanding about concepts in science especially when offered as open questions. For example Some people describe friction as opposite of slipperiness. Do you agree or disagree? as an open question to help discussion the second part could be changed to what do you think? to create an encouraging atmosphere for discussion it is important to concentrate on what students say rather than concentrating on the correct answer and move on.

Gioka (2007) carried out another study on the application of assessment for learning in biology classes (photosynthesis and osmosis) in four London secondary schools of middle and low income families. The aim of the study which can be considered as case studies was to find by observation, interview and collection of students works, the extent to which AfL is used by science graduate teachers and the results showed that only a few out of 9 teachers effectively used the AfL techniques throughout the academic year. In terms of communication of assessment criteria for instance, although all teachers applied it to some extent, but their different level of communications, showed how AfL techniques could be used in different levels in science lessons. For example, one teacher directly communicated the exam board criteria to help students succeed in exams but another teacher also communicated the criteria for good investigation and writing skills by setting out specific assessment criteria. The study also highlights some other differentiation in the extent of using AfL techniques in typical science classes in questioning and marking. It does not provide any new AfL technique and is limited to the application of the previously known methods and encouraging science teachers to apply more effective methods of assessment for learning.

School changes for the application of AfL

Some difficulties and challenges of applying AfL methods arise at whole school level and is a challenge for all school rather than only the teacher. The study shows that by increasing number of students in class, such as some courses with large class sizes in universities, the assessment for learning becomes more challenging. This is due to the limited possibilities for contact and engagement at individual level (Wanous, Procter Murshid, 2009). Also the changes toward the application of AfL methods are difficult to effect in large secondary schools (Gilmore 2008 cited in Hill, 2011)

Hill (2011) suggests some strategies to manage a change toward AfL, such as everyone at school should have a clear vision and take active roles including the principal and senior staff, to organise development programmes, working groups and meetings. Teachers also need to plan together and act as learners until it becomes an element of the school culture.

Personal Experiences

Throughout my first placement I was involved in observation of many science lessons and occasionally other subjects taught by experienced teachers along with teaching many science topics and benefited from constructive feedbacks from them. Below are my reflection of some of these real experiences that can demonstrate how AfL techniques are used in science lessons and their impacts and challenges for students and teachers.

One of the important points consistently I have observed is the benefit to make sure all pupils understood the task following teacher s first introduction. The teacher s I have observed used several questions and seeking feedbacks for this purpose before they move on. For example, following introducing the lesson objective Explain how static electricity can be generated using Van de Graff generator teacher asked: X, can you explain what we are going to do? X explained. Teacher added more details and clarifications then asked: Is anyone still unclear? few second waiting and then: OK, Let s do it now .

One of the barriers that decreases the participation of pupils in my classes identified by the observer teachers is saying the answer myself if the students cannot answer. And advised me to wait more and give then clue to help them enjoy finding the answer. According to Rowe (1986) wait time encourages students to participate more and provide more detailed and better quality answers. This is something that I targeted to practice during my second placement.

Giving feedback about the good behaviour to all class also seemed to me a very effective method which I used it few times and really appreciated the difference. Having learned from the class teacher, to encourage pupils to collect their equipment quickly following a practical, rather than requesting again I announced: X collected her staff, Y is ready, Z is almost ready and this simple feedback helped for everyone to follow the expectations. Or when teachers normally say: I am really impressed with the work you improved . .

Sometimes the AfL questions I observed were more systematic and in written form or embedded in fun activities. The teacher of a low level year 10 class sometimes starts the lesson with a set of questions (for example a set of 15 closed short questions in the form of 5, 4, 3, 2, 1 which the first 5 has one word answer and gradually the last question is an open question that needs explanation) as a recap for the previous lessons or a set of questions at the end (such as bingo) as plenary to reinforce the keywords.

A year 7 teacher asked pupils to read the practical instruction on making a sugar solutions in hot and cold waters then started a set of questions: look at your method and tell what you need for your practical. How much sugar? How much water? How long do you stir? How do you measure the time? What do you stir with? How? Why?...

I also observed an English language lesson specifically to improve my questioning skills. The teacher was known in the school to be expert in this purpose and actually I was impressed with the different types of question that teacher continuously asked in different occasions. Some of which: What page are we reading? , Teacher reads a piece of poem then asks: What s the story all about? X, what do you think about ? Y, do you have any words to improve that? OK so, Z, do you have anything to add to? OK fine!, Can you tell me any emotional words you have found? Any other word? Any other words that have ? Any other? Yah, I like that Would you agree? Why would you agree then? OK cool! What about tear? Is it emotional or physical? This table, I haven t heard anything from you yet! Do you want to tell me about ? What else stars represent? X, said stars can represent god what do you think Y? I don t know. Teacher explains and asks another question which he can answer, nice, ok, cool! We haven t discussed about structure yet! Anything else which worth noting? R, I haven t heard from you today yet? Does anyone have anything other than .?

In the school that I am spending my main placement, it is a common practice to check the students exercise books regularly and leave them some comments to indicate how they can improve or to correct their work if necessary. These non-numerical comments believed in literature according to Butler (1998) and Wiliam (1999) to be more effective than numerical marks. In practice, I personally experienced that students were more eager to know their grades on their test paper rather than identifying their weaknesses and reading comments. It makes sense that if the feedback only limited to the comments they will focus on the comments and benefit from it more.

Another common procedure in our science department is to Plan a DIRT session following each post topic test which I observed once and then planned and taught a lesson accordingly for another occasion. DIRT which stands for Dedicated Improvement and Reflection Time is a session that in which students receive feedback from teacher on their test results and they dedicate an hour to identify their areas that need to be improved as some targets based on teacher feedbacks on the test paper and work on those targets. The idea is that rather than going through the answer of each questions, it is more beneficial if students in a self-assessment process identify their weaknesses and work to improve those specific areas. To help students, for some classes a set of pre-defined targets could be provided, for example to follow target 1 if they you have not achieved a good score in question 1 and so on. I think DIRT can be considered as an effective AfL method that meets Walker s (2012) criteria to close the feedback loop by using the data from a summative assessment (test results, open circuit) to use it for further learning by pupils in a self-assessment process (closed loop) when they are involved in setting their own targets and work on their targets to improve their learning.


This study addressed some advantages of assessment for learning methods and their positive impact on the learning, progress and motivation of pupils and showed that in addition to the advantages there are also some difficulties in applying these techniques which necessitates the group work and cooperation of the management and senior staff and teachers in schools to make the AfL as a school culture.

Some case studies in science lessons and some works based on the CRT methods were discussed and noticed that those researches have many assumptions and limitations but their results on AfL have inappropriately generalised from a non-representative samples to a whole UK population. The research in this area can be improved by further large scale and comprehensive research methods in parallel to the current small scale researches.

This resource was uploaded by: Shahram

© 2005-2019 Tutor Hunt - All Rights Reserved

Privacy Policy | Copyright | Terms of Service
loaded in 0.125 seconds
twitter    facebook