Learner observation tasks as a learning tool for pre-service teachers

Chapter 1

Introduction

1.1. Teaching Practicum in Kazakhstan

Teaching Practicum is compulsory for student teachers of graduate level enrolled in the English Language Teaching Department. Student teachers take Teaching Practicum at state schools, and follow the Teaching Practicum Curriculum issued by the Department of High Education of Kazakhstan. According to the foregoing Curriculum the Teaching Practicum consists of two periods: five-week period for the third-year students at the end of the 5^th semester, December, and seven-week period for the fourth-year students at the beginning of the 7^th semester, September and October.

Lesson observation is one of the major components of the Teaching Practicum. Both Teaching Practicums involve observation weeks: two weeks for the third-year students and one week for the fourth-year students. Observation weeks are devoted to observing lessons and familiarising with the school’s facilities, policies, procedures, pedagogical practices, and the preparation of timetable.

During the Observation Weeks student teachers have to observe lessons given by their monitor teachers to be aware of the methods and techniques of her/his teaching. In addition to it they observe the relationship between the teacher and students, students’ learning styles and their behaviour. To get better understanding of the learners’ personalities student teachers are recommended to observe lessons across other subject areas that are taught for the class they are allocated. At the same time pre-service teachers observe lessons of other experienced teachers who display exemplary teaching practices, and novice teachers to evaluate various teaching techniques at different levels of professional experience.

During the Observation Weeks student teachers are required to record their observations of fifteen English language classes for the third-year students and ten classes for the fourth-year students to be assessed. Students must have daily entries of their observations reflecting on various types of teaching or participation experience. Moreover, student teachers are strongly recommended to conduct peer observation and provide feedback on at least one lesson per day, and written feedback on at least two lessons per week during the Teaching Weeks.

1.1.1 Types of records at the Teaching Practicum and trainees’ problems

There are no fixed observation instruments in the National Teaching Practicum Cirriculum. Every English Language Teaching Department compiles their own, in ethnographic or structured format. Some Departments prescribe that student teachers must keep diaries, whereas others provide trainees with observation schemes. The former technique requires that pre-service teachers have to describe their reaction to the lesson observed, learners, the relationship between teacher and pupils, school policy in general and their initial teaching experience in the form of narration. The latter ones are introduced in different formats; it is either a detailed structured check-list with pre-specified categories of the teacher’s or learner’s behaviour and the trainee’s role is to record their occurrence, and accompany with evidences or jotted comments that they consider relevant to the observation, or a general lesson reports where student teachers make notices about plusses and minuses of the lesson observed.

As a teacher trainer at the state University in Kazakhstan I have read, analysed and assessed more than 200 diaries and observation sheets for six years. This work has raised my doubts about usefulness of observation as a learning tool. The comments of trainees are mainly descriptive; the student teachers note down what the teacher and the learners have done during the lesson and whether the learners are "interested", "involved", "active" or not. I have noticed that trainees face problems with identifying the aims of the lesson, means of transition, teacher’s prompts and learning outcomes. There is very little analysis or reflection. They observe that the teacher has no problems with discipline but do not ask themselves why it has been so. Very few trainees have made any connection between observations and their own teaching.

I can name some reasons of these problems. The main one is in the little amount of time that is allotted to TESOL course in Kazakhstan. Due to this reason, pre-service teachers are formally introduced to observation skills and strategies. Student teachers need help in observation, but university supervisor and educational psychology instructor are far too often in the classroom with pre-service teachers to guide them and conduct observation, further analysis and reflection in collaborative way. Another reason is that the format of the observation schemes seems to limit the student teachers very much. They feel obliged to fill in the space often repeating the same remarks in subsequent observation sheets. Finally, observation sheets prescribe categories or tasks in the form of broad statements without explaining the reason of observation, what to write and in what sequence. Teaching process is a complex procedure that covers teaching behaviour, learning behaviour, patterns of interaction, and patterns of group dynamics. Some aspects of these procedures are overt, for example, question-answer work, but sometimes it is far more covert, such as learner’s interest. So student teachers face the dilemma what is noteworthy to mention, how to interpret teacher’s, learner’s remarks or behaviour, what size the notes should be.

1.1.2 Tasks as solution of the problem

In my paper I am looking for some help for my students to make their observation experience more meaningful. Student teachers should know that the reason of observation and filling the observation sheets is that we want them to learn something from doing so, and only then grade them. The features of a good observer should be made clear to them. They should realize that the skills of observation can be learnt. The university supervisor should try to transfer some of her observation skills by observing a lesson, and analyzing observation sheets after a lesson she has observed with the trainees in a collaborative and consulting way.

The main suggestion concerns the format of the observation schemes. Numerous schedules of observation have been introduced: the Flanders System of Interaction Analysis (FIAC) by Flanders (1970), the Foreign Language INTeraction (FLINT) system by Moskowitz (1971), FOCUS by Fanselow (1977), COLT by Allen, Frölich and Spada (1984), the Stirling system by Mitchell, Johnstone and Parkinson (1981). They are valid and do not require trials. But the main problem with these instruments is that they were originally designed for educational research and for in-service teacher development. Some of these instruments, they are described in Chapter 2.5.2. are recommended for teacher training education. However, the researchers do not deny the fact that all of them are complex and require intensive training. Thus for teacher training education we need reliable observation instruments based on scientific grounds that develop observation skills gradually and improve them with practice.

Observation tasks have been introduced by the Professor Wajnryb (1992) and are widely used in a modified way round the world in teacher development programmes. She clearly identified the advantages of observation tasks. They limit the scope of observation and allow an observer to focus her/his attention at one or two particular aspects. Concrete subsequent statements provide a convenient means of collecting data and free student teachers from interpreting the behaviour and making evaluation during the lesson. A list of questions after a lesson guide them what aspects of the teaching/learning process they should reflect on. What is more they allow student teacher to personalize the data and to view their own teaching experience. Thus the nature of the task-based experience is ‘inquiry-based, discovery-oriented, inductive and potentially problem-solving’ (Wajnryb 1992:15).

However, initially classroom observation tasks have been introduced for teachers’ professional growth but not for teacher training education. That is why they need to be adapted for this purpose as well. Learner observation tasks offer samples of categories to the student teachers without restricting them. Student teachers could decide in which form to take notes, either putting down actual utterances or jotters. It is important because it allows student teacher to be independent and autonomous. Other modifications are described in Chapter 3.

The two main purposes of the tasks can be formulated as to raise trainees’ awareness about the aspects of the teaching process and guide student teachers to make their own decision about the teaching process. In addition to them observation tasks may occur as the basis for further deeper case study research and provide student teachers with data for writing a course work according to the National Programme for Teaching English Language Department.

1.1.3 The problem of assessment of observation documents

At the end of the Teaching Practicum observation sheets or diaries must be included in the Practicum Folder to be assessed. There is another problem a supervisor faces. There are no explicit criteria for assessment student teachers’ observation sheets. Gill S., a university teacher from the Czech Republic, in his feedback to my request about Teaching Practicum experience in different countries noticed: ‘What we use to arrive at these decisions (assess or not assess student’s observation schedules) is our internal and doubtless highly subjective criteria’. These criteria include the full answer to the questions, evidence of student teachers’ ability to describe what they have seen and link it to the activities of the lesson, evidence of reflection, and language explicitness. It is evident that all these criteria sound ambiguously. What should we treat as ‘the full answer’, ‘evidence of reflection’ and ‘language explicitness’? In my paper I am going to introduce scientific criteria for assessment of observation for research purpose and adapt them to observation as a learning tool for teacher training education.

1.2. Learner as a central focus of observation

1.2.1 Learner’s central role in the teaching process

For my dissertation I have designed observation tasks which are directed to observe and study learner’s behaviour, their attitude to each other, the teacher and the subject, and guide student teachers to contemplate about their motives, reasons of these behaviours. There are many reasons to set a learner in the centre of the observation. Historically, due to the teacher-centered approach in education, observation was focused to the aspects of teacher’s behaviours: opening /closing procedures, use of voice, handling discipline problems and many others. But all humanistic, language acquisition theories approach to the teaching process that an individual learner can bring his/her own experience, knowledge, ideas to the classroom. One of the main aims of the present teaching process is to help learners to be responsible for their learning progress, to promote their autonomy in language learning. To accomplish this aim, student teachers should know individual differences, learners’ subjective needs and preferences. This knowledge will help them ‘to make instructional procedures more flexible to individual learning pace and needs’ (Tudor 1996:11) that enhance learners’ involvement into learning process and learners’ progress accordingly.

1.2.2 Reasons to observe learner’s behaviour

Another motive that drives me the idea to design learner observation tasks is the reports of my trainees after the teaching practicum. They have noted that ‘students are of different levels but they are given the same tasks; tasks for students with lower level should be adapted; students should have not only group work but individual work; pupils demonstrate lack of interest in doing some tasks’. These quotes clearly indicate student teachers’ awareness of individual differences and importance of individual approach to every learner or a group of learners. However, student teachers enter the classroom with ‘a critical lack of knowledge’ (Kagan 1992:131) about pupils. To acquire knowledge of pupils, direct observation appears to be crucial. This requires structured guided observation that allows trainees to study pupils’ behaviours, to know their differences and needs to respond them appropriately through a variety of learning activities in their future lesson planning.

In an extensive review of hundred studies of beginning teachers Veenman (1984:144) ranked classroom discipline, motivation of students, and individual differences among students as their first three concerns. The purpose of compiling learner observation tasks is to change in the trainee’s knowledge of a class in terms of a progression: beginning with classroom climate and management, moving to motivation of students and their individual learning styles, and finally turning to students’ language proficiency.

1.3 Overview of chapters

The dissertation is intended to provide university supervisors and student teachers at Teaching Practicum with four observation tasks that are directed at observing learners’ behaviours.

Introduction explains the background situation in teaching practicum of TESOL Departments in High education in developing countries, particularly in the Kazakhstan Republic. I introduce the motives that have brought me the idea to develop materials for observation during the teaching practicum. The subsequent chapters have been divided into specific areas.

Chapter 2 gives a detailed account of observation in educational research and in the language classroom studies. Observation is defined as a direct research methods and a learning tool for data collecting. It emphasized characteristic features of observation as a scientific method and its difference from the natural process of looking. Some weaknesses of observation are specified, among which errors in representing data, objectivity of data recording and limitation of observable items are classified and described. Reliability and validity are two key processes that can enhance the ‘trustworthiness of reported observations, interpretations, and generalizations’ (Mishler 1990:419). Typology of reliability and evidences of validity introduce methodological strategies and judgment criteria for objective assessing of observation data. To ensure scientific observation an observer must clarify focus of observation, approach to data collection, and ways of recording observation data. The paper presents four perspectives on a lesson for pre-service teacher education: teacher-centred, learner-centred, curriculum-centred and context-centred focus. Two approaches (system-based, ethnographic) are described in opposition, and ad-hoc instrument as a combination of both. Method and techniques of observation focus on the main instruments that have been developed for pre-service teacher education: field notes, anecdotal records, diaries, journals, personal logs, case studies, and checklists, observation schedules, observation tasks, selective verbatim, rating numerical scales. They are classified as procedures of a low degree and high degree of explicitness (Seliger and Shohamy 1989:158) respectively. Data evaluation is a late and crucial stage in observation method. For teacher training education evaluation of observation records constitutes a part of the teaching practicum assessment. In qualitative and quantitative research two approaches to analysis of the documents are presented: manual and computer based. A set of procedures and criteria is specified for manual evaluation.

Chapter 3 describes the details of the learner observation tasks design. It explains the choice of area for learner observation and the reasons of modification of classroom observation tasks elaborated by Wajnryb (1992). Description of the task frame, categories is provided.

Chapter 4 gives self-evaluation account of the designed materials in the context of the literature review. It explains the choice of the ad-hoc approach as the most appropriate instrument for teacher training education. I emphasise the combined features of ethnographic and structured approach to the design of the learner observation tasks. It is followed by the evidences of reliability and validity of the documents.

Chapter 5 introduces a brief background about the particular facet of learner behaviour that is to be focused on doing every observation task. This is followed by the actual description of the task, its objectives and the procedure of the work on the task before, during and after the lesson. I explain the choice of categories and symbols of the task that student teachers are recommended to employ in their descriptive notes.

Chapter 6 indicates further implication of the learner observation tasks into the Teaching Practicum Curriculum. Also three phases how to work with the tasks are given for university supervisors. I have adapted evaluation criteria proposed by Scott (1990) for manual assessment of trainees’ documents. Finally, some recommendations for future improvement of assessment procedure with the use of computer packages are introduced.

Chapter 2

Literature review

2.1 What is observation?

2.1.1. Observation in scientific research

Repeated reference refers observation as a method of data collection and a process involving representations and recordings in which reality is depicted. Techniques of observation are not themselves new: they have been used in scientific research for studying the behaviour of men and animals. Anthropologists, sociologists and psychologists were concerned primarily with describing ‘observable behaviours and activities’ (Seliger and Shohamy 1989:118) with the ‘systematic recording in objective terms of behaviour in the process of occurring’ (Jersild and Meigs 1939), and describing these in their entirety from beginning to end.

One could treat observation as a familiar and natural phenomenon that does not need any definition. Hutt and Hutt (1974) give no definition of observation in their book ‘Direct observation and Measurement of Behaviour’. The definition of general observation is given by Wright (1960:71) ‘research methods… rest upon direct observation as a scientific practice that includes observing and recording and analysis of naturally occurring events and things’. According to Wright (1960:71) observation is direct as no arrangements stand between the observer and the observed, and the records are usually compiled immediately after the observation. In a review article, Weick (1968:360) defines an observational method in more elaborative way as ‘the selection, provocation, recording and encoding of that set of behaviours and setting organism’ ‘in situ’ which is consistent with empirical aims’.

So, the characteristic features of observation as a scientific method I can define as there should be a limited amount of information to be collected; the data should be recorded systematically and analysed over a period of time; the data should be congruent with the aims; the observation session must be planned; and, finally, the observation and analysis must be objective.

2.1.2. Approaches to observation in the language classroom studies

Observation in the language classroom is treated either as a research procedure for in -service professional development or as a learning tool for pre-service teachers. Hargreaves (1980:212) suggests that the 1970s were a ‘notable decade’ for classroom studies thanks to the number of projects and the wide range of methodological approaches, and he identified ‘three great traditions’ of studying classrooms - systematic observation, ethnographic observation and sociolinguistic studies. Sociolinguistics studies the aspects of linguistics applied toward the connections between language and society. These aspects are not of prime interest for pre-service classroom observation that is why I do not dwell upon this approach in this paper.

Hammersley (1986:47) proposes that systematic observation and ethnography are treated as ‘self-contained and mutually exclusive paradigms’. The further description of both of these approaches supports this idea. Croll (1986:5) illustrates some fundamental aspects of systematic observation as follows: explicit purposes which are worked out before data collection; explicit and rigorous categories and criteria for classifying phenomena; data should be presented in quantitative form to be analysed with statistical techniques; any observer should record a particular event in an identical fashion to any other. Ethnographic approach involves a complete cycle of events that occur within the interaction between the society and environment. Lutz (1986:108) defines ethnography as ‘a holistic, thick description of the interactive process involving the discovery of important and recurring variables in the society as they relate to one another, under specific conditions, and as they affect or produce certain results and outcomes in the society’. So, systematic observation is described as highly eclectic studies of an event with pre-specified categories and detailed analysis is presented in quantitative manner whereas ethnography describes and interprets events holistically in their naturally occurring contexts. More detailed characteristics of systematic and ethnographic approaches are provided in Chapter 2.3.

2.2. Observation as a problem

2.2. 1. Classifications of errors in the process of observation

There is always the possibility of error in the observation process. Fassnacht (1982:43) reviews Campbell’s (1958) classifications of errors in representing data in psychological and social studies. Some of these errors frequently occur when making judgements and primarily concern language behaviour:

a) error of central tendency

b) error of leniency or generosity

c) primacy or recency effect

d) halo effect

e) logical error

A first error occurs in using a rating scale. Hollingworth (1910) called the effect ‘central tendency’ in a series of judgements about objectivity of quantifiable stimuli, when the large stimuli are underestimated and the small ones overestimated.

An error of leniency or generosity could arise in making favourable verbal judgements using personality scales. Fassnacht (1982:40) clarifies that in the personality scales a number of questions relating to one particular personality trait are drawn together and the answers to these questions are given in the form of ‘yes’, ‘no’, ‘sometimes’, ‘often’ which might not reflect objective reality.

A third error occurs as a result of the order in which perceptual events happen. The problem is that in behaviour testing the first impression could have a distorting effect on later data collection and thus lead to errors. Bailey (1990:218) admits that in diary keeping, events that are embarrassing or painful when they occur ‘often lose their sting after weeks of reflection’.

A fourth error, halo effect, is described by Mandl (1971) when the evaluator ‘has the tendency when judging a personality trait to be influenced by a general impression or a salient characteristic’.

Logical errors or error of theory reveals due to the theoretical assumptions of the observer. It is now widely accepted that observation is always ‘theory-laden’ (Phillips 1993:62). He continues that observations can not be ‘pure’, free from the influence of background theories or hypotheses or personal hopes and desires. Ratcliffe (1983:148) supports this assumption in that ‘most research methodologists are now aware that all data are theory-, method-, and measurement-dependent’. As Bailey (1990:226) suggests in conducting 'pure research' it is better to avoid reading the research literature in the field, to keep from biasing the results.

2.2.2. The problem of ‘observable’ items

The item ‘observable’ in the definition given by Seliger and Shohamy (1989:118) mentioned above emphasizes the problem of what items to be treated as observable in classroom setting. Thus, Smith and Geoffrey (1968) make valid assertions criticising systematic observation systems:

The way the teacher poses his problems, the kind of goals and sub-goals he is trying to reach, the alternatives he weighs … are aspects of teaching which are frequently lost to the behavioural oriented empirical who focuses on what the teacher does to the exclusion of how he thinks about teaching. Smith and Geoffrey (1968:96)

McIntyre and Macleod (1986:14) generalize the problem of observable items and limitation of data obtained through systematic observation claiming that there is ‘no direct evidence on the actions of participants which are not overt’. The detailed criticism of systematic observation is given in Chapter 2.6.2.

2.2.3. Data recording problems

The problem of accurate recording

Data collection, description procedures face problems of the accuracy and explicitness of records. ‘The crucial problem is to be able to render interpretable the process of events and behaviour as it occurs naturally’ (McKernan 1996:60).

Hutt and Hutt (1970:34) emphasise the difficulty of accurate description of the behaviour. They emphasize the problem with the vocabulary choice in that there are many thousands of words which describe motor and language behaviour but ‘unfortunately, the words are injunctive concepts, learned by usage rather than by definition’ (Hutt and Hutt 1970:34). Other than that, it is frequently found that some definitions are over encompassing in that they cover patterns of behaviour for which ordinary language has two or more terms. Lofland and Lofland (1995:93) recommend employing behaviouristic and concrete vocabulary rather than abstract adjectives and adverbs, which are based on paraphrase and general recall.

The problem of objective recording

Another problem with the written commentary to be discussed is the problem of objectivity. All researchers agree that the data are often subjective, reflect personal impressions, inferential and interpretative. Events may not be viewed the same way by different observers. ‘It is common to find that witnesses to an accident give differing accounts of what happened’ (Lofland 1995:127).

Eisner (1993:49) defines objectivity as being ‘fair, open to all sides of the argument’. He considers that to reduce subjectivity the observer must achieve correspondence not only in what s/he perceives or understands but how she or he represents it. Schaffer (1982:75) continuous the problem of vocabulary choice saying that there are some aspects of reality which can be described fairly objectively and those which can only be described subjectively, and ‘it is difficult to know where the borderline between objectivity and subjectivity lies’. Scheurich (1997:161) doubts in ‘the very existence of gross material reality’. He claims that research mainly addresses interpretation of meaning or constructions of ‘reality’.

To sum the problems with data recording I can suggest that an observer may describe and interpret an event in subjective way due to personal bias, theoretical assumptions, s/he can experience difficulty in the choice of an object/behaviour to observe and words to record an event in accurate and explicit way.

2.2.4. The choice of an approach to observation

An observer faces the dilemma in choosing systematic or ethnographic approaches. The main problem of ethnographical approach lies in its very nature – it is so broad that it demands a highly trained observer to do a competent and reliable observation. ‘An untrained observer may be overwhelmed by the complexity of what goes on and not be able to focus on important events in the classroom’ (Day 1990:44). Pre-specified coding systems in systematic observation are exclusively concerned with ‘what can be categorized or measured’ (Simon and Boyer 1974). Thus they may distort or ignore the qualitative features which they claim to investigate. At the same time limiting the attention of the observer can help improve reliability.

2.3. Reliability and Validity

2.3.1 Types of reliability

Reliability and validity are the most important criteria for assuring the quality of the data collection procedures. The criterion of reliability provides information on whether the data collection procedure is ‘consistent and accurate’ (Seliger and Shohamy 1989:185). The researchers suspect that observers may unintentionally impose their own biases and impressions on the observed situation. Seliger and Shohamy (1985:185) claim that for different types of data collection procedures different types of reliability are relevant. Thus they determine for the ethnographic approach the following types:

a) inter-rater reliability (to examine to which different observers agree on the data collected from the observation);

b) test-retes reliability (to check stability of data collection over time);

c) regrounding (to repeat the data collection and compare both results);

d) parallel form (to examine to which extent two versions of the same data collection procedure are really collecting the same data)

To assure reliability different methodologists suggest involving at least two observers to carry a ‘sequential analysis’ (Becker 1970:79), or to achieve ‘inter-observer agreement’ (Croll 1986:150). The idea of the former procedure is to carry out the analysis concurrently with data collection in the sense that ‘one may ‘step back’ from the data, so as to reflect on their possible meaning’ (Fielding 2001:158). Thus further subsequent data gathering will direct the observer either to abandon or pursue the original hypothesis. In the later procedure two observers look at the same events from different locations to categorise these events and compare the outcomes. Using systematic schemes with pre-specified categories they refine, or ‘index’ (Fielding 2001:159) the definitions and categories of observation by ‘applying in a consistent manner the procedures for data selection, collection, grouping, inclusion, exclusion etc.’ (Simpson and Tuson 1995:65).

2.3.2 Types and evidences of validity

Just as there are different types of reliability, Seliger and Shohamy (1989: 102) suggest that there are different types of validity which provide ‘evidence’ for validity. Thus, their typology of ‘evidences’ of validity comprises

a) evidence on content validity which demonstrates appropriateness of data collection against the content to be measured;

b) criterion validity which provides an indication as to whether the instrument can be measured against some other criterion and compared with the previous results (concurrent validity), and whether the procedure is capable of foretelling certain behaviour (predictive validity);

c) construct validity which examines whether the data collection procedure is a good representation of and consistent with current theories underlying the variable being measured.

Chaudron (1988:24) gives another term to the content validity and suggests ‘treatment validity’ which relates to the process component of process-product study and demonstrates that the treatment was in fact implemented and that it was identifiable different from whatever it was being compared with.

For the results of the second language research Seliger and Shohamy (1989:104) identify internal and external validity. They propose that a study has internal validity if the outcomes of the observational data can be directly and unambiguously attributed to the treatment that is applied to the observed group, and that the interpretation of these data is not dependent on the subjective judgement of an individual researcher. Internal validity in this sense relates to three areas: ‘representativeness, retrievability, and confirmability of the data’ (Seliger and Shohamy 1989:104). External validity involves the extent to which the findings of a study can be generalized and applied to another situation and the categories of the study are treated as basic, applied, and practical.

To achieve evidences of validity items or questions of an instrument must be analyzed in the process of data collection. A researcher or observer should obtain information on whether the items are of ‘low-inference’ or ‘high-inference’ (Long 1980), too easy or too difficult, and whether the items are phrased and easily understood by the respondents. All these aspects are recommended to examine in the pilot phase of the research that is likely to be proved by evidences from a variety of sources, such as additional questionnaire data from pupils or teachers, interviews, surveys. Another way of examining the validity of observation is to ask colleagues to study the categories and to define the purpose of the observation. Simpson and Tuson (1995:65) treat this method as a useful check on face validity. Thus to achieve reliable and valid observation an evaluator should take into account the spatial location of an observer, engage more than one observer, involve ‘low-inference’ categories that do not require complex interpretation and check agreement of key aspects against independent studies.

2.4. Items of observation

2.4.1 The importance of items

In so far the language classroom observation ‘does not simply mean watching classes’ (Wallace 1991:123). An observer may record either very narrowly defined data such as a specific speech act, or more general kinds of language learning activity such as turn-taking, group work.

Any scientific research or observation is characterised by terms as ‘structured’, ‘organised’, ‘methodical’, and ‘systematic’. To follow these characteristics any data collection obtains a structure or format, and guided by some questions or variables. Croll (1986:55) notifies a variable as a basic unit that represents the process by which a concept of interest is turned into a set of working definitions whereby the results of observation or some other data collecting process can be categorized and measured.

2.4.2 Items of observation in the language classroom

For classroom observation as a learning tool Richards (1998:143) proposes three perspectives on a lesson for pre-service training to develop a deeper understanding of how and why teachers teach the way they do and the different ways teachers approach their lessons. They are:

1) Teacher-centered focus: the teacher is primary focus; factors include the teacher’s role, classroom management skills, questioning skills, presence, voice quality, manner, and quality instructions.

2) Curriculum-centered focus: the lesson as an instructional unit is the primary focus; factors include lesson goals, opening, structuring, task types, flow, and development and pacing.

3) Learner-centered focus: the learners are the primary focus; factors include the extent to which the lesson engaged them, participation patterns, and extent of language use.

Wallace (1998:68) substitutes the focus on the curriculum with the focus on the context in which the teacher teaches: the classroom layout, the teaching aids available and how they are used.

Low-inference and high-inference categoreis

The presentation of items involves constructing sets of categories into which occurrences must be coded unambiguously. In this respect Long (1980:3) introduces low-inference and high-inference measures. Low-inference categories include things that can be counted or coded without the observer having to infer their meaning from observable behaviour. Such categories according to Allwright and Bailey (2000:73) involve the number of times the student raises her/his hands, or the frequency with which the teacher uses the student’s name. High-frequency items demand that the observer make a judgement that goes beyond what is immediately observed. The samples of this type of categories cover factors like learner’s attention, or the social climate. I can conclude that observation data should cover categories of observable behaviour that does not require much interpretation.

2.5. Typology of observation

Typology of classroom observation instruments is worked out by Wallace (1991:66) and he presents the following oppositions:

1. system-based, ethnographic or ad-hoc

2. global or specific

3. evaluative, formative or research-related

4. teacher-focused, learner-focused or neutral in focus

5. quantitative or qualitative

He admits that some of the oppositions are not clear-cut and overlap. For example, observation techniques which are primarily evaluative may be employed for formative purposes, ethnographic approach is treated as global and qualitative. System based approach can focus on teacher’s activity and learners’ activities. System-based (systematic), ethnographic and ad-hoc approaches encompass other characteristics of the classification provided. Thus, I outline the features of the first opposition.

2.5.1 System-based approach

By system-based observation Wallace (1991:67) means the observation that is based on a system of fixed and pre-specified categories. They are global in nature, i.e. ‘they are intended to give general coverage of the most salient aspects of the classroom process’ (Wallace 1991:110). Any system contains a finite array of categories. The endeavour of all system-based observation instruments is the analysis of teacher-class interaction. The two most influential systems are devised by Bellack (1966:267) and by Flanders (1970:314). Wallace (1991:112) has identified the characteristic features of the first system as:

1) the data are measured from a transcript, i.e. the data have to be first recorded and then transcribed;

2) the central place of labelled units of discourse are structure, solicit, response, reaction.

In the ‘Flanders tradition’ there is a form of documented recall where tallies are made every three minutes under one range of categories. In chapter 2.6 the analysis of a range of interaction schemes, their advantages and disadvantages are presented with more details. They are widely used by researchers as they are ready-made, well known and ‘it does not to be trialled and validated’ (Wallace 1991:111).

2.5.2. Ethnographic approach

The observation techniques share many of qualities of ethnographic practices. Ethnography is a detailed sociological observation of people which immerses the researcher in an intense period of observation ‘which guides and informs all subsequent data gathering’. (Radnor 2002:49)

Ethnographical approach is originally developed from the methodologies of field anthropologists and sociologists concerned with studying human behaviour within the context in which that behaviour would naturally occur. Methodologically, ‘anthropological’ classroom studies are based on participant observation, during which the observer immerse him/herself in the ‘new culture’. Initial data gathered by the ethnographer are open-ended and relatively unstructured that ‘allows and encourages the development of new categories’ (Delamont and Hamilton 1976:13). An ethnographer uses a holistic framework. S/he makes no attempt to manipulate, control or eliminate variables. At the same time s/he reduces the breadth of research problems systematically to give more concentrated attention to the emerging salient issues.

The great strength of the ethnographic research is that it gets away from the simplistic behavioural emphasis of the pre-specified codes. (Delamont and Hamilton 1976:37).

The main purpose of the ethnographic approach is the search for meaning and is based on the description of the studied phenomenon. However, Lutz (1986:112) warns that not everyone who can write a paragraph describing an encounter between a teacher and a student is an ethnographer, and he points out that an observer should be trained in ethnographic methods, particularly participant-observer field methods.

2.5.3 Ad-hoc approach

The term ‘ad-hoc’ is used to describe something that has been devised for a particular purpose, ‘with no claims to generality’ (Wallace 1991:113). The ad-hoc approach relates to structured approaches but the categories derive from a particular problem or research topic. That is why this system is more popular with practising teachers. What is more this approach is flexible and eclectic, and involves both quantitative and qualitative data where each seems appropriate. Wallace (1991:113) assumes that each different area of concern will yield a different system of analysis. Ad-hoc approach is considered to be the most appropriate in teacher-training education as it is basically guided discovery approach that drive student-teachers to focus and reflect on an important area of language teaching, and provide a meta-language with which to discuss. The instrument of ad-hoc approach is known as observation tasks (Wajnryb 1992) and is described in Chapter 2.6.2.

2.6. Methods and techniques of observation

2.6.1 Classification of data collection techniques

Seliger and Shohamy (1989:158) present classification of data collection procedures according to the degree of explicitness. On one end of the scale they set broad and general techniques which do not focus on a particular type of data and are considered to be of a low degree, while at the other end they tend to put procedures which are more explicit and structured and thus reveal high degree of explicitness. Collecting data by procedures of a low degree of explicitness is done by means of open and informal description, which tends to be done simultaneously with its occurrence. Typical procedures of this kind are field notes, records, diaries, journals, lesson reports, personal logs, life history accounts, informal interviews with the subjects of observation. Collecting data by means of procedures of a high degree of explicitness involves the use of formal and structured types of data collection procedures. Examples of such procedures are interaction schemes, checklists, observation schedules, observation tasks, formal interviews, surveys, structured questionnaires, case studies, rating numerical scales. Different procedures imply different techniques for data collection. Data obtained from more structured observations are presented in the form of checks, tallies, frequencies, and ratings, while data obtained from the informal observations are presented in the form of narration, field-notes, or transcripts.

According to this classification I am going to describe a range of procedures that are applied to pre-service classroom observation.

2.6.2 Observation instruments

Field notes

Field notes are records of naturalistic observation in the natural context of the behaviour researched through direct listening and watching. The main focus of observation notes is accurate description rather than interpretation. An observer can write down interesting details on various aspects of school life in general and of the teaching process in particulars. ‘Each observational note represents a happening or event – it approximates the who, what, when, and how of the action observed’ (McKernan 1996:94). McKernan considers field notes as a useful tool as

1. they are simple records to keep requiring direct observation

2. no outside observer is necessary

3. problems can be studied in the teacher’s own time

4. they can function as an aide-memoire

5. they provide clues and data not dredged up by quantified means.

At the same time an observer should consider some drawbacks in the use of this technique presented by McKernan (1996:96) as follows:

1. It is difficult to record lengthy conversations

2. They can be fraught with problems of researcher response, bias, and subjectivity

3. It is time-consuming to write up on numerous characters

4. They are difficult to structure

5. They should triangulate with other methods, as diaries, analytic notes.

The case study

Elliot and Ebbutt (1986:75) treat case study as a research technique in which teachers identify, diagnose and attempt to resolve major problems they faced in teaching for understanding. Richards (1998:73) considers case materials help students to explore how teachers in different settings ‘arrive at lesson goals and teaching strategies, and to understand how expert teachers draw on pedagogical schemes and routines in the process of teaching’. McKernan (1996:76) reminds that the researcher or an observer should use a ‘conceptual framework’, which can relate to existing science. So, the researcher employs various concepts to make sense of the observed data.

Richards (1998:76) enumerates advantages for using case studies in teacher education:

1. students are provided with vicarious teaching problems that present real issues in context;

2. students can learn how to identify issues and frame problems;

3. cases can be used to model the process of analysis and inquiry in teaching;

4. students can acquire an enlarged repertoire and understanding of educational strategies.

5. cases help stimulate the habit of reflective inquiry.

Diary/journal

Some research employ both terms equally. Allport (1942:95) has made the point that ‘the spontaneous, intimate diary is the personal document par excellence’. Many researchers have kept diaries as self-evaluative tool of their own experience. The most notable study of a diary keeping method is described by Bailey (1990). She has used the diary study approach as one option for the classroom-centered research project required in the practicum. The resulting journals have focused on issues related to lesson planning and creativity, time management, problems faced by non-native teachers of English, classroom control, group work, and difficult student-teacher relations. Baily's (1990:218) sense of result is that diaries were often extremely useful exercises for the teachers-in-preparation, both in generating behavioural changes and in developing self-confidence.

Requirements to write the diary entries she identifies as follows:

a) to set aside time each day immediately following the class, in pleasant place free of interruptions;

b) the time allotted to writing about the language teaching or learning experience should at least equal the time spent in class;

c) to set up the conditions for writing so that the actual process of writing is or can become relatively free. It's difficult in getting started;

d) in recording entries in the original uncensored version of the diary, one should not worry about style, grammar, or organisation. The goal is to get complete and accurate data while the recollections are still fresh.

Her studies reveal some problems in keeping diaries. In actual practice, students experience difficulties in describing events freely, the process of writing seems to be tedious for them; they do not get used to criticize, reflect, express frustration, and raise questions in written form. Some students were reluctant to edit their private journals.

Porter, Goldstein, Leatherman, and Conrad (1990:240) consider the journal is not a personal diary. They emphasise that the journal is a place to go beyond notes made during observation by exploring, reacting, making connections. The journal entries are intended to be polished pieces of writing. But as diaries, as journal are not assessed. The problem with assessment is in that there is no rigid regulation about the frequency of entries per day or week. It depends on the nature and structure of the course. At the same time writing every week is considered to be productive since the journal is meant to be ongoing. Sometimes students need to process what they are reading and make connections among a number of readings.

Benefits of using journals Porter et al. (1990:287) sees as:

1) students can get help with areas of course content where they are having difficulty; get a teacher’s response;

2) they promote autonomous learning, encouraging students to take responsibility for their own learning and to develop their own ideas;

3) students can gain confidence in their ability to learn, to make sense of difficult material, and to have original insights;

4) the journal encourages students to make connections between course content and their own teaching;

5) the journals create interaction beyond the classroom, both between teacher and student, and among students. It allows an ongoing dialogue between teacher and students;

6) the journals make class more process oriented. Students input can in part shape the curriculum. The teacher can use this information to restructure the course.

Anecdotal records

Anecdotal records McKernan (1996:67) refers to narrative-verbatim descriptions of meaningful incidents and events which have been observed in the behavioural setting. They focus on narrative, conversation and dialogue and provide short, sharp incisive summaries of points that stick in the mind after the event. Anecdotal records are treated to be useful in teacher training education because they directly observe behavioural data which enable students to ‘see’ the incident and gain ‘inside’ perspective. One of the key tasks for the observer is to watch for the beginning and ending of ‘episodes’ of behaviour. McKernan (1996:68) sets some disadvantages of anecdotal records that are similar to diary keeping and journal as any piece of descriptive writing, such as:

1. they require extensive time to observe, write and interpret;

2. maintainenace of ‘objectivity’ is difficult;

3. observers require training in the use of anecdotes;

4. they are often reported without taking accounts of setting;

5. read out of context, they can be misunderstood and misinterpreted;

6. some observers focus on ‘negative’ or ‘undesirable’ events only.

Personal action logs

Personal action logs McKernan (1996:110) defines as record sheets which document a researcher’s activities over a lengthy time period ‘to get a full-blown representation’ of a day. Thornbury (1991:141) clarifies the purpose of log-keeping as ‘to direct trainees’ attention towards areas they may have overlooked or avoided; to measure the trainees’ assessment against our own; to make adjustments, if necessary, to the course design and/or content’. Logs may be kept in chart summary form, describing the main events with time sampling or in a more descriptive form similar to a diary. At the same time personal logs (McKernan 1996:111) are recommended to keep over a lengthy period of time and in connection with more extensive accounts, such as field notes, diaries and audio transcripts to validate findings.

Check-lists

The use of check-lists suggests the formulation of well-defined and ‘clearly delineated behaviour categories, which in turn presupposes more than a superficial acquaintance with the data’ (Hutt and Hutt 1970:38). It is used to focus ‘the observer’s attention to the presence, absence, or frequency of occurrence of each point of the prepared list as indicated by checkmarks’ (Hopkins and Antes 1985:467). Thus a prerequisite for obtaining reliable and valid data from check-lists is a set of clearly defined categories. For this reason a check-list would be unsuitable for recording behaviour with which the observer was not completely familiar or for recording the complete range of activities in a free-field situation. The researchers confirm that although in principle a large number of categories are feasible, in practice an observer is unable to cope reliably with more than fifteen. Different methodologists notice that as the number of categories increase, the problems involved in scanning these. That is why Hutt and Hutt (1970:69) offer from a practical view to have check-lists as compact as possible, since they are most commonly used in those situations where the observer is attempting to record unobtrusively and with the minimum of distraction to the subject.

The greatest advantage of check lists is the facility and speed with which they can be analysed, as observer just ticks off phenomenon against an appropriate category by mere observation. Measures that might be easily obtained are as follows:

1. frequency with which there is a change in activity;

2. number of different activities;

3. number of stimuli encountered;

4. duration of specific activity;

5. changes in nature and duration of activities with time.

However, McKernan (1996:108) admonishes that the arrangement of the points is crucial in that sequence in task completion should be logical and sequential. An observer or designer of this instrument must ensure that:

1. points to be observed are listed in their actual sequence of happening;

2. all similar attributes are included in categories;

3. all the relevant and specified points are listed.

Observation schemes

Over the years numerous schemes have been developed for recording classroom interaction. Chaudron (1988:19), modifying the analysis originated by Long (1980), identifies twenty-four various schemes. In his review Chaudron (1988:17) points out that Long (1980) has included only those instruments which were designed to observe verbal interaction in a classroom, whereas the range of categories is great due to various purposes of observation. Chaudron interprets categories as

a) social interactive (Allwright (1980:169) turn-taking and turn-giving, Moskowitz’s (1970) ‘jokes’, ‘praises or encourages’)

b) pedagogical (Jarvis’s (1968:336) ‘classroom management’, ‘repetition reinforcement’, or Fanselow’s (1977:18) ‘solicit’, ‘respond’)

c) objective behaviour (Naiman, Neil, Frölich, Stern, and Todesco’s (1978) ‘student hand-raising’, ‘student callout’, or Moscowitz’s (1970) ‘student response -choral’)

d) semantic or cognitive content of behaviours (Fanselow’s (1977:31) ‘characterize’)

e) type and grouping of participants (Mitchell et al. (1981:19) ‘whole class’, ‘individuals doing the same task’)

For teacher training purpose Chaudron (1988:18) recommends to apply eleven schemes among which Capelle, Jarvilla, and Revelle (n.d.), Moskowitz’s (1970), Politzer (1980), Seliger (1977) are conducted in real time coding and categories of schemes refer to low degree of inference.

Advantages of interaction schemes as the basis of reflection in experiential knowledge are described by Wallace (1991:121) and he claims that these systems

1) objectify the teaching process;

2) provide a reliable record (by a trained observer);

3) promote self-awareness in the teacher;

4) provide a meta-language, which enables teachers to talk about their profession;

5) make teacher training more effective by improving the quality of teaching.

At the same time systematic observation schemes have some critics. Delamont and Hamilton’s (1976:3) main critique is levelled at the use of pre-specified categories to ‘code’ or classify the behaviour of teachers and pupils, which can not capture and reflect the whole complexity of classroom life.

Delamont and Hamilton (1976:8) identify seven criticisms of systematic observational systems:

1) Systematic observation provides data only about ‘average’ or ‘typical’ classrooms, teachers and pupils.

2) All the interactional analysis systems ignore the temporal and spatial context in which the data are collected as most systems use data gathered during very short periods of observation the observer is not expected to record information about the physical setting.

3) Interaction analysis systems are usually concerned only with overt, observable behaviour. In the case if intentions lay behind the direct behaviour an observer must himself impute the intention.

4) Interaction analysis systems are concerned with ‘what can be categorized or measured’ (Simon and Boyer 1986:1). They may obscure, distort or ignore the qualitative features which they claim to investigate, by having ill-defined boundaries between the categories.

5) Interaction analysis systems focus on ‘small bits of action or behaviour rather than global concepts’ (Simon and Boyer 1986:1). Delamont and Hamilton clarifies that there is a tendency to generate a superabundance of data which must be linked either to the complex set of descriptive concepts or to a small number of global concepts.

6) The systems utilize pre-specified categories.

7) Placing arbitrary boundaries on continuous phenomena obscures the flux of social interaction.

Walker and Adelman (1976: 136) emphasize the problems of recording child-child talk and objectivity of incorporating this kind of talk into the normal flow of teacher-centred classroom. They illustrate that there is no research instrument to code the spontaneous talk or social function of jokes and humour. ‘Talk is seen to be a highly complex, problematic activity, rich in contradictory and bizarre meanings and frequently with difficulties and confusions’ (Walker and Adelman 1976: 137). This organisation is taken for granted in observation schemes.

Rating scales

McKernan (1996:118) reviews various styles of rating scales – category, numerical, graphic and pictorial. They all share the common feature of having a rater place an object, person or idea along a sequential scale in terms of estimated value to the rater. Rating scales are treated as helpful instrument to measure non-cognitive areas where an observer is interested in cooperativeness, industriousness, tolerance, enthusiasm, group skills. At the same time McKernan (1996:119) notes that all rating sheets need to

a) include observable behavior;

b) rate significant outcomes as opposed to minor or trivial behaviours;

c) employ clear, unambiguous scales – never to use less than three, nor more than ten points on a scale;

d) arrange for several raters to observe the same phenomena to increase reliability of ratings;

e) keep items short and to the point.

Rating scales are opposed to direct observation as an assessment strategy. Nevertheless, Sattler (1982:33) points out that rating scale may not correspond with data obtained by the way of direct observation. He suggests that the internal consistency and ‘inter-rater’ reliability are important features of behaviour rating scales (Sattler 1982:34). Another criticism of observational data obtained through ratings is in that they involve human judgment and the sample of behaviour may be limited.

Selective verbatim

This technique is described by McKernan (1996:170). Unlike interaction analysis the selective verbatim techniques is directed at studying ‘selective’ verbal reactions. These are interactions that reflect effective or ineffective teaching. The procedure involves recording of the actual words and further analysis. The main advantage of the selective verbatim technique is in that it allows an observer to concentrate on one aspect of the teaching/learning behaviour at a time and it provides an objective non-interpretive record of verbal behaviour, which can be analyzed later.

Observation tasks

An observation task is ‘a focused activity to work on while observing a lesson in progress’ (Wajnryb 1992:7). Like a selective verbatim technique it focuses on one or a small number of aspects of the teaching/learning process but covers nonverbal behaviour as well. The purpose of the task is to collect actual facts or patterns of interaction that emerge in a lesson. The advantage of the collecting information with the help of selective tasks is that ‘it provides a convenient means of collecting data that frees the observer from forming an opinion or making a non-the-spot evaluation during the lesson’ (Wajnryb 1992:7).

To draw general conclusion about the techniques of observation I can say that some of them suggest either too broad or too narrow studying of the teaching process. It does not suit the main objectives of the Observation Weeks at the Teaching Practicum that are targeted to acquaint trainees with all the facets of the complex teaching/learning process gradually, to practice and develop trainees’ observation skills.

2.7. Evaluation of documents

2.7.1 Criteria for manual evaluation

The data evaluation process in qualitative and quantitative research is complex, laborious and time consuming procedure. In social research there are two main approaches to analysis and evaluation of data: manual and computer based. In the former case qualitative research evaluation is treated as ‘intuitive, idiosyncratic and creative’ (Stroh 2000:226). Due to the immersive nature of the participant observation and closeness to a subject a researcher is inclined to see things from the member’s perspective. Thus Cohen and Mannion (1994:52) suggest evaluating materials by means of two stages: ‘external’ and ‘internal criticism’. External criticism is concerned with establishing the ‘authenticity’ (Scott 1990:37) or genuiness of material. It is aimed at the document itself rather than the statements it contains and endeavors to analyse forms of the data rather than the interpretation or meaning. That is way it sets out to discover frauds, inventions or distortions. A set of questions proposed by Platt (1981) can be employed to test observation material on its authenticity:

Does the document make sense or does it contain glaring errors?

Are there different versions of the original document available?

Is there consistency of literary style, handwriting or typeface?

Has the document been transcribed by many copyists?

Does the version available derive from a reliable source?

Internal criticism deals with the accuracy of the data presentation and an evaluator has to establish ‘credibility’, ‘representativeness’ and ‘meaning’ (Scott 1990:53) of the document.

Credibility refers to the question of whether the task is ‘free from error and distortion’ (Macdonald 2001:204). The later may occur when the comments and discussion were made long time after actual observation, or when the account has been made through different hands and the author was not present at the lesson. The task is considered to be representative if all the aspects of the task have been taken place in an accurate way. But missing of some categories might occur, then the question of what is missing, how much and why should be considered.

Representativeness can be affected by the interest or bias of the author to please the reader, or being under pressure, from fear or vanity the writer can distort or omit some facts.

The meaning of a docu