Fundamental considerations of the evaluation process: goals, reliability, validity and utility.
Grundsätzliche Überlegungen zu Evaluationsverfahren: Ziele, Verläßlichkeit, Validität und Nutzen
1. Introduction
Answers to the question on how to assess and improve instructional quality in higher education have been sought for several years. Many studies have been conducted to identify components of the instructional process that determine its effectiveness, to develop measurement instruments, and synthesize evaluation models. Over the past few years attention has shifted from measurement issues to processes of decision-making, management of university organizations, and the use of evaluation data (GIJSELAERS & WOLFHAGEN, 1996).
Instructional evaluation is nowadays widely applied at North-American Universities. It is used by administrators to make decisions about promotion and tenure of instructors, as feedback by teachers to improve their courses, and by students to select courses (MARSH, 1984). Many US medical schools have adopted evaluation models to ensure educational quality. IRBY (1993) reports that today almost all medical schools in the United States of America evaluate teaching with student ratings and that 63% of the medical schools incorporate other forms of evaluation like peer review of teaching, portfolios or teaching files. Evaluation is mandatory in more than 80% of all courses.
The European situation differs quite distinctly. Here, evaluation has gained only recently more attention. For example in the Netherlands (GIJSELAERS, 1988; WOLFHAGEN, 1993) and Germany (EITEL et al., 1992) successful attempts have been made to implement evaluation models in medical schools to improve medical education. Contrary to most US medical schools, the major purpose of these models is to use evaluation data for instructional improvement and not for accountability purposes (decisions about promotion and tenure of faculty).
An obvious question is why evaluation has gained so much attention in medical education. It seems that two kinds of factors may be identified that changed established perspectives on medical education. First, professional bodies have severely criticized the contents and nature of traditional medical education. For example, the publication of the GPEP-report (1984) and the Edinburgh Declaration (1989) led to a reorientation in medical education. According to these reports, medical education should pay more attention, amongst others, to integration of basic science teaching and clinical instruction, emphasize development of self-learning skills, and promote problem-solving skills. Second, educational research shows that traditional medical education is regarded as very demanding. It places a heavy burden on medical students to retain knowledge that they acquired during their period of studying basic science until it is needed in clinical work (BARROWS, 1984). As a consequence, students are often poorly motivated, have disappointing learning results, and are not able to use their knowledge when confronted with patients in clinical clerkships. FELTOVITCH et al. (1991) argue that in medical education students encounter serious difficulties in applying conceptual basic science knowledge to clinical problem-solving tasks. MANDL et al. (1993) found that medical students didn't sufficiently relate signs and symptoms with diagnosis formulation, ignored information that didn't fit into their primary hypothesis about a patient's disease, and were not capable to restructure and synthesize information presented in the case.
Given the pressure for curriculum reform in medical education, the need for "change" instruments has grown. In this respect, evaluation is regarded as an instrument that may serve the purpose of change and improvement in medical education, provided that certain conditions have been taken care of (GIJSELAERS & WOLFHAGEN, 1996). The present article focuses on basic issues in evaluation: evaluation goals, measurement issues and the utilization of evaluation results. The question is put forward how evaluation can be an effective instrument to change and improve medical education.
2. Evaluation Goals
Evaluation may serve two basic purposes: Improvement purposes and Accountability purposes. In case of improvement purposes, evaluation may be used to generate information that is suitable for staff development activities, course and program improvement, changes in design of skills training courses, etc. Evaluation for accountablity purposes yields information that is used for career decisions (e.g. promotion or tenure of faculty) or about school status (e.g. ranking of institutions or accreditation).
DARLING-HAMMOND, WISE and PEASE (1983) argue that different evaluation methods are better suited to one or another of these purposes. Focusing on individual or organizational concerns leads to different evaluation methods. For instance, teachers are likely to require different input, according to their stage of professional development and their particular teaching situations. They expect information which offers them the opportunity to change their particular courses or facets of teaching behavior. Feedback which doesn't appraise what teachers want to find out about their teaching is not accepted and not perceived as valuable for improving teaching (ROTEM & GLASMAN, 1979). By contrast, decisionmakers are primarily interested in information providing them an overview about the program as a whole.
According to DARLING-HAMMOND et al. (1983) an emphasis on one evaluation purpose may tend to limit the pursuit of another because of the conflicting information needs mentioned. Therefore it is important to consider what purposes are best served by available evaluation methods. For example, test results may be useful to provide indications about the quality of graduates or the quality of a curriculum, but they hardly provide useful information on how to improve certain details of individual courses. The use of outcome data to assess educational quality is based on the assumption that quality is determined by the effectiveness of a program. It is argued that the degree to which an instructor or program facilitates student learning, or the amount students learn, is similar to educational effectiveness. Hence, the degree of concordance between educational goals and educational outcome may serve as an indicator for educational quality. However, outcome data (test results, assessment data, etc.) have been shown to be of limited use for purposes of educational improvement, because they don't provide information about the causes which lead to certain outcomes.
3. Reliability and Validity Issues
An essential question is how components of high quality teaching may be defined, and how acceptable teaching practices can be distinguished from unacceptable teaching practices. Most evaluation approaches assess quality against a set of prespecified objectives. It is assumed that educational quality is reflected by educational effectiveness. It is argued that the degree to which an instructor facilitates student achievement is similar to educational effectiveness: the amount students learn in a particular course (MCKEACHIE, 1979). Hence, the degree of concordance between student performance and intended outcomes determines educational quality. Consequently, it is required that evaluation should identify characteristics of classroom teaching which facilitate the learning process and learning outcomes of students. Measures are needed describing teaching behavior which facilitates student learning and which in turn may provide data for program improvement.
Student ratings are generally seen as an instrument which meet these requirements (MARSH, 1984). The Medical Faculty of the University of Limburg has implemented an evaluation system which uses student ratings as the primary data source to describe instructional processes. There are some other advantages in the use of student ratings. Student ratings are relatively inexpensive and have a high degree of reliability, usually ranging from .80 to .90 and above (FELDMAN, 1976). In addition, meta-analyses have shown moderate correlations between student ratings of teachers and student achievement (COHEN, 1981). These results are considered to support the validity of student ratings as a measure of teaching effectiveness (ABRAMI, COHEN & D'APPOLONIA, 1988). Furthermore, research findings like MARSH's studies (MARSH, 1984) have demonstrated that student ratings have a multidimensional nature, are stable, relatively valid against a variety of indicators of effective teaching, relatively unaffected by potential biases, and considered to be useful by faculty as feedback about their teaching. In conclusion, the majority of studies shows that student ratings can be used in a reliable and valid way to evaluate parts of the instructional process. However, despite these findings the quality of student ratings is always questioned. Especially when the outcomes of an evaluation are not in favor of a course, teacher or program. So, before we continue on the use of student evaluation data, we will take a closer look at the nature of these data.
By using student ratings of instruction it is assumed that they measure dimensions of quality: measures that contain descriptions of the instructional process. It is generally assumed that the raters, students, are representative (except for random error) in their measuring of the instructional attributes or characteristics. That is, students within a class will give a similar rating on an item measuring an element of teacher behavior, except for some random error. The amount of random error determines the reliability of ratings. Research has shown that random error -or also called within-class variability- is normally modest or small, depending on the number of students and quality of items, resulting in sufficiently high degrees of reliability. For example, GIJSELAERS and SCHMIDT (1991) found that interrater-reliabilities within small groups (when based upon 7 ratings) is normally above .80. This result is regarded as highly satisfactory.
Whether differences in ratings between students within classes are to be interpreted as random or as systematic, has been an issue of considerable debate in the literature on student ratings. As FELDMAN (1977) pointed out: "together with the fact that students are not trained as either observers or raters, it might well be expected that ratings done in typical classroom settings would be dependent to some extent on the characteristics and experiences of the student observers". Consequently, the issue is whether dissimilarities in ratings not only reflect random error, but also systematic error. If the latter is the case, then some of the patterned variability in ratings represents so-called true variance. Differences among students may then result from legitimate or genuine sources of influences on their ratings. However, studies on the relationship between certain attributes of students have in general shown only weak relations between ratings and these attributes. For example, FELDMAN (1977) mentions that only students' motivation and expected grade appear to be consistently related to student ratings of instruction. MARSH (1984) found that the effect of Prior Subject Interest on student ratings of instruction was greater than that of any of 15 other background variables considered. The Prior Subject Interest variable was most highly correlated with students' ratings of the course's learning value (rs about .4). Ratings with dimensions of teaching behavior were lower (rs between .3 and -.12). According to MARSH (1984), higher student interest in the subject matter apparently created a more favorable learning environment and facilitated effective teaching.
Comparable results were found in a study by JONES (1981). He showed that students' ratings of teaching are in part related to what they consider as good teaching. Students' criteria about what constitutes good teaching may differ depending on their basic expectations of the course. Highly motivated students tended to give more favorable ratings for the same course than poorly motivated students. GIJSELAERS and NUY (1995) found that evaluations of teacher behavior were significantly correlated, though modest to weak, with intrinsic motivation (students' interest in course subject matter). Other motivation variables were not related to ratings of tutor behavior. In general, it may be concluded from reviews of the student ratings literature (FELDMAN, 1977; MARSH, 1984) that effects of background variables tend to be small, except for motivation variables and students¹ expectations.
4. Utility
IRBY (1993) formulates several guidelines for a comprehensive system to evaluate and improve teaching. In our view, these guidelines may be characterised as: 1) pay attention to the quality of data collection, and 2) take care of the organisational context and change management. Studies at the University of Limburg confirm IRBY's notion that additional factors -next to quality of data collection- play a decisive role in the utilization of student ratings: acceptance of evaluation results, quality of communication between evaluator and teacher, way of presenting evaluation results, credibility of the evaluator and teacher commitment. In fact, GIJSELAERS (1988) and WOLFHAGEN (1993) found that although the quality of evaluation data is a necessary prerequisite for change, it is not the essential factor.
4.1. Design of Measurement Instruments
If we consider IRBY's first factor "quality of data collection", the question arises how evaluation instruments can be constructed. Usually, the factor analytic approach is used in investigations of student rating data to identify essential components of instruction. For example, most evaluation instruments for lecturing are typically based on an empirical approach. Students and teachers are asked to specify the behaviors they feel are most important for superior teaching practices. Subsequently, their answers are factor analyzed to identify the communalities underlying these answers, which in turn are considered to represent characteristics of good teaching (FELDMAN, 1976). The general outcome of these studies is that six to nine factors are identified that distinguish between effective and ineffective lecturing (MARSH, 1984):
- "Skill", teacher's capability to teach a class;
- "Structure", teacher's preparation and organization of a course;
- "Workload", teacher's requirements set for students;
- "Rapport", teacher's interest in individual students, closeness to students;
- "Instructor-group interaction", teacher's interaction within class;
- "Feedback", the nature of teacher's comments on student's work;
- "Learning/value", student's perceived relevancy of a course;
- "Enthusiasm", teacher's motivation to teach a class;
- "Breadth of coverage", teacher's presentations of background of various theories.
Meta-analyses (COHEN, 1981) have shown that some of these factors are significantly correlated with student achievement: e.g. "skill" (r = .50), "structure" (r = .47). The nine factors may serve as a basis to construct evaluation questionnaires for courses that follow a traditional lecturing format. Literature on student ratings contains many examples of questionnaires that use these factors as the starting point for the design of evaluation items (e.g. BRASKAMP, BRANDENBURG & ORY, 1984).
Less frequently, identification of desired teaching behavior is grounded in theories about school learning. At Maastricht medical school, evaluation questionnaires have been designed based on models of school learning (SCHMIDT, DOLMANS, GIJSELAERS & DES MARCHAIS, 1995). These models assume that learning within the context of a school can be described by input variables, notably student characteristics, teacher behavior and the adequacy of learning materials; process variables like learning activities carried out by students, time on task and features of the instructional process; and finally output variables (e.g. achievement) and affective outcomes like interest in the subject-matter studied. For example, in Maastricht the so-called CTEQ (Clinical Teaching Evaluation Questionnaire) was developed exclusively to evaluate clinical clerkships (WOLFHAGEN, 1993; WOLFHAGEN, VLEUTEN, ESSED, 1995). It consists of fourteen domains related with an effective learning environment in the setting of the clinical clerkship. Each domain reflects a particular aspect of clinical education: 1) preparedness for the clerkship, 2) quality of clerkship manual, 3) personal supervision in general, 4) supervision of technical performance, 5) attention to non-patient related affairs (context of patient situations), 6) quality of educational activities not associated with daily patient care, 7) patient mix with respect to problem variation and frequency, 8) quality of patient-related learning situations (ward attendance, participation in patient rounds), 9) quality of outpatients' clinic (office facilities for examining patients), 10) quality of learning facilities available, 11) quality of clinical competency assessment procedures (end of clerkship examination), 12) opportunity for self-study as a result of patient contacts, 13) global judgement of the clerkship quality (atmosphere, organisation, and instructiveness), and 14) spread of activities over time. An example of CTEQ is added in the appendix of this article.
CTEQ contains 56 items, covering the fourteen domains mentioned above. The majority of items are in a five-point Likert type scale, ranging from (1) totally disagree to (5) totally agree. Some items require a write-in response. Studies by GIJSELAERS (1988) and WOLFHAGEN (1993) have shown that the theory-guided approach to designing evaluation questionnaires is valid and useful. For a review of the construction of evaluation questionnaires we refer to the work of MARSH (1984), IRBY (1993) and SCHMIDT et al. (1995).
4.2. Evaluation Procedure
GIJSELAERS (1990) and GIJSELAERS and WOLFHAGEN (1996) provide detailed descriptions of how the medical school in Maastricht applies an evaluation approach that focuses on identification of deficiencies in courses, provides guidelines for improvement, and implements changes whenever necessary. In this section a summary of the major features of the approach will be given.
The evaluation procedure is characterized by its emphasis on comparative data and its compromise between uniformity and flexibility. The procedure follows two distinct stages. The first stage involves a general screening of all courses, carried out with the help of the evaluation instruments mentioned above. Standardized questionnaires, based on thoery-guided models of teaching and learning, are used to evaluate all courses in the curriculum. Provision for supplemental questions at the end of the standard instruments allows individual teachers to design items unique to their specific needs. After a short period an evaluation report is written on each course, presenting the findings on both standard and specific questions. In the report, factor scores, overall summary ratings, and graphic comparisons with all other courses are included. Interpretations of the evaluator concerning the evaluation findings and inspection of the course materials are added and suggestions for possible improvement of the course are made. Usually the evaluation report is discussed with persons in charge of the courses. This is considered important for the actual utilization of the evaluation findings.
Copies of the report are also sent to the curriculum committee of the medical faculty. This committee holds a great responsibility concerning the choice of curricular contents, the instructional methods applied, and the operational management of the educational program. A summary of the report which contains students' ratings concerning the functioning of the discussion group and tutor is returned to each individual tutor. This report serves purposes of individualized feedback.
The second stage concentrates on courses that show unacceptable or persistent deficiencies. The problem is obvious: how does one decide whether a course is deficient or not? The literature on educational evaluation supplies only a few tentative answers. Our solution to this problem is a comparative approach. Comparisons are made between courses for similar components of the instructional process. As a rule of thumb only attention is paid to differences that are interpretable, i.e. the differences between courses is more than .5 scale point (on a five-point scale). This decision rule makes clear that the standards to judge educational quality are not based on absolute norms but on a relative one: the mean rating of all courses serves as the standard.
The decision whether or not a course, or part of it, is deficient is not based on deviating behavior on one item solely. In most cases a pattern of deviation emerges. If a course falls short on one item with respect to an element of that course, it will tend to fall short on related items concerning this element of the course. This pattern enables the curriculum evaluation project to formulate hypotheses on the nature of the deficiency. These hypotheses can be tested in further in-depth research. GIJSELAERS and WOLFHAGEN (1996) present various examples on how in-depth evaluation research can be conducted to increase the understanding of problems identified in stage 1.
4.3. Organization of the Evaluation
Obviously, running an evaluation program like the one at the Medical School of Maastricht costs money. Maastricht Medical School has organized the evaluation around a project team consisting of educational psychologists, physician, basic science teachers and teachers from clinical departments. Daily evaluation is carried out by an educational psychologist (.8 full time equivalent staff). The educational psychologist is responsible for the actual evaluation. Support is given by a research assistent (.5 full time equivalent) and a member of secretarial staff (.5 full time equivalent). The other members of the project team are labeled for two hours a week on the evaluation project. These members are responsible for communicating evaluation findings to medical departments, to advise the educational psychologist, to provide feedback on the quality of evaluation (regarding evaluation instruments, evaluation reports, etc.), and to develop ideas, if necessary, for new evaluation studies. The model of organizing evaluation through a project team is regarded as essential to make evaluation effective. It increases faculty commitment for evaluation, and evaluation is not regarded as a play ground for educational psychologists. That is, evaluation makes part of planning the education process by sharing responsibilities between medical teachers.
Student enrollment is 150 students a year in Maastricht. Drop-out rates are extremely low (about 10%). The majority of the students graduates after six years. The medical program consist of 24 courses (preclinical) in year 1 - 4. Year 5 - 6 is dedicated to clinical education. All courses and clinical clerkships are evaluated by the evaluation project team.
5. Conclusions
Since 1981, the Maastricht Medical School has adopted an evaluation approach that provides information about shortcomings in the program, guidelines for improvement, and enables change whenever necessary. During the first few years of evaluation activities, faculty debates focused on measurement issues. It was felt that reliability and validity issues seemed to be the essential requirement for change. However, although the reliability and validity of student ratings was reasonbly well supported by research findings (e.g. GIJSELAERS, 1988), awareness increased that organizational change is not only a matter of having data available. Medical schools are like any other organization. If they are receptive to management issues, they will learn that organizational behavior is also a matter of organizational control and organizational culture. MARSH (1984) noted in his review article that "Any procedure used to evaluate teaching effectiveness would prove to be threatening and highly criticized". In addition, he observed that "much of the debate is based on ill-founded fears about student ratings, but the fears still exist". Maastricht experiences show that evaluation is also a matter of how an organization pays attention to teaching, in addition to to research and patient care. Prominent questions are whether high quality teaching is rewarded (formally or informally), whether central curriculum committees have the courage and power to discuss the educational quality of courses run by their colleagues, and whether professional bodies in medicine pay attention to proficiencies required for graduates when working in professional practice. If an organization is ready to handle these kinds of questions, evaluation may become an effective instrument for quality improvement. If not, evaluation will be reduced to ritual data-processing solely.
6. References
ABRAMI, P.C., D'APPOLONIA, S., & COHEN, P.A. (1990). Validity of student ratings of instruction: what we know and what we do not. Journal of Educational Psychology, 82, 219-231.
Association of American Medical Colleges (1984). Physicians for the Twenty-First Century. Report of the Project Panel on the General Professional Education of the Physician and College Preparation for Medicine (GPEP-report). Washington.
BARROWS, H.S. (1984). A specific, problem-based, self-directed learning method designed to teach medical problem-solving skills, self-learning skills and enhance knowledge retention and recall. In H.G. SCHMIDT & M.L. De Volder (Eds.), Tutorials in Problem-based learning. Maastricht, the Netherlands: Van Gorcum.
BRASKAMP, L.A., BRANDENBURG, D.C., & ORY, J.C. (1984). Evaluating Teaching Effectiveness. A practical Guide. Beverly Hills: Sage Publications.
COHEN, P.A. (1981). Student ratings of instruction and student achievement: a meta-analysis of multisection validity studies. Review of Educational Research, 51, 281-309.
DARLING-HAMMOND, L., WISE, A.E., & PEASE, S.R. (1983). Teacher evaluation in the organizational context: a review of the literature. Review of Educational Research, 53, 285-328.
EITEL, F., KANZ, K., SEIBOLD, R., SKLAREK, J., FEUCHTGRUBER, G., STEINER, B., NEUMANN, A., SCHWEIBERER, L., HOLZBACH, R., & PRENZEL, M. (1992). Verbesserung des Studentenunterrichts - Sicherung der Strukturqualität Medizinischer Versorgung. Chirurgische Klinik und Poliklinik, Ludwig-Maximilians-Universität München.
FELDMAN, K.A. (1977). Consistency and variability among college students in rating their teachers and courses. Research in Higher Education, 9, 199-242.
FELTOVITCH, P.J., COULSON, R.L., SPIRO, R.J., & DAWSON-SAUNDERS, B.K. (1991). Knowledge Application and Transfer for Complex Tasks in Ill-Structured Domains: Implications for Instruction and Testing in Biomedicine. In D.A. EVANS & V.L. PATEL (Eds.), Advanced Models of Cognition for Medical Training and Practice. Berlin: Springer-Verlag.
GIJSELAERS, W.H. (1988). Kwaliteit van het onderwijs gemeten (Measuring instructional quality). PhD-thesis. Maastricht, the Netherlands: University of Limburg.
GIJSELAERS, W.H. (1990). Curriculum evaluation. In C. van der VLEUTEN & W.H.F.W. WIJNEN (Eds.), Problem-based learning: Perspectives from the Maastricht experience (pp 51-62). Amsterdam: Thesis Publishers.
GIJSELAERS, W.H., & NUY, H. (1995, april). Effects of motivation on students' ratings of tutor behavior. Paper presented at the Annual Meeting of the American Educational Research Association, April 18-22, San Francisco, Calif. (ERIC Document Reproduction Service ED 383668).
GIJSELAERS, W.H., & SCHMIDT, H.G. (1991, April). Using students' ratings as measure for educational quality: the case of problem-based medical education.. Paper presented at the Annual Meeting of the American Educational Research Association, Chicago, Ill.
GIJSELAERS, W.H., & WOLFHAGEN, H.A.P. (1996). Konsequenzen der Lehrevaluation -die Maastrichter Erfahrungen. In E. NEUGEBAUER, & J. KOEBKE (Eds.). Qualität der Lehre. München, Berlin: Urban & Schwarzenberg.
IRBY, D. (1993). Evaluating instructional scholarship in medicine. Journal of the American Podiatric Medical Association, 83, 332-337.
JONES, J.J. (1981). Students' models of university teaching. Higher Education, 10, 529-549.
MANDL, H.M., GRUBER, H., & RENKL, A. (1993). Das träge Wissen. Psychologie Heute, September, 64-69.
MARSH, H.W. (1984). Students' evaluations of university teaching: dimensionality, reliability, validity, potential biases, and utility. Journal of Educational Psychology, 76, 707-754.
MCKEACHIE, W.J. (1979). Student Ratings of Faculty: A Reprise. Academe, 65, 384-397.
ROTEM, A., & GLASMAN, N.S. (1979). On the effectiveness of students' evaluative feedback to university instructors. Review of Educational Research, 49, 497-511.
SCHMIDT, H.G., DOLMANS, D., GIJSELAERS, W.H., & DES MARCHAIS, J.E (1995). Theory-guided design of a rating scale for course evaluation in problem-based curricula. Teaching and Learning in Medicine, 7, 82-91.
WOLFHAGEN, H.A.P. (1993). Kwaliteit van klinisch onderwijs (Quality of Clinical Education). PhD-thesis. Maastricht, the Netherlands: Universitaire Pers Maastricht.
World Federation for Medical Education (1989). The Edinburgh Declaration. Ann. Community-Oriented Education, 2, 111-113.
WOLFHAGEN, H.A.P., VLEUTEN, C.P.M., & ESSED, G.G.M. (1995). Improving the quality of clinical clerkships using program evaluation results. In A.I. ROTHMAN & R. COHEN (Eds.). Proceedings. The sixth Ottawa Conference on Medical Education (pp. 484-487). Toronto: University of Toronto bookstore Publishing.
7. Appendices:
7.1 CTEQ
To: Students of years 5 and 6 serving a clerkship
From: Project Programme Evaluation
Subject: Programme Evaluation in a Clinical Setting
Dear Students,
The Faculty of Medicine is collecting information on the effectiveness of the clinical clerkships. A very important aspect is the evaluation by students of the clerkship or clerkships they have followed. We therefore request you to fill in the evaluation form accompanying this letter. The results will be reported to all those involved in clinical training. The data reported will be anonymous.
Most if the items on the questionnaire are statements, to which you are asked to respond by circling a number.
1 means that you "disagree entirely" with the statement
2 means that you "disagree"
3 means "neutral"
4 means that you "agree"
5 means that you "agree entirely" with the statement.
If you consider a particular statement to be irrelevant, please cross out the number of the statement considered.
The questionnaire concludes with some open questions.
Insofar as possible, this questionnaire has been adapted specifically to the various clerkships. As a result, a number of questions (corresponding with a question number from an item bank) are not included in this list.
Thank you for your cooperation.
Programme Evaluation Project
Department of Educational Development and Educational Research
Faculty of Medicine
University of Maastricht, the Netherlands
7.2 Programme Evaluation Questionnaire for Clinical Setting
The questions below refer to the clerkship you have just concluded.
In questions 1 and 3, please tick where appropriate.
1. Clerkships concluded:
0 1. GP
0 2. Internal Medicine
0 3. Surgery
0 4. Gynaecology/Obstetrics
0 5. Paediatrics
0 6. Neurology
0 7. Opthalmology
0 8. ENT
0 9. Dermatology
0 10. Psychiatry
2. This questionnaire refers to the clerkship
3. Location of present clerkship
0 1. Maastricht
0 2. Heerlen
0 3. Sittard
0 4. Roermond
0 5. Eindhoven
0 6. Other, namely: ....................................
4. Date of end of clerkship:
For example: