ASSIGNMENT No. 1
Q.1 what are the types of assessment? Differentiate assessment for training of learning and as learning.
Formative assessment is used to monitor student’s learning to provide ongoing feedback that can be used by instructors or teachers to improve their teaching and by students to improve their learning.
Summative assessment, however, is used to evaluate student’s learning at the end of an instructional unit by comparing it against some standard or benchmark.
You can tell from their definitions that those two evaluation strategies are not meant to evaluate in the same way. So let’s take a look at the biggest differences between them.
Differences between formative and summative assessments
The first big difference is when the assessment takes place in a student’s learning process.
As the definition already gave away, formative assessment is an ongoing activity. The evaluation takes place during the learning process. Not just one time, but several times.
A summative evaluation takes place at a complete other time. Not during the process, but after it. The evaluation takes place after a course or unit’s completion.
There’s also a big difference between the assessement strategies in getting the right information of the student’s learning.
With formative assessments you try to figure out whether a student’s doing well or needs help by monitoring the learning process.
When you use summative assessments, you assign grades. The grades tell you whether the student achieved the learning goal or not.
The purposes of both assessments lie miles apart. For formative assessment, the purpose is to improve student’s learning. In order to do this you need to be able to give meaningful feedback. Check out this post about feedback.
For summative assessment, the purpose is to evaluate student’s achievements.
So do you want your students to be the best at something, or do you want your students to transcend themselves each time over and over again?
Remember when I said that with formative assessment the evaluation takes place several times during the learning process en with summative assessment at the end of a chapter or course? This explains also the size of the evaluation packages.
Formative assessment includes little content areas. For example: 3 formative evaluations of 1 chapter.
Summative assessment includes complete chapters or content areas. For example: just 1 evaluation at the end of a chapter. The lesson material package is much larger now.
The last difference you may already have guessed. Formative assessment considers evaluation as a process. This way, the teacher can see a student grow and steer the student in an upwards direction.
With summative assessment it’s harder for you to steer the student in the right direction. The evaluation is already done. That’s why summative assessments or evaluations are considered to be more of a “product”.
Examples of formative assessments
Formative assessments can be classroom polls, exit tickets, early feedback, and so on. But you can make them more fun too. Take a look at these three examples.
- In response to a question or topic inquiry, students write down 3 different summaries. 10-15 words long, 30-50 words long and 75-100 words long.
- The 3-2-1 countdown exercise: Give your students cards to write on, or they can respond orally. Students have to respond to three separate statements: 3 things you didn’t know before, 2 things that surprised you about this topic and 1 thing you want to start doing with what you’ve learned.
- One minute papers are usually done at the end of the lesson. Students answer a brief question in writing. The question typically centers around the main point of the course, most surprising concept, most confusing area of the topic and what question from the topic might appear on the next test.
Examples of summative assessments
Most of you have been using summative assessments whole their teaching careers. And that’s normal. Education is a slow learner and giving students grades is the easier thing to do.
Examples of summative assessments are midterm exams, end-of-unit or –chapter tests, final projects or papers, district benchmark and scores used for accountability for schools and students.
So, that was it for this post. I hope you now know the differences and know which assessment strategy you are going to use in your teaching. If you want to know more about implementing formative assessment you should really take a look at this interview of a school without grades and this post about the building blocks of formative assessment.
Q.2 what do you know about taxonomy of educational objectives? Write in detail.
Taxonomy Information and quotations in this summary, except where otherwise noted, are drawn from Krathwohl, D. R. (2002). A revision of Bloom’s taxonomy: An overview. Theory into Practice, 41 (4), 212-261. Krathwohl participated in the creation of the original Taxonomy, and was the co-author of the revised Taxonomy.
“The Taxonomy of Educational Objectives is a framework for classifying statements of what we expect or intend students to learn as a result of instruction. The framework was conceived as a means of facilitating the exchange of test items among faculty at various universities in order to create banks of items, each measuring the same educational objective (p. 212).”
The Taxonomy of Educational Objectives provides a common language with which to discuss educational goals.
Bloom’s Original Taxonomy
Benjamin Bloom of the University of Chicago developed the Taxonomy in 1956 with the help of several educational measurement specialists.
Bloom saw the original Taxonomy as more than a measurement tool. He believed it could serve as a:
- common language about learning goals to facilitate communication across persons, subject matter, and grade levels;
- basis for determining in a particular course or curriculum the specific meaning of broad educational goals, such as those found in the currently prevalent national, state, and local standards;
- means for determining the congruence of educational objectives, activities, and assessments in a unit, course, or curriculum; and
- panorama of the range of educational possibilities against which the limited breadth and depth of any particular educational course or curriculum could be contrasted (Krathwohl, 2002).
Bloom’s Taxonomy provided six categories that described the cognitive processes of learning: knowledge, comprehension, application, analysis, synthesis, and evaluation. The categories were meant to represent educational activities of increasing complexity and abstraction.
Bloom and associated scholars found that the original Taxonomy addressed only part of the learning that takes place in most educational settings, and developed complementary taxonomies for the Affective Domain (addressing values, emotions, or attitudes associated with learning) and the Psychomotor Domain (addressing physical skills and actions). These can provide other useful classifications of types of knowledge that may be important parts of a course.
The Affective Domain
- Characterization by a value or value complex
From Krathwohl, Bloom, & Masia. Taxonomy of Educational Objectives, the Classification of Educational Goals. Handbook II: Affective Domain. (1973).
- Reflex movements
- Basic-fundamental movements
- Perceptual abilities
- Physical abilities
- Skilled movements
- Nondiscursive communication
From Harrow. A taxonomy of psychomotor domain: a guide for developing behavioral objectives. (1972).
The Revised Taxonomy
Bloom’s Taxonomy was reviewed and revised by Anderson and Krathwohl, with the help of many scholars and practitioners in the field, in 2001. They developed the revised Taxonomy, which retained the same goals as the original Taxonomy but reflected almost half a century of engagement with Bloom’s original version by educators and researchers.
Orignal vs Revised Bloom’s Taxonomy
 Unlike Bloom’s original “Knowledge” category, “Remember” refers only to the recall of specific facts or procedures
 Many instructors, in response to the original Taxonomy, commented on the absence of the term “understand”. Bloom did not include it because the word could refer to many different kinds of learning.However, in creating the revised Taxonomy, the authors found that when instructors use the word “understand”, they were most frequently describing what the original taxonomy had named “comprehension”.
Structure of the Cognitive Process Dimension of the Revised Taxonomy
One major change of the revised Taxonomy was to address Bloom’s very complicated “knowledge” category, the first level in the original Taxonomy. In the original Taxonomy, the knowledge category referred both to knowledge of specific facts, ideas, and processes (as the revised category “Remember” now does), and to an awareness of possible actions that can be performed with that knowledge. The revised Taxonomy recognized that such actions address knowledge and skills learned throughout all levels of the Taxonomy, and thus added a second “dimension” to the Taxonomy: the knowledge dimension, comprised of factual, conceptual, procedural, and metacognitive knowledge.
Structure of the Knowledge Dimension of the Revised Taxonomy
- Factual knowledge – The basic elements that students must know to be acquainted with a discipline or solve problems in it.
- Conceptual knowledge – The interrelationships among the basic elements within a larger structure that enable them to function together.
- Procedural knowledge – How to do something; methods of inquiry; and criteria for using skills, algorithms, techniques, and methods.
- Metacognitive knowledge – Knowledge of cognition in general as well as awareness and knowledge of one’s own condition.
The two dimensions – knowledge and cognitive – of the revised Taxonomy combine to create a taxonomy table with which written objectives can be analyzed. This can help instructors understand what kind of knowledge and skills are being covered by the course to ensure that adequate breadth in types of learning is addressed by the course.
For examples of learning objectives that match combinations of knowledge and cognitive dimensions see Iowa State University’s Center for Excellence in Learning and Teaching interactive Flash Model by Rex Heer. http://www.celt.iastate.edu/teaching/effective-teaching-practices/revised-blooms-taxonomy
Structure of Observed Learning Outcomes (SOLO) taxonomy
Like Bloom’s taxonomy, the Structure of Observed Learning Outcomes (SOLO) taxonomy developed by Biggs and Collis in 1992 distinguishes between increasingly complex levels of understanding that can be used to describe and assess student learning. While Bloom’s taxonomy describes what students do with information they acquire, the SOLO taxonomy describes the relationship students articulate between multiple pieces of information. Atherton (2005) provides an overview of the five levels that make up the SOLO taxonomy:
- Pre-structural: here students are simply acquiring bits of unconnected information, which have no organization and make no sense.
- Unistructural: simple and obvious connections are made, but their significance is not grasped.
- Multistructural: a number of connections may be made, but the meta-connections between them are missed, as is their significance for the whole.
- Relational level: the student is now able to appreciate the significance of the parts in relation to the whole.
- At the extended abstract level, the student is making connections not only within the given subject area, but also beyond it, able to generalize and transfer the principles and ideas underlying the specific instance.
Q.3 how will you define attitude? Elaborate its components.
Attitudes represent our evaluations, preferences or rejections based on the information we receive.
It is a generalized tendency to think or act in a certain way in respect of some object or situation, often accompanied by feelings. It is a learned predisposition to respond in a consistent manner with respect to a given object.
This can include evaluations of people, issues, objects, or events. Such evaluations are often positive or negative, but they can also be uncertain at times.
These are the way of thinking, and they shape how we relate to the world both in work and Outside of work. Researchers also suggest that there are several different components that make up attitudes.
One can see this by looking at the three components of an attitude: cognition, affect and behavior.
3 components of attitude are;
- Cognitive Component.
- Affective Component.
- Behavioral Component.
The cognitive component of attitudes refers to the beliefs, thoughts, and attributes that we would associate with an object. It is the opinion or belief segment of an attitude. It refers to that part of attitude which is related in general knowledge of a person.
Typically these come to light in generalities or stereotypes, such as ‘all babies are cute’, ‘smoking is harmful to health’ etc.
Affective component is the emotional or feeling segment of an attitude.
It is related to the statement which affects another person.
It deals with feelings or emotions that are brought to the surface about something, such as fear or hate. Using the above example, someone might have the attitude that they love all babies because they are cute or that they hate smoking because it is harmful to health.
Behavior component of an attitude consists of a person’s tendencies to behave’in a particular way toward an object. It refers to that part of attitude which reflects the intention of a person in the short-run or long run.
Using the above example, the behavioral attitude maybe- ‘I cannot wait to kiss the baby’, or ‘we better keep those smokers out of the library, etc.
Attitude is composed of three components, which include a cognitive component, effective or emotional component, and a behavioral component.
Basically, the cognitive component is based on the information or knowledge, whereas the affective component is based on the feelings.
The behavioral component reflects how attitude affects the way we act or behave. It is helpful in understanding their complexity and the potential relationship between attitudes and behavior.
But for clarity’s sake, keep in mind that the term attitude essentially refers to the affected part of the three components.
In an organization, attitudes are important for their goal or objective to succeed. Each one of these components is very different from the other, and they can build upon one another to form our attitudes and, therefore, affect how we relate to the world.
Q.4 what are the type of every questions? Also write its advantages and disadvantages.
Multiple choice items are a common way to measure student understanding and recall. Wisely constructed and utilized, multiple choice questions will make stronger and more accurate assessments. At the end of this activity, you will be able to construct multiple choice test items and identify when to use them in your assessments. Let’s begin by thinking about the advantages and disadvantages of using multiple-choice questions. Knowing the advantages and disadvantages of using multiple choice questions will help you decide when to use them in your assessments.
- Allow for assessment of a wide range of learning objectives
- Objective nature limits scoring bias
- Students can quickly respond to many items, permitting wide sampling and coverage of content
- Difficulty can be manipulated by adjusting similarity of distractors
- Efficient to administer and score
- Incorrect response patterns can be analyzed
- Less influenced by guessing than true-false
- Limited feedback to correct errors in student understanding
- Tend to focus on low level learning objectives
- Results may be biased by reading ability or test-wiseness
- Development of good items is time consuming
- Measuring ability to organize and express ideas is not possible
Multiple choice items consist of a question or incomplete statement (called a stem) followed by 3 to 5 response options. The correct response is called the key while the incorrect response options are called distractors.
For example: This is the most common type of item used in assessments. It requires students to select one response from a short list of alternatives.
- True-false (distractor)
- Multiple choice (key)
- Short answer (distractor)
- Essay (distractor)
Following these tips will help you develop high quality multiple choice questions for your assessments.
- Use 3-5 responses in a vertical list under the stem.
- Put response options in a logical order (chronological, numerical), if there is one, to assist readability.
- Use clear, precise, simple language so that wording doesn’t affect students’ demonstration of what they know (avoid humor, jargon, cliché).
- Each question should represent a complete thought and be written as a coherent sentence.
- Avoid absolute or vague terminology (all, none, never, always, usually, sometimes).
- Avoid using negatives; if required, highlight them.
- Assure there is only one interpretation of meaning and one correct or best response.
- Stem should be written so that students would be able to answer the question without looking at the responses.
- All responses should be written clearly, approximately homogeneous in content, length and grammar.
- Make distractors plausible and equally attractive for students who do not know the material.
- Ensure stems and responses are independent; don’t supply or clue the answer in a distractor or another question.
- Avoid “all of the above” or “none of the above” when possible, and especially if asking for the best answer.
- Include the bulk of the content in the stem, not in the responses.
- The stem should include any words that would be repeated in each response.
Examine the examples below and think about the tips you just learned. As you look at each one think about whether or not it ‘s a good example or does it need improvement?
- As a public health nurse, Susan tries to identify individuals with unrecognized health risk factors or asymptomatic disease conditions in populations. This type of intervention can best be described as
- case management
- health teaching
- none of the above
This item should be revised. It should not have “none of the above” as a choice if you are asking for the “best” answer.
- Critical pedagogy
- is an approach to teaching and learning based on feminist ideology that embraces egalitarianism by identifying and overcoming oppressive practices.
- is an approach to teaching and learning based on sociopolitical theory that embraces egalitarianism through overcoming oppressive practices.
- is an approach to teaching and learning based on how actual day-to-day teaching/learning is experienced by students and teachers rather than what could or should be experienced.
- is an approach to teaching and learning based on increasing awareness of how dominant patterns of thought permeate modern society and delimit the contextual lens through which one views the world around them.
This item should be revised because the repetitive wording should be in the stem. So the stem should read “Clinical pedagogy is an approach to teaching and learning based on:”
- Katie weighs 11 pounds. She has an order for ampicillin sodium 580 mg IV q 6 hours. What her daily dose of ampicillin is as ordered?
- 1160 mg
- 1740 mg
- 2320 mg
- 3480 mg
This example is well written and structured.
- The research design that provides the best evidence for a cause-effect relationship is an:
- experimental design
- control group
- quasi-experimental design
- evidence-based practice
This example contains a grammatical cue and grammatical inconsistency. Additionally, all distractors are not equally plausible.
- The nurse supervisor wrote the following evaluation note: Carol has been a nurse in the post-surgical unit for 2 years. She has good organizational and clinical skills in managing patient conditions. She has a holistic grasp of situations and is ready to assume greater responsibilities to further individualize care.
Using the Dreyfus model of skill acquisition, identify the stage that best describes Carol’s performance.
- Advanced beginner
This is a good example.
Multiple choice questions are commonly used in assessments because of their objective nature and efficient administration. To make the most of these advantages, it’s important to make sure your questions are well written.
Q.5 Construct a test, administer it and ensure its reliability.
Reliability is a measure of the consistency of a metric or a method.
Every metric or method we use, including things like methods for uncovering usability problems in an interface and expert judgment, must be assessed for reliability.
In fact, before you can establish validity, you need to establish reliability.
Here are the four most common ways of measuring reliability for any empirical method or metric:
- inter-rater reliability
- test-retest reliability
- parallel forms reliability
- internal consistency reliability
Because reliability comes from a history in educational measurement (think standardized tests), many of the terms we use to assess reliability come from the testing lexicon. But don’t let bad memories of testing allow you to dismiss their relevance to measuring the customer experience. These four methods are the most common ways of measuring reliability for any empirical method or metric.
The extent to which raters or observers respond the same way to a given phenomenon is one measure of reliability. Where there’s judgment there’s disagreement.
Even highly trained experts disagree among themselves when observing the same phenomenon. Kappa and the correlation coefficient are two common measures of inter-rater reliability. Some examples include:
- Evaluators identifying interface problems
- Experts rating the severity of a problem
For example, we found that the average inter-rater reliability[pdf] of usability experts rating the severity of usability problems was r = .52. You can also measure intra-rater reliability, whereby you correlate multiple scores from one observer. In that same study, we found that the average intra-rater reliability when judging problem severity was r = .58 (which is generally low reliability).
Do customers provide the same set of responses when nothing about their experience or their attitudes has changed? You don’t want your measurement system to fluctuate when all other things are static.
Have a set of participants answer a set of questions (or perform a set of tasks). Later (by at least a few days, typically), have them answer the same questions again. When you correlate the two sets of measures, look for very high correlations (r > 0.7) to establish retest reliability.
As you can see, there’s some effort and planning involved: you need for participants to agree to answer the same questions twice. Few questionnaires measure test-retest reliability (mostly because of the logistics), but with the proliferation of online research, we should encourage more of this type of measure.
Parallel Forms Reliability
Getting the same or very similar results from slight variations on the question or evaluation method also establishes reliability. One way to achieve this is to have, say, 20 items that measure one construct (satisfaction, loyalty, usability) and to administer 10 of the items to one group and the other 10 to another group, and then correlate the results. You’re looking for high correlations and no systematic difference in scores between the groups.
Internal Consistency Reliability
This is by far the most commonly used measure of reliability in applied settings. It’s popular because it’s the easiest to compute using software—it requires only one sample of data to estimate the internal consistency reliability. This measure of reliability is described most often using Cronbach’s alpha (sometimes called coefficient alpha).
It measures how consistently participants respond to one set of items. You can think of it as a sort of average of the correlations between items. Cronbach’s alpha ranges from 0.0 to 1.0. Since the late 1960s, the minimally acceptable measure of reliability has been 0.70; in practice, though, for high-stakes questionnaires, aim for greater than 0.90. For example, the SUS has a Cronbach’s alpha of 0.92.
The more items you have, the more internally reliable the instrument, so to increase internal consistency reliability, you would add items to your questionnaire. Since there’s often a strong need to have few items, however, internal reliability usually suffers. When you have only a few items, and therefore usually lower internal reliability, having a larger sample size helps offset the loss in reliability.
Here are a few things to keep in mind about measuring reliability:
- Reliability is the consistency of a measure or method over time.
- Reliability is necessary but not sufficient for establishing a method or metric as valid.
- There isn’t a single measure of reliability, instead there are four common measures of consistent responses.
- You’ll want to use as many measures of reliability as you can (although in most cases one is sufficient to understand the reliability of your measurement system).
- Even if you can’t collect reliability data, be aware of the ways in which low reliability may affect the validity of your measures, and ultimately the veracity of your decisions