Using Standards in Evaluation

In my area of work, educational evaluation, standards refer to desired behaviors, skills, or measurable knowledge. I like standards. If I disagree with a particular standard, I still find it useful to know what others expect. But even when a set of standards is universally accepted, I cannot help asking, “So, how is that working out?”

Channeling that trait into professional practice, I adopted the following process of reflecting on a series of questions when applying a set of standards in my evaluation practice:

What is the point of the standard?

The creators of the standard—usually an organization or committee (e.g., National Science Teachers Association, International Society for Technology in Education, Joint Committee on Standards for Educational Evaluation) believe things will be better if people meet their criteria. Knowing what “better” means is important; it anticipates the summative evaluation of the project.

What does it look like to meet the standard?

Sometimes, as in the Next Generation Science Standards, criteria are very detailed, describing a scope and sequence of specific knowledge and skills across multiple grades. In those cases, it is relatively easy to list the attributes of “better.”

Sometimes, observable attributes of meeting a standard are not clear; I have to develop them. I spent several years evaluating teacher professional development programs against the National Education Technology Standards (NETS, now the ISTE Standards), a framework for incorporating technology in learning. At the time, the NETS included a section on Creativity and Innovation. In applying the standard, I parsed every sentence, reached out to the ISTE Standards project, and enlisted the help of graduate students to research how artists and other creative professionals defined the concept. It became clear that the NETS emphasized the innovation aspect. That is, the creative student understands and builds on prior knowledge.

How can I document the standard in practice?

If the standard is well-defined, I can walk back through a logic model from the desired outcome, through the attributes of each standard, to a rubric that I can score: Yes/No; Not/Somewhat/Completely; 0%–100%, etc. In the case of the NETS, understanding the attributes the standard writers had in mind let me create checklists that enabled classroom observers and curriculum analysts to quantify the presence and extent of NETS activities and behaviors.

This process turns the conceptual heavy lifting already done by standard developers into a tool that gives me replicable evidence for judgments. Ideally, I can teach the criteria to others and apply them in any medium (field observations, documents, vlogs, etc.).

Besides providing comparative ratings of success across time, activities, and settings, the results help validate the standards themselves. Are individuals or activities scoring higher on a particular attribute more likely to display whatever outcome the standard supports? If not, what conditions mediate results?

This is a reductionist approach. It breaks down abstract concepts into observable and countable events. It could tempt an evaluator to ignore context and events that fall outside the scope of a standard. However, context is fundamental evaluation practice. When this process includes careful item development, training of observers, and analysis of instrument reliability and validity, it provides evidence that is replicable and relatively easy to communicate within and outside a program.

For instance, I can advise school administrators that if they want teachers to create lessons that meet a new standard within a single school year, they should let teachers develop the lessons in teams and practice them multiple times with one another’s students (i.e., a “lesson study” approach). Otherwise, it takes about three years for a teacher to implement a new practice. Administrators do not have to take my word for it. They can carry one of my rubrics into a classroom and see for themselves.

Application

Think of this approach as an extension of the review an evaluator does of any goal-based project. In ATE projects, the goal is producing qualified technicians (NSF 24-584); that is, not just teaching about technology, but placing people in jobs. Reverse-engineering the logic model, we ask what it takes to get that job. The ATE answer is effective training. What is effective? We’re looking for attributes—observable behaviors and qualities that we can describe, count, and modify. Ideally, the professional standards or certifications a program relies on make those characteristics clear. But if they do not, or if successful certification does not result in success in the workforce, then evaluators should look under the hood and try to recognize the attributes of experience and context that make a difference.

Selected Journal Articles and Book Chapters

Bielefeldt, T. (2012). Guidance for technology decisions from classroom observations. Journal of Research on Technology in Education, 44 (3), 205–210.
Hayden, K., Ouyang, Y., Scinski, L., Olszewski, B., & Bielefeldt, T. (2011). Increasing student interest and attitudes in STEM: Professional development and activities to engage and inspire learners. Contemporary Issues in Technology and Teacher Education, 11(1). Retrieved from http://www.citejournal.org/vol11/iss1/science/article1.cfm.

Presentations and Magazine Articles

Bielefeldt, T. (2000, June 24). Objective assessment of technology integration in learning environments [Presentation]. Preparing Tomorrow’s Teachers to Use Technology Conference, Atlanta, GA.
Bielefeldt, T. (2012–2013). Know the ISTE Standards for Students. (Monthly column of training activities for classroom observation.) Learning and Leading with Technology.
Bielefeldt, T. (2014, October). Classroom observation as participatory evaluation [Presentation]. American Evaluation Association, Denver.
Bielefeldt, T. (2015, October 22). XCOT: Cross-classroom, cross standards observation technology [Presentation]. National Science Foundation Advanced Technological Education, Washington.
Bielefeldt, T. (2015, November 12). Making classroom observation easy—well, possible. American Evaluation Association, Chicago.
Bielefeldt, T., & Olszewski, B. (2013, June 25). Documenting 21st Century skills: Classroom observation with the NETS [Workshop training]. ISTE 2013, San Antonio.
Bielefeldt, T., & Olszewski, B. (2011, November 3). Technology and classroom observation [Presentation]. American Evaluation Association, Anaheim.
Rubino-Hare, L., Bielefeldt, T., Bloom, N., & Blevins, K., (2017, June). Using ISTE Standards to improve a PBL/geospatial technology professional learning program [Presentation]. ISTE 2017, San Antonio.

About the Authors

Mr. Talbot Bielefeldt

Senior Program Evaluator, Clearwater Program Evaluation

Talbot Bielefeldt has been an educational program evaluator since 1996, working on a variety of federal, state, local, and corporate initiatives at all educational levels. He holds a Masters in Educational Policy and Management from the University of Oregon. His business, Clearwater Program Evaluation in Eugene, Oregon, currently consults on NSF ATE and S-STEM grants.

Email View Contributions

Except where noted, all content on this website is licensed under a Creative Commons Attribution-NonCommercial-ShareAlike 4.0 International License.

Looking for an Evaluator?

Featured Insights

Crafting Compelling Case Studies for Evaluation

Getting Your New ATE Project’s Evaluation off to a Great Start

Research on ATE Evaluation

Changes in ATE Evaluation Plans Across Time

Evaluation Task Validation

Sentiment Analysis

Evaluator Procurement

How can EvaluATE help you?

Using Standards in Evaluation

About the Authors

Mr. Talbot Bielefeldt

Related Blog Posts