GUEST POST by LAUREL NORRIS: Robust Responses to Open-Ended Questions: Good Surveys Prime Respondents to Think Critically

This is a guest post by Laurel Norris (https://twitter.com/neutrinosky).

Laurel is a Training Specialist at Widen Enterprises, where she is involved in developing and delivering training, focusing on data, reporting, and strategy.

--------------------------------------------------

Robust Responses to Open-Ended Questions: Good Surveys Prime Respondents to Think Critically

By Laurel Norris

--------

I’ve always been a fan of evaluation. It’s a way to better understand the effectiveness of programs, determine if learning objectives are being met, and reveal ways to improve web workshops and live trainings.

Or so I thought.

It turns out that most evaluations don’t do those things. Performance-Focused Smile Sheets (the book is available at http://SmileSheets.com) taught me that and when I implemented the recommendations from the book, I discovered something interesting. Using Dr. Thalheimer’s method improved the quality and usefulness of survey data – and provided me with much more robust responses to open-ended questions.

By more robust, I mean they revealed what was helpful and why, talked about what they thought their challenges would be in trying it themselves, discussed what areas they thought could use more emphasis, and shared where they would have appreciated more examples. In short, they provided a huge amount of useful information.

Bigstock--187668292

Before using Dr. Thalheimer’s method, only a few open-ended responses were helpful. Most were along the lines of “Thanks!”, “Good webinar”, or “Well presented”. While those kinds of answers make me feel good, they don’t help me improve trainings.

I’m convinced that the improved survey primed people to be more engaged with the evaluation process and enabled them to easily provide useful information to me.

So what did I do differently? I’ll use real examples from web workshops I conducted. Both workshops ran around 45 minutes and had 30 responses to the end of workshop survey. They did differ in style, something that I will discuss towards the end of this article.

 

The Old Method

Let’s talk about the old method, what Dr. Thalheimer might call a traditional smile sheet. It was (luckily) short, with three multiple choice questions and two open-ended. Multiple choice questions included:

  • How satisfied are you with the content of this web workshop?
  • How satisfied are you with the presenter's style?
  • How closely did this web workshop align with your expectations?

Participants answered the questions with options on Likert-like scales ranging from “Very Unsatisfied” to “Very Satisfied” or “Not at all Closely” to “Very Closely”. Of course, in true smile-sheet style, the multiple choice yielded no useful information. People were 4.1 level satisfied with the content of the webinar, “data” which did not enable me to make any useful changes to the information I provided.

Open-ended questions invited people to “Share your ideas for web workshop topics” and offer “Additional Comments”. Of the thirteen open-ended responses I got, five of them provided useful information. The other seven were either a thank you or some form of praise.

 

The New Method

Respondents were asked four multiple choice questions that judged effectiveness of the web workshop, how much the concepts would help them improve work outcomes, how well they understand the concepts taught, and whether or not they would use skills they learned in the workshop at their job.

The web workshop was about user engagement, in particular, administrators increasing engagement with the systems they manage. Questions were:

  • In regard to the user engagement, how able are you to put what you’ve learned into practice on the job?
  • From your perspective, how valuable are the concepts taught in the workshop? How much will they help improve engagement with your site?
  • How well do you feel you understand user engagement?
  • How motivated will you be to utilize these user engagement skills at your work?

Responses were specific and adapted from Dr. Thalheimer’s book. For example, here were the optional responses to the question “In regard to the user engagement, how able are you to put what you’ve learned into practice on the job?”

  • I'm not at all able to put the concepts into practice.
  • I have general awareness of the concepts taught, but I will need more training or practice to complete user engagement projects.
  • I am able to work on user engagement projects, but I'll need more hands-on experience to be fully competent in using the concepts taught.
  • I am able to complete user engagement projects at a fully competent level in using the concepts taught.
  • I am able to complete user engagement projects at a expert level in using the concepts taught.

All four multiple choice questions had similarly complete options to choose from. From those responses, I was able to more appropriately determine the effectiveness of the workshop and whether my training content was performing as expected.

The open-ended question was relatively bland. I asked “What else would you like to share about your experience during the webinar today?” and received twelve specific, illuminating responses, such as:

“Loved the examples shown from other sites. Highly useful!”

“It validated some of the meetings I have had with my manager about user engagement and communication about our new site structure. It will be valuable for upcoming projects about asset distribution throughout the company.”

“I think the emphasis on planning the plan is helpful. I think I lack confidence in designing desk drops for Design teams. Also - I'm heavily engaged with my users now as it is - I am reached out to multiple times per day...but I think some of these suggestions will be valuable for more precision in those engagements.”

Even questions that didn’t give me direct feedback on the workshop, like “Still implementing our site, so a lot of today's content isn't yet relevant”, gave me information about my audience.

 

Conclusion

Clearly, I’m thrilled with the kind of information I am getting from using Dr. Thalheimer’s methods. I get useful, rich data from respondents that helps me better evaluate my content and understand my audience.

There is one positive aspect of using the new method that might have skewed the data. I designed the second web workshop after I read the book, and Dr. Thalheimer’s Training Effectiveness Taxonomy influenced the design. I thought more about the goals for the workshop, provided cognitive supports, repeated key messages, and did some situation-action triggering.

Based on those changes, the second web workshop was probably better than the first and it’s possible that the high-quality, engaging workshop contributed to the robust responses to open-ended questions I saw.

Either way, my evaluations (and learner experiences) are revolutionized. Has anyone seen a similar improvement in open-ended response rates since implementing performance-focused smile sheets?

 

Another Reason to Learn About Performance-Focused Smile Sheets

This has been a great year for the Performance-Focused Smile Sheet approach. Not only did the book, Performance-Focused Smile Sheets: A Radical Rethinking of a Dangerous Art Form, win a prestigious Award of Excellence from the International Society of Performance Improvement, but people are flocking to workshops, conference sessions, and webinars to learn about this revolutionary new method of gathering learner feedback.

Now there's even more reason to learn about this method. In the July 2017 issue of TD (Talent Development), it was reported that the Human Capital Institute (HCI) issued a report that said that measurement/evaluation is the top skill needed by learning and development professionals!

Go to SmileSheets.com to get the book.

Want to Diagnose Your Organization’s Smile Sheets for FREE?

We are coming up to the one-year anniversary of my book, Performance-Focused Smile Sheets: A Radical Rethinking of a Dangerous Art Form. To celebrate, I've created a diagnostic to help you and your organization give yourselves a robust Smile-Sheet Checkup.

To access the diagnostic instrument, go to the diagnostic page on the book's website.

SmileSheetDiagnostic

Testing for Instructional Designers — A Common Mistake

Somebody sent me a link to a YouTube video today -- a video created to explain to laypeople what instructional design is. Most of it was reasonable, until it gave the following example, narrated as follows:

"... and testing is created to clear up confusion and make sure learners got it right."

image from https://s3.amazonaws.com/feather-client-files-aviary-prod-us-east-1/2016-11-03/d044b9e8-731e-42ff-a2ef-a9b5df14f533.png

Something is obviously wrong here -- something an instructional designer ought to know. What is it?

Scroll down for the answer...

Before you scroll down, come up with your own answer...

.

.

.

.

.

.

.

.

.

.

.

.

Answer: 

The test question is devoid of real-world context. Instead of asking a text-based question, we could provide an image and ask them to point to the access panel.

Better yet, we could have them work on a simulated real-world task and follow steps that would enable them to complete the simulated task only if they used the access panel as part of their task completion.

Better yet, we could have them work on an actual real-world task... et cetera...

Better yet, we might first ask ourselves whether anybody really needs to "LEARN" where the access panel is -- or would they just find it on their own without being trained or tested on it?

Better yet, we might first ask ourselves whether we really need a course in the first place. Maybe we'd be better off to create a performance-support tool that would take them through troubleshooting steps -- with zero or very little training required.

Better yet, we might first ask ourselves whether we could design our equipment so that technicians don't need training or performance support.

.

.

.

Or we could ask ourselves existential questions about the meaning and potency of instructional design, about whether a career devoted to helping people learn work skills is worthy to be our life's work...

Or we could just get back to work and crank out that test...

SMILE...

 

 

Providing Feedback on Quiz Questions — Yes or No?

I was asked today the following question from a learning professional in a large company:

It will come as no surprise that we create a great deal of mandatory/regulatory required eLearning here. All of these eLearning interventions have a final assessment that the learner must pass at 80% to be marked as completed; in addition to viewing all the course content as well. The question is around feedback for those assessment questions. 

  • One faction says no feedback at all, just a score at the end and the opportunity to revisit any section of the course before retaking the assessment.

  • Another faction says to tell them correct or incorrect after they submit their answer for each question.

  • And a third faction argues that we should give them detailed feedback beyond just correct/incorrect for each question. 

Which approach do you recommend? 

Bigstock-Results-Information-Homepage-E-123859796


Here is what I wrote in response:

It all depends on what you’re trying to accomplish…

If this is a high-stakes assessment you may want to protect the integrity of your questions. In such a case, you’d have a large pool of questions and you’d protect the answer choices by not divulging them. You may even have proctored assessments, for example, having the respondent turn on their web camera and submit their video image along with the test results. Also, you wouldn’t give feedback because you’d be concerned that students would share the questions and answers.

If this is largely a test to give feedback to the learners—and to support them in remembering and performance—you’d not only give them detailed feedback, but you’d retest them after a few days or more to reinforce their learning. You might even follow-up to see how well they’ve been able to apply what they’ve learned on the job.

We can imagine a continuum between these two points where you might seek a balance between a focus on learning and a focus on assessment.

This may be a question for the lawyers, not just for us as learning professionals. If these courses are being provided to meet certain legal requirements, it may be most important to consider what might happen in the legal domain. Personally, I think the law may be behind learning science. Based on talking with clients over many years, it seems that lawyers and regulators often recommend learning designs and assessments that do NOT make sense from a learning standpoint. For example, lawyers tell companies that teaching a compliance topic once a year will be sufficient -- when we know that people forget and may need to be reminded.

In the learning-assessment domain, lawyers and regulators may say that it is acceptable to provide a quiz with no feedback. They are focused on having a defensible assessment. This may be the advice you should follow given current laws and regulations. However, this seems ultimately indefensible from a learning standpoint. Couldn't a litigant argue that the organization did NOT do everything they could to support the employee in learning -- if the organization didn't provide feedback on quiz questions? This seems a pretty straightforward argument -- and one that I would testify to in a court of law (if I was asked).

By the way, how do you know 80% is the right cutoff point? Most people use an arbitrary cutoff point, but then you don’t really know what it means.

Also, are your questions good questions? Do they ask people to make decisions set in realistic scenarios? Do they provide plausible answer choices (even for incorrect choices)? Are they focused on high-priority information?

Do the questions and the cutoff point truly differentiate between competence and lack of competence?

Are the questions asked after a substantial delay -- so that you know you are measuring the learners' ability to remember?

Bottom line: Decision-making around learning assessments is more complicated than it looks.

Note: I am available to help organizations sort this out... yet, as one may ascertain from my answer here, there are no clear recipes. It comes down to judgment and goals.

If your goal is learning, you probably should provide feedback and provide a delayed follow-up test. You should also use realistic scenario-based questions, not low-level knowledge questions.

If your goal is assessment, you probably should create a large pool of questions, proctor the testing, and withhold feedback.

 

Benchmarking Your Smile Sheets Against Other Companies may be a Fool’s Game!

Original post appeared here. I'm moving it and updating it.

Updated Article

When companies think of evaluation, they often first think of benchmarking their performance against other companies. There are important reasons to be skeptical of this type of approach, especially as a sole source of direction.

I often add this warning to my workshops on how to create more effective smile sheets: Watch out! There are vendors in the learning field who will attempt to convince you that you need to benchmark your smile sheets against your industry. You will spend (waste) a lot of money with these extra benchmarking efforts!

Two forms of benchmarking are common, (1) idea-generation, and (2) comparison. Idea-generation involves looking at other company’s methodologies and then assessing whether particular methods would work well at our company. This is a reasonable procedure only to the extent that we can tell whether the other companies have similar situations to ours and whether the methodologies have really been successful at those other companies.

Comparison benchmarking for training and development looks further at a multitude of learning methods and results and specifically attempts to find a wide range of other companies to benchmark against. This approach requires stringent attempts to create valid comparisons. This type of benchmarking is valuable only to the extent that we can determine whether we are comparing our results to good companies or bad and whether the comparison metrics are important in the first place.

Both types of benchmarking require exhaustive efforts and suffer from validity problems. It is just too easy to latch on to other company’s phantom results (i.e., results that seem impressive but evaporate upon close examination). Picking the right metrics are difficult (i.e., a business can be judged on its stock price, its revenues, profits, market share, etc.). Comparing companies between industries presents the proverbial apple-to-orange problem. It’s not always clear why one business is better than another (e.g., It is hard to know what really drives Apple Computer’s current success: its brand image, its products, its positioning versus its competitors, its leaders, its financial savvy, its customer service, its manufacturing, its project management, its sourcing, its hiring, or something else). Finally, and most pertinent here, it is extremely difficult to determine which companies are really using best practices (e.g., see Phil Rosenweig’s highly regarded book on The Halo Effect) because companies’ overall results usually cloud and obscure the on-the-job realities of what’s happening.

The difficulty of assessing best practices in general pales in comparison to the difficulties of assessing its training-and-development practices. The problem is that there just aren’t universally accepted and comparable metrics to utilize for training and development. Where baseball teams have wins and losses, runs scored, and such; and businesses have revenues and profits and the like; training and development efforts produce more fuzzy numbers—certainly ones that aren’t comparable from company to company. Reviews of the research literature on training evaluation have found very low levels of correlation (usually below .20) between different types of learning assessments (e.g., Alliger, Tannenbaum, Bennett, Traver, & Shotland, 1997; Sitzmann, Brown, Casper, Ely, & Zimmerman, 2008).

Of course, we shouldn’t dismiss all benchmarking efforts. Rigorous benchmarking efforts that are understood with a clear perspective can have value. Idea-generation brainstorming is probably more viable than a focus on comparison. By looking to other companies’ practices, we can gain insights and consider new ideas. Of course, we will want to be careful not to move toward the mediocre average instead of looking to excel.

The bottom line on benchmarking from other companies is: be careful, be willing to spend lots of time and money, and don’t rely on cross-company comparisons as your only indicator.

Finally, any results generated by brainstorming with other companies should be carefully considered and pilot-tested before too much investment is made.

 

Smile Sheet Issues

Both of the meta-analyses cited above found that smile sheets were correlated with an r = 0.09, which is virtually no correlation at all. I have detailed smile-sheet design problems in detail in my book, Performance-Focused Smile Sheets: A Radical Rethinking of a Dangerous Art Form. In short, most smile sheets focus on learner satisfaction, and fail to focus on factors related to actual learning effectiveness. Most smile sheets utilize Likert-like scales or numeric scales that offer learners very little granularity between answer choices, opening up responding to bias, fatigue, and disinterest. Finally, most learners have fundamental misunderstandings about their own learning (Brown, Roediger & McDaniel, 2014; Kirschner & van Merriënboer, 2013), so asking for their perceptions with general questions about their perceptions is too often a dubious undertaking.

The bottom line is that traditional smile sheets are providing almost everyone with meaningless data in terms of learning effectiveness. When we benchmark our smile sheets against other companies' smile sheets we compound our problems.

 

Wisdom from Earlier Comments

Ryan Watkins, researcher and industry guru, wrote:

I would add to this argument that other companies are no more static than our own -- thus if we implement in September 2011 what they are doing in March 2011 from our benchmarking study, then we are still behind the competition. They are continually changing and benchmarking will rarely help you get ahead. Just think of all the companies that tried to benchmark the iPod, only to later learn that Apple had moved on to the iPhone while the others were trying to "benchmark" what they were doing with the iPod. The competition may have made some money, but Apple continues to win the major market share.

Mike Kunkle, sales training and performance expert, wrote:

Having used benchmarking (carefully and prudently) with good success, I can't agree with avoiding it, as your title suggests, but do agree with the majority of your cautions and your perspectives later in the post.

Nuance and context matter greatly, as do picking the right metrics to compare, and culture, which is harder to assess. 70/20/10 performance management somehow worked at GE under Welch's leadership. I've seen it fail miserably at other companies and wouldn't recommend it as a general approach to good people or performance management.

In the sales performance arena, at least, benchmarking against similar companies or competitors does provide real benefit, especially in decision-making about which solutions might yield the best improvement. Comparing your metrics to world-class competitors and calculating what it would mean to you to move in that direction, allows for focus and prioritization, in a sea of choices.

It becomes even more interesting when you can benchmark internally, though. I've always loved this series of examples by Sales Benchmark Index:
http://www.salesbenchmarkindex.com/Portals/23541/docs/why-should-a-sales-professional-care-about-sales-benchmarking.pdf

 

Citations

Alliger, Tannenbaum, Bennett, Traver, & Shotland (1997). A meta-analysis of the relations among training criteria. Personnel Psychology, 50, 341-357.

Brown, P. C., Roediger, H. L., III, & McDaniel, M. A. (2014). Make It Stick: The Science of Successful Learning. Cambridge, MA: Belknap Press of Harvard University Press.

Kirschner, P. A., & van Merriënboer, J. J. G. (2013). Do learners really know best? Urban legends in education. Educational Psychologist, 48(3), 169–183.

Sitzmann, T., Brown, K. G., Casper, W. J., Ely, K., & Zimmerman, R. D. (2008). A review and meta-analysis of the nomological network of trainee reactions. Journal of Applied Psychology, 93, 280-295.

Smile-Sheet Workshop in Suffolk, VA — June 10th, 2016

OMG! The best deal ever for a full-day workshop on how to radically improve your smile-sheet designs! Sponsored by the Hampton Roads Chapter of ISPI. Free book and subscription-learning thread too!

 

Friday, June 10, 2016

Reed Integration

7007 Harbour View Blvd #117

Suffolk, VA

 

Click here to register now...

 

Performance Objectives:

By completing this workshop and the after-course subscription-learning thread, you will know how to:

  1. Avoid the three most troublesome biases in measuring learning.

  2. Persuade your stakeholders to improve your organization’s smile sheets.

  3. Create more effective smile sheet questions.

  4. Create evaluation standards for each question to avoid bias.

  5. Envision learning measurement as a bulwark for improved learning design.

 

Recommended Audience:

The content of this workshop will be suitable to those who have at least some background and experience in the training field. It will be especially valuable to those who are responsible for learning evaluation or who manage the learning function.

 

Format:

This is a full-day workshop. Participants are encouraged to bring laptops if they prefer to use a computer to write their questions.  

 

Bonus Take-Away:

Each Participant will receive a copy of Dr. Thalheimer’s Book, Performance-Focused Smile Sheets: A Radical Rethinking of a Dangerous Art Form.

Book Publication Day!!

Wow!!

I almost can't believe it. Finally, after 17 years of research and writing, I'm finally a published author.

Today is the day!

It's kind of funny really.

When I began this journey back in 1997 I had a well-paying job running a leadership-development product line, building multimedia simulations, and managing and working with a bunch of great folks.

As I looked around the training-and-development field -- that's what we called it back then -- I saw that we jumped from one fad to another and held on sanctimoniously to learning methods that didn't work that well. I concluded that what was needed was someone to play a role in bridging the gap between the research side and the practice side.

I had a very naive idea about how I might help. I thought the field needed a book that would specify the fundamental learning factors that should be baked into every learning design. I thought I could write such a book in two or three years, that I'd get it published, that consulting gigs would roll in, that I'd make good money, that I'd make a difference.

Hah! The blind optimism of youth and entrepreneurship!

I've now written over 700 pages on THAT book...without an end in sight.

 

How The Smile-Sheet Book Got its Start

Back in 2007, as I was mucking around in the learning research, I began to see biases in how we were measuring learning. I noticed, for instance, that we always measured at the top of the learning curve, before the forgetting curve had even begun. We measured with trivial multiple-choice questions on definitions and terminology -- when these clearly had very little relevance for on-the-job performance. I wrote a research-to-practice report on these learning measurement biases and suddenly I was getting invited to give keynotes...

In my BIG book, I wrote hundreds of paragraphs on learning measurement. I talked about our learning-measurement blind spots to clients, at conferences, and on my blog.

Where feedback is the lifeblood of improvement, we as learning professionals were getting very little good feedback. We were practicing in the dark.

I'd also come to ruminate on the meta-analytic research findings that showed that traditional smile sheets were virtually uncorrelated with learning results. If smile sheets were feeding us bad information, maybe we should just stop using them.

It was about three or four years ago that I saw a big client get terrible advice about their smile sheets from a well-known learning-measurement vendor. And, of course, because the vendor had an industry-wide reputation, the client almost couldn't help buying into their poor smile-sheet designs.

I concluded that smile-sheets were NOT going away. They were too entrenched and there were some good reasons to use them.

I also concluded that smile sheets could be designed to be more effective, more aligned with the research on learning, and designed to better support learners in making smile-sheet decisions.

I decided to write a shorter book than the aforementioned BIG book. That was about 2.5 years ago.

I wrote a draft of the book and I knew I had something. I got feedback from learning-measurement luminaries like Rob Brinkerhoff, Jack Phillips, and Bill Coscarelli. I got feedback from learning gurus Julie Dirksen, Clark Quinn, and Adam Neaman. I made major improvement based on the feedback from these wonderful folks. The book then went through several rounds of top-tier editing, making it a much better read. 

As the publication process unfolded, I realized that I didn't have enough money on hand to fund the printing of the book. Kickstarter and 227 people raised their hands to help, reserving over 300 books in return for their generous Kickstarter contributions. I will be forever indepted to them.

Others reached out to help as well, from people on my newsletter list, to my beloved clients, to folks in trade organizations and publications, to people I've met through the years, to people I haven't met, to followers on Twitter, to the industry luminaries who agreed to write testimonials after getting advanced drafts of the book, to family members, to friends.

Today, all the hard work, all the research, all the client work, all the love and support comes together for me in gratitude.

Thank you!

 

= Will Thalheimer

 

P.S. To learn more about the book, or buy it:  SmileSheets.com

Kirkpatrick Model Good or Bad? The Epic Mega Battle!

Clark Quinn and I have started debating top-tier issues in the workplace learning field. In the first one, we debated who has the ultimate responsibility in our field. In the second one, we debated whether the tools in our field are up to the task.

In this third installment of the series, we've engaged in an epic battle about the worth of the 4-Level Kirkpatrick Model. Clark and I believe that these debates help elucidate critical issues in the field. I also think they help me learn. This debate still intrigues me, and I know I'll come back to it in the future to gain wisdom.

And note, Clark and I certainly haven't resolved all the issues raised. Indeed, we'd like to hear your wisdom and insights in the comments section.

--------------------------

Will:
 
I want to pick on the second-most renowned model in instructional design, the 4-Level Kirkpatrick Model. It produces some of the most damaging messaging in our industry. Here’s a short list of its treacherous triggers: (1) It completely ignores the importance of remembering to the instructional design process, (2) It pushes us learning folks away from a focus on learning—where we have the most leverage, (3) It suggests that Level 4 (organizational results) and Level 3 (behavior change) are more important than measuring learning—but this is an abdication of our responsibility for the learning results themselves, (4) It implies that Level 1 (learner opinions) are on the causal chain from training to performance, but two major meta-analyses show this to be false—smile sheets, as now utilized, are not correlated with learning results! If you force me, I’ll share a quote from a top-tier research review that damns the Kirkpatrick model with a roar. “Buy the ticket, take the ride.”

 

Clark:
 
I laud that you’re not mincing words!   And I’ll agree and disagree.  To address your concerns: 1) Kirkpatrick is essentially orthogonal to the remembering process. It’s not about learning, it’s about aligning learning to impact.  2) I also think that Kirkpatrick doesn’t push us away from learning, though it isn’t exclusive to learning (despite everyday usage). Learning isn’t the only tool, and we should be willing to use job aids (read: performance support) or any other mechanism that can impact the organizational outcome.  We need to be performance consultants! 3) Learning in and of itself isn’t important; it’s what we’re doing with it that matters. You could ensure everyone could juggle chainsaws, but unless it’s Cirque de Soleil, I wouldn’t see the relevance.

So I fully agree with Kirkpatrick on working backwards from the org problem and figuring out what we can do to improve workplace behavior.  Level 2 is about learning, which is where your concerns are, in my mind, addressed.  But then you need to go back and see if what they’re able to do now is what is going to help the org!  And I’d counter that the thing I worry about is the faith that if we do learning, it is good.  No, we need to see if that learning is impacting the org.  4) Here’s where I agree, that Level 1 (and his numbering) led people down the garden path: people seem to think it’s ok to stop at level 1!  Which is maniacal, because what learners think has essentially zero correlation with whether it’s working (as you aptly say)).  So it has led to some really bad behavior, serious enough to make me think it’s time for some recreational medication!

 

Will:
 
Actually, I’m flashing back to grad school. “Orthogonal” was one of the first words I remember learning in the august halls of my alma mater. But my digression is perpendicular to this discussion, so forget about it! Here’s the thing. A model that is supposed to align learning to impact ought to have some truth about learning baked into its DNA. It’s less than half-baked, in my not-so-humble opinion.

As they might say in the movies, the Kirkpatrick Model is not one of God's own prototypes! We're responsible people, so we ought to have a model that doesn’t distract us from our most important leverage points. Working backward is fine, but we’ve got to go all the way through the causal path to get to the genesis of the learning effects. Level 1 is a distraction, not a root. Yes, Level 2 is where the K-Model puts learning, but learning back in 1959 is not the same animal that it is today. We actually have a pretty good handle on how learning works now. Any model focused on learning evaluation that omits remembering is a model with a gaping hole.

 

Clark:
 
Ok, now I’m confused.  Why should a model of impact need to have learning in its genes?  I don’t care whether you move the needle with performance support, formal learning, or magic jelly beans; what K talks about is evaluating impact.  What you measure at Level 2 is whether they can do the task in a simulated environment.  Then you see if they’re applying it at the workplace, and whether it’s having an impact.  

No argument that we have to use an approach to evaluate whether we’re having the impact at level 2 that we should, but to me that’s a separate issue.  Kirkpatrick just doesn’t care what tool we’re using, nor should it.  Kirkpatrick doesn’t care whether you’re using behavioral, cognitive, constructivist, or voodoo magic to make the impact, as long as you’re trying something.  

We should be defining our metric for level 2, arguably, to be some demonstrable performance that we think is appropriate, but I think the model can safely be ignorant of the measure we choose at level 2 and 3 and 4.  It’s about making sure we have the chain.  I’d be worried, again, that talking about learning at level 2 might let folks off the hook about level 3 and 4 (which we see all too often) and make it a matter of faith. So I’m gonna argue that including the learning into the K model is less optimal than keeping it independent. Why make it more complex than need be?  So, now, what say you?

 

Will:
 
Clark! How can you say the Kirkpatrick model is agnostic to the means of obtaining outcomes? Level 2 is “LEARNING!” It’s not performance support, it’s not management intervention, it’s not methamphetamine. Indeed, the model was focused on training.

The Kirkpatricks (Don and Jim) have argued—I’ve heard them live and in the flesh—that the four levels represent a causal pathway from 1 to 4. In addition, the notion of working backward implies that there is a causal connection between the levels. The four-level model implies that a good learner experience is necessary for learning, that learning is necessary for on-the-job behavior, and that successful on-the-job behavior is necessary for positive organizational results. Furthermore, almost everybody interprets it this way. 

The four levels imply impact at each level, but look at all the factors that they are missing! For example, learners need to be motivated to apply what they’ve learned. Where is that in the model? Motivation can be an impact too! We as learning professionals can influence motivation. There are other impacts we can make as well. We can make an impact on what learners remember, whether learners are supported back on the job, etc.

Here’s what a 2012 seminal research review from a top-tier scientific journal concluded: “The Kirkpatrick framework has a number of theoretical and practical shortcomings. [It] is antithetical to nearly 40 years of research on human learning, leads to a checklist approach to evaluation (e.g., ‘we are measuring Levels 1 and 2, so we need to measure Level 3’), and, by ignoring the actual purpose for evaluation, risks providing no information of value to stakeholders… (p. 91). That’s pretty damning!

 

Clark:

I don’t see the Kirkpatrick model as an evaluation of the learning experience, but instead of the learning impact.   I see it as determining the effect of a programmatic intervention on an organization.  Sure, there are lots of other factors: motivation, org culture, effective leadership, but if you try to account for everything in one model you’re going to accomplish nothing.  You need some diagnostic tools, and Kirkpatrick’s model is one.

If they can’t perform appropriately at the end of the learning experience (level 2), that’s not a Kirkpatrick issue, the model just lets you know where the problem is. Once they can, and it’s not showing up in the workplace (level 3), then you get into the org factors. It is about creating a chain of impact on the organization, not evaluating the learning design.  I agree that people misuse the model, so when people only do 1 or 2, they’re wasting time and money. Kirkpatrick himself said he should’ve numbered it the other way around. 

Now if you want to argue that that, in itself, is enough reason to chuck it, fine, but let’s replace it with another impact model with a different name, but the same intent of focusing on the org impact, workplace behavior changes, and then intervention. I hear a lot of venom directed at the Kirkpatrick model, but I don’t see it ‘antithetical to learning’.  

And I worry the contrary; I see too many learning interventions done without any consideration of the impact on the organization.  Not just compliance, but ‘we need a course on X’ and they do it, without ever looking to see whether a course on X will remedy the biz problem. What I like about Kirkpatrick is that it does (properly used) put the focus on the org impact first.

 

Will:

Sounds like you’re holding on to Kirkpatrick because you like its emphasis on organizational performance. Let’s examine that for a moment. Certainly, we’d like to ensure that Intervention X produces Outcome Y. You and I agree. Hugs all around. Let’s move away from learning for a moment. Let’s go Mad Men and look at advertising. Today, advertising is very sophisticated, especially online advertising because companies can actually track click-rates, and sometimes can even track sales (for items sold online). So, in a best-case scenario, it works this way:

  • Level 1 – Web surfers says they like the advertisement
  • Level 2 – Web surfers show comprehension by clicking on link.
  • Level 3 – Web surfers spend time reading/watching on splash page.
  • Level 4 – Web surfers buy the product offered on the splash page.

A business person’s dream! Except that only a very small portion of sales actually happen this way (although, I must admit, the rate is increasing). But let’s look at a more common example. When a car is advertised, it’s impossible to track advertising through all four levels. People who buy a car at a dealer can’t be definitively tracked to an advertisement.

So, would we damn our advertising team? Would we ask them to prove that their advertisement increased car sales? Certainly, they are likely to be asked to make the case…but it’s doubtful anybody takes those arguments seriously… and shame on folks who do!

In case, I’m ignorant of how advertising works behind the scenes—which is a possibility, I’m a small “m” mad man—let me use some other organizational roles to make my case.

  • Is our legal team asked to prove that their performance in defending a lawsuit is beneficial to the company? No, everyone appreciates their worth.
  • Do our recruiters have to jump through hoops to prove that their efforts have organizational value? They certainly track their headcounts, but are they asked to prove that those hires actually do the company good? No!
  • Do our maintenance staff have to get out spreadsheets to show how their work saves on the cost of new machinery? No!
  • Do our office cleaning professionals have to utilize regression analyses to show how they’ve increased morale and productivity? No again!

There should be a certain disgust in feeling we have to defend our good work every time…when others don’t have to.

I use the Mad Men example to say that all this OVER-EMPHASIS on proving that our learning is producing organizational outcomes might be a little too much. A couple of drinks is fine, but drinking all day is likely to be disastrous.

Too many words is disastrous too…But I had to get that off my chest…

 

Clark:

I do see a real problem in communication here, because I see that the folks you cite *do* have to have an impact. They aren’t just being effective, but they have to meet some level of effectiveness. To use your examples: the legal team has to justify its activities in terms of the impact on the business. If they’re too tightened down about communications in the company, they might stifle liability, but they can also stifle innovation. And if they don’t provide suitable prevention against legal action, they’re turfed out.   Similarly, recruiters have to show that they’re not interviewing too many, or too few people, and getting the right ones. They’re held up against retention rates and other measures.  The maintenance staff does have to justify headcount against the maintenance costs, and those costs against the alternative of replacement of equipment (or outsourcing the servicing).  And the office cleaning folks have to ensure they’re meeting environmental standards at an efficient rate.  There are standards of effectiveness everywhere in the organization except L&D.  Why should we be special?

Let’s go on: sales has to estimate numbers for each quarter, and put that up against costs. They have to hit their numbers, or explain why (and if their initial estimates are low, they can be chastised for not being aggressive enough). They also worry about the costs of sales, hit rates, and time to a signature. Marketing, too, has to justify expenditure. To use your example, they do care about how many people come to the site, how long they stay, how many pages they hit, etc. And they try to improve these. At the end of the day, the marketing investment has to impact the sales. Eventually, they do track site activity to dollars. They have to. If we don’t, we get boondoggles. If you don’t rein in marketing initiatives, you get these shenanigans where existing customers are boozed up and given illegal gifts that eventually cause a backlash against the company. Shareholders get a wee bit stroppy when they find that investments aren’t paying off, and that the company is losing unnecessary money.  

It’s not a case of ‘if you build it, it is good’! You and I both know that much of what is done in the name of formal learning (and org L&D activity in general) isn’t valuable. People take orders and develop courses where a course isn’t needed. Or create learning events that don’t achieve the outcomes. Kirkpatrick is the measure that tracks learning investments back to impact on the business.  and that’s something we have to start paying attention to. As someone once said, if you’re not measuring, why bother? Show me the money! And if you’re just measuring your efficiency, that your learning is having the desired behavioral change, how do you know that behavior change is necessary to the organization? And until we get out of the mode where we do the things we do on faith,  and start understanding have a meaningful impact on the organization, we’re going to continue to be the last to have an influence on the organization, and the first to be cut when things are tough. Yet we have the opportunity to be as critical to the success of the organization as IT! I can’t stand by seeing us continue to do learning without knowing that it’s of use. Yes, we do need to measure our learning for effectiveness as learning, as you argue, but we have to also know that what we’re helping people be able to do is what’s necessary. Kirkpatrick isn’t without flaws, numbering, level 1, etc. But it’s a clear value chain that we need to pay attention to. I’m not saying in lieu of measuring our learning effectiveness, but in addition. I can’t see it any other way.

 

Will:

Okay, I think we’ve squeezed the juice out of this tobacco. I would have said “orange” but the Kirkpatrick Model has been so addictive for so long…and black is the new orange anyway…

I want to pick up on your great examples of individuals in an organizations needing to have an impact. You noted, appropriately, that everyone must have an impact. The legal team has to prevent lawsuits, recruiters have to find acceptable applicants, maintenance has to justify their worth compared to outsourcing options, cleaning staff have to meet environmental standards, sales people have to sell, and so forth.

Here is the argument I’m making: Employees should be held to account within their circles of maximum influence, and NOT so much in their circles of minimum influence.

So for example, let’s look at the legal team.

  Kirkpatrick with Clark Quinn -- Legal Team

Doesn’t it make sense that the legal team should be held to account for the number of lawsuits and amount paid in damages more than they should be held to account for the level of innovation and risk taking within the organization?

What about the cleaning professionals?

Kirkpatrick with Clark Quinn -- Cleaning Professionals

Shouldn’t we hold them more accountable for measures of perceived cleanliness and targeted environmental standards than for the productivity of the workforce?

What about us learning-and-performance professionals?

Kirkpatrick with Clark Quinn -- Learning and Performance

Shouldn’t we be held more accountable for whether our learners comprehend and remember what we’ve taught them more than whether they end up increasing revenue and lowering expenses?

I agree that we learning-and-performance professionals have NOT been properly held to account. As you say, “There are standards of effectiveness everywhere in the organization except L&D.” My argument is that we, as learning-and-performance professionals, should have better standards of effectiveness—but that we should have these largely within our maximum circles of influence.

Among other things, we should be held to account for the following impacts:

  • Whether our learning interventions create full comprehension of the learning concepts.
  • Whether they create decision-making competence.
  • Whether they create and sustain remembering.
  • Whether they promote a motivation and sense-of-efficacy to apply what was learned.
  • Whether they prompt actions directly, particularly when job aids and performance support are more effective.
  • Whether they enable successful on-the-job performance.
  • Et cetera.

Final word, Clark?

 

Clark:

First, I think you’re hoist by your own petard.  You’re comparing apples and your squeezed orange. Legal is measured by lawsuits, maintenance by cleanliness, and learning by learning. Ok that sounds good, except that legal is measured by lawsuits against the organization. And maintenance is measured by the cleanliness of the premises.  Where’s the learning equivalent?  It has to be: impact on decisions that affect organizational outcomes.  None of the classic learning evaluations evaluate whether the objectives are right, which is what Kirkpatrick does. They assume that, basically, and then evaluate whether they achieve the objective.  

That said, Will, if you can throw around diagrams, I can too. Here’s my attempt to represent the dichotomy. Yes, you’re successfully addressing the impact of the learning on the learner. That is, can they do the task. But I’m going to argue that that’s not what Kirkpatrick is for. It’s to address the impact of the intervention on the organization. The big problem is, to me, whether the objectives we’ve developed the learning to achieve are objectives that are aligned with organizational need. There’s plenty of evidence it’s not.

Clark's Blog Diagram for Kirkpatrick

So here I’m trying to show what I see K doing. You start with the needed business impact: more sales, lower compliance problems, what have you. Then you decide what has to happen in the workplace to move that needle.  Say, shorter time to sales, so the behavior is decided to be timeliness in producing proposals. Let’s say the intervention is training on the proposal template software. You design a learning experience to address that objective, to develop ability to use the software. You use the type of evaluation you’re talking about to see if it’s actually developing their ability. Then you use K to see if it’s actually being used in the workplace (are people using the software to create proposals), and then to see if it’d affecting your metrics of quicker turnaround. (And, yes, you can see if they like the learning experience, and adjust that.)

And if any one element isn’t working: learning, uptake, impact, you debug that.  But K is evaluating the impact process, not the learning design. It should flag if the learning design isn’t working, but it’s not evaluating your pedagogical decisions, etc. It’s not focusing on what the Serious eLearning Manifesto cares about, for instance. That’s what your learning evaluations do, they check to see if the level 2 is working. But not whether level 2 is affecting level 4, which is what ultimately needs to happen. Yes, we need level 2 to work, but then the rest has to fall in line as well. 

My point about orthogonality is that K is evaluating the horizontal, and you’re saying it should address the vertical. That, to me, is like saying we’re going to see if the car runs by ensuring the engine runs. Even if it does, but if the engine isn’t connected through the drivetrain to the wheels, it’s irrelevant. So we do want a working, well-tuned, engine, but we also want a clutch or torque converter, transmission, universal joint, driveshaft, differential, etc. Kirkpatrick looks at the drive train, learning evaluations look at the engine.

We don’t have to come to a shared understanding, but I hope this at least makes my point clear. 

 

Will:

Okay readers! Clark and I have fought to a stalemate… He says that the Kirkpatrick model has value because it reminds us to work backward from organizational results. I say the model is fatally flawed because it doesn’t incorporate wisdom about learning. Now it’s your turn to comment. Can you add insights? Please do!

 

New Website Launched…To Promote the Practice of Learning Audits

Last week I launched the website LearningAudit.com to promote the practice of learning audits.

LearningAudit_Banner_2014

It is my passionate belief that our learning interventions would be tremendously improved if we took a research-based systematic approach to reviewing them. LearningAudit.com is dedicated to the proposition that we can all do this.

On the site there is the research-to-practice report, "How to Conduct a Learning Audit" and a job aid to support the learning-audit process.