The following is a transcript of the presentation video, edited for clarity.
I’m very honored to have been invited to speak with you this evening about a topic that is near and dear to me. And one that I’ve been working on quite a long time. I developed my first patient reported outcome instrument and published that work as part of my dissertation in 1978 and 1979. And I’ve been doing a lot of measurement development over the years. And so, I have to say, it’s been very exciting to see patient reported outcomes and patient centered care becoming much more mainstream. And back when I began it was really seen as much too soft to be really considered serious science. But those days have changed and I tell you it’s been very encouraging to see the growing interest in patient reported outcomes.
What I thought I would try to do this evening is give you some sense of my views on what PROs or PROMs are, and provide a context for thinking about PROs and what’s going on in the challenging healthcare system in which all of you are working. And describe some of what is see is very exciting innovations that allow us to develop better PROs then we’ve had in the early years where we’ve kind of, as Margaret was describing, that we developed PROs on a napkin, very clinically grounded, very useful at their time, but the science of developing PROs has gotten much more sophisticated. And I’ll talk a little bit about that this evening, and I’ll try to talk about some applications and examples to try to make it a little more interesting.
So why do I think this topic is important? I think one of the reasons it’s become so more mainstream is we are enjoying, and I use that term very intentionally, we’re enjoying a very turbulent time in our society with respect to what’s going on in the healthcare system. There’s a tremendous amount of change afoot. I know change can be very disturbing and upsetting for people. But what I’m noticing is that change creates tremendous opportunities to introduce innovation into systems that otherwise can be very resistant to change and innovation, and that’s why I think it’s such an exciting time right now for those of us who are interested in doing research in healthcare.
But it can be very challenging. Just to give you one example, in January the HHS secretary announce the national plan that’s going to move us toward away from fee for service in Medicare, to more innovative value-based payment models, including accountable care organizations, bundled payment arrangements, medical homes. These are entities that 10 years ago didn’t really exist in our country, and it’s clear the federal government is moving very strongly in this direction. She’s seeking to tie as much as 50% of traditional fee for service payments to these alternative models by the end of 2018. Who knows if that timeframe is realistic, but there’s no question all types of efforts to introduce and change incentives for a care that’s being provided are occurring today.
How Should Our Professions Be Responding?
So how should our professions be responding? And this is, I think where PROs as well as other forms of data can be extremely useful to us, and that’s really where I want to focus my remarks this evening. I want to do it in terms of the context of thinking about systems thinking, and for this I draw heavily on a neurosurgeon, Atul Gawande that many of you may know. He works at the Brigham Hospital in Boston. He’s written some really excellent books on many different topics in healthcare. He came up with a checklist for surgery to try to reduce errors in surgeries. He’s a very gifted writer. He’s a very practical individual. And he’s written a recent book on end of life, and he also writes a lot for The New Yorker. And those of you who follow The New Yorker might see his writings in The New Yorker.
And back in 2011, this was my first introduction to Gawande, he wrote a piece called Cowboys and Pit Crews, and he was making the argument, and he was talking about medicine, not speech, language and hearing, or physical therapy, or rehab in general, he was talking about medicine. And he said that his profession of medicine really — the culture was one of cowboys, where everyone was kind of working in isolation, the value was around autonomy and independence. And that what we needed to do is become much more like pit crews where we’re working in interconnected teams. And I must say he hit a responsive chord for me, because I have been — I am a physical therapist by clinical background, and this has been a very strong theme in my profession for many years now to become more autonomous, more independent, more disconnected and not having to work under the direction of physicians. And I thought Gawande really put his finger on it. And what he argued is that if we’re going to become less like cowboys and more like pit crews, what we needed to do is to understand how to think in terms of systems, not just in terms of our clinical practice. And that’s where I think the PROs really fit in very nicely.
He argued three major pieces in systems thinking, and the first one was becoming interested in data. And the second was learning and developing skills to use the data to address system problems, come up with better solutions. And then finally, and in my view the most challenging one was to become much more skillful at disseminating innovations that change the way in which we practice on a larger scale. And I want to talk briefly about each of these tonight and show how PROs fit into this way of thinking.
Data Interest
So the first is interest in data. To start with I’m draw on David Cutler, works at Harvard, he argues — he’s an economist. He argues that healthcare is the most information intense industry in our society, in our economy, but it uses IT or data the least, which I think he’s really right, if you think about it. It’s incredibly important, data is so important to healthcare in all of our professions and we’re very unsophisticated about it. And he says the primary emphasis improving quality of care in economics and in healthcare today is really focused on cost containment. The regulatory approach is really focused on cost, must more on quality, and his argument is that we need to put much more effort on looking at the benefits of what we’re doing and the care that we provide to our patients. And that’s very consistent where HHS is going because it’s to provide more incentives for us to focus on not only cost but also benefits. And Cutler argues that’s far more difficult than looking at cost.
And so to do that I think we need to become much more interested intrinsically in data and not just the data themselves, but learning how to use data to create new information to answer important questions. So we need to learn how to do this in real time, not something that we do outside of our practice, but something that we do as an intrinsic part of our practice. In my experience in working with clinicians over the years, particularly in organizations that are accredited, we all know we have to collect data, particularly for accrediting agencies. And we get big three-ring notebooks full of data when the accreditor come and we bring all that stuff out, and they look at it, and we all nod. And the accreditors go home and it goes back up on the shelf, and we go back to our work until the next time these crazy accreditors come to visit us.
So I’m not talking about that kind of data collection, I’m talking about collecting data, converting it to information, because it’s important to our practice. And that’s what Gawande is arguing. And not only for the purpose — specifically not to try to prove what we’re doing works. I hear that all the time in my clinical physical therapy, people talk about trying to prove that physical therapy is beneficial. I think that’s asking the wrong question. That’s not what we should be trying to do, we should be trying to understand what of what we do as professionals works. Discovering what works for what patients under what conditions, what circumstances, to achieve what outcomes and at what cost.
I think a lot of times we’re very insecure when we talk about trying to prove what we do is valuable and works. I don’t ever hear our colleagues in medicine do that. They don’t talk about trying to prove that medicine works. They’re much more secure in understanding that some things that they do is really not fairly beneficial, we need to identify what those are and stop doing them. And identifying those things that are most valuable, and that’s where data becomes intrinsically important to the practice that we do in our respective professions. And I think that’s what Gawande was arguing. And I think this dovetails really beautifully into the growing role of patient reported outcomes. As we move toward a more patient centered approach to providing healthcare, incorporating the view of the patient in looking at where we’re effective and where we’re not I think is — is the sweet spot for PROs.
So the PROs represent the impact of a health condition on our client’s lives. PROs are a measurement of any aspects of a patient’s health status that comes directly from the person, without the interpretation of the person’s responses by a clinician or another individual. Information comes directly from the individual him or herself.
PROs offer a structured approach, a structured technique that allows us to minimize error involved in gaining this information, to maximize consistently, ultimately providing a more reliable and hopefully more valid measurement of important outcomes. In other words, I’m not talking about just interviewing the patient and getting the patient’s impressions, I’m talking about using standardized techniques to really get highly consistent, reproducible, valid information about the impact that we provide on our patients.
And PROs are particularly useful because many aspects of our care are known only to the client, even the FDA has come to recognize this. And the FDA is now beginning to use PROs in their work. Ten years ago, when I as consulting with groups going up in front of the FDA were hitting a brick wall, they wanted nothing to do with PROs. And there’s been a seat change and FDA is now recognizing the importance of PROs and looking at the impact of various interventions. And it gets us away from this, the quotation on this little cartoon, “Are you pissing and moaning or can you verify what you’re saying with data?” I think this kind of put a finger on it. It’s extremely important in the work that we do.
Now, as I look back across my career, it’s quite clear that the psychometrics, and the way in which we develop PROs has changed fundamentally over the past 20, 30 years. And there’s been a lot of improvement that leads to much better approaches to documenting the impact of care on patient outcomes. It’s been increasingly formalized, there are very standardized techniques for developing and evaluating these tools. There’s much better linkage between concepts and the PROs themselves. The early PROs were very empirical and not very conceptual at all. Clinicians identified areas that seemed to be important and relevant to the care that they provide, they developed questions around those seemingly important areas and they really weren’t very conceptually grounded.
PROs today are much more conceptually grounded, using frameworks like the international classification of function, disability and health, as well as many others. There’s been much better qualitative work with focus groups of patients and clinicians in developing PROs so that they have really good content validity. A lot of the early tools did not have that.
And ePRO methods are increasingly available, and eventually we’re going to get to the point where we have the infrastructure in our healthcare system to really allow us to use that infrastructure to collect this kind of information. And what I’m finding is that, although we talk about the introduction of electronic medical record, we’re not at the point yet where we can easily incorporate standardized PROs into most healthcare systems that I’ve run into in our country. We’re just not there yet.
The NIH has invested heavily in PROs through the PROMIS initiative that some of you may be aware of, as well as the NeuroQual Initiative and other initiatives. For a long time the NIH was not particularly interested in funding research on the development of PROs, and over the past 10 years that has changed dramatically and there’s been a tremendous investment by the NIH into the development of standardized PROs. So a lot has changed in the whole science and development of PROs.
There’s one pair of advances that I want to share with you that I know some of you are quite familiar with, that I think is very exciting in helping us address one of the major problems that have really held us back on PROs, having to do with the balance between psychometric quality and the burden that we place on our patients trying to collect this information. I have seen that early in my career that we can develop good psychometrically adequate instruments, but the problem is they became very long and burdensome in order to really capture what it was we were trying to understand. And, therefore, clinicians didn’t want to use them for good reason, because they wanted to have time to work with their patients, not just assess their status using PROs. So this was — this has been a real tension in the area of PROs for many years. Particularly in using classical tests.
I came up learning how to do measurement using classical test theory. I think most of us did. It’s widely been used it has been the predominant way in which we’ve developed PROs in the early years. And with a classical approach to measurement you basically develop a fixed set of items that goes into what we call an instrument, and you present that fixed set of items to a clinician or a client, regardless of whether or not all of those items are appropriate. If you have 36 items in an instrument, you present all the 36 items to the client and then you generate a score, that’s how classical testing works. And then you either assign or generate some kind of summative score that reflects where the patient is across all of those items. And you try to do in a way that maximizes — true scoring minimizes error, that’s classical test theory.
This is kind of a graphic illustrating how classical test theory works. It’s a linear approach to measurement. And you introduce and get a score on a whole set of items, you administer all the items and you get a response from each item from the patient and then you generate a score, in this case 73 on a scale of zero to 100, and it’s very linear in its approach. You start with the first one, you go through them all, you generate a score. Now you run into a couple problems using classical test theory. The one is illustrated on the left here, and that is, if you’re worried about the length of an instrument, what usually happens is you develop a short form where you get a very crude instrument, where you have a lot of gaps in between different levels of what it is you’re trying to measure, which makes it very difficult to detect change using such an instrument.
An example would be the short form 12 or the short form 36 that’s widely used across the world today to look at health status outcomes, but it’s extremely crude, and there are problems with not being able to detect change. The other problem that’s illustrated on the right of the slide is that to get around the concern of having too many questions you focus on a narrow range of the continuum that you’re trying to understand, and you get a much finder gradation of understanding. But the problem is, if you move into a patient population that is located elsewhere on that continuum you’re going to get a lot of ceiling effects. Everyone at the top of the scale or at the bottom of the scale with floor effects, and, again, it’s not going to be very useful in looking at change and what’s going on with our patients.
So there are real problems in using classical test theory, which is why contemporary measurement theory is an extremely attractive solution for some of these problems. They’re two innovations that we have taken from the field of education and have begun to use in healthcare measurement, particularly in the area of PROs, and the first one is item response theory. The most commonly used one is called Rasch model, it’s one of many different models of item response. And it’s used to build instruments and it’s a different approach than classical test theory. The outcome scores are item based and not test based, as I described with classical test theory. You spend a lot of time developing items that you put into an instrument, looking at the persons level of an outcome and the characteristics of each item and how difficult each item is on whatever the continuum is you’re trying to measure. And then they generate outcome scores based on probability models that represent the likelihood that a person will respond in a particular way, given their ability level on the underlying outcome you’re trying to understand.
So it’s a very attractive approach, and it looks like something like this. If you’re trying to measure an outcome such as physical function, you build a lot of items. And if you think about the continuum as a ruler, you try to write and build items that cover the full range of that continuum so you can really understand what’s happening to a patient on a very broad continuum, from very low ability to very high ability.
This approach works extremely well to develop very precise, very accurate, and very sensitive measures. The problem is, just using item response theory technology also develops very long instruments. It’s the second innovation tied with IRT that really becomes useful, it’s called computerized adaptive testing. And this is the technique that you use for administering IRT instruments. It allows you to take very broad based instruments that cover a broad range of ability of outcomes that you’re interested in, but it allows you to administer only a small subset of the items in each instrument to get an accurate score of where that patient is on that outcome. A combination of the IRT instrument and the very efficient administration using computerized adaptive testing, or CAT, really results in a tremendous innovation of achieving sensitivity to change while maximizing the feasibility of doing short assessments in a clinical context. So it integrates IRT with computerized administration, and this is the direction in which many PROs are going. For example, the PROMIS initiative is all CAT based.
So the way it works with computerized adaptive testing, and, again, this technique is taken directly from the education field, where they’ve been administering standardized tests using CAT for many years now. Computer algorithms select questions from the underlying IRT pool based on responses to prior questions. So every time a patient responds to a question and gives a response to an item, the algorithm goes into the pool, picks the next best item, and continues to do that until you get to the solution that you’re looking for, which is a very different approach then classical test theory, which is very linear.
So the measurement is adapted to each individual patient. So you may in fact end up administering completely different items to each of your patients, which is something clinicians aren’t used to. It takes a little getting used to. We’re used to administering the same items over time in order to see whether or not a patient has changed. With CAT and IRT, many times you end up administering different items if the patient has changed, because they’re no longer at the same point on that continuum, they’re either higher or lower, and the algorithm picks different items to help you figure out where they are.
So it takes — it’s a different paradigm of assessment. What’s nice about CAT administration is it skips irrelevant items that don’t apply to your particular patient. So it not only saves time, but it’s more respectful of the patient. You’re not asking them irrelevant information. And you can get precision and feasibility together, which makes it very attractive. It’s very iterative.
I’ve got a little graphic here that shows the difference between an IRT approach to measurement versus classical test theory. So you start with an individual item, usually you start in the middle of the range of ability, and you get a score on that first item, then the algorithm picks the next item. In this case it’s a more difficult item. You get a score and then the algorithm picks the next item. And you can see they’re picking — the algorithm is picking an item in the middle between the first and the second. And then you pick another item and then finally you get to the score. You don’t have to ask all the items, you just ask items until you’ve honed in on where that individual is on that continuum. So you save a lot of time, burden, it’s more respectful for the patient and you get to the place you’re trying to go in the measurement.
The other advantage of IRT/CAT instruments is it takes a lot of our typical ordinal scales that we use in clinical PRO measurement and converts them to an interval level, which allows you to do more statistical manipulation. It allows you to maximize precision across a broader range of an outcome that you’re interested in. it also has a lot of flexibility. You can use different response scales. As long as you can put them on the same underlying metric it works extremely well. You don’t have to use all one type, therefore you can reduce floor and ceiling effects, you can reduce burden, cost of administration, clinical time. So it has a lot of benefits, highly efficient.
Using Data to Devise Solutions
So from the point of view of becoming intrinsically interested in data, PROs using these more contemporary techniques I think offers clinicians the opportunity to get the information that they need but do it in a way that’s much more feasible. Which leads me to the second part of Gawande’s model that I want to talk briefly about, and that is, becoming more sophisticated in using data to solve systems problems that we’re faced with today.
This is a little table I’ve put together to help think about different ways in which we use data in our work as clinicians. We use data within the clinical encounter, we use data outside the clinical encounter. We can use data at the individual level, Mrs. Jones, or we can use data at the aggregate level.
So at the individual level within a clinical encounter we use data all the time, to screen patients, to monitor whether or not our patient has changed, to help us in care planning.
We also use individual data outside the clinical encounter. For example, if you’re doing discharge planning with other professionals you might bring data about Mrs. Jones to the discharge planning. We use individual patient data, client data all the time.
We don’t use aggregate data as much, but there are many useful ways, particularly from a systems perspective in using aggregate data within the clinical encounter. For example, using evidence from the literature can help you work with a patient in deciding what’s an appropriate treatment approach to use. Aggregate data can also be extremely useful for generating prognostic indications when a patient first comes to you.
This is being done a lot in the field of physical therapy now. We’ve created databases that allows us to generate very simple prediction models. When a patient comes to be assessed for clinical care in physical therapy, if you just know a little bit about them, you can predict how many visits they’re likely to need and how much improvement they’re likely to show based on looking at aggregate data in your database.
Outside the clinical encounter these data are being used increasingly more often for clinical staffing, to do continuous quality improvement, marketing our services, reducing practice variation. In my experience we’re much more comfortable using data at the individual level, that’s really our comfort zone. When we’re talking about — when I work with clinicians, most of the time they want data that they can use for care planning, for monitoring their patient, to help screen their patient. They’re much less comfortable in thinking from a systems perspective in using aggregate data to devise solutions to system problems, such as the treatment decision aides, prognosis, quality monitoring, identifying best practices. And this is the area where I think PROs can be increasingly useful to us to help us move our practice forward.
One of the ways I like to think about it, and I think this also comes from Gawande, he talks about position deviants. In all of our fields we have people who are practicing at an extremely high level and data can help us identify who these position deviants are so that we can begin to lay a foundation, a culture, of innovations and quality improvement in practice within our professions. In physical therapy we’re doing this a lot more, we’re holding national conferences where we’re identifying people doing really innovative stuff around the country. We’re bringing them together. We’re allowing them to showcase the kinds of innovations that they’re doing so that they’re more visible within the field and begin to influence the practice of other individuals. So the use of data to identify positive deviants I think can be extremely useful. And PROs can have an important role in generating the kind of data that we need to identify these deviants in our professions.
Lots of fields are doing it, and I don’t have time tonight to go into it in any detail, but this is where data registries can be extremely useful and important. The folks at Dartmouth-Hitchcock were early into this area. Back in 2010 they started this high value healthcare collaborative. They brought together a select few institutions around the country. They started it with Mayo, Denver Health, and Cleveland Clinic, and they began to collect standardized data together across those institutions to begin to identify who the positive deviants were across those five institutions. Their goal was to try to improve healthcare, to lower costs and move best practices out to the national provider community. And this has begun to grow in medicine. In 2013 it’s been expanded to 19 different institutions who have all agreed that they’re going to collect information in the same way to begin to ask important systems questions.
Disseminating at Scale to Change Practice
Now the third element, and last one that I want to talk about tonight, is the challenge of dissemination. It’s one thing to begin to use data intrinsically to begin to identify our positive deviants, but we all know, those of you who have been practicing for any period of time know, the big challenge is trying to change the norms in which we practice at the broad scale. And this is the challenge that I think we really need to attend to much more in the years ahead. Don Berwick has been on top of this for many years now. He’s a pediatrician, was based on Harvard Medical School for many years. He became the head of the Centers for Medicare and Medicaid for several years. And he had started an institute called Institute for Healthcare Quality Improvement many years ago, where he was really trying to help identify positive deviants, and from a systems approach, disseminate that deviation across the broad scale. And he has argued in his writings in healthcare invention is extremely hard, but in Berwick’s view dissemination is even harder. And from my experience this is spot on. And he argues we need a much more coordinated deployment of practice innovations on a large scale. And this is where I think professional organizations could be very helpful and very effective so that we can become better at disseminating innovations that we’ve identified.
Gawande also argues in another piece that he wrote in 2013 that we have to move away from regulation if our goal is to change the norms of how we practice. He argues that diffusion is a social process, and I think he is spot on. His argument is that penalties and incentives won’t achieve cultural change, which is what we’re talking about. That regulation is not the way to go if we’re going to change the culture of how we practice. It’s a social process. The challenge is to get to the point where X, whatever the innovation is, becomes the norm of what we do in our professions. And so to create norms we need to understand the existing norms and the barriers to changing them, that’s what Gawande is arguing. And many others, like Everett Rogers, have written about this for years. Mass media can be useful, regulations and penalties and incentives, maybe, but fundamentally, if we’re going to achieve large-scale dissemination of innovations we’ve got to take a much more social approach.
What do I mean by a social approach? I’m going to use a few examples that I’ve been exposed to. The Agricultural Extension Service is an approach to large-scale innovation dissemination. It’s basically the application of scientific research and new knowledge of agricultural practices at a local level. People from universities going to the local communities, working with farmers, working with future farmers, to try to institute change, innovation, into how farmers were practicing. Very social in the approach. Very much based on local face-to-face networks. And I think very innovatively working with the youth so that the next generation gets socialized early on.
In our own field of healthcare, the pharmaceutical fields have figured this out a long time ago in terms of detailing. It’s a very effective approach to changing physician prescribing behavior, and it’s totally based on this social approach. They talk about the rule of seven touches. You personally touch a doctor seven times, they’ll become — they get to know you, they get to trust you, and then they begin to change the kind of practice patterns that the pharmaceutical industry is trying to get them to change.
Now this is not an inexpensive way to do it, it’s a very effective social approach to instituting innovations on a much broader scale. And this is where Berwick has been working for many years. In his Institute for Healthcare Improvement he’s been using these collaborative models where he brings providers together from all the country for a short period of time, six to 15 months, and they develop a learning system where they come together, they talk about systems thinking. They go home and begin to collect data in their local environment. They bring the data back to the next meeting and they begin to develop a learning culture as part of the network that he brings together around topics such as low back pain and so forth. And that’s the approach that he tries to use to institute innovation on a much broader scale.
Conclusion
So in conclusion, in my view, systems thinking must become a much more important approach to how we practice, and I believe PROs can play an extremely important role, as well as other data, to assess the outcomes that we’re trying to achieve. We need to develop things such as data registries so we can identify innovative solutions to systems problems, and then we can begin to move to the next level and that is dissemination. And in my view, academic programs need to learn how to teach future clinicians the importance of systems thinking and the skills that clinicians of the future are going to need in order to really flourish in the kind of challenging healthcare environments that we’re going to be working in in the years ahead. Thank you.
References
Acquadro, C., Berzon, R., Dubois, D., Leidy, N. K., Marquis, P., Revicki, D. & Rothman, M. (2003). Incorporating the patient’s perspective into drug development and communication: an ad hoc task force report of the patient‐reported outcomes (PRO) harmonization group meeting at the Food and Drug Administration, February 16, 2001.
Value in Health,
6(5), 522–531
[Article][PubMed]
Cleary, P. D. & Edgman-Levitan, S. (1997). Health care quality: Incorporating consumer perspectives.
Journal of the American Medical Association,
278(19), 1608–1612
[Article] [PubMed]
Eton, D. T., Beebe, T. J., Hagen, P. T., Halyard, M. Y., Montori, V. M., Naessens, J. M., Sloan, J. A., Thompson, C. A. & Wood, D. L. (2014). Harmonizing and consolidating the measurement of patient-reported information at health care institutions: A position statement of the Mayo clinic.
Patient Related Outcome Measures,
5, 7
[Article] [PubMed]
Greenhalgh, J. (2009). The applications of pros in clinical practice: What are they, do they work, and why?
Quality of Life Research,
18(1), 115–123
[Article][PubMed]
Jette, A. M. (2012). Face into the storm.
Physical Therapy,
92(9), 1221–1229
[Article] [PubMed]
Porter, M. E. (2010). What is value in health care?
New England Journal of Medicine,
363(26), 2477–2481
[Article] [PubMed]
Wagner, L. I., Schink, J., Bass, M., Patel, S., Diaz, M. V., Rothrock, N., Pearman, T., Gershon, R., Penedo, F. J. & Rosen, S. (2015). Bringing PROMIS to practice: brief and precise symptom screening in ambulatory cancer care.
Cancer,
121(6), 927–934
[Article] [PubMed]