Measuring the Measurement

Physicians Wary of Performance-Rating Schemes

Texas Medicine Logo

Cover Story - July 2007  

ByKen Ortolon
Senior Editor

When Austin gynecologist Patricia A. Gunter, MD, received stars from UnitedHealthcare's Premium Designation Program for appropriate use of Pap smears, mammography, and chlamydia screening, she was pleased.

"Those are recognized health screening tools that are easy to measure," Dr. Gunter said. "So that didn't bother me."

A few weeks later, however, Blue Cross and Blue Shield of Texas informed her that she would not be accepted as a provider in its BlueChoice Solutions network because her utilization of office visits and laboratory tests exceeded that of other Austin physicians in her specialty. When she looked at the data upon which the Blue Cross decision was made, though, she was surprised to find that Blue Cross said she was spending too much money on two of the things for which United had recognized her: Pap smears and sexually transmitted disease (STD) screening.

"I thought, 'Well this is backwards,'" Dr. Gunter said. "United gives me two stars because I do a good job of doing Pap smears and STD tests, and then this other company won't let me participate in their network because I do too many Pap smears and STD tests. That made me mad, particularly because I was offering that insurance company to my family and my employees."

Dr. Gunter was even more confused earlier this year when Blue Cross launched its new BlueCompare physician performance-rating program and gave her a dark blue ribbon, its highest quality rating.

She's not alone in receiving confusing ratings from health insurance plans.

"I was down-rated for not doing chlamydia screens on women of childbearing age. I am a pediatrician," exclaimed one physician, after seeing his ranking on one insurer's list. The physician was responding to a TMA survey. "I am appalled at the idea that they lead patients to believe they're assessing quality without ever seeing a chart."

Charlotte Smith, MD, an Austin physical medicine specialist, also encountered flaws in the ranking system, which she is still trying to rectify after months of effort.

"I was (and continue to be) harassed by an agent of a tiered network who is sending me multitudes of 'friendly reminders' to do diagnostic tests, and order medications and other interventions for a patient in their network," Dr. Smith explained. The problem is the patient is not under her care.

"I've never seen nor met this patient. Yet on the basis of my so-called 'care' for this patient I've never seen, I was awarded a 'black' ribbon [a poor score], which was then changed to a gray ribbon and then no ribbon," she said. "What's more, the stuff they are 'reminding' me to do is completely outside the scope of what a physician of my specialty should do (and, in fact, would probably constitute malpractice if I did)."

She added that a patient's treatment "must be based on the evidence-based science, not … a health plan's balance sheet."

Such systems, physicians assert, paint inaccurate pictures and are ripe for errors.

Physician performance-rating programs, such as United's Premium Designation Program and BlueCompare, are the latest generation of health insurance industry efforts to rein in costs. Most rely on claims data to purportedly measure physicians' cost efficiency and quality of care.

But physicians are concerned these programs amount to nothing more than economic credentialing and argue that scientific studies have proven claims data alone are inadequate to truly judge quality of care.

They're also worried that the rating systems will confuse patients.

"I fear that people will choose doctors solely on the discount that insurance companies give them upfront when it might not be in their best interest," said Austin orthopedic surgeon C. Bruce Malone, MD, chair of the Texas Medical Association Board of Trustees. "The most expensive care is not necessarily the best, but neither is the cheapest."

He, too, was surprised by what he found on one health plan's panel.

"Some doctors listed on the panel have been retired for several years or even dead for years. And my data included a report on my partner, who has been retired for three years."

Still, TMA officials warn that health plans are under such intense pressure from employers to hold down premium costs that these rating schemes are likely here to stay. And they urge physicians to arm themselves to combat potentially inaccurate ratings by learning to mine data from their own practices.

"Physicians have got to educate themselves on what individual health plans are measuring and do their best to measure it themselves," said Houston neonatologist Michael Speer, MD, who chaired the TMA Special Committee on Physician Performance, which played a large role in TMA's successful effort earlier this year to delay implementation of and then modify the BlueCompare program.

During the session that ended in May, TMA urged the Texas Legislature to pass legislation to force health plans to use evidence-based tools to measure the quality and efficiency of care. Unfortunately, that bill failed to pass late in the session. 

Applying the Pressure

Both TMA leaders and health plan executives say large employers looking to cut their operating costs by controlling health insurance premiums for their employees are driving the push for performance measurement.

"As I have visited with many large customers, their concern is that they are not getting the value they think they should with the dollars they're spending on health care," said Paul Handel, MD, vice president and chief medical officer for Blue Cross and Blue Shield of Texas. "The transparency programs are being pushed by the employers onto the insurance carriers to do something to help better define quality and cost effectiveness."

But it's not just employers. Government payers, such as Medicare and Medicaid, also are jumping on the performance measurement bandwagon.

"There is a general sense on the part of the payers, both public and private, that the value [in health care] is not good," said David Gregg, MD, senior consultant and principal of Mercer Health and Benefits LLC, a division of Mercer Human Resource Consulting. The company consults with both employers and health plans on employee health benefit issues. Mercer was involved in creating a performance measurement program in Massachusetts called the Clinical Performance Improvement Initiative in 2003. More recently, it launched a national database of claims data called Care Focused Purchasing in partnership with large employers and health plans. (See "Mercer, Employers Launch Nationwide Claims Database.")

Dr. Gregg says several recent studies have led payers to conclude that they are not getting value for their premium dollar. These include a RAND Corp. study that found that only about 55 percent of patients receive what medical literature has concluded are "best practices," a Dartmouth Atlas report outlining wide variances in the cost of care across the country, and the Institute of Medicine report, To Err is Human , that said up to 98,000 patients die each year from medical errors in hospitals.

"When a payer looks at that combination of information, they say, 'I'm paying 20 or 30 percent more than I need to, and I'm getting roughly half the quality outcome,'" he said. "When they look at any other line of their business, they would never accept that from a supply chain vendor." 

Defining Quality

So the push is on to define and measure quality of care and cost effectiveness and to apply those measurements to health insurance products, such as tiered or narrowed networks, that offer supposedly higher quality and cost-efficient provider networks at lower premiums.

Blue Cross rolled out its BlueCompare ratings this spring. United's Premium Designation Program and Aetna's Aexcel network, which encourages members to select specialists who have demonstrated effectiveness in clinical performance and cost efficiency, have been around since at least 2005. CIGNA and Humana also have performance-rating programs in Texas.

The problem, says TMA President-Elect Josie R. Williams, MD, is the plans are attempting to measure quality with the only tool they have - claims data.

"Those numbers were never designed to be used to judge quality of care," said Dr. Williams, the director of the Rural and Community Health Institute and Institute for Healthcare Evaluation at the Texas A&M University System Health Science Center.

"When physicians are measured by insurers, they want to know, are they being measured appropriately, is the measure valid, is it scientifically accurate, does it really mean something, and, then, did the insurer get it," added Dr. Speer, also a member of the TMA Board of Trustees. "In other words, did they have the tools to actually determine whether or not that measure was being done? And claims data don't do it."

In a report on the BlueCompare program written by the Special Committee on Physician Performance and adopted by the TMA Board of Trustees in February, TMA concluded that there are "significant problems inherent in relying on claims/administrative data" for measuring quality. Among problems TMA pointed out are:

  • Whether patients decline services or are noncompliant with their physicians' recommendations cannot be determined from claims data.
  • Cost of care cannot accurately be attributed to physicians because claims data don't capture who ordered services or treatment.
  • Claims data fail to capture practice and cost differences between urban and rural areas.

TMA also concluded that claims data might identify overutilization of services but not underutilization and may not identify poor diagnostic skills.

Drs. Williams and Speer agree that medical chart review is the only way to truly measure quality, but Dr. Handel says health plans are not likely to go that route because of two impediments. Chart abstractions are "incredibly time consuming and disruptive to a practice" and "extraordinarily expensive" for the plans, he says.

Dr. Speer also says using claims data in evaluating a group practice is problematic. "If the group has a single billing number, then all members of that group who care for a given patient are tarred by the same brush," he said. "Thus, it is impossible, without chart review, to determine which member of the group practices effective, quality, and cost-efficient medicine versus one who does not." 

What Are We Measuring?

While the value of data gleaned from claims is a major concern for physicians, TMA leaders say there's added concern over just what is being measured by these rating schemes. They say United and Aetna rolled out their programs with little advance communication with physicians. And while the TMA special committee had extensive discussions with Blue Cross executives about BlueCompare, Dr. Speer says they couldn't fully answer the committee's questions about what was being measured because Blue Cross was using proprietary software developed by Health Benchmarks Inc.

The committee concluded that BlueCompare's rating system simply added several measures developed by Health Benchmarks onto another program previously developed to measure cost efficiency.

TMA also faulted BlueCompare for lumping physicians for whom there was insufficient data to rate or specialists for whom there were no quality measures in with low-performing physicians in the gray ribbon category, the lowest of three performance ratings.

Dr. Speer was among those who got a gray ribbon. "Now, I happen to know they don't have measures for neonatology so I wasn't quite as upset as some others, but   it offended me to be lumped in with those folks who were judged inadequate by whatever measurement was being used," he said.

At TMA's request, Blue Cross eliminated the gray ribbons from its BlueCompare program.

Another concern is that the health plans seem to be measuring different things, which produces conflicting results such as those experienced by Dr. Gunter.

Dr. Handel says Blue Cross is working with the Dallas-Fort Worth Business Group on Health and other carriers to come up with a uniform system that would eliminate conflicting scores among plans.

"At the present time, each of the plans uses different criteria for determining either cost effectiveness or compliance with HEDIS [Health Plan Employer Data and Information Set] measures or appropriate ordering of tests," he said. "As a result, what is happening over and over again is people are comparing apples and oranges. At the end of the day,   the insurance industry needs to decide what the appropriate measures are, and let's all measure the same thing,   so there will not be that situation of one plan giving you an A+ and one plan giving you an F."

Dr. Gregg from Mercer also says several national initiatives are under way to identify appropriate quality measures. One of them is the Ambulatory Care Quality Alliance, a group founded by insurers, the Agency for Healthcare Research and Quality, the American College of Physicians, and the American Academy of Family Physicians. It is attempting to check quality measures produced by various specialty societies. The alliance began with 26 quality measures and has expanded the list to more than 80 different measures recommended by specialty organizations, he says.

Many of these measures are familiar to physicians already because they are the same or similar to HEDIS guidelines developed by the National Committee for Quality Assurance, Dr. Gregg says. An example is regular hemoglobin A1C testing and annual eye exams for diabetes management.

But Dr. Speer says many of the measures being developed by specialty organizations are "consensus measures" that may sound good but have not been validated. He serves on an American Academy of Pediatrics subcommittee looking at proposed measures from various national organizations and says it is difficult to find out what science such measures are based on.

"Yes, they're national, but nobody knows whether they're going to work," he said. 

Measuring Cost

Physicians also have raised concerns about how cost effectiveness is being measured in the various rating programs. TMA officials met in May with United executives and discovered that physicians were getting dinged on their cost ratings for sending patients to higher cost facilities. Yet, doctors don't necessarily know that one hospital has a richer contract with United than another and therefore is more expensive.

Dr. Gunter also says the cost data in her BlueChoice Solutions ranking included such items as bone and joint treatment that she did not order.

"I don't treat bone and joint diseases, I'm a gynecologist," said Dr. Gunter. "If I had a patient who had to have hip surgery, it's not my fault."

With so many concerns about how performance measurement is being done, TMA has developed a framework to evaluate physician performance-rating programs adopted by Texas carriers. The eight-point framework asks:

  1. Is there a willingness to appoint an expert team of TMA members to review clinical measures with input on the rating program and physician communication?
  2. Has the performance-rating methodology been externally and objectively validated?
  3. Is there an option for a pilot testing of the program in a selected market to evaluate results and make appropriate modifications?
  4. Is there an "opt out" option for physicians who do not wish to participate in the rating program?
  5. To assure transparency, will physicians be able to obtain the data on which they are rated?
  6. Will there be a timely review and appeals process?
  7. Are communication materials for patients written at an eighth-grade level to address health literacy issues?
  8. Will there be timely updates on rating metrics that are made available to the public (e.g., changes in star, ribbon, or other icons on the plan's Web site)?

Dr. Williams says Blue Cross has demonstrated a willingness to work with physicians to address concerns about its BlueCompare program.

Following action by the special committee and TMA trustees, Blue Cross temporarily delayed further implementation of BlueCompare and agreed to drop the gray ribbon designation from the program. Blue Cross gives physicians who rank in the top 20 percent in quality and cost efficiency a dark blue ribbon. Those who rank in the second 20 percent receive a light blue ribbon. Physicians who score lower, have insufficient data, or are not rated get no ribbon.

Blue Cross also agreed to establish an advisory committee of Blue Cross medical directors and TMA leaders to examine ongoing physician concerns with the program, strengthen the appeals process, and give TMA an opportunity to review communications about BlueCompare sent to physicians.

Dr. Handel says he hopes the advisory committee will achieve "something that is meaningful for the patients and the physicians as well as the business community that has demanded some type of scorecard, and that this will not be just an exercise in futility but will have a positive impact on patient care."

United also has expressed an interest in creating an advisory committee to review concerns about its Premium Designation Program. 

Getting an Education

Meanwhile, TMA and other physician organizations are urging doctors to educate themselves about performance-rating schemes and be prepared to refute inaccurate data.

The Physicians Advocacy Institute (PAI), created with funds from the antiracketeering class action lawsuit settlement against health plans, is developing educational materials, including an instructional video, about performance-measurement programs. That video is expected to be available in September through TMA and other PAI member organizations.

"One of PAI's primary objectives is to educate physicians about the various methodologies used to rate physicians based on costs," said Robert W. Seligson, PAI president and executive vice present/chief executive officer of the North Carolina Medical Society. "Physicians can only ensure the accuracy and fairness of the ratings if they understand how they are determined."

PAI officials and others say adoption of electronic medical records systems that allow physicians to analyze their own practice data against plans' claims data also will be important in dealing with performance-rating schemes.

Meanwhile, Dr. Gunter maintains a healthy skepticism about any rating system, saying health insurance plans are more concerned about money than quality.

"I would much rather be recognized for good care than be recognized for cheap care," she said.

Ken Ortoloncan be reached by telephone at (800) 880-1300, ext. 1392, or (512) 370-1392; by fax at (512) 370-1629; or by email at  Ken Ortolon.  

SIDEBAR

Mercer, Employers Launch Nationwide Claims Database

Mercer Health & Benefits LLC and a group of large U.S. employers have launched a national physician claims database they hope will enable them to analyze physician quality and cost-efficiency performance over all payers.

Care Focused Purchasing (CFP) was incorporated as a 501(c) corporation in 2005 and has been collecting physician claims data from 50 major national employers and eight national health insurance plans. David Gregg, MD, senior consultant and principal in Mercer Health and Benefits, says the database currently includes claims data on more than 23 million insured lives. The program went live this year, and CFP is now working on its first set of data analysis, he says.

"No individual health plan has sufficient data to really get a good look at a doctor's whole patient panel," Dr. Gregg said. "They can only get a look at that percent of the patient panel that comes from their membership. The main idea was to get the self-insured employers to contribute their claims data to a central data aggregation and then to leverage the health plans that they participate with to contribute their whole book of business."

The database includes both physician and hospital claims data, and CFP plans to analyze data on physician quality, physician cost efficiency, hospital quality, and hospital cost efficiency measures. The data are being analyzed based on a set of about 90 quality-of-care measures and methodologies for rating cost efficiency.

"The results of that information will go back to the health plans, where they will be able to see detail on their own population and their own network," Dr. Gregg said. "And they'll see an aggregate picture of the rest of the data set."

Large employers also will be able to review the data on their employee population.

Dr. Gregg says health plans can use the information to create different network products or different benefit design products. They also can use the data to drive quality improvement, he says.

"The data is available to support the dialogue between the health plans and the doctors in the network," he said. "Where the performance results for an individual doc can be reviewed with the medical directors of the health plan there can be opportunities for quality improvement, performance improvement."

Whether a health plan shares its data with its network physicians is being left up to the individual plans, Dr. Gregg added.

Employers will use the data, he says, to educate employees about performance difference across a plan's network.

In a news release issued in March 2006, CFP said its program is distinguished by its focus on transparency "as a means of building a consumer-driven market that identifies better physicians, better hospitals, and better treatment options. Transparency means that health care purchasers and consumers can easily access all the information on cost and quality they need to compare and select doctors, hospitals, and treatment options."

As with other physician performance-measurement programs, doctors are concerned about CFP's reliance on claims data and quality measures whose scientific validity have not been proven.

"To expect that one can determine physician quality, physician cost efficiency, hospital quality, and hospital cost-efficiency measures by analyzing a set of about 90 quality-of-care measures and methodologies for rating cost efficiency when no one can identify which measures are valid is unrealistic at best," said Michael Speer, MD, chair of TMA's Special Committee on Physician Performance.

Participants in CFP include BellSouth Corp., The Boeing Co., Texas Instruments Inc., Sprint Corp., and Honeywell International. Carriers participating in CFP include CIGNA and Aetna.

CFP also is urging the U.S. Centers for Medicare & Medicaid Services to release Medicare claims data to its data warehouse.

Back to article

 

 

July 2007 Texas Medicine Contents
Texas Medicine Main Page

 

 


Comment on this (Must be logged in to comment)

Add Comment

Text Only 2000 character limit

Looking for more?