Pre-Clinical Testing of Spinal Implants: Safety Versus Substantial Equivalence
February 2008
by Glenn Stiegman & Justin Eggleton

 

I. INTRODUCTION
Historically, the orthopedic industry has appreciated the difference in the performance data threshold necessary to earn Investigational Device Exemption (IDE) approval versus the threshold to get 510(k) clearance. For IDE approval, a company is attempting to demonstrate to the FDA that a device is safe enough to initiate a clinical study (i.e., offering a reasonable assurance of safety). In contrast, for the 510(k), a company is attempting to demonstrate substantial equivalence (SE) to another device with regard to indications, intended use, labeling, fundamental scientific technology, and performance. The differences between these two concepts have been very simple in the past because most devices were clearly either class II or class III,where their design could clearly distinguish their intended use. However, a new trend is changing the landscape. Now, the same device can follow a class III IDE/PMA (Investigational Device Exemption/Premarket Approval) pathway or a class II 510(k) pathway depending on small modifications to its intended use and indications.

The most popular of these devices is the Dynesys® Dynamic Stabilization System from Zimmer Spine. This device was cleared through a 510(k) as a fusion system to treat spondylolisthesis, which is a class II indication. However, the device also underwent an IDE to study non-fusion use for class III indications to treat spinal stenosis. Consequently, the presence of bone graft caused this device to shift from a class III to a class II device treating different lumbar spine pathologies. The obvious benefit to this pathway is that a device was able to be commercialized within a year and costing thousands of dollars, instead of the 5+ years with an IDE and costing many millions of dollars necessary for an IDE study and PMA submission.

Aside from the obvious differences in time and money, the thought processes, information, and testing needed for these two pathways are very different and not consistent from device to device. The novel design features of these devices, as well as their minimized use, have resulted in a lack of any uniform standards for their assessment. This situation has led to a great deal of frustration and confusion among regulatory bodies, industry, and clinicians alike. Given that the general goal is to bring safe and effective devices to enter the U.S. market in the least burdensome manner, each group will need to resort to and accept a paradigm shift in the way new spine devices are tested.

II. DEVELOPING SUBSTANTIAL EQUIVALENCE
For many devices, there are very straightforward pathways to earn a substantial equivalence determination. A company usually has access to guidelines or standards that clearly outline the necessary steps, labeling considerations, and testing. Most of the time, sufficient data is available in the literature or in the company’s own documentation that can be used for comparison of results. However, for the non-traditional devices or those with a unique design, indication, or regulatory pathway, no regulatory precedent is available. Consequently, the company must rely on novel mechanical testing, cadaver testing, animal data, or even clinical data to demonstrate substantial equivalence.

For a device with a unique design that warrants new testing parameters that fall outside the FDA-recognized standard, the FDA normally requires a predicate to be tested in the same identical way. In other words, as long as a company is able to demonstrate that a device is as strong and stiff as the predicate in the same, somewhat physiologic test, the device can be deemed substantially equivalent, assuming that all other regulatory requirements are satisfied. This does not take into account the possible unique failure modes that may exist with the new design. If side-by-side testing is not feasible, animal or clinical data would be necessary to demonstrate substantial equivalence.

An example of this scenario is the current trend in posterior dynamic stabilization systems. Although one of the closest and most abundant predicates to dynamic stabilization systems is the pedicle screw system, most of these systems cannot be tested in the same manner because they are too flexible. Indeed, the device may not even meet the standardized testing parameters (i.e., ASTM F1717, the standard by which spinal implant constructs are tested) because of its flexibility; but if it does, it must demonstrate inferior results to a more conventional pedicle screw system when tested in accordance with the standard. Neither of these strategies demonstrates substantial equivalence to a predicate. Therefore, the company must demonstrate that the subject device is mechanically equal to or better than the predicate through a test that evaluates both devices similarly. This process would be called upon to demonstrate that the device, when tested in a unique but clinically relevant fashion, is mechanically equivalent to a predicate with the same intended use. Mechanical equivalence is often the only unknown factor in the substantial equivalence decision paradigm.

However, the question remains: How does this 510(k)-based testing evaluate or correlate with the safety and effectiveness of the device? The premarket notification regulations are designed around the idea that by showing equivalence to a predicate, which often either has a significant history of clinical use or was likely equivalent to another predicate with significant historical use, the subject device is as safe and as effective. Although the company is able to demonstrate that its device is as stiff and strong or can bend in certain ways when tested in a unique fashion, that does not necessarily mean these results will translate to clinical settings and accomplish the intended use for the device in question. For example, to characterize the cyclic loading on a dynamic stabilization device, a company must show that it takes certain loads to flex or extend the device a certain distance. This load should be higher than the predicate, meaning it is a stiffer device. But what if the subject device breaks when trying to move that distance because it is too stiff? For a fusion scenario, a stiffer device is ideal, but when comparing it to a flexible predicate, it may be considered a failure.

III. DEVELOPING SAFETY
Now, if we analyzed the same device (a posterior stabilization system) for non-fusion use for initiation of an IDE, what testing would be needed? Would comparative testing be required? Because the device is now indicated for non-fusion, an extension of tests would be needed to demonstrate that it can maintain its integrity through the life of the patient in the absence of fusion. Therefore, testing the device out to 10 million cycles instead of 5 million cycles would be required. Furthermore, additional testing modalities would be necessary that were not needed for the 510(k). For dynamic stabilization systems, dynamic shear testing may be required, or even dynamic torsion. These concerns relate back to the new intended use in that now we are interested in device safety throughout the life expectancy of the device, and not so much the stiffness required to facilitate fusion.

In other words, many more rigorous tests are needed to demonstrate that the device is safe to implant into one patient, while different and perhaps significantly less testing would be needed to put the same device onto the market as a fusion device.

Even though the indications would not be the same, many companies would still prefer a 510(k) clearance for a device because it is cheaper and quicker than a lengthy IDE and PMA process. But the issue remains: for the same device, the threshold for determining safety for implantation into a single patient may be different than the threshold for getting clearance for amarketed device, which can then be implanted into thousands of patients.

A great deal of weight is placed solely on the label, or in this case, the indications, to differentiate between what the FDA would deem a marketable device and one that would not have sufficient data to justify implanting even one patient. In stabilization systems, the simple addition of “has to be used with bone graft” may save millions of dollars and years of company time.

The easier 510(k) route is viewed by many companies and investors as the best way to get a device on the market. These companies may not appreciate the limitations that the labeling and marketing will have by getting a fusion-based indication. In addition, the risk placed on the surgeon and company by possibly using or marketing the device off-label is a major consideration. Only time will tell if this strategy positively impacts doctors, patients and device companies.

IV. CASE STUDY: INTERVERTEBRAL SPACERS
For intervertebral “spacers,” which are indicated for fusion in the United States, the regulatory pathway is driven by the presence or absence of bone graft. An example of this is the Satellite™ Spinal System from Medtronic Sofamor Danek, Inc., for which the company was able to use the Harmon Sphere as the predicate device. Similar devices are now being studied for non-fusion and podium presentations. The fusion version of the Satellite Spinal System was a cleared 510(k) in September 2005, whereas a non-fusion version of the device would be considered an IDE/PMA. However, both devices, regardless of their intended use (i.e., fusion or non-fusion), have similar safety issues.

Because of the pre-1976 Amendments status afforded the Harmon Sphere, and its similarity to the Satellite Spinal System, the regulatory pathway was fairly simple for fusion indications. However, as noted above, the same safety issues exist even though the device has been determined substantially equivalent. The primary failure modes and safety issues that exist for these devices are either subsidence or migration/expulsion, which have been observed not only with the Harmon Sphere but also with nucleus replacements that have similar designs and materials. Both of these risks are characterized in both types of regulatory submissions (i.e., 510(k) and IDE), but their importance differs.

In fusion, some surgeons believe subsidence may be desired; however, the literature has shown that loss of disc height is correlated with pain and function. In fact, both the Harmon Sphere and the Fernstrom Ball system exhibited significant subsidence due to the point contact of their spherical shapes (Fernstrom, U. “Arthroplasty with intercorporal endoprosthesis in herniated disc and in painful disc.” Acta Chir Scand 1966;357:154–9). Current interbody fusion devices have higher surface areas and higher contact areas with the vertebral endplate to prevent subsidence. As with all 510(k) devices, to demonstrate substantial equivalence, the companymust compare the device to a predicate. Thus, for spacers, sponsors would need to demonstrate better subsidence results (i.e., more load to push the device into bone). However, because subsidence was not only expected with the Harmon Sphere, but in fact was a desired result, and indeed was the theoretical cause of clinical success due to the added translational stability, it would seem that lower subsidence results would lead to a better device than the predicate (Satellite Spinal System).

For an IDE (non-fusion) device, the same rationale applies: the device should demonstrate a higher resistance to subsidence, which for this indication seems more clinically accurate.Much of the 510(k) review, as with dynamic stabilization systems, is based on demonstrating mechanical equivalence. This raises a question as to whether the device is stronger or performs better under certain loading scenarios. However, in the case of characterizing subsidence for spacers, a possible clinical success (total subsidence and disc collapse) for one regulatory pathway is a failure.

The confusion in the area of spacer subsidence begs the question: how significant is the difference between clinical relevance of preclinical testing for an IDE versus a 510(k)? In an IDE, preclinical testing would need to be performed in a cadaver specimen to assess subsidence, range of motion, endplate disposition, and other factors. Clinically, if this loss in disc height becomes significant, not only will the treatment effect of maintaining disc space be lost, but the patient’s pain and function could worsen to the point of needing a reoperation, which would be considered a treatment failure. However, as with dynamic stabilization systems, the criterion for determining substantial equivalence in a 510(k) device is significantly different for spacers. Such testing generally involves polyurethane foams in a static test to assess subsidence. Although these data allow comparison between devices, the resulting stiffness data, size and shape of the device, or material has yet to be correlated to any clinical performance, and it is impossible to determine safety and effectiveness. This typifies one of the pitfalls in the methods for evaluating “safety” in a 510(k).

With a company developing a spacer to be substantially equivalent to the Satellite Spinal System, the burden is on the company to demonstrate mechanical equivalence (i.e., that the device is as stiff and strong) to the predicate, and to demonstrate methods to mitigate the associated device-dependent risks. In other words, for Medtronic, Inc., the substantial equivalence determination would be based on the Satellite Spinal System, and thus the Harmon Sphere.

The other main failure mode of spacers is expulsion or migration. In either a 510(k) or an IDE, the risk of migration needs to be characterized. Therefore, substantial equivalence would not address the risks of subsidence and expulsion, both of which are clinical in nature. Now that the Satellite Spinal System is a valid predicate, the landscape for spacer devices has changed. The threshold for demonstrating substantial equivalence has been set, whether it is measuring subsidence or migration or ignoring both of these issues.

If one were to seek an IDE for one of these devices, both of these issues would need to be addressed before pursuing the clinical study either on the bench or in a functional animal model. In addition, the risk of migration often cannot be studied adequately on the bench; therefore, animal or clinical data would be needed to address this risk. In this example, the safety questions seem to overlap both fusion and non-fusion indications but are only addressed in the IDE.

Although risks and failure modes have been associated with these spacers, the regulatory process allows a difference in characterization of these risks depending on the submission. Once again, the dependence on the labeling is critical because the only difference between a device that is fully characterized and studied extensively in patients and one with little characterization is the association with the word “fusion.”

With the new precedents set, one would have to ask: is this the end of truly dynamic stabilization systems and nucleus replacements? The current market offers no non-fusion dynamic stabilization systems and no nucleus replacements, yet many companies are developing these systems for the U.S. market. Although there is a reliance on mechanical testing, what happens if the spacer or dynamic stabilization system is not as stiff and strong as the predicate?

V. CAGE VERSUS VERTEBRAL BODY REPLACEMENT
With the recent FDA down-classification of interbody fusion cages to class II, requiring 510(k) approval, comes a new approach to testing for regulatory purposes. Results from testing with interbody cages over the past 10 to 15 years justified an IDE-specific battery of tests to confirm safety. This evaluation generally required a series of static and dynamic axial compression and torsion tests with instances of subsidence or expulsion testing included. Now the goal of preclinical testing is to demonstrate an interbody cage that is equivalent to or better than a predicate (e.g., PMA-approved or new 510(k)-cleared) cage. In the June 12, 2007 Guidance Document1, the requisite test battery for interbody cages was established. This battery of tests is more inclusive than that described in the PMA. This test battery is outlined in the table above.

The design and clinical outcomes for the BAK, Brantigan Cage, Interfix, Affinity, BAK-C, and all other approved cages were based on metallic designs. These cages were extremely robust. Now, some of the partial VBRs made out of much weaker materials have to demonstrate substantial equivalence. Although some data are publicly available for the PMA-approved cages, many of the newer cages will likely have lower performance results due to polymer (e.g., PEEK) designs. In such cases, the burden will be on the company to provide a rationale for the substantial equivalence argument. This is more problematic than simply providing a clinical justification for an IDE, especially in the case of shear testing where the test setup is not the most clinically relevant configuration to evaluate the device performance in translation under physiological conditions.

In looking at the test battery (Table 1), one may notice that it is very similar to what is required for VBRs. Comparing the two respective test batteries side by side, several key differences become evident. First, VBR testing requires expulsion testing, whereas interbody cages do not. Second, interbody cage testing may require shear testing (depending upon cage design), whereas VBRs do not. The VBR test battery omits subsidence; however, the FDA has requested this testing in the past few years due to the influx of partial VBRs intended for use in softer cancellous bone. Understanding and appreciating these differences is necessary to ensure resources are correctly allocated for these projects.

So why is this important? For a long time, companies have been getting their lumbar cages cleared as VBRs and their cervical cages cleared as partial VBRs. Now that cages are reclassified, will there be a tremendous barrier to entry for cages compared to VBRs? A new interbody fusion cage submission will require more than simply changing the indications and citing the existing VBR data. The 510(k)must also include new results fromtesting and new rationales to demonstrate substantial equivalence. This translates into conducting static/dynamic compression shear and subsidence testing if it has not been already performed.Additionally, a new rationale for every test is required to demonstrate substantial equivalence. These rationales are critical because the VBR-turned-cage needs to be equivalent to a predicate interbody cage for every test, not just the new compression shear and subsidence testing.

Conducting new tests and providing the proper justifications can be difficult depending on how stringent the FDA is. What if a company’s cage is much weaker than the PMA-approved cages and weaker than anything described in the literature? One would think that most companies will simply seek a partial VBR indication. Therefore, the easier and more familiar pathway to market is to seek clearance for a VBR. We believe the FDA has begun to recognize this dilemma in recent years, and now seems to review cage submissions more liberally on a case-by-case basis. The pathway for interbody fusion cages is unfamiliar and possibly more expensive, and when investors are looking for predictability and confidence, many companies will choose to submit a VBR 510(k).As with dynamic stabilization systems and spacers, there is an easier pathway to market for VBRs that may prevent cages from ever really being accurately marketed. Nevertheless, both of these pathways are better than predicating your cage to a cement restrictor.

VI. WHEN TO CONSIDER ANIMAL TESTING
When approached with a new dynamic stabilization system or other novel technology, the FDA often requests a company to either do side-by-side testing or to characterize the failure modes of the device and to demonstrate that it can fuse in a functional animal model.A similar thought process would be appropriate for those spacers that are not as strong and stiff as the Satellite Spinal System or with interbody cages not demonstrating equivalent or better mechanics. Due to anatomical and biomechanical variations, no ideal animal model exists for evaluating spinal devices. Thus, trying to show proof of concept or mechanism of action in an animal is almost impossible—that is, of course, unless you want to show that something can fuse. Most animal studies demonstrate great fusion results; therefore, when asked to demonstrate that a device can indeed facilitate fusion, relying on a functional animal should yield acceptable results. This sets the bar even lower for getting these devices to the market.

If a company has a very flexible system or spacer made of a very “formable” polymer, then it can possibly test the system appropriately and put it in an animal model. The company would have to conduct a controlled study evaluating the presence of fusion, ideally without the presence of bone graft to represent worse case; however, one could justify including bone graft. Demonstrating fusion in these cases can be accomplished relatively easily, resulting in the company’s device being determined substantially equivalent. Conducting a 6-month or 1-year animal study still saves a company millions of dollars and years. This is a very simplistic view of these devices and only takes into account the stiffness and strength of the device. Many other factors play into the FDA’s decision-making process, including material and the possible clinical effect that material may have in treating or achieving the indications.

In some cases, a company’s only survival mechanism is to seek clearance of a device because so many competitors have sought the same pathway. The dynamic stabilization field will become so crowded in the near future, that many companies will not seek the non-fusion indications for which the device was originally designed. In these cases, the advertisements, presentations, and literature promoting such a device for a non-fusion system may not compensate the millions of dollars and years from getting to market 5+ years earlier.

A company’s ultimate strategic goal should be to set the highest barrier to entry, such that all competitive devices have to go through the same route to market or to be accompanied by the same amount of data. Often, the more data the company generates, the more publications and conference presentations can happen. Although the time to market for many of these devices can be very long, the marketability can bring huge dividends on sales. In addition, those companies that do conduct the proper clinical study or follow the proper regulatory pathway have an easier time getting their devices paid for by CMS and other insurers.

Above all, when evaluating a device and its regulatory pathway, all monetary, marketing, and timing concerns should be set aside and the primary focus should be on the patient being treated. Is the device safe for whatever pathway the company plans on using? If this means submitting a spacer as a 510(k) or an IDE, what indication is the safest for the patient? Many times these questions are lost among the key decision makers, while the only people truly seeing the effects of the device are the patients.

ABOUT THE AUTHORS
Mr. Stiegman manages and directs the regulatory affairs for MCRA and its clients. Mr. Stiegman is responsible for management of approximately 10 regulatory professionals at MCRA. Mr. Stiegman leads the firm’s submission process, regulatory strategy, analysis and development: from pre-clinical testing, to FDA submissions, to market approval and post commercialization.

Prior to joining MCRA in February 2006, Mr. Stiegman served as the Chief of the Orthopedic Devices Branch for the US Food and Drug Administration. As Branch Chief, Mr. Stiegmanmanaged a teamof scientists, clinicians, and engineers in the regulation of all orthopedic devices marketed in the United States. In addition, Mr. Stiegman was responsible for overseeing all FDA guidance documents and FDA policy determinations for orthopedic devices marketed in the United States. Furthermore, he assisted in and oversaw all integrity, compliance, and monitoring issues regarding the orthopedic industry in collaboration with the Office of Compliance.

Mr. Stiegman was also a member of several leveraging groups, such as the Orthopedic Device Forum and Orthopedic Surgical Manufacturer Association, where he represented the FDA. As the head of the Orthopedic Devices Branch, Mr. Stiegman pursued the advancement and consistency in the regulation of all orthopedic devices. This was evident by the pursuit of reclassifying several types of orthopedic devices, developing guidance documents on state-of-the-art orthopedic devices, and educating and assisting the orthopedic community in the regulatory strategies to get devices to market.

Prior to becoming Branch Chief, Mr. Stiegman was a reviewer in the Orthopedic Devices Branch where he was the team leader on many state-of-the-art spinal technologies. He was a leader in the field of artificial disc replacements, nucleus replacements, posterior stabilization systems, and many of the current widely used fusion spinal systems. He authored a guidance document for industry on spinal systems indicated for fusion, and he also developed documents that assisted companies in getting other devices to market, such as artificial disc replacements, nucleus replacements, and posterior stabilization systems. Mr. Stiegman received his Bachelor in Science at Tulane University in Biomedical Engineering and his Master in Science at Clemson University in Bioengineering with a focus on biomaterials and biomechanics.

Mr. Eggleton is responsible for regulatory affairs relevant to spine devices for MCRA clients. In particular, Mr. Eggleton writes 510(k)s, IDEs, PMAs and international regulatory submissions. Mr. Eggleton also develops specific regulatory pathways for new, innovative devices and drafts preclinical and clinical test protocols designed to position devices for regulatory approval. Prior to joining MCRA in July 2006, Mr. Eggleton served as a lead reviewer in the Orthopedic Devices Branch of the US Food & Drug Administration. In this role, Mr. Eggleton reviewed more than 300 510(k), IDE and PMA submissions regarding cutting-edge spine, trauma and bone cement technologies, among others. Technologies of note include artificial disc replacements, nucleus pulposus replacements, interspinous process stabilization devices, posterior stabilization systems and novel materials used in fusion spinal systems.

Mr. Eggleton also contributed to several guidance documents and ASTM technical committees regarding orthopedic device testing.

Preceding his experience at the FDA, Mr. Eggleton worked in the Drexel University Biomaterials Laboratory where he aided in the development of novel hydrogels designed to replace the nucleus pulposus of the intervertebral disc.Mr. Eggleton also served as a research associate at Therics, Inc., specializing in the characterization of bone substitute materials.

Mr. Eggleton received his Bachelor of Science in Biomedical Engineering at Drexel University, with a focus on biomaterials and tissue engineering.

 
 
Phone: 202.552.5800
Fax: 202.552.5798
Email: info@mcra.com