Likelihood Ratio: A modern approach for classical homeopathy

Lex Rutten, Erik Stolper, Roland Lugten, Rob Barthels
Commissie Methode en Validering of the Dutch society of homeopathic physicians (VHAN)

The most striking difference between homeopathy and conventional medicine is the fact that the homeopathic medicine cannot be prescribed on diagnosis only. In other words: in conventional medicine one item makes sure that a medicine works, in homeopathy one item makes the effectiveness of the medicine only more probable. If we put it this way scientific philosophy can describe what is happening here: The conventional doctor has a Popperian vision; the homeopathic doctor’s view is Bayesian. Popper states that facts should be falsified. If we believe that swans are white, the first black Swan falsifies this opinion (it is no longer true) and we should form a new hypothesis. This is a very fundamental approach that is congruent with the design of Randomised Clinical Trial (RCT): a medicine is a placebo until proved otherwise. It works or it does not work. Until recently, conventional medicine strongly relied on Popperian ideas, but this is changing. More and more medicine realises that diagnoses are based on more than one fact. So we prescribe on one diagnosis, but that diagnosis is based on more facts. Ultimately a conventional therapy is also based on more facts and each fact alone cannot give certainty about effectiveness.

The 18th century reverent and mathematician Thomas Bayes described a more pragmatic design for the search for truth. The conviction that something is true is built up gradually by subsequent observations. If you see the first black swan, you begin to doubt that all swans are white. After 50 observations of different black swans, you are pretty sure that swans can be black. Many facts are neither true, nor false, but probable. So is the result of most medical treatments. Bayes reformed the formula for conditional probability and thus described how our conviction of the truth of a certain fact increases or decreases by subsequent observations. In epidemiology this expressed by Likelihood Ratio (LR) and odds. LR+ stands for increase in likelihood if a symptom is present; LR- stands for decrease in likelihood if the symptom is absent.

How do we value homeopathic symptoms? There is of course Hahnemann’s famous § 153, stating that peculiar, rare symptoms are most valuable. But we know that each remedy has its own characteristics, also called ‘keynotes’. These symptoms are not so rare, but observed more frequently in relation to that remedy than the rest of our population. If we try to say this in epidemiological terms: the prevalence of the symptom in the remedy-population is higher than in the rest of the population. Or, if we divide the prevalence of the remedy population by the prevalence of the rest-population the outcome is >1. If the outcome of this division is 5, the corresponding symptom is 5 times more likely to occur in the remedy population; the LR+=5.

If we ask participants to materia medica validation to estimate the importance of a symptom regarding a certain remedy they say things like: "Loquacity occurs in 40% of the Lachesis-patients". Then we ask: "How frequently do you see loquacious patients in general?" The answer is 10%, so the estimated LR+ of this symptom for Lachesis is 4. It appears that experienced practitioners can estimate these data for well-known symptoms, because we already validated this symptom. But this is only the case for a limited number of symptoms and medicines.

Introduction to epidemiology

In conventional medicine we are used to assessing diagnostic tests. To assess a diagnostic test we need a gold standard to compare the test with. The gold standard is regarded as the best approximation of the truth. For instance, to assess ultrasonography for diagnosis of appendcitis the best standard is the result of laparotomy (and histology). After laparotomy we can divide patients in four groups, according to outcome:

a: the test (ultrasonography) is positive and the illness (appendicitis) is present: true positive

b: the test is positive and the illness is absent: false positive

c: the test is negative and the illness is present: false negative

d: the test is negative and the illness is absent: true negative

This can be depicted in a 2x2 table (Table 1):

 

illness present

illness absent

 

test positive

a

b

a+b

test negative

c

d

c+d

 

a+c

b+d

a+b+c+d

Table 1: 2x2 table showing relationships between the results of diagnostic tests and the presence of illness

We will use the notation a-d to indicate the possible results of a test.

Likelihood ratio

Now we come to the likelihood ratio(LR). The likelihood ratio is a constant that indicates the relation between prior-odds and posterior-odds (the odds after the test). This relation is given by the formula:

Posterior-odds (+) = LR(+) x prior-odds.

The transformations between odds and chance are as follows:

Odds = chance / (1-chance) and Chance = odds / (1+odds)

The LR can indicate the change if the test is positive (defined as LR+) and if the test is negative (defined as LR-). The mathematical formulas regarding the parameters in the 2x2 table are:

LR+ = (a/(a+c)) / (b/(b+d))

LR- = (c/(a+c)) / (d/(b+d))

If LR = 1 nothing changes. The higher the LR(+) the better the test if the result is positive. For a negative result the test is better if the LR(-) is closer to zero. As an example we take the ultrasonography for the diagnosis of appendicitis. Literature shows that the LR(+) = 7.6 and the LR(-) = 0.27. Now we can calculate the posterior-chance for all prior-chances using the formula mentioned above. This is graphically represented in Figure 1.

This graph shows how the probability of appendicitis changes after positive (L+) and after negative (L-) ultrasonography. If our suspicion of the existence of appendicitis was 33%, the probability after the test rises to 79%, a negative ultrasonography would lower the probability to 12%.

Symptoms and cure

There are differences between diagnostic tests like ultrasonography and the questions used in homeopathy. The most important problem is that our ‘gold standard’ is not easy to define. There is a somewhat vague general understanding about the meaning of ‘cure’. One of the main principles in our method is the relation between a symptom and the curative effect of a medicine. The symptom indicates that a medicine is more likely to have an effect than we could expect it by mere coincidence. The ‘gold standard’ in homeopathy is the fact that the medicine worked. Instead of the diagnostic value of a test we measure the prognostic value of a symptom.

The 2x2 diagram for a homeopathic symptom is shown in Table4:

 

medicine worked

rest

 

symptom present

a

b

a+b

symptom absent

c

d

c+d

 

a+c

b+d

a+b+c+d

Table 2: In this table:

a+c = all the patients that got the medicine with a positive effect

b+d = all the other patients, including the ones that got the medicine without positive effect

Assessing questions

In homeopathy we ask questions to get ideas about possible medicines or to confirm medicines that might be applicable. Some questions seem more effective than others. In this respect homeopathy does not differ from conventional medicine. Can we assess questions the same way we assess diagnostic tests?

We take the symptom as the diagnostic test and the medicine as the illness to be diagnosed. If we take a closer look at the formula for the LR(+) we see:

a/(a+c) is for the prevalence of the symptom in the population that responds to a medicine.

b/(b+d) is for the prevalence of the symptom in the population that does not respond to that medicine.

So the LR(+) = (a/(a+c))) / (b/(b+d)) = (prevalence of the symptom with the medicine) / (prevalence of the symptom with all others). Or in words: The likelihoodratio (+) of a symptom compares the presence of that symptom in the successful prescriptions of that medicine, with the frequency of this symptom in the unsuccessful prescription of the same medicine and all prescriptions of other medicines.

If the symptom is more frequently present where the medicine was successful than in the rest of the population the LR(+)>1. In other words the more the symptom is confined to the medicine (and not to the rest) the higher the likelihood ratio(+). In fact the likelihood ratio(+) is a mathematical representation of §153 of the Organon by Samuel Hahnemann that states: pay particular attention to the peculiar and characteristic symptoms.

An old rule can be translated into modern terminology, and there are several advantages in doing so. The most important advantage is a more accurate and quantitative description of the importance of a symptom, based on empirical data. This implies of course the gathering of the necessary data.

Pitfalls and limitations

If we want to base homeopathy on accurate and quantitative data we have to be sure that our algorithm is correct, to paraphrase Disraeli ‘There are lies, damned lies and epidemiology’. If theory and experience in practice do not match something is wrong. It might be the formula or the data that is wrong. At present it is not possible to test the usefulness of the LR in homeopathy, which is mainly because we have no data about the prevalence of our symptoms in our patient population, and we are just starting to gather data about successful cases. The prevalence of a symptom in patients responding to a particular homeopathic medicine will be determined by collecting a sufficient number of cases that showed a curative effect from that medicine. But even then we must be careful. Homeopathic symptoms are not easy to define, and often vague. Our gold standard, the cure is perhaps even more difficult to define. Also, present materia medica and repertory are not very clear about the exact meaning of many symptoms.

Rare medicines; typeface, prevalence and likelihood.

The importance of a symptom in relation to a certain medicine is represented by the typefaces in homeopathic repertories. Bold type or bold and underlined, represent the most important symptoms of medicines. There are, however, some inconsistencies in the repertory. One is the representation of rare medicines. Changing typefaces on the basis of LR and power of the argument could correct this shortcoming.

The meaning of the typefaces in the repertory is not very clear. Kent gives no explanation in the preface to his repertory; he merely explains what kind of symptoms is more valuable. We find some explanation in one of the sources of the repertory, Hering’s ‘Guiding symptoms’. Hering gives indications for: ‘Symptoms occasionally confirmed’, ‘Symptoms more frequently confirmed’, ‘Symptoms verified by cures’ and ‘Symptoms repeatedly verified’. When a symptom is frequently confirmed in the treatment with a certain medicine its becomes more important, especially when the symptom is rare.

Rare medicines are medicines with little data. If there is little experience with a medicine, its symptoms are not frequently confirmed. This means that there is no emphasis for these symptoms in the repertory, even if the symptoms are characteristic for the medicine. As an example, consider the rubric "Fear of death" in the repertory of RADAR-synthesis (v. 7.2). This rubric contains 146 medicines, among them Aconitum (bold and underlined), Latrodectus mactans (plain) and Sulphur (plain). In this repertory Aconitum has 4376 symptoms, Latrodectus mactans has 109 symptoms and Sulphur has 11451 symptoms.

In Clarke’s materia medica, ‘Fear of death’ is an important symptom for Aconitum. For Sulphur ‘Fear of death’ is not mentioned. For Latrodectus mactans ‘Fear of death’ is the most important symptom.

Small remedies: Latrodectus mactans

Latrodectus is a difficult medicine to handle because of the few data known for this medicine. Personal experience led to a surprising effect on a patient with cardiac problems. Several other medicines had a moderate effect. Because of the persistent radiating pain from the heart to the axilla. Latrodectus was tried with good result. Aconitum had no effect on this patient; the second best medicine was Lycopodium. The fear of death in this patient was deep but not easily expressed. Since then Latrodectus was used in cases with heart problems (esp. with pain radiating to the shoulder or axilla) and fear of death more frequently and with more success than Aconitum and sometimes with deep ‘constitutional’ effect.

This kind of experience with small remedies is not uncommon for experienced colleagues. Maybe the only reason that a medicine is rarely prescribed is lack of knowledge. The more a medicine is used the greater the knowledge about it. The chance of finding a medicine by the use of the repertory is proportional to the amount of symptoms in the repertory. The chance of finding Aconitum is 40 times greater than the chance of finding Latrodectus. More use of these rare medicines seems justified.

If a medicine is seldom used there is not only the handicap of few recorded symptoms. The emphasis implied by typeface is also absent because of the infrequent use. We might suspect that ‘Fear of death’ is at least as important for Latrodectus as for Aconitum, in spite of the difference in typeface in the rubric. Let us investigate the advantage of using the LR instead of typeface. We assume that Latrodectus is prescribed once in about thousand prescriptions and that the prevalence of the symptom "Fear of death" is present in 40% of the cases where Latrodectus acted. Furthermore we assume that the prevalence ‘Fear of death’ is 5% in the rest of the population. The LR(+) of ‘Fear of death’ for Latrodectus is 8. See Table 3

 

Latrodectus mactans

Other medicines

 

Fear death

4(a)

500(b)

1006

no fear death

6(c)

9500(d)

9004

 

10

10000

10010

Table 3: hypothetical 2x2 table for ‘Fear of death’ and Latrodectus mactans

If we do the same for Aconitum, we assume that the prevalence of ‘Fear of death’ in this medicine is also 40%. Assuming that it is prescribed once in about every hundred prescriptions the figures in the 2x2 table are different, but the LR(+) for ‘Fear of death’ of Aconitum is also 8. See Table 4

 

Aconitum

Other medicines

 

Fear death

4(a)

50(b)

55

no fear death

6(c)

940(d)

945

 

10

990

1000

Table 4: hypothetical 2x2 table for ‘Fear of death’ and Aconitum

In a former paragraph we showed the graphical representation of the likelihood ratio. The likelihood graph for this symptom with Aconitum and Latrodectus is shown in Figure 2.

We can do this for Sulphur assuming that the prevalence of this symptom is the same as in the rest of the population and that Sulphur is prescribed once in every 50 prescriptions. The likelihood ratio (+) for ‘Fear of death’ for Sulphur is 1 (but it might even be less than 1). See Table 5.

 

Sulphur

Other medicines

 

Fear death

1(a)

49(b)

50

no fear death

19(c)

931(d)

950

 

20

980

1000

Table 5: hypothetical 2x2 table for ‘Fear of death’ and Sulphur

In the cases of Aconitum and Latrodectus we see a prior-chance of 10% climb to a posterior-chance of 47% after the symptom ‘Fear of death’ (see graph in the appendix). In the case of Sulphur the posterior-chance will remain 10%.

When we rely on our materia medica and when we have experience with a rare medicine, in this case Latrodectus, we will prescribe Latrodectus as many times as Aconitum, maybe even more. If we don’t have this experience and rely on the repertory we will frequently prescribe other medicines, like Sulphur, instead of Latrodectus in a case with ‘Fear of death’ despite the fact that this symptom is no clue for Sulphur. Of course these figures are estimated and we need to investigate them furthermore.

Towards research

LR gives us the opportunity to assess homeopathic symptoms scientifically. The best way for this is prospective research. If we want to perform prospective research on LR we will have to evaluate an enormous amount of symptoms. This requires a method that interferes with daily practice as little as possible.

Our first goal is to investigate the possibility of assessing the LR of homeopathic symptoms. Our long-term goal is to update our materia medica and repertory by means of statistical instruments that match the homeopathic methodology.

We do this investigation for ourselves, it will not convince anybody that homeopathy works. We want to assess the best symptoms to include or exclude a certain medicine. The presence and the intensity of many symptoms are judged by clinical experience, just like we do in practice. We want to investigate the same instruments we use in daily practice.

Methods

The first prospective assessment of homeopathic symptoms started June 2004, ten practices participated. We assessed six symptoms: Diarrhoea from anticipation, fear of death, grinding teeth at night, herpes lips, sensitivity to injustice and loquacity. This is a mix of vague and less vague symptoms. We purposely chose symptoms that are not related to the same medicine as far as we know (but this cannot be excluded). Several computerprograms were adapted to record and export the presence of the symptoms in each patient and all medicines and their results prescribed to each patient. The ten participating doctors were already trained in assessing results during consensus meetings that we organise since 1997. During these meetings doctors present their best cases regarding two medicines and discuss results; what is the score according to the Glasgow homeopathis scale (GHHOS) and was it due to the medicine?

During the prospective assessment of LR we held two consensus meetings each year to define symptoms and to discuss intermediate results. These meetings revealed differences between doctors in interpreting results and difficulties in interpreting vague symptoms.

Results March 2007

In March 2007, 3367 patients were included and 3246 prescriptions evaluated. Some results regarding the symptom 'Fear of death' are shown in table 6. There were 131 patients with fear of death in the total population of 3367 patients (3.9%). Patients reacting well were defined by GHHS results between 2-4, ie not only the presnted complaint was better, but also constitutional effects were visible.

   
LR+
95% CI
Fear of death n=131
  Aconitum
6.5
1.9 to 21.9
  Anacardium
12.1
6.2 to 23.7
  Arsenicum album
6.4
3.1 to 13.2
  Conium
3.7
1.0 to 13.6
  Veratrum album
10.4
3.5 to 30.9
  Sulphur
0.35
0.05 to 2.5

Table 6: LR results of the symptom 'Fear of death' after 3246 evaluated prescriptions, with 95% confidence intervals (95% CI)

The amount of data permitted to assess only a small number of LRs with significant values. Many values, like for Calcarea carbonica (Calc.) (LR= 1.2), Cimiciguga (Cimic.) (LR=4.3), Lac caninum (Lac-c.) (LR=4.3), Nitricum acidum (Nit-ac.) (LR=1.7) and Phosphorus (Phos.) (LR=1.4) had 1 in their 95% confidence interval. If we look at the underlying figures, however, we can see that their data are more reliable then the data of the original repertory. If only 3 of 64 patients responding well to Calcarea carbonica have a fear of death we can hardly imagine that this symptom strongly indicates Calcarea, as suggested by the repertory. The symptoms is repeatedly seen in patients responding well to this medicine, but also prescribed many times. But there were 5 out of 11 Anacardium patients with fear of death.

We conclude that the bold type entries in this reppertory-rubric of Calc. and Phos. are incorrect, they are just due to the frequent use of these medicines. Nit-ac. should probably not be mentioned in bold type, and Sulphur should not be in this rubric. On the other hand, Anacardium is a surprising outcome; this medicine is strongly related to fear of death.

Of course, we must realise that this is one group of doctors, with their training and experience. There could be differences in other groups in other countries. But if we constitute repertory-rubrics this way we are much more informed than by the existing repertory.

This first study led to three conclusions:

  1. LR research is feasible when using the proper software in daily practice .
  2. It is possible to gather a large amount of data without interfering with daily practice .
  3. There are many mistakes in the repertory. Most of them concern unjust entries of frequently used medicines, but there is no general rule that permits us to discard all frequently used medicines from a rubric.

Changing repertory

When we perform prospective studies on LR, the repertory will change gradually while more symptoms are investigated and symptom rubrics become more complete as the assessment is maintained longer. So we place the new information on top of the old repertory. This is visualised here by a partial reconstruction of the repertory-rubric with LR values added between parentheses.

MIND - Fear - death, of (prevalence 3.9%)
Acon.(6.5), act-sp., agn., all-s., alum., ......, Anac.(12.1), ....., Arg-n.(4.2), Ars.(6.4), ....., calc.(1.2),...., cimic.(2.5),...., con.(3.7)

These figures can only be interpreted in the context of the assessment, so the prevalence of the symptom in the research population is added to the symptom. This way the user can compare the assessment with his interpretation of the symptom and his own population. After the assessment some medicines will prove to have LR+=±1, which means that the symptom is no indication for these medicines (like Calcarea in the example). One might ask if these entries in the repertory are superfluous. It is even possible that LR+ for a medicine is smaller than one and in that case the entry is wrong, because the symptom pleads against that medicine, as is the case for Sulphur.

A repertory based on LR calls for new considerations, like what to include in or exclude from the repertory. If you have few data, like 1 out of 6 Cimicifuga patients having fear of death, there is no statistically significant value for this medicine. But we would lose valuable information if we should not include this medicine in the rubric. How should we indicate our uncertainties? Giving the raw data, like 1 out of 6? Or the confidence interval, and then which confidence interval? Or p-values?

Many rubrics of the repertory will only change in the long run, especially the smaller rubrics because they represent infrequent symptoms. The assessment of these symptoms will take a very long time because the prevalence is low and the gain in assessing them is not so great because we already know that LR of these symptoms is high. But also in small rubrics there may be medicines with different importance as to the symptom, like aconitum and argentum-nitricum in the rubric ‘Fear of death, predicts the time’. These differences, represented by typeface, should be based on estimations using the method of LR to make the repertory consistent. The estimations can be made by the information at hand; the materia medica gives us indications about the importance of the symptom in relation to the remedy. This enables us to give a rough estimation of the prevalence of the symptom in the remedy-population. We can therefore divide the importance of the symptom in different classes: ‘not important’, ‘important’ and ‘very important’.

Rubric analysis

In the present situation, rubric analysis is only practical for smaller rubrics; in that case we can consider the likeness of each medicine in the rubric with the picture of the patient. In general, small rubrics are more important, ie LR is probably high for each medicine in that rubric. This means that every medicine in that rubric is worth-wile considering. For large rubrics this procedure is too time consuming and for many medicines in the rubric LR is low, but until we have assessed LR we do not know for which medicines.

After we investigated LR we can analyse any rubric, large or small, because we can select the interesting remedies (with high LR). A computer-repertory can easily produce a graph that shows the increase of probability that a medicine will work with the corresponding LR.. We can also estimate the number of symptoms needed for a reliable prescription. This is represented in a simplified way in Figure 2, where two of those graphs are combined. Suppose that we have two patients, one with three symptoms with LR+=6 and one with four symptoms with LR+=4. If the prior chance that any medicine will work is 1% the probability of an effect will develop as in Table 7, graphically represented by Figure 4.

 

Certainty with LR+=6

Certainty with LR+=4

Symptom 1

6%

4%

Symptom 2

26%

14%

Symptom 3

68%

39%

Symptom 4

 

72%

Table 7: development of posterior chance with different LR

We see that we need three symptoms with LR+=6 or four symptoms with LR+=4 to get the same certainty that the medicine will work. This is a very straightforward example. Real cases have symptoms with different LR. If, say in the second case (with four symptoms) Symptom 4 would have LR+=6 we could extend the last vertical red line in Figure 2 to the upper curve. Posterior chance would become 80%.

Conclusion

Our methodology answers to the generally accepted and increasingly popular scientific principles of Bayes theorem. If we estimate the importance of a symptom in homeopathy, we implicitly describe the LR of that symptom. By making these estimations more explicit and by performing prospective studies assessing LR of homeopathic symptoms we achieve significant improvement of our method:
  • We are more certain about the value of our symptoms and able to estimate the certainty that our medicine will be effective.
  • Randomised trials can show better results by selecting patients with ‘good’ symptoms
  • Symptoms are better described and results become reproducible.
  • Structural mistakes of the repertory are solved.
  • We develop our own scientific identity.

 

References:

1. Stolper CF, Rutten ALB, Lugten RF, Barthels RJ. Improving homeopathic prescribing by applying epidemiological techniques: the role of likelihood ratio. Homeopathy (2002) 91, 230-238
2. Chalmers A. Wat heet wetenschap. 10de druk. 1999. Boom. ISBN 90 5352494 0
3. Sackett DL, Haynes RB, Guyatt GH, Tugwell P. Clinical epidemiology: a basic science for clinical medicine. 2nd ed. Boston: Little, Brown, 1991
4. Vermeulen M. Dwalingen in de methodologie. XXXVI. Van ‘likelihood’-ratio’s en de regel van Bayes. Ned Tijdschr Geneeskd 2001/50: 2421-4
5. Horton MD, Counter SF, Florence MG, Hart MJ. A prospective trial of computed tomography and ultrasonography for diagnosing appendicitis in the atypical patient. Am J Surg 2000;179(5):379-81
6. Ende J van den, Derese A, e.a. Medische besliskunde. Huisarts Nu 1996;25(9):283-344
7. Rutten A.L.B., Stolper CF, Lugten RF, Barthels RJ. Assessing likelihood ratio of clinical symptoms: handling vagueness. Homeopathy. 2003;92:182-6
8. Kent JT. Repertory of the Homeopathic Materia Medica. 6th edition, World Homeopathic Links, New Delhi1982
9. Hering C. Guiding Symptoms of our Materia Medica. B. Jain Publishers 1974
10. Clarke JH. A dictionairy of practical materia medica. Jain Publishing, New Delhi, 1984
11.Rutten ALB, Stolper CF, Lugten RF, Barthels RJ. Is assessment of likelihood ratio of homeopathic symptoms possible? A pilot study. Homeopathy. 2003;92:213-6
12. Rutten ALB, Stolper CF, Lugten RF, Barthels RJ. 'Cure' as the gold standard for likelihood ratio assessment: theoretical considerations. Homeopathy 2004;93:78-83
13. Rutten ALB., Stolper CF, Lugten RF, Barthels RJ. Repertory and likelihood ratio: time for structural changes. Homeopathy. 2004;93:120-124.
14. Rutten ALB., Stolper CF, Lugten RF, Barthels RJ. A Bayesian perspective on the reliability of homeopathic repertories. Homeopathy. 2006;95:88-93
15. Rutten ALB. Bayesian homeopathy: talking normal again. Homeopathy. 2007;96:120-124. DOI 10.1016/j.homp.2007.03.004