New repertory, new considerations

A.L.B. Rutten1, C.F. Stolper1, RFG Lugten1, RWJM Barthels1

1 Commissie Methode en Validering VHAN (Dutch Association of Homeopathic Physicians), The Netherlands

Abstract

The criteria for entering medicines in repertory rubrics are unclear and partly incorrect. A new repertory should be based on clear and objective criteria. Retrospective and prospective assessment of medicines and symptoms by the Dutch Committee for Methods and Validation gives an indication of the validity of existing repertory entries. Relying on the experience of one expert is the cause of part of the shortcomings of the repertory. This experience is highly influenced by chance. Another part is due to the use of absolute instead of relative occurrence of symptoms. Yet another part is caused by not comparing prevalence in the population ‘cured’ by a medicine with the remainder of the population. In many cases we need better definitions of symptoms.
A clear protocol and prospective research could overcome many shortcomings of the repertory. Statistics help to get more objective criteria, but we still need to reach consensus about how to handle probabilities of outcomes of our assessments. Homeopathy 2008;97:16-21.

Keywords: likelihood ratio, Bayesian analysis, repertory, clinical symptoms, expert opinion, probability

Introduction

A repertory of homeopathic symptoms, such as Kent’s, is an impressive work, but Kent’s Repertory is one century old. There are systematic mistakes in the existing repertory, for instance using absolute occurrences instead of relative, as we have previously pointed out., If we were to start a new, to make a repertory according to modern standards, would this be the same repertory? We should start with a protocol, including rules covering which entries should be made, definitions of symptoms, methods of data gathering, reference standard (gold standard), handling of bias and of statistical uncertainty.
We cannot disregard the existing repertory, or the fast growing number of variations on Kent’s original repertory. We all suspect that many entries in the repertory are wrong, but it is still unclear which and why. To get rid of old and prevent new false entries we need clear and objective criteria.
The problem is that expert opinion is one of the most important sources of the repertory. Medicine cannot do without it, but its validity is much criticised. We can estimate the value of expert opinion from a theoretical perspective using statistics and especially Bayes’ theorem. This paper presents some points arising from preliminary results of a prospective assessment of the symptom ‘Sensitive to injustice’. This is one of six symptoms we investigated. Our group is also organising consensus meetings retrospectively gathering best cases of the use of a single medicine by 10-20 experienced homeopaths. Some of the retrospective and prospective data can be compared with each other and with entries in the existing repertory. The symptom ‘Sensitive to injustice’ was present in some of the retrospectively analysed medicines. In this paper we compare retrospective and prospective data with each other and with the existing repertory rubric.

Existing rubric

There are many different versions of the symptom-rubric ‘Sensitive to injustice’. For this purpose we used RADAR-Synthesis version 8.1.40, ‘modern to 1987’, which is as follows:

Calcarea carbonica (Calc), Causticum (Caust), Cuprum (Cupr), Drosera (Dros), Ignatia (Ign), Mercurius (Merc), Nux-vomica (Nux-v), Sepia (Sep), Staphisagria (Staph), Veratrum album (Verat).

The entry of Ignatia in Italics indicates that the symptom is a stronger indication for Ignatia than for the medicines in plain type. Entries in bold type indicate that the presence of the symptom is a strong indication for those medicines. The authors referenced are C.M. Boger, C. Coulter, Gallavardin, S. Hahnemann, O. Hansen, R. Sankaran, T. Smith and G. Vithoulkas. Nux vomica is attributed to Hahnemann. If the patient presents an interesting symptom we consult the repertory to see which medicines could be indicated. The repertory entry of a particular medicine is, among others, based on the fact that the symptom is seen in former cases cured by those medicines. This means that experience from the past tells us what to expect in the case at hand. There is a sound statistical theory to back this procedure: the law of conditional probability. Reverend Thomas Bayes (1702-1761) stated this law as Bayes’ theorem. If we translate this theorem for homeopathy, we can calculate the (posterior) odds (and chance) that a medicine will work given the presence of a certain symptom, see Box.>BR> If we disregard the awkward handling of odds, the chances that the medicine will cure increases if a corresponding symptom has been seen in the past in cured cases. According to Bayes’ theorem there is one vital condition: the cured population should be compared with the remainder of the population. If the prevalence of the symptom in the cured population is not larger than in the remainder of the population the probability that the symptom will cure is not increased. As yet, there is no comparison with the remainder of the population in the present repertory.

Methods

Since 1997 we have organised retrospective evaluation of the ‘best’ cases of experienced homeopathic doctors (Materia Medica Validation) to learn how medicines could be successfully chosen and how we think. During these consensus meetings results of each case are evaluated using a modified GHHOS scale. We use the nominal group method, known in the qualitative research as a method to reach consensus. Participants present their best cases concerning one medicine and discuss criteria to evaluate the prescription. Before this evaluation of cases we asked participants to estimate the prevalence of some symptoms, known as ‘keynotes’ for the given medicines. After evaluation of cases we have a (retrospective) indication about the prevalence of the most important symptoms of each medicine. Since June 2004, in our prospective study (LR project), 10 experienced homeopathic doctors are gathering data on all new patients older than two years. The study will go on until December 2007. The symptom ‘Sensitive to injustice’ was one of six symptoms that were checked in all patients. The participating doctors specified the symptom as ‘Sensitivity to injustice done towards others, resulting in subsequent behaviour like turning off the television, writing letters, protesting etc’, but clinical judgement prevails. A paper on vagueness of symptoms was prepared for this meeting. Training in the use of the GHHOS was given at the consensus meetings. Consensus was also reached about required follow-up before entering results in the database. Result GHHOS 2 was entered after at least one month after prescription, result GHHOS 3 or 4 after at least 6 months. Participants used database programs to enter data. The co-ordinator (AR) used Excel spreadsheet and the statistical program Epi-info to evaluate results. All participants met twice a year after sending their data. Participants received feedback on their output during these meetings and by e-mail newsletter.

Results

Retrospective data

Bayes’ theorem is a learning algorithm. We learn how to diagnose appendicitis in practice, after a few cases. We learn because we see certain symptoms in our appendicitis patients that we do not see in other patients. This implicit comparison between appendicitis patients and other patients can be translated explicitly into LR and then Bayes’ formula can be applied.
10 years experience of Materia Medica validation tells us that homeopathic doctors are able to estimate occurrences of important symptoms. We were able to compare the estimates of six symptoms with the outcome of our later prospective assessment. There is interpersonal variation but the mean of all estimates agreed with the prospective outcome within a 5-10% interval. According to those estimates interesting (keynote) symptoms have prevalence in the whole population of 15% or less. The evaluation of more than 30 medicines also showed that most ‘keynote symptoms’ occur in less than 50% of our best cases concerning one medicine. As an example ‘fear of dark’ was present in only 5 out of the 12 best Stramonium cases (42%).
A symptom that occurs in 45% of the patients responding well to one medicine and in 15% of the rest-population has a moderate LR=3. If the symptom had occurred less, say in 5% of the rest-population, the LR would have been 9. This is consistent with Bayes’ theorem and Hahnemann’s aphorism 153 about the importance of peculiar symptoms. Symptoms with prevalence over 15% in the whole population will have lower LR. Symptoms with much lower prevalence in the whole population, say less than 1%, will not be seen on a daily basis. We therefore estimate that most prescriptions are based on symptoms with prevalence between 1 and 10%. The accordance between estimates and assessed values indicate that practice experience is valuable, but it has some flaws.
Bayes’ theorem and our knowledge from bringing cases from several doctors together also explains the problem with expert opinion. During Materia Medica validation 10 to 20 experienced doctors bring their best cases of one homeopathic medicine. It appears that even experienced doctors seldom have more than 3 ‘best’ cases of one medicine. In November 2002 our consensus meeting (retrospectively) showed that 2 out of 18 Staphisagria patients (11%) were sensitive to injustice. The prevalence of the symptom ‘sensitive to injustice’ in the whole population was estimated to be 10%. Later (June 2004), prospective LR research confirmed that the prevalence of the symptom was indeed 10% in 2506 patients, and 2 of the 23 patients (8%) who responded well to Staphisagria during the LR research were sensitive to injustice (see later).
Many entries in the repertory are based on the opinion of one expert. Suppose that the editor of the repertory asks the opinion of an experienced homeopath with 5 good Staphisagria cases. Exact binomial chance calculation based on the data of our prospective research indicates that the chance that one of these 5 patients is sensitive to injustice is about 47%. So 1 out of 2 doctors with 5 cases responding to Staphisagria estimates that the prevalence of the symptom in the Staphisagria population is 1 out of 5 (20%), more than his estimate of 1 out of 10 (10%) for his whole practice. So in the experience of this doctor from his Staphisagria population the symptom ‘sensitive to injustice’ is an indication for Staphisagria. There is a possibility of 11% (corresponding with 1 out of 9 doctors) that 2 out of 5 Staphisagria cases (40%) are sensitive to injustice, meaning that ‘sensitive to injustice’ is a rather strong indication for Staphisagria. The gathering of cases of several colleagues in our consensus meeting and LR assessment increased numbers and showed that the real prevalence of this symptom in the Staphisagria population is less. ‘Sensitivity to injustice’ might not be an indication for Staphisagria. This demonstrates that many entries in the repertory could be influenced by chance. Bias, like confirmation bias, could also influence expert opinion.

Prospective study

After 33 months (February 2007) 3367 patients entered the study and 3246 prescriptions were evaluated. The symptom ‘Sensitive to injustice’ was present in 330 patients (10%). Outcomes for these patients are shown in Table 1.
This table has some similarities to the existing repertory-rubric: Calc, Caust, Ign, Merc, Sep and Staph are indeed repeatedly observed in relation to this symptom. The medicines Anacardium (Anac), Aurum (Aur), Belladonna (Bell), Carcinosinum (Carc), Carcinosinum con cuprum (carc-c-c), Cocculus (Cocc), Kalium bichromicum (Kali-bi) Natrium muriaticum (nat-m) and Phosphoricum acidum (ph-ac)are new. The symptom has been seen in the Natrium muriaticum population not much less than in the Causticum population.

medicine, patients with

result GHHOS 2-4

sensitive to injustice

anac

5

aur

2

bell

4

calc

6

carc

8

carc-c-c

4

caust

15

cocc

4

ign

6

merc

7

nat-m

13

nux-v

2

ph-ac

3

sep

7

staph

2

Table 1: Preliminary outcome of the assessment of the symptom 'Sensitive to injustice'. Number of patients in each population with result GHHOS 2-4, absolute occurrence.

Discussion

The fact that a symptom is repeatedly observed in a population cured by a certain medicine was so far enough to include the medicine with emphasis in the repertory-rubric, but is not enough to apply Bayes’ theorem. If the symptom is not infrequently occurring and if the medicine is often used, like Calcarea in this case, it is possible that the symptom is not characteristic for Calcarea patients as a group. We must also know how many patients cured by that medicine are not sensitive to injustice, see Table 2.

medicine

sensitive to injustice

not sensitive to injustice

prevalence

LR

95% CI

anac

5

6

45%

4.69

2.44 to9.04

aur

2

7

22%

2.28

0.67 to 7.76

bell

4

13

24%

2.42

1.02 to 5.73

calc

6

58

10%

0.96

0.44 to 2.06

carc

8

27

23%

2.37

1.28 to 4.39

carc-c-c

4

4

50%

5.15

2.56 to 10.38

caust

15

23

39%

4.17

2.78 to 6.27

cocc

4

6

40%

4.12

1.92 to 8.86

ign

6

17

26%

2.69

1.34 to 5.40

merc

7

38

16%

1.60

0.80 to 3.19

nat-m

13

116

10%

1.03

0.61 to 1.74

nux-v

2

28

7%

0.68

0.18 to 2.60

ph-ac

3

16

16%

1.62

0.57 to 4.59

sep

7

69

9%

0.94

0.46 to 1.92

staph

2

23

8%

0.82

0.21 to 3.09

Table 2: Preliminary outcome of the assessment of the symptom 'Sensitive to injustice', prevalence of the symptom in each ‘medicine-population’ and LR. Prevalence in the whole population is 10%. 95% CI = 95% Confidence Interval.

We calculated LR knowing that the prevalence of the symptom was 10% in the whole population. Then we see that the prevalence of this symptom is less among Nux vomica and Staphisagria patients than in the rest-population, and therefore LR<1. Sensitivity to injustice is no indication for Nux vomica and Staphisagria although we have several patients with this symptom. For Calcarea and Sepia LR is just below 1, in other words it is a contraindication!>

Translating LR into type

To make a comparison with the existing entries of Kent’s repertory we have to translate type (expressing importance of the symptom related to that medicine) into numbers. Such a translation is rather arbitrary; a possible translation from type into LR is shown in Table 3:

Type

LR

Plain

1.5-3.0

Italics

3.0-6.0

Bold

> 6.0

Table 3: Repertory entries translated into LR values

The choice of LR>6 for bold type could be justified as follows: There is consensus that three good symptoms pointing to the same medicine indicate a reliable prescription. Three symptoms with LR=6 give a combined LR of 6x6x6=216 (LRs should be multiplied). Suppose that the prior chance that any medicine works is 1%, then the posterior chance after three symptoms with LR=6 becomes 69%. We don’t know prior chances that homeopathic medicines work, but these values of prior chance and LR reasonably fit the results of our assessment so far. If we have five symptoms with LR=3, the resulting combined LR=3x3x3x3x3=243. With this combined LR chances go from 1% to 71%. So five symptoms with LR=3 give about the same result as three symptoms with LR=6. The choice of LR=1.5 for a plain type entry is arbitrary.

Developing a new repertory

As far as we know our assessment is the first prospective assessment of the prognostic properties of homeopathic symptoms, and assessment of the prevalence of symptoms in the whole population. Our retrospective validation of medicines just indicated the prevalence of the most important symptoms in the population ‘cured’ by that medicine. If we want to enter our results into the repertories we should make a new start guaranteeing reliability and reproducibility of new entries, and maybe we should discard old entries that become questionable with these results. But we must not lose or disregard useful information, and existing entries should be handled carefully.

What would a scientifically sound repertory rubric look like? According to our preliminary results it would be:

Anac, Aur, Bell, Carc, Carc-c-c?, Caust, Cocc, Ign, Merc?, Ph-ac?.

Numbers are still small, particularly for Carcinosinum con cuprum and Phosphoricum acidum so these results should be handled cautiously. Still, this rubric is based on a better defined and more reliable process than most existing repertory entries. There are a few problems. The fact that there are no Cuprum and Drosera cases could be explained by the fact that those medicines are not commonly used. It could be that our group had no ‘specialists’ in prescribing those medicines.

Prospective research slightly alters assessment of symptoms. If you ask every patient if he is sensitive to injustice many patients will answer affirmatively. We made some specifications for this symptom, but this process is rather subjective. In this population the prevalence of the symptom was 10%. But if a patient has a very strong sensitivity to injustice - like it occurs in, say, only 1% of all people - the LRs are larger than the ones we measured. This is intuitively understandable: A stronger symptom is a stronger indication.

The medicines Phosphoricum acidum and Mercurius have LRs >1.5, but this could still depend on chance. For Nux-vomica LR could still be above 1.5 according to the 95% Confidence Interval.

If we are to constitute a repertory with scientific assessment we should regard the influence of chance. How reliable are our numbers – or, if we would measure a certain value for LR repeatedly, would it always be the same? Statistics learn that possible values from repeated measurements are distributed around the sample value. Figure 1 shows simplified distributions for possible values of the LR of the symptom ‘Sensitive to injustice’ for Aurum (LR=2.28) and for Nux vomica (LR=0.68). The chance that the LR for Aurum is larger than 1.5 is indicated by the area under the left curve (right from the vertical line) corresponding with LR=1.5, being 85.9%, calculated as exact binomial chance. The chance that the LR for Nux vomica > 1.5 is 15.1% as shown in the right curve.

Figure 1: Distributions of chances that Aurum (left) and Nux vomica (right) have LR>1.5.

Adding or discarding entries

When should a new entry be added, and moreover, when should an old entry be removed? We could discuss this question in this example regarding the entries in Table 4.

Medicine<> LR p-value
Aurum 2.28 0.859
Calcarea carbonica 0.96 0.136
Carcinosinum 2.37 0.931
Carcinosinum con cuprum 5.15 0.997
Cocculus 4.12 0.990
Ignatia 2.69 0.954
Mercurius 1.68 0.639
Nux vomica 0.68 0.151
Sepia 0.94 0.100
Staphisagria 0.82 0.254

Table 4: Some medicines correlated to the symptom 'Sensitive to injustice', LR and probability that LR>1.5.

The p-values in Table 4 indicate chances that LRs of these medicines are >1.5. For Calcarea carbonica chances are about 14% (p=0.136). If we have many cases, like 76 for Sepia, we are confident (p=0.100) that the LR-value will not be larger than 1.5. With a smaller number of cases, like 25 for Staphisagria with the same LR value, we are less confident (p=0.254).
We emphasise that our aim is not to prove or falsify a hypothesis. According to scientific habits we should form a hypothesis and try to falsify it. Such a hypothesis could be that ‘Sensitivity to injustice’ is no indication for Aurum. This hypothesis for Aurum is not falsified by our data, chances are 0.141 (1-0.859), so larger than p=0.05, the generally accepted value for significant results. If we want to maintain such criteria our repertory would become very thin and most existing entries should be discarded. The knowledge that ‘sensitivity to injustice’ has 64% chance of indicating Mercurius (LR>1.5) might still relevant for daily practice. But what should be the cut-off value of the p-value for adding an entry to the repertory if LR should be larger than 1.5? And should we take LR>1.5 as lower limit? If we take all values for LR>1.0, the chance that the LR for Mercurius >1.0 is 92%.
The next question is when do we discard an existing entry? According to our assessment Nux-vomica should be discarded; chances that the LR could be more than 1.5 are 15%, chances that LR could be larger than 1 are 41% (exact binomial chance). Should it make a difference that this entry comes from Hahnemann? Hanemann could also be a victim of chance. It is also possible that Hahnemann used a different definition for ‘sensitive to injustice’. A limitation of assessment by one group is that there may be colleagues that have much experience with medicines that are not known to the doctors that participated in this assessment. This kind of assessment is also not suited for rarely prescribed medicines; existing entries should not be removed from the repertory when LR research does not indicate them. We should strive for consensus about infrequently used medicines that come up in LR research, like Carcinosinum con cuprum. The data for Carcinosinum con cuprum from this assessment come from two observers; for this medicine we could not claim that results are derived from a multi centre assessment.
Another reason to be careful with discarding medicines from the repertory is possible interaction between symptoms. In the repertory, as in our assessment, symptoms are considered as independent entities. But it is possible that the combination of two symptoms with low LR is a much stronger indication for a medicine than expected from the LRs separately. Repertorisation could metaphorically be compared with a weather-forecast. You get variables like temperature, wind and rain as independent values, but your decision about what to do tomorrow depend on an complex weighting of these variables.
It doesn’t seem wise to just discard entries with a certainty of less than, say, 80%, but we should put a certain limit. Above this limit we could indicate statistical certainties. How do we indicate certainty in a feasible way? We could use p=values like in Table 4 or confidence intervals. We could also give the measured values, like in Table 2. In any case, we have to get accustomed to replacing some of our intuition about expert-experience by statistical considerations about scientific assessment. But it will be a slow shift and there will probably remain the need for intuition.

Bias

The method we present here is an improvement compared to the unsatisfactory process that led to the development of existing repertories. We increased numbers to diminish the influence of chance in our retrospective assessment and we made a comparison with the remainder of the population in our prospective assessment. We got a clearer insight into the influence of chance, but bias is still possible. Maybe we used a different definition than former contributors for this symptom. We don’t know if our results are valid all over the world. The symptom ‘Sensitive to injustice’, like many other homeopathic symptoms, is very subjective and therefore liable to confirmation bias5; it will be detected sooner if we think of Causticum on other grounds. Prospective investigation slightly alters our consultation; will our results differ from retrospective data and why? Our retrospective analysis of 10 best Causticum cases (November 1998) showed 4 patients (40%) who were sensitive to injustice, not different from our prospective research. We have seen several similarities in our retrospective and prospective data, but only for symptoms that are the most prominent for a particular medicine. Our reference is ‘cure’ measured according to the GHHOS scale, but is this scale valid?

Conclusion

The homeopathic method is, unconsciously, based on the sound scientific theory of conditional probability, but our repertories are not. Expert opinion is valuable, but liable to bias and the influence of chance. The influence of chance could be diminished by gathering more cases from different doctors. If we want to use our experience in a state-of-the-art way we must know the prevalence of homeopathic symptoms in the populations cured by the medicines, but also in the remainder of the populations. The relation between these two values is expressed as Likelihood Ratio (LR). The best way to assess this is prospective research.

We should develop new criteria for entering or discarding entries in the repertory. We probably need a different way of handling statistical uncertainty than in hypothesis testing. Symptoms should be more clearly defined. The validity of our scales measuring cure should be validated. Most of the questions we presented here have been issues as long as repertories have existed, but were avoided. To do it right we should handle these questions properly. Our considerations are about clinical data, but many may apply to homeopathic pathogenetic trials (provings).

Acknowledgements

Our retrospective research was supported by the SHO (homeopathic post-grade training for doctors). Our retrospective research was sponsored by KVHN (royal Dutch patient’s organisation for homeopathy), the Louise van Eeghen foundation, SFWOH (foundation for homeopathic research) and VHAN (Dutch homeopathic doctor’s association). The doctors participating in the prospective research in June 2006 were Rob Barthels, Paul Fruijtier, Gerard Jansen, Jean Pierre Jansen, Christien Klein, Roland Lugten, René van der Reijden, Lex Rutten, Erik Stolper and Janny Verhey.