Les Crises Les Crises
15.avril.202015.4.2020
Article LesCrises

The tremendous ethical and methodological flaws in the Raoult clinical trial: analysis, by Olivier Berruyer

Merci 694
J'envoie

We provide you today with a scientific analysis of the trial on chloroquine conducted by Raoult, Gautret & al. , and which was widely covered by the press two weeks ago, something which has triggered the current controversy.

This is the translation of this article, originally in French. Our apologies for the possible typos…

I. Outline of the article

We will show in this article that the ethical and methodological flaws of the Raoult/Gautret trial made it impossible to interpret its results.

Outline :

    1. Outline of the article
    2. The Philippe Gautret/ Didier Raoult trial dating from March 2020
    3. Results of the trial and discussion
    4. Was this trial legal ?
    5. Conclusions that can be drawn from the trial
    6. Reactions and analysis from scientists
    7. Summary of the problems encountered
    8. Towards retractation

II. The Philippe Gautret/ Didier Raoult trial dating from March 2020

2-1 « Hydroxychloroquine plus Azithromycin as a Treatment for covid-19: Results of a Non-Randomized Open-Label Clinical Trial »

2-2 The specifications of the trial

2-3 Criteria for judging the result

2-4 Demographics

2-5 Weird dates

2-1 « Hydroxychloroquine plus Azithromycin as a Treatment for Covid-19: Results of a Non-Randomized Open-Label Clinical Trial »

This trial was published by the IHU of Marseille on its website on March 17 (source, archive, pdf archive) and in the International Journal of Antimicrobial Agents on March 20 (source, archive, pdf archive):

Traditionally at the IHU (but not only there, certainly), we find a truckload of 18 signatories: Philippe Gautret, Jean-Christophe Lagier, Philippe Parola, Van Thuan Hoang, Line Meddeb, Morgane Mailhe, Barbara Doudier, Johan Courjon, Valérie Giordanengo, Vera Esteves Vieira, Hervé Tissot Dupont, Stéphane Honoré, Philippe Colson, Éric Chabrière, Bernard La Scola, Jean-Marc Rolain, Philippe Brouqui, Didier Raoult.

With so many brains at work, you’d think we’d be dealing with a pretty damn good study…

In addition, let’s mention that Raoult presented these results to his students on March 16 on Youtube (source, at 14’31):

2-2 The specifications of the trial

This trial is open-ended and non-randomized : patients know what they are being given, and the distribution was not randomized ; all this greatly reduces the robustness of the trial – but it is still potentially interesting – if we don’t jump to conclusions.

So there are going to be 2 groups: one with chloroquine and one without chloroquine.

The group being given chloroquine is at the IHU, the one without it is divided between the IHU, Nice, Avignon and Briançon.

First problem: patients of the control group are scattered in 3 other centres, probably overwhelmed centres too. The big problem is that this multi-centre distribution is carried out without a distribution of the contaminated patients within each centre. Marseille is practically the only chloroquine treatment centre where almost 100% of patients are treated, and all the other centres only have control patients.

It is thus impossible to ensure that the protocol is properly followed : the patients could, for example, receive less good care, or simply medical care that is different from what was planned.

Right then, so who are these patients, what are the criteria for joining the study?

The investigators therefore decided that 2 criteria had to be met:

  • be over 12 years old
  • have som virus at the back of the nose

Health condition is not a criterion :

So here we are, with three groups of patients :

  1. asymptomatic : no clinical signs;
  2. « URTI » : who suffer from rhinitis, pharyngitis, or moderate fever and muscle pain;
  3. « LRTI » : who suffer from pneumonia or bronchitis.

So there is a gradation of seriousness, even though apparently URTIs can be hospitalized and LRTIs not.

Some patients have been excluded: those with particular pathologies (eye or heart problems) or pregnant women :

On the other hand, those who were excluded and those who refused treatment were placed in the control group :

New problem: the control group has special characteristics that can alter the efficacy of the treatment for them.

A quick perusal of the article suggests that the trial is over – just look at its title:

But in fact no, these are actually preliminary results after six days of treatment:

We are led to understand that the trial is to last 14 days

2-3 Criteria for judging the result

When the sponsor plans a clinical trial, the objective and purpose of the trial should be clearly defined.

To do so, for this purpose, a main criterion is defined as a  » Primary endpoint  » corresponding to the main outcome of the study on which the efficacy of the treatment can be concluded.

Generally speaking, trial sponsors also add additional subcriteria. However, only the primary endpoint, if statistically significant, can lead to a conclusion, and this is never the case for the secondary endpoints taken independently. Thus, the trap sometimes suggested by some authors is to conclude on the secondary criteria when the result of the main criterion is not significant. This is not proper and is a significant analytical error, because when the primary endpoint is negative, one should no longer be able to conclude anything from the study based on the secondary endpoints that have been fulfilled. Thus, the role of the secondary endpoints in a study will simply be to supplement the message of the primary endpoint.

Here’s the one from the Raoult & Gautret essay:

With this trial, the primary objective is to measure the clearance of the virus (in the back of the nose) on day 6 after inclusion.

The sub criteria will therefore be :

  • elimination of the virus (in the back of the nose) on day 14;
  • improved clinical outcome: body temperature, respiratory rate, length of hospital stay and mortality;
  • the occurrence of side effects.

4-4 Demography

All the parameters having been determined, the groups of patients still need to be constituted. To do so, statistics are used to define their size:

The team therefore indicated that their statistical analysis had shown that they would need to round up 48 patients, treating 24 and keeping 24 in a control group.

Problem: Dominique Costagliola, Member of the Academy of Sciences, Vice-Dean Research Delegate of the Faculty of Medicine, Sorbonne University, Deputy Director of the Pierre Louis Institute of Epidemiology and Public Health, at the Sorbonne University, and a specialist in trials, has recalculated the figures and is unable to find the same result. (source)

And thus, the IHU included 42 patients « who met the conditions for inclusion » in this study and divided them into two groups:

  • 26 were treated with hydroxy-chloroquine;
  • 16 were the control group, and were not treated with hydroxychloroquine.

Problem: the control group is significantly smaller (by one third) in size than the 24 from the team’s own statistical analysis.

Another problem: the trial is not « randomized ». Patients are not randomly selected to be in the control group or the treatment group. The investigators chose who they were going to treat with hydroxychloroquine (HCQ) (here, the repartition is in fact geographical, and depends on the treatment centre), but, above all, also treated with antibiotic, which represents a very strong bias.

Finally, you will note that they report that they have integrated « 36 of the 42 selected patients ». But this is false; they have integrated all 42 patients! Let’s consider what happened.

2-5 No conflicts of interests?

As far as we can tell, the authors haven’t mentioned anything concerning eventual conflicts of interest (source) :

N/A = Non available, not to be found

And this is really a pity as this trial is a IHU Marseille study

Concerning Chloroquine which is in fact produced by Sanofi Laboratory

And Sanofi is one of IHU Marseille’s partner(sources there and here)

Here are the partners of IHU-IM

Be aware that Sanofi Aventis is one of the funder that finances the Institute, which means that Raoult often meets with them ( this is the 3rd largest pharmaceutical laboratory in the world. One just has to keep in mind that there is a link there.

2-6 Weird dates

We would like to draw your attention on the problem concerning the dates for this test.

This is what the published article indicates (source, archive) :

Patients were therefore included in « a single-arm protocol from early March to March 16th. »

First of all, the characteristic precision of the Marseille IHU on the « beginning of March » is to be noted. They therefore do not seem to want to indicate what the first calendar day of the trial is.

This being said, they indicate an ending date: March 16th.

But the trial is a 14-day trial, with a primary assessment criterion on day 6 (or day 7, depending on how you count the day of inclusion, the team speaks of D0 as we have seen, D for day).

So there are only 2 options: either March16 th is D6, or it is D14.

2-6-1 Scenario D14

This hypothesis is supported by the sentence mentioning a single-arm protocol from early March to March16.  » This trial is not supposed to have two protocols.

Therefore, assuming March 16th is D14, this would mean that inclusion day D0 would be March 2nd, and patients would have been treated from March 3rd to March 16th, and therefore D6 would be March 8th and the trial would have ended on March 17th.

Furthermore, since Didier Raoult presented the D6 findings on March 16th, that would have allowed the team one week to analyze the results and write the article.

But in fact, this doesn’t seem feasible. First of all, if the trial ended that day, why « urgently » publish an evaluation at D6 on that same March 16th ?

And more important, we are told that the trial was not approved by the authorities until March 5th and 6 th :

It is obviously illegal to carry out a clinical trial without the agreement of the authorities, and this is a criminal offence:

Article L.1126-5 of the Public Health Code

« Conducting, or having conducted, research involving a human being […] without having obtained the positive approval of a committee for the protection of persons shall be liable to one year’s imprisonment and a fine of 15,000 euros ».

Which means that D0 can only be, at best, on March 6th or 7th, and therefore D6 can only be Thursday March 12th or Friday 13th .

Since assumption D14 seems to be invalidated, then the following one is the correct one.

2-6-1 Scenario D6

We then have to admit that the wording of the protocol is unfortunate, so March 16th is D6, which would mean that D0, inclusion day would be March 10th, and that the patients would have been treated from March 11th to March 24th, and then the trial ended on March 25th.

But when you look at the publication:

So we can see that it means that on Monday, March 16th, it would have been necessary to :

  • collect the tests from all 30 patients;
  • perform all 30 tests;
  • report and synthesize the results in Marseille;
  • carry out the complete study and the graphs;
  • have it re-read and signed by the 18 people;
  • get the slides ready for the power point presented by Didier Raoult on March 16th ;

  • Send the article to the International Journal of Antimicrobial Agents (IJAA) for review.

So on March 17th, the IJAA found a proofreader (maybe two), who proofread this article and validated it within the same day – that’s really fast. Too fast, it seems…

As can be seen, this hypothesis, although legal, raises serious problems of credibility.

Problem: It is very difficult to establish the calendar dates D0, D6 and D14 for this clinical trial.

III. Result of the trial and discussion

3-1 Strange lost people !

3-2 The 36 patients (remaining)

3-3 Hydroxychloroquine results

3-4 Detailed patient data

3-5 The problem of test reliability

3-6 Analysis of the results for hydroxychloroquine

3-7 Results for hydroxychloroquine plus azithromycin

3-8 Viral carriage

3-9 Let’s put some seriousness into this trial…

3-10 One last big problem

As we previously mentioned, understanding a clinical trial is not very complicated… Let’s move on to the analysis of the results.

3-1 Strange lost people !

Clinical trials for drug development are frequently carried out on thousands of patients over trimesters or even years. It may therefore happen that some people leave the trial without informing anyone ( relocation, weariness, etc.). This is what is called « lost people » : they were there at the beginning, but are no longer there at the end, without the reason necessarily being known.

And, you won’t believe this, but out of the 26 prople treated with chloroquine, the authors indicate that there were… 6 lost people within 6 days!

So all those who have « stopped the treatment early » are called « lost« . And the reasons why they did so are very interesting:

The first of those decided to leave the hospital on day 3, and by days 1 and 2 he had no virus left in his samples. That’s called a healed one. On the one hand, we can say that it was a success, but on the other hand, the treatment obviously had nothing to do with it. One can even wonder if he was even really sick the day the trial started.

The second decided to voluntarily stop the treatment because of nausea on the 3rd day, while he was still infected with the virus. So we can imagine that he must have been extremely nauseated to choose to stop the treatment… So in fact he is not « lost  » at all: it is a classical case of treatment failure, linked to side effects that are too difficult to cope with.

But let’s continue the analysis of our lost to follow up patients:

Here’s something different: 3 have simply gone into emergency intensive care ! On days two, three and four. So these are serious treatment failures.

For the sixth « lost  » patient it’s even worse :

He died on the third day.

His nose testing revealed that he was not carrying the virus anymore.

So he died with no virus in his nose… (which is frequent, death is often actually due to a bursting of the immune defences. But the virus could also be somewhere else)

So here we are again with another serious chloroquine treatment failure.

And Raoult’s team swept those 5 failures out of the study, and quietly made them look like lost to follow up patients !

Which in fact means that they gave a patient a new treatment, he died 3 days after that, they just shrugged their shoulders and dropped him from the study as if he had decided to go home. Same for the 3 sent to intensive care and the one with intolerable side effects.

Never have I seen something of the kind in a test report before ! This comes close to the threshold of scientific fraud.

Quite obviously, there’s no evidence that chloroquine was involved in any of the five failures. But there’s also nothing to say that it wasn’t.

Because in fact, in the control group (without chloroquine): no death, nor any transfer to the intensive care unit:

This might have been a piece of luck, it might have been normal. Or maybe it wasn’t…

For those of you that are interested, here is a summary of what is now a nugget among clinical trials:

But let’s carry on, because the study, having camouflaged its failures, continues – on the basis of only 20 patients treated with chloroquine and still 16 patients in the control group:

Problem: 20 and 16 are therefore quite far from the statistical need for 24 and 24.

And it is strange to note that, while the number of patients treated with chloroquine was 26 at the beginning and 20 at the end, Raoult publicly speaks of « 24 treated patients » (sources here, here and here) :

3-2 The 36 (remaining) patients

Let’s then discover these 36 patients ho are still present in the trial on the 6th day:

This quite difficult-to-read chart mainly deals with: their age, sex, clinical status, progress in the disease, whether they were treated with hydroxychloroquine and its concentration in the blood, and their virus load on each of the 6 days.

One detail stands out. Let’s remember this:

hese are 36 patients who « meet the inclusion criteria ». Criteria which are simple:

here are only two of them :

  1. Be over twelve years old
  2. Having some virus in the nose

Good enough

And then, what do we notice ?

Patients 1 and 4 are ten years old : they don’t satisfy the admission criterion for the trial (eventhough they did not get chloroquine).

That’s it, let’s call it game over. Bravo « to the most cited microbiology researcher in France ».

But let’s go on.

Let us point out another problem: there are serious differences in the data for the control group if we consider the pre-publication of 20 March on medRxiv (source) and the final, authoritative one on the IJAA journal on ScienceDirect (also 20 March – source) (this was pointed out Leonid Schneider, from a PubPeer alert) :

Problem: Was the collection of test data from the control group centers really reliable?

3-3 Hydroxychloroquine results

The potential efficacy of chloroquine in vitro, when poured on cell cultures that are in test tubes, has been repeatedly cited by investigators:

It just happens that the in vitro efficacy says nothing about the effictiveness in humans. And the team knows this very well, since for 10 years they’ve been trying this against many viruses, with no proven effectiveness in humans as to reducing the viral load – but even with proven efficacy in increasing the viral load of infections such as AIDS, Chikungunya and influenza, as demonstrated in this post.

Here is the result as reported:

After 6 days, « 70% of patients treated with chloroquine » were « cured of the virus », compared with « 12.5% in the control group ».

Sounds impressive when you put it like that. It sounds very convincing. As long as you don’t reviez it, of course.

So, let’s review it.

3-4 Detailed data on the patients

But first of all let’s have a close look at the patients

The chart provided in the article being all mixed up and not very clear, we have redrawn it, more neatly, classifying the patients in a better way :

This is better, but we are still going to improve it, don’t worry

The black line separates the groups, the control group at the top, and the group with chloroquine at the bottom. We simply sorted out the asymptomatic, the URTI with cough and the LRTI with pneumonia.

First remark: the groups are not homogeneous :

A/ neither as regarding the age :

37 years old versus 51 years old – with no evidence of which group has an advantage in terms of the rate of viral load decline. (source)

Age and gender distribution in the groups

B/ nor as regarding their clinical status :

Let’s make sure we do note that there are 25% asymptomatic in the control group.

Second remark: many data points are missing (the black boxes). Indeed, in the centres that are not in Marseille, and which include the majority of the control group, the tests are not done on a daily basis.

And we even observe that 5 patients in the control group are not tested on the 6th day, which is astounding, since it is the day of the assessment of the major criterion of judgement ! The investigators arbitrarily counted these 5 patients as still positive. This is not at all rigorous.

A sixth patient (from the chloroquine group this time) was also not tested in Marseille on days 5 and 6 (light grey). But as he was negative on days 2, 3 and 4, one can reasonably assume that he is still negative. But again, this is not serious: why was he not tested at the IHU, since, not lost sight of, he was obviously still in hospital ?

3-5 The problem of how reliable the tests are

Didier Raoult’s choice was to carry out a clinical trial not on the question « Will chloroquine help to treat and protect serious cases in order to save lives » but on the question « Does chloroquine help the organism get rid of the virus at the back of the nose more quickly ». In fact, he removed all severe deteriorations from the study.

So his test basically measures how fast the virus is cleared from the nose (the term « cure » he uses is a bit hasty, there’s nothing to say that there isn’t some virus left elsewhere in the body – but never mind) – and he does this by testing people who are sick.

These are the famous tests that Raoult is always talking about. But, as usual, Raoult doesn ‘t say everything, far from it.

To put it simply, he has 2 main steps: 1/ sampling 2/ PCR analysis

The sampling is not easy to do: the swab has to go to the far end of the nasopharynx to soak up the secretions, and it is uncomfortable. This is a major cause for failure – even when it is done by professionals – because if it’s not done properly, you won’t be able to take enough virus, and the result will be falsely negative.

PCR (Polymerase Chain Reaction) is the laboratory technique for nucleic acid amplification. Let’s schematize. We take the virus sample from the swab and put it into the PCR machine. A primer, which is a short piece of the genetic code of the RNA of the virus in mirror (we can use different ones) is introduced; the primer will stick on the corresponding code of the virus; we will then multiply this duplicate, and recover twice the amount of genetic code Q (PCR 1). We run the machine again: at PCR 2, we have 4 times the quantity Q. At PCR 3, we have 8 times the quantity Q, etc. There is an exponential growth. There is a moment when there is such a mass of virus that we will be able to detect it (by fluorescence), because it exceeds the mass M, which is the detection capacity (remember, this is a rough diagram for the general public here).

The measure of viral load shown in the charts is not « a weight of virus » but the number of times the PCR must be run to reach the « M » detection limit. This number of cycles is called the final CT = « cycle threshold » and therefore represents the point at which the signal is significantly greater than the background noise, i.e. the minimum number of cycles needed for the amplified virus RNA to be detectable. (For PCR enthusiasts, you may wish to refer to this dedicated post on Wikipedia).

Then you still have to define a threshold to stop the PCR, in order to declare that at that moment, without RNA being detected, it is assumed that no RNA was present in the sample. Typically, this threshold is defined using positive and negative control samples, and is dependent on the PCR primers that are selected. This work is rarely reported in the methods for a publication, and the scientific community most often trusts this development as long as the results appear to be consistent. However, when the results seem to be inconsistent, for example when the virus disappears and then reappears as if by magic, there is reason to doubt the quality of this above development and the selection of the detection threshold. Whatever the threshold chosen, there will always be false negatives and false positives: setting it too low will lead to a sample being too easily considered negative, and setting it too high will lead to a sample being too easily considered positive.

In the analysis of Raoult’s trial, this threshold is only specified once, in very small print below the chart with the list of patients (see above) it is 35 :

Let us underline, for the follow-up, that this important information is not clearly included in the dedicated part of the article which talks about PCR (in one line…):

nor there

Hence, Raoult’s team decided that if, after running the PCR 35 times, there was still no RNA detection, then the patient no longer had any virus in the back of his nose. But this threshold is totally arbitrary: they could have picked 32 or 38.

However, this method also has a problem of reliability, which adds to the problem of sampling. And it is not anecdotal – although there is no information on the reliability of the PCR used in Marseille. But we can cite this edifying scientific article (source ; pdf) from March 4th (which we have translated in this post) :

« In this study, we have developed and compared the performance of three new real-time RT-PCR assays targeting the SARS-CoV-2 RNA polymerase (RoRp)/helicase (Hel) genes […] with the RoRp-P2 assay which is used in more than 30 European laboratories. Of the three new tests, the COVID-19-RdRp/Hel assay had the lowest in vitro detection limit […]. Of the 273 specimens from 15 patients with laboratory-confirmed COVID-19 in Hong Kong, 77 (28.2%) were positive for both the COVID-19 RoRp/Hel assay and the RoRp-P2 assay. The COVID-19-OrdRp/Hel assay yielded an additional 42 negative RoRd-P2 specimens [119/273 (43.6%) versus 77/273 (28.2%), P<0.001] ».

To sum up, when 273 samples that you are sure contain the virus are tested using the widely used « RoRp-P2 » PCR, the result is positive in only 28% of the cases – that’s 72% false negatives! Of course laboratories use different methods to greatly reduce inaccuracy…

Let us also mention this article from the American CDC (source ; pdf) :

« Negative results are not sufficient to rule out Covid-19 infection and should not be used as the sole basis for treatment or other decisions regarding patient care. A false negative result may occur if a specimen is improperly collected, transported, or handled. False negative results may also occur if amplification inhibitors are present in the specimen or if insufficient organisms are present in the specimen. …] Positive and negative predictive values are highly dependent on prevalence. False negative test results are more likely when the prevalence of the disease is high. « [CDC, March 30, 2020]

But let’s not be too harsh about the reliability of PCR testing, it’s really a fantastic tool. Just think: we discovered the virus less than 3 days ago; we sequenced it in 2 weeks, and were able to have PCR tests less than a month later, and in France thousands of them can be carried out per day. Of course, this was not enough at the beginning of the epidemic, one more mistake of our country. But if we were in 1960, we would still be wondering what this weird virus was made of…

Lastly, we refer the reader to this post quoting Professor Vincent Thibault, head of the virology laboratory at Rennes University Hospital, who explains on France Bleu radio station that coronavirus screening tests are only 70% reliable. « In the case of this young girl [16 years old, who died from Covid-19] it seems that two initial samples were negative. She was sent home and her condition deteriorated rapidly with a result that finally turned out positive. This sad case illustrates the problem we face today. » And, last but not least, he adds:

« Today we do nasal swabs, but we know that the virus is not in the nose at every stage of the disease (…) A test can therefore be negative although the patient is symptomatic and indeed contaminated. This is because the virus is located much deeper, in the lungs for example.”

This sentence totally invalidates the very core of the Gautret / Raoult protocol – whose acceptance by the authorities is all the more surprising.

Indeed, the nasal viral load of the most severely contaminated patients could very well decrease, because the virus would in fact be migrating into the lungs (as such was perhaps the case for the patient that was judged and presented as « negative »). It is therefore baffling that the protocol did not include a clinical status of the patients on day 6 – so that it was possible to make sure they were cured, and not on their way to the resuscitation unit… Another serious scientific blunder.

Does anyone know if Didier Raoult ever mentioned this « minor » problem of non-reliability of the tests?

3-6 Analysis of the hydroxychloroquine results

This problem appears very clearly in the Raoult paper:

Look at the lines framed in red: some patients are positive one day, negative the next day, and positive again the day after !

Let’s consider the first one, patient no 4 on days 0, 1 and 2: « 24 / NEG / 33 ». Since NEG means 35, this means that this patient’s viral load would have been divided by about 2,000 (211) and then, the next day, would have been multiplied by about 4…

The measures framed in blue are very surprising, for instance, for patient 21 (2nd blue-framed line) the recorded values are supposed to be « 16 /34 /24 », which means that the viral load has been divided by about 250,000 (218) within one day and then multiplied by about 1,000 (210) the following day

In short, the measurements do not appear very reliable. And again, this is blindingly obvious when you study the chart of the results published by the team:

Note: A reviewer who has repeated the calculations reports that he finds P-values that are twice as high (source). This should be checked.

So there’s one scientist who wrote without batting an eyelid – and 17 others are supposed to have reviewed before signing – that the control group had :

  • 1 negative out of 16 on day 3;
  • 4 negatives out of 16 on day 4;
  • but 3 negatives out of 16 on day 5;
  • and thus only 2 negatives out of 16 on day 6 (the day of the result of this study).

And this is clearly visible on their graph, which was widely circulated when the study was published :

Everyone focused on the difference between the curves, without taking any notice of the fact that the percentage of positives increased 3 times in the control group, which is obviously ridiculous!

Here is another problem. As we have seen, many missing measurementsare missing for the control group (the famous « NDs » on the black background).

All scientists know that the results must be analysed in light of the statistical uncertainty of the result measurement.

Let’s take an example. Let’s imagine that we want to determine the average age of the Presidents of the French Republic on the day of their first election. With only one value, for Macron, the average is 39. With 2 values (Holland, 57) the average is 48. With Sarkozy (52) the average is 49. When including the 8 presidents of the Fifth Republic, we reach an average of 56. With all the others, we reach the real value of 60. It is therefore obvious that the robustness of the 39- year average corresponding to the Macron measure alone is much lower than that of the 56-year average corresponding to 8 values. The more measurements there are, the closer we get to the correct value.

To reflect this, scientists use error bars which are graphical representations of the variability of data and are used on graphs to indicate the error, or uncertainty in a reported measurement. They give a general idea of the accuracy of the measurement, or conversely, how far away from the reported value the true value is. In most cases, error bars represent a standard deviation of uncertainty, a standard error, or a certain confidence interval (e.g. a 95% confidence interval) (source: Wikipedia). See the following 2 examples for a better understanding:

Here the measurements displayed as histograms are accompanied by a confidence interval (in red) indicating the amplitude where the actual value lies with a probability of, let’s say, 95%. The height depends on the size of the sample on which the measurement is made (it would be very large for Macron’s 39 years, but much smaller for the 56 years of the 8 measurements).

Similarly, the above graph shows the theoretical variability of the curve measurements.

To make a long story short, a PubPeer contributor (source) presented the uncertainty that should have been included in Raoult’s paper as follows:

It appears clearly that the sample is so small that the uncertainty bars are overlapping across the two curves, which means that they have absolutely no statistical robustness, and that the differences between the curves may be as much the result of chance as of treatment.

Problem: The sample is far too small to enable any conclusion to be drawn.

And that’s not all !

3-7 Results for Hydroxychloroquine and Azithromycin

We’ll come back to this in the next post, but it should be noted that Raoult also tested a hydroxy-chloroquine + azithromycin (antibiotic) combination, with the following results:

By day 5, all patients treated with Raoult cocktail are cured!

We have to admit that, presented this way, the effectiveness of the treatment seems very convincing indeed.

And there is more to it:

This has to be read carefully:

1/ « hydroxychloroquine is efficient in clearing viral nasopharyngeal carriage (…) in only three to six days, in most patients.« 

It is quite unbelievable to draw such definitive conclusions on such a small sample in a trial full of biases;

2/ This difference with « control group starts even as early as day 3 post-inclusion« 

This is, indeed, quite remarkable, considering the fact that this control group is very poorly tested on a daily basis:

3/ These results are « of great importance because a recent paper has shown that the mean duration of viral shedding (…) in China was 20 days (even 37 days for the longest duration) ».

First of all, it is odd that a small sample treated in Marseilles be compared in a clinical trial with a population hospitalised in China 2 months earlier – it is not even certain, for instance, that the strain of virus is the same…

This is yet another proof of the lack of seriousness attached to such an important study. One reviewer (source) pointed out that the previous graph is not identical to the one presented by Raoult on March 16 th, during his presentation (in his video, or here, archive, source here, archive):

However, it is certain that if the treatment was able to suppress the viral carriage « in just 3 to 6 days », when the average duration « in China was 20 days », it would be spactacular.

If it were true…

3-8 Viral carriage

This section, although it may seem a bit technical, is crucial to the understanding of a very serious bias in the study.

3-8-1 Overview

The purpose of this trial is therefore to compare the speed with which the virus can be evacuated from the back of the nose – also called viral carriage.

This study, published by the Lancet in February, traced the evolution of the viral load in two Chinese patients, in their throat (in red) and sputum (in blue):

There clearly is a measurement abnormality on day 7, on the right – the red line of the patient on the left (patient 1) is more easily readable.

We see that the viral load starts increasing in the nose between the 2nd and the 4th day, reaches a peak between day 5 and day 7, and for those 2 patients disappears on day 9 – which doesn’t necessarily mean that those patients were heaaled at that moment.

Thus, as we want to know about the speed of viral shedding among patients, we must take into acount the moment at which we start monitoring the patient: the shedding duration will be much shorter if you start monitoring the patient on the 7th day after the onset of the symptoms than if you start on the 2nd day…

This is why the analysis specifies that on average, the patients join the trial 4 days after the onset of symptoms..

This means that the patients are monitored, at the end of the trial, 10 days after the onset of symptoms

This would be half of the duration of the viral shedding in China; the amount of negative tests would then be a great result for the treatment.

« WOULD BE »…

3-8-2 « The average duration of viral carriage […] in China was 20 days

This information about the average duration of 20 days is a key point in the argumentation of Raoult article. The reference is to be found in this study (there):

This article was published on March 11 th (and is therefore quoted in Raoult article on March 16 th) and indeed says :

Let’s first observe that if the duration of viral shedding for this group composed of hundreds of hospitalized Chinese people is indeed 20 days (between 17 and 24, to be precise), ranging from 8 to 37 days, the article mentions the median duration (it’s longer for half of the patients, shorter for the other half) and not the average duration, those are different figures.

Problem: Raoult team merges average durations and median durations.

However, the study calls for caution regarding the robustness of this data:

« the estimated duration of viral shedding is limited by the frequency of respiratory specimen collection, lack of quantitative viral RNA detection, and relatively low positive rate of SARS-CoV-2 RNA detection in throat-swabs »

« the estimated duration of viral shedding is limited by the frequency of respiratory specimen collection, lack of quantitative viral RNA detection, and relatively low positive rate of SARS-CoV-2 RNA detection in throat-swabs »

But actually, that’s not the point. Because in order to average viral shedding, you need to have the measurements history of the patients who are no longer carrying the virus. And therefore, we need to define when these patients are no longer carrying the virus. We therefore need to look into the methodology of the study – which actually refers to another study (available here):

The Chinese scientists aim at the envelope gene of the virus, during 45 cycles.

So they have a very thorough search for traces of the virus. Which is not of the same nature or level as the one carried out in Marseilles…

Therefore, one cannot compare average durations and median durations on such different bases!

3-8-3 The Marseilles case

As we’ve seen, the article refers to another article on the methodology of the PCR technique that has been used (it’s here)

The article indicates the ARN primer of the virus,

We‘ve also checked: The team from Marseilles has simply shifted the Chinese primer :

The PR will replicate the part of the genetic code of the virus displayed here, between the primers (arrows)

Unfortunately, the article doesn’t mention the number of cycles, refering to another article (here) :

We finally discover that, it seems, the practice should be to do 45 cycles:

We’ll see that this issue is not anecdotal

3-8-4 Another example to make it clear

This excellent article SARS-CoV-2 Viral Load in Upper Respiratory Specimens of Infected Patients (source) dated February 19th gives us interesting informations, from the measures of viral loads of about 15 Chinese patients.

Here is the viral carriage (the higher the number, the less virus there is, the reverse scale is therefore normal). The patients in intensive care are indicated in red, in blue are those whose state is moderate.

The curve of the average viral load according to the cycle threshold (Ct) is shown in blue. This means that a viral load is indeed observed up to 18-21 days, but this is because the detection threshold is 40 Ct.

Thus, with a value of 35 Ct (which is not fully comparable to that of Raoult, but should be quite close), anaverage duration of 8-9 days would probably have been necessary for the virus to disappear. That is to say a little less than in the Marseilles sample treated with chloroquine…

3-8-5 Conclusion

In view of the methodological weakness of the article, it would help if the investigators could:

  • specify the primer that has been used;
  • confirm the Ct value of 35;
  • indicate whether this Ct value is customary in Marseilles – and if not explain why;
  • and explain how they can compare their results with those of the Chinese study with a Ct value of 45.

Indeed, it is more than likely that a Ct of 45 would have led to 100% of the patients treated with chloroquine being, like the control group, positive on day 6, invalidating the result.

A Ct value of 45 in Marseilles would probably have led to this:

But we have to admit that this would have made it more difficult to sell to the media.

Let’s hope the IHU will give some explanation about this (was the Ct used in the trial a usual practice; why not use a Ct value of 45 as in China since the results were compared with those of the Chinese study) in order to leviate the doubt.

Last remark: it was noted that patients in the control group that were not treated in Marseilles had no Ct value but only a POSitive or NEGative status. It can therefore be assumed that the samples were not analysed in Marseilles but in the laboratories of the various hospitals where they were. Therefore, there is no way of knowing whether they used the same primer and a Ct value of 35 for their PCR.

And ther is also nothing that indicates that all of them detect the same level of virus. If they were finer in their analysis, they would detect the virus for significantly longer, which would totally distort the comparison with the chloroquine group treated (and would then account for the discrepancy) – and would therefore make the clinical trial totally useless.

3-9 Let’s put some seriousness in this trial…

We are therefore dealing with a trial comparing the speed of recovery of 2 groups, this speed depending on the seniority of the disease.

The article indicates that the the control group, comprising 16 people with a mean time of 3.9 days, is comparable to the treated group:

But this is a lie.

Patients 1, 4, 2 and 3 are asymptomatic; no date of first symptoms is available for patients 13 and 7! Same thing for the 2 in the group treated with chloroquine.

3.9 is a mean value for the 10 other patients in the control group. Not for the 16 patients obviously.

Indeed, it is not scientifically acceptable to include in this trial (on speed of recovery) people whose time of the disease is not known!

Similarly, it is not scientifically acceptable to include 6 patients who have not been tested on day 6 in the evaluation of the trial on day 6 ! As Didier Raoult likes to say, « It’s delirious! »

So here is what this little trial would look like once cleared of these abnormalities:

Unfortunately, there are only 5 people left in the control group (but we will see that Didier Raoult is not really interested in control groups) as compared with 18 in the treated group.

We can, however, calculate the average day of passage in negative, for the negatives: 8th day for URTIs, 9th day for LRTIs. This is exactly what was shown in the previous Chinese study for a Ct of 35…

Here is, therefore, what remains of this trial – when excluding the numerous above-mentioned biases

When considering the last column alone, the treatment seems to be working very well.

But if you consider all of it:

  • for URTIs (bad colds): as the HCQ treatment is 4 days old, and measured on day 6, we measure on day 10 of symptoms; the average is 8 days of recovery, so it is not surprising to have 60% people healed. The subgroup with the antibiotic is 34 years old on average, so these people have probably eliminated the virus more quickly. Remains the control group with no negative; but the « group » consists of 3 people, older than the others, with only 2 days of clearance; with an average of 8 days of clearance, it is not very surprising to have no negative on the 6th day…
  • for LRTI (pneumonia): the HCQ+AZM group is once again the youngest of the groups; it has 5 days of seniority, for an average clearance time of 9 days: it is normal to have a good result in this example.The control group only has 2 patients; the one with 2 days of seniority would have just had time to go into negative; the second, starting on day 10, should have gone into negative, starting with a seniority higher than the average, but this was not the case, he may have to go to ICU if his condition worsens.

Let’s mention one more thing: the concentration of chloroquine in the body (which is given in the chart) does not seem to have a very strong influence on the speed of shedding.

3-10 But in fact, what was Raoult’s objective on March 9?

In this video dated March 9 th, Raoult’s objective was as follows:

« Our research project on hydroxy-chloroquine has just been accepted, and we’re implementing it with two objectives :

  • Point one is toimprove clinical management, that is specifically for patients with rather severe symptoms,
  • and on the other hand, and that is our second objective, it is to see if we can quickly, because that’s what the Chinese said, reduce viral carriage, i.e. when the natural viral carriage is apparently around 12 days, Mr. Zhong reported that under chloroquine the viral carriage was reduced to 4 days.

And so we do hope to confirm these data because then it will allow, especially for those who carry considerable amounts of virus, to decrease this viral load, and the risk of secondary contamination. « [Didier Raoult, March 9, 2020]

Thus, on March 9 th, according to Raoult, the average time of portage in China was 12 days, but on the 16th in his paper, he fetched, without a question, a Chinese study that mentions 20 days. And his goal on March 9 th was to bring it down to 4 days, but on the 16th he welcomes a treatment that indicates an average viral carriage of probably 9 to 11 days. Moreover, he does not talk in his article about the improvement of clinical management – but with 1 death and 3 cases of resuscitation out of 26, it is true that there is nothing to be glorious about.

This trial is therefore a failure in relation to its initial objectives, which he refrains from saying.

3-11 One last big problem

Back to the patients

The patients are therefore all hospitalized. We also have that they are classified according to their clinical state :

But in fact, when is the classification established? On the first day, as it is likely, or on the 6th?

1/ If it’s on the 6th day, it means that some patients‘ status will be different be they on the 6th day or on the 14th.

And why isn’t there anywhere a « cured » status? Although there was at least one, the patient, reported as « lost from sight »:

But why on earth is being cured considered as being lost from sight?

Does this mean that, when a patient has a non detected viral load several days in a row, and no more symptoms, the investigators still go on treating them, until the 14th day? Why that?

2/ If this happens on the first day, then it would make more sense. But then, what happens when the patient’s condition worsens? The team would be lucky if, among the 22 UTRI included in a period of time of 0 to 10 days after the onset of symptoms, none worsened into a pneumonia (LTRI)…

We have shown here the evolution of the viral load of the URTI patients (common symptoms of a big cold) measured in CT (from 15 (many viral particles) to 35 (few viral particles)) this depending on the number of days after the onset of symptoms :

The thick green line represents the only URTI patient from the control group; the thick blue lines represent the 2 patients who received the HCQ + antibiotics treatment; all the others received HCQ

It is easy to see that some measures are quite surprising, in zigzag patterns. We do see that although both patients receiving HCQ had a sharp decrease of the viral load (they are however young, 20 and 48 years old); we must also note that they started with a small viral load.

Let’s now focus on the LRTI (pneumonia) patients:

They are all treated with HCQ + antibiotics, except for the two thick lines, pink and red, who received HCQ only.

We observe that the pneumonia patients (LRTI) seem to have a faster decrease in the viral load at the back of their nose than those who have common cold symptoms (URTI). But as we’ve seen, the question is: is it because they’re healing, or is it because the virus tends to concentrate elsewhere in the body?

If we had some information about the clinical condition of each patient it would be possible to answer this question, but the investigators, strangely enough, decided not to communicate any of this information on the 6th day but to do so only on the 14th.

Therefore, it is impossible to know whether the disappearance of the viral load always implies that the patient has healed, as the investigators clearly imply. Let’s use the example of the two UTRI patients receiving antibiotics (blue lines): did they really heal quickly, or were they actually evolving towards a pneumonia? How are they? The investigators decided to give them antibiotics, but not on a randomly basis: they might have chosen the most severe UTRI cases?

Once more, let us observe the evolution of the patients:

Patient 34, for example: he’s 20, he suffers from a rhinitis , after 3 days in the hospital, 5 ays with symptoms, he no longer has a detectable viral load. But he stays at the hospital for 4 days, and he is supposed to stay 8 more days. Healed? There, he only sees a nurse once a day, she does a Covid-sampling in his nose, and that’s it.

All these patients who are seemingly healed, look highly motivated in their desire to help science by staying at the hospital. Or might not it be that they are staying because they are still ill? Unfortunately, we know nothing of that.

In any case, this diagram from an article in Nature (soure) provides a reason to ramain cautious:

Indeed, we see here that the viral load, (here measured in RNA copies by mL, on a logarithmic scale) concerning 9 patients decreases much faster in the nose (yellow lines) than in the sputum (mucus from the lower airways, orange lines), or in the stools (grey lines)… The virus often stops being detectable in the nose within 7 to 10 days – and there is no reason to believe that the detection limit in Marseilles, CT 35, is better than this analysis, which is of high quality.

Do not hesitate to report other methodological problems here.

IV. Was this trial legal ?

  • 4-1 European legal requirements in the case of a clinical trial
  • 4-2 The authorisations concerning the Gautret/Raoult trial protocol
  • 4-3 Raoult has never done randomized trials
  • 4-4 Questionable official statements
  • 4-5 But what was the treatment that was tested in this trial?
  • 4-6 The question of the dates
  • 4-7 Thanks to Didier Raoult, the French research is shining brighly

4-1 European legal requirements in the case of a clinical trial

Health tragedies following insufficient clinical trials have led to a strict regulation of clinical trials.

Here is a little regulatory reminder, from the European regulation 536/2014 dated April 16 th, 2014 (available here).

Article 2 – General principle

A clinical trial can only be conducted according to the following directives:

« In a clinical trial the rights, safety, dignity and well-being of subjects should be protected and the data generated should be reliable and robust. The interests of the subjects should always take priority over all other interests.« 

Article 4 – Prior authorization

L’examen éthique est réalisé par un comité d’éthique conformément au droit de l’État membre concerné. […]

The ethical review shall be performed by an ethics committee in accordance with the law of the Member State concerned. […]

Article 5 – Submission of an application

1. In order to obtain an authorisation, the sponsor shall submit an application dossier to the intended Member States concerned through the portal referred to in Article 80 (the ‘EU portal’).. […]

Article 47 –Compliance with the protocol and good clinical practice

The sponsor of a clinical trial and the investigator shall ensure that the clinical trial is conducted in accordance with the protocol and with the principles of good clinical practice.

The sponsor must therefore submit its protocol to obtain authorization (in France from the Agence Nationale de la Sécurité du Médicament) from an ethics committee, and then obviously comply with the said protocol during the trial.

4-2 Authorizations of the Gautret/Raoult trial protocol

The results of the test read as follows:

It has of course obtained authorization from an Ethics Committee (CPP Île de France) and has been registered on the European Union’s portal, EudraCT, in accordance with the regulations.

4-3 Raoult has never done a randomized trial.

As a reminder, Raoult indicated that he has never done a randomized trial, because he considers this… useless!

« And so that is why we decided to see here, with you, a certain number of points to consider. The first one, it’s in the wrong order, is against randomized trials. I have never done a randomized trial, but I can assure you that there are hundreds of thousands of people who are being treated with treatments that I have devised. I’ve never done a randomized trial because all I was interested in was the diseases people were dying from. Anyway, it’s easy to see. The first ones I worked on had a 65 % mortality rate, so it’s pretty easy and fast to find out if it works. The second one is that it’s true that in our world of microbes… to see if there are no more microbes, that’s quite easy too. So there’s no point in giving a placebo to treat sepsis. It’s a crazy story, it doesn’t make any sense. You just have to look to see if the germs are gone and if people are healed. The thing with the randomized stuff, maybe it works or it’s useful when you take 100,000 people who have had a myocardial infarction to see, but to put that in infectious diseases, if you want, it doesn’t make sense. It doesn’t make sense, so since it doesn’t make sense… Maybe we’ll have to reverse it all. Since it doesn’t make sense… It doesn’t make any sense at all. It’s silly. The only good thing about randomized trials, but now we know about it so it’s not very interesting, is that it allows you to evaluate the placebo effect.« [Didier Raoult, February 13, 2020]

And indeed, when we search the European clinical trials database, we will only find one trial whose sponsor was the IHU: the dated March 2020 (source ; see more extensively here)

The IHU seems to have no experience in conducting clinical trials on difficult issues – as the aftermath clearly shows.

4-4 Questionable official statements

So let’s look at this submission of application, which was filed by the IHU on March 10th:

We get to know more about the purpose of the trial:

« Reduce the period of viral carriage, and thus the contagion. »

The main goal is therefore not to measure a clinical improvement as to infected people, but to reduce the duration of the viral load carriage at the back of the nose – leading thus the IHU to automatically draw the conclusion that there is a decrease in contagiousness, which remains to be proven (the virus might still be in the throat or lungs).

As reporting dates for the main assessment, they declared days 1, 4, 7 and 14, without it being known whether day 1 corresponds to D0 or D1 of the published article (which measures at D6) :

The 14-day trial is scheduled to end in 12 months time

As for the participants, on March 10 th, they indicated 25 people (including 5 teenagers):

While this is a trial planned for 24people being treated and 24 in control.

Here is the actual population distribution in the trial compared to this official statement:

So one can see that the trial departs considerably from it, which is very surprising, because that statement was made on March 10 th, and the results on Day 6 were reported on March 16 th…. But on which day were the groups constituted by the IHU?

4-5 But what was the treatment that was tested in this trial?

As can be seen, the sponsor seems quite at a loss to define his trial:

He says he has a single-armed protocol… but still has a control group!

An « arm« , in a clinical trial, is a group of patients, treated or non treated (source). So there are 2 arms here – and therefore the declaration on the site raises questions, since only the treated group (« 25 ») was declared – and that was done incorrectly.

But there is a much more serious problem: what was the exact treatment that was tested ?

The statement is very clear:

« Treatment of SARS-Cov-2 Coronavirus Respiratory Infections with Hydroxy-chloroquine. »

and the title of the trial is in non-technical language: « Hydroxychloroquine as a treatment for coronavirus disease Covid-19« .

Clearly, the IHU did declare a test for hydroxy-chloroquine, in the form of Plaquenil:

Moreover, in the slides of the announcement of the results by Didier Raoult on March 16th, the essay was added at the last minute and this was done in the bibliography part ( here, archive, source here, archive) :

« Hydroxychloroquine as a treatment of Covid-19 » That is the mentioned title.

But there’s a big problem. The official title of the published essay is (source) :

« Hydroxychloroquine and azithromycin as a treatment for Covid-19. »

Yet, the official statement indicates that there is no sub-study

azithromycin. This is indeed the title of the publication!

And it’s obvious in the results chart:

or even in the patients’ list only:

Finally, it was so much a trial of the Hydroxychloroquine plus Azithromycin combination that, in addition to the title, it has become the treatment that Didier Raoult now distributes to thousands of Marseilles residents, based on this trial.

Association which, as we have seen, potentially increases the QT interval of 11% of the patients according to the American study.

The question is therefore simple: have the people at the Île-de-France (Paris region) Comity for the Protection of People, and the ANSM, given their agreement to a « Hydroxychloroquine plus Azithromycin » test, yes or no?

This is a very important issue, as it is obviously illegal to carry out a clinical trial without the agreement of the authorities; it is a criminally punishable:

Article L.1126-5 of the Public Health Code

« Is punishable by one year’s imprisonment and a fine of 15,000 euros the fact of carrying out or having carried out research involving a human being […] without having obtained the positive agreement of a Committee for the Protection of People« .

It is even difficult to see things straight, as the article mentions an agreement of the ANSM on March 5 th and of the CPP on March 6th:

But the statement mentions a PPC agreement on March 3rd:

There is one final point related to one more requirement:

Article 42 – Reporting of suspected unexpected serious adverse reactions by the sponsor to the Agency

1. The sponsor of a clinical trial performed in at least one Member State shall report electronically and without delay […] all relevant information about the following suspected unexpected serious adverse reactions.:

(a) all suspected unexpected serious adverse reactions to investigational medicinal products occurring in that clinical trial […].

The period for the reporting of suspected unexpected serious adverse reactions by the sponsor to the Agency shall take account of the seriousness of the reaction and shall be as follows:

(a) in the case of fatal or life-threatening suspected unexpected serious adverse reactions, as soon as possible and in any event not later than seven days after the sponsor became aware of the reaction;[…].

It should be ascertained whether, as a matter of caution, the death and 3 transfers to an ICU for patients treated with chloroquine have been reported – as it seems difficult to exclude categorically and without any doubt the slightest influence of the drug.

It is therefore important to get to the bottom of all of this rapidly

4-6 The question of dates

Finally, note the title in the pre-publication (source) :

Thus, it had been planned to call the article « preliminary results« .

But the word « preliminary » has disappeared, and these seem to be the final results. As suggested by the main assessment criterion in the article:

However, as we have seen, there are several criteria, including the one on day 14. But on March 16 th the team communicated the results and that was on Day 6 « because it was an emergency », but as of April 10th, there still are no publication of the results on Day 14.

We raise again the problem mentioned in the previous post (see here) :

So, what does March 16 th refer to? What was the first day of the trial, i.e. the first day of treatment of the patients ? And how was it possible to have an article written about the results on day 6 on Monday, March 16?

4-7 Thanks to Didier Raoult, French research is shining brightly

For, as we have seen, if a form of blind fascination continues in France, the sad truth has been well perceived across the Atlantic by specialists in « Fake-Science » (who have no particular dispute with Raoult). We are thinking here of Professor David Gorski, oncology surgeon, professor at Wayne State University (source: Science-Based Medicine website):

«  It turns out that Didier Raoult’s group played even faster and looser with the data in this study than my post indicates. Worse, Raoult’s group has a documented history of data fabrication, likely due to his tyrannical manner of running the institute and his demand for results, where he publicly humiliates students, postdocs, and researchers who do not produce the results he wants.[…]

he is feared in the French scientific community due to his propensity to use his power and influence to silence critics.[…]

He’s also been quoted as saying: « I have never done randomized trials…The effect of randomized stuff, maybe it works on 100,000 people who had a myocardial infarction, but putting that in infectious diseases, it doesn’t make sense.It’s silly. »

That explains a lot about why his trial design was so incompetent.

Overall, Raoult strikes me as a “brave maverick” who might have been a great scientist in his prime but who has now become arrogant and dictatorial and has now come down with a serious case of the Dunning-Kruger effect.

He has no expertise in clinical trials and in fact disputes the usefulness of randomized clinical trials in infectious disease.

Anything he publishes on COVID-19 should be taken with a huge grain of salt. »

But we can now also quote, by way of example, this article of April 9 from CNN talking about the Raoult essay (source) :

« The study was a complete failure.  » [Kevin Tracey, CEO of the Feinstein Institutes for Medical Research in New York City]

« It was pathetic.  » [Art Caplan, Head of the Division of Medical Ethics at New York University School of Medicine]

It could not be better said…

V. Conclusions from the trial

5-1 Definitive conclusion

We can definitively conclude that :

  • Raoult team does not have the most elementary competency to carry out such a trial;
  • severe problems arose concerning their medical ethics and intellectual probity, as among 26 patients they chose to call « loss for follow up », one was dead, 3 saw their condition severely worsen, and one sufferes such terrible side effects that he decided to leave the trial ;
  • this trial does enable in any way to prove whether chloroquine is less effective or not effective at all ;
  • taking chloroquine during 3 days at the hospital does not prevent patients from dying or going to the ICU;
  • the 18 people who signed this article are at the least severely incompetent, or obliging people who signed without having read with full attention ;
  • the peer review comity that approved the article only did, at best, a very quick very brief analysis of the article;
  • the IHU is wasting our valuable time by not carrying out serious studies – this center was not even able to find in the hospital 30 Covid-19 patients to constitute a solid placebo group…

5-2 Issues to look further into

We hope that in the future satisfying answers will be brought to the following questions:

  • Of course, the first question remains: is chloroquine efficient? It’s a very important question, to which other, more competent teams will have to answer.
  • Did chloroquine worsen or not the symptoms and clinical situation of the 4 patients ?

Finally, it would be very useful if the team of investigators answered the following questions :

  1. What are the exact dates of the calendar of this clinical trial (D0, D6, D14) ?
  2. Why were the 5 serious setbacks of the trial indicated as lost patients ?
  3. How did you select the UTRI patients whom you gave antibiotics to ?
  4. Why are patients 1 and 4 10 yrs old when the criterion was over 12 yrs old ?
  5. How can you be sure that the collection of the data at the centers concerning control groups was done in a reliable manner ?
  6. According to you, how reliable or non reliable are those tests ?
  7. Why did you not mention this issue in the article? Especially when, among the control group, we have a ratio of positive patients that is increasing by 3 times during the trial.
  8. Why are confidence intervals almost never indicated in your charts and diagrams ?
  9. Why did you include asymptomatic cases ?
  10. How did you evaluate the average duration of the disease among the 16 patients of the control group, which includes asymptomatic cases ?
  11. Did you go on treating patients once their viral load was not detectable anymore ? And if so, why ?
  12. Why didn’t you give a placebo ?
  13. Why didn’t you indicate the clinical condition on D6 ?
  14. Did some URTI cases worsen into LRTI symptoms? How does this appear in the overview table ?
  15. aAre there any patients having a negative viral load that were not immediately healed ?
  16. What was the cause of the death of the patient treated with chloroquine ? Howis it possible to know the exact role played by the treatment ? Same question with the patients sent to the ICU.
  17. What was the PCR method used in the trial (primer and Ct) ? Is it exactly the same as for the other tests done in Marseilles ?
  18. Was the PCR method in Marseilles the same as in the other centers ? If not, what are the differences ?
  19. How did you compare the average duration of viral shedding in Marseilles with those in China, given that they had different PCR parameters ?
  20. Why did you talk about  » mean duration of viral shedding » in China when it was an average duration ?

5-3 But this treatment probably works (a little)

This being said, let’s repeat again that we are not « against » chloroquine, this would make no sense. We’d like it to work (and there are a few encouraging signs), but we would simply like to find about that through high-quality studies – nothing more.

And we’re almost sure that the treatment that is suggested in the trial works – at least a little.

Indeed, it’s called the placebo effect, and it very likely that it plays a role in the least serious cases. But Didier Raoult adamantly refuses to measure that, which is strange, because it would demonstrate the real efficacy of the chloroquine.

But this is not his first time (source):

We see that, during this serious public health crisis, in an article about the methods to fight the epidemic, Raoult offers a comparison with acupuncture and homeopathy !

Those methods, by the way, never proved any better than mere placebos (see here and there, for example).

5-4 How could such an article be published?

This article was published in the International Journal of Antimicrobial Agents:

Whose editor-in-chief is J.M. Rolain, and one of the editors is P. Colson from Marseilles:

who are two of the authors of the article (what are the odds!).

A new example of conflict of interests in science

5-5 Conclusion

Finally, there’s this gold nuugget :

Thus, the study had some « limitations », such as « the 6 dropouts under study. » Including 1 death, 3 in intensive care and a serious side effect ….

Mais aussi :

« For ethical reasons », because of results « so significant and evident » they decided to share their results, « given the urgent need for an effective drug ».

We can here ask whether it was so urgent, given the low quality of the trial, of the outrageous dismissal of the setbacks (1 death, 3 in ICU, 1 with severe side-effects), of the poor results, of the sample size too small to be significant, if it was really necessary to publish on March 16th

Because, if we did get this on day 6…:

…we would probably have obtained something like this on day 14 :

We would then have seen that we didn’t have a « miraculous » drug, but only, maybe, a drug speeding up theviral shedding in the nose – which could be useful, depending on the side effects, but doesn’t give the same image of efficacy. Without even mentioning tje serious setbacks…

This being said, the outcome of day 6 was published on March 16th. On the 5th of april, we still don’t have the outcome of day 14. « For ethical reasons », it would have been good to have access to it just as quikly, meaning on March 24th… The secondary criteria about the mortality would also have been interesting to analyze.

VI. Reactions and analyses from scientists

In the debates about Didier Raoult, the words « talented scientists », « great professional », « international authority » are often used, but here is how many specialists judge his work – and given what we have just analyzed, it’s easy to understand why.

Dominique Costagliola, member of the French Academy of Sciences (one of the only 36 members of the « human biology and medial sciences » section), vice dean of research at the Faculty of Medecine at the Sorbonne University, deputy director of the Pierre Louis Institute of Epidemiology and public health (Sorbonne University):

« About a non randomized trial in which the control group includes an unknown proportion of people who had non-inclusion criteria or who refusded to partiipate in the trial or who were treated at other centers, it is impossible to tell if the 2 groups are comparable or not, and therefore to judge if the possible discrepancies can be attributed to receiving or not the treatment, especially as the most severe clinical cases (at least 4 of the 6 lost patients, 3 in ICU and one dead) were excluded from the description and the analysis.

A further complication arises from the fact that we do not know for certain whether the virological analyses were carried out in a centralized manner, as suggested by the simple general mention « positive » and not a quantitative measure for a large proportion of the control group patients, whereas this is never the case for the patients treated with chloroquine, and this is definitely something which could introduce differences due to the method of the measurement and not because of the eventual treatment.

All in all, the study is carried out, described and analyzed in a non rigorous manner, with imprecisions and ambiguities, and this study is subject to a high risk of bias according to international standards. In this context, it is impossible to interpret the described effect as a consequene of the chloroquine treatment.

Furthermore, if the prevention of the transmission is an important criterion, in the current crisis situation the relevant criterion for the evaluation of a treatment is rather a clinical criterion such as the need for a ventilator, or death, which was the case for 4 of the 26 patients treated with chloroquine in this study. »

Prof. Xavier Lescure, infectious disease specialist at the Bichat-Claude-Bernard hospital (source):

« As a researcher, a teacher and a doctor I am deeply concerned about the polemic around chloroquine and hydroxychloroquine. I won’t mention the pilot study presented by professor Didier Raoult, which had only a few patients and is methodologically outrageous. It’s not a randomised clinical trial, with randomly selected patients. They are selected without paying attention to their condition. Some are asymptomatic and do not need a treatment, some who se conditions worsens towards the ICU or death are excludedfrom the study, etc. It’s scientifically shameful. A researcher may have convictions, but no certaintes. He must be guided by doubt.

Right now, I’m mostly concerned as a doctor treating patients in their hospital beds. Everyone wants Plaquenil, now, despite the fact that this treatment might cause the clinical condition of our contaminated patients to worsen. And we have the greates difficulties to carry out the serious clinial trials which started on several molecules, because the patients want « their » Plaquenil. This unrestrained communication might slow down the integration of patients into clinical trials of medecines against Covid-19, because we have to spend a lot of time re-informing them. Lastly, I am also conerned as a teacher, as the communication about these molecules goes against everything that I have learnt and pass on to my students. »

Prof. Gilbert Deray, head of the nephrology department at the Pitié Salpêtrière hospital:

« Science, medecine, therapeutic indiations are not decided by articles published in peer-reviewed journals, by experts, they are decided on Facebook and Twitter. I am flabbergasted that various high-profile people can tell those who listen to them « That’s the treatment you should receive ». […] Politicians saying « I just received chloroquine for 7 days, I’m cured, take some ». They don’t realize it, but it’s the antithesis of medecine. First, they don’t realize it’s absolutely stupid, I state this strongly. They don’t realize of the influence they have. I don’t understand that some politicians, doctors, former doctors say: « I called the government to ask them to give this treatment ». Because the study showed that… Despite everything that people who actually read these studies say. […]

I am sad and very angry. I’m very angry, because what I believe, is that these divisions within the society will cause deaths. It’s important to know that among medical teams, there are debates around this problem, and that it is tiring for everyone. And that the trials which are currently trying to prove the efficacy of this drug are slowed down because patients do not want to enter them. […]

It’s important to know that when they tell us « Yes, but we are in a special period, it’s war, we mmust act quickly »… Who said we shouldn’t act quickly? We could do the same studies, very quikly, with the same deadlines, and with a real result. So, I’ll tell you one last thing about this. When I’m told « we shouldn’t give placebo in this period, it’s wrong, it’s unethial ». I believe exactly the opposite. Because we don’t know if it works. Do you know if patients will improve or worsen, when you give them placebo ? Let me remind you that chloroquine, whih is efficient against chikungunya in a test tube, worsens the symptoms of chikungunya in a patient. I don’t know, if this treatment is efficient or makes things worse. And if we had given a placebo to 20 or 30 persons, we would have had an answer, we would have saved millions more.

François Séverac, methodologist, biostatistitian and doctor at Strasbourg’s civilian hospital (source):

« Given the methodological problems and the number of biases in the study, which is merely preliminary, we cannot confirm its results. However, if all those biases prevent from demonstrating that the treatment works, it doesn’t mean that it doesn’t work. »

Nicolas Martin, host of the radio program « the scientific method » on France Culture (source):

« In the colletive mind, many seem to believe that its only scientists and scientific journalists who are griping, when there are positive results: but no, there are no proven positive results. The study doesn’t follow any method, so it’s like building a wall without cement, if you lean against it, it collapses.

Let’s finish with a vision (which we invite you to take with a grain of salt) from abroad, by Prof. David Gorski, oncological surgeon, professor at the Wayne State University (source: science-based medecine website):

It turns out that Didier Raoult’s group played even faster and looser with the data in this study than my post indicates. Worse, Raoult’s group has a documented history of data fabrication, likely due to his tyrannical manner of running the institute and his demand for results, where he publicly humiliates students, postdocs, and researchers who do not produce the results he wants.

He is feared in the French scientific community due to his propensity to use his power and influence to silence critics.

He’s also been quoted as saying: « I have never done randomized trials…The effect of randomized stuff, maybe it works on people who had a myocardial infarction, but putting that in infectious diseases, it doesn’t make sense.« 

That explains a lot about why his trial design was so incompetent.

Overall, Raoult strikes me as a “brave maverick” who might have been a great scientist in his prime but who has now become arrogant and dictatorial and has now come down with a serious case of the Dunning-Kruger effect.

He has no expertise in clinical trials and in fact disputes the usefulness of randomized clinical trials in infectious disease.

Anything he publishes on COVID-19 should be taken with a huge grain of salt. »

VII Summary of the problems

Here is a summary of the main problems, mistakes and errors :

  1. It is very diffiult to figure out the exact calendar dates of the beginning and the end of this clinical trial;
  2. The trial is not randomized;
  3. The control group includes people shoowing particularities, which might modify the efficay of the treatment ;
  4. The control group is significantly smaller (by a third) than the 24 determined by the statistical analysis of the team itself
  5. The chloroquine group itself is smaller than needed (20 instead of 24);
  6. The sample is too small to lead to any conclusion;
  7. There is no controm placebo group;
  8. The control group is scattered in 3 other centers, with no distribution of patients within each center;
  9. The investigators have casually swept aside from the study 5 setbacks (including 1 casualty and 3 patients being moved to ICU), passing them off as lost patients;
  10. 2 patients are 10 years old, when the protocol sets the criterion of at least over 12 to join the study;
  11. The clinical condition on day 6 is not known (are all patients that do not show any more some virus in the nose all healed ?);
  12. Patients with an unknown date of onsert of the disease were included in the trial, even if they were asymptomatic patients ( but not ony they);
  13. there are doubts about the seriousness and thoroughness of the collection of the data from the tests, at the centers where the control groups were tested ;
  14. There are almost never any indications of confidence intervals, in the charts and diagrams;
  15. The tests carried out are not highly reliable (on 3 occasions the ratio of positive tested patients increased, among the control group);
  16. without any other element of information, Raoult team tries to compare the speed of recovery in China in January, and in Marseilles in March;
  17. Raoult team is mixing average and median durations on a fundamental factor, to be able to draw a comparison between its results and China‘s;
  18. We don’t know for sure that there are no differences between the thresholds detection in Marseilles and in the other centers, or between Marseilles and China (or even between this trial and the daily measures of the tests done in Marseilles).

VIII. Towards retractation

Let us finish with the reaction of the Société Internationale de Chimiothérapie Antimicrobienne (ISAC) who issued this press release (source) :

« Hydroxychloroquine and azithromycin as a treatment of COVID-19: results of an open-label non-randomized clinical trial [hydroxychloroquine et azithromycine as treatment for Covid-19 : résults of an open clinical trial (non randomised] (Gautret P et al. PMID 32205204)

ISAC shares the concerns regarding the above article recently published in the International Journal of Antimicrobial Agents (IJAA). The ISAC Board believes the article does not meet the Society’s expected standard, paticularly with respect to the lack of better explanations of the inclusion criteria and patients triage to ensure patient safety.

Despite some suggestions online as to the reliability of the article’s peer review process, the process did adhere to the industry’s peer review rules. Given his role as Editor in Chief of this journal, Jean-Marc Rolain had no involvement in the peer review of the manuscript and has no access to information regarding its peer review. Full responsibility for the manuscript’s peer review process was delegated to an Associate Editor.

Although ISAC recognises it is important to help the scientific community by publishing new data fast, this cannot be done at the cost of reducing scientific scrutiny and best practices. Both Editors in Chief of our journals (IJAA and Journal of Global Antimicrobial Resistance) are in full agreement.

Andreas Voss ISAC President – April 3 2020

This is a very rare statement, even if we understand that it is a bare minimum, because the society did not draw all the necessary conclusions from this sad story, by not retracting this article, which is not even accompanied by a warning in its online version.

And finally, as Leonid Schneider reminds us, the publisher is Elsevier.

It is unfortunately a new proof of the lack of deontology which we sometimes see among scientific publishers.

As for this precise trial, let us also mention a few whistleblowers : Leonid Schneider (@schneiderleonid ; voir Cf. this article) ; Elisabeth Bik (@MicrobiomDigest ; voir Cf. this article), but also @Damkyan_Omega, @AndrewALover and @GaetanBurgio – and more generally the amazing PubPeer community.

Failure to retract this article would be a second scientific disgrace.

 

1 réactions et commentaires

  • Amora // 15.04.2020 à 18h39

    Very good idea to translate this french article in english version. I agree.

      +2

    Alerter
  • Afficher tous les commentaires

Les commentaires sont fermés.

Et recevez nos publications