Accuracy Detection: Applications Where Biostatisticians Can Contribute

In this short communication, we describe a state of the art accuracy detection (polygraph) method, available at Converus.com. It is estimated to detect truthfulness (falseness) with 88% (86%) probability respectfully. While lie detection methods are not admissible in adjudicating guilt in criminal cases, this technology can be used to provide important ancillary information to those trying to fairly resolve disputes. We shall provide three examples: i. Helping Human Resources Departments resolve sexual harassment complaints in the MeToo# era; ii. Screening jailhouse informants in criminal procedures; and iii. Assessing unanimous guilty verdicts in capital murder cases, where jury deliberations took a very long time to achieve. These three applications should be viewed as examples of collaborations amongst those involved in important adjudication, experts in accuracy detection, and applied statisticians and biostatisticians, Other applications are encouraged.


Introduction
An excellent method for credibility assessment is available from [1]. The subject sits in front of a computer screen and eye movements and other behaviors are tracked by the camera and the computer. According to this website, the process takes about 30 minutes and operator involvement with the subject is minimal. These credibility assessment (polygraph) tests, as described in Kircher and Raskin [2], using the five-fold method, are estimated to detect truthful responses 88% of the time and deceptive responses 86% of the time. In the Methods section below, we provide three applications that rely on the binomial distribution to quantify: a) Helping Human Resources Departments resolve sexual harassment complaints in the MeToo# era; b) Screening jailhouse informants in criminal procedures; and c) Assessing unanimous guilty verdicts in capital murder cases, where jury deliberations took a very long time.

Methods
Help for Human Resources to resolve a sexual harassment case. The question we ask both the accuser (Y) and the accused (N) is basically: Did N sexually abuse Y? Of course, Y will say Yes, and N will say No.
If the polygraph inferences are concordant, which occurs if it infers one is truthful and one is not, this is substantial evidence for either exoneration of N or supporting the accuser. When the true perceptions are concordant, there is a slightly better than 75% (88% of 86%) chance that the Converus.com polygraph will get such a conclusive result. Of course, the accuser can truly believe it happened while the accused truly believes it did not happen. The risk in this case is that the polygraph infers that N is lying while Y is truthful, which translates to 10.6% (12% of 88%). If the results are not concordant, then we would conclude the polygraph was of little help to Human Resources.
Screening a jailhouse informant. It is well known that while waiting for trial, incarcerated defendants often provide incriminating information to other inmates. However, in exchange for their testimony, the prosecution often provides incentives for an informant's testimony. Defense attorneys are therefore very skeptical of such testimony. We should be in equipoise about such evidence, as on the one hand it has the potential to help the courts but on the other hand it could yield damaging but fabricated testimony.
Before allowing such a witness, it would be extremely valuable to do the following. Via a deposition, the informant will have the testimony recorded and signed. Once this is completed, the informant is given a polygraph to infer whether the testimony is entirely true or not. Only if the informant passes the test would the evidence be admitted to the court. Of the genuine testimonials, 12% are expected to be rejected. The strength of this approach is that the jury would not see any informant testimony until it was cleared.
Long jury deliberation in capital murder cases. When it takes a long period of time to reach a guilty verdict in a capital murder case, especially if the jury has been sequestered, we should be concerned about two things: A.
Were there dominant jury members who unduly influenced their peers? and

B.
Was there impatience, where jurors cave in just to get back to their normal lives? These juries consist of 12 members, and a guilty verdict requires all jurors to assert that beyond a reasonable doubt, the defendant is guilty.
There are two items that can be investigated. The first surrounds a given trial. When selecting a jury, permission to undergo a polygraph would be an eligibility requirement. When deliberations take over a set period, 12 Converus.com machines should be made available, and all jurors would independently submit to this (about 30 minutes) concerning the truthfulness of their verdict. Under present law, the results of these polygraphs would not influence the actual verdict but could be powerful evidence to motivate an appeal.

I.
The single case inference: We test the hypothesis that the statements of all 12 jurors, that beyond a reasonable doubt, s/he believed the defendant was guilty. If 3 or more (4 or more) jurors were inferred to as lying, we can reject this hypothesis at P=0.047 (0.010) respectively. In other words, the likelihood of seeing this level of contradiction when all members really believed guilt beyond a reasonable doubt would be 4.7% (1.0%) respectively.
II. Collective review of cases. If we could collect data from a large collection of such jurors from many capital cases, we can assess the societal question of whether such verdicts indeed demonstrate a substantial degree of failure in the spirit of unanimity.
Let P represent the fraction of polygraphs that infer truth to the question of guilty beyond a reasonable doubt, and Q represent the true truthfulness fraction. Note that we can get excellent direct estimates of P, but Q (the desired parameter) must be estimated indirectly as follows.
Recall that truthfulness is detected with probability 0.88 and falseness detected with probability 0.86=1-0.14. Hence, we infer truthfulness when a juror truly believes in guilt and the polygraph gets it right, or when a juror has reasonable doubts, but the polygraph fails to pick up the lie. Hence, ( ) 0.88 0.14 1 .
Note that if P is close to 0.88, then Q is close to 1.0 (virtual truth from the jurors collectively). But suppose P is 0.83, which implies Q=0.932. This may not seem alarmingly different from the ideal value of 1.0, but when you factor in 12 jurors, the probability that all 12 are truthful in a specific case translates to 0.93212 =0.43. So, the likelihood of all 12 jurors believing guilty beyond a reasonable doubt is only 43% when Q=.83.

Discussion
There are two elements of this paper that need to be pointed out. First, the reliability numbers, (88% for truthfulness and 86% for falseness), though based on a reasonable sample size, need to be refined over time as increasing experience through converus. com develops. Second, there are purportedly "good liars" who can fool polygraphs. For that reason, converus.com uses control questions to evaluate this. If the subject is deemed to be a "good liar" on these control questions, then the inference from any polygraph might be tainted, and the technology should therefore not be used on such subjects. It is argued here that "good liars" are likely rare. The narrow difference in reliability in detecting truthfulness (88%) and falseness (86%) would presumably have been wider if "good liars" were highly prevalent.
Although polygraph technology is not directly admissible to detect guilt or innocence of an individual criminal defendant, the use of the polygraph for an individual trial could provide powerful rationale for an appeal. One possible further study that could be done is to show a video of the trial to a new unofficial jury, to see if a verdict can be obtained in a timely manner.
Since the standard of evidence in civil trials is much less stringent than criminal trials, there is great potential to use polygraph technology in these cases. Class action lawsuits are especially amenable to using polygraphs, due to the large number of plaintiffs, who could be tested, for example to confirm oral claims disputed by a defendant.
In the capital murder example (3), the collective data across man trials could estimate Q as a function of the length of deliberation. If there is a large discrepancy between Q and 0.88, for trials whose deliberations exceed a time threshold, then it seems reasonable in such a situation to declare a mistrial when the deliberations exceed the threshold. The calculation at the end of the methods section is alarming!

Funding
This project was self-funded.