2.3. Concurrent use of initial diagnostic tests for diagnosis of TB in People living with HIV and children

There are significant burdens of tuberculosis in people living with HIV and children, particularly in low- and middle-income countries (LMICs). Persons living with HIV are at substantially higher risk of developing TB disease due to immunosuppression, with TB being a leading cause of death among this population. Children, especially those under five, are at high risk of progression from TB infection to TB disease and rapid disease progression and often present with broad respiratory symptoms, which complicate diagnosis and increase morbidity and mortality if not promptly treated. Addressing TB in these at-risk populations requires concerted efforts that account for their unique clinical presentations and diagnostic needs.

Diagnosing TB in persons living with HIV and children is challenging, particularly because of unspecific clinical presentations and often low and varying numbers of mycobacteria in their samples that lower the sensitivity of existing diagnostic tests. Furthermore, children and people living with HIV with advanced immunosuppression may be unable to provide sputum samples and can have disseminated TB, which is challenging to confirm with laboratory methods. To, in part address this challenge, WHO recommends the use of stool to aid in laboratory confirmation of TB in children, and the use of urine to aid in the confirmation of TB in persons living with HIV. However, even highly sensitive tests for TB diagnosis, such as LC-aNAATs, can miss TB in these groups. There is therefore a need for improved diagnostic approaches to accurately confirm TB in these higher-risk populations to ensure early and effective treatment.

Tests based on the detection of the lipoarabinomannan (LAM) antigen are biomarker-based tests that may be used on urine at the point of care for TB detection. The currently available urinary LAM assay is rapid (<1 hour to result) but has suboptimal sensitivity and is therefore not suitable as general diagnostic tests for TB. However, unlike traditional diagnostic methods, it demonstrates improved sensitivity for the diagnosis of TB among individuals coinfected with HIV. The estimated sensitivity is even greater in patients with low CD4 cell counts. The lateral flow urine LAM assay (LF-LAM) strip-test – the Abbott/Alere Determine TB LAM Ag (USA), hereafter referred to as LF-LAM – is currently the only commercially available urinary LAM test.

Using concurrent¹⁰ testing of different sample types offers a promising approach that considers the diagnostic testing barriers for HIV-positive adults and adolescents, HIV-positive children, and children without HIV or for whom HIV status is unknown. For instance, testing of sputum and stool during the same visit, when feasible, using LC-aNAATs increases the likelihood of detecting TB in children who may have scant bacilli in respiratory samples alone. Similarly, for persons living with HIV, testing of sputum and urine during the same visit, when sputum can be produced, using LC-aNAATs and LF-LAM increases the likelihood of detecting TB with a rapid point-of-care result while also ensuring detection of rifampicin resistance. This concurrent testing approach builds on the prior recommendation for LF-LAM test use among eligible persons living with HIV, which underscored the need for mWRD testing of available respiratory samples to support universal patient access to resistance testing services.

Implementing a diagnostic approach that includes concurrent sample testing could simplify diagnostic processes, shorten the patient journey, and improve TB detection rates and health outcomes for these at-risk populations. At the same time, the inability to collect one or more specimens at the same initial visit, or lack of one of the two test types should not delay testing of available specimens and tests, but instead trigger specimen collection and testing as soon as possible.

The following three scenarios of recommendations:

  • LC-aNAAT on respiratory samples and urine LF-LAM among adults and adolescents living with HIV
  • LC-aNAAT on respiratory samples and stool in children
  • LC-aNAAT on respiratory samples and stool, as well as urinary LF-LAM among children with HIV

These recommendations should be implemented within recommendations for the comprehensive diagnosis and management of persons living with HIV and children.

2.3.1 Concurrent use of tests in people living with HIV
Recommendations
Unamed-Table-7

 

Remarks

  • Serious illness in people living with HIV is defined based on any of the following symptoms: respiratory rate ≥30 breaths per minute, temperature ≥39 °C, heart rate ≥120 beats per minute or unable to walk unaided.
  • Advanced HIV disease is defined in people living with HIV who have a CD4 cell count of <200 cells/mm³ or presenting with a WHO Stage 3/4 AIDS-defining illness.
  • This concurrent testing recommendation supersedes prior guidance on using LF-LAM for people living with HIV and the use of a single molecular test for diagnosis of TB in this group.
  • This recommendation is strong despite the low certainty of evidence because the findings indicate large desirable effects (i.e. rapid and accurate diagnosis of TB in a highly vulnerable population – people living with HIV – in whom diagnosing TB is often challenging) over small undesirable effects (i.e. negative consequences of this testing strategy).
  • The LC-aNAAT products for which eligible data met the class-based performance criteria for this recommendation were Xpert MTB/RIF Ultra and Truenat MTB Plus. Data for performance of Truenat MTB Plus and MTB-RIF Dx were only available for testing among persons living with HIV without concurrent LF-LAM testing.
Justification and evidence

In a 2016 Cochrane systematic review of the diagnostic accuracy of LF-LAM, sensitivity increased by 13% when combining LF-LAM and sputum Xpert MTB/RIF, compared with sputum Xpert alone, while the specificity decreased by 4%. However, results were based on only a few studies, and analyses were restricted to participants able to produce sputum.

Incremental diagnostic accuracy

In 2023, WHO commissioned a series of systematic reviews to evaluate the incremental diagnostic accuracy¹¹ of concurrent use of either two different tests – LC-aNAAT on respiratory samples and LF-LAM on urine among people living with HIV – or the same test on two samples (LC-aNAAT on respiratory and stool samples) in children, or alternatively LC-aNAAT on respiratory and stool samples along with LF-LAM on urine among children with HIV.

What is the incremental diagnostic accuracy of concurrent use of respiratory LC-aNAATs and LF-LAM on urine for diagnosis of TB disease in adults and adolescents with HIV who present with presumptive TB, compared with any of the tests alone?

Of 31 studies, 27 evaluated diagnostic accuracy against an MRS, and 23 against a CRS, with 20 studies evaluating accuracy against both reference standards.

A total of 27 studies (12 651 participants, including 2368 [18.7%] with TB) compared the accuracy of the concurrent use of LC-aNAAT on a respiratory sample and LF-LAM versus each of the tests alone, using an MRS. The pooled differences in sensitivity and specificity between concurrent testing versus LC-aNAAT alone were 6.7% (95% credible interval [CrI]: 3.8 to 10.7; 95% prediction interval [PI]: 0.6 to 45.9) and –6.8% (95% CrI: –9.5 to –4.7; 95% PI: –32.8 to –6.8), respectively (Fig. 2.3.1.1). Certainty of evidence was low for both sensitivity and specificity.

A total of 23 studies (11 109 participants, including 3723 [33.5%] with TB) compared the accuracy of the concurrent use of LC-aNAAT and LF-LAM versus LC-aNAAT alone, using a CRS. The pooled differences in sensitivity and specificity between concurrent testing versus LC-aNAAT alone were 16.0% (95% CrI: 10.7 to 22.9; 95% PI: 2.3 to 60.3) and –3.5% (95% CrI: –6.6 to –1.7; 95% PI: –47.2 to –0.1), respectively (Fig. 2.3.1.1). Certainty of evidence was low for sensitivity and very low for specificity.

Fig. 2.3.1.1. Forest plot of pooled differences in sensitivity and specificity (all studies combined) by index test: LF-LAM, LC-aNAAT and their concurrent useᵃ

Fig-2-3-1-1

 

CrI: credible interval; CRS: composite reference standard; LC-aNAAT: low-complexity automated nucleic acid amplification test; LF-LAM: lateral flow urine lipoarabinomannan assay; MRS: microbiological reference standard; TB: tuberculosis.

a The diamonds represent the pooled sensitivity and specificity, and the black horizontal line its 95% CrI. The pooled difference in sensitivity and specificity between concurrent testing and LC-aNAAT alone is indicated by a line connecting two diamonds. This pooled difference may not correspond to the difference between the pooled single test accuracy estimates (see Web Annex B.8).

 

In addition to diagnostic accuracy, clinical outcome data on mortality, time to diagnosis and time to treatment were assessed. Data on cure and loss to follow-up were not assessed due to a lack of data. The data from three studies indicated that an intervention including LC-aNAAT on respiratory samples and LF-LAM on urine in adult inpatients with HIV was associated with slightly reduced 8-week mortality (risk ratio: 0.93; 95% CI: 0.74–1.17). The adjusted hazard ratio of time to diagnosis in adult inpatients with HIV was 1.55 (95% CI: 1.29–1.87). This means that participants in the intervention groups (i.e. those undergoing concurrent LC-aNAAT on respiratory samples and LF-LAM on urine) were 1.55 times more likely to be diagnosed with TB within fewer days (relative reduction of 2 days and 1 day to same-day) than those in the control group. The pooled risk ratio of adult inpatients with HIV diagnosed with TB was 1.56 (95% CI: 1.29–1.88), indicating that the intervention group had 1.56 times the risk of being diagnosed with TB (either microbiologically confirmed or clinically diagnosed) compared with the standard of care, which included LC-aNAAT on sputum alone. The pooled risk ratio of adult inpatients with HIV with a bacteriologically confirmed TB diagnosis was 3.06 (95% CI: 1.82– 5.16), indicating that the intervention group had three times the risk of being microbiologically confirmed with TB compared with the standard of care. Finally, the pooled risk ratio of adult inpatients with HIV treated for TB was 1.47 (95% CI: 1.25–1.73), indicating that the intervention group had 1.47 times the likelihood of being treated for TB, compared with the standard of care.

Single sample testing in people living with HIV compared with the MRS

Should LC-aNAATs on respiratory samples be used to diagnose pulmonary TB in PLHIV (adults and adolescents) with signs and symptoms or screened positive for pulmonary TB, against a microbiological reference standard?

Twelve studies (2016 participants) evaluated sputum specimens from people living with HIV (Fig. 2.3.1.2). The sensitivities ranged between 54% and 100% and the specificities between 78% and 100%. The summary sensitivity (95% CI) was 87.4% (83.8 to 90.3) and the summary specificity was 95.2% (92.7 to 96.9). The certainty of evidence for both sensitivity and specificity were graded as “High”.

Fig. 2.3.1.2 Forest plot of LC-aNAAT sensitivity and specificity for detection of pulmonary TB in PLHIV using a microbiological reference standard

Fig-2-3-1-2

 

Studies are sorted on the plot by assay and sensitivity (low to high). FN: false negative; FP: false positive; TN: true negative; TP: true positive.

 

Cost–effectiveness analysis

To date, evidence of cost–effectiveness for concurrent testing is limited. Several studies have assessed Xpert MTB/RIF with LF-LAM for diagnosing TB among people living with HIV. These studies have shown that concurrent testing is likely to increase the life expectancy of people living with HIV and be cost effective compared with using Xpert MTB/RIF in sputum samples alone. Fekuda et al. evaluated the cost–effectiveness of concurrently using Xpert Ultra and LF-LAM among people living with HIV and concluded that concurrent testing is the preferred cost-effective strategy. Previous cost–effectiveness analyses primarily focused on Xpert MTB/ RIF or Xpert Ultra, leaving a gap in evidence regarding the other technologies that may meet the LC-aNAAT class criteria. For details of particular studies see Web Annex B.9.

In preparation for the GDG meeting in May 2024, WHO commissioned a study to assess the cost– effectiveness of using LC-aNAATs (including Xpert Ultra, Truenat and other novel LC-aNAATs in the development pipeline) for the detection of TB when used concurrently among people living with HIV and children, including children with HIV, across two different country settings (Malawi and the Philippines). An objective of the study was to assess the cost–effectiveness of concurrent use of LC-aNAAT on respiratory samples and LF-LAM on urine for TB diagnosis and rifampicin-resistance detection among adult people living with HIV with presumptive TB, compared with a single LC-aNAAT on respiratory samples alone.

In the hypothetical model, a cohort of people living with HIV with signs and symptoms of TB progressed through a decision analytical framework. In the intervention arm, TB diagnosis involved the concurrent use of LC-aNAAT on respiratory samples and LF-LAM on urine, whereas the comparator arm exclusively used LC-aNAAT on respiratory specimens. The probability of being able to provide a respiratory sample was considered, and testing was carried out, either on both respiratory and urine samples concurrently or solely on urine. In both intervention and comparator arms, participants not diagnosed through the diagnostic strategy had the opportunity for clinical diagnosis. People with bacteriologically confirmed TB underwent DST for rifampicin and began either drug-susceptible TB or DR-TB treatment, depending on the DST result. All individuals were followed over time, including those with false negative or false positive diagnostic results, to account for unnecessary treatment or additional mortality due to missed diagnoses.

The cost–effectiveness results of concurrent use of LC-aNAAT with LF-LAM among people living with HIV, when used in the emblematic settings of Malawi and the Philippines, are shown in Table 2.3.1.1 In Malawi, the average cost of implementing an LC-aNAAT on a respiratory sample was US$ 276, with a corresponding average DALY of 2.44. When used concurrently with LF-LAM, the average cost rose to US$ 298, while the average DALY decreased to 1.93. The resulting incremental cost per DALY averted was US$ 42, with a 95% uncertainty range (UR) of US$ 18 to US$ 345. Similarly, in the Philippines, LC-aNAAT on a respiratory sample had an average cost of US$ 220, with an average DALY of 2.78, whereas concurrent use with LF-LAM incurred an average cost of US$ 238 and an average DALY of 2.13. The incremental cost per DALY averted was US$ 28 (95% UR: 12–249).

Table 2.3.1.1 Cost–effectiveness analysis of concurrent use of LC-aNAAT and LF-LAM among people living with HIV in Malawi and the Philippines

Table-2-3-1-1

 

DALY: disability-adjusted life year; HIV: human immunodeficiency virus; ICER: incremental cost–effectiveness ratio; LC-aNAAT: low-complexity automated nucleic acid amplification test; LF-LAM: lateral flow urine lipoarabinomannan assay; UR: uncertainty range.

 

More information on the cost–effectiveness analysis of concurrent use of tests in people living with HIV is available in Web Annex B.9.

User perspective

This section deals with the following question:

Are there implications for user preferences and values, equity, acceptability, feasibility and human rights from the implementation of a concurrent testing approach (LC-aNAATs + LF-LAM)?

The GDG assessed whether concurrent testing of multiple samples would increase the diagnostic accuracy (i.e. the benefit to patients or the programme in terms of finding more people with TB). Three PICO questions concerned the different concurrent sample combinations for specific groups facing challenges from reliance on respiratory samples alone (children and people living with HIV). One question focused on the concurrent use of LC-aNAAT on a respiratory sample and LF-LAM on urine for the diagnosis of TB in people living with HIV.

User preferences and values

As important outcomes of the diagnostic test, people in high TB burden settings value:

  • getting an accurate diagnosis and reaching diagnostic closure (finally knowing “what is wrong with me”);
  • avoiding diagnostic delays, as they exacerbate existing financial hardships and emotional and physical suffering and make people feel guilty for infecting others (especially children);
  • having accessible facilities; and
  • reducing diagnosis-associated costs (e.g. travel, missing work).

More details on patient-important outcomes are available in Web Annex B.10.

Equity

Concurrent specimen testing was not practiced in the interview study countries. However, it was believed to improve access to care by minimizing repeat visits and loss to follow-up. According to the interview study respondents, using non-sputum specimens has the potential to improve access to care, especially with a test that can be performed at all levels of the health care system. Challenges with producing a sufficient quality and quantity of sputum are well documented and can lead to repeat testing or false results.

Acceptability

Based on the results of the interview study, LF-LAM is being used inconsistently for people living with HIV and only for very ill patients who cannot produce sputum. Our results are in accordance with published literature on LF-LAM.

Prior research on user perspectives on LF-LAM showed that it is generally described as acceptable by key stakeholders, due to its fast turnaround time, ease of use (lack of technical expertise required), low or no maintenance and equipment required, and urine being more accessible and less stigmatized than sputum. LF-LAM is deemed particularly acceptable when used in combination with other tests and clinical considerations. As the sensitivity of LF-LAM is especially low where the pretest probability is low, participants commented that it should not be used as a standalone test but should instead be used in combination with other tests, and that the results should be interpreted by a doctor considering the full clinical context, rather than being considered in isolation.

Feasibility

Interview study findings highlighted that the benefits of LF-LAM are crucially dependent on how several feasibility challenges are addressed.

  • Hygienic, safe and private sanitary facilities with running water are necessary for LF-LAM implementation at a testing site, but they are not always available, particularly in rural areas. Investments in staffing and sanitary facilities are required.
  • Not everybody can spontaneously produce or collect urine samples. This can be the case, for example, when the patient is too ill or septic or has to be catheterized because collecting urine samples from diapers is impossible, or if the hospital has no clean, private space to produce urine.
  • Visibility of faint results and result interpretation can be problematic. Comprehensive health care worker training in test interpretation (including mandatory use of the reference reading card, where appropriate) is crucial to ensure accurate result interpretation for clinical action.
  • The need for CD4 cell count results to select people for the test is problematic because these are not always immediately available. To facilitate implementation and benefit a wider range of individuals, eliminating the CD4 cell count as an eligibility criterion for people living with HIV should be considered.
  • In a hospital setting, bedside testing may violate patient confidentiality.
  • Results must be captured in a standardized way that feeds into facility and NTP reporting systems.
  • Quality assurance schemes need to be rolled out, and external quality controls need to be made available, to ensure tests and testing processes are quality controlled.

Concurrent testing needs to be framed as a more efficient way of working (i.e. testing two samples concurrently during the same visit, instead of testing one sample during each of two separate visits) that also allows increasing access and reducing costs for patients. According to a laboratory manager, this framing of the benefits outweighing the additional workload, and potentially resulting in reduced work in the long run, will be critical to avoid concurrent testing being perceived as additional work for already overburdened health care workers (see Web Annex B.10).

Prior investments made in frontrunner technologies, donor preferences, limited health systems thinking and unnecessary competition between manufacturers all pose challenges to policy adoption and implementation of novel molecular diagnostics. In addition, national in-country health technology and cost-efficacy assessments can delay decisions to implement newer technologies and diagnostic strategies using different samples (see Web Annex B.10).

Implementation considerations
  • Global and national HIV and TB programmes need to communicate regularly and clearly, indicating responsibilities for concurrent testing for people living with HIV.
  • Concurrent testing maximizes diagnostic opportunity and accuracy of case detection, is a more efficient way to address the needs of this population and is preferred even if the testing workload may increase.
  • A positive result on either test is sufficient to confirm TB diagnosis.
  • Patient loss to follow-up for the second test result should be monitored and prevented. Patients should be provided with information to understand the concurrent testing approach and the need for follow-up.
  • The LF-LAM performed in point-of-care settings may be the first positive result and is sufficient to make the initial diagnosis. A respiratory sample is still required for rifampicinresistance detection, and is also required when the LF-LAM result is negative.
  • Where LF-LAM is not available for testing of people living with HIV, efforts should be made to ensure access to testing.
  • LF-LAM does not differentiate Mtb from other mycobacterial species. However, the LAM antigen detected in a clinical sample in TB endemic areas is most likely attributable to Mtb.
  • When LF-LAM results are consistently positive, without positive LC-aNAAT results, investigation of the quality of testing and local epidemiology of non-tuberculosis mycobacteria and extrapulmonary TB in the tested population is warranted to understand the difference.
  • Interpreting bands on the LF-LAM test strip should be performed using the manufacturer’s reading card to minimize incorrect results.
  • LF-LAM test strips must be stored according to the manufacturer’s instructions (e.g. between 2 and 30 °C) in sealed bags and not used after expiration.
  • Infrastructure to collect a urine sample privately should be available. Patients should be instructed how to properly and sanitarily collect a urine sample to minimize contamination and prevent false positive results.
  • Trained staff will be required to perform the LF-LAM test at the point of care.
  • As with all WHO-recommended TB diagnostics, quality assurance programmes and quality controls for both tests are required.
  • LF-LAM is designed to detect mycobacterial LAM antigen in human urine. Other samples (e.g. sputum, serum, plasma, CSF and other body fluids) or pooled urine specimens should not be used.

Monitoring and evaluation

  • Monitor simultaneous specimen collection and turnaround time for the test results in a concurrent testing approach.
  • Monitor patient access to, and loss to follow-up from, a second test in a concurrent testing approach.
  • Monitor patient access to, and loss to follow-up from, follow-on DST among those with a positive LF-LAM result but a negative LC-aNAAT result.
  • Monitor trends in the discordance rate between the LF-LAM and LC-aNAAT results. If these differences vary from other local or regional patterns, or if the trends change, further investigation is required and outcomes should be tracked for recurrence over time.
Research priorities
  • Conduct more rigorous studies with higher quality reference standards, including multiple specimen types and extrapulmonary samples, to improve confidence in specificity estimates.
  • Gather evidence on the impact of concurrent testing on TB treatment initiation and mortality.
  • Determine training, competency and quality assessment needs by setting and by cadre of staff (i.e. health care worker, laboratory technician or clinical staff).
  • Perform country-specific cost–effectiveness and cost–benefit analyses of the concurrent testing approaches or sequential testing approaches in different programmatic settings.
  • Develop and apply standardized methods for assessment of costs and cost–effectiveness, to improve comparability and scope of economic evidence.
  • Perform operational research on availability, requirements and best practices for the pointof-care set-up: private specimen collection facility, tabletop space for testing samples, and reporting system (preferably digital) for entry of results, with linkages to existing information management systems (i.e. health and laboratory information management systems).
2.3.2 Concurrent use of tests in children without HIV or with unknown HIV status
Recommendations
Unamed-Table-8

 

Remarks

  • This recommendation prioritizes concurrent testing of two different sample types over the use of a single molecular test for diagnosis of TB in children.
  • Use of LC-aNAATs on isolated specimens was also evaluated. The findings supported the use of LC-aNAATs for initial diagnostic testing for TB in children with signs or symptoms or who screen positive for pulmonary TB, using respiratory sample, gastric aspirate, stool or nasopharyngeal aspirate, rather than smear or culture.
  • This recommendation is strong despite the low certainty of evidence because the findings indicate large desirable effects (i.e. rapid and accurate diagnosis of TB in a highly vulnerable population – children – in whom diagnosing TB is often challenging) over trivial undesirable effects (i.e. negative consequences of this testing strategy) (for more details, see GRADE evidence to decision [EtD] table, Web Annex A.4).
  • The product for which eligible data met the LC-aNAAT class-based performance criteria for this recommendation was Xpert MTB/RIF Ultra. The performance of Truenat MTB Plus and MTB-RIF Dx for this recommendation could not be assessed, as data were unavailable.
Justification and evidence

LC-aNAATs on respiratory and stool samples are recommended as the first test for symptomatic children presenting with presumptive TB disease, and are widely used to diagnose TB.

Previous systematic reviews have traditionally assessed diagnostic accuracy of LC-aNAATs on two samples in isolation for the detection of TB in children, but in clinical practice the tests may be used concurrently (i.e. LC-aNAAT on a respiratory sample and a stool sample) and together they increase sensitivity.

Incremental diagnostic accuracy of concurrent testing compared with single sample testing 

What is the incremental diagnostic accuracy of concurrent use of LC-aNAATs on respiratory and stool samples for diagnosis of pulmonary TB disease in children who are HIV-negative or have an unknown HIV status, with signs and symptoms or who screened positive for pulmonary TB, compared with use of an LC-aNAAT on one sample type (either respiratory or stool)?

Eight studies (2145 participants, 173 [8.1%] of whom had TB disease) compared the accuracy of concurrent use of LC-aNAATs with respiratory and stool samples (LC-aNAATs combined) versus LC-aNAAT on one sample type (either respiratory or stool) against an MRS.

Compared with LC-aNAAT on respiratory samples alone, concurrent testing had 7.1 percentage points (95% CrI: 3.2 to 13.4) higher sensitivity and –1.7 percentage points (95% CrI: –3.8 to –0.6) lower specificity. Certainty of evidence for both sensitivity and specificity was low for comparison with LC-aNAAT on respiratory samples alone. Compared with LC-aNAAT on stool alone, concurrent testing had 22.1 percentage points (95% CrI: 13.7 to 32.7) higher sensitivity and –4.1 percentage points (95% CrI: –8.0 to –1.7) lower specificity. Certainty of evidence was moderate for sensitivity and low for specificity for comparison with LC-aNAAT on stool alone.

Twelve studies (3579 participants, 1464 [40.9%] of whom had TB disease) compared the accuracy of LC-aNAATs combined versus each LC-aNAAT alone against a CRS.

Compared with LC-aNAAT on respiratory samples alone, concurrent testing had 4.7 percentage points (95% CrI: 2.1 to 8.9) higher sensitivity and –0.5 percentage points (95% CrI: –1.4 to 0) lower specificity. Compared with LC-aNAAT on stool alone, concurrent testing had 10.5 percentage points (95% CrI: 6.9 to 15.0) higher sensitivity and –0.1 percentage points (95% CrI: –0.7 to –0.005) lower specificity. Certainty of evidence was very low for both sensitivity and specificity for both comparisons (concurrent testing versus respiratory sample alone and stool alone) under a CRS (Fig. 2.3.2.1). The data on Truenat MTB Plus and MTB-RIF Dx were unavailable.

Fig. 2.3.2.1 Forest plot of pooled sensitivity and specificity for all studies, by each index test

Fig-2-3-2-1

 

CrI: credible interval; CRS: composite reference standard; LC-aNAAT: low-complexity automated nucleic acid amplification test; MRS: microbiological reference standard; TB: tuberculosis.

The diamonds represent pooled sensitivity and specificity, and the black horizontal line its 95% CrI. The difference in accuracy between index tests is indicated by solid lines (concurrent versus stool) or dotted lines (concurrent versus respiratory) connecting the diamonds.

 

Single sample testing in children compared with the MRS

Should LC-aNAATs on respiratory samples be used to diagnose pulmonary TB in children with signs and symptoms or who screened positive for pulmonary TB, against an MRS?

Fifteen studies (3024 participants) evaluating sputum were identified, with sensitivities ranging between 57% and 91% and specificities between 82% and 100% (Fig. 2.3.2.2). Eleven studies (2990 participants) were included in the meta-analysis. The summary sensitivity was 75.3% (95% CI: 68.9–80.8) and summary specificity was 95.9% (95% CI: 92.3–97.9). Certainty of evidence was high for both sensitivity and specificity. The data on Truenat MTB Plus and MTB-RIF Dx were unavailable.

Fig. 2.3.2.2 Forest plot of LC-aNAAT sensitivity and specificity for detection of pulmonary TB in sputum samples and MRS

Fig-2-3-2-2

 

CI: confidence interval; FN: false negative; FP: false positive; LC-aNAAT: low-complexity automated nucleic acid amplification test; MRS: microbiological reference standard; TB: tuberculosis; TN: true negative; TP: true positive.

 

Should LC-aNAATs on gastric aspirate specimens be used to diagnose pulmonary TB in children with signs and symptoms or who screened positive for pulmonary TB, against an MRS?

Twelve studies (1959 participants) were identified, with sensitivities between 0% and 100% and specificities between 67% and 100% (Fig. 2.3.2.3). All 12 studies were included in the meta-analysis. The summary sensitivity was 69.6% (95% CI: 60.3–77.6) and summary specificity was 91.0% (95% CI: 82.5–95.6). Certainty of evidence was moderate for both sensitivity and specificity. The data on Truenat MTB Plus and MTB-RIF Dx were unavailable.

Fig. 2.3.2.3 Forest plot of LC-aNAAT sensitivity and specificity for detection of pulmonary TB in gastric aspirate and MRSᵃ

Fig-2-3-2-3

 

CI: confidence interval; FN: false negative; FP: false positive; LC-aNAAT: low-complexity automated nucleic acid amplification test; MRS: microbiological reference standard; TB: tuberculosis; TN: true negative; TP: true positive.

a Studies are sorted on the plot by decreasing sensitivity and specificity.

 

Should LC-aNAATs on nasopharyngeal aspirate specimens be used to diagnose pulmonary TB in children with signs and symptoms or who screened positive for pulmonary TB, against an MRS?

Seven studies (1355 participants) were identified, with sensitivities between 33% and 67% and specificities between 50% and 99% (Fig. 2.3.2.4). Six studies (1353) were included in the meta-analysis. The summary sensitivity was 46.2% (95% CI: 34.9–57.9) and summary specificity was 97.5% (95% CI: 95.1–98.7). Certainty of evidence was moderate for sensitivity and high for specificity. The data on Truenat MTB Plus and MTB-RIF Dx were unavailable.

Fig. 2.3.2.4 Forest plot of LC-aNAAT sensitivity and specificity for detection of pulmonary TB in nasopharyngeal aspirate samples and MRS

Fig-2-3-2-4

 

CI: confidence interval; FN: false negative; FP: false positive; LC-aNAAT: low-complexity automated nucleic acid amplification test; MRS: microbiological reference standard; TB: tuberculosis; TN: true negative; TP: true positive.

 

Should LC-aNAATs on stool be used to diagnose pulmonary TB in children with signs and symptoms or who screened positive for pulmonary TB, against an MRS?

Ten studies (2855 participants) were identified, with sensitivities between 26% and 100% and specificities between 89% and 100% (Fig. 2.3.2.5). All 10 studies were included in the metaanalysis. The summary sensitivity was 68.0% (95% CI: 50.3–81.7) and summary specificity was 98.2% (95% CI: 96.3 to 99.1). Certainty of evidence was moderate for sensitivity and high for specificity. The data on Truenat MTB Plus and MTB-RIF Dx were unavailable.

Fig. 2.3.2.5 Forest plot of LC-aNAAT sensitivity and specificity for detection of pulmonary TB in stool and MRSᵃ

Fig-2-3-2-5

 

CI: confidence interval; FN: false negative; FP: false positive; LC-aNAAT: low-complexity automated nucleic acid amplification test: MRS: microbiological reference standard; TB: tuberculosis; TN: true negative; TP: true positive.

a Studies are sorted on the plot by decreasing sensitivity.

 

Cost–effectiveness analysis

As part of the preparatory process for the GDG meeting in May 2024, WHO commissioned a modelled study to assess the cost–effectiveness of using LC-aNAATs (including Xpert Ultra, Truenat and other novel LC-aNAATs in the development pipeline) for the detection of TB when used concurrently among people living with HIV and children, including children with HIV, across two different country settings (Malawi and the Philippines).

A study objective was to assess the cost–effectiveness of concurrent use of LC-aNAATs on respiratory and stool samples for TB diagnosis and rifampicin-resistance detection among children (aged <10 years) with presumptive TB and without HIV infection, compared with a single LC-aNAAT on a respiratory sample alone.

In this hypothetical model, a cohort of children with presumptive TB progressed through a decision analytical framework. In the intervention arm, TB diagnosis involved the concurrent use of LC-aNAATs on both respiratory and stool samples, whereas the comparator arm solely used LC-aNAATs on respiratory specimens. The probability of being able to provide a respiratory sample was considered, and testing was conducted, either for both respiratory and stool samples concurrently or solely for stool. In both the intervention and comparator arms, participants not diagnosed through the diagnostic strategy had the opportunity for clinical diagnosis. Children with bacteriologically confirmed TB underwent DST for rifampicin and began either drug-susceptible TB or DR-TB treatment, depending on the DST result. All individuals were followed over time, including those with false negative or false positive diagnostic results, to account for unnecessary treatment or additional mortality due to missed diagnoses.

When using the high TB burden setting of Malawi to parametrize the model, cost–effectiveness modelling found that the use of an LC-aNAAT on a respiratory sample resulted in an average cost of US$ 144, with a corresponding average DALY of 0.93. In contrast, the concurrent use of LC-aNAATs on respiratory and stool samples yielded an average cost of US$ 204, and a DALY of 0.57, resulting in an incremental cost per DALY averted of US$ 253 (95% UR: 123–2317) (Table 2.3.2.1).

Similarly, in the Philippines, the cost of an LC-aNAAT on a respiratory sample was US$ 84, associated with a DALY of 1.04. Concurrent testing in the Philippines resulted in an average cost of US$ 149 and a DALY of 0.66, with an ICER of US$ 156 per DALY averted (95% UR: 79–888) (Table 2.3.2.1).

Table 2.3.2.1 Cost–effectiveness analysis of concurrent use of LC-aNAATs among children in Malawi and the Philippines

Table-2-3-2-1

 

DALY: disability-adjusted life year; ICER: incremental cost–effectiveness ratio; LC-aNAAT: low-complexity automated nucleic acid amplification test; UR: uncertainty range.

 

More information on the cost–effectiveness analysis of concurrent use of tests in children is available in Web Annex B.9.

User perspective

The GDG assessed whether concurrent testing of multiple samples would increase the diagnostic yield (i.e. the benefit to patients or the programme in terms of finding more people with TB). One of the PICO questions focused on the concurrent use of LC-aNAATs on respiratory and stool samples for the diagnosis of TB in children.

User preferences and values

As important outcomes of the diagnostic test, people in high TB burden settings value:

  • getting an accurate diagnosis and reaching diagnostic closure (finally knowing “what is wrong with me”);
  • avoiding diagnostic delays, as they exacerbate existing financial hardships and emotional and physical suffering and make people feel guilty for infecting others (especially children);
  • having accessible facilities; and
  • reducing diagnosis-associated costs (e.g. travel, missing work).

Participants appreciate that stool collection is far less invasive than gastric lavage and can thereby reduce physical and emotional suffering of children and their parents (see Web Annex B.10).

Equity

Concurrent specimen testing was not practiced in the interview study countries. However, it was believed to improve access to care by minimizing repeat visits and loss to follow-up.

According to the interview study respondents, using non-sputum specimens has the potential to improve access to care, especially with a test that can be performed at all levels of the health care system. Challenges with producing a sufficient quality and quantity of sputum are well documented and can lead to repeat testing or false results.

Acceptability

Most participants, including health workers and caregivers, did not immediately understand why multiple samples would be tested concurrently at the same visit, if a respiratory sample is available. They highlighted that a sputum sample is the preferred choice, and they would only collect the second-best sample if that were not available. However, participants also thought that concurrent sample testing could be possible if there was a WHO recommendation, altered diagnostic algorithms and specific training and capacity strengthening to facilitate it (see Web Annex B.10).

For young children, stool seems to be an acceptable specimen, especially after adequate training in how to process it. Stool from adults is considered more difficult in terms of both acceptance and processing time. In general, participants had confidence in the results from stool tested by GeneXpert (see Web Annex B.10).

Feasibility

Important feasibility challenges are related to the deteriorating quality of stool, caused by delays between time of collection and time of processing in the laboratory (see Web Annex B.10).

Concurrent testing needs to be framed as a more efficient way of working (i.e. testing two samples concurrently during the same visit, instead of testing one sample during each of two separate visits) that also allows increasing access and reducing costs for patients. The practice of concurrent testing needs to be framed as generating sufficient benefit to justify the additional short-term workload and having the potential to reduce the workload in the longer term. Without such framing, there is a risk that already overburdened health care workers will avoid concurrent testing (see Web Annex B.10).

Prior investments made in frontrunner technologies, donor preferences, limited health systems thinking and unnecessary competition between manufacturers all pose challenges to policy adoption and implementation of novel molecular diagnostics. In addition, national in-country health technology and cost-efficacy assessments can delay decisions to implement newer technologies and diagnostic strategies using different samples (see Web Annex B.10).

More information on the qualitative evidence analysis and synthesis for concurrent use of tests in children is available from Web Annex B.10.

Implementation considerations
  • Concurrent testing maximizes diagnostic opportunity and accuracy of case detection, is a more efficient way to address the needs of this population and is preferred even if the testing workload may increase.
  • A positive result on either test is sufficient to confirm TB diagnosis.
  • Patient loss to follow-up for the second test result should be monitored and prevented. Patients should be provided with information to understand the concurrent testing approach and the need for follow-up.
  • Testing capacity should be secured for the second test, as volumes will increase.
  • Adequate staffing capacity and training are needed to improve the collection of different sample types and laboratory processing of collected samples.
  • Performing the same test on a new sample may need additional regulatory approval on a national and international level.
  • Infrastructure and training on how to collect a stool sample privately should be available.
  • As with all WHO-recommended TB diagnostics, quality assurance programmes for both sample types are required.
  • At a primary health care level, in a situation of sputum paucity or absence, stool and nasopharyngeal aspirate may be feasible, whereas collection of more invasive specimen types (i.e. induced sputum, BAL and gastric aspirate) would require upward referral, depending local capacity and expertise. In these circumstances, performing stool testing at primary health care level and waiting for a test result before upward referral of the child may be appropriate.
Monitoring and evaluation
  • Monitor simultaneous specimen collection and turnaround time for the test results in a concurrent testing approach.
  • Monitor patient loss to follow-up from a second test in a concurrent testing approach.
  • Monitor trends in the rate of indeterminate test results for both sample types with LC-aNAATs.
  • Monitor trends in the discordance rate between the respiratory and stool LC-aNAAT results. If these differences vary from other local or regional patterns, or if the trends change, further investigation is required.
Research priorities
  • Evaluate the impact of concurrent specimen testing on patient-important outcomes for children (cure, mortality, time to diagnosis and time to start of treatment).
  • Evaluate the impact of concurrent specimen testing on affordability and cost–effectiveness in the intended settings of use.
  • Evaluate the performance of other LC-aNAATs in concurrent testing approaches.
  • Identify an improved reference standard that accurately defines TB disease in children and paucibacillary specimens because the sensitivity of all available diagnostics is suboptimal.
  • Develop new tools that correctly diagnose a higher proportion of TB in children. Ideally, the new tools will be rapid, affordable, feasible and acceptable to children and their parents.
  • Develop rapid point-of-care diagnostic tests and simpler alternative sample types for paucibacillary and extrapulmonary TB in children.
  • Perform operational research to ensure that tests are used optimally in intended settings.
  • Develop and apply standardized methods for assessment of costs and cost–effectiveness, to improve comparability and scope of economic evidence.
2.3.3 Concurrent use of tests in children with HIV
Recommendations
Unamed-table-9

 

Remarks

  • This recommendation prioritizes concurrent testing over the use of molecular testing and LF-LAM in isolation for diagnosis of TB in children with HIV.
  • Use of LC-aNAATs on isolated specimens was also evaluated. The findings supported the use of LC-aNAATs for initial diagnostic testing for TB in HIV-positive children with signs or symptoms or who screen positive for pulmonary TB, using sputum, gastric aspirate, stool or nasopharyngeal aspirate, rather than smear or culture.
  • This recommendation is conditional because the findings indicate moderate undesirable effects (i.e. decreased specificity, resulting in more false positive test results) when compared with a single test strategy.
  • The product for which eligible data met the LC-aNAAT class-based performance criteria for this recommendation was Xpert MTB/RIF Ultra. The performance of Truenat MTB Plus and MTB-RIF Dx for this recommendation could not be assessed, as data were unavailable.
Justification and evidence

LC-aNAATs on respiratory and stool sample and LF-LAM on urine are recommended as the first test for symptomatic children with HIV presenting with presumptive TB disease, and should be used to diagnose TB.

Previous systematic reviews have traditionally assessed diagnostic accuracy of LC-aNAATs on two samples and LF-LAM on urine in isolation for the detection of TB in children, but in clinical practice the tests may be used concurrently (i.e. LC-aNAAT on a respiratory and stool sample and LF-LAM on urine) and together they increase sensitivity.

Incremental diagnostic accuracy

What is the incremental diagnostic accuracy of concurrent use of LC-aNAATs on respiratory and stool samples and LF-LAM on urine versus each sample type alone for diagnosis of pulmonary TB disease in children with HIV, with signs and symptoms or who screened positive for pulmonary TB, compared with any of the tests (either LC-aNAATs combined or LF-LAM) alone?

Based on six studies (653 participants, including 43 [6.6%] with TB) included in the meta-analysis for an MRS, the estimated diagnostic accuracy of the concurrent use of LC-aNAAT on respiratory samples plus LC-aNAAT on stool and LF-LAM on urine had a pooled sensitivity of 77.8% (95% CrI: 59.9 to 89.8) and a pooled specificity of 83.9% (95% CrI: 73.9 to 90.4) (Fig. 2.3.7). Compared with LC-aNAAT on respiratory samples alone, concurrent testing had 6.9 percentage points (95% CrI: 1.5 to 20.1) higher sensitivity and –10.1 percentage points (95% CrI: –21.6 to –4.9) lower specificity. Certainty of evidence was low for specificity and moderate for sensitivity.

Based on six studies (674 participants, including 286 [42.4%] with TB) included in the metaanalysis for a CRS, the estimated diagnostic accuracy of the concurrent use of LC-aNAAT on respiratory samples plus LC-aNAAT on stool and LF-LAM on urine had a pooled sensitivity of 30.1% (95% CrI: 13.2 to 54.9) and a pooled specificity of 83.3% (95% CrI: 69.6 to 90.2) (Fig. 2.3.3.1). Compared with LC-aNAAT on respiratory samples alone, concurrent testing had 14.9 percentage points (95% CrI: 0 to 41.1) higher sensitivity and –12.0 percentage points (95% CrI: –27.0 to –2.6) lower specificity. Certainty of evidence was very low for sensitivity and low for specificity.

Fig. 2.3.3.1 Forest plot of pooled sensitivity and specificity for all studies, by each index test

Fig-2-3-3-1

 

CrI: credible interval; CRS: composite reference standard; LC-aNAAT: low-complexity automated nucleic acid amplification test; LF-LAM: lateral flow urine lipoarabinomannan assay; MRS: microbiological reference standard; TB: tuberculosis.

a The diamonds represent pooled sensitivity and specificity, and the black horizontal line its 95% CrI. The difference in accuracy between index tests is indicated by solid lines (concurrent versus stool) or dotted lines (concurrent versus respiratory) connecting the diamonds.

 

Cost–effectiveness analysis

In addition to the economic evidence regarding concurrent use of tests in people living with HIV and children (see Sections 2.3.1 and 2.3.2), WHO commissioned a third study that aimed to assess the cost–effectiveness of using LC-aNAATs (including Xpert Ultra, Truenat and other novel LC-aNAATs in the development pipeline) for the detection of TB when used concurrently among children with HIV, across two different country settings (Malawi and the Philippines).

An objective of this study was to assess the cost–effectiveness of concurrent use of LC-aNAATs on respiratory and stool samples and LF-LAM on urine for TB diagnosis and rifampicin-resistance detection among children (aged <10 years) living with HIV and with presumptive TB, compared with a single LC-aNAAT on a respiratory sample alone.

In the hypothetical model that informed this study, a cohort of children with HIV and with signs and symptoms of TB progressed through a decision analytical framework. In the intervention arm, TB diagnosis involved the concurrent use of LC-aNAATs on both respiratory and stool samples, alongside LF-LAM on urine. The comparator arm used LC-aNAAT on respiratory samples alone. The probability of providing a respiratory sample was considered, and testing was conducted, either concurrently on respiratory and stool samples alongside LF-LAM, or on stool alone alongside LF-LAM. In both the intervention and comparator arm, participants not diagnosed through the diagnostic strategy had the opportunity for clinical diagnosis. Children with bacteriologically confirmed TB underwent DST for rifampicin and began either drug-susceptible TB or DR-TB treatment, depending on the DST result. All individuals were followed over time, including those with false negative or false positive diagnostic results, to account for unnecessary treatment or additional mortality due to missed diagnoses.

The study findings shown in Table 2.3.3.1 show the cost–effectiveness of the concurrent use of LC-aNAATs on respiratory and stool samples and LF-LAM on urine among children with HIV in Malawi and the Philippines. In Malawi, the average cost of implementing an LC-aNAAT on a respiratory sample was US$ 319, with a corresponding average DALY of 5.08. When used concurrently, the average cost increased to US$ 460, while the average DALY decreased to 1.8. The resulting ICER per DALY averted was US$ 43 (95% UR: 28–89). Similarly, in the Philippines, implementation of an LC-aNAAT on a respiratory sample alone cost US$ 249, with an average DALY of 5.13, whereas concurrent use incurred an average cost of US$ 345 and an average DALY of 1.77. The ICER per DALY averted was US$ 29 (95% UR: 18–63).

Table 2.3.3.1 Cost–effectiveness analysis of concurrent use of LC-aNAATs among children living with HIV in Malawi and the Philippines

Table-2-3-3-1

 

DALY: disability-adjusted life year; HIV: human immunodeficiency virus; ICER: incremental cost–effectiveness ratio; LC-aNAAT: low-complexity automated nucleic acid amplification test; LF-LAM: lateral flow urine lipoarabinomannan assay; UR: uncertainty range.

 

More information on the cost–effectiveness analysis of concurrent use of tests in children with HIV is available in Web Annex B.9.

User perspective

The GDG assessed whether concurrent testing of multiple samples would increase the diagnostic yield (i.e. the benefit to patients or the programme in terms of finding more people with TB). One of the PICO questions focused on concurrent use of LC-aNAATs on respiratory and stool samples and LF-LAM on urine for the diagnosis of TB in children living with HIV.

User preferences and values

The interview study and quality evidence synthesis produced no data on the use of LF-LAM in children living with HIV. However, in general, as important outcomes of the diagnostic test, patients in high TB burden settings value:

  • getting an accurate diagnosis and reaching diagnostic closure (finally knowing “what is wrong with me”);
  • avoiding diagnostic delays, as they exacerbate existing financial hardships and emotional and physical suffering and make people feel guilty for infecting others (especially children);
  • having accessible facilities; and
  • reducing diagnosis-associated costs (e.g. travel, missing work).

Participants appreciate that stool sample collection is far less invasive than gastric aspirate (see Web Annex B.10).

Equity

Concurrent sample testing was not practiced in the study countries. However, concurrent sample testing could improve access to care by minimizing repeat visits and loss to follow-up (see Web Annex B.10).

Using non-sputum samples can improve access to care, especially with a test that can be performed at all levels of the health carecare system. Challenges with producing sputum of sufficient quality and quantity are well documented and can lead to repeat testing or false results. Participants highlighted the impact that using stool has on increasing case-finding and access to care, particularly among destitute families (see Web Annex B.10).

Acceptability

The interview study produced no data on the use of LF-LAM in children living with HIV.

Most participants (including parents and legal representatives of children) did not immediately understand why multiple samples would be tested concurrently at the same visit, if a respiratory sample is available. They highlighted that a sputum sample is the preferred choice, and they would only collect the second-best sample if that were not available. However, participants also thought that concurrent sample testing could be possible if there was a WHO recommendation, altered diagnostic algorithms and specific training and capacity strengthening to facilitate it (see Web Annex B.10).

For young children, stool seems to be an acceptable specimen, especially after adequate training in how to process it. Stool from adults is considered more difficult in terms of both acceptance and processing time. There was general confidence among participants regarding results from stool tested by GeneXpert (see Web Annex B.10).

Feasibility

The interview study produced no data on the use of LF-LAM in children living with HIV. In younger and sicker children, urine sample collection is more cumbersome, as it requires both the child’s and the caregiver’s cooperation, and it may be affected by medical issues, such as dehydration (see Web Annex B.10).

Important feasibility challenges are related to the deteriorating quality of the stool sample caused by delays between time of collection and time of processing in the laboratory (see Web Annex B.10).

Concurrent testing needs to be framed as a more efficient way of working (i.e. testing two samples concurrently during the same visit, instead of testing one sample during each of two separate visits) that also allows increasing access and reducing costs for patients. According to a laboratory manager, this framing of the benefits outweighing the additional workload, and potentially resulting in reduced work in the long run, will be critical to avoid concurrent testing being perceived as additional work for already overburdened health care workers (see Web Annex B.10).

Prior investments made in frontrunner technologies, donor preferences, limited health systems thinking and unnecessary competition between manufacturers all pose challenges to policy adoption and implementation of novel molecular diagnostics. In addition, national in-country health technology and cost-efficacy assessments can delay decisions to implement newer technologies and diagnostic strategies using different samples (see Web Annex B.10).

Implementation considerations
  • The implementation considerations are the same as those in Sections 2.3.1 and 2.3.2.
Monitoring and evaluation
  • The monitoring and evaluation considerations are the same as those in Sections 2.3.1 and 2.3.2.
Research priorities
  • The research priorities are the same as those in Sections 2.3.1 and 2.3.2.
 

10 Concurrent use of tests: samples are taken simultaneously (when possible), and testing is conducted for both tests. A positive result on either test is a positive result for the combination.

11 Incremental change in diagnostic accuracy with concurrent testing compared with individual sample testing.

Book navigation