Contact tracing reveals community transmission of COVID-19 in New York City –

Contact tracing in NYC

The NYC Test & Trace Corps initiative was launched in June 202020. Established as an operation to provide contact tracing, testing, and resources to support isolation and quarantine, the contact tracing program was integrated with a set of intervention efforts designed to limit morbidity and mortality from COVID-19 in NYC (Supplementary Information). Contact tracing was performed through phone calls and text messages, capable of reaching most residents of NYC. Specifically, contact tracers made phone calls to confirmed cases and symptomatic contacts to conduct a case investigation. For children under 18 years old, parents or legal guardians were contacted. Information about close contacts during the infectious period was elicited during the interview, and reported close contacts were then notified about their status of exposure through phone calls or text messages and are encouraged to get tested. Both confirmed/probably cases and their close contacts were monitored daily for the duration of their quarantine.

We analyzed data obtained from case investigations and COVID-19 testing results (molecular and antigen) collected between 1 October 2020 and 10 May 2021 (Supplementary Fig. 1, Supplementary Information). During this period, 691,834 confirmed and probable cases were reported to the New York City Department of Health and Mental Hygiene (DOHMH)21. The circulating strains of SARS-CoV-2 in NYC were dominated by the index virus strain; however, the Iota (B.1.526) and Alpha (B.1.1.7) variants gradually replaced the index virus during the spring of 2021 (Supplementary Fig. 2). After excluding cases residing in residential congregate settings, cases were sent to the NYC Test & Trace Corps for contact tracing. Among these cases, 644,029 were reached by tracers and 450,415 completed an interview. In total, 779,011 contacts with confirmed and probable cases were self-reported via case investigations, of whom 20.9% (162,659/779,011) were subsequently tested. The overall positivity rate among tested exposures is 55.8%. However, as infected individuals were more likely to seek tests, the actual secondary attack rate should be lower. We further disaggregated testing results for different exposure types (healthcare facility contact, home health aide, household member, intimate partner, large gathering contact, other close proximity, workplace contact) (Supplementary Fig. 3). The positivity rate was highest for household members and lowest for workplace contacts. The median time from specimen collection to reporting results to DOHMH was 2 days. 97% of index patients were called by tracers within two days of reporting to DOHMH (Fig. 1a) and 68.4% of contacts were called the day of reporting to the Test & Trace team (Fig. 1b). Among tested contacts, 66.6% sought testing within one week of notification (Fig. 1c). For traced symptomatic infections, 86.7% were tested after symptom onset, and 13.3% were tested before symptom development (Fig. 1d).

Fig. 1: Key statistics of contact tracing in NYC.
figure 1

ad The distributions of: a time between reporting date for index cases and being called by contact tracers; b time between calling index cases and notifying exposed persons; c time between notifying exposed persons and specimen sampling of notified individuals who were tested; d time from symptom onset to specimen sampling for symptomatic COVID infections. A negative value implies that testing preceded symptom onset. Age distributions of index cases (e) and self-reported contacts (f). The contact mixing matrix (g) shows the total number of exposures among age groups reported during the study period.

Adults aged 20 to 49 years old constituted the majority of index cases (Fig. 1e), a finding in agreement with the age distribution of confirmed infections in the United States22. Self-reported contacts were more uniformly distributed among the population under 50 years old (Fig. 1f). The age-stratified contact matrix highlights more frequent interactions among individuals of similar age and inter-generation mixing within the household (Fig. 1g), a pattern also observed in other countries23.

Exposure and transmission networks

We reconstructed the self-reported exposure network at the individual level for the study period. The exposure network was highly fragmented, with 947,042 individuals in 242,486 disjoint clusters. Cluster size showed considerable heterogeneity (Fig. 2a), as did the number of contacts reported by each index case (Fig. 2b). We visualize several large exposure clusters in Fig. 2c, color-coded by the home borough of each person. Exposure clusters exhibit diverse structures ranging from hub-and-spoke networks with a single spreader to networks with multiple spreaders. Over half of the clusters shown in Fig. 2c were in Queens and Brooklyn. Within those large exposure clusters in Fig. 2c, 1195 index patients (59.4%) reported contacts living in the same borough, but 817 (40.6%) cross-borough contacts were also recorded.

Fig. 2: Structure of exposure and transmission networks.
figure 2

a, b The distributions of cluster size and number of close contacts reported by each index case in the exposure network. Exposure clusters with more than 35 individuals are visualized in (c). The exposure network is undirected. Index cases and reported close contacts are connected. Node size is proportional to the number of connected individuals. Colors indicate the home location of each person (five boroughs in NYC, outside NYC, and unknown). The distributions of cluster size and the number of secondary cases in the transmission network are shown in (d) and (e), respectively. f Visualizes transmission clusters with more than six infected individuals. Node size represents the number of secondary cases. Arrows indicate the direction of transmission.

We additionally reconstructed transmission chains between index cases and their close contacts who were confirmed positive in laboratory tests (molecular and antigen). Due to asymptomatic and pre-symptomatic shedding24,25,26, index cases were not necessarily the source of infections in these putative transmission events. To infer the direction of transmission, we estimated the infection date of lab-positive cases. For symptomatic cases, infection date was estimated using an empirical incubation period distribution obtained from a prior study18; for asymptomatic cases, we used specimen collection date to estimate infection date using a model of viral load dynamics coupled with a Bayesian inference (Supplementary Fig. 4)27. Specifically, for each index case and close contact pair, we estimated their infection times using symptom onset date or specimen collection date. The direction of transmission was then determined by the estimated infection times—the individual infected earlier is the infector and the individual infected later is the infectee. We sampled an ensemble of possible transmission networks compatible with the estimated chronological order of infections. For each sampled transmission network, we computed the likelihood of observing the network given transmission probabilities across age groups, estimated using the test and trace data (Supplementary Table 1, Supplementary Fig. 5). The reconstructed network was selected as the one that maximizes the likelihood among the ensemble of possible transmission networks. We further performed sensitivity analyses demonstrating that the network reconstruction is robust to potential bias of the incubation period distribution28 (Supplementary Information). More details on the transmission network reconstruction are provided in the Supplementary Information.

During the study period, we identified 58,474 potential transmission clusters formed by exposures that resulted in lab-confirmed infections. On average, these transmission clusters had a mean size of 2.3 individuals, representing 19.6% (135,478/691,834) recorded cases during the study period. However, transmission cluster size and the number of secondary cases linked to each index case had large variance (Fig. 2d, e)—only 0.2% of transmission clusters involved more than 6 infections. The largest identified transmission cluster consisted of 12 cases, and the maximum number of secondary cases for a single index case was 7. Transmission clusters with at least 6 infections are visualized in Fig. 2f.

To quantify the spatial spread of SARS-CoV-2 in NYC at fine geographical scales, we mapped exposure and transmission networks across modified ZIP code tabulation areas (MODZCTAs, referred to as ZIP codes hereafter; Fig. 3a, b). Among 72,191 transmission events where place of residence was known, 7826 (10.8%) included multiple ZIP codes. Among these cross-ZIP code transmission events, only 2536 (32.4%) occurred between neighboring ZIP code areas, indicating that the majority of cross-ZIP code transmission drove non-local disease spread. For 2187 cross-borough transmission events, only 48 (2.2%) were between neighboring ZIP code areas. We observed several local clusters of ZIP codes that were tightly interconnected by exposure and transmission, centered around locations with high community prevalence. Infections in those high-prevalence ZIP code clusters were linked to self-reported contacts in nearby and far locations (Fig. 3a), which may have facilitated the spread of COVID-19 across the city (Fig. 3b). Among the cross-ZIP code transmission chains, we examined distributions of index cases who initiated transmission (Fig. 3c) and the infected contacts (Fig. 3d) across ZIP codes. A distinct skew in the distribution suggests that certain ZIP codes were more involved in the spatial spread of COVID-19. Geographically, most cross-ZIP code transmission events occurred within 10 km; however, long-distance transmission up to 40 km was also evident (Fig. 3e).

Fig. 3: Spatial transmission of SARS-CoV-2 in NYC.
figure 3

a, b The exposures and transmission events across ZIP codes in NYC identified from contact tracing data. Arrows indicate direction of exposure (from index cases to reported close contacts) and transmission (from index infections to infected contacts). Arrow thickness indicates the number of exposures and transmission events. ZIP code area color represents the cumulative number of confirmed cases during the study period (yellow to red—low to high). To better visualize, exposure links with less than 30 events and transmission links with <2 events are not shown on the maps. For cross-ZIP code transmission events, the distributions of index infections and infected contacts across ZIP code areas are presented in c and d. e The distribution of distance between home ZIP codes of index infections and infected contacts in cross-ZIP code transmission events. The population weighted centroids for ZIP code areas were used to compute the distance.

Evaluation of intervention measures

During the period from October 2020 to March 2021, a dynamic zone-based control strategy was adopted in New York State to limit viral spread in communities with high case growth rates while avoiding undue harm to the economy29. Three tiers of zones (yellow, orange, and red) were identified based on a set of metrics, collectively defined by test positivity rate, hospital admissions per capita, and hospital capacity29,30. Local restrictions on business and services were imposed based on zone conditions. Compliance to these restrictions can be reflected by the number of individuals visiting points-of-interest (POIs, e.g., restaurants, grocery stores, gyms, and bars) in each ZIP code. In December 2020, vaccines became available to the population at highest risk for severe outcomes associated with COVID-19 in NYC and were subsequently available to all eligible individuals over 15 years old during early April 2021. With the support of the detailed contact tracing data, we evaluated the impact of these public health interventions on community transmission of SARS-CoV-2 in NYC.

We assessed the associations of the numbers of non-household within- and cross-ZIP code transmission events across NYC with demographic, socioeconomic, disease surveillance, vaccination coverage, and human mobility features (Supplementary Information, Supplementary Figs. 67). Here cross-ZIP code transmission events include both directions, i.e., transmission for which either infector or infectee lived in a certain ZIP code. As non-household transmission contributed to the expansion of SARS-CoV-2 outside the household, we focused on 4642 non-household transmission events, representing 7% of all transmission events. We used aggregated foot traffic records derived from mobile phone data31 documenting weekly numbers of POI visitors in each ZIP code as an indicator of human mobility and compliance with the zone-based local restrictions (Supplementary Information, Supplementary Fig. 7). We used conditional autoregressive (CAR) models32 to assess the effects of the above factors on within- and cross-ZIP code transmission (Fig. 4). Specifically, for both within- and cross-ZIP code transmission, we fitted Poisson generalized linear mixed models (GLMM) with random effects and CAR priors to account for the inherent spatial-temporal autocorrelation in disease transmission data32,33 (Supplementary Information, Supplementary Figs. 89).

Fig. 4: Effects of various features on the transmission of SARS-CoV-2 in NYC.
figure 4

Incidence rate ratios (exponentiated coefficients) for non-household within-ZIP code transmission and cross-ZIP code transmission are shown for 12 covariates in a and b, respectively (Deviance information criterion, DIC = 6342 for a and DIC = 12,644 for b). Coefficients were estimated using a Poisson generalized linear mixed model controlling for spatial-temporal autocorrelations. We used the log-transformed population as the offset in the regression model. Covariates were standardized and are shown on the y-axis. The incidence rate ratio quantifies the multiplicative change in the number of transmission events per each covariate increase of one standard deviation, controlling for other covariates. The violin plots show the distributions of incidence rate ratios. Black dots and horizontal black lines highlight the median estimates and 95% CIs. Distributions in a and b were obtained using \(n={{{{\mathrm{20,000}}}}}\) MCMC samples of the posterior estimates.

We found that higher vaccination coverage and fewer POI visitors were associated with reduced non-household within- and cross-ZIP code transmission in the same week (Fig. 4). Estimates of coefficients are provided in Supplementary Table 2. The model identifies a strong effect of vaccination on SARS-CoV-2 transmission: during the early phase of vaccine rollout that aligns with the study period, a 12.5% newly vaccinated population was associated with reductions of 28.0% (95% CI: 14.0%–40.0%) and 14.8% (1.7%–26.4%) for within- and cross-ZIP code non-household transmission events, respectively. This marginal benefit may diminish for higher vaccine coverage as we expect the effect is nonlinear when the vaccinated population is near 100%. In contrast, a 78.1% increase of POI visitors per capita (ratio of the number of POI visitors to the population of each ZIP code) was associated with increases of 9.6% (0.3%–19.3%) and 14.4% (8.7%–20.2%) for within- and cross-ZIP code transmission outside households, respectively. In the foot traffic data, the POI category with the largest number of visitors was restaurants and bars. It is possible, but not known, whether gathering in these places may contribute more to cross-ZIP code transmission than to within-ZIP code transmission. We further found that both within- and cross-ZIP code transmission had strong positive associations with log weekly cases per capita. A 13.5% increase of log weekly cases per capita was associated with increases of 158.8% (126.5%–196.4%) and 117.3% (97.7%–137.9%) for non-household within- and cross-ZIP code transmission. Higher percentage of Hispanic residents and lower cumulative cases per capita were associated with higher non-household transmission (see strength of effect in Supplementary Table 2). For cross-ZIP code transmission, cumulative cases per capita had a stronger effect than vaccination and POI visitors (Fig. 4b, Supplementary Table 2), indicating that prior infections may result in reduced cross-ZIP code transmission in locations with a higher attack rate. These findings reveal how health inequities related to COVID-19 manifest across NYC communities. Results also indicate that promoting vaccination and capacity limits or temporary limits on local businesses, schools, and other POIs in high-prevalence communities were effective in reducing SARS-CoV-2 transmission in NYC. These findings were corroborated with an alternate random-effect model (Supplementary Information) and testing of effect lags of one week and two weeks (Supplementary Figs. 1012). Findings were also found robust to possible reduced response rate in contact tracing among children and elderly (Supplementary Fig. 13).