Automated detection and characterization of small cell lung cancer liver metastasis on computed tomography

Sophia Ty; Fahmida Haque; Parth Desai; Nobuyuki Takahashi; Usamah Chaudhary; Peter L. Choyke; Anish Thomas; Barış Türkbey; Stephanie A. Harmon

doi:10.4274/dir.2025.253310

ABSTRACT

PURPOSE

Small cell lung cancer (SCLC) is an aggressive disease with diverse phenotypes that reflect the heterogeneous expression of tumor-related genes. Recent studies have shown that neuroendocrine (NE) transcription factors may be used to classify SCLC tumors with distinct therapeutic responses. The liver is a common site of metastatic disease in SCLC and can drive a poor prognosis. Here, we present a computational approach to detect and characterize metastatic SCLC (mSCLC) liver lesions and their associated NE-related phenotype as a method to improve patient management.

METHODS

This study utilized computed tomography scans of patients with hepatic lesions from two data sources for segmentation and classification of liver disease: (1) a public dataset from patients of various cancer types (segmentation; n = 131) and (2) an institutional cohort of patients with SCLC (segmentation and classification; n = 86). We developed deep learning segmentation algorithms and compared their performance for automatically detecting liver lesions, evaluating the results with and without the inclusion of the SCLC cohort. Following segmentation in the SCLC cohort, radiomic features were extracted from the detected lesions, and least absolute shrinkage and selection operator regression was utilized to select features from a training cohort (80/20 split). Subsequently, we trained radiomics-based machine learning classifiers to stratify patients based on their NE tumor profile, defined as expression levels of a preselected gene set derived from bulk RNA sequencing or circulating free DNA chromatin immunoprecipitation sequencing.

RESULTS

Our liver lesion detection tool achieved lesion-based sensitivities of 66%–83% for the two datasets. In patients with mSCLC, the radiomics-based NE phenotype classifier distinguished patients as positive or negative for harboring NE-like liver metastasis phenotype with an area under the receiver operating characteristic curve of 0.73 and an F1 score of 0.88 in the testing cohort.

CONCLUSION

We demonstrate the potential of utilizing artificial intelligence (AI)-based platforms as clinical decision support systems, which could help clinicians determine treatment options for patients with SCLC based on their associated molecular tumor profile.

CLINICAL SIGNIFICANCE

Targeted therapy requires accurate molecular characterization of disease, which imaging and AI may aid in determining.

Keywords:

Computer vision, segmentation, neuroendocrine gene expression, radiomics, tumor classification, transcriptomics, molecular tumor profile

Main points

• The liver is a frequent site of metastases in small cell lung cancer, and artificial intelligence helps identify and segment tumors.

• The computed tomography (CT) imaging characteristics of liver lesions have a moderate correlation with neuroendocrine transcription factors.

• An end-to-end machine learning pipeline may help characterize the molecular profile of liver lesions in CT.

Small cell lung cancer (SCLC) is an aggressive form of lung cancer strongly associated with smoking and accounts for 13%–15% of all lung cancer cases.^{1, 2} Patients often present with advanced disease, resulting in a poor prognosis with a 5-year survival rate of 7%.³ These outcomes reflect the challenges in clinical management of a recalcitrant cancer marked by the ubiquitous presence of TP53, RB1 loss-of-function events, and high chromosomal instability,^{4, 5} which drive rapid progression, widespread metastasis,^{5, 6} and treatment resistance following initial response to therapy.⁷ Among many complications associated with cancer progression, SCLC typically leads to hepatic metastasis, which is seen in 21%–27% of patients at presentation and 69% at autopsy.⁸ This makes the liver the most prevalent metastatic site after mediastinal lymph nodes—an important characteristic since liver metastasis is also an independent marker of poor prognosis.^{9, 10}

SCLC also demonstrates a high degree of heterogeneity, manifesting under various transcriptional subtypes. The classification of SCLC subtypes is defined by the expression levels of four transcription regulators, namely neuronal differentiation 1 (NEUROD1), achaete-scute family basic helix-loop-helix transcription factor 1 (ASCL1), POU class 2 homeobox3, and yes-associated protein (YAP1).¹¹ The relative expression of these regulators leads to heterogeneous neuroendocrine (NE) gene expression, which has therapeutic implications.^{12, 13} The SCLC tumors associated with relatively high NEUROD1 and ASCL1 expression are considered NE positive and demonstrate greater susceptibility to DNA-damaging agents;^{14, 15} non-NE SCLC tumors have greater POU2F2 and YAP1 expression and have been shown to possess better response to immunotherapy.^{13, 16-18}

Despite emerging insights into its molecular subtypes, SCLC is still currently treated as a homogenous disease. As we gain more insights into the molecular underpinnings of SCLC that drive tumor response to treatments, there is a need for clinical workflows that can stratify patients based on their tumor profile. Methods that identify subpopulations of patients with SCLC who are likely to benefit from specific targeted treatments without requiring additional invasive testing can offer physicians actionable insights, especially when treatment response status can be determined at the time of diagnosis. Furthermore, computational platforms that can accurately detect and characterize tumors offer practical utility in supporting physicians from diagnosis to treatment. They can automate critical tasks, integrate different types of medical data (e.g., radiology scans, biopsy findings, blood panel information), extract clinically relevant tumor characteristics, and consolidate medical information for health practitioners.

Within the past decade, artificial intelligence (AI) has been integrated into automating medical image processing tasks. More specifically, deep learning has shown promise in segmenting objects at different imaging scales, including tissue¹⁹ and cellular²⁰ levels, for a variety of medical conditions, including lung cancer^{21, 22} and hepatic diseases.^23-25 Preceding the popularity of deep learning for medical image segmentation is the use of radiomics, a common research approach to describe tumors quantitatively, including their intensity, shape, and texture, which may be used as image-based biomarkers for downstream analysis and association studies. Radiomics has been adopted to address various clinical decision tasks, such as lesion classification^{26, 27} and treatment response prediction,^{28, 29} including applications in liver-associated malignancies.^30-32

Given the implications of transcriptional subtypes to treatment response in SCLC and the association between SCLC hepatic metastasis and prognosis, we investigated whether NE status, as determined by tumor gene expression analysis, can be determined from computed tomography (CT) scans of confirmed SCLC metastasis in the liver. In this study, we present a two-step machine learning framework for automated detection and characterization of SCLC liver metastasis. We employ deep learning for three-dimensional (3D) segmentation of hepatic lesions followed by radiomics-based analysis to characterize and classify image scans by NE status, defined as high expression of a pre-selected gene set.

Methods

A graphical summary of the study objective is shown in Figure 1.

Study population and data description

Two datasets were utilized in this study: (1) the liver tumor segmentation (LiTS) dataset, a publicly available dataset containing 131 CT scans; and (2) a retrospective cohort of patients with metastatic SCLC (mSCLC) underwent CT at the National Cancer Institute (Bethesda, MD, USA).

The LiTS dataset consists of multi-center scans of primary and secondary hepatic tumors. All scans were manually annotated by a radiologist (>3 years’ experience) using the ITK-SNAP open source software platform to obtain liver and lesion labels. Annotations were confirmed by three additional radiologists; the most senior reader’s findings were used in any labeling conflicts. This research study was conducted retrospectively using human participant data made available as open-access materials by Bilic et al.²⁵ This open-source cohort was used for training and development of a segmentation algorithm for the detection and segmentation of focal liver lesions on CT. This cohort was used exclusively for the segmentation task (Figure 1).

An initial query for the mSCLC dataset identified 88 patients diagnosed with SCLC and undergoing disease monitoring or treatment at the institution under one or more clinical protocols, including the following ClinicalTrials.gov identifiers: NCT02769962 (IRB 16-C-0107; 2016-05-09), NCT03554473 (IRB 18-C-0110; 2018-09-11), NCT02487095 (IRB 15-C-0150; 2015-07-30), NCT02484404 (IRB 15-C-0145; 2015-06-29), NCT02146170 (IRB 14-C-0105, 2014-05-28). Each protocol was approved by the local institutional review board, and written informed consent was obtained from all patients. From these patients, a total of 346 abdominal CT scans obtained during 178 CT sessions were identified for possible inclusion. This cohort was used for both the segmentation and classification tasks (Figure 1). Multiple series were included from each study date: for example, thick-slice and soft tissue thin-slice reconstructions to evaluate model robustness (Supplementary Table 1). Radiology reports were manually reviewed for each CT scan to confirm the presence or absence of hepatic lesions. From all available scans, a subset of 82 scans was manually reviewed by an expert radiologist (>15 years’ experience), and liver lesions were segmented using ITK-SNAP. Liver organ annotations were obtained using a previously developed two-dimensional (2D) U-Net liver segmentation model³³ and were manually adjusted using ITK-SNAP. All annotated scans were used for the tumor segmentation task (Figure 1).

All patients in the mSCLC cohort underwent either tissue or blood sampling for bulk RNA or circulating free DNA (cfDNA) chromatin immunoprecipitation sequencing at multiple timepoints, corresponding with CT study dates (± 3 months). Expression profiles from sequencing data were used to classify patients broadly into NE-positive and NE-negative phenotype groups based on previously published methods.¹³^,³⁴ Briefly, single-sample gene set enrichment analysis from a 50-gene signature panel was used to classify samples as NE (score >0) or non-NE (score <0), with a lower score in the non-NE group reflecting more confidence that the sample does not exhibit NE differentiation.³⁵ Strong correlation observed between cfDNA-derived and RNA-derived expression scores for NE phenotyping has been previously reported;¹³^,^34-36 therefore, either reference standard was used for ground truth assignment in this cohort. The NE phenotype expression scores (range: −1,1) and classification (NE, non-NE) were recorded for use in this study (Table 1).

Deep learning model development for tumor segmentation

Three deep learning algorithms were selected to build a hepatic lesion detection model: (1) a 3D U-Net, (2) a 3D SegResNet, and (3) a 3D nnU-Net. During initial model development and selection, each algorithm was trained solely using the LiTS dataset, and mSCLC data were used as an independent test set. For all training, data partitions were stratified at the patient level and are summarized in Table 2.

The U-Net and SegResNet models were built using the Medical Open Network for AI platform (version 1.3.0).³⁷ For these two network architectures, training was conducted with the following data pre-processing and augmentations: CTs were resampled to uniform spacing (0.5 mm × 0.5 mm × 1 mm), foreground cropping, variable CT windowing, and random cropping by labels with a sampling ratio of 4:1:3 for the background, liver, and lesion, with 12 samples taken per image. Each sample crop was of size 512 × 512 × 16. Both models were trained using an adaptive moment estimator to minimize lesion-level DICE loss, with a learning rate of 0.0001 for 1,500 epochs. The final model was selected based on the highest validation DICE reported during training. Inference was conducted using the sliding window technique.

The nnU-Net model, an auto-configuring semantic segmentation model, was implemented using the built-in 3dfullres five-fold cross-validation.³⁸ Pre-processing configurations selected by the model included spacing (0.789 mm × 0.789 mm × 2 mm), patch size 80 × 80 × 60, and per-image z-score standardization.

For all three models, performance was evaluated in the test set using lesion-level DICE coefficients and tumor detection sensitivity on both the LiTS and mSCLC datasets.

Due to differences in the burden and imaging characteristics of the mSCLC cohort compared with the LiTS cohort, which may potentially impact generalizability, a final nnUNet model was trained from all LiTS training data along with a subset of mSCLC scans partitioned with an approximate training/test split of 80%/20% scans. Inference of the final finetuned segmentation model was completed for all mSCLC scans for use in the classification model.

Radiomics characterization and neuroendocrine phenotype classification

Each mSCLC scan was labeled based on matched gene expression-based NE score as NE positive (1) or NE negative (0). All CT studies with confirmed hepatic lesions that were matched to NE scores within ± 3 months of the imaging date were included in the NE phenotype classification. Scans were partitioned with an approximate training/test split of 80%/20% images using the same stratification applied during segmentation. Splits were determined at the patient level to avoid bias, resulting in 177 scans for training and 50 scans for testing (Table 2).

Liver lesion contours obtained from the final segmentation model were characterized using radiomics. Quantitative image features were extracted using PyRadiomics (v3.0.1) with a resampling pixel spacing of (1 mm, 1 mm, 1 mm) for the (x, y, z) voxel coordinates and default image standardization parameters. A total of 107 radiomic features were obtained, representing first-order statistics, shape (2D and 3D), gray level co-occurrence matrix, gray level run length matrix (glrlm), gray level size zone matrix (glszm), neighboring gray tone difference matrix, and gray level dependence matrix (gldm). The number of lesions per image was determined using connected-components-3D (v3.12.1) and served as an additional feature, resulting in a total of 108 features. From these, a subset of imaging characteristics correlated with NE phenotype was selected using least absolute shrinkage and selection operator (LASSO) regression (scikit-learn v1.2.2).

Radiomics-based NE phenotype classification was conducted using three machine learning models: (1) logistic regression (scikit-learn v1.2.2), (2) random forest (scikit-learn v1.2.2), and (3) XGBoost (v2.0.3). For all models, default parameters were used. Code and raw data for how these models were trained are available at https://github.com/NIH-MIP/mSCLC_Segmentation_Classification. All models incorporated the subset of imaging features selected using LASSO regression for binary classification of tumors as NE (1) or non-NE (0) phenotype.

Each model was trained with and without class-based weights. Five-fold cross-validation was implemented, and the best model was selected using the F1 score and area under the receiver operating characteristic curve (AUC) from cross-validation as the primary performance criteria. When applied to the test set, the ensemble of all five-folds was utilized for test set evaluation (average prediction of five-folds).

Statistical analysis

The DICE coefficient,³⁹ a measure that describes spatial agreement between two image sets, was calculated to quantify the performance of the model compared with ground truth annotation from radiologists. To evaluate detection performance metrics at the lesion level, connected-components-3D (v3.12.1) was used to identify unique lesions in both the ground truth segmentations and model output. Next, each lesion was classified as a true positive (i.e., ground truth lesion correctly segmented by AI), false negative (i.e., ground truth lesion was not segmented by the model), or false positive (i.e., model segmented a lesion with no ground truth correlate) per scan. Each segmentation model’s sensitivity, positive predictive value (PPV), and false positive trends were calculated and reported as summary statistics.

The relationship between tumor burden and NE scores was examined using two tests: (1) Spearman correlation analysis (SciPy v.1.11.1) for continuous NE scores, and (2) Wilcoxon rank sum tests (R v4.4.1) for binarized NE scores. For this analysis, one series per patient per scan date was selected. Tumor volume estimates were calculated using AI-predicted tumor regions from the final segmentation model.

Finally, the performance of each binary classifier was evaluated. Model accuracy, sensitivity, specificity, PPV, negative predictive value, F1 scores, and AUC were calculated (scikit-learn v1.2.2) and compared. Due to potential bias in multiple scans coming from the same CT study, bootstrap sampling was performed at the study level to select one scan per study per iteration randomly. The mean and 95% confidence intervals (CIs) of each performance metric are reported in the test set.

Results

The segmentation model utilized all LiTS data and incorporated a subset of scans from the mSCLC cohort. Of the 88 patients identified for possible study inclusion, a total of 86 patients were included in the final study cohort, with two exclusions due to insufficient data records (no sampling within the required timeframe from a scan date; no segmentations). Key characteristics of the mSCLC cohort are provided in Table 1. Image acquisition characteristics are summarized in Supplementary Table 1 for each cohort and task.

Lesion detection

The cohort information for model training is shown in Table 2. In addition to the LiTS cohort, 82 annotated scans (82 unique studies) from the mSCLC dataset were utilized for the segmentation task, of which 50 scans were positive for containing hepatic lesions. First, models were trained only on the LiTS cohort and applied to mSCLC. Among the three deep learning models implemented for automated hepatic lesion detection and segmentation, the 3D fullres nnU-Net model provided the most accurate and robust results for the test set from both datasets. Its detection performance had a median DICE score of 0.75 and 0.607, lesion-level sensitivity of 0.813 and 0.539, and PPV of 1.0 and 0.99 for the LiTS and mSCLC test sets, respectively. The 3D fullres nnU-Net model also had a range of 0–1 false positive lesions per scan for both datasets. The U-Net and SegResNet models had median DICE scores of 0.439, 0.417 for the LiTS dataset, and 0.297, 0.418 for the mSCLC dataset. The U-Net model had lesion sensitivities of 0.395 and 0.244, whereas the SegResNet model achieved 0.662 and 0.447 for LiTS and mSCLC. A summary of each model’s performance is provided in Table 3.

Next, an evaluation was performed to determine how finetuning of the nnU-Net for liver lesion segmentation on the mSCLC data may improve performance. Here, the 3D fullres nnU-Net model achieved a median DICE of 0.771 and 0.640, with sensitivities of 0.826 and 0.667 for the LiTS test set and mSCLC test set, respectively. For both datasets, the model had a PPV of 1.0 and 0 false positives per scan. Representative images of cases with high and low concordance between the AI-predicted and ground truth annotations of mSCLC liver lesions from the test set are provided in Figure 2.

Correlation analysis of tumor burden and neuroendocrine status

A statistically significant correlation was found between tumor burden and NE scores for continuous data (Spearman: 0.252, P value: 0.0059) and binarized data (Wilcoxon rank sum P value: 0.028). Further analysis revealed that cfDNA samples had greater dependence on tumor volume, with a statistically significant correlation between tumor burden and cfDNA-derived NE scores (Spearman: 0.446, P value: 0.00057; Wilcoxon rank sum P value: 0.0013). Conversely, biopsy-derived NE scores did not have a statistically significant correlation with tumor volume (Spearman: −0.0738, P value: 0.5680; Wilcoxon rank sum P value: 0.253). Correlation plots are provided in Figure 3.

Neuroendocrine phenotype classification

A total of 227 scans from 118 CT studies were used for the NE classification task after excluding scans with no hepatic lesions by radiologist read (n = 92 scans), scans with no corresponding RNA sequencing (RNAseq) or cfDNA data (n = 19), or false negatives by segmentation model (n = 6 volumes). Of the usable data, 172 scans (89 studies) were identified as NE positive and 55 scans (29 studies) were identified as NE negative (Table 1). The LASSO feature selection was performed within the training set, identifying 20/108 radiomic features correlated with NE-related tumor phenotypes for inclusion in the classification model. The distribution of selected radiomic feature categories was as follows: 30% shape, 25% glszm, 20% gldm, 15% first order statistics, and 10% glrlm. Among these, shape was the most dominant radiomic feature type found in the subset. The top five imaging feature characteristics determined during feature selection were minor axis length, maximum 2D diameter row, major axis length, gldm large dependence emphasis, and first order variance. A full list of the selected radiomic features and a distribution summary of the selected feature type are provided in Figure 4.

For all evaluated models, weighted training did not substantially boost the predictive performance of the algorithms (Supplementary Table 2), nor did they generalize better to the test set (Supplementary Table 3), regardless of the optimization strategy for determining the best weight. Based on the cross-validation performance, the random forest classifier had the highest F1 (mean 0.86; range 0.81, 0.91) and AUC (mean: 0.68; range: 0.56, 0.81) (Supplementary Table 2). All calculated metrics for the NE phenotype classification task are summarized in Table 4, where the logistic regression model generalized better than the random forest model by AUC (mean: 0.71; 95% CI: 0.59–0.82 vs. mean 0.58; CI: 0.48–0.69, respectively), though F1 and accuracy metrics were similar across all models, likely due to the imbalance in favor of the positive class (NE). The 50 scans in the test cohort represented 26 unique CT studies. Breaking accuracy down further by scan-level classification agreement across series volumes: 18/24 scans with multiple series volumes were correctly classified in both series volumes, 4/24 scans were incorrectly classified in both series volumes, and 2/24 scans had different classification results across the different series. The two scans with only a single series volume available were both correctly classified.

Failure analysis revealed that 8/10 of misclassifications were associated with NE scores near the boundary (NE score: 0) and with cfDNA-derived NE scores. Misclassified observations skewed to the NE-negative phenotype (Figure 5a), with a mean NE score of −0.007 as presented in Figure 5b. The NE classification results showed an AUC of 0.528 when evaluated on cfDNA-derived data alone, although they showed an AUC of 0.984 when strictly evaluated on biopsy-derived transcriptomic data, as shown in Figure 5c.

Discussion

Patients with SCLC often present with hepatic metastases.^{9, 40} Characterizing NE profiles of mSCLC lesions offers a pathway to stratify patients based on distinct therapeutic vulnerabilities of their molecular subtypes.¹³ We demonstrate the potential of deep learning on automated liver lesion detection and the feasibility of using radiomics to describe properties of mSCLC tumors. This framework can enable the determination of patients’ NE status as positive or negative for bearing an NE-like phenotype without resorting to invasive biopsy procedures and relying on scans obtained as part of routine staging studies. Our 3D nnU-Net segmentation model demonstrated that hepatic lesions can be accurately detected for patient populations with highly variable disease characteristics (e.g., a wide range of tumor size, varying number of tumors in a single scan) using a fully automated platform. We also showed the possibility of stratifying patients with SCLC by their NE status using radiomic features extracted from routinely acquired abdominal CT scans of metastatic liver lesions.

The finetuned nnU-Net segmentation model was able to identify liver lesions with a median DICE score of 0.771 for the publicly available data (LiTS) and 0.640 for our internal mSCLC dataset. It also showed adeptness in locating regions with suspected lesions in CT images with high sensitivity (LiTS: 0.826 and mSCLC: 0.667) that outperformed previously reported models (ISBI 2017: 0.458, MICCAI 2017: 0.515, MICCAI 2018: 0.554)²⁵ while having low false positive rates (0–1 false positive per scan) and comparable tumor-level DICE scores (ISBI 2017: 0.674, MICCAI 2017: 0.702, MICCAI 2018: 0.739)²⁵ for the public dataset. Further characterization of the model’s behavior on mSCLC cases without ground truth reports of a lesion (n = 94) revealed that false positive predictions typically were small in volume (0.638 ± 1.73 cm³). Overall, our detection tool not only offers valuable improvements to previous liver lesion segmentation benchmarks but also enables automating the lesion annotation process for radiomics analysis.

Correlation analysis of AI-predicted lesions showed that tumor burden has a statistically significant association with NE status. Our results revealed that NE scores exhibit dependence on tumor volume, especially when expression profiles were derived from cfDNA samples. This finding is consistent with previous studies showing a correlation between tumor burden and cfDNA in NE-related^{34, 41} and lung⁴² neoplasms, although their exact relationship remains elusive.⁴³ This is also supported by our radiomics analysis, shows that shape—which includes measurements of tumor size—is the most frequently selected radiomic feature type during model development. These findings indicate that tumor burden may be an indirect measure of NE expression in mSCLC tumors. Recent studies have demonstrated a strong correlation between cfDNA-derived and RNA-derived expression scores for NE phenotyping.^{13, 34-36} However, the same volume-based relationship was not observed in biopsy-based RNAseq determination of NE phenotype. This discrepancy may be explained by tumor heterogeneity, where cfDNA-derived expression metrics are an aggregate of multiple lesions throughout the patient, whereas biopsy-derived RNAseq expression is sampled directly from a single lesion. This heterogeneity component can also explain why more errors in NE prediction were observed in cfDNA-derived samples, as metastatic lesions elsewhere in the body may contribute to this expression, despite the predominance of liver lesions in these patients. However, these hypotheses cannot be evaluated within the patients evaluated in this study due to the limited sample size and lack of multiple targeted biopsy-based samples for RNAseq expression.

To our knowledge, techniques for predicting patient-level SCLC NE status have yet to be explored. Our approach in integrating the molecular phenotypic landscape of SCLC with image-extracted tumor markers offers a path towards building translational computing workflows that may help tailor SCLC treatment. In this study, we show that a logistic regression classifier can distinguish NE phenotypes using radiomics data with 80% accuracy (0.79 ± 0.04) and an AUC of 0.73 (0.70 ± 0.08). We noted that the phenotype classifier demonstrated high sensitivity for NE-positive tumors but low specificity for NE-negative tumors. This classifier was trained and tested on heterogeneous image acquisition settings, leading to consistency in performance for multiple reconstructions (series volumes) within a single study; however, further validation is warranted.

Our approach has several limitations. This is a relatively small patient cohort of metastatic patients with SCLC; the classification algorithm was trained and validated in 227 scans from 50 patients, and further research is warranted. We were underpowered to perform classification based on the Riley et al.⁴⁴ criterion, increasing the likelihood of our algorithm overfitting to the training population. Our data utilize a mixture of biopsy-sampled and cfDNA-based NE scores. The cfDNA sampling is comparatively easier but captures signals from both cancerous and non-cancerous components. This impacts our ability to describe with confidence the NE expression profiles specific to each lesion. Next, our framework utilizes a cascaded algorithm that analyzes biomedical imaging data in a stepwise fashion. This workflow inherently propagates segmentation errors and failures at the first step to the radiomics analysis, which occurs downstream. Since radiomics is volume dependent, the effectiveness of the tumor phenotype classifier relies on the performance of the lesion detection tool. Our DICE performance indicates that the model did not perform as well in mSCLC cases (DICE <0.7) compared with LiTS (DICE >0.7). This may be influenced by high disease burden and complex anatomy from extensive prior treatments, or it can affect cases with very small lesions, which are difficult to identify. Further work on how changes in predicted volumes impact radiomics and downstream classification is needed. Similarly, further investigation into the effects of contrast use and acquisition parameters is warranted. Third, our NE classification task is conducted at the image level rather than the lesion level. This primarily describes the bulk NE profile rather than tumor-specific characteristics. Any heterogeneity reflected across the patients’ disease burden cannot be evaluated through the current workflow. Future validation in lesion-based assessment for samples with targeted sequencing may provide more context to individual signatures. Finally, the data distribution of NE positive to NE negative is largely unbalanced, making it difficult to further optimize the machine learning classifier to better detect NE-negative phenotypes.

In conclusion, deep learning and radiomics-based analysis enable automated detection and characterization of SCLC liver metastasis. Using AI-based platforms, routinely acquired CT scans may be used to determine the NE status of patients with mSCLC liver lesions. This could enable clinicians to tailor SCLC treatments based on a patient’s NE status and its associated molecular tumor profile.

Conflict of interest disclosure

The authors declared no conflicts of interest.

Funding

This research was supported in part by the Center for Cancer Research, National Cancer Institute, National Institutes of Health Intramural Research Program project number ZIABC012163 and ZIABC011793. The research was supported in part by the NIH Undergraduate Scholarship Program (S.T.). The contributions of the NIH author(s) were made as part of their official duties as NIH federal employees, are in compliance with agency policy requirements, and are considered Works of the United States Government. However, the findings and conclusions presented in this paper are those of the author(s) and do not necessarily reflect the views of the NIH or the U.S. Department of Health and Human Services.

References

van Meerbeeck JP, Fennell DA, De Ruysscher DK. Small-cell lung cancer. Lancet. 2011;378(9804):1741-1755.

CrossRef PubMed Google Scholar

Thomas A, Pommier Y. Small cell lung cancer: time to revisit DNA-damaging chemotherapy. Sci Transl Med. 2016;8(346):346fs12.

CrossRef PubMed Google Scholar

Thomas A, Mohindroo C, Giaccone G. Advancing therapeutics in small-cell lung cancer. Nat Cancer. 2025;6(6):938-953.

CrossRef PubMed Google Scholar

George J, Lim JS, Jang SJ, et al. Comprehensive genomic profiles of small cell lung cancer. Nature. 2015;524(7563):47-53.

Automated detection and characterization of small cell lung cancer liver metastasis on computed tomography

ABSTRACT

Methods

Results

Discussion

Conflict of interest disclosure

Funding

References

Suplementary Materials