Thyroid cancer is one of the most common cancers, with a global increase in incidence rate for both genders. Ultrasound-guided fine-needle aspiration is the current gold standard to diagnose thyroid cancers, but the results are inaccurate, leading to repeated biopsies and unnecessary surgeries. To reduce the number of unnecessary biopsies, we explored the use of multiparametric photoacoustic (PA) analysis in combination with the American Thyroid Association (ATA) Guideline (ATAP). In this study, we performed in vivo multispectral PA imaging on thyroid nodules from 52 patients, comprising 23 papillary thyroid cancer (PTC) and 29 benign cases. From the multispectral PA data, we calculated hemoglobin oxygen saturation level in the nodule area, then classified the PTC and benign nodules with multiparametric analysis. Statistical analyses showed that this multiparametric analysis of multispectral PA responses could classify PTC nodules. Combining the photoacoustically indicated probability of PTC and the ATAP led to a new scoring method that achieved a sensitivity of 83% and a specificity of 93%. This study is the first multiparametric analysis of multispectral PA data of thyroid nodules with statistical significance. As a proof of concept, the results show that the proposed new ATAP scoring can help physicians examine thyroid nodules for fine-needle aspiration biopsy, thus reducing unnecessary biopsies.

Significance:

This report highlights a novel photoacoustic scoring method for risk stratification of thyroid nodules, where malignancy of the nodules can be diagnosed with 83% sensitivity and 93% specificity.

Over recent decades, a dramatic increase in the global incidence of thyroid cancers has been reported for both genders (1, 2). In 2018, the age-standardized rates of thyroid cancer per 100,000 person-years were 10.2 for women and 3.1 for men (3). According to the American Cancer Society, it is estimated that more than 52,000 new cases of thyroid cancer will be diagnosed in the United States in 2020. This rise is attributed to a variety of reasons, including an increase in environmental radiation, obesity, and overdiagnosis (4). Particularly, advances in diagnostic imaging techniques, such as ultrasonography, positron emission tomography, X-ray computed tomography, and magnetic resonance imaging have contributed to the increased number of newly diagnosed thyroid cancer cases (5). However, not all nodules need to be treated immediately. Conventional handling of nonaggressive nodules leads to unnecessary biopsies, wasted healthcare costs, and degradation of patients' quality of life. It is reported that more than 470,000 women and 90,000 men may have been overdiagnosed over the past two decades (5, 6), and the overall costs for treating those patients are substantial (∼$1.6 billion for all U.S. patients since 1985, and a projected cost in excess of $3.5 billion in 2030; ref. 7).

Ultrasound (US)-guided fine-needle aspiration (FNA) is the standard of care for diagnosing thyroid cancers in the clinical workflow (8). Before performing FNA, US imaging (USI)-based risk stratification, such as that described the American Thyroid Association (ATA) Guidelines (9), the British Thyroid Association (BTA) Guidelines (10), or the Thyroid Imaging Reporting and Data System (TIRADS; ref. 11), is used to screen patients. Stratification scores are assigned using such features of thyroid nodules as size, micro- and macrocalcifications, irregular margins, hypoechoic contrast, solid component, and a taller-than-wide shape (12, 13). For nodules meeting the suspicious criteria, FNA is performed under the guidance of USI. While the USI-based risk stratification methods are useful in screening and as quick pathologic tests, they suffer from rather low specificities (ATA, 17.3%; BTA, 50.9%; and TIRADS, 28.2%) in comparison with their high sensitivities (ATA, 98.0%; BTA, 90.0%; and TIRADS, 94.0%; ref. 14). The low specificities of these approaches lead to many unnecessary FNAs on patients with benign nodules. This unnecessary FNA incurs high economic costs for sampling and analysis, and aspiration itself can be a source of anxiety, pain, and discomfort for the patient (15).

To improve thyroid cancer diagnosis, Lyshchik and colleagues explored using US elastography to differentiate malignant from benign nodules by measuring their stiffness (16). For some diagnostic features, the technique showed potential for distinguishing nodules with good sensitivity (82%) and specificity (96%), but the classification results varied for other US features (sensitivities dropped to 30%–50%). The results were operator-dependent and affected by the patient's movements, including breathing and carotid pulsation. In other work, quantitative US shear wave elastography has been studied to predict the malignancy of thyroid nodules (17). The results showed good classification performance, with a sensitivity of approximately 70%–85% and specificity of approximately 70%–94% (18, 19), but displacement by physiologic motion, such as breathing, pulsation of the carotid artery, and patient movement during imaging, was still a major problem in obtaining reproducible results. To overcome these problems, the US elastography using only the natural pulsation of the carotid artery was developed and commercialized (20, 21). This approach attained a sensitivity of 87.8% and a specificity of 77.5%. To go beyond the current classification performance, additional information, such as the metabolism of the thyroid nodules, could augment the existing USI technology (22).

In this study, we explore the use of multiparametric photoacoustic (PA) analysis in combination with the ATA Guideline, with the goal of reducing the number of unnecessary FNA biopsies. PA imaging (PAI) can noninvasively visualize the optical absorption of tissues (23, 24). In addition to providing molecular imaging with the use of exogenous contrast agents (25, 26), PAI also provides functional metabolic information about biological tissues based on intrinsic chromophores (27, 28). To capture molecular and functional imaging as well as obtain anatomical information in a clinical setting, several PA and US imaging (PAUSI) systems have already been developed and applied in selected clinical applications (29–32). Recently, several clinical studies have used PAI to evaluate thyroid nodules. Dogra and colleagues (33) reported promising feasibility for PAI in differentiating thyroid nodules by investigating oxy- and deoxyhemoglobin components but presented no in vivo results. Dima and colleagues (using healthy volunteers; ref. 34) and Yang and colleagues [using 10 papillary thyroid cancer (PTC) patients; ref. 35] demonstrated in vivo structural PAI of thyroid nodules with a single optical wavelength, but neither presented statistical or quantitative analyses. More recently, Roll and colleagues (36) conducted a pilot study and reported statistical analysis of PA data for thyroid nodules regarding hemoglobin oxygen saturation (sO2). However, the number of samples (13 benign and 3 PTC nodules) was limited. No attempts were made to perform multiparametric analysis, to suggest any quantitative score and test its performance, or to test the reproducibility or the interphysician variations. Without these, such a method faces many challenges before routine clinical use.

In this in vivo patient study, we use a multispectral PAUSI system to photoacoustically image human thyroid nodules (23 PTC and 29 benign cases). Comprising 87.8% of all types, PTC is the most common form of malignant thyroid nodule, compared with other types such as follicular (9.2%), medullary (2.1%), and anaplastic (0.9%) thyroid cancers (37). A multiparametric classification based on acquired multispectral PA signal features (e.g., the spectral gradient of PA signals, relative sO2, and skew angle in the sO2 distribution) achieved encouragingly high sensitivity and specificity. We further propose a new scoring method that combines the ATA Guidelines and the probability of PTC calculated from the multiparametric photoacoustic analysis. The proposed scoring method, named ATAP (P for photoacoustic) achieved a sensitivity of 83% and a specificity of 93% on our dataset. We have also confirmed the reliabilities of ATAP and the analysis method by measuring aspects of the intraphysician and interphysician variability. To the best of our knowledge, this is the first study to present statistical multiparametric PA analysis of thyroid nodules in vivo, using the largest number of patients to date.

Real-time photoacoustic and ultrasound imaging system

The real-time PAUSI system consisted of a programmable US machine (EC-12R, Alpinion Medical Systems) and a mobile laser (Phocus Mobile, OPOTEK). A laptop (ThinkPad E440, Lenovo) controlled all the laser operations, including on/off, wavelength tuning, power logging, and power optimization. A fast wavelength-tuning module in the laser system quickly tuned the wavelengths from 690 to 930 nm for each laser pulse (10 Hz). The laser beam was delivered through a custom-designed fiber bundle (Taihan Fiberoptics). A 128-element linear array US transducer (L3–12, Alpinion Medical Systems) was used to acquire all the PA and US images. The center frequency of the transducer was 8.5 MHz, with a fractional bandwidth (−6 dB) of 60%. A sterile US surgical probe cover (Safety Lock, EraeSI) was used to contain water and seal the imaging probe. The total weight and size of the imaging probe were approximately 600 g and 9 × 6 × 10 cm3, respectively.

Real-time image processing

Before data acquisition, all image acquisition parameters, including time gain compensation coefficients (for US images), beamforming methods, apodization methods, demodulation frequency, log compression coefficients, and display modes, were optimized using a custom-built real-time parameter control function. To achieve better image quality in USI, 256-scanline transmit focusing, and 3-angle spatial compounding techniques were applied. After data acquisition, both PA and US data went through the customized functional data processing blocks before PA and US images were shown on the display panel in real time. In short, the sequence of image processing followed five steps: (i) compensation for the acoustic attenuation, (ii) PA beamforming using the delay-and-sum method, (iii) apodization and aperture control to reduce side lobes, (iv) frequency demodulation to extract certain frequency region, (v) log compression to visualize a wide range of the signal, and (vi) scanline conversion to generate the image (38, 39). For the real-time image generation, all the calculations were parallelized and performed by the General Purpose Graphics Processing Unit (GPGPU, GeForce GTX 1080, NVIDIA) on the US machine. Those functional data processing blocks also stored both PA and US imaging data for further off-line analysis.

In vivo PAUSI of patients with thyroid cancer

All the patient enrollment and data acquisition followed the protocols approved by the Institutional Review Boards of the Seoul St. Mary's Hospital (Seoul, Republic of Korea) and Pohang University of Science and Technology (Pohang, Republic of Korea). Informed written consent was obtained from all the enrolled patients after explaining the procedure by clinicians. This clinical trial is registered with www.clinicaltrials.gov (NCT 00170196). The imaging data from patients who were recruited while they were hospitalized for a thyroidectomy (Group I) was acquired one day before the scheduled surgical operation. The results of the surgical pathology were acquired and used as gold standards. We also recruited patients who visited the clinic (Group II). For those Group II patients who did not have a biopsy result, FNA biopsies were performed after the data acquisition and considered as true values. The FNA results were determined in the pathology department. The patients laid supine with their neck extended during data acquisition. Conventional US gel (Ecosonic, SANIPIA) was applied between the imaging probe and the skin surface to match acoustic impedance. After localizing the suspicious thyroid nodule using conventional B-mode US images, the operation sequence of the PAUSI system was switched to acquire real-time PA and US images, while the imaging probe was held steady. Three physicians participated in PAUSI and data acquisition. The raw multispectral PA and US data were stored for off-line processing. During the entire data acquisition procedure, all patients and examiners wore safety glasses (LG9, Thorlabs) to prevent eye exposure from the laser beam. The maximum output fluence on the skin was 10.7 mJ/cm2 at 756 nm, which is less than the safety limit allowed on human skins (40). The fluence of each laser pulse was measured by dividing the optical energy by the illumination area at the end of the fiber bundle. The optical power was measured by a thermal sensor (S350C, Thorlabs) connected to a digital energy meter (PM100D, Thorlabs). To measure the laser illumination area at the skin surface, we shined the laser on a laser burn paper (BP-50, Photonic Solutions), then measured the burned area.

Real-time on-line data acquisition and off-line data processing

We acquired PAUS data of the patients in real time and postprocessed the data for off-line multispectral analysis (Supplementary Fig. S1). When the system was switched to the data acquisition mode, the laser system consecutively produced five wavelengths (700, 756, 796, 866, and 900 nm, Supplementary Fig. S2) for multispectral analysis (41–44), and the US machine stored the acquired PAUS data for each wavelength in the order of the wavelength series. We defined the acquired PA data from one series of wavelengths as a packet. The data collection time for a single packet (i.e., PA data acquisition time for five wavelengths) was 1 second, which was short enough to minimize the patients' movement. We encouraged patients to reduce their movements, including breathing, for 15 seconds. Fifteen packets of data were stored for every single measurement, and these 15 packets of data were defined as one dataset. The data acquisition time for one dataset was 15 seconds. We acquired multiple datasets from each patient. While multiple redundant datasets were acquired for each patient, some of the acquired data was not usable because of one of the following reasons (Supplementary Table S6): (i) The data was acquired incorrectly, (ii) Normal tissue determined by the physicians needed for signal normalization could not be determined in the PA image, or (iii) PA data were not saved because of system or user error. To test the system reproducibility, the clinician removed the imaging probe away from the patients, rested for 1–2 minutes, repositioned the imaging probe, and repeated the imaging procedures between the datasets. The entire imaging procedure for one patient was 8–10 minutes. For patients #14–52, we acquired more than five sets of data to test this measurement reproducibility.

After the real-time data acquisition and storage, off-line data processing was performed with custom-designed software codes (Matlab R2020b, Mathworks). Although we tried to minimize the patients' movements, motion artifacts could not be eliminated because of carotid artery pulsation, breathing, and clinician's movements. To further reduce motion artifacts, we calculated the correlation coefficient (CC) between consecutive US images (as shown as pink in Supplementary Fig. S1), found the five highest CC values, and selected the corresponding PA data as the most-correlated PA spectral sets (shown as blue in Supplementary Fig. S1). The US image with the highest CC was used for delineation of the nodules and normal thyroid tissue using custom-made software (shown as orange in Supplementary Fig. S1, Supplementary Movie S1). The clinicians delineated boundaries by clicking points on the US images. During the delineation, the clinicians were blinded from the FNA results and from each other's boundary decisions. The normal tissue area was determined at the same depth as the corresponding nodule site for normalization (45–47). The determined boundaries were applied to all the selected PA spectral set for multispectral analysis (shown as orange in Supplementary Fig. S1). For each spectral set, three parameters (PA spectral gradient, relative sO2 level, and skew angle in sO2 distribution) are quantified and averaged (Supplementary Fig. S3A and S3B; green in Supplementary Fig. S1). For each patient, we used the averaged values from all the datasets as an input for multispectral analysis. Our method improves the measurement reliability by electing the most correlated data and by averaging values from all the datasets.

Correlation coefficients of ultrasound images

The similarity between US images was quantified by calculating the CC value as follows:

formula

where |{\mu _i}$| and |{\sigma _i}$| are the mean and the SD of the ith US image (⁠|{\rm{U}}{{\rm{S}}_i}$|⁠), respectively. |{\mu _j}$| and |{\sigma _j}$| are the mean and the SD of the jth US image (⁠|{\rm{U}}{{\rm{S}}_j}$|⁠), respectively. The similarity of the five consecutive US images was calculated by the mean of the CC values of all different combinations of five images (i.e., 5C2 = 10 combinations).

Quantification of photoacoustic signals for spectral gradient

On the stored raw data, we applied the conventional delay-and-sum beamforming and generated PA images through frequency demodulation with a center frequency of 2.2 MHz and a cut-off frequency of 0.5 MHz. Those frequencies were tuned to achieve strong PA signals at the nodule area by using real-time parameter control software that was described in our previous report (39). To compensate for the optical fluence variation with wavelength and penetration depth, we normalized the PA signals (31, 46–48). At each wavelength, we extracted the top 50% of the PA signals within the normal boundary and then calculated the average. We normalized the PA amplitude at each pixel by dividing it by the averaged value of the normal tissue, then calculated the mean and SD of the top 50% of the normalized signals to represent the PA signal of the nodule at each wavelength. The portion of data (top 50%) was selected to produce the best classification result (Supplementary Fig. S4). Although the classification performances with various signal thresholds are very similar, the top 50% showed better values for sensitivity and AUC. We determined the linear regression of the representative PA signals with a first-order polynomial fit and defined the PA spectral gradient as the slope of the fitted line (Supplementary Fig. S3A).

Quantification of hemoglobin oxygen saturation by nonnegative spectral unmixing

The spectral unmixing was performed as follows: (i) To normalize the PA signals at each wavelength, the PA amplitude of each pixel in the PA image was divided by the average PA amplitude of the corresponding normal tissue. (ii) A symmetric 2D median filter with a size of 50-by-10 (axial-by-lateral, corresponding to 1-by-3 mm2, which is smaller than the minimum area of 25 mm2) was applied to the normalized PA images to smooth pixel-wise mismatch. The 50-by-10 kernel size was determined to produce the best sO2 classification result (Supplementary Fig. S5). Although the classification performances for various kernel sizes are very similar, the 50-by-10 showed better results in specificity and AUC. (iii) The multispectral 2D PA images are reshaped as 1D vector and are solved for the least-square solution with a nonnegative constraint function as follows:

formula

where |\varepsilon _n^{{c_m}}$| is the absorption cross-section of the mth component at the nth pixel. |\hat{\mu }_{{\lambda _k}}^{{c_m}}$| is the normalized optical absorption coefficient of the mth component at the kth wavelength. |PA_n^{{\lambda _k}}$| is the normalized PA amplitude with the kth wavelength in the nth pixel. |{{\bm{f}}_n}$| is the weighting function, which returns a positive constant value |\alpha $| when the absorption cross-section is negative. By using the unmixed value of the oxy- and deoxyhemoglobin and total hemoglobin, relative sO2 for each pixel was calculated as follows:

formula

where HbO2 is the unmixed oxyhemoglobin value, HbR is the unmixed deoxyhemoglobin value, and HbT is the total hemoglobin value. The relative sO2 level for a patient was calculated as the average value of the top 50% of the relative sO2 data for each pixel. We also quantified the pixel-wise distribution from the top 50% of the relative sO2 pixels by calculating the skew angle of the line that connects the middle of the horizontal axis and the peak point of the Gaussian-fitted distribution (Supplementary Fig. S3B).

Statistical analysis

For each parameter (i.e., PA spectral gradient, relative sO2 level, and skew angle in sO2 distribution), we investigated the statistical difference between PTC and benign nodules. To quantify the statistical difference, we calculated the P value with the Student t test, assuming that the parameters for both nodules were in a normal distribution. First, we excluded data points that were at the outside of 99.3% of the normal distribution. However, those data were not excluded in the subsequent processing.

We also investigated the classification performance of each parameter by calculating the ROC curves. We set threshold values ranging from the minimum to the maximum of the data, then calculated the sensitivity and specificity for each threshold value. We considered the histologic pathology results as the gold standard for those patients who had undergone thyroidectomy (i.e., Group I patients and patient #49 in Group II). For those who did not have a histopathologic examination (i.e., the rest of the Group II patients), we considered the FNA biopsy as the true value. After calculating the empirical ROC curve, we also calculated the binomial and 95% confidence ROC curves by assuming both nodules were in normal distributions. The AUC was also quantified by calculating the area under the binomial ROC curves. The sensitivity and specificity at the critical point of the binomial ROC curves were used to represent the classification result for each parameter. All the processing was performed using custom-designed software (Matlab R2020b, Mathworks).

Multiparametric classification

We obtained validation results for the multiparametric classification using a support vector machine (SVM) and 5-fold leave-p-out cross-validation (49–52). The SVM was implemented using the scikit-learn package version 0.20.4 in a custom script written in Python 3.9.5. Specifically, the scikit-learn C-Support vector classification (svm.SVC) algorithm was used. We set the parameters as follows: the regularization parameter C was set to one (the default value), the class weight was set to “balanced” to account for the imbalanced dataset, and a linear kernel was used for its simplicity; all other parameter values were left at defaults. In the cross-validation procedure, we randomly divided the patients into five subgroups. The division was performed using the KFold tool in the scikit-learn Python package. First, we trained a decision function using the first four of the five subgroups (e.g., subgroup I–IV; 80% of the data), then tested the trained decision function on the remaining subgroup (e.g., subgroup V; 20% of the data). In the next step, we trained another decision function using another four of five subgroups (e.g., subgroups I, III–V; 80% of the data), then tested the remaining subgroup (e.g., subgroup II; 20% of the data). We repeated this procedure for a total of five times, setting each subgroup in turn as a test set. We used the C-support vector classification formulation of the SVM optimization problem in which a hyperplane is defined to optimize the following equation (53).

formula

where |m$| is the number of training data, |{\rm{\ }}{{\bm{x}}_{\bm{i}}} \in {\mathbb{R}^n}$| is a training point, |n$| is the number of parameters, |{\bm{w}}$| is the weighting factor for each parameter, |b$| is a constant offset, |{y_i} \in \{ { - 1,{\rm{\ }}1} \}$| is a sample's class (−1 for benign and 1 for PTC), |C \gt 0{\rm{\ }}$|is a regularization parameter, and |\xi $| is a nonnegative variable. After the training, the test data were classified as:

formula

where |{{\bm{x}}_t}$| is the test data.

Real-time photoacoustic and ultrasound imaging

We used the PAUSI system (38) equipped with a tunable laser to acquire PA and US images (Fig. 1A). The design of the imaging probe was optimized on the basis of Monte Carlo simulation of the optical fluence in biological tissues (39). We used a custom-designed single fiber bundle to reduce the imaging probe size for better maneuverability by clinicians. To avoid PA signal reflection artifacts, our design included a 30-mm gap between the transducer surface and the skin surface. This gap determines the effective imaging depth of the system. Because of its dark color, the transducer surface generates PA waves, thus reflective artifacts exist at the same distance between the transducer and skin surface. We determined the effective imaging depth from the performance benchmarks of the system, where an approximately 20 dB signal-to-noise ratio was achieved at approximately 30 mm depth (38). Finally, we sealed the imaging probe with a disposable membrane, which is clinically certified for US imaging during surgery, then poured water onto the membrane for acoustic coupling. The system configuration is fully described in Materials and Methods.

Figure 1.

System configuration and real-time operation of the clinical PA and US imaging system. A, Photograph of the system. The inset shows the photograph and the cross-sectional schematic of the imaging probe. B, The real-time operation sequence of the system. BF, beamforming; DAS, delay-and-sum; Demod., frequency demodulation; FB, fiber bundle; GPU, graphics processing unit; Log Comp., log compression; PA, photoacoustic; PRT, pulse repetition time; RF, radiofrequency data; SPCD, spatial compounding; TR, transducer; US, ultrasound.

Figure 1.

System configuration and real-time operation of the clinical PA and US imaging system. A, Photograph of the system. The inset shows the photograph and the cross-sectional schematic of the imaging probe. B, The real-time operation sequence of the system. BF, beamforming; DAS, delay-and-sum; Demod., frequency demodulation; FB, fiber bundle; GPU, graphics processing unit; Log Comp., log compression; PA, photoacoustic; PRT, pulse repetition time; RF, radiofrequency data; SPCD, spatial compounding; TR, transducer; US, ultrasound.

Close modal

To acquire PA and US images in real time, the laser and the US machine were synchronized with a customized operation sequence (Fig. 1B). When the laser generates a short (∼4.5 ns) pulse beam with a repetition rate of 10 Hz, it also sends a trigger signal to the US machine. Because the PAUSI system is equipped with a 64-channel data acquisition board, two laser pulses are required to acquire data from the 128 transducer elements, resulting in a PAI rate of 5 Hz. When the US machine receives the trigger signal, it acquires the generated PA signals. Typical B-mode USI is performed immediately after the PAI acquisition. Once the full dataset required for an entire image is acquired and stored in the flash memory buffer, it undergoes several image-processing steps. The data acquisition and processing procedures are detailed in the Methods section and Supplementary Figures (Supplementary Figs. S1 and S3).

Patient selection and clinical workflow

Between January 2019 and January 2020, we screened patients for suspicious thyroid nodules. Our patient selection workflow is shown in Fig. 2. Yellow and green boxes denote examinations and results, respectively. Two cohorts of patients were enrolled: (i) Group I patients whose PTC nodules had been confirmed by FNA and were scheduled to have a thyroidectomy, and (ii) Group II patients whose thyroid nodules had been found in the US images. In all, 52 patients (49.7 ± 15.7 years; 21 Group I patients and 31 Group II patients) met the criteria for both size (≥6 mm; a single nodule size that clinicians could both find and maintain in position for imaging with the PAUSI system) and depth (≤30 mm; an effective PAI imaging depth) and were enrolled for PAI (Table 1). On the basis of FNA, 23 patients (44.1 ± 15.3 years) were confirmed to have PTC nodules, and 29 patients (54.2 ± 14.9 years) were benign. The ATA guidelines were assessed by the physicians, based on US images acquired by sonographers using conventional US machines. In addition, the nodule sizes were measured by sonographers at the largest length of the nodule among several longitudinal and transverse views acquired using conventional US machines (Supplementary Table S1). More detailed characteristics and surgical pathology findings of the enrolled patients are shown in Supplementary Table S1. All of the 21 Group I patients had surgery, and PTC was confirmed by surgical pathology. For all 31 Group II patients, the FNA results were used as ground truth. Although one patient (patient #9) was confirmed to have PTC from FNA, that patient moved to another hospital and thus their surgical pathology result was not available. Because the nodules in the other 29 patients in Group II were found to be benign from FNA, their histopathologic results were not available either. For those who had the surgical operation, we performed molecular tests for BRAF and TERT promoter mutations using the excised tissues for prognostic information. Most of the patients (18/22) showed positive for BRAF mutation, which matched with the previously reported prevalence in the PTC nodules (54, 55). TERT promoter mutation testing was performed on 18 of 22 patients with PTC, and all were found to be negative, indicating a good prognosis after surgery (9).

Figure 2.

Patient selection and study workflow. Yellow and green boxes denote examinations and results, respectively. n, the number of patients. ATA, American Thyroid Association; FNA, fine-needle aspiration; PA, photoacoustic; PTC, papillary thyroid cancer; US, ultrasound.

Figure 2.

Patient selection and study workflow. Yellow and green boxes denote examinations and results, respectively. n, the number of patients. ATA, American Thyroid Association; FNA, fine-needle aspiration; PA, photoacoustic; PTC, papillary thyroid cancer; US, ultrasound.

Close modal
Table 1.

Characteristics of enrolled patients.

GenderAgeGroupaFNA resultATAb GuidelineDepthc [mm]Finald pathology
65 PTC 24 PTC 
35 PTC 16 PTC 
58 II Benign 12 N/A 
67 PTC 25 PTC 
17 PTC 19 PTC 
55 PTC 24 PTC 
26 PTC 19 PTC 
60 II Benign 25 N/A 
44 II PTC 19 N/Ae 
10 53 PTC 21 PTC 
11 42 PTC 22 PTC 
12 59 PTC 19 PTC 
13 39 PTC 10 PTC 
14 71 II Benign N/A 
15 71 II Benign 11 N/A 
16 65 II Benign 22 N/A 
17 38 PTC 21 PTC 
18 49 II Benign 18 N/A 
19 60 PTC 17 PTC 
20 61 PTC 18 PTC 
21 53 II Benign 22 N/A 
22 59 PTC 10 PTC 
23 45 II Benign 20 N/A 
24 56 PTC 20 PTC 
25 31 PTC 19 PTC 
26 41 PTC 20 PTC 
27 58 II Benign 22 N/A 
28 28 PTC 21 PTC 
29 61 II Benign 21 N/A 
30 31 II Benign 19 N/A 
31 50 II Benign 19 N/A 
32 37 II Benign 18 N/A 
33 90 II Benign 17 N/A 
34 38 II Benign 18 N/A 
35 76 II Benign 17 N/A 
36 40 II Benign 18 N/A 
37 56 PTC 22 PTC 
38 16 PTC 20 PTC 
39 39 II Benign 18 N/A 
40 48 II Benign 20 N/A 
41 29 II Benign 17 N/A 
42 42 II Benign 13 N/A 
43 36 II Benign 16 N/A 
44 67 II Benign 22 N/A 
45 61 II Benign 22 N/A 
46 76 II Benign 18 N/A 
47 60 II Benign 20 N/A 
48 47 II Benign 16 N/A 
49 36 II PTC 23 PTC 
50 62 II Benign 17 N/A 
51 52 II Benign 24 N/A 
52 30 PTC 22 PTC 
GenderAgeGroupaFNA resultATAb GuidelineDepthc [mm]Finald pathology
65 PTC 24 PTC 
35 PTC 16 PTC 
58 II Benign 12 N/A 
67 PTC 25 PTC 
17 PTC 19 PTC 
55 PTC 24 PTC 
26 PTC 19 PTC 
60 II Benign 25 N/A 
44 II PTC 19 N/Ae 
10 53 PTC 21 PTC 
11 42 PTC 22 PTC 
12 59 PTC 19 PTC 
13 39 PTC 10 PTC 
14 71 II Benign N/A 
15 71 II Benign 11 N/A 
16 65 II Benign 22 N/A 
17 38 PTC 21 PTC 
18 49 II Benign 18 N/A 
19 60 PTC 17 PTC 
20 61 PTC 18 PTC 
21 53 II Benign 22 N/A 
22 59 PTC 10 PTC 
23 45 II Benign 20 N/A 
24 56 PTC 20 PTC 
25 31 PTC 19 PTC 
26 41 PTC 20 PTC 
27 58 II Benign 22 N/A 
28 28 PTC 21 PTC 
29 61 II Benign 21 N/A 
30 31 II Benign 19 N/A 
31 50 II Benign 19 N/A 
32 37 II Benign 18 N/A 
33 90 II Benign 17 N/A 
34 38 II Benign 18 N/A 
35 76 II Benign 17 N/A 
36 40 II Benign 18 N/A 
37 56 PTC 22 PTC 
38 16 PTC 20 PTC 
39 39 II Benign 18 N/A 
40 48 II Benign 20 N/A 
41 29 II Benign 17 N/A 
42 42 II Benign 13 N/A 
43 36 II Benign 16 N/A 
44 67 II Benign 22 N/A 
45 61 II Benign 22 N/A 
46 76 II Benign 18 N/A 
47 60 II Benign 20 N/A 
48 47 II Benign 16 N/A 
49 36 II PTC 23 PTC 
50 62 II Benign 17 N/A 
51 52 II Benign 24 N/A 
52 30 PTC 22 PTC 

Abbreviations: F, female; M, male; N/A, not available.

aI, hospitalized patients for thyroidectomy; II, outpatients in the clinic.

b1, benign; 2, very low suspicion; 3, low suspicion; 4, intermediate suspicion; 5, high suspicion.

cMeasured during PAUS imaging.

dOnly available for patients who had surgery.

ePatient #9 did not have surgery.

Multispectral photoacoustic images and hemoglobin oxygen saturation of the thyroid nodules

The in vivo PAUS images and multispectral PA signals acquired from one benign patient (#27) and one PTC patient (#9) are shown in Fig. 3. In addition to the PA amplitude, we investigate the relative sO2 level in the nodule area because we hypothesized that the PTC nodules would have lower sO2 levels than the benign nodules because of tumor hypoxia, similar to general cancerous nodules (56–58). The B-mode US images are represented in grayscale, while the PA amplitude and sO2 images are overlaid in pseudocolors. The US images clearly visualize critical structures like the carotid arteries (CA), thyroid lobes (TH; white dashed lines in Fig. 3A), and thyroid nodule boundaries (ND; yellow dashed lines in Fig. 3A). In particular, the thyroid nodule boundaries were selected by the physicians, who considered the clarity of the nodule border and the presence of normal thyroid tissue for PA data processing. Supplementary Figure S6 shows all the nodule boundaries in the US images of corresponding patients. The depth in Table 1 was quantified as the distance between the skin surface and the deepest part of the thyroid nodule boundary in the US images. For better visualization of PA amplitude and sO2 map, the close-up images are shown in Fig. 3A (red dashed boxes), while the original PA images and sO2 map for five wavelengths are shown in Supplementary Fig. S7. For comparison, Supplementary Figs. S8 and S9 show several combinations of wavelengths for spectral unmixing. Most of the combinations exhibit statistically significant differences between PTC and benign nodules. The best empirical ROC curve is obtained for a combination of all the multispectral PA data from five wavelengths. Thus, we used the five wavelengths for spectral unmixing of the oxy- and deoxyhemoglobin components. We used nonnegative spectral unmixing, which is described in the Materials and Methods section. In particular, the true optical fluence within deep biological tissues is required to calculate the absolute value of sO2. However, because the optical attenuation in tissue differs according to the optical wavelength, calculating the absolute value of sO2 is challenging (59). Therefore, we calculated the relative value of sO2 after fluence normalization at the same depth for each wavelength. The color-coded sO2 signals are overlaid on the nodule areas (Fig. 3A). In the example cases, the relative sO2 value in the benign nodule (60 ± 12 a.u.) is much higher than that in the PTC nodule (37 ± 9 a.u.), which agrees well with our hypothesis of tumor hypoxia. We also plot the changes in the PA signals over five wavelengths to investigate the PA spectral responses (i.e., optical spectrum along the wavelengths) in the benign and PTC nodules (Fig. 3B). The quantified data for all benign and PTC nodules are shown in Supplementary Fig. S10A and S10B. In the PTC nodules, the slope of the first-order polynomial fitted line (named the PA spectral gradient) has a decreasing trend with increasing wavelength. In contrast, for benign nodules, the slope remains relatively constant. Also, from the histograms and Gaussian-fitted distribution (solid black line in Fig. 3C) of the sO2 distributions, we quantified the skewness of the histogram by calculating the skew angle (θs in Fig. 3C) of the line that connects the middle of the horizontal axis and the peak point of the Gaussian-fitted distribution. The difference in the skew angles for the benign (1.75 rad) and PTC (1.50 rad) nodules is noticeable. The colors in the histograms represent the corresponding sO2 value. The benign nodules mostly have a higher skew angle than the PTC nodules. Supplementary Figure S11 shows sO2 histograms and Gaussian-fitted distributions for all the patients. Histograms in Supplementary Figs. S3C and S11 were obtained from one representative PA image among multiple PA images of each patient. In our investigation, not only sO2 but also the spectral gradient and the skewness are associated with the oxygen level in the blood, making them suitable for exploring the hypothesis of tumor hypoxia. In addition, the relationship between sO2 and the skew angle was analyzed by a linear regression model, and the R2 was 0.78 (0.75 for physician #2), confirming a high correlation (Supplementary Fig. S12).

Figure 3.

In vivo PAUS images and multispectral analyses of patients with a benign nodule and PTC. A, US, overlaid PAUS, and sO2 images of selected benign and patients with PTC. B, Quantified multispectral PA amplitudes within the benign and PTC nodules. Red dashed lines represent the first-order polynomial fit of the averaged PA amplitudes at each wavelength. C, Histograms of sO2 levels in the nodule areas from the cases. The colors represent sO2 level, and the black solid lines represent Gaussian-fitted distribution. CA, carotid artery; ND, nodule; PA, photoacoustic; PTC, papillary thyroid cancer; sO2, hemoglobin oxygen saturation; TH, thyroid; US, ultrasound; θs, skew angle of the histogram.

Figure 3.

In vivo PAUS images and multispectral analyses of patients with a benign nodule and PTC. A, US, overlaid PAUS, and sO2 images of selected benign and patients with PTC. B, Quantified multispectral PA amplitudes within the benign and PTC nodules. Red dashed lines represent the first-order polynomial fit of the averaged PA amplitudes at each wavelength. C, Histograms of sO2 levels in the nodule areas from the cases. The colors represent sO2 level, and the black solid lines represent Gaussian-fitted distribution. CA, carotid artery; ND, nodule; PA, photoacoustic; PTC, papillary thyroid cancer; sO2, hemoglobin oxygen saturation; TH, thyroid; US, ultrasound; θs, skew angle of the histogram.

Close modal

Multiparametric photoacoustic analysis of thyroid nodules

From the observations of the multispectral PA signals, we used several multispectral signal features to classify the nodules as benign or PTC. The averaged PA spectral gradients of the PTC and benign nodules are shown in Fig. 4A. The overall trends of the PA spectral gradients are matched from the cases in Fig. 3. The PA spectral gradients of the benign nodules are relatively small, while those of the PTC nodules are more negative than those of benign nodules. The difference is statistically significant with a P value of <0.001 (Fig. 4A). The AUC of the ROC curve is 0.69 (Fig. 4B), and the sensitivity and specificity are 87% and 48%, respectively.

Figure 4.

Multispectral PA analysis for differentiating thyroid nodules based on the nodule boundaries determined by physician #1. A and B, PA spectral gradients of the thyroid nodules. C and D, Relative sO2 levels of the thyroid nodules. E and F, The skew angles of the sO2 distributions in the nodule areas. A, C, and E, Student t test results with P values. B, D, and F, Corresponding ROC curves with sensitivities and specificities at the critical points of binomial ROC. Data points (red x) that are at the outside of 99.3% of the normal distribution are determined as outliers and are excluded for P value calculation (but are included in other processing). G, The 3D illustrations of the values with the classification results. H, Classification results of the thyroid nodules based on SVM decision function. The negative values of the SVM decision function denote data points classified as benign. I, ATAP scores and classification results of the nodules. The red shaded areas denote the decision region for PTC, while the blue shaded areas denote the decision region for benign. ***, P < 0.001. ATAP, the American Thyroid Association and the photoacoustic probability of PTC; B, benign; PA, photoacoustic; PTC, papillary thyroid cancer; Se, sensitivity; sO2, hemoglobin oxygen saturation; Sp, specificity.

Figure 4.

Multispectral PA analysis for differentiating thyroid nodules based on the nodule boundaries determined by physician #1. A and B, PA spectral gradients of the thyroid nodules. C and D, Relative sO2 levels of the thyroid nodules. E and F, The skew angles of the sO2 distributions in the nodule areas. A, C, and E, Student t test results with P values. B, D, and F, Corresponding ROC curves with sensitivities and specificities at the critical points of binomial ROC. Data points (red x) that are at the outside of 99.3% of the normal distribution are determined as outliers and are excluded for P value calculation (but are included in other processing). G, The 3D illustrations of the values with the classification results. H, Classification results of the thyroid nodules based on SVM decision function. The negative values of the SVM decision function denote data points classified as benign. I, ATAP scores and classification results of the nodules. The red shaded areas denote the decision region for PTC, while the blue shaded areas denote the decision region for benign. ***, P < 0.001. ATAP, the American Thyroid Association and the photoacoustic probability of PTC; B, benign; PA, photoacoustic; PTC, papillary thyroid cancer; Se, sensitivity; sO2, hemoglobin oxygen saturation; Sp, specificity.

Close modal

We spectrally unmixed the multispectral PA data to isolate the hemoglobin components (e.g., total hemoglobin, oxyhemoglobin, deoxyhemoglobin; Supplementary Fig. S13), and then calculated the relative sO2 levels. The PTC nodules show a lower sO2 level (54 ± 8 a.u.) than the benign nodules (61 ± 7 a.u.), with a P value of <0.001 (Fig. 4C). The ROC curve shows an AUC of 0.77 (Fig. 4D). At the critical point of the fitted binomial ROC curve, a sensitivity of 66% and a specificity of 75% are achieved.

In addition, to verify our previous observation of the differentiable distribution of sO2 in benign and PTC nodules, we quantified the skewness of the histogram by evaluating the asymmetry of the Gaussian-fitted distribution. We calculated the skew angles of the histograms and assessed the statistical differences (Fig. 4E). The difference between benign and PTC nodules is statistically significant, with a P value of <0.001. The ROC curves with the skew angle show an AUC of 0.81 (Fig. 4F), with a sensitivity of 80% and a specificity of 68%.

To further enhance the sensitivity and specificity, we conducted multiparametric analyses using all three parameters together. We placed all the data points into three-dimensional space and explored the boundary between them for better classification (Fig. 4G). To automate the classification process as well as to approximate how well the classifier would perform on unseen data, we defined a linear classifier using 5-fold cross-validation to train a SVM (49). Most benign and PTC nodules were correctly classified (unfilled markers in Fig. 4G), while a few were misclassified (filled markers in Fig. 4G). To visualize the classification results, we also plotted the value of the SVM decision function for each point (Fig. 4H). In Fig. 4G and H, the red circles represent the PTC nodules, while the blue triangles represent the benign nodules. Negative values of the SVM decision function denote data points classified as benign. The results show enhanced classification, with a sensitivity of 78% and a specificity of 93%. To test the robustness of the boundary selection, we repeated the multiparametric PA analyses described above, using nodule boundaries determined by another physician during the off-line process (Supplementary Fig. S14A–S14I). The results (a sensitivity of 74% and a specificity of 90%) match well with those of the first physician.

In addition, we quantified the significance of each parameter by comparing classification results with various combinations of parameters (Supplementary Table S2). It turns out that all the combinations of parameters show accuracy higher than 80%, so we cannot determine the rank of the significances. We noted a statistical difference in the spectrally unmixed oxyhemoglobin values between the PTC and benign cases (Supplementary Fig. S13). However, the multiparametric classification result of a four-parameter SVM (i.e., PA spectral gradient, relative sO2, skew angle in sO2 distribution, and oxyhemoglobin component) was worse (sensitivity of 74% and specificity of 86%) than that of the three-parameter SVM.

Classification of PTC and benign nodules

We wanted to use the results of the multiparametric analysis to establish a potential classification method with higher specificity (93%) than conventional USI-based methods (∼20%–50%), so we calculated the probability of PTC based on the logistic regression model of SVM decision function values (Supplementary Fig. S15) and combined it with the ATA Guidelines. By combining the highly sensitive (i.e., ATA Guidelines) and high-specific (i.e., multiparametric PA analysis) methods, we proposed a complementary classification method. We name the combination the ATAP (P for photoacoustic) score, and define it as follows:

formula

where 0 ≤ γ ≤ 1 is the weight of the ATA Guidelines. To determine the optimal weight, we investigated the performance of the ATAP score using all the possible weights (Supplementary Fig. S16). From the analyses, γ = 0.2 shows the best accuracy of 88% (87% for physician #2), with enhanced sensitivity of 83% while preserving specificity (yellow lines in Supplementary Fig. S16). The sensitivity, specificity, and accuracy are quantified by using an FNA suggestion method defined as follows:

formula

All the quantified values of the parameter, probability, and the ATAP score are shown in Supplementary Table S3 (for physician #1) and Supplementary Table S4 (for physician #2). The ATAP score suggests that FNA need not be performed for most of the 29 benign cases (27 from physician #1, 26 from physician #2), which shows the potential to minimize unnecessary FNA biopsies (Fig. 4I; Supplementary Fig. S14I). The ATAP score corrects the misclassified cases of the SVM decision function (blue rows in Supplementary Tables S3 and S4) and achieves enhanced sensitivity (83% from both physicians) while the specificity is maintained (93% and 90% from physician #1 and #2, respectively). However, there are still false negative and false positive cases (yellow rows in Supplementary Tables S3–S5). Among the false positive cases, patients #36 and #42 have benign nodules, but the sO2 values are unusually low (40 ± 10 and 46 ± 11, respectively). The sizes of those nodules in USI had rapidly increased (more than 1 cm compared with the previously measured size) within 12 months for patient #36 and 8 months for patient #42. This fact may imply that the growth rate of the nodule was related to this low sO2 level. Note that we observed only these two cases of rapidly growing benign nodules in our patient population, and thus future study with more cases is required to confirm this. Moreover, we confirmed that when γ was set to 0.41 (an accuracy of 75% and 77% for physician #1 and #2, respectively), a specificity of 55% (59% from physician #2) could be secured while maintaining a sensitivity of 100% (Supplementary Fig. S16, black lines). Despite the lesser accuracy, this specificity is more than 3-fold higher than the specificity (17.3%) of the conventional ATA guidelines (14), with a sensitivity of 100%.

To differentiate PTC from benign thyroid nodules, we presented a multispectral PAUSI system. We successfully visualized real-time PAUS images of thyroid nodules (23 PTC and 29 benign patients) in vivo, collected multispectral PA data, and classified the nodules offline. To distinguish benign and PTC nodules, we applied several multispectral analysis methods and quantified three parameters. They are highly related, but their specific meanings are different based on the quantification procedure. PA spectral gradient, the multispectral response of the averaged PA value; relative sO2 level, the averaged value of the pixel-wise multispectral response; and skew angle of sO2, the pixel-wise distribution of the pixel-wise multispectral response. All the single-parameter analyses showed encouraging results with statistically differentiable distributions. To enhance the classification accuracy, we used multiple parameters together and defined a new scoring method, the ATAP score, by weighting the probability of PTC and the ATA Guidelines. The ATAP score indicated 83% sensitivity and 93% specificity for physician #1 and 83% sensitivity and 90% specificity for physician #2. These classification metrics are comparable with those of the existing US elastography (sensitivities of ∼70%–85% and specificities of ∼70%–95%). This result implies that these two modalities (multispectral PAI for metabolic information and US elastography for mechanical information), when combined, can complement each other, achieving synergistic benefits. Further studies are needed to confirm this hypothesis.

By selecting the most highly correlated US images, we could minimize the effect of motion perturbation. Intraoperator reproducibility was tested by collecting multiple datasets separated by 1–2 minutes from each patient (patients #14–52, Supplementary Table S6) and comparing their relative sO2 levels (Supplementary Fig. S17). Most of the calculated relative sO2 levels are within ±10% of the mean value. We did not intentionally exclude any data, aside from patients' data that were not available because of irrecoverable errors, including the wrong positioning of the image plane, lack of normal tissue area, and unexpected system errors.

When the nodule sizes Supplementary Table S1 are analyzed, they show a statistical difference between benign nodules and PTC nodules (Supplementary Fig. S18). This difference is an inevitable result of the clinical workflow. In the clinical procedure, when a nodule is smaller than a certain threshold [<6 mm in the Korean Thyroid Association (KTA) guideline (60), and <1 cm in the ATA guideline (9)], FNA is not be performed unless the nodule is determined highly malignant. That said, according to the ATA or KTA guidelines, if the degree of malignancy is high, FNA is performed even if the size is small. If the FNA determines that the nodule is malignant, it is excised promptly. In other words, malignant nodules are often surgically removed before they become large. However, if the degree of malignancy is not high, the nodule's growth is monitored, and FNA is usually performed only when it reaches 1–2 cm. Therefore, the benign nodules identified by FNA are often large. Thus, irrespective of our intentions, this clinical workflow leads to a difference in size between benign and PTC nodules (61). In our study, nodule size was used only as a patient-recruiting criterion: it was not used for data processing at all. To determine whether the difference in ATAP scores between the two groups (benign and PTC) was because of size, ANCOVA was performed on the ATAP scores of both groups, with size as a covariate (Supplementary Table S7). There was no statistically significant effect of size on ATAP (P = 0.157). However, the ATAP score was statistically significantly higher in the PTC group (P < 0.001), with an effect size (⁠|{\eta ^2}$|⁠) of 0.396.

In this study, we acknowledge some limitations. First of all, the histopathologic results of nodules from Group II patients (29 benign and 2 patients with PTC) were not available because the 29 benign patients did not have surgery and 2 patients with PTC moved to other hospitals. Thus, FNA results were used as ground truth for those patients. Because the FNA may have inaccuracies, our results might have been impacted by this limitation. Because this was the first test of the feasibility of using multispectral PA to analyze thyroid nodules, we recruited only patients with PTC who met certain criteria for nodule sizes and depths. There are approximately 12% of other malignant thyroid types, such as follicular thyroid, medullary thyroid, and anaplastic thyroid cancers. Multispectral PA analyses and comparisons between different types of thyroid cancer are the next steps of the study. Furthermore, detailed studies of inter- and intraphysician agreement are needed for clinical translation. To use the PAUSI machine as a clinical triaging tool, we need to further update our system in several regards: (i) processing the multispectral parameters in real time, (ii) improving the quality of the US image so that the ATA Guidelines can be quantified using the PAUSI system, (iii) improving the integrated imaging probe for easier hand-held operation, and (iv) implementing US elastography for integrated analysis.

Although we corrected some misclassified cases by using the ATAP score, there are still four false negative and two false positive cases (three false positive cases for physician #2) that result in the relatively low sensitivity, a concern for clinical translation as a triaging tool. The low sensitivity may be because of the limitation of our parameters based on endogenous chromophores, especially oxy- and deoxyhemoglobin. Further study with additional parameters, including the amplitude, distribution, and temporal change of contrast-enhanced PA signals in nodules using exogenous contrast agents, may improve the classification accuracy.

In this study, we did not quantify the vascularity of the nodules, but rather their sO2, based on the B-mode (i.e., cross-sectional) images that correspond to the position of FNA. To quantify the vascularity, volumetric data would be required, which was not obtained in this study. However, because we calculated the sO2 by multispectral analysis of the series of B-mode images, changes in vascularity might not affect the result. As a next step of the study, we can acquire volumetric data for vascularity assessment by modifying the recently developed 3D hand-held scanner (45).

Despite these limitations, this is the first multiparametric analysis of multispectral PA data of thyroid nodules with statistical significance. We believe the approach presented here, when combined with existing USI technology, offers two benefits: (i) It provides better screening of patients, thus reducing overdiagnosis and unnecessary FNA biopsies. (ii) It increases the accuracy of FNA by localizing an area of interest during the biopsy.

No disclosures were reported.

J. Kim: Conceptualization, data curation, software, formal analysis, supervision, validation, investigation, visualization, methodology, writing–original draft, writing–review and editing. B. Park: Conceptualization, data curation, software, formal analysis, validation, investigation, visualization, methodology, writing–original draft, writing–review and editing. J. Ha: Conceptualization, formal analysis, validation, investigation, methodology, writing–original draft. I. Steinberg: Software. S.M. Hooper: Software, formal analysis. C. Jeong: Investigation. E.-Y. Park: Investigation. W. Choi: Investigation. T. Liang: Software. J.S. Bae: Investigation. R. Managuli: Investigation, writing–review and editing. Y. Kim: Investigation, writing–review and editing. S.S. Gambhir: Conceptualization, investigation, writing–review and editing. D.-J. Lim: Conceptualization, resources, formal analysis, funding acquisition, investigation, writing–original draft, project administration, writing–review and editing. C. Kim: Conceptualization, resources, formal analysis, supervision, funding acquisition, validation, investigation, writing–original draft, project administration, writing–review and editing.

This research was supported by the following sources: the National Research Foundation (NRF) of Korea grants (2020R1A6A1A03047902, 2020M3H2A1078045, 2018R1D1A1A02045433, 2019R1A2C2006269, BK21 FOUR Projects, and 2021R1A5A1032937) and the Artificial Intelligence Graduate School Program funded by the Ministry of Science and ICT (2019-0-01906).

The costs of publication of this article were defrayed, in part, by the payment of page charges. This article must therefore be hereby marked advertisement in accordance with 18 U.S.C. Section 1734 solely to indicate this fact.

1.
Vaccarella
S
,
Dal Maso
L
,
Laversanne
M
,
Bray
F
,
Plummer
M
,
Franceschi
S
. 
The impact of diagnostic changes on the rise in thyroid cancer incidence: a population-based study in selected high-resource countries
.
Thyroid
2015
;
25
:
1127
36
.
2.
Horn-Ross
PL
,
Lichtensztajn
DY
,
Clarke
CA
,
Dosiou
C
,
Oakley-Girvan
I
,
Reynolds
P
, et al
Continued rapid increase in thyroid cancer incidence in California: trends by patient, tumor, and neighborhood characteristics
.
Cancer Epidemiol Biomarkers Prev
2014
;
23
:
1067
79
.
3.
Ferlay
J
,
Colombet
M
,
Soerjomataram
I
,
Mathers
C
,
Parkin
D
,
Piñeros
M
, et al
Estimating the global cancer incidence and mortality in 2018: GLOBOCAN Sources and Methods
.
Int J Cancer
2019
;
144
:
1941
53
.
4.
Davies
L
,
Welch
HG
. 
Increasing incidence of thyroid cancer in the United States, 1973–2002
.
JAMA
2006
;
295
:
2164
7
.
5.
Vaccarella
S
,
Franceschi
S
,
Bray
F
,
Wild
CP
,
Plummer
M
,
Dal Maso
L
. 
Worldwide thyroid-cancer epidemic? The increasing impact of overdiagnosis
.
N Engl J Med
2016
;
375
:
614
7
.
6.
Ahn
HS
,
Kim
HJ
,
Welch
HG
. 
Korea's thyroid-cancer “epidemic”—screening and overdiagnosis
.
N Engl J Med
2014
;
371
:
1765
7
.
7.
Lubitz
CC
,
Kong
CY
,
McMahon
PM
,
Daniels
GH
,
Chen
Y
,
Economopoulos
KP
, et al
Annual financial impact of well-differentiated thyroid cancer care in the United States
.
Cancer
2014
;
120
:
1345
52
.
8.
Cooper
DS
,
Doherty
GM
,
Haugen
BR
,
Kloos
RT
,
Lee
SL
,
Mandel
SJ
, et al
Management guidelines for patients with thyroid nodules and differentiated thyroid cancer: the American Thyroid Association Guidelines Taskforce
.
Thyroid
2006
;
16
:
109
42
.
9.
Haugen
BR
,
Alexander
EK
,
Bible
KC
,
Doherty
GM
,
Mandel
SJ
,
Nikiforov
YE
, et al
2015 American Thyroid Association Management Guidelines for adult patients with thyroid nodules and differentiated thyroid cancer: The American Thyroid Association Guidelines Task Force on Thyroid Nodules and Differentiated Thyroid Cancer
.
Thyroid
2016
;
26
:
1
133
.
10.
Perros
P
,
Boelaert
K
,
Colley
S
,
Evans
C
,
Evans
RM
,
Gerrard BA
G
, et al
Guidelines for the management of thyroid cancer
.
Clin Endocrinol
2014
;
81
:
1
122
.
11.
Kwak
JY
,
Han
KH
,
Yoon
JH
,
Moon
HJ
,
Son
EJ
,
Park
SH
, et al
Thyroid imaging reporting and data system for US features of nodules: a step in establishing better stratification of cancer risk
.
Radiology
2011
;
260
:
892
9
.
12.
Moon
W-J
,
Jung
SL
,
Lee
JH
,
Na
DG
,
Baek
J-H
,
Lee
YH
, et al
Benign and malignant thyroid nodules: US differentiation—multicenter retrospective study
.
Radiology
2008
;
247
:
762
70
.
13.
Frates
MC
,
Benson
CB
,
Charboneau
JW
,
Cibas
ES
,
Clark
OH
,
Coleman
BG
, et al
Management of thyroid nodules detected at US: society of radiologists in ultrasound consensus conference statement
.
Radiology
2005
;
237
:
794
800
.
14.
Chng
CL
,
Tan
HC
,
Too
CW
,
Lim
WY
,
Chiam
PPS
,
Zhu
L
, et al
Diagnostic performance of ATA, BTA and TIRADS sonographic patterns in the prediction of malignancy in histologically proven thyroid nodules
.
Singapore Med J
2018
;
59
:
578
.
15.
Grani
G
,
Lamartina
L
,
Ascoli
V
,
Bosco
D
,
Biffoni
M
,
Giacomelli
L
, et al
Reducing the number of unnecessary thyroid biopsies while improving diagnostic accuracy: toward the “right” TIRADS
.
J Clin Endocrinol Metab
2019
;
104
:
95
102
.
16.
Lyshchik
A
,
Higashi
T
,
Asato
R
,
Tanaka
S
,
Ito
J
,
Mai
JJ
, et al
Thyroid gland tumor diagnosis at US elastography
.
Radiology
2005
;
237
:
202
11
.
17.
Park
AY
,
Son
EJ
,
Han
K
,
Youk
JH
,
Kim
J-A
,
Park
CS
. 
Shear wave elastography of thyroid nodules for the prediction of malignancy in a large scale study
.
Eur J Radiol
2015
;
84
:
407
12
.
18.
Zhao
C-K
,
Xu
H-X
. 
Ultrasound elastography of the thyroid: principles and current status
.
Ultrasonography
2019
;
38
:
106
.
19.
Cosgrove
D
,
Barr
R
,
Bojunga
J
,
Cantisani
V
,
Chammas
MC
,
Dighe
M
. 
WFUMB guidelines and recommendations on the clinical use of ultrasound elastography: part 4. Thyroid
.
Ultrasound Med Biol
2017
;
43
:
4
26
.
20.
Dighe
M
,
Bae
U
,
Richardson
ML
,
Dubinsky
TJ
,
Minoshima
S
,
Kim
Y
. 
Differential diagnosis of thyroid nodules with US elastography using carotid artery pulsation
.
Radiology
2008
;
248
:
662
9
.
21.
Lim
D-J
,
Luo
S
,
Kim
M-H
,
Ko
S-H
,
Kim
Y
. 
Interobserver agreement and intraobserver reproducibility in thyroid ultrasound elastography
.
Am J Roentgenol
2012
;
198
:
896
901
.
22.
Wojakowska
A
,
Chekan
M
,
Widlak
P
,
Pietrowska
M
. 
Application of metabolomics in thyroid cancer research
.
Int J Endocrinol
2015
;
2015
:
1
13
.
23.
Wang
LV
,
Hu
S
. 
Photoacoustic tomography: in vivo imaging from organelles to organs
.
Science
2012
;
335
:
1458
62
.
24.
Kim
C
,
Favazza
C
,
Wang
LV
. 
In vivo photoacoustic tomography of chemicals: high-resolution functional and molecular optical imaging at new depths
.
Chem Rev
2010
;
110
:
2756
82
.
25.
Zhang
Y
,
Jeon
M
,
Rich
LJ
,
Hong
H
,
Geng
J
,
Zhang
Y
, et al
Non-invasive multimodal functional imaging of the intestine with frozen micellar naphthalocyanines
.
Nat Nanotechnol
2014
;
9
:
631
8
.
26.
Jung
H
,
Park
S
,
Gunassekaran
GR
,
Jeon
M
,
Cho
Y-E
,
Baek
M-C
, et al
A peptide probe enables pphotoacoustic-guided imaging and drug delivery to lung tumors in K-rasLA2 mutant mice
.
Cancer Res
2019
;
79
:
4271
82
.
27.
Zhang
HF
,
Maslov
K
,
Stoica
G
,
Wang
LV
. 
Functional photoacoustic microscopy for high-resolution and noninvasive in vivo imaging
.
Nat Biotechnol
2006
;
24
:
848
51
.
28.
Choi
C
,
Ahn
J
,
Kim
C
. 
Intravascular photothermal strain imaging for lipid detection
.
Sensors
2018
;
18
:
3609
.
29.
Kim
J
,
Kim
YH
,
Park
B
,
Seo
HM
,
Bang
CH
,
Park
GS
, et al
Multispectral ex vivo photoacoustic imaging of cutaneous melanoma for better selection of the excision margin
.
Br J Dermatol
2018
;
179
:
780
2
.
30.
Lin
L
,
Hu
P
,
Shi
J
,
Appleton
CM
,
Maslov
K
,
Li
L
, et al
Single-breath-hold photoacoustic computed tomography of the breast
.
Nat Commun
2018
;
9
:
2352
.
31.
Knieling
F
,
Neufert
C
,
Hartmann
A
,
Claussen
J
,
Urich
A
,
Egger
C
, et al
Multispectral optoacoustic tomography for assessment of Crohn's disease activity
.
N Engl J Med
2017
;
376
:
1292
4
.
32.
Steinberg
I
,
Huland
DM
,
Vermesh
O
,
Frostig
HE
,
Tummers
WS
,
Gambhir
SS
. 
Photoacoustic clinical imaging
.
Photoacoustics
2019
;
14
:
77
98
.
33.
Dogra
VS
,
Chinni
BK
,
Valluru
KS
,
Moalem
J
,
Giampoli
EJ
,
Evans
K
, et al
Preliminary results of ex vivo multispectral photoacoustic imaging in the management of thyroid cancer
.
Am J Roentgenol
2014
;
202
:
W552
8
.
34.
Dima
A
,
Ntziachristos
V
. 
In-vivo handheld optoacoustic tomography of the human thyroid
.
Photoacoustics
2016
;
4
:
65
9
.
35.
Yang
M
,
Zhao
L
,
He
X
,
Su
N
,
Zhao
C
,
Tang
H
, et al
Photoacoustic/ultrasound dual imaging of human thyroid cancers: an initial clinical study
.
Biomedical Optics Express
2017
;
8
:
3449
57
.
36.
Roll
W
,
Markwardt
NA
,
Masthoff
M
,
Helfen
A
,
Claussen
J
,
Eisenblätter
M
, et al
Multispectral optoacoustic tomography of benign and malignant thyroid disorders - a pilot study
.
J Nucl Med
2019
;
60
:
1461
6
.
37.
Mao
Y
,
Xing
M
. 
Recent incidences and differential trends of thyroid cancer in the USA
.
Endocr Relat Cancer
2016
;
23
:
313
22
.
38.
Kim
J
,
Park
S
,
Jung
Y
,
Chang
S
,
Park
J
,
Zhang
Y
, et al
Programmable real-time clinical photoacoustic and ultrasound imaging system
.
Sci Rep
2016
;
6
:
35137
.
39.
Kim
J
,
Park
E-Y
,
Park
B
,
Choi
W
,
Lee
KJ
,
Kim
C
. 
Towards clinical photoacoustic and ultrasound imaging: probe improvement and real-time graphical user interface
.
Exp Biol Med
2020
;
245
:
321
9
.
40.
American National Standards Institute
Vol. ANSI Z136.1
. Available from: https://www.lia.org/resources/laser-safety-information/laser-safety-standards/ansi-z136-standards/z136-1
41.
Huda
K
,
Wu
C
,
Sider
JG
,
Bayer
CL
. 
Spherical-view photoacoustic tomography for monitoring in vivo placental function
.
Photoacoustics
2020
;
20
:
100209
.
42.
Lawrence
DJ
,
Escott
ME
,
Myers
L
,
Intapad
S
,
Lindsey
SH
,
Bayer
CL
. 
Spectral photoacoustic imaging to estimate in vivo placental oxygenation during preeclampsia
.
Sci Rep
2019
;
9
:
1
8
.
43.
Ulrich
L
,
Held
KG
,
Jaeger
M
,
Frenz
M
,
Akarçay
HG
. 
Reliability assessment for blood oxygen saturation levels measured with optoacoustic imaging
.
J Biomed Opt
2020
;
25
:
046005
.
44.
Balasundaram
G
,
Ding
L
,
Li
X
,
Attia
ABE
,
Dean-Ben
XL
,
Ho
CJH
, et al
Noninvasive anatomical and functional imaging of orthotopic glioblastoma development and therapy using multispectral optoacoustic tomography
.
Transl Oncol
2018
;
11
:
1251
8
.
45.
Park
B
,
Bang
CH
,
Lee
C
,
Han
JH
,
Choi
W
,
Kim
J
, et al
3D wide-field multispectral photoacoustic imaging of human melanomas in vivo: a pilot study
.
J Eur Acad Dermatol Venereol
2021
;
35
:
669
76
.
46.
Oraevsky
A
,
Clingman
B
,
Zalev
J
,
Stavros
A
,
Yang
W
,
Parikh
J
. 
Clinical optoacoustic imaging combined with ultrasound for coregistered functional and anatomical mapping of breast tumors
.
Photoacoustics
2018
;
12
:
30
45
.
47.
Lee
C
,
Choi
W
,
Kim
J
,
Kim
C
. 
Three-dimensional clinical handheld photoacoustic/ultrasound scanner
.
Photoacoustics
2020
;
18
:
100173
.
48.
Regensburger
AP
,
Fonteyne
LM
,
Jüngert
J
,
Wagner
AL
,
Gerhalter
T
,
Nagel
AM
, et al
Detection of collagens by multispectral optoacoustic tomography as an imaging biomarker for Duchenne muscular dystrophy
.
Nat Med
2019
;
25
:
1905
15
.
49.
Chang
C-C
,
Lin
C-J
. 
LIBSVM: a library for support vector machines
.
ACM Trans Intell Syst Technol
2011
;
2
:
27
.
50.
Shalev-Shwartz
S
,
Ben-David
S
.
Understanding machine learning: from theory to algorithms
.
New York, NY
:
Cambridge University Press
; 
2014
.
51.
Celisse
A
. 
Optimal cross-validation in density estimation with the L2-loss
.
Ann Stat
2014
;
42
:
1879
910
.
52.
Hastie
T
,
Tibshirani
R
,
Friedman
J
.
The elements of statistical learning: data mining, inference, and prediction
.
New York, NY
:
Springer Science & Business Media
; 
2009
.
53.
Pedregosa
F
,
Varoquaux
G
,
Gramfort
A
,
Michel
V
,
Thirion
B
,
Grisel
O
, et al
Scikit-learn: machine learning in python
.
J Mach Learn Res
2011
;
12
:
2825
30
.
54.
Cohen
Y
,
Xing
M
,
Mambo
E
,
Guo
Z
,
Wu
G
,
Trink
B
, et al
BRAF mutation in papillary thyroid carcinoma
.
J Natl Cancer Inst
2003
;
95
:
625
7
.
55.
Song
YS
,
Park
YJ
. 
Genomic characterization of differentiated thyroid carcinoma
.
Endocrinol Metab
2019
;
34
:
1
10
.
56.
Kim
M-H
,
Ko
SH
,
Bae
J-S
,
Lee
S-H
,
Jung
C-K
,
Lim
D-J
, et al
Non–FDG-avid primary papillary thyroid carcinoma may not differ from FDG-avid papillary thyroid carcinoma
.
Thyroid
2013
;
23
:
1452
60
.
57.
Hockel
M
,
Vaupel
P
. 
Tumor hypoxia: definitions and current clinical, biologic, and molecular aspects
.
J Natl Cancer Inst
2001
;
93
:
266
76
.
58.
Vaupel
P
,
Mayer
A
,
Höckel
M
.
In
:
Sen
C
,
Semenza
G
,
editors
.
Methods in enzymology.
vol.
381
Amsterdam, Net
herlands:
Elsevier
; 
2004
.
p.
335
54
.
59.
Hussain
A
,
Petersen
W
,
Staley
J
,
Hondebrink
E
,
Steenbergen
W
. 
Quantitative blood oxygen saturation imaging using combined photoacoustics and acousto-optics
.
Opt Lett
2016
;
41
:
1720
3
.
60.
Shin
JH
,
Baek
JH
,
Chung
J
,
Ha
EJ
,
Kim
J-h
,
Lee
YH
, et al
Ultrasonography diagnosis and imaging-based management of thyroid nodules: revised Korean Society of Thyroid Radiology consensus statement and recommendations
.
Korean J Radiol
2016
;
17
:
370
.
61.
Kamran
SC
,
Marqusee
E
,
Kim
MI
,
Frates
MC
,
Ritner
J
,
Peters
H
, et al
Thyroid nodule size and prediction of cancer
.
J Clin Endocrinol Metab
2013
;
98
:
564
70
.