An automatic peak deconvolution code for Raman spectra of carbonaceous material and a revised geothermometer for intermediate-to moderately high-grade metamorphism

Carbonaceous material (CM) undergoes progressive changes that reflect its thermal history. These changes are in general irreversible and provide valuable information for understanding diagenetic and metamorphic processes of crustal rocks. Among various approaches to quantify these changes, the R2 ratio, area ratio of specific peaks in CM Raman spectra, is widely used to estimate the maximum temperature of intermediate-to moderately high-grade metamorphism. The calculation of the R2 ratio requires peak deconvolution of the original spectrum, and the results depend on the details of how this is carried out. However, a clear protocol for selecting appropriate initial conditions has not been established and obtaining a reliable temperature estimate depends at least in part on the experience and skill of the operator. In this study, we developed a Python code that automatically calculates the R2 ratio from CM Raman spectra. Our code produces R2 ratios that are generally in good agreement with those of Aoya et al. (J Metamorph Geol 28:895–914, 2010, https:// doi. org/ 10. 1111/j. 1525-1314. 2010. 00896.x) for the same Raman data, with much less time and effort than was the case in the previous studies. We have confirmed that the code is also applicable to other previous datasets from both contact and regional metamorphic regions. The overall trend of the recalculated data indicates that samples with R2 greater than ~ 0.7 are not sensitive to the changes in CM maturity and thus should not be used for the calibration of an R2-based geothermometer. We propose a modified geothermometer for contact metamorphism that is strictly applicable to samples with R2 from 0.023 to 0.516, with the proviso that a laser with a wavelength of 532 nm should be used. A slight extrapolation of the newly proposed geothermometer up to R2 of 0.57 provides a temperature estimate that is consistent with the geother-mometer of Kaneki and Kouketsu (Island Arc 31:e12467, 2022; https:// doi. org/ 10. 1111/ iar. 12467); the boundary between the two geothermometers corresponds to a temperature of 391 °C.


Introduction
The thermal history of crustal rocks provides important insights into the diagenetic and metamorphic processes of the Earth's interior.Carbonaceous material (CM) derived from organic matter is a common constituent of sediments and metasedimentary rocks.CM undergoes irreversible changes in its organochemical and crystallographic features-referred to collectively as its maturity-as temperature increases.Quantification of these features of CM has been widely used as a proxy for the thermal history experienced by the host rock.The maturity of CM has been investigated using numerous different analytical techniques such as X-ray diffraction analysis, Raman spectrometry, and elemental analysis (e.g., Buseck and Huang 1985;French 1964;Ganz and Kalkreuth 1987;McKirby and Powell 1974;Seifert 1978;Sweeney and Burnham 1990;Tuinstra and Koenig 1970;van Krevelen 1993).The changes in CM maturity are generally thought to depend on time and temperature alone and be irreversible.However, deformation and in particular brittle deformation may also influence the results (e.g., Nakamura et al. 2015).
Over the past three decades, the use of Raman spectroscopy has dramatically increased in popularity among researchers in the geosciences (see Fig. 1 in Henry et al. 2019 andKouketsu 2023).A Raman spectrum of CM typically exhibits two bands at wavenumbers of 1355 cm -1 (disordered band) and 1575 cm -1 (graphite band), which are attributed to the disordered and ordered graphitic structures, respectively (Tuinstra and Koenig 1970).Some Raman parameters associated with these two peaks, such as width, position, intensity, and area, have been reported to show close correlation with the maturity of CM (e.g., Beyssac et al. 2002;Kouketsu et al. 2014;Schito et al. 2017;Roberts et al. 1995), as reviewed by Henry et al. (2019).Beyssac et al. (2002) proposed the first empirical geothermometer to estimate the maximum temperature T during regional metamorphism from the area ratio of specific peaks, i.e., the R2 ratio, as follows: R2 was defined as where A D1 , A D2 , and A G are the areas of the D1-, D2-, and G-bands.Following their pioneering work, various empirical geothermometers based on Raman spectra of CM have been developed for terrestrial rocks (e.g., Aoya et al. 2010;Kouketsu et al. 2014;Lahfid et al. 2010;Lünsdorf et al. 2017;Rahl et al. 2005).Mori et al. (2015aMori et al. ( , 2017) ) performed thermal modeling of contact metamorphism (2) and concluded that the heating duration required for CM maturation to attain steady state for a given temperature is greater than one hundred years.Throughout the present paper, we assume that such steady state conditions have been achieved for all samples.The idea of estimating T from CM Raman spectra has been extended to fault rocks (e.g., Hirono et al. 2015;Kaneki et al. 2016;Mukoyoshi et al. 2018) and extraterrestrial rocks (e.g., Busemann et al. 2007;Cody et al. 2008;Homma et al. 2015), although the direct application of the geothermometers to these types of rocks may not be appropriate, due to large differences in the environments where CM matures (e.g., Homma et al. 2015;Kouketsu et al. 2017).
Of the CM-based geothermometers that have been proposed (e.g., Aoya et al. 2010;Beyssac et al. 2002;Kouketsu et al. 2014;Lahfid et al. 2010;Lünsdorf et al. 2017;Rahl et al. 2005), those of Kouketsu et al. (2014) and Aoya et al. (2010) (hereafter, referred to as K2014 and A2010, respectively) are among the most widely used.Numerous studies have used one of these geothermometers, or a combination, to estimate T of the sam- ples of interest (e.g., Bonneville et al. 2020;Cavalazzi et al. 2021;Hickman-Lewis et al. 2020;Mori et al. 2015a, b;Nakamura et al. 2019;Shimura et al. 2021;Yamaoka et al. 2022).The geothermometers of K2014 and A2010 are based on the widths of specific peaks and R2 , respec- tively, and have distinct but complementary applicable temperature ranges: for K2014 it is 150-400 °C and for A2010 it is 340-655 °C.Although both approaches require peak deconvolution of the measured Raman spectra, the criteria for setting the initial conditions for nonlinear least-squares fitting are not clearly presented in the original papers, meaning temperature estimates could vary depending on the skill and experience of the analysts.Kaneki and Kouketsu (2022) (hereafter, referred to as KK2022) addressed this issue by developing a Python code that automatically performs peak deconvolution to calculate the widths of the D1-and D2-bands.The main objective of the present study is to develop a similar code to calculate R2.
We first describe the fitting procedures implemented in our code to reproduce R2 of A2010 for the same data- set.The results obtained are then compared with those reported by A2010 for the validation of the code.Finally, we assess the utility of our proposed approach including discussion of its limitations based on results of applying it to datasets other than those of A2010.

Methods
In the development of our code, we reanalyzed the same Raman spectral dataset as that measured and analyzed in A2010.The basic structure of the code is similar to that of Fitting D in KK2022.Note that we developed our code by trial and error to reproduce the analytical results of A2010 as closely as possible for the same dataset.For this purpose, we tried to follow the analytical procedures of A2010 as closely as possible (e.g., fitting function, order of a baseline, number of peaks, etc.).

Data
We first briefly describe the samples and spectral data of A2010.Readers can refer to A2010 for more detailed information, including their experimental setups.Since the present study focuses on the development, validation, and application of our new code, we will not examine the procedure and methodology used in the original experiments reported in A2010.A total of ten analyzed samples was studied, of which two, K08 and N33, were excluded from the final geothermometer of the present study (Table 1).The detailed reasons for this selection are discussed in Sect.4.3.All samples experienced contact metamorphism, and their independently estimated T range from 340 ± 25 °C to 655 ± 25 °C.A2010 measured at least 50 spectra for each sample and a total of 809 spectra.Most of the spectra show two distinct peaks of the disordered and graphite bands (Fig. 1a), except for highly matured CM.
A2010 examined the effect of measurement conditions on the analytical results by varying the magnification of the objective lens (50 × or 100 ×), sample type Fig. 1 Data processing procedures of our code for calculating R2 from a raw Raman spectrum.One of the spectra of the N29 sample is shown here as an example.a Raw spectrum, linear baseline, and baseline-corrected spectrum; b normalized baseline-corrected spectrum (referred to as 'measured spectrum' in this paper); c initial conditions of calculated spectrum; and d calculated spectrum after nonlinear least-squares fitting.Gray areas in a indicate the spectral ranges used to determine a linear baseline.The coefficients of determination R 2 in c and d were calculated using Eq.(8) (thin section or polished chip), and incident angle of laser beam (normal or parallel to the c-axis of the crystal structure of graphite).They concluded that these variations in the experimental conditions had a limited effect on the calculated R2 (Fig. 8 in A2010).

Fitting function
A2010 employed a Voigtian, V , in their fitting proce- dures, which is defined as the convolution of a Gaussian, G , and a Lorentzian, L , as follows: where ω is the center position, A is the area, and σ and γ represent the widths of G and L , respectively.G and L are expressed as Although we need to determine the values of the arguments of V as an initial condition for nonlinear least- squares fitting, it is difficult to constrain σ and γ from a raw Raman spectrum.Some previous studies (e.g., Kouketsu et al. 2014;Lünsdorf and Lünsdorf 2016) employed a pseudo-Voigtian, PV , for peak deconvolution.One can define a typical PV as follows: where η is the mixing ratio of G and L , and Ŵ is full width at half maximum (FWHM).PV in Eq. ( 6) takes its maxi- mum intensity I at x = ω as follows: Note that the first approximation to all arguments of PV can be easily determined from a raw Raman spec- trum.Since we are interested in the area ratio and the difference in A between V and PV is less than 1% (Fig. S1), we concluded that the usage of PV instead of V should have negligible effect on the calculated R2.

Background correction and normalization
Background correction is required for each raw spectrum prior to peak deconvolution.Although it is obvious that A2010 employed a linear baseline, the details of the algorithm used to describe it are not clearly presented.In our code, we determined the slope and intercept of the linear baseline by a linear least-squares method using the raw spectral data in the ranges 1100-1150 cm -1 and 1700-1750 cm -1 (Fig. 1a).However, for samples with a high degree of maturity and thus low effects of fluorescence, a small bulge sometimes appears around 1100 cm -1 (attributable to the overlapping of spectra from quartz or calcite grains with the spectra from CM), which makes this procedure inapplicable (Fig. S2a).In this case, it is better to (3) describe the linear baseline using the data in the ranges 1200-1250 cm -1 and 1700-1750 cm -1 (Fig. S2b).We defined a 'mature' spectrum as one with the maximum intensity at 1320-1380 cm -1 (disordered band) less than half that at 1570-1600 cm -1 (graphite band) after the baseline correction.If a given spectrum satisfies this criterion, the linear baseline is described using the spectral data in the ranges 1200-1250 cm -1 and 1700-1750 cm -1 , rather than 1100-1150 cm -1 and 1700-1750 cm -1 .The baseline-corrected spectrum was then normalized such that its maximum intensity in the range 1100-1750 cm -1 becomes unity (Fig. 1b).This normalized baseline-corrected spectrum is referred to as the 'measured spectrum' in the present study, and we used this spectrum for the peak deconvolution.

Initial conditions and peak deconvolution
To perform the nonlinear least-squares fitting of the measured spectrum, the initial conditions for each peak should be determined.The approach to setting the initial conditions in our code is similar to that of Fitting D in KK2022.The D1-, D2-, D3-, and G-bands of PV are assumed to be always present in all the measured spectra.The maximum intensity of the measured spectrum in the range 1320-1380 cm -1 and the associated wavenumber are employed as the initial values for I and ω of the D1-band, respectively.The same analysis is applied to the G-band in the range 1570-1600 cm -1 .The initial Ŵ of the D1-band is set as the distance between two wavenumbers where the intensities are half the initial I in the ranges from 1200 cm -1 to initial ω and from ini- tial ω to 1500 cm -1 .That of the G-band is set to twice the distance between initial ω and the wavenumber at the point where the intensity is half the initial I in the range from 1500 cm -1 to initial ω .These approaches to deter- mining appropriate values for the initial Ŵ allow us to analyze Raman spectra of CM with different maturities using a single code, eliminating the subjectivity owing to the selection of the fitting method (a detailed discussion of this point can be found in Sect.4.3 in KK2022).The D1-and G-bands are initially assumed to be a pure Lorentzian, and thus η = 1 .To determine the initial con- ditions of the other two bands, we used the residual spectrum after subtracting the initial D1-and G-bands from the measured spectrum (Fig. S3).For the D2-band, 90% of the maximum intensity of the residual spectrum in the range 1610-1630 cm -1 and the associated wavenumber are employed as the initial values for I and ω , respec- tively.The same approach is employed for the D3-band in the range 1500-1540 cm -1 .The initial Ŵ of the D2-band is set to twice the distance between the initial ω and the wavenumber where the intensity is half the initial I in the range from initial ω to 1700 cm -1 .That of the D3-band is set to 130 cm -1 , as in Fitting D of KK2022.The D2-and D3-bands are initially assumed to be half Gaussian and half Lorentzian, and thus η = 0.5 .We calculated the ini- tial A values of these four bands using Eq. ( 7).
After determining the initial conditions of the calculated spectrum (Fig. 1c), we performed nonlinear leastsquares fitting on the measured spectrum in the spectral range 1100-1750 cm -1 (Fig. 1d).We used scipy.optimize.curve_fit function with the Trust Region Reflective algorithm (SciPy v1.10.1), which is applicable even when the bounds on the model parameters are given.We defined the cost function as half the sum of squares of the residual of the measured spectrum after subtracting the calculated spectrum.The maximum number of iterations was set to 10,000.All values of tolerances during the iterations were set to 10 -8 .Ŵ and A of the four bands can take any positive values.The value of η can move from zero to unity.The ω values can vary 1300-1400 cm -1 , 1600- 1650 cm -1 , 1450-1550 cm -1 , and 1550-1600 cm -1 for the D1-, D2-, D3-, and G-bands, respectively.The coefficient of determination, R 2 , was calculated for each fitting curve: where I meas i and I calc i are the intensities of the measured and calculated spectra, and I calc is the mean intensity of the calculated spectrum in the range 1100-1750 cm -1 .
For each sample, we calculated the values of the mean, standard deviation (SD), and standard error (SE) of R2 of the calculated spectra after optimization.As in A2010, we then excluded the data with R2 deviating by more than ± 2SD from the mean value.After this treatment, we recalculated the mean, SD, and SE values.

Performance of the code
Figure 2 shows examples of the initial conditions together with the measured spectra for each sample.The initial fitting curves show a good match with the measured data with R 2 > 0.97 .After optimization, the R 2 of the calcu- lated spectra exceed 0.99 regardless of the sample (Fig. 3).Since the fitting is performed automatically, the same analytical results are obtained for the same spectral data, irrespective of the analyst.Our code typically analyzes one spectrum in less than one second using an ordinary (8) laptop computer and thus greatly reduces the time and effort required to arrive at a result.

Comparison with Aoya et al. (2010)
Figure 4 compares R2 of the present study with those reported by A2010.Error bars indicate the SD.As described in Sect.2.1, A2010 investigates the effect of measurement conditions on the calculated R2 by vary- ing the sample type, magnification of an objective lens, and incident angle of laser beam.Figure 4a shows the comparisons of R2 for each measurement condition.Our code-generated results are generally in good agreement with those of A2010 within the range of SD, regardless of the measurement conditions.When R2 exceeds ~ 0.6, the D2-and G-bands were not well decomposed for some samples (e.g., K08 in Figs. 2 and 3), and the results of the present study showed significantly larger R2 than those of A2010 with differences exceeding one SD.These trends also hold even when the data measured under different conditions are summed and then analyzed to obtain representative R2 for each sample (Fig. 4b).These results are consistent with the negligible effects on R2 by varying the measurement conditions as examined by A2010.All data are summarized in Table 1.

Modified geothermometer
Instead of a linear model proposed by Beyssac et al. (2002) (Eq.( 1)), A2010 presents a quadratic equation to express the relationship between R2 and T: Ideally, R2 calculated using our code should be suitable to use Eq. ( 9) for estimating T .However, there are dif- ferences in methodology of the present study that make this simple approach inappropriate.(1) The present study uses a more limited sample set than A2010 with the data not only for N33 but also for K08 removed when defining the calibration curve.(2) Our code-generated R2 were not identical to those of A2010 even for the same spectra (Fig. 4 and Table 1).(3) The uncertainties in both R2 and T should be taken into consideration when formulating the calibration curve.Due to these differences, we concluded that the previous geothermometer (Eq.9) needs to be modified when employing our code, which is the main issue addressed in this subsection.
Since the calibration data have uncertainties in both R2 and T (Table 1), the coefficients of the quadratic func- tion cannot be calculated as a unique solution to a linear equation.Therefore, we performed the Deming regression (Text S1), which is applicable to a regression problem that requires iterative treatments to optimize the model parameters.Samples K08 and N33 were excluded from our regression analysis, owing to their inappropriateness as the calibration samples (see Sect. 4.3 for the detailed discussion).Following KK2022, we assume that the errors in T of the calibration samples can be repre- sented by the SE.We employed SE rather than SD as the uncertainties of each data point, since SE is required to perform the Deming regression (Text S1).Finally, the regression curve obtained is expressed as follows: The geothermometer of Eq. ( 10) is strictly only applicable to samples with R2 from 0.023 to 0.516 (Table 1).Figure 5 shows the regression curve (Eq.( 10)), its 95% prediction interval (Text S2), and the data points used to calculate them (Table 1).Although A2010 defined (10) T [ • C] = 393.6(R2) 2 − 716.9(R2) + 671.8. the uncertainty of Eq. ( 9) as the maximum difference between the estimated and known T of the calibra- tion samples, we employed the prediction interval for the uncertainty of Eq. ( 10) owing to its clear statistical definition.Therefore, a direct comparison of the uncertainties between Eqs. ( 9) and ( 10) is inappropriate.The regression curve of Eq. ( 10) shows a very good fit to the data with R 2 = 0.998 , which is similar to that of A2010 ( R 2 = 0.995 ).The optimization of the quadratic func- tion by the iterative treatment affected the results of temperature estimation only modestly, with ~ 1 °C difference in T between the regression curves after 0th and 10th iterations (Fig. S4).At a given R2 , the calculated T using Eq. ( 10) differs by a maximum of ~ 10 °C from the results of using Eq. ( 9) (Fig. 6), which is similar to the 95% prediction interval of Eq. ( 10) (~ 15 °C).Note that Fig. 2 Examples of the measured spectra and initial conditions of the calculated spectra for each sample.The coefficients of determination R 2 were calculated using Eq. ( 8) Eq. ( 10) was established using the samples which underwent contact metamorphism, and its direct application to the samples which experienced other processes including regional metamorphism may not be appropriate, although our code may be useful to establish a new R2-based geothermometer for these situations.

Deviation from Aoya et al. (2010)
R2 of the K08 sample in the present study was larger than that of A2010 by an amount that exceeds one SD (Fig. 4 and Table 1).Let us assume that a measured spectrum is perfectly explained by a calculated spectrum, i.e., R 2 in Eq. ( 8) is strictly unity.It then follows from Eq. ( 2) that where A all represents the entire area of the measured spectrum in the spectral range 1100-1750 cm -1 and thus takes a constant value.The Raman spectra of the K08 sample typically show a clear D4-band at around 1200 cm -1 and a strong 'saddle' intensity at around 1500 cm -1 (e.g., Figs. 2 and 3).This indicates that the optimized A D1 and A D3 values may be strongly influenced by which algorithm was employed to perform the peak deconvolution.Therefore, the observed difference in R2 between these two studies may be qualitatively explained Fig. 3 Examples of the measured spectra and the calculated spectra after optimization for each sample.The coefficients of determination R 2 were calculated using Eq. ( 8) if the optimization algorithm of the present study yields larger A D1 and A D3 than A2010.

Applicability of the code to other datasets
Since our code was developed by trial and error to reproduce R2 of A2010 as closely as possible, its applicability to datasets other than those of A2010 should be examined.We reanalyzed the raw Raman spectral data of the natural samples measured and analyzed by Kouketsu et al. (2019Kouketsu et al. ( , 2021)), Shimura et al. (2021), andYamaoka et al. (2022).These studies measured the Raman spectra of natural CM under the same experimental conditions as A2010, performed peak deconvolution by manually setting the initial conditions, and calculated R2 .R2 of the samples to be reanalyzed were reported to range from 0.066 ± 0.041 to 0.655 ± 0.021 (Table S1).
The comparison of R2 obtained in the previous and pre- sent studies is shown in Fig. 7. Error bars represent the SD.All the data obtained are summarized in Table S1.R2 of the present study generally agree with those of the previous studies within the margin of SD.However, we note that when R2 is greater than ~ 0.6, our code yielded signif- icantly larger R2 than the previous studies exceeding one SD, something which was also recognized for the samples used in A2010 (Fig. 4).These results clearly indicate that our code is applicable to datasets other than those of A2010, although the code-generated results significantly differ from the previous analysis when R2 > ∼ 0.6.

Sensitivity of R2 as a proxy for CM maturity and its implications for use as a geothermometer
K2014 reported that FWHM of the D1-band, Ŵ D1 , lin- early decreases with increasing T while it shows almost no change after reaching ~ 40 cm -1 .Therefore, we can consider Ŵ D1 > 40 cm −1 as a necessary condition that Ŵ D1 has its sensitivity to changes in CM maturity.This subsection focuses on the similar sensitivity test for R2  Regression curve 95% prediction interval Fig. 5 Geothermometer modified from that of Aoya et al. (2010).R2 calculated using our code are plotted against T experienced by each of the samples examined by Aoya et al. (2010), together with the obtained geothermometer (Eq.10) and its 95% prediction interval.All plotted data are summarized in Table 1.The black points indicate the results for the samples K08 and N33, which were excluded from the regression analysis (see Sect. 4.3 for the detailed reasons).Error bars for R2 are the standard error, which are almost hidden by the data points owing to their small values.Error bars in T were also taken as the standard errors.The coefficient of determination R 2 was calculated using Eq. ( 8) and discusses its implications for practical temperature estimation.
We reexamined the raw spectral data of Aoya et al. (2010), Kouketsu et al. (2019Kouketsu et al. ( , 2021)), Nakamura et al. (2019), Shimura et al. (2021), andYamaoka et al. (2022).R2 was calculated using our code, while Ŵ D1 was deter- mined using the code for Fitting E of KK2022.Figure 8 compares R2 and Ŵ D1 for each sample.Figure 8b shows an enlarged view of the gray area in Fig. 8a.Error bars indicate the SD.The large SD of Ŵ D1 of highly matured CM with R2 of ~ 0.1 were attributed to the disappear- ance of the D1-band (e.g., K04 in Figs. 2 and 3).All the data obtained are summarized in Table S2.The results obtained show the same trend regardless of datasets (Fig. 8a).Within the region for low-grade material (upper right part in Fig. 8a), Ŵ D1 decreases significantly with increasing CM maturity while R2 shows almost constant values of 0.6-0.8.R2 starts to decrease from ~ 0.7 toward zero when Ŵ D1 decreases to ~ 55 cm -1 (Fig. 8b).After Ŵ D1 reaches ~ 40 cm -1 at R2 of ~ 0.55, only R2 shows a system- atic decrease with CM maturity.Based on these results, we suggest that R2 < ∼ 0.7 is a necessary condition for R2 to be sensitive to changes in CM maturity.In other words, samples with R2 ≥ ∼ 0.7 should not be employed in the calibration of an R2-based geothermometer.This is the reason why our regression curve (Eq.10) did not include the K08 sample ( R2 = 0.737 ± 0.029 ), which ena- bles our geothermometer to yield better results around R2 = 0.516 than that of A2010 (Eq.9) who included the K08 sample (Fig. 6).Since very low-grade CM sometimes exhibits R2 of ~ 0.7 (e.g., dataset of Nakamura et al. 2019 in Fig. 8a), the condition R2 < ∼ 0.7 alone can be mis- leading for the assessment of CM maturity.It is important to check the consistency of the results obtained with other observations, including the strong influence of fluorescence (e.g., Schito et al. 2017) and the existence of the D4-band (e.g., Kouketsu et al. 2014), both of which are absent in the spectra of intermediate-grade CM (Fig. S5).
As described in Sect.3.3, our modified geothermometer is determined in the range 0.023 ≤ R2 ≤ 0.516 (Fig. 5 and Table 1).However, the geothermometers based on Ŵ D1 , including that of KK2022, are applicable only when  Aoya et al. (2010).R2 calculated using our code are plotted against those reported in the previous studies.All plotted data are summarized in Table S1.The gray line indicates exact agreement between the values calculated in the previous and present studies.Error bars indicate the standard deviation R2 exceeds ~ 0.55, owing to the insensitivity of Ŵ D1 to changes in CM maturity for lower R2 (Fig. 8).Therefore, in a strict sense, a combination of the code-based geothermometers of KK2022 and the present study is incapable of estimating T of samples with R2 from 0.516 to ~ 0.55 and Ŵ D1 of ~ 40 cm -1 .A straightforward solution to this prob- lem would be to refine our modified geothermometer further by considering additional calibration data of samples whose T are known, Ŵ D1 are ~ 40 cm -1 , and R2 are larger than ~ 0.55 while smaller than ~ 0.7.Although the N33 sample almost satisfies this requirement (Table 1), the bimodal distribution in its R2 histogram (Fig. S6) indicates its inappropriateness as a calibration sample.An alternative approach is to simply extrapolate the applicable range of Eq. ( 10), as implicitly done in some previous studies (e.g., Kaneki and Kouketsu 2022;Kouketsu et al. 2014;Rahl et al. 2005).In this case, we need to consider an appropriate extent for the extrapolation of our geothermometer.One potential clue can be obtained when we try to smoothly connect the geothermometers of KK2022 and the present study on the data trend shown in Fig. 8.The Ŵ D1 -based geo- thermometer of KK2022 is By combining Eqs. ( 10) and ( 12), we obtain (13) Ŵ D1 cm −1 = −171.1(R2) 2 + 311.7(R2) − 80.8.
When the calculated (R2, Ŵ D1 ) satisfy Eq. ( 13), the two geothermometers yield the same T .For example, if we assume the connecting T is 391 °C, the correspond- ing R2 and Ŵ D1 are 0.57 and 41.3 cm -1 , respectively (Eqs.( 10) and ( 12)).It is noteworthy that the point (R2, Ŵ D1 ) = 0.57, 41.3 cm −1 (indicated by star sym- bol in Fig. 8b) plots well within the data trend, which is drawn by a number of recalculated data both from contact and regional metamorphisms.Therefore, the slight extrapolation of our geothermometer up to R2 = 0.57 (within the standard deviation of the lowest-T sample for our calibration curve, N30) provides a consistent switching to the lower-T geothermometer of KK2022, with a boundary corresponding to T = 391 • C .In addition, set- ting R2 = 0.57 as a boundary between the two geother- mometers enables an unambiguous classification of all the data in Fig. 8 into one of two groups: using our geothermometer (Eq.( 10)) if R2 ≤ 0.57 and using KK2022 (Eq.( 12)) if R2 > 0.57.

Effect of laser wavelength
Differences in the laser wavelength employed in the Raman system can significantly affect the spectral intensity and subsequent analytical results (e.g., Wang et al. 1990;Ferrari and Robertson 2001).K2014 and KK2022 demonstrated that for the Raman data of a specific sample measured under different experimental conditions (i.e., spectrometer, laser wavelength, pinhole size, and grating), some representative Raman parameters (i.e., Ŵ ,  S2. R2 were calculated using our code, while Ŵ D1 were determined using the code for Fitting E of Kaneki and Kouketsu (2022).A close-up of the gray area in a is shown in b.The black-dashed line in b indicates the proposed upper limit R2 = 0.57 for the extrapolation of our geothermometer (Eq.10).When the calculated (R2, Ŵ D1 ) satisfy Eq. ( 13) (indicated by the gray line in b), the geothermometers of Kaneki and Kouketsu (2022) (Eq.( 12)) and the present study (Eq.( 10)) yield the same T .Star symbol in b represents the point (R2, Ŵ D1 ) = 0.57, 41.3 cm −1 where the two geothermometers yield T = 391 °C.Error bars indicate the standard deviation ω , intensity ratio, and area ratio) showed consistent val- ues when the laser wavelength was the same, while the parameters other than Ŵ differed significantly for differ- ent laser wavelengths.These results clearly suggest that the effect of the laser wavelength on the spectral parameters dominates over the other variable experimental conditions.Since these two studies analyzed the data of the sample with T = 165 • C and 330 • C , which is outside the temperature range of the calibration samples used in our regression analysis, this subsection investigates the dependence of the analytical results using our code on the laser wavelength for samples with higher T.
We measured the Raman spectra of the N30 sample (T = 410 ± 30 • C ) under the same experimental con- ditions as for A2010, except that we used a laser with a wavelength of 633 nm instead of the original 532 nm.Using our code, we calculated the following six Raman parameters: intensity ratio of the D1-band to the G-band (i.e., R1 ratio), R2 , Ŵ of the D1-and G-bands, and ω of the D1-and G-bands.The results obtained are compared with those of a 532 nm laser in Fig. 9. Error bars indicate the SD.All the data obtained are summarized in Table S3.Of the six parameters examined, only Ŵ of the D1-and G-bands were consistent between two laser wavelengths within the margin of SD (Fig. 9b), similar to the results reported in K2014 and KK2022.The other four parameters, including R2 , showed significant differences exceed- ing one SD when the laser wavelength was changed from 532 to 633 nm (Fig. 9a, c).The observed larger R2 for the longer laser wavelength (Fig. 9a) may be attributed to (i) use of the longer laser wavelength resulting in the larger R1 ratio (e.g., Ferrari and Robertson 2001), and (ii) the positive correlation between the R1 and R2 ratios (e.g., Aoya et al. 2010).The mean T calculated using Eq. ( 10) were 407 and 364 °C for a 532 and 633 nm laser, respectively, and their difference exceeded 95% prediction intervals.Considering these facts, geothermometers based on R2 cannot be reliably applied to spectra that were meas- ured using a different laser wavelength from the one used to establish it.Therefore, the usage of our geothermometer (Eq.( 10)) should be limited to data measured using a 532 nm laser.

Conclusions
In this study, we have developed a Python code that automatically performs peak deconvolution of the Raman spectra of carbonaceous material and calculates the R2 ratio.A close comparison with the results of Aoya et al. (2010) shows very good agreement for the same original data unless R2 exceeds ~ 0.6.Based on a Deming regres- sion analysis, we propose a modified set of coefficients for the quadratic calibration curve first proposed by Aoya et al. (2010) to model the relationship between R2 and the maximum metamorphic temperature T .This modified curve is used in our code to calculate the mean T and its 95% prediction interval, although its direct application to samples experiencing the processes other than the contact metamorphism may not be appropriate.Our code is applicable to datasets other than those of Aoya et al. (2010), as long as the Raman spectral data are obtained using a 532 nm laser.We document that R2 < ∼ 0.7 is a necessary condition for R2 to be sensitive to changes in the sample maturity.For this reason, our modified geothermometer was calibrated without the K08 sample (mean R2 of 0.737) and should be preferred over the geothermometer of Aoya et al. (2010).Since a combination of the geothermometers of Kaneki and Kouketsu (2022) and the present study is incapable of estimating T of samples with maturity in a particular range, further refinement or a slight extrapolation of the present geothermometer is required.In the latter case, we propose R2 = 0.57 as a possible boundary where users should switch from the geothermometer presented here to the one proposed by Kaneki and Kouketsu (2022).

Fig. 4
Fig. 4 Comparison of R2 of the present study with those of Aoya et al. (2010).The results for a each measurement condition and b each sample are shown.All plotted data are summarized in Table 1.When R2 of both studies are identical, the data are plotted on the gray line.Error bars indicate the standard deviation

Fig. 9
Fig. 9 Effect of laser wavelength on some Raman parameters.Comparisons of some Raman parameters of the N30 sample between the laser wavelengths of 532 and 633 nm are shown.a R1 and R2 ratios, b FWHM of the D1-and G-bands, and c center positions of the D1-and G-bands are shown.Error bars indicate the standard deviation.Note that the error bars in c are hidden by the data points owing to their small values

Table 1
Summary of temperature, measurement condition, and R2 ratio of each sample used for the geothermometer.