Advantage of bulk lightning models for predicting lightning frequency over Japan

This study examined the performance of an explicit bulk lightning model coupled with a meteorological model for forecasting lightning by numerical weather prediction over Japan. The evaluation was conducted by comparing the lightning predicted by the explicit bulk lightning model, diagnosed empirically by the numerical model, and observed by ground base measurements. From the results, the bulk lightning model performed better in terms of lightning frequency than did the diagnostic scheme, which overestimated the lightning frequency, although there were no appreciable differences in the score of each method for the geographical distribution and time correlation compared with the observations. These results suggest that the explicit bulk lightning model is advantageous for predicting lightning frequency. The sensitivity of the simulated lightning to the choice of cloud microphysical model was also examined by using a two-moment and a one-moment bulk microphysical scheme. Sensitivity experiments on the choice of microphysical model indicated that the two-moment bulk scheme reproduced the observed lightning well, while the one-moment bulk scheme overestimated the lightning frequency. Analyses suggested that the overestimation of the lightning in the one-moment bulk scheme originated from active charge separation by riming electrification, in which graupel was produced more frequently and was assumed to fall faster. These results suggest that the explicit bulk lightning model with the two-moment bulk microphysical scheme offers an alternative to conventional lightning prediction methods.


Introduction
Lightning is a threat to human life because it can injure and even kill people.Every year, lightning strikes kill several people in Japan and more than 1000 people worldwide (e.g., Fujibe 2017;Curran et al. 2000).Lightning is also a serious threat to today's highly electrified society by damaging electronics and other equipment.Each year in Japan, the cost of lightning damage exceeds 100 billion yen (Yokoyama 2002), and the cost have been increasing based on the insurance claim costs (https:// www.city-net.or.jp/ wp-conte nt/ uploa ds/ 2015/ 02/ 00_ kouky ou_ raiga itais aku.pdf, accessed on 2022/10/12).However, such damage can be prevented by evacuation and equipment protection if the lightning is predicted in advance, which is why accurate lightning prediction is required to reduce the damage from lightning.
Operational numerical prediction of lightning is not yet feasible, but lightning can be diagnosed empirically based on observations.For example, the National Severe Storms Laboratory and the European Centre for Medium-Range Weather Forecasts diagnose lightning based on hydrometeors in the clouds calculated by their forecasting models (Lopez 2018; https:// www.nssl.noaa.gov/ educa tion/ svrwx 101/ light ning/ forec asting/, accessed on 2022/11/28).Meanwhile, the Japan Meteorological Agency (JMA) provides lightning guidance (Tsuchida 2018) by forecasting the probability of lightning using logistic regression based on explanatory variables considered to be related to lightning, such as the Showalter stability index, convective available potential energy (CAPE), and precipitation amount, simulated by its model for operational numerical weather prediction (NWP).JMA also provides thunder nowcasting (Kasahara 2010a(Kasahara , b, 2011) ) by diagnosing lightning activity up to 1 h ahead based on the observed lightning and precipitation.These diagnostic methods are accurate and have helped greatly to reduce lightning damage, but they cannot predict the frequency of lightning.In addition, JMA's lightning guidance predicts the probability of lightning in each 10-km or 20-km grid for every 3 h and so cannot be used for forecasts finer than its grid size or time step, and the lead time of JMA's thunder nowcasting is between one and a few hours, which is short compared to the NWP lead time.McCaul et al. (2009) developed a scheme to diagnose lightning frequency based on the microphysical properties of lightning clouds, such as upward flux of graupel in the mixed-phase region at − 15 °C and vertically integrated ice mass.This scheme can even forecast the frequency of lightning with the NWP lead time, but it uses empirical equations, which are not always applicable for predicting lightning in storms that are not included in the dataset used to fit the empirical equations, such as storms that have never been experienced before, or in the future climate.
Numerical simulation by an explicit bulk lightning model (BLM) (e.g., Barthe et al. 2005Barthe et al. , 2012Barthe et al. , 2016;;Bovalo et al. 2019;Hayashi 2006;Helsdon et al. 2001;MacGorman et al. 2001;Mansell et al. 2002Mansell et al. , 2005;;Mansell and Ziegler 2013;Takahashi 1984) is a powerful tool for predicting lightning based on the predicted charge density of hydrometeors.However, explicit BLMs are not suitable for operational NWP because of their high computational cost, and they are yet to be used operationally.This lack of experience with prediction by BLMs means that their advantages are as yet unclear.The computationally expensive parts of BLMs are calculating the discharge and solving the inverse matrix to calculate the electric field from the charge density.Fierro et al. (2013) developed a lightning discharge scheme with low computational cost and that performed well in previous studies (e.g., Fierro et al. 2013;Dafis et al. 2018;Chen et al. 2020;Lu et al. 2022;Sato et al. 2022).The calculation for solving the inverse matrix should also be accelerated.Most of those studies were focused on high-flash-rate events such as summer thunderstorms and winter thunderstorms with strong convection caused by low-pressure systems.However, in Japan, in addition to the above types of lightning, some of the lightnings associated with low cumulonimbus clouds along the Sea of Japan in winter have special characteristics.This type of lightning event occurs in other limited areas of the world, such as around the Great Lakes in the United States along the west coast of both Norway and Israel (Montanya et al. 2016;Holzworth et al. 2019).The characteristics of winter lightning differ from those of summer lightning in several aspects.Storms with only a few flashes are often observed and are called "single lightning flash storms" (Michimoto 1993;Hayashi and Marui 2016).The positive cloud-to-ground (CG) flash frequently occurs compared with other lightning.The electric current in a single lightning strike is sometimes large, and the lightning damage is much more serious than that due to single lighting in summer (Goto and Narita 1995;Ishii and Saito 2009).The length of the flash is longer than that of summer lightning, so a wider area of charges is removed (Yoshida et al. 2019).However, although these features would affect the performance of explicit BLMs, only a few previous studies have used them for this type of winter lightning (Altaratz et al. 2005;Kawano et al. 2019).
For the reality in Japan, the only previous studies to use BLMs were those by Sato et al. (2022), Hayashi (2006), and Kawano et al. (2019), but they did not evaluate the performance of the BLMs.To apply a BLM to operational forecasting, its performance should be evaluated in various cases including the specific winter lightning described above.In addition, one-moment cloud microphysics models are often used in operational forecast models because of their low computational cost, but Sato et al. (2022) used a two-moment cloud microphysical scheme.The one-moment scheme predicts the mixing ratio of hydrometeors, while the two-moment scheme predicts the number concentration in addition to the mixing ratio and so offers a more accurate representation of the particle size distribution of hydrometeors but at higher computational cost.Barth et al. (2005) noted the dependence of BLMs on the choice of cloud microphysical scheme, so it is necessary to investigate the sensitivity of a BLM to that choice.
In this study, we conducted numerical simulations using an explicit BLM (Sato et al. 2019) implementing the lightning discharge scheme of Fierro et al. (2013) and targeting six typical lightning events in Japan.By evaluating the BLM for the six events, our aim was to examine its advantages.We also conducted sensitivity experiments with another cloud microphysical scheme.To evaluate the BLM's performance and indicate its advantages, its results were compared with those of conventional lightning diagnostic methods and observations by ground base measurements.

Model
The model used in this study was version 5.3.6 of SCALE (Scalable Computing for Advanced Library and Environment) (Nishizawa et al. 2015;Sato et al. 2015), with an explicit BLM implemented in SCALE (Sato et al. 2019(Sato et al. , 2022) ) used to calculate the electrification of hydrometeors and lightning.Often called riming electrification, the noninductive charge separation was considered based on a look-up table by Takahashi (1978).The discharge process was calculated by the parameterization of Fierro et al. (2013): in this scheme, the initiation point of a discharge is determined as the point where the absolute value of the electric field exceeds the threshold value ( E int ), and the charge of the hydrometeors in a cylinder with the specific radius ( r cyl ) centered at the discharge initiation point is neutralized; for the present work, E int and r cyl were set at 110kVm −1 and 5km , respectively, based on Fierro et al. (2013) and Sato et al. (2022).In the BLM, the Bi-Conjugate Gradient STABilized method (Bi-CGSTAB; van der Vorst 1992) is used to solve the inverse matrix for calculating the electric field from the charge density, and we used symmetric Gauss-Seidel preconditioning for Bi-CGSTAB to accelerate the calculation of the electric field; see Appendix 1 for the contribution of the preconditioning to reducing the computational cost.The radiation and turbulence processes were calculated using the MSTRN-X radiation scheme (Sekiguchi and Nakajima 2008) and the Mellor-Yamada turbulence scheme (Nakanishi and Niino 2006), respectively.The bucket land model and single-layer urban canopy model (Kusaka et al. 2001) were used for calculating surface temperature and moisture over the land in nonurban and urban areas, respectively, and the surface flux was calculated using a Beljaars-type bulk model (Beljaars and Holtslag 1991).
In sensitivity experiments, we used two cloud microphysical schemes to examine the sensitivity of the simulated lightning activity to that choice: (i) a two-moment bulk microphysical scheme of Seiki and Nakajima (2014) (hereinafter SN14), and (ii) a one-moment bulk scheme of Tomita (2008) (hereinafter T08).

Target cases
Our aim was to evaluate our BLM for typical cases of lightning in Japan.To do so, we conducted numerical simulations targeting six cases including both summer (called S1 and S2) and winter (called W1, W2, W3, and W4) in Japan.S1 and S2 are heavy rainfall events in summer, and the others are lightning events in winter.The characteristics and the motivation behind the choice of each case are given below.S1 and S2, which are the same cases discussed by Sato et al. (2022), are heavy rain events in summer, but the lightning frequency differs greatly between these two cases.S1 is a heavy rainfall event in 2017 over the Kyushu area of Japan in summer; large amounts of rainfall were triggered by tall convective clouds generated under unstable conditions with high CAPE, and this event occurred during a short period of time and with a high flash rate (Tsuji et al. 2020;Kawano et al. 2018;Kawano and Kawamura 2020).S2 is a heavy rainfall event in 2018 over a wide area of western Japan in summer; low convective clouds generated under moist and stable conditions produced long-lasting heavy rainfall over the whole area (Tsuji et al. 2020).The flash rate in S2 was lower than that in S1 (Kawano et al. 2018;Nakakita et al. 2019), and comparing the performance of the BLM between the two cases enables us to examine whether it can distinguish the lightning frequency in lightning events with high and low flash rates.
The other cases are lightning event in winter.For W1, we focused on a typical case of lightning in winter, i.e., that triggered by strong convection such as in a lowpressure system.A low-pressure system passed through Hokkaido in Japan on November 11, 2019, and rainfall and lightning from cumulonimbus clouds accompanying a cold front were observed near Sapporo.Evaluating the BLM for this case enables us to examine whether it can reproduce the lightning that occurs in a winter thunderstorm with strong convection.
W2 is a snowfall event caused by a northwest monsoon over Hokkaido on February 9, 2016.In this case, lightning was detected only a few times near Sapporo, and so evaluating the BLM for this case enables us to examine whether it can reproduce lightning with only a few flashes in winter.
W3 and W4 are winter lightning cases that occurred in the Hokuriku region along the Sea of Japan.In both cases, lightning and precipitation were caused by low cumulonimbus clouds generated by a northwest monsoon.Evaluating the BLM for this case enables us to examine its ability to reproduce winter lightning in the Hokuriku region.

Experimental setup
The calculation domain and experimental setup for each case are summarized in Fig. 1 and Table 1, respectively.Numerical simulations were conducted over the shaded areas in Fig. 1, and the regions surrounded by dashed black lines indicate the analysis areas.The data from upwind areas and the first several hours were not used for the analyses to avoid the effects of lateral boundary conditions and spin-up time (see Table 1).The horizontal grid spacing was 1 km.The vertical layer was divided into 60 layers from 40 m above the ground to 21,545 m at the model top, with the layer thickness increased from 40 to 683 m toward the upper layers except for the simulation of S2; in S2, the vertical layer was divided into 57 layers from 40 m above the ground to 19,528 m at the model top, with the layer thickness increased from 40 to 651 m toward the upper layers following Sato et al. (2022), which reported that the cloud top height in S2 was lower than that of S1 and 57 layers is sufficient for simulating the clouds in S2.The horizontal wind, air density, temperature, specific humidity, soil moisture, soil temperature, and sea surface temperature (SST) of Mesoscale ANaLysis (MANL) provided by JMA (2022) were used for the initial and boundary conditions; MANL has a horizontal resolution of 5 km, 50 vertical layers, and temporal resolution of 3 h.SST was fixed as the initial value, and the time interval of the model output was set to 5 min.

Observational data
The 1-km-mesh synthetic-radar grid point values provided by JMA (2009) were used to evaluate the precipitation calculated by the BLM, and the lightning location data observed by the Lightning Detection Network System (LIDEN; Ishii et al. 2014) operated by the JMA were used to evaluate the performance of the BLM in that respect.The LIDEN data was grided to the nearest model grid, but the radar data was used with the original grid setup of the observations.The BLM outputs the flash origin density (FOD) defined by Eq. ( 9) in Fierro et al. (2013), which corresponds to the lightning frequency  of CG and intra-cloud (IC) lightning combined.Based on Fierro et al. (2013), the FOD is a good surrogate for the Earth Networks Total Lightning Network (ENTLN; https:// www.weath erbug.com/ weath er-forec ast/ now/), which is a system similar to LIDEN.Thus, the locations of the initiation discharge points of CG and IC observed by LIDEN can be compared with the FOD.It has been suggested that LIDEN can give false detections caused by electromagnetic waves other than lightning, and our pre-analyses showed that the location of lightning estimated by LIDEN can have errors.To reduce the effects of such errors, JMA conducts quality control of LIDEN data using radar observations, and in this study, we eliminated any LIDEN data with no precipitation observed by radar within a 10 km radius and within 10 min before or after, the same time interval as the radar data, in the same way as done by JMA (Kasahara 2011).et al. (2009, hereafter MC) developed a lightning threat diagnostic scheme, which is a widely used lightning diagnostic method (e.g., Dafis et al. 2018;Fierro et al. 2013;McCaul et al. 2020).In this scheme, lightning frequency (F MC ) is calculated by a linear combination of the upward flux of graupel in the mixed-phase region at −1 5 °C (F 1 ) and the vertically integrated mass of ice, snow, and graupel (F 2 ): where q g , q s , and q i are the mixing ratios of graupel, snow, and ice, respectively, w is the vertical velocity, ρ is the air density, z is the height, k 1 (= 0.042), k 2 (= 0.2), r 1 (= 0.95), and r 2 (= 0.05) are empirical parameters, and F 1 and F 2 were set to zero to reproduce the regional distribution of lightning if they are less than the threshold values of 0.01 and 0.4, respectively.

McCaul
To evaluate the performance of the BLM, we compared the lightning distribution and its frequency predicted by the BLM and diagnosed by the MC scheme with LIDEN observations.Note that the MC scheme was developed for the Weather Research and Forecasting (WRF) onemoment scheme (WRF Single-Moment Six-Species; (1) (2) F 2 = k 2 ρ q g + q s + q i dz, (3) Hong and Lim 2006) and is not adjusted for SN14 or T08 of SCALE; adjusting the empirical parameterization for them may improve the results.To examine reasonable parameter values of the MC scheme for SCALE, we conducted 16 sensitivity experiments with k 2 = 0.2, 0.15, 0.1,  and 0.05 and threshold values of 0.4, 0.6, 0.8, and 1.0, and the results of those experiments are discussed in Sect. 4.

Scores for performance evaluation
The lightning model was evaluated in terms of absolute value, temporal variation, and geographical distribution.
To evaluate these three elements, the observed lightning was converted to gridded data, and the following three statistical scores were used.
To evaluate the errors in the absolute value of the lightning frequency, we calculated the root mean squared error (RMSE) defined as where F pre i is the accumulated lightning frequency calculated by the model for each time step for the model output, and F obs i is the observed one.To verify that the model can reproduce relative changes in lightning frequency, we calculated the Pearson correlation coefficient (r) for the temporal relationship between the model and the observations for hourly area-accumulated lightning frequencies.
The equitable threat score (ETS) was calculated to validate the distribution of precipitation and lightning quantitatively.Traditional ETS calculation methods use a 2 × 2 contingency table at individual grid points, in which the elements of the table are hits (correctly forecast events), misses (observed but not forecast events), false alarms (forecast but not observed events), and correct negatives (correctly forecast nonevents).The ETS is given as where To reduce the effect of displacement between simulated and observed values, the ETS was calculated within a neighborhood framework (Clark et al. 2010;Lynn et al. 2015).In this framework, the grid point is defined as a hit if the event is predicted at any grid point within a certain radius R from the observation.Similarly, the grid point is defined as a hit if the event is observed at any grid point (5) ETS = hits − chance hits + misses + false alarms − chance , (6) chance = (hits + misses)(hits + false alarms) hits + misses + correct negatives + false alarms .
within a radius R of the predicted grid.The value of R was set to 5 km, and the threshold for accumulated precipitation during the entire analysis period was set in increments of 50 mm for S1 and S2, in increments of 10 mm for W1 and W3, and in increments of 5 mm for W2 and W4.Unlike the ETS for precipitation, to examine the spatiotemporal performance of the BLM, hits were considered if there was a LIDEN observation within 5 km and 15 min before or after the FOD simulated by the model at each time step according to Hayashi (2006).The ETS for lightning verification is shown hourly as well as for the entire analysis period to capture time evolution.Because the BLM output can take decimal values, grid points with FOD values above zero were counted as having lightning.

Evaluation of simulated precipitation
First, the simulated precipitation was compared with the observed precipitation to assess the validity of each cloud microphysical model.Figure 2 shows the geographical distribution of accumulated surface precipitation during the analysis period in the analysis area.In all cases, SN14 reproduced the observed accumulated precipitation approximately well, but there were discrepancies in some cases, such as shifts in the location of the precipitation area.In S1, a large precipitation area exceeding 450 mm was observed around 33.4°N, 130.8°E (Fig. 2a).An area with large precipitation amount was reproduced in SN14, but its location shifted to the southeast, near 33°N, 131°E (Fig. 2b).In S2 (Fig. 2d and e), SN14 reproduced well the large precipitation amount around 35.7°N, 135°E, but it overestimated the precipitation amount over the Sea of Japan.In W1 (Fig. 2g and h), SN14 reproduced the location of a large amount of precipitation near Sapporo, but it overestimated the precipitation amount and did not reproduce the precipitation areas near 42°N, 142°E.In W3 and W4 (Fig. 2m, n, p, and q), the model reproduced the observed precipitation area, but the simulated area with a large amount of precipitation exceeding 30 mm showed discrepancies with the observed one.T08 also reproduced the observed precipitation well in all cases (Fig. 2c, f, i, l, o, and r).However, T08 overestimated the maximum precipitation amount in S2 (Fig. 2f ) and W2 (Fig. 2l), and the large-precipitation area shifted southward as with SN14 in W4 (Fig. 2r).
As discussed by Sato et al. (2022), the above discrepancies originated from the mismatched meteorological fields and different cloud microphysical schemes in the child model (i.e., SCALE in this study) and parent model (i.e., MANL in this study), which often occur in such dynamical downscaling simulations.Thus, such discrepancies would be reduced by improving the initial and boundary conditions for the meteorological fields.In spite of such discrepancies, the ETS for the low precipitation threshold (Fig. 3) of both SN14 and T08 is around 0.25-0.5, which is similar to the ETS of the JMA mesoscale model used for operational weather forecasts (e.g., Honda and Sawada 2009), and therefore both SN14 and T08 generally reproduced the geographical distribution of observed precipitation.From these results outlined above, it is concluded that the model reasonably simulated the distribution of the convective clouds that triggered the large amount of precipitation, if we consider the uncertainty of the initial and boundary conditions and model errors.

Evaluation of simulated lightning for summer cases
Figure 4 shows the geographical distribution of the flash density observed by LIDEN, the FOD simulated by the BLM, and flashes diagnosed by the MC scheme accumulated during whole analyses period for S1 and S2.First, differences between the lightning simulated by the BLM with SN14 and the observed lightning are discussed as a basic performance of the BLM, because the BLM with SN14 has been used previously (Sato et al. 2019(Sato et al. , 2021(Sato et al. , 2022)).In S1, the BLM with SN14 reproduced the lightning distribution approximately, but the simulated lightning distribution was displaced from the measured lightning distribution as with the precipitation area (Fig. 4a and b).The displaced lightning distribution originated for the same reason as the displaced precipitation area as discussed in Sect.3.1.The displacement would be lessened by preparing reasonable meteorological fields and correcting the problem of the mismatched cloud microphysical models, so we do not discuss these displacements in this study.
The cumulative number of observed lightning events exceeded 10 per grid point at the most frequent points, but the BLM with SN14 underestimated the maximum number as being about seven per grid point (Fig. 4a and  b).Nevertheless, the BLM with SN14 overestimated the amount of accumulated lightning over the whole analysis area by about 1.4 times the observed amount.
In S2 (Fig. 4f and g), the BLM with SN14 reproduced well the geographical distribution and the maximum accumulated lightning frequency per grid point, but it overestimated the lightning frequency accumulated over the analysis area by an order of magnitude.The overestimation of area-accumulated lightning frequency in these two cases was originated from the grids with lightning but small frequency.The lightning frequency with these grids are not seen clearly in Fig. 4f and g, but their contributions to the cumulative frequency over the whole analysis area were not so small.
These results indicate that the BLM with SN14 reproduced well the lighting with high flash rate (e.g., the  , e, h, k, n, q) and T08 (c, f, i, l, o, r) accumulated within the analysis period over the analysis area for each case lightning in S1) within a factor of 2 (i.e., a factor of about 1.4 in this study).The results also suggest that the BLM with SN14 overestimated the lighting with low flash rate within a factor of 10 (i.e., one order of magnitude).However, the BLM with SN14 reproduced the difference of about one order of magnitude in the observed lightning frequency between the two cases.Thus, we conclude that the BLM with SN14 can distinguish between summer lightning with high and low flash rates, even though it overestimates the frequency of lightning with low flash rate.
In both cases, the BLM with T08 simulated lightning over wider areas than the observations and overestimated the lightning frequency per grid point and the area-integrated lightning counts (Fig. 4c and h).Especially in S2 (Fig. 4h), a large amount of lightning was calculated even on the west side of the analysis area, where no lightning was observed, and the area-integrated lightning amounts were nearly 100 times higher than the observations.The amounts of area-integrated lightning in S1 and S2 were of the same order of magnitude.These results indicate that  (d, i) and MC with T08 (e, j) within the analysis period over the analysis area for S1 (a, b, c, d,  e) and S2 (f, g, h, i, j) the BLM with T08 could not reproduce the difference in lightning frequency between summer lightning with high and low flash rates.The reason for the overestimated lightning frequency by T08 and difference in the electrification in T08 and SN14 are discussed in Sect.3.5.Figure 4d, e, i, and j show the geographical distribution of lightning frequencies calculated by the MC scheme using SN14 and T08.Compared to the observation results (Fig. 4a, and f ), the MC scheme with SN14 overestimated the lightning frequency in both cases and simulated a large amount of lightning even in areas where lightning was not observed.These overestimations originated from the overestimation of high clouds constructed by snow, as also discussed by Sato et al. (2022).Based on Eq. (1), the overestimation of q s results in the large lightning frequency.When T08 was used, the MC scheme showed results similar to those from the BLM with T08.However, the MC scheme did not reproduce the difference in lightning frequency between S1 and S2, with a difference factor of only about 1.5 in the amount of area-integrated lightning between the two cases in both cloud microphysics schemes.
From the results outlined above, we conclude that the BLM with SN14 can distinguish between lightning events with high and low flash rates, whereas the other methods (i.e., BLM with T08, and MC) cannot.
Quantitative evaluations of each scheme based on the three scores support the above analyses.Figure 5 shows time series of the integrated lightning frequency over the analysis area and its hourly RMSE for S1 and S2.The RMSE for the entire analysis time is also shown in the legend of Fig. 5c and d.In both cases, the BLM with SN14 had lower RMSEs than the other methods, indicating that it performs better in reproducing lightning frequencies.The large RMSE with the other methods originated from the overestimation of the lightning frequency.
Scatter plots and correlation coefficients of the accumulated lightning frequency during every hour for LIDEN and BLM or MC are shown in Fig. 6a and b.The correlation coefficients of all the methods do not differ appreciably in both cases.However, there are more plots of the BLM with SN14 than those of the other methods between the two straight lines in Fig. 6a and b showing that the difference from observation is within a factor of 10.This result also suggests that the BLM with SN14 performs better at simulating lightning frequency.
Figure 6c and d show the ETS for each hour and for the entire analysis time (in the legends) in S1 and S2.Because there was no lightning during 3:00-8:00 and 21:00-24:00 in S2, the value of ETS for these hours was zero, and that for the entire analysis time for S2 was also small (Fig. 6d).In S1 using the BLM with SN14, the simulated precipitation area and lightning area were shifted to the southeast in the first half of the analysis period (figure not shown), so the ETS values for this period are lower than those with T08.Except for the period in S1, there is no appreciable difference in ETS values between the BLM with SN14 and the other methods.
Based on these results, we conclude that the BLM with SN14 performs better at simulating lightning frequency in summer events compared with other methods, and the performances for geographical distribution and relative change with time are equivalent with those of the other methods.The advantage of the BLM with SN14 is its ability to reproduce the difference in lightning frequency between summer lightning with high and low flash rates.

Evaluation of simulated lightning for winter cases over Hokkaido
Figure 7 shows the geographical distribution of the flashes observed by LIDEN, the FOD simulated by BLM, and the diagnosed lightning frequency by the MC scheme accumulated during whole analyses period for W1 and W2.In W1, the BLM with SN14 reproduced well the observed lightning frequency and the geographical distribution (Fig. 7a and b).In W2, lightning was observed only once during the analysis period (Fig. 7f ), which is similar to the results of the BLM with SN14 in which no lightning was simulated (Fig. 7g).In both cases, the BLM with T08 calculated a large amount of lightning and overestimated the observations (Fig. 7c and h).The lightning was simulated by the BLM with T08 in almost all areas with precipitation (Fig. 2i and r).The lightning diagnosed by the MC scheme with both T08 and SN14 overestimated the measured lightning in W1 (Fig. 7d and e).In addition, lightning was diagnosed by the MC scheme with both SN14 and T08 in W2 (Fig. 7i and j), when lightning was rarely observed.These results indicate that the BLM with SN14 outperforms the other schemes.
Figure 8 shows time series of the integrated lightning frequency over the analysis area and its RMSE for W1 and W2.In both cases, the BLM with SN14 had lower RMSEs than those with the other methods, which indicates that it performs better in reproducing lightning frequencies.The large RMSEs for the MC scheme and the BLM with T08 originated from the overestimation of the lightning.Figure 9a and b show scatter plots and correlation coefficients of the accumulated lightning frequency during every hour for LIDEN and BLM or MC for W1 and W2, respectively.For W1, all plots due to the BLM with SN14 are within a factor of 10, whereas those due to the other methods are mostly within a factor greater than 10.The correlation coefficient by SN14 was higher than that by T08 with both BLM and MC (Fig. 9a and b), indicating that SN14 reproduced well the relative change of lightning with time.Note that Fig. 9b contains only one data point for W2 because lightning was not observed for most of the analysis period.The ETS is shown in Fig. 9c, but W2 is not shown because lightning was rarely observed and the ETS value was almost zero.There was no appreciable difference in ETS values between the BLM and MC schemes regardless of the choice of cloud microphysical scheme for W1.
All the above results indicate that the BLM with SN14 performed better at simulating lighting in both cases and that it has an advantage for reproducing winter lightning triggered by strong convection by low-pressure systems, which has a small flash rate compared with summer lightning.Another advantage of the BLM with SN14 is its ability to reproduce events with either no or very little lightning, which are not reproduced by the other methods.

Evaluation of simulated lightning for winter cases in Hokuriku
Figure 10 shows the geographical distribution of the flash density observed by LIDEN, the FOD simulated by the BLM, and the diagnosed lightning frequency by the MC scheme accumulated during the whole analysis period for W3 and W4.In both cases, the BLM with SN14 underestimated the lightning frequency per grid point by more than one order of magnitude (Fig. 10b  and g), and it underestimated the area-accumulated lightning frequency by between a fifth and a sixth compared with the observations.In contrast, the BLM with T08 overestimated in both cases, which is similar to the other four cases (Fig. 10c and h).Overestimation was also seen in the results of MC, except for MC with T08 in W4 (Fig. 10j).From these results, it is difficult to determine which method is better for W3 and W4, and so the scores are useful.
The RMSE, scatter plots, correlation coefficients, and ETS are shown in Figs.11 and 12. First, we evaluate the results for W3.The BLM with SN14 showed smaller RMSE than the other methods (Fig. 11c).The correlation coefficient with SN14 was larger than that with T08 for both the BLM and the MC scheme (Fig. 12a and b), and the correlation coefficients with the BLM and the MC scheme were of the same magnitude.These results mean that the performance of each model in terms of the relative change of the lightning frequency was similar.However, the BLM with T08 and the MC scheme with both SN14 and T08 had more data points in the lower right of the scatterplot, which corresponds to overestimation of lightning frequency.In addition, the ETS of all methods was similar, which means that the performance of each model in terms of the geographical distribution of lightning did not differ greatly.From these results, we conclude that there was no appreciable difference in the performance of either method for reproducing the distribution and increase or decrease of lightning.However, the BLM with SN14 was advantageous for simulating the lightning frequency more accurately for W3, albeit with underestimation (Fig. 10b).
In W4, the MC scheme with T08 and that with SN14 had almost equivalent RMSEs, but SN14 had a smaller hourly RMSE during the time when lightning was observed (between 13:00 and 17:00) (Fig. 11d).The BLM with SN14 underestimated the lightning frequency compared to the MC scheme with T08, resulting in a larger RMSE.The correlation coefficient by the MC scheme with T08 was larger than that by the other methods, and the ETS did not differ greatly among all the methods.Thus, for W4, the MC scheme with T08 outperformed the BLM, which was contrary to the performance evaluation for the other five cases.However, the MC scheme involves empirical parameters and is not applicable for lightning events that have never been experienced previously, and the tuning of those parameters based on a specific case does not always improve the performance for other cases, as discussed in Sect. 4. Considering these disadvantages of the MC scheme, the advantages of the BLM with SN14 shown via the analyses of S1-W3 exceed the disadvantages, and we suggest that the BLM with SN14 is more suitable for lightning simulation than other methods.

Sensitivity of simulated lightning to cloud microphysical models
Here, we discuss the sensitivity of the BLM and the MC scheme to the choice of cloud microphysical model based on the vertical profiles of the mixing ratio of hydrometeors around convective core, where the graupel generates frequently (Fig. 13) and the amount of graupel charge production over the cloudy grids (Fig. 14).Compared to SN14, T08 had a lower mixing ratio of total hydrometeors, and its vertical distribution tended to shift to lower layers where graupel was more dominant.According to previous studies (Roh and Satoh 2014;Kondo et al. 2021), T08 shows the large conversion rate from snow to graupel and the fast fall velocity of graupel compared with SN14.This would be because the graupel can be easily created by the accretion process once the graupel is generated by the riming according to the prescribed constant parameters such as interception parameter, slope parameter and so on in the single moment bulk scheme including T08 as briefly summarized in Appendix 3. By these characteristics of T08, hydrometeors in the atmosphere quickly become graupel and fall to the ground as precipitation.Thus, in T08, graupel in the clouds tends to be more dominant, especially around the height of − 10 °C where riming effectively occurs (Pflaum and Pruppacher 1979), and the hydrometeors are distributed in lower layers than in SN14 for all cases.T08 has a higher collision rate between graupel and snow or ice because the graupel mixing ratio tends to be larger and the graupel falls faster (Kondo et al. 2021).This higher collision rate of graupel with snow or ice causes active charge separation through the riming electrification mechanism, and therefore T08 had a larger amount of charge separation and overestimated the lightning frequency in the BLM (Fig. 14).Note that the peak of the charge production by T08 was similar to that by SN14 in S2 and W4 as shown in Fig. 14b and  f.In these cases, the charge separation in T08 occurred wider layer than SN14 due to the higher collision rate graupel and snow or ice.This result supports the conclusion that the charge separation through the riming electrification mechanism in T08 was more active compare to SN14.In summary, the BLM is highly sensitive to the choice of cloud microphysical model, which may cause overestimation of lightning frequency, especially in models in which a large amount of graupel is simulated.

Conclusions and discussion
By comparisons with observations for six summer and winter cases over Japan, this study evaluated a BLM that explicitly predicts lightning, as well as investigating its sensitivity to the choice of cloud microphysical model.The BLM was also compared with the lightning diagnostic method of McCaul et al. (2009).Through these sensitivity experiments and comparisons, the advantages of the BLM were examined.The evaluation results showed that the BLM with SN14 had the following advantages: (1) it reproduced well the lightning frequency of summer lightning with high flash rates; (2) it reproduced the difference between summer lightning with high and low flash rates; (3) it reproduced winter lightning with a lower flash rate than summer lightning; (4) it reproduced the absence of or only a few cases of lightning in winter; and (5) it reproduced the lightning frequency in Hokuriku without an empirical diagnostic scheme.
The results of the sensitivity experiments for the choice of cloud microphysics model showed that SN14 reproduced the distribution and frequency of lightning qualitatively well.By contrast, T08 overestimated the lightning frequency and simulated its distribution over wider areas than the observations; this is because graupel is produced easily by the large conversion rate from ice and snow to graupel (Kondo et al. 2021), thus charge separation through the riming electrification mechanism is caused frequently by collisions with snow and ice in T08.These results suggest that the BLM with SN14 could be a more powerful tool for lightning forecasting than conventional methods, especially for forecasting the lightning frequency.However, for the BLM to be used in operational Fig. 13 Vertical profiles of mixing ratios of hydrometeors (black), graupel (red), and ice (green) simulated with cloud microphysics schemes of SN14 (solid lines) and T08 (dashed lines) averaged over the convective core area in the analysis area during the entire analysis period.Blue dotted and dashed lines indicate the − 10 °C and − 15 °C altitudes, respectively, averaged over the entire clouded grid points in the analysis area during the entire analysis period.The results of T08 and SN14 were both averaged for calculating the − 10 °C and − 15 °C altitudes.The hydrometeor mixing ratio (Q hyd in the legends) is the sum of the mixing ratios of cloud, rain, ice, snow, and graupel, and the ice mixing ratio (Q ice in the legends) is the sum of the mixing ratios of ice, snow, and graupel ( = q i + q s + q g ).A convective core area was defined as a grid where the sum of the liquid water and ice water paths exceeded 1 kg m −2 forecasting, future work is required involving forecast experiments conducted over long periods of time.
As well as the advantage of the BLM, we must discuss its uncertainties.In this paper, we assumed that the electric field threshold (E int ) and the radius of charge neutralization (r cyl ) were constant when calculating discharges with the BLM, but the seasonal dependence of these parameters must be discussed.Previous numerical simulations reported that the electric field required to trigger the breakdown is larger at lower altitudes (e.g., Gurevich and Zybin 2005), which indicates that it may be more appropriate to assume a larger E int in the BLM in winter cases.It is also observed that the average lightning length is longer in winter (Yoshida et al. 2019), and therefore a longer r cyl would be more appropriate in the BLM in winter cases.However, it is difficult to determine these parameters correctly based on theoretical consideration and observations.Thus, the experiments conducted in this study were focused on how well the BLM could perform with the same parameters as in previous studies (Fierro et al. 2013;Sato et al. 2022) and did not vary these parameters.In future work, sensitivity experiments should be conducted to determine reasonable parameter values.
In addition to the uncertainties of the BLM, it is necessary to discuss the empirical parameters of the MC scheme.In this study, the coefficients and threshold values proposed by McCaul (2009) were used without modification in the calculation of the MC scheme, whereas tuning them for SCALE may improve the performance of the MC scheme.Thus, we conducted several sensitivity experiments to examine suitable parameter values for the MC scheme in SCALE and discussed whether the advantages of the BLM persist in comparison with the MC scheme with tuned parameters.
In the MC scheme, the lightning frequency is calculated by a linear combination of F 1 and F 2 as shown in Sect.2.5, but F 2 was dominant for most of the diagnosed lightning frequency in this study.The coefficients and thresholds of the MC scheme are determined empirically, so tuning these parameters against F 2 may improve the overestimation tendency of the MC scheme.Accordingly, for MC with T08, which scored better in the present study, we conducted sensitivity experiments on k 2 and the threshold values under which F 2 set zero.The results of the sensitivity experiments showed that the RMSE was reduced most effectively when k 2 and the threshold were set as 0.1 and 0.6, respectively (Table 2), although the correlation coefficient Fig. 14 As Fig. 13 but for vertical profiles of the amounts of positive (red) and negative (blue) graupel charge production simulated with the cloud microphysics schemes of SN14 (solid lines) and T08 (dashed lines) averaged over the entire clouded grid points in the analysis area during the entire analysis period.Black dotted and dashed lines indicate the − 10 °C and − 15 °C altitudes, respectively, averaged over the entire clouded grid points in the analysis area during the entire analysis period.The results of T08 and SN14 were both averaged for calculating the − 10 °C and − 15 °C altitudes.A clouded grid point is defined as a grid where the sum of the liquid water and ice water paths exceeded 1 g m −2 and ETS did not change appreciably in any experiment.The RMSE of the tuned MC with T08 was comparable to that of the BLM with SN14, except for W4, in which the RMSE became worse and larger than that of the BLM with SN14, and the amount of accumulated lightning over the whole analysis area was underestimated compared to the BLM with SN14.In addition, the difference in the diagnosed number of lightning events between S1 and S2 was only about a factor of three, so it is difficult to reproduce the difference between the summer cases with different lightning frequencies, even if the parameters are tuned.From the results shown above, it is concluded that the MC scheme can be tuned to SCALE, but it is difficult to find suitable parameters for all cases because of the strong dependence of the parameters on each case.This strong dependence of the parameters was also reported in previous observational studies.Basarab et al. (2015) and Hayashi et al. (2021) proposed relational expressions for the observed microphysical properties in clouds and lightning frequency, similar to the relational expression reported by McCaul et al. (2009).They derived the relational expressions from the results of observations of several cases, but each relational expression was highly dependent on the individual cases.Thus, it is not easy to determine a relational expression that is applicable for all cases because the suitable parameters of an MC scheme are not always suitable for other cases.The BLM's strength is that it can calculate lightning without such tuning for each case and can be used for cases that have never been experienced before and for future experiments, which are difficult to predict with empirical parameters.
In addition, we must discuss the uncertainty of the lightning observation system for comparing the BLM.According to Fierro et al. (2013), the FOD (i.e., the output of the BLM) is supposed to be a good surrogate for ENTLN, a system similar to LIDEN, if it detects one or two points per flash.However, the LIDEN observations seem not to meet this requirement, especially for the winter lightning in Hokuriku based on the analyses using the Lightning Location System, which is constructed similarly to LIDEN (e.g., Tajiri et al. 2021).In general, lightning observation systems including LIDEN, which have long base lines between each observation station, are less accurate than three-dimensional lightning location data such as the observation data of the Lightning Mapping Array (Thomas et al. 2001) and BOLT (Yoshida et al. 2014;Wu et al. 2014).Comparing the BLM with such three-dimensional lightning location data would enable us to evaluate it in more detail, and such comparisons should be conducted in the near future.

Application of preprocessing
In our BLM (Sato et al. 2019), Eqs. ( 7) and ( 8) are solved at each time step to calculate the electric field (E) from the charge density (ρ e ) with Neuman boundary condition at lateral boundary, and Dirichlet boundary with electrical potential with zero at the ground and the model top: where φ and ε are the electrical potential and atmospheric permittivity, respectively, and ρ e is the charge density at each grid point, which is defined as the sum of the charge density of each hydrometeor at each grid point.Equation ( 7) is Poisson's equation, and the computational cost of a Poisson solver is large, especially on a state-of-the-art supercomputer using massively parallel computing, this being because global communication is necessary.A Poisson solver based on the Bi-CGSTAB method (van der Vorst 1992) is used in SCALE for solving Eq. ( 7), and because of the large computational cost, the elapsed time for the lightning component is much larger than those for the other physical components in SCALE (Table 3).To reduce the computational cost of the Poisson solver, three preconditioning methods (Saad 2003) were applied for Bi-CGSTAB in SCALE, i.e., the Gauss-Seidel method (GS), the symmetric Gauss-Seidel method (SGS), and incomplete lower-upper factorization (ILU).Table 3 shows that all three preconditioning methods helped greatly to reduce the number of iterations done by Bi-CGSTAB, and fewer iterations resulted in less elapsed time (i.e., smaller computational cost) of Bi-CGSTAB.Of the three preconditioning methods, SGS was the most effective, and the computational cost of Bi-CGSTAB was reduced by about 84% compared with that without preconditioning.Based on the results shown above, SGS preconditioning was applied.

Appendix 2
Enlarged view of Fig. 5 for small value of the lightning frequency   � RIKEN Center for Computational Sciences.The MANL data and the 1-km-mesh synthetic-radar grid point values were supplied via the Weather Business Consortium and the Research Institute for Sustainable Humanosphere of Kyoto University (http://database.rish.kyoto-u.ac.jp/index-e.html).We also thank ThinkSCIENCE, Inc. (Tokyo, Japan) for English language editing.

Fig. 2
Fig. 2 Geographical distribution of observed precipitation (a, d, g, j, m, p) and precipitation simulated with cloud microphysics schemes of SN14 (b, e, h, k, n, q) and T08 (c, f, i, l, o, r) accumulated within the analysis period over the analysis area for each case

Fig. 5
Fig. 5 a, b Time series of flashes detected by LIDEN (black), FOD simulated by BLM with SN14 (solid red) and T08 (solid blue), and flashes diagnosed by MC with SN14 (dashed red) and T08 (dashed blue).c, d Time series of hourly root mean squared error (RMSE) of flashes observed by LIDEN and FOD simulated by BLM with SN14 (solid red) and T08 (solid blue) or flashes diagnosed by MC with SN14 (dashed red) and T08 (dashed blue).The mean RMSE for the entire analysis time is shown in the legends of (c) and (d).a, c and b, d show the results for S1 and S2, respectively

Fig. 6
Fig. 6 Scatter plots of hourly accumulations of flashes observed by LIDEN and FOD simulated by BLM with SN14 (a) and T08 (b) or flashes diagnosed by MC with SN14 (a) and T08 (b), and time series of hourly equitable threat scores (ETS) for FOD simulated by BLM with SN14 (red) and T08 (blue) and flashes diagnosed by MC with SN14 (dashed red) and T08 (dashed blue) for S1 (c) and S2 (d).In the legends, the correlation coefficient between observed flashes and FOD or diagnosed flashes for each case is shown as r (a, b), and ETS for the entire analysis time is shown (c, d).The lines in a and b show simulated value equals 10 times and 0.1 times observed value

Figure 15
Figure15is the same as Fig.5, but with the axis (b) and (d) was changed.

Fig. 15
Fig. 15 Same as Fig. 5 but with the axis (b) and (d) were changed

Table 2
Accumulated flash number over the whole analysis area and RMSE for each case, diagnosed with MC with T08 tuned to k 2 = 0.1 and a threshold of 0.6.RMSEs of BLM with SN14 and MC with T08 not tuned are also given for comparison

Table 3
Elapsed times for calculating physical components in SCALE, evaluated by simulations of five time steps from 03 UTC on 5 July, 2017 for S1 *The mean iteration number means the number of iterations per every call of subroutine for Bi-CGSTAB.For this test, the iteration was finished when the relative residual norm becomes 10 -10 or smaller WO Without preconditioning; GS With Gauss-Seidel preconditioning; SGS With symmetric Gauss-Seidel preconditioning; ILU With incomplete lower-upper factorization preconditioning.For the calculation, the radiation processes was called once, and other physical processes were called five times