## Introduction

Pork is one of the essential items in the Korean diet. Pork consumption per capita was 27.1 kg per year on average as of 2020, found to be the most consumed among meats, and has increased by more than 60% compared to the 16.5 kg per capita consumption of pork in 2000 (MAFRA, 2020). Since the mid-1990s, as the pig industry has grown steadily, pork production has also increased significantly. Pork production in 2020 was 991,000 tons, an increase of 38.7% compared to 714,000 tons in 2000. Pork production reached 7.17 trillion won in 2020, recording the second highest agricultural production value after rice (APQA, 2020; MAFRA, 2020).

With the growth of the pig industry, the demand and supply of pork has become a significant concern for both consumers and producers, and controlling the supply and demand of pork has become a critical challenge for the industry. However, compared to the demand for pork, which has relatively stable consumption, it is not easy to maintain a stable supply due to its volatility (Gouk et al., 2021; Lee et al., 2022). This phenomenon is reflected in the occurrence of a supply shock related to pig breeding.

Supply disruptions frequently occur due to vulnerability to livestock diseases, such as porcine reproductive and respiratory syndrome (PRRS), foot and mouth disease (FMD), African swine fever (ASF), and abnormal weather conditions. PRRS occurs on some farms every year and harms farm productivity where it occurs. FMD and ASF have resulted in short- and long-term supply reductions by limiting the operation of neighboring farms and the slaughter of pigs in outbreak farms.1) The heatwave caused a decrease in supply and oversupply due to delayed pig growth.2) In addition, seasonal factors directly or indirectly affect shipments. Korea has four distinct seasons due to its geographical location. As a result, supply is insufficient in summer, which has practical resort and high season demand, because the stocking of sows and breeding of sows are carried out just before summer. On the other hand, in winter, shipments are high, resulting in oversupply. Such supply instability shocks the overall industry.

Due to this imbalance in the supply and demand of pigs, the need for accurate supply and demand forecasting has increased, and the preparation of measures for supply-demand crisis response and supply control in the pig industry has emerged as an urgent task. To effectively control supply and demand when a supply-demand imbalance occurs, it is necessary to recognize and evaluate supply-demand conditions properly. To achieve this, three requirements should be satisfied: 1) securing accurate data, 2) establishing a stable supply model, and 3) developing indicators to identify the supply-demand balance. The quality of information and data related to supply and demand has improved considerably compared to the past. From the third quarter of 2017, the data released by Statistics Korea was replaced with beef traceability system data, and also in pork sector, more accurate analysis of pig production became possible, such as improving the sample survey data based on the report data of the pig traceability system. In particular, short-term and mid-to-long-term forecasting became possible simultaneously, as data, such as quarterly information or monthly information contained in the pig traceability system, were provided. This led to securing the primary data necessary for the study of response to supply and demand regulation.

However, alternatives for supply model construction and manuals were relatively insufficient. Currently, most of the studies related to the supply model in Korea are biased toward Korean beef. Only a relatively small number of studies are being conducted on the pork supply model. In addition, research institutes and related ministries are paying attention to ways to control the supply and demand of pork. However, the manual for implementing measures is still in the development stage, as the crisis level setting for supply and demand control is not yet systematic. To promptly evaluate the supply-demand control conditions for pigs, a stable supply model and systematization of manual development are required. Against this background, this study builds a pork supply prediction model using pig traceability system report data provided by the Korea Institute for Animal Products Quality Evaluation (KAPE). Based on the derived results, a method for setting the supply–demand crisis stage using a statistical approach was proposed.

Literature review

The pork forecasting model provides information related to the supply and demand of pork and price, and performs a function of promoting stable supply and price of pork. To this end, it is necessary to build a model that reflects the trend of supply and demand of pigs and price information, and it is possible to predict supply and demand and price through model estimation. The methodological basis of such the outlook and forecasting can be found as early as the early 20th century, and the forecasting model has developed as statistical figures on the supply and demand of agricultural products have been published, enabling comparative analysis between items (Warren and Pearson, 1928).

Afterwards, as there has been a movement to develop a model suitable for economic theory in the field of agricultural outlook, work for systematization of the demand model and the supply model began (McKillop, 1967; Deaton and Muellbauer, 1980), since then, over time, works for elaboration of pork predictions such as time series analysis, dynamic analysis (Skold and Holt, 1988; Chavas and Holt, 1991), and simulation analysis (Prescott and Stengos, 1987; Devadoss et al., 1989) have been developed. Recently, techniques using big data or machine learning are also being carried out (Chuluunsaikhan et al., 2020; Ryu et al., 2020), and efforts to improve the accuracy of pig outlook with the application of various methodologies are still ongoing.

Meanwhile, there have been several attempts in the Korean academic community to analyze the price and production of meat, but most of the studies were related to the supply control of Korean beef, and there are few studies on pigs. The KREI-KASMO study (Seo et al., 2020) is a study related to the projection for the agricultural sector in Korea - KREI-KASMO is a partial equilibrium model developed in 2008 to forecast the total amount of Korea's agricultural sector, and includes a total of 74 items (cultivation: 65, livestock: 9) including pigs. Pork supply and demand forecasting model in KREI-KASMO has a structure in the form of a simultaneous equilibrium model that forecasts the number of breeding animals, graded animals, consumption, imports, and wholesale prices using annual data.

Han et al. (2020) developed a monthly pork model which enables of a recursive simulation based on the report data of the traceability system. In this model, the breeding sector reflected the growth cycle of breeding pigs - reserve pigs - sows - piglet - growing pigs - finishing pigs. In order to reflect the fact that the number of pig graded is subject to change in elasticity from month to month, the variable obtained by multiplying the number of fattening pigs in the previous year by a month dummy variable was introduced as an explanatory variable, to consider the difference in seasonal effects. Few studies have analyzed the pork supply model, but even these studies focus on the simulation with structure-based explanation, and the discussion of supply control is limited. Therefore, it is necessary to forecast the supply situation and prepare countermeasure to a supply crisis.

This study aims to increase the utilization of the model for supply and demand regulation by constructing a monthly model that can make short-term predictions and flexible responses, and by adding variables that were not previously considered, the model was supplemented to improve the predictive power of estimating the number of pig graded. In addition, this paper, without staying at the predictive analysis of the model, discusses the plan to set the supply crisis stage based on the analysis results, thereby laying the foundation for supply-demand policy analysis to recognize and evaluate the supply-demand situation of pigs.

## Materials and Methods

Methods

Pork supply model

The model was constructed to forecast short-term pig supply based on information on the number of pigs slaughtered. The dependent variable in the pig supply model is the number of pigs slaughtered (NPS). As explanatory variables, event factors reflecting the number of pigs raised (NFP, Lag1), working days (WDS), monthly variable (Month), and other events (COVID-19, ASF Quarantine Measures such as movement restriction, heatwave, and etc.) were used. The relationship between NPS and explanatory variables can be represented as follows:

(1)

To estimate NPS, linear regression was used, and the relationship between the dependent variable and the explanatory variable can be represented as follows:

(2)

where y is the vector of NPS, X is the vector of explanatory variables, including the number of electric finisher pigs, WDS, slaughter month variables, and event variables that may affect pig supply, and ε is an error term vector. In this case, the coefficients of the explanatory variables can be estimated using the following calculations:

(3)

Linear regression models assume the independence and homoscedasticity of errors. In this case, the covariance matrix can be expressed in the form of σ2Ω. Generally, Ω represents a positive definite matrix and appears as a weighting matrix under the Gauss-Markov assumption (Wooldridge, 2010). In the generalized regression model, the covariance matrix for the least-squares estimator can be expressed as (X′X) − 1X′σ2ΩX′(X′X) − 1. An estimation can be conducted using convergence in probability to (X′X)/n for population n, if the distribution of σ2Ω is unknown. Letting Q is convergence in probability to (X′X)/n, it can be represented as sums of squares and cross products that involve σij and the rows of X (Greene, 2012) as follows:

(4)

Using time series data, the process of convergence in the probability of Q in the time period T can be expressed as error terms εt and εs as follows:

(5)

Convergence is based on the assumption that observations should gradually decrease in correlation over time (Andrews, 1991). However, the autocorrelation problem can be solved through the heteroskedasticity and autocorrelation consistent (HAC) estimator using a linearly decaying matrix with the weight wl (Newey and West, 1987; Zeileis, 2004), as follows:

(6)

In this study, NPS was estimated through linear regression. For the serial-correlation problem that may occur when estimating using time series data, the Newey-West HAC method was used to eliminate autocorrelation by utilizing the error.

Evaluation of model performance

In this study, the proximity of predicted and actual values was examined, and the stability and predictive power of the model were tested using root mean square percent error (RMSPE), mean absolute percent error (MAPE), and Theil's inequality coefficient (Theil's U). RMSPE is similar to the method for calculating the root mean square error but is calculated in percentage units (Lee et al., 2021), as follows:

(7)

where YiS is the predicted value obtained from the model and Yi is the actual value. Since this criterion is calculated in percentage units, it has the advantage that there is no error change, even if the measurement unit is changed. MAPE goes through the process of calculating the error as an absolute ratio, and it has the advantage of being less sensitive to outliers than the square error method. MAPE can be calculated as follows:

(8)

Theil's U measures the relative accuracy of the predictive value for the actual values as follows:

Theil' s inequality coefficient (9)

RMSPE and MAPE have the characteristic that the calculated value is distributed between 0 and 100 and has the form of a percentage. In contrast, in Theil's U, the computed value is distributed between 0 and 1. The closer RMSPE, MAPE, and Theil's U are to 0, the greater the stability and predictive power of the model.

Determining the supply crisis stage

The confidence interval of the error term was used to set the supply-demand crisis stage using the predicted results. The confidence interval is the estimation interval in which the population mean is calculated according to the confidence level when there are n samples: X1, X2, X3, X4, …, Xn, and can be calculated as follows:

(10)

The predicted values for each confidence interval of the error term were recalculated. The supply crisis level was determined by comparing the recalculated forecast for each confidence interval with the average monthly value. The average annual value means the average value over three years, excluding the maximum and minimum values of monthly NPS for the last five years. The crisis stage can be set, as shown in Fig. 1, by comparing the predicted value for each derived confidence interval with the average value. According to the general classification of statistical significance levels, stages have been identified. When the two values intersect within 50%, it is identified as Green; for 50 - 95%, it is identified as yellow. For 95 - 99%, it is identified as amber, and when the two values cross outside 99%, it is identified as red.

Data

This study was conducted based on the pork traceability system report information provided by KAPE. The pork traceability system used in the analysis includes monthly data on sows, piglets, growing pigs, and finishing pigs from December 2017 to February 2021. In this study, NPS was used as a dependent variable for estimating pig supply. In addition, NFP, WDS, slaughter month indicators, and event variables that could affect pig supply were used as explanatory variables for the model estimation. The variables used for model estimation and data summary are shown in Table 1, and monthly NPS change is represented in Fig. 2.

Sd, standard deviation; Max, maximum; Min, minimum.

^{z} It indicates if quarantine measures have been implemented.

Table 1. Data summary. |

When estimating supply in livestock models, the number of breeding animals is the primary variable (Han et al., 2020). Pigs have a cycle of being born, growing, slaughtered, and distributed to the market based on the characteristics of livestock. Pigs become piglets (0 - 2 months old) for two months after birth, grow into growing pigs (3 - 4 months old) after two months, grow into finishing pigs (over five months old) after two months, and are shipped to the market when they reach an appropriate weight (110 - 115 kg). Therefore, when estimating finisher pigs, piglets 4 - 5 months ago grow into growing pigs 2 - 3 months ago. Growing pigs become a factor in determining the number of finisher pigs raised. Traceability system data is based on the farmer's report on piglets, raising pigs, and finishing pigs; it is not easy to figure out the exact age of the month. Therefore, in this study, the growth process of pigs → rearing pigs → finisher pigs was considered a cohort factor, considering that the data reported on the traceability system does not indicate the exact age of the month. The number of piglets was assumed to be determined by the number of sows and normal piglets per sow per year (PSY). The number of growing pigs is determined by the number of growing pigs and piglets in the last month. The number of finishing pigs is determined by the number of finishing pigs and growing pigs in the previous month. In this study, the number of breeding animals was a variable for estimation, and NFP, the breeding stage just before market shipment, was applied.

WDS is the variable representing the number of days a pig slaughter operation is carried out per month. Standard WDS are Monday, Tuesday, Wednesday, Thursday, Friday, Saturday, and Sunday; national holidays are considered days off. On some national holidays, Thanksgiving Day and New Year's Day, the NPS decreases sharply. After that, the delayed slaughter continues. Days with an NPS of less than 35,000 per day were not counted as WDS, because sharp declines in NPS on these days could affect the average number of working animals per day.3)

In addition, event dummies and structural change variables that can affect pig supply were constructed as exogenous variables for fitting to improve predictive power. For event dummies, Heatwave and COVID-19 were the primary variables. Heatwave is an indicator variable, assigned a value of 1 for the month affected by the extreme heatwave in July and August 2018 and 0 for the rest. The COVID-19 variable was assigned a value of 1 for April 2020 and 0 for the rest. The structural change variable was set as the occurrence standard for the ASF Quarantine Measures; the variable was set to 1 after September 2019 and 0 before that.

## Results and Discussion

Results of the pork supply model estimates

As a result of estimating the NPS from December 2017 to February 2021 using the traceability data, the R-squared, which indicates the explanatory power of the model, was 0.93, and the Durbin-Watson statistic was 1.74, meaning there was no issue of autocorrelation (Table 2). The variance inflation factors for the variables were below 10, meaning there was no multicollinearity issue. The WDS was statistically significant at the 1% level, and the NPS increased by 50,596 animals whenever the number of working days increased by one working day. On the other hand, the NFP coefficient was 0.13, indicating that for a one-unit increase in the number of finishing pigs in the previous month, the NPS increased by 13% in the current month. It was not statistically significant, even though there is an actual relationship between NFP and NPS. If the sample size increases, there is potential for significance to be achieved.

The coefficient of the ASF_SD, as a structural change dummy, was 44,392.50 and statistically significant. ASF_SD variables have a positive effect; the domestic quarantine measures taken immediately after the occurrence of ASF affect the reduction of livestock diseases. The Heatwave variable showed an adverse impact and was found to be significant at the 1% level. The NPS decreased significantly due to the summer heatwave in July and August 2018. The COVID-19 variable was also found to be statistically significant. A positive effect was that the consumption from dining out decreased, and the expected demand from the payment of disaster aid increased immediately after the re-spread of COVID-19 in March 2020.

Performance test of the pork supply model

The accuracy of the forecast is a vital factor in identifying the crisis stage for supply control. If the predictive power is high, the effects of measures such as supply and demand management policies are guaranteed. Still, if the predictive power is low, the actions to set the supply-demand control crisis stage may be reflected differently from the actual conditions. This study tested the model's performance by comparing the predicted values with the actual values based on the estimated model results (Table 3).

The number of finishing pigs is reflected in the estimated supply function used to predict the NPS. To measure the accuracy of the model's predictions, the in-sample error for the period from October 2020 to February 2021 and the out-sample error for the period from March to July 2021 was tested using RMSPE, MAPE, and Theil's U. As a result of the test, in the case of RMSPE and MAPE, the in-sample errors were 1.6% and 1.5%, respectively, indicating a high level of predictive power. The out-sample errors were also within 5%, which were somewhat stable.

On a monthly basis, the periods with the highest prediction accuracy were April and June of 2021, and the periods with relatively low prediction accuracy were March, May and July. In March, NPS increased significantly due to an increase in the proportion of domestic pork consumption resulting from reduced imports during the period, resulting in a prediction error rate of 5.4%. In May, the NPS increased because of more household demand due to Children's Day and the continued increase in domestic pork consumption, resulting in a prediction error rate of 4.2%. In July, the number of slaughtered animals decreased due to a delay in pig growth influenced by the heatwave at the end of the month, and the prediction error rate was 7.6%.

NFP, the number of finishing pigs; WDS, working days; ASF_SD, structure dummy of African swine fever.

*** and ** denote significance at the 1% and 5% levels, respecvtiely.

Table 2. Results of the number of pig slaughtered (NPS) estimation. |

RMSPE, root mean square percent error; MAPE, mean absolute percentage error; Theil’s U, Theil's inequality coefficient.

Table 3. Results of error rate tests. |

Identifying pig supply crisis stage

This study identified the supply crisis stage based on the confidence interval, calculated using the standard deviation of the predicted value and the residual. Accordingly, the crisis level was set to four levels (green, yellow, amber, and red). The critical t values for 38 degrees of freedom were 0.681, 2.024, and 2.712, for 50, 95, 99%, respectively. The confidence interval was derived based on these critical values. The confidence interval distribution was derived by applying the simulated NPS value based on the pig supply model, as represented in Fig. 3.

Fig. 3 shows the distribution of confidence intervals, applying the July forecast of 1,458,000 pigs to determine the supply–demand control crisis stage. Based on the crisis level of the simulated value based on the average annual value of 1,328,000, the average year value is less than 1,381,000, which is the 99% confidence level of the predicted value. The crisis stage of supply can then be determined as red (surplus).

## Conclusion

To forecast the supply and demand in Korea's pork industry and respond to the rapidly changing conditions, NPS was estimated, and a method for identifying the crisis stage was developed using the predicted value. The NPS was calculated by introducing the number of WDS, lagged NFP, month dummy variables, structural change variable reflecting productivity improvement immediately after the outbreak of ASF, and explanatory variables, such as COVID-19 variable and heatwave. Based on the NPS estimation results, the model's explanatory power was over 90%, and the bias problem that may appear in the results using time series data, such as autocorrelation and multicollinearity, did not occur. The model's performance was tested using RMSPE, MAPE, and Theil's U for in-sample error and out-of-sample error rates. Based on these results, the supply model is reliably accurate.

To evaluate supply-demand conditions, the distribution of the confidence interval of the predicted value was established, and the supply crisis stage was identified by comparing it with the average value. However, considering the recent supply and demand for pork, pork imports began to account for a large portion of the supply.4) The method for identifying the crisis stage presented in this study can be used as an index for stabilizing the supply and demand of pork. However, to better reflect the structure of pork supply, it should be possible to evaluate the supply crisis stage by considering imports, inventories, and factors that affect supply, as well as domestic production.

This study aimed to identify the stage of the pork supply control crisis using NPS estimates in Korea and to prepare an analytical framework for stable policy implementation. As a result of the analysis, the error for the simulated value was low, indicating that the model's predictive power was excellent and the results were reliable. Since, the model used data prepared from the report on the traceability system. Therefore, it is necessary to build data and operate the model continuously through periodic updates. Also, considering that the number of WDS, one of the variables with high explanatory power in the estimation function, can always change depending on the government policy or market situation, it is necessary to maintain the model's performance to monitor the market situation. Because imports are a factor affected by domestic demand, a rigorous analysis of the pork supply must consider the demand model and the supply model proposed in this study. In future studies, the forecasting analysis of pork supply and demand must be conducted in a more diversified manner by simultaneously considering production and consumption.

## Footnote

1) When FMD occurred in 2010, about 3.32 million animals (about 30% of the total) were culled, causing so much damage that it was called the “Worst livestock disease catastrophe in history.” ASF first occurred in Korea in 2019 in Gyeonggi Province and is considered a contagious livestock disease that requires large-scale disposal when confirmed cases occur due to high morbidity and mortality rates.

2) A typical example of a heat wave is the summer of 2018. The July - August heatwave delayed the growth of pigs weighing 80 kg or more, significantly decreasing the summer supply of pigs. After that, pigs with delayed growth began to be shipped in earnest in October; the result was an oversupply in the fourth quarter of 2018.

3) It was found that the discrepancy between the reported monthly NPS and the monthly NPS, which considered only the working days when more than 35,000 animals were slaughtered between January and June 2021, was within 0.3%.

4) As the import offer price of pork is now rising, the import volume of pork is decreasing, leading to a decreased total pork supply to the domestic market.