Logarithmic Quadratic Regression Model for Early Periods of COVID-19 Epidemic Count Data
Author(s): Daisuke Tominaga
Background: While COVID-19 epidemic has been spreading worldwide, its characteristics are still unclear. The development of good mathematical models for predicting its prevalence and subsiding is strongly expected. The epidemic curve shows how the epidemic increases and subsides. This is the number of persons found infected daily. To express this with a mathematical model, the compartment model such as the SIR model is used generally. However, model parameter values of these ordinary differential equation based models are very sensitive for errors of observed data, and it is often difficult to find a reliable model especially when the amount of data is not sufficient. On the other hand, a regression model with a small number of parameters is more robust against data errors than a highly sensitive nonlinear differential equation model, though, it is not clear what a good regression model is for epidemic data.
Methods: We modeled the initial emerging period of the epidemic curve of COVID-19 in Tokyo with a model that introduces a quadratic polynomial function to the logarithms of the numbers of infected cases, and modeled it with other regression models including the generalized linear model to compare.
Results: It was shown that the statistical properties of the logarithmic quadratic function model were good even in the early stages of the epidemic, which is generally said to increase exponentially and monotonically. By applying the logarithmic quadratic function model to the data of the number of cases in each country of the world, the starting and the subsiding dates of the epidemic and the total number of cases in each country were estimated.
Conclusions: Although an epidemic curve in an early period said generally to be exponential, namely linear in the logarithmic space, a quadratic curve regression fits better than the linear and the generalized linear model. These es