### Predictors

Individual-level variables that were included in the regression models were gender (male, female), age (20-44, 45-64, 65-79), total household income (low {less than $20,000}, medium {$20,000-$59,999}, high {more than $60,000}), education (low {not completed high school}, medium {high school completion and some post-secondary education} and high {university degree}), number of chronic medical conditions (0, 1, >1) from the following list (asthma, fibromyalgia, arthritis/rheumatism, back problems, high blood pressure, diabetes, epilepsy, heart disease, and cancer), perceived health status (poor/fair, good, very-good/excellent), number of ADG's (0-5, 6-9, >9), RUB status (0-1, 2, 3, 4-5), access to a primary care physician in the community (no, yes), and location of primary residence (rural, urban).

### Regression Models for Count Outcomes

Perhaps the most parsimonious and widely implemented method for modeling count data in the public health sciences is Poisson regression. The Poisson regression model assumes that the number of events (y

_{i}) experienced by patient i follows a Poisson distribution:

where μ

_{i} represents the conditional mean response of a given patient, which is assumed to depend on a set of observed data (x

_{i}) and an estimated vector of coefficients (β). Mathematically, this relationship takes the following form:

Taking the natural logarithm of the conditional mean allows for the response under consideration to vary linearly as a function of observed predictor variables multiplied by the effect of their corresponding regression coefficients. Various numerical maximization methods exist for iteratively estimating the values of the coefficient vector, β, and the associated -covariance matrix. variance Estimates are typically found by finding the parameter estimates that maximize the following log-likelihood function:

Since the natural logarithm of the likelihood function for the Poisson regression model is globally concave, a unique maximum can be found if it exists [21]. A restrictive assumption attached to the Poisson regression model is that the conditional variance is assumed to be equal to the conditional mean. As a result, the Poisson regression model is not always an ideal model for count data, especially in instances where a large mass of observations exists on the corner of the empirical distribution. This typically arises in the form of observed zeroes in a data set that are in excess of what would be predicted by the Poisson distribution. In severe instances, fitting a Poisson model to data with excess zeroes can result in model misspecification, inefficient parameters estimates and incorrect inferences.

A less parsimonious, but more flexible extension to the Poisson regression model is the negative binomial regression model. The negative binomial regression model does not assume that the conditional variance of the response is equal to the conditional mean. A simple extension to the specification of the Poisson conditional mean leads to a negative binomial regression model, which is illustrated below:

Above, the conditional mean for the Poisson model has been adjusted by adding an individual specific random term, ε

_{i}, that is assumed to be uncorrelated with the observation vector, x

_{i}. Typically one assumes that

follows a gamma distribution. By extending the Poisson conditional mean in this manner, we arrive at the negative binomial regression model. The inclusion of the random error in the conditional mean of the negative binomial regression model is useful, as it allows for the modeling of both observed and unobserved heterogeneity whereas, the Poisson model only accounts for observed heterogeneity. In other words, using the Poisson regression model it was assumed that patients with the same observation vector would incur the same conditional mean response. The incorporation of the random term in the negative binomial regression model allows patients with identical observation vectors to experience different conditional mean responses. If we assume (ε

_{i}) has a mean that of 1 and variance of υ then the conditional mean of yi is still μ

_{i}; however, the conditional variance becomes μ

_{i}(1 +uμ

_{i}) = μ

_{i} + uμ

_{i}
^{2}. As u approaches zero binomial regression model converges toward the Poisson model, with a conditional mean that is equal to the conditional variance, μ

_{i} [

19,

21]. For the negative binomial model, the probability that an individual patient incurs yi emergency department visits is dictated by the following density function:

Above, μ

_{i} represents the mean number of events that is expected for an individual with observation vector xi, u represents the negative binomial dispersion parameter and Γ(·) represents the gamma function. Determination of regression coefficients in negative binomial regression proceeds by maximizing the following log-likelihood function with respect to the unknown parameters:

The negative binomial regression model is a useful model for accounting for data in which unobserved heterogeneity or temporal/spatial correlation is present; however, it is not necessarily an optimal model for dealing with data that contain an excess mass of zeroes at the corner of its empirical distribution.

Zero Inflated Poisson (ZIP) regression models were introduced by Lambert [

22] as a method for modeling the factors influencing the number of defects encountered in a manufacturing application. Greene [

23] introduced the idea of the Zero Inflated Negative Binomial (ZINB) model to handle both excess zeroes and over-dispersion as a result of unobserved heterogeneity which commonly arises in economic problems. Each of the models - ZIP and ZINB - assumes that patients can fall into one of two groups. The first group of patients never experience the outcome (eg. always show zero demand for emergency department services) and the second group of patients show some positive demand which is governed by the Poisson or negative binomial density. A patient falls into group 1 with probability ψ

_{i}, and a patient falls into group 2 with probability (1 - ψ

_{i}), where ψ

_{i} is an estimable parameter from available data. Group 1, although homogeneously comprised of persons with zero demand for emergency services over a given interval of time are derived from a combination of processes - resulting from the binomial probability, ψ

_{i}, and the zeroes which accumulate naturally in the Poisson or negative binomial densities. Distinguishing between the two sources of zeroes is not possible, as it is a form of discrete unobserved heterogeneity [

21]. The probability density function for the ZIP model is given below:

Similarly, for zero-inflated negative binomial model, the probability density function is given by:

For both the ZIP and ZINB models the probability of an excess zero, ψ

_{i}, the is modeled using logistic regression (although, any binary regression framework will suffice). As a result, the probability of an excess zero is given by:

In other words, the probability of an excess zero is a function of some observed linear predictor, η

_{i}, which itself is formed from a set of predictor variables, z

_{i}, multiplied by their associated logistic regression coefficients, ε(nb. the set z

_{i}, in the logistic of model need not equal the set of variables, x

_{i}, in the Poisson or negative binomial component regression models). For the ZIP model the conditional mean and variance are:

For the ZINB model, the conditional mean is the same as for the ZIP model; however, the conditional variance differs. The equations for both the conditional mean and variance of the ZINB model are given below:

Considering ψ

_{i} as the probability of excess zeroes, it can be observed that as ψ

_{i} tends toward zero then the probability densities, as well as the conditional mean and variances of the ZIP and ZINB models converge toward the corresponding formulas for the Poisson and negative binomial models, respectively [

18,

19,

21]. Determination of regression coefficients for the ZIP and ZINB models once again occurs by maximization of the log-likelihood functions, which are given below.

Here *I*(·) is an indicator function.

One issue with the application of zero-inflated modeling strategies for emergency department demand is that interpretively some of the zeroes in ZIP/ZINB models are considered to be structural; whereas, others are assumed to arise as a result of a sampling process. Conceptually, it is hard to imagine even the healthiest individuals in the Ontario population not being "at risk" for an emergency department visit and hence representing a structural zero. As a result, even though the ZIP and ZINB models may fit our data well, a more parsimonious explanation of the phenomena under investigation can be derived using a hurdle modeling approach.

Hurdle models account for excess zeroes but are specified and interpreted slightly differently than the ZIP and ZINB models discussed above. The hurdle regression approach was introduced by Mullahy [

14] and incorporates a Bernoulli or right censored count density (the hurdle component) with a left truncated count density (the count component). As a result, hurdle models are composed of two mechanisms, similar to their zero-inflated counterparts; however, hurdle approaches avoid modeling the probability of a zero occurrence as a function of two mixed sources - the structural component (logistic model) and a sampling component (count model). Rather, using a hurdle approach the probability of an observation occurring at zero or not zero is governed by a Bernoulli (or right censored count) density. That is, the probability of a zero occurrence is given by ψ

_{i}. Further, the probability of a non-zero occurrence is given by (1 - ψ

_{i}). Therefore, with probability (1 - ψ

_{i}) we will observe a positive integer count from either a truncated Poisson or negative binomial density (normalized to integrate to 1). In a mathematically more precise manner, Mullahy [

14] illustrated that the general hurdle density is given as:

Above, z_{i} and γ represent the respective variables and coefficients associated with the zero/non-zero (hurdle) process. The xi and β are the respective variables and coefficients associated with the count process. Lastly, the *f*
_{zero}(·) function represents either a Bernoulli or right censored count density; whereas, the *f*
_{count}(·) function typically represents a left truncated Poisson or negative binomial density.

If we purport to use a Bernoulli density to model the probability (ψ

_{i}) of a zero versus a non-zero (1 - ψ

_{i}) count, coupled with a left truncated Poisson density (with mean μ

_{i}) for the count process, then our overall hurdle Poisson (HP) density looks as follows:

Similarly, if we purport to use a Bernoulli density to model the probability (ψ

_{i}) of a zero versus a non-zero (1 - ψ

_{i}) count, coupled with a left truncated negative binomial density (with mean, μ

_{i}, and negative binomial dispersion parameter, υ) for the count process, then our overall hurdle negative binomial (HNB) density looks as follows:

Again, regression coefficients are estimated through determination of the coefficients which maximize the following log-likelihood functions [

18]:

Intuitively, the hurdle approach to handling excess zeroes in medical utilization data is appealing as predicted zeroes are not interpreted as a mix of structural and sampling zeroes. Rather, the first component of the hurdle approach can be used to model whether a person does or does not decide to seek emergency services over the time interval of our study. This process can be modeled using a binary regression framework, such as logistic or probit regression. Given that a person does decide to seek emergency services, the number of visits they make to the emergency department can then be modeled using a left truncated Poisson or negative binomial distribution.