Document Type


Publication Date



Statistical Methodology | Statistical Models


In law‐related and other social science contexts, researchers need to account for data with an excess number of zeros. In addition, dollar damages in legal cases also often are skewed. This article reviews various strategies for dealing with this data type. Tobit models are often applied to deal with the excess number of zeros, but these are more appropriate in cases of true censoring (e.g., when all negative values are recorded as zeros) and less appropriate when zeros are in fact often observed as the amount awarded. Heckman selection models are another methodology that is applied in this setting, yet they were developed for potential outcomes rather than actual ones. Two‐part models account for actual outcomes and avoid the collinearity problems that often attend selection models. A two‐part hierarchical model is developed here that accounts for both the skewed, zero‐inflated nature of damages data and the fact that punitive damage awards may be correlated within case type, jurisdiction, or time. Inference is conducted using a Markov chain Monte Carlo sampling scheme. Tobit models, selection models, and two‐part models are fit to two punitive damage awards data sets and the results are compared. We illustrate that the nonsignificance of coefficients in a selection model can be a consequence of collinearity, whereas that does not occur with two‐part models.


Earlier versions of this material were presented at the First Annual Conference on Empirical Legal Studies, University of Texas and the International Conference on Empirical Legal Studies, conducted by the Cegla Center for Interdisciplinary Research of the Law, Tel Aviv University, Buchmann Faculty of Law.