Document Type


Publication Date



Electronic rulemaking, E-rulemaking, Public comment, Notice-and-comment process, Rule-specific issue categorization, Flat text categorization, Hierarchical text categorization, Human interannotator agreement, Cornell e-Rulemaking Initiative, CeRI


Administrative Law | Categorical Data Analysis | Legislation


We address the e-rulemaking problem of categorizing public comments according to the issues that they address. In contrast to previous text categorization research in e-rulemaking, and in an attempt to more closely duplicate the comment analysis process in federal agencies, we employ a set of rule-specific categories, each of which corresponds to a significant issue raised in the comments. We describe the creation of a corpus to support this text categorization task and report interannotator agreement results for a group of six annotators. We outline those features of the task and of the e-rulemaking context that engender both a non-traditional text categorization corpus and a correspondingly difficult machine learning problem. Finally, we investigate the application of standard and hierarchical text categorization techniques to the e-rulemaking data sets and find that automatic categorization methods show promise as a means of reducing the manual labor required to analyze large comment sets: the automatic annotation methods approach the performance of human annotators for both flat and hierarchical issue categorization.

Publication Citation

Published in: Proceedings of the 9th Annual International Digital Government Research Conference, Montreal, Canada, May 18-21, 2008.