"Active Learning for e-Rulemaking: Public Comment Categorization" by Stephen Purpura, Claire Cardie et al.

Cornell e-Rulemaking Initiative Publications

Title

Active Learning for e-Rulemaking: Public Comment Categorization

Authors

Stephen Purpura, Information Science Program, Cornell UniversityFollow
Claire Cardie, Department of Computer Science, Cornell UniversityFollow
Jesse Simons, Department of Computer Science, Cornell UniversityFollow

Document Type

Conference Presentations

Publication Date

5-2008

Abstract

We address the e-rulemaking problem of reducing the manual labor required to analyze public comment sets. In current and previous work, for example, text categorization techniques have been used to speed up the comment analysis phase of e-rulemaking - by classifying sentences automatically, according to the rule-specific issues [2] or general topics that they address [7, 8]. Manually annotated data, however, is still required to train the supervised inductive learning algorithms that perform the categorization. This paper, therefore, investigates the application of active learning methods for public comment categorization: we develop two new, general-purpose, active learning techniques to selectively sample from the available training data for human labeling when building the sentence-level classiers employed in public comment categorization. Using an e-rulemaking corpus developed for our purposes [2], we compare our methods to the well-known query by committee (QBC) active learning algorithm [5] and to a baseline that randomly selects instances for labeling in each round of active learning. We show that our methods statistically significantly exceed the performance of the random selection active learner and the query by committee (QBC) variation, requiring many fewer training examples to reach the same levels of accuracy on a held-out test set. This provides promising evidence that automated text categorization methods might be used effectively to support public comment analysis.

Comments

Presented at the 9th International Conference on Digital Government Research, May 18-21, 2008.

Recommended Citation

Purpura, Stephen; Cardie, Claire; and Simons, Jesse, "Active Learning for e-Rulemaking: Public Comment Categorization" (2008). Cornell e-Rulemaking Initiative Publications. 7.
https://scholarship.law.cornell.edu/ceri/7

Download

Included in

Administrative Law Commons

COinS

Scholarship@Cornell Law: A Digital Repository

Cornell e-Rulemaking Initiative Publications

Title

Active Learning for e-Rulemaking: Public Comment Categorization

Authors

Document Type

Publication Date

Abstract

Comments

Recommended Citation

Included in

Browse

Author Corner

Links

Scholarship@Cornell Law: A Digital Repository

Cornell e-Rulemaking Initiative Publications

Title

Active Learning for e-Rulemaking: Public Comment Categorization

Authors

Document Type

Publication Date

Abstract

Comments

Recommended Citation

Included in

Share

Browse

Author Corner

Links