"An eRulemaking Corpus: Identifying Substantive Issues in Public Commen" by Claire Cardie, Cynthia R. Farina et al.

Cornell e-Rulemaking Initiative Publications

Title

An eRulemaking Corpus: Identifying Substantive Issues in Public Comments

Authors

Claire Cardie, Department of Computer Science, Cornell UniversityFollow
Cynthia R. Farina, Cornell Law SchoolFollow
Matt Rawding, Information Science Program, Cornell UniversityFollow
Adil Aijaz, Department of Computer Science, Cornell UniversityFollow

Document Type

Conference Presentations

Publication Date

5-2008

Abstract

We describe the creation of a corpus that supports a real-world hierarchical text categorization task in the domain of electronic rulemaking (eRulemaking). Features of the task and of the eRulemaking domain engender both a non-traditional text categorization corpus and a correspondingly difficult machine learning task. Interannotator agreement results are presented for a group of six annotators. We also briefly describe the results of experiments that apply standard and hierarchical text categorization techniques to the eRulemaking data sets. The corpus is the first in a series of related sentence-level text categorization corpora to be developed in the eRulemaking domain.

Comments

Presented at the 9th International Conference on Digital Government Research, May 18-21, 2008.

Recommended Citation

Cardie, Claire; Farina, Cynthia R.; Rawding, Matt; and Aijaz, Adil, "An eRulemaking Corpus: Identifying Substantive Issues in Public Comments" (2008). Cornell e-Rulemaking Initiative Publications. 6.
https://scholarship.law.cornell.edu/ceri/6

Download

Included in

Administrative Law Commons

COinS

Scholarship@Cornell Law: A Digital Repository

Cornell e-Rulemaking Initiative Publications

Title

An eRulemaking Corpus: Identifying Substantive Issues in Public Comments

Authors

Document Type

Publication Date

Abstract

Comments

Recommended Citation

Included in

Browse

Author Corner

Links

Scholarship@Cornell Law: A Digital Repository

Cornell e-Rulemaking Initiative Publications

Title

An eRulemaking Corpus: Identifying Substantive Issues in Public Comments

Authors

Document Type

Publication Date

Abstract

Comments

Recommended Citation

Included in

Share

Browse

Author Corner

Links