Conference Proceedings Paper

A paper prepared for delivery at a conference and later published in the conference proceedings.

This paper was written to be presented at the 2010 meeting of the Special Interest Group for Design of Communication in Sao Paulo, Brazil, and was published in the proceedings from that conference.

The paper is a writeup of two components of our work with Eli Review. Both of them relate to trying to find a way to apply a metric to the helpfulness of someone giving feedback. This is different from the thumb’s up / down that someone might give to a product rating on Amazon. Instead, this method is trying to find out whether or not someone giving feedback to a writer helped that writer improve.

The first component details an experiment that we did in which Bill Hart-Davidson, Jeff Grabill, and I worked with a group of teachers to determine which factors they weighed when considering the helpfulness of student feedback. In this experiment, we ended up producing an weighting score to those different criteria that very closely matched the teacher scores. The second part of the paper is written with (then) MSU graduate student Michael Wojcik and intern Christoper Klerkx in which we created a fully-automated algorithm to control this process.

This remains our most widely-cited piece of scholarship, being routinely cited by small and large organizations (including Amazon) doing research and writing patents on feedback.

Continue reading for the full citation, abstract, introduction, and download links.

Screenshot from the intro page of our conference proceedings paper (full paper available behind ACM paywall).

Citation:

Hart-Davidson, William, McLeod, Michael, Klerkx, Christopher and Wojcik, Michael. 2010. A method for measuring helpfulness in online peer review. In Proceedings of the 28th ACM International Conference on Design of Communication (SIGDOC ‘10). ACM, New York, NY, USA, 115-121. DOI=10.1145/1878450.1878470 http://doi.acm.org/10.1145/1878450.1878470.

Abstract:

This paper describes an original method for evaluating peer review in online systems by calculating the helpfulness of an individual reviewer’s response. We focus on the development of specific and machine scoreable indicators for quality in online peer review.

Introduction:

Following the logic of Benkler [1] in The Wealth of Networks, a culture of massive scale peer production such as we see with Web 2.0 technologies creates a quality problem related to the inherent self-evaluation bias of content producers. To ensure quality in web scale peer production, we need something that approaches web scale peer review. Logistically, this is a difficult problem, and so we have seen the rise of machine-mediated systems for coordinating peer response and review such as Amazon.com’s recommender system or Slashdot’s peer review driven posting system.

While these systems address the quality of the product in peer- production – they make the content better – they do not do much to reveal, assess, or facilitate the development of review activity itself. In many systems and contexts, it is desirable to know who the best reviewers are and also, in educational settings, what kinds of instructional interventions may be useful for helping students become better reviewers.

Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee.

Our review method takes input from a structured but flexible review workflow and stores the artifacts generated in the review process (e.g. comments, suggestions for revisions) along with other user-supplied descriptive and evaluative data that correspond with review metrics (such as whether or not a comment addressed a specific review criterion). The core method is designed as a web service that can receive this input from a variety of different sources and production environments, performing the analytics needed to calculate reviewer helpfulness and visualizations meant to offer both formative and summative feedback of reviewers’ performance. Our method understands a review to be a group activity consisting of reviewers, review targets (documents), and criteria, all under the direction of a review coordinator.

There are a number of specific applications for this method, but we will focus on one application called “Eli” that has been developed for online peer review in classroom settings. Eli gives feedback to both teachers and students about reviewers’ helpfulness in both a single review (a helpfulness score) and over time (a helpfulness index) so that students’ ability to do reviews can be meaningfully evaluated.

The full paper is available freely online or at the Association for Computer Machinery website.