Automatic language and information processing: rethinking evaluation

KAREN SPARCK JONES

doi:10.1017/S1351324901002583

Automatic language and information processing: rethinking evaluation

Published online by Cambridge University Press: 26 April 2001

KAREN SPARCK JONES

Show author details

KAREN SPARCK JONES: Affiliation:
Computer Laboratory, University of Cambridge, New Museums Site, Pembroke Street, Cambridge CB2 3QG, UK; e-mail: sparckjones@cl.cam.ac.uk

Article contents

Abstract
Footnotes

Get access

Rights & Permissions

Abstract

System evaluation has mattered since research on automatic language and information processing began. However, the (D)ARPA conferences have raised the stakes substantially in requiring and delivering systematic evaluations and in sustaining these through long term programmes; and it has been claimed that this has both significantly raised task performance, as defined by appropriate effectiveness measures, and promoted relevant engineering development. These controlled laboratory evaluations have made very strong assumptions about the task context. The paper examines these assumptions for six task areas, considers their impact on evaluation and performance results, and argues that for current tasks of interest, e.g. summarising, it is now essential to play down the present narrowly-defined performance measures in order to address the task context, and specifically the role of the human participant in the task, so that new measures, of larger value, can be developed and applied.

Type: Research Article
Information: Natural Language Engineering , Volume 7 , Issue 1 , March 2001 , pp. 29 - 46

DOI: https://doi.org/10.1017/S1351324901002583 [Opens in a new window]

Access options

Get access to the full version of this content by using one of the access options below. (Log in options will check for institutional or personal access. Content may require purchase if you do not have access.)

Footnotes

I am grateful to James Allan and Fred Jelinek for inviting me to give the talk on which this paper is based, and to two referees for their comments.

Article contents

Automatic language and information processing: rethinking evaluation

Abstract

Access options

Footnotes

Save article to Kindle

Save article to Dropbox

Save article to Google Drive

Reply to: Submit a response

Your details

You have entered the maximum number of contributors

Conflicting interests