a2 Data Mining and Informatics Research Group, University of Ballarat, Australia e-mail: email@example.com
Authorship attribution methods aim to determine the author of a document, by using information gathered from a set of documents with known authors. One method of performing this task is to create profiles containing distinctive features known to be used by each author. In this paper, a new method of creating an author or document profile is presented that detects features considered distinctive, compared to normal language usage. This recentreing approach creates more accurate profiles than previous methods, as demonstrated empirically using a known corpus of authorship problems. This method, named recentred local profiles, determines authorship accurately using a simple ‘best matching author’ approach to classification, compared to other methods in the literature. The proposed method is shown to be more stable than related methods as parameter values change. Using a weighted voting scheme, recentred local profiles is shown to outperform other methods in authorship attribution, with an overall accuracy of 69.9% on the ad-hoc authorship attribution competition corpus, representing a significant improvement over related methods.
(Received December 20 2010)
(Revised March 18 2011)
(Accepted April 24 2011)
(Online publication June 09 2011)