Survey about citation context analysis: Tasks, techniques, and resources

MYRIAM HERNÁNDEZ-ALVAREZ; JOSÉ M. GOMEZ

doi:10.1017/S1351324915000388

Survey about citation context analysis: Tasks, techniques, and resources

Published online by Cambridge University Press: 05 November 2015

MYRIAM HERNÁNDEZ-ALVAREZ

and

JOSÉ M. GOMEZ

Show author details

MYRIAM HERNÁNDEZ-ALVAREZ: Affiliation:
Escuela Politécnica Nacional, Facultad de Ingeniería de Sistemas, Quito, Ecuador e-mail: myriam.hernandez@epn.edu.ec
JOSÉ M. GOMEZ: Affiliation:
Dpto. de Lenguajes y Sistemas Informáticos, Universidad de Alicante, Alicante, España e-mail: jmgomez@ua.es

Article contents

Abstract
References

Get access

Rights & Permissions

Abstract

Bibliometric calculations currently used to assess the quality of researchers, articles, and scientific journals have serious structural problems; many authors have noted the weakness of citation counts, because they are purely quantitative and do not differentiate between high- and low-citing papers. If a paper’s reputation is simply evaluated according to the number of its citations, then incomplete, incorrect, or controversial articles may be promoted, regardless of their relevancy. Therefore, perverse incentives are generated for researchers who may publish many incorrect or incomplete papers to achieve high impact indexes. It is essential to improve the objective criteria for automatic article-quality assessments. However, to obtain these new criteria, it is necessary to advance the programmed detection of context, polarity, and function of bibliographic references.

We present an overview of general concepts and review contributions to the solutions to problems related to these issues, with the purpose of identifying trends and suggesting possible future research directions.

Type: Survey Paper
Information: Natural Language Engineering , Volume 22 , Issue 3 , May 2016 , pp. 327 - 349

DOI: https://doi.org/10.1017/S1351324915000388 [Opens in a new window]
Copyright: Copyright © Cambridge University Press 2015

Access options

Get access to the full version of this content by using one of the access options below. (Log in options will check for institutional or personal access. Content may require purchase if you do not have access.)

References

Abu-Jbara, A., and Radev, D., 2012. Reference scope identification in citing sentences. In Proceedings of the 2012 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, ACL. Stroudsburg, PA, pp. 80–90.Google Scholar

Abu-Jbara, A., Ezra, J., and Radev, D., 2013. Purpose and polarity of citation: Towards NLP-based bibliometrics. In Proceedings of the 2013 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, ACL. Atlanta, GA, pp. 596–606.Google Scholar

Angrosh, M. A., Cranefield, S., and Stanger, N. 2013. Conditional random field based sentence context identification: Enhancing citation services for the research community. In Proceedings of the First Australasian Web Conference, Adelaide, Australia, Australian Computer Society, Inc., vol. 144: pp. 59–68.Google Scholar

Artstein, R., and Poesio, M., 2008. Inter-coder agreement for computational linguistics. Computational Linguistics 34 (4): 555–96.Google Scholar

Athar, A., 2011. Sentiment analysis of citations using sentence structure-based features. In Proceedings of the ACL 2011 Student Session, ACL. Stroudsburg, PA, pp. 81–7.Google Scholar

Athar, A. 2014. Sentiment analysis of scientific citations. Technical Report, University of Cambridge, Computer Laboratory, (UCAM-CL-TR-856).Google Scholar

Athar, A., and Teufel, S., 2012. Context-enhanced citation sentiment detection. In Proceedings of the 2012 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, ACL. Montreal, Canada, pp. 597–601.Google Scholar

Biber, D., and Finegan, E. 1994. Intra-textual variation within medical research articles. In Oostdijiik, N. and DeHaan, P. (eds.), Corpus-Based Research into Language, pp. 201–22. Amsterdam: Rodopi.Google Scholar

Bird, S., Dale, R., Dorr, B. J., Gibson, B., Joseph, M. T., Kan, M. Y., Lee, D., Powley, B., Radev, D. R., and Tan, Y. F., 2008. The ACL anthology reference corpus: A reference dataset for bibliographic research in computational linguistics. In Proceedings of the 6th International Conference on Language Resources and Evaluation Conference (LREC’08), Marrakesh, Morocco, pp. 1755–59.Google Scholar

Blitzer, J., Dredze, M., and Pereira, F., 2007. Biographies, Bollywood, boom-boxes and blenders: Domain adaptation for sentiment classification. In Proceedings of the 45th Annual Meeting of the Association of Computational Linguistics, Prague, Czech Republic, pp. 440–47.Google Scholar

Boldrini, E., Fernández Martínez, J., Gómez Soriano, J. M., and Martínez Barco, P., 2009. Machine learning techniques for automatic opinion detection in non-traditional textual genres. In Proceedings of the First Workshop on Opinion Mining and Sentiment Analysis, WOMSA09, Seville, Spain, pp. 110–19.Google Scholar

Brembs, B., and Munafò, M. 2013. Deep impact: Unintended consequences of journal rank. Digital Libraries; Physics and Society. Available at http://arxiv.org/abs/1301.3748 Google Scholar

Chen, M., Xu, Z., Weinberger, K., and Sha, F., 2012. Marginalized denoising autoencoders for domain adaptation. In Proceedings of the 29th International Conference on Machine Learning, Edinburgh, Scotland, pp. 767–84.Google Scholar

Ciancarini, P., Di Iorio, A., Nuzzolese, A. G., Peroni, S., and Vitali, F. 2014. Evaluating citation functions in CiTO: Cognitive issues. Semantic Web: Trends and Challenges, pp. 580–94. Berlin: Springer International Publishing.Google Scholar

Ciancarini, P., Iorio, A.Di Nuzzolese, A. G., Peroni, S., and Vitali, F. 2013. Semantic annotation of scholarly documents and citations. AI*IA 2013: Advances in Artificial Intelligence, vol. 8249: pp. 336–47. Berlin: Springer.Google Scholar

Davletov, F., Aydin, A. S., and Cakmak, A., 2014. High impact academic paper prediction using temporal and topological features. In Proceedings of the 23rd ACM International Conference on Conference on Information and Knowledge Management, Shanghai, China, pp. 491–98.Google Scholar

Dong, C., and Schäfer, U., 2011. Ensemble-style self-training on citation classification. In Proceedings of 5th International Joint Conference on Natural Language Processing, Asian Federation of Natural Language Processing, Chiang Mai, Thailand, pp. 623–31.Google Scholar

Fang, F. C., Steen, R. G., and Casadevall, A., 2012. Misconduct accounts for the majority of retracted scientific publications. In Proceedings of the National Academy of Sciences of the United States of America, United States of America, vol. 109, pp. 17028–33.Google Scholar

Fernández, J., Boldrini, E., Gómez, J. M., and Martínez-Barco, P. 2011. Evaluating EmotiBlog robustness for sentiment analysis tasks. In Natural Language Processing and Information Systems, Heidelberg: Springer-Verlag, pp. 290–94.Google Scholar

Garfield, E., 1972. Citation analysis as a tool in journal evaluation: Journals can be ranked by frequency and impact of citations for science policy studies. Science 178: 471–79.Google Scholar

Garzone, M. A. 1997. Automated classification of citations using linguistic semantic grammars. Master’s Thesis. The University of Western Ontario. Available at http://www.collectionscanada.gc.ca/obj/s4/f2/dsk2/ftp04/mq28570.pdf Google Scholar

Garzone, M., and Mercer, R. E. 2000. Towards an automated citation classifier. In Advances in Artificial Intelligence, pp. 337–46. Berlin Heidelberg: Springer.Google Scholar

Green, A., Ashley, K., Litman, D., Reed, C., and Walker, V. 2014. In Proceedings of the First Workshop on Argumentation Mining, ACL. Baltimore, MD, p. 3.Google Scholar

He, Q., Kifer, D., Pei, J., Mitra, P., and Giles, C. L., 2011. Citation recommendation without author supervision. In Proceedings of the 4th ACM international Conference on Web Search and Data Mining, ACM. Kowloon, Hong Kong, pp. 755–64.CrossRef Google Scholar

Hernández, M., and Gómez, J. M., 2014. Survey in sentiment, polarity and function analysis of citation. In Proceedings of the First Workshop on Argumentation Mining, ACL. Baltimore, MD, pp. 102–3.CrossRef Google Scholar

Hirsch, J. E. 2005. An index to quantify an individual’s scientific research output. Proceedings of the National academy of Sciences of the United States of America, United States of America, 102 (46): 16569–72.Google Scholar

Hyland, K., 1996. Writing without conviction? Hedging in science research articles. Applied Linguistics 17: 433–54.Google Scholar

Hyland, K. 1998. Hedging in Scientific Research Articles, vol. 54. Amsterdam: John Benjamins Publishing.Google Scholar

Ioannidis, J. P. A., 2005. Why most published research findings are false. Chance 18 (4): 40–7.Google Scholar

Iorio, A., Di Nuzzolese, A. G., and Peroni, S., 2013. Towards the Automatic Identification of the Nature of Citations. Montpellier, France: SePublica, pp. 63–74.Google Scholar

Jochim, C., 2014. Improving citation polarity classification with product reviews. In Proceedings of the 52nd Annual Meeting of the Association for Computational Linguistics, ACL. Baltimore, MD, pp. 42–8.Google Scholar

Jochim, C., and Schütze, H., 2012. Towards a generic and flexible citation classifier based on a faceted classification scheme. In Procedings of COLING’12, Mumbai, India, pp. 1343–58.Google Scholar

Kang, I.-S., and Kim, B.-K. 2012. Characteristics of citation scopes: a preliminary study to detect citing sentences. In Computer Applications for Database, Education, and Ubiquitous Computing Information Science, pp. 80–5. Berlin: Springer.Google Scholar

Kaplan, D., Iida, R., and Tokunaga, T., 2009. Automatic extraction of citation contexts for research paper summarization: a coreference-chain based approach. In Proceedings of the 2009 Workshop on Text and Citation Analysis for Scholarly Digital Libraries, ACL. Suntec, Singapore, pp. 88–95.Google Scholar

Kataria, S., Mitra, P., and Bhatia, S., 2010. Utilizing context in generative bayesian models for linked corpus. In AAAI Conference in Artificial Intelligence, Atlanta, Georgia, USA, pp. 1340–45.Google Scholar

Kataria, S., Mitra, P., Caragea, C., and Giles, C. L. 2011. Context sensitive topic models for author influence in document networks. In IJCAI Proceedings-International Joint Conference on Artificial Intelligence, Barcelona, Spain, vol. 22 (3): p. 2274.Google Scholar

Kessler, M. M. 1963. Bibliographic coupling between scientific papers. American documentation Wiley Periodicals, Inc. 14 (1): 10–25.Google Scholar

Klein, D., and Manning, C. D. 2003. Accurate unlexicalized parsing. In Proceedings of the 41st Annual Meeting, ACL. Stroudsburg, PA, USA, vol. 1: pp. 423−30.Google Scholar

Li, X., He, Y., Meyers, A., and Grishman, R., 2013. Towards fine-grained citation function classification. In Proceedings of Recent Advances in Natural Language Processing, Hissar, Bulgaria, pp. 402–7.Google Scholar

Liakata, M., Saha, S., Dobnik, S., Batchelor, C., and Rebholz-Schuhmann, D., 2012. Automatic recognition of conceptualization zones in scientific articles and two life science applications. Bioinformatics 28: 991–1000.CrossRef Google Scholar PubMed

Livne, A., Gokuladas, V., Teevan, J., Dumais, S. T., and Adar, E., 2014. CiteSight: supporting contextual citation recommendation using differential search. In Proceedings of the 37th international ACM SIGIR conference on Research & development in information retrieval, ACM. Gold Coast, Australia, pp. 807–16.Google Scholar

MacRoberts, M. H., and MacRoberts, B. R. 1984. The negational reference: Or the Art of dissembling. Social Studies of Science, London, Beverly Hills and New Delhi, 14 (1): 91–4, Sage Publications Ltd.CrossRef Google Scholar

Marder, E., Kettenmann, H., and Grillner, S. 2010. Impacting our young. In Proceedings of the National Academy of Sciences of the United States of America United States of America, 107: 21233.Google Scholar

Mei, Q., and Zhai, C. 2008. Generating impact-based summaries for scientific literature. In Proceedings of the 46 Annual Meeting: HLT, ACL. Columbus, Ohio, USA, vol. 8: pp. 816–24.Google Scholar

Mercer, R. E., Di Marco, C., and Kroon, F. W. 2004. The frequency of hedging cues in citation contexts in scientific writing. In Advances in Artificial Intelligence, pp. 75–88. Berlin: Springer Heidelberg.Google Scholar

Meyers, A., 2013. Contrasting and corroborating citations in journal articles. In Proceedings of the International Conference Recent Advances in Natural Language Processing RANLP 2013, Hissar, Bulgaria, pp. 460–66.Google Scholar

Moravcsik, M. J., & Murugesan, P. (1975). Some results on the function and quality of citations. Social studies of science, 5 (1), 86–92.CrossRef Google Scholar

Mullen, T., and Collier, N., 2004. Sentiment analysis using support vector machines with diverse information sources. In Conference on Empirical Methods in Natural Language Processing, ACL. Barcelona, Spain, pp. 412–18.Google Scholar

Nallapati, R. M., Ahmed, A., Xing, E. P., and Cohen, W. W., 2008. Joint latent topic models for text and citations. In Proceedings of the 14th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, ACM. Las Vegas, Nevada, USA, pp. 542–50.Google Scholar

Nicholson, J. M., and Ioannidis, J. P. A., 2012. Research grants: Conform and be funded. Nature 492: 34–6.Google Scholar

Page, L., Brin, S., Motwani, R., and Winograd, T. 1999. The PageRank citation ranking: Bringing order to the web. Technical Report, Stanford InfoLab, Stanford University (SIDL-WP-1999–0120).Google Scholar

Pang, B., and Lee, L., 2004. A sentimental education: Sentiment analysis using subjectivity summarization based on minimum cuts. In Proceedings of the 42nd Annual Meeting on Association for Computational Linguistics, ACL. Morristown, NJ, USA, pp. 271–78.Google Scholar

Peldszus, A. 2014. Towards segment-based recognition of argumentation structure in short texts. In Proceedings of the First Workshop on Argumentation Mining ACL 2014, Baltimore, MD, USA, pp. 88−97.Google Scholar

Prabowo, R., and Thelwall, M., 2009. Sentiment analysis: A combined approach. Journal of Informetrics 3: 143–57.Google Scholar

Qazvinian, V., and Radev, D. R. 2008. Scientific paper summarization using citation summary networks. In Proceedings of the 22nd International Conference on Computational Linguistics, ACL. Stroudsburg, PA, vol. 1: pp. 689−96.Google Scholar

Qazvinian, V., and Radev, D. R. 2010. Identifying non-explicit citing sentences for citation-based summarization. In Proceedings of the 48th Annual Meeting, ACL. Uppsala, Sweden, pp. 555−64.Google Scholar

Radev, D. R., Muthukrishnan, P., and Qazvinian, V., 2009. The ACL Anthology Network corpus. In Proceedings of the 2009 Workshop on Text and Citation Analysis for Scholarly Digital Libraries, ACL. Suntec, Singapore, pp. 54–61.Google Scholar

Radicchi, F., 2012. In science “there is no bad publicity”: Papers criticized in comments have high scientific impact. Nature Scientific Reports 2: 815.Google Scholar

Reyhani Hamedani, M., Kim, S. W., Lee, S. C., and Kim, D. J., 2013. On exploiting content and citations together to compute similarity of scientific papers. In Proceedings of the 22nd ACM international conference on Conference on information & knowledge management, ACM. San Francisco, CA, USA, pp. 1553–56.Google Scholar

Ritchie, A., Robertson, S., and Teufel, S., 2008. Comparing citation contexts for information retrieval. In Proceedings of the 17th Acm Conference on Information and Knowledge Management, ACM. Napa Valley, CA, USA, pp. 213–22.Google Scholar

Sample, I. 2013. Nobel winner declares boycott of top science journals. The Guardian. Available at http://www.theguardian.com/science/2013/dec/09/nobel-winner-boycott-science-journals Google Scholar

Sayyadi, H., and Getoor, L., 2009. FutureRank: Ranking scientific articles by predicting their future PageRank. In SDM Siam International Conference on Data Mining, Sparks, Nevada, pp. 533–44.Google Scholar

Schreiber, M., 2013. A case study of the arbitrariness of the h-index and the highly-cited-publications indicator. Journal of Informetrics 7: 379–87.Google Scholar

Sebastiani, F., 2002. Machine learning in automated text categorization. ACM Computing Surveys 34: 1–47.CrossRef Google Scholar

Siegel, D., and Baveye, P., 2010. Battling the paper glut. Science 329: 1466.Google Scholar

Small, H., 1973. Co-citation in the scientific literature: A new measure of the relationship between two documents. Journal of the American Society for Information Science 24: 265–69.Google Scholar

Small, H., 2011. Interpreting maps of science using citation context sentiments: A preliminary investigation. Scientometrics 87: 373–88.Google Scholar

Sugiyama, K., Kumar, T., Kan, M.-Y., and Tripathi, R. C., 2010. Identifying citing sentences in research papers using supervised learning. In 2010 International Conference on Information Retrieval and Knowledge Management (CAMP), Shah Alam, Selangor, Malaysia, pp. 67–72.Google Scholar

Teufel, S. 1999. Argumentative zoning: Information extraction from scientific text. Doctoral dissertation, School of Cognitive Science, University of Edinburgh, UK. Available at http://www.cl.cam.ac.uk/~sht25/thesis/t1.pdf Google Scholar

Teufel, S. 2010. The structure of scientific articles: Applications to citation indexing and summarization. CLSI–Studies in Computational Linguistics, Chicago: University of Chicago Press.Google Scholar

Teufel, S., and Moens, M. 1999. Discourse-level argumentation in scientific articles: Human and automatic annotation. In Towards Standards and Tools for Discourse Tagging: Proceedings of the Workshop, ACL. Somerset, NJ, USA, pp 84–93.Google Scholar

Teufel, S., Siddharthan, A., and Tidhar, D. 2006 July. Automatic classification of citation function. In Proceedings of the 2006 Conference on Empirical Methods in Natural Language Processing, ACL. Stroudsburg, PA, pp. 103–10.Google Scholar

Teufel, S., Siddharthan, A., and Tidhar, D. 2009. An annotation scheme for citation function. In Proceedings of the 7th SIGdial Workshop on Discourse and Dialogue, ACL. Stroudsburg, PA, pp. 80−7.Google Scholar

Tsai, C.-T., Kundu, G., and Roth, D., 2013. Concept-based analysis of scientific literature. In Proceedings of the 22nd ACM international conference on Conference on information & knowledge management - CIKM ‘13, ACM Press. New York, NY, USA, pp. 1733–38.Google Scholar

Van Noorden, R., 2013. Brazilian citation scheme outed. Nature 500 (7464): 510–11.Google Scholar

Verlic, M., Stiglic, G., Kocbek, S., and Kokol, P., 2008. Sentiment in Science – A Case Study of CBMS Contributions in Years 2003 to 2007. In 21st IEEE International Symposium on Computer-Based Medical Systems, University of Jyväskylä, Finland, pp. 138–43.Google Scholar

Wilson, T., Hoffmann, P., Somasundaran, S., Kessler, J., Wiebe, J., Choi, Y., and Patwardhan, S., 2005. OpinionFinder. In Proceedings of HLT/EMNLP on Interactive Demonstrations, ACL. Morristown, NJ, USA, pp. 34–5.Google Scholar

Yan, R., Tang, J., Liu, X., Shan, D., and Li, X., 2011. Citation count prediction: Learning to estimate future citations for literature. In Proceedings of the 20th ACM International Conference on Information and Knowledge Management, ACM. Glasgow, UK, pp. 1247–52.Google Scholar

Young, N. S., Ioannidis, J. P. A., and Al-Ubaydli, O., 2008. Why current publication practices may distort science. PLoS Medicine 5 (10): e201.Google Scholar

Zhang, G., Ding, Y., and Milojević, S., 2013. Citation content analysis (cca): A framework for syntactic and semantic analysis of citation content. Journal of the American Society for Information Science and Technology 64 (7): 1490–503.Google Scholar

Zhang, W., Yu, C., and Meng, W., 2007. Opinion retrieval from blogs. In Proceedings of the sixteenth ACM conference on Conference on information and knowledge management - CIKM ‘07, ACM Press. New York, NY, USA, p. 831.CrossRef Google Scholar

Zhu, X., Turney, P., Lemire, D., and Vellino, A., 2014. Measuring academic influence: Not all citations are equal. Journal of the Association for Information Science and Technology 66 (2): 408–27.CrossRef Google Scholar

Ziman, J. M. 1987. An Introduction to Science Studies: The Philosophical and Social Aspects of Science and Technology, Cambridge: Cambridge University Press.Google Scholar

Article contents

Survey about citation context analysis: Tasks, techniques, and resources

Abstract

Access options

References

Save article to Kindle

Save article to Dropbox

Save article to Google Drive

Reply to: Submit a response

Your details

You have entered the maximum number of contributors

Conflicting interests