Document-document similarity approaches and science mapping: experimental comparison of five approaches
2009 (English)In: Journal of Informetrics, ISSN 1751-1577, Vol. 3, no 1, 49-63 p.Article in journal (Refereed) Published
This paper treats document-document similarity approaches in the context of science mapping. Five approaches, involving nine methods, are compared experimentally. We compare text-based approaches, the citation-based bibliographic coupling approach, and approaches that combine text-based approaches and bibliographic coupling. Forty-three articles, published in the journal Information Retrieval, are used as test documents. We investigate how well the approaches agree with a ground truth subject classification of the test documents, when the complete linkage method is used, and under two types of similarities, first-order and second-order. The results show that it is possible to achieve a very good approximation of the classification by means of automatic grouping of articles. One text-only method and one combination method, under second-order similarities in both cases, give rise to cluster solutions that to a large extent agree with the classification.
Place, publisher, year, edition, pages
Elsevier BV , 2009. Vol. 3, no 1, 49-63 p.
Bibliometrics, Citation data, Text mining, Cluster analysis, Data source combination, Science mapping
Computer and Information Science
IdentifiersURN: urn:nbn:se:umu:diva-37580DOI: 10.1016/j.joi.2008.11.003ISI: 000262496700005OAI: oai:DiVA.org:umu-37580DiVA: diva2:369081