Improving co-authorship network structures by combining multiple data sources: evidence from Italian academic statisticians

Vittorio Fuccella; Domenico De Stefano; Maria Prosperina Vitale; Susanna Zaccarin

doi:10.1007/s11192-016-1872-y

Loading next page...

References (41)

R Lai G-C Li (2014)
Disambiguation and co-authorship networks of the US patent inventor database (1975–2010)
Research Policy, 43
J Shotton A Criminisi (2012)
Decision forests: A unified framework for classification, regression, density estimation, manifold learning and semi-supervised learning
Foundations and Trends in Computer Graphics and Vision, 7
Y Xue E Durham (2012)
Quantifying the correctness, computational complexity, and security of privacy-preserving string comparators for record linkage
Information Fusion, 13
SJ Stolfo MA Hernandez (1995)
The merge/purge problem for large databases
ACM Sigmod Record, 24
MA Gonçalves AF Santana (2015)
On the combination of domain-specific heuristics for author name disambiguation: The nearest cluster method
International Journal on Digital Libraries, 16
H. Han (2004)
. Two supervised learning approaches for name disambiguation in author citations.
Digital Libraries
VI Torvik NR Smalheiser (2009)
Author name disambiguation
Annual Review of Information Science and Technology, 43
S Gillani M Imran (2013)
A real-time heuristic-based unsupervised method for name disambiguation in digital libraries
D-Lib Magazine, 19
D Zhao A Strotmann (2009)
Author name disambiguation for collaboration network analysis and visualization
Proceedings of the American Society for Information Science and Technology, 46
V Torra J Domingo-Ferrer (2003)
Disclosure risk assessment in statistical microdata protection via advanced record linkage
Statistics and Computing, 13
A-L Barabási R Albert (2002)
Statistical mechanics of complex networks
Reviews of Modern Physics, 74
ME Newman (2004)
Coauthorship networks and patterns of scientific collaboration
Proceedings of the National Academy of Sciences, 101
E Horlings T Gurney (2011)
Author disambiguation using multi-aspect similarity indicators
Scientometrics, 91
N Torelli (2006)
Metodi statistici per l’integrazione di dati da fonti diverse
V Fuccella D Stefano (2013)
The use of different data sources in the analysis of co-authorship networks and scientific performance
Social Networks, 35
M Weeber VI Torvik (2005)
A probabilistic similarity metric for medline records: A model for author name disambiguation
Journal of the American Society for Information Science and Technology, 56
R Mooney M Bilenko (2003)
Adaptive name matching in information integration
IEEE Intelligent Systems, 5
S Robertson (2004)
Understanding inverse document frequency: On theoretical arguments for IDF
Journal of Documentation, 60
X. Dong (2005)
. Reference reconciliation in complex information spaces.
Proceedings of the 2005 ACM SIGMOD international conference on management of data
S-H Na I-S Kang (2009)
On co-authorship for author disambiguation
Information Processing and Management, 45
J-C Lamirel P Cuxac (2013)
Efficient supervised and semi-supervised approaches for affiliations disambiguation
Scientometrics, 97
AA Ferreira AP Carvalho (2011)
Incremental unsupervised name disambiguation in cleaned digital libraries
Journal of Information and Data Management, 2
P. Christen (2008)
. Febrl: An open source data cleaning, deduplication and record linkage system with a graphical user interface.
Proceedings of the 14th ACM SIGKDD international conference on Knowledge discovery and data mining
D. Lee (2005)
. Effective and scalable solutions for mixed and split citation problems in digital libraries.
Proceedings of the 2nd international workshop on Information quality in information systems
AA Ferreira RG Cota (2010)
An unsupervised heuristic-based hierarchical method for name disambiguation in bibliographic citations
Journal of the American Society for Information Science and Technology, 61
J Moody (2004)
The structure of a social science collaboration network: Disciplinary cohesion from 1963 to 1999
American Sociological Review, 69
S. Yan (2007)
. Adaptive sorted neighborhood methods for efficient record linkage.
Proceedings of the 7th ACM/IEEE-CS joint conference on digital libraries (pp
AB Sunter IP Fellegi (1969)
A theory for record linkage
Journal of the American Statistical Association, 64
R Nugent SL Ventura (2015)
Seeing the non-stars: (Some) sources of bias in past disambiguation approaches and a new public tool leveraging labeled records
Research Policy, 44
D Hicks (1999)
The difficulty of achieving full coverage of international social science literature and the bibliometric consequences
Scientometrics, 44
M. Sadinle (2011)
. Approaches to multiple record linkage.
Proceedings of International Statistical Institute (Vol
S Zaccarin D De Stefano (2016)
Co-authorship networks and scientific performance: An empirical analysis using the generalized extreme value distribution
Journal of Applied Statistics, 43
H. Han (2005)
. Name disambiguation in author citations using a k-way spectral clustering method.
Digital Libraries
AA Ferreira A Veloso (2012)
Cost-effective on-demand associative author name disambiguation
Information Processing and Management, 48
X-H Ding J Wu (2013)
Author name disambiguation in scientific collaboration and mobility cases
Scientometrics, 96
S Milojević (2013)
Accuracy of simple, initials-based methods for author name disambiguation
Journal of Informetrics, 7
MA Gonçalves AA Ferreira (2012)
A brief survey of automatic methods for author name disambiguation
ACM Sigmod Record, 41
R. Baxter (2003)
. A comparison of fast blocking methods for record linkage.
ACM KDD Workshops (Vol
MJ Leij S Goyal (2006)
Economics: An emerging small world
Journal of Political Economy, 114
B Li H Wu (2014)
Unsupervised author disambiguation using Dempster–Shafer theory
Scientometrics, 101
P Christen (2012)
A survey of indexing techniques for scalable record linkage and deduplication
IEEE Transactions on Knowledge and Data Engineering, 24

Publisher: Springer Journals
Copyright: 2016 Akadémiai Kiadó, Budapest, Hungary
ISSN: 0138-9130
eISSN: 1588-2861
DOI: 10.1007/s11192-016-1872-y
Publisher site: See Article on Publisher Site

Abstract

Abstract The aim of the present contribution is to merge bibliographic data for members of a bounded scientific community in order to derive a complete unified archive, with top-international and nationally oriented production, as a new basis to carry out network analysis on a unified co-authorship network. A two-step procedure is used to deal with the identification of duplicate records and the author name disambiguation. Specifically, for the second step we strongly drew inspiration from a well-established unsupervised disambiguation method proposed in the literature following a network-based approach and requiring a restricted set of record attributes. Evidences from Italian academic statisticians were provided by merging data from three bibliographic archives. Non-negligible differences were observed in network results in the comparison of disambiguated and not disambiguated data sets, especially in network measures at individual level.

Journal

SIGGRAPH 2015: Studio – Springer Journals

Published: Apr 1, 2016

Keywords: Information Storage and Retrieval; Library Science

Get 20M+ Full-Text Papers For Less Than $1.50/day. Start a 7-Day Trial for You or Your Team.

Learn More →

Improving co-authorship network structures by combining multiple data sources: evidence from Italian academic statisticians

Improving co-authorship network structures by combining multiple data sources: evidence from Italian academic statisticians

Get 20M+ Full-Text Papers For Less Than $1.50/day. Start a 7-Day Trial for You or Your Team.

Learn More →

Improving co-authorship network structures by combining multiple data sources: evidence from Italian academic statisticians

Improving co-authorship network structures by combining multiple data sources: evidence from Italian academic statisticians

References (41)

Abstract

Journal

Recommended Articles

There are no references for this article.

Our policy towards the use of cookies