A systematic comparison of software dedicated to meta-analysis of causal studies

Leon Bax; Ly-Mee Yu; Noriaki Ikeda; Karel Moons

doi:10.1186/1471-2288-7-40

A systematic comparison of software dedicated to meta-analysis of causal studies

Bax, Leon; Yu, Ly-Mee; Ikeda, Noriaki; Moons, Karel 2007-09-10 00:00:00 Background: Our objective was to systematically assess the differences in features, results, and usability of currently available meta-analysis programs. Methods: Systematic review of software. We did an extensive search on the internet (Google, Yahoo, Altavista, and MSN) for specialized meta-analysis software. We included six programs in our review: Comprehensive Meta-analysis (CMA), MetAnalysis, MetaWin, MIX, RevMan, and WEasyMA. Two investigators compared the features of the software and their results. Thirty independent researchers evaluated the programs on their usability while analyzing one data set. Results: The programs differed substantially in features, ease-of-use, and price. Although most results from the programs were identical, we did find some minor numerical inconsistencies. CMA and MIX scored highest on usability and these programs also have the most complete set of analytical features. Conclusion: In consideration of differences in numerical results, we believe the user community would benefit from openly available and systematically updated information about the procedures and results of each program's validation. The most suitable program for a meta-analysis will depend on the user's needs and preferences and this report provides an overview that should be helpful in making a substantiated choice. Background investigation of heterogeneity, small study effects, and Meta-analysis has been characterized in various ways, other data trends. Although meta-analysis is applied in from "making order of scientific chaos"[1] to "mega-silli- many types of research, the bulk of published meta-anal- ness"[2], and has been subject of many debates. However, yses are in the domain of therapeutic and – albeit to a time has taught – both opponents and proponents – that lesser extent – observational etiologic studies. This paper things are not black and white; meta-analysis, executed focuses on this area of causal medical research and in par- with care, has become an important and influential cor- ticular the software that is being used in the correspond- nerstone of scientific medicine. As the quantitative part of ing meta-analyses. a systematic review, the merit of meta-analysis over qual- itative approaches lies in the formal and reproducible Page 1 of 9 (page number not for citation purposes) BMC Medical Research Methodology 2007, 7:40 http://www.biomedcentral.com/1471-2288/7/40 Computer software has become indispensable in meta- analysis software. We also checked the website of each analysis and in the last decennia many programs have included program and made inquiries with its authors been developed. To aid potential users in choosing the about their validation procedures. software that fits their needs, there are a number of reviews and comparisons available [3-7]. The most recent In addition to the search for validation reports, two one, however, dates back to 5 years ago and in the mean- reviewers (LB, LMY) actively investigated the comparabil- time the spectrum of available software has changed sub- ity of the numerical results with data sets from three pre- stantially. Also, most of the existing reviews have focused viously published [8-10] meta-analyses (Table 1). These on numerical features, such as which analytical models data sets have been used as examples in methodological were available or what graphs could be produced. We meta-analysis publications [11-13] and are representative believe that information on the validity or comparability of those commonly encountered in therapeutic or etio- of results and ease-of-use are equally important factors in logic meta-analyses. The first data set [8] contains per- the total applicability of the software. Therefore, the pur- group data from 16 randomized controlled trial articles pose of our study was to systematically compare features, with a dichotomous outcome, i.e. group sizes and event results, and usability of the currently available meta-anal- rates. One of the 16 included studies has no events in one ysis software. of the treatment arms and the data set itself is subject to substantial small study effects. The second data set [9] Methods contains per-group data typically found in meta-analyses Software search and selection of controlled trials with a continuous outcome (group We decided, a priori, to focus on software that was solely sizes, means, standard deviations). It contains data from dedicated to meta-analysis of randomized therapeutic or 11 studies with heterogeneous results. The third data set observational causal studies. General statistics packages [10] contains data as they could be found in meta-analy- were excluded. Furthermore, the software had to be ses of observational studies. The data are from 19 studies actively maintained and supported, which was judged by with a dichotomous outcome, like in the first data set. either the time of the last software update (less than 5 However, this time there are no per-group data available years), bug report (less than 5 years), or website update for each study but only the comparative association meas- (less than 3 years). We also decided to select only software ures (odds ratios) and their standard errors. with a graphical interface and mouse-click compatibility, which essentially excluded the DOS programs. For each data set, we compared the combined association measures, tests for heterogeneity, and tests for small study Searches for software and publications related to their effects (publication bias) derived from each of the studied development and usage were done by two authors (LB, meta-analysis programs. We focused on the most com- LMY) with combinations of the following keywords in mon association measures such as the risk difference, risk Internet search engines of Google, Yahoo, AltaVista, and ratio, odds ratio, mean difference, Hedges' g, and Cohen's MSN: "meta-analysis", "meta-analyses", "systematic d, including their 95% confidence intervals. We used the review", "software", "program", "package", "macro", metan (version 1.81) [14], metabias (version 1.4.2) [15], "add-in", and "add-on". The first search was done mid and metatrim (version 1.5.1) [16] programs of the general 2005 and the last search in June 2006. The software was statistics software STATA [17] as 'reference' in the software purchased or downloaded if it appeared to fulfill the comparisons. inclusion criteria. Assessment of usability Assessment of numerical and graphical features Finally, we performed a usability assessment amongst 30 The assessment of the numerical and graphical features in researchers from various institutes and countries: Kitasato the included meta-analysis programs was handled inde- University (Japan), Tokai University (Japan), Utrecht Uni- pendently by two investigators (LB, LMY) and reviewed by versity (The Netherlands), University of Amsterdam (The all authors until there was consensus on all items. The Table 1: Overview of the data sets used in the criterion programs were installed and tested on Windows XP and validation Windows 2000 systems in English and Japanese. Details of the documented features are provided in the tables of No Author(s) Date Studies Input type the results section. 1 Teo et al.[8] 1991 16 Group size, events 2 Wahlbeck et al.[9] 2000 11 Group size, mean, Validity and comparability of meta-analysis results standard deviation We searched the internet and literature databases of med- 3 Pagliaro et al.[10] 1992 19 Association measure, ical and social sciences (PubMed, EmBase, Eric, and standard error PsychInfo) for articles that reported validations of meta- Page 2 of 9 (page number not for citation purposes) BMC Medical Research Methodology 2007, 7:40 http://www.biomedcentral.com/1471-2288/7/40 Netherlands), the Dutch Cochrane Center (The Nether- nalysis. This software comes with a book and cannot be lands), the University of Leuven (Belgium), and the Cen- purchased separately. Neither the software nor the book is tre for Statistics in Medicine (UK). There were no specific supported by a website, which is why we did not find it at inclusion criteria and the sample consisted of individuals first. At the time of inclusion, we could no longer assess it from various departments and with various levels of expe- in the usability part, but have included it post-hoc in the rience with meta-analysis. assessment of comparability and features. During the assessment sessions, participants were asked to Assessment of numerical and graphical features install (evaluation versions of) each of the studied meta- Below is a short summary of the numerical and graphical analysis programs and to analyze one small data set of a features in each of the reviewed programs; details are meta-analysis with a dichotomous outcome (a shortened available in Tables 3 and 4. version of the previously described meta-analysis by Teo et al. [8]). As they completed this task, they scored the usa- Comprehensive Meta-Analysis (commercial software) has bility of each program in an electronic scoring list. This list the highest profile in the Internet search engines of all [see Additional file 1] was developed via a consensus ses- included programs. It distinguishes itself from other pro- sion with (meta-analysis) experts from the disciplines of grams by the option to enter effect sizes of different for- epidemiology, biostatistics, and medical informatics, who mats and the comprehensiveness of the numerical were asked which elements they considered important in options and output. Data can be entered manually or via meta-analysis software and what items they would use to copy-and-paste in the CMA spreadsheet; direct import of judge its usability. The order in which each program was text or other data files is not possible. The program fea- installed and assessed was determined by a computer gen- tures all major graphical presentations. The tutorial and erated randomization list and different for each partici- manual are to-the-point and extensive. The program is pant. actively maintained and the website is modern and regu- larly updated. Results Software search and selection MetAnalysis 1.0 (commercial software) is not sold sepa- We found 10 meta-analysis packages that were available rately, but comes as a bonus feature of a book [19]. It is for download or purchase via the internet (Table 2). Many limited to studies with descriptive data on dichotomous were no longer updated or had remained in their DOS outcomes. Data cannot be pasted or imported and must stage and were excluded from our study. We included six be entered manually, cell by cell. Once the data are programs in our comparison: Comprehensive Meta-anal- entered and the calculations performed, numerical data ysis (CMA) Version 2 [18], MetAnalysis[19], MetaWin 2.1 can be produced in a print preview screen and graphs in [20], MIX 1.5 [21], RevMan 4.2.8 [22], and WEasyMA 2.5 separate windows. A nice feature is the radial part of the [23] (in alphabetical order). Using less stringent inclu- Galbraith plot, which is lacking in most other software. sion/exclusion criteria did not change this software selec- The software also has the facilities to enter loss to follow- tion. Using more stringent criteria would exclude up/drop-out information and use the studies in the meta- WEasyMA as various signs indicate that it may no longer analyses with per-protocol or intention-to-treat analysis. be developed and supported. Initially, our search did not The software does not contain help files and does not pick up the still relatively unknown program called MetA- have a website, but users can consult the book instead. Table 2: Retrieved meta-analysis software Software name OS requirements Meta-analysis interface Availability Selected for review Comprehensive Meta-Analysis WINDOWS Graphical Commercial EasyMA DOS DOS menu Free EpiMeta DOS DOS menu Free Hepima DOS DOS menu Free Meta-Analysis 5.3 DOS DOS menu Free Meta-Analyst DOS DOS menu Free MetAnalysis WINDOWS Graphical Commercial Meta-Stat DOS DOS menu Free MetaWin WINDOWS Graphical Commercial MIX WINDOWS Graphical Free RevMan WINDOWS Graphical Free WEasyMA WINDOWS Graphical Commercial Page 3 of 9 (page number not for citation purposes) BMC Medical Research Methodology 2007, 7:40 http://www.biomedcentral.com/1471-2288/7/40 Table 3: Meta-analysis software – basic feature comparison CMA MetAnalysis MetaWin MIX RevMan WEasyMA General URL meta-analysis.com - metawinsoft.com mix-for-meta-analysis.info cc-ims.net/RevMan weasyma.com Corporate single user price ~$1295.00 ~$75.00 ~$150.00 Free $650 ~$490.00 Student single user price ~$395.00 ~$75.00 ~$75.00 Free Free ~$280.00 Download/program size 30 Mb 5 Mb 9 Mb 20 Mb/50 Mb 9 Mb 3 Mb Compatibility Windows Windows Windows Windows Windows Windows Last update 2006 2005 2002 2006 2005 2002 License Single user Single user Single user Open Open Single user Input options Manual input 9 9 9 9 9 9 Copy & paste () 9 9 9 9 Text file import 9 9 File import (Excel, other software) 9 9 Descriptive dichotomous, e.g. n(total), n(y = 1) 9 9 9 9 9 9 Descriptive continuous, e.g. n, m, sd 9 9 9 9 Comparative, e.g. theta, se/var 9 9 9 9 Multi-format (mixed in one data set) Single data input/selection 9 9 9 9 9 Maximum number of studies Unlimited Unlimited Unlimited 100 Unlimited Unlimited Information sources Within-program HTML help () () 9 9 9 9 Printable manual 9 9 9 9 Description of methods/calculations ( ) ( ) 9 9 9 9 Additional information sources (PDFs/tutorials) 9 9 9 Up-to-date website 9 9 9 9 8 Export options Copy output to clipboard 9 9 9 9 9 9 Export to office application(s) Report creation 9 9 9 Setting copy file type (e.g. bmp, jpg or wmf) 9 9 9 The ' ' indicates the presence and no mark indicates the absence of a feature. The '( )' means that the feature is limited or partially in development, and the ' ' means it 9 9 was not working correctly at the time of our assessments. MetaWin 2.1 (commercial software) is accompanied by a used by other programs. In contrary to most other soft- comprehensive manual in the form of a book and, in this ware, all calculations are based on t-distributions and respect, resembles the MetAnalysis package described boot-strap methods are also available. The help files and above. Distinctive features are the effect size calculator, the book are extensive and detailed. some graphs that are relatively uncommon in meta-anal- ysis (the normal quantile plot and a weighted histogram), MIX 1.5 (free software) is the most recently developed and the option to use bootstrap confidence intervals. The program. Its most prominent features are the comprehen- interface resembles a spreadsheet program and various sive graphical output, detailed numerical options, and data files can be imported. For some changes in the anal- educational features like built-in data sets corresponding ysis, data range selections have to be repeated, which is to those in a number of books, and extensive tutor func- somewhat more time-consuming compared to methods tions. MIX is the only program that will not function by Page 4 of 9 (page number not for citation purposes) BMC Medical Research Methodology 2007, 7:40 http://www.biomedcentral.com/1471-2288/7/40 Table 4: Meta-analysis software – analytical feature comparison CMA WEasyMA MetaWin MetAnalysis RevMan MIX Computational setting options Number of decimals 9 9 9 9 Alpha level/confidence intervals 9 9 9 9 9 Constant continuity correction 9 9 9 9 9 9 Treatment arm continuity correction Variance for mean differences 9 9 Variances for standardized mean differences 9 9 Bootstrap confidence intervals Numerical output Individual study data AM,CI,P,W,other AM,CI,W,other AM,other AM,CI,other AM,CI,P,W,other AM,CI,P,W,other Association measures – risk RD,RR,OR RD,RR,OR RD,RR,OR RD,OR RD,RR,OR RD,RR,OR Association measures – means & MD,HG,CD,other HG,other MD,HG MD,HG,CD standardized measures Association measures – other CC,Z CC Fixed effect models/weighting IV,MH,PETO IV,MH,PETO,other IV,MH,PETO IV,MH,PETO IV,MH,PETO IV,MH,PETO Random effects models/weighting DL DL DL DL DL DL Cumulative analyses Several variables Several variables Several variables ( ) Only graph Several variables 2 2 2 2 2 2 Heterogeneity Q,I ,t QQ Q,I Q,I Q,I ,t ,other Small study effect/publication bias FSN,RC,EGG,TF EGG FSN,RC FSN,EGG FSN,RC,EGG,MAC,TF Meta-regression Single moderator Single moderator Graphical output Forest plot 9 9 9 9 9 9 - Points proportional to weights 9 9 - Annotations in rows possible 9 9 9 - Cumulative possible 9 9 Funnel plot (1/se, se, var, N, P) 1/se,se 1/se,se,N var,N N 1/se 1/se,se,N,P Galbraith plot (radial) 9 9 9 9 Exclusion sensitivity plot Trim and fill plot 9 9 L'Abbe plot 9 9 9 Other plots HIST,NQ BOX,HIST,NQ,other Graph formatting 9 9 9 9 9 9 The ' ' indicates the presence and no mark indicates the absence of a feature. The '( )' means that the feature is limited or partially in development, and the ' ' means it 9 9 was not working correctly at the time of our assessments. Abbreviations: AM = association measure, CI = confidence interval, P = P value, W = weight, RD = risk difference, RR = risk ratio, OR = odds ratio, md = mean difference, hg = Hedges' g, cd = Cohen's d, CC = correlation coefficient, Z = Fisher's Z, IV = inverse variance weighting, MH = Mantel-Haenszel weighting, PETO = Peto's weighting, DL = Dersimonian & Laird weighting, Q = Cochran's or Breslow & Day's Q, I =Higgins's inconsistency statistic, t =between study variance indicator, FSN = fail-safe number test, RC = rand correlation test, Egg = Egger's regression test, Mac = Macaskill's regression test, TF = trim and fill method, se = standard error, var = variance, N = sample size, TFP = trim and fill plot, HIST = histogram, NQ = normal quantile plot, BOX = box-and-whiskers plot. itself and it requires Microsoft Excel 2000 or later to run. RevMan 4.2.8 (free for private and academic use) was Another limitation is the maximum number of data sets, developed by and for the Cochrane Collaboration. It which is currently 100. Data sets can be created by manual stands out due to its extensive features for collaborative input as well as by importing text delimited data files or management of systematic reviews. The analytical func- Excel workbooks. The numerical and graphical options tions of the program cannot be accessed without first cre- are diverse and comprehensive. ating a review structure and because import and copy- Page 5 of 9 (page number not for citation purposes) BMC Medical Research Methodology 2007, 7:40 http://www.biomedcentral.com/1471-2288/7/40 and-paste functionality are also limited, getting started Since MetAnalysis and WEasyMA can only analyze data requires more preparation than with most other software. from two-by-two tables, the comparability assessments Once data are in the analysis module, analysis is straight- were limited to one data set [8]. Analyses in MetAnalysis forward. Output is detailed, though without tests for pub- were very similar though not always identical to those lication bias and no other graphs than the forest and from STATA. We found that if we entered experimental funnel plot. The help resources in RevMan are extremely group data first (as is the case in all other software), an thorough. A new version is to be released in the near incorrect event coding is applied that causes the software future. to calculate risk differences and odds ratios of survival even if mortality is entered as event. For risk differences WEasyMA 2.5 (commercial software) stands out by the this only changes the sign, but for odds and odds ratios it speed with which results become available after data set gives the reciprocal of the intended results [26]. Although creation. Data cannot be imported or pasted and need to the book mentions that control data are to be entered in be entered manually, cell by cell. Another limitation of the first data column, the software has currently no built- this program is that it can only handle data from clinical in guard against this and we therefore urge users to be trials with dichotomous outcomes, e.g. two-by-two table careful. data. Although limited to these types of data, the program produces a wide variety of numerical and graphical out- In WEasyMA, we found results that could not be repro- put. The original author has indicated that the software is duced if a data set with zero events in one study arm was currently unsupported by a development team and may used. Even when using the same continuity correction as soon no longer be available. reported in the 'Calculation options' dialog in WEasyMA, the results remained different in STATA. The WEasyMA Validity and comparability of meta-analysis results authors did not respond to our inquiry into reasons for Our internet and database search did not yield any publi- the discrepancies. cations on the validity or validation of any of the pro- grams, except for MIX [24,25]. Authors of all programs Assessment of usability were contacted to determine whether (yet unpublished) Of the 30 participating researchers, 26 provided quantita- tive data that were suitable for analysis (Table 5). Trouble evidence of validation procedures was available. Authors of RevMan indicated that validation data were made pub- with the electronic user form or installation of software lic via notes and abstracts at Cochrane Collaboration made the data from 4 researchers incomplete and they meetings and conferences. The authors of CMA, MetAnal- were excluded from the quantitative part. MIX scored ysis, and MetaWin stated that all procedures had been highest on the overall usability (8.6), followed by CMA checked extensively with external programs, spreadsheets, (6.9), MetaWin (6.2), RevMan (6.1), and WEasyMA (4.2). and occasionally by hand, though had not been made public. For CMA, Excel sheets with such data are available RevMan was most familiar to the participating research- upon request. We received no information on validation ers. MIX had not been used by any of the participants but procedures from the authors of WEasyMA. the name was familiar to some as they were affiliated to the same institutions as the makers of the MIX software. We found no discrepancies in meta-analysis results Stratifying the results in analogous subgroups did not between STATA, MIX and RevMan. In CMA, we found a reveal any specific trends in the ratings. Experienced users small inconsistency in results of publication bias tests, but appeared to be more critical than less experienced users, this was corrected via an update while we were writing this but relative scores were identical. Installation of WEa- article. syMA and CMA was troublesome for some researchers. Qualitative statements mostly concerned problems with MetaWin's results were different from STATA's results the installation (WEasyMA, CMA), error messages in (and thus also from results in CMA, MIX, and RevMan) French (WEasyMA), and difficulties with data set creation because MetaWin mostly uses a t-distribution where the (WEasyMA, RevMan). Favorable comments included aforementioned programs use a z-distribution (although praise for the user interfaces (MIX, RevMan, CMA), help a recent version of MIX also allowed us to use a t-distribu- system (RevMan), speed of analysis (WEasyMA), and tion). We did find what seemed to be a terminological within-program tutoring (MIX, CMA). inconsistency, as the Mantel-Haenszel labeled method used in MetaWin for odds ratio analyses gave results that Discussion were identical to those from Peto's method in the other Meta-analysis is an indispensable tool in current-day syn- programs (albeit with confidence limits based on a t-dis- thesis of research data from multiple studies, and system- tribution). atic reviews with meta-analyses occupy the top position in the hierarchy of evidence. Software for meta-analysis has Page 6 of 9 (page number not for citation purposes) BMC Medical Research Methodology 2007, 7:40 http://www.biomedcentral.com/1471-2288/7/40 Table 5: Meta-analysis software – usability ratings (best scoring software from left to right) Items and subgroups MIX CMA MetaWin RevMan WEasyMA All researchers (26) Overall rating (min-max) 8.6 (6.7 to 10) 6.9 (3.7 to 9.7) 6.2 (4.3 to 8.7) 6.1 (4.3 to 8.3) 4.2 (1 to 7.3) Getting started 8.6 7.4 6.8 7.6 4.5 Data preparation 8.3 6.3 6.3 4.5 2.6 Usability in analysis 8.8 7.1 5.6 6.3 5.9 Experienced (7) Overall rating (min-max) 8.1 (7.0 to 9.7) 6.8 (6.0 to 7.3) 5.9 (4.3 to 7.7) 5.4 (4.3 to 6.3) 3.3 (1 to 5.7) Getting started 8.0 7.6 6.2 7.5 2.8 Data preparation 8.3 6.3 6.3 3.0 2.0 Usability in analysis 8.0 6.6 5.4 6.3 5.3 Inexperienced (19) Overall rating (min-max) 8.7 (6.7 to 10) 7 (3.7 to 9.7) 6.3 (4.3 to 8.7) 6.3 (4.7 to 8.3) 4.6 (1.3 to 7.3) Getting started 8.8 7.3 6.9 7.7 5.0 Data preparation 8.3 6.3 6.3 5.0 2.8 Usability in analysis 9.1 7.3 5.6 6.3 6.1 All scores are summary scores, based on the scores of items in the 'Installation', 'Data preparation', and 'Usability in analysis' categories. Each item was scored from bad to excellent on a scale from 0 to 10. evolved over the years and available reviews are relatively rently not prevented by warning or error messages and can outdated. We therefore considered it timely to provide a lead to invalid results. systematic overview of the features, criterion validity, and usability of the currently available software that is dedi- The usability study shows that preparing data for analysis cated to meta-analysis of causal (therapeutic and etio- is the hardest part in each program. MIX and CMA are logic) studies. It has some overlaps with existing reviews identified as the most user-friendly programs. WEasyMA [3-7], but includes other more recent programs, contains scored least favorable. Stratifying user evaluations based more detailed information on the merits and demerits of on experience with meta-analysis and previous experience the available programs, and follows a more systematic or knowledge of the software did not reveal any trends in approach. the ratings. We studied four commercial programs (CMA, WEasyMA, Our comparison has been limited to software dedicated to MetaWin, and MetAnalysis) and two free programs (Rev- meta-analysis only and does not include general statistics Man and MIX). The features of the commercial programs packages. The primary reason to leave them out was were not necessarily more extensive than those of the free because they are structurally very different, making direct ones. In particular MIX stood out in terms of numerical comparisons inappropriate. Central to this issue is soft- options and graphical output. CMA was generally most ware syntax: most general packages require thorough versatile, in particular in options for analysis of various knowledge of their syntax in order to produce and alter types of data. With regard to the comparability of results, graphs that are common in meta-analysis; the dedicated MIX, RevMan, and CMA produced numerical results that packages, however, produce such graphs with a few or were identical to results from STATA's metan, metabias, sometimes even a single click. In addition, the syntax and metatrim. MetaWin's results are different and slightly knowledge required to do more advanced meta-analyses more conservative, since the confidence intervals are with the general packages means that in a usability survey based on a t-distribution or bootstraps. WEasyMA pro- all participants would have to be expert statisticians, capa- duces results that can be disparate from the other pro- ble of writing and adapting syntax for meta-analysis in all grams, especially in data sets with studies with zero events major general software packages. This is not only not fea- in one or both of the comparison groups. Although most sible in the current setting, it would also make the partic- differences were small in the data sets we used, we have ipating individuals no longer representative of the reservations on how this will reflect on data sets with (sometimes relatively inexperienced) users of the software more extreme data. The MetAnalysis program should also in the scientific and academic community. Although a dif- be used with care as data have to be entered manually and ferent approach would be necessary, we believe the user in the correct columns. Exchanging the columns is cur- community of meta-analysis software would benefit from Page 7 of 9 (page number not for citation purposes) BMC Medical Research Methodology 2007, 7:40 http://www.biomedcentral.com/1471-2288/7/40 an additional review of meta-analysis options in general Conclusion statistics software. In conclusion, the most suitable meta-analysis software for a user depends on his or her demands; no single pro- Due to the lack of a 'gold' standard, we resorted to gram may be best for everybody. The information pro- between-program comparisons and a criterion validation vided in this article, in particular the data in Tables 3 and with STATA's user-written commands metan, metabias and 4, should give users the opportunity to make a substanti- metatrim as reference. Our choice for STATA was based on ated decision. its versatility and use in two major books on meta-analysis [11,12]. We realize that STATA itself is also user-written Competing interests and potentially subject to similar validity issues than the None of the authors have financial conflicts of interest, other programs. The fact that CMA, MIX, and RevMan although the first author is also primary developer of one produced results that were identical to results from of the free programs (MIX) studied in this review. The STATA, at least with the three data sets we selected, justi- other authors have been co-authors in an introductory fies to some extent our use of STATA as a reference stand- article about MIX. To reduce personal biases, all tasks were ard. handled by multiple investigators and the subjective usa- bility assessments were assigned (by study design) to indi- The results of our usability survey should be regarded as viduals other than the authors. exploratory and serve as a rough indication. First, the number of participants was relatively small. Second, it is Authors' contributions not unlikely that there may be some bias in favor of Rev- LB and KGM developed the study concept and designed Man and MIX because some users were already familiar the study. LB and LMY handled the primary data acquisi- with these programs. Subgroup analyses, however, did tion and drafted the manuscript. All authors double not reveal such trends. MetAnalysis could unfortunately checked the data tables and analyses, and approved the not be included as it was included after the start of the usa- final version of the manuscript. bility assessment. A further point regarding MIX is that it was created following a development focus list [25] that Additional material was created in a similar fashion to our usability scoring list. Assessment of both lists reveals that a number of Additional file 1 items are very similar. Although this may indicate that the Software usability scoring list. The scoring list that was used to evaluate lists are indeed reflecting the demands of statistical soft- the usability of the meta-analysis software. ware users, it also means that the MIX program was likely Click here for file to do well in our assessment. We believe, however, that [http://www.biomedcentral.com/content/supplementary/1471- 2288-7-40-S1.doc] any program that is systematically developed to satisfy its users' demands should perhaps deservedly score high. Another point to which we would like to draw attention is Acknowledgements the lack of accessible public information about the man- This study was not supported by any particular grant. The authors would ner in which meta-analysis programs have been validated. like to express their gratitude to all researchers who participated in the Only the website of the MIX program includes specific ref- usability assessment sessions. erences to this and MIX is the only program with a peer- reviewed and published validation report [25]. Without References such reports, authors, reviewers, editors, and consumers 1. Hunt M: Making order of scientific chaos. In How Science Takes of evidence have no reference for judgments about the Stock New York , Russel Sage Foundation; 1997:1-19. 2. Eysenck HJ: Meta-analysis and its problems. Bmj 1994, suitability of the software for scientific purposes. This is of 309:789-792. course equally applicable to the user-written meta-analy- 3. Normand SLT: Meta-analysis software - a comparative review -DSTAT, version 1.10. Am Statistician 1995, 49:298-309. sis macros for general statistics software. We argue for 4. Egger M, Sterne JAC, Smith GD: Meta-analysis software. BMJ more rigor and transparency in this area. 1998, 316(7126): Website only: http://bmj.bmjjournals.com/archive/ 7126/7126ed9.htm. 5. Sutton AJ, Lambert PC, Hellmich M, Abrams KR, Jones DR: Meta- Finally, we are fully aware that the world of information analysis in practice: A critical review of available software. In technology changes constantly and by the time this man- Meta-Analysis in Medicine and Health Policy Edited by: Berry DA, Stangl DK. New York , Marcel Dekker; 2000. uscript is published, it is possible that some updates have 6. Sutton AJ, Lambert PC, Hellmich M, Abrams KR, Jones DR: Meta- become available or that new products have been analysis software. In Systematic Reviews in Health Care: Meta-Analysis launched. We apologize beforehand for our lack of tim- in Context 2nd edition. Edited by: Egger M, Davey Smith G, Altman DG. London: BMJ Books; 2001. ing. Like a traditional review, we intend to update this investigation in due time. Page 8 of 9 (page number not for citation purposes) BMC Medical Research Methodology 2007, 7:40 http://www.biomedcentral.com/1471-2288/7/40 7. Arthur W, Bennett W, Huffcutt A: Choice of software and pro- grams in meta-analysis research: Does it make a difference? Educ Psychol Measurement 1994, 54:776-787. 8. Teo KK, Yusuf S, Collins R, Held PH, Peto R: Effects of intravenous magnesium in suspected acute myocardial infarction: over- view of randomised trials. Bmj 1991, 303(6816):1499-1503. 9. Wahlbeck K, Cheine M, Essali MA: Clozapine versus typical neu- roleptic medication for schizophrenia. Cochrane Database Syst Rev 2000:CD000059. 10. Pagliaro L, D'Amico G, Sorensen TI, Lebrec D, Burroughs AK, Mora- bito A, Tine F, Politi F, Traina M: Prevention of first bleeding in cirrhosis. A meta-analysis of randomized trials of nonsurgi- cal treatment. Ann Intern Med 1992, 117(1):59-70. 11. Egger M, Davey Smith G, Altman DG: Systematic reviews in health care: meta-analysis in context. London , BMJ Publishing Group; 2001. 12. Sutton AJ, Abrams KR, Jones DR, Sheldon TA, Song F: Methods for meta-analysis in medical research. Chichester , Wiley; 2000. 13. Glasziou P, Irwig L, Bain C, Colditz G: Systematic Reviews in Health Care: A Practical Guide. Cambridge , Cambridge Univer- sity Press; 2001. 14. Bradburn MJ, Deeks JJ, Altman DG: Metan - an alternative meta- analysis command (Metan 1.81). Stata Technical Bulletin 2003, STB 44(sbe24):4-15. 15. Steichen TJ: Tests for publication bias in meta-analysis (Meta- bias 1.2.4). Stata Journal 2003, SJ3-4(sbe19_5):11. 16. Steichen TJ: Nonparametric trim and fill analysis of publica- tion bias in meta-analysis (Metatrim 1.0.5). Stata Technical Bul- letin 2003, STB61(sbe39.2):11. 17. StataCorp: Stata statistical software, Release 9. College Station, TX , StataCorp LP; 2005. 18. Borenstein M, Hedges L, Higgins J, Rothstein H: Comprehensive Meta-Analysis Version 2. Engelwood, NJ , Biostat; 2005. 19. Leandro G: Meta-analysis in Medical research. Blackwell Pub- lishing, BMJ Books; 2005. 20. Rosenberg MS, Adams DC, Gurevitch J: MetaWin: Statistical Software for Meta-Analysis Version 2. Sunderland, Massachu- setts , Sinauer Associates; 2000. 21. Bax L, Yu LM, Ikeda N, Tsuruta N, Moons KGM: MIX: Comprehen- sive Free Software for Meta-analysis of Causal Research Data - Version 1.5. 2006. 22. The Nordic Cochrane Centre: Review Manager (RevMan). Ver- sion 4.2 for Windows. Copenhagen , The Cochrane Collaboration; 23. Chevarier P, Cucherat M, Freiburger T, Maupas J, Visele N, Bugnard F, Bazog P: WeasyMA. Lyon , ClinInfo; 2000. 24. Bax L, Yu LM, Ikeda N, Tsuruta H, Moons KGM: Conference pro- ceeding: Validation of a freely available and comprehensive meta-analysis add-in for excel. In Eur J Epidemiol Volume 21(sup- plement). European Journal of Epidemiology; 2006:58. 25. Bax L, Yu LM, Ikeda N, Tsuruta H, Moons KG: Development and validation of MIX: comprehensive free software for meta- analysis of causal research data. BMC Med Res Methodol 2006, 6(1):50. 26. Deeks JJ: Issues in the selection of a summary statistic for meta-analysis of clinical trials with binary outcomes. Stat Med 2002, 21(11):1575-1600. Pre-publication history The pre-publication history for this paper can be accessed Publish with Bio Med Central and every here: scientist can read your work free of charge "BioMed Central will be the most significant development for http://www.biomedcentral.com/1471-2288/7/40/prepub disseminating the results of biomedical researc h in our lifetime." Sir Paul Nurse, Cancer Research UK Your research papers will be: available free of charge to the entire biomedical community peer reviewed and published immediately upon acceptance cited in PubMed and archived on PubMed Central yours — you keep the copyright BioMedcentral Submit your manuscript here: http://www.biomedcentral.com/info/publishing_adv.asp Page 9 of 9 (page number not for citation purposes) http://www.deepdyve.com/assets/images/DeepDyve-Logo-lg.png BMC Medical Research Methodology Springer Journals http://www.deepdyve.com/lp/springer-journals/a-systematic-comparison-of-software-dedicated-to-meta-analysis-of-N9nqmUYDxR

Loading next page...

References (29)

SLT Normand (1995)
Meta-analysis software - a comparative review -DSTAT, version 1.10
Am Statistician, 49
TJ Steichen (2003)
Nonparametric trim and fill analysis of publication bias in meta-analysis (Metatrim 1.0.5)
Stata Technical Bulletin, STB61
M Egger, JAC Sterne, GD Smith (1998)
Meta-analysis software
BMJ, 316
(2005)
Stata statistical software, Release 9
P Chevarier, M Cucherat, T Freiburger, J Maupas, N Visele, F Bugnard, P Bazog (2000)
WeasyMA
L Pagliaro, G D'Amico, TI Sorensen, D Lebrec, AK Burroughs, A Morabito, F Tine, F Politi, M Traina (1992)
Prevention of first bleeding in cirrhosis. A meta-analysis of randomized trials of nonsurgical treatment
Ann Intern Med, 117
MJ Bradburn, JJ Deeks, DG Altman (2003)
Metan - an alternative meta-analysis command (Metan 1.81)
Stata Technical Bulletin, STB 44
(2003)
Review Manager (RevMan). Version 4.2 for Windows
HJ Eysenck (1994)
Meta-analysis and its problems
Bmj, 309
W Arthur, W Bennett, A Huffcutt (1994)
Choice of software and programs in meta-analysis research: Does it make a difference?
Educ Psychol Measurement, 54
L Bax, LM Yu, N Ikeda, N Tsuruta, KGM Moons (2006)
MIX: Comprehensive Free Software for Meta-analysis of Causal Research Data - Version 1.5
AJ Sutton, KR Abrams, DR Jones, TA Sheldon, F Song (2000)
Methods for meta-analysis in medical research
M Egger, G Davey Smith, DG Altman (2001)
Systematic reviews in health care: meta-analysis in context.
HJ Eysenck (1994)
789
Bmj, 309
M Borenstein, L Hedges, J Higgins, H Rothstein (2005)
Comprehensive Meta-Analysis Version 2
G Leandro (2005)
Meta-analysis in Medical research
L Bax, LM Yu, N Ikeda, H Tsuruta, KG Moons (2006)
Development and validation of MIX: comprehensive free software for meta-analysis of causal research data
BMC Med Res Methodol, 6
P Glasziou, L Irwig, C Bain, G Colditz (2001)
Systematic Reviews in Health Care: A Practical Guide
KK Teo, S Yusuf, R Collins, PH Held, R Peto (1991)
Effects of intravenous magnesium in suspected acute myocardial infarction: overview of randomised trials
Bmj, 303
TJ Steichen (2003)
Tests for publication bias in meta-analysis (Metabias 1.2.4)
Stata Journal, SJ3-4
AJ Sutton, PC Lambert, M Hellmich, KR Abrams, DR Jones (2000)
Meta-Analysis in Medicine and Health Policy
JJ Deeks (2002)
Issues in the selection of a summary statistic for meta-analysis of clinical trials with binary outcomes
Stat Med, 21
L Bax, LM Yu, N Ikeda, H Tsuruta, KGM Moons (2006)
Eur J Epidemiol
M Egger (1998)
Website only: h
BMJ, 316
M Hunt (1997)
1
M Hunt (1997)
How Science Takes Stock
K Wahlbeck, M Cheine, MA Essali (2000)
Cochrane Database Syst Rev
SLT Normand (1995)
298
Am Statistician, 49
MS Rosenberg, DC Adams, J Gurevitch (2000)
MetaWin: Statistical Software for Meta-Analysis Version 2

Publisher: Springer Journals
Copyright: Copyright © 2007 by Bax et al; licensee BioMed Central Ltd.
Subject: Medicine & Public Health; Theory of Medicine/Bioethics; Statistical Theory and Methods; Statistics for Life Sciences, Medicine, Health Sciences
eISSN: 1471-2288
DOI: 10.1186/1471-2288-7-40
pmid: 17845719
Publisher site: See Article on Publisher Site

Abstract

Background: Our objective was to systematically assess the differences in features, results, and usability of currently available meta-analysis programs. Methods: Systematic review of software. We did an extensive search on the internet (Google, Yahoo, Altavista, and MSN) for specialized meta-analysis software. We included six programs in our review: Comprehensive Meta-analysis (CMA), MetAnalysis, MetaWin, MIX, RevMan, and WEasyMA. Two investigators compared the features of the software and their results. Thirty independent researchers evaluated the programs on their usability while analyzing one data set. Results: The programs differed substantially in features, ease-of-use, and price. Although most results from the programs were identical, we did find some minor numerical inconsistencies. CMA and MIX scored highest on usability and these programs also have the most complete set of analytical features. Conclusion: In consideration of differences in numerical results, we believe the user community would benefit from openly available and systematically updated information about the procedures and results of each program's validation. The most suitable program for a meta-analysis will depend on the user's needs and preferences and this report provides an overview that should be helpful in making a substantiated choice. Background investigation of heterogeneity, small study effects, and Meta-analysis has been characterized in various ways, other data trends. Although meta-analysis is applied in from "making order of scientific chaos"[1] to "mega-silli- many types of research, the bulk of published meta-anal- ness"[2], and has been subject of many debates. However, yses are in the domain of therapeutic and – albeit to a time has taught – both opponents and proponents – that lesser extent – observational etiologic studies. This paper things are not black and white; meta-analysis, executed focuses on this area of causal medical research and in par- with care, has become an important and influential cor- ticular the software that is being used in the correspond- nerstone of scientific medicine. As the quantitative part of ing meta-analyses. a systematic review, the merit of meta-analysis over qual- itative approaches lies in the formal and reproducible Page 1 of 9 (page number not for citation purposes) BMC Medical Research Methodology 2007, 7:40 http://www.biomedcentral.com/1471-2288/7/40 Computer software has become indispensable in meta- analysis software. We also checked the website of each analysis and in the last decennia many programs have included program and made inquiries with its authors been developed. To aid potential users in choosing the about their validation procedures. software that fits their needs, there are a number of reviews and comparisons available [3-7]. The most recent In addition to the search for validation reports, two one, however, dates back to 5 years ago and in the mean- reviewers (LB, LMY) actively investigated the comparabil- time the spectrum of available software has changed sub- ity of the numerical results with data sets from three pre- stantially. Also, most of the existing reviews have focused viously published [8-10] meta-analyses (Table 1). These on numerical features, such as which analytical models data sets have been used as examples in methodological were available or what graphs could be produced. We meta-analysis publications [11-13] and are representative believe that information on the validity or comparability of those commonly encountered in therapeutic or etio- of results and ease-of-use are equally important factors in logic meta-analyses. The first data set [8] contains per- the total applicability of the software. Therefore, the pur- group data from 16 randomized controlled trial articles pose of our study was to systematically compare features, with a dichotomous outcome, i.e. group sizes and event results, and usability of the currently available meta-anal- rates. One of the 16 included studies has no events in one ysis software. of the treatment arms and the data set itself is subject to substantial small study effects. The second data set [9] Methods contains per-group data typically found in meta-analyses Software search and selection of controlled trials with a continuous outcome (group We decided, a priori, to focus on software that was solely sizes, means, standard deviations). It contains data from dedicated to meta-analysis of randomized therapeutic or 11 studies with heterogeneous results. The third data set observational causal studies. General statistics packages [10] contains data as they could be found in meta-analy- were excluded. Furthermore, the software had to be ses of observational studies. The data are from 19 studies actively maintained and supported, which was judged by with a dichotomous outcome, like in the first data set. either the time of the last software update (less than 5 However, this time there are no per-group data available years), bug report (less than 5 years), or website update for each study but only the comparative association meas- (less than 3 years). We also decided to select only software ures (odds ratios) and their standard errors. with a graphical interface and mouse-click compatibility, which essentially excluded the DOS programs. For each data set, we compared the combined association measures, tests for heterogeneity, and tests for small study Searches for software and publications related to their effects (publication bias) derived from each of the studied development and usage were done by two authors (LB, meta-analysis programs. We focused on the most com- LMY) with combinations of the following keywords in mon association measures such as the risk difference, risk Internet search engines of Google, Yahoo, AltaVista, and ratio, odds ratio, mean difference, Hedges' g, and Cohen's MSN: "meta-analysis", "meta-analyses", "systematic d, including their 95% confidence intervals. We used the review", "software", "program", "package", "macro", metan (version 1.81) [14], metabias (version 1.4.2) [15], "add-in", and "add-on". The first search was done mid and metatrim (version 1.5.1) [16] programs of the general 2005 and the last search in June 2006. The software was statistics software STATA [17] as 'reference' in the software purchased or downloaded if it appeared to fulfill the comparisons. inclusion criteria. Assessment of usability Assessment of numerical and graphical features Finally, we performed a usability assessment amongst 30 The assessment of the numerical and graphical features in researchers from various institutes and countries: Kitasato the included meta-analysis programs was handled inde- University (Japan), Tokai University (Japan), Utrecht Uni- pendently by two investigators (LB, LMY) and reviewed by versity (The Netherlands), University of Amsterdam (The all authors until there was consensus on all items. The Table 1: Overview of the data sets used in the criterion programs were installed and tested on Windows XP and validation Windows 2000 systems in English and Japanese. Details of the documented features are provided in the tables of No Author(s) Date Studies Input type the results section. 1 Teo et al.[8] 1991 16 Group size, events 2 Wahlbeck et al.[9] 2000 11 Group size, mean, Validity and comparability of meta-analysis results standard deviation We searched the internet and literature databases of med- 3 Pagliaro et al.[10] 1992 19 Association measure, ical and social sciences (PubMed, EmBase, Eric, and standard error PsychInfo) for articles that reported validations of meta- Page 2 of 9 (page number not for citation purposes) BMC Medical Research Methodology 2007, 7:40 http://www.biomedcentral.com/1471-2288/7/40 Netherlands), the Dutch Cochrane Center (The Nether- nalysis. This software comes with a book and cannot be lands), the University of Leuven (Belgium), and the Cen- purchased separately. Neither the software nor the book is tre for Statistics in Medicine (UK). There were no specific supported by a website, which is why we did not find it at inclusion criteria and the sample consisted of individuals first. At the time of inclusion, we could no longer assess it from various departments and with various levels of expe- in the usability part, but have included it post-hoc in the rience with meta-analysis. assessment of comparability and features. During the assessment sessions, participants were asked to Assessment of numerical and graphical features install (evaluation versions of) each of the studied meta- Below is a short summary of the numerical and graphical analysis programs and to analyze one small data set of a features in each of the reviewed programs; details are meta-analysis with a dichotomous outcome (a shortened available in Tables 3 and 4. version of the previously described meta-analysis by Teo et al. [8]). As they completed this task, they scored the usa- Comprehensive Meta-Analysis (commercial software) has bility of each program in an electronic scoring list. This list the highest profile in the Internet search engines of all [see Additional file 1] was developed via a consensus ses- included programs. It distinguishes itself from other pro- sion with (meta-analysis) experts from the disciplines of grams by the option to enter effect sizes of different for- epidemiology, biostatistics, and medical informatics, who mats and the comprehensiveness of the numerical were asked which elements they considered important in options and output. Data can be entered manually or via meta-analysis software and what items they would use to copy-and-paste in the CMA spreadsheet; direct import of judge its usability. The order in which each program was text or other data files is not possible. The program fea- installed and assessed was determined by a computer gen- tures all major graphical presentations. The tutorial and erated randomization list and different for each partici- manual are to-the-point and extensive. The program is pant. actively maintained and the website is modern and regu- larly updated. Results Software search and selection MetAnalysis 1.0 (commercial software) is not sold sepa- We found 10 meta-analysis packages that were available rately, but comes as a bonus feature of a book [19]. It is for download or purchase via the internet (Table 2). Many limited to studies with descriptive data on dichotomous were no longer updated or had remained in their DOS outcomes. Data cannot be pasted or imported and must stage and were excluded from our study. We included six be entered manually, cell by cell. Once the data are programs in our comparison: Comprehensive Meta-anal- entered and the calculations performed, numerical data ysis (CMA) Version 2 [18], MetAnalysis[19], MetaWin 2.1 can be produced in a print preview screen and graphs in [20], MIX 1.5 [21], RevMan 4.2.8 [22], and WEasyMA 2.5 separate windows. A nice feature is the radial part of the [23] (in alphabetical order). Using less stringent inclu- Galbraith plot, which is lacking in most other software. sion/exclusion criteria did not change this software selec- The software also has the facilities to enter loss to follow- tion. Using more stringent criteria would exclude up/drop-out information and use the studies in the meta- WEasyMA as various signs indicate that it may no longer analyses with per-protocol or intention-to-treat analysis. be developed and supported. Initially, our search did not The software does not contain help files and does not pick up the still relatively unknown program called MetA- have a website, but users can consult the book instead. Table 2: Retrieved meta-analysis software Software name OS requirements Meta-analysis interface Availability Selected for review Comprehensive Meta-Analysis WINDOWS Graphical Commercial EasyMA DOS DOS menu Free EpiMeta DOS DOS menu Free Hepima DOS DOS menu Free Meta-Analysis 5.3 DOS DOS menu Free Meta-Analyst DOS DOS menu Free MetAnalysis WINDOWS Graphical Commercial Meta-Stat DOS DOS menu Free MetaWin WINDOWS Graphical Commercial MIX WINDOWS Graphical Free RevMan WINDOWS Graphical Free WEasyMA WINDOWS Graphical Commercial Page 3 of 9 (page number not for citation purposes) BMC Medical Research Methodology 2007, 7:40 http://www.biomedcentral.com/1471-2288/7/40 Table 3: Meta-analysis software – basic feature comparison CMA MetAnalysis MetaWin MIX RevMan WEasyMA General URL meta-analysis.com - metawinsoft.com mix-for-meta-analysis.info cc-ims.net/RevMan weasyma.com Corporate single user price ~$1295.00 ~$75.00 ~$150.00 Free $650 ~$490.00 Student single user price ~$395.00 ~$75.00 ~$75.00 Free Free ~$280.00 Download/program size 30 Mb 5 Mb 9 Mb 20 Mb/50 Mb 9 Mb 3 Mb Compatibility Windows Windows Windows Windows Windows Windows Last update 2006 2005 2002 2006 2005 2002 License Single user Single user Single user Open Open Single user Input options Manual input 9 9 9 9 9 9 Copy & paste () 9 9 9 9 Text file import 9 9 File import (Excel, other software) 9 9 Descriptive dichotomous, e.g. n(total), n(y = 1) 9 9 9 9 9 9 Descriptive continuous, e.g. n, m, sd 9 9 9 9 Comparative, e.g. theta, se/var 9 9 9 9 Multi-format (mixed in one data set) Single data input/selection 9 9 9 9 9 Maximum number of studies Unlimited Unlimited Unlimited 100 Unlimited Unlimited Information sources Within-program HTML help () () 9 9 9 9 Printable manual 9 9 9 9 Description of methods/calculations ( ) ( ) 9 9 9 9 Additional information sources (PDFs/tutorials) 9 9 9 Up-to-date website 9 9 9 9 8 Export options Copy output to clipboard 9 9 9 9 9 9 Export to office application(s) Report creation 9 9 9 Setting copy file type (e.g. bmp, jpg or wmf) 9 9 9 The ' ' indicates the presence and no mark indicates the absence of a feature. The '( )' means that the feature is limited or partially in development, and the ' ' means it 9 9 was not working correctly at the time of our assessments. MetaWin 2.1 (commercial software) is accompanied by a used by other programs. In contrary to most other soft- comprehensive manual in the form of a book and, in this ware, all calculations are based on t-distributions and respect, resembles the MetAnalysis package described boot-strap methods are also available. The help files and above. Distinctive features are the effect size calculator, the book are extensive and detailed. some graphs that are relatively uncommon in meta-anal- ysis (the normal quantile plot and a weighted histogram), MIX 1.5 (free software) is the most recently developed and the option to use bootstrap confidence intervals. The program. Its most prominent features are the comprehen- interface resembles a spreadsheet program and various sive graphical output, detailed numerical options, and data files can be imported. For some changes in the anal- educational features like built-in data sets corresponding ysis, data range selections have to be repeated, which is to those in a number of books, and extensive tutor func- somewhat more time-consuming compared to methods tions. MIX is the only program that will not function by Page 4 of 9 (page number not for citation purposes) BMC Medical Research Methodology 2007, 7:40 http://www.biomedcentral.com/1471-2288/7/40 Table 4: Meta-analysis software – analytical feature comparison CMA WEasyMA MetaWin MetAnalysis RevMan MIX Computational setting options Number of decimals 9 9 9 9 Alpha level/confidence intervals 9 9 9 9 9 Constant continuity correction 9 9 9 9 9 9 Treatment arm continuity correction Variance for mean differences 9 9 Variances for standardized mean differences 9 9 Bootstrap confidence intervals Numerical output Individual study data AM,CI,P,W,other AM,CI,W,other AM,other AM,CI,other AM,CI,P,W,other AM,CI,P,W,other Association measures – risk RD,RR,OR RD,RR,OR RD,RR,OR RD,OR RD,RR,OR RD,RR,OR Association measures – means & MD,HG,CD,other HG,other MD,HG MD,HG,CD standardized measures Association measures – other CC,Z CC Fixed effect models/weighting IV,MH,PETO IV,MH,PETO,other IV,MH,PETO IV,MH,PETO IV,MH,PETO IV,MH,PETO Random effects models/weighting DL DL DL DL DL DL Cumulative analyses Several variables Several variables Several variables ( ) Only graph Several variables 2 2 2 2 2 2 Heterogeneity Q,I ,t QQ Q,I Q,I Q,I ,t ,other Small study effect/publication bias FSN,RC,EGG,TF EGG FSN,RC FSN,EGG FSN,RC,EGG,MAC,TF Meta-regression Single moderator Single moderator Graphical output Forest plot 9 9 9 9 9 9 - Points proportional to weights 9 9 - Annotations in rows possible 9 9 9 - Cumulative possible 9 9 Funnel plot (1/se, se, var, N, P) 1/se,se 1/se,se,N var,N N 1/se 1/se,se,N,P Galbraith plot (radial) 9 9 9 9 Exclusion sensitivity plot Trim and fill plot 9 9 L'Abbe plot 9 9 9 Other plots HIST,NQ BOX,HIST,NQ,other Graph formatting 9 9 9 9 9 9 The ' ' indicates the presence and no mark indicates the absence of a feature. The '( )' means that the feature is limited or partially in development, and the ' ' means it 9 9 was not working correctly at the time of our assessments. Abbreviations: AM = association measure, CI = confidence interval, P = P value, W = weight, RD = risk difference, RR = risk ratio, OR = odds ratio, md = mean difference, hg = Hedges' g, cd = Cohen's d, CC = correlation coefficient, Z = Fisher's Z, IV = inverse variance weighting, MH = Mantel-Haenszel weighting, PETO = Peto's weighting, DL = Dersimonian & Laird weighting, Q = Cochran's or Breslow & Day's Q, I =Higgins's inconsistency statistic, t =between study variance indicator, FSN = fail-safe number test, RC = rand correlation test, Egg = Egger's regression test, Mac = Macaskill's regression test, TF = trim and fill method, se = standard error, var = variance, N = sample size, TFP = trim and fill plot, HIST = histogram, NQ = normal quantile plot, BOX = box-and-whiskers plot. itself and it requires Microsoft Excel 2000 or later to run. RevMan 4.2.8 (free for private and academic use) was Another limitation is the maximum number of data sets, developed by and for the Cochrane Collaboration. It which is currently 100. Data sets can be created by manual stands out due to its extensive features for collaborative input as well as by importing text delimited data files or management of systematic reviews. The analytical func- Excel workbooks. The numerical and graphical options tions of the program cannot be accessed without first cre- are diverse and comprehensive. ating a review structure and because import and copy- Page 5 of 9 (page number not for citation purposes) BMC Medical Research Methodology 2007, 7:40 http://www.biomedcentral.com/1471-2288/7/40 and-paste functionality are also limited, getting started Since MetAnalysis and WEasyMA can only analyze data requires more preparation than with most other software. from two-by-two tables, the comparability assessments Once data are in the analysis module, analysis is straight- were limited to one data set [8]. Analyses in MetAnalysis forward. Output is detailed, though without tests for pub- were very similar though not always identical to those lication bias and no other graphs than the forest and from STATA. We found that if we entered experimental funnel plot. The help resources in RevMan are extremely group data first (as is the case in all other software), an thorough. A new version is to be released in the near incorrect event coding is applied that causes the software future. to calculate risk differences and odds ratios of survival even if mortality is entered as event. For risk differences WEasyMA 2.5 (commercial software) stands out by the this only changes the sign, but for odds and odds ratios it speed with which results become available after data set gives the reciprocal of the intended results [26]. Although creation. Data cannot be imported or pasted and need to the book mentions that control data are to be entered in be entered manually, cell by cell. Another limitation of the first data column, the software has currently no built- this program is that it can only handle data from clinical in guard against this and we therefore urge users to be trials with dichotomous outcomes, e.g. two-by-two table careful. data. Although limited to these types of data, the program produces a wide variety of numerical and graphical out- In WEasyMA, we found results that could not be repro- put. The original author has indicated that the software is duced if a data set with zero events in one study arm was currently unsupported by a development team and may used. Even when using the same continuity correction as soon no longer be available. reported in the 'Calculation options' dialog in WEasyMA, the results remained different in STATA. The WEasyMA Validity and comparability of meta-analysis results authors did not respond to our inquiry into reasons for Our internet and database search did not yield any publi- the discrepancies. cations on the validity or validation of any of the pro- grams, except for MIX [24,25]. Authors of all programs Assessment of usability were contacted to determine whether (yet unpublished) Of the 30 participating researchers, 26 provided quantita- tive data that were suitable for analysis (Table 5). Trouble evidence of validation procedures was available. Authors of RevMan indicated that validation data were made pub- with the electronic user form or installation of software lic via notes and abstracts at Cochrane Collaboration made the data from 4 researchers incomplete and they meetings and conferences. The authors of CMA, MetAnal- were excluded from the quantitative part. MIX scored ysis, and MetaWin stated that all procedures had been highest on the overall usability (8.6), followed by CMA checked extensively with external programs, spreadsheets, (6.9), MetaWin (6.2), RevMan (6.1), and WEasyMA (4.2). and occasionally by hand, though had not been made public. For CMA, Excel sheets with such data are available RevMan was most familiar to the participating research- upon request. We received no information on validation ers. MIX had not been used by any of the participants but procedures from the authors of WEasyMA. the name was familiar to some as they were affiliated to the same institutions as the makers of the MIX software. We found no discrepancies in meta-analysis results Stratifying the results in analogous subgroups did not between STATA, MIX and RevMan. In CMA, we found a reveal any specific trends in the ratings. Experienced users small inconsistency in results of publication bias tests, but appeared to be more critical than less experienced users, this was corrected via an update while we were writing this but relative scores were identical. Installation of WEa- article. syMA and CMA was troublesome for some researchers. Qualitative statements mostly concerned problems with MetaWin's results were different from STATA's results the installation (WEasyMA, CMA), error messages in (and thus also from results in CMA, MIX, and RevMan) French (WEasyMA), and difficulties with data set creation because MetaWin mostly uses a t-distribution where the (WEasyMA, RevMan). Favorable comments included aforementioned programs use a z-distribution (although praise for the user interfaces (MIX, RevMan, CMA), help a recent version of MIX also allowed us to use a t-distribu- system (RevMan), speed of analysis (WEasyMA), and tion). We did find what seemed to be a terminological within-program tutoring (MIX, CMA). inconsistency, as the Mantel-Haenszel labeled method used in MetaWin for odds ratio analyses gave results that Discussion were identical to those from Peto's method in the other Meta-analysis is an indispensable tool in current-day syn- programs (albeit with confidence limits based on a t-dis- thesis of research data from multiple studies, and system- tribution). atic reviews with meta-analyses occupy the top position in the hierarchy of evidence. Software for meta-analysis has Page 6 of 9 (page number not for citation purposes) BMC Medical Research Methodology 2007, 7:40 http://www.biomedcentral.com/1471-2288/7/40 Table 5: Meta-analysis software – usability ratings (best scoring software from left to right) Items and subgroups MIX CMA MetaWin RevMan WEasyMA All researchers (26) Overall rating (min-max) 8.6 (6.7 to 10) 6.9 (3.7 to 9.7) 6.2 (4.3 to 8.7) 6.1 (4.3 to 8.3) 4.2 (1 to 7.3) Getting started 8.6 7.4 6.8 7.6 4.5 Data preparation 8.3 6.3 6.3 4.5 2.6 Usability in analysis 8.8 7.1 5.6 6.3 5.9 Experienced (7) Overall rating (min-max) 8.1 (7.0 to 9.7) 6.8 (6.0 to 7.3) 5.9 (4.3 to 7.7) 5.4 (4.3 to 6.3) 3.3 (1 to 5.7) Getting started 8.0 7.6 6.2 7.5 2.8 Data preparation 8.3 6.3 6.3 3.0 2.0 Usability in analysis 8.0 6.6 5.4 6.3 5.3 Inexperienced (19) Overall rating (min-max) 8.7 (6.7 to 10) 7 (3.7 to 9.7) 6.3 (4.3 to 8.7) 6.3 (4.7 to 8.3) 4.6 (1.3 to 7.3) Getting started 8.8 7.3 6.9 7.7 5.0 Data preparation 8.3 6.3 6.3 5.0 2.8 Usability in analysis 9.1 7.3 5.6 6.3 6.1 All scores are summary scores, based on the scores of items in the 'Installation', 'Data preparation', and 'Usability in analysis' categories. Each item was scored from bad to excellent on a scale from 0 to 10. evolved over the years and available reviews are relatively rently not prevented by warning or error messages and can outdated. We therefore considered it timely to provide a lead to invalid results. systematic overview of the features, criterion validity, and usability of the currently available software that is dedi- The usability study shows that preparing data for analysis cated to meta-analysis of causal (therapeutic and etio- is the hardest part in each program. MIX and CMA are logic) studies. It has some overlaps with existing reviews identified as the most user-friendly programs. WEasyMA [3-7], but includes other more recent programs, contains scored least favorable. Stratifying user evaluations based more detailed information on the merits and demerits of on experience with meta-analysis and previous experience the available programs, and follows a more systematic or knowledge of the software did not reveal any trends in approach. the ratings. We studied four commercial programs (CMA, WEasyMA, Our comparison has been limited to software dedicated to MetaWin, and MetAnalysis) and two free programs (Rev- meta-analysis only and does not include general statistics Man and MIX). The features of the commercial programs packages. The primary reason to leave them out was were not necessarily more extensive than those of the free because they are structurally very different, making direct ones. In particular MIX stood out in terms of numerical comparisons inappropriate. Central to this issue is soft- options and graphical output. CMA was generally most ware syntax: most general packages require thorough versatile, in particular in options for analysis of various knowledge of their syntax in order to produce and alter types of data. With regard to the comparability of results, graphs that are common in meta-analysis; the dedicated MIX, RevMan, and CMA produced numerical results that packages, however, produce such graphs with a few or were identical to results from STATA's metan, metabias, sometimes even a single click. In addition, the syntax and metatrim. MetaWin's results are different and slightly knowledge required to do more advanced meta-analyses more conservative, since the confidence intervals are with the general packages means that in a usability survey based on a t-distribution or bootstraps. WEasyMA pro- all participants would have to be expert statisticians, capa- duces results that can be disparate from the other pro- ble of writing and adapting syntax for meta-analysis in all grams, especially in data sets with studies with zero events major general software packages. This is not only not fea- in one or both of the comparison groups. Although most sible in the current setting, it would also make the partic- differences were small in the data sets we used, we have ipating individuals no longer representative of the reservations on how this will reflect on data sets with (sometimes relatively inexperienced) users of the software more extreme data. The MetAnalysis program should also in the scientific and academic community. Although a dif- be used with care as data have to be entered manually and ferent approach would be necessary, we believe the user in the correct columns. Exchanging the columns is cur- community of meta-analysis software would benefit from Page 7 of 9 (page number not for citation purposes) BMC Medical Research Methodology 2007, 7:40 http://www.biomedcentral.com/1471-2288/7/40 an additional review of meta-analysis options in general Conclusion statistics software. In conclusion, the most suitable meta-analysis software for a user depends on his or her demands; no single pro- Due to the lack of a 'gold' standard, we resorted to gram may be best for everybody. The information pro- between-program comparisons and a criterion validation vided in this article, in particular the data in Tables 3 and with STATA's user-written commands metan, metabias and 4, should give users the opportunity to make a substanti- metatrim as reference. Our choice for STATA was based on ated decision. its versatility and use in two major books on meta-analysis [11,12]. We realize that STATA itself is also user-written Competing interests and potentially subject to similar validity issues than the None of the authors have financial conflicts of interest, other programs. The fact that CMA, MIX, and RevMan although the first author is also primary developer of one produced results that were identical to results from of the free programs (MIX) studied in this review. The STATA, at least with the three data sets we selected, justi- other authors have been co-authors in an introductory fies to some extent our use of STATA as a reference stand- article about MIX. To reduce personal biases, all tasks were ard. handled by multiple investigators and the subjective usa- bility assessments were assigned (by study design) to indi- The results of our usability survey should be regarded as viduals other than the authors. exploratory and serve as a rough indication. First, the number of participants was relatively small. Second, it is Authors' contributions not unlikely that there may be some bias in favor of Rev- LB and KGM developed the study concept and designed Man and MIX because some users were already familiar the study. LB and LMY handled the primary data acquisi- with these programs. Subgroup analyses, however, did tion and drafted the manuscript. All authors double not reveal such trends. MetAnalysis could unfortunately checked the data tables and analyses, and approved the not be included as it was included after the start of the usa- final version of the manuscript. bility assessment. A further point regarding MIX is that it was created following a development focus list [25] that Additional material was created in a similar fashion to our usability scoring list. Assessment of both lists reveals that a number of Additional file 1 items are very similar. Although this may indicate that the Software usability scoring list. The scoring list that was used to evaluate lists are indeed reflecting the demands of statistical soft- the usability of the meta-analysis software. ware users, it also means that the MIX program was likely Click here for file to do well in our assessment. We believe, however, that [http://www.biomedcentral.com/content/supplementary/1471- 2288-7-40-S1.doc] any program that is systematically developed to satisfy its users' demands should perhaps deservedly score high. Another point to which we would like to draw attention is Acknowledgements the lack of accessible public information about the man- This study was not supported by any particular grant. The authors would ner in which meta-analysis programs have been validated. like to express their gratitude to all researchers who participated in the Only the website of the MIX program includes specific ref- usability assessment sessions. erences to this and MIX is the only program with a peer- reviewed and published validation report [25]. Without References such reports, authors, reviewers, editors, and consumers 1. Hunt M: Making order of scientific chaos. In How Science Takes of evidence have no reference for judgments about the Stock New York , Russel Sage Foundation; 1997:1-19. 2. Eysenck HJ: Meta-analysis and its problems. Bmj 1994, suitability of the software for scientific purposes. This is of 309:789-792. course equally applicable to the user-written meta-analy- 3. Normand SLT: Meta-analysis software - a comparative review -DSTAT, version 1.10. Am Statistician 1995, 49:298-309. sis macros for general statistics software. We argue for 4. Egger M, Sterne JAC, Smith GD: Meta-analysis software. BMJ more rigor and transparency in this area. 1998, 316(7126): Website only: http://bmj.bmjjournals.com/archive/ 7126/7126ed9.htm. 5. Sutton AJ, Lambert PC, Hellmich M, Abrams KR, Jones DR: Meta- Finally, we are fully aware that the world of information analysis in practice: A critical review of available software. In technology changes constantly and by the time this man- Meta-Analysis in Medicine and Health Policy Edited by: Berry DA, Stangl DK. New York , Marcel Dekker; 2000. uscript is published, it is possible that some updates have 6. Sutton AJ, Lambert PC, Hellmich M, Abrams KR, Jones DR: Meta- become available or that new products have been analysis software. In Systematic Reviews in Health Care: Meta-Analysis launched. We apologize beforehand for our lack of tim- in Context 2nd edition. Edited by: Egger M, Davey Smith G, Altman DG. London: BMJ Books; 2001. ing. Like a traditional review, we intend to update this investigation in due time. Page 8 of 9 (page number not for citation purposes) BMC Medical Research Methodology 2007, 7:40 http://www.biomedcentral.com/1471-2288/7/40 7. Arthur W, Bennett W, Huffcutt A: Choice of software and pro- grams in meta-analysis research: Does it make a difference? Educ Psychol Measurement 1994, 54:776-787. 8. Teo KK, Yusuf S, Collins R, Held PH, Peto R: Effects of intravenous magnesium in suspected acute myocardial infarction: over- view of randomised trials. Bmj 1991, 303(6816):1499-1503. 9. Wahlbeck K, Cheine M, Essali MA: Clozapine versus typical neu- roleptic medication for schizophrenia. Cochrane Database Syst Rev 2000:CD000059. 10. Pagliaro L, D'Amico G, Sorensen TI, Lebrec D, Burroughs AK, Mora- bito A, Tine F, Politi F, Traina M: Prevention of first bleeding in cirrhosis. A meta-analysis of randomized trials of nonsurgi- cal treatment. Ann Intern Med 1992, 117(1):59-70. 11. Egger M, Davey Smith G, Altman DG: Systematic reviews in health care: meta-analysis in context. London , BMJ Publishing Group; 2001. 12. Sutton AJ, Abrams KR, Jones DR, Sheldon TA, Song F: Methods for meta-analysis in medical research. Chichester , Wiley; 2000. 13. Glasziou P, Irwig L, Bain C, Colditz G: Systematic Reviews in Health Care: A Practical Guide. Cambridge , Cambridge Univer- sity Press; 2001. 14. Bradburn MJ, Deeks JJ, Altman DG: Metan - an alternative meta- analysis command (Metan 1.81). Stata Technical Bulletin 2003, STB 44(sbe24):4-15. 15. Steichen TJ: Tests for publication bias in meta-analysis (Meta- bias 1.2.4). Stata Journal 2003, SJ3-4(sbe19_5):11. 16. Steichen TJ: Nonparametric trim and fill analysis of publica- tion bias in meta-analysis (Metatrim 1.0.5). Stata Technical Bul- letin 2003, STB61(sbe39.2):11. 17. StataCorp: Stata statistical software, Release 9. College Station, TX , StataCorp LP; 2005. 18. Borenstein M, Hedges L, Higgins J, Rothstein H: Comprehensive Meta-Analysis Version 2. Engelwood, NJ , Biostat; 2005. 19. Leandro G: Meta-analysis in Medical research. Blackwell Pub- lishing, BMJ Books; 2005. 20. Rosenberg MS, Adams DC, Gurevitch J: MetaWin: Statistical Software for Meta-Analysis Version 2. Sunderland, Massachu- setts , Sinauer Associates; 2000. 21. Bax L, Yu LM, Ikeda N, Tsuruta N, Moons KGM: MIX: Comprehen- sive Free Software for Meta-analysis of Causal Research Data - Version 1.5. 2006. 22. The Nordic Cochrane Centre: Review Manager (RevMan). Ver- sion 4.2 for Windows. Copenhagen , The Cochrane Collaboration; 23. Chevarier P, Cucherat M, Freiburger T, Maupas J, Visele N, Bugnard F, Bazog P: WeasyMA. Lyon , ClinInfo; 2000. 24. Bax L, Yu LM, Ikeda N, Tsuruta H, Moons KGM: Conference pro- ceeding: Validation of a freely available and comprehensive meta-analysis add-in for excel. In Eur J Epidemiol Volume 21(sup- plement). European Journal of Epidemiology; 2006:58. 25. Bax L, Yu LM, Ikeda N, Tsuruta H, Moons KG: Development and validation of MIX: comprehensive free software for meta- analysis of causal research data. BMC Med Res Methodol 2006, 6(1):50. 26. Deeks JJ: Issues in the selection of a summary statistic for meta-analysis of clinical trials with binary outcomes. Stat Med 2002, 21(11):1575-1600. Pre-publication history The pre-publication history for this paper can be accessed Publish with Bio Med Central and every here: scientist can read your work free of charge "BioMed Central will be the most significant development for http://www.biomedcentral.com/1471-2288/7/40/prepub disseminating the results of biomedical researc h in our lifetime." Sir Paul Nurse, Cancer Research UK Your research papers will be: available free of charge to the entire biomedical community peer reviewed and published immediately upon acceptance cited in PubMed and archived on PubMed Central yours — you keep the copyright BioMedcentral Submit your manuscript here: http://www.biomedcentral.com/info/publishing_adv.asp Page 9 of 9 (page number not for citation purposes)

Journal

BMC Medical Research Methodology – Springer Journals

Published: Sep 10, 2007

Get 20M+ Full-Text Papers For Less Than $1.50/day. Start a 7-Day Trial for You or Your Team.

Learn More →

A systematic comparison of software dedicated to meta-analysis of causal studies

A systematic comparison of software dedicated to meta-analysis of causal studies

Get 20M+ Full-Text Papers For Less Than $1.50/day. Start a 7-Day Trial for You or Your Team.

Learn More →

A systematic comparison of software dedicated to meta-analysis of causal studies

A systematic comparison of software dedicated to meta-analysis of causal studies

References (29)

Abstract

Journal

Recommended Articles

There are no references for this article.

Our policy towards the use of cookies