journal article
LitStream Collection
Roodman, David; Nielsen, Morten Ørregaard; MacKinnon, James G.; Webb, Matthew D.
doi: 10.1177/1536867X19830877pmid: N/A
The wild bootstrap was originally developed for regression models with heteroskedasticity of unknown form. Over the past 30 years, it has been extended to models estimated by instrumental variables and maximum likelihood and to ones where the error terms are (perhaps multiway) clustered. Like bootstrap methods in general, the wild bootstrap is especially useful when conventional inference methods are unreliable because large-sample assumptions do not hold. For example, there may be few clusters, few treated clusters, or weak instruments. The package boottest can perform a wide variety of wild bootstrap tests, often at remarkable speed. It can also invert these tests to construct confidence sets. As a postestimation command, boottest works after linear estimation commands, including regress, cnsreg, ivregress, ivreg2, areg, and reghdfe, as well as many estimation commands based on maximum likelihood. Although it is designed to perform the wild cluster bootstrap, boottest can also perform the ordinary (nonclustered) version. Wrappers offer classical Wald, score/Lagrange multiplier, and Anderson–Rubin tests, optionally with (multiway) clustering. We review the main ideas of the wild cluster bootstrap, offer tips for use, explain why it is particularly amenable to computational optimization, state the syntax of boottest, artest, scoretest, and waldtest, and present several empirical examples.
doi: 10.1177/1536867X19830891pmid: N/A
In this article, I propose a new approach to language interfacing for statistical software by allowing automatic interprocess communication between R and Stata. I advocate interactive language interfacing in statistical software by automatizing data communication. I introduce the rcall package and provide examples of how the R language can be used interactively within Stata or embedded into Stata programs using the proposed approach to interfacing. Moreover, I discuss the pros and cons of object synchronization in language interfacing.
doi: 10.1177/1536867X19830892pmid: N/A
In this article, I underscore the importance of syntax coloring in teaching statistics. I also introduce the statax package, which includes JavaScript and LATEX programs for highlighting Stata code in HTML and LATEX documents. Furthermore, I provide examples showing how to implement this package for developing educational materials on the web or for a classroom handout.
doi: 10.1177/1536867X19830893pmid: N/A
In this article, I introduce a new command, nehurdle, that collects maximum likelihood estimators for linear, exponential, homoskedastic, and heteroskedastic tobit; truncated hurdle; and type II tobit models that involve explained variables with corner solutions. I review what a corner solution is as well as the assumptions of the mentioned models.
Jakubowski, Maciej; Pokropek, Artur
doi: 10.1177/1536867X19830909pmid: N/A
The OECD Programme for the International Assessment of Adult Competencies (PIAAC) is currently the only international survey of adult skills. It provides rich data on skills, work and life situations, earnings, and attitudes. To ensure representativeness and high reliability, the study is based on a complex survey design and advanced statistical methods. To obtain correct results from publicly available microdata, one must use special methods that are often too advanced for less experienced researchers. In this article, we present piaactools—a package of three commands that facilitate analysis with PIAAC data. The command piaacdes calculates basic statistics, piaactab computes frequencies of adults at each proficiency level, and piaacreg allows for the use of several regression models with PIAAC data. Output is saved as HTML files that can be opened in most spreadsheets and as Stata matrices that can be further processed in Stata. We also explain how to use these commands and provide examples that can be easily modified for use with different models and variables.
doi: 10.1177/1536867X19830910pmid: N/A
In this article, I present the lsemantica command, which implements latent semantic analysis in Stata. Latent semantic analysis is a machine learning algorithm for word and text similarity comparison and uses truncated singular value decomposition to derive the hidden semantic relationships between words and texts. lsemantica provides a simple command for latent semantic analysis as well as complementary commands for text similarity comparison.
Showing 1 to 10 of 17 Articles