The Stata Journal

The wild bootstrap was originally developed for regression models with heteroskedasticity of unknown form. Over the past 30 years, it has been extended to models estimated by instrumental variables and maximum likelihood and to ones where the error terms are (perhaps multiway) clustered. Like bootstrap methods in general, the wild bootstrap is especially useful when conventional inference methods are unreliable because large-sample assumptions do not hold. For example, there may be few clusters, few treated clusters, or weak instruments. The package boottest can perform a wide variety of wild bootstrap tests, often at remarkable speed. It can also invert these tests to construct confidence sets. As a postestimation command, boottest works after linear estimation commands, including regress, cnsreg, ivregress, ivreg2, areg, and reghdfe, as well as many estimation commands based on maximum likelihood. Although it is designed to perform the wild cluster bootstrap, boottest can also perform the ordinary (nonclustered) version. Wrappers offer classical Wald, score/Lagrange multiplier, and Anderson–Rubin tests, optionally with (multiway) clustering. We review the main ideas of the wild cluster bootstrap, offer tips for use, explain why it is particularly amenable to computational optimization, state the syntax of boottest, artest, scoretest, and waldtest, and present several empirical examples.

journal article

LitStream Collection

Seamless interactive language interfacing between R and Stata

Haghish, E. F.

2019 The Stata Journal

doi: 10.1177/1536867X19830891pmid: N/A

In this article, I propose a new approach to language interfacing for statistical software by allowing automatic interprocess communication between R and Stata. I advocate interactive language interfacing in statistical software by automatizing data communication. I introduce the rcall package and provide examples of how the R language can be used interactively within Stata or embedded into Stata programs using the proposed approach to interfacing. Moreover, I discuss the pros and cons of object synchronization in language interfacing.

journal article

LitStream Collection

On the importance of syntax coloring for teaching statistics

Haghish, E. F.

2019 The Stata Journal

doi: 10.1177/1536867X19830892pmid: N/A

In this article, I underscore the importance of syntax coloring in teaching statistics. I also introduce the statax package, which includes JavaScript and LATEX programs for highlighting Stata code in HTML and LATEX documents. Furthermore, I provide examples showing how to implement this package for developing educational materials on the web or for a classroom handout.

journal article

LitStream Collection

Estimation methods in the presence of corner solutions

Sánchez-Peñalver, Alfonso

2019 The Stata Journal

doi: 10.1177/1536867X19830893pmid: N/A

In this article, I introduce a new command, nehurdle, that collects maximum likelihood estimators for linear, exponential, homoskedastic, and heteroskedastic tobit; truncated hurdle; and type II tobit models that involve explained variables with corner solutions. I review what a corner solution is as well as the assumptions of the mentioned models.

journal article

LitStream Collection

piaactools: A program for data analysis with PIAAC data

Jakubowski, Maciej; Pokropek, Artur

2019 The Stata Journal

doi: 10.1177/1536867X19830909pmid: N/A

The OECD Programme for the International Assessment of Adult Competencies (PIAAC) is currently the only international survey of adult skills. It provides rich data on skills, work and life situations, earnings, and attitudes. To ensure representativeness and high reliability, the study is based on a complex survey design and advanced statistical methods. To obtain correct results from publicly available microdata, one must use special methods that are often too advanced for less experienced researchers. In this article, we present piaactools—a package of three commands that facilitate analysis with PIAAC data. The command piaacdes calculates basic statistics, piaactab computes frequencies of adults at each proficiency level, and piaacreg allows for the use of several regression models with PIAAC data. Output is saved as HTML files that can be opened in most spreadsheets and as Stata matrices that can be further processed in Stata. We also explain how to use these commands and provide examples that can be easily modified for use with different models and variables.

journal article

LitStream Collection

lsemantica: A command for text similarity based on latent semantic analysis

Schwarz, Carlo

2019 The Stata Journal

doi: 10.1177/1536867X19830910pmid: N/A

In this article, I present the lsemantica command, which implements latent semantic analysis in Stata. Latent semantic analysis is a machine learning algorithm for word and text similarity comparison and uses truncated singular value decomposition to derive the hidden semantic relationships between words and texts. lsemantica provides a simple command for latent semantic analysis as well as complementary commands for text similarity comparison.

Showing 1 to 10 of 17 Articles

Articles per page

The Stata Journal

2022

2021

2020

2019

2018

2017

2016

2015

2014

2013

2012

2011

2010

2009

2008

2007

2006

2005

2004

2003

2002

2001

Related Journals: