Access the full text.
Sign up today, get DeepDyve free for 14 days.
L. Humphreys (1962)
The organization of human abilities.American Psychologist, 17
L. Humphreys (1986)
An analysis and evaluation of test and item bias in the prediction context.Journal of Applied Psychology, 71
D. Thissen, L. Steinberg (1984)
A response model for multiple choice itemsPsychometrika, 49
R. Darrell, Bock (1988)
Full-Information Item Factor AnalysisApplied Psychological Measurement, 12
C. Hirsch (1988)
Curriculum and Evaluation Standards for School Mathematics
F. Lord (1980)
Applications of Item Response Theory To Practical Testing Problems
R. Zwick (1990)
When Do Item Response Function and Mantel-Haenszel Definitions of Differential Item Functioning Coincide?Journal of Educational Statistics, 15
Sireci Sireci, Thissen Thissen, Wainer Wainer (1991)
On the reliability of testlet‐based testsJournal of Educational Measurement, 28
D. Thissen, L. Steinberg, Joan Mooney (1989)
Trace Lines for Testlets: A Use of Multiple-Categorical-Response Models.Journal of Educational Measurement, 26
P. Holland, Dorothy Thayer (1986)
Differential Item Performance and the Mantel-Haenszel Procedure.
P. Bentler, D. Bonett (1980)
Significance Tests and Goodness of Fit in the Analysis of Covariance StructuresPsychological Bulletin, 88
S. Sireci, D. Thissen, H. Wainer (1991)
ON THE RELIABILITY OF TESTLET‐BASED TESTSETS Research Report Series, 1991
L. Resnick (1987)
Education and Learning to Think
M. Tatsuoka, F. Lord, M. Novick, A. Birnbaum (1971)
Statistical Theories of Mental Test Scores.Journal of the American Statistical Association, 66
C. Lewis, K. Sheehan (1990)
Using Bayesian Decision Theory to Design a Computerized Mastery TestApplied Psychological Measurement, 14
F. Samejima (1968)
Estimation of latent ability using a response pattern of graded scoresPsychometrika, 34
Roznowski Roznowski (1988)
Review of test validityJournal of Educational Measurement, 25
H. Gulliksen (1952)
Theory of mental tests
R. Bock (1972)
Estimating item parameters and latent ability when responses are scored in two or more nominal categoriesPsychometrika, 37
D. Thissen, L. Steinberg, H. Wainer (1988)
Use of item response theory in the study of group differences in trace lines.
L. Humphreys (1981)
The Primary Mental Ability
H. Wainer, G. Kiely (1987)
Item Clusters and Computerized Adaptive Testing: A Case for TestletsJournal of Educational Measurement, 24
It is sometimes sensible to think of the fundamental unit of test construction as being larger than an individual item. This unit, dubbed the testlet, must pass muster in the same way that items do. One criterion of a good item is the absence of DIF–the item must function in the same way in all important subpopulations of examinees. In this article, we define what we mean by testlet DIF and provide a statistical methodology to detect it. This methodology parallels the IRT‐based likelihood ratio procedures explored previously by Thissen, Steinberg, and Wainer (1988, in press). We illustrate this methodology with analyses of data from a testlet‐based experimental version of the Scholastic Aptitude Test (SAT).
Journal of Educational Measurement – Wiley
Published: Sep 1, 1991
Read and print from thousands of top scholarly journals.
Already have an account? Log in
Bookmark this article. You can see your Bookmarks on your DeepDyve Library.
To save an article, log in first, or sign up for a DeepDyve account if you don’t already have one.
Copy and paste the desired citation format or use the link below to download a file formatted for EndNote
Access the full text.
Sign up today, get DeepDyve free for 14 days.
All DeepDyve websites use cookies to improve your online experience. They were placed on your computer when you launched this website. You can change your cookie settings through your browser.