Performance of the Generalized Mantel-Haenszel Method in Graded-Response Items Using Empirical Data
Main Article Content
Abstract
The current study aimed to examine the performance behavior of the Generalized Mantel-Haenszel (GMHDIF) method in detecting differential item functioning (DIF) in graded response based on the gender variable while altering the sample size. It used real data obtained from the responses of a sample of Tabuk University students on a scale to evaluate the quality of academic advising. Six sample size levels were used: 250, 500, 1000, 1500, 2000, and 2500. The study concluded that the differential items detected by the method in small sample sizes may not appear as such in larger samples. Conversely, items that do not seem differential in small samples may show differential functioning in larger samples. Some items appeared to be different across all sample sizes, including the smaller ones. Therefore, the effectiveness and ability of the method to detect DIF items increases with larger sample sizes. Items expected to have a high level of differential functioning are easier for the method to detect, even in smaller sample sizes.