TY - JOUR AU - Yale, Robert, N. AB - Abstract Believability has been proposed as a factor influencing the persuasiveness of narratives. A measure of narrative believability was developed and validated. Study 1 details the construction and evaluation of the Narrative Believability Scale (NBS-12) in terms of internal consistency. Study 2 evaluates the criterion-related and construct validity of the scale. Study 3 tests the predictive validity of the measure for identifying juror verdicts and verdict confidence over and above the influence of other measures, including presentation order, attorney credibility, bias, and transportation. The NBS-12 was found to be a psychometrically robust measure of narrative believability and was able to predict variance in verdicts and verdict confidence. These results have implications for narrative persuasion research and understanding juror decision making. Over the past 3 decades, scholarly attention to the effects and mechanisms of narrative persuasion has consistently found that narratives have the potential to influence real-world beliefs and behaviors. This influence has been demonstrated with both true and fictional narratives (Busselle & Bilandzic, 2008; Gerrig & Prentice, 1991; Green & Brock, 2000), and may actually become stronger over time (Jensen, Bernat, Wilson, & Goonewardene, 2011). Several mechanisms of narrative persuasion have been examined, including transportation (Green & Brock, 2000), character identification (de Graaf, Hoeken, Sanders, & Beentjes, 2011; Green, 2006; Moyer-Gusé, Chung, & Jain, 2011; Slater & Rouner, 2002), counterfactual thinking (Tal-Or, Boninger, Poran, & Gleicher, 2004), engagement (Busselle & Bilandzic, 2009), causality (Dahlstrom, 2010), emotional reactions (Murphy, Frank, Moran, & Patnoe-Woodley, 2011), and perceived realism (Busselle & Bilandzic, 2008). Although investigation of each of these constructs has provided insight into various mechanisms by which narratives may influence beliefs, one key construct has received scant attention outside of the context of legal narratives: believability. Beyond the domain of media effects, researchers in psychology, communication, and law have been investigating the function of narrative in the courtroom for several decades and studies overwhelmingly suggest that the narratives believed by jurors during a trial are the primary determinants of the ultimate verdict (Bennett & Feldman, 1981; Huntley & Costanzo, 2003; Pennington & Hastie, 1988, 1991, 1992; Rideout, 2008). However, little is known about what makes a narrative seem veridical, and thus acceptable for informing decisions, despite this overwhelming evidence in the legal context that believability is of primary importance for understanding narrative influence. Pennington and Hastie's story model of juror persuasion (1991) is a theoretical account of the ways jurors construct and assess narratives to render verdicts in court. Briefly stated, the model posits that jurors arrive at a verdict through a three-step process: story construction, representation of the decision alternatives by learning verdict category attributes, and classification of the story into the best fitting verdict category (Pennington & Hastie, 1991). The story model has emerged as the most widely accepted account of juror decision making (Diamond & Rose, 2005; Hannaford, Hans, Mott, & Munsterman, 2000), but has remained largely unexamined for its contributions to the understanding of narrative influence. As an explicit model of narrative influence on decision making, the story model provides an excellent framework for exploring believability as a key factor for narrative acceptance, and thus narrative influence. To date, most published research related to the story model has manipulated some aspect of the trial process, such as evidence order (Pennington & Hastie, 1988), witness credibility and story completeness (Pennington & Hastie, 1992), or presentation organization type (Adaval, Isbell, & Wyer, 2007), in order to create differing levels of the certainty principles explained in the model. Although this past research has provided strong empirical support for the model, the absence of measures for model constructs has limited the applicability of the model to more ecologically valid situations involving competing narratives. Further, the lack of measures has thus far confined the usage and testing of the story model almost exclusively to the legal arena, providing a hindrance to application of the model's empirically supported predictive validity in other narrative influence contexts. The purpose of the present research is to develop and validate a measure of narrative believability based on the certainty principles from the story model. The story model The story model specifies several “certainty principles” that determine acceptance or rejection of a given narrative in terms of its persuasive impact: coverage, coherence, and uniqueness (Pennington & Hastie, 1991). These principles “determine acceptability of a story and the resulting level of confidence in the story” (Pennington & Hastie, 1991, p. 527). Specifically, the model posits that coverage, coherence, and uniqueness contribute to confidence in a selected story. Coverage and coherence alone determine the acceptability of a story for selection. Narrative coverage refers to “the extent to which the story accounts for evidence presented at trial” (Pennington & Hastie, 1991, pp. 527–528). For example, a prosecution trial narrative in which no explanation was given for video evidence suggesting the presence of another perpetrator would fail to account for the evidence and would be considered a narrative with low coverage. For nontrial narratives, stories with “loose ends” or those perceived to be missing important information might be considered to have low narrative coverage. Narrative coherence is a construct that is composed of three separate components in the story model: consistency, plausibility, and completeness, and is similar to the concept of evidence persuasiveness (Pennington & Hastie, 1991). Consistency refers to the perception that the facts of a story are not at odds internally or with other information believed by the juror to be true. Plausibility refers to the perception that a story is similar to what typically happens in the world. Completeness refers to the extent to which a story conforms to expectations about story structure. Thus, a story is most coherent when it is highly consistent, plausible, and complete. In summary, these constructs coalesce to form what might be thought of as narrative believability. The final certainty principle specified in the story model is that of story uniqueness. A story is unique to the extent that it is the only explanation of the evidence presented at trial. In cases where a juror constructs multiple coherent stories, the uniqueness of each story is reduced along with the juror's confidence in whichever story is ultimately selected (Einhorn & Hogarth, 1986; Pennington & Hastie, 1993). As uniqueness is only related to juror confidence in the selected story (Pennington & Hastie, 1991), it is less a component of believability than it is a construct related to the relative fitness of a narrative in comparison with alternatives. The requirement of alternative narratives for the evaluation of this construct limits the relevance of the subscale outside the domain of legal decision making, so uniqueness was excluded as a dimension of the narrative believability scale (NBS-12). With this in mind, a believable narrative is one that avoids leaving loose ends, is internally consistent and consistent with the perceiver's prior knowledge, and contains the expected elements and structure of a story. The NBS was created and evaluated in three phases. Study 1 included the generation and selection of items, reliability analysis, and confirmatory factor analysis (CFA). Study 2 evaluated the criterion-related and construct validity of the scale. Study 3 tested the predictive validity of the scale by assessing the relationship between scores on the measure and participant verdicts and verdict confidence in two mock trial scenarios. Study 1: Scale development and reliability testing Generation of scale items and narrative stimuli In Study 1, eight scale items were generated for each of the constructs specified by the story model as factors influencing the acceptability of a narrative (coverage, consistency, plausibility, and completeness), resulting in an initial pool of 32 items, each 7-point Likert scales with anchor points strongly disagree and strongly agree. Using the scale development guidelines suggested by DeVellis (2003), a panel of six experts composed of four trial attorneys and two litigation consultants with knowledge of the story model evaluated the pool items. To ensure face validity, the experts reviewed explicit definitions of each of the four constructs and then categorized each of the 32 items as to the underlying construct they believed was being measured. Items with low agreement between experts about the construct being measures were removed from the pool. In order to ensure item sampling adequacy, experts also suggested additional items not present in the initial pool. This process resulted in a final pool of 24 items. An additional item was created for each of the constructs by providing a short conceptual definition of the construct and asking for a rating of the narrative on that dimension, for a total of 28 narrative believability pool items. Five trial narratives were developed for this study to provide test stimuli for evaluating the narrative believability pool items. The trial narratives were written based on the State v. Lawrence mock trial case file (Rothschild, Siemer, & Bocchino, 2004), which contains evidence and witness statements in a purse-snatching incident. A master prosecution case narrative was developed to be high in the four factors related to narrative believability: coverage, consistency, plausibility, and completeness. Using feature-based manipulations, four additional narratives were produced, each designed to be substantially lower than the master narrative on one of the four factors. In the first manipulation, narrative coverage was reduced by adding information to the narrative suggesting a likely perpetrator other than Mr. Lawrence, for whom no explanation was given. The second manipulation reduced narrative consistency by including contradictory information within the narrative relating to the eyewitness identification and the accused's supposed motive for the crime. The third manipulation reduced narrative plausibility by including information about Mr. Lawrence that provided a strong disincentive for him to have committed the crime. The final manipulation reduced narrative completeness by presenting the case as a series of bullet-pointed statements of evidence and direct quotations from the testimony of Mr. Lawrence, the victim, and one other witness in the case rather than in a narrative format. The final trial narratives ranged from 870 to 950 words and also contained a printed map of key locations related to the case. The panel of six experts reviewed the trial narratives and identified each as to the specific dimension they believed was manipulated. Percent agreement was 78.7% for the five narratives, and the fixed-marginal multirater kappa (Siegel & Castellan, 1988) was acceptable (κ = .73). Method Participants Undergraduate students (N = 474) voluntarily completed the study for extra credit. Slightly more males (52.1%) completed the survey than females (47.7%). The mean age of participants was 19.6 years (SD = 3.1). For academic status, first years accounted for 34% of participants, sophomores for 16.2%, juniors for 24.9%, seniors 24.5%, and the remaining .4% did not respond. Stimuli In addition to the five manipulated trial narratives, two story narratives from previous research were used in this study. The story narratives, “Murder at the Mall” (Nuland, 1994) and “Bubbles at the Mall,” were previously used in the development of the narrative transportation scale (Green & Brock, 2000). In “Murder,” a girl is horrified as her younger sister is stabbed to death by an escapee from a mental asylum, while “Bubbles” relates the story of a girl who watches as her younger sister is overcome by giggles while watching a clown's bubble blowing act. Use of these narratives enabled the assessment of the divergent validity of the NBS-12 with narrative transportation. Procedure Participants were randomly assigned to conditions in a 2(order −  story or trial narrative first) × 2(story narrative − “Murder” or “Bubbles) × 5(trial narrative − master, low coverage, low consistency, low plausibility, or low completeness) design. Note that the cells are not of particular interest in this study, but randomization of these elements controls for any undesired order or narrative pairing effects. For example, this study is not interested in whether there is an ordering effect for participants who read story narratives before trial narratives, or vice versa, nor is this study concerned with specific pairings of narratives (e.g., “Bubbles” with the low coverage trial narrative or “Murder” with the low completeness trial narrative). Participants were students enrolled in courses offered by the Department of Communication at a large university in the Midwestern United States. These courses represent a wide diversity of majors, as most departments at the university require at least one course in communication. At their discretion, instructors may offer up to 3% extra credit in their courses for research participation. Interested students visit the web-based research participation system to enroll in studies of their choosing. Measures After reading each narrative, participants completed the narrative believability item pool and the general items from the narrative transportation scale, which measures “absorption into a story” (Green, 2002, p. 2), to assess the relationship between believability and transportation. Finally, participants completed the Pinocchio circling task (Green & Brock, 2000) as a measure of narrative acceptance. Respondents are provided with a printed copy of the narrative they read and were asked to circle “false notes”—those parts of the story that seem untrue, or seem unbelievable. Participants are instructed that they may make as many or as few circles as they wish, and circles may be as small as a single word or as large as an entire paragraph. Responses are scored in two ways: the total number of circles (Pinocchios), and the number of lines of text containing Pinocchios (lines). In past research, this task was found to be related to narrative transportation, with highly transported respondents drawing fewer Pinocchios through fewer lines of text (Green & Brock, 2000). Results Item analysis Each of the 7-item subscales from the narrative believability item pool was analyzed using the ALPHAMAX macro in SPSS (Hayes, 2005) to identify the most efficient items for inclusion in the final measure. A final NBS-12 containing 12 items (three for each subscale) was determined to be the most parsimonious measure of the story model constructs. Table 1 presents the final NBS-12 with means and standard deviations for responses to the story and trial narratives. Table 1 Narrative Believability Scale (NBS)-12 Item Descriptive Statistics and Standardized Factor Loadings by Model (N = 474) . Story Narratives . Trial Narratives . Indicator . M . SD . M . SD . Plausibility P1—I believe this story could be true. 4.30 1.66 5.19 1.43 P2—This story was plausible. 4.27 1.63 5.02 1.40 P3—This story seems to be true. 4.11 1.61 4.78 1.39 Completeness CM1—It was easy to follow the story from beginning to end. 3.72 1.75 4.55 1.56 CM2—It was hard to follow this story.a 3.93 1.88 4.65 1.68 CM3—If I were writing this story, I would have organized it differently.a 3.49 1.83 3.99 1.64 Consistency CN1—The information presented in this story was consistent. 4.15 1.48 4.55 1.36 CN2—All of the facts in this story agreed with each other. 4.21 1.43 3.85 1.58 CN3—The “consistency” of a story refers to the extent to which a story does not contradict itself or contradict other things you know to be true or false. How would you rate this story in terms of “consistency”?b 4.15 1.50 4.48 1.40 Coverage CV1—There was important information missing from this story.a 4.15 1.53 3.84 1.51 CV2—There were lots of “holes” in this story.a 4.06 1.51 3.87 1.52 CV3—The “coverage” of a story refers to the extent to which the story accounts for all of the information presented in the story. How would you rate this story in terms of “coverage”?b 4.25 1.55 4.62 1.24 . Story Narratives . Trial Narratives . Indicator . M . SD . M . SD . Plausibility P1—I believe this story could be true. 4.30 1.66 5.19 1.43 P2—This story was plausible. 4.27 1.63 5.02 1.40 P3—This story seems to be true. 4.11 1.61 4.78 1.39 Completeness CM1—It was easy to follow the story from beginning to end. 3.72 1.75 4.55 1.56 CM2—It was hard to follow this story.a 3.93 1.88 4.65 1.68 CM3—If I were writing this story, I would have organized it differently.a 3.49 1.83 3.99 1.64 Consistency CN1—The information presented in this story was consistent. 4.15 1.48 4.55 1.36 CN2—All of the facts in this story agreed with each other. 4.21 1.43 3.85 1.58 CN3—The “consistency” of a story refers to the extent to which a story does not contradict itself or contradict other things you know to be true or false. How would you rate this story in terms of “consistency”?b 4.15 1.50 4.48 1.40 Coverage CV1—There was important information missing from this story.a 4.15 1.53 3.84 1.51 CV2—There were lots of “holes” in this story.a 4.06 1.51 3.87 1.52 CV3—The “coverage” of a story refers to the extent to which the story accounts for all of the information presented in the story. How would you rate this story in terms of “coverage”?b 4.25 1.55 4.62 1.24 a Item should be reverse-scored. b Seven point scale from Very Low to Very High. Open in new tab Table 1 Narrative Believability Scale (NBS)-12 Item Descriptive Statistics and Standardized Factor Loadings by Model (N = 474) . Story Narratives . Trial Narratives . Indicator . M . SD . M . SD . Plausibility P1—I believe this story could be true. 4.30 1.66 5.19 1.43 P2—This story was plausible. 4.27 1.63 5.02 1.40 P3—This story seems to be true. 4.11 1.61 4.78 1.39 Completeness CM1—It was easy to follow the story from beginning to end. 3.72 1.75 4.55 1.56 CM2—It was hard to follow this story.a 3.93 1.88 4.65 1.68 CM3—If I were writing this story, I would have organized it differently.a 3.49 1.83 3.99 1.64 Consistency CN1—The information presented in this story was consistent. 4.15 1.48 4.55 1.36 CN2—All of the facts in this story agreed with each other. 4.21 1.43 3.85 1.58 CN3—The “consistency” of a story refers to the extent to which a story does not contradict itself or contradict other things you know to be true or false. How would you rate this story in terms of “consistency”?b 4.15 1.50 4.48 1.40 Coverage CV1—There was important information missing from this story.a 4.15 1.53 3.84 1.51 CV2—There were lots of “holes” in this story.a 4.06 1.51 3.87 1.52 CV3—The “coverage” of a story refers to the extent to which the story accounts for all of the information presented in the story. How would you rate this story in terms of “coverage”?b 4.25 1.55 4.62 1.24 . Story Narratives . Trial Narratives . Indicator . M . SD . M . SD . Plausibility P1—I believe this story could be true. 4.30 1.66 5.19 1.43 P2—This story was plausible. 4.27 1.63 5.02 1.40 P3—This story seems to be true. 4.11 1.61 4.78 1.39 Completeness CM1—It was easy to follow the story from beginning to end. 3.72 1.75 4.55 1.56 CM2—It was hard to follow this story.a 3.93 1.88 4.65 1.68 CM3—If I were writing this story, I would have organized it differently.a 3.49 1.83 3.99 1.64 Consistency CN1—The information presented in this story was consistent. 4.15 1.48 4.55 1.36 CN2—All of the facts in this story agreed with each other. 4.21 1.43 3.85 1.58 CN3—The “consistency” of a story refers to the extent to which a story does not contradict itself or contradict other things you know to be true or false. How would you rate this story in terms of “consistency”?b 4.15 1.50 4.48 1.40 Coverage CV1—There was important information missing from this story.a 4.15 1.53 3.84 1.51 CV2—There were lots of “holes” in this story.a 4.06 1.51 3.87 1.52 CV3—The “coverage” of a story refers to the extent to which the story accounts for all of the information presented in the story. How would you rate this story in terms of “coverage”?b 4.25 1.55 4.62 1.24 a Item should be reverse-scored. b Seven point scale from Very Low to Very High. Open in new tab Calculation of scale and subscale reliability for story narrative responses established a full scale Cronbach's alpha of .91 (M = 4.07, SD = 1.15) and subscale Cronbach alphas of .87 for plausibility (M = 4.23, SD = 1.45), .85 for completeness (M = 3.72, SD = 1.59), .81 for consistency (M = 4.17, SD = 1.25), and .78 for coverage (M = 4.15, SD = 1.27). Responses to trial narratives achieved a full scale Cronbach's alpha of .88 (M = 4.45, SD = .98), and subscale Cronbach alphas of .81 for plausibility (M = 5.00, SD = 1.20), .82 for completeness (M = 4.40, SD = 1.40), .78 for consistency (M = 4.30, SD = 1.21), and .72 for coverage (M = 4.11, SD = 1.14). Reliability of the narrative transportation scale from participants who read the story narratives was in line with previous findings (α = .80, M = 42.78, SD = 11.03). Since only the general items from the scale were used, the theoretical range for the scale was 11–77. Actual responses ranged from 11 to 68. For those who read the trial narratives, reliability was slightly lower (α = .72, M = 42.77, SD = 8.86). An independent groups t test comparing scores on the narrative transportation scale for the story narratives revealed significantly higher transportation for “Murder” (n = 235, M = 48.57, SD = 9.21) than for “Bubbles” (n = 239, M = 37.08, SD = 9.62), t(472) = 13.28, p < .001, d = 1.22. This was also consistent with previous findings (Green & Brock, 2000). Approximately .6% of the data were missing and replaced using the poststratum marginal mean value (Sande, 1982). Seven of the items were skewed and all of the items exhibited significant kurtosis. As a set, the NBS-12 items exhibited significant multivariate abnormality, skewness = 10.60, z score = 12.98, p < .001, and kurtosis = 206.09, z score = 13.58, p < .001. Confirmatory factor analysis Responses to the NBS-12 were analyzed using LISREL 8.80 for CFA. Because the data were nonnormal, CFA was conducted using the asymptotic covariance matrix. Thus, a Satorra–Bentler (S–B) χ2 is reported, as it adjusts for nonnormal distributions (Satorra & Bentler, 2010). The root mean square error of approximation (RMSEA; Nevitt & Hancock, 2000), the comparative fit index (CFI; Bentler, 1990), and the nonnormed fit index (NNFI; Bentler & Bonett, 1980) were used to evaluate model fit, as these indices restrict random variation with varying methods of estimation, sample sizes, and model misspecification (Fan, Thompson, & Wang, 1999). Hu and Bentler (1999) suggest good model fit is indicated by values greater than .95 for the CFI and NNFI, and values less than or equal to .06 for the RMSEA. On the basis of prior evidence and theory specifying the relationship between coverage and coherence (plausibility, consistency, and completeness), a two-factor measurement model was specified. The two-factor model exhibited poor fit, S–B χ2 = (53, N = 474) = 528.88, p < .001, RMSEA = .14 (90% CI: .13–.15), NNFI = .92, CFI = .93. As the story model explicates three sub-dimensions of the latent variable of Coherence, a second model was specified in which each three-item subscale loaded on its own latent variable (plausibility, completeness, and consistency), producing a four-factor model. The four-factor model exhibited good fit, S–B χ2 = (48, N = 474) = 104.14, p < .001, RMSEA = .05 (90% CI: .037–.063), NNFI = .99, CFI = .99. Figure 1 presents model specification and factor loadings for both models. Figure 1 Open in new tabDownload slide Two-factor model and four-factor model of the relationship between constructs measured by the Narrative Believability Scale (NBS)-12. Single arrowhead paths indicate standardized regression coefficients and double arrowhead arcs indicate inter-factor correlations. All factor loadings are significant at p < .001. Figure 1 Open in new tabDownload slide Two-factor model and four-factor model of the relationship between constructs measured by the Narrative Believability Scale (NBS)-12. Single arrowhead paths indicate standardized regression coefficients and double arrowhead arcs indicate inter-factor correlations. All factor loadings are significant at p < .001. Study 2: Criterion-related and construct validity tests Although Study 1 established the strong psychometric properties of the NBS-12 and provided evidence of the expected factor structure, it did not evaluate the measure's validity. In Study 2, the data collected in Study 1 was examined to assess the ability of the NBS-12 subscale scores to differentiate between the manipulated trial narratives. The association between NBS-12 scores and a measure of narrative acceptance, the Pinocchio circling task, was also tested. As both the NBS-12 and the Pinocchio circling task measure acceptance of a narrative, the measures should be significantly correlated (H1). Criterion-related validity The criterion-related validity of the NBS-12 was evaluated by testing the ability of the subscale scores to differentiate between the master trial narrative and the four trial narratives that were manipulated to be low in plausibility, completeness, consistency, and coverage. A series of independent groups t tests revealed expected differences on the plausibility subscale between the master (n = 96, M = 4.87, SD = 1.12) and the low plausibility narrative (n = 96, M = 3.88, SD = 1.37), t(190) = 5.49, p < .001, d = .80, differences on the completeness subscale between the master (n = 96, M = 4.72, SD = 1.48) and the low completeness narrative (n = 91, M = 3.92, SD = 1.27), t(185) = 3.91, p < .001, d = .58, differences on the consistency subscale between the master (n = 96, M = 5.43, SD = 1.02) and the low consistency narrative (n = 95, M = 5.11, SD = 1.07), t(189) = 2.11, p = .04, d = .31, and differences on the coverage subscale between the master (n = 96, M = 4.68, SD = 1.08) and the low coverage narrative (n = 96, M = 4.02, SD = 1.07), t(190) = 4.20, p < .001, d = .62. The ability of each subscale to differentiate between the stimulus stories was verified by conducting t tests between all pairs of master and manipulated narratives. Given the relationships between subscales identified in the CFA model, it is not surprising that all four NBS-12 subscales significantly differentiated between the master and each of the manipulated narratives, with the exception of no significant differentiation by the completeness subscale for the low coverage and low consistency narratives. Importantly, both the coverage and the plausibility subscales exhibited the most pronounced differences between the master and the low coverage and plausibility narratives, respectively. Although the coverage subscale slightly outperformed the consistency subscale for differentiating between the master and low consistency narrative, the mean differences between the subscale scores were not significantly different. Similarly, although both the consistency and coverage subscales slightly outperformed the completeness subscale at differentiating between the master and low completeness narrative, the mean differences between the subscale scores were not significantly different. This ability to differentiate between feature-based manipulations of the constructs within the trial narratives provides preliminary evidence of the criterion-related validity of the NBS-12 subscales. An independent groups t test comparing scores on the NBS-12 for the story narratives revealed significantly higher overall believability for “Murder” (n = 235, M = 4.66, SD = .99) than for “Bubbles” (n = 239, M = 3.48, SD = .99), t(472) = 12.92, p < .001, d = 1.19. Table 2 presents the Pearson correlations, means, and standard deviations for the scale and subscale scores on responses to the trial and story narratives. Table 2 Descriptive Statistics and Pearson Correlations for Scales and Subscales (Trial and Story Narratives) . Trial Narratives . . Trans . NBS . Plaus . Comp . Cons . Cov . Pin . Lines . Transportation .33* .26* .39* .24* .12** −.06 −.10 NBS .76* .76* .84* .82* −.21* −.22* Plausibility .36* .60* .47* −.18** −.20** Completeness .44* .51* −.11 −.08 Consistency .63* −.20** −.21* Coverage −.19** −.23** Pinocchios .78* M 42.69 4.45 5.00 4.40 4.29 4.11 5.63 8.79 SD 8.89 .98 1.20 1.40 1.21 1.14 3.77 5.67 . Trial Narratives . . Trans . NBS . Plaus . Comp . Cons . Cov . Pin . Lines . Transportation .33* .26* .39* .24* .12** −.06 −.10 NBS .76* .76* .84* .82* −.21* −.22* Plausibility .36* .60* .47* −.18** −.20** Completeness .44* .51* −.11 −.08 Consistency .63* −.20** −.21* Coverage −.19** −.23** Pinocchios .78* M 42.69 4.45 5.00 4.40 4.29 4.11 5.63 8.79 SD 8.89 .98 1.20 1.40 1.21 1.14 3.77 5.67 . Story Narratives . . Trans . NBS . Plaus . Comp . Cons . Cov . Pin . Lines . Transportation .58* .41* .54* .50* .46* −.24* −.33* NBS .77* .85* .86* .84* −.33* −.38* Plausibility .47* .55* .51* −.33* −.40* Completeness .67* .65* −.28* −.29* Consistency .67* −.23* −.31* Coverage −.29* −.29* Pinocchios .70* M 42.78 3.97 4.10 3.90 4.20 3.80 8.35 7.15 SD 11.03 1.09 1.28 1.36 1.17 1.19 7.15 14.50 . Story Narratives . . Trans . NBS . Plaus . Comp . Cons . Cov . Pin . Lines . Transportation .58* .41* .54* .50* .46* −.24* −.33* NBS .77* .85* .86* .84* −.33* −.38* Plausibility .47* .55* .51* −.33* −.40* Completeness .67* .65* −.28* −.29* Consistency .67* −.23* −.31* Coverage −.29* −.29* Pinocchios .70* M 42.78 3.97 4.10 3.90 4.20 3.80 8.35 7.15 SD 11.03 1.09 1.28 1.36 1.17 1.19 7.15 14.50 Note: N = 474 for correlations between narrative transportation, the full NBS-12 scale, and the subscales of plausibility, completeness, consistency, and coverage. N = 235 for correlations including Pinocchios and lines. OR = odds ratio; NBS = Narrative Believability Scale. * p ≤ .001. ** p ≤ .01. Open in new tab Table 2 Descriptive Statistics and Pearson Correlations for Scales and Subscales (Trial and Story Narratives) . Trial Narratives . . Trans . NBS . Plaus . Comp . Cons . Cov . Pin . Lines . Transportation .33* .26* .39* .24* .12** −.06 −.10 NBS .76* .76* .84* .82* −.21* −.22* Plausibility .36* .60* .47* −.18** −.20** Completeness .44* .51* −.11 −.08 Consistency .63* −.20** −.21* Coverage −.19** −.23** Pinocchios .78* M 42.69 4.45 5.00 4.40 4.29 4.11 5.63 8.79 SD 8.89 .98 1.20 1.40 1.21 1.14 3.77 5.67 . Trial Narratives . . Trans . NBS . Plaus . Comp . Cons . Cov . Pin . Lines . Transportation .33* .26* .39* .24* .12** −.06 −.10 NBS .76* .76* .84* .82* −.21* −.22* Plausibility .36* .60* .47* −.18** −.20** Completeness .44* .51* −.11 −.08 Consistency .63* −.20** −.21* Coverage −.19** −.23** Pinocchios .78* M 42.69 4.45 5.00 4.40 4.29 4.11 5.63 8.79 SD 8.89 .98 1.20 1.40 1.21 1.14 3.77 5.67 . Story Narratives . . Trans . NBS . Plaus . Comp . Cons . Cov . Pin . Lines . Transportation .58* .41* .54* .50* .46* −.24* −.33* NBS .77* .85* .86* .84* −.33* −.38* Plausibility .47* .55* .51* −.33* −.40* Completeness .67* .65* −.28* −.29* Consistency .67* −.23* −.31* Coverage −.29* −.29* Pinocchios .70* M 42.78 3.97 4.10 3.90 4.20 3.80 8.35 7.15 SD 11.03 1.09 1.28 1.36 1.17 1.19 7.15 14.50 . Story Narratives . . Trans . NBS . Plaus . Comp . Cons . Cov . Pin . Lines . Transportation .58* .41* .54* .50* .46* −.24* −.33* NBS .77* .85* .86* .84* −.33* −.38* Plausibility .47* .55* .51* −.33* −.40* Completeness .67* .65* −.28* −.29* Consistency .67* −.23* −.31* Coverage −.29* −.29* Pinocchios .70* M 42.78 3.97 4.10 3.90 4.20 3.80 8.35 7.15 SD 11.03 1.09 1.28 1.36 1.17 1.19 7.15 14.50 Note: N = 474 for correlations between narrative transportation, the full NBS-12 scale, and the subscales of plausibility, completeness, consistency, and coverage. N = 235 for correlations including Pinocchios and lines. OR = odds ratio; NBS = Narrative Believability Scale. * p ≤ .001. ** p ≤ .01. Open in new tab Construct validity Construct validity is concerned with the relationship between variables as specified by theory, and is of particular importance in scale development for constructs for which there are no existing validated measures (Cronbach & Meehl, 1955). As the story model posits that plausibility, completeness, consistency, and coverage influence the acceptability of a narrative (Pennington & Hastie, 1991), construct validity of the NBS-12 may be established by the presence of an association between NBS-12 subscale scores and other measures of narrative acceptance such as the Pinocchio circling task (Green & Brock, 2000). Subscales of the NBS-12 were positively related to the Pinocchio circling task. The results were similar for the trial narratives, with the exception that the relationship between the completeness subscale and the number of Pinocchios failed to reach statistical significance (Table 2). Only half of the participants completed the Pinocchio circling task because of another variable reported elsewhere. Overall, correlations between the two measures of story acceptance (Pinocchio circling and the NBS-12) were higher for the story narratives than for the trial narratives. The convergence of the measures provides preliminary evidence for the construct validity of the NBS-12 (support for H1). Study 3 Studies 1 and 2 established the strong psychometric properties of the NBS-12 and provided preliminary evidence for the criterion-related and construct validity of the measure. Study 3 was designed to examine the relationship between believability and verdicts rendered in a legal context to provide evidence of the predictive validity of the NBS-12. As the story model was originally conceived within the domain of juror decision making, this context is particularly appropriate for an initial test of the measure. Certainty principles in the story model On the basis of the story model's specified relationships between juror verdicts and verdict confidence and the certainty principles of coverage and coherence, the NBS-12 subscales should function similarly to confirm predictive validity. Specifically, the coverage subscale scores should predict juror verdicts (H2) and juror confidence in the selected verdict (H3). Together, a story's coherence (consistency, plausibility, and completeness) functions similarly to story coverage in that more coherent stories are more acceptable, and once accepted, jurors are more confident in the decision (Pennington & Hastie, 1991, 1992, 1993). As the NBS-12 measures each of these components of narrative coherence separately, it is of interest to understand how the scores on consistency, plausibility, and completeness relate to juror verdicts (RQ1), and to juror confidence in the selected verdict (RQ2). Apart from the ability to predict verdicts and verdict confidence independently, the efficacy of the measure may be tested by determining the predictive validity of the NBS-12 subscales over and above the influence of other relevant measures, such as bias against plaintiffs or corporate defendants (Peck, 2004), narrative transportation (Green & Brock, 2000), and perceptions of attorney competence and trustworthiness (McCroskey & Teven, 1999). Method Sample One-hundred forty-nine male and 120 female undergraduate students enrolled in a Department of Communication class at a large Midwestern university in the United States voluntarily completed the study for course extra credit. Only participants who were eligible to serve on a jury in the United States (at least 18 years of age, a United States citizen, and not convicted of a felony) were used for this study. The mean age of participants was 20.3 years (SD = 1.5). Procedure Participants were randomly assigned to conditions in a 2(case − Sanchez or Williams) × 2(presentation order − plaintiff or defense first) × 3(plaintiff attorney − Attorney 1, 2, or 3) × 3(defense attorney − Attorney 1, 2, or 3) between-subjects design. In order to preserve the adversarial nature of the trial simulation, the randomization process was such that the plaintiff and defense cases were always presented by different attorneys. For example, if Attorney 1 presented the plaintiff case, the defense case would be presented by Attorney 2 or 3. Note that these cells are not of particular interest in this study, but randomization of these elements was used to allow for greater generalizability of the results. These experimental factors were controlled for by including the variables in each of the regression analyses. Upon arrival at the study computer lab, participants selected a computer and launched the study website. Next, they answered a set of preliminary questions confirming their jury-eligible status and collecting demographic information. Participants then completed the Corporate Litigation Bias Scale (Peck, 2004) and viewed web videos of the plaintiff and defense statements. After the videos, participants completed the narrative transportation scale (Green & Brock, 2000) and the NBS-12 in response to the plaintiff and defense statements individually. Finally, participants rendered verdicts in the case and completed credibility measures for each of the attorneys they viewed. Stimuli Three actors, two males and one female, aged 40–60 years, were video recorded delivering plaintiff and defense statements for two different civil cases involving allegations of medical negligence, Sanchez v. Southwest Hospital and Williams v. St. James Medical Center. The case statements were prepared by actual attorneys involved with the litigation for use in pretrial strategy research. Similar to a closing argument, each trial statement outlined the timeline of events related to the case, summarized important evidence and witness testimony, and highlighted key arguments about how the evidence should be interpreted to arrive at a verdict from the perspective of the represented party. The names of the parties and witnesses in each case were changed to protect the confidentiality of the litigants. The video statements were each between 11 and 13 minutes long. Measures Prior to viewing the statement videos, participants completed the Corporate Litigation Bias Scale (Peck, 2004). After viewing the statement videos, participants completed the general items from the narrative transportation scale (Green & Brock, 2000) and the NBS-12 for both the plaintiff and defense narratives. Participants also responded to a verdict questionnaire that asked them to indicate a verdict (plaintiff or defense) and their confidence that the selected verdict was correct (measured from 1 = Not at all confident to 7 = Very confident). Finally, participants completed measures of competence and trustworthiness (McCroskey & Teven, 1999) for each attorney. Inclusion of these additional measures provides the ability to estimate the influence of scores on the NBS-12 subscales over and above some existing measures related to persuasiveness in a legal context. Calculated variables Reliability coefficients were calculated for all scales used to ensure the integrity of the measures with the particular stimuli employed in this research. Scale scores were calculated for all measures used in the study. Difference scores were calculated for the measures of transportation, competence, trustworthiness, coverage, completeness, plausibility, and consistency by subtracting the score for each measure on the plaintiff narrative from the score for each measure on the defense narrative. For example, a participant rating plaintiff attorney competence at 6.5 and defense attorney competence at 4 would have a calculated difference score of −2.5. Use of the difference scores in the analysis provides a single metric of the magnitude and direction of the difference between scores on the plaintiff and defense case narratives. In this way, participants whose scores reflected a preference for the plaintiff narrative were negative, and those reflecting a preference for the defense narrative were positive. Where used in the analysis, difference scores are noted by the suffix “Diff” following the subscale identifier. Similarly, absolute value difference scores were also computed for each of these values by dropping the sign on all negative difference scores to provide a single metric of the magnitude of the difference between scores on the case narratives without respect to the particular narrative (plaintiff or defense) receiving more support. Where used in the analysis, absolute value difference scores are noted by the suffix “Diff Abs” following the subscale identifier. Results Scale reliabilities All scales used in this study were found to have adequate internal consistency ranging from a Cronbach's alpha of .72 (coverage subscale for plaintiff narrative) to .87 (completeness subscale for defense narrative). CFA replication The four-factor model identified in Study 2 was tested and once again exhibited good fit, S–B χ2 = (48, N = 474) = 84.59, p < .001, RMSEA = .05 (90% CI: .034–.072), NNFI = .99, CFI = .99. Relationship between NBS-12 subscale difference scores and verdict Hypothesis 2 posited that scores on the NBS-12 coverage subscale would predict juror verdicts, and research question 1 was interested in how the scores on consistency, plausibility, and completeness relate to juror verdicts. The difference scores used in this analysis provide a metric of the magnitude and direction of the difference between scores on the plaintiff and defense case narratives for each of the NBS-12 subscales. Positive difference scores indicate a preference for the defense narrative while negative difference scores indicate a preference for the plaintiff narrative on each measured variable. According to the story model, participant verdicts should correspond to the narrative perceived as more believable. A hierarchical binary logistic regression analysis was performed to determine the relationship between the NBS-12 subscale difference scores and participant verdicts. The outcome variable verdict was coded as 0 = Defense and 1 = Plaintiff. Four predictor variables were used in the model: Coverage Diff, Completeness Diff, Consistency Diff, and Plausibility Diff. Variables were entered in a single block using forced entry. The model was statistically significant compared to the null model, χ2(4) = 201.30, p < .001. The strength of association between the NBS-12 subscale difference scores and a Plaintiff verdict was large, with Cox and Snell's R2 = .53 and Nagelkerke's R2 = .70. Overall, the logistic regression equation correctly classified 85.9% of cases as either plaintiff or defense verdicts (Table 3). Since the Coverage subscale was not a significant predictor of verdicts when entered into the regression with the other NBS-12 subscales, another regression was conducted in which Coverage Diff was the lone predictor. Absent the influence of the other subscales, Coverage Diff performed as expected, χ2(1) = 123.94, p < .001, Cox and Snell's R2 = .37 and Nagelkerke's R2 = .49. Overall, the regression equation correctly classified 77.7% of cases as either plaintiff or defense verdicts. Table 3 Summary of Logistic Regression Analysis Predicting Verdict from NBS-12 Subscales Predictor . β . SE β . Wald χ2 . df . p . OR . Constant 0.497 .201 6.138 1 .013 .608 Coverage Diff −.138 .251 .301 1 .583 .871 Completeness Diff .015 .153 .010 1 .922 1.015 Consistency Diff −.606 .235 6.640 1 .010 .545 Plausibility Diff −2.348 .425 30.519 1 .000 .096 Predictor . β . SE β . Wald χ2 . df . p . OR . Constant 0.497 .201 6.138 1 .013 .608 Coverage Diff −.138 .251 .301 1 .583 .871 Completeness Diff .015 .153 .010 1 .922 1.015 Consistency Diff −.606 .235 6.640 1 .010 .545 Plausibility Diff −2.348 .425 30.519 1 .000 .096 Note: N = 269. Summary statistics are reported to three decimal places for accuracy. If the value of OR is greater than 1, it indicates that as the predictor increases, the odds of a Plaintiff verdict increase. If OR is less than 1, it indicates that as the predictor increases, the odds of a Plaintiff verdict decrease. OR = odds ratio; NBS = Narrative Believability Scale. Open in new tab Table 3 Summary of Logistic Regression Analysis Predicting Verdict from NBS-12 Subscales Predictor . β . SE β . Wald χ2 . df . p . OR . Constant 0.497 .201 6.138 1 .013 .608 Coverage Diff −.138 .251 .301 1 .583 .871 Completeness Diff .015 .153 .010 1 .922 1.015 Consistency Diff −.606 .235 6.640 1 .010 .545 Plausibility Diff −2.348 .425 30.519 1 .000 .096 Predictor . β . SE β . Wald χ2 . df . p . OR . Constant 0.497 .201 6.138 1 .013 .608 Coverage Diff −.138 .251 .301 1 .583 .871 Completeness Diff .015 .153 .010 1 .922 1.015 Consistency Diff −.606 .235 6.640 1 .010 .545 Plausibility Diff −2.348 .425 30.519 1 .000 .096 Note: N = 269. Summary statistics are reported to three decimal places for accuracy. If the value of OR is greater than 1, it indicates that as the predictor increases, the odds of a Plaintiff verdict increase. If OR is less than 1, it indicates that as the predictor increases, the odds of a Plaintiff verdict decrease. OR = odds ratio; NBS = Narrative Believability Scale. Open in new tab Hierarchical binary logistic regression analysis was used to determine the relationship between the NBS-12 subscale difference scores and participant verdicts over and above the influence of experimental factors and the predictive ability of other relevant measures. Sixteen predictor variables were used in the model: Sex, Case, Case Order, Plaintiff Attorney (dummy coded), Defense Attorney (dummy coded), Plaintiff Bias, Corporate Bias, Transportation Diff, Competence Diff, Trustworthiness Diff, Coverage Diff, Completeness Diff, Consistency Diff, and Plausibility Diff. Variables were entered in three blocks using forced entry for each, as this method is required when testing theory due to the influence of random variation in stepwise techniques (Studenmund, 2010). A test of the first block model (predictor variables: Sex, Case, Case Order, Plaintiff Attorney, and Defense Attorney) was statistically significant compared to the null model, χ2(7) = 16.45, p < .05. The strength of association between these predictors and a Plaintiff verdict was small, with Cox and Snell's R2 = .06 and Nagelkerke's R2 = .08. The second block (additional predictor variables: Plaintiff Bias, Corporate Bias, Transportation Diff, Competence Diff, and Trustworthiness Diff) produced a model which was statistically significant compared to the first block only, χ2(5) = 160.06, p < .001. The strength of association between these predictors and a Plaintiff verdict was large, with Cox and Snell's R2 = .48 and Nagelkerke's R2 = .64. The final block (additional predictor variables: Coverage Diff, Completeness Diff, Consistency Diff, and Plausibility Diff) resulted in a model which was statistically significant compared to the first two blocks, χ2(4) = 59.83, p < .001. The strength of association between these predictors and a Plaintiff verdict was large, with Cox and Snell's R2 = .59 and Nagelkerke's R2 = .78 (Table 4). Table 4 Summary of Logistic Regression Analysis Predicting Verdict from Bias Measures and Difference Scores on Transportation, Credibility Dimensions, and NBS-12 Subscales Predictor . β . SE β . Wald χ2 . df . p . OR . Constant −.075 2.163 .001 1 .972 .928 Block 1  Sex .283 .494 .327 1 .568 1.327  Case −.719 .489 2.165 1 .141 .487  Case Order 1.161 .532 4.759 1 .029 3.192  Plaintiff Attorneya 1.908 2 .385  Defense Attorneya .310 2 .857 Block 2  Plaintiff Bias −1.042 .496 4.414 1 .036 .353  Corporate Bias .764 .412 3.441 1 .064 2.146  Transportation Diff −.064 .040 2.595 1 .107 .938  Competence Diff −.551 .250 4.844 1 .028 .576  Trustworthiness Diff −.742 .212 12.269 1 .000 .476 Block 3  Coverage Diff −.049 .289 .028 1 .866 .952  Completeness Diff .049 .185 .071 1 .789 1.051  Consistency Diff −.457 .283 2.620 1 .106 .633  Plausibility Diff −2.271 .506 20.112 1 .000 .103 Predictor . β . SE β . Wald χ2 . df . p . OR . Constant −.075 2.163 .001 1 .972 .928 Block 1  Sex .283 .494 .327 1 .568 1.327  Case −.719 .489 2.165 1 .141 .487  Case Order 1.161 .532 4.759 1 .029 3.192  Plaintiff Attorneya 1.908 2 .385  Defense Attorneya .310 2 .857 Block 2  Plaintiff Bias −1.042 .496 4.414 1 .036 .353  Corporate Bias .764 .412 3.441 1 .064 2.146  Transportation Diff −.064 .040 2.595 1 .107 .938  Competence Diff −.551 .250 4.844 1 .028 .576  Trustworthiness Diff −.742 .212 12.269 1 .000 .476 Block 3  Coverage Diff −.049 .289 .028 1 .866 .952  Completeness Diff .049 .185 .071 1 .789 1.051  Consistency Diff −.457 .283 2.620 1 .106 .633  Plausibility Diff −2.271 .506 20.112 1 .000 .103 Note: N = 269. Summary statistics are for the final model in which all variables were entered and are reported to three decimal places for accuracy. If the value of OR is greater than 1, it indicates that as the predictor increases, the odds of a Plaintiff verdict increase. If OR is less than 1, it indicates that as the predictor increases, the odds of a Plaintiff verdict decrease. OR = odds ratio; NBS = Narrative Believability Scale. a Categorical variable recoded into dummy variables in the analysis. Complete results: Defense Attorney 1 vs. 3, β = −.02, SE β = .67, Wald χ2 = .001, df = 1, p = .98, OR = 1.02; Defense Attorney 2 vs. 3, β = .32, SE β = .62, Wald χ2 = .26, df = 1, p = .61, OR = 1.37; Plaintiff Attorney 1 vs. 3, β = .89, SE β = .66, Wald χ2 = 1.84, df = 1, p = .18, OR = 2.44; Plaintiff Attorney 2 vs. 3, β = .29, SE β = .68, Wald χ2 = .18, df = 1, p = .67, OR = 1.33. Open in new tab Table 4 Summary of Logistic Regression Analysis Predicting Verdict from Bias Measures and Difference Scores on Transportation, Credibility Dimensions, and NBS-12 Subscales Predictor . β . SE β . Wald χ2 . df . p . OR . Constant −.075 2.163 .001 1 .972 .928 Block 1  Sex .283 .494 .327 1 .568 1.327  Case −.719 .489 2.165 1 .141 .487  Case Order 1.161 .532 4.759 1 .029 3.192  Plaintiff Attorneya 1.908 2 .385  Defense Attorneya .310 2 .857 Block 2  Plaintiff Bias −1.042 .496 4.414 1 .036 .353  Corporate Bias .764 .412 3.441 1 .064 2.146  Transportation Diff −.064 .040 2.595 1 .107 .938  Competence Diff −.551 .250 4.844 1 .028 .576  Trustworthiness Diff −.742 .212 12.269 1 .000 .476 Block 3  Coverage Diff −.049 .289 .028 1 .866 .952  Completeness Diff .049 .185 .071 1 .789 1.051  Consistency Diff −.457 .283 2.620 1 .106 .633  Plausibility Diff −2.271 .506 20.112 1 .000 .103 Predictor . β . SE β . Wald χ2 . df . p . OR . Constant −.075 2.163 .001 1 .972 .928 Block 1  Sex .283 .494 .327 1 .568 1.327  Case −.719 .489 2.165 1 .141 .487  Case Order 1.161 .532 4.759 1 .029 3.192  Plaintiff Attorneya 1.908 2 .385  Defense Attorneya .310 2 .857 Block 2  Plaintiff Bias −1.042 .496 4.414 1 .036 .353  Corporate Bias .764 .412 3.441 1 .064 2.146  Transportation Diff −.064 .040 2.595 1 .107 .938  Competence Diff −.551 .250 4.844 1 .028 .576  Trustworthiness Diff −.742 .212 12.269 1 .000 .476 Block 3  Coverage Diff −.049 .289 .028 1 .866 .952  Completeness Diff .049 .185 .071 1 .789 1.051  Consistency Diff −.457 .283 2.620 1 .106 .633  Plausibility Diff −2.271 .506 20.112 1 .000 .103 Note: N = 269. Summary statistics are for the final model in which all variables were entered and are reported to three decimal places for accuracy. If the value of OR is greater than 1, it indicates that as the predictor increases, the odds of a Plaintiff verdict increase. If OR is less than 1, it indicates that as the predictor increases, the odds of a Plaintiff verdict decrease. OR = odds ratio; NBS = Narrative Believability Scale. a Categorical variable recoded into dummy variables in the analysis. Complete results: Defense Attorney 1 vs. 3, β = −.02, SE β = .67, Wald χ2 = .001, df = 1, p = .98, OR = 1.02; Defense Attorney 2 vs. 3, β = .32, SE β = .62, Wald χ2 = .26, df = 1, p = .61, OR = 1.37; Plaintiff Attorney 1 vs. 3, β = .89, SE β = .66, Wald χ2 = 1.84, df = 1, p = .18, OR = 2.44; Plaintiff Attorney 2 vs. 3, β = .29, SE β = .68, Wald χ2 = .18, df = 1, p = .67, OR = 1.33. Open in new tab Examination of the regression results reveals that Hypothesis 2 was confirmed. Although large differences were observed by Verdict between scores on Completeness Diff [Defense M = .24, SD = 1.68, Plaintiff M = −1.39, SD = 1.66, t(267) = 8.06, p < .001, d = .98], Consistency Diff [Defense M = .69, SD = 1.20, Plaintiff M = −1.18, SD = 1.40, t(267) = 11.75, p < .001, d = 1.44], and Plausibility Diff [Defense M = .79, SD = 1.12, Plaintiff M = −1.27, SD = 1.26, t(267) = 14.10, p < .001, d = 1.73], only Plausibility Diff was found to predict Verdict over and above the influence of the other factors. Relationship between NBS-12 subscales and verdict confidence Hypothesis 3 posited that scores on the NBS-12 coverage subscale will predict juror confidence in the selected verdict. Research question 2 was interested in understanding how scores on consistency, plausibility, and completeness relate to juror confidence in the selected verdict. Absolute value difference scores were used in this analysis to provide a metric of the magnitude of the difference between the two narratives on the NBS-12 subscales without respect to which story was preferred. A large absolute value difference score would indicate that a participant viewed one narrative as considerably more believable than the alternative narrative. According to the story model, participant confidence in the selected verdict should be highest when there is a large perceived difference between the plaintiff and defense narratives on the dimensions of coverage, consistency, completeness, and plausibility. Multiple regression analysis was performed to determine the relationship between absolute value difference scores on the NBS-12 subscales and verdict confidence. The regression equation was statistically significant, R = .39, R2 = .15, adjusted R2 = .14, F(4, 264) = 12.05, p < .001, as approximately 14% of the variance in verdict confidence was accounted for by completeness (β = −.10, SE = .05, p = .044), consistency (β = .23, SE = .07, p = .002), and plausibility (β = .18, SE = .07, p = .007), but not coverage (β = .06, SE = .07, p = .45). As coverage did not perform as expected, another regression was conducted in which Coverage Diff Abs was the only predictor of verdict confidence. When entered alone, Coverage Diff Abs accounted for approximately 7% of the variance in verdict confidence, R = .27, R2 = .07, adjusted R2 = .07, F(1, 267) = 20.94, p < .001, with coefficient β = .25, SE = .06, p < .001. A second multiple regression analysis was performed to determine the relationship between absolute value difference scores on the NBS-12 subscales and verdict confidence over and above the influence of experimental factors and the predictive ability of other relevant measures. Variables were entered in three blocks: Block 1 contained demographics and variables related to the randomized experimental conditions: Sex, Case, Case Order, Plaintiff Attorney (dummy coded), and Defense Attorney (dummy coded). Block 2 contained other measures thought to be relevant to verdicts in medical malpractice litigation: Plaintiff Bias, Corporate Bias, and absolute value difference scores for Transportation, Competence, and Trustworthiness. Block 3 contained the absolute value difference scores for the NBS-12 subscales: Coverage Diff Abs, Completeness Diff Abs, Consistency Diff Abs, and Plausibility Diff Abs. The overall regression with all sixteen predictor variables was statistically significant, R = .46, R2 = .21, adjusted R2 = .16, F(16, 252) = 4.19, p < .001, indicating that approximately 16% of the variance in verdict confidence is accounted for by these variables (Table 5). Table 5 Hierarchical Regression Predicting Verdict Confidence from Bias Measures and Absolute Value Difference Scores on Transportation, Credibility Dimensions, and NBS-12 Subscales Predictors . β . SE β . R2 . R2Adj . Summary of Model R2 and R2 Changes . Constant 3.45 .63 Block 1 .05 .02  Sex .11 .12 Model R2 = .050  Case .24*** .12 Model F(7, 262) = 1.95  Case Order −.20 .12 R2Change = .050  Plaintiff Attorney 1 vs. 3a .28 .16 R2Change F(7, 261) = 1.95  Plaintiff Attorney 2 vs. 3a .03 .16  Defense Attorney 1 vs. 3a .27 .16  Defense Attorney 2 vs. 3a .11 .16 Block 2 .13* .09  Plaintiff Bias .18*** .09 Model R2 = .131  Corporate Bias .01 .09 Model F(12, 256) = 2.89*  Transportation Diff Abs .00 .01 R2Change = .082  Competence Diff Abs .02 .07 R2Change F(5, 256) = 4.82*  Trustworthiness Diff Abs .03 .05 Block 3 .21* .16  Coverage Diff Abs .05 .07 Model R2 = .210  Completeness Diff Abs −.08 .05 Model F(16, 252) = 4.19*  Consistency Diff Abs .18*** .07 R2Change = .079  Plausibility Diff Abs .16*** .07 R2Change F(4, 252) = 6.28* Predictors . β . SE β . R2 . R2Adj . Summary of Model R2 and R2 Changes . Constant 3.45 .63 Block 1 .05 .02  Sex .11 .12 Model R2 = .050  Case .24*** .12 Model F(7, 262) = 1.95  Case Order −.20 .12 R2Change = .050  Plaintiff Attorney 1 vs. 3a .28 .16 R2Change F(7, 261) = 1.95  Plaintiff Attorney 2 vs. 3a .03 .16  Defense Attorney 1 vs. 3a .27 .16  Defense Attorney 2 vs. 3a .11 .16 Block 2 .13* .09  Plaintiff Bias .18*** .09 Model R2 = .131  Corporate Bias .01 .09 Model F(12, 256) = 2.89*  Transportation Diff Abs .00 .01 R2Change = .082  Competence Diff Abs .02 .07 R2Change F(5, 256) = 4.82*  Trustworthiness Diff Abs .03 .05 Block 3 .21* .16  Coverage Diff Abs .05 .07 Model R2 = .210  Completeness Diff Abs −.08 .05 Model F(16, 252) = 4.19*  Consistency Diff Abs .18*** .07 R2Change = .079  Plausibility Diff Abs .16*** .07 R2Change F(4, 252) = 6.28* Note: N = 269. Coefficients (β) and standard errors (SE) are for the final model in which all variables were entered. R2 and Adjusted R2 represent the amount of variance explained by all of the blocks included up to that point. a Categorical variable recoded into dummy variables in the analysis. * p < .001. *** p < .05. Open in new tab Table 5 Hierarchical Regression Predicting Verdict Confidence from Bias Measures and Absolute Value Difference Scores on Transportation, Credibility Dimensions, and NBS-12 Subscales Predictors . β . SE β . R2 . R2Adj . Summary of Model R2 and R2 Changes . Constant 3.45 .63 Block 1 .05 .02  Sex .11 .12 Model R2 = .050  Case .24*** .12 Model F(7, 262) = 1.95  Case Order −.20 .12 R2Change = .050  Plaintiff Attorney 1 vs. 3a .28 .16 R2Change F(7, 261) = 1.95  Plaintiff Attorney 2 vs. 3a .03 .16  Defense Attorney 1 vs. 3a .27 .16  Defense Attorney 2 vs. 3a .11 .16 Block 2 .13* .09  Plaintiff Bias .18*** .09 Model R2 = .131  Corporate Bias .01 .09 Model F(12, 256) = 2.89*  Transportation Diff Abs .00 .01 R2Change = .082  Competence Diff Abs .02 .07 R2Change F(5, 256) = 4.82*  Trustworthiness Diff Abs .03 .05 Block 3 .21* .16  Coverage Diff Abs .05 .07 Model R2 = .210  Completeness Diff Abs −.08 .05 Model F(16, 252) = 4.19*  Consistency Diff Abs .18*** .07 R2Change = .079  Plausibility Diff Abs .16*** .07 R2Change F(4, 252) = 6.28* Predictors . β . SE β . R2 . R2Adj . Summary of Model R2 and R2 Changes . Constant 3.45 .63 Block 1 .05 .02  Sex .11 .12 Model R2 = .050  Case .24*** .12 Model F(7, 262) = 1.95  Case Order −.20 .12 R2Change = .050  Plaintiff Attorney 1 vs. 3a .28 .16 R2Change F(7, 261) = 1.95  Plaintiff Attorney 2 vs. 3a .03 .16  Defense Attorney 1 vs. 3a .27 .16  Defense Attorney 2 vs. 3a .11 .16 Block 2 .13* .09  Plaintiff Bias .18*** .09 Model R2 = .131  Corporate Bias .01 .09 Model F(12, 256) = 2.89*  Transportation Diff Abs .00 .01 R2Change = .082  Competence Diff Abs .02 .07 R2Change F(5, 256) = 4.82*  Trustworthiness Diff Abs .03 .05 Block 3 .21* .16  Coverage Diff Abs .05 .07 Model R2 = .210  Completeness Diff Abs −.08 .05 Model F(16, 252) = 4.19*  Consistency Diff Abs .18*** .07 R2Change = .079  Plausibility Diff Abs .16*** .07 R2Change F(4, 252) = 6.28* Note: N = 269. Coefficients (β) and standard errors (SE) are for the final model in which all variables were entered. R2 and Adjusted R2 represent the amount of variance explained by all of the blocks included up to that point. a Categorical variable recoded into dummy variables in the analysis. * p < .001. *** p < .05. Open in new tab Examination of the multiple regression analysis results reveals that hypothesis 3 was also confirmed, but only when Coverage Diff Abs was entered as the lone predictor variable, similar to the previous analysis. Significant zero-order correlations were observed between Confidence and Consistency Diff Abs [r(267) = .34, p < .001], and Plausibility Diff Abs [r(267) = .32, p < .001]. Of these, both consistency and plausibility predicted variance in Confidence over and above the influence of experimental factors and other related measures. Discussion Consistent with expectations, narrative coverage was a significant predictor of verdict and verdict confidence, but only when entered as the lone independent variable. In each analysis including the other subscales, experimental factors, and related measures, the influence of the coverage subscale was subsumed by the predictive power of the additional factors. There are several reasonable explanations for this result. It is possible that the NBS-12 fails to capture some important variance in the construct of narrative coverage. It may be that the particular stimuli used in this research failed to vary enough in terms of coverage. It may also be the case that evidence coverage simply is not a certainty principle. This possibility was first raised by Weinstock and Flaton (2004), who found empirical support for the idea that evidence coverage is a function of each juror's argument skill rather than a certainty principle. They found that jurors with greater argument skill include more of the evidence when explaining their verdict. The very small values of the standardized coefficients (β) in both of the regression analyses in this study provide further empirical support for this argument. More recently, Weinstock (2011) found that jurors' manner of justifying a verdict choice using either narrative structures or relational argument structures was fairly consistent across cases and was related to juror epistemological orientation (Weinstock & Cronin, 2003) and argument skill. This result suggests that cognitive factors may exert influence on perceived coverage and coherence, reflected not only in verdict choice and confidence, but also in responses to the NBS-12. The results are more favorable for narrative coherence as both the completeness and plausibility subscales predicted significant variance in verdict outcomes. Yet, the story model posits that all three subscales of coherence—including consistency—should influence verdict outcomes. Future research should explore whether context moderates these relationships. This study had several limitations. The use of jury-eligible college students for this study limits the extent to which claims about the accuracy of the story model can be made. Bornstein's (1999) meta-analysis suggested few differences exist in the decisions made by college students and jury-eligible adults in trial simulation research, but it is possible that a lack of life experience in college-age individuals allows for fewer comparison points for assessing the coverage and coherence of a story. A test of the NBS-12 with an adult sample is needed. Beyond sample concerns, it is possible that participants in Study 3 decided which side they would support and responded to the NBS-12 items accordingly. Future research should manipulate the dimensions of believability and test the direct influence of narrative believability on decisions to rule out this possibility. Finally, the results of the criterion-related validity tests raise significant concerns about the validity of the consistency and completeness subscales. These results may reflect the difficulty of producing stimulus narratives that differ on one dimension of believability without a concomitant influence on the other dimensions. However, the consistency and completeness subscales should be subjected to further criterion-related validity tests in future research. In spite of these concerns related to criterion-related validity, the strong empirical support for the reliability and predictive validity of the NBS-12 highlight the potential contributions of the scale. This instrument is of practical import for research in narrative influence. As a validated measure, the NBS-12 will facilitate future studies aimed at isolating the mechanisms of narrative-based belief change. In a larger sense, this project provides an additional tool to aid the expansion of scholarly understanding of the characteristics of narratives that differentiate their abilities to influence beliefs and motivate human action. The measure will open up new areas of investigation within the story model framework itself by allowing the use of competing narratives rather than manipulated versions of the same narrative, creating more possibilities for generating testable hypotheses. Finally, the instrument may be immensely practical for attorneys and trial consultants, as it will identify the elements of a given trial narrative that most support or undermine perceptions of its veracity, allowing the case theory to be strengthened prior to trial. There are several theoretical implications of this study for scholars studying narrative-based belief change. First, the results of this research strongly suggest that narrative believability is a construct related to narrative acceptance and influence. Narratives that are perceived as more believable are more likely to influence behavior compared with less believable narratives. This study suggests that the NBS-12 and its component subscales will be useful for all kinds of narrative influence research. Specifically, future research testing the predictive validity of the NBS-12 in contexts such as health, family, and purchasing behavior would be fruitful. Acknowledgment I would like to thank Jakob D. Jensen for his insightful guidance in the research process and the preparation of this paper. References Adaval , R. , Isbell , L. M., & Wyer , R. S. ( 2007 ). The impact of pictures on narrative- and list-based impression formation: A process interference model . Journal of Experimental Social Psychology , 43 ( 3 ), 352 – 364 . doi: 10.1016/j.jesp.2006.04.005 Google Scholar Crossref Search ADS WorldCat Bennett , W. L. & Feldman , M. S. ( 1981 ). Reconstructing reality in the courtroom . New Brunswick, NJ : Rutgers University Press . Google Scholar Google Preview OpenURL Placeholder Text WorldCat COPAC Bentler , P. M. ( 1990 ). Comparative fit indexes in structural models . Psychological Bulletin , 107 ( 2 ), 238 – 246 . doi: 10.1037/0033-2909.107.2.238 Google Scholar Crossref Search ADS PubMed WorldCat Bentler , P. M., & Bonett , D. G. ( 1980 ). Significance tests and goodness of fit in the analysis of covariance structures . Psychological Bulletin , 88 ( 3 ), 588 – 606 . doi: 10.1037/0033-2909.88.3.588 Google Scholar Crossref Search ADS WorldCat Bornstein , B. H. ( 1999 ). The ecological validity of jury simulations: Is the jury still out? Law and Human Behavior , 23 ( 1 ), 75 – 91 . doi: 10.1023/A:1022326807441 Google Scholar Crossref Search ADS WorldCat Busselle , R. & Bilandzic , H. ( 2008 ). Fictionality and perceived realism in experiencing stories: A model of narrative comprehension and engagement . Communication Theory , 18 ( 2 ), 255 – 280 . doi: 10.1111/j.1468-2885.2008.00322.x Google Scholar Crossref Search ADS WorldCat Busselle , R., & Bilandzic , H. ( 2009 ). Measuring narrative engagement . Media Psychology , 12 ( 4 ), 321 – 347 . doi: 10.1080/15213260903287259 Google Scholar Crossref Search ADS WorldCat Cronbach , L. J., & Meehl , P. E. ( 1955 ). Construct validity in psychological tests . Psychological Bulletin , 52 ( 4 ), 281 – 302 . doi: 10.1037/h0040957 Google Scholar Crossref Search ADS PubMed WorldCat Dahlstrom , M. F. ( 2010 ). The role of causality in information acceptance in narratives: An example from science communication . Communication Research , 37 ( 6 ), 857 – 875 . doi: 10.1177/0093650210362683 Google Scholar Crossref Search ADS WorldCat DeVellis , R. F. ( 2003 ). Scale development: Theory and applications (2nd ed.). Thousand Oaks, CA : Sage . Google Scholar Google Preview OpenURL Placeholder Text WorldCat COPAC Diamond , S. S., & Rose , M. R. ( 2005 ). Real juries . Annual Review of Law and Social Science , 255–284 . doi: 10.1146/annurev.lawsocsci.1.041604.120002 OpenURL Placeholder Text WorldCat Einhorn , H. J., & Hogarth , R. M. ( 1986 ). Judging probable cause . Psychological Bulletin , 99 ( 1 ), 3 – 19 . doi: 10.1037/0033-2909.99.1.3 Google Scholar Crossref Search ADS WorldCat Fan , X. , Thompson , B., & Wang , L. ( 1999 ). Effects of sample size, estimation methods, and model specification on structural equation modeling fit indexes . Structural Equation Modeling , 6 ( 1 ), 56 . doi: 10.1080/10705519909540119 Google Scholar Crossref Search ADS WorldCat Gerrig , R. J., & Prentice , D. A. ( 1991 ). The representation of fictional information . Psychological Science , 2 ( 5 ), 336 – 340 . doi: 10.1111/j.1467-9280.1991.tb00162.x Google Scholar Crossref Search ADS WorldCat de Graaf , A. , Hoeken , H., Sanders , J., & Beentjes , J. W. J. ( 2011 ). Identification as a mechanism of narrative persuasion . Communication Research . doi: 10.1177/0093650211408594 OpenURL Placeholder Text WorldCat Green , M. C. ( 2002 ). Narrative worlds, real impact: How stories affect beliefs. IGEL 2002 Proceedings . Retrieved from http://www.arts.ualberta.ca/igel/IGEL2002/Proceedings.htm Green , M. C. ( 2006 ). Narratives and cancer communication . Journal of Communication , 56 , S163 – S183 . doi: 10.1111/j.1460-2466.2006.00288.x Google Scholar Crossref Search ADS WorldCat Green , M. C., & Brock , T. C. ( 2000 ). The role of transportation in the persuasiveness of public narratives . Journal of Personality and Social Psychology , 79 ( 5 ), 701 – 721 . doi: 10.1037/0022-3514.79.5.701 Google Scholar Crossref Search ADS PubMed WorldCat Hannaford , P. L. , Hans , V. P., Mott , N. L., & Munsterman , G. T. ( 2000 ). The timing of opinion formation by jurors in civil cases: An empirical examination . Tennessee Law Review , 67 , 627 – 652 . OpenURL Placeholder Text WorldCat Hayes , A. F. ( 2005 , November). A computational tool for survey shortening applicable to composite attitude, opinion, and personality measurement scales. Paper presented at the Meeting of the Midwestern Association for Public Opinion Research, Chicago, IL. Hu , L., & Bentler , P. M. ( 1999 ). Cutoff criteria for fit indexes in covariance structure analysis: Conventional criteria versus new alternatives . Structural Equation Modeling , 6 ( 1 ), 1 . doi: 10.1080/10705519909540118 Google Scholar Crossref Search ADS WorldCat Huntley , J. E., & Costanzo , M. ( 2003 ). Sexual harassment stories: Testing a story-mediated model of juror decision-making in civil litigation . Law and Human Behavior , 27 ( 1 ), 29 – 51 . doi: 10.1023/A:1021674811225 Google Scholar Crossref Search ADS PubMed WorldCat Jensen , J. D. , Bernat , J. K., Wilson , K. M., & Goonewardene , J. ( 2011 ). The delay hypothesis: The manifestation of media effects over time . Human Communication Research , 37 ( 4 ), 509 – 528 . doi: 10.1111/j.1468-2958.2011.01415.x Google Scholar Crossref Search ADS WorldCat McCroskey , J. C., & Teven , J. J. ( 1999 ). Goodwill: A reexamination of the construct and its measurement . Communication Monographs , 66 ( 1 ), 90 – 103 . doi: 10.1080/03637759909376464 Google Scholar Crossref Search ADS WorldCat Moyer-Gusé , E. , Chung , A. H., & Jain , P. ( 2011 ). Identification with characters and discussion of taboo topics after exposure to an entertainment narrative about sexual health . Journal of Communication , 61 ( 3 ), 387 – 406 . doi: 10.1111/j.1460-2466.2011.01551.x Google Scholar Crossref Search ADS WorldCat Murphy , S. T. , Frank , L. B., Moran , M. B., & Patnoe-Woodley , P. ( 2011 ). Involved, transported, or emotional? Exploring the determinants of change in knowledge, attitudes, and behavior in entertainment-education . Journal of Communication , 61 ( 3 ), 407 – 431 . doi: 10.1111/j.1460-2466.2011.01554.x Google Scholar Crossref Search ADS WorldCat Nevitt , J., & Hancock , G. R. ( 2000 ). Improving the root mean square error of approximation for nonnormal conditions in structural equation modeling . The Journal of Experimental Education , 68 ( 3 ), 251 – 268 . doi: 10.1080/00220970009600095 Google Scholar Crossref Search ADS WorldCat Nuland , S. B. ( 1994 ). How we die . New York, NY : Knopf . Google Scholar Google Preview OpenURL Placeholder Text WorldCat COPAC Peck , M. J. ( 2004 ). Construction of the corporate litigation bias scale (CLBS): Measuring civil juror attitudes toward corporate defendants and individual plaintiffs. Dissertation Abstracts International, UMI No. AAT3153978. Pennington , N., & Hastie , R. ( 1988 ). Explanation-based decision making: Effects of memory structure on judgment . Journal of Experimental Psychology: Learning, Memory, and Cognition , 14 ( 3 ), 521 – 533 . doi: 10.1037/0278-7393.14.3.521 Google Scholar Crossref Search ADS WorldCat Pennington , N., & Hastie , R. ( 1991 ). A cognitive theory of juror decision making: The story model . Cardozo Law Review , 13 , 519 – 557 . OpenURL Placeholder Text WorldCat Pennington , N., & Hastie , R. ( 1992 ). Explaining the evidence: Tests of the story model for juror decision making . Journal of Personality and Social Psychology , 62 ( 2 ), 189 – 206 . doi: 10.1037/0022-3514.62.2.189 Google Scholar Crossref Search ADS WorldCat Pennington , N., & Hastie , R. ( 1993 ). Reasoning in explanation-based decision making . Cognition , 49 ( 1–2 ), 123 – 163 . doi: 10.1016/0010-0277(93)90038-w Google Scholar Crossref Search ADS PubMed WorldCat Rideout , J. C. ( 2008 ). Storytelling, narrative rationality, and legal persuasion . The Journal of the Legal Writing Institute , 14 , 53 – 86 . OpenURL Placeholder Text WorldCat Rothschild , F. D. , Siemer , D. C., & Bocchino , A. J. ( 2004 ). State v. Lawrence (2nd ed.). Louisville, CO : National Institute for Trial Advocacy . Google Scholar Google Preview OpenURL Placeholder Text WorldCat COPAC Sande , I. G. ( 1982 ). Imputation in surveys: Coping with reality . The American Statistician , 36 ( 3 ), 145 – 152 . Google Scholar Crossref Search ADS WorldCat Satorra , A., & Bentler , P. ( 2010 ). Ensuring positiveness of the scaled difference chi-square test statistic . Psychometrika , 75 ( 2 ), 243 – 248 . doi: 10.1007/s11336-009-9135-y Google Scholar Crossref Search ADS PubMed WorldCat Siegel , S., & Castellan , N. J. ( 1988 ). Nonparametric statistics for the behavioral sciences (2nd ed.). New York, NY : McGraw-Hill . Google Scholar Google Preview OpenURL Placeholder Text WorldCat COPAC Slater , M. D., & Rouner , D. ( 2002 ). Entertainment-education and elaboration likelihood: Understanding the processing of narrative persuasion . Communication Theory , 12 ( 2 ), 173 – 191 . doi: 10.1111/j.1468-2885.2002.tb00265.x OpenURL Placeholder Text WorldCat Studenmund , A. H. ( 2010 ). Using econometrics: A practical guide (6th ed.). Englewood Cliffs, NJ : Prentice Hall . Google Scholar Google Preview OpenURL Placeholder Text WorldCat COPAC Tal-Or , N. , Boninger , D. S., Poran , A., & Gleicher , F. ( 2004 ). Counterfactual thinking as a mechanism in narrative persuasion . Human Communication Research , 30 ( 3 ), 301 – 328 . doi: 10.1111/j.1468-2958.2004.tb00734.x Google Scholar Crossref Search ADS WorldCat Weinstock , M. ( 2011 ). Knowledge-telling and knowledge-transforming arguments in mock jurors' verdict justifications . Thinking and Reasoning , 17 ( 3 ), 282 – 314 . doi: 10.1080/13546783.2011.575191 Google Scholar Crossref Search ADS WorldCat Weinstock , M., & Cronin , M. A. ( 2003 ). The everyday production of knowledge: Individual differences in epistemological understanding and juror-reasoning skill . Applied Cognitive Psychology , 17 ( 2 ), 161 – 181 . doi: 10.1002/acp.860 Google Scholar Crossref Search ADS WorldCat Weinstock , M., & Flaton , R. ( 2004 ). Evidence coverage and argument skills: Cognitive factors in a juror's verdict choice . Journal of Behavioral Decision Making , 17 , 191 – 212 . doi: 10.1002/bdm.470 Google Scholar Crossref Search ADS WorldCat © 2013 International Communication Association TI - Measuring Narrative Believability: Development and Validation of the Narrative Believability Scale (NBS-12) JF - Journal of Communication DO - 10.1111/jcom.12035 DA - 2013-06-01 UR - https://www.deepdyve.com/lp/oxford-university-press/measuring-narrative-believability-development-and-validation-of-the-wFCnvmsxMd SP - 578 VL - 63 IS - 3 DP - DeepDyve ER -