Metrics for Interpersonal Speaking Rubrics

Last week we looked at the design constraints on any rubric or other measure that we may develop to capture performance in spontaneous speaking. The top two concerns are that any way to capture performance would be simple to use and offer feedback to the students. This week we look at the possible metrics that we could use to capture a performance. We asked the fundamental question:



What makes spontaneous speaking effective, "good", or compelling?

Starting Point for Metrics

Measurement relies on identifying the features of student spontaneous speech that best correlate to performance and proficiency. These also need to be overtly identifiable in a two-minute speech sample. After scouring our professional organizations, state departments of education, research on measuring proficiency here is a list of possible metrics.

  • Language Function (Adair-Hauck, Glisan, & Troyan, 2013, p. 126)
  • Text type (Adair-Hauck et al., 2013, p. 126)
  • Communication Strategies (Adair-Hauck et al., 2013, p. 126), (Actfl, 2015), (Theisen, 2014)
  • Comprehensibility (Adair-Hauck et al., 2013, p. 126), (Wertz, 2015, p. 10), (Theisen, 2014)
  • Language Control (Adair-Hauck et al., 2013, p. 126), (Actfl, 2015), (Theisen, 2014)
  • Vocabulary (Actfl, 2015)
  • Cultural Awareness and Interculturality (Actfl, 2015), (Wertz, 2015)
  • Quality of Communication (Wertz, 2015)
  • Task Completion (“AP ® Spanish Language Writing and Speaking Scoring Guidelines,” 2007)
  • Topic Development  (“AP ® Spanish Language Writing and Speaking Scoring Guidelines,” 2007)
  • Language Use  (“AP ® Spanish Language Writing and Speaking Scoring Guidelines,” 2007)
  • How well task is completed (Theisen, 2014)


The ACTFL performance descriptors combine comprehensibility and language control in to one feature – basically halving the weight of language control.  The Ohio Department of Education and the ACTFL performance descriptors define comprehensibility with different features. I was also unable to find the research on which these metrics are based, but that could be because of my own limitations.

What do you think?

As suggested by lack of agreement among these sources, the metrics are not all equal. We need your expertise from the field to gather to rate these metrics.



