Action Research for SFSU's NASA-NOVA Course:
Planetary Climate Change
January 18, 2002

Dr. Dave Dempsey
Prof. of Meteorology
Dept. of Geosciences
College of Science
ddempsey@sundog.sfsu.edu
Dr. Kathleen O'Sullivan
Professor
Dept. of Secondary Education
College of Education
kaosul@sfsu.edu
Dr. Lisa White
Assoc. Prof. of Geology
Dept. of Geosciences
College of Science
lwhite@sfsu.edu

I. Introduction

Action research for GEOL/METR 302, "Planetary Climate Change", comprised three types of assessment described in section II below, including:
  1. Attitudes about science [draft mostly complete]
  2. Scientific reasoning [draft mostly complete]
  3. Concepts about climate [draft mostly complete]
These assessments were administered during the first week of classes ("pre-tests") and again in the last week or two of the semester ("post-tests"), in the following six classes: The last four of these courses are introductory, general education (GE) courses for non-majors. They served as controls for our NOVA course. The number of students who completed both pre- and post-tests in each class was as follows:
Class GM310.00
(NOVA course)
GM310.01
(NOVA course)
GM103 G302 M302S M302M
Number of
Students Scored
4 4 6 19 18 5

II. Assessments

A. Attitudes about Science

  1. Question asked: Do students develop a better attitude about science than in existing courses?

  2. Assessment instrument:

  3. Assessment procedure: We gave each student the attitude assessment and a response sheet. Instructions for recording responses on the response sheet are printed on both the attitude assessment and on the answer sheet itself, but we explained them orally using transparencies and an overhead projector to illustrate. They key point to emphasize to students is that on the response sheet, the statement numbers (and five possible responses for each statement) are organized in four columns, which appear in increasing numerical order from left-to-right across columns rather than top-to-bottom within each column. We gave students enough time to complete the assessment (roughly 15 minutes).

    We administered the assessment on the first day of class (the "pre-test") and again in the last week or so of the semester (the "post-test"). We scored only assessments completed by students on both dates, and we scored both sets responses only after the end of the semester. (To match each pair of pre- and post-test responses, we used the last four digits of each student's nine-digit student number, which each student was supposed to write on the response form. We did not score single, unmatched assessments.) To decrease the likelihood of miscoring, two people independently scored every response and any major differences between the two sets of scores were resolved by rescoring.

  4. Scoring rubric: Because of the way the statements are ordered on the attitude assessment (in groups of 4, one per cagegory of statements), and the way the responses are organized into four columns on the response sheet, all responses to statements in Category 1 appear in the first column of the response sheet, responses in Category 2 appear in the second column, etc. For positively-phrased statements, five points are assigned to "strongly agree" responses, four points to "agree", etc. For negatively-phrased statements, one point is assigned to "strongly agree" responses, two points to "agree", etc. For each student's response sheet, we scored responses and summed them in each column separately. (A transparent template mimicking the student response sheet but with appropriate point values replacing "SA", "A", "N", "D", and "SD", can be created and overlaid on each student response sheet to facilitate manual scoring.) The maximum possible score for each column (that is, each category) is 50 and the minimum is 10.

  5. Analysis procedure: We calculated post-test minus pre-test difference scores for each student for each category. Three types of statistical tests were performed on each set of difference scores, with a 95% level of significance chosen in advance as the standard for accepting or rejecting null hypotheses:

    The first two types of tests above were also performed on the pre-test scores alone, to test hypotheses that the students in GM310 were initially no different from students in the four GE classes, that students in the four GE classes did not differ significantly from each other, and that the students in the two GM310 classes did not differ significantly from each other (at least as measured by these assessments).

  6. Results:
    Hypothesis Tested Type
    of Test
    Score Result of Test
    (95% significance level)
    (a) Mean scores are the same among all six classes. F-test pre-test
    only
    Accept: all four categories
    post/pre test
    difference
    Accept: Categories 1, 2, 3;
    Reject: Category 4
    (b) Mean scores are the same among all classes, with GM310 classes lumped together. F-test pre-test
    only
    Accept: all four categories
    post/pre test
    difference
    Accept: Categories 1, 2;
    Reject: Categories 3, 4
    (c) Mean scores are the same among the four GE classes. F-test pre-test
    only
    Accept: all four categories
    post/pre test
    difference
    Accept: Categories 1, 2, 4;
    Reject: Category 3
    (d) Mean scores for GM310 lumped together and for all four GE classes lumped together, are the same. t-test pre-test
    only
    Accept: all four categories
    post/pre test
    difference
    Accept: Categories 2 and 3;
    Reject: Categories 1 and 4
              (GM310 higher)
    (e) Mean scores for all classes lumped together are zero. t-test post/pre test
    difference
    Accept: Categories 1, 2 and 3;
    Reject: Category 4 (positive)
    (f) Mean scores for the four GE classes lumped together are zero. t-test post/pre test
    difference
    Accept: all four categories
    (g) Mean scores for the two GM310 classes lumped together are zero. t-test post/pre test
    difference
    Accept: Categories 2, 3
    Reject: Categories 1, 4
    (h) Mean scores for each course separately are zero. t-test post/pre test
    difference
    Reject:
        GM310.00, Cat. 4 (positive);
        GM310.01, Cat. 4 (positive);
        G302, Cat. 4 (positive);
        M302S, Cat. 3 (negative);
    Accept: all others

  7. Interpretation: [Not done yet]
 

B. Scientific Reasoning

  1. Question asked: Do students learn to reason scientifically better than they do in existing courses?

  2. Assessment instrument:

  3. Assessment procedure: We distributed the two problems to students and briefly described them, noting that students should not only chose one of the four possible answers provided but explain their reasoning as best they could. We allowed students about 15 minutes to solve complete this assessment.

  4. Problem solutions and scoring rubric: Each of the two problems is worth five points, one point for chosing the correct answer from among the four possibilities provided and four points for the reasoning supporting the choice. (In some cases it is possible to earn partial credit for an explanation supporting an incorrect choice.)

  5. Analysis procedure: Null hypotheses posed and tests performed were the same as in the assessment of attitudes about science.

  6. Results: All hypotheses tested were accepted.

  7. Interpretation: [Not done yet]
 

C. Climate Concepts

  1. Question asked: Do students learn connections among geosciences and concepts about climate and climate change better than in existing courses?

  2. Assessment instrument:

  3. Assessment procedure: We provided each student with a copy of the assessment instrument, then spent five minutes or so explaining what a hierarchical concept map is, using an example (of the earth's water cycle) on an overhead transparency to illustrate the idea. The generic strategy for creating any hierarchical concept map are to (1) put the main topic in a box at the top of the map; (2) put related, more specific subordinate topics in boxes below, connected to the topmost box by lines; (3) label the lines with (mostly) verbs or prepositions to specify the nature of the relation between the connected topics; (4) iterate for increasingly more specific subordinate topics. Cross-connections between different branches of subordinate topics are possible and examples shown. (For a primer on concept maps and their use as a field-tested assessment technique, see the National Institute for Science Education's description of concept maps.)

    We then went over the instructions provided on the assessment instrument and asked the students to construct a hierarchical concept map in which they organized their own knowledge about climate, prompted by five key questions about climate. After 15-30 minutes, we collected the students' concept maps. (Especially at the beginning of the semester, 15 minutes was plenty of time because most students knew almost nothing about the subject. At the end of the semester, if the students had learned much about the subject at all, 15-30 minutes seemed to be enough time for the students to show a noticeable improvement if any such improvement was forthcoming--the exception rather than the rule in the control groups!)

  4. Scoring rubric: We deemed two components of the concepts maps credit-worthy: (1) appropriate topics in boxes; and (2) logical, labeled lines of connections between topics. We constructed our own version of an acceptable concept map based on the five key prompting questions, identified a dozen important topics distributed among these questions, and assigned half a point to each topic for a subtotal of 6 points; and we assigned half a point to each of up to four logically coherent, labeled line of connections between topics (generally corresponding to any of the five key prompting questions about climate) for a subtotal of 4 points. The total points possible was 10 points.

    A crude outline of the lines of connections associated with the key prompting questions (italics) and topics (bold-face) that we deemed important (and hence credit-worthy) looked something like this:

    To increase scoring consistency, two of us--a meteorologist (Dempsey) with no previous experience scoring concept maps and some limited experience using concept maps as an instructional tool, and a secondary-science-education faculty member (O'Sullivan) with extensive experience scoring concept maps and using them for instruction but with limited knowledge about climate--scored student concept maps independently. In occasional cases where our scores differed markedly we consulted and rescored, but in general we simply averaged our two scores and performed our statistical analysis on the averaged scores.

  5. Results:
    Hypothesis Tested Type
    of Test
    Score Result of Test
    (95% significance level)
    (a) Mean scores are the same among all six classes F-test pre-test
    only
    Reject
    post/pre test
    difference
    Reject
    (b) Mean scores are the same among all classes, with GM310 classes lumped together F-test pre-test
    only
    Accept
    post/pre test
    difference
    Reject
    (c) Mean scores are the same among the four GE classes F-test pre-test
    only
    Accept
    post/pre test
    difference
    Accept
    (d) Mean pre-test scores for the two GM310 classes are the same t-test pre-test
    only
    Reject
    post/pre test
    difference
    Accept
    (e) Mean pre-test scores for GM310 lumped together and for all four GE classes lumped together, are the same. t-test pre-test
    only
    Accept
    post/pre test
    difference
    Reject
    (GM310 scores higher)
    (f) Mean score for all classes lumped together is zero. t-test post/pre test
    difference
    Reject
    (difference positive)
    (g) Mean score for the four GE classes lumped together is zero. t-test post/pre test
    difference
    Accept
    (h) Mean score for the two GM310 classes lumped together is zero. t-test post/pre test
    difference
    Reject
        (difference positive);
    (i) Mean score for each course separately is zero. t-test post/pre test
    difference
    GM310.00: Reject
        (difference positive);
    GM310.01: Reject
        (difference positive);
    All others: Accept

  6. Interpretation: [Not done yet]


Table of Contents

  1. Introduction

  2. Assessments
    1. Attitudes about science
    2. Scientific reasoning
    3. Concepts about climate


References

Fraser, BT, 1981, TOSRA: Test of Science Related Attitudes. Australian Council for Educational Research, Hawthorne, VIC.