Friday, May 30, 2008

Links! PLOS one articles of interest in May 2008

The following articles caught my eye this month in PLOS one ... they're on my reading list for possible inclusion here on BME. In the meantime, I thought that readers might enjoy checking some of them out.

Orientation Sensitivity at Different Stages of Object Processing: Evidence from Repetition Priming and Naming (Harris et al)

Enhancement of Both Long-Term Depression Induction and Optokinetic Response Adaptation in Mice Lacking Delphilin (Takeuchi et al)

Brain Networks for Integrative Rhythm Formation (Thaut et al)

Linking Social and Vocal Brains: Could Social Segregation Prevent a Proper Development of a Central Auditory Area in a Female Songbird? (Cousillas et al)

Imagine Jane and Identify John: Face Identity Aftereffects Induced by Imagined Faces (Ryu et al)

A Potential Neural Substrate for Processing Functional Classes of Complex Acoustic Signals (George et al)

Comparing the Processing of Music and Language Meaning Using EEG and fMRI Provides Evidence for Similar and Distinct Neural Representations (Steinbeis1 and Koelsch)

Visual Learning in Multiple-Object Tracking (Makovski et al)

Time Course of the Involvement of the Right Anterior Superior Temporal Gyrus and the Right Fronto-Parietal Operculum in Emotional Prosody Perception (Hoekert et al)

On How Network Architecture Determines the Dominant Patterns of Spontaneous Neural Activity (Galán)

The Encoding of Temporally Irregular and Regular Visual Patterns in the Human Brain (Zeki et al)

Citral Sensing by Transient Receptor Potential Channels in Dorsal Root Ganglion Neurons (Stotz et al)

Long-Term Activity-Dependent Plasticity of Action Potential Propagation Delay and Amplitude in Cortical Networks (Bakkum et al)

Gender Differences in the Mu Rhythm of the Human Mirror-Neuron System (Cheng et al)

Monday, May 26, 2008

Cargo cult psychometrics? Setting standards on standards-based standardized tests

This past week I had the opportunity to work for the Maine Department of Education (DOE) with a small, diverse group of educators on the task of setting achievement standards for the Science component of the Maine High School Assessment (MHSA). The intent of the MHSA is to measure student learning relative to the Maine Learning Results (MLR), a body of learning objectives / outcomes / standards that all students in Maine are expected to be proficient in as a result of their high school educational experience. A few years back, the Maine DOE decided to adopt the College Board's SAT as the MHSA instead of continuing with their own Maine Educational Assessment (MEA), which had been developed with the help of Measured Progress, a non-profit firm based in New Hampshire. Aside from offsetting some development costs, the switch to using the SAT as the primary component of the MHSA was undoubtedly influenced by the nice side-effect that all students in Maine would be one step closer to college application readiness. However, unlike the MEA, the SAT does not correlate with all standards in the MLR. Up until this year the federal DOE did not require states to report student learning in Science (at least in grades 9 - 12 ... I'm not sure about earlier grade levels), so Maine had not included any questions on Science since switching to the SAT. But, because Maine needed to report student learning in Science this year, the state DOE worked with Measured Progress to develop a multiple choice and free response augmentation for the MHSA to measure student learning relative to the Science standards from the MLR in order to be able to comply with federal reporting rules.

Because it had been a few years since Maine students' learning in Science had been assessed, the panel I worked on was tasked with setting achievement standards - expectations - for categorizing overall student performance on the Science augment as "Does Not Meet", "Partially Meets", "Meets", and "Exceeds". We began our work by actually taking the assessment; while I can't discuss any of the test items specifically, I can say that the questions seemed generally well-written and represented a broad and balanced sampling of the Science standards. We were then given a binder containing all the questions in order of difficulty, as determined by Item Response Theory (IRT) analysis, the first step in which is to generate for each question an Item Characteristic Curve (ICC) which, we were told, was of the logistic or "S curve" type, although we didn't see the actual graphs.

IRT is a complicated psychometric analytical framework that I heard of for the first time during this panel - I am still learning about it using the following resources: UIUC tutorial; USF summary; National Cancer Institute's Applied Research pages. We were not taught any of the following specifics on IRT during the panel session. From what I've learned subsequently by reading through the above-linked resources, it appears that the purpose of the ICC is to relate P, the probability of a particular response to a question (in this case, the correct answer) to Theta, the strength of the underlying trait of interest (in this case, knowledge of the MLR's science standards). In a logistic ICC, the S-shaped curve has three variables that influence it's theta value: "a", the discrimination parameter; "b", the difficulty parameter; "c", the guessing parameter. What I'm not yet sure of, and perhaps might never be, is whether the rank-order of the questions in our binders were based on some type of integration of the P v. Theta curve for each question, or if they were based on the "b" value for each item's ICC - from the way it was described to us, I suspect the latter to be the case.

Once we had the binders with ordered questions, we were asked to go through each question and to determine, individually, what the question measured and why it was more difficult than the previous question. A multiple choice or free response question can measure a lot of different factors - we were instructed to concentrate on determining which standard(s) from the MLR's Science section were being assessed. So, our analysis of what was measured by each question left out the important factors of wording, inductive reasoning, and deductive reasoning, just to name a few. After finishing our question-by-question analysis and discussing our individual findings as a group, we moved on to another task: describing the Knowledge, Skills, and Abilities (KSAs) that we associated with students at the four different achievement levels (Does Not Meet, Partially Meets, Meets, and Exceeds). Completing this task took quite a while, as there were many different opinions about what kind of KSAs different educators had observed in and/or expected from students at the different achievement levels.

With achievement level KSAs in mind, we then moved into the "bookmarking" task, the results of which would be sent on to the Maine DOE as the panel's recommendation for the cut scores to categorize students within one of the four achievement levels. In the bookmark standard setting procedure, each of us was given three bookmarks - one to place at the cut between Does Not Meet and Partially Meets, another to place at the cut between Partially Meets and Meets, and a final one to place at the cut between Meets and Exceeds. We were instructed to go through the binders with ordered questions and, starting from the easiest question at the very beginning, to place the bookmarks at the transition point where we felt that only the students with the KSAs characteristic of the next-higher achievement level would be able to get the question correct 2/3 of the time. Again, as with IRT, the underpinnings of the bookmark standard setting procedure weren't explained to us in detail, so I've been reading the following sources to learn more about it: Wisconsin DPI - 1; Wisconsin DPI - 2; Lin's work at U. Alberta's CRAME [PDF].

And so, we each went through our binders, set our bookmarks, and gave the results to the psychometrician from Measured Progress. Our bookmarks were averaged, and the average placement of the bookmarks were presented back to us for group discussion. We talked about where we placed our bookmarks individually and why we placed them there - some people were much above the average, some much below, and others very near or at the group's average placement. Conversation revealed that some panelists did not fully understand the instructions on where to place the bookmarks (if I recall, I think most confusion was due to the instructions about only the students at the next highest achievement level being able to get the question correct 2/3 of the time). Conversation also helped many panelists to re-evaluate the placement of their bookmarks based on question characteristics that had not been considered in the first round. We were then given the opportunity to place our bookmarks a second time, and were told that these results (which were not shared with us) would be passed on to the Maine DOE as the panel's recommendation for cut scores for categorizing student achievement.

During one of our breaks on the second day, when we were working on the bookmarking task, another panelist I was talking with asked if I had ever read any Richard Feynman, particularly his essay called "Cargo Cult Science". Although I'd heard of Feynman before, I replied that I hadn't read any of his work - the panelist described it to me as pertaining to the distinction between science and pseudoscience, and shared with me his feeling that our attempts to measure and set standards for student knowledge felt a lot like what Feynman was describing in that essay. At the time, I felt a bit of disagreement - although I know that measuring knowledge of standards via any assessment is bound to have flaws, I don't think it's psuedoscience. I've since read Feynman's essay, and understand more about his distinction between science and pseudoscience, which helps me to understand better my fellow panelist's remark -- I think it is captured by this quote: "In summary, the idea is to try to give all of the information to help others to judge the value of your contribution; not just the information that leads to judgment in one particular direction or another."

Feynman's essay goes on to discuss the obligation of scientists to other scientists, as well as to non-scientists. I particularly agree with the responsibility of scientists to "bend over backwards" to demonstrate how they could be wrong, particularly when the complexity of the problem and/or the solution make it likely that the non-expert / non-scientist will believe the scientist's conclusion simply because they can't understand or evaluate the work on their own. My experience in working on this standard-setting panel provided me with invaluable insight into a complex process whose results have significant implications. Even our panel of experienced science educators struggled to understand the complexity of the standard setting process that we implemented, and the full underlying complexity of the entire process (ie: Item Response Theory and Bookmarking Method) was not explained. Given that there could be significant differences in the KSAs associated with each achievement level depending on the composition of the panel, and given that the underlying complexity of the task is significant, I think it is accurate to label this work as "cargo cult science" because only the results are shared with a broad audience. I don't think that the task of measuring knowledge with "pencil and paper" assessment is inherently pseudoscience - but we ultimately do a disservice to the potential for making education more scientific when the full scope of this type of work is not published.

Thursday, May 15, 2008

Autism-like effects and mitochondrial disorders...?

The Role of Thioredoxin Reductases in Brain Development (Soerensen et al)

A Marked Effect of Electroconvulsive Stimulation on Behavioral Aberration of Mice with Neuron-Specific Mitochondrial DNA Defects (Kasahara et al)

One of my usual projects in the Biology classes that I teach is a "controversial issues" research paper and presentation that requires the student to pick a topic, describe it scientifically, explore multiple perspectives on the topic, and detail their own opinion. A commonly-chosen topic has been autism-vaccinations debate, which is an issue that I'm interested in as well. My father was infected by the Polio virus shortly before the Salk vaccine became publicly available - details aside, it left him with life-long disabilities that have increased in impact with age. What's interesting to think about is that despite the significant health consequences, he was lucky in the sense that he survived Polio in the first place. As you might reasonably guess, I'm a strong proponent of vaccinations - but I also don't discount the possibility that vaccines (overall, and/or specific ingredients) may have negative health consequences for some individuals exposed to them.

So what does this have to do with the brain? Recently in the news I discovered that a US court had ruled that a vaccine had triggered a mitochondrial disorder that caused the manifestation of autism-like symptoms. I've linked to a couple of articles above that help to demonstrate the important role that mitochondria play in the brain - one of the most metabolically-active organs in the body. Research on autism is on-going, but it's clear that it's a brain-based disorder. Furthermore, research is also continuing on mitochondrial disease (also see info from the NIH on mitochondrial myopathy), but it's clear that these diseases have the potential to affect brain function. Finally, many are beginning to research the potential link between mitochondrial disease and autism, as evidenced by this article [PDF link] from a peer-reviewed scientific journal on pediatric medicine and this page with information and links from the CDC.

Wednesday, May 7, 2008

Measuring academic ability, teacher retention, and student learning

A recent post on Dangerously Irrelevant (cross-posted to LeaderTalk) prompted me to think quite a bit about how academic ability is measured...and how teacher academic ability measurements correlate with measurements of student academic ability. I think the conclusions are built on pretty soft data ... just because there's a correlation between teacher college entry exam scores and teacher attrition doesn't necessarily mean that there's a "brain drain" from the classroom, and just because there's a correlation between those same types of teacher scores and student scores doesn't mean that those are necessarily the most academically-able teachers or students. The large and important point here is that the measurement of academic ability is difficult, and current instruments are limited in scope - so, although we should certainly support the retention of our best teachers, we shouldn't be satisfied with building our arguments on soft data.

The following are some comments that I made on the post at D.I. ... I'm re-posting them here for my own archival purposes and because the topic of measuring academic ability is clearly within the domain of cognitive psychology. As you can tell, I'm responding to other comments, which I don't feel comfortable re-posting here because they're not mine -- please visit the above link to the post at D.I. to read the full article and ensuing discussion -- it's very interesting.

Comment 1:
I'm on-board with the idea that smarter teachers are going to be the best at promoting student learning. But I'd like to echo and add to Orenta's point above .. if we're railing against using standardized tests as such an important measure of current student learning, how can we maintain the integrity of our argument by claiming that the same type of testing is an important and valid measure for teachers' knowledge? Though anecdotal, I know plenty of people who've declined in intelligence / knowledge / learning capacity after the structural support of family and high school settings are gone, which can have a strong positive influence on college entrance exam scores. Also, I know plenty of people who have developed incredibly as learners both in college and post-college, particularly in the context of teaching others. So, again, I'm all for getting the smartest teachers possible in the classroom, and finding ways to keep them there - but I think it's a bit disingenuous, for multiple reasons, to use college entrance scores on standardized exams to make the point.

Comment 2:
I agree that there's an important and significant role for standardized test scores in the interpretation of student and teacher intelligence. However, I am uncomfortable with your generalized statement - if based only on the citations from Anderson & Carroll, and Guarino et al - that "the percentage of teachers with lower academic ability increases in schools over time. The brightest go elsewhere." and your stated assumption #1 "smart people are less likely to stay in teaching (thus resulting in a concentration of teachers with lower academic ability)." As I said originally, I absolutely support the notion that we should make more effort to retain our brightest teachers; I stand by my claim that scores on standardized tests taken in high school, or even at the end of a college program, by individuals who then become teachers are not the best data to use when making the argument that there is a longitudinal "brain drain" from the classroom. While there may be a correlation between this particular teacher characteristic and student achievement, I hesitate to make the jump into causality, as do Wayne & Youngs: "When statistical methods seem to establish that a particular quality indicator influences student achievement, readers still must draw conclusions cautiously. Theory generates alternative explanations that statistical methods must reject, so a positive finding is only as strong as the theory undergirding the analysis. If the theory is incomplete—or data on the plausible determinants of student achievement are incomplete—the untheorized or unavailable determinants of student achievement could potentially correlate with the teacher quality variable (i.e., correlation between the error term and the teacher quality variable). Thus, student achievement differences that appear connected to teacher qualifications might in truth originate in omitted variables." Further, in the section of their article specific to the review of studies on teacher test scores and student achievement, Wayne & Youngs point out that none of the teacher tests used in those studies are still in use, and emphasize the importance of researching the correlation between student achievement and teacher performance on assessments of their skill beyond standardized tests. My larger point, which I attempted to make by providing anecdotal evidence, is that the teachers who do remain in education for longer periods of time are not necessarily less able to promote student learning, even if there is a correlation that points to their tendency to have scored lower on standardized tests prior to their entry into college and/or the classroom. With that said, I would certainly support hiring policy shifts toward selecting applicants with the highest academic credentials possible, including historical and more recent scores on standardized assessments.

Comment 3:
The older the data, the more inappropriate it is as a measure of academic ability. This is well-accepted with IQ tests, for example - the score is compared with others in the same age-range, not with the entire population. With that reasoning in mind, I think it's a bit provocative to use college entrance exam scores as the only data you show in your post about the "brain drain" from the classroom. In my read of the articles you cite, I interpret the authors as being far more reserved in their interpretation of the available data, noting limitations in data availability and suggesting caution in forming inferences based on it. For example, in the Anderson & Carroll DOE study they show that teachers are more likely to earn a graduate degree than non-teachers, and that teachers with a graduate degree are less likely to leave the profession than those with an undergraduate degree. This, to me, is great support for questioning the veracity of the relationship between college entry exam scores and teacher attrition, and exemplifies the significant personal, formal, and professional learning that occurs in college and within the first few years of experience in teaching (and all "leavers" in that study had at least one year of experience in the profession). I suppose that I'm being picky about this data because I worry about the impact on current and prospective teachers from this type of provocation. I don't think anyone in any profession would want either of these two possibilities: 1 - to think that their academic ability is being reduced to and summarized by a standardized test score (SAT or ACT) they earned in high school; 2 - to think that the longer they stay in the profession, the more "less-academically-able" people they'll be working with. Especially with regard to point #2, although it *may* be true, I think we're obliged to use better data to make such a negative critique. My further problem with the use of this data to demonstrate a "brain drain" from the classroom is that there is *so* *much* *more* to being an effective teacher than the (limited) aspects of academic ability indicated by scores on the SAT, ACT, college selectivity, college GPA, Praxis I, Praxis II, or an IQ test. For example, even beyond subject-area expertise, what about being creative and being able to help others be creative? What about the ability to collaborate and to help others to collaborate? What about technology skills and the ability to help others to use technology? My impression is that we're hoping for these skills to manifest more and more in the classroom over time - but none of them are measured by the above-named instruments ... and we're not citing research in the "brain drain" question that attempts to measure them actively (let alone current academic ability), or to correlate measurements of those important "21st Century Skills" with measures of academic ability (whether old or current). I am an ardent advocate of data-driven policy development, decision making, and education research, but I think we need to be really careful - more so than in this post - when we're using data to make a point that is critical of those who are currently in the teaching profession and those who are about to enter it. Certainly this provocative writing has herein inspired productive discussion - but I must admit that I have some concern that using what I would describe as very suspect data to make what may in fact be a valid point may actually serve to exacerbate the very problem that you're writing about and that I think we're all working to prevent: good teachers leaving the classroom.

Tuesday, May 6, 2008

Mouse genetics and formation of spatial memory

Mouse Cognition-Related Behavior in the Open-Field: Emergence of Places of Attraction (Dvorkin et al)

The mouse in a maze is a pretty familiar image to many, even if they are only loosely familiar with the formalities of psychological studies. In this article, researchers placed different genetic strains of mice in an open space - not a maze - and tracked their patterns of movement. Interestingly, the researchers determined that a correlation exists between the genetic strain of mouse and the movement behavior of the mouse, and inferred that perceptual and/or cognitive differences (due to genetics) are the causal factor. The movement behavior investigated was the tendency of the mouse to stop in a particular location in the open space, relative to how many times the mouse had passed through or stopped in that particular location. With various factors (including overall tendency to stop, as well as olfactory influences) controlled for, the researchers discovered that two of the three strains of mice investigated were more likely to stop in a particular location if they had passed through it or stopped there more times (implying that stopping behavior is related to memory of location), whereas the third strain of mouse did not show such a relationship between those aspects of movement behavior. The strain of mouse that did not exhibit this movement pattern is of a genetic variety that is known to cause malfunctions in the hippocampus, an area of the brain known to have a significant role in memory. The strain of mouse that had the greatest tendency to stop in previously-visited locations is reportedly a relatively new genetic strain that is highly similar to wild-type mice. This study provides valuable information supporting the role of genetics on the formation of spatial memory in mice. I would suggest that it also supports the notion that even in different and more complex species such as our own, inherited biological factors may predispose certain individuals to have greater or lesser learning capacity in very specific types of tasks (not just overall).