Part three of a 4-part series examining what happens when science is used for marketing (using brain-training software as the central example).
[Full disclosure: I am a co-PI on federal grants that examine transfer of training from video games to cognitive performance. I am also a co-PI on a project sponsored by a cognitive training company (not Posit Science) to evaluate the effectiveness of their driver training software. My contribution to that project was to help design an alternative to their training task based on research on change detection. Neither I nor anyone in my laboratory receives any funding from the contract, and the project is run by another laboratory at the Beckman Institute at the University of Illinois. My own prediction for the study is that neither the training software nor our alternative program will enhance driving performance in a high fidelity driving simulator.]
In my last post, I examined some of the claims on the Posit Science blog to see what science they were using as the basis for the claims. Posit Science prides itself on being rooted in science, and unlike most claims for brain training, they actually can point to published scientific results in support of some of their claims. The primary evidence used to support the claim that DriveSharp training can improve driving appear to be based on a 2003 paper by Roenker et al that examined the effects of training on the UFOV. Today I will examine what that article actually showed to see what sorts of claims are justified.
First, I should say that the Roenker et al (2003) study is an excellent first attempt to study transfer of training from a simple laboratory task to real-world performance. It used performance measures during real-world driving, and was far more systematic than most road-test studies of this sort. As with any such study, though, it is limited in scope to the experimental conditions and subjects it tested. It also had several methodological shortcomings that somewhat weaken the conclusion that training transfers to untrained driving tasks. Here are some characteristics of the study that you might not know if you relied solely on the description of the scientific findings touted on the Posit Science site:
1. The subjects were all 55 or older and were recruited specifically because they might benefit from training. At least some (we don’t know how many) were recruited because they were involved in crashes. Screening criteria excluded participants who performed normally on the UFOV, so these were older participants with driving problems who had existing impairments on a demanding perception/attention task. Given that this transfer-of-training study tested only impaired older drivers, don’t count on any benefits if you are unimpaired, a good driver, under age 55, etc. The claims on the Posit Science website don’t mention these potentially important limitations.
2. The study involved 3 conditions: (a) the critical “speed of processing” training group, (b) a simulator training group, and (c) a relatively unimpaired control group. Not surprisingly, the two training groups tended to improve on the tasks that were specifically trained. The simulator training was a more standard driver training program, and those subjects showed improvements on the same tasks that were emphasized in the training (e.g., proper turning into a lane and using a signal). The critical “speed of processing” group showed no improvements on signaling or turning. Not surprisingly, though, their UFOV performance improved. That’s effectively what their training task was. Similarly, the speed training group responded faster in a choice response time task. Again, these sorts of task-specific benefits are not surprising because we know that training tends to improve performance on the trained task.
Even if training didn’t improve performance on the trained task, we might still find improvements if people thought the training should help. Subjects in the simulator condition knew that they were being trained to use their signal correctly and to turn into their lane appropriately, so they would be highly motivated to perform well for those aspects of the driving test (and some even said that they worked hard on doing those tasks well in the post-test). Similarly, a group trained to respond quickly would be motivated to respond quickly on other tasks.
There also was a lot of variability in the outcome measures, and in some cases, the speed trained group underperformed the other groups 18 months later (e.g., on the position in traffic composite measure). Given the number of statistical tests involved (3 training conditions, about 10 outcome measures, multiple follow-up tests), some of the statistically significant differences are likely to be spurious in any case.
3. In the pre-training driving segment, the raters were blind to the condition. However, following training, one or both of the coders knew the training condition. Even if the coders weren’t told the training condition, they might well have been able to tell which condition a subject was from anyway. Given that the training subjects were impaired to start with, the differences between them and control subjects might have been apparent in their driving performance. The paper provided no evidence that coders actually were unaware of the condition or that they couldn’t guess the condition. More importantly, the coders apparently were informed that the subject was in either the critical training group or the unimpaired group (that is, they knew the subject wasn’t in the other training condition). Why does that matter? If the coders knew that subjects were in the speed training condition and they believed that the training might improve some aspects of driving performance, then any subjective measures of driving could be affected by their expectations. If the coders were not truly blind to the condition, then their subjective judgments in coding the events might be biased by their knowledge and expectations about the training condition.
4. The one significant benefit of speed training found in the paper was a reduced number of dangerous maneuvers. Recall the claim that training reduced dangerous maneuvers by 36%. As I noted in the second post of this series, the judgment of what is dangerous could be somewhat subjective. It would be interesting to see the data on what constituted a dangerous maneuver – did the raters spot the same dangerous maneuvers or did they just come up with the same overall number of dangerous maneuvers. Were the two raters blind to each other? That is, could they see each other taking notes about what was dangerous? Either of these sorts of factors, in addition to the possibility that they knew the training condition, could lead to a spurious claim of improvement. Given that such events were rare in the study, a slight bias to code something as a dangerous maneuver or to treat it as safe could lead to what looks like a large relative improvement in performance.
These criticisms are not intended to cast aspersions on the Roenker et al (2003) study. I actually found the study to be quite impressive. If I had been a reviewer of this study, I would have raised some of these concerns, but I likely would have recommended publication (after requesting some weakening of the claims). It is an important first attempt to study transfer of training from the laboratory to actual driving, a topic that deserves further study. What I find problematic is not the science itself, but the way in which the science is applied in marketing the effectiveness of training more generally. The DriveSharp post stated the claim that training improves driving, and made no mention of these limitations and qualifications. Someone reading the post or the Posit Science website might conclude that training has a proven effect on driving for all people, when the effects are limited to one measure in an already impaired older population. Untrained readers might not delve into the paper itself to see what other limitations the study had. In the final part of this series, I will return to the DriveSharp blog post and will briefly discuss the possible negative consequences of sciencey marketing.
Roenker DL, Cissell GM, Ball KK, Wadley VG, & Edwards JD (2003). Speed-of-processing and driving simulator training result in improved driving performance. Human factors, 45 (2), 218-233 PMID: 14529195