Mon 24 Apr 2006
In formatting reminiscent of Diana Davis’s ‘07 orginal Anti-Blog, Assistant Visiting Professor of Statistics David Craft blogs on all sorts of topics. [Side note: I suspect that Craft's title is actually Visiting Assistant Professor and not Assistant Visiting Professor . . . just so we have that straight.]
We are now doing the things in STATS 101 that I have been looking forward to teaching…confidence intervals and hypothesis testing, etc, but I am not sure how well classes are going. Today I felt like I was staring into a sea of disgruntled soldiers, but I wasn’t sure what anyone was disgruntled about. I think I need more treats to get us all through 65 minutes of stats.
…
Lots of email action after the stats 101 grades came out. Wouldn’t want to go through that again.
Been there, done that.
Best part is that Craft has a link to R on his homepage. I have been trying to convince professors at Williams, in economics and statistics, that R should be used for every stats class (STAT 101, STAT 201, ECON 253, and so on). Alas, my efforts have been a miserable failure so far.
No worries though. In 5 years or less, nothing but R will be taught at Williams. You read it here first. (Previous R work at EphBlog here and here.)
2006-04-24 23:25:53
Actually, you will find Diana Davis’s ‘07 (is that the standard possessive form?) original Anti-Blog with the original (lack of) formatting here.
Williams stats uses JMP because Williams stats has been almost entirely Professor De Veaux, and Professor De Veaux is friends with John, the rich guy who wrote John’s Macintosh Program (JMP). If you want to change the department, start lobbying with (relatively) new stats professors Botts and Klingenberg.
2006-04-25 02:30:07
Teaching R to Stat 101 and 201 students is clearly not worth the hassle. JMP is far more user-friendly and has all of the functionality they’ll ever need (and then some). Perhaps to your dismay, David, Professor De Veaux teaches S-Plus when he teaches 346 (regression and forecasting). Professor Botts, on the other hand, swears by R, and I can’t really blame him. It’s pretty nice. I see that your zeal for R likewise shows up on your online forum postings. I incidentally ran into this piece when I was considering whether to put more effort into learning R as opposed to using the school’s licenses to use the slightly more robust S-Plus.
R vs. S-Plus vs. SAS
The tipping point for me was that R is open source, which is very helpful in all sorts of ways and is subsequently 100% free.
2006-04-25 11:24:15
It is inaccurate to describe S+ as “slightly more robust” than R. In fact, R was more robust than S+ 5 years ago! This is one of the reasons that I switched. R is now much more robust than S+.
I have had e-mail communications with Dick on this topic. Since he knows much more about teaching statistics to undergraduates than I ever will, he is much more qualified to comment on this. He (and others at Williams) feel that R is too much for a intro course like STAT 101 or 201.
I couldn’t disagree more, but that is a topic for another day. Also, I don’t worry too much about this since, in another few years (and once R provides a point-and-click front-end), almost every into course at other elite schools with use R. At that point, Williams will be the outlier.
2006-04-25 12:26:49
Since I know only what I’ve read on forums online, I’ll gladly concede on the S+ vs. R argument. Although I will say that S+ is more user-friendly for the beginner, and most math majors will likely appreciate that it has a similar GUI to Matlab.
In any case, I think you’re way off your rocker on the concept of teaching R to Stat 201 students (I won’t even bring up 101 in this discussion). The fact is that Stat 201 is an survey-like introductory course, where emphasis should be placed on the actual material. To spend extra time (and use time that would otherwise be used for course instruction) trying to teach R to a large group of people who may not ever use it again seems unnecessary. Like I said, JMP is more than adequate for topics covered in 201, has a much more accessible GUI, and is subsequently much better suited for an introductory class.
R assumes some familiarity with basic programming, which is unfair to ask of many of whom are neither Comp Sci or Math majors.
2006-04-25 12:38:15
I couldn’t disagree more with David R’s assertions. A large portion of incoming econ grad students do not have any programming experience before arrival and it is central to our education.
Art Goldberger believed that you only understand statistics and econometrics after you have had to program it. There is something relevatory about placing the matrices next to one another. Suddenly, the abstract becomes tangible.
So much of what I learned in undergrad stats was completely useless. If anything, stats is a wonderful opportunity to introduce people to programming. Whether that is done in R or Matlab or Fortran or C or GAUSS, I don’t care. Everyone should graduate with some knowledge of computer programming just as everyone should graduate proficient in Microsoft Excel.
2006-04-25 12:39:01
This is big topic which deserves a thread of its own. Two points:
1) There are JMP-like GUIs available for R. R Commander is one (which I have not used). Other efforts are underway. So, using R for STAT 201 does not necessarily mean forcing students to learn how to program.
2) But, even if it did, that would be a bug not a feature. But it depends on what your definition is of “actual material.” If we are talking proofs and definitions, than neither R nor any software is needed. To my mind, the key material in such a course is the skills needed to do simple data analysis, at the very least for a thesis in something like psychology. Even better are the skills to do such an analysis in the outside world.
These skills really need to and can involve working with actual data in a serious fashion. If I could teach my students at Harvard to do it in a semester, I see no reason why Williams professors can not do the same.
It is an empirical question, though! Perhaps I can teach a section of 201 someday using R. We can then compare. Who knows what we might discover . . .
2006-04-25 12:58:22
Richard: I completely agree. If you intend to go on to Econ grad school, you should make it your business to become highly proficient using statistical software like R. The point I was trying to make was that, yes, using R (or S+ or SAS or SPSS or STATA) is very central to a deeper understanding of the material, but I think it isn’t quite appropriate for an introductory level course. This is why professors (to the best of my knowledge) continue to use JMP for 101 and 201, but my Stat 346 and 441 courses have both made extensive use of R. In 201, the emphasis is less so on actual data manipulation than learning to interpret it.
David: Anyone seriously interested in doing advanced data analysis for a thesis would likely need outside help after taking only 201 regardless of whether or not they learned R. Stat 346 (or Econ 255) is far better suited for this purpose. That said, Stat 201 is not intended to be the end-all for data analysis, and instructors likely take into consideration that their time is better suited teaching them actual concepts (as there are many introduced) than to teach them the intricacies of writing extensions in R.
The reason I don’t include Stat 101 in this discussion is because it is largely geared towards those less mathematically inclined. The topics are the same, but the treatment of the material is very different.
2006-04-25 16:06:47
A major complaint about Stat 201 is already that too much emphasis is placed on learning how to use JMP. People don’t like learning which menus to select when they’re never going to use the program again — which is true for 95% of people whether it’s JMP or R or S+. So making the program even more technical would be an extremely unpopular, and probably anti-educational, move. Stat 201 is about understanding the theory of statistics, and less about learning how to use programs to compute it.
2006-04-25 18:27:31
If we are teaching stats at Williams with theory removed from computation, then we have missed the pedagogical boat. Statistics is one of the most difficult theoretical systems to understand. Many economists avoid statistical theory like the plague because it is so bafflingly mysterious. The mystery only begins to dissolve with practice.
I think students should be forced to work through programming their own statistical routines until they are blue in the face. That is really the only way to “get it.” You get it by doing.
And really, 201 in my time wasn’t about theory, it was about using a calculator and then looking for t-values in the back of your textbook. It was the height of anti-educational because you just memorized what to add and multiply and then what table to use.
2006-04-25 23:20:56
David - your emphasis on skills smells very un-liberal-artsy, or at least it feels as if you think statistics is not an intellectual discpline of independent interest but merely exists to serve other subjects. At a liberal arts college, any statistics course, even an introductory one, should be primarily about statistics, not about using statistics for other subjects. I am all for a course which to some extent “deconstructs” statistics and shows in examples from other fields when the tools of statistics can or cannot be used because some assumptions do or do not hold, but this should be in the context of fostering a critical view of statistics and its place in the world.
I agree with Richard that the learning of statistics can be enhanced by computation. It is, however, quite a challenge to structure an introductory course so that they are courses about statistics, enhanced by the use of tools, rather than courses about using the tools, or courses where the half using the tools just gives students the idea that the half without the tools was pointless. Mathematicians have been struggling with the appropriate use of technology in first-year and sophmore level courses for 15 years, and I haven’t really seen a well-implemented solution. (Actually, i think Prof. Johnson taught the diff. eq. class at Williams with open Mathematica exams while i was there, though i didn’t take the class. Maybe he has figured it out?) I suspect statistics as a field hasn’t really figured it out either.