Quote of the Day
I can't mate in captivity.
- Gloria Steinem on why she never married
I have been fortunate in having worked for companies that have had enlightened education policies, which I have always taken advantage of. These policies allowed me to take training in very diverse areas, with one exception – statistics.
Over my entire career, I have always found myself lacking in statistical knowledge and I have found that training difficult to acquire without going on-campus, which is not an option for a person who travels as much as I do. My staff expresses the same frustration. For example, I was discussing some data analysis yesterday with one of our physicists and he commented that he wished he would have taken a Design of Experiments (DoE) course instead of Complex Analysis when he was at university. Nowadays, more and more people have found that using synthetic data generation can help fill some gaps when it comes to generating statistical data, but without that core understanding not much will change in the long term when it comes to the processing of such things. Having both would likely be the ideal for many individuals in this field.
I have done all that I could to remedy my personal situation:
- Follow statistics blogs like "Learn and Teach Statistics and Operations Research" with Dr. Nic
- Took engineering courses that were heavily statistics oriented (e.g. digital signal processing)
- Attended seminars from sites like The Analysis Factor
- Watched online videos on statistics-related topics (e.g. the Econometrics Academy is excellent)
This training has been useful, but I feel like I need a bit broader selection of courses.
Starting late last year, I have been taking classes from Statistics.com and this site may be the solution to my problem – they have an enormous range of classes. I am now taking my third class from this site and I have been very impressed with what I have been able to learn there. I have three more classes that I am planning to take there this year. My focus is on DoE – the limited DoE knowledge that I have acquired recently has changed how I work. I have also been learning about bootstrap and resampling methods.
Here is a summary of my experience with their courses:
- Most courses are four-weeks long and very focused on applications.
The four-week course length is both a strength and a weakness. The four-week structure means that Statistics.com has numerous courses and you can pick very specific topics to learn. The four-week structure and a desire to keep the prerequisites limited mean that the courses do not have much theory. I would like a bit more theory because it really aids your ability to apply your knowledge in new situations.
- The instructors often are the authors of the textbooks that they are using.
Again, a strength and a weakness. The strength is that the instructors have thought long and hard about how to present the material and they know their books well. The weakness is that some textbooks are just not very good.
- The instructors have been very available and willing to help when you have questions.
The student forums are the primary communication vehicle. The instructors give very timely feedback to your questions and I have never had a communication problem.
Personally, I love using forums and I believe that they are a powerful way to communicate small bits of information. They are not so good for big pieces of information, like theoretical discussions.
Taking online classes can be a problem for many people because you must make an effort to reach out for help when you do not understand something. Reaching out for some is complicated by the fact that you must communicate your concerns through writing, which is difficult for many. Some students are embarrassed to have their "stupid" or "obvious" questions posted to a forum. My management experience has taught me to always ask "obvious" questions because I frequently receive answers that I do not expect. To successfully use online classes, you must be an active participant.
All my training for the last twenty years has been online and this has forced me to change my approach as a student. In my youth and early adulthood, I was the quiet guy in the back of the class who never said anything. I was so bad that I had one high-school teacher berate me in front of the whole class for my lack of participation – I believe he referred to me as "classroom ballast". If I had a boring class, I just blamed the teacher or the textbook or the subject matter.
My approach today is radically different – I have a responsibility to myself and the other students to actively participate. This not only makes the course more enjoyable for me, but it helps the other students and it motivates the instructor. I often receive comments from the other students on how much they learn from my questions. The instructors comment that they appreciate when I bring up real-world statistical issues that they can help solve.
My approach to participating in an online class is simple:
- If a topic does not seem worthwhile, I ask how this topic is relevant.
- If a topic is unclear, I seek clarification.
- If I can see a connection or analogy with another topic, I mention that in the forum and ask for comment.
- If I have an application for a technique, I write it up and show how I applied it (the instructors love this).
I will report on how my these classes go in a later blog. At this point, I am excited about the learning experience ahead.
Interesting. I just started an EdX course on stats; we'll see how it goes. It's a subject that I'd like to understand better, because I know it's often misused. Unfortunately my one stats class back in university was long on equations and short on understanding when, where, and how we would use the equations to gain which understanding from the data.
Of course, I also don't know which topics within statistics would be most useful to me, which is why the one I'm taking is an intro level course.
I have not used statistics.com , but skimming the website, it looks like you can choose your stats language between common statistics packages, including R.
If you have never used the stats language R, given your math focus and website, I would recommend you do the R-focused statistic.com courses.
For statistical modeling and even general mathematical manipulations R is quite amazing and is completely open source. Like many open source projects, it has a fairly fanatical following! The development team includes academic statisticians and even John Chambers, one of the original language developers at Bell Labs. I used to subscribe to the S list, which morphed into the R list, as R replaced Splus. S plus was the commercial version of Bell Labs S language.
R is closer in style to Matlab, than it is to your favorite MathCad. Basic syntax of R,S and Splus are pretty close. Splus had a nice windows GUI.
The two books I owned and can recommend are : Statistical Models in S by Chambers and Hastie and Modern Applied Statistics in S ("MASS"), by Venerables and Ripley.
There are tons of free online resources now, so the books maybe not as valuable as they used to be, many years back. The latest edition of the "MASS" book, does cover R I believe, and the authors were/are part of the development group.
There is a statistician who keeps stats (!) on the growth of R and other statistical packages. He uses academic citations, as a proxy for usage and then plots growth over time. This has been interesting to watch over the years, as R has been rapidly overtaking, the commercial giants.
http://r4stats.com/2014/08/20/r-passes-spss-in-scholarly-use-stata-growing-rapidly/
R has been growing in popularity for data mining by google and others and even creeping in for general business use. Some business oriented publications are even trying to persuade Excel users to move to R. The R project has adopted google's R programming guide style suggestions: http://google-styleguide.googlecode.com/svn/trunk/Rguide.xml
R is really not suited for very large volumes of data, in the way it handles/stores the data,but there are a number of projects to improve this, so it can be used on much larger quantities of data and on faster parallel processors.
Since writing that post on my statistics education, I have continued to take courses at Statistics.com. During this period, I have developed a real appreciation of R. I tried Minitab and Statistica, but R has become my statistical tool of choice. I certainly see the shortcomings to Excel for statistical work.
Thanks for comment.
mathscinotes