The third iteration of Daphne Koller’s Probabilistic Graphical Models class has just started on Coursera. I’d been meaning to take this course since it was initially offered, but found the required time commitment (15-20 hours a week) unfeasible alongside all my other undergraduate classes. However, with my flexible schedule as a graduate student and having others in the lab interested in learning about PGMs, hopefully will be sufficient motivation to successfully try and complete this course. Probabilistic graphical models have become an extremely powerful tool for many domains: within cognitive science they have been cropping up all over the place, so this course is a good excuse to take the time to learn about them deeply!

One great resource of cognitive modelling papers I’ve recently discovered is Gary Cottrell’s collection of Cognitive Modeling Greatest Hits. The description on the page reads:

This is a list of cognitive modeling papers solicited from a wide range of cognitive modelers, by asking them the following: “I wonder if you would do me the honor of sending me a list of your top 2-5 favorite cognitive modeling papers. I would expect that 1-3 of these would be your papers, and 1-3 would be someone else’s. I am looking for papers where someone really nailed the phenomenon, whatever it is. I would lean towards more recent papers, but oldies but goodies are ok too.”

I’ve only read a fraction of the papers on the list, but among those are some of my favourites. There are also a large number of articles that look quite interesting that I’ve never come across before. All up, I think it’s a great list, it highlights historical papers whose ideas are still relevant to cognitive scientists today, as well as identifying interesting ideas that have emerged in the last 5 years or so. In getting up to speed with the literature on cognitive modelling, I’m going to attempt to read as many of these papers I can in the coming months, so expect numerous posts discussing the finer aspects of cognitive models!

While I’m constantly reading journal papers and other books, I’ve never had much success with reading textbooks. This is partly because textbooks seem to require a much higher sustained effort to get through, especially anything technical. However, last month I found the time to read through Machine Learning for Hackers by Drew Conway and John Myles White. Much of the modeling work done in cognitive science and mathematical psychology stems from work in machine learning and statistics, not only in the mathematical aspects, but also in the applied aspects of data analysis and modeling. I figured a book like this would be a good introduction into many of these different areas.

What I liked

Coming from a background of programming, knowing a little bit of R, and a little bit of statistics, Machine Learning for Hackers is an excellent introduction to practical machine learning. Overall, it’s a written in an easy-to-read style, with a number of motivating examples for various machine learning algorithms. For somebody who has recently shifted to using R for most of my daily programming, the book is a useful resource for picking up various R idioms, as well as a solid introduction to a variety of different R packages I’ve been starting to use on a regular basis, such as plyr and ggplot2. The book also exemplifies numerous good practices for practical machine learning, such as visualizing your data, using cross-validation and regularisation, and comparing models.

One thing which may turn people off from the book is the sheer amount of time and code spent preprocessing the data into a form that can be used by machine learning algorithms, rather than actually playing with the machine learning algorithms themselves. However, this level of preprocessing of their data is on par with how much I do in my own work before I start having fun with my models, so I feel accurately conveying how much data munging is required is a positive in my book.

What I didn’t like

I’m not a huge fan of the title, I think it would be more appropriately named ‘Machine Learning in R’. In order to get the most out of this book, readers should definitely have some previous experience with R. The unhappy reviews on Amazon suggest that a number of people were roped in by the title, but then disappointed as they tripped up from R’s syntactic quirks than on the real material in the book. Another related issue is that because a lot of the book is reliant on a number of R-specific packages, much of the code examples would not be as beneficial. Throughout the book, I also felt that the written explanations behind some of the algorithms did not provide a good intuition of why things behaved the way they did, and would’ve liked to have seen some equations. However, I can sympathize with the authors in not including any heavy math in a book geared towards practical machine learning. Additionally, there is one unfortunate chapter in the book where they attempt to describe social networks with Twitter using the Google SocialGraph API. As the book went to print, Google decided to close down its SocialGraph API rendering the code examples of the chapter useless. However, the ideas behind the chapter are very interesting and I’d love to see a different implementation of it someday.


Despite its shortcomings, for anybody who knows a little bit of R and would like to learn more about practical machine learning, I would heartily recommend this book. It has provided me with a stepping stone to try my luck in Kaggle competitions, as well as diving deeper into machine learning.

Jan 13, 2013

Suppose you are with a native in a foreign country where you do not speak the language. All of a sudden, a rabbit passes by, and the native utters the word “gavagai”. One inference you could make is that “gavagai” means rabbit, but as the philosopher Quine points out, “gavagai” could have an infinite number of other meanings, such as “Let’s go hunting” or “There will be a storm tonight”. Why would “rabbit” be the correct inference to make in this case? This is but one example of where our minds must make inferences from a very limited amount of information. Yet our minds are continuously making such inferences, such as determining the correct grammar when learning language, or recognizing objects in our visual space, or how to generalize unseen objects into categories.

How do our minds learn these remarkable feats of cognition? If we can discover these underlying principles, would we be able to build more intelligent machines? Starting in March, I’ll be undertaking a Ph.D. in computational cognitive science to (hopefully!) shed some light on some of these problems. The hope is this blog will serve as a public repository for my thoughts on this endeavour, as well as an attempt to make cognitive science research more open and to publicize research in the field. Stay tuned!