a mixture of five
Assorted answers to this week's problem set, as Jupyter Notebook pages for download:
Kate: Calls the super-useful command line program
wgetto download the data file from the network. Uses Pandas to parse it. Great comments and explanation throughout, including some side notes on how to use powerful numpy vectorized operations, if you dare.
Kevin: Includes a full mixture negative binomial estimation where he fits not just the means, but also the dispersions of the clusters, showing how you'd do that.
Sean: My version is verbose with lots of notes. I broke a lot of stuff out into separate functions so I could talk about what each bit is doing, and making the correspondence as obvious as possible between the steps of k-means and the steps of fitting a mixture model by EM.