- If you are on the job market, Tal Galili from R bloggers has compiled 3 new R jobs for seekers like you.
- Text mining is currently a live issue in data analysis. Enoromus text data resourses on the Internet made it an important component of Big Data world. If text mining is something that you need to do for your job, you should read Text mining in R – Automatic categorization of Wikipedia articles.
- Randy Olsen, PhD student at Michigan State University’s Computer Science program, studies the percentages of undergraduate degrees conferred to men in the USA and publishes his findings in a blog titled The double-edged sword of gender equality.
- Who will win the World Cup? See what statisticians say.
- Earlier this month, the results of the 15th annual KDnuggets Software Poll were released and R’s popularity continues to grow. See Revolution Analytics’ new post for details.
- And finally, Xi’an discusses a new paper by Simon Barthelmé and Nicolas Chopin called The Poisson transform for unnormalised statistical models.

22

Jun 14

## The week in stats (June 23rd edition)

16

Jun 14

## The week in stats (June 16th edition)

- Writing functions is an important part of programming, and in order to write proper functions you need to know how to debug when your functions aren’t working. Slawa Rokicki, PhD student at Harvard, explains How to write and debug an R function.
- It is often said that you should avoid loops in R because R is extremely slow with iterations, and hence many R-programmers try to avoid loops by working with matrices and arrays. Did you know that an even better option is to run your loops in C++ and import your result back into R? Here is a quick tutorial called how you can use C++ within R.
- Rasmus Bååth blogs about the The Most Comprehensive Review of Comic Books Teaching Statistics.
- Did you know that more and more startups are starting to use R as their primary data analysis tool? According to Revolution Analytics, Uber and CultureAmp have just joined the R camp.
- Xi’an reviews a new paper called Generalizations related to hypothesis testing with the Posterior distribution of the Likelihood Ratio by Smith and Ferrari.
- And finally, DiffusePrioR writes “If history can tell us anything about the World Cup, it’s that the host nation has an advantage of all other teams”. Do you agree or disagree, and what do you think is Brazil’s chance of winning the World Cup?

http://bit.ly/SHnlXH

15

Jun 14

## The week in stats (June 9th edition)

- Like the plots above? Learn how to create these in R from Freakonometrics’ new post called Box plot, Fisher’s style.
- If you are on the job market, Tal Galili from R bloggers has compiled 6 new R jobs for seekers like you.
- Big Data has gained lots of popularity recently, and every data scientist should know at least something about it. If you are new to data science, consider this introduction to R for Big Data with PivotalR.
- Using Repeated Measures to Remove Artifacts from Longitudinal Data by Dmitry Grapov.
- And finally, Andrew Gelman discusses Why we hate stepwise regression.

12

Jun 14

## A new way to visualize content

Right now I’m working on a project that involves new ways to view units of content and the relationships between them. I’ve posted the comic I worked on, it has a number of stats references throughout. This is early alpha stages for the software, you may run into issues. To see the relationships, go to the puffball menu and make sure that “Show relationships” is clicked.

26

May 14

## The week in stats (May 26th edition)

- Alvaro Galindo reviews Social Media Mining with R by by Nathan Danneman and Richard Heinmann.
- Some popular articles on R tip and tricks are: R has some sharp corners by Win-Vector LLC, Sample uniformly within a fixed radius by Forester (Assistant Professor at the University of Minnesota Twin Cities), The Birthday Simulation by Wes Stevenson, and didYouMean() Function: Using Google to correct errors in Strings by Sam Weiss.
- R bloggers compiles a list of R related positions for those who are on the job market.
- Xi’an discusses a special issue Statistical Science named Big Bayes Stories: A Collection of Vignettes.
- Last week, we featured an article on R vs. Julia. This week, Matloff (aka Mad (Data) Scientist) writes another comparison called R beats Python! R beats Julia! Anyone else wanna challenge R?

19

May 14

## The week in stats (May 19th edition)

- Are you a self-taught “scientist programmer”? Here is why people think code written by people like you is ugly.
- As always, R articles are extremely popular. This week, we have: Facebook teaches you exploratory data analysis with R by Revolution Analytics, Beyond R, or on the Hunt for New Tools by Quintuitive, Bootstrap Critisim (with example) by Eran Raviv, The apply command 101 by Learning R by Imitation, and Can We do Better than R-squared? by Learning as You Go.
- Julia is a new programming language (only 2 years old) for scientific computing and it has gained lots of popularity recently. In the past, we shared some articles comparing R and Julia. This week, Alvaro Galindo writes another comparison called Julia versus R – Playing around.
- Sébastien Bubeck, assistant professor at Princeton, releases the first draft of his monograph based some old lecture notes called Theory of Convex Optimization for Machine Learning.
- And finally, happy Victoria Day to those in Canada!

12

May 14

## The week in stats (May 12th edition)

- Looking for a job? Here are some jobs compiled by R-bloggers that may be of interest to you.
- Homer White, professor of mathematics at Georgetown College, shares his Five Reasons to Teach Elementary Statistics With R.
- Seven R Quirks That Will Drive You Nutty.
- Some popular R articles this week are: how to build a sales dashboard with R, Optimising your R code, and Modelling seasonal data with GAMs.
- And finally, Xi’an discusses bridging the gap between machine learning and statistics.

05

May 14

## The week in stats (May 5th edition)

- Popular R articles this week are: colormap by Dan Kelley (Professor of Oceanography at Dalhousie University), The new look of learning R by DataCamp, Writing an R package from scratch by Hilary Parker (Data Analyst at Etsy), Test coverage of the 10 most downloaded R packages by Quartz Bio, How to Code Something ‘New’ in R by Francis Smart (PhD student at Michigan State University) and Reading large data tables in R by Fabio Marroni.
- If you roll a fair die 6 times, what is the probability that there is at least one pair of identical consecutive face values?
- Great news! The RSS is setting a data analysis challenge this year. The top three teams will be invited to present their results in a special session at the RSS Annual Conference in September 2014, and submissions will be considered for publication in the Journal of the Royal Statistical Society, Series C. If you are interested, here are the details.
- And finally, do you fly frequently? If so, you may want to know how to Automatically Scrape Flight Ticket Data Using R and Phantomjs.

28

Apr 14

## The week in stats (April 28th edition)

- Two pieces of interesting data visualization work attracted some attention this week. How Americans Die by Matthew Klein of Bloomberg Visual Data and The Music America’s Listening To by Chris Kolmar of Movoto Blog.
- Popular R articles of the week are: Testing for Linear Separability with Linear Programming in R by Raffael Vogler, Twitter Extraction by Ethan Fosse, Simpson’s Paradox Is Back by Mad (Data) Scientist, and Object Oriented Programming with R: An example with a Cournot duopoly by Bruno Rodrigues.
- Have you ever tried Julia or have considered adopting it? Econometrics by Simulation reviews Julia from an R user’s perspective for those who are interested in learning this programming language.
- Rapport summarizes some key metrics about the popularity of R like the number of R Foundation members per country all over the world, and presents his findings in a report called R activity around the world.
- And finally, Why are R users so damn Stingy?!

21

Apr 14

## The week in stats (April 21st edition)

- Do you know anything about the Hilbert Matrix (other than it is probably named after David Hilbert)? In his post this week, Nicholas Horton, Professor of Statistics at Amherst College, explains what it is, and how to create these matrices using both SAS and R.
- Xi’an discusses a new paper by Randal Douc, Florian Maire, and Jimmy Olsson called MCMC for sampling from mixture models.
- Some popular statistical articles this week are: Modeling Data With Functional Programming In R by Cartesian Faith, Make your ggplots shareable, collaborative, and with D3 by Matt Sundquist, Implementing a Principal Component Analysis (PCA) by Sebastian Raschka (for Python), and Ordering Datasets Alphabetically by geomorph.
- And finally, have you ever tried the popular mobile game 2048? If not, here are some code that you can run on your machine and start playing the game with R.