Posts Tagged ‘Data mining’

Introduction to R

R is not for “Rishu” as I made it out to be when I heard of this data mining tool. Initially, I assumed R to be yet another tool as Pentaho. But my assumptions fall apart when I clicked on which says up front its definition:

“R is a language and environment for statistical computing and graphics. It is a GNU project which is similar to the S language and environment”

So here it is R is a language. R is more of a data mining tool as it seems to me. Well if you ever worked on MATLAB, the format and syntax would look the same. What makes R special is its ability to handle complex mathematical queries and computation simple and easier. Creating graphs and plots are never too easy. For example, let take the below image (Screen shot from code I wrote):


The code is pretty simple. I have assigned certain values (which are in vector format) into two separate variables – “a” and “‘b”. The value of variable “b” is the square of variable “a”. And as you can see the computation of mathematical function is done by using simple commands. I calculated the “MEAN” and “VARIANCE” of the variable b using two simple commands – mean (b) and var (b). The variable “c_lm” shows the linear regression model of variable b and a.

Well there are loads more. People have gone ahead and created something like “Google Trends”.  Though Google has its own GUI built over R, but nothing is stopping us from creating one either.

Sources:; Google Trends