Quotations are often used to assert the claims and support credibility of a person's views on a topic. Quotes are very popular in newspaper columns and presentations to clarify or reinforce the summary or main points and augment the arguments. I am also a big fan of quotes and have used them in every chapter of my Masters Thesis and Doctoral Dissertation. Ever since DataMarket and even Linkedin came up with their quotes, I planned to publish some of my favorites that were missing from those two lists. So, here it comes. Because of my background in machine learning and data mining, the list could be biased and tilted in that direction.
Science these days has basically turned into a data management problem.
The purpose of models is not to fit the data but to sharpen the questions.
Although we often hear that data speak for themselves, their voices can be soft and sly.
Data does not equal information; information does not equal knowledge; and, most importantly of all, knowledge does not equal wisdom. We have oceans of data, rivers of information, small puddles of knowledge, and the odd drop of wisdom.
With too little data, you won’t be able to make any conclusions that you trust. With loads of data you will find relationships that aren’t real... Big data isn’t about bits, it’s about talent.
All models are wrong, but some are useful.
Far better an approximate answer to the right question, which is often vague, than an exact answer to the wrong question, which can always be made precise.
Statisticians, like artists, have the bad habit of falling in love with their models.
The combination of some data and an aching desire for an answer does not ensure that a reasonable answer can be extracted from a given body of data.
Prediction is very difficult, especially about the future.
Statistical thinking will one day be as necessary a qualification for efficient citizenship as the ability to read and write.
If you torture the data enough, nature will always confess.
Conducting data analysis is like drinking a fine wine. It is important to swirl and sniff the wine, to unpack the complex bouquet and to appreciate the experience. Gulping the wine doesn’t work.
We are drowning in information and starving for knowledge.
Data do not speak for themselves they need context, and they need skeptical evaluation
Data is the sword of the 21st century, those who wield it well, the Samurai.
With three constants, I can fit a dog. With four, I can make it bark.
All information looks like noise until you break the code.
It is a capital mistake to theorize before one has data. Insensibly one begins to twist facts to suit theories, instead of theories to suit facts.
There are two things you are better off not watching in the making: sausages and econometric estimates.
Data analysis is simply a dialogue with the data
Data is the new oil? No: Data is the new soil.