I both like and dislike this article from Smithsonian.com today about an analysis of Twitter data.
I like the analysis because it adheres to the right principal about Big Data - namely that simply collecting the data without analysis does nothing for you. So the study authors did do a lot of good analysis on place, language, time of day, distance between tweets and retweets and so on. Done properly, it gives an interesting picture of the way humans communicate - early and often - as the saying goes for voting.
I like even better the added data the investigators brought into the examination. They studied the time of the tweet so they could know if natural or artificial light was present during the tweet itself. Subtracting these and mapping it shows either the lack of rural electrification or that Twitter is a banned political substance in some parts of the world (e.g. Iran and China). The great thing the investigators did here is the use of data, properly analyzed, tells you a lot about other things you were not expecting to see or even asking questions about. Something that is self-evident and powerful once it is revealed. That is the essence of good Big Data.
What I don't like is what they could have done with the Twitter data. Certainly I care about the novel insights they can glean from the data. But in the end they never tried to look at what people were saying, what did they mean - what fears and pain, what successes and joys are all those millions and millions of people talking about? Thats how we would have approached this endeavor here at Big Data Lens.
Maybe it's just dribble. But like the discovery of politics and energy you can probably be sure there is something deep and powerful in all that human communication waiting to be found.