Inspired by a quote

A quote by Albert Einstein caught my attention today. By my poor paraphrase, it states that one should not be judged based on how much he receives from the world, but by how much he contributes to the world. This is very much akin to JFK’s famous inaugural punchline regarding American citizens. While I am not sure whether plagiarism has occurred, it is important enough to be repeated by two intellectual giants. What’s amazing is how trivial the line may sound and yet how hard it is to implement it into one’s daily action. Indeed society judges us largely based on our intake. It is viewed as the most objective standards for success. This can take the form of capital, power, and increasingly the amount of publication as a trophy. While material success is increasingly trivialized by modern agricultural and technology boom, it is important to revisit the wisdom of giving. People like Alexandre Dumas, who donate their resources copiously, did not live their lives like a pauper. Quite the contrary, they possess the equanimity of a King. Similarly, those who work for non-profit or contribute to open source not just for its sake of fame or karma radiate high humanity without sacrificing the excitement and versatility of their lives. A friend mentioned that in order to gain energy, one must expend it first. Perhaps the feeling of richness also exemplifies this principle.

So without further ado, I will go ahead and share three hours worth of coding frustration, with the ultimate solution and some personal philosophizing that hopefully will teach me and the reader how to generalize in future situations. After all, the race against time is always on, weekends or Christmas.

Pig as a dataflow programming language is aptly named to reflect its multitude of despicable idiosyncrasies. I have been doing a lot of the following control flow idioms:
A = foreach ( group B by c ) generate flatten ( ( IsEmpty(B.d) ? {(‘[]’)} : B.d )) as d;

In the past this has worked 99% of the time. Not today, after I refactored some code.

The hard part about debugging a script written in specialized language such as pig is that you never know for sure whether it’s your problem or the parser’s problem. While python has its share of occasional misses, one should expect things to go sour much more frequently in pig. It turns out all I had to do was replacing {(‘[]’)} by
TOBAG(TOTUPLE(‘[]’)). The ‘[]’ part is simply the string I wanted to output in case of empty field.

The main lesson learned from this is that I should try the obvious fix first, rather than eyeballing for other potential bugs of my own. The reason I didn’t was that I overestimated the effort required to recall the syntax for TOBAG and TOTUPLE. Instead I should get a fair evaluation of the effort involved and try the right thing, rather than the safe or easy thing, first.


About aquazorcarson

math PhD at Stanford, studying probability
This entry was posted in Uncategorized. Bookmark the permalink.

Leave a Reply

Fill in your details below or click an icon to log in: Logo

You are commenting using your account. Log Out /  Change )

Google+ photo

You are commenting using your Google+ account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )


Connecting to %s