多努力分布 (multinoulli distribution)

Today a colleague of mine brought up an interesting mathematical statistics problem. What is the Jensen Shannon divergence between two multinomial distributions? In the discussion, he mentioned that he has reduced the problem to looking at binomial distributions. I misheard it as Bernoulli distribution, and started wondering what’s the name of the multinomial analogue of that. Surely it is not called multinomial distribution, since the latter deals with n objects, rather than 1.

Finally I read about Rademacher distribution, which is nothing but a rescaled version of Bernoulli distribution. Outraged by the excessive naming in mathematics, I started looking up the former curiosity.

According to wikipedia, the correct generalizing nomenclature is categorical distribution, or multinoulli distribution. I have never heard of multinoulli before, but the etymology is self-explanatory and Bernoulli seems respectable enough to coin a new word based on his name. The most natural Chinese translation becomes “how much effort?”. In fact, if one restricts to multinommial where n =1, then it’s the same as multinoulli distribution.

Back to the colleague’s problem: recall Jensen-Shannon is defined by
JSD(u, v) = \sum u_i \log 2u_i / (u_i + v_i) + v_i \log 2v_i / (u_i + v_i). For u = multinomial(n, \alpha) and $\latex v = multinomial(m, \beta)$, JSD doesn’t make sense unless n = m, which we assume. \alpha and \beta are vectors that sum to 1, and both of the same dimension k, otherwise again it doesn’t make sense.

There are n!/(n-k)! different points in the common state space. We can simply calculate the probability under u and v of each point. Taking the case of $k=2$, we are then dealing with the special case of binomial distribution. The calculation eventually reduces to a sum of the form
\sum_i \binom{n}{i} \alpha^i \log (1 + (\alpha / \beta)^i). Mathematica suggests that this is not reducible to closed form in terms of common special functions. I suspected that one could Taylor expand log(1 + \epsilon)  = \epsilon - \epsilon^2 / 2 + \epsilon^3 / 3 - \ldots and get item-wise closed form. It is true that each term in the resulting expansion is summable in closed form, but summing them together becomes just as difficult. In fact with each n, I end up with n terms in this roundabout summation in mathematica. So I am finally convinced that this problem has no analytic solution.

Advertisements
Posted in Uncategorized | Leave a comment

回国有感

乘产假期间回国2周,收益颇多。这里主要谈一下治病方面的进展。不生病不知道体质的重要性,就像不发生金融危机就不重视市场秩序一样。人需要管的东西太多,作为一个有社会责任感的青年,故意放纵的情况是不太多的。多数是疏忽或者优先权错位。

去了一趟云南也看了中医。途中微信感慨了国内医生的敬业,却遭美国朋友嘲笑说中国人不重视锻炼。或许锻炼是比欧美人少,但是也是出于无奈。试想天天单程1.5小时的上班族,外加孩子,有时还要加班,哪有时间去锻炼?吃了几幅药也没见好转。看来锻炼是硬道理。我虽然在微信上严辞反驳说美国80%的肥胖病患率,但回到上海后也开始锻炼了。

重拾高中时的篮球,在校园内95后学弟们的寝室边上篮球场独自练投篮,虽然有老大徒悲伤之感,但球技也颇有长进。对健康却没有信心。如今受感冒咳嗽困扰,胃病似乎缓一缓了,但一些生理表征如隔膜横纹,过量饮食后导致体力衰竭的症状依旧如前,只是消化不良似乎没那么明显了。或许生理年龄到了换一种病的时候了。至于咳嗽,仍然受冷空气和疲劳诱发,似乎略有好转,但很有可能是短期受高强度锻炼兴奋所致。就跟两年前暑假骑自行车上下班是一样。体质虚弱没有好转。

打室外篮球另一好处乃阳光浴。平时我早上9点到10点半左右,可以足足晒1个半小时。国内阳光又没那么刺激,估计皮肤癌概率减半也是有的。今天在网上发现一些偏方也称晒背大有功效。据说长寿村也是晒出来的。这个观点在西医网站也看到过,说是阳光增强抵抗力,促进钙质吸收。之前在美国也略有小试,效果不甚佳,主要还是太在乎晒黑,皮肤癌之类的危害,和对加州阳光的恐惧。此番回去或许会坚持用防晒霜,并注意控制日晒时间,顺便带娃,可以缓解家庭矛盾又健身了。至于运动,还是需要适量。毕竟带娃已经有很大体力支出。最好还是乘身体需要的是有适当锻炼以下。比如一周一到两次球赛。室内运动尽量避免。等身体彻底恢复再考虑肌肉什么的。

如今回美最大的困惑还是如何面对家人在体力方面的要求。孩子9点前不肯去学校,也是逼不出来的。关键下午下班得早,可以乘机带他去公园玩,顺便晒太阳。或直接回家让阿姨接管。好过在路上堵车之苦。太早太晚都不好。关键要走的时间巧。早晨作息还得从长计议。

Posted in Uncategorized | Leave a comment

Insurance policy

Only after my second child was born did I realize that I have been using the less economic insurance plans all these years. Even during years without major medical events, my family members, myself included, make hospital visits pretty frequently. The ideal plan in that situation is EPO. But I have been using a highly subsidized version of PPO sponsored through my company for the past year, since it was the best advertised one and seems to essentially level the deductible with EPO. What I did not understand is that for things like child delivery, EPO charges a flat $250 rate as reported by the Chinese community, whereas the PPO plan accumulates a bill in excess of $20k, which still charges $2k to me after coinsurance. Unfortunately the details are in the fine-prints, and it’s not in the insurance company’s best interest to make them transparent. Lesson learned. I will have to trust Chinese source of information far more than the English ones, because the latter just suck with irrelevant details.

Posted in Uncategorized | Leave a comment

2.5 hour struggle with technology

I am not impressed with user-friendliness (or user-hostility) of either of the two major cellphone makers. Last night I had to port contact list from a Galaxy 5s to a newly bought iphone 7+ for my mother-in-law; for the record, I would never buy a luxury good like that for myself. Initially the solution seemed straightforward. The conventional means was to set up a google account (which my MIL hadn’t because of restriction in China), and export and import contacts there. It turned out that the galaxy device wouldn’t allow me to use gmail at all, possibly because it was configured for Chinese users, who have no legit use of google products. This in fact took me a while to discover and confirm, as I tried installing the gmail app from the built-in samsung store, which then prompted me to add an email account, only to be rejected because I don’t have google play store installed. The latter turned out to be unavailable in the samsung store, presumably because samsung didn’t want google to takes its market share of mobile apps. Even an unsophisticated user like me can easily sniff competitions going awry at the expense of users with these design choices.

The next option was to use the sim card as a physical medium of transfer. This again was a dead-end because when I tapped on the menu option on iphone 7+ that says import contact from sim card, I got instantly worm-holed back to the home screen without any explanation (or apology). Could this be a case of incompatibility with foreign sim card (previously used on a Chinese device)? I also tried switching the iphone locale to en/us, as iphones were notorious for incomplete feature implementation in secondary locales, but still had no luck. The complaint for localization bugs will be fodder for a later thread. After researching on the web about this turned up no relevant results, I was briefly flummoxed.

The saving grace was the realization that the iphone 7+ did carry a scanty few contacts from the old galaxy phone. Initially I thought it was due to an incomplete exportation to sim, but after switching the sim hosts several times, and consulting with my family members, I started looking at my MIL’s newly created gmail account (which is inaccessible on galaxy). Then it became clear that those few contacts came from an earlier porting attempt by my wife. So a third solution emerged: try loading the contact list directly into the gmail account, and then hopefully it will automatically sync with the iphone.

The next episode simply proves the adage that bad things all come at once. First it took me a while to figure out how to access the local file system on the android: there turned out to be an app just for that, fortunately already installed. It took me no time to locate the file storing the contact list. But how should I send it to other devices? Gmail is out of question. This left me with basically only one option: use wechat. In a moment of unequivocal stupidity, I logged out of my MIL’s wechat account and got into mine, and sent the file as an attachment to myself there. The goal was to retrieve the file on another mobile device/macbook so that it could eventually be uploaded to gmail. I then started checking my personal android phone for the sent file, but it was running soon out of battery. I connected it to the my mac air and made sure that the battery charging mode was on (indeed the data transfer mode was not supported any more by the itune version on my mac air, which was only 4 years old!). But the battery turned out to be really depleted at that point, despite the indicator showing 30% before shutting down. After a few failed attempt to reboot without instantly shutting down, I decided to plug it into a wall socket and simply wait. Meanwhile, I had the ingenious idea of sending the file to my wife’s android phone. It was no longer possible for me to log back into my MIL’s wechat account since she forgot her username and password, and my wife, being the only person knowledgeable in this matter, was upstairs breastfeeding or something and could not be disturbed. For about 10 minutes, I tried to use wechat on my mac air directly, only to find out that it required 2d bar code scanning from a mobile device, which was out of battery at the moment. Even though I eventually succeeded in this regard, the sent file was not showing up in any self-conversation tab, on either my phone or the laptop. So finally I forwarded the file to my wife’s phone, and it appeared instantly on her device’s end. Could that be a bug in wechat regarding self-conversation? Only John von Neumann knows. The rest was happy ending, though to be fair I could have spent that 2.5 hours babysitting my younger one or pretended to do some math in my head.

Posted in Uncategorized | Leave a comment

How much it costs to raise a kid

Today my wife made the comment that it is actually easier on the parents to send the kids to extracurricular classes than having them stay home, despite the extra financial cost. So I was curious enough to do the following back of the envelope calculation. Assuming that we send one kid out every working hour during the week, that is 40 hours a week, so for 18 years, assuming $50 an hour, this amounts to:
echo “40 * 52 * 18 * 50” | bc
1872000

that is a whopping 1.8 million dollars, something only the top 5% of this country can afford. And this is just one kid, and non-weekend working hours. With weekend nannies, diapers, and other material cost, even if we lower the hourly rate to $25, I think the figure is still easily exceeding $1 million. So how on earth can people in this country afford to have a kid, let alone multiple ones?

Posted in Uncategorized | 3 Comments

Doing research of any kind is insurmountably difficult

I have lived in the research world for a while now, more precisely 12 years. My journey has been an extremely inert one. There have been countless times when I thought I am onto something, and it turned out to be fluke, bug, or some other uninteresting outcome. While in academia, I at least had the leisure of choosing the problem I wanted to pursue, some of which might not be at the center of the community spotlight, hence could yield to persistent trying, in industry, the competition is laid out in plain sight, and the metrics against which success is measured are few. The competition not only comes from contemporary peers, but also historical knowledge accumulation, which is true in academic settings also. What is more frustrating is that one often gets committed into a no-brainer project, only to find out later that it is a hole from which one can never crawl out in a wholesome way. This is the key difference between academic pursuit and industrial pursuit. Although in the former case, one also has coauthor’s trust at stake sometimes.

In any event, I have presently been stuck in such a hole for the better part of 4 months. The goal is not even very lofty, but a mere refactoring and space saving gimmick that doesn’t even qualify as a new idea. It turned out however that all those savings come at a cost, namely metrics are going down, despite all kinds of variations I have tried. Being an honest person, I do not wish to resort to the mercy of the team to launch the project, however, it is also distasteful to let it go to waste, since another colleague has been with me throughout this “wonderful” journey and I have a responsibility for not letting him down. While many other folks are anxiously waiting for this bottleneck project to settle down, I continue to bang my head against the wall, especially given how slowly things move within our organization. This may be the most opportune time to fuss about work.

So then I thought about Abraham Lincoln, and how he overcame an insurmountable amount of personal and political difficulties, only to be shot dead in the end. But the beautiful part of his story is that he carried all such weight with a smile of grace. ‘Tis I shall emulate, and prod along with animalistic persistence despite ever dwindling peer respect for my intelligence, prospect for promotion, and the opportunity to change the world and shit before I succumb to natural decay. Eventually the organization will figure out the right place for me to grow or rot, and all I should care about is the next local optimum to pursue.

Posted in Uncategorized | Leave a comment

Reading, innovation, and meaning of life

While staying home on paternity leave, I had more time to ponder the meaning of life, away from hectic programming day job. This is coupled by my grandma’s accidental fall in the bathroom, and the less than optimistic prognosis that her rib bones were fractured and heart and lung got infected as a result. I think even God appeared to me one night to give consolation, since this is a justifiably depressing time, despite the smoothness of the newborn. In any case, reflecting on my first 9 months at my current job, one trap I repeatedly fell in was that deep down, I wanted to innovate and make big news so bad, that I lost sight of the lifelong pursuit of learning. As a programmer, there are many ways to absorb old and new technology. The whole industry evolves around making learning more accessible to both the insiders and outsiders. Maybe the abundance of resource pushed me into the other extreme, by completing shutting my brain off from learning and focusing on continuous philosophizing and hypothesis testing. This break allowed me to realize this as a critical vice that would hinder my long term productivity.

So having regained some intellectual energy, I revisited a branch of mathematics that I detested as a graduate student, namely analytic number theory, as partially motivated by Terence Tao’s most recent blog post on the Bombieri heuristic. Yesterday I managed to get a systematic education on the Mobius function. Today I started reading his earlier post on Goldston-Pintz-Yildirim, Motohashi-Pintz, and YT Zhang’s result. I got tripped by a seemingly innocent estimate, that the Hardy-Littlewood constant relevant for the prime constellation conjecture is bounded away from 0. It turned out to be an elementary consequence of the Prime Number Theorem, which I have always held in awe and dared not to apply it to real questions of interest. It sounded like going through the whole post would be both rewarding and challenging, but I have set my mind to do so, and hopefully come up with some followup learning items. After all, number theory is an exact science and a mediocre mind like mine should still be able to penetrate it, given enough volition. Hopefully I will then find some common ground with past grad school friends and borrow analytic ideas to solve my own problems in Lie theory and probability. Thank you God for the latest revelation.

Posted in Uncategorized | Leave a comment