This blog is being migrated to suhasmathur.com. I will no longer be posting new articles here.
Buyers and sellers of a good or service set the price by expressing how much they are willing to buy or sell for. When a very large number of buyers and sellers interact in this way, the final price of the good or service settles to one that incorporates all this information. However, in the healthcare market, this mechanism is absent.
When you are sick and you visit a doctor, you do not get to ask up front what you will be charged for seeing the doctor! Instead, you present your health insurance card to the receptionist, thereby disclosing information about your ability to pay, and creating information assymetry. The other party (i.e. the healthcare provider) can now select a price , taking into account the information you have just disclosed, and that too, after you have made use of their services, rather than before. This is akin to getting to know the price of a haircut in a barber shop, after you have had the haircut. There is no price discovery mechanism in the healthcare industry in US. I recommend:
Ostensibly, providers check a patient’s insurance card in order to make sure that the patient can pay for the services she is about to purchase. Instead, this function should be done by a ‘lookup’ over the internet on a server run by a government agency: Input = ‘name of the medical procedure’, Output = ‘Covered or not’. This will prevent the provider from learning about the extent of coverages available to the patient. Not learning about this will force providers to more honestly assess fees, and compete with each other more realistically, bringing down costs borne by the patients. I think this will also prevent providers from passing on large unnecessary fees to insurance companies.
A formal mechanism design formulation of the problem would be interesting to tackle. 4 players: patients, providers, insurance companies, government.
Conventional wisdom says that the marginal gains accrued per unit increase in effort are diminishing. In some cases this does not hold, and the gains seem to be super-linear, rather than sublinear in the amount of effort.
Consider for a moment that you are driving in your car, let’s call it Car 1, down a straight long road that has a series of traffic lights. A second car, lets call it Car 2, is driving near you at a speed that is only very slightly greater than yours. The other car is inching ahead of your car very slowly. At time t, the distance by which Car 2 is ahead of you is d(t).
As you near the next traffic signal, the light turns amber and you begin to slow down. But the driver of the other car doesn’t feel the need slow down, but instead makes it through the light without stopping, while you wait at the traffic signal. The gap between the two cars has suddenly increased. If the two cars were to maintain the same small different in speeds throughout the road, slowing and stopping only at traffic lights, then at time t, the distance d(t) between them will be a non-linear function, quite different from the linear function that it would have been, had there been no traffic lights. What do you think d(t) would look like?
It’s not hard to reason out the shape of d(t). At any given time, either both cars are moving,, in which case, they have a relative velocity of v2-v1, or car 1 is stopped at a signal (relative velocity of v2), or car 2 is stopped at a signal (relative velocity of -v1). So the plot of d(t) versus t would be piece wise linear with one of 3 possible slopes in any segment. In the figure below, each segment is labelled with which car(s) is moving during that segment.
Over a long enough time t, d(t) will be significantly above the dotted line, i.e. the value of d(t) that one would expect if there were no traffic lights. Therefore, in the long term, Car 2, which started out with a small velocity advantage ends up far far ahead of Car 1, thanks to non-linearities introduced by the traffic lights.
I have often mused about the similarity between the scenario above and other aspects of life, such as career growth. Just like the traffic lights, there are a number of thresholds in real life, e.g cut-off marks for the A+ grade in exams, a threshold for being accepted at a prestigious university or a job interview, or the difference between getting and not getting an award. Two individuals that differ by only a small amount, will, over the course of a long time interval, find themselves on different sides of these thresholds enough number of times. Thus, small differences in skill level and/or determination between very similar workers get amplified, leading to drastic differences in their achievements over a lifetime. Natural processes around us, including interactions between humans, are highly non-linear, and my suspicion is that humans do not accurately perceve the extent of the non-linearities. The fact that long term benefit can be superlinear in effort is a significant realization, because it means that putting in just a little bit of extra effort — e.g. an extra hour at work daily, an extra skill, extra effort in networking or maintaining relationships — can have a disproportionately positive effect in the long term. I’ll end with a passage from Richard Hamming‘s 1986 lecture at Bell Labs titled ‘You and Your Research’ that touches upon this:
.. Now for the matter of drive. You observe that most great scientists have tremendous drive. I worked for ten years with John Tukey at Bell Labs. He had tremendous drive. One day about three or four years after I joined, I discovered that John Tukey was slightly younger than I was. John was a genius and I clearly was not. Well I went storming into Bode’s office and said, “How can anybody my age know as much as John Tukey does?” He leaned back in his chair, put his hands behind his head, grinned slightly, and said, “You would be surprised Hamming, how much you would know if you worked as hard as he did that many years.” I simply slunk out of the office!
What Bode was saying was this: “Knowledge and productivity are like compound interest.” Given two people of approximately the same ability and one person who works ten percent more than the other, the latter will more than twice outproduce the former. The more you know, the more you learn; the more you learn, the more you can do; the more you can do, the more the opportunity – it is very much like compound interest. I don’t want to give you a rate, but it is a very high rate. Given two people with exactly the same ability, the one person who manages day in and day out to get in one more hour of thinking will be tremendously more productive over a lifetime. I took Bode’s remark to heart; I spent a good deal more of my time for some years trying to work a bit harder and I found, in fact, I could get more work done.
1 hour extra per day turns out to be equal to be an extra ~1.5 months of 8 hour work days per year!
The urn model in that post is not really a complete description of habit formation because the limit distribution of the system (i.e., fraction of white balls after a very large number of draws) is completely determined by the initial configuration of the urn (i.e., number of white and black balls). In reality, habits are formed from repeated actions, and our actions are based on not just our inclinations but also the conscious choices that we make, which can sometimes be against our inclinations. Free will is missing from this model. There is no way to break a bad habit if one start out with one.
Therefore, let us alter the basic urn model slightly to capture choice: At each step, before picking a ball, we explicitly declare which color we are interested in for that pick: a white ball ( = follow the desirable habit) or a black ball ( = follow the undesirable habit). Suppose we choose to aim for a white ball. A ball is then sampled from the urn at random, and if a white ball does indeed come up, then:
But if we decide to pick white and a black ball comes up, then there is no payoff, and the black ball is placed back in the urn without any extra ball. However, if we chose black, a black ball is guaranteed to show up, we get a payoff of #B, where # is a different currency from $, and the black ball is placed back along with an extra black. The extra black ball makes picking a white ball harder when a white is desired.
Consider the implications of this model:
I believe human behavior is fundamentally driven by a desire to be happy. However, different individuals can have very different ideas about what will make them happy. Since most bad habits result from a focus on short term happiness, at the expense of long term happiness, we would do well to make the following tweak: # and $ can be added to give a total payoff, but each unit of # currency is worth only unit, 1 iteration after it was earned. The other currency, $, however, does not decay with time. The question is what behavior emerges when the objective is simply maximization of total payoff? It is intuitive that the answer will depend upon whether the game has a finite horizon or an infinite horizon. Since players have finite lifetimes, it is appropriate to use a finite time horizon, but since a person does not know when he/she will die, the horizon must be a life expectancy with some uncertainty (variance). In other words, nobody knows when they will die, but everybody knows they will die. The only time it makes logical sense to pick a black ball is if the player believes his remaining lifetime is so small that the expected cumulative gain from deciding to pick whites is smaller than the expected cumulative gain (including the decay effect) from picking blacks. Of course this depends upon how he spent his past — i.e. whether or not he built up a high fraction of whites.
Breaking a bad habit
Suppose a person starts out with a predominant inclination towards an undesirable habit and wishes to switch to the good habit. It is clear that if she always chooses the white ball, then she will eventually succeed in creating a larger number of white balls and a good $ balance. But she will have to go many tries and no payoff before that happens. In practice, this requires determination, and she may be tempted to pick the black ball because it is so easily available.
Suppose is the fraction of times that (s)he will decide to aim for a white ball, i.e. the player’s determination. It is intuitive that a larger value would help her switch to the good habit in fewer iterations. It would be interesting to see if there is a threshold value of below which the bad habit is never broken. In particular, I expect there to be a threshold value , below which the bad hait is never broken, w.p. 1, where is the number of whites in the urn and is the number of blacks.
Game over when wealth < 1 unit
Now let us further suppose, that in the course of playing the game, a player can never have less than 1 currency units. In other words, the game stops if the total currency units with a player reaches zero. All other rules are the same: 1 unit earned from a white ball, 1 unit from a black ball but units earned from black balls decay to of their value every time slot. Deciding to pick a black guarantees a black, and an extra black goes in. Deciding to pick a white, provides a white with a probability proportional to the fraction of whites in the urn, and an extra white goes in. With the new rule stipulating ‘game over’ when the player’s wealth falls below 1, the player is incentivized to pick a black if she ever approaches 1 unit of wealth. This is meant to model the inclination of individuals with low self worth and happiness levels to engage in activity that would yield short term happiness (alcohol, drugs, binge eating, etc. ) at the expense of long term prospects.
The problem with conventional text documents is that you cannot easily query them. Suppose I have a 20 page piece of text explaining various things, perhaps with mathematical equations, and plots, etc. In order to figure out something, I need to patiently read through the document till I have my answer. When time is short, and usually time is short, I will skip some parts in a hurry to get to what I need. The skipped parts might contain intermediate facts or definitions, making it impossible for me to understand what I am looking for even when I do find it.
This situation is because information is being presented in the document in an order decided by the author. Perhaps this is the best order. But for a specific query (e.g. what is the main result of this paper? ), it would be nice to have a way to do better than having to patiently read the whole document. The best we have right now is Ctrl+F on words and phrases, which just doesn’t cut it for queries of reasonable complexity (e.g. ‘Why did the GDP of Germany double so quickly?’ while reading an article on the history of the German economy).
The computer doesn’t understand the document :-(
It would be pretty useful to somehow model documents as databases that can support complex queries. Especially when the documents are large, like books. Perhaps I am asking for the holy grail of NLP? Surely, if IBM can build a Jeopardy-winning Watson, it should be possible to build a question-answering service for individual documents? Outside information is permitted.
Tasks that require a high cognitive load, such as thinking about a problem, reading a research paper or writing C++ code, are very sensitive to interruptions. I find that when I am interrupted during such activity, say by a phone call, or by a co-worker asking me something, the net cost of the interruption is not just the amount of time I spend attending to the call or person. It’s as if I have a short term RAM in my head, and I have to re-load the entire context of what I was doing when I get back to it. And this can take a lot of time. What I like are large chunks of uninterrupted time, not several small chunks, even if they add up to the big chunk.
However, shutting oneself off from all interactions with others at one’s workplace is not the solution, because one runs the risk of not interacting enough with colleagues. Richard Hamming said about Bell Labs:
I notice that if you have the door to your office closed, you get more work done today and tomorrow, and you are more productive than most. But 10 years later somehow you don’t know quite know what problems are worth working on; all the hard work you do is sort of tangential in importance.
I have personally found that conversations over coffee and lunch have led to much more interesting directions in my work than solitary thinking. While Hamming was speaking about a research lab environment, I suspect his statement holds true in other creative domains as well.
The apparently contradictory requirements above can be resolved by planning out both solitary time and interaction time during the work week. Both are necessary. I feel creative work requires a combination of two distinct phases. One that allows focused concentration on a task at hand, and another for free discussions, exchange of ideas, and making oneself available to others as a sounding board. Exploit and Explore. An ideal workspace should provide for both. It is said of the original Bell Labs building, that its designer deliberately created long corridors, so that researchers would be forced to encounter one another. Steven Johnson explains in his book ‘Where Good Ideas Come From’ that coffee houses played a crucial role, because they were breeding grounds for collective thinking.
During my travels on the Internets, I have come across a number of writings on the impact of interruptions on productivity that I can attest to from my own experience:
Paul Graham writes about meetings:
I find one meeting can sometimes affect a whole day. A meeting commonly blows at least half a day, by breaking up a morning or afternoon. But in addition there’s sometimes a cascading effect. If I know the afternoon is going to be broken up, I’m slightly less likely to start something ambitious in the morning. I know this may sound oversensitive, but if you’re a maker, think of your own case. Don’t your spirits rise at the thought of having an entire day free to work, with no appointments at all? Well, that means your spirits are correspondingly depressed when you don’t. And ambitious projects are by definition close to the limits of your capacity. A small decrease in morale is enough to kill them off.
Joel Spolsky writes about context switching between different projects.
Some interruptions are self created. Compulsive checking of email for instance. Or deciding to multitask, or trying to handle too many disconnected projects. I find I can do one major thing per day, if I try to do 2 or more major things, I risk accomplishing none. Having smartphones set to beep when an email arrives doesn’t help. I switched my phone to not auto check email, long ago.
An urn contain balls of two colors, white and black. A ball is drawn at random, and then put back in the urn along with a new ball of the same color. This process is repeated many times. What is the ratio of white balls to black balls in the urn after tries? What is the ratio as ? The answer depends upon the initial number of white and black balls of course.
This is the Polya Urn problem, and I find it to be an interesting way to think about the process of habit building. Imagine that the two types of balls represent two opposing choices that we can make about a certain aspect or activity in our lives, e.g. waking up early or not waking up early, doing regular exercise or not, smoking a cigarette after dinner, or not, completing homework on time every evening vs. playing video games during that time, etc.
# Polya urn simulation in Python import random, pylab as plt nruns = niter = 200 whitenum = blacknum = 1 for i in range(0,nruns): frac =  nwhite = whitenum nblack = blacknum for j in range(0,niter): if random.random() < nwhite/float(nwhite+nblack): nwhite = nwhite + 1 else: nblack = nblack + 1 frac.append(nwhite/float(nwhite+nblack)) plt.plot(frac)
Let us say that picking a white ball corresponds to the desirable activity in these examples, and picking a black ball corresponds to the undesirable one. The fraction of white balls in the urn at any given time represents our proclivity to pick the desirable activity at that time.
This model has some interesting properties. First, if the number of balls of each type to begin with are equal, then the fraction of white balls at any step, and in particular at is a random variable that behaves as follows:
Above: Histogram of the fraction of white balls after 20k iterations
Note that extreme outcomes are possible as if appropriate choices are enforced early on. Also the final value is attained after only a few iterations. Early iterations provide the opportunity for large swings in the final outcome, but there are hardly any swings after crossing about 50 iterations. Similarly, the that longer a habit has had to cement itself, that harder it is to change it.
But the model above converges to an almost uniform distribution on the fraction of white balls after a large number of tries. Perhaps starting with 1 white and 1 black isn’t quite right, because we are do not necessarily begin life with equal tendencies for picking each side of a binary choice. If we set the number of white balls to be greater than the number of black balls at the start, then the distribution of the fraction of white balls after many tries is, as expected, skewed in favor of white balls. Again, the fraction of white balls usually settles down to a fairly stable value that is typically determined largely by the actions in the first few iterations.
Above: Fraction of whites, starting with 1 black and 15 white balls
Distribution of fraction of whites after 20k iterations starting with 5 whites, 1 black
If the initial number of balls is allowed to be fractional, then the limit distribution has most of its mass near 0 and 1. The plot below shows 500 iterations starting with 0.2 white and 0.2 black balls. Perhaps this better models habit formation in some people.
Habit building is not the only thing this model can describe. There are a number of phenomena in which reinforcement plays a role:
1. In the form above, the urn model displays positive feedback. How can we change the urn model to display negative instead of positive feedback? Simple: put in an extra black ball if a white ball is picked and vice versa. This forces the fraction of white and black balls to remain close to 1/2.
2. Learning by rewards: There has been work in the psychology community that studies reinforcement learning in humans and animals, that is, given a set of choices and associated rewards, how does an animal learn which choice to pick? The Law of Effect is the hypothesis that the probability of picking a choice is proportional to the total reward accumulated from picking that choice in the past. Given a reward scheme, this translates to an Urn model with the colors representing rewards from the various possible actions. For a nice survey of reinforcement models, see: Robin Permantle, A survey of random processes with reinforcement, Probability Surveys, Vol. 4 (2007) .
3. The model above is the simplest urn model. There are several variations of this model, that have proved useful in modeling and learning tasks, other than reinforcement. E.g. using more than 2 colors, or using multiple urns, or allowing for the introduction of new colors. There are a number of interesting distributions for which these Pola Urn variations act as generating processes (Dirichlet-multinomial, Dirichlet Process, Chinese Restaurant Process), but that’s another post.
A word is not the same as the thing it represents, only a representation of it, meant to convey information about it. For e.g. the word ‘Castle’ is not the same as a Castle. This is obvious. Similarly, a map of a place is not the same as the place itself, it is only a map of it. This is also obvious.
An academic paper published at a conference or journal is not scholarship! It is a presentation of the scholarship. The scholarship itself is the research work. Somehow, this is not so obvious to a lot of people.