The Woodshed


What time is lunch?

When people in London suggest a time for lunch, they suggest 1pm. In the US it would be noon, right? I find that curious.

I suspect this is a case where it’s kind of arbitrary what time you go to lunch, but people have just converged on a standard practice, and that practice is different in the US and the UK. (You would think there would be an incentive to go a little earlier to avoid the crowds, but on the other hand it’s probably useful for remembering lunch dates to just go with the standard time.) This is therefore a case of what social science types call a coordination game, in which there are “multiple equilibria.” If everyone else is going to lunch at 1, you go at 1; if everyone else is going at noon, you go at noon; so once a society has converged on a equilibrium lunchtime, it is hard to shake (even if you started down that route for random reasons).

I have not yet determined whether the workplace calendar is generally shifted back an hour or not. I walked to work at around 8:45 this morning and it seemed like rush hour to me.

Also, I checked, and sunrise and sunset are not generally later here in London than in NYC.

I wonder if there’s an interesting story explaining why London started down the 1pm path and e.g. NYC went with noon. Also, is it the same in other cities in the UK? In Europe?

Kahn and Kotchen on unemployment and environmental concern

Matthew Kahn (a teacher of mine during my MA at Tufts) and Matthew Kotchen have an interesting sounding paper showing that people appear to be less concerned about the environment when the economy is doing worse. Fewer people search Google for “global warming” and fewer survey respondents say they think global warming is occurring when their state’s unemployment rate is higher. (This is with state and month-year fixed effects, meaning that the difference is not just capturing over-time changes in attitudes or stable geographical differences between people in richer and poorer states.)

Some of the results, like the one about Google search terms or another finding about people’s responses to a “most important problem question,” are consistent with the idea that economic concerns crowd out environmental concerns. But the fact that survey respondents say global warming is not happening when their local economy is doing poorly says something different: it suggests that economic problems do not simply change people’s priorities, they also change their views. (Or that, when someone’s priorities are changed, his or her views adjust to become consistent with those priorities: if I don’t spend much time worrying about the environment, the problem must not be happening.) (Sorry: or that it takes time to learn that global warming is happening, and people don’t have that time when they are worried about the environment.)

Started at LSE

After last year’s very enjoyable post-doc at Yale’s Leitner Center, I have shipped off to the LSE to start as a Lecturer (asst. prof, in US terms) teaching in the MPA program. I am still settling into my office in Connaught House, and still working out housing for next month, but so far I am really enjoying both the city and the school. I met some of the students last week at the introductory session for MPA first-years and I was extremely impressed with their sharpness and the variety of interesting experiences they bring to London.

More soon — I’m going to try to do some more writing here.

Stock trading project written up in Bloomberg/BusinessWeek

The project on the stock portfolios of members of Congress that Jens and I have been working on for over two years is finally almost done, and now it has been written up by Bloomberg. The paper is tentatively titled “Political Capital: The (Mostly) Mediocre Performance of Congressional Stock Portfolios, 2004-2008”. The writeup looks pretty accurate to me. I was curious what aspect the media would end up focusing on, given that the story is kind of subtle and doesn’t play directly into a corruption narrative. In this case the writer chose to focus on the local premium while telling the rest of the story further down.

(Finally) set up on new MacBook Air

I am fully up to speed now with my new MacBook Air.

I’ve had it now for a little over a week, and yesterday finished installing my rails environment and downloading the databases I have been working with. (For setting up the Rails environment on Snow Leopard, I recommend this guide.)

I love this computer. Above all I love how light and sleek it is: this weekend I went to NYC for a bachelor party and a baby shower and brought only my violin case — with my MacBook Air, a change of clothes, and a toothbrush slipped in the space where you can store sheet music. I just love that efficiency.

I also really like the screen resolution, the way the computer starts up and shuts down very quickly, the way it makes basically no noise (no moving parts!) and does not get hot. I still have my early 2008 15″ MacBook Pro around, and it’s funny how it feels so big and clunky. The old screen does look massive now that I’m accustomed to the 13″ MBA, but somehow I don’t seem to miss the extra space, and I certainly love how portable the new machine is.

So — now that I’m really set up it’s back to work.

Working on a remote server

Since I figured out how to install R on my Dreamhost VPS, I’ve started sending some computations to the server to run. I’m sure my workflow will be honed as I do this more, but I just wanted to share a bit about how this works for me.

The key element of my approach is SVN. If you don’t know what SVN is: SVN is kind of like a cross between TimeMachine and GoogleDocs for geeks — a way to backup your work and collaborate with others. If you do know what SVN is, you probably think I should use git; I know, I know. At least I’m not using CVS.

So, I write code on my laptop and as I go I test it out on data and check the code in to my SVN repository (hosted incidentally at Assembla). When I am ready to do something on the server, I create a space on my VPS and, from that directory, do an SVN “checkout” of the project I’m working on, which grabs the code from my SVN repository and makes a copy on the server. I may in addition need to FTP some data to the server; I could check the data into SVN as well and I may do that going forward, but because I was somewhat constrained in my SVN repository I have not done this so far.

By this stage, I have basically replicated a chunk of my laptop — the code and data I need to do my computations — on the server. So now I ssh into the server and run the code there as I would locally. When it’s done I fetch the results from the server to do more processing locally.

In my current project I had not been planning to work this way, so I had written a lot of absolute paths (e.g. “~/data/X”) in my code. Because things were not set up that way on my server, I had to change a lot of paths to make them relative, which is fine. But I’m thinking in the future I could set up my laptop and server space to look more similar, so that the transition would be more seamless. I guess in the extreme you could check in a whole directory of your harddrive (code, data, etc) into SVN and thus have a complete copy of that repository both on your local machine and on the server. The only issue is how to deal with stuff you don’t want checked in — huge datasets, images/output from code, etc. I’ll keep honing it.

Until we’re all seamlessly in the cloud I think this setup will help me be more productive. I had had enough of trying to do work while running CPU-intensive stuff in the background, or having my computer chugging along overnight.

Installing R on Dreamhost VPS

I have had an account at Dreamhost since this summer, and have been using it for web hosting and some storage. But I’ve now started using it to run computations in R;. As I went through the installation process I could find very little guidance so I wanted to share what I’ve learned for the next person who is looking to do this.

First, you need superuser privileges. I believe this is not possible on the shared host plan, but with a Dreamhost VPS in the dashboard it’s under “VPS => Manage Admin Users.” The new user will get his own space on your server. I guess it might make sense for your main user (i.e. you) to be an admin user; since I had not done this, I found that after ssh’ing as the admin user I needed to cd up and over to the space on the server in which I had been working, i.e. the space belonging to the existing user.

Then you need to install R under debian linux. I found what I needed here:

sudo apt-get install r-base-dev

The server will ask for your password, to confirm that you are a superuser. Then installation should proceed.

To confirm that it worked, and to begin working, type “R” on the command line. (I was actually confused about this, because on my Mac I typically type “r” but this was not recognized on the server even after installing R.) I was very happy to see when this worked, because I wasn’t sure exactly what I could install on Dreamhost.

In my next post I’ll talk a bit about my workflow using R on the server.

Causal inference is the new research design

I had a look today at the new syllabus for Harvard’s graduate research design course, taught by Mike Hiscox. This looks like a very useful and interesting class. It starts right off with experiments as the gold standard (the third meeting is subtitled “The Appeal of Randomization”) and goes from there to a series of weeks discussing field experiments, lab and survey experiments, natural experiments/IV, RD/DID, matching/regression before getting to what this course used to be about — case studies, “concepts, measures, surveys,” and interviews.

I am a firm believer in the wisdom of the turn towards experiments and causal inference and I think this sounds like a great course, but I hope there will be some kind of research design course that assumes students will not be focusing on the effect of a particular treatment. Most graduate students are not oriented toward questions that these methods are good at answering; I expect more will be over time, but I still believe in methodological diversity.

Comparative politics in the news

I was excited to read two recent posts by Harvard undergrad Dylan Matthews comparing legislative organization in Britain and France to that in the US. Ezra Klein is featuring Dylan’s articles on his Washington Post blog. The two pieces so far have been quite well done, overall, and it’s nice to see that some oversights (such as Dylan’s overlooking the role of local constituency committees in candidate selection in the UK) have been pointed out in the comments. Conclusions about how things would be different in the US with this or that change in electoral law, party organization, or legislative procedure based on how things operate in other countries are of course difficult, but it’s a step in the right direction to see the features of our own government in the context of the variety of institutional forms in existence.

This seems to be a moment of unusual reflection and dissatisfaction about our political institutions: the Left is angry about the filibuster, Congressional paralysis, and Ben Nelson-style horsetrading; the recent Citizens United case has provoked new concern about money in politics and calls for constitutional amendments on political financing. It’s great to see Ezra Klein and Dylan Matthews brining in a comparative perspective, although I’d like to see some professional political scientists enter the fray and make themselves useful — I’ll try.

Thoughts on Lessig’s "Against Transparency"

I spent time this morning reading and thinking about Lessig’s article Against Transparency, which was published in the New Republic back in October when I was too busy to read and write about things like this. Here’s my report — I may return to this and write something more formal.

Lessig’s central critique against what he calls “naked transparency” is that it reinforces lazy inferences about money’s effect on politics, thereby undermining trust in government, and does not produce sufficient mitigating benefits. Naked transparency produces correlations (a word Lessig uses sneeringly): this guy voted for the bill and got an unusually large amount of money from the bill’s beneficiaries, or (in more careful studies) the guys who voted for the bill got more more from the bill’s beneficiaries than those who voted against. The problem for Lessig is that correlation is not causation. He would view it as corruption if the money caused the votes, ie if the votes would have been different in a world where the money had not been given. But there are other, more innocent reasons why money and political positions would be correlated: contributors might enjoy giving money to politicians who agree with them, or perhaps the contributors gave money not to change the politician’s mind but to get a friendly politician elected, which some would view as less problematic. In a particular case, one doesn’t know to what extent corruption is to blame, as opposed to these other channels.

To illustrate the point that transparency tends to produce lazy causal inferences, Lessig offers one of the supposedly problematic correlations that transparency makes possible — First Lady Hillary Clinton’s opposition to bankruptcy reform in 2000 compared to Senator Hillary Clinton’s support for bankruptcy reform in 2001 (after having received $140,000 in campaign contributions from the financial services industry). Lessig’s point here is that we generally assume from this story that the money caused Clinton to change positions, but that in fact there was a less objectionable reason for the switch: Senator Clinton represented New York, home to many companies that would benefit from bankruptcy reform, and it was thus in some sense her job to advocate on behalf of those constituents, regardless of money her campaign had received. Despite this alternative explanation, we tend to assume that money was the cause — what he calls “the default explanation.” “This default, this unexamined assumption of causality, will only be reinforced by the naked transparency movement and its correlations. What we believe will be confirmed, again and again.”

Lessig argues that policymakers and reformers should take into account the public’s limited ability to understand the data that transparency can produce. Central to Lessig’s critique is what he calls “the attention-span problem.” Understanding something nontrivial requires more attention than people are typically (and rationally) willing to give. In the area of money and politics, most of the public will lazily and simplistically view the best correlations the naked transparency movement can produce as evidence of corruption. Lessig does not conclude from this that the public needs to be taught or cajoled to be more careful scientists about the world; rather, he says we should design policy under the assumption that the public will not carefully evaluate correlations.

Ultimately, it is a little unclear what exactly Lessig means by this. He goes on a detour at this point in the essay into the worlds of music and online journalism to consider how those fields have struggled with the challenges posed by internet disruptions that he argues are comparable to the effect of transparency on politics in recent years. The Internet has been good for music and journalism in some senses but disastrous in others, and both industries are starting to come up with ways of handling these disasters. Lessig thinks that the realm of government information and transparency more generally should be viewed similarly as an area where the internet (and its transparency) has good and bad effects, and that transparency reformers should not view their product as such an unmitigated improvement. In particular, whatever the disinfectant benefits of sunlight (and he does not seem to think there are many), they come along with an intensification of cynicism toward the government that he believes is ultimately damaging to our democracy.

I basically agree with what I take to be the two basic points he makes about the reception of transparency data. His first point is that much of the transparency we’ve created does not help us answer causal questions. We can’t answer questions about government corruption by looking at a single contribution or even a set of carefully produced regression coefficients because, after all, correlation is not causation; it is a rare correlation that would provide convincing evidence of corruption as it is usually defined. The second basic point is that the public will not carefully consider the complexity of the issue when presented with these correlations; if indeed they encounter these correlations at all, the data will merely serve to reinforce coarse generalizations like “DC is corrupt.”

But the link between these observations and Lessig’s policy prescription is disappointingly weak. His main recommendation is to address concerns about corruption by reducing money’s role in politics. Transparency itself serves mainly to heighten cynicism, he argues, so we need to go the next step by applying regulation that establishes “a system in which no one could believe that money was buying results” — public financing of elections. I have no particular problem with public financing of elections, but his arguments against naked transparency provide very little reason to think that this is the right solution, for two reasons. First, people will of course continue to believe that money is buying results under the public financing system Lessig advocates: leaving aside the issue of bundling of small contributions, much more is spent on lobbying than on campaign contributions — 10 times as much, according to Ansolabehere et al’s 2003 JEP article — and this proposal wouldn’t touch any of that spending. And the proposal essentially ignores the problem of deciding whether campaign contributions are corrupt: having bemoaned the fact that transparency can’t answer that question for us, Lessig essentially assumes that these contributions are indeed corrupt and advocates drastically curtailing that source of financing. If transparency is as uninformative as he claims, on what basis does he make that recommendation?

A more charitable interpretation of Lessig’s article starts with the policy recommendation and goes from there. Lessig believes that public financing of campaigns would reduce the perception and reality of money buying policy, i.e. corruption. He is frustrated with transparency advocates claiming that disclosure is enough to combat corruption in politics. So he points out the limitations in what transparency can accomplish — limitations that arise fundamentally from the difficulty of drawing causal inferences about money and politics — as well as the real harm that is done by stoking cynicism without actually cleaning up politics. Transparency is not enough, argues Lessig, which is why we need to go further by bolstering public financing; but it is also too much in a sense: without reform, transparency merely makes us cynical and less inclined to participate.

Overall, this is a welcome (if somewhat sprawling) critique that left me with some questions. First, as Brandeis’s famous quote indicates, part of the point of transparency is to disinfect, i.e. to deter unjustifiable behavior. Is there good evidence that it indeed has that effect? My own study of municipal councils in France provides some evidence that transparency moderates political outcomes, but generally it is a difficult problem: given how hard it is to show the effect of money on politics, it is a much more difficult problem to show that the effect is smaller when the system is more transparent. Lessig makes passing reference to the fact that increased disclosure of executive compensation led to higher pay (the study of Ontario companies by Park, Nelson, and Huson is the study I know in this area), but that is probably not sufficient to rule out the possibility of valuable deterrent effects of disclosure.

Second, the data on money and politics unleashed by the transparency movement can only provide correlations, just as Lessig says, but this is true of any area, and there are some correlations that are more informative than others about causal relationships. If we do a randomized experiment and then observe the outcomes (perhaps produced by mandated disclosure), for example, we can learn something about the effect of the treatment; there are plenty of natural experiments that could be very informative as well. This is not something that a citizen journalist or even an investigative journalist or the average professional researcher can easily pull off; I agree with him even most of the professional research on these issues provides correlations that tell us little not only about individual instances but also about average effects of money in politics. But I do think that making this data more easily available makes it more likely that academics once in a while will come upon a nice experiment to document the effect of money on politics in a way that could help Lessig and other careful analysts understand what is going on out there.

Third, I wondered whether the benefits of transparency are more considerable if you think of the problem as a policy analyst rather than as a lawyer. Lessig establishes the unrealistic expectation that transparency is supposed to tell us whether individual votes or other actions are corrupt. “Even if we had all the data in the world and a month of Google coders,” he points out, “we could not begin to sort corrupting contributions from innocent contributions.” I agree. But the goal he sets out — identifying corrupting contributions — is what a lawyer attempts to do when undertaking a corruption investigation; a policy analyst would instead try to estimate the average effect of contributions on voting. Why do policy analysts focus on average effects rather than the effects of individual contributions? The technical reason is that we believe we live in a noisy world. Even if you could identify a very close counterfactual for a given case — a politician B who is very similar to politician A but did not receive a campaign contribution, and subsequently voted on a given piece of legislation — there are likely to be unobserved factors that affect the vote, such as some aspect of politician B’s ideology that differs from that of politician A. Now, if those unobserved factors affect contributions, then B actually is not a good counterfactual for A. But otherwise (e.g. if unobserved ideology is conditionally independent of contributions in the population of politicians), then the counterfactual is good but the effect we estimate from any one such comparison is unreliable, in the sense that if we observed it multiple times we might get different answers. We therefore aggregate up a lot of estimated individual-level effects to estimate an average effect that is less influenced by noise. This will not help a lawyer convict a politician for corruption, but it might inform policy.

In short, by making more data more easily available, the transparency movement has made it more likely that researchers will be able to estimate the average effects of money on politics in some circumstances. This is a modest step forward, but it is progress. Lessig’s essay should help to remind open government zealots that answering questions about corruption requires more than a website with tons of data, and perhaps it should encourage them to focus slightly more of their energies on organizing and making public the kind of data that researchers could use to answer causal questions and slightly less on making slick web interfaces to allow users to comment on particular pieces of data and write emails. But it should not lead them to give up hope about the value of opening up the government.