Language Log: Groseclose and Milyo respond

August 02, 2004

Groseclose and Milyo respond

On July 5, Geoff Nunberg posted a critique of a recent paper on media bias by Tim Groseclose and Jeff Milyo. Professors Groseclose and Milyo have written a response to Nunberg, and asked us to post it on their behalf. I'm happy to be able to do so.

--Mark Liberman

Geoffrey Nunberg recently posted a critique of our paper, “A Measure of Media Bias” at this site. In his essay, Nunberg shows a gross misunderstanding our statistical method and the actual assumptions upon which it relies. We have decided to provide this response, not only to correct his many errors, but as a caution to other academics who would use blogs to pose as experts on subjects well-outside those for which they have the requisite knowledge or technical expertise

We would have ignored Nunberg’s rant, as we have other equally inflamed and baseless web-bashings, except that his posting has been taken by some to be a particularly powerful counterpoint to our study. Indeed, had we not been familiar with what we actually wrote in our study, we would have found it quite convincing, too. This is because Nunberg, in referring to our work, states that "If you take the trouble to read the study carefully, it turns out to be based on unsupported, ideology-driven premises and to raise what would be most politely described as severe issues of data quality…" This is not an isolated charge; Nunberg accuses us of unprofessional behavior throughout his essay. In our world, this is very damning; our livelihood and reputations depend crucially on our abilities to conduct scientific research. Such charges should not be made lightly.

We provide our response in three parts. The first is short and addresses only the most obviously false of Nunberg’s claims. The second is a one paragraph summary of our response regarding bias generated by our list of think tanks and advocacy groups. Together, these address Nunberg’s most serious criticisms of our work. The final part is an attempt to provide a more detailed point-by-point response to his complaints.

PART I. A SHORT RESPONSE

Suffice it to say that Nunberg could not have read our study carefully, as his methodological criticisms are directed only at what we repeatedly describe as our "back-of-the-envelope" method and not the procedure upon which we base our conclusions.

The "back-of-the-envelope" estimates are intended as an easy to understand initial set of calculations; this procedure is described in the section of our paper titled "Descriptive Statsitics." Indeed , we ourselves critique this "back of the envelope method," in order to highlight the strengths of our preferred statistical procedure. Despite this, Nunberg’s summary of our methods is only a summary of the "back-of-the-envelope" method, which we acknowledge to be simplistic and inferior to our primary method.

Anyone who even skims our paper will find a section entitled, "The Estimation Method," which describes our primary statistical procedure in detail. Nowhere in Nunberg’s critique, does he make even the slightest reference to this statistical technique. We are not surprised if Nunberg did not comprehend the material in this section, as it is intended for a somewhat statistically sophisticated audience. However, it is quite inappropriate for Nunberg to act as if this section does not exist.

For this reason, we believe Nunberg has lied when he implies that he has read the study carefully. This is a harsh criticism, but the alternative would be less charitable, as it would mean that Nunberg actually did read the study carefully, but purposely chose to misrepresent our work in order to undermine our credibility. Regardless, by taking on the guise of an informed and careful critic, Nunberg has misled many others who may have trusted him. This is unprofessional conduct, to say the least; other academics who blog should take care not to behave in a like manner.

PART II: ON BIAS

Nunberg finds fault with our list of think tanks and advocacy groups used to rate media outlets. But even if our sample of think tanks is skewed left or right, this will not bias our results. To see this, consider a regression involving height and arm lengths, as the independent and dependent variables. Suppose instead of a balance of short and tall subjects, the researcher includes twice as many tall subjects as short subjects. This will not change the expected relationship between height and arm length -- that is, the estimated parameter associated with the independent variable. Of course, it will cause predictions about arm length to be more precise for tall people than it will for short people. However, it does not cause a bias. E.g. it does not cause the researcher, say, systematically to predict arms to be too long (or too short). As we discuss below no statistics textbook claims that the set of independent variables must have a certain distribution if an estimator is to be unbiased. For the same reason, our method requires nothing of the ideological distribution of the think tanks for the estimates to be unbiased.

PART III. A LONGER RESPONSE

Nunberg makes five general points: 1) Our statistical method for rating think tanks assumes that there is no such thing as a centrist or apolitical think tank and it does not distinguish between, say, a moderately left think tank and a far left think tank; 2) Our method "assumes there can be no such thing as objective or disinterested scholarship"; 3) We "have located the political center somewhere in the middle of the Republican Party." 4) The list of think tanks and policy groups that we choose is an arbitrary mix, and this mix of think tanks causes the media to appear more liberal than they really are. 5) Our data from the Congressional Record "shows some results that would most kindly be described as puzzling" -- most prominent of which are the data that involve the ACLU and the Alexis de Tocqueville Institution.

We show why each point is wrong and in some instances dishonest.

1) Nunberg describes our study as "certainly the most ambitious and analytically complicated" of quantitative studies of media bias. We appreciate the compliment, but we should begin by clarifying the statement. The version of our paper to which Nunberg refers has nine sections, including the introduction. Eight of these sections, in our view, contain no specialized economics or political science jargon, nor do they require any mathematics skill above an eighth-grade level. However, one of these sections, "The Estimation Method," is somewhat analytically complicated. E.g. it describes a maximum-likelihood estimation technique and it notes a set of random variables that follow a Weibull distribution. Such techniques and concepts are somewhat specialized, but most people with a PhD in economics or statistics will know them, and more and more frequently they are becoming part of the toolbox of newly-minted political-science and other social-science PhDs.

Our main conclusions are based strictly upon the method that we describe in that section. However, in another section, entitled “Descriptive Statistics,” we show how a simpler method, which we call the “back of the envelope method” gives nearly identical results. We ourselves discuss the problems with the back-of-the-envelope method. Yet, we decided to include it, because (i) it is accessible to laypersons, and (ii) it helps to provide some intuition for our primary, more complicated, method.

We strongly suspect that (1) Nunberg did not read the more complicated section. Or, if he did, (2) he certainly did not understand it. Here is some evidence.

1a) Nunberg’s essay has four sections. One entitled “The Study,” appears to describe our statistical method. However, in this section he only describes our “back of the envelope” method. Nowhere in the section, nor in any other section of his critique, does make even the slightest reference to our primary statistical method.

1b) Nunberg writes “There are ideological implications, too, in Groseclose and Milyo’s decision to split the think tanks into two groups, liberal and conservative. One effect was to polarize the data. No group – and hence, no study – could be counted as centrist or apolitical.” This is true of the back-of-the-envelope method, but it is not true of the primary, more complicated method that we use (which, again, is the method on which we base our main conclusions).

Our method assumes that legislator i’s preference for citing think tank j is

a_j + b_j x_i + e_ij.

The key letter in this equation is the subscript-j associated with b. As we state in the paper, the j stands for the j-th think tank in our sample. It means that we estimate a different b_j for each different think tank. In contrast, if we had done what Nunberg says we did, we would only estimate two b_j’s, e.g., a b_L for liberal think tanks and b_C for conservative think tanks. That we estimate a different b_j for each different think tank means that we allow for a continuum of different ideologies for the think tanks. Indeed that is what we found. E.g. the b_j for the Heritage Foundation is significantly less than the b_j for the American Enterprise Institute, which is significantly less than the b_j for the Brookings Institution, which is significantly less than the b_j for the Urban Institute, and so on. As a consequence, if a media outlet cites a think tank that is cited predominantly by moderates in Congress or one that is cited nearly equal by conservatives and liberals (e.g. the Brookings Institution was one such think tank), then that will cause our method to rate the media outlet as more centrist. Likewise, if a media outlet cites a far-left think tank then this will cause our method to rate the outlet more liberal than if it had cited a centrist or moderately-left think tank.

1c) Nunberg makes the same error when he writes “In fact, even though the ADA rating that G & L’s [sic] method assigned to the Rand Corporation (53.6) was much closer to the mean for all groups than that of the Heritage Foundation (6.17), G & L [sic] ignored that difference in computing the effect of citations of one or the other group on media bias, compounding the polarization effect. That is, a media citation of a moderately left-of-center group (according to G & M’s criteria) balanced a citation of a strongly right-wing group.”

Again, this is true for our back-of-the-envelope method, but it is not true for our primary method. For an explanation, see our previous point. Again, it is the latter method, not the back-of-the-envelope method, on which we base our main conclusions.

(A separate error in Nunberg’s statement is to call the above numbers, 53.6 and 6.17, “ADA ratings.” We never do that, nor should anyone else. Here is one reason (which is the simplest to explain). It is conceivable that a think tank could be more right wing (or left wing) than any member of Congress in our sample. If so, then the average member citing the think tank would necessarily have an ADA score that is higher than the think tank’s true score. In fact, in general, if we defined the ADA score of the think tanks by the average score of the members citing them, then this in general would cause think tanks to appear more centrist than they really are.)

1d) Another error occurs where Nunberg writes, “Let’s begin with the assumption that underlies Groseclose and Milyo’s assignment of ratings to the various groups they looked at: if a group is cited by a liberal legislator, it’s liberal; if it’s cited by a conservative legislator, it’s conservative.”

We do not assume this, and in fact, it would be ridiculous if we did. Nearly every think tank in our sample is cited at least once by a liberal legislator and at least once by a conservative legislator. Thus, if we literally assumed the above statement, then almost every think tank in our sample would simultaneously be both a conservative and a liberal think tank. It would be very strange for us to make an assumption that is contradicted almost everywhere in our data.

We think that what Nunberg meant to say is that we assume that “if a think tank tends to be cited by liberals, then it is liberal, and if it tends to be cited by conservatives, then it is conservative.” This is a more reasonable statement, and it is true for our back-of-the-envelope method. However, it is not true for our main statistical method.

As mentioned above, our main statistical method estimates a different b_j for each think tank. These estimates indeed describe relative positions of the think tanks. However, we do not assume that our method gives an absolute position. In fact, it cannot give an absolute position. As we note in the paper, it is actually impossible to identify all the b_j’s. All our method can do is identify them up to an additive constant. As a consequence, we must set one of the b_j’s to an arbitrary constant. Substantively, this means that while our method can reveal that the Heritage Foundation is to the right of the Economic Policy Institute, it cannot say, e.g., that the Heritage Foundation is to the right of the political center of the U.S., while the EPI is to the left of the center. Although our results are consistent with this statement, our results are consistent with many other possibilities, including (1) Heritage is far to the right of the political center while EPI is near the political center, or (2) Heritage is near the political center while EPI is far to the left of the political center. Indeed any statement that describes EPI to the left of Heritage would be consistent with our results.

Why is this important? Nunberg says that our method divides think tanks into two dichotomous groups, liberal and conservative, and that we choose as our dividing line the middle of the Republican party. Later, we’ll explain why our paper does not define the political center at the middle of the Republican party. But, for the moment assume that it does. Even if we did make such a strange (and misleading, we would argue) choice, this would not affect our method’s estimates of the media’s ADA scores. The reason is that to estimate ADA scores our method does not make (and cannot make) any sort of assessment about which side of the political center that a think tank lies.

1e) All the evidence above is all consistent with the possiblity that Nunberg read “The Estimation Method” section but just did not understand it. However, some other evidence suggests he really did not read the section at all. Here are the first two sentences of the section: “The back-of-the-envelope estimates are less than optimal for at least three reasons: (i) they do not give confidence intervals of their estimates; (ii) they do not utilize the extent [italics in original] to which a think tank is liberal or conservative (they only record the dichotomy, whether the think tank is left or right of center); and (iii) they are not embedded in an explicit choice model. We now describe a method that overcomes each of these deficiencies.” If Nunberg had really read these sentences, especially reason (ii), we do not see how he could possibly make the statements that he made in points 1b and 1c above. (Another possibility is that he read all sentences of the section except the first two. But this would be even stranger. Each of the sentences in the section except the first two and last six require a fair amount of technical expertise. It would be strange for a person to read the difficult parts of the section but skip the easy parts.)

2) Another criticism that Nunberg makes is that “In fact, their method assumes that there can be no such thing as objective or disinterested scholarship.” This is the strangest sentence of all in Nunberg’s critique. We make six points in response. i) Our method does not make this assumption, and nowhere in the paper do we state anything like it. ii) Such a statement is neither necessary nor sufficient to justify our method. iii) As professors at research universities, we consider the primary aspect of our jobs to produce objective and disinterested scholarship. It would be very strange if we wrote a paper that assumes that such scholarship cannot exist at all.

iv) Although we did not state it in the paper, our own view is nearly the exact opposite of this assumption. Namely, by and large, we believe that all studies and quotes by the think tanks in our sample are true and objective. However, it just happens that some, but not necessarily all, of these true and objective studies appeal differently to conservatives than liberals. To see why, imagine that a researcher publishes a study in a very prestigious scientific journal such as the New England Journal of Medicine. Suppose this study gives evidence that a fetus in the early stages of its mother’s pregnancy can feel pain (or cannot feel pain). We are willing to bet that this true and objective study will appeal more to conservatives (liberals) than liberals (conservatives). We are also willing to bet that conservatives (liberals) would tend to cite it more.

This is all that our study assumes—that these studies can appeal differently to different sides of the political spectrum. We do not assume that the authors of the studies necessarily have a political agenda. Not only that, we do not even assume that each study will appeal differently to different sides of the political spectrum. We only assume that it is possible that such studies will appeal differently. That is, our method does not force each b_j to take a different value. It allows for the possibility that the estimate of each b_j could be the same (of course, however, that does not happen with our data).

v) We took great pains to include in our statistical model the possibility that there are factors besides ideology—including possibly a reputation for objective and disinterested scholarship—that can cause a think tank to be cited more frequently by the media and in Congress. These are represented by the a_j’s that we estimate. Our decision to include these parameters came at a considerable cost in terms of computer time and our own effort to estimate the model. Including these parameters approximately doubles the number of parameters that we need to estimate. This, for reasons that we explain in the last two paragraphs on p. 11, actually quadruples the effort and computer resources that we need to calculate the estimates. As we explain, once we run the full model, we expect the statistical program to take approximately eight weeks to run. If instead, we eliminated the a_j’s, the program would only take two weeks. If we really assumed that there is no such thing as disinterested and objective research, why would we choose to estimate a much more complicated model that tries to account for this possibility?

vi) In contrast, the assumption that Nunberg claims that we make seems to apply more to his views than ours, at least in regard to research on the media. His second to last sentence reads, “It seems a pity to waste so much effort on a project that is utterly worthless as an objective study of media bias.” Is he saying “there can be no such thing as an objective and disinterested” study of media bias?

3) Nunberg claims that “In effect, G & C [sic] have located the political center in the middle of the Republican Party, by which standard the majority of American voters would count as left-of-center.” Here is another case where Nunberg seems not to have read a section of the paper. We devote an entire section to defining the political center (the section is entitled “Digression: Defining the ‘Center’”). We conclude the section with the following sentence, “As a consequence, we think it is appropriate to compare the scores of media outlets with the House median, 39.0”

We devote an entire table, Table 2, toward comparing the median and means of the entire Congress to the means of each party. As we note, the Republican mean is 11.2. Meanwhile the Democratic mean is 74.1. By no stretch of the imagination is 39.0 in the middle of the Republican party. In contrast, it is almost exactly equal to the midpoint of the middles (means) of the two parties.

We also illustrate this in Figures 2 and 3. Both figures list the median of the House, 39.0 and the averages of the Republican and Democratic parties. As anyone can see, 39.0 is approximately the midpoint between the two parties’ averages.

Finally, we also devote an entire table, Table 3, toward showing that 39.0 is indeed a moderate score and not a position in the middle of the Republican party. For instance, it is very near the score of Dave McCurdy (39.8), a Democrat who represented southern and central Oklahoma, a district that consistently and significantly voted for Republican presidential candidates. The 1994 Almanac of American politics notes that he often breaks with the Democratic Party, and in 1990 he formed a “Mainstream Forum” for moderate House Democrats. Our definition of the political center is also near the score of Tom Campbell (41.5), a Republican who represented two different districts in Silicon Valley. Both districts voted overwhelmingly for Gore in 2000. Campbell was one of a handful of House members (of either party) who voted against Newt Gingrich for speaker in 1997 while voting in favor of impeaching President Clinton. The 1998 Almanac of American Politics calls him “[c]onservative on economic issues, liberal on cultural issues.” It is also near the scores of Sam Nunn (D.-Ga.) and Arlen Specter (R.-Penn.). No one with an even moderate knowledge of American politics can say that these legislators are in the middle of the Republican Party.

4) Nunberg raises a number of issues about the set of think tanks we choose to analyze. We make three points in response: a) Despite what he implies, we did not cherry-pick our list; b) He bolsters this charge by reporting citation data about the Conference of Catholic Bishops and the National Association of Manufacturers. If we add these groups to our list, this in general makes the media appear more liberal, not less. c) Nunberg criticizes our list of think tanks for not being the most prominent possible set and for not being a “genuinely balanced” set of think tanks. Even if these charges are true, we show that they do not necessarily imply a bias to our method. That is, if we had used a more prominent set of think tanks or a more balanced set, it is just as likely that this would cause the media to appear more liberal as more conservative.

4a) First, the cherry-picking charge. When we began our study, Milyo, while searching the internet, found a list of think tanks that seemed to be a good place to start to look for data. This is the list created by Saraf. We have never met Saraf, nor do we know anything about him except what he lists on his web site. Further, when we first downloaded the list, we had not even read any other parts of his web site. In short, we knew nothing about Saraf or how his list was created. We chose the list simply because (i) it listed many think tanks, (ii) it seemed to include all the major ones, and (iii) it seemed to include a healthy balance of far-right, right-leaning moderate, moderate, left-leaning moderate, and far-left think tanks.

(As Nunberg mentions, Saraf won an award from a Republican group; thus, it is possible, and maybe likely, that the list is stacked slightly in favor of right-wing groups. Later, we’ll explain why this will not cause a bias to our media estimates. But in the meantime, consider this: Suppose instea d we had chosen a list that was stacked in favor of left-wing groups. We are certain that if we had done that someone, possibly Nunberg himself, would accuse us of intentionally picking a left-wing list in order to make the media look liberal. Here’s how such a critic could explain his or her charge. “Because Groseclose and Milyo’s list has a disproportionate number of left-wing think tanks, this causes media outlets in their sample to appear to cite left-wing groups disproportionately. This, in turn, causes their method to report the media more liberal than it really is.” Later, we’ll explain why this argument is wrong. But for now suppose it is correct. Remember, our list, if anything, seems to be stacked the other way, toward more right-wing groups. This would cause our method to report the media more conservative than they really are.)

This was Spring of 2002 when we first came across the list. Groseclose gave the list to his r.a.’s and asked them to begin data collection. After several months we considered adding more think tanks to the list. However, for two reasons we did not. One is simply the extra effort that it would bring upon us and our research assistants. We have now hired a total of 21 research assistants, and they have spent a total of approximately 5000 hours collecting data over a period of 2 ½ years, and we are still not quite finished. If we were, say, to expand our list to 300 think tanks, then this would cause our data-gathering exercise to take another year and a half, a total of about four years. At some point we have to say “Enough.”

But what about adding, say, 10 or 25 more think tanks? Would that be such a large burden? No, but if we did, our list would no longer be chosen exogenously by another authority. We would be even more susceptible to charges that we cherry-picked our list. Imagine how nefarious someone like Nunberg could make us look, saying, e.g., “Groseclose and Milyo began with a list chosen by another source. But then for some puzzling reason they chose to add several think tanks. Did the first list not give them the results they wanted? One suspects that the media would not look so liberal if they had stuck to their original list.”

Nunberg says that we should have used a set of think tanks “whose prominence was objectively determined.” We’re not sure how he defines “objectively determined,” but if he means “exogenously chosen” in the sense that eonometricians and statisticians use the phrase, we agree. That’s exactly why we use a list chosen by someone else.

As a final word on the possibility we cherry-picked the set of think tanks to rig our result, recall that we have hired 21 research assistants for the data-gathering exercise. We carefully chose them so that approximately half were Gore supporters in the 2000 election. If we really did cherry-pick our list or, say, begin with one list and then switch to another, then almost surely one of these research assistants would recognize it. Imagine the damage to our careers if one of them was able to step forward with such a charge. Even if we had the lowest possible regard for honesty in research, wouldn’t self-interest alone motivate us not to cherry-pick a list given how many research assistants are involved in the project?

4b) To bolster the charge that we chose an arbitrary set of think tanks, Nunberg gathers data from two think tanks that we did not include on our list: the National Association of Manufacturers and the Conference of Catholic Bishops. He states that by not including groups such as these, we “exaggerate the media’s liberal tilt.”

Our first response is simply to apply Nunberg’s critique to himself. What is the “objective criterion” that he used to choose these two groups? In the words of his own critique, he “gives no indication of how his list was compiled, or what criteria were used.”

We are certain that some think tanks that we did not include would cause the media outlets to appear more liberal than we report. We are also certain that other think tanks would cause the outlets to appear more conservative than we report. Accordingly, it would be easy for a critic to cherry-pick two think tanks and then offer them as an example to show that the media are really more conservative than we estimate. We would accuse Nunberg of engaging in such an exercise, except the two think tanks that he chooses work in the opposite direction. If we had included them, our results would generally show the media to be more liberal, not less!

To see this, let us focus on our “back of the envelope” method. Although this is not the method on which we base our conclusions, it is the one on which Nunberg bases his conclusions. Thus, if we want to explain Nunberg’s errors it’s better to focus on this method. Further, it happens that these results very closely approximate our primary method, and it is easier to explain the reasoning with this method than our primary method.

Consider Nunberg’s claim, “By excluding conservative groups that are frequently mentioned in the media, the study appears to exaggerate the media’s liberal tilt.” On the surface, this appears to be an obvious and true statement. For instance, as Nunberg suggests (and our sample examination seems to verify), the National Association of Manufacturers is a group that conservative legislators cite more than liberal legislators. Thus, our back-of-the-envelope method would indeed classify it as a “conservative” group. As an example, consider ABC World News Tonight, which for the period we examine, cites NAM 13 times. (Lexis-Nexis actually lists 17, but four of these are repeat entries.) When we add NAM, World News Tonight necessarily increases its proportion of conservative cites. This would seemingly make its ADA score become more conservative. However, when we add NAM to the mix, this also causes Congress to increase its proportion of conservative cites, which makes it appear more conservative as well. Our method only estimates the extent to which a media outlet is liberal or conservative relative to Congress. Consequently, the net effect is not clear.

If World News Tonight is to make its ADA score more conservative, it must cite NAM in relative greater frequency than does Congress. It does not do this. Namely, when NAM is not in the mix, World News Tonight cites conservative groups 318 times. When we add NAM to the mix, this number becomes 341, an increase of 4.1 percent. Meanwhile, Congress’s conservative cites increase by a much greater degree. Without NAM, Congress cites conservative think tanks 4294 times. When we add NAM, this number becomes 4673, an increase of 8.8% -- more than double the increase associated with WNT.

As a consequence, if we add NAM to our list of think tanks, this causes World News Tonight to appear more liberal, not more conservative. Specifically, when we recalculate its ADA score, it increases by 1.16 points. We did the same calculation with all the other media outlets in our sample except the Drudge Report (it is impossible to do the calculation for it because we do not have an archive of its old reports). These outlets are: (i) CBS Evening News, (ii) Fox News Special Report, (iii) L.A. Times, (iv) NBC Nightly News, (v) New York times, (vi) USA Today. Their respective ADA scores increased by the following when we add NAM: 0.45, 2.32, 2.04, 0.42, –1.52, and 0.85. The New York Times’ score decreased; hence the negative number.

Nunberg reports data about CNN’s cites of the NAM. In the version of our paper that Numberg criticizes, we do not examine any show on CNN. However, in a presentation that Groseclose made at the Stanford Workshop on Media and Economic Performance in Spring 2004, he presented results from CNN’s Newsnight with Aaron Brown. For the period that we examine, 11/9/01 to 2/5/04, Newsnight never cited NAM. Consequently, if we include NAM among our set of think tanks, then this would cause Newsnight’s ADA score to increase (ie to become more liberal). Specifically, it increases by 2.39 points.

(Here are some more details of our calculations. Nunberg reports that NAM received 617 mentions in Congress during the period we consider. In contrast, we found only 541 mentions. We use the latter number. [The anomaly could be explained by the possibility that Nunberg included the 108^th Congress in his calculations; our study did not. Regardless, if one uses Nunberg’s number, this works even more in favor of the point we are making.] Next, we read the first 20 cases that Thomas, the official congressional web site, reports of NAM mentions in the 107^th Congress. Six of these would not be counted in our sample as bona fide cites. For instance, one mention is a case where Rep. Thomas Sawyer lauds one of his constituents, who has just retired from the Goodyear Tire and Rubber company. Sawyer notes that his constituent was a member of the National Association of Manufacturers’ Communication Council. Since this is not a case of a member of the NAM being cited as a policy expert, we do not include it in our sample. Three other cases were similar, and in two cases the legislator criticized the group. Thus, an estimate of the total number of citations that our method would count is 379 [=3D 541 x 14 / 20 ]. Of the sample of 14 cites that we read and did not exclude, 11 were made by Republicans and 3 by Democrats. For the media mentions, we excluded all editorials, letters to the editor, and case where Lexis-Nexis lists the same mention twice. Of the remaining mentions, we excluded six cases where NAM was not cited as a policy expert. Two of these were with Nightly News. One mentioned a lawsuit that NAM had filed but did not quote any member of the group. Another mentioned that a member of NAM would appear on a future NBC show. The four other cases occurred with Special Report. E.g., in one NAM was mentioned because it recently placed tenth on Fortune Magazine’s most powerful lobbyists list. Again the story did not cite any member of NAM.)

Like the case with NAM, if we add Conference of Catholic Bishops to the mix of think tanks, this causes most of the media outlets to appear more liberal, not less. The ADA score of World News Tonight, Newsnight, and the above six media outlets increase by the following when we add CCB to the mix: 0.10, 0.34, -0.15, 0.21, -0.58, 0.23, -1.09, and 0.21. (The negative numbers indicate that the scores of Evening News, L.A. Times and New York Times would decrease.) If we include both the CCB and NAM, the average score of the eight media outlets increases (ie becomes more liberal) by 0.46 points.

(Here are some more details of our calculations. By our calculations CCB received 107 mentions by members of Congress. In contrast, Nunberg reports 130. Again, if one uses Nunberg’s number this works in the direction of making our point even stronger; so let us adopt 107 as the correct figure. We read all 57 of the mentions that occurred in the 106^th and 107^th Congress. We would include only 24 of these in our data set. That is, slightly more than half were not bona fide cases where a member of the group was being cited as a policy expert. Instead, most were cases like Rep. John LaFalce’s speech on May 22, 2002, when he eulogized Monsignor George Higgins. In the eulogy, LaFalce quoted kind words about Higgins from the president of the CCB. Of these 24 cites, 14 were by Republicans and 10 by Democrats. If CCB had been included in our list of think tanks, we estimate that this would add approximately 45 more congressional cites [ =3D 107 x 24/57]. When CCB was mentioned in the media, it was usually in regard to the sexual-abuse scandal by priests. We would not count these as cites, since any quote by the CCB would be to defend their own organization, not a quote where it is treated as an outside expert on policy. To eliminate these cases we searched Lexis-Nexis using the search parameters, “Conference of Catholic Bishops” and not “sex” and not “abuse.” We read the resulting mentions to make sure our method would count them as bona fide cites. The resulting cites for World News Tonight, Newsnight, and the above mentioned media outlets were respectively 3,0,2,1,0,7,7,and 18. )

4c) Nunberg also criticizes our list of think tanks for not being the most prominent possible set and for not being a “genuinely balanced” set of think tanks. However, there is no a priori reason why either criticism would bias our results. Further, Nunberg does not give one.

First, let us address the charge about not selecting the most prominent set of think tanks. Nunberg writes “Start with the list of groups from which G & M drew their initial sample. The describe this simply as a list of ‘the most prominent think tanks,’ …” Then he explains why our set is not the most prominent possible set—that is, there are groups not on our list that are more prominent than some of those on our list. Nunberg concludes this point by stating “On the grounds of sample choice alone, in short, the Groseclose and Milyo study would be disqualified as serious research on ‘the most prominent think tanks.’”

Nunberg implies that we call our list “the 200 most prominent think tanks,” as if there were a way to rank the prominence of all think tanks, and we selected the top 200 from the list. However, we do not claim that. Here’s what we actually write: “The web site, www.wheretodoresearch.com lists 200 of the most prominent think tanks in the U.S.” The key word in the sentence is “of”. That is, we are only claiming, e.g., that of the possibly several hundred think tanks that one can call prominent, our list contains 200 of them. Nunberg is deceptive when he claims that we describe the list as “the most prominent think tanks.”

More important, for our study to give an unbiased estimate of the slant of media outlets, it does not matter if we have selected the 200 most prominent set of think tanks. All we need is that the set is chosen exogenously (again, that’s why we let someone else choose our list).

For the same reason if one is running, say, a univariate regression, it does not matter if the researcher’s independent variable never takes the value that occurs most frequently in the population. For instance, suppose the independent variable is height of male subjects and the dependent variable is the subjects’ arm length. Since heights follow a uni-modal distribution, the most prominent values of the independent variables are the ones associated with moderate heights. Suppose the researcher chose a wide mix of short, medium, and tall subjects, but failed to include any subject whose height is 5’10’’, the most common height among American males. No serious statistician would claim that this causes a bias. Similarly, no statistics or econometrics textbook claims that the set of independent variables must have a certain distribution if an estimator is to be unbiased. For the same reason if we omit a few (or many) of the most prominent think tanks from our sample, this will not bias our results.

Related, Nunberg criticizes Saraf for choosing a “jumble” of groups. If by “jumble” Nunberg means “random,” for the purposes of our study, that is a compliment of the set, not a criticism. As we mentioned, what’s most important is that the set be chosen exogenously. As one learns in the most elementary econometrics classes, “random” is a sufficient (but not necessary) condition for “exogenous.” To see this, again, consider the height-arm length example. If a researcher chose his subjects randomly as opposed to those with the most frequently-observed (“prominent”) heights, then this would not affect his findings about the relationship between height and arm length. That is, he or she will find that arm length is approximately half the subject’s height, and this estimate, “half,” would be the same regardless of which of these two samples that he or she chooses.

Nunberg notes that Saraf is “a free-lance researcher with a masters degree in history who lists among his achievements that he was named Man of the Year by the Cheshire (Connecticut) Republican Town Committee.” We’re not sure of Nunberg’s purpose in this description, but we suspect it was to criticize the credentials of Saraf. If so, this is a little vicious. But it matters not a whit to our results. That is, suppose Saraf has even lower research credentials. Suppose even that he’s only a trained monkey who picked the set randomly. That does not cause a bias to our results (nor does Nunberg even attempt to explain why it could cause a bias to our results). In fact, if Saraf’s research credentials are indeed low, one could even argue that is even more reason to believe that the set is formed randomly (thus exogenously), and hence, it’s even better for our method.

Another point that Nunberg raises is that many of our groups are not pure think tanks. E.g. some, such as NAACP, the NRA, and the ACLU, are more appropriately described as activist groups. We are guilty of calling all of them “think tanks.” We do this only because it is unwieldy to to call them throughout the paper, eg., “think tanks, activist groups, and other policy groups.” But more important, there’s no a priori reason to exclude groups that are not pure think tanks. Likewise, there’s no a priori reason to exclude pure think tanks and to use only activist groups. For our method, the key is to include groups that are cited both by the media and members of Congress. In fact, just imagine the criticism to which we would expose ourselves if we had used only one type of group. Someone such as Nunberg could say “It is ‘puzzling’ why Groseclose and Milyo included only pure think tanks in their list. This alone would disqualify the study as serious research.” Or, alternatively, if we had done the opposite, such a critic could say “It is ‘puzzling’ why Groseclose and Milyo included only activist groups in their list. … ”

A separate issue is whether the list of think tanks is ideologically balanced. Nunberg is not clear in which direction he thinks Saraf’s set is ideologically imbalanced. We think, if anything, Saraf’s set is slightly skewed toward containing more conservative groups—e.g. it contains none of the “Nader” groups such as Public Citizen, Center for Auto Safety, and Center for Science in the Public Interest. And Nunberg notes that Saraf was awarded Man of the Year by a Republican group. (We do not know why Nunberg mentioned this. It is possible that it was only to denigrate Saraf’s credentials and not to suggest that the list is skewed in the conservative direction.) On the other hand, Nunberg writes “by excluding conservative groups that are frequently mentioned in the media, the study appears to exaggerate the media’s liberal tilt.”

But even if our sample of think tanks is skewed left or right, this will not bias our results. To see this, consider the above regression where the researcher includes twice as many tall subjects as short subjects. As we explained, this will not affect the expected relationship between height and arm length—that is, the estimated parameter associated with the independent variable. That is, it will not cause a bias to the estimates.

5) Nunberg writes “Then, too, Groseclose and Milyo’s survey of the citations of groups in the Congressional Record shows some results that would most kindly be described as puzzling.” He focuses especially on the results we report for two groups, the ACLU and the Alexis de Tocqueville Institution. Nunberg is dishonest in his presentation of our ACLU results. In his presentation of results surrounding the Alexis de Tocqueville Institution he reveals, once again, that he did not read our paper very well: that organization ranks highly based on the criterion of sentences cited, not total cites (but Nunberg misses this point). Also, with each group, Nunberg makes a suggestion that, if we were to follow them, it would make the media outlets in our sample appear more liberal, not more conservative.

Consider the ACLU results. Nunberg writes:

“At another point G & M explain that they disregarded the ACLU in their final analysis because it turned up with an excessively conservative score, owing to Republicans who cited it for its opposition to McCain-Feingold.”

Here’s what we actually wrote:

“The primary reason the ACLU appears so conservative is that it opposed the McCain-Feingold Campaign Finance bill. Consequently, conservatives tended to cite this fact often. Indeed, slightly more than half of the ACLU sentences cited in Congress were due to one person, Mitch McConnell (R.-Kt.), who strongly opposed the McCain-Feigold bill. If we omit ACLU citations that are due to McConnell, then the average score, weighted by sentences, increases to 70.12. Because of this anomaly, in the Appendix we report the results when we repeat all of our analyses but omit the ACLU data. This causes the average score of the media outlets to become approximately one ?? point more liberal.”

At this point, we ask you, the reader, to re-read these two passages. With many of Nunberg’s criticisms, he is simply sloppy or careless, or simply misunderstands some technical details of our method. With this point he is dishonest.

Despite what he writes, our final analysis included the ACLU data. In fact, it turns out that the only analysis that we report in the paper contained the ACLU data. Our passage notes that we did the analysis both ways: with and without the ACLU data. The results with the ACLU data are reported in the main text, and the results without the ACLU data are reported in the Appendix. However, we have not yet written the Appendix (and of course the web site to which Nunberg links to our paper lists no Appendix).. Thus, the only results we report in the paper are the ones that do not disregard the ACLU data. The paper is still a rough draft, polished enough to present at academic seminars (that is where the paper is listed—on the web page for a Yale seminar series, where Groseclose presented the paper). Yet it is clearly not in its final form. Indeed, throughout the paper we have written “xx” where we intend to fill in details, and in fact the above passage regarding our results when the ACLU is omitted lists “??” in the sentence. We have done some preliminary analysis that suggests that ADA scores of media outlets will increase by about one point when we omit the ACLU data.

Remember, that an increase in an ADA score means the outlet becomes more liberal. Nunberg writes that our final analysis disregarded the ACLU data, and he implies that we should have done the opposite. Of course, if we follow this suggestion (which, it turns out, we did) this makes the media appear more conservative, not more liberal, than if we had disregarded the ACLU data.

Related, Nunberg’s next two sentences after the above sentence are, “Other researchers might wonder whether there might be similar anomalies in the results obtained for other groups, and might even suspect that this result cast some doubt on their overall method. G & M seem untroubled by that possibility.”

How ominous. We are “untroubled by that possibility.” It turns out that out of 200 think tanks in our sample, there seem to be only two anomalous rankings. First is the Rand Corporation, which our method places to the left of center. We have mentioned this finding to four scholars at Rand. None were surprised, and each agreed that the result is due to the fact that most of the conservative scholars at Rand focus primarily on military research, and these studies tend not to be cited very frequently by the media and members of Congress. Part of the reason is because these studies are often classified. The other anomaly was the ACLU. Our method ranked it (just barely) among the most conservative half of the think tanks. As we mention in the paper, the reason is due to one person, Senator Mitch McConnell. After the ACLU announced that it opposed McCain Feingold, McConnell seemed to mention this at every opportunity he had. In fact, he alone accounted for half of the total congressional citations to the ACLU. No other think tank had such an odd distribution of citations.

In closing, we have devoted considerable time and effort to responding to Nunberg’s irresponsible charges. We do not intend to repeat this exercise for every bit of malicious gossip posted by someone on one of these “blogs.” By exposing Nunberg’s errors and deceptions we hope to encourage other scholar/bloggers to behave in a more professional manner.

August 2^nd, 2004

Tim Groseclose
Jeff Milyo

Posted by Mark Liberman at August 2, 2004 05:07 PM