How does one go about creating a quantitative measure of legislative partisan periphery? This is a challenge I face in the DroneScout project, as we try to determine the determinants of Congressional oversight on drones—and whether partisan periphery is one of them.

For a concept as elusive and subjective as partisanship, there is a surprising amount of research already done into quantifying it among Congressional lawmakers.

The most prominent of these measures is DW-NOMINATE, a scaling system that plots legislators across two axes, from -1 to 1. The first axis—the locations on which are defined by legislator's first dimension DW-NOMINATE scores—generally indicates fiscal tendencies. Democrats tend to have negative first dimension scores, and Republicans tend to have positive scores.

The second axis previously indicated social tendencies, however Poole and Rosenthal, the authors of the original NOMINATE and DW-NOMINATE papers, noted the following:

The 2nd dimension picks up the conflict between North and South on Slavery before the Civil War and from the late 1930s through the mid-1970s, civil rights for African-Americans. After 1980 there is considerable evidence that the South realigns and the 2nd dimension is no longer important.

Here is a visualization of the 111th Congress according to the W-NOMINATE scaling method:

111th Congress according to W-NOMINATE
Source: Wikimedia Commons — Chris Hare

What's great about DW-NOMINATE (and W-NOMINATE, as shown by the previous figure) is that the first dimension—the one that indicates financial partisanship—also acts as a great indicator of general perceived partisanship. Consider some of the most liberal and conservative lawmakers, according to their DW-NOMINATE scores:

Most liberal: Goodwin (WV), Warren (MA), Lee (CA), and Sanders (VT).

Most conservative: Paul (KY), Cruz (TX), Broun (GA), and Sasse (NE).

Sounds about right.

Because DW-NOMINATE is tried and true, I'm going to build this measure of partisan periphery on top of it. All of the figures, then, will be based off of a legislator's party and first dimension score (nowadays, the second dimension is meaningless if we aren't trying to predict voting outcomes).

Now that a respectable measure of partisanship has been found, it's time to design a measure of partisan periphery.

The final measure should indicate how far a legislator's tendencies are from the norm of their party, and must have the following qualities:

  • consistent across time, so a legislator's partisanship score is the same across their career and can be compared with other legislators at different times (this is accomplished using Constant Space DW-NOMINATE scores)
  • indicate both direction and magnitude of periphery (i.e. is the legislator a moderate or an extreme)
  • not rely on any particular thresholds (i.e. a score of 0.6 or above cannot be called 'extreme conservatism,' unless 0.6 has some sort of special statistical significance).
  • be single-dimensional (but may also be signed)

Here are the acceptable caveats of the measure:

  • it may rely on a two party system

Here's an idea: the periphery score of any given legislator is the number of standard deviations between that legislator's first dimension DW-NOMINATE score and the party mean. If a legislator's score deviates in the direction of the mean of the other party, they have a negative periphery score. If their score deviates away from the mean of the other party, they have a positive periphery score.

Thus, the periphery score is only the standard deviations away from the party mean, with sign determined by the direction of the deviations.

Now it's time to implement this measure in a pre-existing dataset. To do this, I am going to simply expand on the original source code (which conveniently already has loaded all legislator's Constant Space DW-NOMINATE scores).

First, I am going to define an array of all legislators:

raw_legislators = []

To avoid legislator duplicates, I will also define an array of already-included legislator ICPSR identification codes:

_already_included_legislators = []

To keep memory from getting too clogged (and to prevent irrelevant legislators from the past influence the party means), I will define the earliest Congress from which I would like to include data. In the case of this project, the minimum Congress must be the 106th, as it begins 1999 and ends 2001. (The scope of this project only includes legislative proceedings from the year 2000 onwards.)

minimum_congress = 106

Then, as I iterate through all the legislators, I simply add their data to the master raw_legislators array provided they are not already in it and the Congress they served in is greater than or equal to minimum_congress. (Note that because a legislator's Constant Space DW-NOMINATE score is the same throughout their entire career, it is not incorrect to simply consider the 'earliest' value.)

if dat['icpsr'] not in _already_included_legislators and dat['congress'] >= minimum_congress:

Now that all the legislators are included in a single array, it is now time to find the party mean for the Republicans (party code 1) and Democrats (party code 0), as well as the standard deviations.

Fortunately, I don't have to calculate the means and standard deviations myself. Instead, NumPy will do it for me:

republican_std_dev = numpy.std([legislator["dim_1"] for legislator in raw_legislators if legislator["party_code"] == 200])
democrat_std_dev = numpy.std([legislator["dim_1"] for legislator in raw_legislators if legislator["party_code"] == 100])

republican_mean = numpy.mean([legislator["dim_1"] for legislator in raw_legislators if legislator["party_code"] == 200])
democrat_mean = numpy.mean([legislator["dim_1"] for legislator in raw_legislators if legislator["party_code"] == 100])

The resulting data:

Republican standard deviation: 0.160409314961
Democrat standard deviation: 0.139125844637

Republican mean 1st dimension score: 0.439763665595
Democrat mean 1st dimension score: -0.336179190751

With this data, it is now possible to create a simple function that calculates the partisan periphery of any legislator:

def periphery(legislator):
    if legislator["party_code"] not in [100, 200]:
        print legislator["party_code"]
        return None
    party_std_dev = None # not pythonic, not a problem
    party_mean = None
    opposite_mean = None
    if legislator["party_code"] is 200:
        party_std_dev = republican_std_dev
        party_mean = republican_mean
        opposite_mean = democrat_mean
    elif legislator["party_code"] is 100:
        party_std_dev = democrat_std_dev
        party_mean = democrat_mean
        opposite_mean = republican_mean

    score = abs(party_mean - legislator["dim_1"]) / party_std_dev

    neg = (legislator["dim_1"] < party_mean) == (opposite_mean < party_mean)
    if neg:
        score *= -1

    return score

This system works surprisingly well. Here are the periphery scores for several Congresspeople who are known for their extreme tendencies (and would be expected to have a high, positive score):

WARREN: 2.67973797551
CRUZ: 2.60730702895
PAUL: 2.63847728861

Other politicians are more well known for their "straight-shooting" central-partisan tendencies. Consider the periphery scores of the following politicians:

MOULTON: 0.0274629725233
BISHOP: -0.0234628867767
GUINTA: -0.0795693541737

Lastly, some politicians are known for their very moderate tendencies. As such, they would be expected to have large negative scores:

MORELLA: -2.85372246436
GILMAN: -2.47344529644
BOEHLERT: -2.09940218046

Now, the measure is ready to be implemented in the dataset. By using an already-existing measure of partisan tendencies (Constant Space DW-NOMINATE), it was possible to simply extend the metric and calculate the standard deviations away from the mean, changing sign as necessary. It is a simple method, but also an effective one.