From Script to Network

Here is how we turned the HBO Series into a collection of networks.

The scripts for the show are not publicly available (except for those that have been submitted for Emmy consideration). Therefore,  we processed fan-authored scripts available on genius.com. These scripts start with the closed caption subtitles, and are then organized into scenes and embellished with stage directions. Some seasons are incomplete or missing, so we rolled up our sleeves and created some scripts ourselves.

We processed the scripts, adding a link between Character A and Character B when:

  1. They appear in a scene together
  2. They appear in a stage direction together
  3. They exchange dialog
  4. Character A mentions Character B
  5. Another character mentions Character A and Character B together.

GOT Script to Network

When two characters appear in the same scene, then clearly they are linked. If a character exchanges dialog with another, this interaction is captured by our network. An extended conversation (with lots of back-and-forth) creates stronger links. When one character mentions another character, then that signifies some relationship between them. Again, they should be linked. Likewise, if two characters are spoken of together, they are linked in the mind of a third character (and also in the mind of the viewer).

Note that our network emphasizes dialog (verbal interaction) over stage direction (physical interaction). Given our data set, this makes sense: the dialog is the most authoritative part of our scripts, since these are taken directly from the closed captioning. But this means that extended action scenes (like epic battles) are de-emphasized by our network creation process. There is no “perfect way” to construct an interaction network. We must remain aware of the limitations of our process, and how those limitations effect our conclusions.

Nicknames and Disambiguation

Of course, characters have multiple names and nicknames, so we developed code to resolve the many ways of referencing. This includes resolving some references by context. For example: “dolt brother” refers to Jon Snow in the example shown above. The scripts are short enough that we can perform this disambiguation in a comprehensive manner.

Network Analysis and Visualization

The final step was to import the networks into the easy-to-use Gephi network analysis software. This is a good choice for our relatively small networks, and has lots of options for making attractive network layouts.

home_screenshot

Advertisement