What are consensus networks?
Consensus networks are a visually intuitive way to show the relative similarity between objects that can be similar in a variety of distinct ways. The concept of a consensus network was developed in phylogenetics, when researchers wanted a more robust way to compare several organisms by several different aspects at once.
In our consensus networks, each node represents a text. Connections between the nodes indicate similarity between the nodes they connect. The strength of the connections, shown by the thickness of the lines, is indicative of the level of similarity between the texts.
What do the nodes represent?
Each node represents a text of about 1000 words. In this network, each node represents one of the Federalist Papers.
What do the connections represent?
Each connection is indicative of similarity between the two nodes it connects. Stronger connections, indicated by the thickness of the lines, show a greater degree of similarity between the documents they connect.
Why do some nodes have more connections than others?
When the networks are generated, each node 'reaches out' to the three nodes to which it is most similar. This process is repeated many times, using different measures to compare the documents each time. Because of this, a node can choose distinct neighbors in different iterations of the network creation. In this way, some nodes choose several distinct neighbors, whereas others only choose three.
Additionally, although each node is guaranteed to reach out to at least three others, there is no guarantee that any given node will be chosen by another. In that way, a node can never be chosen, leading to fewer connections.
What determines the placement of the nodes in the graph?
The graphs are made using force-directed graphing software. In this system, the nodes sit in 3-D space, and pull on each other where there are connections between them. Stronger connections between documents pull their nodes closer together. Then, a force is applied to push the network apart. In the resulting graph, the nodes fall into place based on their relative similarity to other documents in the graph.
What is meant by the terms incoming and outgoing connections?
During network creation, each node 'reaches out' to at least three other nodes. Each of these connections are considered outgoing connections. If a node is chosen by another document, that connection is considered incoming.
How are the consensus networks created?
Consensus networks are created by comparing each text to each other text using many different methods of comparison.
Each text is compared to each other document based on a textual feature. Then, the text ‘reaches out’ to its three nearest neighbors. This process is repeated with several textual features, and similarity is measured several different ways for each textual feature. Each time a text reaches out to one its neighbors, a connection is made between those two nodes.
All these connections are layered on top of one another, resulting in a robust web of connections between texts. The ‘consensus’ portion of the name refers to this layering of connections, as the network provides a consensus of where strong inter-textual connections lie.