How Degree-of-Separation Search Works |
| The degree-of-separation search works by building a network map displaying the linking structure of a list of Web sites returned in response to a Google query. For example, a search to get the betweenness of “Hillary Clinton” works as follows:
Figure 1. Degree-of-separation Web search for query “Hillary Clinton” Figure 1 illustrates the network map returned to the query “Hillary Clinton”. The level-1 nodes are the ones connected directly to the query, i.e. the original search results. Level-2 nodes are the most highly ranked search results returned by the “link” query, to each of the top ten level-1 nodes. Level-3 nodes are the most highly Google-ranked nodes returned by the “link” queries of each of the level-2 nodes. Figure 1 already gives a visual overview of the betweenness of each of the level-1 and level-2 nodes. The more links a node has pointing to it, the more between it is. For example the node labeled http://clinton.senate.gov is linked by a group of level 2 nodes which themselves are linked by groups of level-3 nodes. This indicates that the node http://clinton.senate.gov will have fairly high betweenness itself. Betweenness ranges from 0, for nodes which are totally peripheral, to 1, for nodes which are on all shortest paths. The most between node in figure 2 is the search query “Hillary Clinton” itself, with a value of 0.61. The second most between node is indeed, as figure 2 illustrates, http://clinton.senate.gov with a betweenness value of 0.36. Some other high-betweenness nodes are www.ovaloffice2008.com and www.hillaryclinton.com. Figure 2. Betweenness of Web Sites indicated by size of square or circle Top Google search results do not necessarily have highest betweenness centrality. Google sorts search results by its patented “Page Rank” algorithm, which looks at what Web pages link back to a particular page. It also weights the links to the page by the page rank of the originating page. In terms of social network analysis Google measures the in-degree of a page, that is the number of incoming links. Page rank is a local algorithm, because it only factors in all the nearest neighbors of the page it is measuring. It includes page-rank of the neighbors, weighting incoming links higher from sites that themselves have a high page rank. Betweeness, on the other hand, is basically a global concept, because it looks at all the shortest paths within the entire network that are going through a particular site. Frequently, a node which has a high page rank will also have high betweenness, but this is not necessarily the case. |