pt.tumba.geoclass
Class ClassNetwork

java.lang.Object
  extended by pt.tumba.geoclass.ClassNetwork

public class ClassNetwork
extends java.lang.Object

A probabilistic graphical model of geographical concepts, which is the basis of the algorithm for classifying web pages according to their geographical scopes. Essentialy, ClassNetwork represents a place hierarchy. As for propagating scores in the network, this class implements both a bayesian inference procedure and a variation of the popular PageRank ranking algorithm, as proposed by Rada Mihalcea and Paul Tarau in "TextRank: Bringing Order into Texts".

Author:
Bruno Martins

Constructor Summary
ClassNetwork()
          Constructor for an uninitialized network.
 
Method Summary
 void addEquivalent(java.lang.Integer id1, java.lang.Integer id2)
          Adds a "is-equivalent" relationship between two nodes in the network.
 void addEquivalent(Node node1, Node node2)
          Adds a "is-equivalent" relationship between two nodes in the network.
 Node addNode(java.lang.Integer id, java.lang.String name, java.lang.String type, java.lang.Integer population)
          Adds a new node to the network, constructed from the supplied parameters.
 Node addNode(java.lang.Integer id, java.lang.String name, java.lang.String type, java.lang.Integer population, java.lang.Double value)
          Adds a new node to the network, constructed from the supplied parameters.
 Node addNode(Node node)
          Adds a new node to the network.
 Node addNode(Node node, java.lang.Double value)
          Adds a new node to the network.
 void addPartOf(java.lang.Integer id1, java.lang.Integer id2)
          Adds a "part-of" relationship between two nodes in the network.
 void addPartOf(Node node1, Node node2)
          Adds a "part-of" relationship between two nodes in the network.
 void addRelated(java.lang.Integer id1, java.lang.Integer id2)
          Adds a "is-related" relationship between two nodes in the network.
 void addRelated(Node node1, Node node2)
          Adds a "is-related" relationship between two nodes in the network.
 java.util.List getDirectAncestors(java.lang.Integer id)
          Get the direct ancestors of a given node.
 java.util.List getDirectAncestors(Node node)
          Get the direct ancestors of a given node.
 java.util.List getDirectDescendants(java.lang.Integer id)
          Get the direct descendants of a given node.
 java.util.List getDirectDescendants(Node node)
          Get the direct descendants of a given node.
 java.util.List getDirectRelated(java.lang.Integer id)
          Get the related nodes for a given node.
 java.util.List getDirectRelated(Node node)
          Get the related nodes for a given node.
 java.util.List getEquivalent(Node node)
          Get the equivalent nodes for a given node.
 Node getMaxNode()
          Returns the currently highest scoring node in the network.
 java.lang.Double getMaxNodeScore()
          Returns the score of the currently highest scoring node in the network.
 Node getNode(java.lang.Integer id)
          Returns the node associated with a given id
 Node[] getNode(java.lang.String name)
          Returns the node(s) associated with a given name.
 java.lang.Double getValue(java.lang.Integer id)
          Returns the score associated with a given node of the network.
 java.lang.Double getValue(Node node)
          Returns the score associated with a given node of the network.
 java.lang.String prettyPrint(Node netNode)
          Prints a "human-readable" representation of a node from the network.
 void prettyPrint(java.io.PrintStream output)
          Prints a "human-readable" representation of the network.
 void propagateScoresBayeasian()
          Propagate scores across the network using bayesian inference.
 void propagateScoresTextRank()
          Propagate scores across the network using a variation of the PageRank algorithm,similar to the proposal of Rada Mihalcea and Paul Tarau in "TextRank: Bringing Order into Texts".
 void resetScores()
          Sets the scores for all the nodes in the network to zero.
 java.lang.Double setValue(java.lang.Integer id, java.lang.Double value)
          Updates the score associated with a given node of the network.
 java.lang.Double setValue(NamedEntity ne)
          Updates the score associated with a given node of the network.
 java.lang.Double setValue(NamedEntity ne, java.lang.Double value)
          Updates the score associated with a given node of the network.
 java.lang.Double setValue(Node node, java.lang.Double value)
          Updates the score associated with a given node of the network.
 java.lang.Double setValue(java.lang.String name, java.lang.Double value)
          Updates the score associated all the nodes sharing a given name.
 java.lang.Double setValue(java.lang.String name, java.lang.String type, java.lang.Double value)
          Updates the score associated all the nodes sharing a given name and/or a given type.
 
Methods inherited from class java.lang.Object
equals, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
 

Constructor Detail

ClassNetwork

public ClassNetwork()
Constructor for an uninitialized network.

Method Detail

addNode

public Node addNode(Node node)
Adds a new node to the network.

Parameters:
node - The node to Add.
Returns:
The node added to the network. If a node with the same ID already existed in the network, the old node is returned.

addNode

public Node addNode(Node node,
                    java.lang.Double value)
Adds a new node to the network.

Parameters:
node - The node to add.
value - The score to associate with the node.
Returns:
The node added to the network. If a node with the same ID already existed in the network, the old node is returned (although the score is updated).

addNode

public Node addNode(java.lang.Integer id,
                    java.lang.String name,
                    java.lang.String type,
                    java.lang.Integer population)
Adds a new node to the network, constructed from the supplied parameters.

Parameters:
id - The ID for this geographical concept at the knowledge base (GKB).
name - The Name associated with this geographical concept.
type - The type of this geographical concept.
population - Number of habitants associated with this region.
Returns:
The node with the given parameters. If a node with the same ID already existed in the network, the old node is returned.

addNode

public Node addNode(java.lang.Integer id,
                    java.lang.String name,
                    java.lang.String type,
                    java.lang.Integer population,
                    java.lang.Double value)
Adds a new node to the network, constructed from the supplied parameters.

Parameters:
id - The ID for this geographical concept at the knowledge base (GKB).
name - The Name associated with this geographical concept.
type - The type of this geographical concept.
population - Number of habitants associated with this region.
value - The score to associate with the node.
Returns:
The node added to the network. If a node with the same ID already existed in the network, the old node is returned (although the score is updated).

addPartOf

public void addPartOf(Node node1,
                      Node node2)
Adds a "part-of" relationship between two nodes in the network.

Parameters:
node1 - A Node.
node2 - Another node.

addPartOf

public void addPartOf(java.lang.Integer id1,
                      java.lang.Integer id2)
Adds a "part-of" relationship between two nodes in the network.

Parameters:
node1 - A Node ID.
node2 - Another node ID.

addRelated

public void addRelated(Node node1,
                       Node node2)
Adds a "is-related" relationship between two nodes in the network.

Parameters:
node1 - A Node.
node2 - Another node.

addRelated

public void addRelated(java.lang.Integer id1,
                       java.lang.Integer id2)
Adds a "is-related" relationship between two nodes in the network.

Parameters:
node1 - A Node ID.
node2 - Another node ID.

addEquivalent

public void addEquivalent(Node node1,
                          Node node2)
Adds a "is-equivalent" relationship between two nodes in the network.

Parameters:
node1 - A Node.
node2 - Another node.

addEquivalent

public void addEquivalent(java.lang.Integer id1,
                          java.lang.Integer id2)
Adds a "is-equivalent" relationship between two nodes in the network.

Parameters:
id1 - A Node ID.
id2 - Another node ID.

getNode

public Node getNode(java.lang.Integer id)
Returns the node associated with a given id

Parameters:
id - The id of the Node to return.
Returns:
The Node corresponding to the given ID.

getNode

public Node[] getNode(java.lang.String name)
Returns the node(s) associated with a given name.

Parameters:
name - The name of the Node(s) to return.
Returns:
The Node(s) corresponding to the given name.

getValue

public java.lang.Double getValue(java.lang.Integer id)
Returns the score associated with a given node of the network.

Parameters:
id - A node ID.
Returns:
The score associated with the node.

getValue

public java.lang.Double getValue(Node node)
Returns the score associated with a given node of the network.

Parameters:
node - A node.
Returns:
The score associated with the node.

setValue

public java.lang.Double setValue(NamedEntity ne)
Updates the score associated with a given node of the network. The score is computed with basis on the occurence frequency for the named entity.

Parameters:
ne - The Named Entity corresponding to the node.
Returns:
The score associated with the node or null if the network does not contain the given node.

setValue

public java.lang.Double setValue(NamedEntity ne,
                                 java.lang.Double value)
Updates the score associated with a given node of the network.

Parameters:
ne - The Named Entity corresponding to the node.
value - The score to associate with the node.
Returns:
The score associated with the node or null if the network does not contain the given node.

setValue

public java.lang.Double setValue(Node node,
                                 java.lang.Double value)
Updates the score associated with a given node of the network.

Parameters:
node - A node.
value - The score to associate with the node.
Returns:
The score associated with the node or null if the network does not contain the given node.

setValue

public java.lang.Double setValue(java.lang.Integer id,
                                 java.lang.Double value)
Updates the score associated with a given node of the network.

Parameters:
id - A node ID.
value - The score to associate with the node.
Returns:
The score associated with the node or null if the network does not contain the given node.

setValue

public java.lang.Double setValue(java.lang.String name,
                                 java.lang.Double value)
Updates the score associated all the nodes sharing a given name.

Parameters:
name - A node name.
value - The score to associate with the node.
Returns:
The score associated with the nodes.

setValue

public java.lang.Double setValue(java.lang.String name,
                                 java.lang.String type,
                                 java.lang.Double value)
Updates the score associated all the nodes sharing a given name and/or a given type.

Parameters:
name - A node name (or null if names should be ignored).
type - A node type (or null if types should be ignored).
value - The score to associate with the node.
Returns:
The score associated with the nodes.

getMaxNode

public Node getMaxNode()
Returns the currently highest scoring node in the network.

Returns:
The highest scoring node in the network.

getMaxNodeScore

public java.lang.Double getMaxNodeScore()
Returns the score of the currently highest scoring node in the network.

Returns:
The score of the highest scoring node in the network.

propagateScoresTextRank

public void propagateScoresTextRank()
Propagate scores across the network using a variation of the PageRank algorithm,similar to the proposal of Rada Mihalcea and Paul Tarau in "TextRank: Bringing Order into Texts". Weighting attributes for links with different semantic categories (i.e. partOf, equivalent) are also used, in a similar fashion to the variant of PageRank proposed by Baeza-Yates and Emilio Davis in "Web Page Ranking using Link Attributes".


propagateScoresBayeasian

public void propagateScoresBayeasian()
Propagate scores across the network using bayesian inference.


getDirectDescendants

public java.util.List getDirectDescendants(Node node)
Get the direct descendants of a given node.

Parameters:
node - A node
Returns:
The direct descendants of the given node.

getDirectDescendants

public java.util.List getDirectDescendants(java.lang.Integer id)
Get the direct descendants of a given node.

Parameters:
id - A node ID
Returns:
The direct descendants of the given node.

getDirectAncestors

public java.util.List getDirectAncestors(Node node)
Get the direct ancestors of a given node.

Parameters:
node - A node
Returns:
The direct ancestors of the given node.

getDirectAncestors

public java.util.List getDirectAncestors(java.lang.Integer id)
Get the direct ancestors of a given node.

Parameters:
node - A node ID
Returns:
The direct ancestors of the given node.

getDirectRelated

public java.util.List getDirectRelated(Node node)
Get the related nodes for a given node. The "is-related" association is uni-directional, so that if A "is-related-to" B, then B "is-related-to" A.

Parameters:
node - A node
Returns:
The related nodes for a given node.

getDirectRelated

public java.util.List getDirectRelated(java.lang.Integer id)
Get the related nodes for a given node. The "is-related" association is uni-directional, so that if A "is-related-to" B, then B "is-related-to" A.

Parameters:
node - A node ID
Returns:
The related nodes for a given node.

getEquivalent

public java.util.List getEquivalent(Node node)
Get the equivalent nodes for a given node. The "is-equivalent" association is uni-directional and transitive, so that if A "is-equivalent-to" B, then B "is-equivalent-to" A, and if A "is-equivalent-to" B and B "is-equivalent-to" C, than A "is-equivalent-to" C and C "is-equivalent-to" A.

Parameters:
node - A node
Returns:
The equivalent nodes for a given node.

resetScores

public void resetScores()
Sets the scores for all the nodes in the network to zero.


prettyPrint

public java.lang.String prettyPrint(Node netNode)
Prints a "human-readable" representation of a node from the network.

Parameters:
netNode - The node in the network.
Returns:
A String with the "human-readable" representation of the node.

prettyPrint

public void prettyPrint(java.io.PrintStream output)
Prints a "human-readable" representation of the network.

Parameters:
output - A PrintWriter to where we should print the "human-readable" representation of the network.