[Reader-list] [Second posting] Implementation of a P2P news distribution network

Mon Mar 29 00:06:12 IST 2004

Hello everybody,
	This is my second posting about the Sarai/CSDS sponsored
project -- Implementation of a P2P news distribution network.
	Though a lot of work is being done in the area of p2p
computing, most of the projects are concerned with file sharing or
implementing a distributed file system.
	The main challenges in implementing a p2p news distribution
network are content searching and trust management. In this posting I
have presented a survey of the search methodologies currently
implemented in p2p networks.
	The posting is structured as follows: first I have discussed what
is p2p and why we are using p2p. Then I have presented a survey of
the most popular search techniques and a critique of the same. Finally,
I have tried to outline a probable area of further study.
	The use of plain text for the posting has lead to some formatting
problems. If you want a PDF/PS version of this posting, please mail me.

Regards,
Soumava Das
-----------------------------------

1 Peer-to-peer Systems
------------------------------------

1.1 What is peer-to-peer?

The term peer-to-peer is not of recent coinage. IBMs Systems
Network Architecture documents on LU6.2 Transactions used the term
peer-to-peer computing over 25 years ago. Even the original ARPANET
was developed as a peer-to-peer system. However, the term gained
prominence and publicity with the rise and fall of Napster [30].

Peer-to-peer systems are distributed systems in which the resources of
a large number of autonomous participants or peers are pooled
together to carry out some function. Peer-to-peer systems function by
forming overlay networks over existing networks like the Internet. These
overlay networks are self-organising and usually have no centralised
control.

Peer-to-peer systems are usually designed to support million of peers
or users in an environment characterised by heterogeneous desktop
systems and low bandwidth, intermittent connections to the Internet
[13]. However, if the resources of these comparatively less-powerful
PCs can be harnessed effectively, the amount of processing power or
storage that can be availed of is simply mind boggling. Assuming only
100 million PCs among the nets 300 million users, and only a
100 MHz chip and 100 MB drive on the average PC, which by any
measure is a conservative estimate, these PCs together possess ten
billion MHz of processing power and ten thousand terabytes of storage
[40]. As an example, Kaaza [26] alone, as of 24 March 2004, is reported
to have shared 5,175,808 GB of data [27].

According to Clay Shirky [40] P2P is a class of applications that takes
advantage of resources  storage, cycles, content, human presence 
available at the edges of the Internet. Because accessing these
decentralized resources means operating in an environment of unstable
connectivity and unpredictable IP addresses, P2P nodes must operate
outside the DNS system and have significant or total autonomy
from central servers.

According to Schollmeier [38] A distributed network architecture may
be called a Peer-to-Peer (P-to-P, P2P,. . . ) network, if the participants
share a part of their own hardware resources (processing power,
storage capacity, network link capacity, printers,?). These shared
resources are necessary to provide the Service and content offered by
the network (e.g. file sharing or shared workspaces for collaboration).
They are accessible by other peers directly, without passing
intermediary entities. The participants of such a network are thus
resource (Service and content) providers as well as resource (Service
and content) requesters (Servent-concept).

Aberer and Hauswirth [2] characterises peer-to-peer as:
 peer acts as both a client and a server (servent)
 peer pays its participation by providing access to (some of) its
resources
 no central coordination
 no central database
 no peer has a global view of the system
 global behaviour emerges from local interactions
 all existing data and services are accessible from any peer
 peers are autonomous
 peers and connections are unreliable

Servent [38] is an artificial word derived from the terms server (Serv-)
and client (-ent).

1.2 Peer-to-peer System Models

Peer-to-peer systems can be categorised as
 Centralised or hybrid systems: The most famous example is Napster
[30]. The participating entities form a peer-to-peer network but depend
on some central server for certain vital services. The central server
provides a single point of failure; both technical and legal.

 Decentralised or pure systems: Here all entities participating in the
network cooperate to achieve the desired function without the need for
any centralised service. The failure of any peer does not lead to failure
of the network as a whole. Thus such systems are resilient to system
failures and are censor-resistent to various degrees. However, they
may experience problems in the areas of resource location, searching,
network load and scalability [42]. Despite these problems almost all
existing peer-to-peer applications follow this model. Examples include
Freenet [20] and Gnutella [21].

Decentralized architecture can be categorised into two groups based on
the search techniques employed: unstructured network which uses
blind search i.e. the search is independent of the query or its context
and structured where search is routed.

 Hierarchical systems: The distinguishing feature of this type of
systems is the presence of superpeers  a peer which acts as the
representative of a group of peers. Unlike the hybrid model, this model
is fault tolerant since any of the peers can become superpeer as and
when the need arises. The hierarchy allows easy scaling and keeps the
network traffic under control.

This model is the most promising one but various problems need to be
overcome before it can realise its potential. The only example seems to
be FastTrack [18] but Gnutella architecture is evolving in this direction
through the use of Ultrapeers [41]

1.3 Why Peer-to-peer Architecture is Most Suitable?

Two non client-server paradigms that have been used to distribute
news are Push Systems [24] and Event-based or Publish/Subscribe
network [17]. However, both are subscription based systems and are
directed more towards distribution of content on a pay-use basis. Unlike
peer-to-peer systems prior subscription is required to access news or
other content. Besides, they are vulnerable to censorship and denial of
service attacks like client-server systems though use of multiple
broadcasters in Push systems can mitigate the risk to some extent.
These systems do not allow active discovery of content which should be
a characteristic of a generalised news distribution system. There is also
an attempt to develop an middle-ware architecture based on peer-to-
peer paradigm [34].

The peer-to-peer computing model is advantageous to both individuals
and corporations [3]. The main technical advantages are the ability to
make use of underutilised resources of the participating peers. These
resources include processing power for large-scale computations and
enormous storage potential. Another major technical advantage is that
peer-to-peer is a true distributed systems and can distribute data,
control, network [33] and processing [39] load among the peers. This
results in better performance and elimination of a single point of failure.

The social and psychological factors that are driving the peer-to-peer
adaption are anonymity, autonomity, empowerment, censorship-
resistance, collaboration and participation. In the corporate sector, the
economic factor is a major driving force. The ability to utilise the not-
insignificant processing and storage capacities of the workstations
and PCs is conjuring visions of replacing costly servers and data
centers with peer-to-peer applications resulting in cost reduction.
Collaboration between spatially distributed team members is another
application in both corporate and community development
environments.

Peer-to-peer system provides the most suitable paradigm for
developing the news distribution network especially taking the social
requirements of the distribution network.

2 Content Search in Peer-to-peer Networks
------------------------------------------------------------

Discovering information is the predominant problem. [2] This is more
so in a news distribution network. The ability to search for news items
based on keywords is of prime importance in such networks.
Unfortunately, content search mechanisms in peer-to-peer network are
not well developed. Content searching is in itself a very complicated
task. Even in centralised setups like web search engines the reach and
accuracy of content or keyword based search are not satisfactory [8].
Most of the current peer-to-peer networks like Freenet or Gnutella
concentrate on file sharing. Thus the search in these network is usually
related to finding the location of files given the file name. This is more
of a resource location problem than a pure search problem. In content
based search the problem is not only to locate the item but first to
decide which item is most relevant to the user.

2.1 Desired Features of Search Techniques

Search techniques should be simple and practical so as to ensure wide
acceptability. These should also be adaptive since the peer-to-peer
network itself is highly adaptive. In particular, the search techniques
should be able to perform in an unreliable network [4] Two particularly
desirable search features are scope (ability to find infrequent items) and
support for partial-match queries (queries that contain typos or include
a subset of keywords) [11]. While centralized-index architectures (such
as Napster) can support both these features, existing decentralized
architectures seem to support at most one: prevailing protocols (such
as Gnutella and FastTrack) support partial-match queries, but since
search is unrelated to the query, they have limited scope.

2.2 Measuring Retrieval Effectiveness

Two of the most widely used measures of document retrieval
effectiveness are Recall and Precision [5]. Recall measures how well a
system retrieves all the relevant documents; and Precision, how well
the system retrieves only the relevant documents. One can interpret
Recall as the probability that a relevant document will be retrieved.
Precision, on the other hand, measures how well a system retrieves
only the relevant documents and can be interpreted as the probability
that a retrieved document will be relevant.

Recall = Number of Relevant and Retrieved Documents / Total Number
of Relevant Documents

Precision = Number of Relevant and Retrieved Documents / Total
Number of Retrieved Documents

2.3 Collection Selection

Collection selection is a major challenge in any information retrieval.
Page ranking deals with ranking the documents present in a collection
based on their relevance to the query criterion. One of the most
successful technique is the TFxIDF algorithm suggested by Salton et.
al. [37]. However, the algorithm cannot be used in a distributed
environment. A modification of this algorithm for the peer-to-peer
network is proposed by Cuenca-Acuna and Nguyen [14]. Another
interesting approach is using inference network [7].

2.4 Search Techniques

2.4.1 Centralised Database

In hybrid peer-to-peer systems the meta-data is stored in a centralised
database server (or servers) while the data is stored in the peers. The
peers send the queries to the server which responds by sending a list of
matches along with the list of the peers storing that file.

The most famous exponent of this technique is Napster. The Napster
client sends a search request containing the artist name or song,
maximum number of results to return, link parameters and file
parameters that must be satisfied. The server returns a list of results
containing the filename, file parameters, IP address and nick of the
peer sharing that file. It also sends a weight denoting how good is the
reply [15].

The quality of search tends to be very high but introduces a single point
of failure, especially from a legal perspective. The technique is
resource intensive due to the need to maintain server farms. Scalability
is another area of concern. It appears that the future of this method of
searching is doomed.

2.4.2 Flooding

Almost all unstructured pure peer-to-peer systems deploy this
technique. This search technique is basically blind. The technique is
very simple. The peer initiating the search sends the query message to
all the peers it is directly connected to. If the peer receiving the query
can satisfy the same, it sends back a reply. Otherwise, it simply
forwards the query to the peers it directly connects to. There is usually
some way to restrict the flooding to prevent the network from being
overloaded.

Gnutella uses this technique. A Gnutella servent receiving a search
query first tries to match the search criteria against the shared files
stored locally. If a match is found a Query Hit message is send along
the path the Query message was received. Otherwise, it floods the
The Query message except on the incoming link. A Query message
contains Message ID to prevent the possibility of forwarding a message
twice. The messages also contain a TTL field to restrict the flooding.
The TTL field is decremented when the message is forwarded and the
message is discarded when TTL becomes zero [28].

This technique is fault tolerant but not very scalable because of the
network traffic generated due to flooding [36, 42]. Even as of now, an
user connecting to the Internet using modem does not have the
bandwidth to participate even in the query/reply process [42].

2.4.3 Indexing

Peer-to-peer networks implementing this technique do not need to
forward query requests. They store an inverted (word to document)
index of the contents published by their peers. They can simply search
the local index to find the peer storing the desired item. This technique
has a major drawback: each of the peer need to store the entire index of
all other peers. Also, the indexes need to be updated frequently which
can lead to high network load.

PlanetP [14] uses a modified version of this technique together with a
version of TFxIDF [37]. Here each of the peers creates a summary of its
index using Bloom filter [6]. These summaries are diffused to the other
peers and stored at each peer site. So searching is a two step process.
First the peer searches its collection of summarised indexes. It then
queries the peer whose summarised index gave a positive result. If the
number of peers whose summarised index gives a positive result is
large, it chooses a subset of them based on inverse peer frequency.
Bloom filter summaries never give false negative (but might result in
false positive [6]), so if the document is present in the network, it will be
found out.

The need to keep the summarised indexes frequently updated results in
a trade-off between network load and reliable search. Another trade-off
that must be dealt with is between storage space at each node and
accuracy of the search. Both the trade-offs become crucial as the
number of peers increase. Further more the effectiveness of this
technique is not proven for in a peer-to-peer network where the peers
connect intermittently.

All in all, this technique seems a very promising one and we wish to
study this further. This technique allows efficient search in unstructured
pure peer-to-peer networks.

2.4.4 Distributed Hash Tables

The main objective of DHT is to provide the efficiency of hash tables on
an Internet scale, i.e. given a key we should directly get the value.
Usually, it means that given a item to be searched we can directly
determine which node is storing that item. The main principle is as
follows: each peer is given an unique ID. Each item to be stored is also
assigned an unique ID (usually by hashing the item or some part of it).
The item is then stored at the node whose ID is closest to the items ID.
This closeness is determined either by treating the IDs as vectors in a
vector space and taking their cosine or simply comparing
lexicographically or numerically.

Many systems like CAN [35], CHORD [43], Pasta [29] use such a
mechanism. For example, CAN uses a d-dimensional Cartesian
coordinate space on a d-torus. It partitions the entire vector space
among all the nodes present in the system. To store a (key, value) pair,
the key is mapped onto a point P in the coordinate space using a
uniform hash function. The pair is then stored at the node(peer)
owning the subspace in which P lies. To retrieve an item simply find its
key and map it onto point P in the vector space. The peer owning that
part of the vector space contains the desired item. The request of
retrieval is routed to that peer over the CAN infrastructure.

DHT has many drawbacks including the fact that nodes need to
store data or indexes for common good even if they are not interested
in the data. DHT networks are more suited for distributed file systems
like Pasta. They offer very efficient exact-match lookup: i.e. finding a file
given the exact file-name. However, they are not good for keyword
lookup. Implementing keyword lookup is possible but a non-trivial task
[9]. Though a lot of interest has been generated in DHT, we feel that
this is not an attractive option for implementing the news distribution
network.

A technique similar to DHT was adapted by P-Grid [1] where they used
a distributed search tree instead.

2.4.5 Key based

This technique is quite similar to DHT with one important difference 
peers need not have identifiers and the overlay structure is self-
organising and not static. Freenet stores items in such a way that items
having lexicographically similar keys tend to cluster together.

The main drawback is that there is no mechanism for content searching.
One has to know the key to retrieve an item. As such this mechanism is
more suited for data retrieval [10] than search.

2.4.6 Combination of one or more of the above techniques

In many cases a combination of two or more of the above techniques
are used; for example, the SuperPeer architecture of Kaaza/FastTrack
or the UltraPeer based architecture in Gnutella.
In UltraPeer based Gnutella, the network uses a combination of
indexing and flooding. The ultrapeers maintain an index of the items
shared by their leaf nodes using either Clip2s Reflector or LimeWires
Query Routing Protocol. However, among the ultrapeers themselves
they usually use flooding for searching.

2.4.7 Semantic Groups

The peers are grouped together based on the semantics of the material
stored by them. For example, all peers storing material related to peer-
to-peer networks will form a semantic group. Note that a peer might be
a member of more than one group. The two main methods of forming a
semantic group are using overlay network [11, 12] and using superpeer
architecture [31]. Search queries are classified using the same criterion
as that used to form the semantic groups and are then routed to the
groups which have a high probability of containing matching items.
Groups unrelated to the query topic are not forwarded the query. The
classification criteria may be fixed or automatically-extracted [11].

In Semantic Overlay Network (SON) [12] a hierarchical classification is
used and the the classification of the SONs are predetermined. Nodes
join the appropriate SON or SONs based on the classification of the
items shared by them. Queries are classified by hand and routed to the
appropriate SONs.

The result quality in networks using such semantic groups is expected
to be very high. The network load is less since there is no flooding of
queries. However, proper classification of nodes and queries is a non-
trivial task. The number of groups and their semantic criterion need to
be automatically extracted; otherwise, the system will not scale and
there might be ambiguity among related groups. User dependent
classification is not reliable and is highly subjective.

3 Conclusion
-------------------------

Peer-to-peer networking is a very active field of research and the depth
and breadth of this field is immense. It is not our intention to discuss all
the different search methods or the different implementations available.
It might not be possible also, given that new methods and
implementations are being developed continuously.  Fortunately, most
of them are clustered around a few major paradigms. We have tried to
understand these major paradigms. Our survey was guided by the
special needs of the news distribution network we intend to implement
and we have not mentioned techniques which we felt are not related
with the work we intend to do.

We think that the three most promising areas of research are:
1. hierarchical peer-to-peer network
2. semantic grouping of peers
3. searching using indices.
We wish to further investigate the possibility of using a combination of
superpeer, semantic grouping and use of indices. Having a dual overlay
structure, one based on network topology and another based on
semantic grouping, might be an interesting area of research. Each peer
will connect to a superpeer and also be a member of zero or more
semantic groups. The super peer will maintain an index of not only the
items or documents shared by its leaf nodes but will also maintain list of
the semantic groups its leaf nodes subscribe to. On receiving a query it
will first try to find a match among the items shared by its leaf nodes. If
this is unsuccessful, it will try to find whether any of its leaf nodes
subscribe to a semantic group related to that query. If one or more
matches are found, the query will be forwarded to those nodes. As a
last resort it will forward the query to the other superpeers.

We also wish to study the possibility of improving the query routing
using results of previous queries. For example, the best way to judge
the quality of a reply is by determining whether the user tried to
download that file.

We are interested in the possibility of allowing a peer to become an
expert on certain topics. This will allow other peers to first contact this
expert peer to determine the nodes to forward the query to. This can
have major implication on query routing. We hope to conduct extensive
simulation studies in these directions.

References
-------------------
[1] Karl Aberer, Philippe Cudr´e-Mauroux, Anwitaman Datta, Zoran
Despotovic, Manfred Hauswirth, Magdalena Punceva, and Roman
Schmidt. Pgrid: a self-organizing structured p2p system. SIGMOD Rec.,
32(3):2933, 2003. ISSN 0163-5808.

[2] Karl Aberer and Manfred Hauswirth. Peer-to-peer information
systems: concepts and models, state-of-the-art, and future systems. In
Proceedings of the 8th European software engineering conference held
jointly with 9th ACM SIGSOFT international symposium on Foundations
of software engineering, pages 326327. ACM Press, 2001. ISBN 1-
58113-390-1.

[3] David Barkai. An introduction to peer-to-peer computing. Intel
Developer UPDATE Magazine, February 2000.

[4] Mayank Bawa, Brian F. Cooper, Arturo Crespo, Neil Daswani,
Prasanna Ganesan, Hector Garcia-Molina, Sepandar Kamvar, Sergio
Marti, Mario Schlosser, Qi Sun, Patrick Vinograd, and Beverly Yang.
Peer-to-peer research at Stanford. SIGMOD Rec., 32(3):2328, 2003.
ISSN 0163-5808.

[5] David C. Blair and M. E. Maron. An evaluation of retrieval
effectiveness for a full-text document-retrieval system. Commun. ACM,
28(3):289299, 1985. ISSN 0001-0782.

[6] Burton H. Bloom. Space/time trade-offs in hash coding with
allowable errors. Commun. ACM, 13(7):422426, 1970. ISSN 0001-
0782.

[7] James P. Callan, Zhihong Lu, andW. Bruce Croft. Searching
distributed collections with inference networks. In Proceedings of the
18th annual international ACM SIGIR conference on Research and
development in information retrieval, pages 2128. ACM Press, 1995.
ISBN 0-89791-714-6.

[8] Ben Chamy. The world wide $#%$ing web! ZDNet News, December
23, 2000.

[9] Yatin Chawathe, Sylvia Ratnasamy, Lee Breslau, Nick Lanham, and
Scott Shenker. Making Gnutella-like p2p systems scalable. In
Proceedings of the 2003 conference on Applications, technologies,
architectures, and protocols for computer communications, pages
407418. ACM Press, 2003. ISBN 1-58113-735-4.

[10] Ian Clarke, Oskar Sandberg, BradonWiley, and TheodoreW. Hong.
Freenet: A distributed anonymous information storage and retrival
system. Lecture Notes in Computer Science, 2009 / 2001, 2001.
13

[11] Edith Cohen, Amos Fiat, and Haim Kaplan. A case for associative
peer to peer overlays. SIGCOMM Comput. Commun. Rev.,
33(1):95100, 2003. ISSN 0146-4833.

[12] Arturo Crespo and Hector Garcia-Molina. Semantic overlay
networks for p2p systems. Technical report, Santford University,
January 2003.

[13] Jon Crowcroft, Tim Moreton, Ian Pratt, and Andrew Twigg. Peer-to-
peer systems and the grid. University of Cambridge Computer
Laboratory, JJ Thomson Avenue, Cambridge, UK.

[14] Francisco Matias Cuenca-Acuna and Thu D. Nguyen. Text-based
content search and retrieval in ad-hoc p2p communities. In Revised
Papers from the NETWORKING 2002 Workshops on Web Engineering
and Peer-to-Peer Computing, pages 220234. Springer-Verlag, 2002.
ISBN 3-540-44177-8.

[15] drscholl at users.sourceforge.net. Napster messages, April 7 2000.
URL http://opennap.sourceforge.net/napster.txt.

[16] Electronic frontiers foundation home page. URL
http://www.eff.org/Censorship/.

[17] Patrick Th. Eugster, Pascal A. Felber, Rachid Guerraoui, and
Anne-Marie Kermarrec. The many faces of publish/subscribe. ACM
Comput. Surv., 35 (2):114131, 2003. ISSN 0360-0300.

[18] The Fasttrack web site. URL http://www.fasttrack.nu.

[19] Gary William Flake, Steve Lawrence, and C. Lee Giles. Efficient
identification of web communities. In Proceedings of the sixth ACM
SIGKDD international conference on Knowledge discovery and data
mining, pages 150160. ACM Press, 2000. ISBN 1-58113-233-6.

[20] The Free Network Project. URL http://freenet.sourceforge.net/.

[21] The Gnutella web site. URL htttp://www.gnutella.com.

[22] Google information for webmasters. URL
http://www.google.com/webmasters/2.html.

[23] The Google search page. URL http://www.google.com.

[24] Manfred Hauswirth and Mehdi Jazayeri. A component and
communication model for push systems. In Proceedings of the 7th
European engineering conference held jointly with the 7th ACM
SIGSOFT international symposium on Foundations of software
engineering, pages 2038. Springer-Verlag, 1999. ISBN 3-540-66538-
2.

[25] Rosemary K Horton. Internet censorship. URL
http://library.trinity.wa.edu.au/subjects/te/it/censorship.htm.

[26] The Kazaa web-site. URL http://www.kazaa.com.

[27] Kazaa usage stats. URL http://tools.waglo.com/kazaa.

[28] Patrick Kirk. RFC-Gnutella, 2003. URL http://rfc-gnutella.sf.net

[29] Tim D. Moreton, Ian A. Pratt, and Timothy L. Harris. Storage,
mutability and naming in pasta. In Revised Papers from the
NETWORKING 2002 Workshops on Web Engineering and Peer-to-
Peer Computing, pages 215 219. Springer-Verlag, 2002. ISBN 3-540-
44177-8.

[30] The Napster home page. URL http://www.napster.com.

[31] Wolfgang Nejdl, Martin Wolpers, Wolf Siberski, Christoph Schmitz,
Mario Schlosser, Ingo Brunkhorst, and Alexander L&#246;ser. Super-
peer-based routing and clustering strategies for rdf-based peer-to-peer
networks. In Proceedings of the twelfth international conference on
World Wide Web, pages 536543. ACM Press, 2003. ISBN 1-58113-
680-3.

[32] Times News Network. Yahoo website blocked. Times of India,
Setpember 22, 2003.

[33] The PeerCast home page. URL http://www.peercast.org/.

[34] Peter R. Pietzuch and Jean Bacon. Peer-to-peer overlay broker
networks in an event-based middleware. In Proceedings of the 2nd
international workshop on Distributed event-based systems, pages 18.
ACM Press, 2003. ISBN 1-58113-843-1.

[35] Sylvia Ratnasamy, Paul Francis, Mark Handley, Richard Karp, and
Scott Schenker. A scalable content-addressable network. In
Proceedings of the 2001 conference on Applications, technologies,
architectures, and protocols for computer communications, pages
161172. ACM Press, 2001. ISBN 1-58113-411-8.

[36] Jordan Ritter. Why gnutella cant scale. No, really., February 2001.

[37] G. Salton, A. Wong, and C. S. Yang. A vector space model for
automatic indexing. Commun. ACM, 18(11):613620, 1975. ISSN 0001-
0782.

[38] R¨udiger Schollmeier. A definition of peer-to-peer networking for
the classification of peer-to-peer architectures and applications. In
Proceedings of the First International Conference on Peer-to-Peer
Computing (P2P01), page 101. IEEE Computer Society, 2001. ISBN 0-
7695-1503-7.

[39] The SETIhome project. URL http://setiathome.ssl.berkeley.edu/.

[40] Clay Shirky. What is p2p... and what isnt? OReilly Network,
November 24 2000.

[41] Anurag Singla and Christopher Rohrs. Ultrapeers: Another step
towards gnutella scalability. Gnutella Developer Forum, November
26 2002. URL
http://groups.yahoo.com/group/the_gdf/files/Proposals/Ultrapeer/Ultrap
eers.html.

[42] Kunwadee Sripanidkulchai. The popularity of gnutella queries and
its implications on scalability. Technical report, Carnegie Mellon
University, 2004.

[43] Ion Stoica, Robert Morris, David Karger, M. Frans Kaashoek, and
Hari Balakrishnan. Chord: A scalable peer-to-peer lookup service for
internet applications. In Proceedings of the 2001 conference on
Applications, technologies, architectures, and protocols for computer
communications, pages 149160. ACM Press, 2001. ISBN 1-58113-
411-8.

[44] Stop Censoring Us  Internet censorship in Iran home page. URL
http://stop.censoring.us/.

[45] U.S. Department of Homeland Security  Federal Computer
Incident Response Center home page. URL
http://www.fedcirc.gov/incidentAnalysis/incidentStatistics.html.