p2p.tex 5.3 KB

12345678910111213141516171819202122232425262728
  1. The distinctive feature of \ac{P2P} systems is that each participant has the role of both a server and a client. The participants are therefore equal and provide each other with services, what is reflected in the naming. \ac{P2P} networks are usually characterized as overlay networks over the Internet. Concerning the structure of the overlay network, a distinction is made between structured and unstructured networks. The \ac{P2P} principle became well-known in 1999 with the file-sharing application Napster. The software connected its users and allowed accessing (mainly copyrighted) songs among the participants without having to offer them from a central server. Popular applications of \ac{P2P} networks are file sharing (e.g., BitTorrent), instant messaging (e.g., Skype) and blockchain technology (e.g., Bitcoin).
  2. Their independence particularly characterizes \ac{P2P} networks: there are no control points and not necessarily a fixed infrastructure which leads to minimal operating costs. Besides, \ac{P2P} networks are self-organized and self-scaling, as each additional user contributes its resources. However, there are also some challenges in \ac{P2P} networks that need to be solved for successful operation. These include finding peers in the network (peer discovery) and finding resources (resource discovery). Especially in file sharing networks, solutions have to be found how to motivate users to upload data and not only use the download one-sidedly. The replication of data and the associated availability must also be taken into account in solutions. Another critical issue is the Internet connection of individual participants, which may not be powerful or permanent. \cite{tanenbaum2007distributed,Tanenbaum:2010:CN:1942194,kurose2005computer}
  3. \subsection{Unstructured \ac{P2P} Networks}
  4. \label{sec:unstructured-p2p}
  5. In unstructured \ac{P2P} networks there are no specifications for the overlay network, so the peers are only loosely connected. Due to the loose structures, the failure of one peer has no significant influence on the function of the rest of the network. Another advantage of the unstructured topology is the lower vulnerability.
  6. While such loose structures are easy to create, the performance of the entire network suffers. A multicast request is sent to all connected peers, who forward the request again and flood the entire network. If a peer can respond to the request, it responds by the same route that the request used to reach it. Each request has a validity time (\ac{TTL}) before it is discarded. Popular files with wide distribution can thus be found quickly. However, rare files are difficult to find because the \ac{TTL} may has expired before the request could be completed. A flooding search is not efficient and provides a large amount of signaling traffic. An example of this approach is Gnutella.
  7. To address the problem of inefficient and complicated searching for resources, central points are created to answer the search requests. Network implementations using this structure are Napster, FastTrack, and BitTorrent. Figure \ref{fig:unstructured-p2p} shows a comparison of the search process in the respective networks.
  8. \begin{figure}[h!]
  9. \centering
  10. \includegraphics[width=0.85\textwidth]{unstructured-p2p-networks}
  11. \caption{File retrieval process in unstructured \ac{P2P} systems exemplary shown for Napster, Gnutella v0.4, FastTrack/Gnutella v0.6 and BitTorrent \cite{moltchanov2014p2p-networks}}
  12. \label{fig:unstructured-p2p}
  13. \end{figure}
  14. \subsection{Structured \ac{P2P} Networks}
  15. \label{sec:structured-p2p}
  16. In structured networks, compliance with the structure is strictly controlled. By defining a particular structure, for example a circle or a tree, search processes in the overlay network can be designed efficiently and deterministically. Routing algorithms determine how a node is arranged in the overlay network. The performance of the entire network depends directly on the arrangement of the nodes and how quick changes (joining and leaving nodes) are detected.
  17. Usually, the routing algorithms are based on a \ac{DHT}. Hash tables are data structures in which key-value pairs are stored, whereby the key must be unique. The corresponding value can then be queried via the key. The keys are ids, which are generated with a hash function (e.g., SHA-1). For the addresses of the nodes and the files, ids are created equally, so that they lie in the same address space. For finding a file, it is searched at the node with the same or the next larger id. If it is not available there, it does not exist on the network.
  18. For joining a network, either one or more peers must be known as the entry point, or this information must be obtained from a bootstrap server. When entering a structured network, the joining node is assigned a unique id and thus positions itself in the structure. The routing tables of the nodes affected by the structural change must then be updated. When leaving a network, this happens either planned, and all affected nodes are informed to update their routing tables, or unexpected. Therefore, nodes must always check the correctness of their routing tables.
  19. Known routing algorithms that use \acp{DHT} include Chrod\cite{stoica2003chord}, CAN\cite{ratnasamy2001scalable}, Pastry\cite{rowstron2001pastry}, Tapestry\cite{zhao2004tapestry} and Kademlia\cite{maymounkov2002kademlia}. Among other things, they differ in their distinct structure and the hash functions used.