p2p.tex 5.2 KB

12345678910111213141516171819202122232425262728293031323334
  1. The special feature of peer to peer (P2P) System is that each participant has the role of both a server and a client. The participants are therefore equal with each other and provide each other with services, which is reflected in the naming. P2P networks are usually characterized as overlay networks over the Internet. With regard to the structure of the overlay network, a distinction is made between structured and unstructured networks. The P2P principle became particularly well known in 1999 with Napster. With the file-sharing application Napster, it was possible to exchange (mainly copyrighted) songs among the participants without having to offer them from a central server.
  2. Popular applications of P2P networks are file sharing (e.g. BitTorrent), instant messaging (e.g. Skype) and blockchain technology (e.g. Bitcoin).
  3. P2P networks are particularly characterized by their independence. There are no control points and not necessarily a fixed infrastructure. This also has a positive effect on operating costs. In addition, P2P networks are self-organized and self-scaling, as each additional user contributes its resources.
  4. However, there are also some challenges in P2P networks that need to be solved for successful operation. These include finding peers in the network (peer discovery) and finding resources (resource discovery). Especially in file sharing networks, solutions have to be found how to motivate users to upload data and not only use the download one-sidedly. The replication of data and the associated availability must also be taken into account in solutions. Another critical issue is the Internet connection of individual participants, which may not be powerful or permanent.
  5. \subsection{Unstructured P2P}
  6. \label{sec:unstructured-p2p}
  7. In unstructured P2P networks there are no specifications for the overlay network, so the peers are only loosely connected to each other. Due to the loose structures, the failure of one peer has no great influence on the function of the rest of the network. Another advantage of the unstructured topology is the lower vulnerability.
  8. While such loose structures are easy to create, the performance of the entire network suffers. A multicast request is sent to all connected peers, who forward the request again and flood the entire network. If a peer can respond to the request, it responds by the same route that the request used to reach it. Each request has a validity time (time to live, TTL) before it is discarded. Popular files with wide distribution can thus be found quickly. However, rare files are difficult to find because the TTL may have been reached before. A flooding search is not efficient and provides a large amount of signaling traffic. An example for this approach is Gnutella.
  9. In order to counteract the problem of inefficient and complicated searching for resources, central points are created in other network implementations where addresses of resources can be specifically requested. These include Napster, FastTrack and BitTorrent. Figure \ref{fig:unstructured-p2p} shows a comparison of the search process in the respective networks.
  10. \begin{figure}[h!]
  11. \centering
  12. \includegraphics[width=0.85\textwidth]{unstructured-p2p-networks}
  13. \caption{Search process in unstructured P2P systems \cite{moltchanov2014p2p-networks}}
  14. \label{fig:unstructured-p2p}
  15. \end{figure}
  16. \subsection{Structured P2P}
  17. \label{sec:structured-p2p}
  18. By defining a certain structure, for example a circle or a tree, search processes in the overlay network can be designed efficiently and deterministically. In such structured networks, compliance with the structure is strictly controlled. Routing algorithms determine how a node is arranged in the overlay network. The performance of the entire network depends directly on how the nodes are arranged and how quickly changes (joining and leaving nodes) are responded to.
  19. Usually the routing algorithms are based on a Distributed Hash Table (DHT). Hash tables are data structures in which key-value pairs are stored, whereby the key must be unique. The corresponding value can then be queried via the key. The keys are ids, which are generated with a hash function (e.g. SHA-1). For the addresses of the nodes and for the files, ids are created equally, so that they lie in the same address space. To find a file, it is searched at the node with the same or the next larger id. If it is not available there, it is not existing on the network.
  20. To join a network, either one or more peers must be known as the entry point or this information must be obtained from a bootstrap server. When entering a structured network, the joining node is assigned a unique id and thus positions itself in the structure. The routing tables of the nodes affected by the structural change must then be updated.
  21. When leaving a network, this happens either gracefully and all affected nodes are informed to update their routing tables, or unexpectedly. Therefore, nodes must always check the correctness of their routing tables.
  22. Known routing algorithms that use DHTs include Chrod\cite{stoica2003chord}, CAN\cite{ratnasamy2001scalable}, Pastry\cite{rowstron2001pastry}, Tapestry\cite{zhao2004tapestry} and Kademlia\cite{maymounkov2002kademlia}. Among other things, they differ in their distinct structure and the hash functions used.