SPIN
/
HOSN-app


			
				
					
						
						
							12345678910111213141516171819202122232425262728293031323334
							The distinctive feature of peer to peer (P2P) systems is that each participant has the role of both a server and a client. The participants are therefore equal with each other and provide each other with services, which is reflected in the naming. P2P networks are usually characterized as overlay networks over the Internet. Concerning the structure of the overlay network, a distinction is made between structured and unstructured networks. The P2P principle became mainly well known in 1999 with Napster. With the file-sharing application Napster, it was possible to exchange (mainly copyrighted) songs among the participants without having to offer them from a central server.

Popular applications of P2P networks are file sharing (e.g., BitTorrent), instant messaging (e.g., Skype) and blockchain technology (e.g., Bitcoin).

Their independence particularly characterizes P2P networks: there are no control points and not necessarily a fixed infrastructure. This also has a positive effect on operating costs. Besides, P2P networks are self-organized and self-scaling, as each additional user contributes its resources.

However, there are also some challenges in P2P networks that need to be solved for successful operation. These include finding peers in the network (peer discovery) and finding resources (resource discovery). Especially in file sharing networks, solutions have to be found how to motivate users to upload data and not only use the download one-sidedly. The replication of data and the associated availability must also be taken into account in solutions. Another critical issue is the Internet connection of individual participants, which may not be powerful or permanent.

\subsection{Unstructured P2P}
\label{sec:unstructured-p2p}
In unstructured P2P networks there are no specifications for the overlay network, so the peers are only loosely connected. Due to the loose structures, the failure of one peer has no significant influence on the function of the rest of the network. Another advantage of the unstructured topology is the lower vulnerability.

While such loose structures are easy to create, the performance of the entire network suffers. A multicast request is sent to all connected peers, who forward the request again and flood the entire network. If a peer can respond to the request, it responds by the same route that the request used to reach it. Each request has a validity time (time to live, TTL) before it is discarded. Popular files with wide distribution can thus be found quickly. However, rare files are difficult to find because the TTL may have been reached before. A flooding search is not efficient and provides a large amount of signaling traffic. An example of this approach is Gnutella.

In order to counteract the problem of inefficient and complicated searching for resources, in other network implementations, central points are created to answer the search requests. These include Napster, FastTrack, and BitTorrent. Figure \ref{fig:unstructured-p2p} shows a comparison of the search process in the respective networks.

\begin{figure}[h!]
	\centering
	\includegraphics[width=0.85\textwidth]{unstructured-p2p-networks}
	\caption{Search process in unstructured P2P systems \cite{moltchanov2014p2p-networks}}
	\label{fig:unstructured-p2p}
\end{figure}

\subsection{Structured P2P}
\label{sec:structured-p2p}
By defining a particular structure, for example a circle or a tree, search processes in the overlay network can be designed efficiently and deterministically. In such structured networks, compliance with the structure is strictly controlled. Routing algorithms determine how a node is arranged in the overlay network. The performance of the entire network depends directly on the arrangement of the nodes and how quick changes (joining and leaving nodes) are detected.

Usually, the routing algorithms are based on a Distributed Hash Table (DHT). Hash tables are data structures in which key-value pairs are stored, whereby the key must be unique. The corresponding value can then be queried via the key. The keys are ids, which are generated with a hash function (e.g., SHA-1). For the addresses of the nodes and the files, ids are created equally, so that they lie in the same address space. For finding a file, it is searched at the node with the same or the next larger id. If it is not available there, it does not exist on the network.

For joining a network, either one or more peers must be known as the entry point, or this information must be obtained from a bootstrap server. When entering a structured network, the joining node is assigned a unique id and thus positions itself in the structure. The routing tables of the nodes affected by the structural change must then be updated.

When leaving a network, this happens either gracefully, and all affected nodes are informed to update their routing tables, or unexpectedly. Therefore, nodes must always check the correctness of their routing tables.

Known routing algorithms that use DHTs include Chrod\cite{stoica2003chord}, CAN\cite{ratnasamy2001scalable}, Pastry\cite{rowstron2001pastry}, Tapestry\cite{zhao2004tapestry} and Kademlia\cite{maymounkov2002kademlia}. Among other things, they differ in their distinct structure and the hash functions used.