Trace analysis is a protocol reverse engineering technique that aims to determine the behavior of unknown network protocols by examining network messages. One of the possible steps in the trace analysis may be to divide the traffic dump into separate groups in accordance with the protocol stacks of the packets. In this article, we propose an unsupervised learning method in which we use NLP approaches to get package embeddings and then divide them into groups using clustering. This method can be applied to raw packet data and does not require any domain knowledge to extract the relevant features. The results show that the obtained embeddings successfully capture the semantic information underlying the protocols and allow us to divide the traffic dump into clusters containing packets with the same protocol stack. The developed method of grouping network packets makes it possible to increase the efficiency of the network packet analysis process by jointly analyzing packets belonging to the same unknown protocol.
Original languageEnglish
Title of host publicationProceedings - 2023 IEEE Ural-Siberian Conference on Biomedical Engineering, Radioelectronics and Information Technology, USBEREIT 2023
Subtitle of host publicationbook
PublisherInstitute of Electrical and Electronics Engineers Inc.
Pages344-347
Number of pages4
ISBN (Electronic)979-835033605-4
DOIs
Publication statusPublished - 15 May 2023
Event2023 IEEE Ural-Siberian Conference on Biomedical Engineering, Radioelectronics and Information Technology (USBEREIT) - ИРИТ-РТФ УрФУ, Екатеринбург, Russian Federation
Duration: 15 May 202317 May 2023

Conference

Conference2023 IEEE Ural-Siberian Conference on Biomedical Engineering, Radioelectronics and Information Technology (USBEREIT)
Country/TerritoryRussian Federation
CityЕкатеринбург
Period15/05/202317/05/2023

ID: 41986195