Binary classification of malware by analyzing its behavior in the network using machine learning

Soto, Jean Carlo

Binary classification of malware by analyzing its behavior in the network using machine learning

dc.creator	Soto, Jean Carlo
dc.date	2024-04-29
dc.date.accessioned	2025-11-05T16:21:43Z
dc.date.available	2025-11-05T16:21:43Z
dc.description	Introduction. Every day we are exposed to all kinds of cyber-threats when we browse the internet, compromising the confidentiality, integrity, and availability of our devices. Cyber-attacks have become more sophisticated and cyber attackers require less technical knowledge to execute such attacks. An automated and well-defined process to counter these attacks becomes urgent. The study aim was to solve this problem. Methods. A model was developed to analyze the information in Packet Capture (PCAP) files and classify network connections as either benign or malicious (malware generated). This software used two methods: traditional machine learning algorithms and neural networks. Our experiments were carried out using the Intrusion Detection Evaluation Dataset (CICIDS2017), which contains labeled samples of PCAP files. We experimented using both raw and standardized data. The classification results were evaluated using recall, precision, F1-score, and accuracy metrics. Results. These were satisfactory for both methods, obtaining more than 95% in the F1-score and recall metric, indicating a low number of false negatives. Conclusion. It was found that data standardization had a favorable impact on all metrics and should be used carefully. Overall, our experiments showed that malicious network traffic can be successfully detected using automated methods achieving above 95% of F1-score in the K-Nearest Neighbors algorithm (K-NN) classifier.	en-US
dc.description	Introducción. Cada día estamos expuestos a todo tipo de ciberamenazas cuando navegamos por internet, comprometiendo la confidencialidad, integridad y disponibilidad de nuestros dispositivos. Los ciberataques se han convertido más sofisticados y los ciberatacantes requieren menos conocimientos técnicos para ejecutar dichos ataques. Un proceso automatizado y bien definido para contrarrestar estos ataques se vuelve urgente. El objetivo del estudio fue resolver este problema. Métodos. Se desarrolló un modelo para analizar la información en los archivos de Captura de paquetes (PCAP) y clasificar las conexiones de red como benignas o maliciosas (generadas por malware). Este software utilizó dos métodos: algoritmos tradicionales de aprendizaje de maquina y redes neuronales. Nuestros experimentos se llevaron a cabo utilizando el conjunto de datos de evaluación de detección de intrusiones (CICIDS2017), que contiene muestras etiquetadas de archivos PCAP. Se utilizó datos tanto crudos como estandarizados. Los resultados de la clasificación se evaluaron utilizando métricas de exhaustividad, precisión, puntuación F1 y precisión. Resultados. Estos fueron satisfactorios para ambos métodos, obteniendo más del 95% en las métricas de puntuación F1 y exhaustividad, lo que indica un bajo número de falsos negativos. Conclusión. Se encontró que la estandarización de datos tuvo un impacto favorable en todas las métricas y debe usarse con cuidado. En general, nuestros experimentos mostraron que el tráfico de red malicioso se puede detectar con éxito utilizando métodos automatizados que alcanzan más del 95% de la puntuación F1 en el Clasificador del Algoritmo de Vecinos Más Cercanos (K-NN).	es-ES
dc.format	text/html
dc.format	application/pdf
dc.identifier	https://revistas.unitec.edu/innovare/article/view/251
dc.identifier.uri	https://repositorio.unitec.edu//handle/123456789/13969
dc.language	eng
dc.publisher	Universidad Tecnológica Centroamericana	es-ES
dc.relation	https://revistas.unitec.edu/innovare/article/view/251/282
dc.relation	https://revistas.unitec.edu/innovare/article/view/251/283
dc.rights	https://creativecommons.org/licenses/by-nc-nd/4.0	es-ES
dc.source	Innovare Revista de ciencia y tecnología; Vol. 12 No. 1 (2023); 30-36	en-US
dc.source	Innovare Revista de ciencia y tecnología; Vol. 12 Núm. 1 (2023); 30-36	es-ES
dc.source	2310-290X
dc.subject	Aprendizaje de maquina	es-ES
dc.subject	Aprendizaje profundo	es-ES
dc.subject	Malware	es-ES
dc.subject	Red	es-ES
dc.subject	Seguridad cibernética	es-ES
dc.subject	Cybersecurity	en-US
dc.subject	Deep learning	en-US
dc.subject	Machine learning	en-US
dc.subject	Malware	en-US
dc.subject	Network	en-US
dc.title	Binary classification of malware by analyzing its behavior in the network using machine learning	en-US
dc.title	Clasificación binaria de malware mediante el análisis de su comportamiento en la red mediante aprendizaje de maquina	es-ES
dc.type	info:eu-repo/semantics/article
dc.type	info:eu-repo/semantics/publishedVersion

Colecciones

Innovare

Binary classification of malware by analyzing its behavior in the network using machine learning

Archivos

Colecciones