AMBIENTUM BIOETHICA BIOLOGIA CHEMIA DIGITALIA DRAMATICA EDUCATIO ARTIS GYMNAST. ENGINEERING EPHEMERIDES EUROPAEA GEOGRAPHIA GEOLOGIA HISTORIA HISTORIA ARTIUM INFORMATICA IURISPRUDENTIA MATHEMATICA MUSICA NEGOTIA OECONOMICA PHILOLOGIA PHILOSOPHIA PHYSICA POLITICA PSYCHOLOGIA-PAEDAGOGIA SOCIOLOGIA THEOLOGIA CATHOLICA THEOLOGIA CATHOLICA LATIN THEOLOGIA GR.-CATH. VARAD THEOLOGIA ORTHODOXA THEOLOGIA REF. TRANSYLVAN
|
|||||||
Rezumat articol ediţie STUDIA UNIVERSITATIS BABEŞ-BOLYAI În partea de jos este prezentat rezumatul articolului selectat. Pentru revenire la cuprinsul ediţiei din care face parte acest articol, se accesează linkul din titlu. Pentru vizualizarea tuturor articolelor din arhivă la care este autor/coautor unul din autorii de mai jos, se accesează linkul din numele autorului. |
|||||||
STUDIA INFORMATICA - Ediţia nr.2 din 2023 | |||||||
Articol: |
DEOBFUSCATING JAVASCRIPT CODE USING CHARACTER-BASED TOKENIZATION. Autori: ALEXANDRU-GABRIEL SÎRBU. |
||||||
Rezumat: DOI: 10.24193/subbi.2023.2.01 Published Online: 2023-12-22 pp. 5-21 VIEW PDF FULL PDF The JavaScript code deployed goes through the process of minification, in which variables are renamed using single-character names and spaces are removed in order for the files to have a smaller size, thus loading faster. Because of this, the code becomes unintelligible, making it harder to be analyzed manually. Since JavaScript experts can under- stand it, machine learning approaches to deobfuscate the minified file are possible. Thus, we propose a technique that finds a fitting name for each obfuscated variable, which is both intuitive and meaningful based on the usage of that variable, based on a Sequence-to-Sequence model, which generates the name character by character to cover all the possible variable names. The proposed approach achieves an average exact name generation accuracy of 70.53%, outperforming the state-of-the-art by 12%. Received by the editors: 31 July 2023. 2010 Mathematics Subject Classification 68T05, 68T50. 1998 CR Categories and Descriptors. I.2.6 [Learning]: Subtopic – Connectionism and neural nets; I.2.7 [Natural Language Processing]: Subtopic – Language generation. Keywords and phrases: JavaScript deobfuscation, variable name prediction, Deep Learning, Recurrent Neural Network, Abstract Syntax Tree. |
|||||||