Latent semantic analysis of game models using LSTMs

Dan R. Ghica; Khulood Alyahya

doi:10.1016/j.jlamp.2019.04.003

Latent semantic analysis of game models using LSTMs

Dan R. Ghica, Khulood Alyahya

Computer Science

Research output: Contribution to journal › Article › peer-review

1 Citation (Scopus)

215 Downloads (Pure)

Abstract

We are proposing a method for identifying whether the observed behaviour of a function at an interface is consistent with the typical behaviour of a particular programming language. This is a challenging problem with significant potential applications such as in security (intrusion detection) or compiler optimisation (profiling). To represent behaviour we use game semantics, a powerful method of semantic analysis for programming languages. It gives mathematically accurate models ('fully abstract') for a wide variety of programming languages. Gamesemantic models are combinatorial characterisations of all possible interactions between a term and its syntactic context. Because such interactions can be concretely represented as sets of sequences, it is possible to ask whether they can be learned from examples. Concretely, we are using LSTM, a technique which proved effective in learning natural languages for automatic translation and text synthesis, to learn game-semantic models of sequential and concurrent versions of Idealised Algol (IA), which are algorithmically complex yet can be concisely described. We will measure how accurate the learned models are as a function of the degree of the term and the number of free variables involved. Finally, we will show how to use the learned model to perform latent semantic analysis between concurrent and sequential Idealised Algol.

Original language	English
Pages (from-to)	39-54
Journal	Journal of Logical and Algebraic Methods in Programming
Volume	106
Early online date	12 Apr 2019
DOIs	https://doi.org/10.1016/j.jlamp.2019.04.003
Publication status	Published - Aug 2019

Keywords

programming language semantics
game semantics
recurrent neural networks
machine learning

Access to Document

10.1016/j.jlamp.2019.04.003Licence: None: All rights reserved

Dan_R_Ghica_et_al_Latent_semantic_analysis_of_game_models_using_LSTMs_Journal_of_Logical_and_Algebraic_Methods_in_Programming_2019Accepted author manuscript, 1.1 MBLicence: Creative Commons: Attribution-NonCommercial-NoDerivs (CC BY-NC-ND)

https://www.sciencedirect.com/science/article/pii/S2352220817302122Licence: None: All rights reserved

Cite this

@article{31dc720847e04d3ba10f2d5b519d5107,

title = "Latent semantic analysis of game models using LSTMs",

abstract = "We are proposing a method for identifying whether the observed behaviour of a function at an interface is consistent with the typical behaviour of a particular programming language. This is a challenging problem with significant potential applications such as in security (intrusion detection) or compiler optimisation (profiling). To represent behaviour we use game semantics, a powerful method of semantic analysis for programming languages. It gives mathematically accurate models ('fully abstract') for a wide variety of programming languages. Gamesemantic models are combinatorial characterisations of all possible interactions between a term and its syntactic context. Because such interactions can be concretely represented as sets of sequences, it is possible to ask whether they can be learned from examples. Concretely, we are using LSTM, a technique which proved effective in learning natural languages for automatic translation and text synthesis, to learn game-semantic models of sequential and concurrent versions of Idealised Algol (IA), which are algorithmically complex yet can be concisely described. We will measure how accurate the learned models are as a function of the degree of the term and the number of free variables involved. Finally, we will show how to use the learned model to perform latent semantic analysis between concurrent and sequential Idealised Algol.",

keywords = "programming language semantics, game semantics, recurrent neural networks, machine learning",

author = "Ghica, {Dan R.} and Khulood Alyahya",

year = "2019",

month = aug,

doi = "10.1016/j.jlamp.2019.04.003",

language = "English",

volume = "106",

pages = "39--54",

journal = "Journal of Logical and Algebraic Methods in Programming",

issn = "2352-2208",

publisher = "Elsevier",

}

TY - JOUR

T1 - Latent semantic analysis of game models using LSTMs

AU - Ghica, Dan R.

AU - Alyahya, Khulood

PY - 2019/8

Y1 - 2019/8

N2 - We are proposing a method for identifying whether the observed behaviour of a function at an interface is consistent with the typical behaviour of a particular programming language. This is a challenging problem with significant potential applications such as in security (intrusion detection) or compiler optimisation (profiling). To represent behaviour we use game semantics, a powerful method of semantic analysis for programming languages. It gives mathematically accurate models ('fully abstract') for a wide variety of programming languages. Gamesemantic models are combinatorial characterisations of all possible interactions between a term and its syntactic context. Because such interactions can be concretely represented as sets of sequences, it is possible to ask whether they can be learned from examples. Concretely, we are using LSTM, a technique which proved effective in learning natural languages for automatic translation and text synthesis, to learn game-semantic models of sequential and concurrent versions of Idealised Algol (IA), which are algorithmically complex yet can be concisely described. We will measure how accurate the learned models are as a function of the degree of the term and the number of free variables involved. Finally, we will show how to use the learned model to perform latent semantic analysis between concurrent and sequential Idealised Algol.

AB - We are proposing a method for identifying whether the observed behaviour of a function at an interface is consistent with the typical behaviour of a particular programming language. This is a challenging problem with significant potential applications such as in security (intrusion detection) or compiler optimisation (profiling). To represent behaviour we use game semantics, a powerful method of semantic analysis for programming languages. It gives mathematically accurate models ('fully abstract') for a wide variety of programming languages. Gamesemantic models are combinatorial characterisations of all possible interactions between a term and its syntactic context. Because such interactions can be concretely represented as sets of sequences, it is possible to ask whether they can be learned from examples. Concretely, we are using LSTM, a technique which proved effective in learning natural languages for automatic translation and text synthesis, to learn game-semantic models of sequential and concurrent versions of Idealised Algol (IA), which are algorithmically complex yet can be concisely described. We will measure how accurate the learned models are as a function of the degree of the term and the number of free variables involved. Finally, we will show how to use the learned model to perform latent semantic analysis between concurrent and sequential Idealised Algol.

KW - programming language semantics

KW - game semantics

KW - recurrent neural networks

KW - machine learning

U2 - 10.1016/j.jlamp.2019.04.003

DO - 10.1016/j.jlamp.2019.04.003

M3 - Article

SN - 2352-2208

VL - 106

SP - 39

EP - 54

JO - Journal of Logical and Algebraic Methods in Programming

JF - Journal of Logical and Algebraic Methods in Programming

ER -

Latent semantic analysis of game models using LSTMs

Abstract

Keywords

Access to Document

Fingerprint

Cite this