Sequential Activity Profiling: Latent Dirichlet Allocation of Markov Chains

Ata Kaban, M Girolami

Research output: Contribution to journalArticle

23 Citations (Scopus)


To provide a parsimonious generative representation of the sequential activity of a number of individuals within a population there is a necessary tradeoff between the definition of individual specific and global representations. A linear-time algorithm is proposed that defines a distributed predictive model for finite state symbolic sequences which represent the traces of the activity of a number of individuals within a group. The algorithm is based on a straightforward generalization of latent Dirichlet allocation to time-invariant Markov chains of arbitrary order. The modelling assumption made is that the possibly heterogeneous behavior of individuals may be represented by a relatively small number of simple and common behavioral traits which may interleave randomly according to an individual-specific distribution. The results of an empirical study on three different application domains indicate that this modelling approach provides an efficient low-complexity and intuitively interpretable representation scheme which is reflected by improved prediction performance over comparable models.
Original languageEnglish
Pages (from-to)175-196
Number of pages22
JournalData Mining and Knowledge Discovery
Issue number3
Publication statusPublished - 1 May 2005


  • Markov chains
  • mixture models
  • user profiling


Dive into the research topics of 'Sequential Activity Profiling: Latent Dirichlet Allocation of Markov Chains'. Together they form a unique fingerprint.

Cite this