Text as binary sequence: A case of characteristic constant of text

Petar Milin, Nada Ilic

Research output: Contribution to conference (unpublished)Paperpeer-review

2 Citations (Scopus)

Abstract

The relation between vocabulary size {V(N)) and the text size (AO has been reexamined, where the text has been presented as binary sequence. Six different texts by three authors from different periods were taken from the Corpus of Serbian Language to be analyzed. Statistics included regression analysis, randomness test for binary sequence and stochastic models. Point of equivalence, where number of new and old words is equal, has been proposed as characteristic constant of the text. This constant is independent on N and could be used as an index of vocabulary richness.

Original languageEnglish
Pages47-52
Number of pages6
Publication statusPublished - 2003
Event4th International Workshop on Linguistically Interpreted Corpora at the 10th European Chapter of the Association for Computational Linguistics, LINC@EACL 2003 - Budapest, Hungary
Duration: 13 Apr 200314 Apr 2003

Conference

Conference4th International Workshop on Linguistically Interpreted Corpora at the 10th European Chapter of the Association for Computational Linguistics, LINC@EACL 2003
Country/TerritoryHungary
CityBudapest
Period13/04/0314/04/03

Bibliographical note

Publisher Copyright:
© LINCEACL 2003.All right reserved.

Keywords

  • Binary sequence
  • Characteristic constant of the text
  • Text size
  • Vocabulary size

ASJC Scopus subject areas

  • Language and Linguistics
  • Linguistics and Language

Fingerprint

Dive into the research topics of 'Text as binary sequence: A case of characteristic constant of text'. Together they form a unique fingerprint.

Cite this