Discovery of the golden section (1985)
|The part of the CSL that refers to the contemporary
Serbian written language consists of the following six subsamples: a. poetry,
b. prose, c. daily
press, d. scientific literature,
literature, and f. texts of Belgrade surrealists.
With the exception of the surrealist texts, each subsample consists of
about 1.400.000 words. In sum, the sample of the contemporary Serbian written
language consists of about 7.000.000 words.
Principles of sampling: The sampling differed from one subsample to the other. Thus, for example, poetry, political texts and surrealist texts are given in their full form (the whole book was
|grammatically tagged).Subsamples of novels
and essays, daily press and scientific texts were sampled by page - the
first ten pages were tagged, and then each fifth page till the end of a
book. Daily press material was sampled by date, page and position on a
State of the material: For poetry texts, daily press and surrealist texts frequency dictionaries were compiled in the late fifties and the original grammatically tagged text is no longer available. The intention is to transfer these texts into the electronic format. At present the following is available in electronic format: a. grammatically tagged prose b.frequency dictionaries for poetry, daily press and surrealist texts and c.global frequency dictionary copiled from the subsamples of petry and daily press. The remaining material will be transfered into electronic format in the near future.