HISTORY OF THE CORPUS                                     CSL
Šorše Kostię (1909 - 1995)
Work on the Corpus of Serbian Language CSL started in 1956 at the Institute for Experimental Phonetics and Speech Pathology in Belgrade. The CSL project was initiated and conducted by Prof. Šorše Kostię, and was part of a broader project whose initial goal was automatic speech and text recognition and machine translation. Work on the CSL lasted till 1962, when it was suspended. About 400 collaborators (80 experts in linguistics and other related fields, together with more than 300 technical staff) participated in the CSL project. Due to the level of technology in the fifties, all work on the CSL was executed 
manually. The first step was to create a system of grammatical coding which consisted of more than 2000 distinct codes to capture all grammatical forms within the Serbian language. Each word from the sample of 11 000 000 was manually tagged for its grammatical status. 
Once grammatical tagging was finished, multiple frequency dictionaries were compiled (more that 27 000 pages). In 1996 through the joint efforts of the Institute for Experimental Phonetics and Speech Pathology and the Laboratory for Experimental Psychology, University of Belgrade, that work was converted into an electronic format and partial updating of the system of grammatical tagging was initiated. This phase of the project is still in progress (in 2001).