Short version                                                                            CSL
In compiling a language corpus, three isues should be considered beforehead: corpus reliability, its representativeness and its validity. Corpus reliability is directly dependent on its size, it representativeness is related to the type of material included, while its validity is a byproduct of these two. Also, it should be noted that coprus reliability is related to the aspect of language that is investigated.
There were two principal sampling criteria in building up the corpus of Serbian language. The first criterion was that corpus should include all relevant periods in the development of Serbian language and to encompass all relevant genres of Serbian written language. The second criterion is related to the overall size of the Corpus and to size of its sub samples. Inspection of the documentation suggests that sampling constituted an important part of the project that was approached with the utmost care and seriousness. The fact that there are several studies on sample size and sample reliability (i.e. corpus size and its reliability) written by the most prominent statisticians of that time (B. Ivanović and B. Bajšanski), indicates that sample segments and their size were not chosen randomly. Thus far the original studies were not found, although we know their titles. Likewise, inspection of authors and books that constitute the sub samples of Serbian language from 12th to 20th century suggests clear sampling criteria that will be elaborated in more detail in the forthcoming paragraphs.