Second Chapter: Sources of data and presentation of data



Second Chapter: Sources of data and presentation of data.

Primary and secondary sources classification and tabulation of data. Diagrammatic representation of data.

Introduction:

The basic problem of statistical enquiry is to collect facts and figures relating to a particular phenomenon under study, whether the inquiry is in business, economic or social-science. The investigator is the person who conducts the statistical enquiry. The statistician counts or measures the characteristic under study for further statistical analysis. The respondents are the persons from whom the information is collected. The statistical units are the items on which the measurement is taken. Collection of data is the process of enumeration together with the proper recording of results. The success of an enquiry is based upon the proper collection of data.

Primary and secondary data:

Statistical data may be classified as primary and secondary. Primary data are those which are collected for the first time and they are original in character. If an individual collects the data to study particular problems, the data is the raw-materials of the enquiry. They are primary data collected by the investigator himself to study a particular problem.

Secondary data are those which are already collected by someone for some purpose and are available for the present study. For instance, the data collected during census operations are primary data to the department of census and the same data, if used by a research worker for some study, becomes secondary data.

For collection of primary data, the investigator may choose any one of the following methods:

1. Direct personal observation

2. Indirect oral interview

3. Information oral interview

4. Mailed questionnaires

5. Schedules sent through enumerator

1. Direct personal observation:

Under this method, the data is collected by the investigator personally. The investigator must be a keen observer, tactful and courteous in his behaviour. He asks or cross-examines the informant and collects necessary information. The enquiry is intensive, rather than extensive. For instance, if one wants to study the living conditions of the people in a village, the investigator has to the village, contact the people and get the needed information. Thus it is original in character

This method is adopted in the following areas:

1. Where greater accruing is needed

2. Where the field of enquiry is not large

3. Where confidential data are to be collected

4. Where the field is a complex one.

5. Where intensive study is needed

6. Where sufficient time is available

Merits:

1. Original and first hand information are collected

2. True and reliable data can be had.

3. Response will be more encouraging because of personal approach.

4. A high degree of accuracy can be aimed.

5. The investigator can extract correct information.

6. Misinterpretations, if any, on the part of the information can be avoided

7. Uniformity and homogeneity can be maintained.

Demerits:

1. It is unsuitable where area is large.

2. It is expensive and time-consuming

3. The chances of bias are more.

4. An untrained investigator will not bring good results.

5. One has to collect information according to the convenience of the informant.

2. Indirect Oral interview:

When the respondent is reluctant to supply information, the method of indirect oral investigation can be followed. Under this method, the investigator approaches the witnesses or third parties, who are in touch with the informant. The enumerator interviews the people, who are directly or indirectly connected with the problem under study. For instance, we are asked to collect information relating to gambling or drinking habits of the people. In such cases, the respondents will be reluctant to supply information relating to their own habits. On such occasions, we can approach the dealers of liquor shops, friends, and neighbors etc to get the needed information. Generally, this method us employed by different enquiry committees and commissions. The police department generally, adopts this method to get clues of thefts, riots, murders etc.

Suitability:

This system is more suitable, where the area to be studied is large. It is used when direct information cannot be obtained. The system is generally adopted by government.

Merits:

1. It is simple and convenient

2. It saves time, money and labour

3. It can be used in the investigation of a large area.

4. The information is unbiased

5. Adequate information can be received.

6. As the information is collected from different parties, a true account can be expected. All aspects of the problem can be ascertained.

Demerits:

1. Absence of direct contact is there, information cannot be relied upon.

2. Interview with an improper person will spoil the results.

3. In order to get the real position, a sufficient number of persons are to be interviewed.

4. The careless attitude of the informant will affect the degree of accuracy.

5. Witnesses may color the information according to their interests.

3. Information through agencies:

Under this method, local agents or correspondents will be appointed. They collect the information and transmit it to the office or person. They do this according to their own tastes and preferences. This system is adopted by newspapers, periodicals, agencies etc when information is needed indifferent fields, for example, riots, sports, politics etc the informants are generally called correspondents.

Merits:

1. Extensive information can be received.

2. It is the most cheap and economical method

3. Speedy information is possible

4. It is useful where information is needed regularly.

Demerits:

1. The information may be biased

2. Degree of accuracy cannot be maintained

3. Uniformity cannot be maintained

4. Data may not be original

4. Mailed questionnaires:

In this method, a questionnaires consisting of a list of questions pertaining to the enquiry is prepared. The questionnaires are sent to the respondents. A covering letter is also sent along with the questionnaire, requesting the respondents to extend their full co-operation by giving the correct replies and returning the questionnaires duly filled on time. To get quick and better response, the return postage expense is borne by the investigator, by spending a self-addressed and stamped envelope. This method is adopted by research workers, private individuals, non-government agencies, state and central governments.

Merits:

1. Of all the methods, the mailed questionnaire method is the most economical.

2. It can be widely used, when the area of investigation is large.

3. It saves money, labour and time.

4. Error in investigation is very small because information is obtained directly from the respondents.

Demerits:

1. In this method, there is no direct contact between the investigator and the respondent. Therefore, we cannot be sure about the accuracy and reliability of the data.

2. This method is suitable only for literate.

3. There is long delay in receiving questionnaires duly filled in.

4. People may not give the correct answer and thus one is led to false conclusions.

5. The questionnaire is inelastic asking supplementary question is not possible.

6. Sometimes the informants may not be willing to give written answers.

5. Schedule sent through enumerates:

It is the most widely used method of collection of primary data. A number of enumerators are selected and trained. They are provided with standardized questionnaires, specific training and instructions are given to them for filling up schedules. Each enumerator will be in charge of a certain area. The investigator goes to the informants along with the questionnaire and gets replies to the questionnaire and gets replies to the questions in the schedule and records their answers. He explains clearly the object and the purpose of the enquiry. The difference between former (mailed questionnaire) and this method is that in the former method, is that in the former method, the questionnaire is sent to the informants, whereas in this method the investigator carries the schedule to the informant. This method is used by public organizations and research institutions.

Merits:

1. This method is very useful in extensive enquires.

2. It yields reliable and accurate results because the enumerates are educated and trained.

3. The scope of enquiry can also be greatly enlarged.

4. Even if the respondents are illiterate this technique can be widely used.

5. As the enumerators personally obtain the information, there is less chance of non-response.

Demerits:

1. This is a very costly method, as the enumerators are trained and paid for.

2. This method is time-consuming, because the enumerators go personally to obtain the information.

3. Personal bias of the enumerators may lead to false conclusion.

4. The quality of the collected data depends upon the personal qualities of the enumerator.

5. It is not suitable to all persons as it is very costly.

Drafting a questionnaire:

Before drafting the questionnaire it is essential to set out in detail the data which one desires from the answers to the questionnaire. It shall be wise if we can construct the type of tables which we would like to obtain from the enquiry. It may not always be possible to set out that all the possible data, we would like, in advance, since many things may be learnt in the course of enquiry and one may find that what he believed to be an ideal was not in fact an ideal. For this reason those who are likely to be concerned with analyzing the results should be called in at a very early stage.

The success of the questionnaire method of collecting information depends largely on the proper drafting of the questionnaire. Drafting a questionnaire is a highly specialized job and requires a great deal of skill and experience. It is difficult to lay down any hard and fast rules to be followed in this connection. However, the following general principles may be helpful in framing a questionnaire:

1. Covering letter:

The person conducting the survey must introduce himself and state the objective of the survey. It is desirable that-

i. A short letter is enclosed. The letter should briefly state the purpose of the survey and how the informant would tend to benefit from it.

ii. Enclose a self-addressed stamped envelope for the respondent’s convenience in returning the questionnaire.

iii. Assure the respondents that his answers will be kept in strictest confidence.

iv. Promise the respondent that he will not be solicited after he fills up the questionnaire.

v. If possible, offer special inducements to return the questionnaire.

vi. If the respondent is interested, promise a copy of the results of the survey to him.

2. The number of questions should be small:

The number of questions should be kept to the minimum. The precise number of questions to be included would naturally depend on the object and scope of the investigation fifteen to twenty-five may be regarded as a fair number. If a lengthy questionnaire is unavoidable, it should be preferably divided into two or more parts.

3. Questions should be arranged logically:

The questions must be arranged in a logical order so that a natural and spontaneous reply to each is induced. They should not skip back and forth from one topic to another. Thus, it is undesirable to ask a man how many children he has before asking whether he is married or not. Similarly, it would be illogical to ask t man his income before asking him whether he is employed or not. Thus the sequence of the questions should be considered carefully in terms of the purpose of the study and the persons who will supply the information. Question applying to identification and description of the respondent should come first followed by major information questions. If opinions are requested such questions should usually be placed at the end of the list. Two different questions worded differently be included on the same subject to provide cross –check on important points.

4. Questions should be short and simple to understand:

The questions should be short and simple to understand. Unless the person being interviewed is technically trained, technical terms should be avoided. Words such as ‘capital’ or ‘income’ that have different meanings for different persons should not be used unless a clarification is included in the question.

5. Ambiguous questions ought to be avoided:

Ambiguous questions mean different things to different people. It will not be possible to obtain comparable replies from respondents who take a question to mean different things. For example: a question asked from those who visit super bazaar is, ‘do you think the salesman at different counters need training’. Such a question would be ambiguous because of differences of opinion as to what constitutes training- does it mean formal programme of classes extending over a period of few days, weeks or months or would a few days of working with highly experienced salesman constitute training.

6. Personal questions should be avoided:

As far as possible, questions of personal nature should not be asked. For example, questions about income, sales tax paid etc may not be willingly answered in writing. Where such information is essential, it should be obtained by personal interviews. Even then, such questions should be asked only at the end of the interview, when the informants feel more at ease with the interviewer.

7. Instruction to the informants:

The questionnaire should provide necessary instructions to the informants.

8. Questions should be capable of objective answer:

Avoid questions of opinion and keep to questions of fact. It is highly desirable that questions are so designed that an objective answer may be forthcoming. As far as possible the questions should be such a nature that they can be answered easily in ’yes’ or ‘no’.

9. Specific information questions and open-end questions:

Specific information question calls for a specific item of information. For example, ‘what is your age’? ‘How many children do you have’ etc. These questions are simple and direct and are well adapted to securing information of this type. Care should be taken to use this type of questions only where the respondent can answer correctly. The open question does not pose alternatives or request specific information. It leaves the respondent free to make whatever reply he chooses. For example: the question” why do you use Colgate toothpaste” is an open-ended question. In many ways open-ended questions are superior to other types. However it is difficult to tabulate.

10. Questionnaire should look attractive:

A questionnaire should look attractive, the printing, paper used etc should be of good quality and plenty of space should be left for answers depending upon the type of questions.

11. Questions requiring calculations should be avoided:

Questions should not require calculations to be made. Questions necessitating calculations of ratios and percentage etc should not be asked as it takes a lot of time and informant may not send back the questionnaire.

12. Pre-testing the questionnaire:

The questionnaire should be pre-tested with a group before mailing it out.

13. Cross-checks:

If possible, one or more cross-checks should be incorporated into the questionnaire to determine whether the respondent is answering atleast the important questions correctly.

14. Method of tabulation:

The method to be used for tabulating the results should be determined before the final draft of the questionnaire is made.

Sources of secondary data:

In most of the studies the investigator finds it impractical to collect first hand information on all related issues and as such he makes use of data collected by others. There is a vast amount of published information from which statistical studies may be made and fresh statistics are constantly in a state of production. The sources of secondary data can be broadly classified under two heads:

i. published sources:

ii. Unpublished sources.

i. Published sources:

Various governmental, international and local agencies publish statistical data, and chief among them are:

a. International publications:

International agencies and international bodies publish regular and occasional reports on economic and statistical matters. They are I.M.F, the I.B.R.D, U.N.O etc.

b. Official publication of central and state governments:

Departments of the union and state governments regularly publish reports on a number of subjects. They gather additional information, some of the important of the publications are: The Reserve Bank of India Bulletin, Census Reports, Statistical abstracts of states etc.

c. Semi-official publications:

Semi-government institutions like municipal corporation, district board, Panchayat etc publish reports.

d. Publications of research institutions:

Indian statistical institution, Indian Council of Agricultural Research, Indian Agricultural statistics research institute etc publish the findings of their research programmes.

e. Publications of commercial and financial institutions

f. Reports of various committees and commissions appointed by the government:

For example Wanchoo Commission Report on Taxation, Pay Commission Reports, Land Reforms Committee Reports etc are sources of secondary data.

g. Journals and newspapers:

Current and important materials on statistics and socio-economic problems can be obtained from journals and newspapers like Economic Times, Commerce, Capital, Indian Finance, Monthly Statistics of Trade etc.

ii. Unpublished sources:

There are various sources of unpublished data. There are records maintained by various governments and private offices, the researches carried out by individual research scholars in the universities or research institutes.

Precautions in the use of secondary data:

One must take extra care when using secondary data, according to Simon Kuznets,”the degree of reliability of secondary source is to be assessed from the source, the complier and his capacity to produce correct statistics and the users also, for the most part, tend to accept a series, particularly one issued by a government agency at its face value without enquiring its reliability”.

Prof Bowley points out that,” secondary data should not be accepted at their face value”. Therefore, before using the secondary data, the investigator should consider the following factors:

a. The suitability of data:

First, the investigator must satisfy y himself that the data available is suitable for the purpose of enquiry. It can be judged by the nature and scope of the present enquiry with the original enquiry. For example, if the object of the present enquiry is to study the trend in retail prices, and if the data provides only wholesale prices, such a data is unsuitable.

b. Adequacy of data:

If the data is suitable for the purpose of investigation then we must consider whether the data is useful or adequate for the present analysis

c. Reliability of data:

The reliability of data can be tested by finding out the agency that collected such data. If the agency has used proper methods in collecting data, statistics may be relied upon.

Without knowing the meaning and limitations, we cannot accept the secondary data. According to Prof Bowley,” It is never safe to take published statistics at their face value without knowing their meaning and limitations, and is always necessary to criticize arguments that can be based on them.

We can also divide the secondary sources into three parts. They are:

1. Regular data

2. Periodical data

3. Irregular data

1. Regular data:

The statistical data which is published regularly in the fixed intervals are called regular data. Regular data are called ‘continuous data’ for example, weekly index number of wholesale prices, monthly figures of exports and imports etc.

2. Periodical data:

Periodical data are data, which are regularly published at long intervals such as decennial census statistical abstract for India. Agricultural statistics in India, trade statistics and the statistical tables relating to banks.

3. Irregular data:

Irregular data is the data consisting of special studies of statistical information with no regular dates of publication. For example the reports of the national income committee, Tariff commission and the reports of the various commissions appointed by government from time to time.

Classification and Tabulation of data

Introduction:

The collected data in any stati8stical investigation is known as raw data. They are huge and confusing. As such they cannot be easily understood and are not fit for further analysis and interpretation. Prof J.R.Hicks points out that,” classified and arranged facts speak for themselves, unarranged, they are dead as mutton “. Hence after having collected and edited the data, the next important step is to organize it in a systematic manner.

The first step in the analysis and interpretation of data is classification and tabulation. Classification is the first step in tabulation.

Meaning of classification:

Classification is the process of arranging the available facts into homogeneous groups or classes according to resemblances and similarities. The following are the definitions of classification:

“Classification is the process of arranging things in groups or classes according to their resemblances and affinities, and giving expression to the unity of attributes that may subsist amongst a diversity of individuals”.- R.L. Connor

“Classification is the process of arranging data into sequences and groups according to their common characteristics or separating them into different but related parts”. - Secrist

“The process of grouping a large number of individual facts or observation on the basis of similarity among the items is called classification”. - Stockton and Clark

Chief characteristics of classification are:

1. All the facts are classified into homogeneous groups by the process of classification.

2. The basis of classification is unity in diversity.

3. Classification may be either real or imaginary.

4. The Classification may be according to either similarities or dissimilarities.

5. It should be flexible to accommodate adjustments.

Objectives of classification:

The chief objectives of classification are:

1. To condense the mass of data

2. To present the facts in a simple form

3. To bring out clearly the points of similarity and dissimilarity

4. To facilitate comparison

5. To bring out the relationship

6. To prepare data for tabulation

7. To facilitate the statistical treatment of the data.

Rules of Classification:

It is important that classification should possess the following guiding principles

a. Exactness:

The classes should be rigidly defined. They should not lead to any ambiguity or confusion.

b. Mutually exclusive:

Each item of data must find its place in one class. The classes must not overlap.

c. Stability:

Only one principle must be maintained. That is, the same pattern of classification should be maintained throug0ut the analysis. Then only it will facilitate meaningful comparison and become an ideal classification.

d. Flexibility:

The classification should be flexible and easy to adjust to new situations and circumstances.

e. Suitability:

The classification should be suitable for the object of the enquiry.

f. Homogeneity:

The items included in each class must be homogenous for example: a classification of facts into employed and unemployed youth is not adequate to judge the effect of education, but further, each of them may be classified into literate and illiterate.

g. Mathematical accuracy:

Items included in total and subtotals of each class and sub-class must be the same. Therefore, mathematical accuracy is very important in the classification of data.

Types of classification:

The classification of data primarily depends on the purpose and objectives of the enquiry. There are four important types of classification. They are:

1. Geographical, that is, areawise or region wise or district wise.

2. Chronological or historical that is on the basis of time

3. Qualitative by character or by attributes

4. Quantitative or numerical or by magnitudes.

1. Geographical classification:

In geographical classification, the basis of classification is the geographical or locational differences between various items in the statistical data like states, districts, cities, taluks, regions, zone, area etc. Geographical classification is illustrated in the following table:

Sales data (of pressure cookers) for 1988 in Tamil Nadu

|Name of the town | Number of pressure cookers |

|Madras | 15,000 |

|Tiruchi | 13,000 |

|Madurai | 11,000 |

|Coimbatore | 8000 |

|Kanyakumari | 4000 |

2. Chronological classification:

This type of statistical data is classified according to the time of its occurrence, such as years, months, weeks, days, hours etc. For example census data is expressed in decades, national income is expressed every year, and departmental sales are expressed every month or week.

Time series are also called chronological classification. They are further classified into the period of time and at the point of time. Statistical data regarding population, imports, exports, sales in a firm, etc also come under this classification.

Chronological classification is illustrated below:

Population of India from 1921 to 1981

| Year | Population(in millions) |

| 1921 | 248 |

| 1931 | 276 |

| 1941 | 313 |

| 1951 | 357 |

| 1961 | 438 |

| 1971 | 536 |

| 1981 | 684 |

3. Qualitative classification:

When the data is classified according to some quality or attributes such as sex, honesty, intelligence, literacy, blindness, colour, deafness, religion, marital status etc, the classification is termed as qualitative or descriptive attributes. In this type one can only find out the presence or absence of the attributes in the given units.

This again can be classified into two types:

a. Simple classification

b. Manifold classification

a. Simple classification:

If the data is classified into only two classes, such as literate and illiterate or honest and dishonest or skilled and unskilled, the classification is termed as simple classification. This classification is normally dichotomy or twofold, for example,

Population Population

______________ _____________

/ / / /

Male Female literate illiterate

b. Manifold classification:

In manifold classification, the universe is classified on the basis of more than one attribute at a time, for example, we may first divide the population into males and females on the attribute of sex, then further divide them on the basis of literacy and so on.

Population

___________________________

/ /

Male female

/ /

_____ ______

/ / / /

Literate illiterate literate illiterate

______ _________ _______ ______

/ / / / / / / /

Married unmarried married unmarried married unmarried married

4. Quantitative classification:

If the data are classified according to some characteristic which is capable of quantitative measurement like age, income, height, weight, price, production, sales, profits, etc, it is called quantitative, classification or classification according to variables. Variable is the quantitative phenomenon under study

| Marks | No. of students |

| 10 - 20 | 10 |

| 20 - 30 | 7 |

| 30 - 40 | 13 |

| 40 - 50 | 18 |

| 50 - 60 | 12 |

| 60 - 70 | 6 |

| 70 - 80 | 4 |

In the above classification, marks are termed as variables and the number of students in each class is the frequency.

Frequency distribution:

Erricker states frequency distribution is,” a classification according to the number possessing the same values of the variables. It is simply a table in which the data are grouped into classes and the number of cases which fall in each class is recorded. There are individual observation, discrete frequency distribution and continuous frequency distribution.

a. individual observation:

| Roll – numbers | Marks |

| 1 | 40 |

| 2 | 33 |

| 3 | 27 |

| 4 | 38 |

| 5 | 41 |

| 6 | 48 |

| 7 | 44 |

| 8 | 51 |

| 9 | 39 |

| 10 | 55 |

b. Discrete or grouped frequency distribution:

One has to count the number of times each value of the variable is repeated in the data and it is called the frequency of that class. Boddington says,” Discrete variable is one where the variables differ from each other by definite amounts”.

For example:

| Number of children | Number of families |

| 0 | 12 |

| 1 | 84 |

| 2 | 110 |

| 3 | 65 |

| 4 | 27 |

| Total | 300 |

Making a frequency table:

Data may be given in the form of individual observation. They are to be converted into discrete frequency distribution.

Steps:

We should form a table with three headings, viz, variables, tally-marks and frequency.

In the first column we place all possible values of the variable. In the second column is tally sheet, where tally marks are put against the number. After a particular value has occurred four times, for the fifth occurrence we out a cross tally (////) cutting the first four tally –marks, and this gives as a block of 5. For the sixth item we put another tally-mark leaving some space. By putting cross-tally marks, and allowing little space after a block of five, easy and correct counting is facilitated. Finally we count the number of bars corresponding to each value of the variable and place it in the column entitled frequency.

Illustration 1:

Consider the marks scored by 30 students:

9, 7, 5, 3, 4, 8, 6, 0, 6, 5

9, 1, 7, 2, 3, 8, 6, 8, 7, 4

9, 4, 5, 10, 6, 5, 9, 6, 9, 5

We are unable to understand the significance of marks scored by the 30 students, as it is given in raw form. We have to form discrete series out of the above data. First, we note down the latest and highest values. In the first column we place all

Possible values of the variables.

| | Tally-sheet | Number of students |

|Marks | |(frequency) |

| 0 | / | 1 |

| 1 | / | 1 |

| 2 | / | 1 |

| 3 | // | 2 |

| 4 | /// | 3 |

| 5 | //// | 5 |

| 6 | //// | 5 |

| 7 | /// | 3 |

| 8 | /// | 3 |

| 9 | //// | 5 |

| 10 | / | 1 |

| Total | | 30 |

| Marks | Frequency |

| 0 | 1 |

| 1 | 1 |

| 2 | 1 |

| 3 | 2 |

| 4 | 3 |

| 5 | 5 |

| 6 | 5 |

| 7 | 3 |

| 8 | 3 |

| 9 | 5 |

| 10 | 1 |

| Total | 30 |

Continuous or grouped frequency distribution:

A collection of items, which cannot be exactly measured, but placed with certain limits, is called continuous series. The following technical terms are important, when a continuous frequency distribution is formed or data is classified according to class-intervals.

Class-limits:

The class-limits are the smallest or the lowest and the largest or the highest values in the class. For example: take the class 10-20, the lowest value is 10 and the highest value is 20. The two boundaries of the class are k known as the lower limit and upper limit and upper limit of the class. Class limit is also known as class boundaries.

Class-intervals:

The differences between the lower limit and the upper limit of the class are known as the class interval. For example in the 10-20, the class-interval is 10(that is 20 -10 = 10). The formula to find the class-interval of a given problem is:

i= l – s

_____

k

Where,

l = Largest item

s = Smallest item

k = The number or classes

For example, if the marks of 50 students are varied between 10 and 80, and if want to form 7 classes, then the class interval would be:

[pic]

L = 80

s = 10

k = 7

[pic]

Therefore, the class-interval would be 10-20, 20-30, 30-40, 40-50, 50-60, 60-70, and 70-80.

There are two methods of forming class-intervals

1. Exclusive method

2. Inclusive method

1. Exclusive method: (overlapping)

Under this method, the upper-limit of one class-interval is the lower limit of the next class for example:

Marks No. of students

10 20 15

20-30 20

30-40 10

Total 45

This method ensures continuity of data. A student whose marks are between 10 and 19.9 would be included in the 10-20 class. A student whose mark is 20 would be included in the class 20-30. This method is mostly followed. One can use a better way in the exclusive method. This type avoids confusion

Marks No. of students

More than 10 but less 10

than 20

(10 but under 20)

More than 20 but less 15

than 30

(20 but under 30)

More than 30 but less 17

than but under 40

(30 but under 40)

Total 42

2. Inclusive method (non-overlapping):

In this method the upper limit of one class is included in that class itself, for example:

Marks No. of students

10 - 19 17

20 – 29 15

30 – 39 12

40 – 49 10

Total 54

Here, a student getting 29 marks is included in 20-29 class intervals and similarly a student getting 39 marks is included in 30-39 class intervals. There by the confusion which was observed in the exclusive method is avoided because the upper limit of a class here is not the lower limit of the next class.

Tabulation of data:

Meaning:

Tabulation means, a systematic presentation of numerical data in columns and rows in accordance with some salient features or characteristics. Columns are vertical arrangement and rows are horizontal arrangement. Croxton and Cowden state that, ”either for one’s own use or for the use of others, the data must be presented in a suitable form”. It facilitates comparison and also facilitates analysis.

Definitions:

According to Prof. H.Secrist,” Tables are a means of recording in permanent form the analysis that is made through classification and of placing things that are similar and should be compared.”

In the words of Prof. Neiswangar,”a statistical table is a systematic organization of data in columns and rows”.

Tabulation is the process of presenting data in tables.

Objectives (purposes):

Tabulation helps in understanding complex numerical data and makes them simple and clear similar and dissimilar facts are separated. According to D.W. Paden and E.F.Lindquist,”The purpose of a table is to summarize a mass of numerical information and to present it in the simplest possible form, consistent with the purpose for which it is to be used”.

Tabulation is a medium of communication of great economy and effectiveness for which ordinary prose is inadequate. In addition to its function in simple presentation the statistical table is a useful tool of analysis.

The main objectives of tabulation are:

1. To clarify the object of investigation

2. To simplify complex data.

3. To clarify the characteristics of data

4. To present facts in the minimum of space

5. To facilitate comparison

6. To detect errors and omission in the data

7. To depict trend and tendencies of the problem under consideration

8. To facilitate statistical processing

9. To help reference

Difference between classification and tabulation:

Classification and tabulation are important processes in statistical investigation. Through these processes, the collected data are summarized and put in a systematic order.

1. Both classification and tabulation are important for statistical information. First the data are classified, then, they are presented in tables. Classification is the basis for tabulation.

2. Tabulation is a mechanical function of classification, because in tabulation classified data are placed in columns and rows.

3. Classification is a process of statistical analysis; tabulation is a process of presenting data in a suitable structure.

Parts of a table:

A good statistical table is an art. The following parts must be present in all tables:

1. Table number:

A table should always number for identification and reference in the future. There is no specific place allotted for this number. It can be given at the centre, on top or bottom of the table or even towards the top left side.

2. Title of the table:

Each table should be given a suitable title. It must be written on the top of the table. The title must describe briefly, the contents of the table. A complete title sh0uld be able to answer question of the type:

i. What is the data in the table?

For example marks scored by a set of 50 students

ii. Where did the data occur?

For example: In a certain college in Bangalore

iii. When did the data occur?

For example during March 2004, promotional examination etc

The title should be self explanatory.

3. Head note:

It is a statement, given below the title and enclosed in brackets, for example, the unit of measurement is written as a head note, such as ‘in millions’ or ‘in crores’.

4. Captions:

These are headings for the vertical columns. They must be brief and self explanatory. They explain what the row represents. Usually in a table, we find more rows than columns.

5. Studs:

Studs refer to row readings. These are written to the left extreme of the table. They explain what the row represents. Usually in a table, we find more rows than columns.

6. Body of the table:

It contains the numerical information. It is the most important part of the table. The arrangement is the body of the table is generally from left to right in rows from top to bottom in columns.

7. Foot-note:

If any explanation or elaboration regarding any item is necessary, foot-notes should be given.

8. Source-note:

It refers to the source from where information has been taken. It is useful to the reader to check the figure and gather additional information.

Specimen of a table

[pic]

Rules for tabulation:

The construction of a good statistical table is a specialized art and requires great skill, experience and common sense on the part of the tabulator. There is no hard and fast rule regarding it. According to Bowley,” in collection and tabulation, common sense is the chief requisite, and experience, and the chief teacher.” There

are certain general rules to be followed in the construction of a good statistical table. They are:

1. The table should be simple and compact it should not be overloaded with details.

2. The caption and studs in the table should be arranged in a systematic manner. It must be easy to read the important items. There are many types. They are alphabetically, chronological, geographical, conventional etc.

3. It should suit the purpose of the investigation.

4. The unit of measurements should be clearly defined and given in the tables. For example: height in meters, weight in kilograms, etc.

5. Figures may be rounded off to avoid unnecessary details in the table. But a foot-note must be given to this effect.

6. Suitable approximation may be adopted.

7. A miscellaneous column should be added to include important items.

8. A table should be complete and self-explanatory.

9. A table should be attractive to draw the attention of the readers.

10. As it forms a basis for statistical analysis, it should be accurate and free from all sorts of errors.

11. Abbreviation should be avoided.

12. Do not use ditto marks that may be mistaken.

13. Proper lettering will help to adjust the size of the table.

14. If it is a big table, it will lose its simplicity and understandability and in such a case break into two or three tables.

Types of tables:

Statistical tables can be classified into a number of ways. There are a number of categories depending upon:

1. The basis of coverage which can be further classified into simple table and complex table. A complex table can be classified into two-fold, three-fold or manifold table.

2. The basis of objective or purpose. This can be further classified into general purpose table or reference table and special purpose table or summary table.

3. The basis of nature of enquiry, which can be further, be classified into original or primary table and derived or derivative table.

I. on the basis of coverage:

Simple and complex:

In a simple table the data is classified according to only one characteristic. It is termed as one way or single table and it takes form of a frequency table. In a complex table two or more characteristics are shown. It is more popular, because it helps to give appropriate consideration to all the facts:

Simple Table:

[pic]

Two-way table:

If the caption or stud is classified into two characteristics and if it gives information of two interrelated questions, then such a table is called two-way table for example:

[pic]

Three-way table:

In this type of table three characteristics are shown. It gives information regarding three interrelated characteristics of a phenomenon. For example:

Distribution of population by age, sex and literacy

[pic]

____________________________________________________________

A large number of interrelated problems or characteristics are represented in the same table. For example, the distribution of students in a college according to faculty, class, sex and residence.

Manifold or higher order table:

Number of students in Mysore University (according to faculty, age, sex and residence)

[pic]

2. On the basis of objectives (purposes):

a. General purpose table:

It is also known as informative table and provides information for general use, and usually in chronological order. The detailed table in the census reports is of this kind. Government agencies prepare this type of tables. These are used by research, workers and statisticians. These are placed in the appendix of a report for reference.

b. Special purpose table:

It is also called a summary table or text table or analytical table or table. It presents the data relating to a particular or a special purpose. Ratios percentages etc are used to facilitate comparison.

3. On the basis of originality:

The statistical table may be classified into 1. Primary table and 2. Derived table.

In primary table (original), the statistical forms are expressed in original. It contains actual and absolute figures. In a derived table, figures and results are derived from the primary data. It presents totals, percentages, ratios, averages, dispersion, co-efficient of correlation etc. Both primary and derived tables are generally used in practice.

Diagrammatic presentation of data

Generally a statistical data is first classified and then tabulated, that is the data is presented in the form of a table. A table consists of numbers which may not always be interesting. They can be confusing especially when they are large and complex. So it may not be easily understood by layman. Hence a tabulated data can be represented through diagrams and graphs, which is more interesting and appealing. These are more attractive and easily understood, when compared to the tables.

Diagrams

Diagram is a pictorial representation of the data. A diagram is a visual form of presentation of statistical data. Diagram refers to various types of devices such as bars, circles, maps, pictorials, cartograms etc. These devices can take many attractive forms. Strictly speaking, these are not graphic devices. Diagrams do not add any new meaning to the statistical facts, but they exhibit the results more clearly. The use of diagrams is becoming more and more popular in the present time. Diagrams, occupy an important place, because (merits)

1. They are attractive and impressive:

Diagrams are attractive and create interest in the mind of the readers. They are more appealing to the eye; even a layman can understand them very easily. Diagrams have greater attention than mere figures.

2. They save time and labour:

Diagrams save mush time and labour to understand it and enables one to draw meaningful inferences from it.

3. They have universal applicability:

Diagrammatic presentation of statistical data is followed universally. It is greatly used in almost all walks of life as a good guide in economics, business, social institutions, administration and other fields.

4. They make data simple:

Diagrams can be remembered easily, as they render comparison in an easy and possible way. They render the whole data readily intelligible. For example, the study of profit pattern of two firms with the help of figures may not be clear, but when the figures are put in the medium of a diagram the trend can be very clear.

5. They make comparison easy:

Diagrams render comparison between two or more sets of data. In absolute figures comparison may not be clear, but diagrammatic presentation makes it earlier and simpler.

6. They provide more information:

A diagram will reveal more information than the data in a table. Cold figures can speak in clear tones, if translated into diagrammatic language.

Limitations of a diagram

The presentation of a diagram without a careful study will be misleading. In brief, the following are the deficiencies or restricted uses:

1. Diagrams cannot be analyzed further.

2. Diagrams show only approximate values.

3. The uses of certain diagrams are limited to the experts (example: multi-dimensional ones).

4. It exposes only limited facts. All details cannot be presented diagrammatically.

5. To draw a table is easy but construction of a diagram is not so easy.

6. It is a supplement to the tabular presentation but not an alternative to it.

7. Minute readings cannot be made. Small differences in large measurements cannot be studied.

8. If there is a wide gap between two different measurements, the diagram will not give a meaningful look. For example, 10 and 900 cannot be shown in a diagram, whatever the scale be adopted.

9. Diagrams are drawn when comparisons needed otherwise, they are of little use.

10. Diagrams drawn on false base are illusory.

Rules for making a diagram:

Diagrammatic presentation of a statistical table is simple and effective as photographic memory will last long in the mind than any other form. The construction of a diagram is an art, which can be acquired through practice. However, the following guide line will help in making them more effective:

1. Heading: every diagram must have suitable title. The title, in bold letters, conveys the main facts depicted by the diagram. If needed, sub-headings can also be given. It must be brief, self-explanatory and clear.

2. Size: The size of the diagram should neither be too big nor too small. It must match with the size of the paper. It should be in the middle of the paper.

3. Length and Breadth: An appropriate proportion should be maintained between length and breadth. Lutz has suggested that proportions of length and breadth should be 2:1, 1:4 or 4:1. If it is so, the diagram looks attractive. Care should be taken to ensure that the diagram does not look ugly.

4. Drawing: Since impression is needed it should be drawn neatly and accurately with the help of drawing instruments. Each diagram should also be numbered for ready reference.

5. A proper Scale: A proper Scale must be chosen for the diagram to look attractive and create a visual impact on the reader. It must suit the space available. Accuracy should not be sacrificed to attractiveness.

6. Selection of appropriate diagram: The most important point is the selection of proper diagram to present a set of figures. All types of diagrams are not suitable for all types of data. A wrong selection of the diagram will distort the true characteristics of the phenomenon to be presented and might lead to very wrong and misleading interpretations.

7. Right method: C.W. Lowe writes,” The important point, which must be borne in mind at all times, is that the pictorial presentation, chosen for any situation, must depict the true relationship and point out the proper conclusion. Use of an inappropriate chart may distort the facts and mislead the reader.

8. Index: When many items are shown in a diagram, through different colors, dotting, crossings etc an index must be given for identifying and understanding the diagrams.

9. Sources: If the data presented have been acquired from some external source, the fact should be indicated at the bottom of the diagram.

10. Simplicity: Diagram should be very simple. It must be so simple that even a lay man who does not have the knowledge of mathematical or statistical background, can understand the diagram. If the data is very more diagrams can be used to represent the data. Too much of information presented in a diagram will be confusing. Therefore, it is suggested to draw several simple diagrams which are more effective than a complex one.

Types of diagram

There are various diagrammatic devices by which statistical data can be presented. The following are some of the common types of diagrams.

1. One dimensional diagram ( line and bar)

2. Two dimensional diagram (Rectangle, square, circles etc)

3. Three dimensional diagram (cube, sphere, cylinder etc)

4. Pictogram

5. Cartogram

1. One dimensional diagram:

In one dimensional diagram, the length of the lines or bars is considered and the width of the bars is not taken into consideration. The term ‘bar’ means a thick wide line, the following are the main types:

A. Line diagram:

This is the simplest of all the diagrams. On the basis of size of the figures, heights of bars or lines are drawn. The distance between lines is kept uniform. It makes comparison easy. This diagram is not attractive and hence it is less important.

B. Simple Bar diagram:

A simple bar diagram is used represent a single variable. A single variable may mean ’population of various countries’, ‘production of sugar in different states of our country’. ‘Number of deaths in some localities‘ ‘number of employees in various branches of a bank’ and so on for various years. However, a simple bar diagram can represent only one category of data.

C. Multiple Bar diagram:

Multiple bar diagrams are used to denote more than one phenomenon, for example, import and export trend. Multiple bars are useful for direct comparison between two values. The bars are drawn side by side. In order to distinguish the bars, different colours, shades, may be used and a key index to be given to understand the different bars. These diagrams are easy to understand.

D. Sub-divided bar diagram:

The bar is subdivided into various parts in proportion to the values given in the data and may be drawn on absolute figures or percentages. Each component occupies a part of the bar proportional to its share in the total. Here also to differentiate different components from one another, different shades or colours may be used.

E. Percentage sub-divided bar diagram:

To make comparison on a relative basis various components are expressed as percentage to the total. Percentages are cumulated to divide the bars. Here bars are all of equal heights; each segment shows the percentage to total.

F. Other bar diagrams:

a. Deviation bars: Deviation bar diagram is used to depict the net deviations in different values, that are surplus or deficit, profit or loss, net import or export etc which has either positive or negative values. Positive values are shown above the base line and negative value below the baseline.

b. Broken bars:

In certain cases we may come across data which contain very wide variations in values- very small or very large. In order to provide adequate and reasonable shape to the smaller bars, the larger bars may be broken at the top. The value of each bar is written at the top of the bar.

2. Two-dimensional diagram:

Pie-chart:

A pie-chart is used to represent the total into the break up of various components. For example, the pie chart may represent the budget of a family, for a month and the various sections may represent portions of the budget allotted to food, rent, clothing, and education so on.

The pie-chart is so called because the entire graph looks like a pie and the components resemble slices cut from the pie. Pie charts are also called as sector graphs or angular diagrams. They can be constructed for either a single set of data or for two or more sets of data.

Differences between tables and diagrams:

1. A table consists of accurate numbers, whereas a diagram gives only an approximate idea.

2. A table contains more information when compared to a diagram.

3. A Table requires careful observation whereas a diagram is easy to follow and interpret.

4, Diagrams are readily understood and they can be retained in memory for a long time, whereas it is difficult and impractical to memorize the numbers present in tables.

5. Diagrams and graphs are attractive and they give good impression when compared to tables.

6. Diagrams and graphs are easy for comparison, when compared to tables.

Graphic representation:

Graphic presentation of statistical data gives a pictorial effect. The collected data will generally be complex. It will be very difficult to understand the importance of collected data. Yet the classification and tabulation will reduce the complexity, still they are not easily understood by the common people. If the mass of data are depicted graphically, they become easy to be understood and grasped. Statisticians have since long discovered the importance of graphic presentation. It enables us to present the data in a simple, clear and effective manner. Boddington says,” The wandering of a line is more powerful in its effect on the mind than a tabulated statement; it shows what is happening and what is likely to take place just as quickly what is happening and what is likely to take place just as quickly as the eye is capable of working”.

Graphic presentation of numerical data is becoming popular because of various merits. The advantages of graphic presentation are as follows:

1. It is attractive and impressive.

2. It simplifies complexity of data.

3. It provides easy comparison of two or more phenomena.

4. It needs no special knowledge of mathematics to understand a graph.

5. It provides the basis to locate the statistical measures, like median, mode, quartiles etc.

6. Apart from simplicity, it saves the time and energy of the statistician as well as the observer.

7. Graphic method is probably the simplest method of presenting statistical data.

8. It shows any trend that may be present and the direction in which the trend may change.

Procedure for the construction of a graph:

Graphs are prepared on a graph paper in which two lines are drawn, which intersect each other at right angles. These lines are called axis; the point of intersection is ’o’ (point of origin). The horizontal line is called y axis. Generally an independent variable is represented along ‘x axis’ and the dependent variable along ‘Y-axis’. For each axis, a convenient scale, representing the units of the variable is chosen in such a manner that it accommodates the entire data that is given in the table. The scales of x and y need not be the same.

After fixing the origin and scale, the given data is plotted by marking points or dots, corresponding to various x and y values. Then, these points are joined by straight lines to get the graph.

General Rules:

While graphing statistical data, the following guidelines may be observed:

1. Every graph must have a title, indicating the facts presented by the graph.

2. It is necessary to plot the independent variable on the horizontal axis and dependent variables on the vertical axis.

3. Problem arises regarding the choice of a suitable scale. The choice must accommodate the whole idea.

4. The principle of drawing graph is that the vertical scale must start from zero. If the fluctuations are quite small compared to the size of variables, there is no need of showing the entire scale from the origin.

5. For showing proportional relative changes in the magnitude, the ratio or logarithmetic scale should be used.

6. The graph must not be over-crowded with curves.

7. If more than one variable is plotted on the same graph. It is necessary to distinguish them.

8. Index should be given to show the scales and the meaning of different curves.

9. All lettering must be horizontal.

10. It should be remembered that for every value of independent variable there is a corresponding value of the dependent variable. It is these matched values that are to be plotted.

11. Source of information should be mentioned as foot note.

Difference between Diagram and Graph (First define diagram and then graph then go to write the differences for 5 marks)

Diagram Graph

1. Ordinary paper can be used 1. Graph paper is needed

2. It is attractive and is easily understandable 2. It needs some effort to

Understand.

3. It is appropriate and effective to represent 3. It creates problems

one or more variables.

4. It cannot be used for interpolation and 4. It is helpful in interpolation and extrapolation techniques. and extrapolation techniques.

5. Median and mode cannot be estimated. 5. The value of median and mode

can be estimated.

6. It is used for comparison only. 6. It represents a mathematical

relationship between two variables

7. Data is represented by bars, rectangles. 7. Data is represented by points or

lines of different kinds of dots,

dashes etc.

8. Diagrams are used for publicity as they 8. Graphs are very useful to

are attractive. They give only approximate statistician or researcher in

Information. To a statistician or researcher analysis.

diagrams are not helpful in analysis.

Like diagram, a large number of graphs are used in practice. They can be broadly divided under two heads:

1. Graphs of frequency distribution

2. Graphs of time series

Graphs of frequency distribution:

Frequency distribution can be represented graphically in one of the following ways:

1. Histogram

2. Frequency Polygon

3. Frequency curve

4. Cumulative frequency.

1. Histogram:

One of the most important and useful methods of presenting frequency distribution of continuous series is known as a histogram. Histogram is a set of vertical bars erected adjacent to each other. The area of each bar is proportional to the frequencies represented. A Histogram is constructed by taking ‘class-intervals’ along x axis and ‘frequencies’ along y axis. The class-intervals may have equal width or unequal width. Histogram is also known as ‘block diagram’ or ‘staircase chart’.

b. Construction of histogram when class intervals have unequal widths:

Here a correction for unequal width class interval is to be done. The correction consists of finding a frequency density that is the frequency for the unequal width class divided by the width of that class, that is,

frequency

Frequency density = _______________

Width of the class-interval

The histogram constructed using these frequency densities for the unequal width class-intervals.

b. Frequency Polygon:

A grouped frequency distribution can be represented by a histogram. A simple method of smoothing the histogram is to draw a frequency polygon. This is done by connecting the mid-point of the top of each rectangle with the mid-point of the top of each adjacent rectangle, by straight lines. This is done under the assumption that the frequencies in a class interval are evenly distributed throughout the class. The area of the polygon is equal to the area of the histogram, because the area left out is just equal to the area included in it. Mode can easily be found out.

c. Frequency curve:

A frequency curve is drawn by smoothing the frequency polygon. It is smoothed in such a way that the sharp turns are avoided. A frequency polygon if smoothed further, so as to minimize sudden changes, results into a continuous smooth curve known as frequency or smooth frequency curve. The curve should begin and end at the baseline.

Uses and limitations of frequency curves:

Frequency curves are useful for comparing two or more distributions, graphically; several frequency curves can be plotted on the same axis, like the frequency polygons.

However, it is difficult to draw a frequency curves can be plotted on the same axis, like the frequency polygons.

However it is difficult to draw a frequency curve, as it requires some imagination. Frequency curves drawn by different individuals may appear different, unlike the frequency polygons.

Ogives: ( or cumulative frequency curves):

When cumulative frequencies are plotted on a graph, then the frequency curve obtained is called ‘ogive’ or ‘cumulative frequency curve’. Ogives determine median, quartiles, percentiles etc. The class-limits are shown along the x axis and the cumulative frequencies along the y axis. In drawing an ogive the cumulative frequency is plotted at the upper limit of the class interval, and the successive points are later joined together to get an ogive curve.

There are two methods of constructing Ogives, viz:

1. Less than Ogive.

2. More than Ogive

1. Less than Ogive:

Less than Ogive is constructed as follows:

a. Take the upper-limits.

b. Find out the less than cumulative frequency. Here, we keep the first frequency as it is and then go on adding the other frequencies, the last less than cumulative frequency should be equal to total(N)

c. Take the upper limits on x axis and less than cumulative frequency on y axis and plot the points.

d. Join the points by a smooth curve. We get the ‘less than Ogive’, which is a raising curve.

2. More than ogive:

More than Ogive is constructed as follows:

1. Take the lower limits.

2. Find more than cumulative frequency. Here we take the total frequency first and then go on subtracting the frequency from each class. The last more than cumulative frequency should be equal to the last frequency.

3. Take the lower limits on x axis and more than cumulative frequency on y-axis and plot the points.

4. Join the points by a smooth curve. We get ’more than ogive’ which is a declining curve.

Stages of statistical investigation:

Statistical investigation means, survey or enquiry which can be expressed in quantitative terms. Here, information is obtained using statistical devices. A survey requires specialized knowledge and skill. It passes through several stages which can be broadly categorized under two heads:

I. Planning the survey and

II. Execution of the survey.

I. Planning the survey:

Planning a survey is done before we start collecting the data. It is very important, because the result of the survey depends to a large extent on planning. The different stages of planning are:

1. Purpose of the survey:

The purpose of the survey should be very clearly defined. This helps in determining the type of information required and the techniques for analyzing such information. So, this avoids collecting irrelevant information, thus avoiding wastage and confusion.

2. Scope of the survey:

After deciding the purpose of the survey, the next stage is to decide about, the scope. Scope refers to the geographical area, the type of information and the subject matter. The scope fixes the within which the survey is to be conducted. This helps in deciding the quantity of data to be collected.

3. Units of data collection:

Statistical units must be defined clearly, before we start collecting the data. Units are also necessary for presentation and interpretation.

4. Sources of data:

After deciding the purpose and the scope, the sources of data will have to be determined. The sources of data can be primary or secondary.

5. Techniques of data collection:

There are two important techniques of data collection called ’census technique’ and ‘sampling technique’. The census technique refers to complete enumeration and record of each and every item of the population. For example: while checking for polio immunization in a locality, each and every house will have to be checked. The sample technique refers to the collection of a part of the population. For example: while buying a bag of rice, it is enough to inspect just a handful of rice, to take a decision. The choice of census method or sample method depends on purpose of the survey, nature of the data, time, cost, degree of accuracy and availability of resource.

6. Choice of frame:

A’ frame’ refers to listing of all the units in the population under study. For example: the voters list of a locality, the entire structure of the survey is determined by the nature and accuracy of the frame.

7. Degree of accuracy:

The degree of accuracy has to be decided by the investigator, before starting the survey it is practically not possible to achieve 100% accuracy. Even if it is possible, it may not be worth the time and money. The degree of accuracy depends on the purpose of the survey.

8. Miscellaneous considerations:

Apart from considering the above stages it is also important to consider matters such as whether the survey is:

i. Official, Semi-official or Non-Official:

An official survey is conducted by or on behalf of central or state government, for example Census.

A Semi-official survey is conducted by organisation who enjoy Government patronage For example: Indian statistical institute (ISI).

A non-official survey is conducted by individuals or private organisation for example: research scholars.

ii. Confidential or non-confidential:

A confidential survey is one in which the details are kept confidential. For example: Central Beauru of Investigation (CBI).

A non-confidential survey is one in which the details are kept open for the general public.

3. A regular survey or adhoc:

A regular survey is one which is conducted at regular intervals of time. for example: census.

An adhoc survey is conducted as and when required and not necessarily on regular basis. For example: random checking by income by tax-officials.

4. Initial or repetitive:

An initial survey is conducted for the first time by an investigator so here the plan for the survey has to be formulated.

A repetitive survey is conducted in continuation of the survey previously conducted. So there is no need to formulate the plan. But modification can be incorporated whenever necessary.

5. Direct or indirect:

A direct survey is conducted for qualitative characteristics for example: heights income, ages etc.

An indirect survey is conducted for qualitative characteristics. For example: intelligence efficiency etc.

II. Executing the survey:

After planning the survey, putting the plan into action is called executing the survey. Different stages of executing the survey are:

1. Setting up an administrative organisation:

An administrative organisation has to be established, depending upon the purpose and scope of the study.

2. Design of forms:

A standard form should be designed for systematic collection and analysis of the data.

3. Selection, training and supervision of field investigators:

The success of the survey depends largely on the field investigator. So, they have to be carefully selected, depending on their knowledge, efficiency and character. They should be honest, sincere, hard-working and tactful. The selected investigators should be thoroughly trained. They should be given clear instruction regarding purpose and scope of the survey.

4. Editing the field work:

After the field work is completed, supervisors collect all the relevant matters recorded by the field investigators. They will have to be checked for errors, inconsistence, illegible writings, omission etc.

5. Follow up of non-response:

In spite of best efforts, some respondents may not give the necessary information for some reason or other. In such cases, a list of non-respondents should be made to obtain information.

6. Processing of data:

After collecting the edited data, it has to be processed. The data has to be coded and fed to the computers, for classification, tabulation and analysis.

7. Preparation of report:

The final step in executing the survey is to prepare a report.

Sample survey

In a sample survey, only a part of the population is considered. The results obtained from sample can be used to estimate for the population. Suppose we require an overall information or information. ‘On an average’, regarding the average monthly income of post masters in India. Then we can collect information from a sample area, say 50, selected randomly from all over India, and we can calculate the average monthly income.

Advantages (merits):

1. It is less time consuming as compared to the census survey.

2. It is relatively inexpensive method.

3. It is easy to use this method as it involves training very few people.

4. The results are accurate and generally reliable.

5. It is possible to collect detailed information.

6. Sample method can be applied wherever it is impractical to use census method.

7. It can be used to check the accuracy of census method.

Limitations ( demerits):

1. Sample method should be carefully planned and executed. Otherwise, it leads to inaccuracy.

2. It should be carefully organized and so the cost per unit is greater in this method as compared to census method.

3. It requires the services of experts. Otherwise the results will not be reliable.

4. It cannot be used whenever information on each and every unit of the population is required.

______________

What is statistical series (2 marks and 5 marks)(the quote I have underlined will be for 2marks and the entire answer will be for 5 marks)

Statistical series is closely associated with classification of data. According to L.R. Connor,” If two variable quantities can be arranged side by side so that the measurable differences in the one correspond to the measurable differences in the other, the result is said to form a statistical series”.

Normally in statistical calculation we come across three types of statistical series. They are

1. Individual Series

2. Discrete series

3. Continuous series

1. Individual series:

When any particular phenomenon is observed and the values recorded, they are known as individual observations. In individual observations frequencies are not noted. As an example consider the following observations of marks of 10 students.

Students: 1 2 3 4 5 6 7 8 9 10

Marks : 50 55 60 65 70 75 80 85 90 95

Discrete Series:

Discrete series relate to items which cannot be expressed in fraction. For example: number of schools in a town, number of students in a college, number of hospitals in a district etc. These data are tabulated with frequencies. Consider the following example:

Number of children in the family number of families

2 10

3 25

4 30

5 35

Total 100

Continuous series:

When an item can take decimal value, they are known as continuous series. For example, data relating to distance from one place to another, incomes of individuals etc. These data after tabulation appear in the form of class intervals and frequencies.

Consider the following example:

Age of persons in year’s no. of persons

Class-intervals frequencies

0 - 10 25

10 - 20 30

20 – 30 75

30 – 40 30

40 – 50 40

(Sigma)Ef= 200

Data relating to two hundred persons are tabulated in the above table. There are 25 persons in the age group of 0 and 10. But there may be persons whose ages are 9 and 3 months 10 days, 8 years and 2 months etc. in this class interval. A continuous data for statistical analysis purpose is presented in the form of class intervals and frequencies.

(To answer the question: define statistical series? For 5 marks first define statistics, collection of data, classification, define statistical series and then write about individual series discrete and continuous series)

If the question is asked for 10 marks then the write the same thing in more detailed manner)

_______________________________

................
................

In order to avoid copyright disputes, this page is only a partial summary.

Google Online Preview   Download