Management for All: DATA ANALYSIS IN MARKETING RESEARCH

Sunday, December 8, 2013

DATA ANALYSIS IN MARKETING RESEARCH

MARKETING RESEARCH PROCEDURE

Marketing research is undertaken in order to improve the understanding about a marketing situation or problem and consequently improve the quality of decision-making related to it. The usefulness of the marketing research output will depend upon the way the research has been designed and implemented at each stage of the process. There are five steps in every marketing research process:

A) PROBLEM DEFINITION

B) RESEARCH DESIGN

C) FIELD WORK

D) DATA ANALYSIS

E) REPORT PRESENTATION AND IMPLEMENTATION

D) DATA ANALYSIS:

After you have collected the data, you need to process, organise and arrange it in a format that makes it easy to understand and directly helps the decision-making process. Raw data has to be processed and analysed to obtain information. There are three phases for analysing the data:

a) Classifying the raw data in a more orderly manner;

b) Summarising the data;

c) Applying analytical methods to manipulate the data to highlight their inter-relationship and quantitative significance.

a) Classifying the raw data: The most commonly used classification in marketing research are quantitative, qualitative, chronological and geographical.

Quantitative: In this classification, data is classified by a numerical measure such as number of respondents in each market segment, number of years employed, number of family members, number of units consumed, number of brands stocked or some such numerical characteristic.

Qualitative: In this classification, the data is classified by some non-numerical attribute such as type of occupation, type of family structure (nucleus, or joint family), type of retail outlet (speciality, general merchant, department store etc.).

Chronological classification is that in which data is classified according to the time when the event occurred.

In the geographical classification the data is classified by location which may either be a country, state, region, city, village, etc.

Summarising the data: The first step in summarising the data is the tabulation. Individual observations or data are placed in a suitable classification in which they occur and then counted. Thus we know the number of times or the frequency with which a particular data occurs. Such tabulation leads to a frequency distribution as illustrated in Table 1.

The frequency distribution may involve a single variable as in Table 1 or it may involve two or more variables which is known as cross-classification or cross-tabulation.

The frequency distribution presented per se may not yield any specific result or inference. What we want is a single, condensed representative figure which will help us to make useful inferences about the data and also provide yardstick for comparing different sets of data. Measures of average or central tendency provide one such yardstick. The three types of averages are the mode, median and mean.

Mode: The mode is the central value or item that occurs most frequently. When the data is organised as a frequency distribution the mode is that category which has the maximum number of observations. (in the 121 - 140 category in Table 1). A shopkeeper ordering fresh stock of shoes for the season would make use of the mode to determine the size which is most frequently sold. The advantage of mode is that it is easy to compute, is not affected by extreme values in the frequency distribution and is representative if the observations are clustered at one particular value or class.

Median: The median is that item which lies exactly half-way between the lowest and highest value when the data is arranged in an ascending or. descending order. It is not affected by the value of the observation but by the number of observations. Suppose you have the data on monthly income of house holds in a particular area. The median value would give you that monthly income which divides the number of households into two equal parts. Fifty per cent of all households have a monthly income above the median value and fifty per cent of household have a monthly income below the median income.

Mean: The mean is the common arithmetic average. It is computed by dividing the sum of the values of the observations by the number of items observed. A firm wants to introduce a new packing of sliced bread aimed at the customer segment of small nucleus families of four members. It wishes to introduce the concept of a `single-day pack', i.e., a pack which contains only that number of bread slices that is usually eaten in a single day. This strategy would help to keep the price of the pack well within the family's limited budget. The firm has many opinions on the ideal number of slices that the pack should contain - ranging from three to as high as twelve. The firm decides to hire a professional marketing agency to conduct market research and recommend the number or bread slices it should pack.

The research agency goes about the task in two steps. In the first step, it randomly chooses five families (who are consumers of bread) in each of the four colonies in the city. These families are asked to maintain for one week a record of the exact number of slices they consumed each day. From this data, the agency calculates the average (or mean) number of bread slices eaten per family per day. There would be twenty such mean values (5 families in 4 colonies each; sample size 20). In the second step, from these mean values, the model value would provide the answer to the number of bread slices to be packed in each pack.

The mode in this frequency distribution is 8. Eight slices is the most commonly occurring consumption pattern. The agency's recommendation is to pack eight bread slices in the single-day pack.

The mean, mode and median are measures of central tendency or average. They measure the most typical value around which most values in the distribution tend to converge. However, there are always extreme values in each distribution. These extreme values indicate the spread or the dispersion of the distribution. To make a valid marketing decision you need not only the measures of central tendency but also relevant measures of dispersion. Measures of dispersion would tell you the number of values which are substantially different from the mean, median or mode. If the number of observations at the extreme values is large enough to form a substantial number, it indicates an opportunity for market segmentation. In the earlier example or bread if in a larger sample, you find that the number of households who consume three slices per day is also substantially large, the firm may find it worthwhile to introduce a 3-slice pack for light bread consumers. Such variations from the central tendency can be found by using measures of dispersion. The two commonly used measures of dispersion are range and standard deviation.

Range: The range is the difference between the largest and smallest observed value. Using the data in step I in the bread illustration, the largest observed value is 6 and the smallest observed value is 2, therefore the range is 4. The smaller the figure of range, the more compact and homogenous is the distribution.

Variance and standard deviation: These two measures of dispersion are based on the deviations from the mean. The variance is the average of the squared deviations of the observations values from the mean of the distribution. Standard deviation is the square root of the variance. The standard deviation is used to compare two samples which have the same mean. The distribution with the smaller standard deviation is more homogenous.

Selecting analytical methods: Besides having a summary of the data, the marketing manager also would like information on inter-relationships between variables and the qualitative aspects of the variables.

Correlation: Correlation coefficient measures the degree to which the change in one variable (the dependent variable) is associated with change in the other variable (independent one). As a marketing manager, you would like to know if there is any relation between the amount of money you spend on advertising and the sales you achieve. Sales are the dependent variable and advertising budget is the independent variable. Correlation coefficient, in this case, would tell you the extent of relationship between these two variables, whether the relationship is directly proportional (increase or decrease in advertising is associated with increase or decrease in advertising) or it as an inverse relationship (increase in advertising is associated with decrease in sales and vice versa) or there is no relationship between the two variables. However, it is important to note that correlation coefficient does not indicate a causal relationship. Sales is not a direct result of advertising alone, there are many other factors which affect sale. Correlation only indicates that there is some kind of association - whether it is casual or casual can be determined only after further investigation. You may find a correlation between the height of your salesmen and the sales, but obviously it is of no significance. In 1970, NCAER (National Council of Applied and Economic Research) predicted the annual stock of scooters using a regression model in which real personal disposable income and relative weighted price index of scooters were used as independent variables.

Regression Analysis: For determining casual relationship between two variables you may use regression analysis. Using this technique you can predict the dependent variables on the basis of the independent variables.

So far we have considered relationship only between two variables for which correlation and regression analysis are suitable techniques. But in reality you would rarely find a one-to-one casual relationship, rather you would find that the dependent variables are affected by a number of independent variables. Sales is affected by the advertising budget, the media plan, the content of the advertisements, number of salesmen, price of the product, efficiency of the distribution network and a host of other variables. For determining casual relationship involving two or more variables, multi-variate statistical techniques are applicable. The most important of these are the multiple regression analysis, discriminant analysis and factor analysis.

Multiple regression analysis is a variation of the regression analysis technique discussed above. The difference is that instead of considering one you may have two or more than two independent variables.

Discriminant analysis: In our discussion of dependent and independent variables, we have so far taken sale as the dependent variable. Sale is expressed in a numerical form. But not all dependent marketing variables can be expressed in numbers. Suppose you want to find out the reasons for customers brand preference for Thums Up vs. coca cola. In this case, the dependent variable, the brand, is not numerical in nature. A company is planning to introduce a new brand of detergent bar in the market and wants to find out the consumer traits associated with detergent bar as compared to detergent powder. This information would help the company focus its advertising strategy to exploit such associated traits. Several studies, aiming to discriminate between users and non-users of a particular brand of a product have been carried out. In one such study for a popular brand of Shirt, it was found that significant differences in the personality traits could determine between users and non-users.

Factor analysis: The multiple regression technique is based on the idea that you use truly independent variables. These variables are neither influenced by the dependent variable nor are they influenced by other independent variables. But in real life situations, there are many independent variables which are influenced by other independent variables, i.e. these independent variables have a high inter-correlation. You may find such an inter-correlation between the dealer discount structure and the ‘push' which the dealer provides to your product. Factor analysis is a statistical procedure which tries to determine a few basic factors that may underline and explain the inter-correlation among a large number of variables.

Statistical inference: These procedures involve the use of sample data to make inferences about the population. The three approaches used here are: estimates of population values, hypotheses about population values and tests of association between values in the population. Statistical inference as an analytical tool for marketing decisions is gaining wide acceptance.

Linkbar

Management for All

Subscribe through E-mail