SPSS: Descriptive Statistics and Charts

Last week in the RPD6380 class, we talked about how to enter data into an Excel spreadsheet, and then opening it into the SPSS software.

As a recap:

  • Create variable names at the top of each column in Excel to match your variables/questions.  Use the Best Practices to naming your variables:
    • Keep it short (Maximum 32 characters in SPSS)
    • Start with a letter – can contain numbers
    • NO funny characters – %,$,#, etc…
    • NO blank spaces – use an _ if you want
  • Save your file in Excel
  • To open in SPSS
    • File
    • Open Data
    • Navigate to where you saved your file
    • Change File Type to Excel
    • Select your file and click Open
    • Answer the questions in the dialogue box
  • Make sure you save your data in SPSS

Variable Labels

In the Variable View of SPSS, take the time to fill in the Labels for each variable.  This way you won’t have to remember what those shortened variable names are in a couple of months or years.

Value Labels

We worked through an exercise where we coded some of our data.  Males/Females were m/f or 1/2.  Be sure to add all these labels in the Values section of the Variable View in SPSS. The time you spend doing this at the start of your research will save you a LOT of time when you do your analysis.

Missing Values

Be sure to add any missing codes to the Missing column in the Variable View

Descriptive Statistics

At the beginning of any statistical analysis, learning more about your data is a great place to start.  Descriptive statistics are essentially that – they describe your data, or they summarize your data to give you a good, solid base understanding of what you have collected.  The type of descriptive statistics you will conduct will depend on the type of variable you have.  Remember the 3 types of variables that SPSS distinguishes between?

  • Scale – a continuous piece of information, also referred to as Interval or Ratio.  Examples: age, weight, height
  • Nominal – a categorical piece of data – there is NO relationship between the categories.  Examples:  religion, colour, gender
  • Ordinal – a categorical piece of data – this time there is a relationship or order to the categories.  Examples:  Year of study, age group, likert scales

Each of these data types will use a different type of descriptive statistic.  For instance, calculating the mean of colour makes no sense at all, but a frequency count of colour does work.

Frequency

To calculate the frequency of a categorical variable (nominal OR ordinal) in SPSS:

  • Analyze
  • Descriptive Statistics
  • Frequencies
    • Select the variables in question and drag to the right hand side
      • As an example, select Income Category
    • Click OK to run

You should now have a frequency table of the variable, Income Category

The lists the categories of the variable, in this case: Below $25; $25-$49; $50-$74; $75+.  If you had not provided the value labels, you would see 1; 2; 3; 4 as the categories with no explanation as to what they represent.

The table lists Frequency – actual count of observation in each category; Percent – percent of observations as a total; Valid Percent – this will change if you have missing observations.  The Valid Percent is the percentage of observations that have values for Income Category; Cumulative Percent.

Try:

  • Run the Frequency procedure on the variable called Internet
  • Can you describe what you see?

Mode

Mode is the value in the data that appears the most.  So let’s switch variables and run a frequency on the variable Job Satisfaction.  When you run the frequency you have a table that shows you how many people answered each of the 5 levels of this Likert Scale:

  • Highly dissatisfied = 1109
  • Somewhat dissatisfied = 1268
  • Neutral = 1393
  • Somewhat satisfied = 1406
  • Highly satisfied =1224

By looking at these results I can see that Somewhat satisfied appears to be the category that people selected the most.  But let’s get SPSS to do the hard work for us and confirm whether this is correct or not.

To obtain the MODE of a variable:

  • Analyze
  • Descriptive Statistics
  • Frequencies
    • Select the variables in question and drag to the right hand side
      • As an example, select Job Satisfaction
    • Click on the Statistics button on the right
      • Select Mode
      • Click Continue
      • Click OK

You should now see the Mode in the first table of the Frequency output.

Try:

  1. What is the mode?
  2. Would you calculate the MODE on a variable such as income?  Why or Why not?

Median

The median of a variable, is the middle value.  So if you have an even number of categories, there will be no median or middle value, but if you have an odd number you will see it.

To obtain the MEDIAN in SPSS, follow the same instructions as the MODE, but select the MEDIAN in the Statistics dialogue box.

Try:

  1. What is the median for Job Satisfaction?
  2. What is the median value for Level of Education?

Mean

The mean or average is calculated on a scale variable or continuous variable.  It just doesn’t make sense to calculate the mean of a categorical variable.

To obtain the MEAN in SPSS:

  • Analyze
  • Descriptive Statistics
  • Descriptives
    • Select the variable in question and drag to the right hand side
      • use income as an example
      • Click OK to run

You should now have a table with N, Minimum, Maximum, Mean, and Standard Deviation for the household income variable.  These are the default values you obtain when you run this analysis.  But, what happens if you want the Sum or the Standard Error of this variable?

  • Analyze
  • Descriptive Statistics
  • Descriptives
    • Select the variable in question and drag to the right hand side
    • Select the Options button – this will open another dialogue box that has a list of statistics to select from
      • Select Sum and S.E. mean (standard error of the mean)
    • Click Continue
    • Click OK to run

Your output table will now contain these added statistics.

Try:

  1. Select another Scale variable from your dataset and calculate the mean, variance, and standard deviation.

Explore Function in SPSS

Sometimes you may want to determine what the mean household income by marital status or by another categorical variable.  Till now, we’ve been looking at the entire dataset.  There are a few ways to do this, but the most direct way is to use the Explore function in SPSS.

  • Analyze
  • Descriptive Statistics
  • Explore
    • In the Dependent List box, add the variables for which you would like to calculate the means – for example:  household income
    • In the Factor List box, add the variable by which you would like to see the means for – for example: marital status
    • Click Ok to run.

You will now see a much larger table than we have seen to date.  SPSS provides you with a long list of descriptive statistics for household income by each level of marital status.

You will also see a Stem and Leaf plot along with a Boxplot to provide you with a sense of the distribution of the data.  More information to help you get a better feeling for the data that you are working with.

Summary

The common descriptive statistics that are used include: frequency, median, mode, mean, and measures of variation (standard deviation, standard error, etc..).  Each of these statistics should be run on the appropriate types of data – keep in mind, that a frequency on a variable such as age will give you a long table with meaningless information.

Chart Builder in SPSS

Numbers and statistics can be fun, but sometimes putting these numbers into context with a chart or graph may reach a broader audience of understanding.  What do I mean by that?  How many of you will remember a number vs how many of you will remember a graph that shows a trend?

Building charts in SPSS is quite straightforward and fun!  You’ll see!!

Let’s start by creating a barchart for our job satisfaction variable.  We want to see a bar for each level and we want to see the count.

In SPSS:

  • Graphs
  • Chart Builder – this will open a dialogue box
    • Notice on bottom half – a gallery of all the different types of charts you can create in SPSS.
    • We want a simple barchart
      • Select bar
      • Then double-click on the first barchart listed
      • Once you do this you should see the skeleton of a bar chart appear in the top half of your dialogue box.
      • All you need to do now, is to drag and drop the variables where they are appropriate.
      • For this example:
        • Select Job satifisfaction and drag it to the x-axis
        • On the right, you may see an Element Properties dialogue box (if you do not see this – Click on the Element Properties button to open it).
        • Note that under Statistics, Count is selected – this is what we want.  But click on this to see what other statistics are available.
      • To create the graph Click OK

You should now see a very plain barchart that matches the Frequency counts we created earlier.

Let’s create a chart that shows the average income for each level of job satisfaction.  I’m curious to see whether the folks that are not satisfied with their job have a lower average income.

So, let’s start this again:

  • Graphs
  • Chart Builder – this will open a dialogue box
    • Select Barchart again
    • Drag and drop Job Satisfaction to the x-axis
    • Now drag and drop Household income to the y-axis
    • Notice how the Statistic changed to Mean.  This is what we want.
    • Let’s run in by clicking OK

Hmm…  now that’s an interesting graph!

One last piece missing from this graph – error bars!  Whenever you have charts with means, you should ALWAYS provide some measure of variance.  So let’s add some error bars and we’ll try standard error.

  • Graphs
  • Chart Builder – this will open a dialogue box
    • Select Barchart again
    • Drag and drop Job Satisfaction to the x-axis
    • Now drag and drop Household income to the y-axis
    • Ensure that the statistic is mean
    • Under the statistics box in the Element Properties box, check the Display Error Bars box
      • Now you have a few options, as stated above let’s use the Standard Error option – select Standard Error
      • Click Apply
    • Click OK to run chart

Providing the error bars gives the reader a “fuller” picture of the data.  Although in this case it does not change the story!

Try:

  1. Create a barchart that shows the mean household income by job satisfaction for the 2 levels of marital status.  Be sure to include error bars.
  2. What question does this barchart answer?

More charts

I used the example of a barchart, but the more you use the ChartBuilder, you can see how straightforward it is to create charts in SPSS.  Try playing around with a different chart and see what happens.

Summary

  • Barchart for counts
  • Barchart to show means of groups
  • Side-by-side barchart to show means of group

 

Name