Computer application in Agricultural Economics Chapter 2
UNIT 2
2. Introduction to statistical package
2.1. What is a statistical package?
Before going on to the definition of statistical packages, one needs to revisit the definition of
statistics and its functions. In this section, we would highlight the areas/ problems that statistics
as a discipline addresses to and the kind of data one gets for the statistical applications.
The term “Statistics” is used as a “collection of numerical facts or data”. It is also used in terms
of a “body of methods and techniques for analysing numerical data”. Statistical techniques
have many purposes, which include methods and procedures for summarising, simplifying,
reducing and presenting raw data. It then makes predictions, tests hypotheses and infers
characteristics of a population from the characteristics of a sample. In other words, Statistics is
generally thought of as serving two functions. One is to describe sets of data; the other is to
help in drawing inferences. When you are studying only a sample, there is possibility that your
assumption may not be accurate and you can never be certain that you have drawn the correct
inference. For this reason the inferential use of statistics may be thought of as helping you to
make decisions under conditions of uncertainty. It is different from guessing, because Statistics
also provides you with a method of estimating how reliable your conclusions are. With each
statistical statement that you make, you indicate the probability that findings like yours could
have been the result of chance factors.
A statistical package is the software for the collection, organisation, interpretation, and
presentation of numerical information. The need for a statistical package has arisen because of
the complexity of calculations involved in making inferences from the data. The advances in
computing technologies have made statistics a yet more powerful field. The emergence of
statistical software has undoubtedly contributed enormously to the development in research
studies in this 21st century. The high premium placed on ICT by human beings, researchers and
organizations has undoubtedly made it a major drive of every nation. Statistical Software (SS)
is a vital tool for research analysis, data validation and findings. ‘Over the course of history,
different forms of data analysis methods have been in existence. Initially, it was paper and pen
and later the advent of which computer has helped invention of punching machines and later
upgraded to simple calculator and complex scientific calculator. Nevertheless, pundits have
revealed that statistical software is a software program that makes the calculation and
presentation of statistics relatively easy. Statistical software allows researchers to avoid routine
mathematical mistakes and produce accurate figures in their research if they input all data
correctly.
By kalkidan 2016 Page 1 of 9
Computer application in Agricultural Economics Chapter 2
Development of statistical software allows academic researchers to conduct more quantitative
studies easily. Many researchers, professionals, scientists and business managers also can
clearly present accurately prediction of the future using statistical software. Many proprietary
and freeware statistical software packages are available that are suitable for different statistical
analysis, depending on the user's needs.
The emergence of statistical software in the twenty-first century has helped different
researchers in the physical and social science to improve in the quality of research. Most
renowned researchers in adopting this software in their data analysis have been able to identify
the immense contribution to research findings. Any quantitative research cannot be done
effectively without statistical packages. Moreover, it enables expert’s group research data for
easy presentation. It helps professionals to interact with data thereby paving way for creativity
and innovation. Some are user friendly interface with drop-down tips for beginners. Advances
in technology have improved all our lives and has allowed experts greater freedom to come out
with results within a shine of an eye than ever before where it takes time to finish analysis. This
same technology has offer experts tremendous opportunity to research and keep research as a
more interesting field of study.
2.2. Features of statistical software
Statistical software has some common characteristics that make it reliable and suitable for data
analysis:
1. Data editor is in rows and columns which make it very easy to enter numeric data.
2. There is availability of menu bar comprises drop-down menu, quick analysis as well as
brief user manual.
3. Statistical level of measurement is put into consideration in data entry
4. They follow the initial steps in research project
a. Getting your data ready to enter into the software.
b. Defining and labelling variable
c. Entering data appropriately with each row containing each case and each column as
variable.
d. Data checking and cleaning is possible.
All data should be numeric, although it may not be all variables it is not desirable to
use letter or word (String variable) as data. This can be achieved by recoding the letter
or word (string data) into desirable numeric and labelled appropriately.
Data exploration can be done to check for errors and other accuracy.
The statistical level of significance for rejecting null hypothesis (Ho) is when your p-
value significance is less than 0.05.
COMMON STATISTICAL SOFTWARE AND THEIR APPLICATION TO DATA ANALYSIS
By kalkidan 2016 Page 2 of 9
Computer application in Agricultural Economics Chapter 2
Statistical Package for the Social Sciences (SPSS). SPSS- (Statistical Package for the Social
Sciences now Statistical products and Solution services) is most widely used in social science
disciplines and courses. SPSS is the oldest software programs developed and made available in
1960s and has been redeveloped over the years, the latest version is SPSS 28 which was
produced in August 2021. Many sociologists, psychologists and social workers use this
program to enter their research data and formulate results. Although social science uses SPSS
more widely than other fields, many find it easy to navigate with SPSS because it is a package
that many beginners enjoy due to its very easy to use nature. SPSS has a "point and click"
interface that allows you to use pull down menus to select commands that you wish to perform.
Odusina (2011) disclosed that working with SPSS demand some background knowledge of
statistics. There are slight variations in the difference version of SPSS e.g. version 10, 11, 12,
13, 14, 15, 16, …etc. SPSS assists the user in describing data, testing hypotheses and looking
for a correlation or relationship between one or more variables. SPSS is very suitable for most
regression analysis and different kinds of ANOVA (regression, logistic regression, survival
analysis, analysis of variance, factor analysis, multivariate analysis but not suitable for time
series analysis and multilevel regression analysis).
STATA:- The full form of STATA is south Texas art therapy association. The south
Texas art therapy association, stata, is an affiliate chapter of the American art therapy
association. The purpose of this association is to progressively develop the therapeutic use of
art, to advance research, to improve standards of practice and to provide vehicles for the
exchange of information and experience.
Stata is a powerful statistical package with smart data-management facilities, a wide array of
up-to-date statistical techniques, and an excellent system for producing publication-quality
graphs. Stata latest version was produced April, 2021which is a fast and easy to use data
management package. Stata is available for Windows, Unix, and Mac computers. The standard
version is called Stata/IC (or Intercooled Stata) and can handle up to 2,047 variables. There is a
special edition called Stata/SE that can handle up to 32,766 variables (and also allows longer
string variables and larger matrices), and a version for multicore/multiprocessor computers
called Stata/MP, which has the same limits but is substantially faster. STATA performs most
general statistical analyses (regression, logistic regression, survival analysis, analysis of
variance, factor analysis, multivariate analysis and time series analysis).
Econometric Views (EViews):- EViews is a statistical package for window, used mainly for
time-series oriented econometrics analysis. It was developed by Quantitative Micro Software
(QMS) and now a part of IHS. Version 1.0 was released in March 1994, and has replaced
MicroTSP. The current version of EViews is EViews-12, released in November 2020. EViews
can be used for general statistical analysis and econometric analyses, such as cross-section and
panel data analysis and time series estimation and forecasting. EViews relies heavily on a
proprietary and undocumented file format for data storage. General description of the package
Advances in computing especially the advent of the personal computer (PC) have made
computing a game of the commoners. Today one has the computing power as one can easily
load software of his choice or need into his PC. There is a overabundance of read-made
computer packages available today. Now one can find different statistical packages for
applications to different disciplines. We will describe such packages that are ready available
and are popular and user friendly.
2.2.1. SPSS (Statistical packages for Social Science)
By kalkidan 2016 Page 3 of 9
Computer application in Agricultural Economics Chapter 2
SPSS (Statistical Package for the Social Sciences) has now been in development for more than
thirty years. Originally developed as a programming language for conducting statistical
analysis, it has grown into a complex and powerful application with now uses both a graphical
and a syntactical interface and provides dozens of functions for managing, analysing, and
presenting data.
Recall that when you open SPSS, a dialog box appears with the question, What would you
like to do? This window allows the user to choose from a number of quick-start options, such
as loading an existing data file or opening a recently-used file.
But if we don't yet have an existing file, just click Cancel in the lower right of this start up
dialog box. The box will close, leaving a blank Untitled - SPSS Data Editor window open
When you use SPSS, you work in one of several windows: the data view, the variable view, the
output view, the draft output view. Eventually you’ll also use the syntax editor (think: code) to
save or refine your queries.
When you first open SPSS on your computer, you should see something that looks similar to
following screenshot:
There are six different windows that can be opened when using SPSS. The following will give
a description of each of them.
The Data Editor:- The Data Editor is a spread sheet in which you define your variables
and enter data. Each row corresponds to a case while each column represents a variable.
The title bar displays the name of the open data file or "Untitled" if the file has not yet been
saved. This window opens automatically when SPSS is started.
The Output Navigator:- The Output Navigator window displays the statistical results,
tables, and charts from the analysis you performed. An Output Navigator window opens
By kalkidan 2016 Page 4 of 9
Computer application in Agricultural Economics Chapter 2
automatically when you run a procedure that generates output. In the Output Navigator
windows, you can edit, move, delete and copy your results in a Microsoft Explorer-like
environment. Information from the Output Viewer is saved in a file with the extension .spo.
The Pivot Table Editor: - Output displayed in pivot tables can be modified in many ways
with the Pivot Table Editor. You can edit text, swap data in rows and columns, add color,
create multidimensional tables, and selectively hide and show results.
The Chart Editor: - You can modify and save high-resolution charts and plots by
invoking the Chart Editor for a certain chart (by double-clicking the chart) in an Output
Navigator window. You can change the colors, select different type fonts or sizes, switch
the horizontal and vertical axes, rotate 3-D scatterplots, and change the chart type.
The Text Output Editor: - Text output not displayed in pivot tables can be modified with
the Text Output Editor. You can edit the output and change font characteristics (type, style,
color, size).
The Syntax Editor: - SPSS has never lost its roots as a programming language. Although
most of your daily work will be done using the graphical interface, from time to time you’ll
want to make sure that you can exactly reproduce the steps involved in arriving at certain
conclusions. In other words, you’ll want to replicate your analysis. The best method of
preserving the exact steps of a particular analysis is the syntax view. You can paste your
dialog box selections into a Syntax Editor window, where your selections appear in the
form of command syntax.
SPSS Menus and Icons
File includes all of the options you typically use in other programs, such as open, save, exit.
Notice, that you can open or create new files of multiple types to the right of ‘file ‘ tab.
Edit includes the typical cut, copy, and paste commands, and allows you to specify various
options for displaying data and output.
Click on Options, and you will see the dialog box to the left. You can use this to
format the data, output, charts, etc. These choices are rather overwhelming, and you
can simply take the default options for now.
View allows you to select which toolbars you want to show, select font size, add or remove
the gridlines that separate each piece of data, and to select whether or not to display your
raw data or the data labels.
Data allows you to select several options ranging from displaying data that is sorted by a
specific variable to selecting certain cases for subsequent analyses.
Transform includes several options to change current variables. For example, you can
change continuous variables to categorical variables, change scores into rank scores, add a
constant to variables, etc.
Analyse includes all of the commands to carry out statistical analyses and to calculate
descriptive statistics. Much of this book will focus on using commands located in this
menu.
Graphs includes the commands to create various types of graphs including box plots,
histograms, line graphs, and bar charts.
Utilities allows you to list file information which is a list of all variables, there labels,
values, locations in the data file, and type.
By kalkidan 2016 Page 5 of 9
Computer application in Agricultural Economics Chapter 2
Add-ons are programs that can be added to the base SPSS package. You probably do not
have access to any of those.
Window can be used to select which window you want to view (i.e., Data Editor,
Output Viewer, or Syntax). Since we have a data file and an output file open, let’s try this.
Select Window/Data Editor. Then select Window/SPSS Viewer.
Help has many useful options including a link to the SPSS homepage, a statistics coach,
and a syntax guide. Using topics, you can use the index option to type in any key word and
get a list of options, or you can view the categories and subcategories available under
contents. This is an excellent tool and can be used to troubleshoot most problems.
The Icons directly under the Menu bar provide shortcuts to many common commands that are
available in specific menus. Take a moment to review these as well. Place your cursor over the
Icons for a few seconds, and a description of the underlying command will appear. For
example, this icon is the shortcut for Save. Review the others yourself.
Data editor window
This is the window where you can see your data, and information about the variables in your
dataset. It is also possible to change your data in this window, but I would strongly
recommend against ever changing your data that way because things can go terribly wrong and
you have no record of what you changed and how you changed it! There are two ‘views’ for
the Data Editor window:
i) Data view – you can see the actual data in your dataset for each record and each variable;
and
ii) Variable view – this gives a summary of each variable in your dataset, including the
variable name, type, various properties of the way in which the data are stored, any
label(s) for the variable itself and variable values (such as value labels for categories of
sex, which in the dataset may be represented as 1 and 2, relating to male and female).
It is important to note that in this view, the variables are listed in the rows (as
opposed to being shown in the columns when the Data View tab is active). In the
Variable View, instead of listing the variables in columns, characteristics of the
variables are indicated in these columns.
o SPSS has rules for variable names (e.g., there can be no spaces, the variable name must
begin with a letter, and the maximum length is 8 characters). Upper or lower case may be
used – SPSS doesn’t care which you use, so we have used lower case. Note that once a
name has been typed in, SPSS default options appear in the next three columns.
Introduction to Syntax
IBM SPSS Statistics has been easy to use with its point and click interface for several decades.
Before that, however, SPSS required users to write programs to conduct their analyses. In fact,
all the point and click interface does is create syntax for you that is run. Using syntax can be
very useful if you want to save an analysis and run it later. Understanding it is an important
part of being able to understand your output as well. This text will not cover how to create
By kalkidan 2016 Page 6 of 9
Computer application in Agricultural Economics Chapter 2
SPSS syntax from scratch. Rather, it will simply cover how to get SPSS to create it for you,
and how to read the syntax it creates.
Having SPSS Create Syntax
Most SPSS Dialog Boxes will have a Paste option in them. Clicking Paste instead of OK will
put the relevant SPSS Syntax in a Syntax Window.
SPSS Syntax Basics
The syntax window is highlighted to help you understand what the commands are asking SPSS
to do. In the example above, the command that was run was the Frequencies command. It was
run on the variable TIME.
Many SPSS commands have options that are added to them but are not necessarily important at
the level of this text. For example, the /ORDER=ANALYSIS option above.
SPSS Syntax Format and SPSS Help
SPSS syntax is written like sentences. In fact, each one ends with a period (“.”). The number of
lines it is on does not matter – so it is often best to use several lines to make it easier to read.
By kalkidan 2016 Page 7 of 9
Computer application in Agricultural Economics Chapter 2
Rather than provide extensive coverage in this text, I suggest that if you are interested in SPSS
Syntax that you use the excellent help system provided within SPSS itself.
If you want to write a comment, or any sort of text that is NOT an SPSS command (telling
SPSS to do something), then you need to preface your text with an asterisk (*). When reading
your syntax, SPSS will ignore any text within an asterisk and a full stop.
Running SPSS Syntax
Whether SPSS creates syntax for you, whether you create it yourself, or whether you modify
syntax that SPSS has created, at some point you are going to want to use that syntax. From a
Syntax Window simply select Run and then All. Alternatively, you can select just some of the
text in the window with your mouse and the click Selection instead of All.
Saving and Loading SPSS Syntax
Syntax files have the extension “.sps” and can be saved and loaded just like other files. To load
a file select File – Open – Syntax. To access that help system, click Help then select Topics.
This will bring up a long list of SPSS Help Topics. Find the section labelled Reference and
then Command Syntax Reference for help with syntax.
While many people feel extremely uncomfortable using syntax, and would much rather use the
built-in menus (sound like you?!), in SPSS you can actually do both (use the menus and the
syntax file) for many procedures. Regardless of whether you write the syntax yourself, or paste
it into a syntax file from the menus, it really is best practice to use a syntax file to keep a record
of all your data management and analysis procedures.
Van den Berg (2013) lists six reasons you should use SPSS syntax:
1. Syntax is ideal project documentation;
2. Syntax can be corrected;
3. Syntax can be recycled;
4. Syntax gets things done fast;
5. Typing syntax saves time; and
By kalkidan 2016 Page 8 of 9
Computer application in Agricultural Economics Chapter 2
6. Syntax has more options…
By kalkidan 2016 Page 9 of 9