SAS - Basic Concepts
Powerpoint Templates
What is SAS?
Statistical Application Software
The SAS system has a suite of products
Each product associated with a set of functionalities Capable of efficiently handling very large data sets
Powerpoint Templates
Some products in the SAS System
Core of the SAS System
The basic software to make SAS run
Base SAS Software
SAS language, DATA step and Basic Procedures
SAS/STAT
Procedures for various statistical analyses
SAS/GRAPH
Procedures and options to create graphs
SAS/ETS
Economic Time Series Time Series Analysis
SAS/OR
Operations Research Optimization, LP etc
SAS/EIS
Enterprise Information Systems for OLAP models
Enterprise Miner
Mining Package with various techniques
SAS/Intrnet
Web based application and portal development
Analyst/AF/FSP/Other Front End Based Features
Powerpoint Templates
Program Editor Window
Explorer and Results Window
Write your code in this window
Powerpoint Templates
Log Window
Log Window View the Log Created by the Program Execution
Powerpoint Templates
Output Window
Output Window
View the Values of a Dataset in this Window
Powerpoint Templates
Components of SAS Programs
DATA steps typically create or modify SAS data sets.
put your data into a SAS data set compute values check for and correct errors in your data produce new SAS data sets by subsetting, merging, and updating existing data sets.
PROC (procedure) steps are pre-written routines that enable you to analyze and process the data in a SAS data set
you can use PROC steps to create a report that lists the data produce descriptive statistics create a summary report produce plots and charts.
Powerpoint Templates
Characteristics of SAS Programs
SAS programs consist of SAS statements. A SAS statement has two important characteristics:
It usually begins with a SAS keyword. It always ends with a semicolon.
SAS statements are in free format.
This means that they can begin and end anywhere on a line One statement can continue over several lines Several statements can be on a line. Blanks or special characters separate "words" in a SAS statement.
Powerpoint Templates
Overview of Data Sets
Descriptor Portion
The descriptor portion of a SAS data set contains information about the data set, including the name of the data set the date and time that the data set was created the number of observations the number of variables
Data Portion
Contains Rows, Column and actual Value
Variable Attributes
Name Type Length Format Informat Label
Powerpoint Templates
SAS Libraries
Every SAS file is stored in a SAS library, which is a collection of SAS files. A SAS data library is the highest level of organization for information within SAS. General form, basic LIBNAME statement:
LIBNAME libref 'SAS-data-library';
where libref is 1 to 8 characters long, begins with a letter or underscore, and contains only letters, numbers, or underscores. SAS-data-library is the name of a SAS data library in which SAS data files are stored. The specification of the physical name of the library differs by operating environment.
Powerpoint Templates
10
Storing Files Temporarily or Permanently
Storing files temporarily:
If you don't specify a library name when you create a file (or if you specify the library name Work), the file is stored in the temporary SAS data library. When you end the session, the temporary library and all of its files are deleted. Temporary SAS libraries last only for the current SAS session
Storing files permanently:
To store files permanently in a SAS data library, you specify a library name other than the default library name Work. Permanent SAS libraries are available to you during subsequent SAS sessions
Powerpoint Templates
11
Referencing SAS Files
To reference a permanent SAS data set in your SAS programs, you use a two-level name:
libref.filename
In the two-level name, libref is the name of the SAS data library that contains the file, and filename is the name of the file itself. A period separates the libref and filename.
To reference temporary SAS files, you can specify the default libref Work, a period, and the filename.
For example, the two-level name Work.Test Alternatively, you can use a one-level name (the filename only) to reference a file in a temporary SAS library. When you specify a one-level name, the default libref Work is assumed.
If the USER library is assigned, SAS uses the User library rather than the Work library for one-level names. User is a permanent library. So referencing a SAS file in any library except Work indicates that the SAS file is stored permanently.
Powerpoint Templates
12
Example of a SAS Data set
Variables ID 1 2 Observations 3 4 5 6 53 54 55 56 57 58 Mary NAME Lucy Tom Dan Tim 42 46 43 45 42 48 HT WT 41 54 . 56 48 43
ID, HT and WT are Numeric Variables
NAME is a Character Variable Character Variables if blank are represented by a space Numeric Variables if blank are represented by a .
Powerpoint Templates
13
CREATING LIST REPORTS
Powerpoint Templates
14
Basic Report
You can easily list the contents of a SAS data set by using a simple program like the one shown below.
libname clinic 'your-SAS-data-library'; proc print data=clinic.admit; run;
You can produce column totals for numeric variables within your report.
libname clinic 'your-SAS-data-library'; proc print data=clinic.admit; sum fee; run;
Powerpoint Templates
15
Selected Observations and Variables
You can choose the observations and variables that appear in your report. In addition, you can remove the default Obs column that displays observation numbers.
libname clinic 'your-SAS-data-library'; proc print data=clinic.admit noobs; var age height weight fee; where age>30; run;
Powerpoint Templates
16
Specifying WHERE Expressions
Symbol
= or eq
Meaning
equal to
Example
where name='Jones, C.';
^= or ne
> or gt < or lt >= or ge <= or le
not equal to
greater than less than greater than or equal to less than or equal to
where temp ne 212;
where income>20000; where partno lt "BG05"; where id>='1543'; where pulse le 85;
Powerpoint Templates
17
More Operators
Using the CONTAINS Operator
The CONTAINS operator selects observations that include the specified substring. The mnemonic equivalent for the CONTAINS operator is ?
where firstname CONTAINS 'Jon'; where firstname ? 'Jon';
IN operator
where actlevel in ('LOW','MOD'); where fee in (124.80,178.20);
Between And
Where date Between 21Dec2010d And 20Jan2011d;
Like Operator
Where name like _uj%;
Powerpoint Templates 18
CREATING SAS DATA SETS FROM RAW DATA
Powerpoint Templates 19
Steps to Create a SAS Data Set
To Do This Use This SAS Statement Example
Reference a SAS data library
Reference an external file Name a SAS data set Identify an external file Describe data Execute the DATA step List the data Execute the final program step
LIBNAME statement
FILENAME statement DATA statement INFILE statement INPUT statement RUN statement PROC PRINT statement RUN statement
libname libref 'SAS-data-library';
filename tests 'c:\users\tmill.dat'; data clinic.stress; infile tests obs=10; input ID 1-4 Age 6-7 ...; run; proc print data=clinic.stress; run;
Powerpoint Templates
20
Steps to Create a SAS Data Set
Using a LIBNAME Statement
libname taxes 'c:\users\acct\qtr1\report';
Using a FILENAME Statement
filename exer 'c:\users\exer.dat';
Naming the Data Set
DATA SAS-data-set-1 <...SAS-data-set-n>; Rules for SAS Names
SAS data set names can be 1 to 32 characters long must begin with a letter (AZ, either uppercase or lowercase) or an underscore (_) can continue with any combination of numbers, letters, or underscores.
Powerpoint Templates 21
Steps to Create a SAS Data Set
Specifying the Raw Data File
INFILE file-specification <options>;
Describing the Data
General form, INPUT statement using column input:
INPUT variable ;<$> startcol-endcol . . . where variable is the SAS name that you assign to the field the dollar sign ($) identifies the variable type as character (if the variable is numeric, then nothing appears here) startcol represents the starting column for this variable endcol represents the ending column for this variable.
Powerpoint Templates
22
Steps to Create a SAS Data Set
filename exer 'c:\users\exer.dat'; data exercise; infile exer; input ID $ 1-4 Age 6-7 ActLevel $ 9-12 Sex $ 14; run;
When you use column input, you can
read any or all fields from the raw data file read the fields in any order specify only the starting column for values that occupy only one column.
input ActLevel $ 9-12 Sex $ 14 Age 6-7;
Powerpoint Templates
23
Verifying the Data
Whenever you use the DATA step to read raw data
Write the DATA step using the OBS= option in the INFILE statement. Submit the DATA step. Check the log for messages. View the resulting data set. Remove the OBS= option and re-submit the DATA step. Check the log again. View the resulting data set again.
Powerpoint Templates
24
Creating and Modifying Variables
General form, assignment statement:
variable=expression; where variable names a new or existing variable expression is any valid SAS expression.
SAS Expressions
An expression is a sequence of operands and operators that form a set of instructions. The instructions are performed to produce a new value:
Operands are variable names or constants. They can be numeric, character, or both. Operators are special-character operators, grouping parentheses, or functions.
Powerpoint Templates
25
Using Operators in SAS Expressions
Operator Action negative prefix Example negative=-x; Priority I
**
* / + -
exponentiation
multiplication division addition subtraction
raise=x**y;
mult=x*y; divide=x/y; sum=x+y; diff=x-y;
I
II II III III
When you use more than one arithmetic operator in an expression, operations of priority I are performed before operations of priority II, and so on consecutive operations that have the same priority are performed
from right to left within priority I from left to right within priorities II and III
you can use parentheses to control the order of operations.
Powerpoint Templates 26
Reading In stream Data
To read in stream data, you use a DATALINES statement as the last statement in the DATA step (except for the RUN statement) and immediately preceding the data lines A null statement (a single semicolon) to indicate the end of the input data
Powerpoint Templates
27
Using Data step for Internal raw data
Internal raw data
Datalines or Cards to indicate that the data is internal
data cities; input City $ Rank ; datalines; Mumbai 1 Delhi 2
Chennai 3
Calcutta 4 ; run ;
Powerpoint Templates
*
28
Steps to Create a Raw Data File
data _null_; set clinic.stress; file 'c:\clinic\patients\stress.dat'; put id 1-4 name 6-25 resthr 27-29 maxhr 31-33 rechr 35-37 timemin 39-40 timesec 42-43 tolerance 45 totaltime 47-49; run;
Using the_NULL_ Keyword
The keyword _NULL_, which enables you to use the DATA step without actually creating a SAS data set A SET statement specifies the SAS data set that you want to read from.
Powerpoint Templates 29
UNDERSTANDING DATA STEP PROCESSING
Powerpoint Templates 30
Compilation phase
A SAS DATA step is processed in two phases:
During the compilation phase,
each statement is scanned for syntax errors. Most syntax errors prevent further processing of the DATA step. When the compilation phase is complete, the descriptor portion of the new data set is created.
Powerpoint Templates
31
Execution phase
If the DATA step compiles successfully, then the execution phase begins.
During the execution phase, the DATA step reads and processes the input data. The DATA step executes once for each record in the input file
Compile Program Execution Phase
Initialize variables to missing
Execute input statement Execute other statements
End of File ?
No
Yes
Next step Output to SAS dataset
Powerpoint Templates
32
Compilation Phase In detail
Input Buffer
At the beginning of the compilation phase, the input buffer (an area of memory) is created to hold a record from the external file The input buffer is created only when raw data is read, not when a SAS data set is read The term input buffer refers to a logical concept; it is not a physical storage area
Powerpoint Templates
33
Program Data Vector
Program Data Vector
The program data vector is the area of memory where SAS builds a data set, one observation at a time Like the term input buffer, the term program data vector refers to a logical concept The program data vector contains two automatic variables that can be used for processing but which are not written to the data set as part of an observation
_N_ counts the number of times that the DATA step begins to execute. _ERROR_ signals the occurrence of an error that is caused by the data during execution.
The default value is 0, which means there is no error. When one or more errors occur, the value is set to 1.
Powerpoint Templates
34
Syntax Checking
During the compilation phase, SAS scans each statement looking for following syntax errors.
missing or misspelled keywords invalid variable names missing or invalid punctuation invalid options.
Data Set Variables
As the INPUT statement is compiled, a slot is added to the program data vector for each variable in the new data set variable attributes such as length and type are determined the first time a variable is encountered. Any variables that are created with an assignment statement in the DATA step are also added
Powerpoint Templates
35
Execution Phase In detail
Initializing Variables
At the beginning of the execution phase, the value of _N_ is 1. Because there are no data errors, the value of _ERROR_ is 0. The remaining variables are initialized to missing.
Missing numeric values are represented by periods (.) missing character values are represented by blanks ()
Input Data
The INFILE statement identifies the location of the raw data.
Input Pointer
INPUT statement uses an input pointer to keep track of its position The input pointer starts at column 1 of the first record, unless otherwise directed
Powerpoint Templates 36
Execution Phase - End of the DATA Step
1. The values in the program data vector are written to the output data set as the first observation 2. The value of _N_ is set to 2 and control returns to the top of the DATA step 3. The variable values in the program data vector are reset to missing 4. That the automatic variables _N_ and _ERROR_ retain their values 5. The DATA step works like a loop, repetitively executing statements to read data values and create observations one by one
Powerpoint Templates
37
End-of-File Marker The ultimate End !!
End-of-File Marker
The execution phase continues the iterations until the end-of-file marker is reached in the raw data file The order in which variables are defined in the DATA step determines the order in which the variables are stored in the data set
data perm.update; infile invent; input Item $ 1-13 IDnum $ 15-19 InStock 21-22 BackOrd 24-25; Total=instock+backord; run;
Powerpoint Templates
38
Methods for getting your data into the SAS system
Entering data directly into SAS dataset
Creating SAS datasets from raw files
Using Data step Using Import Procedure
Converting other softwares data files into SAS datasets
Reading other softwares data files directly
Powerpoint Templates
39
Different types of input
Column Input List Input Formatted input
Types of input List input Formatted input
Column input
Powerpoint Templates
40
Free-Format Data
What is free format data
Data that is not arranged in columns The fields are often separated by blanks or by some other delimiter
Powerpoint Templates
41
Fixed-Field Data
What is Fixed-Field Data
Data is arranged in columns or fixed fields You can specify a beginning and ending column for each field
Powerpoint Templates
42
Reading Free-Format Data
Using List Input
General form, INPUT statement using list input: INPUT variable <$>; where variable specifies the variable whose value the INPUT statement is to read $ specifies that the variable is a character variable.
Because list input, by default, does not specify column locations,
all fields must be separated by at least one blank or other delimiter fields must be read in order from left to right you cannot skip or re-read fields.
Powerpoint Templates 43
Using Data step for External raw data
External raw data
Infile statement to tell SAS the filename and the path
Text in the external file Mumbai 1 Delhi 2 Chennai 3 data cities; infile "C:\training\sample1.txt input City $ Rank ; run ;
File Name=C:\training\sample1.txt,
Calcutta 4
NOTE: The infile "C:\training\sample1.txt" is: RECFM=V, LRECL=256 The minimum record length was 8. The maximum record length was 10.
NOTE: 4 records were read from the infile "C:\training\sample1.txt".
NOTE: The data set WORK.CITIES has 4 observations and 2 variables.
NOTE: DATA statement used: real time cpu time 0.25 seconds 0.11 seconds
Powerpoint Templates
*
44
Working with Delimiters
Use the DLM= option in the INFILE statement to specify a delimiter other than a blank (the default) Example:
data perm.survey; infile credit dlm=','; input Gender $ Age Bankcard FreqBank Deptcard FreqDept; run;
Powerpoint Templates
45
Reading Raw data separated by spaces
List Input
data runners;
input name $ surname $ age runtime1 runtime3 ;
datalines; Scott A 15 23.3 21.5 Mark . 13 25.2 24.1 ; run ;
NOTE: SAS went to a new line when INPUT statement reached past the end of a line. NOTE: The data set WORK.RUNNERS has 2 observations and 5 variables. All missing data must be indicated by a period All values are separated by at least one space Character data are eight characters or fewer Should not have embedded spaces
Powerpoint Templates
*
46
Reading Raw data separated by spaces
data runners; input name $ surname $ age runtime1 runtime2 ; datalines; Scott A 15 22.0 21.9 Mark . 13 25.2 24.1 Jon K 13 25.1 Michael M 14 12 . ; run ; NOTE: Invalid data for runtime2 in line 228 1-7. RULE: 228 ----+----1----+----2----+----3----+----4----+----5----+----6----+----7----+----8----+Michael M 14 12 . data runners; input name $ surname $ age runtime1 runtime2 ; datalines; Scott A 15 22.0 21.9 Mark . 13 25.2 24.1 Michael M 14 12 . ; run ;
name=Jon surname=K age=13 runtime1=25.1 runtime2=. _ERROR_=1 _N_=3 NOTE: SAS went to a new line when INPUT statement reached past the end of a line.
NOTE: The data set WORK.RUNNERS has 4 observations and 5 variables.
Powerpoint Templates
*
47
Reading Missing Values at the End of a Record
Missover option:
If the missing values occur at the end of the record, you can use the MISSOVER option in the INFILE statement to read the missing values at the end of the record The MISSOVER option prevents SAS from going to another record if, when using list input, it does not find values in the current line for all the INPUT statement variables At the end of the current record, values that are expected but not found are set to missing The MISSOVER option works only for missing values that occur at the end of the record
Powerpoint Templates
48
Reading Missing Values at the Beginning or Middle of a Record
The DSD Option
You can use the DSD option in the INFILE statement to correctly read the raw data sets the default delimiter to a comma treats two consecutive delimiters as a missing value removes quotation marks from values If the data uses multiple delimiters or a single delimiter other than a comma, then simply specify the delimiter value(s) with the DLM= option
data perm.survey; infile credit dsd dlm='*'; input Gender $ Age Bankcard FreqBank Deptcard FreqDept; run; Powerpoint Templates 49
The LENGTH Statement
The variable attributes are defined when the variable is first encountered in the DATA step
Powerpoint Templates
50
Modifying List Input
The ampersand (&) modifier :
is used to read character values that contain embedded blanks. The & indicates that a character value that is being read with list input might contain one or more single embedded blanks The value is read until two or more consecutive blanks are encountered
The colon (:) modifier :
enables you to read nonstandard data values and character values that are longer than eight characters, but which contain no embedded blanks The colon (:) indicates that values are read until a blank (or other delimiter) is encountered, and then an informat is applied input Rank City & $12. Pop86 : comma.;
Powerpoint Templates
51
Creating Free-Format Data
The PUT statement can also be used with list output to create free-format raw data files. data _null_;
set perm.finance; file 'c:\data\findat2' dlm=','; put ssn name salary date : date9.; run; PROC EXPORT DATA=SAS-data-set; OUTFILE=filename <DELIMITER='delimiter'>; RUN;
Powerpoint Templates
52
Reading Raw Data in Fixed Fields
Column Input Features
It can be used to read character variable values that contain embedded blanks. input Name $ 1-25;
No placeholder is required for missing data. A blank field is read as missing and does not cause other fields to be read incorrectly. Fields or parts of fields can be re-read. Fields do not have to be separated by blanks or other delimiters.
Powerpoint Templates
53
Identifying Standard and Nonstandard Numeric Data
Standard numeric data values can contain only
numbers decimal points numbers in scientific, or E, notation (23E4) minus signs and plus signs. Some examples of standard numeric data are 15, -15, 15.4, +.05, 1.54E3, and -1.54E-3.
Nonstandard numeric data includes
values that contain special characters, such as percent signs (%), dollar signs ($), and commas (,) date and time values data in fraction, integer binary, real binary, and hexadecimal forms.
Powerpoint Templates
54
Using Formatted Input
Whenever you encounter raw data that is organized into fixed fields, you can use
column input to read standard data only formatted input to read both standard and nonstandard data.
General Form of the INPUT Statement Using Formatted Input
INPUT <pointer-control> variable informat.; where pointer-control positions the input pointer on a specified column variable is the name of the variable that is being created informat is the special instruction that specifies how SAS reads raw data.
Powerpoint Templates
55
@n Column Pointer Control
Using the @n Column Pointer Control
The @n is an absolute pointer control that moves the input pointer to a specific column number The @ moves the pointer to column n, which is the first column of the field that is being read You can use the @n to move a pointer forward or backward when reading a record. INPUT @n variable informat.; input @9 FirstName $5. @1 LastName $7.
Powerpoint Templates
56
The +n Pointer Control
The +n Pointer Control
The +n pointer control moves the input pointer forward to a column number that is relative to the current position The + moves the pointer forward n columns INPUT +n variable informat.;
Powerpoint Templates
57
Using Informats
An informat is an instruction that tells SAS how to read raw data SAS provides many informats for reading standard and DATEw. NENGOw. nonstandard data values PERCENTw.d $BINARYw. DATETIMEw. PDw.d Note that
$w. COMMAw.d HEXw. JULIANw. MMDDYYw. PERCENTw. TIMEw. w.d
each informat contains a w value to indicate the width of the raw data field each informat also contains a period, which is a required delimiter for some informats, the optional d value specifies the number of implied decimal places informats for reading character data always begin with a dollar sign ($).
Powerpoint Templates 58
Reading Character Values
The $w. informat enables you to read character data The w represents the field width of the data value (the total number of columns that contain the raw data field)
Reading Standard Numeric Data
The informat for reading standard numeric data is the w.d informat The w specifies the field width of the raw data value, the period serves as a delimiter, and the d optionally specifies the number of implied decimal places for the value . The w.d informat ignores any specified d value if the data already contains a decimal point
Powerpoint Templates
59
Reading Nonstandard Numeric Data
The COMMAw.d informat is used to read numeric values and to remove embedded
1. 2 3
blanks commas dashes dollar signs percent signs right parentheses left parentheses, which are converted to minus signs
COMMA w. d
the informat name a value that specifies the width of the field to be read (including dollar signs, decimal places, or other special characters), followed by a period an optional value that specifies the number of implied decimal places for a value (not necessary if the value already contains decimal places).
Powerpoint Templates
60
Reading Date and Time Values
How SAS Stores Date Values ?
When you use a SAS informat to read a date, SAS converts it to a numeric date value. A SAS date value is the number of days from January 1, 1960, to the given date.
How SAS Stores Time Values ?
SAS stores time values similar to the way it stores date values. A SAS time value is stored as the number of seconds since midnight.
A SAS datetime is a special value that combines both date and time information. A SAS datetime value is stored as the number of seconds between midnight on January 1, 1960, and a given date and time.
Powerpoint Templates
61
Date and Time Informats
MMDDYYw. Informat
Reads mmddyy or mmddyyyy
Date Expression 101599 10/15/99 10 15 99 10-15-1999 SAS Date Informat MMDDYY6. MMDDYY8. MMDDYY8. MMDDYY10.
DATEw. Informat
Reads ddmmmyy or ddmmmyyyy
Date Expression 30May00 30May2000 30-May-2000 SAS Date Informat DATE7. DATE9. DATE11.
Powerpoint Templates
62
Date and Time Informats
TIMEw. Informat
Reads hh:mm:ss.ss where
hh is an integer from 00 to 23, representing the hour mm is an integer from 00 to 59, representing the minute ss.ss is an optional field that represents seconds and hundredths of seconds.
Time Expression 17:00:01.34 17:00 2:34 SAS Time Informat TIME11. TIME5. TIME5.
Five is the minimum acceptable field width for the TIMEw. informat.
Powerpoint Templates 63
Date and Time Informats
The WEEKDATEw. Format
The WEEKDATEw. format writes date values in the form day-ofweek, month-name dd, yy (or yyyy).
FORMAT Statement format datein weekdate3.; Result Mon
format datein weekdate6.;
format datein weekdate17.; format datein weekdate21.;
Monday
Monday, Apr 5, 99 Monday, April 5, 1999
The WORDDATEw. Format
The WORDDATEw. format is similar to the WEEKDATEw. format, but it does not display the day of the week or the twodigit year values.
FORMAT Statement format datein worddate3.; format datein worddate5.; format datein worddate14.; Result Apr April April 15, 1999
Powerpoint Templates
64
Line Pointer Controls
When SAS reads raw data values, it keeps track of its position with an input pointer We have used column pointer controls and column specifications to determine the column placement of the input pointer We can also position the input pointer on a specific record by using a line pointer control There are two types of line pointer controls
The forward slash (/) specifies a line location that is relative to the current one The #n specifies the absolute number of the line to which you want to move the pointer
Powerpoint Templates
65
Reading Multiple Records Sequentially
The Forward Slash (/) Line Pointer Control Refer to the embedded word doc for illustration
Note that the raw data file must contain the same number of records for each observation that is being created
Powerpoint Templates
66
Reading Multiple Records Sequentially
The #n Line Pointer Control
The #n specifies the absolute number of the line to which you want to move the input pointer The #n pointer control can read records in any order Refer to the embedded word doc for illustration
Powerpoint Templates
67
Combining Line Pointer Controls
The forward slash (/) line pointer control and the #n line pointer control can be used together
Refer to the embedded word doc for illustration
Powerpoint Templates
68
Creating a Single Observation from Multiple Records
The forward slash (/) specifies a line location that is relative to the current one.
The / advances the input pointer to the next record. The / line pointer control only moves the input pointer forward and must be specified after the instructions for reading the values in the current record. Note that the raw data file must contain the same number of records for each observation that is being created.
Powerpoint Templates
69
Reading Multiple Records Non-Sequentially
The #n Line Pointer Control
The #n specifies the absolute number of the line to which you want to move the input pointer. The #n pointer control can read records in any order It must be specified before the instructions for reading values in a specific record.
Points to Remember
Because the / pointer control can only move forward, the pointer control is specified after the values in the current record are read. The #n pointer control can read records in any order and must be specified before the variable names are defined. A semicolon should be placed at the end of the complete INPUT statement.
Powerpoint Templates 70
Creating Multiple Observations from a Single Record
SAS provides two line-hold specifiers. A Line-Hold Specifiers hold the current record for next input statement. The trailing at sign (@) holds the input record for the execution of the next INPUT statement. The double trailing at sign (@@) holds the input record for the execution of the next INPUT statement, even across iterations of the DATA step. The term trailing indicates that the @ or @@ must be the last item that is specified in the INPUT statement.
E.g. input Name $20. @; or input Name $20. @@;
Powerpoint Templates
71
Using the Double Trailing At Sign (@@) to Hold the Current Record
Typically, each time a DATA step executes, the INPUT statement reads a new record. When the trailing @@ is used, the INPUT statement holds the current record and reads the next value. The double trailing at sign (@@)
Holds the data line in the input buffer across multiple executions of the DATA step Typically is used to read multiple SAS observations from a single data line Should not be used with the @ pointer control, with column input, nor with the MISSOVER option.
Powerpoint Templates
72
Using the Double Trailing At Sign (@@) to Hold the Current Record
A record that is being held by the double trailing at sign (@@) is not released until one of the following events occurs:
The input pointer moves past the end of the record. Then the input pointer moves down to the next record. An INPUT statement that has no line-hold specifier executes.
Powerpoint Templates
73
Using the Single Trailing At Sign (@) to Hold the Current Record
Like the double trailing @@, the single trailing @
Enables the next INPUT statement to read from the same record Releases the current record when a subsequent INPUT statement executes without a line-hold specifier.
It's easy to distinguish between the trailing @@ and the trailing @ by remembering that
the double trailing at sign (@@) holds a record across multiple iterations of the DATA step until the end of the record is reached. the single trailing at sign (@) releases a record when control returns to the top of the DATA step.
Powerpoint Templates
74
THANK YOU HAPPY LEARNING
Powerpoint Templates 75