HPH 562 DATA MANAGEMENT & INFORMATICS
Sept 5, 2012
Lecture 2: Getting your Data into SAS
By: Jamie Romeiser *Lectures are not to be redistributed without prior written consent
Organization Tips:
Create a folder for each class meeting Save all class documents in that folder
Lecture Lecture
Code Lecture Databases
Use this folder as your SAS library folder for that class.
Lecture Outline
Two Parts of SAS Programming
Data Proc The Data Step How it works Temporary:
Conceptual Model
Temporary Vs Permanent Datasets:
Method 1: Importing DBMS Files Method 2: Table Entry Method 3: Internal Raw Data (Code) Method 4: External Raw Data
Method 1: Libname Method 2: Point and Click
Permanent
Types of Variables in Brief:
Character, Numeric
Set statement Proc Contents
Lecture Code
*-----------------------------------* * HPH 562 * * Class 2 * *-----------------------------------*------------------------------------------* * Temporary Datasets * * Method 1:Importing data through the import wizard * * Method 2: Input data via SAS Table * * Method 3a: Internal Raw Data, List Input ; * * Method 3b: Internal Raw Data, Column Input; * * Method 4: External Raw Data; * * Permanent Databases * * Method 1: Libname * * Method 2: Point and Click * * Set Statement; * * Proc Contents; * *------------------------------------------------------------------------------*;
Two Parts of SAS Programming Code
Data Steps Creating new datasets Reading and modifying data
*Example of Creating New data examp1; Dataset; input ID BMI Gender $ VitDDefincient; datALinEs; 4687 31 F 0 7542 17 M 1 9637 18 F 1 ; run; *Example of Modifying a data examp2; Dataset; set examp1; If Gender = 1 or BMI>= 18.5 then FRAILTY=1; Else Frailty=0; run;
Proc Steps Print reports Perform utility functions Analyze data
proc print data=examp1; *Example of Report; run; proc freq data=examp1; *Example of Utility table Gender*VitDDeficient; Function; run; proc means data=examp1; *Example of Analyzing Var BMI; Data; run;
SAS Conceptual Model
Raw Data to be Analyzed
Data Analysis in a finished report
DATA step statement; Data
More SAS Statements;
SAS Datase t
PROC step SAS Procedure Statements;
Results of analysis
SAS Conceptual Model (Vitamin D Paper Example)
Raw Data: NHANES III Surveys
Data Analysis in a finished report
Bring data into SAS; Apply Inclusion Criteria; Create Outcome Frailty Variable; Create other predictor Variables;
DATA Data statements; step
PROC Proc step Statements; Frequency Tables; Odds Ratios (i.e. Logistic Regression);
Results of analysis
SAS Datase t
SAS Conceptual Model: Data Step
Data Statement applied to dataset A; Is there data to read?
Ye s No Ye s
Data DatasetB; *Create a new dataset called datasetB; set DatasetA; *From datasetA i.e. use DatasetA as the base; Statement 1; *I want you to make the Statement 2; following changes to Statement 3; datasetA; Statement 4; Statement 5; Statement 6; Run; *Execute my statements up to here;
No
Is there another Data Statemen t?
Example Code:
Data DatasetB; set DatasetA; Frail1=0; if BMI<=18.5 then Frail1 = 1;
Reads Data and Executes Statement Writes observation into dataset B Done. Modification s are now in dataset B
Frail2=0; if SLOWWALK=1 then Frail2 = 1; FRAILTY=0; if Frail1=1 or Frail2=1 then FRAILTY = 1; Run;
SAS Conceptual Model (Vitamin D Paper Example)
Raw Data: NHANES III Surveys
Data Analysis in a finished report
Bring data into SAS; Apply Inclusion Criteria; Create Outcome Frailty Variable; Create other predictor Variables;
DATA Data statements; step
PROC Proc step Statements; Frequency Tables; Odds Ratios (i.e. Logistic Regression);
Results of analysis
SAS Datase t Where?
Temporary Vs. Permanent Datasets
Temporary Datasets:
Stored
All
in Work Folder within SAS Libraries
files that SAS stores in the WORK library are deleted at the end of a session (i.e. temporary)
Temporary Vs. Permanent Datasets
Creating a Temporary Dataset:
Method
Using
1: Importing DBMS Files
2: Table Entry 3: Internal Raw Data (Code)
Import Wizard to bring in datasets stored excel or access format. data into SAS table it in your code
Method
Enter Type
Method Method
Pulling
4: External Raw Data
in Data stored in .txt format, but specifying variable names
Temporary Vs. Permanent Datasets
Creating a Temporary Dataset:
Method
1: Importing Database Management System (DBMS) files ID Heig Gende Intervention Result
Excel/Access
Year 2006 2006
1997-2003 Steps for Import Wizard:
File Import Data Select Excel Find Dataset Select Sheet Name It Finish
ht 46 752 67 71
r F M 0 0 Yes No
9673
969
62
69
F
M
1
1
Yes
Yes
2006
2006
PROC IMPORT OUT= WORK.datasetname DATAFILE= DRIVE:\Foldername\datasetname.xls" DBMS=EXCEL REPLACE; RANGE="Sheet1$"; GETNAMES=YES; MIXED=NO; SCANTEXT=YES; USEDATE=YES; SCANTIME=YES; RUN;
Temporary Vs. Permanent Datasets
Creating a Temporary Dataset:
Method
Hand
2: Table Entry
Entering Data into SAS Table Steps (must be in the Explorer Window):
File New Table Enter Data Label Variables
You
will hardly ever use this method.
Temporary Vs. Permanent Datasets
Creating a Temporary Dataset:
Method
Input
3: Internal Raw Data
ID Heig ht Gende r Intervention Result Year
raw data via code (a) List Input:
46
752
67
71 62 69
F
M F M
0
0 1 1
Yes
No Yes Yes
2005
2005 2005 2005
Each data value is 9673 separated by one space 969
1. 2. 3.
input statement Define the variables, character$ or numeric Specify the location of the raw data (in this case, your location is datalines, meaning youre inputting the raw data
data example2; input ID Height Gender $ Intervention Result $ Year; datalines; 46 67 F 0 Yes 2005 752 71 M 0 No 2005 9673 62 F 1 Yes 2005 969 69 M 1 Yes 2005 ; run;
Temporary Vs. Permanent Datasets
Creating a Temporary Dataset:
Method
Input
3: Internal Raw Data
ID 46 752 9673 969 Heig ht 67 71 62 69 Gende r F M F M Intervention 0 0 1 1 Result Yes No Yes Yes Year 2004 2004 2004 2004
raw data via code (b) Column Input:
Each data value is separated in a defined data example3; input ID 1-4 Height 9-10 column Year 29-32;
datalines;
1. 2. 3.
input statement Define the variables, character$ or numeric Specify the location of the raw data (in this case, your location is datalines, meaning youre inputting the raw data
Gender $ 17 Intervention 21 Result $25-27
1 2 2 1-------9-------7-------5---9--46 67 F 0 Yes 2004 752 71 M 0 No 2004 9673 62 F 1 Yes 2004 969 69 M 1 Yes 2004 ; run;
Temporary Vs. Permanent Datasets
Creating a Temporary Dataset:
Method
Input
4: External Raw Data
Gende r F M F M Intervention 0 0 1 1 Result Yes No Yes Yes Year 2004 2004 2004 2004
raw data, usually a .txt ID Heig file, via link ht
46 752 9673 67 71 62 69
1.
2. 3.
Specify the location of the raw data (in this case, your location is a browser location); input statement Define the variables, character$ or numeric
969
data example4; infile "G:\HPH 562\Class 2 Final\Example4\ex4.txt"; input ID Height Gender $ Intervention Result $ Year @@; run;
Temporary Vs. Permanent Datasets
Permanent Datasets:
Datasets
have two names Stored in a folder you create within SAS Libraries Purposes
Store
data as permanent SAS datasets Retrieve SAS datasets
Class2a
Temporary Vs. Permanent Datasets
Creating a Library to create/to access your Permanent databases
Method 1: Libname Statement
libname Class2a "J:\HPH 562 2011\Lecture 2\Class2Data"; run;
Name of library
Location of your library
Method 2: Non-programming method
While the Explorer window is highlighted, click File/New Fill in required fields (name of folder, location of folder)
Tip: If youre returning to the same code every time, its easier to form your library through the Libname statement.
Temporary Vs. Permanent Datasets
Class2a
J:\
2
J:\Class2a
CLASS2A
Hmwk2
Class2a
Make a folder called Class2a
Download or save your SAS database into this folder.
Now, in SAS, create a permanent library referencing the folder where your database is saved.
libname PICKLE J:\CLASS2A"; run;
If you did it correctly, you will see the Hmwk2 appear in your Homework permanent library. The full name of this database is now Homework.Hmwk2. NOTE: You will also see any other SAS databases that are stored in that folder (J:\CLASS2A) data VitD; set PICKLE.Hmwk2; run;
Finally. What can you do?
Taking a Permanent Database and Making it Temporary:
*Permanent Databases have 2 names
Libname PermFoldername BrowserAddress"; run; data nameoftemporarydatabase; set Permfoldername.databasename; run; Location of Library where my data PERMANENT is stored
Name of Library where my PERMANENT data is stored
Libname BASIL"G:\HPH 562\DRAFT\Datasets\NAMCSIII "; run; Name of new Temporary Database data Slide20; set BASIL.namcsedit; run; Name of my Permanent database
Taking a Temporary Database and Making it Permanent:
Code:
data PermFolderName.Permdatabase; set tempdatabase; run;
Example
data BASIL.SLIDE; set slide21; run;
In Summary: Getting Data into SAS
Accessing a SAS database
MUST CREATE A LIBRARY WHERE THE PERMANENT SAS DATABASE IS STORED in order to access database
The only way to look at an Excel database is through the Excel program. Same thing with SAS. The only way to look at a SAS database in through the SAS program. The difference between the two is the SAS program does not automatically open when you try to open a SAS database. You must physically open SAS first, and create a permanent library where that database is stored. Then, you may look at the data.
It is already is permanent! Permanent means that it is a SAS database.
Use the Import Wizard! (point & click: file, import, name, etc.) It will be temporary! (Work folder) Use Code (Input & Datalines Commands) It will be temporary! (Work folder) Use Code and site the browser location of data (Input & INFILE commands) It will be temporary! (Work folder)
Excel/Access Databases:
Inputting Internal Raw Data:
Inputting External Raw Data ( .TXT data)
Proc Contents
Shows (In the output window) information about your dataset
Number
of Observations Variables
Name
Type Length
Proc Contents data=nameofdataset; run;
Example
Proc Contents data=example3; run;