• Shuffle
    Toggle On
    Toggle Off
  • Alphabetize
    Toggle On
    Toggle Off
  • Front First
    Toggle On
    Toggle Off
  • Both Sides
    Toggle On
    Toggle Off
  • Read
    Toggle On
    Toggle Off
Reading...
Front

Card Range To Study

through

image

Play button

image

Play button

image

Progress

1/104

Click to flip

Use LEFT and RIGHT arrow keys to navigate between flashcards;

Use UP and DOWN arrow keys to flip the card;

H to show hint;

A reads text to speech;

104 Cards in this Set

  • Front
  • Back

SAS Environment

- Program Editor


- Enhanced Editor


- Log


- Output


- Results


- Explorer


- Table Editor

Editors

Program Editor:


- Not as easy to use as enhanced editor




Enhanced Editor:


- Enhances use because text is color coded


- Code is written here


- Editor must be selected in order to run code

Log

- Shows you the results of execution


- Shows any notes made in blue


- Shows warning messages in green


- Shows error messages in red


- Shows executed messages in black


- Used for debugging

Output

- Displays listing reports

- SAS 9.3 is not directly outputted to output window


- Shows pages

Results

- Default window for output
- Displays html report
- Shows one long page
- Used for navigating previously run results
- Has a bookmark for all of your previously executed code


Explorer

- Navigations tool
- Allows you to navigate between SAS libraries and SAS objects
- You can go into folders and look at different data sets
- Data sets will show up in table editor window

Table Editor

- Used to create new data sets
- Also a view table window
- Can be used to open an existing data set
- Can make modifications to the variable names, data, etc.

SAS File Types

- .sas
- .log
- .lst

SAS Statement

- Keywords


- Semicolon at the end of entire statement

Submitting SAS Programs

- Very easy using F8


- Can submit using runnning man symbol, or the menu


- Selecting portion of code means you're submitting just that part


- No selection means submitting all the code

DATA Steps

- We are reading in the data set and manipulating the data set


- Creating new variables, keeping or dropping variables, calculating things, etc.

PROC Steps

- Utility options


- Creating a report, or a subset of data


- Any analysis is done here

SAS Libraries

- Essentially a nickname to a location on your disk drive


- Use libname statement to create a library and a path location to a folder


- Work is a temporary library


- Two-level file name : 'library.data'

SAS Data sets

Two Portions:

Descriptor Portion:
- Holds all the general info about your data set: name, creation date, if sorted, attributes, etc.

Data Portion:
- Contains all of the data
- Print by using PROC print procedure

PROC Contents

- Displays the descriptor portion of the data set

SAS Variables

- Numeric: default variable: missing values are displayed as periods


- Character: has to be specified with a $ sign, missing values displayed by blank spaces

Titles

- Used to enhance look of report
- 10 possible lines allocated for titles and footnotes
- Can change these lines by a title n statement
- If you change one specific title, then all other subsequent titles will be erased
- Default: SAS System
- Clear out titles will null statement: title;
- Can change color, height, justify, font ,bold, italics, etc.
- These options only show up in the results viewer (statement will but enhancements wont)

Footnotes

- Used to enhance look of report


- Show up at the bottom


- 10 possible lines allocated for titles and footnotes


- Default footnote: nothing


- Clear out footnotes with null statement: footnote;


- Can change color, height, justify, font, bold, italics, etc.


- Options only show up in results viewer and not output window (statement will but enhancements wont)

SAS System Options

- Can change things in output window
- Can change pagesize, numbers, date, timestamp, etc.
- Options statement
- When you use an options statement, they are additive: everything you specified in the previous problem will stay unless you change it, changing one does not specify the number
- Can be specified globally ( outside or inside proc print step)

WHERE statement

- Used to select observations


- Can be used with most sas procedures


-i.e.:


proc print data=data1.stresstest noobs;


where MaxHR>=170;


run;




- General form: WHERE where-expression;



Comparison Operators

- Equal to: =


- Not equal to: ^=, ~=, <>


- Greater than: >


- Less than: <


- Greater than or equal to: >=


- Less than or equal to: <=


- In: where state in ('NC','TX') (case sensitive)

Logical Operators

- And: & or and


- Or: | or or


- Not: ^ or ~ or not



Special Operators

- Like: % replaces any number of characters, _ replaces one character: where code like 'E_U%';- Between - and - Contains: ?- Is missing

Column Totals

- Can calculate column totals for numeric variables using sum statement

By grouping

- Can be used in proc sort to sort the data set


- Can be used in proc print to group variable values and print them together


- If used in proc print, it must be used on a data set already sorted by that variable


- Using sum and by together gives us subtotals for each by group


- Ex:


proc sort data=data1.admit out=work.admit;


by ActLevel;


run;


proc print data=work.admit;


by ActLevel;


sum Fee;


run;

Page breaks

- PAGEBY used to seperate the by groups


- You cannot use a page break on its own, it must be used together with a by statement and thus must be used on a sorted data set


- Ex:


proc print data=work.admit;


by ActLevel;


pagebyActLevel;


sum Fee;


run;

ID variables

- ID statement overwrites the observation column and replaces it with whatever variable you specify


- If you specify a by variable with the same id variable, you will get the by variable and its value on the upper left corner for each grouping (changes look of report)


- Ex:


proc print data=work.empdata;


by JobCode;


id JobCode;


sum Salary;


run;

Column Width

- Changes column width in output window


- If you want width to be consistent across all columns use width=uniform


- Ex:


proc print data=data1.empdata width=uniform;


run;

Number of Observations - N

- Displays number of observations in your report
- Also specifies descriptive text

PROC Sort

- Sorts data set
- PROC Sort=data.dataset;
- Attempts to replace original data set
- You have to specify an out dataset for SAS to creat a new dataset and not replace the old set
- You have to have a by statement
- By default will sort in ascending order
- Descending statement in front of variable name will sort in descending order
- Ex:
proc sort data=data1.empdata out=work.jobsal;
by Salary;
run;

SAS Syntax rules

- Not case sensitive except in case of strings


- You can make everything upper or lower case


- SAS is free format (statement can span multiple lines) except in the case of datalines

Data and Program Errors

Data Errors:


- Occurs when things like character values are read into numeric values




Program Errors:


- Occurs when part of the code is incorrect


- Execution of the program is halted



Sources of SAS Data

- Data Entry


- Existing SAS Datasets


- Import


- Datalines


- Infile

Data Entry

- Opening table editor and typing in the data

Existing SAS Datasets

- Use the set command within the data step to read them in


- Ex:


data work.bonus;


set data1.fltattnd;


run;

Import

- Uses import wizard to bring in data from an external program


- Ex:


PROC IMPORT DATAFILE='X:\PStat130\data1\DallasLA.xls'


OUT=WORK.tdfwlax


DBMS=XLS REPLACE;


SHEET='DFWLAX';


GETNAMES=YES;


RUN;




- Notice no ; until after dbms

Datalines

- With raw data SAS doesn't know what to do, so you have to give it more info
- Uses data step for raw data
- Used for small data, certain structure and format
- Ex:
data work.sample;
input firstname $ gender $ age;
datalines;
John Male 22
Jane Female 19;
run;

-input specifies variables

Infile

- Using data step
- Used for raw data (pointing sas to raw datafile)
- Ex:
data work.sample;
infile'D:\UCSB\sample.txt';
input name $ gender $ age;
run;

Input Statement

- Used in conjunction with reading in raw data
- You need to tell sas the variable name,variable type, attributes
- 3 types: list, column, formatted

List input

- Data is possibly free format
- All data needs to be standard numeric or character input
- Every single variable has to be read in sequentially
- Variable name with $ or not
- Maximum character values of 8
- Each value is separated by a space - the 'delimeter'
- Ex:
data work.students;
input Name $ Team $ Age;
datalines;
David Male 19
Amelia Female 23
Ravi Male 17
Ashley Female 20
Jim Male 26
;
run;

Column Input

- Data is not free format


- Data is within fixed columns


- Tell sas what columns correspond with which variables


- In column you can choose to read in a subset of variables and set the column order to whatever you want


- Must be standard numeric or character


- lets you read in character values greater than 8


Ex:


data work.students;


input Name $ 1-6 Gender $ 9-14 Age 18-20; datalines;


David Male 19


Amelia Female 23


Ravi Male 17


Ashley Female 20


Jim Male 26


;


run;

Formatted Input

- Used for nonstandard numeric or character type


- Allowed to read in data with symbols, etc.


- Converts these variables into character or numeric type


- When printed its going to be original type


- Ex:


data students; input Name $ Gender $ Age Enroll mmddyy8.;


datalines;


David Male 19 06/18/10


Amelia Female 23 08/02/10


Ravi Male 17 07/22/10


Ashley Female . 09/14/10


Jim Male 26 08/26/10


;


run;




- date is not in standard format and so the informat is mmddyy8



Relative and Absolute Pointer Control

Column:


- Relative: use + sign


- Absolute: use @ symbol




Line:


- Absolute: #n


- Relative: /

Write a program that permanently changes the variable name EmpID to EID in the data set work.empdata. Do not use a DATA step.

proc datasets library=work;


modify empdata;


rename EmpID=EID;


run;

Write a program that creates a SAS data set named work.oscars from the worksheet oscarsin the Excel file entertainment.xls. This file is located in the folder 'C:\Desktop\data'. Make sure to replace a similarly named file if one exists. Place the title “2014 Oscar Winners” on the top of each page.

title '2014 Oscar Winners’;


proc import out=work.oscars datafile='C:\Desktop\data\entertainment.xls' dbms=xls replace;


sheet='oscars';


run;

What does the following SAS code output? proc print; run;

The last successfully created data set

Which of the following is NOT a valid SAS data set name?

4thquarter

PROC DATASETS

- Can be used to permanently modify attributes (name of variable (rename), labels, formats, etc.)

Drop and Keep statements

- For output datasets (what do you want to be output)

Drop= and Keep=

- Applied to dataset itself

Creating variables

- State name of variable = .....
- Ex:
Tax = salary * .05;

Arithmetic Operators

- Multiplication: *


- Division: /


- Addition: +


- Subtraction: -


- Exponent: **


- Negative: -

SAS Functions

- Month(SAS-date)



Selecting observations

-Delete
-Where
-If

Datetime values

-For a sas date value this is stored as number of dates between original time and now, for the datetime value this number is the amount of seconds between dates

Suppressing ID column

-place noobs after proc print

Format

- Comes within Proc Print for temporary or within data for permanent


- format variable formatw.d; where w is width and d is decimal places


-Ex:


procprint data=data1.empdata split=' ';


format Salary dollar11.2;


run;



User defined formats

Step 1 (Create format):


proc format;


value $codefmt 'FLTAT'='Flight Attendant' 'PILOT'='Pilot';


run;




Step 2 (apply format):


proc print data=data1.empdata;


format Jobcode$codefmt.;


run;

Set

-Used to append data sets


-Whatever order the sets appear in the set statement is the order they appear in the data set


-Variable names and data types should be the same in both data sets


- Unique values cause missing values




Data work.qtr1;


set work.jan work.feb work.mar;


run;




This combines the three data sets in the order of jan, feb, mar.





Merge-by

- Combining two data sets with at least one common variable and other unique variables


- Set A has m records and k unique variables


- Set B has n records and j unique variables


- Combined set has max(m,n) records and k+j+1 variables (if there is one common variable)


-Records from each set with the same value of the unique By variable are linked and output as one record


-If you omit the BY statement the first record from each data set are output together as one without being linked by a common variable


-Must be sorted before using by statement

PROC MEANS

-Calculate and display simple summary statistics


-Summarizes numeric variables


-Count, mean, Standard deviation, min, max


-BY and CLASS statements can be used to create summaries for sub-groups


-Determine which statistics wanted using options in proc means line


-OUTPUT creates output data set containing summary stats

PROC FREQ

-Analyzes every variable in the data set


-Displays each distinct data balue


-Calculates the number of observations in which each value appears( and the corresponding frequency)


-Indicates missing values for each variable


-Use tables statement to select variables and options


-For a two way table do variable1*variable2 where 1 is row 2 is column

PROC TABULATE

-Calculate and display multi-dimensional tables with summary statistics


-Able to group up to three dimensions


-Will generate frequency by default


-Use multiple table statements to create multiple tables


-Classification variables can be either character or numeric but analysis needs to be numeric


-Will print sum by default but can be set to mean, median, etc.


-All used after analysis variable for sum, us all*mean for mean

PROC REPORT

-Create listing and summary reports

-by default creates listing report
-Use column statement instead of var statement
-can specify format using 'format= ' and label
-Character variables used as Display variables, Numeric variables used as analysis variables
-Can add enhancements like headline or headskip
-subtotals and grand totals using break and rbreak statements
PROC GCHART

- HBAR, VBAR, or PIE


-Can graph character or numeric variables


-Displays frequency by default. If you want other than freq must apply analysis variable (sumvar=) and then specify type=(mean or sum)


-Use explode to pop out piece of pie chart

SYMBOLn:

-Used within GPLOT to define plotting symbols, draw lines through data points, specify symbol and line color

PROC GPLOT

-Used to produce scatterplots and graphs


-Use symbol and label to edit plot

ODS: Output Delivery System

-ODS HTML statement opens, closes, and manages HTML destination


-Uses print, means or freq or graphs if goptions statement is used


- Steps are: open HTML destination for output, generate output, close destination


-Can use to convert sas data set into other file types

OUTPUT Statement

-If included in data step SAS writes a record immediately


-You can create multiple records from one observation, write to multiple output data sets, combine info from multiple obs into a single record when used with RETAIN

RETAIN Statement

-Allows you to use a variable from a previous iteration

First./Last. BY-variables

-When you sort data, if you read sorted data into data statement and use a by statement two variables are created to identify first record in the by group and the last record in the by group




First.BY-variable


Last.BY-variable

Accumulating totals for BY groups

-Set accumulator variable to zero at start of each by group


-Increment with a sum statement


-Output only the last observation of each BY gorup

DROP=, KEEP=

data army(keep=Code Airport);


set data2.military(drop=City State Country);


if Type eq 'Army' then output;


run;

SUM Statement

Variable + Expression;




-Creates the variable on the left side if it doesn't exist


-Initializes the variable to zero before the first iteration of the Data step


-Automatically retains variable

DO loops

DO-END - Executes statements as a unit, usually a part of If-then/else


Iterative DO - Executes a group of statements repetitively based on the value of an index variable


DO WHILE - Executes group of statements as long as the condition stays true, condition is checked before each loop iteration


DO UNTIL - Executes group of states until condition is true, is checked after each loop iteration

Create Custom Reports (Data_NULL_)

-Uses data step but does not create data set


-Instead writes output to a specified file using put statements which control exact location and format of output file



COLON Modifier

-Use to read each value only as far as the next delimeter


-Allows you to use Informats with List input but handle nonstandard data values



INFILE Statement Options

-Fix non-blank delimieters with DLM='delimeter'


infile 'students.txt' dlm=',';


-Missing Data at end of row: Sas loads the next record to finish the observation


-Missover option stops that from happening and sas just jumps missing values


-DSD option sets default delimiter to a comma and treats consecutive delimiters as missing values

Single trailing @ modifier

-Tells SAS to hold the current input line for further processing


-Holds until there is an input statement with no @ or the bottom of the data step

Double trailing @@ modifier

-Holds raw data record in the input buffer until sas reads past the end of the line

Variable Lists

-Numbered Range lists - variables start with same name and end with number (var week1-week52)


-Name Range lists - variables appear in consecutive order in the set (var mon--sun)


-Name Prefix lists- begin with specified character string (of SALES:)


-Special SAS name lists-_NUMERIC_ _CHARACTER_ _ALL_

Index(string,target);

returns position of specific character within string




INDEX('SMITH-JOHN','-') = 6




returns 0 if it isn't in string

SUBSTR(string,start <,length>);

-Extracts a portion of the character variable




SUBSTR('PSTAT130',6,3) = '130'


SUBSTR('PSTAT130',6) = '130'

SCAN(string, n, <, delimiters>);

-Parses a character string into a set of "words" using a delimiter




SCAN('Smith, John', 1) = 'Smith'


SCAN('Smith, John', 2) = 'John'

||

-Joins two or more strings together




'John' || 'Smith' = 'JohnSmith'

TRIM()

-Removes any trailing blank spaces




TRIM ('JOHN ') = 'JOHN'

ROUND()

-Rounds up or down traditionally


-Round(12.12) = 12

CEIL()

-Rounds up only


-CEIL(4.4) = 5

FLOOR()

Rounds down only


-FLOOR(3.6) = 3

INT()

-Removes any decimals from a number


INT(4.8)=4

INPUT(source,informat):

-Uses a SAS format (informat) to convert a character string into a number




input(CVar4,mmddyy6.);

PUT(source,format):

-Converts number into character string


AreaCode=805


Put(Areacode,3.) = '805'

SET code

Data work.qtr1;


set work.jan work.feb work.mar;


run;




This combines the three data sets in the order of jan, feb, mar.

MERGE-BY Code

data allscores;


merge midterm final;


by name;


run;

RENAME=

-When appending data sets use to create common variable names


-When merging data sets use to create unique variable names




data allsections;


set morning


afternoon(RENAME=(testscore=score));


run;

PROC MEANS Code

proc means data=data1.admit n mean stddev maxdec=2;


var age height weight;


run;

PROC FREQ Code

PROC FREQ data=sas-data-set;


TABLES variable-list / options;


run;

PROC TABULATE code

PROC TABULATE Data=sas-data-set ;


CLASS class-variables;


VAR analysis-variables;


TABLE page-expression, row-expression, column-expression options>;


run;




variables need to be listed in both class and var

PROC REPORT code

proc report data=data1.crew nowd;


column JobCodeLocation Salary;


define JobCode/ order width=8 'Job Code';


define Location / 'Home Base';


define Salary / format=dollar10.;


run;




-Order keyword identifies variable used to order report

PROC GCHART Code

procgchartdata=data1.crew;


vbar JobCode / sumvar=Salary type=mean;


run;




displays average salary for each jobcode

PROC GPLOT Code

proc gplotdata=data1.admit;


plot weight*height / regeqn;


symbol v=dot i=rlCLM95;


run;


quit;




regeqn gives regression equation, rlCLM95 gives regression line and 95% Confidence level

ODS Code

ods html file='Salary.xls';
proc print data=data1.empdata label noobs;
label Salary='Annual Salary';
title1 'Salary Report';
run;
ods html close;

writes to excel file

Do Code

DATA odd;


do i= 1 to 100 by 2;


output;


end;


run;

Colon Modifier Code

data airplanes;


infile'airdata.txt';


input ID $ InService: date9.


PassCap


CargoCap;


run;