MAKE CATEGORICAL 1 or 0 fields from a list of variables:
- encode party, gen (party_n)
-> encodes string to numbers starting with 1, 2 3 etc by category (1 or 0)
- encode mv, gen(mv_no)
- gen x = party =="HDU"
-> encodes string to numbers only if parts match quotations (1 or 0) Case sensitive: use upper() if needed
TO ISOLATE PARTS OF VARIABLE NAME
- gen x = strpos(diagnosis,"RSV") >0
-> codes 1 for any time the string "RSV" appears in the field (if it appears at least once it will be 1)
- regexm(var1,"RSV") -> codes 1 if word "RSV" appears in var1 string
STRING FUNCTIONS
substring (name,1,comma-1)…extracts from name to first comma substr("abcdef",2,3) = "bcd" substr("abcdef",-3,2) = "de" substr("abcdef",2,.) = "bcdef" substr("abcdef",-3,.) = "def" substr("abcdef",2,0) = "" substr("abcdef",15,2) = ""
SPLITTING WORDS use egenmore functions
Word count
egen x=wordof(var1,word(1) ->to choose words from a string egen xx = wordof(dx1),word(2) or -1 for last word (needs egenmore) split x,(@) egen Grade = ston(grade), to(1/5) from (Poor Fair Good "Very good" Excellent) maps number from a string egen yy = dayofyear(date_ad), m(1) -> counts days of year from jan , m(5) would be strating in april
Leave a comment