Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Generate variable containing number of characters in a string variable

Tags:

stata

In a survey dataset I have a string variable (type: str244) with qualitative responses. I want to count the number of characters in each response/string and generate a new variable containing this number.

Using the egenmore I have already counted the number of words using nwords, but I cannot find the counterpart for counting characters.

EXAMPLE:

egen countvar = nwords(stringvar)

where countvar is the new variable name and stringvar is the string variable.

Does such an egen function exist for counting characters?

like image 514
harre Avatar asked Aug 05 '15 17:08

harre


People also ask

How many characters are in a string variable?

Like numeric variables, string variables can have labels, and missing value declarations, although missing string values cannot be longer than 8 characters. Strings can be up to 32767 characters long.

Can a string variable contain numbers?

A string may contain only numeric characters and still not be valid for the type whose TryParse method that you use. For example, "256" is not a valid value for byte but it is valid for int . "98.6" is not a valid value for int but it is a valid decimal .

How do you create a string variable?

To create a string, put the sequence of characters inside either single quotes, double quotes, or triple quotes and then assign it to a variable. You can look into how variables work in Python in the Python variables tutorial. For example, you can assign a character 'a' to a variable single_quote_character .

What is a string variable in Stata?

String variables, simply speaking, are variables that contain not just numbers, but also other characters (possibly mixed with numbers). For instance, the European Social Survey stores information about the country where respondents were surveyed in variable cntry , which contains strings like "DE", "ES, "LT" etc.


1 Answers

There is no egen function because there has long [sic] been a function strict sense to do this. In recent versions of Stata, the function is called strlen() but the older name length() continues to work:

. sysuse auto
(1978 Automobile Data)

. gen l1 = length(make)

. gen l2 = strlen(make)

. su l?

    Variable |        Obs        Mean    Std. Dev.       Min        Max
-------------+---------------------------------------------------------
          l1 |         74    11.77027    2.155257          6         17
          l2 |         74    11.77027    2.155257          6         17

See help functions and (e.g.) this tutorial column.

like image 83
Nick Cox Avatar answered Oct 28 '22 09:10

Nick Cox