Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

When to use the terms "delimiter," "terminator," and "separator"

What are the semantics behind usage of the words "delimiter," "terminator," and "separator"? For example, I believe that a terminator would occur after each token and a separator between each token. Is a delimiter the same as either of these, or are they simply forms of a delimiter?

SO has all three as tags, yet they are not synonyms of each other. Is this because they are all truly different?

like image 589
Tim Lehner Avatar asked Feb 02 '12 19:02

Tim Lehner


People also ask

Where should you use a delimiter?

When designing a data file format, use delimiters that will not appear in the data or padding, or use CSV or SSV forms. When copying from a table into a file, you can insert delimiters independently of columns. For example, to insert a newline character at the end of a line, specify nl=d1 at the end of the column list.

What is a delimiter and what is it used for?

A delimiter is a sequence of one or more characters for specifying the boundary between separate, independent regions in plain text, mathematical expressions or other data streams. An example of a delimiter is the comma character, which acts as a field delimiter in a sequence of comma-separated values.

Is a delimiter a data separator?

This value is the separator. DELIMITER=";" semi-colon separated values. The ; must be ";", otherwise it is treated as a comment. When decoding delimited values, leading and trailing blanks, and leading and trailing quotation marks, " " and ' ' in each value field are ignored.

What would be considered a delimiter?

A delimiter is one or more characters that separate text strings. Common delimiters are commas (,), semicolon (;), quotes ( ", ' ), braces ({}), pipes (|), or slashes ( / \ ). When a program stores sequential or tabular data, it delimits each item of data with a predefined character.


1 Answers

A delimiter denotes the limits of something, where it starts and where it ends. For example:

"this is a string" 

has two delimiters, both of which happen to be the double-quote character. The delimiters indicate what's part of the thing, and what is not.

A separator distinguishes two things in a sequence:

one, two 1\t2 code();  // comment 

The role of a separator is to demarcate two distinct entities so that they can be distinguished. (Note that I say "two" because in computer science we're generally talking about processing a linear sequence of characters).

A terminator indicates the end of a sequence. In a CSV, you could think of the newline as terminating the record on one line, or as separating one record from the next.

Token boundaries are often denoted by a change in syntax classes:

foo() 

would likely be tokenised as word(foo), lparen, rparen - there aren't any explicit delimiters between the tokens, but a tokenizer would recognise the change in grammar classes between alpha and punctuation characters.

The categories aren't completely distinct. For example:

[red, green, blue] 

could (depending on your syntax) be a list of three items; the brackets delimit the list and the right-bracket terminates the list and marks the end of the blue token.

As for SO's use of those terms as tags, they're just that: tags to indicate the topic of a question. There isn't a single unified controlled vocabulary for tags; anyone with enough karma can add a new tag. Enough differences in terminology exist that you could never have a single controlled tag vocabulary across all of the topics that SO covers.

like image 50
Ian Dickinson Avatar answered Sep 27 '22 22:09

Ian Dickinson