What are the semantics behind usage of the words "delimiter," "terminator," and "separator"? For example, I believe that a terminator would occur after each token and a separator between each token. Is a delimiter the same as either of these, or are they simply forms of a delimiter?
SO has all three as tags, yet they are not synonyms of each other. Is this because they are all truly different?
When designing a data file format, use delimiters that will not appear in the data or padding, or use CSV or SSV forms. When copying from a table into a file, you can insert delimiters independently of columns. For example, to insert a newline character at the end of a line, specify nl=d1 at the end of the column list.
A delimiter is a sequence of one or more characters for specifying the boundary between separate, independent regions in plain text, mathematical expressions or other data streams. An example of a delimiter is the comma character, which acts as a field delimiter in a sequence of comma-separated values.
This value is the separator. DELIMITER=";" semi-colon separated values. The ; must be ";", otherwise it is treated as a comment. When decoding delimited values, leading and trailing blanks, and leading and trailing quotation marks, " " and ' ' in each value field are ignored.
A delimiter is one or more characters that separate text strings. Common delimiters are commas (,), semicolon (;), quotes ( ", ' ), braces ({}), pipes (|), or slashes ( / \ ). When a program stores sequential or tabular data, it delimits each item of data with a predefined character.
A delimiter denotes the limits of something, where it starts and where it ends. For example:
"this is a string"
has two delimiters, both of which happen to be the double-quote character. The delimiters indicate what's part of the thing, and what is not.
A separator distinguishes two things in a sequence:
one, two 1\t2 code(); // comment
The role of a separator is to demarcate two distinct entities so that they can be distinguished. (Note that I say "two" because in computer science we're generally talking about processing a linear sequence of characters).
A terminator indicates the end of a sequence. In a CSV, you could think of the newline as terminating the record on one line, or as separating one record from the next.
Token boundaries are often denoted by a change in syntax classes:
foo()
would likely be tokenised as word(foo)
, lparen
, rparen
- there aren't any explicit delimiters between the tokens, but a tokenizer would recognise the change in grammar classes between alpha and punctuation characters.
The categories aren't completely distinct. For example:
[red, green, blue]
could (depending on your syntax) be a list of three items; the brackets delimit the list and the right-bracket terminates the list and marks the end of the blue
token.
As for SO's use of those terms as tags, they're just that: tags to indicate the topic of a question. There isn't a single unified controlled vocabulary for tags; anyone with enough karma can add a new tag. Enough differences in terminology exist that you could never have a single controlled tag vocabulary across all of the topics that SO covers.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With