Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Lex strings with single, double, or triple quotes

My objective is to parse like Python does with strings.

Question: How to write a lex to support the following:

  1. "string..."
  2. 'string...'
  3. """multi line string \n \n end"""
  4. '''multi line string \n \n end'''

Some code:

states = (
        ('string', 'exclusive'),
        )

# Strings
def t_begin_string(self, t):
    r'(\'|(\'{3})|\"|(\"{3}))'
    t.lexer.push_state('string')

def t_string_end(self, t):
    r'(\'|(\'{3})|\"|(\"{3}))'
    t.lexer.pop_state()

def t_string_newline(self, t):
    r'\n'
    t.lexer.lineno += 1

def t_string_error(self, t):
    print("Illegal character in string '%s'" % t.value[0])
    t.lexer.skip(1)


My current idea is to create 4 unique states that will match the 4 different string cases, but I'm wondering if there's a better approach.

Thanks for your help!

like image 335
Steve Peak Avatar asked Dec 12 '13 12:12

Steve Peak


People also ask

Do strings require single or double quotes?

Both single (' ') and double (" ") quotes are used to represent a string in Javascript. Choosing a quoting style is up to you and there is no special semantics for one style over the other. Nevertheless, it is important to note that there is no type for a single character in javascript, everything is always a string!

When should we use triple quotes to define strings?

Spanning strings over multiple lines can be done using python's triple quotes. It can also be used for long comments in code. Special characters like TABs, verbatim or NEWLINEs can also be used within the triple quotes. As the name suggests its syntax consists of three consecutive single or double-quotes.

Which is a valid string that contains both single quotes and double quotes?

@Denilson, XML (and therefore XHTML) allows both single and double quotes.

Which data can be enclosed with single or double or triple quotes?

In Python, a string ( str ) is created by enclosing text in single quotes ' , double quotes " , and triple quotes ( ''' , """ ). It is also possible to convert objects of other types to strings with str() . This article describes the following contents.


1 Answers

isolate the common string to make a single state and try to build an automaton with lesser states however u can have a look on py lex yacc if u are not worried about using an external library that makes ur job easier

However u need basics of lex yacc ///the sample code is as shown

tokens = (
    'NAME','NUMBER',
    'PLUS','MINUS','TIMES','DIVIDE','EQUALS',
    'LPAREN','RPAREN',
    )
    enter code here

# Tokens

t_PLUS    = r'\+'
t_MINUS   = r'-'
t_TIMES   = r'\*'
t_DIVIDE  = r'/'
t_EQUALS  = r'='
t_LPAREN  = r'\('
t_RPAREN  = r'\)'
t_NAME    = r'[a-zA-Z_][a-zA-Z0-9_]*'

def t_NUMBER(t):
    r'\d+'
    try:
        t.value = int(t.value)
    except ValueError:
        print("Integer value too large %d", t.value)
        t.value = 0
    return t

# Ignored characters
t_ignore = " \t"

def t_newline(t):
    r'\n+'
    t.lexer.lineno += t.value.count("\n")

def t_error(t):
    print("Illegal character '%s'" % t.value[0])
    t.lexer.skip(1)

# Build the lexer
import ply.lex as lex
lex.lex()

# Parsing rules

precedence = (
    ('left','PLUS','MINUS'),
    ('left','TIMES','DIVIDE'),
    ('right','UMINUS'),
    )

# dictionary of names
names = { }

def p_statement_assign(t):
    'statement : NAME EQUALS expression'
    names[t[1]] = t[3]

def p_statement_expr(t):
    'statement : expression'
    print(t[1])

def p_expression_binop(t):
    '''expression : expression PLUS expression
                  | expression MINUS expression
                  | expression TIMES expression
                  | expression DIVIDE expression'''
    if t[2] == '+'  : t[0] = t[1] + t[3]
    elif t[2] == '-': t[0] = t[1] - t[3]
    elif t[2] == '*': t[0] = t[1] * t[3]
    elif t[2] == '/': t[0] = t[1] / t[3]

def p_expression_uminus(t):
    'expression : MINUS expression %prec UMINUS'
    t[0] = -t[2]

def p_expression_group(t):
    'expression : LPAREN expression RPAREN'
    t[0] = t[2]

def p_expression_number(t):
    'expression : NUMBER'
    t[0] = t[1]

def p_expression_name(t):
    'expression : NAME'
    try:
        t[0] = names[t[1]]
    except LookupError:
        print("Undefined name '%s'" % t[1])
        t[0] = 0

def p_error(t):
    print("Syntax error at '%s'" % t.value)

import ply.yacc as yacc
yacc.yacc()

while 1:
    try:
        s = input('calc > ')   # Use raw_input on Python 2
    except EOFError:
        break
    yacc.parse(s)
like image 108
IamSeekingAns Avatar answered Sep 21 '22 07:09

IamSeekingAns