Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

C# advanced String.Split

Tags:

arrays

c#

split

I have a string similar to this one:

The boy said to his mother, "Can I have some candy?"

If I do a normal String.Split on it, I get:

{ 'The', 'boy', 'said', 'to', 'his', 'mother', '"Can', 'I', 'have', 'some', 'candy?"' }

I want an array like so:

{ 'The', 'boy', 'said', 'to', 'his', 'mother', 'Can I have some candy?' }

Obviously, I could just loop through character by character and keep track of whether I'm in a string or not and all that... but is there a better way? With Regexs perhaps?

like image 935
Entity Avatar asked Jun 04 '11 22:06

Entity


People also ask

What C is used for?

C programming language is a machine-independent programming language that is mainly used to create many types of applications and operating systems such as Windows, and other complicated programs such as the Oracle database, Git, Python interpreter, and games and is considered a programming foundation in the process of ...

What is C full form?

Full form of C is “COMPILE”. One thing which was missing in C language was further added to C++ that is 'the concept of CLASSES'.

What language is C in?

C is a procedural language that provides no support for objects and classes. C++ is a combination of OOP and procedural programming languages. C has 32 keywords and C++ has 63 keywords. C supports built-in data types, while C++ supports both built-in and user-defined data types.

Is C language easy?

C is a general-purpose language that most programmers learn before moving on to more complex languages. From Unix and Windows to Tic Tac Toe and Photoshop, several of the most commonly used applications today have been built on C. It is easy to learn because: A simple syntax with only 32 keywords.


2 Answers

How about finding all the matches of this regex:

"[^"]*"|\S+
like image 87
MRAB Avatar answered Oct 09 '22 01:10

MRAB


Depends a bit on your requirements. E.g. do you need to treat AAA"BBB (no spaces) as signle word, or two words? If AAA"BBB is a single word, and " only starts a qouted field after delimiter - this looks like CSV parser. Of course, CSV has other rules, like double qoutes to mean literal quote, etc - but you would need to define some similar rules too.

So you can adapt any open source CSV parser, or see if e.g. Microsoft.VisualBasic.FileIO.TextFieldParser works for you

        string msg = "The boy said to his mother, \"Can I have some candy?\"";
        System.IO.MemoryStream s = new System.IO.MemoryStream(Encoding.Unicode.GetBytes(msg));
        TextFieldParser p = new TextFieldParser(s, Encoding.Unicode);
        p.Delimiters = new string[] { " ", "," };
        foreach(var f in p.ReadFields().Where(f => f != ""))
            Console.WriteLine(f);
like image 45
Michael Entin Avatar answered Oct 09 '22 01:10

Michael Entin