Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Parse URLs using C-Strings in C++

Tags:

c++

string

I'm learning C++ for one of my CS classes, and for our first project I need to parse some URLs using c-strings (i.e. I can't use the C++ String class).

The only way I can think of approaching this is just iterating through (since it's a char[]) and using some switch statements. From someone who is more experienced in C++ - is there a better approach? Could you maybe point me to a good online resource? I haven't found one yet.

like image 644
Jarsen Avatar asked Apr 09 '26 11:04

Jarsen


2 Answers

Weird that you're not allowed to use C++ language features i.e. C++ strings!

There are some C string functions available in the standard C library.

e.g.

strdup - duplicate a string
strtok - breaking a string into tokens. Beware - this modifies the original string.
strcpy - copying string
strstr - find string in string
strncpy - copy up to n bytes of string
etc

There is a good online reference here with a full list of available c string functions for searching and finding things.

http://www.cplusplus.com/reference/clibrary/cstring/

You can walk through strings by accessing them like an array if you need to.

e.g.

char* url="http://stackoverflow.com/questions/1370870/c-strings-in-c"
int len = strlen(url);
for (int i = 0; i < len; ++i){
  std::cout << url[i];
}
std::cout << endl;

As for actually how to do the parsing, you'll have to work that out on your own. It is an assignment after all.

like image 188
hookenz Avatar answered Apr 11 '26 01:04

hookenz


There are a number of C standard library functions that can help you.

First, look at the C standard library function strtok. This allows you to retrieve parts of a C string separated by certain delimiters. For example, you could tokenize with the delimiter / to get the protocol, domain, and then the file path. You could tokenize the domain with delimiter . to get the subdomain(s), second level domain, and top level domain. Etc.

It's not nearly as powerful as a regular expression parser, which is what you would really want for parsing URLs, but it works on C strings, is part of the C standard library and is probably OK to use in your assignment.

Other C standard library functions that may help:

  • strstr() Extracts substrings just like std::string::substr()
  • strspn(), strchr() and strpbrk() Find a character or characters in a string, similar to std::string::find_first_of(), etc.

Edit: A reminder that the proper way to use these functions in C++ is to include <cstring> and use them in the std:: namespace, e.g. std::strtok().

like image 34
Tyler McHenry Avatar answered Apr 11 '26 00:04

Tyler McHenry



Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!