Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

(array & string) Difference in Java vs. C [closed]

Tags:

java

c

I know about C and I am entering into Java and confused about its approach towards arrays and strings. It's totally different from arrays and strings in C. Please help me understand what is actually the difference between C and Java (for strings and arrays).

like image 415
Subhransu Mishra Avatar asked Sep 27 '10 09:09

Subhransu Mishra


People also ask

What do you mean by array?

An array is a data structure consisting of a collection of elements (values or variables), each identified by at least one array index or key. Depending on the language, array types may overlap (or be identified with) other data types that describe aggregates of values, such as lists and strings.

What is an array and example?

An array is a collection of similar types of data. For example, if we want to store the names of 100 people then we can create an array of the string type that can store 100 names. String[] array = new String[100]; Here, the above array cannot store more than 100 names.

What are the 3 types of arrays?

There are three different kinds of arrays: indexed arrays, multidimensional arrays, and associative arrays.

What is array in data?

An array is a linear data structure that collects elements of the same data type and stores them in contiguous and adjacent memory locations. Arrays work on an index system starting from 0 to (n-1), where n is the size of the array.


2 Answers

In C

Arrays

Arrays in C are simply syntactic sugar to access contiguous memory spaces, or - vulgarizing it shamelessly here - a variant of a pointer notation. To avoid allocating big chunks of contiguous memory and avoid having to reallocate your memory yourself manipulating data of variable size, you then resort to implementations of common Computer Science Data Structure concepts (for instance, a linked list, which uses a pointer to indicate the memory address of the next element in a series).

You can substitute pointer arithmetic with array notations in C, and vice versa.

The following will print the 5 elements of an array using different access methods:

#include <stdio.h>

int main(int ac, char **av) {
  char arr[2] = {'a', 'b'};

  printf("0:%c 0:%c 1:%c 1:%c\n", arr[0], *arr, arr[1], *(arr + 1));
  return (0);
}

The following will be valid with int variables. Notice the slight modification to accomodate for the size of an integer:

#include <stdio.h>

int main(int ac, char **av) {
  int arr[2] = {42, -42};

  printf("0:%d 0:%d 1:%d 1:%d\n", arr[0], *arr, arr[1], *(arr + 4));
  return (0);
}

(To obtain the size of a given data type, resort to the use of sizeof.)

Strings

Here I assume you want to know about the conventional C-string implementation, and not one provided by a 3rd-party library.

Strings in C are basically simply arrays of characters. The main reason for this is obvious: as you need to often manipulate strings and print them to a stream, using a contiguous memory space makes sense and is an easy implementation. However, as you need to remember the size of your contiguous memory space to not inadvertently access something forbidden, we rely on the concept of a "NULL-terminated string", meaning a string of N characters is a actually an array of N + 1 characters terminated by a trailing '\0' character, which is used as the de-facto character to look for when you want to reach the end of a string.

A straightforward declaration would be:

char *test = "my test";

which would be equivalent to:

char test[8] = { 'm', 'y', ' ', 't', 'e', 's', 't', '\0' };

(Notice the trailing '\0')

However, you have to realize that in that case, the string "my test" is static, and that's the memory space you are directly pointing to. Which means you will encounter issues when trying to dynamically modify it.

For instance, this would blow up in your face (following thee previous declaration):

test[4] = 'H'; /* expect a violent complaint here */

So to have a string you can actually modify you can declare a string simply as:

#include <stdio.h>
#include <stdlib.h>

int main(int ac, char **av) {
  char *test = strdup("my test");

  printf("%s\n", test);
  return (0);
}

Where strdup() is a function of the C standard library allocating memory for your string and injecting the characters in there. Or you can allocate memory yourself with malloc() and copy characters manually or with a function like strcpy().

This particular declaration is thus mutable, and your are free to modify the content of the string (which in the end is just a dynamically allocated array of characters, allocated with malloc()).

If you need to change the length of this string (add/remove characters to/from it), you will need to everytime be wary of the allocated memory. For instance, calling strcat() will fail if you haven't reallocated some additional memory first. Some functions, however, will take care of this for you.

The C string does NOT support Unicode by default. You need to implement to manage code points yourself, or consider using 3rd-party library.


In Java

Arrays

Arrays in Java are very close to their C parent (to the point that we even have a method for efficient array-to-array-copy support using a bare-bone native implementation: System.arraycopy()). They represent contiguous memory spaces.

However, they wrap these bare-bone arrays within an object (which keeps track of the size/length of the array for you).

Java arrays can have their content modified, but like their C counterpart, you will need to allocate more memory when trying to expand them (except you do it indirectly, and will usually reallocate a complete array instead of doing a realloc() like in C).

Strings

Strings in Java are immutable, meaning they cannot be changed, once initialized, and operations on String actually create new String instances. Look up StringBuilder and StringBuffer for efficient string manipulation with an existing instance, and beware of their internal implementation details (especially when it comes to pre-setting the capacity of your buffer efficiently, to avoid frequent re-allocations).

for instance, the following code uses produces a 3rd String instance out of someString and "another string":

String myNewStr = someString + "another string";

In the underlying implementation, the Java String* classes also use an arrays of characters, like their C parent.

This implies that they use more memory than the bare-bone C implementation, as you have the overhead of your instance.

Not only that, they actually use a lot more memory because the Java String class provides Unicode support by default, meaning it allows for multiple code points per character (which is not a trivial thing to do in C, in comparison).

On the other, notice that except if considering performance, you don't need to worry about threading, memory, and implementing functions looking for trailing '\0' characters.


What More?

A lot more could be said and researched. Your question is fairly broad at the moment, but I'll be glad to edit if you add sub-questions in your comments.

Also, maybe this could help:

  • Java Language Specification, 3rd Ed., Chap 10 - Arrays
  • Details of the JVM compilation for arrays
  • Java and C differences, with details for both strings and arrays.
  • The C-FAQ contains some details similarities and differences between arrays on pointers.
like image 104
haylem Avatar answered Sep 24 '22 17:09

haylem


In C, a string is typically just an array of (or a pointer to) chars, terminated with a NUL (\0) character. You can process a string as you would process any array.

In Java, however, strings are not arrays. Java strings are instances (objects) of the java.lang.String class. They represent character data, but the internal implementation is not exposed to the programmer. You cannot treat them as arrays, although, if required, you can extract string data as an array of bytes or chars (methods getBytes and getChars). Note also that Java chars are 16-bits, always, while chars in C are typically (not always) 8-bit.

like image 37
Grodriguez Avatar answered Sep 25 '22 17:09

Grodriguez