Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Token string cuts off when inserting into 2d array in C

I am correctly tokenizing single words from a line of strings; however, inserting them into a 2d array cuts off parts of the token. I also have a problem with NULL, and the code results in a segfault.

#include <stdio.h>
#include <stdlib.h>
#include <string.h>   // strtok

#define MAX_FILE_LENGTH 30
#define MAX_COURSE_LENGTH 30
#define MAX_LINE_LENGTH 1000

void trim(char* str) {
  int l = strlen(str);
  if (str[l - 1] == '\n') {
    str[l - 1] = 0;
  }
}

int main() {
  char filename[MAX_FILE_LENGTH]; 
  char arr[MAX_COURSE_LENGTH][MAX_COURSE_LENGTH];
  const char delim[] = " ";
  char* token;
  int course = 0;
  char c;
  FILE* fp;
  int N = 0;    // number of lines in file

  printf("This program will read, from a file, a list of courses and their prerequisites and will print the list in which to take courses.\n");
  printf("Enter filename: ");
  scanf("%s%c", filename, &c);

  fp = fopen(filename, "r");

  if (fp == NULL) {
    printf("Could not open file %s. Exit\n", filename);
    printf("\nFailed to read from file. Program will terminate.\n");
    return -1;
  }

  while (!feof(fp) && !ferror(fp)) {
    int i = 0;
    if (fgets(arr[N], MAX_LINE_LENGTH, fp)) {
      trim(arr[N]);
      printf("Full line: |%s|\n", arr[N]);
      token = strtok(arr[N], delim);
      arr[N][i] = *token;
      printf("N = %d, i = %d, token = %s arr[%d][%d]: %s\n", N, i, token, N, i, &arr[N][i]);
      while (token != NULL) {
        i++;
        token = strtok(NULL, " \n");
        printf("token at arr[%d][%i]: %s value at arr[%d][%d]: %s\n", N, i, token, N, i, &arr[N][i]);
        arr[N][i] = *token;
        printf("N = %d, i = %d, token = %s arr[%d][%d]: %s\n", N, i, token, N, i, &arr[N][i]);
      }
      N++;
    }
  }
  fclose(fp);
  return 0;
}

The output I'm getting reads:

Full line: |c100 c200|
N = 0, i = 0, token = c100 arr[0][0]: c100
token at arr[0][1]: c200 value at arr[0][1]: 100
N = 0, i = 1, token = c200 arr[0][1]: c00
token at arr[0][2]: (null) value at arr[0][2]: 00
zsh: segmentation fault  ./a.out

My file is a list of courses and I am to build an adjacency matrix with the list of prerequisite courses.

c100 c200
c300 c200 c100
c200 c100

I tried to reset each index to NULL or '\0' before inserting the tokens, but the same result occurred. Inserting the first word in the [N][0]th index of the inner array works, but there is something I'm missing when inserting into other indexes of the inner array.

like image 691
Januar Soepangat Avatar asked May 02 '26 13:05

Januar Soepangat


1 Answers

Much worse than all of my comments above, which could lead to undefined behavior, is the active and always UB you have by using %s to print a single character.

The assignment

arr[N][i] = *token;

assigns the single character in token[0] to the single character element arr[N][i].

You then use %s to print &arr[N][i], but the %s format expects a null-terminated string which you don't have.

To have a string you need an array of characters, and what you maybe should do is something like (but I'm just guessing):

char arr[MAX_COURSE_LENGTH][MAX_COURSE_LENGTH][MAX_COURSE_LENGTH];

and

strcpy(arr[N][i], token);

And as mentioned by Allan Wind, you're forgetting that strotk can (and will) return a NULL pointer.

Also note that all upper-case names are usually used for macros and symbolic constants. That makes it kind of confusing to see the name N being used as an index variable.

like image 114
Some programmer dude Avatar answered May 04 '26 02:05

Some programmer dude



Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!