Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Checking a character to be a newline

Tags:

c

char

How to check whether a character is a newline character in any encoding in C?

I have a task to write my own wc program. And if I use just if (s[i] == '\n') it has another answer than original wc if I call it to itself.
Here is the code:

typedef struct
{
    int newline;
    int word;
    int byte;
} info;

info count(int descr)
{
    info kol;
    kol.newline = 0;
    kol.word = 0;
    kol.byte = 0;

    int len = 512;
    char s[512];
    int n;

    errno = 0;
    int flag1 = 1;
    int flag2 = 1;
    while(n = read(descr, s, len))
    {
        if(n == -1)
            error("Error while reading.", errno);

        errno = 0; 

        kol.byte+=n;
        for(int i=0; i<n; i++)
        {
            if(flag1)
            {
                kol.newline++;
                flag1 = 0;
            }

            if(isblank(s[i]) || s[i] == '\n')
                flag2 = 1;
            else
            {
                if(flag2)
                {
                    kol.word++;
                    flag2 = 0;
                }
            }
            if(s[i] == '\n')
                flag1 = 1;
        }
    }
    return kol;
}  

It works fine for all text files, but when I call it to file I got after compiling itself it does't give the answer wc gives.

like image 522
Taygrim Avatar asked Mar 31 '13 19:03

Taygrim


People also ask

How do you check if a character is a newline in Python?

Check if a string contains a newline (\n) in Python # Use the in operator to check if a string contains a newline character, e.g. if '\n' in string: . The in operator will return True if the string contains a newline character and False otherwise.

Which character is used as a new line character?

Adding Newline Characters in a String Operating systems have special characters denoting the start of a new line. For example, in Linux a new line is denoted by “\n”, also called a Line Feed. In Windows, a new line is denoted using “\r\n”, sometimes called a Carriage Return and Line Feed, or CRLF.

What is the Ascii code for New line?

In ASCII, newline is X'0A'. In EBCDIC, newline is X'15'. (For example, ASCII code page ISO8859-1 and EBCDIC code page IBM-1047 translate back and forth between these characters.) Windows programs normally use a carriage return followed by a line feed character at the end of each line of a text file.


1 Answers

The way to check whether a character s[i] is a newline character is simply:

if (s[i] == '\n')

If you're reading from a file that's been opened in text mode (including stdin), then whatever representation the underlying system uses to mark the end of a line will be translated to a single '\n' character.

You say you're trying to write your own wc program, and by comparing to '\n' you're getting different results than the system's wc. You haven't told us enough to guess why that's happening. Show us your code and tell us exactly what's happening.

You might run into problems if you're reading a file that's encoded differently -- say, trying to read a Unix-format text file on a Windows system. But then wc would have the same problem.

like image 195
Keith Thompson Avatar answered Oct 13 '22 12:10

Keith Thompson