Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How many bytes is \n\r?

I have a .NET app that is trying to ftp a file but I'm ending up with 1 extra byte per line. My line separator is Environment.NewLine, which I believe translates into \n\r. How many bytes is that?

like image 503
donde Avatar asked May 26 '10 18:05

donde


People also ask

How many bits is a newline?

However, those operating systems use a record-based file system, which stores text files as one record per line. In most file formats, no line terminators are actually stored. Operating systems for the CDC 6000 series defined a newline as two or more zero-valued six-bit characters at the end of a 60-bit word.

How many bytes is a single letter?

Understanding bits and bytes We call 8 bits a byte. The very common ASCII system makes each letter of the alphabet, both capital and small (plus punctuation and some other symbols) correspond to a number from 0 to 255 (for example a=97, b= 98 and so on), so one letter can be expressed with one byte.

Is a byte 1 character?

A byte is the smallest unit of data on a system. In general, 1 byte = 1 ASCII character. 2 bytes = 1 UTF-16 character. An unsigned byte can old the values 0-255.


2 Answers

I know this is an old question, but for the sake of future readers; you can determine how many bytes are in a given string (or string value) via the following:

Encoding.UTF8.GetByteCount("SomeString");

In this case;

Encoding.Unicode.GetByteCount(Environment.NewLine);
// OR
Encoding.Unicode.GetByteCount("\n\r");

.NET uses Unicode unless otherwise specified; for example with an XmlSerializer you can specify the encoding.

Remember to use the proper encoding when you are attempting to count the number of bytes since it is different with each encoding:

  • An ASCII character in 8-bit ASCII encoding is 8 bits (1 byte), though it can fit in 7 bits.
  • An ISO-8895-1 character in ISO-8859-1 encoding is 8 bits (1 byte).
  • A Unicode character in UTF-8 encoding is between 8 bits (1 byte) and 32 bits (4 bytes).
  • A Unicode character in UTF-16 encoding is between 16 (2 bytes) and 32 bits (4 bytes), though most of the common characters take 16 bits. This is the encoding used by Windows internally.
  • A Unicode character in UTF-32 encoding is always 32 bits (4 bytes).
  • An ASCII character in UTF-8 is 8 bits (1 byte), and in UTF-16 - 16 bits.
  • The additional (non-ASCII) characters in ISO-8895-1 (0xA0-0xFF) would take 16 bits in UTF-8 and UTF-16.
like image 190
Taco タコス Avatar answered Sep 30 '22 04:09

Taco タコス


In ASCII encoding, \n is the Newline character 0x0A (decimal 10), \r is the Carriage Return character 0x0D (decimal 13).

As Jack has said already, the correct sequence is CR-LF, not vice versa.

FTP is probably adding LF characters to your stream if they are placed incorrectly and you are transmitting the file as Text.

like image 40
cdonner Avatar answered Sep 30 '22 05:09

cdonner