New Lines in Unix vs. Windows

I had a moment of enlightenment  today while working with text files. Summary... Unix formats new lines with the "\n" escape sequence, but Windows looks for "\r\n"

The files were mime messages that came into a Unix-based mail server. These files then get moved to a Windows machine for processing. Opening the files in advanced text editor like Notepad++ revealed no difference, but opening the files in Notepad showed little boxes in place of the new lines.

In mime messages, new lines are used to separate the various items of a header (Content-Type, To, From, Subject). My application could not find all the headers, because it could not find all the line breaks.  

A simple fix for this is to replace "\n" with "\r\n" if the file is from a Unix system.

            FileStream fs = File.OpenRead(txtInputPath.Text);
            StreamReader sr = new StreamReader(fs);
            string msgString = sr.ReadToEnd();
            fs.Position = 0;

            // Windows needs \r\n but Unix formats docs with only \n
            msgString = msgString.Replace("\n","\r\n");

For a thorough explanation and history...
The Absolute Minimum Every Software Developer
Absolutely Positively Must Know About Unicode and Character Sets
 
by Joel Spolsky

For a simple program that will do the conversion see. ToFroDos

Tags:

 

Month List

About

Martin is a .NET programmer in Western Pennsylvania.