Problem solve Get help with specific problems with your technologies, process and projects.

End the end-of-line hassle

Ways to deal with different OS treatment of the end of a text line

This tip is brought to you by the letters represented by hex codes of 0x0a, 0x0c and 0x0d, not to mention the occasional combining of them in an attempt to describe the end of a line in a file (The combining part can be blamed on Bill Gates for creating MSDOS, though it might have even more history, and I promise no more Sesame Street references).

Our company uses Cobalt servers for Web hosting. About one third of the domains hosted do their own Web design. They design on various flavors of Windows or Mac, with a few Linux geeks thrown in to raise the bar. You can probably guess that there are various levels of experience out there.

Luckily, HTTP servers and Web browsers don't really care about what you choose to use for end of line characters; they compress all white space to a single space, then decide how to render it.

The problems occur when users upload CGI programs (and server side includes) from "other" operating systems.

A quick background break: MSDOS files are known for using "carriage return/newline" combinations; UNIX uses just a newline (often called linefeed); other operating systems (Mac, for example) use just a carriage return. Many applications are fairly smart about this and will properly display any version. This can be a problem when troubleshooting. My suggestion is to revert to good old vi to see exactly what is in a file.

For example, I used the following Perl command to create a file named test-eol:

    perl -e 'print "a\012\n b\013\n c\014\n d\015\ne"' > test-eol
The purpose was to create a file with a,b,c,d on four lines, with the newline, vertical tab, form feed and carriage return characters between the letter and a new line.

Here is a list of these characters:

    decimal  octal  hex   name
       10     012   0x0a  newline (or linefeed)
       11     013   0x0b  vertical tab
       12     014   0x0c  form feed
       13     015   0x0d  carriage return
When this file is displayed with vi (or "cat -ve"), you see this:

The ^K,L,M all indicate of a possible problem with interpretation. Note that "more" only showed the form feed (and that your mileage may vary).

When a non-Linux (or UNIX) application is allowed to touch a script (or text) file without promising to return it to its original line feed characters, you often have a problem.

Teaching users to use the ASCII mode of FTP can help, but this is not a guarantee (configuring FTP's ntrans option would help, but few of our users actually use the FTP command directly).

When you have users uploading from applications like GoLive or CuteFTP, who knows what will happen.

So, to fix these files, either teach users to configure and use FTP, or provide commands or tools to convert "whatever" to single newlines.

Something like this should do it:

  perl -i.bak -pe 's/[\n\r\f]+$/\n/;' filenames-here
Good luck, and happy end-of-line character hunting!
Fred Mallett is founder of FAME Computer Education, which provides standup delivery of educational classes on a variety of UNIX and Win32 related subjects.

Dig Deeper on Linux servers

Start the conversation

Send me notifications when other members comment.

Please create a username to comment.