Dropping the BOM

This one had me baffled for a while but it made me glad that I read up about character sets a while back…

The Problem
I was trying to use a header('Location: whatever.php') redirection and it had been working fine. I then moved part of the code (the database connection info) into an include file (settings.php). This broke my site in an interesting way: Firstly, I started getting this message:

Warning: Cannot modify header information – headers already sent by (output started at C:\xampp\htdocs\settings.php:1) in C:\xampp\htdocs\section.php on line 8

About Header()
There are two parts to any file sent by HTTP: the header and the body. The header contains meta-information – information about the file such as what type of file it is etc. My understanding of this part is a little hazy but I think it works something like this: PHP compiles the header then sends that to the client, then compiles the body and then sends that. The header() function in PHP forces an HTTP header to be sent. If you were to include the line header(‘Location: yomomma.php’); then PHP would effectively redirect the user to a picture of your mum (and you know we all want to see that). (Note: the use of header(‘Location: blah’); isn’t technically good practice as it should only really be used when a page has been moved but I won’t go into that now. However, it works and a lot of people use it… The choice is yours.)

About The Error Message
Ok, so what does the error mean? What it’s saying is that once PHP has started outputting the body of the page it is too late to go back and start re-doing the headers. This means that if there is ANY content (even white space!) output before the header() function then the header() function will fail.

So imagine my surprise when I open up settings.php and look at line 1 expecting to see an echo statement or maybe just a space before the opening PHP tag. But there was nothing there, just a load of variable assignments enclosed between two PHP with nothing outside them. So this should work without a problem…what the hell’s going on?!

A Closer Look At Settings.php
There is clearly something wrong with settings.php. If navigate straight to it I should get a blank page but I don’t – I get something strange. This:

Three bizarre characters. It took me a while to find out that these characters represent the Byte Order Mark (BOM) for Unicode UTF-8. The BOM is basically a tiny piece of data which indicates the byte order and encoding form (basically so that the computer can tell if it’s UTF-8 or UTF-16 or whatever). For more info on the BOM take a look at the Unicode website. It turns out that settings.php was encoded using “UTF-8 including BOM” whereas all the other pages were encoded using “UTF-8 without BOM”. So when a file without BOM finds an unexpected BOM in a file that it includes it treats it as normal text and outputs it. This then breaks the header() function and the page fails!

The Fix
To fix the problem I just made sure that all my PHP pages were encoded using the same encoding (UTF-8, without BOM). Simple enough when you know what the problem is!


About Mr Chimp

I make music, draw pictures, browse the internet, programme, and make sweet, sweet cups of tea until the early hours.
This entry was posted in Uncategorized and tagged , , , , , . Bookmark the permalink.

6 Responses to Dropping the BOM

  1. Skilldrick says:

    Aha, very interesting. Never come across that before, but I’ll know what the problem is if I do!

  2. Mr Chimp says:

    Yeah, I’ve had the “output already sent” message loads of times but it’s always been something obvious before. This was just bizarre.

  3. narissa71 says:

    I had a solitary full stop lurking on my page. Turned out to be the same thing. It appeared where I had an include to a function in a separate file.

  4. Mr Chimp says:

    Little tip: if possible set the default encoding in your text editor. That way any new files you make should be in the right format.

    I had the same problem a couple of weeks back because I forgot to set the encoding on a new file!

  5. Pingback: W3C Local CSS Validator Errors « Mr Chimp Learns to Write

  6. Pingback: Character Encodings « Mr Chimp Learns to Write

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s