Unexpected $_GET

During private beta testing for a certain application I encountered some unexpected discrepancies in the contents of the global PHP $_GET array on different servers. One of the arguments being passed through the query string is a url encoded string that contains encoded ampersands (%26) and encoded equal signs (%3D). On some hosts those encoded ampersand/equal pairs were incorrectly being parsed out into additional indexes of the $_GET global array. I crafted a simple reduction to illustrate the problem:

<?php print_r($_GET); ?>

The file containing just the above line of code is then accessed with the following query string:

?debug1=testing%26encoded%3Dquerystrings&debug2=done

The correct output should be:

Array
(
    [debug1] => testing&encoded=querystrings
    [debug2] => done
)

But on some hosts I found this:

Array
(
    [debug1] => testing
    [encoded] => querystrings
    [debug2] => done
)

So what was the problem? The hosts that were exhibiting this incorrect behavior had all enabled WebDAV on certain directories. In order for WebDAV to work with Windows XP authentication mod_encoding needs to be installed and the following line added to your httpd.conf file:

EncodingEngine on

But this results in the intentionally url encoded entities being decoded prematurely, thus “corrupting” the $_GET array.

As you could probably tell from the link, mod_encoding is of Japanese origin and is difficult to find information on in English. From what I can piece together, unlike something like mod_rewrite, the EncodingEngine setting cannot be overridden on a per directory basis with .htaccess. What I ended up doing was moving the WebDAV directories to their own isolated subdomain and disabling WebDAV on the primary domain. Not an ideal approach. Has anybody come across this and found a less restrictive solution?

Previous
Another Opening at Silverpoint
Next
JavaScript-enhanced Image Replacement
Author
Shaun Inman
Posted
May 11th, 2005 at 11:38 pm
Categories
PHP
Mint
Web
Comments
018 (Now closed)

018 Comments

001

No, I haven’t noticed anything like that.

I guess that one could get around it by parsing the $_SERVER[\"QUERY_STRING\"] variable. Not the best solution but it would probably work.

Author
Henrik Pejer
Posted
May 12th, 2005 1:22 am
002

Shaun, have you tested to see if this behaviour is occuring with $_POST as well? I would also check $GLOBALS[“_GET”] and see if it is simply webDAV mangling your _GET directly.

Good info to know though, as we were considering webDAV for a SVN repository. Gonna tuck that idea onto a back burner for now until I have more time to nose around and see what’s going on with it.

Author
Jakob Heuser
Posted
May 12th, 2005 2:28 am
003

Henrik, I had considered the option of parsing the query string directly but in this particular instance there are a number of unknowns in the query string so it wouldn’t really be possible.

Jakob, I haven’t tested $_POST yet because in this situation that’s not really an option but that would be a much more significant issue, wouldn’t it? I’ll give the $GLOBALS array a shot but I expect the same problem.

Author
Shaun Inman
Posted
May 12th, 2005 3:30 am
004

Update: $GLOBALS['_GET'] is also affected but $_POST is not.

Author
Shaun Inman
Posted
May 12th, 2005 3:37 am
005

Pretty interested in seeing what exactly the application in question is. As for a solution to the problem; maybe I am not canalizing it enough, but couldn’t this problem be solved thusly:

EncodingEngine On SetServerEconding UTF-8

Perhaps you need a different encoding than UTF-8. It seems as that if it is only occurring on certain servers it could be a system set default that is conflicting with this. I’m not to familiar with WebDAV myself but I did recall their being a SetServerEncoding, AddClientEncoding, and DefaultClientEncoding settings. Perhaps experimenting with the possible options these settings can have will wield the results you anticipated.

Or I could just be totally off base, that is a strong possibility to. I maybe a programmer, but I am an ASP/Visual Basic programmer; which makes me not a programmer at all. Or at least until I come home and do stuff with PHP. Nonetheless good luck with your problem.

Author
Ryan Latham
Posted
May 12th, 2005 3:39 am
006

Ryan, that might be something worth digging into if you admin your own server but the hosted sites I’m dealing with don’t allow editing httpd.conf directly.

Author
Shaun Inman
Posted
May 12th, 2005 3:51 am
007

I would try using one or a combination of the following php functions to see if they will help insure your url gets encoded the way you want it encoded.

urlencode() urldecode() rawurlencode() rawurldecode() htmlentities()

Author
Brent O'Connor
Posted
May 12th, 2005 7:14 am
008

Brent, the elements have already been encoded properly before being used in the query string. The issue here is that mod_encoding is decoding that encoding prematurely. Thanks for the suggestion though.

Author
Shaun Inman
Posted
May 12th, 2005 7:24 am
009

Myself and Tim at nefariousdesigns.co.uk came up with this:

function fryup()
{
  $arr_values = explode('&',$_SERVER[QUERY_STRING]);

  foreach ($arr_values as $param) 
  {
    $temp_arr = explode('=',$param);
    $arr[$temp_arr[0]] = $temp_arr[1];
  }

  return $arr;  
}

A convoluted function but will give the same results as a $_GET array. And you don’t need to know your anything about what’s in the query string.

Maybe that will help?

Author
Stuart
Posted
May 12th, 2005 8:06 am
010

Maybe I haven’t explained myself clearly. What’s happening is that correctly encoded ampersands (%26) are being decoded before they even reach PHP because of mod_encoding. That means that PHP’s internal processes and any user-defined function that breaks up the query string by looking for an & will stumble in the same place.

This is not a PHP problem. The problem occurs before PHP even enters the picture. The solution is going to lie on the Apache module side of the fence.

Author
Shaun Inman
Posted
May 12th, 2005 8:24 am
011

Use some other character than ampersand for GET request parameter separator. In some hosts you can change the PHP ini settings with .htaccess file. Check out http://www.php.net/manual/en/ini.core.php#ini.arg-separator.input for more information.

Author
Janne
Posted
May 12th, 2005 12:11 pm
012

It is always possible to encode the _GET information and replace all the strings (such as %20) to specific characters. I think there is a comprehensive php script for this. It may slow down the script a bit, but it should be really secure. It is the only “internal” method that can resolve it with php only.

Author
Oliver
Posted
May 12th, 2005 3:01 pm
013

I know only the basics of Apache, but perhaps double-encoding it might help? That is, take the %26 and escape it again.

Author
jordan
Posted
May 12th, 2005 3:16 pm
014

Hmm, this sounds more like a major security issue with the mod_encoding module. If mod_encoding can only be applied globally it should not be used.

Imagine if I’d have both mod encoding and register globals enabled and use a few uninitialised variables…

Instead of trying to work around the issue I would put a note that it presents a major security issue and that it should be disabled if they wish to use Shortstat. It may not be an option though :-/

Author
Vincent Grouls
Posted
May 14th, 2005 3:06 am
015

I would just like to know what Media Temples new project is. The graphic looks sweet.

Author
Neil
Posted
May 14th, 2005 4:23 pm
016

I came up with this fix for Shaun’s little WebDav bug. Not sure if it’s a solution that will work for him. Thought I might add it as a comment so that whoever is interested might be able to use the code somehow if needed in a project.

Basically all I did was write a function that substitutes (%26) with and (%3D) with before displaying the link. Then once it’s submitted a function runs that loops through the $_GET array and decodes it back to the original values.

You can see an example by going to http://brent.epicserve.com/examples/get-encode-function.phps.

Hope this helps someone! :) If not… at least I had fun coming up with one solution that worked. Granted I didn’t know the entire scope of what Shaun was doing so I don’t know how practical it might be in his situation.

Author
Brent O'Connor
Posted
May 16th, 2005 5:45 am
017

So, what was the original problem? Are you working on two different servers and you’re trying to get them both to work without changing the code, just the server params? I agree with some other the others. I’d just mess with urlencode() urldecode()… or just run a reg_ex on _SERVER[query_string]

Author
Dustin Diaz
Posted
May 16th, 2005 7:39 am
018

Shaun, I would really like to beta test mint. Would that be possible?

Author
Morgan Knutson
Posted
May 22nd, 2005 4:18 pm