Handling of $+ in URLs

There was an interesting discussion about handling URL in Zinc. Zinc is the fully rewritten HTTP client server of Pharo developed by Sven

Zinc has an excellent design and Sven takes really care of it because he uses it in production on the servers of http://www.beta9.be . Zinc is also an important player in the Pharo Web stack with Seaside http://www.seaside.st or other web frameworks (soon there will be a nice new comers around Amber – stay tuned).

Pharo is really grateful for his involvement and contributions. Here is the summary of the discussion:

Hi,

Johan Brichau reported an issue a couple of days ago concerning the handling of $+ in ZnUrl (Pharo 3’s URL class) and in Seaside’s WAUrl. #bleedingEdge of Zinc HTTP Components fixes the issue, as far as I can see. I want to explain the problem and the solution.

Before october 24 of last year, ZnUrl used a ‘better safe than sorry’ safe set when doing percent encoding of unsafe characters. However, the URL spec defines different allowed characters per URL part. This behaviour was then added to Zinc-Resource-Meta-Core, ZnUrl’s package.

Soon after that a discussion with Jan van de Sandt let to a first small change: since ZnUrl interprets the query part of a URL as key-value pairs, it is necessary to treat $= and $& as unsafe, even though they are not according to the URL spec (which doesn’t concern itself with how the query part is interpreted).

All that time, $+ kept on being interpreted as a space, independent of the safe set. As Johan reported, this conflicted with $+ being a safe character. Which eventually let to the functional problem of not being able to enter a + in an input field, in Seaside.

Why only in Seaside ? Because ZnZincServerAdaptor>>#requestUrlFor: was implemented by printing the interpreted incoming ZnUrl and parsing it again. There, the escaping of $+ disappeared and it became an unintended space.

This situation is now fixed by

Changes to ZnPercentEncoder:
– adding an #decodePlusAsSpace boolean option

Changes to ZnResourceMetaUtils:
– #decodePercent: no longer decodes plus as space
– #decodePercentForQuery: does plus as space decoding
– #queryKeyValueSafeSet no longer includes $+
– #parseQueryFrom: not uses #decodePercentForQuery:

Added ZnDefaultServerDelegate>>#formTest1: to test simple form submit encoding handling

Modify ZnZincServerAdaptor>>#requestUrlFor: to build a WAUrl explicitely from the interpreted parts of the incoming ZnUrl instead of going via printing and parsing

Adding new unit tests
– ZnUrlTests>>#testPlusHandling
– ZnServerTests>>#testFormTest1

I think WAUrl should best be changed as well, but that is not my call.

In code, this summarises the implemented behaviour:

ZnUrlTests>>#testPlusHandling
“While percent decoding, a + is translated as a space only in the context of
application/x-www-form-urlencoded get/post requests:
http://en.wikipedia.org/wiki/Percent-encoding#The_application.2Fx-www-form-urlencoded_type
ZnUrl interprets its query part as key value pairs where this translation is applicable,
even though strictly speaking + (and =, &) are plain unreserved characters in the query”

“$+ is not special in the path part of the URL and it remains itself”
self
assert: ‘http://localhost/foo+bar’ asZnUrl firstPathSegment
equals: ‘foo+bar’.
self
assert: ‘http://localhost/foo+bar’ asZnUrl printString
equals: ‘http://localhost/foo+bar’.
“$+ gets decoded to space in the interpreted query part of the URL,
and becomes an encoded space if needed”
self
assert: (‘http://localhost/test?q=foo+bar’ asZnUrl queryAt: #q)
equals: ‘foo bar’.
self
assert: ‘http://localhost/test?q=foo+bar’ asZnUrl printString
equals: ‘http://localhost/test?q=foo%20bar’.
“to pass $+ as $+ in a query, it has to be encoded”
self
assert: ‘http://localhost/test?q=foo%2Bbar’ asZnUrl printString
equals: ‘http://localhost/test?q=foo%2Bbar’

I hope this is a good and correct solution. In any case, it fixes the functional problem that $+ disappeared in WAUrlEncodingFunctionalTest – which I took over in ZnDefaultServerDelegate>>#formTest1:

Thanks Johan for the whole discussion !

Sven

Advertisements

Leave a Reply

Please log in using one of these methods to post your comment:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s

%d bloggers like this: