Article delegate-en/355 of [1-5169] on the server localhost:119
  upper oldest olders older1 this newer1 newers latest
search
[Top/Up] [oldest] - [Older+chunk] - [Newer+chunk] - [newest + Check]
[Reference:<_A333@delegate-en.ML_>]
Newsgroups: mail-lists.delegate-en

[DeleGate-En] Re: problem with url translation?
17 Mar 1999 08:48:08 GMT ysato@etl.go.jp (Yutaka Sato)


On 03/08/99(23:35) you pjuaqbdyi-y27ap3p5sbfr.ml@ml.delegate.org wrote
in <_A333@delegate-en.ML_>
 | https://delegate.machine.edu/-_-http:/foo.edu/test.html
 |and suppose that test.html has images of the form
...
 |    <IMG SRC="../icons/picture2.gif">
...
 |However, picture2.gif fails.  In the log, I see:

A similar problem was recently reported also in delegate@etl.go.jp
mailing list though the former case was with explicit MOUNT and
the problematic URL was in SRC for FRAME.
("http://wall.etl.go.jp/mail-lists/delegate/7985",03Feb99)

In your case the URL for picture2.gif seems wrong because it
points to outer space of WWW data of the server.  ".." means "upper
directory", of course, thus a relative URL points to upper directory
from a file at the top directory of WWW server is abnormal.
It should be one of the followings.
  "icons/picture2.gif"
  "/icons/picture2.gif"
  "./icons/picture2.gif"

Then the client in this case interprets ".." in the URL to
normalize "a/b/../c" as "a/c", that is from
"/-_-http://foo.edu/../icons/picture2.gif" to
"/-_-http://icons/picture2.gif", then this is sent to DeleGate.

This rewriting is a legal operation for a relative URL based on
RFC1808 (shown at the section 4.,step6,c).  And the RFC also
mentions about special handling of "abnormal" URLs like above, from
"http://foo.edu/" + "../icons/picture2.gif" to
"http://foo.edu/" + "icons/picture2.gif".
Thus clients which implement the algorithm can access the target
resource with such abnormal, wrong URL.
But unfortunately, of course, abnormal URL in /-_-URL notation of
DeleGaet is not handled as a special case in the specification nor
in clients' implementation.

# Personally I'm not willing to support the specification about
# rewriting of ".." to be implemented in client side because,
# like the document says,
# > this algorithm cannot guarantee that the resulting URL
# > will equal that intended by the original author, ... 

 |It seems that whenever a relative URL starts with ".." 
 |the ".." gets stripped out, and delegate intreprets the
 |directory following it as machine name.
 |
 |Am I doing something wrong, or is this a bug in url translation?
 |I tried to track it down in the code, but I was unable
 |to figure out where it is happening.

DeleGate does nothing for relative URLs in response message,
to make relaying be as light wait as possible, and to be as
transparent as possible.

You did nothing wrong but encountered a page with abnormal URL.
Although I'm not sure yet how to handle such abnormal URLs (by default),
I made a trial patch to avoid the problem like enclosed bellow.

Cheers,
Yutaka
--
Yutaka Sato <ysato@etl.go.jp> http://www.etl.go.jp/~ysato/   @ @ 
Computer Science Division, Electrotechnical Laboratory      ( - )
1-1-4 Umezono, Tsukuba, Ibaraki, 305-8568 Japan            _<   >_


*** ../../delegate5.9.1/src/url.c	Thu Mar 11 16:00:06 1999
--- url.c	Wed Mar 17 16:50:14 1999
***************
*** 890,895 ****
--- 890,938 ----
  	strcpy(relurl,absurl);
  }
  
+ /*
+  * care an abnormal pointer to outer space of the server ...
+  * care only "../" at the top of URL to make the normalization be light weight
+  */
+ int URL_NORMALIZE = 1;
+ #define UPDIR(u) \
+ 	((u[0]=='.' && u[1]=='.' && \
+ 	 (u[2]=='/' || u[2]=='"' || u[2] == '>' || isspace(u[2]) || u[2]==0)) \
+ 	? &up[2] : 0)
+ 
+ url_normal(base,url)
+ 	char *base,*url;
+ {	char *up,*bp,bc,*xp,*rp;
+ 
+ 	if( !UPDIR(url) )
+ 		return 0;
+ 
+ 	if( *base )
+ 		bp = base + strlen(base) - 1;
+ 	else	bp = base - 1;
+ 
+ 	rp = url;
+ 	up = url;
+ 	while( xp = UPDIR(up) ){
+ 		while( base <= bp && (bc = *bp--) )
+ 			if( bc == '/' )
+ 				break;
+ 		rp = xp;
+ 		if( *xp != '/' )
+ 			break;
+ 		else	up = xp + 1;
+ 
+ 	}
+ 	if( bp < base ){
+ 		if( LOG_VERBOSE ){
+ 			char ub[32];
+ 			Strncpy(ub,url,16);
+ 			Verbose("ABNORMAL-URL: base<%s> url<%s>\n",base,ub);
+ 		}
+ 		return rp - url;
+ 	}else	return 0;
+ }
+ 
  url_absolute(myhp,proto,host,port,base,line,xline)
  	char *myhp,*proto,*host,*base,*line,*xline;
  {	Referer referer;
***************
*** 916,921 ****
--- 959,965 ----
  	char *sp,*np,*xp;
  	int ch;
  	char *tagp;
+ 	char uplen;
  
  	getBASE(referer,&myhp,&proto,&hp,&host,&port,&base);
  
***************
*** 970,975 ****
--- 1014,1026 ----
  			 * page's URL
  			 */
  			sp += 1;
+ 			sprintf(xp,"%s://%s/",proto,hp);
+ 		}
+ 		else
+ 		if( URL_NORMALIZE && (uplen = url_normal(base,sp)) ){
+ 			sp += uplen;
+ 			if( *sp == '/' )
+ 				sp++;
  			sprintf(xp,"%s://%s/",proto,hp);
  		}
  		else

  admin search upper oldest olders older1 this newer1 newers latest
[Top/Up] [oldest] - [Older+chunk] - [Newer+chunk] - [newest + Check]
@_@V