I have a problem that I really don't understand. I'm trying to upload a files in a asp classic app, without the use of an external component. I also want to post some text that will be stored in a DB. The file upload perfectly, I'm using this code: Upload Files Without COM v3 by Lewis E. Moten III
The problem is the other form input fields. I'm using UTF-8, but they don't end up as UTF-8. I.e Swedish characters å ä and ö is displayed as question marks if I print them out using Response.Write.
I have saved the files in UTF-8 (with BOM), I have added the meta tag to tell the page it is in UTF-8. I have set Response.CharSet = "UTF-8".
The function to convert from binary to string looks like this (this is the only place I can think of that might be wrong, since the comments say that it pulls ANSI characters, but I think it should pull Unicode characters):
Private Function CStrU(ByRef pstrANSI)
' Converts an ANSI string to Unicode
' Best used for small strings
Dim llngLength ' Length of ANSI string
Dim llngIndex ' Current position
' determine length
llngLength = LenB(pstrANSI)
' Loop through each character
For llngIndex = 1 To llngLength
' Pull out ANSI character
' Get Ascii value of ANSI character
' Get Unicode Character from Ascii
' Append character to results
CStrU = CStrU & Chr(AscB(MidB(pstrANSI, llngIndex, 1)))
Next
End Function
I have created a test asp page (multiparttest.asp) to replicate this, the upload stuff from Lewis E. Moten is required to make it work (I have added his files in a subdir called upload).
<%Response.CharSet = "UTF-8" %>
<!--#INCLUDE FILE="upload/clsUpload.asp"-->
<html>
<head>
<title>Test</title>
<meta http-equiv="Content-Type" content="text/html; charset=UTF-8" />
</head>
<body>
<%
Set objUpload = New clsUpload
Response.Write( objUpload.Fields("testInput").Value )
%>
<form method="post" enctype="multipart/form-data" action="multiparttest.asp">
<input type="text" name="testInput" />
<input type="submit" value="submit" />
</form>
</body>
</html>
I have captured the request using LiveHTTP Headers in Firefox, and saved it as a UTF-8 file, the Swedish characters looks like they should (they didn't look ok in the LiveHTTP header GUI, but i'm guessing that the GUI it self doesn't use the correct encoding). This is how the POST request looks like:
http://localhost/testsite/multiparttest.asp
POST /testsite/multiparttest.asp HTTP/1.1
Host: localhost
User-Agent: Mozilla/5.0 (Windows; U; Windows NT 6.0; en-US; rv:1.9.1.5) Gecko/20091102 Firefox/3.5.5 (.NET CLR 3.5.30729)
Accept: text/html,application/xhtml+xml,application/xml;q=0.9,*/*;q=0.8
Accept-Language: en-us,en;q=0.5
Accept-Encoding: gzip,deflate
Accept-Charset: ISO-8859-1,utf-8;q=0.7,*;q=0.7
Keep-Alive: 300
Connection: keep-alive
Referer: http://localhost/testsite/multiparttest.asp
Cookie: ASPSESSIONIDASBBRBTT=GLDJDBJALAMJFBFBDCCIONHF; ASPSESSIONIDAQABQBTT=DIPHILKAIICKJOIAIMILAMGE; ASPSESSIONIDCSABTCQS=KMHBLBLABKHCBGPNLMCIPPNJ
Content-Type: multipart/form-data; boundary=---------------------------7391102023625
Content-Length: 150
-----------------------------7391102023625
Content-Disposition: form-data; name="testInput"
åäö
-----------------------------7391102023625--
HTTP/1.x 200 OK
Cache-Control: private
Content-Length: 548
Content-Type: text/html; Charset=UTF-8
Server: Microsoft-IIS/7.0
X-Powered-By: ASP.NET
Date: Tue, 10 Nov 2009 14:20:17 GMT
----------------------------------------------------------
Any help in this mater is appreciated!
I've tried to add all these to the top of the asp file, due to different suggestions I've found on this problem elsewhere, with no different result..
<%@Language=VBScript codepage=65001 %>
<%Response.ContentType="text/html"%>
<%Response.Charset="UTF-8"%>
<%Session.CodePage=65001%>
This question seems related, UTF-8 text is garbled when form is posted as multipart/form-data. But they doesn't use ASP or IIS. Is it possible to setup some kind of character encoding for multipart/form-data in IIS? I'm using IIS7. Maybe my request does have the wrong encoding after all? (I'm really lost in the character encoding world right now)
Your analysis of CStrU is correct. It assumes that single byte ANSI characters are being sent by the client. It also assumes that the codepage being used by both client and locale that the VBScript is running in are the same.
When using UTF-8 the assumptions made by CStrU will always be incorrect. There isn't, to my knowledge, a locale that has 65001 as its codepage (I think there are one or two that use 65000 but thats different again).
Here is a replacement function that assumes text is in UTF-8:-
Private Function CStrU(ByRef pstrANSI)
Dim llngLength '' # Length of ANSI string
Dim llngIndex '' # Current position
Dim bytVal
Dim intChar
'' # determine length
llngLength = LenB(pstrANSI)
'' # Loop through each character
llngIndex = 1
Do While llngIndex <= llngLength
bytVal = AscB(MidB(pstrANSI, llngIndex, 1))
llngIndex = llngIndex + 1
If bytVal < &h80 Then
intChar = bytVal
ElseIf bytVal < &hE0 Then
intChar = (bytVal And &h1F) * &h40
bytVal = AscB(MidB(pstrANSI, llngIndex, 1))
llngIndex = llngIndex + 1
intChar = intChar + (bytVal And &h3f)
ElseIf bytVal < &hF0 Then
intChar = (bytVal And &hF) * &h1000
bytVal = AscB(MidB(pstrANSI, llngIndex, 1))
llngIndex = llngIndex + 1
intChar = intChar + (bytVal And &h3F) * &h40
bytVal = AscB(MidB(pstrANSI, llngIndex, 1))
llngIndex = llngIndex + 1
intChar = intChar + (bytVal And &h3F)
Else
intChar = &hBF
End If
CStrU = CStrU & ChrW(intChar)
Loop
End Function
Note that with CStrU being corrected for UTF-8 the output of your example page now looks wrong. The advice to set the Codepage of the file to 65001 is also a requirement. Since you are setting the CharSet sent to the client to "UTF-8" you need to also tell ASP to use the UTF-8 code page when encoding text written using Response.Write.
I don't know if this will be any help, but I have worked with some classic ASP code to use the SWFUpload utility (Flash plugin that allows multiple file uploads in a batch).
The ASP sample code includes some comprehensive code that sorts out the Byte/Unicode decoding, and looks similar to what you mention regarding chr(AscB(MidB(... - perhaps seeing a second example might shed light on your problem.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With