Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How to get data off of a character

I am working on a project in Unity which uses Assembly C#. I try to get special character such as é, but in the console it just displays a blank character: "". For instance translating "How are you?" Should return "Cómo Estás?", but it returns "Cmo Ests". I put the return string "Cmo Ests" in a character array and realized that it is a non-null blank character. I am using Encoding.UTF8, and when I do:

char ch = '\u00e9';
print (ch);

It will print "é". I have tried getting the bytes off of a given string using:

byte[] utf8bytes = System.Text.Encoding.UTF8.GetBytes(temp);

While translating "How are you?", it will return a byte string, but for the special characters such as é, I get the series of bytes 239, 191, 189, which is a replacement character.

What type of information do I need to retrieve from the characters in order to accurately determining what character it is? Do I need to do something with the information that Google gives me, or is it something else? I am need a general case that I can place in my program and will work for any input string. If anyone can help, it would be greatly appreciated.

Here is the code that is referenced:

using System;
using System.Collections.Generic;
using System.Linq;
using System.Text;
using UnityEngine;
using System.Collections;
using System.Net;
using HtmlAgilityPack;


public class Dictionary{
string[] formatParams;
HtmlDocument doc;
string returnString;
char[] letters;
public char[] charString;
public Dictionary(){
    formatParams = new string[2];
    doc = new HtmlDocument();
    returnString = "";
}

public string Translate(String input, String languagePair, Encoding encoding)
    {
        formatParams[0]= input;
        formatParams[1]= languagePair;
        string url = String.Format("http://www.google.com/translate_t?hl=en&ie=UTF8&text={0}&langpair={1}", formatParams);

        string result = String.Empty;

        using (WebClient webClient = new WebClient())
        {
            webClient.Encoding = encoding;
            result = webClient.DownloadString(url);
        }       
        doc.LoadHtml(result);
        input = alter (input);
        string temp = doc.DocumentNode.SelectSingleNode("//span[@title='"+input+"']").InnerText;
        charString = temp.ToCharArray();
        return temp;
    }
// Use this for initialization
void Start () {

}
string alter(string inputString){
    returnString = "";
    letters = inputString.ToCharArray();
    for(int i=0; i<inputString.Length;i++){
        if(letters[i]=='\''){
            returnString = returnString + "&#39;";  
        }else{
            returnString = returnString + letters[i];   
        }
    }
    return returnString;
}
}
like image 524
Cameron Barge Avatar asked Nov 09 '12 15:11

Cameron Barge


People also ask

How do I extract data from a specific character in Excel?

To get text following a specific character, you use a slightly different approach: get the position of the character with either SEARCH or FIND, subtract that number from the total string length returned by the LEN function, and extract that many characters from the end of the string.

How do you extract data from a cell after a character?

Extract text before or after space with formula in Excel Select a blank cell, and type this formula =LEFT(A1,(FIND(" ",A1,1)-1)) (A1 is the first cell of the list you want to extract text) , and press Enter button.


1 Answers

Maybe you should use another API/URL. This function below uses a different url that returns JSON data and seems to work better:

    public static string Translate(string input, string fromLanguage, string toLanguage)
    {
        using (WebClient webClient = new WebClient())
        {
            string url = string.Format("http://translate.google.com/translate_a/t?client=j&text={0}&sl={1}&tl={2}", Uri.EscapeUriString(input), fromLanguage, toLanguage);
            string result = webClient.DownloadString(url);

            // I used JavaScriptSerializer but another JSON parser would work
            JavaScriptSerializer serializer = new JavaScriptSerializer();
            Dictionary<string, object> dic = (Dictionary<string, object>)serializer.DeserializeObject(result);
            Dictionary<string, object> sentences = (Dictionary<string, object>)((object[])dic["sentences"])[0];
            return (string)sentences["trans"];
        }
    }

If I run this in a Console App:

    Console.WriteLine(Translate("How are you?", "en", "es"));

It will display

¿Cómo estás?
like image 97
Simon Mourier Avatar answered Oct 19 '22 01:10

Simon Mourier