I'm running an educational website which is teaching programming to kids (12-15 years old).
As they don't all speak English in the code source of the solutions we are using French variables and functions names. However we are planing to translate the content into other languages (German, Spanish, English). To do so I would like to translate the source code as fast as possible. We mostly have C/C++ code.
The solution I'm planning to use :
Is there already some open-source code/project that can do that ? (For the points 1,2 and 4)
If there isn't, the most difficult point in the first one : using a C/C++ parser to build a syntactical tree and then extracting the variables with their position seems the way to go. Do you have others ideas ?
Thank you for any advice.
Edit : As noted in a comment I will also need to take care of the comments but there is only a few of them : the complete solution is already explained in plain-text and then we are showing the code-source with self-explained variable/function names. The source code is rarely more that 30/40 lines long and good names must make it understandable without comments if you already know what the code is doing.
Additional info : for the people interested the website is a training platform for the International Olympiads in Informatics and C/C++ (at least the minimum needed for programming contest) is not so difficult to learn by a 12 years old.
Compilers convert one programming language into another. Usually, compilers are used to convert code so the machine can understand it. If we want it to be human-readable, we need a subset of compilers called transpilers. Transpilers also convert code however the output is generally understandable by a human.
Yes, it is possible to translate programming languages. You can convert the source code from one language into a code in a different language. Interpreting a programming language, however, is unnecessary and not possible at present.
The program (source code) must be translated into machine language so that the computer can execute the program (as the computer only understands machine language). The way that this translation occurs depends on whether the programming language is a compiled language or an interpreted language.
Try putting the code directly into google translate. It does a pretty good job of only translating words. The things it does "accidentaly" translate could be dealt with by running the code through something that replaces them with known substitutes.
Translating from one language to another is definitely possible, and this is literally all a compiler is doing. The language that a compiler spits out as output is generally machine code or assembly, but this is just another language, and there are compilers (sometimes called transpilers or transcompilers) which translate between two languages.
When developing an application an application with only one language in mind, it’s common practice to put the text directly in the source code as it will appear to the end user. Let’s take an HTML element with the text “Confirm password” as an example. Even if you’re using a templating language, it’ll likely look like this in the source code:
If a language is Turing Complete, then you have: So to translate from language A to language B, you convert the A code into a Turing Machine, then convert that machine into B code. Of course, in practice, the practical bits get in the way, and this also requires you having the translations accessible to you.
Are you sure you need a full syntax tree for this? I think it would be enough to do lexical analysis to find the identifiers, which is much easier. Then exclude keywords and identifiers that also appear in the header files being included.
In principle it is possible that you want different variables with the same English name to be translated to different words in French/German -- but for educational use the risk of this arising is probably small enough to ignore at first. You could sidestep the issue by writing the original sources with some disambiguating quasi-Hungarian prefixes and then remove these with the same translation mechanism for display to English-speaking end users.
Be sure to let translators see the name they are translating with full context before they choose a translation.
I really think you can use clang (libclang) to parse your sources and do what you want (see here for more information), the good news is that they have python bindings, which will make your life easier if you want to access a translation service or something like that.
You don't really need a C/C++ parser, just a simple lexer that gives you elements of the code one by one. Then you get a lot of {
, [
, 213
, )
etc that you simply ignore and write to the result file. You translate whatever consists of only letters (except keywords) and you put them in the output.
Now that I think about it, it's as simple as this:
bool is_letter(char c)
{
return (c >= 'a' && c <= 'z') || (c >= 'A' && c <= 'Z');
}
bool is_keyword(string &s)
{
return s == "if" || s == "else" || s == "void" /* rest of them */;
}
void translateCode(istream &in, ostream &out)
{
while (!in.eof())
{
char c = in.get();
if (is_letter(c))
{
string name = "";
do
{
name += c;
c = in.get();
} while (is_letter(c) && !in.eof());
if (is_keyword(name))
out << name;
else
out << translate(name);
}
out << c; // even if is_letter(c) was true, there is a new c from the
// while inside that was read (which was not letter), but
// not written, so would be written here.
}
}
I wrote the code in the editor, so there may be minor errors. Tell me if there are any and I'll fix it.
Edit: Explanation:
What the code does is simply to read input character by character, outputting whatever non-letter characters it reads (including spaces, tabs and new lines). If it does see a letter though, it will start putting all the following letters in one string (until it reaches another non-letter). Then if the string was a keyword, it would output the keyword itself. If it was not, would translate it and output it.
The output would have the exact same format as the input.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With