Introduction
Google provides its Translation API that looks very promising and useful. So I was very enthusiastic when we decided to use it in our Localizer project.
Unfortunately, I can't say that the API is well documented. And of course, there is no tutorial that describes how to use it in Delphi 2009 as we need to do.
I spent some time searching the web and found a few articles concerning this task but not a single one of them gave me a solution.
All articles that I found suggest to use the http://google.com/translate URL to access the translation service. It is not quite correct. Firstly, this URL is used for end-user requests via browser. Its parameters are not documented and can be changed any time. Secondly, the response is a regular web page with a lot of unnecessary tags, text, etc. It is a kind of headache to extract the result from it. And, as well as the URL, the response layout can be (I'd even say it definitely will be) changed in the future.
The Solution
Google describes the Translation API and gives another way to solve the task. The correct URL is the following: http://ajax.googleapis.com/ajax/services/language/translate. In this case, the response format is a JSON encoded result with embedded status codes.
All we need to do is to construct a properly constructed URL with all necessary CGI arguments, send an HTTP referrer header that accurately identifies our application (Google terms of use requirement), and be able to process the JSON encoded response.
So far so good. Let’s try to write the Delphi function that translates some input string. We will use the Indy TidHttp component to send the HTTP requests.
As I defined, after investigating the argument part of the constructed URL should be converted to UTF8 and then encoded. As Google says, “the value of a CGI argument must be properly escaped (e.g., via the functional equivalent of JavaScript's encodeURIComponent()
method)”. I tried to use some standard or third-party URL-encoding functions but not a single one of them does it correctly in terms of Google expectations. The main problem is that all available functions encode the source string char-by-char when Google expects the string encoded byte-by-byte. So I had to do it myself.
function URLEncode(const S: RawByteString): RawByteString;
const
NoConversion = ['A'..'Z', 'a'..'z', '*', '@', '.', '_', '-', '/', ':', '=', '?'];
var
i, idx, len: Integer;
function DigitToHex(Digit: Integer): AnsiChar;
begin
case Digit of
0..9: Result := AnsiChar(Chr(Digit + Ord('0')));
10..15: Result := AnsiChar(Chr(Digit - 10 + Ord('A')));
else
Result := '0';
end;
end; // DigitToHex
begin
len := 0;
for i := 1 to Length(S) do
if S[i] in NoConversion then
len := len + 1
else
len := len + 3;
SetLength(Result, len);
idx := 1;
for i := 1 to Length(S) do
if S[i] in NoConversion then
begin
Result[idx] := S[i];
idx := idx + 1;
end
else
begin
Result[idx] := '%';
Result[idx + 1] := DigitToHex(Ord(S[i]) div 16);
Result[idx + 2] := DigitToHex(Ord(S[i]) mod 16);
idx := idx + 3;
end;
end; // URLEncode
The next question is how to extract the translation from the response we get. In our case, the response format is a simple JSON object similar to the snippet shown below:
{
"responseData" : {
"translatedText" : the-translated-text,
},
"responseDetails" : null | string-on-error,
"responseStatus" : 200 | error-code
}
The best way is to use some library that works with JSON structures. For example, you may download and use the uJson unit.
For demonstration purposes, it will be enough to process the response as a regular string. We need to extract the status of the response (200 = OK), the translation text and the error string if status != 200.
// source - the string to be translated
// langpair - the string that defines the source and target language in special format,
// i.e. “en|ru”. The list of available languages and their abbreviations
// you may find in Translation API description
// resultString - the translation
// result - the error message if any. Empty result means that
// the function has been executed successfully
function googleTranslate(source : string; langpair : string;
var resultString : string) : string;
var
url, s, status : String;
utfs : UTF8String;
http : TidHttp;
begin
result := '';
http := TidHttp.Create;
try
utfs := UTF8String(source);
utfs := URLEncode(utfs);
url := 'http://ajax.googleapis.com/ajax/services/language/translate?v=1.0&q=' +
String(utfs) + '&langpair=' + langpair;
http.Request.Referer := 'http://oursite.com';
http.Request.UserAgent := 'Our Application';
s := http.Get(url);
status := Copy(s, pos('"responseStatus":', s)+18, length(s));
status := Copy(status, 0, pos('}', status)-1);
if (status = '200') then begin //status is OK
s := Copy(s, pos('"translatedText":', s)+18, length(s));
resultString := Copy(s, 0, pos('"}, "responseDetails"', s)-1);
end
else begin //an error occurred
s := Copy(s, pos('"responseDetails":', s)+20, length(s));
resultString := '';
result := Copy(s, 0, pos('", "responseStatus"', s)-1);
end;
finally
http.Free;
end;
end;
At last, we can try to translate something. Say, we are to translate “Hello world!” from English to Ukrainian.
var
res, strValue : string;
…
res := googleTranslate('Hello world!', 'en|uk', strValue);
if (res = '') then //translation is OK
ShowMessage('Translation: ' + strValue)
else //error
ShowMessage('Error: ' + res);
History
- 11th May, 2010: Initial post