Wannacry ransom message was translated using Google translate but with a few changes

in #hacking8 years ago (edited)

The Wannacry malware included 28 translations of its ransom message.

You can download the language ransom messages here.

I wanted to figure out how they translated their message into 28 languages.

The answer (of course) is Google translate.

I took the English version of the message and plugged it into Google translate. Most of the translation packs in Wannacry were identical to the English --> [Language] translation that Google translate spit out.

I compared the Wannacry version with the Google translate version using diffchecker.

In case you're wondering how I know it's Google translate and not another translator, I included an example of what happens when you try using Bing translate at the very bottom. It comes out very different.

The Chinese translations were completely different. Maybe this is because I screwed up and don't have the right language packs. Alternatively, it could be because they used a different translation program for Chinese or it could mean one of the people who wrote Wannacry is a Chinese speaker.

A subset included minor differences. (Some are functionally different translations, others are as simple as a comma or hyphen.)

Here are the results, ordered biggest differences between English --> [Language] translation to identical.
On the left side you will see the version from the malware. On the right side you will see the version from Google translate.
Note that the Wannacry author changed the text between the angle brackets back to English for all of the translations.

Completely different

Chinese (simplified)
simp chinese.png

Chinese (traditional)
tradchinese.png

Minor changes
Russian
en russian.png

Portuguese
en portuguese.png

Korean
en korean.png

Czech
english to czech.png

Greek
en greek.png

Finnish
en finnish.png

Romanian
en romanian.png

Polish
en polish.png

Swedish
en swedish.png

Turkish!
en turkish.png

Vietnamese
en vietnamese.png

Indonesian
en indonesian.png

Identical to Google Translate | Images here

Bulgarian
Croatian
Dutch
Filipino
French
Italian
Japanese
Latvian
Norwegian
Slovak
Spanish


Also worth noting, Wannacry was missing translations for 3 of the 10 most common languages: Hindi, Arabic and Bengali.


Here's a sample of what happened when I tried to use Bing to translate English --> Czech.
en czech bing.png


Update: After a brief conversation with @noisy in the comments section, I think a possible explanation for the 'minor differences' is that Google translate may have made minor improvements to some of their translation algorithms between when the hackers translated the messages and when I translated the message.


If you have any questions or comments please post them in the comments section, e-mail me at wh1sks at keemail (dot) me, or DM me on twitter (@steemwh1sks)

Sort:  

do you think they corrected everything by themself... or maybe they used some tools/services to do that?

That's what I'm wondering too.

The answer is I don't know but I sent this to some proper linguists so hopefully they can chime in.

BTW. Yesterday i found your post on Reddit... I discovered that it was really well appreciated there. Do you think WikiLeaks community is aware of possibilities which Steem blockchain gives to people (like... censorship resistant Steem network)?

If you follow the /r/wikileaks 'meta' closely, you'll see that a subset of users and mods (especially /u/kybarnet) are constantly posting about decentralization, ethereum and other blockchain solutions. So they're certainly aware.

The question I asked myself is, "What does it take to get a fat lazy redditor off their fat ass someone to spend 3 minutes making a new phone verified steemit account?"

I decided the best way for me would be to try and post high quality content and goad users into making accounts so that they can argue with me and each other in the comments section.

Then after arguing with me hopefully they'll learn about what distinguishes Steemit from centralized competitors.

I've gotta say, I'm very happy with writing content on steem. The editor is a lot better than Reddit's. Also draft saving is awesome.

I do wish Steemit had the "Big Editor" button that Reddit Enhancement Suite has.

I hope you are aware of, that you do not have to give Steemit Inc your phone number. There are alternatives (which can be well appreciated by some people). You can create an account on: https://anon.steem.network/

Why there is a cost of account creation is well described in this article:
http://bytemaster.github.io/article/2016/02/10/How-to-build-a-decentralized-application-without-fees/

@dantheman wrote it even before Steem and Steemit existed - there is no other blockchain which has no fees!

Didn't know about the anonymous steem accounts! Very useful and I'll definitely pass it along to people I know who may be hesitant to make a PVA!

Thanks for the link on cost of account creation. Sometime in the nearish future (once I'm confident I fully understand Steemit myself) I will write up a post targeted specifically at the /r/wikileaks crowd to try and explain why they should migrate some of their posts here.

P.S. It looks like you speak Polish. Is the minor change between Google translate and the malware sample a correction?

It seems of strange to me that they'd slip in a couple of minor corrections... Just something I wasn't quite able to figure out.

Is the minor change between Google translate and the malware sample a correction?

really minor. Actually, as you can see only 1 phrase is corrected, 3 weren't translated in the first place, so I would not be surprised if those 3 phrases would be translated also by google translator, but individually, not together with < and >.

But it is correct in the malware version but not in the Google translate version?

ie: z plików is correct and pliki is incorrect?

If so, I think it kind of implies they kind of knew how Polish nouns work, right? And made 1 correction, then left loads of other errors...

Very strange.