Target of the project
I wanted to create an SEO-oriented page with reasonable articles. On the page I put some links to referral programs, that would be my source of income.
I want to make my page popular, so I want to buy the backlinks on the Black Hat Seo forum.
I crawled around 15_000 articles from some blog pages. I decided to translate these articles into 11 languages. Later translate back article from another language to English, to change the wording a little bit.
I tried to use Chat-GPT-3.5 to reformulate some articles, but that day API of OpenAI worked really badly (slow and a lot of 5xx).
In my case it was ~15_000 articles = 118_173_837 characters to translate, so overall price would be:
118 million characters * 12 languages * 20 € = 28_320 €
Of course, 28_000 euros was not an option for me. It is time for hacking.
My first attempt was to translate from my laptop using Selenium + Java. This idea failed because there is a rate-limiting per IP.
AWS Lambda with headless Chrome
The second idea was to create AWS Lambda, with headless Chrome. My local application created SQS messages with articles to translate. Each SQS message triggered the execution of AWS Lambda.
What went wrong for me:
Articles were too long for a single translation (there is a 5_000-character limit). I needed to divide it into smaller chunks, translate and connect it back.
Translation of long text takes time. DeepL.com is taking into consideration the context of the whole 5_000 characters. It was much faster to divide it into 500 chunks, but I was losing the advantage of the deepl.com solution.
Lambda is run from the same machines, all the time. I needed to put some sleeping code, to not use the same IP too quickly. But let's be honest, AWS bills per time spend inside AWS Lambda, to this is not good from a financial point of view.
I didn't test my lambda well enough. I run it for hundreds of articles in the evening and went to sleep. In the morning I saw my AWS bill went pretty high and most of the articles were not translated. 15 minutes (maximum time of AWS Lambda) was not enough to translate the whole article, AWS Lambda timed out and the message was starting to invoke AWS Lambda again and again.
Batch translate + VPN
After some attempts, I decided to find a different solution.
I realized that you can translate 3 documents per month from 1 IP address. Each document can be up to 100_000 characters long.
I used the Java Apache POI library to build DOCX documents. In each document, I was able to put 8 articles. I used Selenium to upload and download it.
To change IP:
I bought Proton VPN for $10 per month.
I read the manual on how to connect to proton VPN from the command line
After 3 document translations, I simply executed the shell script from Java.
It was enough for me to translate around 118 million characters into 12 languages = 118 * 12 = 1.4 billion characters.
The speed was around 2 documents (~100_000 characters) per 1 minute. It was running on my local computer for around 3 weeks at night. I could deploy it to the server, but time was not important to me.
I found a pretty effective way of using deepl.com for free. I could continue to optimize it, but for me, it was good enough.
How to use it?
I have a hammer in my hand = I can translate texts in a pretty massive way. I used it for SEO content generation.
The question is: how to earn money on text translation?
Let me know in the comments, what you think about this topic!
You can also write a direct message to me: email@example.com