An AI translation, for instance, may be nothing like a human-crafted translation. Instead of an “authoritative” but derivative literary text that encapsulates a single subjectivity’s experience of the original text, an AI translation may be more like an interactive experience through which the reader excavates her own reading by interrogating an infinite number of synthesized translator subjectivities to fully experience the potential inherent in the original. It will not replace a human translation; it will be something completely different.
—Ken Liu, “The cinematograph, the ‘noematograph,’ and the future of AI art”
A couple of posts ago, I speculated about the potential of AI-assisted translation, with a focus on the fact that monolingual readers can now engage with foreign texts in a way that has not previously been possible. The central limitation of older models like Google Translate is their determinism: every input must produce the same output.
But with LLMs, it is possible to inject context to get a better interpretation of a foreign text. Not only can the LLM use context to produce a higher-fidelity result, but the user themselves can also ask questions of the text, using context to understand the translation and its potential inaccuracies in a substantive way.
I proposed that this opens up the possibility of pre-translation— that is, using LLMs to excavate meaning from the foreign corpus for both personal cultural understanding and as preparation for more substantial future translation efforts. If the activity of translation was made more accessible by pre-translation, this could allow for greater cultural understanding, more interest in foreign languages, and story collections with truly global scope. It would also be a great way to get more eyes on foreign corpuses in rarer languages that currently go unnoticed.
One comment on the article mentioned Ken Liu’s essay “The cinematograph, the ‘noematograph,’ and the future of AI art”, which was a great articulation of the future possibilities I see in this technology. It emphasizes that the potential of LLMs lies not necessarily in its output but even moreso in the process of using it and how that might change the user’s perspective. Liu describes a reading experience that is almost museum-like, one in which readers can participate in a semantic “excavation” of the text. It reflected a realization I have been tossing around lately: learning a language could potentially be far more interesting than just the communication it affords.
While the post got generally positive feedback, one of our longtime readers had some pretty strong private objections to the post, which I thought were worth sharing here. Their central objection was:
A translation performed by someone who cannot read the source material is more likely to be low quality because such a person cannot assess the accuracy of the translation, only the quality of the output in their native language. An increase in the publishing of these translations will have a negative effect on readers by diluting the market with potentially inaccurate, low quality translations. This post is bad insofar as it praises the quality of translations performed by people who cannot read the source material.
They emphasized that the “translation gap” in literature is not necessarily driven by a true absence of capable bilingual readers but also a utilitarian economy that prevents many who would like to spend their time translating literature from doing it full-time. They went on to describe how this dynamic might be exacerbated by LLMs, which could kill the demand for high-quality translations when mediocre automated ones would suffice. Ultimately, they argued that the post was irresponsible in not fully considering its potential adverse ways effects before endorsing this practice.
I will concede that my post was written in the heat of personal inspiration, and the article would have been stronger with more investigation of how LLMs have been affecting the industry. I regret certain choices that I made in the original post, and in particular I should have taken a less dismissive tone towards the current translation landscape.1 Still, I maintain that the post was not arguing for the replacement of human translators with AI. Rather, I was arguing that LLMs could create a valuable new step in the translation pipeline that could expand the canon for global audiences.
Perhaps the biggest substantive disagreement between the reader and myself is that we had very different intuitions about what effect LLM translations and mediocre translations in general will have on the translation market. The reader argued that an influx of monolingual translations will be bad for world literature by reducing the incentive for humans to translate literature professionally:
A bad translation is worse than no translation because it’s going to block the way to a good translation being produced,” said U.C. Berkeley French professor Liesl Yamaguchi, who translated Väinö Linna’s classic 1954 World War II novel “The Unknown Soldier” from Finnish to English. “It took the rights, it took the resources and people say, ‘Oh, it exists. It’s accessible. So we don’t need to do a good version.’”
“That effectively kills the work in the target language,” Yamaguchi added. “That’s an extremely cruel and unfortunate thing to do to lesser-translated languages and literatures.
—Anne Li, The Markup
I can understand the sentiment behind this claim, and I can certainly imagine it being more true in the pre-digital era. However, I would argue that it is not so simple; while a bad translation may decrease the incentive for a good translation, nothing brings awareness to the importance of translation like a bad translation. LLMs might decrease the quality of the average foreign translation, but among serious readers of literature, I would expect a proliferation of mediocre translations to create much more vibrant discourse about the authoritative version of the text. I doubt serious readers would be satisfied with the top translation of a famous work being the product of an LLM.
I am doubtful that AI translation in the long run would lead to more situations where otherwise “high literature” would be left as a mindless LLM translation when it otherwise would be translated by a human. Obviously this is pure speculation, but my intuition is that LLM-assisted translation may in fact lead to a lower average quality of translation but also lead to higher highs and greater accessibility in return.
Anecdotally, I didn’t think very much about translations until hearing about divisive translations. I remember the Three-Body Problem being a subject of debate on the UW campus, with Chinese students and bilingual Americans alike commenting on how it was a relatively activist translation (some good, some bad). And one of the things that piqued my interest in translation was reading about how Bett and Boyd’s translation of Breasts and Eggs by Mieko Kawakami is sometimes criticized for flattening out the Osaka-ben dialect that is used in the original text. If translations were less divisive in general, then I would expect to have a lot less exposure to the subject.
These experiences have lead me to think that it would in fact be preferable if non-speakers could democratically object to choices made by bilingual translators as a result of being in conversation with the text. This could lead to more informed choices about which version of the text is most “true”, and perhaps even composite translations that suit the personal tastes of an individual reader.
The reader also took issue with my admittedly provocative statement “a professional-quality translation isn’t necessarily needed to curate foreign texts”:
While this is technically true, the burden of proof is on whether LLM translation is sufficient to curate foreign texts well, as in equal to or better than texts curated by readers of the original language, or to justify the time spent as opposed to curating in your native language. This case seems to me to prove the opposite, in that the text you curated was by your own admission (and in my opinion) not very good. You have not made a compelling case that the interim step of LLM-translation by monolingual readers would lead to a higher quality curation of foreign texts than, say, going onto Douban to find highly acclaimed works of science fiction and fantasy and commission professional translations directly. I would also argue that the Douban users are the ones really doing the curating in either case.
To clarify my position, the goal here would not be to produce a foreign collection that is “authoritative” (e.g. China’s Best Science Fiction); rather, it would be to curate texts with global scope guided by a single editorial vision. The intent is not objectivity but subjectivity: a noematograph that reflects personal tastes rather than critical acclaim in the texts’ respective countries. The benefits of this approach may be less evident when applied to relatively well-translated languages like Chinese and Japanese. But for smaller languages, even a few monolingual translations sprinkled into a wider collection intuitively feels like it be very helpful for driving interest in the genre.
Ideally, yes, it would be preferable if there was always enough raw professionally translated material to accomplish such tasks or that anyone could simply commission professional translations on a whim. But there are only a small handful of genres in which there is enough translated raw material to do any sort of substantial curation that isn’t just a simple rehash of the tastes of the most prominent translators. I don’t expect this central dynamic to change in the near future, so I don’t really see any alternative situation that would achieve a better outcome.
I would agree that there is more apparent value in going onto a discussion site like Douban and “pre-translating” criticism rather than the works themselves; in fact, enabling readers to browse online foreign literary discussion feels like an even more useful application of LLMs than literary translation. However, I would both argue that this still counts as “pre-translation” and also that it is made richer by an ability to excavate the original text. It would be difficult to contextualize such conversations without having a rudimentary ability to access the subject of discussion.
No matter how deep the world of professional translation there is, there will always be a bottleneck of taste in terms of who actually has the ability to translate. I went on to argue that:
In general, I do actually think it would be good if there were like 100x more people involved in translating literature, even if they don’t necessarily speak both languages, and even if this results in a canon that is in some ways lower-quality. I think it is ideal that readers now have the option to engage with the original text themselves rather than submit themselves to the translator’s vision, even if that vision is quite high-quality. There will be a lot more people reading lower-quality translations in the world where this becomes mainstream; I will concede that. But I ultimately think that this is a fair price to pay for a world that is much more culturally connected and aware of the barriers between cultures, rather than a world in which most of this knowledge is reserved solely for bilingual readers. But I also recognize that this is a more controversial view that may not be shared by most people, esp translators.
I would like to acknowledge here that I have been taking a fairly populist position on translation for most of this post. But in fairness to the reader, there are some respects in which gatekeeping translation is good. People who translate world literature are often scholars who have spent their entire lives immersing themselves in a foreign culture, and it is in some way beautiful that they are able to be the ones to present that culture to a global audience as a result. A full democratization of the translation process will make it so that is no longer the case.
When this slice of culture is no longer selected to optimize for representation of that culture, people abroad will likely get a bit of a perspective distorted towards what sells in America. There will probably be, for example, an increased availability of mediocre xianxia that reflect Western sensibilities but don’t necessarily contribute to a greater cultural understanding of China. It is somewhat sad that more artistic translators may not be able to enjoy the same kind of respect that they might have enjoyed in past eras, and they won’t be able to ensure that people reading the literature of another country automatically reach some baseline level of cultural understanding. But my position is that the increased accessibility and engagement will make this worth it.
The reader’s response highlights the importance that they place on cultural understanding:
I think we have similar values here in that I also want for the world whatever will lead to the most cross-cultural understanding. My opinion is that high quality translations lead to more people reading literature from other cultures, and that most people prefer reading a high quality translation to translating themselves with an LLM. I think that reading translated material from other cultures is a good way to learn about them, and this knowledge is not exclusive to bilingual readers.
I think that there are two different kinds of cultural knowledge are being discussed here. The first is the knowledge of cultures that can be gained from reading about them in a reader’s native language. But there is deeper and qualitatively different understanding to be gained from engaging with foreign texts directly, just as reading different accounts of history is different than reading even the most high-quality interpretations. This knowledge is in fact exclusive to bilingual readers in the current landscape, and I do think we would be culturally better off if this wasn’t the case.
I suspect that I am comparatively more optimistic about the ability of LLMs to drive interest in translation and comparatively more pessimistic about the ability of foreign texts to drive interest in a foreign language. Anecdotally, I feel that very well-regarded foreign literature in most cases isn’t treated all that differently from a high-quality text in English. Obviously the most motivated readers will be interested in the way in which the cultural context influenced the text, but for most of my literature-loving peers, reading these texts didn’t drive much of an interest in the language itself.
In the end, we are imagining very different translation futures, both of which are certainly possible. The LLM-assisted paradigm could be worse and more smoothed over than the existing one; we could simply get worse translations and there wouldn’t be enough interest to drive substantially greater activity. But I imagine a paradigm where it is actually quite common for fans of literature to excavate foreign texts.
I’d like to close with a brief vision of what I think the ideal future of AI translation looks like. In a similar vein to the sakugAI post, I want to see a future where human translation is rightly recognized as an artisan product. Most foreign texts (technical manuals, online forum discussions, most web novels) will be translated by LLMs, but human translators will be more respected as craftsmen and true artists. There will be a far more legible pipeline by which a work can rise to prominence and receive the “human translation treatment”, and this will be a result of the Internet becoming far more multilingual in general.
I am particularly drawn to a world in which the foreign Internet is more accessible, a world where it is quite normal to hop over to a French message board and get an embedded LLM to clarify the topics of discussion. You might be able to set custom flags as to how you want certain connotations to be translated in your browser, and maybe you could even specify preferences as to how you would want your messages to appear in other languages. And while there will obviously be nuance lost even in the best case, it would still be quite utopian if the full diversity of written thought was accessible from the comfort of your own home.
The most resonant criticism of the original post was that I claimed “Obviously AI-assisted translation doesn’t have much practical purpose for actual publishing”, but then in the same breath I also referenced an AI translation that I literally posted publicly in this very magazine. While I did disclose that the post was AI-assisted, I admit that the original post does not do enough to clarify the context or intent. This has since been amended.
Our reader also mentioned data pollution and the possibility of LLMs not just stagnating but even becoming worse at translation due to all the LLM-generated translated text entering the corpus. I don’t really think this is that much of a concern at this stage due to the more deterministic properties of language, but I also feel that the absolute accuracy of LLM translation is less important than the ability to interrogate it and recursively find potential inaccuracies. I also feel that benchmarking against Google Translate would be able to avoid hallucinations in the worst case, which seems to me to be the most important concern. In any case, the most direct harm seems to be the negative effect on LLM translation quality, which doesn’t seem to be to be such a bad thing.
Perhaps the canniest pronouncement by an author in my lifetime was when García Márquez said 100 Years of Solitude was better in the the Rabassa translation into English. I wonder if he actually believed it.