Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Issue with format %s using a width in several languages #203

Open
CarlosJimenez opened this issue Jan 20, 2021 · 8 comments
Open

Issue with format %s using a width in several languages #203

CarlosJimenez opened this issue Jan 20, 2021 · 8 comments
Labels

Comments

@CarlosJimenez
Copy link

CarlosJimenez commented Jan 20, 2021

Alexei, first of all thank you very much for your impressive work. I've been using sprintf-js since years ago.
But I found an issue that I don't know how to fix. Depending of the uff-8 characters used the width modifier in %s is not respecting the columns.

I use sprint-js instead `${expression}` because I can specify the width of a %s
I use UTF-8 characters. I attached a JSON with the string "To view this app you should enable Javascript" translated to 100 languages.

I use the following line with a %-100s width modifier

sprintf("case %-7s: return %-100s // %-11s %s\n",`'${lang}'`,'`'+json[lang]+'`;',xxxx,name);

The spaces are not respected in some of the languages as you can see in the output (the comment on each line is bad adjusted
translate.json.txt

Is there any way to respect spaces properly independently of the language?

case 'af'   : return `Om hierdie inligting te sien, moet u Javascript aktiveer`;                                          // af-ZA   afrikáans
case 'am'   : return `ይህንን መተግበሪያ ለመመልከት ጃቫ ስክሪፕትን ማንቃት አለብዎት`;                                                           // am-ET   amhárico
case 'ar'   : return `لعرض هذا التطبيق ، يجب تمكين Javascript`;                                                           // ar-EG   árabe
case 'az'   : return `Bu tətbiqə baxmaq üçün Javascript'i aktivləşdirməlisiniz`;                                          // az-AZ   azerí
case 'be'   : return `Для прагляду гэтага прыкладання вы павінны ўключыць Javascript`;                                    // be-BY   bielorruso
case 'bg'   : return `За да видите това приложение, трябва да активирате Javascript`;                                     // bg-BG   búlgaro
case 'bn'   : return `এই অ্যাপ্লিকেশনটি দেখতে আপনার জাভাস্ক্রিপ্ট সক্ষম করা উচিত`;                                        // bn-BD   bengalí
case 'bs'   : return `Za prikaz ove aplikacije trebali biste omogućiti Javascript`;                                       // bs-ES   bosnio
case 'ca'   : return `Per veure aquesta aplicació, heu d’habilitar Javascript`;                                           // ca-ES   catalán
case 'ceb'  : return `Aron matan-aw kini nga app kinahanglan nimo nga mapagana ang Javascript`;                           // ceb-latn-es cebuano
case 'co'   : return `Per vede sta app devi attivà Javascript`;                                                           // co-ES   corso
case 'cs'   : return `Chcete-li zobrazit tuto aplikaci, měli byste povolit Javascript`;                                   // cs-CZ   checo
case 'cy'   : return `I weld yr app hon dylech alluogi Javascript`;                                                       // cy-CY   galés
case 'da'   : return `For at se denne app skal du aktivere Javascript`;                                                   // da-DK   danés
case 'de'   : return `Um diese App anzuzeigen, sollten Sie Javascript aktivieren`;                                        // de-DE   alemán
case 'el'   : return `Για να δείτε αυτήν την εφαρμογή θα πρέπει να ενεργοποιήσετε τη Javascript`;                         // el-GR   griego
case 'en'   : return `To view this app you should enable Javascript`;                                                     // en-US   inglés
case 'eo'   : return `Por vidi ĉi tiun programon vi devas ebligi Ĝavaskripton`;                                           // eo-EO   esperanto
case 'es'   : return `Para ver esta aplicación, debe habilitar Javascript`;                                               // es-US   español
case 'et'   : return `Selle rakenduse vaatamiseks peaksite lubama Javascripti`;                                           // et-EE   estonio
case 'eu'   : return `Aplikazio hau ikusteko Javascript gaitu beharko zenuke`;                                            // eu-ES   euskera
case 'fa'   : return `برای مشاهده این برنامه باید Javascript را فعال کنید`;                                               // fa-IR   persa
case 'fi'   : return `Jos haluat tarkastella tätä sovellusta, ota Javascript käyttöön`;                                   // fi-FI   finlandés
case 'fr'   : return `Pour voir cette application, vous devez activer Javascript`;                                        // fr-FR   francés
case 'fy'   : return `Om dizze app te besjen moatte jo Javascript ynskeakelje`;                                           // undefined frisio
case 'ga'   : return `Chun an aip seo a fheiceáil ba cheart duit Javascript a chumasú`;                                   // ga-IE   irlandés
case 'gd'   : return `Gus an aplacaid seo fhaicinn bu chòir dhut Javascript a chomasachadh`;                              // gd-GD   gaélico escocés
case 'gl'   : return `Para ver esta aplicación debes habilitar Javascript`;                                               // gl-ES   gallego
case 'gu'   : return `આ એપ્લિકેશન જોવા માટે તમારે જાવાસ્ક્રિપ્ટને સક્ષમ કરવી જોઈએ`;                                       // gu-IN   gujarati
case 'ha'   : return `Don duba wannan app ya kamata a kunna Javascript`;                                                  // ha-ES   hausa
case 'haw'  : return `E ʻike ai i kēia polokalamu pono ʻoe e hoʻohana iā Javascript`;                                     // haw-ES  hawaiano
case 'he'   : return `כדי להציג אפליקציה זו עליך להפעיל את Javascript`;                                                   // undefined undefined
case 'hi'   : return `इस एप्लिकेशन को देखने के लिए आपको जावास्क्रिप्ट सक्षम करना चाहिए`;                                  // hi-IN   hindi
case 'hmn'  : return `Txhawm rau saib cov app no koj yuav tsum tau pab Javascript`;                                       // undefined hmong
case 'hr'   : return `Za prikaz ove aplikacije trebali biste omogućiti Javascript`;                                       // hr-HR   croata
case 'ht'   : return `Pou wè app sa a ou ta dwe pèmèt JavaScript`;                                                        // ht-ES   criollo haitiano
case 'hu'   : return `Az alkalmazás megtekintéséhez engedélyeznie kell a Javascript alkalmazást`;                         // hu-HU   húngaro
case 'hy'   : return `Այս ծրագիրը դիտելու համար անհրաժեշտ է միացնել Javascript- ը`;                                       // hy-AM   armenio
case 'id'   : return `Untuk melihat aplikasi ini, Anda harus mengaktifkan Javascript`;                                    // id-ID   indonesio
case 'ig'   : return `Iji lelee ngwa a ị kwesịrị ịkwalite Javascript`;                                                    // ig-ES   igbo
case 'is'   : return `Til að skoða þetta forrit ættirðu að virkja Javascript`;                                            // is-IS   islandés
case 'it'   : return `Per visualizzare questa app devi abilitare Javascript`;                                             // it-IT   italiano
case 'iw'   : return `כדי להציג אפליקציה זו עליך להפעיל את Javascript`;                                                   // he-IL   hebreo
case 'ja'   : return `このアプリを表示するには、Javascriptを有効にする必要があります`;                                                              // ja-JP   japonés
case 'jw'   : return `Kanggo ndeleng aplikasi iki, sampeyan kudu ngaktifake Javascript`;                                  // jv-ID   javanés
case 'ka'   : return `ამ აპლიკაციის სანახავად უნდა ჩართოთ Javascript`;                                                    // ka-GE   georgiano
case 'kk'   : return `Бұл қолданбаны көру үшін Javascript қосылу керек`;                                                  // kk-KK   kazajo
case 'km'   : return `ដើម្បីមើលកម្មវិធីនេះអ្នកគួរតែបើកដំណើរការ Javascript`;                                               // km-KH   camboyano
case 'kn'   : return `ಈ ಅಪ್ಲಿಕೇಶನ್ ವೀಕ್ಷಿಸಲು ನೀವು ಜಾವಾಸ್ಕ್ರಿಪ್ಟ್ ಅನ್ನು ಸಕ್ರಿಯಗೊಳಿಸಬೇಕು`;                                  // kn-IN   canarés
case 'ko'   : return `이 앱을 보려면 자바 스크립트를 활성화해야합니다.`;                                                                       // ko-KR   coreano
case 'ku'   : return `Ji bo dîtina vê sepanê divê hûn Javascript-ê çalak bikin`;                                          // undefined kurdo
case 'ky'   : return `Бул колдонмону көрүү үчүн Javascriptти күйгүзүшүңүз керек`;                                         // ky-KY   kirguís
case 'la'   : return `Prospicere`;                                                                                        // la-LA   latín
case 'lb'   : return `Fir dës App ze gesinn sollt Dir Javascript aktivéieren`;                                            // lb-ES   luxemburgués
case 'lo'   : return `ເພື່ອເບິ່ງແອັບ this ນີ້ທ່ານຄວນເປີດໃຊ້ Javascript`;                                                  // lo-LA   lao
case 'lt'   : return `Norėdami peržiūrėti šią programą, turėtumėte įgalinti „Javascript“`;                                // lt-LT   lituano
case 'lv'   : return `Lai skatītu šo lietotni, iespējojiet Javascript`;                                                   // lv-LV   letón
case 'mg'   : return `Raha hijery an'ity fampiharana ity dia tokony hampidirinao ny Javascript`;                          // mg-MG   malgache
case 'mi'   : return `Hei tiro i tenei taupānga me whakahohe koe i teJavascript`;                                         // mi-ES   maorí
case 'mk'   : return `За да ја видите оваа апликација, треба да овозможите Javascript`;                                   // mk-MK   macedonio
case 'ml'   : return `ഈ അപ്ലിക്കേഷൻ കാണുന്നതിന് നിങ്ങൾ ജാവാസ്ക്രിപ്റ്റ് പ്രവർത്തനക്ഷമമാക്കണം`;                            // ml-IN   malayalam
case 'mn'   : return `Энэ програмыг үзэхийн тулд Javascript-ийг идэвхжүүлэх хэрэгтэй`;                                    // mn-MN   mongol
case 'mr'   : return `हे अॅप पाहण्यासाठी आपण जावास्क्रिप्ट सक्षम करा`;                                                    // mr-IN   maratí
case 'ms'   : return `Untuk melihat aplikasi ini, anda harus mengaktifkan Javascript`;                                    // ms-MY   malayo
case 'mt'   : return `Biex tara din l-app għandek tippermetti Javascript`;                                                // mt-MT   maltés
case 'my'   : return `ဤ app ကိုကြည့်ရှုရန်သင်သည် Javascript ကိုအသုံးပြုသင့်သည်`;                                          // my-MY   birmano
case 'nb'   : return `For å se denne appen, bør du aktivere Javascript`;                                                  // undefined undefined
case 'ne'   : return `यो अनुप्रयोग हेर्नका लागि तपाईंले जाभास्क्रिप्ट सक्षम गर्नुपर्नेछ`;                                 // ne-NP   nepalí
case 'nl'   : return `Om deze app te bekijken, moet u Javascript inschakelen`;                                            // nl-NL   neerlandés
case 'ny'   : return `Kuti muwone pulogalamuyi muyenera kuloleza Javascript`;                                             // ny-ES   chichewa
case 'or'   : return `ଏହି ଆପ୍ ଦେଖିବା ପାଇଁ ଆପଣ ଜାଭାସ୍କ୍ରିପ୍ଟ ସକ୍ଷମ କରିବା ଉଚିତ୍ |`;                                         // undefined oriya
case 'pa'   : return `ਇਸ ਐਪ ਨੂੰ ਵੇਖਣ ਲਈ ਤੁਹਾਨੂੰ ਜਾਵਾਸਕ੍ਰਿਪਟ ਯੋਗ ਕਰਨੀ ਚਾਹੀਦੀ ਹੈ`;                                          // pa-PA   panyabí
case 'pl'   : return `Aby wyświetlić tę aplikację, należy włączyć Javascript`;                                            // pl-PL   polaco
case 'ps'   : return `د دې اپلیکیشن لیدو لپاره تاسو باید جاواسکریپټ وړ کړئ`;                                              // ps-PS   pastún
case 'pt'   : return `Para visualizar este aplicativo, você deve habilitar o Javascript`;                                 // pt-BR   portugués
case 'ro'   : return `Pentru a vizualiza această aplicație ar trebui să activați Javascript`;                             // ro-RO   rumano
case 'ru'   : return `Для просмотра этого приложения вы должны включить Javascript`;                                      // ru-RU   ruso
case 'rw'   : return `Kureba iyi porogaramu ugomba gukora Javascript`;                                                    // undefined kinyarwanda
case 'sd'   : return `هن ايپ کي ڏسڻ لاءِ توهان کي جاوا اسڪرپٽ کي فعال ڪرڻ گهرجي`;                                         // sd-SD   sindhi
case 'si'   : return `මෙම යෙදුම බැලීමට ඔබ ජාවාස්ක්‍රිප්ට් සක්‍රීය කළ යුතුය`;                                              // si-LK   cingalés
case 'sk'   : return `Ak chcete zobraziť túto aplikáciu, musíte povoliť Javascript`;                                      // sk-SK   eslovaco
case 'sl'   : return `Za ogled te aplikacije morate omogočiti Javascript`;                                                // sl-SI   esloveno
case 'sm'   : return `Ina ia vaʻai i lenei app e tatau ona faʻatagaina le Javascript`;                                    // sm-ES   samoano
case 'sn'   : return `Kuti utarise iyi app unofanirwa kugonesa JavaScript`;                                               // sn-ES   shona
case 'so'   : return `Si aad u daawato barnaamijkan waa inaad awood u siisaa Javascript`;                                 // so-SO   somalí
case 'sq'   : return `Për të parë këtë aplikacion duhet të aktivizoni Javascript`;                                        // sq-SQ   albanés
case 'sr'   : return `Да бисте прегледали ову апликацију, требало би да омогућите Јавасцрипт`;                            // sr-RS   serbio
case 'st'   : return `Ho sheba sesebelisoa sena u lokela ho thusa Javascript`;                                            // st-ES   sesoto
case 'su'   : return `Pikeun ningali aplikasi ieu anjeun kedah ngaktipkeun Javascript`;                                   // su-ID   sundanés
case 'sv'   : return `För att se den här appen bör du aktivera Javascript`;                                               // sv-SE   sueco
case 'sw'   : return `Kuangalia programu hii unapaswa kuwezesha Javascript`;                                              // sw      suajili
case 'ta'   : return `இந்த பயன்பாட்டைக் காண நீங்கள் ஜாவாஸ்கிரிப்டை இயக்க வேண்டும்`;                                       // ta-IN   tamil
case 'te'   : return `ఈ అనువర్తనాన్ని చూడటానికి మీరు జావాస్క్రిప్ట్‌ను ప్రారంభించాలి`;                                    // te-IN   telugu
case 'tg'   : return `Барои дидани ин барнома, шумо бояд Javascript -ро фаъол кунед`;                                     // tg-TG   tayiko
case 'th'   : return `ในการดูแอพนี้คุณควรเปิดใช้งาน Javascript`;                                                          // th-TH   tailandés
case 'tk'   : return `Bu programmany görmek üçin Javascript-i işletmeli`;                                                 // undefined turkmeno
case 'tl'   : return `Upang matingnan ang app na ito dapat mong paganahin ang Javascript`;                                // fil-PH  tagalo
case 'tr'   : return `Bu uygulamayı görüntülemek için Javascript'i etkinleştirmelisiniz`;                                 // tr-TR   turco
case 'tt'   : return `Бу кушымтаны карау өчен Javascript кушарга кирәк`;                                                  // undefined tártaro
case 'ug'   : return `بۇ ئەپنى كۆرۈش ئۈچۈن Javascript نى قوزغىتىشىڭىز كېرەك`;                                             // undefined uigur
case 'uk'   : return `Для перегляду цього додатка слід увімкнути Javascript`;                                             // uk-UA   ucraniano
case 'ur'   : return `اس ایپ کو دیکھنے کے ل you آپ کو جاوا اسکرپٹ کو چالو کرنا چاہئے`;                                    // ur-PK   urdu
case 'uz'   : return `Ushbu dasturni ko'rish uchun siz Javascript-ni yoqishingiz kerak`;                                  // uz-UZ   uzbeco
case 'vi'   : return `Để xem ứng dụng này, bạn nên bật Javascript`;                                                       // vi-VN   vietnamita
case 'xh'   : return `Ukujonga le app kuya kufuneka wenze iJavascript`;                                                   // xh-ES   xhosa
case 'yi'   : return `צו זען דעם אַפּ איר זאָל געבן דזשאַוואַסקריפּט`;                                                    // yi-YI   yidis
case 'yo'   : return `Lati wo ohun elo yii o yẹ ki o mu Javascript ṣiṣẹ`;                                                 // yo-ES   yoruba
case 'zh-cn': return `要查看此应用,您应该启用Javascript`;                                                                            // undefined undefined
case 'zh-tw': return `要查看此應用,您應該啟用Javascript`;                                                                            // undefined undefined
case 'zu'   : return `Ukuze ubuke lolu hlelo lokusebenza kufanele unike amandla i-Javascript`;                            // zu-ZA   zulú

How can I solve this?

Thank you very much again.
Carlos

@alexei alexei added the bug label Jan 20, 2021
@alexei
Copy link
Owner

alexei commented Jan 20, 2021

This is totally valid, but I don't know how to count characters properly.

For this particular purpose, though, I'd suggest changing the coding style e.g.

case "pt": // pt-BR portugués
    return `Para visualizar este aplicativo, você deve habilitar o Javascript`

@CarlosJimenez
Copy link
Author

CarlosJimenez commented Jan 21, 2021

Just in case that helps
https://stackoverflow.com/questions/4063146/getting-the-actual-length-of-a-utf-8-encoded-stdstring

int len = 0;
while (*s) len += (*s++ & 0xc0) != 0x80;

@alexei
Copy link
Owner

alexei commented Jan 21, 2021

That looks like C/C++

@alexei
Copy link
Owner

alexei commented Jan 21, 2021

I tried

s.length
s.split("").length
[...s].length
(new TextEncoder).encode(s).length

with no success.

If you know how to do it, I'm open to discuss.

@CarlosJimenez
Copy link
Author

CarlosJimenez commented Jan 22, 2021

I don't know how to fix it (yet). I would like to help. I will try a couple of ideas and I'm glad to help to you. I am using your module since years. And it will be an honor if I can help you.

The following (C routine ported to Javascript) doesn't work. (while (*s) len += (*s++ & 0xc0) != 0x80;)

function lengthStr(s) {
	console.log(s,s.length);
	var len = 0;
	for (var i=0; i<s.length; i++) {
		var c = s.charCodeAt(i);
		var c_and_xC0 = c & 0xC0;
		if (c_and_xC0 !== 0x80) len++;
		console.log(sprintf("%d %02X %02X %d",i,c,c_and_xC0,len));
	}
	if (len===0) len=s.length; 
	return len;
}

The following using a buffer gives info about the real UTF-8 codes. But it returns basically the same of s.length

function dec2bin(dec){
    return (dec >>> 0).toString(2);
}

function lengthStr(s) {
	console.log(s,s.length);
	var buffer = Buffer.from(s, "utf-8");
	console.log(buffer.length);
	var len = 0;
	for (var i=0; i<buffer.length; i++) {
		var c = buffer[i];
		var c_and_xC0 = c & 0xC0;
		// console.log(i,c,dec2bin(c));
		if (c <      0x80)   len++;         else
		if (c <     0x800) { len++; i+=1; } else
		if (c <   0x10000) { len++; i+=2; } else
		if (c <  0x200000) { len++; i+=3; } else
		if (c < 0x4000000) { len++; i+=4; } 

		console.log(sprintf("%d %02X %s %02X %d",i,c,dec2bin(c),c_and_xC0,len));
	}
}

I just shared my failed tests to avoid you fall in my errors.

I'll keep you informed.

Best
Carlos

@alexei
Copy link
Owner

alexei commented Jan 27, 2021

The following [...] doesn't work

The following [...] returns basically the same of s.length

That was my problem as well 🙂

@lorenzos
Copy link

lorenzos commented Jan 19, 2022

The problem is not in counting characters, Node is perfectly capable of correctly counting characters natively:

> 'abcdef'.length
6
> '要查看此应用'.length
6

The problem is that fixed-width fonts can't really be fixed-width in some cases. It's a display issue, and I'm afraid it also depends on the font used. Here are, for example, a screenshot from GitHub, my editor, and my terminal, where you can see the Chinese string always has a different width compared to the abcdef one:

GitHub rendering

Editor rendering

Terminal rendering

So, I don't think it is possible to fix this.

@XiaoHippo
Copy link

XiaoHippo commented May 25, 2022

You can use wcwidth which recognizes wide-characters and counts the number of columns they take.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

4 participants