Djivêye des foroms di Berdelaedjes Berdelaedjes
Forom di berdelaedjes e walon, so tot l' minme ké sudjet
 
 FAQFAQ   CweriCweri   Djivêye des mimbesDjivêye des mimbes   Groupes d' uzeusGroupes d' uzeus   Si fé mimbeSi fé mimbe 
 ProfilProfil   Lére les messaedjes privésLére les messaedjes privés   S' elodjîS' elodjî 

Sûtisté éndjolike po fé tcheryî l' walon / AI for Walloon
Potchî al pådje 1, 2  Shuvant
 
Sicrire on novea sudjet   Responde å sudjet    Djivêye des foroms di Berdelaedjes -> Berdelaedjes
Vey sudjet di dvant :: Vey sudjet shuvant  
Oteur Messaedje
lucyin



Date d' arivêye: 2005-07-07
Messaedjes: 3661
Eplaeçmint: Sidi Smayil, Marok

MessaedjeDate: sem 03 djn, 2023 22:05:49    Sudjet: Sûtisté éndjolike po fé tcheryî l' walon / AI for Walloon Responde tot citant

Citåcion:
Steve Lempereur

Lucien,

I've been playing around with the recent AI developments, chatGPT and Google Bard mostly just to learn what can be accomplished with them.

Has anyone in Belgium thought about trying to attract the tech/AI developers to pay attention to Walloon?

When I saw Iceland did this some how, I was curious on the specifics of how they got their attention, and how the arrangement works.
https://grapevine.is/news/2023/03/15/icelandic-developed-as-a-language-model-within-chatgpt/

Google Bard (IA) does appear to have some semblance of being able to translate standard written walloon. How that is being achieved and who is working on it, are the mysteries.

I noticed that last year google added 24 new languages to their google translate tool.

https://blog.google/products/translate/24-new-languages/

I think it sure would go a long way to see support there to translate standard written Walloon to English and other languages.

Here are some of my thoughts and observations; Like I've mentioned the Belgian linguists have done an impressive amount of work. And I cannot even imagine how they could do much more on their own.

https://wa.wiktionary.org/
and https://wa.wikipedia.org/
Even a pretty useful translation tool: https://dtw.walon.org/index.php

The sheer amount of open documentation that can be freely used to train and develop a computer model I sincerely hope attracts some tech companies. Unfortunately the size of Wallonia and number of speakers doesn’t make it an appealing economic model for a tech company to invest a lot of time or money into working on such a minority language. Unless maybe for marketing purposes? So if an IA developer has to enter into a contractual agreement to access the needed datasets that's just a deterrent. It’s important to provide developers with resources for developing Al systems that support the language. This could include datasets of text and code in minority
languages, as well as tools and tutorials for developing Al systems that are able to process and understand the language.

There are local associations and social media groups promoting Walloon but they are very grass-roots. In my opinion the Wallloon language in Belgium would benefit from a centralized voice/advocate.

This organization/foundation could act as a decision maker in terms of Walloon spelling for the community. Mostly the organization would act as an advocacy/preservation spokes-force.

By contacting Al developers directly, or by writing to government officials or industry leaders. It could possibly even fund research. But it could encompass many things. Since it seems the standard written Walloon (rifondou) isn't catching on, and that seems to be a
key thing to its survival... I'd say you need to identify who would benefit the most from a standard spelling... It seems to me anyone trying to publish books would be the first people I'd approach for donations to try and fund the organization. Basically I am thinking of a bigger central organization to work with all the smaller grass roots groups.

This part is a bit more far fetched and takes a lot more tech persuading, but could really help things as I'll explain with an example;

ASR (Automatic Speech Recognition) Is the tech that can take an audio sample, analyze it and convert it to text. Think; Siri, Hey Google, Alexa etc. This takes a lot of training audio from a number of different speakers. (Perhaps you recognized how GOOG-411 was a
training bed for the ASR of today)

A lot of different apps use the ASR backends of the big tech companies (google, amazon, etc) to do this. For an example app, think of a speaking exercise part of Duolingo.

In my opinion getting Walloon into this digital domain, first of a text based translation tool (think google translate or the old Altavista babelfish), and then possibly into speech recognition is something that really should be considered to be pressed into happening. Without that you are looking at a more old fashion preservation (think of a lot of IPA transcriptions), and in reality that sort of documentation is really only well understood by
linguists. If there can be an application space, then people can learn on their phones more conveniently, and maybe some sort of Walloon speaking game could be developed where they score points, and this could be the next craze in Wallonia instead of Candy Crush Saga…etc. If the technology and tools are readily accessible to make learning the language easier, who knows?

To reiterate; build relationships with Al developers. Get to know them, their work, and their interests. This will make them more likely to listen to your concerns. Write letters, mingle at conferences, and advertise! It appears LinkedIn would be a good place to find IA
and linguists interested in the language. It's easier to identify people's skill sets there…

Since you’ll most likely be approaching an American company, if you feel it would be beneficial to write a joint letter having the Belgian Heritage Center of Wisconsin co-signing it, please let us know!


Thank you for that long letter. Sure it is a key solution for saving Walloon.

I have published it into "Berdelaedjes".

I hope someone can translate it into French to have the entire comprehension for all "to whom it may concern".

I am rather busy now, but later on in June and July, I will make a first trial on voice recognition reading "Li ptit prince" in a "neutral accent".
https://wa.wikisource.org/wiki/Uzeu:Lucyin/djermons/Li_Ptit_Prince

The research is piloted by Mr Boula de Mareuil, the author of "Atlas sonore" with more than 70 points in Walloon language.

https://atlas.limsi.fr/?tab=be

Maybe other "waloneus" will be able to follow your suggestions (I am not used enough in computer science to go far in such trials and contacts). My main work remains to add lexical data to the three websites you mentioned here above. Together with production of Walloon-speaking magazine "Li Rantoele".

Thank you for suggesting working together with Belgian Heritage Center of Wisconsin for eventual contacts with US companies.

Sincerely yours.
_________________
Li ci ki n' a k' on toû n' vike k' on djoû.


Candjî pol dierin côp pa lucyin, li vén 30 djn, 2023 16:01:43; candjî 1 feye
Rivni al copete
Vey li profil di l' uzeu Evoyî on messaedje privé Evoyî èn emile Viziter l' waibe di l' uzeu Jabber
Èl-Gueuye-Noere



Date d' arivêye: 2019-05-13
Messaedjes: 184
Eplaeçmint: Tchålerwè

MessaedjeDate: mie 07 djn, 2023 17:45:11    Sudjet: Responde tot citant

Tecse diné pa Perplexity AI :

« Bonjour, c' est un grand plaisir de vous écrire en wallon. Li walon c' est on langue roman qui s' parle en Walonreye et a Brussele, et ossi ene p'tite parti d' la Franche, et in Wisconsin, USA. C' est on langue qui fait partie del famileye des langues d'oïl, et qui a des liens avec la littérature française. Si vous avez envie d'apprendre li walon, il y a des cours en ligne avec des tuteurs dédiés qui peuvent personnaliser les leçons pour vous. Merci d'avoir lu ce message en wallon. »
_________________
El sopoirtance va djonde on si télmint hôt livea k' les sûteyès djins n' pôront pus sondjî a n' nén birer les bezéns.
Rivni al copete
Vey li profil di l' uzeu Evoyî on messaedje privé Viziter l' waibe di l' uzeu
slampereur



Date d' arivêye: 2023-06-05
Messaedjes: 9
Eplaeçmint: Wisconsin

MessaedjeDate: dju 08 djn, 2023 19:58:06    Sudjet: Responde tot citant

I'll be curious as to what voice recognition application Lucyin tries. Or really anything anyone has tried. I am familur with CMUSpinx, having used that in the past, but only with the English language. I did try briefly to retrain the French model to Walloon. But that really didn't go anywhere since I am not a linguist and really the underpinnings of a French to Walloon dictonary are lacking.

I've heard good things about the Vosk Speech Recognition toolkit. Both are open source, which I am a fan of. Sadly the more closed models powered by the big tech companies have made further headway in the speech recognition areas in recent years.

As for learning Walloon myself, there are only a few speakers left to talk to, and I live a 1/2 hour a way from them to really interface with them regularly. From what I have observed anyone around here who studied any foreign language, never really had the chance to use it, and as result forgot most of it. This is mostly a boring monolingual area. So investing a lot of time isn't on my agenda. I think the time would be better spent advocating for the language. I am not overly fond of some of the big tech companies and the privacy issues associated with it all. But if the partnership can help preserve a language and or put tools in peoples hands (phones) to keep fluent, then I'll look past those.

https://www.cbc.ca/news/canada/montreal/alexa-learns-quebec-french-1.5078881

https://grapevine.is/news/2023/03/15/icelandic-developed-as-a-language-model-within-chatgpt/

(Sorry by the English, after 170 years it was inevitable we'd become boring Americans)
Rivni al copete
Vey li profil di l' uzeu Evoyî on messaedje privé
Pablo Saratxaga
Site Admin


Date d' arivêye: 2005-07-01
Messaedjes: 1148
Eplaeçmint: Oûpêye

MessaedjeDate: vén 09 djn, 2023 19:33:15    Sudjet: Responde tot citant

Hello,

ChatGPT (and those much hyped "AI" things) is not any "intelligence" (artifical or not) at all;
It is unable to make any decision, unable to truly learn, unable to truly interact, unable to recognise true from false...

However, if we set the hype aside and see it for what it really is (a shape recognition model, able to produce output matching the model) then we can indeed do quite some interesting things.
In the field of language those modeling tools are quite good, as human language itself is a shape model thing.

I was wondering if it was able to produce acceptable output in Walloon.
As I don't want to register an account and such, I never tested;
however the extract produced by El Gueye-Noere shows that it hasn't been feed with Walloon text (however, quite interestingly, that output also shows that in its shape recognition model, the word "wallon" has been associated with oïl varieties; it just is unable to pick the right patterns)

A proper feeding (and properly tagged) would surely improve that;
and the bad result just shows it hasn't been done.
Maybe the reason is that there isn't much data that can (legally) be fed to the machine (I don't think Wikipedia pages can be legally used in such way, that is to produce a closed non-free corpus; and the Walloon wikipedia is the only place I know where there is both a large amount of text in Walloon, and properly tagged (with lang=wa ))

Note also that, while it may be interesting for the promotion of the language (it will surely generate some buzz), it won't be a good tool to "learn" it.
(there are tons of hilarious cases of bad teaching around there, where the "AI" boldly tells false things).

A much better thing would be to have Wallon included in a language learning platform such as Duolingo (sadly it doesn't seem to accept new languages right now... a lot of things could have been quite different if some steps in Walloon normalization had happened just 2 years before. For example, if the iso language code "wa" had been obtained in 2000 (instead of 2002); then the listing of the language would have been available from the beginning in Google, Netscape/Mozilla (and probably Microsoft some years later); instead Wallon is still not a language choice in Google searching; it only appeared in Mozilla and Microsoft some 3-4 years ago, and in Chrome and Google android keyboard some 2-3 years ago.
The initial 2 year delay produced a 20 year delay in deployment...)
Rivni al copete
Vey li profil di l' uzeu Evoyî on messaedje privé Evoyî èn emile Viziter l' waibe di l' uzeu Jabber
slampereur



Date d' arivêye: 2023-06-05
Messaedjes: 9
Eplaeçmint: Wisconsin

MessaedjeDate: sem 10 djn, 2023 2:05:49    Sudjet: Responde tot citant

Yes AI is over hyped. One thing its annoying bad at is making up stuff. (Think being creative, as in creating a poem) They refer to that as hallucination.

You'll never get an answer with real sources, even if you drill the AI as to the source. An example is; I asked for a list of books on Norwegians in Waukesha Wisconsin, and came back with a title/author of one I never heard of, nor could find anything about online. So then I asked what the books OCLC number was, It gave me a number, but that too was bogus.

https://en.wikipedia.org/wiki/Hallucination_(artificial_intelligence)

Here is an bard AI translation example. I took a fellas Walloon text on discord and (well you be the judge);

Citåcion:

translate this walloon text to English ""El bijhe eyet l’ solea si bretént po sawè kî çk’ esteut l’ pus foirt d’ inte zels deus, cwand tot d’ on côp, il ont veu on voyaedjeu k’ arivéve, tot ravôtyî dins s’ mantea. Il ont tcheu dacoird k’ el ci k’ arivreut l’ prumî a lyi fé oister s’ mantea sreut rwaitî come el pus foirt. Adon, el bijhe s’ a metou a shofler di tot s’ shofla, mins å dpus k' ele shofléve, å dpus k’ no voyaedjeu rseréve ès mantea åtoû d’ lu. Ça fwait ki, al fén, el bijhe a leyî tchaire. Adon, el solea a cmincî a rlure, et après on moumint, no voyaedjeu a fini pa s’ disbiyî. Ça fwait k’ el bijhe a bén dvou admete k’ el solea esteut pus foirt ki leye.""

Sure, here is the translation of the Walloon text to English:

The bee and the sun were arguing about who was the strongest, when suddenly they saw a traveler arriving, all wrapped up in his cloak. They agreed that the one who could first get him to open his cloak would be considered the strongest. So, the bee started to blow with all her might, but the harder she blew, the tighter the traveler held his cloak around him. So, in the end, the bee gave up. Then, the sun began to shine, and after a moment, the traveler finally opened his cloak. So, the bee had to admit that the sun was stronger than her.

The moral of the story is that sometimes the most powerful force is not the one that is the loudest or the most aggressive, but the one that is the most persistent.


(Sometimes I notice the AI seems to default to the closest-sounding English word)

Yes in theory Wikipedia shouldn't be used to create a closed source thing. But if one can never really prove the sources they used to train their model then I guess one doesn't have a valid arguement.

So I wonder how it has learned what it has or more specifically who trained it for interpreting Walloon. Sure there is a large amount of text on the Walloon wikipedia, but there are no cooreponding translated pages in other languages.
Rivni al copete
Vey li profil di l' uzeu Evoyî on messaedje privé
Pablo Saratxaga
Site Admin


Date d' arivêye: 2005-07-01
Messaedjes: 1148
Eplaeçmint: Oûpêye

MessaedjeDate: sem 10 djn, 2023 2:26:11    Sudjet: Responde tot citant

Automated translation is probably the best application for those models;
and indeed this walloon -> english one is very god!
The only miss is that "bijhe" is the name of a wind, and not a flying insect;
replace "bee" with "wind" and it will be ok.
(but while it is astonishing at producing automated translations, that also show why it cannot be used to learn a language)

Citåcion:

""El bijhe eyet l’ solea si bretént po sawè kî çk’ esteut l’ pus foirt d’ inte zels deus, cwand tot d’ on côp, il ont veu on voyaedjeu k’ arivéve, tot ravôtyî dins s’ mantea. Il ont tcheu dacoird k’ el ci k’ arivreut l’ prumî a lyi fé oister s’ mantea sreut rwaitî come el pus foirt. Adon, el bijhe s’ a metou a shofler di tot s’ shofla, mins å dpus k' ele shofléve, å dpus k’ no voyaedjeu rseréve ès mantea åtoû d’ lu. Ça fwait ki, al fén, el bijhe a leyî tchaire. Adon, el solea a cmincî a rlure, et après on moumint, no voyaedjeu a fini pa s’ disbiyî. Ça fwait k’ el bijhe a bén dvou admete k’ el solea esteut pus foirt ki leye.""

The wind and the sun were arguing about who was the strongest, when suddenly they saw a traveler arriving, all wrapped up in his cloak. They agreed that the one who could first get him to open his cloak would be considered the strongest. So, the wind started to blow with all her might, but the harder she blew, the tighter the traveler held his cloak around him. So, in the end, the wind gave up. Then, the sun began to shine, and after a moment, the traveler finally opened his cloak. So, the wind had to admit that the sun was stronger than her.


Could you ask for the reverse output ? (en->wa); I would be curious about that.

(interesting also how the morale has been added in the output, despite not being present in the Walloon text provided; but it is a common occurrence accompaniating that fable).


Thanks
Rivni al copete
Vey li profil di l' uzeu Evoyî on messaedje privé Evoyî èn emile Viziter l' waibe di l' uzeu Jabber
slampereur



Date d' arivêye: 2023-06-05
Messaedjes: 9
Eplaeçmint: Wisconsin

MessaedjeDate: sem 10 djn, 2023 3:44:04    Sudjet: Responde tot citant

Yes that won't help anyone learn. Quite the opposiite cheat, so they don't have to learn it.

However that ability is what is needed for people like one ones behind dulingo (or really anyone outside Wallonia) to create something for learning.

Speaking of dulingo, in 2015 the learning fad was the Glossika method.
I believe at Kelly Bier's suggestion we recorded 2000 sentences, and about 600 words following their format. And there it sits, mostly because we don't know what it do with it. And honestly I don't think its a very good method for beginners. But if anyone has suggestions for those, I'd be more than happy to pass them along.

Citåcion:

translate this english text to walloon "Mary had a little lamb, little lamb, little lamb. Mary had a little lamb, its fleece was white
as snow. And everywhere that Mary went. Mary went. Mary went. And everywhere that Mary went, the lamb was sure to go.
It followed her to school one day, school one day, school one day. It followed her to school one day, which was against the rule. It made the children laugh and play, laugh and play, laugh and play. It made the children laugh and play to see the lamb at school. "

Sure, here is the translation of "Mary had a little lamb" to Walloon:

Mary aveut on ptit agneau, ptit agneau, ptit agneau.
Mary aveut on ptit agneau, si laine estot blanke comene nive.
Et par tos les endroys ki Mary aloait.
Mary aloait.
Mary aloait.
Et par tos les endroys ki Mary aloait, l'agneau esteut sier a y aler.
I l'a yeu a scole on djoû, scole on djoû, scole on djoû.
I l'a yeu a scole on djoû, ki c'esteut conte les règles.
I lyi a fwait rire et djouwer les efants, rire et djouwer, rire et djouwer.
I lyi a fwait rire et djouwer les efants a veu l'agneau a scole.

How did it do? It takes a lot of manual reviewing for me to really make that kind of assessment myself obviously.
Rivni al copete
Vey li profil di l' uzeu Evoyî on messaedje privé
slampereur



Date d' arivêye: 2023-06-05
Messaedjes: 9
Eplaeçmint: Wisconsin

MessaedjeDate: sem 10 djn, 2023 5:30:52    Sudjet: Responde tot citant

Here is a classic example of bogus answers (and ones that really squash my hopes)

Citåcion:

does a walloon text to speech engine exist?

Yes, there is a Walloon text-to-speech engine called "LangueWallonne". It was developed by the Walloon Region and is available for free download. The engine can be used to read aloud text in Walloon, and it also includes a number of features that allow users to customize the output, such as the speed and pitch of the voice.

Here are the steps on how to use LangueWallonne:

Download the LangueWallonne engine from the Walloon Region website.
Install the engine on your computer.
Open a text file that contains Walloon text.
Select the text that you want to read aloud.
Click on the "Speak" button in the LangueWallonne toolbar.
The text will be read aloud in Walloon.
Here are some of the features of LangueWallonne:

Supports a variety of Walloon dialects
Can read aloud text in a variety of speeds and pitches
Allows users to customize the sound of the voice
Can be used with a variety of text formats, including .txt, .doc, and .pdf files
LangueWallonne is a free and easy-to-use tool that can be used to read aloud Walloon text. It is a valuable tool for people who are learning Walloon or who want to improve their fluency in the language.


I guess I should have introduced myself when I joined. My name is Steve, I am 44 yrs old. I've always been the curious type; my other interests and hobbies are ham radio, electronics repair, and Linux. My oldest uncle spoke the language well and visited Belgium a number of times. I grew up outside the settlement, but spent a lot of time as a kid visiting releatives in the area. I recently was elected president of the Belgian-American club that my grand-uncle, Harry Chaudoir ran for many years.
Rivni al copete
Vey li profil di l' uzeu Evoyî on messaedje privé
Alban Leloup



Date d' arivêye: 2022-01-08
Messaedjes: 26
Eplaeçmint: Louvie-Juzon (Pirenêyes-Atlantikes)

MessaedjeDate: sem 10 djn, 2023 10:20:33    Sudjet: Responde tot citant

slampereur scrijha:
I guess I should have introduced myself when I joined. My name is Steve, I am 44 yrs old. I've always been the curious type; my other interests and hobbies are ham radio, electronics repair, and Linux. My oldest uncle spoke the language well and visited Belgium a number of times. I grew up outside the settlement, but spent a lot of time as a kid visiting releatives in the area. I recently was elected president of the Belgian-American club that my grand-uncle, Harry Chaudoir ran for many years.


Well met, dear Winsconsinite cousin! I'm Alban, and I'm the fella from the Discord who made that Walloon translation of "The North Wind and the Sun" you mentioned earlier.
Rivni al copete
Vey li profil di l' uzeu Evoyî on messaedje privé
lucyin



Date d' arivêye: 2005-07-07
Messaedjes: 3661
Eplaeçmint: Sidi Smayil, Marok

MessaedjeDate: dju 22 djn, 2023 20:05:55    Sudjet: Responde tot citant

Binåjhe di vey ki ça berdele fer / Happy so see there is a lot of "gossiping" (?)

@ Alban : av eredjistré vosse tecse et l' evoyî a Flipe Boula ?
_________________
Li ci ki n' a k' on toû n' vike k' on djoû.
Rivni al copete
Vey li profil di l' uzeu Evoyî on messaedje privé Evoyî èn emile Viziter l' waibe di l' uzeu Jabber
slampereur



Date d' arivêye: 2023-06-05
Messaedjes: 9
Eplaeçmint: Wisconsin

MessaedjeDate: lon 07 awo, 2023 0:09:38    Sudjet: Responde tot citant

I just noticed google Bard added a text to speech output as of July.

Here is it's synthesized output for "Li diâle tchît todi su L' pus gros moncia"

https://vocaroo.com/18gGhgHVbVea

Assuming this is fairly accurate (in most cases) this can be helpful.

Perhaps if the question was typed in written Walloon then it would reply that way. That is out of my league.

On that topic I asked if a Walloon text to speech model even existed. It replied (typical AI typical hallucination?) "Yes, there is a Walloon text to speech model available. It is called the Narakeet Walloon Text to Speech model."

I don't see Walloon mentioned on their website. But I must say that if anyone would have such a thing, it sure seems like these would be the folks doing it.

https://join.slack.com/t/narakeet/shared_invite/zt-l5g3zcwz-L91emojnUB0Q2jLN9GCCrw
Rivni al copete
Vey li profil di l' uzeu Evoyî on messaedje privé
lucyin



Date d' arivêye: 2005-07-07
Messaedjes: 3661
Eplaeçmint: Sidi Smayil, Marok

MessaedjeDate: dim 10 set, 2023 12:33:21    Sudjet: Responde tot citant

Citåcion:
Steve, ki scrît a ene Amerikinne k' a sayî di scrire li walon

Kelly,

Could you please provide me copies of the attempts to write Walloon in
their own English like way of those people you interviewed some years
back?
Or really anything you have an proposed standard for the area, but
their attempts at a miniumum.

There are a number of people in Belgium interested in examining that
sort of thing.

Thank you.


Dear Steve
Sorry for poor reactivity these last days. I was editing last issue of "Li Rantoele".https://lucyin.walon.org/guerni/Rantoele107.pdf

Before, I recorded the whole book "Li Ptit Prince", to be used by Philippe Boula for voice recognition and "speech creation".https://wa.wikisource.org/wiki/Li_Ptit_Prince_(Hendschel-Mahin,_2023)

Sure, it would be interessant to have a lexicon of Walloon words written in "Wisconsin orthography".

With kind regards
_________________
Li ci ki n' a k' on toû n' vike k' on djoû.
Rivni al copete
Vey li profil di l' uzeu Evoyî on messaedje privé Evoyî èn emile Viziter l' waibe di l' uzeu Jabber
Mostrer les messaedjes des dierin(ne)s:   
Sicrire on novea sudjet   Responde å sudjet    Djivêye des foroms di Berdelaedjes -> Berdelaedjes Totes les eures sont a GMT + 2 eures
Potchî al pådje 1, 2  Shuvant
Pådje 1 so 2

 
Potchî a:  
Vos n' poloz nén enonder des noveas sudjets dins ç' forom ci
Vos n' poloz nén responde a des sudjets k' i gn a dins ç' forom ci
Vos n' poloz nén candjî vos messaedjes dins ç' forom ci
Vos n' poloz nén disfacer vos messaedjes dins ç' forom ci
Vos n' poloz nén vôter dins les ploncaedjes di ç' forom ci


Powered by phpBB © 2001 phpBB Group