[{"content":"","date":"11 May 2026","externalUrl":null,"permalink":"/ludo/en/tags/ai/","section":"Tags","summary":"","title":"Ai","type":"tags"},{"content":"","date":"11 May 2026","externalUrl":null,"permalink":"/ludo/en/categories/","section":"Categories","summary":"","title":"Categories","type":"categories"},{"content":"","date":"11 de May de 2026","externalUrl":null,"permalink":"/ludo/tags/gesti%C3%B3-coneixement/","section":"Tags","summary":"","title":"Gestió-Coneixement","type":"tags"},{"content":"","date":"11 de May de 2026","externalUrl":null,"permalink":"/ludo/es/tags/gesti%C3%B3n-conocimiento/","section":"Tags","summary":"","title":"Gestión-Conocimiento","type":"tags"},{"content":"","date":"11 de May de 2026","externalUrl":null,"permalink":"/ludo/tags/infraestructura/","section":"Tags","summary":"","title":"Infraestructura","type":"tags"},{"content":"","date":"11 May 2026","externalUrl":null,"permalink":"/ludo/en/tags/infrastructure/","section":"Tags","summary":"","title":"Infrastructure","type":"tags"},{"content":"","date":"11 May 2026","externalUrl":null,"permalink":"/ludo/en/tags/knowledge-management/","section":"Tags","summary":"","title":"Knowledge-Management","type":"tags"},{"content":"","date":"11 de May de 2026","externalUrl":null,"permalink":"/ludo/es/tags/llms-locales/","section":"Tags","summary":"","title":"Llms-Locales","type":"tags"},{"content":"","date":"11 de May de 2026","externalUrl":null,"permalink":"/ludo/tags/llms-locals/","section":"Tags","summary":"","title":"Llms-Locals","type":"tags"},{"content":"","date":"11 May 2026","externalUrl":null,"permalink":"/ludo/en/tags/local-llms/","section":"Tags","summary":"","title":"Local-Llms","type":"tags"},{"content":"","date":"11 May 2026","externalUrl":null,"permalink":"/ludo/en/","section":"Marc Alier (Ludo)","summary":"","title":"Marc Alier (Ludo)","type":"page"},{"content":"In one recent run I had local LLMs going for more than 28 hours, chewing through 437 markdown files. The model used was qwen3.5:122b, served via Ollama working two Mac Studios on a private LAN. The task was unglamorous: read a file, return a JSON with a two-sentence summary, the named persons mentioned, the topic tags, the key-ideas the file argues. Repeat. Aggregate across files. Write per-entry scaffolds.\nIf I try to do this with claude code I will go out of tokens in a heartbeat, and even if I pay cheap chinese cloud providers I will end up not doing this kind of content repository optimizations because it can get expensive soon.\nI have since started replacing that pipeline with DS4 running DeepSeek V4 Flash Q4 on the same hardware. Per-call latency dropped from a hundred and fifty seconds to about twelve. Same task. Same prompt shape. Different engine, double the size in RAM. Quite an improvement in performance, and I even feel the Studio machine run cooler.\nThis post is the optimisation note. What I want from the local fleet is a quality LLM with variable cost approaching zero, for batch tasks — wiki indexing, references audit, transcription cleanup, bulk classification. The kind of work where the bottleneck is latency and reliability, not cleverness. Two engines, two postures, one tradeoff worth naming.\nThe use case # A wiki index over a personal writing and research repository. About four hundred markdown files: chapters, drafts, source material, memory notes - hell even my full phd disertation is in there!-. For each file I want a small structured artefact: a summary the index can show, named entities to populate person/topic/key-idea facets, cross-references between them.\nThis is bulk extraction. Reusable system prompt; variable user content; structured JSON output. The model does not need to be clever — it needs to be reliable, predictable, and available enough to grind through hundreds of calls without breaking.\nCalling an LLM in this register is a different sport from chatting with one. The output is the input to the next pipeline step. If twenty per cent of calls return empty, that is twenty per cent of the wiki missing. If you double the context length and quality silently degrades, the corpus that needs chunking is your largest file. The engineering surface that matters is latency, predictability, and graceful failure.\nKarpathy\u0026rsquo;s wiki pattern # In early April 2026 Andrej Karpathy posted a tweet that went viral: \u0026ldquo;a large fraction of my recent token throughput is going less into manipulating code, and more into manipulating [knowledge].\u0026rdquo; He sketched a system where raw sources — articles, papers, repos, datasets, even images — drop into a raw/ directory, and an LLM agent incrementally compiles them into a structured wiki: interlinked markdown files with summaries, backlinks, concept pages. Two days later he followed up with an idea file — a GitHub gist that describes the pattern conceptually, no code attached, on the theory that in the agent era the idea is more useful to share than the implementation.\nThat tweet describes the project I am building almost word for word. The repository I am indexing is years of writing — chapters, drafts, source transcriptions, memory notes, the lot. The bootstrap I ran is the incremental compile step Karpathy names. The operations he sketches downstream — ingest, query, lint — are what comes next for me. He did not invent the pattern, but he made it legible: named it, gave it shape, made it public at the moment the local-hardware tooling caught up. By summer it will be everywhere.\nVersatility vs specialisation # Two postures sit at the ends of the local-LLM spectrum.\nOllama is the versatile end. It serves any model in its library — Qwen, Llama, Mistral, Gemma, Phi, anything you can grab as a GGUF. One install, GUI and CLI, one HTTP API, and you can swap models per task. If you change models every week, Ollama is the right tool: the friction of trying a new one is a single ollama pull command.\nDS4 is the specialised end, the new kid in town. It serves one model — DeepSeek V4 Flash quantized at q2 or q4 — and is dedicated to making that one model run as well as a Mac Studio will allow. Disk-resident KV cache, MoE-aware kernels, MTP speculative decoding, an OpenAI-compatible HTTP API on top. It cannot serve any other model. That is not a bug; that is the design and intent.\nThe tradeoff is direct. If you commit to one model and use it heavily — same prompt prefix, hundreds of calls a day, prefill cost amortised over thousands of inferences — specialisation pays for itself fast. If you flit between models, the specialised engine is sunk cost. The right answer depends on how much you actually use it.\nOllama + qwen3.5:122b — the versatile incumbent # qwen3.5:122b is a strong Mixture-of-Experts model, 125 billion total parameters, plenty of capability. Served via Ollama it is a one-line install and a permanent presence on the LAN. It has been my workshop\u0026rsquo;s \u0026ldquo;Tier 1\u0026rdquo; for months: anything mechanical and voluminous that I do not want to pay an API for goes there.\nFor the wiki bootstrap, two observations are worth recording.\nThe context cliff. I capped every call at 12k tokens of input with a hard ceiling of 16k via --num-ctx 16000. Past 16k, quality degrades visibly — truncated tails, missed named entities, dropped numerical qualifiers. The 30k chapter, the 70k transcription, the 50k worldview file all had to be chunked, summarised per chunk, and re-aggregated. Each chunk is another Ollama call.\nThe empty-output failure mode. Through the run, eighteen to twenty per cent of calls returned len=0 — an empty completion. Not malformed JSON, not refused content, just nothing. Across files of all sizes; not predicted by size, type, or content. Retrying the same call usually produces the same result. My mitigation was a retry pass on gpt-5-mini via OpenAI — about twenty cents per hundred files. Cheap, but not free, and not the point.\nDS4 + DeepSeek V4 Flash — the specialised stack # DS4 is an alpha inference engine built by antirez, dedicated to DeepSeek V4 Flash — a 284B total / 13B active MoE model. The model spec claims up to a one-million-token context; I run the server at --ctx 200000 and have tested files up to 70k tokens. It exposes an OpenAI-compatible HTTP API on port 8000 and keeps an on-disk SHA1 KV cache that persists across server restarts.\nThree of those properties matter for batch extraction:\nThe KV cache reuses prompt prefixes. When the system prompt is identical across every call — and in a structured-extraction pipeline it always is — the prefill cost is paid once, then cached on disk. The second call onward sees a dramatically smaller per-call cost. Over a long run this compounds into the biggest single saving.\n200k context (the model claims more, I have only tested this much). No more chunking the 70k files. The whole thing goes in, the whole thing comes out summarised. Less aggregation logic, less risk of losing a section.\nNoticeably better at structured JSON output in my early smoke tests. Zero empty-output failures in the first runs.\nHardware throughput on the M3 Ultra: ~38 tokens per second prefill, 32 generation — consistent with antirez\u0026rsquo;s documented numbers. Per-call latency in the live enrichment pipeline is 6.3 to 25.9 s, mean 11.6 s for a 2k-input, 500-token-output call (last 200 calls of a 1,241-call run). The same call shape on qwen3.5:122b ran at 90–290 s, mean 150–200 s. An order of magnitude on the wall-clock that mattered most.\nMean per-call latency, in seconds Vertical bar chart. Ollama + Qwen3.5:122b: ~175s mean. DS4 + DeepSeek V4 Flash: 11.6s mean. Measured on Mac Studio M3 Ultra with the same call shape (2k-input, 500-token-output). Mean per-call latency seconds (lower is better)\n0 50 100 150 200 ~175s Ollama + Qwen3.5:122b 11.6s DS4 + DeepSeek V4 Flash Q4 Mac Studio M3 Ultra · 2k-input, 500-token-output call · ≈15× difference The reliability picture is sharper than the latency picture, and the one that decides whether the pipeline is usable at all. Zero silent failures in 1,241 calls, versus one in five on Qwen. The system prompt tells the model \u0026ldquo;if the context is too thin, output INSUFFICIENT_CONTEXT and nothing else\u0026rdquo; — DeepSeek V4 Flash obeys 15.6% of the time. The scaffold stays in place; nothing is invented. Qwen, when it failed, failed silently.\nMetric DS4 + DeepSeek V4 Flash Ollama + Qwen3.5:122b Per-call latency, mean 11.6 s 150–200 s Per-call latency, range 6.3 – 25.9 s ~90 – 290 s Silent empty-output failures 0 / 1,241 ~18–20 % Honest refusals (scaffold kept) 194 / 1,241 (15.6 %) n/a Context budget that holds quality 200k tokens (tested) ~16k effective Cross-call cache reuse on-disk SHA1, persistent none One unscientific observation worth recording: the Mac Studio runs cooler under DS4 at near-max GPU than under Ollama+Qwen3.5:122b at comparable load. No benchmark, just the case temperature.\nNot locked to DS4. The KV-cache-on-disk trick is not proprietary. llama.cpp can hold prefix cache in memory today and has persistent disk cache on the roadmap; Ollama sits on top of it and will inherit. The pattern — specialised engine, prompt-prefix reuse — is portable, and the optimisation will spread to whichever model you happen to be serving.\nCatalan generation # I am not comparing the generation quality of these two engines in this post. That is a different question, and on long enough texts of consequence I would not trust either of them to do the writing for me. For Catalan in particular, my standing rule is that anything generated by a local model below the ~250B effective-parameter mark loses the idiom — the rhythm, the vocabulary that does not translate. Qwen3.5:122b fails that test. DeepSeek V4 Flash, with thirteen billion active params at inference, sits in the same effective-size class and I expect it to fail the same test until a real translation batch tells me otherwise.\nCatalan-target translation in my workshop goes to Claude or to a frontier OpenAI model. The local fleet covers ES/EN bulk only. The cost saving lives on the bulk-mechanical side; voice and idiom are not yours to save on.\nDemonetisation, in the Diamandis sense # In Bold (2015), Peter Diamandis and Steven Kotler describe six effects that follow once a technology digitises: it goes deceptive, disruptive, demonetised, dematerialised, democratised. Demonetisation is the one that matters here. A year ago, indexing four hundred personal markdown files with an LLM was not economically obvious — at frontier-API rates the bill would have been real, the rate limits would have stretched it across weeks. Today, on hardware I already own, the same job is a single batch run at zero variable spend. The marginal call is free.\nWhen marginal cost approaches zero, the use cases that emerge are the ones that were not worth the bill before. Wiki indexing of a personal corpus is one. Transcription cleanup at scale, multilingual drafts for things you will rewrite anyway, semantic search across hundreds of papers, references audit, ontology extraction — each was \u0026ldquo;interesting but not interesting enough\u0026rdquo; at frontier-API prices, and each is a side project now.\nThat is the optimisation worth caring about. Not \u0026ldquo;Ollama vs DS4\u0026rdquo; — that decision is small, and the answer shifts as llama.cpp catches up. The optimisation is finding the workflows that only make sense once the per-call cost is gone.\nQuick technical reference # If you want to try the DS4 path on an Apple Silicon machine with enough memory (256–512 GB for the Q4 build, 96–192 GB for the Q2):\n# 1. Clone + build (~30 seconds) mkdir -p ~/Code \u0026amp;\u0026amp; cd ~/Code git clone https://github.com/antirez/ds4.git cd ds4 \u0026amp;\u0026amp; make # 2. Download the model (Q4: 153 GB, Q2: 81 GB) — antirez\u0026#39;s published GGUFs ./download_model.sh q4 # or `q2` for the smaller build # 3. Run the server mkdir -p ~/.ds4-kv ./ds4-server \\ --ctx 200000 \\ --kv-disk-dir ~/.ds4-kv \\ --kv-disk-space-mb 32768 # 4. Hit it like any OpenAI endpoint curl -s http://192.168.1.35:8000/v1/chat/completions \\ -H \u0026#34;Content-Type: application/json\u0026#34; \\ -d \u0026#39;{\u0026#34;model\u0026#34;: \u0026#34;deepseek-v4-flash\u0026#34;, \u0026#34;messages\u0026#34;: [{\u0026#34;role\u0026#34;:\u0026#34;user\u0026#34;,\u0026#34;content\u0026#34;:\u0026#34;Hola\u0026#34;}]}\u0026#39; By default the server should bind to 0.0.0.0:8000, though the version I built bound to 127.0.0.1:8000 and required an explicit reconfigure to reach across the LAN. The --kv-disk-dir is where the SHA1 KV cache lives — keep it on internal NVMe, not external storage. First call to a cold server takes ten to sixty seconds to load the weights; subsequent calls reuse them.\n","date":"11 May 2026","externalUrl":null,"permalink":"/ludo/en/posts/local-llms-as-bulk-tools/","section":"Posts","summary":"In one recent run I had local LLMs going for more than 28 hours, chewing through 437 markdown files. The model used was qwen3.5:122b, served via Ollama working two Mac Studios on a private LAN. The task was unglamorous: read a file, return a JSON with a two-sentence summary, the named persons mentioned, the topic tags, the key-ideas the file argues. Repeat. Aggregate across files. Write per-entry scaffolds.\n","title":"Optimizing My Local LLM Setup for Batch Tasks","type":"posts"},{"content":"","date":"11 May 2026","externalUrl":null,"permalink":"/ludo/en/categories/posts/","section":"Categories","summary":"","title":"Posts","type":"categories"},{"content":"","date":"11 May 2026","externalUrl":null,"permalink":"/ludo/en/posts/","section":"Posts","summary":"","title":"Posts","type":"posts"},{"content":"","date":"11 May 2026","externalUrl":null,"permalink":"/ludo/en/tags/","section":"Tags","summary":"","title":"Tags","type":"tags"},{"content":"The doctoral school of the University of Lleida has invited me to give a conference (keynote) at their PhD day on 6/3/2024. Here you will find the audio and video of the conference:\nhttps://mossegalapoma.cat/el-moment-sputnik-conferencia-den-ludo-marc-alier-a-la-jornada-de-lescola-de-doctorat-de-lleida/\n","date":"1 March 2024","externalUrl":null,"permalink":"/ludo/en/posts/conferencia-a-la-udl/","section":"Posts","summary":"The doctoral school of the University of Lleida has invited me to give a conference (keynote) at their PhD day on 6/3/2024. Here you will find the audio and video of the conference:\n","title":"'The Sputnik Moment' Conference by Marc Alier at the University of Lleida PhD Day","type":"posts"},{"content":"","date":"1 March 2024","externalUrl":null,"permalink":"/ludo/en/tags/educaci%C3%B3/","section":"Tags","summary":"","title":"Educació","type":"tags"},{"content":"","date":"1 March 2024","externalUrl":null,"permalink":"/ludo/en/tags/ia/","section":"Tags","summary":"","title":"Ia","type":"tags"},{"content":"ChatGPT summarizes the article in The Guardian for me: https://www.theguardian.com/commentisfree/2024/feb/13/software-student-cheated-combat-ai\nRobert Topinka, professor at Birkbeck, University of London, explores the dilemma professors face with students using AI to write essays. After detecting an essay marked as \u0026ldquo;100% AI-generated\u0026rdquo;, Topinka finds himself in a difficult situation when an exceptionally brilliant student challenges this accusation. The case highlights the challenges of AI detectors, like Turnitin, which can confuse students\u0026rsquo; legitimate use of technological support with cheating. Topinka argues for the need to adapt academic assessment to the AI era, proposing alternatives like presentations and podcasts to demonstrate students\u0026rsquo; critical and original thinking, while avoiding unfair accusations and promoting educational equality.\nOver the last year, I\u0026rsquo;ve taught courses to teachers and professors at more than 7 universities (You can see part of the content at https://wasabi.essi.upc.edu/ludo/cursos) and I\u0026rsquo;ve always received questions about how to detect if a paper was done by an AI instead of a person.\nCopied assignments are nothing new. When I was still a student, I gave tutoring classes for the subject Methodical Programming at an academy, and one day they asked me to solve an exercise that turned out to be the same one that appeared on the exam. Don\u0026rsquo;t ask me where that prompt came from, because I don\u0026rsquo;t know. I also didn\u0026rsquo;t continue working at that academy to find out.\nSince the web existed, we\u0026rsquo;ve had things like \u0026ldquo;El rincón del vago\u0026rdquo; where students share notes, exercises, and solved practices. And on Instagram, students receive ads from companies offering to do their bachelor\u0026rsquo;s or master\u0026rsquo;s thesis for them\u0026hellip; so they have time left to dedicate to their crush. I myself receive X (formerly Twitter) ads from companies offering to write research articles in my field and publish them in my name in prestigious journals.\nBut the problem is AI. Obviously. Because it allows doing things at scale and at ridiculous costs. And now we\u0026rsquo;re all rushing in.\nAnd what we want are easy solutions, like taking anti-inflammatories, but applied to teaching quality and the peace of mind a dashboard gives you. Systems like Urkund and Turnitin integrate into our Moodle\u0026rsquo;s and Sakai\u0026rsquo;s and tell us what percentage of our students\u0026rsquo; submissions are original or are copied from online stuff or their classmates\u0026rsquo; papers.\nAs Bruce Willis said in \u0026ldquo;The Kid\u0026rdquo;: \u0026ldquo;sandwiches give security\u0026rdquo;. And the green color next to a submission gives peace of mind.\nBut for years now students have been inserting Cyrillic characters that look like letters in the Latin alphabet (see table of confusing characters at https://github.com/mindcrypt/uriDeep/blob/master/data/deepDiccConfusables.txt) and that bypass anti-copy systems without a problem.\nAnd AI-generated content detection systems?\nIn February 2023 I already said, and it\u0026rsquo;s on YouTube recorded in silicon, that for texts under a page it\u0026rsquo;s impossible to say with certainty if a text was generated by AI or by a person. Later, researchers from the University of Maryland published \u0026ldquo;Sadasivan, V. S., Kumar, A., Balasubramanian, S., Wang, W., \u0026amp; Feizi, S. (2023). Can AI-Generated Text be Reliably Detected? arXiv:2303.11156 [cs.CL]. https://doi.org/10.48550/arXiv.2303.11156\u0026rdquo;\nLet\u0026rsquo;s look at the abstract:\nNext, we provide an impossibility result indicating that as language models become more sophisticated and better at emulating human text, the performance of the best possible detector decreases. For a sufficiently advanced language model seeking to imitate human text, even the best possible detector could act only marginally better than a random classifier. Our result is general enough to capture specific scenarios such as particular writing styles, intelligent prompt design, or text paraphrasing.```\nMeaning, NO, that you cannot reliably detect if a text was generated by AI. The system will not be reliable and will generate errors: false negatives and false positives.\nNow, let\u0026rsquo;s talk about ethics. Real AI ethics, not the cheap kind. The ethics of a professor who decides to delegate their judgment to a machine that says a student \u0026ldquo;copied with AI\u0026rdquo;. The ethics of a company that sells the service of this machine that\u0026hellip; DOESN\u0026rsquo;T WORK.\nFalse positives can impact a student\u0026rsquo;s career, or a researcher who gets on a publisher\u0026rsquo;s blacklist based on an AI discriminator.\n\u0026ldquo;AI told me you cheated with AI.\u0026rdquo; It\u0026rsquo;s an exhibition of cynicism and laziness, of not wanting to do things right. Of not thinking that if an AI does the exercise you ask, maybe you should ask for something else.\nSome professor has explained to me, to my horror, that their AI-made content detector is ChatGPT. And they believe it. For a few months now ChatGPT has stopped taking a stance when asked if a text was generated by AI.\n","date":"12 February 2024","externalUrl":null,"permalink":"/ludo/en/posts/la-ia-diu-que-aquests-deures-els-ha-fet-la-ia/","section":"Posts","summary":"ChatGPT summarizes the article in The Guardian for me: https://www.theguardian.com/commentisfree/2024/feb/13/software-student-cheated-combat-ai\nRobert Topinka, professor at Birkbeck, University of London, explores the dilemma professors face with students using AI to write essays. After detecting an essay marked as “100% AI-generated”, Topinka finds himself in a difficult situation when an exceptionally brilliant student challenges this accusation. The case highlights the challenges of AI detectors, like Turnitin, which can confuse students’ legitimate use of technological support with cheating. Topinka argues for the need to adapt academic assessment to the AI era, proposing alternatives like presentations and podcasts to demonstrate students’ critical and original thinking, while avoiding unfair accusations and promoting educational equality.\n","title":"AI says the assignments were done by AI","type":"posts"},{"content":"ChatGPT gives me a summary of the article in The Guardian: https://www.theguardian.com/commentisfree/2024/feb/13/software-student-cheated-combat-ai\nRobert Topinka, a lecturer at Birkbeck, University of London, explores the dilemma professors face with students’ use of AI to write essays. After flagging an essay as “100% AI-generated,” Topinka lands in a tough spot when an exceptionally bright student challenges the accusation. The case highlights the problems with AI detectors like Turnitin, which can mistake legitimate technological support used by students for cheating. Topinka argues we must adapt academic assessment to the AI era, proposing alternatives such as presentations and podcasts to demonstrate students’ critical and original thinking, avoiding unfair accusations and promoting equal educational opportunities.\nOver the past year, I’ve given courses to lecturers from more than seven universities (you can see part of the content at https://wasabi.essi.upc.edu/ludo/cursos), and I always get questions about how to detect whether a piece of work was done by AI instead of a person.\nCopied work is nothing new. When I was still a student, I tutored a course called Methodical Programming at a cram school, and one day they asked me to solve an exercise that turned out to be the same one that appeared on the exam. Don’t ask me where that problem statement came from, because I don’t know. I didn’t keep working at that school to find out.\nSince the web has existed, we’ve had things like “El Rincón del Vago,” where students share notes, exercises, and solved assignments. And on Instagram, students see ads for companies that offer to do their bachelor’s or master’s theses… so they can spend more time on their crush. I myself get ads on X (formerly Twitter) from companies that offer to write research papers in my field and publish them under my name in prestigious journals.\nBut AI is the problem. Sure. Because it lets you do things at scale and at negligible cost. And now we’re all running to it.\nAnd what we want are easy fixes, like taking anti-inflammatories, but applied to teaching quality and to the peace of mind that a dashboard gives you. Systems like Urkund and Turnitin integrate with our Moodles and Sakais and tell us what percentage of our students’ submissions are original or are copies of things online or of their classmates’ work.\nAs Bruce Willis said in “The Kid”: “sandwiches give you security.” And the green color next to a submission gives you peace of mind.\nBut students have been inserting Cyrillic characters that look like Latin letters for years now (see the table of confusable characters at https://github.com/mindcrypt/uriDeep/blob/master/data/deepDiccConfusables.txt), and they get around anti-plagiarism systems without a problem.\nAnd what about systems that detect AI-generated content?\nBack in February 2023 I already said—and it’s on YouTube, etched in silicon—that for texts under a page it’s impossible to say with certainty whether a text was generated by AI or written by a person. Later, researchers at the University of Maryland published “Sadasivan, V. S., Kumar, A., Balasubramanian, S., Wang, W., \u0026amp; Feizi, S. (2023). Can AI-generated text be reliably detected? arXiv:2303.11156 [cs.CL]. https://doi.org/10.48550/arXiv.2303.11156”\nWe read in the abstract:\nNext, we provide a theoretical impossibility result indicating that as language models become more sophisticated and better at emulating human text, the performance of the best possible detector declines. For a sufficiently advanced language model aiming to imitate human text, even the best possible detector could perform only marginally better than a random classifier. Our result is general enough to capture specific scenarios such as particular writing styles, clever prompt design, or text paraphrasing.\nIn other words, NO, you cannot reliably detect whether a text was generated by AI. The system won’t be reliable and will produce errors: false negatives and false positives.\nNow, let’s talk ethics. Real AI ethics, not the sham kind. The ethics of a professor who decides to outsource their judgment to a machine that says a student “cheated with AI.” The ethics of a company that sells the service of that machine that… DOES NOT WORK.\nFalse positives can affect a student’s career, or get a researcher blacklisted by a publisher based on an AI discriminator.\n“AI has told me you cheated with AI.” That’s a display of cynicism and laziness, of not wanting to do things properly. Of not stopping to think that if an AI can do the exercise you’re asking for, maybe you should ask for something else.\nMore than one lecturer has told me, to my horror, that her detector for AI-generated content is ChatGPT. And that she believes it. For a few months now, ChatGPT has stopped sticking its neck out when asked whether a text was generated by AI.\n","date":"12 February 2024","externalUrl":null,"permalink":"/ludo/en/posts/la-ia-dice-que-has-hecho-los-deberes-con-ia./","section":"Posts","summary":"ChatGPT gives me a summary of the article in The Guardian: https://www.theguardian.com/commentisfree/2024/feb/13/software-student-cheated-combat-ai\nRobert Topinka, a lecturer at Birkbeck, University of London, explores the dilemma professors face with students’ use of AI to write essays. After flagging an essay as “100% AI-generated,” Topinka lands in a tough spot when an exceptionally bright student challenges the accusation. The case highlights the problems with AI detectors like Turnitin, which can mistake legitimate technological support used by students for cheating. Topinka argues we must adapt academic assessment to the AI era, proposing alternatives such as presentations and podcasts to demonstrate students’ critical and original thinking, avoiding unfair accusations and promoting equal educational opportunities.\n","title":"AI says you did your homework with AI.","type":"posts"},{"content":"","date":"12 February 2024","externalUrl":null,"permalink":"/ludo/en/tags/chatgpt/","section":"Tags","summary":"","title":"Chatgpt","type":"tags"},{"content":"En aquesta entrada us presento RepoSearch, un cercador semàntic de TFGs d\u0026rsquo;Enginyeria Informàtica, desenvolupat per Juanan Pereira d\u0026rsquo;UPV/EHU https://github.com/juananpe . Fent servir màgia arcana (embeddings, bases de dades vectorials, cerca de similitud per vectors) aquesta eina permet cercar TFGs de manera semàntica (digues què busques i et donarà resultats a partir del contingut de les memòries, no només del títol i metadades més o menys ben etiquetades). És una iniciativa impressionant. Kudos, Juanan! L\u0026rsquo;eina a: https://semantic.ikasten.io/\n","date":"5 de February de 2024","externalUrl":null,"permalink":"/ludo/posts/buscador-de-tfgs-de-ingenier%C3%ADa-inform%C3%A1tica/","section":"Posts","summary":"En aquesta entrada us presento RepoSearch, un cercador semàntic de TFGs d’Enginyeria Informàtica, desenvolupat per Juanan Pereira d’UPV/EHU https://github.com/juananpe . Fent servir màgia arcana (embeddings, bases de dades vectorials, cerca de similitud per vectors) aquesta eina permet cercar TFGs de manera semàntica (digues què busques i et donarà resultats a partir del contingut de les memòries, no només del títol i metadades més o menys ben etiquetades). És una iniciativa impressionant. Kudos, Juanan! L’eina a: https://semantic.ikasten.io/\n","title":"RepoSearch Cercador Semàntic de TFGs d'Enginyeria Informàtica","type":"posts"},{"content":"","date":"29 January 2024","externalUrl":null,"permalink":"/ludo/en/tags/code/","section":"Tags","summary":"","title":"Code","type":"tags"},{"content":"In this post, I present an important document prepared by the UPC Doctoral School. It is the study “Data and applications: analysis of habits in data management and application use among doctoral students at the UPC North Campus”, based on a survey and a focus group carried out during the months of June and July.\nData and applications: analysis of habits in data management and application use among doctoral students at the UPC North Campus # The aim of the study is twofold:\nExplore how doctoral students at the UPC North Campus manage and use research data. Identify the applications most used in the preparation of doctoral theses. According to the study’s conclusions, highlighted on pages 29-31 of the document, the following points stand out:\n21% of respondents report being informed about specific procedures for data management within their research area. The applications most used by doctoral students are reference managers, Overleaf, and Grammarly. Around 35% of the survey participants use ChatGPT or other applications based on generative artificial intelligence. This study provides valuable information about usage patterns and the needs of doctoral students regarding data management and the use of technologies. Its importance lies in understanding how these elements are integrated into current academic practice.\n","date":"29 January 2024","externalUrl":null,"permalink":"/ludo/en/posts/survey-estudiants-doctorat/","section":"Posts","summary":"In this post, I present an important document prepared by the UPC Doctoral School. It is the study “Data and applications: analysis of habits in data management and application use among doctoral students at the UPC North Campus”, based on a survey and a focus group carried out during the months of June and July.\n","title":"Study on the Habits of UPC Doctoral Students in Data Management and Application Use","type":"posts"},{"content":"%%{init: { \u0026rsquo;logLevel\u0026rsquo;: \u0026lsquo;debug\u0026rsquo;, \u0026rsquo;theme\u0026rsquo;: \u0026lsquo;default\u0026rsquo; , \u0026rsquo;themeVariables\u0026rsquo;: { \u0026lsquo;cScale0\u0026rsquo;: \u0026lsquo;#FFD700\u0026rsquo;, \u0026lsquo;cScaleLabel0\u0026rsquo;: \u0026lsquo;#000000\u0026rsquo;, \u0026lsquo;cScale1\u0026rsquo;: \u0026lsquo;#228B22\u0026rsquo;, \u0026lsquo;cScaleLabel1\u0026rsquo;: \u0026lsquo;#FFFFFF\u0026rsquo;, \u0026lsquo;cScale2\u0026rsquo;: \u0026lsquo;#00008B\u0026rsquo;, \u0026lsquo;cScaleLabel2\u0026rsquo;: \u0026lsquo;#FFFFFF\u0026rsquo;, } } }%% timeline title Modern Education Timeline section XIX Century 1800s : Compulsory Education Laws : Schools and Classrooms reflect a factory-like structure : Goal to alphabetize the whole society : Numerical Grading introduced by William Farish : Standardized Testing emerges : Courses by mail section Early section XX Century Early 1900a : Modern School Movement begins : Dewey emphasizes experiential learning : Montessori promotes child individuality : Pressey Teaching Machine (1926) 1940s and 50s : Piaget introduces developmental stage theory : B.F. Skinner develops behaviorism : B.F. Skinner Bulilds GLIDER teaching machine : Radio Broadcasting of Lectures : IBM teaching software section Late XX Century 1960s : First applications of computers in education : Seymour Papert, LOGO Programming Language : Computer-Aided Instruction (CAI) : IBM 1500 Instructional System 1970s : PLATO Instructional System : Distance Universities (OU , UNED, \u0026hellip; ) : VCRs : Microcomputers , BASIC : Using computers become a subject of study\n%%{init: { \u0026rsquo;logLevel\u0026rsquo;: \u0026lsquo;debug\u0026rsquo;, \u0026rsquo;theme\u0026rsquo;: \u0026lsquo;default\u0026rsquo; , \u0026rsquo;themeVariables\u0026rsquo;: { \u0026lsquo;cScale0\u0026rsquo;: \u0026lsquo;#FFD700\u0026rsquo;, \u0026lsquo;cScaleLabel0\u0026rsquo;: \u0026lsquo;#000000\u0026rsquo;, \u0026lsquo;cScale1\u0026rsquo;: \u0026lsquo;#228B22\u0026rsquo;, \u0026lsquo;cScaleLabel1\u0026rsquo;: \u0026lsquo;#FFFFFF\u0026rsquo;, \u0026lsquo;cScale2\u0026rsquo;: \u0026lsquo;#00008B\u0026rsquo;, \u0026lsquo;cScaleLabel2\u0026rsquo;: \u0026lsquo;#FFFFFF\u0026rsquo;, } } }%% timeline title Educational Technology Timeline section The Personal Computer Days 1982 : Computer-Aided Education : LOGO : Optical Mark Recognition in Paper Questionnaires 1991 : Multimedia Contents : CD-ROM : Content authoring tools 1996 : Early Web first online courses and campus section The Web goes Mainstream 2003 : Web 2.0 - Dynamic webpages : Modern Learning Management Systems (Moodle, Sakai, Blackboard) : Wikipedia, Blogs : Geographical Information Systems : Online Learning Tools (Forums, Quizzes, Tasks, workshop, Wikis, \u0026hellip;) : Smart whiteboards and Projectors : Clickers : Online Educational Content Standards (Scorm, IMS Cartridge, QTI) : First experiments Virtual Worlds in education (Seccond Life) 2007 : Social Networks : Educational Robotics (Lego Mindstorms, Arduino\u0026hellip;) : Online video contents and lessons (Youtube, Kahn Academy \u0026hellip;) : Personal Learning Environments (PLE\u0026rsquo;s) : LMS provided as a Cloud Computing Service : LMS Interoperability Webservices and IMS-LTI section EdTech is mainstream 2010 : Mobile Learning : Very Affordable Laptops: Netbooks /Chromebooks/ iPads : Startups investing in EdTech (Kahoot, Duolingo, Classdojo, Quizlet) : MOOCS 2015 : Learning Analytics : Gamification : Serious Games : Adaptive Learning\n%%{init: { \u0026rsquo;logLevel\u0026rsquo;: \u0026lsquo;debug\u0026rsquo;, \u0026rsquo;theme\u0026rsquo;: \u0026lsquo;default\u0026rsquo; , \u0026rsquo;themeVariables\u0026rsquo;: { \u0026lsquo;cScale0\u0026rsquo;: \u0026lsquo;#FFD700\u0026rsquo;, \u0026lsquo;cScaleLabel0\u0026rsquo;: \u0026lsquo;#000000\u0026rsquo;, \u0026lsquo;cScale1\u0026rsquo;: \u0026lsquo;#228B22\u0026rsquo;, \u0026lsquo;cScaleLabel1\u0026rsquo;: \u0026lsquo;#FFFFFF\u0026rsquo;, \u0026lsquo;cScale2\u0026rsquo;: \u0026lsquo;#00008B\u0026rsquo;, \u0026lsquo;cScaleLabel2\u0026rsquo;: \u0026lsquo;#FFFFFF\u0026rsquo;, } } }%% timeline title Educational Technology new horizons Timeline Section The undiscovered country 2020 : Covid Pandemic : All studies go sudenly online : Massive adoption of Videconferencing : LMS markets shifting to Big tech 2022 : Generative AI : First encounters with LLMs (GPT-3, Early RAG Systems) : Image Generation Models (Stable Diffusion, Dall-E, Midjourney) 2023 : AI Goes Mainstream : ChatGPT enters by storm in education : Kahn Academy Launches Kahn amigo Section Things get interesting 2024 : You are here\n","date":"1 January 2024","externalUrl":null,"permalink":"/ludo/en/posts/edtech-timeline/","section":"Posts","summary":"%%{init: { ’logLevel’: ‘debug’, ’theme’: ‘default’ , ’themeVariables’: { ‘cScale0’: ‘#FFD700’, ‘cScaleLabel0’: ‘#000000’, ‘cScale1’: ‘#228B22’, ‘cScaleLabel1’: ‘#FFFFFF’, ‘cScale2’: ‘#00008B’, ‘cScaleLabel2’: ‘#FFFFFF’, } } }%% timeline title Modern Education Timeline section XIX Century 1800s : Compulsory Education Laws : Schools and Classrooms reflect a factory-like structure : Goal to alphabetize the whole society : Numerical Grading introduced by William Farish : Standardized Testing emerges : Courses by mail section Early section XX Century Early 1900a : Modern School Movement begins : Dewey emphasizes experiential learning : Montessori promotes child individuality : Pressey Teaching Machine (1926) 1940s and 50s : Piaget introduces developmental stage theory : B.F. Skinner develops behaviorism : B.F. Skinner Bulilds GLIDER teaching machine : Radio Broadcasting of Lectures : IBM teaching software section Late XX Century 1960s : First applications of computers in education : Seymour Papert, LOGO Programming Language : Computer-Aided Instruction (CAI) : IBM 1500 Instructional System 1970s : PLATO Instructional System : Distance Universities (OU , UNED, … ) : VCRs : Microcomputers , BASIC : Using computers become a subject of study\n","title":"EdTech Timelines","type":"posts"},{"content":"","date":"1 January 2024","externalUrl":null,"permalink":"/ludo/en/tags/education/","section":"Tags","summary":"","title":"Education","type":"tags"},{"content":" Table Of Contents # AI-Generated Content Search Engine or Database Content Dealing with Hallucinations What Are Hallucinations? Hallucination Mitigation Strategies Practical strategies and use cases of Chatbots like ChatGTP Language Considerations: for best results, use English Putting Context i context. Text Formatting: a neat trick in LLM Conversations Use text formats like a ninja Other uses of text formats and chatbots. Large Language Models (LLMs) like ChatGPT are a new family of technologies that offer a lot of potential but also come with their own set of challenges. We’re using these AI tools more and more for everything from simple questions to generating complex code. So, it’s important to ask: Do we really understand how to get the best out of them?\nIn this blog post, we’ll go over some key best practices for using LLMs. We’ll look at the difference between AI-generated content and information from databases or search engines. We’ll also talk about the issue of “hallucinations,” where the model gives you incorrect or misleading information. And we’ll cover why the language you use to ask questions matters for the quality of the answers you get.\nWhether you’re just curious or planning to integrate LLMs into your projects, this guide aims to help you use these tools more effectively.\nWhere does ChatGTP get its answers from # AI-Generated Content # When you ask a question to a Large Language Model like ChatGPT, it generates an answer based on a huge dataset it was trained on. This dataset includes a wide variety of sources, such as websites, books, and articles. But it’s important to know that these models can’t access real-time or updated information. What you get is based on what the model has learned up to a certain point in time.\nAdditionally, it’s worth noting that the information, while often accurate, is generated in a stochastic or random manner. This means that the possibility of failure or error is always part of the equation. It’s like a secret sauce that can sometimes add flavor but other times spoil the dish.\nSearch Engine or Database Content # In contrast to AI-generated content, information from search engines, databases, or specialized systems like Wolfram Alpha comes from real-time, factual sources. These platforms often provide peer-reviewed or verified information. Unlike Large Language Models, they can give you the most current data available, whether it’s the latest news, scientific research, or statistical data.\nImplications\nUnderstanding the difference between these two types of content is crucial. If you’re looking for up-to-date or highly factual information, relying solely on a Large Language Model might not be your best bet. While these models are incredibly versatile and can provide quick answers, they can’t replace the depth and reliability of information you’d get from a specialized database or search engine. However, it’s worth noting that accessing these specialized sources often requires technical skills, like understanding specific user interfaces or programming languages. In contrast, Large Language Models offer the convenience of obtaining information through simple, natural language queries. Knowing when to use each can help you get the most accurate and reliable information, while also considering the ease of access.\nDealing with Hallucinations # What Are Hallucinations? # In the context of Large Language Models (LLMs) like ChatGPT, hallucinations refer to the generation of inaccurate, exaggerated, or completely false information. For example, ChatGPT has been known to invent plausible-sounding answers backed with fake citations, leading users to double-check whether they had accidentally stumbled upon a real phenomenon. In some cases, it has even fabricated financial numbers in news articles or generated fake case precedents in legal briefs.\nHallucinations often occur due to limitations in the training data or the model’s understanding of context. The model doesn’t have the ability to verify the information it generates; it simply produces responses based on patterns in the data it was trained on. This stochastic nature of LLMs means that randomness and the possibility of absolute failure are always part of the equation.\nHallucination Mitigation Strategies # Awareness of Potential Fabrication: The first step is to be aware that LLMs can generate made-up content. This awareness can make you more critical of the answers you receive. Assess Your Own Expertise: Your ability to judge the quality of the information is tied to your level of expertise on the topic. If you’re knowledgeable in a subject, you’re better equipped to spot inaccuracies. Critical Evaluation and Validation Strategy: Always critically evaluate the information. Have a strategy for validation that could involve cross-referencing with reliable sources or consulting experts. If something seems too complex or too good to be true, it likely needs further verification. Test Against Reliable Sources: When unsure, test the information against trusted platforms like search engines or Wikipedia. For generated code, always run and test it to ensure it performs as expected. Human Review: In high-stakes areas like legal filings, human review is increasingly necessary. Some jurisdictions even require it to ensure the reliability of AI-generated content. Risk-Adjusted Verification\nThe steps for verifying information should be proportional to the risks and implications of using it. For critical applications like generated code, extensive testing and expert review are essential. For general knowledge questions, a quick cross-reference with a reliable source may suffice.\nBy being mindful of the stakes, you can tailor your verification process to take appropriate precautions. This balanced approach allows you to make the most of what Large Language Models offer, while also safeguarding against their limitations and risks.\nPractical strategies and use cases of Chatbots like ChatGTP # Language Considerations: for best results, use English # Large Language Models like ChatGPT can understand and generate text in multiple languages. However, for best results, use English. A significant portion of their training data is in English, making their responses in this language more nuanced and accurate. This is especially important for non-English native speakers.\nIf you need non-English content, a useful strategy for is to first prompt the model in English to get the most accurate response. Once you have that, you can then ask ChatGPT to translate the answer into your desired language. This two-step process often yields more reliable results than directly prompting in a non-English language.\nPutting Context i context. # When you’re chatting with ChatGPT, the context includes all the previous conversation you’ve had—both your prompts and the chatbot’s responses. Context is the information that helps the model understand what you’re asking. In Large Language Models like ChatGPT, this context is not just limited to your current query. It also encompasses the entire conversation history, the specific way you phrase your questions, and even the model’s own previous responses. Because the model pays attention to this broader context along with your current prompt and its own training data, it can provide more nuanced and accurate answers. This means that the model not only understands your immediate question but also considers the broader conversation to generate a more context-aware response.\nExample 1: Recipe and Shopping List\nYour Prompt: “Can you give me a recipe for chocolate chip cookies?” ChatGPT’s Response: “Sure, here’s a simple recipe for chocolate chip cookies…” Your Follow-up Prompt: “Great, can you now create a shopping list based on that recipe?” ChatGPT’s Follow-up Response: “Of course, based on the chocolate chip cookie recipe, you’ll need…” In this example, ChatGPT uses its own generated recipe as context for creating a relevant shopping list. This shows how the model’s previous output informs its next action, making the interaction more fluid and personalized.\nExample 2: Educational Assessment\nYour Prompt: We are going to do the assessment of students’ exercises. The problem to solve is: x + 4 = 7 The proper response is x = 7 - 4 x = 3 Now I will produce several responses from students. You compare the student’s solution to the proposed solution, then say: A) whether the response is correct or not, B) if it’s not correct, explain where the student went wrong and how to go about fixing it. Do not produce the correct answer, just give orientation to the student. Now just say OK. ChatGPT’s Response: “OK” In this example, you’ve set up a context for ChatGPT to evaluate student responses against a given correct solution. The model will use this context to provide targeted feedback, helping students understand where they might have gone wrong without giving away the correct answer. This is a practical application of how understanding context can lead to more effective educational interactions.\nYour Follow-up Prompt 1: Student 1’s response is: x = 7 + 4 x = 11 ChatGPT’s Follow-up Response 1: A) The response is incorrect. B) The student added 4 instead of subtracting it. To correct this, the student should start by isolating x_x_on one side of the equation by performing the opposite operation on both sides. Your Follow-up Prompt 2: Student 2’s response is: x = 7 / 4 x = 1.75 ChatGPT’s Follow-up Response 2: A) The response is incorrect. B) The student divided by 4 instead of subtracting it. The student should focus on isolating x_x_by using the correct inverse operation. Your Follow-up Prompt 3: Student 3’s response is: x = 7 - 4 x = 3 ChatGPT’s Follow-up Response 3: A) The response is correct. B) The student correctly isolated x_x_by subtracting 4 from both sides of the equation. In this extended example, ChatGPT uses the context provided—the correct solution to the equation x+4=7_x_+4=7—to evaluate the students’ responses. It then provides targeted feedback based on how each student’s answer compares to the correct solution. This demonstrates the model’s ability to use context for more nuanced and helpful interactions.\nMissing the context is missing the point\nUnderstanding context in your chats with Large Language Models like ChatGPT is crucial. It makes your exchanges more accurate and nuanced. Context helps the model understand the topic, tone, and your specific needs better. This is key for avoiding errors, especially in professional or educational settings. It also allows for more complex interactions, like detailed discussions or using the model for teaching, as shown in our examples.\nSource: Fishkin, Rand. “We Analyzed Millions of ChatGPT User Sessions: Visits are Down 29% since May, Programming Assistance is 30% of Use.” SparkToro. Accessed September 3, 2023.\nInterestingly, research shows (see figure) that almost half of ChatGPT sessions consist of just one (34.5%) or two (14.5%) prompts. This suggests that a large number of users are not fully utilizing the context-aware capabilities of these models. Only about 35% of sessions have five or more prompts. By not engaging in longer conversations, users miss out on the benefits of context, such as more accurate and nuanced responses, especially in complex queries or discussions.\nWhen interacting with ChatGPT, the way you format your text isn’t just a matter of aesthetics or readability; it also impacts semantics, the meaning behind your words. Proper text formatting acts as a guide for the model, helping it better understand the structure and intent of your query. This, in turn, allows the model to generate responses that are more precise, organized, and tailored to what you’re actually asking for.\nText Formatting: a neat trick in LLM Conversations # When interacting with ChatGPT, the way you format your text isn’t just a matter of aesthetics or readability; it also impacts semantics, the meaning behind words. Proper text formatting acts as a guide for the model, helping it better understand the structure and intent of your query. This, in turn, allows the model to generate responses that are more precise, organized, and tailored to what you’re actually asking for.\nExample 3: Using Bullet Points for Multiple Questions\nYour prompt:\n\u0026quot;Tell me about the benefits of: - Solar energy - Wind energy\u0026quot; ChatGPT’s response:\n\u0026quot;Certainly! - Solar energy is renewable, reduces electricity bills, and has low operating costs. - Wind energy is also renewable, cost-effective, and can be built on existing farms.\u0026quot; In this example, the use of bullet points in the prompt helps to clearly separate multiple questions, making it easier for ChatGPT to address each one individually.\nUse text formats like a ninja # Beyond the basics of bullet points and numbered lists, you can take advantage of more specialized text formats commonly used in computing to enhance your interactions with ChatGPT. Formats like CSV (Comma-Separated Values), JSON (JavaScript Object Notation), and XML (eXtensible Markup Language) can be employed in your prompts. These formats are particularly useful when you’re dealing with data-heavy queries or need the model to generate responses that can be easily integrated into other software or databases.\nFor instance, if you have data in an Excel file that you’d like to analyze, you can export it as a CSV and then use that CSV-formatted data in your prompt to ChatGPT. This makes it easier for the model to understand your query and improves the accuracy of its responses. You can even ask ChatGTP : “How do I export data in EXCEL to a CSV File”.\nExample 4:\nYour prompt:\n\u0026quot;Sort the following table by Age in ascending order. The table is in CSV format: Name,Age,Occupation Alice,34,Engineer Bob,29,Doctor Charlie,45,Teacher David,39,Lawyer Ella,28,Artist \u0026quot;\u0026quot; ChatGPT’s response:\n´´´ Name,Age,Occupation Ella,28,Artist Bob,29,Doctor Alice,34,Engineer David,39,Lawyer Charlie,45,Teacher ´´´ A note about Markdown\nChatGPT understands Markdown text formatting, which allows it to display code snippets, tables, headers, links, and even embedded images in its output. However, if you’re using a Markdown-enabled editor for your documents—like Obsidian or Notion—you should be cautious with certain Markdown codes that could disrupt your presentation:\nTriple backticks (\u0026quot;`\u0026quot;): These are used for code blocks in Markdown. If you’re not careful, they can mess up the formatting in your Markdown editor. Triple dashes (\u0026quot;-\u0026quot;): These are often used for horizontal lines, which could unintentionally divide your text. Hash symbols (\u0026quot;#\u0026quot;, “##”, “###”): These are used for headers. Make sure you use them intentionally to structure your document. Asterisks (\u0026quot;*\u0026quot;): These are used for bullet points. Make sure they are aligned properly to maintain the list structure. Tables: Markdown tables can be useful but also tricky. Ensure the columns align correctly for readability.\nUsing Markdown in ChatGPT output\nLets go back to the last example, we can ask chatgpt to output the results in a markdown table:\nYour prompt:\nSort the following table by Age in ascending order. Format the result **in Markdown**. The table is in CSV format: Name,Age,Occupation Alice,34,Engineer Bob,29,Doctor Charlie,45,Teacher David,39,Lawyer Ella,28,Artist ChatGPT’s response:\n[TABLE]\nThis way, you get the sorted data in a Markdown table, which you can directly include in your Markdown documents. On the chatGPT page you will see the table above, but if you copy and paste it to a plain text editor you will get something like:\n|Name|Age|Occupation| |---|---|---| |Ella|28|Artist| |Bob|29|Doctor| |Alice|34|Engineer| |David|39|Lawyer| |Charlie|45|Teacher| Which is a Markdown representation of a table.\nOther uses of text formats and chatbots. # Example: Getting ChatGPT to Work with Graphs\nYour prompt:\n“Is there any text-based language that can be used to represent graphs?”\nChatGPT Response:\n“Yes, there are several text-based languages and formats that can be used to represent graphs. These languages are often used in conjunction with graph visualization or analysis software. Here are some of them:\nDOT Language: The DOT language is a plain text graph description language that is widely used as part of the Graphviz project. It is straightforward and allows for describing graphs, digraphs, and their attributes. Example:\ndotCopy code digraph G { A -\u0026gt; B; B -\u0026gt; C; C -\u0026gt; A; } (… response continues with several formats : YAML, GML, JSON GRAPH… )\n“\nYour Follow-Up Response:\n“Generate a .dot file representing a sample graph with the 10 most populated cities in Europe, their high-speed train connections, and the number of kilometers for each direct connection.”\nChatGPT Response:\n“Here is a sample .dot file representing a graph of the 10 most populated cities in Europe, along with their high-speed train connections and the distances between them.”\ndigraph EuropeCities { \u0026quot;London\u0026quot; -\u0026gt; \u0026quot;Paris\u0026quot; [label=\u0026quot;450 km\u0026quot;]; \u0026quot;Paris\u0026quot; -\u0026gt; \u0026quot;Madrid\u0026quot; [label=\u0026quot;1,050 km\u0026quot;]; \u0026quot;Madrid\u0026quot; -\u0026gt; \u0026quot;Rome\u0026quot; [label=\u0026quot;1,400 km\u0026quot;]; \u0026quot;Rome\u0026quot; -\u0026gt; \u0026quot;Berlin\u0026quot; [label=\u0026quot;1,100 km\u0026quot;]; \u0026quot;Berlin\u0026quot; -\u0026gt; \u0026quot;Moscow\u0026quot; [label=\u0026quot;1,800 km\u0026quot;]; \u0026quot;Moscow\u0026quot; -\u0026gt; \u0026quot;Saint Petersburg\u0026quot; [label=\u0026quot;700 km\u0026quot;]; \u0026quot;Saint Petersburg\u0026quot; -\u0026gt; \u0026quot;Kiev\u0026quot; [label=\u0026quot;1,100 km\u0026quot;]; \u0026quot;Kiev\u0026quot; -\u0026gt; \u0026quot;Istanbul\u0026quot; [label=\u0026quot;1,300 km\u0026quot;]; \u0026quot;Istanbul\u0026quot; -\u0026gt; \u0026quot;Bucharest\u0026quot; [label=\u0026quot;600 km\u0026quot;]; \u0026quot;Bucharest\u0026quot; -\u0026gt; \u0026quot;Berlin\u0026quot; [label=\u0026quot;1,400 km\u0026quot;]; \u0026quot;London\u0026quot; -\u0026gt; \u0026quot;Berlin\u0026quot; [label=\u0026quot;1,100 km\u0026quot;]; \u0026quot;Paris\u0026quot; -\u0026gt; \u0026quot;Berlin\u0026quot; [label=\u0026quot;1,050 km\u0026quot;]; \u0026quot;Paris\u0026quot; -\u0026gt; \u0026quot;Rome\u0026quot; [label=\u0026quot;1,100 km\u0026quot;]; \u0026quot;Madrid\u0026quot; -\u0026gt; \u0026quot;Berlin\u0026quot; [label=\u0026quot;1,900 km\u0026quot;]; \u0026quot;Berlin\u0026quot; -\u0026gt; \u0026quot;Kiev\u0026quot; [label=\u0026quot;1,300 km\u0026quot;]; \u0026quot;Moscow\u0026quot; -\u0026gt; \u0026quot;Kiev\u0026quot; [label=\u0026quot;870 km\u0026quot;]; \u0026quot;Kiev\u0026quot; -\u0026gt; \u0026quot;Bucharest\u0026quot; [label=\u0026quot;800 km\u0026quot;]; \u0026quot;Istanbul\u0026quot; -\u0026gt; \u0026quot;Moscow\u0026quot; [label=\u0026quot;2,200 km\u0026quot;]; \u0026quot;Rome\u0026quot; -\u0026gt; \u0026quot;Madrid\u0026quot; [label=\u0026quot;1,400 km\u0026quot;]; \u0026quot;Rome\u0026quot; -\u0026gt; \u0026quot;Paris\u0026quot; [label=\u0026quot;1,100 km\u0026quot;]; } Now you can do things like asking ChatGPT to find the optimal path from Madrid to Saint Petersburg and highlight such path on the .dot code. I got GPT-4 to create the code for this graph in two extra prompts. Interestingly GPT-4 used the Dijstra algorithm to solve the problem. I will give you the resulting image and leave it as a challenge.\nIf you need to know how to generate the image from the .dot code… just ask ChatGPT.\nLast prompts # According to research, more than 23% of the use of ChatGPT is related to education. Both teachers and students are using ChatGPT, and the new advanced AI chatbots that are coming online (such as Google’s Bard, Open Assistant, Meta’s Llama 2, etc.) are finding good use cases to assist them in their daily tasks of teaching and learning. A good understanding of these tools, their capabilities, and limitations is crucial for responsible and ethical use, especially in educational settings\nIn addition to everything covered so far in this article, there are a few final points worth considering:\nTools like ChatGPT are here to stay; they are neither a fad nor a passing trend. They represent a new era of internet and software technology, and their impact will be monumental. To draw a comparison with web technologies, we’re currently in the equivalent of the late ’90s, tinkering with static web pages. The technology is set to improve exponentially over the next five years, and its impact on our lives will be as transformative as the combined effects of PCs, the internet, and smartphones. The sophisticated use of ChatGPT and other Large Language Models (LLMs) is becoming increasingly difficult to detect. Current research suggests that it’s unlikely we’ll develop a reliable ‘AI-generated content’ detector anytime soon. Those claiming to offer such tools are essentially selling snake oil. As of June 2023, ChatGPT has introduced a useful feature: the ability to share a permalink to a conversation. This is incredibly valuable as it allows us to showcase the work done with the chatbot to produce specific content. For instance, this very document was developed using ChatGPT in a conversation that spanned several hours. I’m providing the link as a showcase of how effectively a chatbot can be used to produce quality work. Or at least as good a quality as this article. Thank you for reading to the end.\n","date":"1 September 2023","externalUrl":null,"permalink":"/ludo/en/posts/good_bad_alucinated/","section":"Posts","summary":"Table Of Contents # AI-Generated Content Search Engine or Database Content Dealing with Hallucinations What Are Hallucinations? Hallucination Mitigation Strategies Practical strategies and use cases of Chatbots like ChatGTP Language Considerations: for best results, use English Putting Context i context. Text Formatting: a neat trick in LLM Conversations Use text formats like a ninja Other uses of text formats and chatbots. Large Language Models (LLMs) like ChatGPT are a new family of technologies that offer a lot of potential but also come with their own set of challenges. We’re using these AI tools more and more for everything from simple questions to generating complex code. So, it’s important to ask: Do we really understand how to get the best out of them?\n","title":"ChatGPT in Practice: Strategies for Optimal Interactions","type":"posts"},{"content":"","date":"1 September 2023","externalUrl":null,"permalink":"/ludo/en/tags/chatgtp/","section":"Tags","summary":"","title":"Chatgtp","type":"tags"},{"content":"","date":"1 September 2023","externalUrl":null,"permalink":"/ludo/en/tags/course/","section":"Tags","summary":"","title":"Course","type":"tags"},{"content":"","date":"1 September 2023","externalUrl":null,"permalink":"/ludo/en/tags/llm/","section":"Tags","summary":"","title":"Llm","type":"tags"},{"content":" Table Of Contents # 1 Embeddings 1.2 Practical Applications of Embeddings 2 The context in LLMs based on transformers** 2.1 Context as Ad Hoc Training 3 Self-Referential Context and Programmability in LLMs**_ References Generative AI models, especially Large Language Models (LLMs) like GPT and its successors, have become a pivotal force in the advancement of artificial intelligence. While these models have gained prominence for their capabilities in natural language processing (NLP)—including tasks such as sentiment analysis, machine translation, and content generation—their applications extend beyond the realm of NLP [11].\nOne of the key objectives of this paper is to explore the specific features and architecture of generative AI models that make them a foundational technology for the development of Smart Learning Applications. This exploration serves as a part of a broader inquiry into the capabilities, limitations, and potential applications of these models. While the growing field of AI offers options for training and fine-tuning models, thanks to open-source initiatives like Lama 2 and platforms like OpenAssistant, this paper takes a different angle. We seek to understand how the existing functionalities of readily available models can be effectively utilized to develop Smart Learning Applications without requiring extensive modifications.\nBy examining critical components such as embeddings and the model’s ability to understand context, this paper aims to provide a nuanced understanding that can guide the development of educational technology and open up new avenues for innovation.\n1 Embeddings # At the heart of LLMs is the concept of “embedding.” In NLP, embeddings serve the purpose of transforming linguistic entities—whether they are words, phrases, or entire documents—into numerical vectors of fixed dimensions. Mikolov et al. introduced the Word2Vec model, a popular method for generating word embeddings, which has been foundational in the development of embeddings in LLMs [12]. This transformation is pivotal as it allows textual data to be represented in a manner that is both computationally efficient and semantically rich. Through the process of embedding, LLMs are equipped to discern intricate patterns and relationships in language.\n1.2 Practical Applications of Embeddings # LLMs utilize embeddings for a myriad of tasks. From sentiment analysis and machine translation to the generation of content, the capabilities of LLMs are vast. The inherent numerical nature of embeddings facilitates operations that can deduce relationships, draw analogies, and discern nuances in language. In the realm of machine translation, embeddings have been instrumental in identifying equivalent terms across languages, ensuring that translations maintain their intended meaning and accuracy [13].\nA notable example of the practical application of embeddings is provided by ChatPDF.com, a turnkey system that allows for the embedding of entire PDFs, spanning hundreds of pages. This system enables a chat interface with the document, offering an innovative approach to document interaction. ChatPDF.com also provides an API for developers, making it easier to integrate this embedding-friendly model into various applications [14].\nFig 2. ChatPDF enables a chat conversation with the contents of a document. In the example a conversation with the paper “Attention is all you need”\n2 The context in LLMs based on transformers** # In the realm of LLMs, particularly models like GPT-4 and BERT based on the transformer architecture, the term “context” denotes the immediate surrounding information that the model leverages to generate a response [15, 16]. For GPT-4, this context is derived from the preceding text in a conversation or document. Such context is indispensable as it provides the model with insights into the ongoing topic, the tone and style of the conversation, and any specific instructions or constraints.\nAn essential mechanism that enables this contextual understanding is the “attention” mechanism. In transformer architectures, attention allows the model to weigh different parts of the input text differently. This means that when generating a response, the model doesn’t treat all words or tokens in the context equally. Instead, it “attends” more to certain parts that are more relevant to the query or prompt at hand. For example, if the conversation is about climate change, words like “emissions,” “carbon,” and “temperature” might receive higher attention weights. This attention mechanism works in tandem with the context to produce more accurate and contextually appropriate responses.\nHowever, the understanding of context in LLMs extends beyond just the immediate preceding text. Given the vast datasets these models are trained on, they possess a comprehensive understanding of a multitude of topics. When presented with a specific context, the model delves into this extensive knowledge base, honing in on the relevant segments to craft an appropriate response.\n2.1 Context as Ad Hoc Training # The proposition of utilizing context as a form of ad hoc training presents a novel approach to interacting with LLMs. Howard and Ruder introduced the idea of fine-tuning pre-trained models for specific tasks, which is somewhat related to the idea of ad hoc training [17].\nBy furnishing an LLM with a distinct context or a set of instructions, users have the capability to “guide” the model’s responses in real-time. This method of providing contextual guidance effectively acts as instantaneous training, molding the model’s behavior without necessitating alterations to its foundational architecture or weights.\nIn our exploration of the role of context in language models, we present two contrasting figures to highlight the difference that context can make in the responses generated by a chatbot.\nFig. 3 depicts a scenario where a chatbot, trained on data up to 2021, is asked to write a two-paragraph essay about the 2023 FIFA Women’s World Cup from the perspective of a 10-year-old girl. In this case, the chatbot lacks the specific context of the event and therefore produces a response based on its pre-2021 training, lacking in specific details about the event.\nFig. 3: A screenshot of a chatbot interaction where the chatbot, trained on data up to 2021, is asked to write a two-paragraph essay about the 2023 FIFA Women’s World Cup from the perspective of a 10-year-old girl. The chatbot’s response lacks specific details about the event.\nOn the other hand, Fig. 4 shows the same chatbot, but this time it is provided with context in the form of a text snippet from the Wikipedia page about the 2023 FIFA Women’s World Cup. When asked the same question, the chatbot is able to generate a detailed response, discussing the sports event, the results, participating countries, and individuals involved, as if it had been specifically trained on that information.\nFig. 4: A screenshot of the same chatbot interaction, but this time the chatbot is provided with a text snippet from the Wikipedia page about the 2023 FIFA Women’s World Cup. When asked the same question, the chatbot generates a detailed response, discussing the sports event, the results, participating countries, and individuals involved, showcasing the power of contextual information.\n3 Self-Referential Context and Programmability in LLMs**_ # An often-overlooked aspect of context in LLMs is the self-referential nature of their responses. As an LLM generates a response, that output becomes part of the ongoing context for subsequent interactions. This dynamic feature enables a unique form of programmability and direction-following in LLMs.\nOne aspect of context in advanced Language Learning Models (LLMs) like GPT-4 and GPT-3.5 that warrants closer attention is the self-referential nature of their generated responses. While earlier models in the GPT series have some ability to use context within a single interaction, this capability has been significantly enhanced in more recent versions. Advanced LLMs can incorporate their own generated text into the context for future interactions as represented in Fig 5., meaning that as the model generates a response, that newly generated content doesn’t just serve as an answer to a query; it also becomes part of the evolving context that informs subsequent responses.\nFig. 5: The generated response is incorporated into the ongoing context, influencing the LLM’s attention mechanism as the answer continues to be generated in a recursive manner.\nThis feature introduces a level of dynamic context updating, allowing the model to follow directions or carry out tasks in a more nuanced manner. For example, if an advanced LLM like GPT-4 or GPT-3.5 is asked to generate a recipe and then create a shopping list based on that recipe, the model can use the ingredients listed in its own generated recipe as context for compiling the shopping list. This not only shows the model’s ability to understand and maintain context but also highlights its capability to be “programmable” within the scope of a single interaction. This self-referential context updating makes advanced LLMs versatile tools for more complex, multi-step tasks and interactions.\nTo illustrate this in fee Fig 6 we asked GPT-4 to generate a recipe for spaghetti bolognese and then create a shopping list based on that recipe. The model first listed the ingredients and steps for the dish and then used this information to compile a shopping list. This shows that the model can use its own generated text as context for a subsequent task within the same interaction.\nThis example highlights a straightforward but important feature: the model’s ability to update its context dynamically. The shopping list isn’t just a separate output; it’s directly related to the recipe the model itself provided. This demonstrates that language models can follow directions from their own generated text, allowing for more nuanced and context-aware interactions.\nFig. 6: A screenshot of a chatbot interaction where the user asks GPT4 for a spaghetti bolognese recipe followed by a shopping list. The chatbot first generates the recipe and then uses it as context to create a shopping list, demonstrating the concept of Dynamic Context Updating.\nIn an educational setting, the self-referential context can be leveraged to enable a chatbot to review, grade, or provide feedback on student exercises. Typically, when presented with the text of an exercise and a student’s solution, both GPT-3.5 and GPT-4 tend to perform poorly in grading and offering feedback. However, the performance improves significantly when the chatbot is instructed to first solve the problem itself, then compare the student’s solution with its own generated solution, and finally provide a grade and feedback. The results using this approach are markedly better.\nprompt = f\u0026quot;\u0026quot;\u0026quot; Determine if the student's solution is correct or not. Question: I'm building a solar power installation and I need \\\\ help working out the financials. - Land costs $100 / square foot - I can buy solar panels for $250 / square foot - I negotiated a contract for maintenance that will cost \\\\ me a flat $100k per year, and an additional $10 / square \\\\ foot What is the total cost for the first year of operations as a function of the number of square feet. Student's Solution: Let x be the size of the installation in square feet. Costs: 1. Land cost: 100x 2. Solar panel cost: 250x 3. Maintenance cost: 100,000 + 100x Total cost: 100x + 250x + 100,000 + 100x = 450x + 100,000 \u0026quot;\u0026quot;\u0026quot; response = get_completion(prompt) print(response) The code we shown above presents an example of prompt asking the chatbot to determine if a student’s solution is correct or not. The results will be poor, using this approach. But in the next fragment of code we can see how instructing carefully in the prompt to first solve de problem and then consider the students solution the results will be much improved (Source OpenAI Cookbook[18]).\nprompt = f\u0026quot;\u0026quot;\u0026quot; Your task is to determine if the student's solution \\\\ is correct or not. To solve the problem do the following: - First, work out your own solution to the problem. - Then compare your solution to the student's solution \\\\ and evaluate if the student's solution is correct or not. Don't decide if the student's solution is correct until you have done the problem yourself. Use the following format: Question: --- question here --- Student's solution: --- student's solution here --- Actual solution: --- steps to work out the solution and your solution here --- Is the student's solution the same as actual solution \\\\ just calculated: --- yes or no --- Student grade: --- correct or incorrect --- Question: --- I'm building a solar power installation and I need help \\ working out the financials. - Land costs $100 / square foot - I can buy solar panels for $250 / square foot - I negotiated a contract for maintenance that will cost \\ me a flat $100k per year, and an additional $10 / square \\ foot What is the total cost for the first year of operations \\ as a function of the number of square feet. --- Student's solution: --- Let x be the size of the installation in square feet. Costs: 1. Land cost: 100x 2. Solar panel cost: 250x 3. Maintenance cost: 100,000 + 100x Total cost: 100x + 250x + 100,000 + 100x = 450x + 100,000 --- Actual solution: \u0026quot;\u0026quot;\u0026quot; response = get_completion(prompt) print(response) 4 The Evolution of Contextual Understanding: The System Message.\nIn March 2023, OpenAI introduced a groundbreaking feature in the API of their GPT 3.5 and GPT 4 models: the “system message” [18]. Prior to this update, the API allowed for the model to receive prompts and context through an array of messages categorized as either “user” (messages prompted by the user) or “agent” (previous responses generated by the model). The system message, however, is designed to be the first message in this array.\nThe system message serves a critical function: it outlines the role and behavioral parameters expected of the LLM. This allows for a more nuanced and directed interaction with the model, as it provides the LLM with guidelines on how to respond to subsequent prompts.\nThe introduction of the system message significantly enhances the user’s ability to guide the model’s behavior in real-time, effectively serving as an extension of the ad hoc training concept discussed earlier. It offers users a more refined tool for customizing the model’s responses, thereby elevating the level of interaction to a more dynamic and tailored experience.\nThe following fragment shows Python code which demonstrates how to use the system message feature to set the behavior of the model as a Socratic mentor.\nimport openai ... # Define the system message to set the behavior of the model as a Socratic mentor system_message = { 'role': 'system', 'content': 'You are a Socratic mentor. Engage in thoughtful dialogue,ask probing questions, and guide the user to deeper understanding.' } # Define a user message user_message1 = { 'role': 'user', 'content': 'What is the meaning of life?' } # Define an agent message (a previous response from the model, if any) agent_message1 = { 'role': 'agent', 'content': 'The meaning of life is a deeply philosophical question. What do you think it is?' } # Define another user message user_message2 = { 'role': 'user', 'content': 'I think it is to find happiness.' } # Combine all messages into a list messages = [system_message, user_message1, agent_message1, user_message2] # Make an API call to GPT-4 response = openai.Completion.create( model=\u0026quot;gpt-4\u0026quot;, messages=messages ) This code snippet illustrates how the system message can be used to guide the model’s behavior, making the interaction more dynamic and tailored to individual needs.\nReferences # [12] Mikolov, T., et al. “Efficient estimation of word representations in vector space.” arXiv preprint arXiv:1301.3781 (2013).\n[13] Vaswani, A., et al. “Attention is all you need.” Advances in neural information processing systems. 2017.\n[14] API Backend Documentation.\u0026quot; ChatPDF.com. Accessed August 25, 2023. **https://www.chatpdf.com/docs/api/backend**.\n[15] Devlin, J., et al. “BERT: Pre-training of deep bidirectional transformers for language understanding.” arXiv preprint arXiv:1810.04805 (2018).\n[16] Howard, J., and Ruder, S. “Universal language model fine-tuning for text classification.” arXiv preprint arXiv:1801.06146 (2018).\n[17] Suárez, Diego. “How to write ‘System’ Instructions for OpenAI’s GPT-4 Chat API.” Rootstrap Blog, April 25, 2023. Rootstrap Blog.\n[18] OpenAI. “OpenAI Cookbook: Examples and guides for using the OpenAI API.” GitHub repository. Last modified August 22, 2023. https://github.com/openai/openai-cookbook. Accessed August 23, 2023\n","date":"31 August 2023","externalUrl":null,"permalink":"/ludo/en/posts/embeddings/","section":"Posts","summary":"Table Of Contents # 1 Embeddings 1.2 Practical Applications of Embeddings 2 The context in LLMs based on transformers** 2.1 Context as Ad Hoc Training 3 Self-Referential Context and Programmability in LLMs**_ References Generative AI models, especially Large Language Models (LLMs) like GPT and its successors, have become a pivotal force in the advancement of artificial intelligence. While these models have gained prominence for their capabilities in natural language processing (NLP)—including tasks such as sentiment analysis, machine translation, and content generation—their applications extend beyond the realm of NLP [11].\n","title":"Embeddings, Context uses and Self Referencing in LLMS","type":"posts"},{"content":"Oceans as tools for capturing carbon\nOceans are a massive carbon sponge, absorbing a quarter of global CO2 emissions. That makes them a powerful tool to fight climate change. A company called Equatic is testing a bold idea to remove carbon dioxide from the atmosphere: pull it directly out of the ocean. The company is piloting this idea with a barge in Los Angeles that removes 100 kg of CO2 from seawater every day.\nThe direct ocean capture process\nCaptura’s process works like a large desalination plant. It continuously draws in seawater, separating 1% of it to turn into concentrated brine, which the company runs through a patented electrolysis machine. Electrolysis, a process used across various industries, including desalination, sends water through a series of electrified membranes. Each membrane separates different molecules in the water, forming acidic and alkaline solutions. The acid is added to the remaining ocean water, where it triggers a reaction with bicarbonate ions, causing CO2 to come out as bubbles. The company then adds the alkali to the water to neutralize it before releasing it.\nThe X-Prize and its influence on carbon capture\nCompanies like Equatic and Captura have emerged in part thanks to the X-Prize competitions, international contests that reward innovative solutions to global problems. These prizes are effective because they incentivize research and development in key areas, such as carbon capture, and attract the attention of investors and the general public. The X-Prize has been a driving force behind many technological advances and has helped accelerate progress in areas such as carbon capture, global health, and space exploration.\nAdvantages and disadvantages of direct ocean capture\nCompared with air capture, direct ocean capture (DOC) is less explored territory. But Equatic, Captura, and others are trying to demonstrate the concept’s viability with support from the Chan Zuckerberg Initiative, the Gates Foundation, oil and gas companies, and the Advanced Research Projects Agency–Energy (ARPA-E). Pulling carbon from ocean water, where it is present at a concentration 150 times higher than in the atmosphere, is more efficient than air capture.\nAlternative direct ocean capture technologies\nMeanwhile, at MIT, chemical engineering professor T. Alan Hatton, mechanical engineering professor Kripa Varanasi, and their colleagues have devised a DOC technology that minimizes the use of membranes, making it even simpler and cheaper than electrolysis. The “reactive flow” system sends water through two electrochemical cells in tandem. In the first cell, a bismuth electrode releases protons that separate CO2 from the water’s carbonates and bicarbonates, while a silver chloride electrode releases chloride ions to balance the charge.\nEnvironmental considerations\nAs companies and investors push ahead with DOC projects, environmental organizations are urging caution. It will be important for ocean carbon removal technologies to advance responsibly and “not endanger the health of ocean ecosystems or those who depend on them.” The climate crisis is so clear and severe that we need ocean-based carbon removal. But there are natural, proven ways to do it, such as conserving and restoring seagrass meadows or growing seaweed. Electrochemical DOC technologies simply have too many unknowns to spend limited research budgets on, given the limited time we have.\nThis blog post was inspired by Prachi Patel’s article “Using The Oceans To Help Capture Carbon,” published in IEEE Spectrum. https://spectrum.ieee.org/direct-ocean-carbon-capture\nChatGTP helped me with the writing and with a good part of the summary. Here’s the conversation behind its creation. GPT-4 + Webpilot plugin https://chat.openai.com/share/5931adec-3d6b-4c91-add2-bdb045696cb2\n","date":"16 August 2023","externalUrl":null,"permalink":"/ludo/en/posts/post-1-1/","section":"Posts","summary":"Oceans as tools for capturing carbon\nOceans are a massive carbon sponge, absorbing a quarter of global CO2 emissions. That makes them a powerful tool to fight climate change. A company called Equatic is testing a bold idea to remove carbon dioxide from the atmosphere: pull it directly out of the ocean. The company is piloting this idea with a barge in Los Angeles that removes 100 kg of CO2 from seawater every day.\n","title":"Direct CO2 Capture in the Ocean: A New Tool to Fight Climate Change","type":"posts"},{"content":"","date":"16 August 2023","externalUrl":null,"permalink":"/ludo/en/tags/sostenibilitat/","section":"Tags","summary":"","title":"Sostenibilitat","type":"tags"},{"content":"","date":"2 August 2023","externalUrl":null,"permalink":"/ludo/en/tags/ethics/","section":"Tags","summary":"","title":"Ethics","type":"tags"},{"content":"Organizations linked to free software such as Creative Commons, Github, Huggingface have made public a document in which they offer suggestions on how to improve the AI Act to protect and foster the development of free software.\nThe document is at https://huggingface.co/blog/assets/eu_ai_act_oss/supporting_OS_in_the_AIAct.pdf%5D and here is an executive summary*:\nSupport for Open Source and Open Science in the EU AI Act # Open source, non-profit and academic research and development play an essential role in the artificial intelligence (AI) ecosystem. Continuing to support and foster this open ecosystem will be key to ensuring that technology serves all EU citizens for two main points: ● First, the values of solid research, reproducibility and transparency fostered by open science are fundamental to the development of safe and responsible AI systems. ● Second, open source development can enable competition and innovation by new entrants and smaller players, including in the EU. The AI Act promises to set a global precedent in AI regulation to address its risks while encouraging innovation. By supporting the growing approach of the open ecosystem towards AI, the regulation has an important opportunity to deepen this objective through greater transparency and collaboration between stakeholders. Unfortunately, current proposals threaten to create impracticable barriers and disadvantages for contributors to this open ecosystem. The signatory organizations represent both commercial and non-profit stakeholders in the open source AI ecosystem. Below, we make 5 concrete suggestions to ensure that the AI Act works for open source:\nClearly define AI components Clarify that the collaborative development of open source AI components and making them available in public repositories does not subject developers to the requirements of the AI Act, improving and expanding Recitals 12a-c of the Parliament text and Article 2(5e). Support the coordination of the AI Office and inclusive governance with the open source ecosystem, based on the Parliament text. Ensure that the R\u0026amp;D exception is practical and effective, allowing limited trials in real conditions, combining aspects of the Council approach and a modified version of Parliament\u0026rsquo;s Article 2(5d). Establish proportional requirements for \u0026ldquo;base models\u0026rdquo;, recognizing and treating differently different uses and development modalities, including open source approaches, adapting Parliament\u0026rsquo;s Article 28b. (translated with AI) ","date":"2 August 2023","externalUrl":null,"permalink":"/ludo/en/posts/open-source-normativa-eu-document/","section":"Posts","summary":"Organizations linked to free software such as Creative Commons, Github, Huggingface have made public a document in which they offer suggestions on how to improve the AI Act to protect and foster the development of free software.\n","title":"Organizations linked to free software ask the EU to review the AI Act to protect free software","type":"posts"},{"content":"The Enigma machine, the cryptographic contraption that played a key role during the Second World War, is a powerful symbol of our technological past. Developed by Nazi Germany to encrypt its messages, this machine was considered inscrutable until the arrival of British mathematician Alan Turing, who managed to decipher its code in 1943.\nAn original Enigma model, specifically the K723, is the crown jewel of the IV/22 Signals Battalion, preserved in perfect working order since 2007. This unique piece is displayed alongside other historical relics of military telecommunications at the Bruc barracks in Pedralbes.\nThe origins of Enigma date back to 1923, when it was manufactured by a Berlin company. The machine was originally developed for commercial use, but was soon adopted by the German army to transmit messages with maximum security. In 1936, Adolf Hitler sent a batch of Enigma machines to Franco as a contribution to the coup-plotters’ side, turning the Spanish Civil War into the first wartime testing ground for Enigma machines.\nAfter the war, these machines were relegated to warehouses, forgotten until a few years ago when they were rediscovered and turned into museum pieces. The Bruc’s K723 is one of the last Enigmas that arrived in Spain, probably during the final stages of the Civil War.\nAlthough the Enigma machines sent to Franco were not as advanced as those used during the Second World War, they were still powerful enough to offer more than 1.8 million possible letter combinations. The Enigma machines used by the Nazis in the Second World War were even more sophisticated, with up to five interchangeable rotors that greatly expanded the encryption options.\nThe story of the Enigma machine and its decryption by Alan Turing is not only a fascinating part of the history of computing, but also a reminder of how cryptography has evolved over time. Today, in a world dominated by Google, Microsoft and other tech giants, the importance of information security and cryptography remains as relevant as it was in Enigma’s day.\nIn the era of artificial intelligence, cryptography is a vital tool to ensure the privacy and security of our data. And as historians of computing, it’s vital that we remember the lessons of the past in order to better understand the present and prepare for the future. As the writer and philosopher George Santayana said, “Those who cannot remember the past are condemned to repeat it.” This is especially true in computing and information security, where lessons from the past can help us navigate an uncertain future.\nA fun way to learn more is Neal Stephenson’s novel Cryptonomicon (Stephenson, N. (2003). Criptonomicón. (M. Barceló, prologue). Ediciones B - Nova.)\n","date":"1 August 2023","externalUrl":null,"permalink":"/ludo/en/posts/enigma-a-l-costat-del-campus-nord/","section":"Posts","summary":"The Enigma machine, the cryptographic contraption that played a key role during the Second World War, is a powerful symbol of our technological past. Developed by Nazi Germany to encrypt its messages, this machine was considered inscrutable until the arrival of British mathematician Alan Turing, who managed to decipher its code in 1943.\n","title":"An Enigma from the Civil War is preserved next to UPC’s North Campus","type":"posts"},{"content":"","date":"1 August 2023","externalUrl":null,"permalink":"/ludo/en/tags/enigma/","section":"Tags","summary":"","title":"Enigma","type":"tags"},{"content":"","date":"1 August 2023","externalUrl":null,"permalink":"/ludo/en/tags/hist%C3%B2ria-de-linform%C3%A0tica/","section":"Tags","summary":"","title":"Història-De-Linformàtica","type":"tags"},{"content":"Found that on the former Twitter and I thought is was on point.\n","date":"17 July 2023","externalUrl":null,"permalink":"/ludo/en/posts/ai-panic/","section":"Posts","summary":"Found that on the former Twitter and I thought is was on point.\n","title":"Ai panic","type":"posts"},{"content":"L’article “The Tech Revolution: 10 Predictions about AI, Internet, and Possibly Our World in 3 Years” de David Poblador i Garcia presenta una sèrie de prediccions sobre com l’IA i internet podrien transformar el nostre món en els propers tres anys. Aquí teniu un resum de les seves prediccions, que val a dir comparteixo:\nEls Models de Llenguatge Grans (LLMs) es convertiran en l’interfície definitiva per a la interacció amb el coneixement, fent que els motors de cerca convencionals siguin gairebé obsolets.\nLa majoria dels LLMs funcionaran sense connexió, alliberant-nos de la necessitat de connectivitat constant a internet.\nEls models d’IA generativa es convertiran en especialistes, proporcionant coneixements profunds en dominis específics.\nEl model de càlcul client-servidor tradicional es retirarà a favor de dispositius més potents i LLMs més avançats.\nL’indústria multimilionària de la publicitat en línia haurà de trobar una manera de passar a l’offline o arriscar-se a quedar obsoleta.\nLes tasques complexes com les reserves d’hotels, les reserves de restaurants o l’organització d’enviaments es realitzaran interactuant amb APIs millorades que col·laboren amb els models d’IA que s’executen localment.\nEmergirà un mercat extens per a les integracions d’IA.\nAmb la disminució de la importància dels llocs web corporatius, internet es transformarà en un espai més personal per a l’expressió.\nEls fabricants de dispositius i els constructors de sistemes operatius es trobaran en la primera línia d’aquesta nova batalla.\nEls creadors de continguts trobaran noves vies en l’era de l’IA, enriquint els LLMs amb els seus coneixements especialitzats.\nL’autor conclou que aquesta transformació podria ocórrer més ràpidament del que inicialment es pensava, possiblement en menys de tres anys.\nTinc més enllaços a articles i fonts sobre IA aquí\n","date":"10 de June de 2023","externalUrl":null,"permalink":"/ludo/posts/10-prediccions-de-david-poblador/","section":"Posts","summary":"L’article “The Tech Revolution: 10 Predictions about AI, Internet, and Possibly Our World in 3 Years” de David Poblador i Garcia presenta una sèrie de prediccions sobre com l’IA i internet podrien transformar el nostre món en els propers tres anys. Aquí teniu un resum de les seves prediccions, que val a dir comparteixo:\n","title":"10 prediccions sobre IA i Internet que comparteixo","type":"posts"},{"content":"","date":"1 June 2023","externalUrl":null,"permalink":"/ludo/en/tags/research/","section":"Tags","summary":"","title":"Research","type":"tags"},{"content":"Engineering Education Journals\n","date":"1 June 2023","externalUrl":null,"permalink":"/ludo/en/posts/donde-publicar-investigaci%C3%B3n-en-engineering-education/","section":"Posts","summary":"Engineering Education Journals\n","title":"Where to publish Research in Engineering Education","type":"posts"},{"content":"After decades of using free online services, and seeing the consequences of being the product and not the customer (see Soshana Zuboff “The Age of Survailance Capitalism”, or the Netflix documentary “The social dilemma”.), I’m getting less stingy and I try to pay for services that are truly useful to me, especially for my work.\nThat said.\nChatGPT gives completely different results if we use the free model (GPT-3.5-turbo) or the paid model (GPT-4 and GPT-4 plus plug-ins). Since I’ve had access to GPT-4 I use it daily for many tasks, and often my documents are the result of a dialogue with the machine. If I think of GPT-4 as an intern, at intern rates it pays for itself in a single day, or in a single morning.\nOn the other hand, our students will surely use the totally free version, as we would have at their age. And we need to understand how both versions work and what they can do.\nIn any case, using the GPT-4 version gives you a sense of the quality of what’s coming. My impression is that the jump from GPT-3.5 to GPT-4 is like going from nothing to GPT-3.5.\n","date":"19 May 2023","externalUrl":null,"permalink":"/ludo/en/posts/pagas-por-chatgpt/","section":"Posts","summary":"After decades of using free online services, and seeing the consequences of being the product and not the customer (see Soshana Zuboff “The Age of Survailance Capitalism”, or the Netflix documentary “The social dilemma”.), I’m getting less stingy and I try to pay for services that are truly useful to me, especially for my work.\n","title":"Do you pay for ChatGPT?","type":"posts"},{"content":"","date":"19 May 2023","externalUrl":null,"permalink":"/ludo/en/tags/opini%C3%B3n/","section":"Tags","summary":"","title":"Opinión","type":"tags"},{"content":"","date":"19 May 2023","externalUrl":null,"permalink":"/ludo/en/tags/upc/","section":"Tags","summary":"","title":"Upc","type":"tags"},{"content":"","date":"17 May 2023","externalUrl":null,"permalink":"/ludo/en/tags/guia-instalaci%C3%B3n/","section":"Tags","summary":"","title":"Guia-Instalación","type":"tags"},{"content":"In this post I explain how to install and use a small Python script I wrote. This script connects to the OpenAI API to use GPT-3.5_turbo to translate files from one language to another, regardless of length. You need an OpenAI API key to use the program. This API has a cost, but it\u0026rsquo;s very low, around 1 euro cent per translation.\nMost of this post is written with ChatGTP and serves as an example of how easy it is to write documentation and notes with this tool. In any case, I\u0026rsquo;m the editor of the post, I choose what I want to be explained and therefore: I am the author. We can debate this point, which is not trivial.\nInstall GitHub Desktop # What is GitHub? # According to ChatGPT\nGitHub is like Google Drive for programmers, but with some very useful extra features. Imagine you’re working on a project and want to save different versions of it so you can go back to a previous one if something goes wrong. That’s what GitHub does. Also, if you work in a team, each member can work on different parts of the project at the same time, and then GitHub helps merge all the work in an orderly way. Finally, it’s also like a social network for programmers, because you can share your projects with others and even collaborate on other people’s projects.\nThat’s a good explanation.\nInstall GitHub Desktop # GitHub Desktop is a graphical interface that lets you interact with GitHub without using the command line. It’s useful for managing repositories, making commits, push and pull, resolving conflicts, among other things. Here are brief instructions to install it on Windows:\nGo to the official GitHub Desktop page at https://desktop.github.com/ Click the “Download for Windows (64bit)” button if your operating system is 64-bit. If you’re not sure, you can check by going to Windows Settings -\u0026gt; System -\u0026gt; About -\u0026gt; System type.\nOnce the .exe file has been downloaded, double-click it to start the installation.\nFollow the on-screen instructions to complete the installation.\nOnce the installation has completed, you can open GitHub Desktop from the Windows Start menu.\nThe first time you open the app, it will ask you to sign in with your GitHub account. If you don’t have one, you can create it at https://github.com/\nOnce you’ve signed in, you can start using GitHub Desktop to manage your GitHub repositories.\nInstall Python # ChatGPT says:\nPython is a programming language, which means it’s a way to give instructions to a computer. It’s like a cookbook for the machine: you tell it exactly what to do and how to do it. What makes Python special is that it’s designed to be easy to read and write. It’s like a recipe book that tries to be simple and clear, avoiding unnecessary complications. That’s why it’s often used to teach programming to beginners. But don’t be mistaken, Python is very powerful and is used in many places, from websites to scientific research and game development.\nWe should also bear in mind that it’s an interpreted language:\nAn interpreted language, like Python, is a type of programming language that runs line by line, that is, in real time. Imagine you’re in a play and the director tells you what to do scene by scene, instead of giving you the whole script at once. That’s how Python works: the computer reads and performs the instructions in the code you wrote one by one.\nThis has some advantages, such as being easier to learn and use, and you can test parts of your code as you go. However, it can be a bit slower than other languages that are compiled (that is, translated completely to machine language) before running.\nTo install Python 3.10 on Windows, follow these steps:\nGo to the official Python website at https://www.python.org/\nClick the “Downloads” tab at the top of the page.\nYou’ll see a button that says “Download Python 3.10.x” (x will be a number depending on the latest available version). Click that button to download the installer.\nOnce the .exe file has been downloaded, click it to open the installer.\nIn the installer window, make sure to check the box “Add Python 3.10 to PATH” at the bottom. This will let you use Python from the Windows command line.\nThen click “Install Now”.\nThe installer will start copying files and setting up Python on your system. This process can take a few minutes.\nOnce the installation is complete, you’ll see a screen that says “Setup was successful”. You now have Python 3.10 installed on your system!\nTo check that the installation was successful, open the command line (CMD or PowerShell) and enter the command python --version. This should show the version of Python you just installed.\nWhat ChatGPT doesn’t tell you is that you must make sure to check “add python.exe” to PATH. …and remove the path length limit (Disable path length limit).\nInstalling the OpenAI API for Python # To install the OpenAI API for Python, you’ll need to use pip, which is a tool that comes with Python and is used to install additional packages or libraries.\nFollow these steps:\nOpen your computer’s command line. On Windows, you can search for “CMD” or “Command Prompt” in the Start menu.\nOnce the command line is open, type the following command and press Enter:\npip install openai\nPerhaps “pip” doesn’t work on your system and you need to use “pip3”\nWait for the installation to complete. You’ll see several messages in the command line as the package is downloaded and installed. Get an access key for the OpenAI API # To get an OpenAI API Key, follow these steps:\nGo to the OpenAI website at https://beta.openai.com/signup/ and create an account if you don’t already have one.\nOnce you’ve signed in, go to the Dashboard.\nIn the Dashboard, look for the “API Keys” section.\nClick “Create new API Key”.\nIt will ask you to enter a name for the key and select some options. Once you’ve done that, click “Create”.\nYou’ll be shown a new API Key. Be sure to copy it and save it in a safe place, as you won’t be able to see it again.\nYou can now use this API key in your code to interact with the OpenAI API.\nRemember to treat your API Key like a password, as it provides access to your OpenAI account and paid services. Don’t share it with anyone and don’t publish it in public places like code repositories on GitHub.\nYou may need to set up a payment method on the platform since API access is paid. For testing with models like GPT-3.5_turbo the cost is very low—on the order of cents or thousandths of a euro per call. Models like DALL-E or GPT-4 are more expensive.\nGDPR and the OpenAI API # OpenAI, based in the United States, should not receive personal data of European users via its API without GDPR compliance guarantees. This means that when you use the OpenAI API, you must ensure you don’t send data that can identify a person, such as names, email addresses, phone numbers, and any other data that could be used to identify someone.\nIt’s important to understand that the responsibility for ensuring that the data sent to the OpenAI API complies with the GDPR lies with the user who sends the data. Therefore, you should always be careful when handling and processing personal data and ensure that you are complying with all applicable laws and regulations.\n(If you want to use the OpenAI API with sensitive data in Europe you should use the API version in Microsoft Azure’s cloud https://azure.microsoft.com/en-us/products/cognitive-services/openai-service/)\nMy translation program # My translation program is in my GitHub repository https://github.com/granludo/gpt-test, in the file-translator folder. You should use GitHub Desktop to clone the folder to your computer.\nThese are the instructions from the Readme.md in Spanish:\nBy Marc Alier @granludo https://wasabi.essi.upc.edu/ludo\nLicensed under the GNU General Public License v3.0\nSimple Python 3 program that uses OpenAI’s GPT-3.5-turbo model to translate files from one language to another. It splits the file into chunks of 20 lines.\nRequirements # You need to have a key to use the OpenAI API; you can get one here: https://beta.openai.com/docs/developer-quickstart/api-key this program looks for the key in a file specified in the “mykeypath” variable, by default at the path ‘..//..//mykey.json’. (That is, inside the GitHub folder)\nExample contents of mykey.json:\n{ “key”: “sk-7QzXHpaajenpuakn1tgHT3Bxxxxxxxxxxx” }\nYou need to install the json and openai packages\npip install openai\nUsage # python3 gpt-traductor.py archivo_origen idioma_origen idioma_destino archivo_destinoExample\npython3 gpt-translate.py sample.md catalan italian sample_translated.md\nI hope you find it useful.\n","date":"17 May 2023","externalUrl":null,"permalink":"/ludo/en/posts/mi-script-de-traducci%C3%B3n-de-ficheros-de-texto-usando-el-api-de-chatgpt.-guia-de-instalaci%C3%B3n/","section":"Posts","summary":"In this post I explain how to install and use a small Python script I wrote. This script connects to the OpenAI API to use GPT-3.5_turbo to translate files from one language to another, regardless of length. You need an OpenAI API key to use the program. This API has a cost, but it’s very low, around 1 euro cent per translation.\n","title":"My text file translation script using the ChatGPT API. Installation guide","type":"posts"},{"content":"","date":"17 May 2023","externalUrl":null,"permalink":"/ludo/en/tags/python/","section":"Tags","summary":"","title":"Python","type":"tags"},{"content":"","date":"17 May 2023","externalUrl":null,"permalink":"/ludo/en/tags/software/","section":"Tags","summary":"","title":"Software","type":"tags"},{"content":"","date":"17 May 2023","externalUrl":null,"permalink":"/ludo/en/tags/traductor/","section":"Tags","summary":"","title":"Traductor","type":"tags"},{"content":"Based on the courses I have designed and taught to over 300 university professors titled \u0026lsquo;ChatGPT, opportunity and challenge for teaching\u0026rsquo;, and at the request of the UPC Vice-Rectorate for University Policy, I have written this draft decalogue of recommendations. With additional contributions, this document has been presented to the community of the Universitat Politècnica de Catalunya. This is my version with some additional revisions.\nAI IN THE UNIVERSITY, DECALOGUE OF RECOMMENDATIONS # We are facing a technology with great transformative potential: the availability of generative AI models—like ChatGPT—that can generate content (text, code, images, sound, video\u0026hellip;) similar to what a person could produce. Generative models are already part of our work tools, and learning to use them effectively, responsibly, and understanding their limitations is a new competency to learn and integrate into everyone\u0026rsquo;s workflow, teachers and students alike. We are probably only at the beginning of a new paradigm that will lead us to rethink academic activity, not only because having these tools available has the potential to transform the learning experience, but also because they can change or evolve the competencies professionals will require in the future.\nON STUDY # We need to know and understand these tools. It is important to have a basic understanding of this new technology, its possibilities, opportunities, and challenges in the same work or study field. We need to identify how generative models affect each discipline, which tasks become automatable, and with what level of reliability and precision, and which do not. Generative models can be a consultation tool, to generate ideas and first drafts for many tasks. They can be used inside and outside the classroom. Research is already underway on how to make versions adaptable to specific knowledge domains, such as the subject matter of a course. Certain teaching methodologies where students have to put their knowledge into practice in the classroom (such as the flipped classroom) can be effective to take advantage of and/or coexist with these new tools in the academic sphere. We need to be very careful with the information we give to AI models offered as \u0026ldquo;software as a service\u0026rdquo; in the cloud. We must consider whether these services comply with GDPR and who will use the information we provide during a chat. ON EVALUATION # The possibility that work done outside the classroom is not the student\u0026rsquo;s original work has always been present. This is nothing new. Generative models simplify and lower the cost (in the case of academies or organizations that produce academic work for money). Plagiarism detection tools like URKUND-Ouriginal are not reliable, and possibly never will be again, especially for texts under 1000 characters. We must take into account the high rate of false positives; the consequences of acting in the case of a false positive are serious and must be avoided. Certain projects, exercises, and tasks can be automated. We need to consider alternatives such as proposing new types of assignments and evaluating the work process over the result. Some evaluation activities that can be proposed are formative portfolios, in-class presentations, oral assessment, simulations, etc. In-person exams, without internet access, are not affected. Just as some exams have been with notes or internet connection, it can be a challenge to propose exams with access to generative models. ","date":"1 May 2023","externalUrl":null,"permalink":"/ludo/en/posts/decalogo-sobre-el-uso-de-la-ia-en-el-aula/","section":"Posts","summary":"Based on the courses I have designed and taught to over 300 university professors titled ‘ChatGPT, opportunity and challenge for teaching’, and at the request of the UPC Vice-Rectorate for University Policy, I have written this draft decalogue of recommendations. With additional contributions, this document has been presented to the community of the Universitat Politècnica de Catalunya. This is my version with some additional revisions.\n","title":"A decalogue of recommendations on the use of AI in the classroom.","type":"posts"},{"content":"Legend has it that the wise Anand (a name suggested by ChatGPT, whom I call Skippy ^2) helped King Devendra (a name also proposed by Skippy) solve a difficult problem. In gratitude Devendra offered Anand the payment he desired. Anand asked that the king give him a grain of rice for the first square of a chessboard and that he double the amount in each successive square. The king accepted, but after a good while the king\u0026rsquo;s mathematicians informed him that there was not enough wheat in the entire kingdom to pay what Anand asked for.\nGordon Moore, one of the founders of Intel, observed in 1965 that the capacity to integrate transistors on a chip doubles every 18 months. That is to say, following an exponential progression. This observation became a predictor of industry behavior that has held true until Moore\u0026rsquo;s death in early 2023.\nIt seems impossible that something doubles 64 times, but since Moore\u0026rsquo;s observation, processor power has doubled 43 times. But there\u0026rsquo;s more: 20 years ago inventor and futurist Ray Kurzweil calculated the computing capacity one could acquire for 1000 dollars adjusted for inflation throughout the 20th century. The result indicated exponential growth perhaps starting before the beginning of the 20th century, if we consider the work of Blaise Pascal, Charles Babbage and Ada de Lovelace at the beginning of the 19th century. What we can say is that during the 20th century the computing capacity we can buy at a given price has doubled, regardless of the technology used, 66 times during the 20th century. More than squares on a chessboard. And this behavior has not stopped in the 23 years we\u0026rsquo;ve been into the 21st century.\nBoth the wise Anand and King Devendra would be dumbfounded. But the question the wise Anand would ask is: what happens when something doubles so many times and what happens when this affects so many things?\nA possible answer is given by Peter Diamandis, Greek-American entrepreneur and futurist, who proposes in his book \u0026ldquo;Abundance: The Future Is Better Than You Think\u0026rdquo; (^1) that when something is digitized it becomes connected with this exponential improvement. This at first is not a big deal (it\u0026rsquo;s deceptive) because at first exponential curves are quite flat, but suddenly they make a kink upwards and grow suddenly. It is then that digitization becomes disruptive. And this disruption implies changes in the way things are done, in the artifacts we use (which become smaller or disappear), in the jobs we people have to do - the disappearance of some and the creation of others -, and in the value of things: some things cease to have value because they become abundant and other things become very valuable.\nYou can ask Skippy about Diamandis\u0026rsquo;s 6 D\u0026rsquo;s and he will explain it to you very well.\nAnother possible answer was given by Alvin Toffler, in an essay in the prestigious magazine Playboy - which is not Open Access nor JCR, but surely you\u0026rsquo;ve heard of it - in 1970 and subsequently in the book of the same title: \u0026ldquo;Future Shock\u0026rdquo; ^3. The shock of the future. Toffler makes the observation that the transformations caused by technological innovation - already in 1970 - are each time deeper and each succeed each other faster, they accelerate. So that the world changes radically several times during a person\u0026rsquo;s lifetime. And this can leave us perplexed, in the shock of the future.\nIf Toffler already observed the shock of the future 53 years ago, now it is something evident in our day to day. Let\u0026rsquo;s think that 25 years ago there was no web - it was there, but there were only 4 nerds -, 15 years ago there were no mobile phones connected to the internet - they were there, but we only had 4 nerds -, 6 months ago there were no generative models that seem like Artificial Intelligences. Well, they were there, but only 4 nerds were playing with them.\nOn November 30, 2022 ChatGPT was launched, another Chatbot on the internet like many others we had seen, which allowed talking to the generative model GPT-3 (Generative Pretrained Transformer) developed by the non-profit organization OpenAI (who have reconsidered and now they do want to make profits). In 5 days ChatGPT reached 5 million users, in two months it had reached 100 million users. ChatGPT can already enter the Guinness Book of Records as the fastest expanding product to date. In January 2023 ChatGPT was as popular in Google searches as Shakira, now - in May 2023 - it is difficult to find a more popular term anywhere.\nFor some reason ChatGPT is good enough that all of us who use it see that this is something different. A Chatbot worth paying attention to. Just as in their day internet users who tried the new Google search engine realized it was something different; ten times better than anything else, Google\u0026rsquo;s creators like to say.\nChatGPT marks the beginning of the kink in the progression of a new tool: generative models based on neural networks, a technology in the research field of artificial intelligence (AI) that has been running for almost 70 years. Let\u0026rsquo;s not forget that, as John McCarthy - one of the fathers of the discipline - said, when a technology works we no longer call it AI, but something else. AI\u0026rsquo;s goals move further beyond, perhaps now they will aim at the unachieved goal of Artificial General Intelligences (AGI) equivalent to human intelligence. But don\u0026rsquo;t worry, we are not there yet, nor will we be soon.\nHowever, generative models have now become a technology good enough to become the basis of services for the general public. Computer systems now understand the language with which we humans communicate - any language or almost any - they begin to understand context and cultural subtleties, and are capable of responding giving information often accurate. More and more accurate.\nLately, I have seen archive photographs showing mathematics professors demonstrating in the street protesting against the use of calculators in the 70s. Denial or prohibition is one of the first responses we can expect, even if they serve little.\nCalculators were and are fantastic, they allow doing arithmetic calculations, accessing logarithm and trigonometric tables, and even executing algorithms. With a calculator we are capable of solving very complex problems, quickly and with precision. Surely problems we could not address without. But the availability of calculators can cause us not to learn to do these operations without them. Or that we forget how to do them. \u0026ldquo;Use it or lose it\u0026rdquo; they say.\nChatGPT and the large family of generative models that is coming - go to huggingface.co where you will find the main learning community about generative models and transformers and you will see what\u0026rsquo;s cooking there - can do many things for us: Help us translate texts, summarize documents, prepare presentations, review grammar and style, answer questions we would have previously asked an encyclopedia, an expert or the internet search engine\u0026hellip; And even more, if we are lucky and have a knack for giving instructions (what they call \u0026ldquo;prompt\u0026rdquo;, but they are \u0026ldquo;instructions\u0026rdquo;) they can help solve problems and connect with other programs to achieve even more things. They can generate code for our programs, help us improve style, find errors and document them. And many other things. Which like calculators will increase our capabilities, but we will have to be careful because using these tools we can lose the ability to do things without them.\nGenerative AI Models have a very interesting property: sometimes they hallucinate. In the context of AI, hallucinations are strange results - erroneous, for sure - that these systems give sometimes, in an unexpected and unpredictable way. So that whenever we use one of these models we have to think that maybe the result is accurate or maybe it is a hallucination.\nWhen a calculator gives a result, do we have the capacity to check if the result is correct? Certainly, we assume the risk and responsibility of accepting the result of a calculator - or a spreadsheet or SAP - every time we use them for our day to day. But calculators are reliable, they are built with deterministic systems that always give the same result. And if they fail, we can look for where the error is and correct the system.\nWith neural network models this is not the case. We know they work phenomenologically: they work because they work and we observe a behavior. Nothing guarantees us that they will continue to work like this or that they will hallucinate or will explain to us why they work as they do. Think about this when, very soon, more and more systems are controlled by these models.\nAs teachers and educators the dilemma arises for us. Should we use these tools or should we not use them in our subjects? And if we have to use them: how should we use them? Regardless of what we decide our students already have these tools, they know who Shakira is and also what ChatGPT is.\nExpert players of the real-time strategy video game StarCraft explain that to win one must play in Macro and in Micro. One must have a Macro strategy: master the whole map, have a strategy to win the match in the long term. But one must apply tactics and operations in Micro, pay attention and move your units where there are confrontations. And know when you have to pay attention to Macro or to Micro. Without mastering Macro-Micro one cannot win at StarCraft.\nIn the same way each subject has a Macro context: What are the objectives of the subject? The professionals we train, will they use generative AI tools in their day to day? How does this affect our objectives? And how will it affect in the coming months and years, given the shocking pace of technological advancement?\nBut there is also a Micro context: How do we teach? Do we ban AI in the classroom or assume that at any moment our students may have the ChatGPT window open or some other tool? And the projects? And the exercises they have to do at home? How do we evaluate them knowing they have AI tools that are impossible or very difficult to detect?\nEvery teacher, every team of teachers in each subject has the opportunity and the challenge to rethink our strategy and teaching practice. And we cannot ignore it.\n^1 (Diamandis, P. H., \u0026amp; Kotler, S. (2012). Abundance: The Future Is Better Than You Think. Free Press.\u0026quot; ^2 Skippy is the name of the alien AI that co-stars in the science fiction novel series Expeditionary Force by writer Craig Alanson. ^3 Toffler, Alvin. (1970). Future Shock. New York: Random House.\n","date":"26 April 2023","externalUrl":null,"permalink":"/ludo/en/posts/i-la-ia-va-venir-a-laula/","section":"Posts","summary":"Legend has it that the wise Anand (a name suggested by ChatGPT, whom I call Skippy ^2) helped King Devendra (a name also proposed by Skippy) solve a difficult problem. In gratitude Devendra offered Anand the payment he desired. Anand asked that the king give him a grain of rice for the first square of a chessboard and that he double the amount in each successive square. The king accepted, but after a good while the king’s mathematicians informed him that there was not enough wheat in the entire kingdom to pay what Anand asked for.\n","title":"And AI Came to the Classroom","type":"posts"},{"content":"","date":"26 April 2023","externalUrl":null,"permalink":"/ludo/en/tags/cursos/","section":"Tags","summary":"","title":"Cursos","type":"tags"},{"content":"","date":"26 April 2023","externalUrl":null,"permalink":"/ludo/en/tags/ice-upc/","section":"Tags","summary":"","title":"Ice-Upc","type":"tags"},{"content":"","date":"26 April 2023","externalUrl":null,"permalink":"/ludo/en/tags/videos/","section":"Tags","summary":"","title":"Videos","type":"tags"},{"content":"[Note: this post has been translated with my own translator] [Note: using the API has costs; the cost of use depends on the size of the documents. As a rough guide during development, tests and various translations of 2-page documents I spent 0.0.3€ over 5 days]\nIn this post I present a small project I’ve published as open source. What would once have been a project of many years and several million euros is now a small experiment done on a Saturday morning. The keys are GitHub Copilot, ChatGPT 4 as programming assistants, and the GPT-3.5 Turbo API.\ngpt-translator # You can download it from https://github.com/granludo/gpt-test/tree/main/file-trasnlator\nLicensed under the GNU General Public License v3.0\nSimple Python 3 program that uses OpenAI’s GPT-3.5-turbo model to translate files from one language to another. It splits the file into 20-line chunks.## Requirements\nYou need an API key to use OpenAI’s API; you can get one here: https://beta.openai.com/docs/developer-quickstart/api-key this program looks for the key in a file indicated by the “mykeypath” variable, by default at the path ‘..//..//mykey.json’\nYou need to have Python 3 installed on your system. I used Python 3.10\nYou need to install the json and openai packages\npip install openai\npip install json\nOn your system you might need to use pip3 instead of pip.\nUsage # python3 gpt-traductor.py fitxer_origen idioma_origen idioma_destí fitxer_destí\nExample\npython3 gpt-translate.py sample.md català italià sample_translated.md\nHave fun.\n","date":"8 April 2023","externalUrl":null,"permalink":"/ludo/en/posts/traductor-cat/","section":"Posts","summary":"[Note: this post has been translated with my own translator] [Note: using the API has costs; the cost of use depends on the size of the documents. As a rough guide during development, tests and various translations of 2-page documents I spent 0.0.3€ over 5 days]\n","title":"Simple multi-language document translator in Python. A quick project.","type":"posts"},{"content":"[Note: using OpenAI’s API has its costs. The cost of using the translator will depend on the size of the documents you translate. As a guideline, so far with development, tests, and translating a few documents I’ve spent €0.03. ]\nIn this post I present a small project I’ve released as open source. What would once have been a project taking many years and several million euros is now a small experiment done on a Saturday morning. The keys are GitHub Copilot, ChatGPT 4 as programming assistants, and the GPT-3.5 Turbo API.\ngpt-translator # You can download it from https://github.com/granludo/gpt-test/tree/main/file-trasnlator\nLicensed under the GNU General Public License v3.0\nSimple Python 3 program that uses OpenAI’s GPT-3.5-turbo model to translate files from one language to another. It splits the file into 20-line chunks.\nRequirements # You need an API key to use OpenAI. You can get one here: https://beta.openai.com/docs/developer-quickstart/api-key this program looks for the key in a file indicated in the variable “mykeypath” by default at the path ‘..//..//mykey.json’\nYou need to have Python 3 installed on your system. I’ve used Python 3.10\nYou need to install the json and openai packages\npip install openai\npip install json\nOn your system you may need to use pip3 instead of pip.\nUsage # python3 gpt-traductor.py archivo_origen idioma_origen idioma_destino archivo_destinoExample\npython3 gpt-translate.py sample.md catalan italian sample_translated.md\nHave fun.\n","date":"8 April 2023","externalUrl":null,"permalink":"/ludo/en/posts/traductor-simple-de-documentos-multi-idioma-en-python.-un-proyecto-r%C3%A1pido/","section":"Posts","summary":"[Note: using OpenAI’s API has its costs. The cost of using the translator will depend on the size of the documents you translate. As a guideline, so far with development, tests, and translating a few documents I’ve spent €0.03. ]\n","title":"Simple multi-language document translator in Python. A quick project.","type":"posts"},{"content":"On March 2, 2023, I took part in the session on AI chatbots in higher education organized by the Universitat Internacional de Catalunya UIC. At https://www.linkedin.com/events/sessi-sobrexatbotsd-intel-lig-n7034187811851235329/comments/ you can find the video of the event. Yours truly and his talk appear starting at 1h:25 minutes.\n","date":"1 March 2023","externalUrl":null,"permalink":"/ludo/en/posts/participaci%C3%B3-a-jornada-sobre-ia-i-docencia-a-la-uic-1/","section":"Posts","summary":"On March 2, 2023, I took part in the session on AI chatbots in higher education organized by the Universitat Internacional de Catalunya UIC. At https://www.linkedin.com/events/sessi-sobrexatbotsd-intel-lig-n7034187811851235329/comments/ you can find the video of the event. Yours truly and his talk appear starting at 1h:25 minutes.\n","title":"Participation in a session on AI and teaching at UIC","type":"posts"},{"content":"","date":"1 March 2023","externalUrl":null,"permalink":"/ludo/en/tags/pon%C3%A9ncias/","section":"Tags","summary":"","title":"Ponéncias","type":"tags"},{"content":" Contents # What is ChatGPT? ChatGPT is Software As A Service (SaaS) Terms and conditions Authorship rights and responsibility for ChatGPT’s outputs Cost and access What’s ChatGPT for and why is it free? ChatGPT isn’t open source Ethical aspects Ethical reflections from the creators of GPT-3 The ChatGPT Hype What is ChatGPT ? # ChatGTP is a web application based on GPT-3, specifically the GPT-3.5 “text-davinci-003” model developed by OpenAi. The ChatGPT model is optimized to work in conversational form, responding to inputs that users provide as a text “prompt”. ChatGPT is a type of Generative AI based on a Machine Learning Model . We can consider ChatGPT a weak AI (see Types of Artificial Intelligence].\nGPT-3 is a massive natural language generative model. It has been trained with 175 billion parameters in multiple languages, Catalan among them. If the training process had been done on a single computer it would have taken 355 years. The cost to train it on a low-cost cloud platform is $4,600,000. https://lambdalabs.com/blog/demystifying-gpt-3\n(Image source: https://blog.accubits.com/getting-started-with-gpt-3-model-by-openai/)\nChatGPT is Software As A Service (SaaS) # ChatGPT is provided as a service; it’s “Software as a Service”, not software we can install on a server or a cloud computing service. Therefore, the owners of the service (OpenAi, 49% owned by Microsoft since January 2023) decide the service terms: who has access, for what purposes, and what quality of service and availability they offer.\nTerms and conditions # ChatGPT’s terms of service https://openai.com/terms/ state that\nYou agree and instruct us to use Content to develop and improve the Services. You can read more here about how Content may be used to improve model performance. We understand that in some cases you may not want your Content used to improve the Services. You can opt out from having Content used for improvement by contacting suport@openai.com with your organization ID. Please note that in some cases this may limit our Services’ ability to better address your specific use case.\nIn other words, the data we enter into ChatGPT in the form of a Prompt will be collected, analyzed, and used for research, service improvement, and very likely to train future versions of GPT-N.\nOpenAi clearly publishes its terms and conditions, doesn’t abuse legalese, and doesn’t use fine print. I recommend reading the terms of service https://openai.com/terms/ and the privacy policy https://openai.com/privacy/\nAuthorship rights and responsibility for ChatGPT’s outputs # A very important aspect of using generative AI tools like ChatGPT, DALL-E, Stable Diffusion, Whisper, or VALL-E is the implicitly or explicitly established agreement on the authorship rights of what is generated (text, image, audio, or video). This determines the copyright and usage rights that derive for the parties involved.\nOpenAi’s generic terms for its tools define two important concepts: Input and output. Input is the prompt generated by users, while Output is the content generated by the Machine Learning Model. Input and output together make up the content.\nLet’s see what the terms say as translated by ChatGPT itself.\n_(a) Your Content. You may provide input to the Services (“Input”) and receive output generated and returned by the Service based on the Input (“Output”). Input and Output are collectively “Content.” Between the parties and to the extent permitted by applicable law, you own all Input and, subject to your compliance with these Terms, OpenAI hereby assigns to you all its right, title and interest in and to the Output. OpenAI may use Content as necessary to provide and maintain the Services, comply with applicable law, and enforce our policies. You are responsible for Content, including ensuring that it does not violate any applicable law or these Terms.\nIn other words, the author of a prompt] holds the rights to the outputs generated by an OpenAi AI as long as they hold the rights to the inputs that produced those contents. If they don’t hold those rights they’d be contravening OpenAi’s terms of service, and that would mean I’m not the one to opine on the matter; you should consult lawyers and legal services.\nOpenAi has things to say about the content, its nature, and the possibility that certain outputs repeat.\n(b) Similarity of Content. Due to the nature of machine learning, Output may not be unique among users and the Services may generate the same or similar output for OpenAI or a third party. For example, you could provide input to a model such as “What color is the sky?” and receive output such as “The sky is blue.” Other users may ask similar questions and receive the same answer. Responses that are requested by and generated for other users are not considered your Content.\nThat is, the originality of the output depends on the originality of the input. Lawyers are going to have a field day in the coming years. Or maybe a sibling of ChatGPT specialized in legal advice… but I digress.\nCost and access # As of January 2023 and since December 2022, ChatGPT is available for free at https://chat.openai.com\nEarlier GPT-3 models were available for testing via a more complex interface and by API (Application Programming Interface). To access them, after a trial period you had to pay for usage by purchasing “token” packages. It’s not clear how long ChatGPT will be available for free use.\nThere’s already a waitlist open to access the “pro” mode of ChatGPT. It’s speculated that the price will be $42 per month, although the free tier remains.\nAccess for application developers via API (Application Programming Interface) to ChatGTP will soon begin to be managed through Microsoft’s cloud services: Azure. https://indianexpress.com/article/technology/tech-news-technology/openais-chatgpt-api-microsoft-azure-api-businesses-8386106/ As a result, we’ll surely start seeing specialized tools making use of various GPT-3 models.\nOne example is the “codex” model, a GPT-3 variant trained with source code available on GitHub (the largest free software project repository on the web, acquired by Microsoft a couple of years ago), where for a few months they’ve been offering the Copilot system, a programming assistant capable of generating code, translating programs between programming languages, optimizing code, and commenting on programs, among other applications.\nWhat’s ChatGPT for and why is it free? # The goal of ChatGPT is to make it easy for a large number of people (on the order of hundreds of millions) to use this model, thereby obtaining data for OpenAi’s researchers. As OpenAi’s CEO points out in the Moonshots Podcast.\nMoonshots. Peter Diamandis. “The Rise of OpenAi with Sam Altman” https://www.diamandis.com/podcast/sam-altman ChatGPT isn’t open source # GPT-3 isn’t open source; other Machine Learning Models are, but GPT-3 and DALL-E aren’t. On this point ChatGPT says:\n_GPT-3 is a language processing model developed by OpenAI. The model and the data used to train it are owned by OpenAI, and therefore it is not open source. OpenAI has decided not to open it because it considers it a valuable technology and believes it is important to protect it in order to continue funding its development and improvement.\nIn addition, there are concerns about the possible risks associated with unintentional or malicious use of GPT-3 technology, such as the automated generation of fake news or the automation of social engineering. OpenAI has adopted a restrictive licensing strategy to avoid these concerns and ensure that GPT-3 is used responsibly.\nEthical aspects # The results of transformer-based generative models can be convincing enough to pass the Turing test. In June 2022 a Google engineer, who was fired, stated he was convinced the LaMDA model was self-aware and therefore had rights analogous to human rights. https://www.livescience.com/google-sentient-ai-lamda-lemoine\nIn January 2023 social networks were circulating rumors about the number of parameters in GPT-4. However, OpenAi CEO Sam Altman denied it and emphasized that GPT-4 won’t be a Types of Artificial Intelligence.\nChatGPT is a project in continuous evolution. The service is often updated and many researchers have reported changes in its behavior. After a short while using ChatGPT we can observe that an effort has been made to make its responses politically correct. Some examples are:\nThe insistence that it is software, that it has no agency, and that it is not an Artificial Intelligence Ethical reflections from the creators of GPT-3 # The creators of GPT-3 presented the project in the paper “Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A. N., … \u0026amp; Polosukhin, I. (2017). Attention is all you need. Advances in neural information processing systems, 30.” In the paper they devote a fairly extensive section to discussing the possible misuses of text-generating systems like GPT-3.\nSummarized by GPT-3\n_The malicious uses of language models can be somewhat difficult to anticipate because they often involve repurposing language models in a very different setting or for a different purpose than researchers had in mind.\nTo help with this, we can think in terms of traditional security risk assessment frameworks, which outline key steps such as identifying threats and potential impacts, assessing likelihood, and determining risk as a combination of probability and impact.\nWe discuss three factors: potential misuse applications, risk actors, and external incentive structures. The potential for misuse of language models increases as the quality of text synthesis improves. GPT-3’s ability to generate multiple paragraphs of synthetic content that people find difficult to distinguish from human-written text is a point of concern in this regard._\nPotential misuse applications of language models include disinformation, spam, phishing, abuse of legal and governmental processes, fraudulent drafting of academic essays, and the precision of social engineering. The potential for misuse increases with improvements in the quality of text synthesis. The authors identify different potential types of “threat actors” based on their level of skill and resources. These range from actors with low or moderate skills and resources to highly skilled and well-equipped groups such as state-sponsored ones they call APT (Advanced Persistent Threats).\nLow- and medium-skilled actors do not currently pose an immediate threat, but improvements in reliability could change this. APT actors do not discuss their operations in public, but no differences have been detected in these actors’ actions since the release of GPT-2. Threat actors’ behavior is influenced by external incentive structures such as scalability, reduced deployment cost, and ease of use, which can influence the adoption of new techniques, tactics, and procedures (TTPs) adopted by threat actors. AI researchers are expected to develop increasingly reliable and steerable language models, which would pose challenges for the scientific community and the need to work on security solutions. The creators of GPT-3 continue their analysis with a discussion of the biases the system may have and how to mitigate them. And they conclude with a section on the system’s energy aspects.\nThe ChatGPT Hype # Generative AI tools capable of creating text from prompts have been available for almost two years now. However, these tools had very limited popularity in communities interested in AI and technological innovation. In the following figure we see the popularity of GPT-3, the best model so far, compared with the popularity in Google searches of a relatively mainstream term like _“ferrari”__.\nBy contrast, when we add the term “ChatGTP”_ to the comparison we get a fright.\nThe term “ferrari” is a good benchmark because it maintains fairly stable interest and allows us to see how ChatGPT sparks very high global attention. But how does ChatGPT compare to a term with first-order hype during January 2023 like “Shakira”, who has been very trendy due to her latest song, which has become a phenomenon in music and popular culture?\nWell, the Google Trends result points to the fact that the level of attention for searches that “ChatGPT” is attracting is of the same order as that attracted by the Colombian singer who until recently lived in Barcelona.\nLet’s recall the meaning of “Hype” According to ChatGPT:\nHype is a term used to describe a large amount of advertising and exaggerated enthusiasm for a product, idea, or trend. In general, it refers to an increase in popularity or attention that has been given to something without a real basis or justification. Thus, the term hype usually refers to a situation in which oversized expectations have been generated about a product or service, in a way that is not consistent with its reality or its capabilities.\nWhether or not there is a real basis or justification, ChatGPT is undoubtedly one of the most popular search terms on the internet.\n","date":"24 February 2023","externalUrl":null,"permalink":"/ludo/en/posts/que-es-chatgtp/","section":"Posts","summary":"Contents # What is ChatGPT? ChatGPT is Software As A Service (SaaS) Terms and conditions Authorship rights and responsibility for ChatGPT’s outputs Cost and access What’s ChatGPT for and why is it free? ChatGPT isn’t open source Ethical aspects Ethical reflections from the creators of GPT-3 The ChatGPT Hype What is ChatGPT ? # ChatGTP is a web application based on GPT-3, specifically the GPT-3.5 “text-davinci-003” model developed by OpenAi. The ChatGPT model is optimized to work in conversational form, responding to inputs that users provide as a text “prompt”. ChatGPT is a type of Generative AI based on a Machine Learning Model . We can consider ChatGPT a weak AI (see Types of Artificial Intelligence].\n","title":"What ChatGPT Is and Isn't?","type":"posts"},{"content":"John Mc Carty lecture https://www.youtube.com/watch?v=K13_sWm_gZw\u0026t=23s\n","date":"23 February 2023","externalUrl":null,"permalink":"/ludo/en/posts/recursos-hist%C3%B2ria-de-la-inform%C3%A1tica/","section":"Posts","summary":"John Mc Carty lecture https://www.youtube.com/watch?v=K13_sWm_gZw\u0026t=23s\n","title":"Resources on the history of computing","type":"posts"},{"content":"If you’re an educator and you’re interested in generative AI and its applications in teaching, you’ll be glad to know you can now access the videos from the course “ChatGPT: Opportunity and Challenge for Teaching” on my YouTube channel. This course covers ChatGPT, a text-generating AI tool introduced at the end of 2022, and its implications for university teaching.\nPlaylist: ChatGPT Opportunity and Challenge for Teaching, and now what do we do?\nIn this course, we reflect on the opportunities and challenges this tool poses for productivity, task automation, and student assessment. We also analyze the applications of ChatGPT in university teaching, propose strategies to address the challenges and leverage the opportunities of ChatGPT, and identify other generative AIs to keep in mind.\nThe course content is very useful for educators, as it explains what generative AIs are, their present and future possibilities, and their possible impact on society and education. It also identifies the opportunities and challenges posed by using ChatGPT in teaching, such as task automation and improved productivity, but also challenges like plagiarism or assessment difficulties.\nFinally, the course proposes strategies to tackle the challenges and make the most of ChatGPT’s opportunities, applied to the participants’ courses. If you’re an educator and want to be at the forefront in the use of generative AI in teaching, this course is a unique opportunity to learn and share knowledge. Don’t hesitate to access the videos of the course “ChatGPT: Opportunity and Challenge for Teaching” on the YouTube channel.\n","date":"23 February 2023","externalUrl":null,"permalink":"/ludo/en/posts/post-1/","section":"Posts","summary":"If you’re an educator and you’re interested in generative AI and its applications in teaching, you’ll be glad to know you can now access the videos from the course “ChatGPT: Opportunity and Challenge for Teaching” on my YouTube channel. This course covers ChatGPT, a text-generating AI tool introduced at the end of 2022, and its implications for university teaching.\n","title":"Videos from the course ChatGPT: Opportunity and Challenge for Teaching v.1","type":"posts"},{"content":"","date":"17 February 2023","externalUrl":null,"permalink":"/ludo/en/tags/podcast/","section":"Tags","summary":"","title":"Podcast","type":"tags"},{"content":"Programar es una Mierda is a go-to podcast for talking about programming with love and humor. I think this is my third time on the show, and we always end up discussing interesting things. This time we talk about AI and programming.\n","date":"17 February 2023","externalUrl":null,"permalink":"/ludo/en/posts/invitado-en-el-podcast-programa-es-una-mierda/","section":"Posts","summary":"Programar es una Mierda is a go-to podcast for talking about programming with love and humor. I think this is my third time on the show, and we always end up discussing interesting things. This time we talk about AI and programming.\n","title":"Podcast: AI Eats It All, Programar es una Mierda","type":"posts"},{"content":"","date":"17 February 2023","externalUrl":null,"permalink":"/ludo/en/tags/programaci%C3%B3n/","section":"Tags","summary":"","title":"Programación","type":"tags"},{"content":"","date":"17 February 2023","externalUrl":null,"permalink":"/ludo/en/tags/video/","section":"Tags","summary":"","title":"Video","type":"tags"},{"content":"I keep seeing more signs that Google will have to get its house in order to stay competitive. Like this one [https://archive.ph/7BYDG]. When that happens, many free or very cheap services will simply be discontinued. Let’s not forget that “Google graveyard” grows every day.\nMicrosoft’s purchase of 49% of OpenAI (it already had a prior collaboration agreement, whose fruits we’ve seen with GitHub Copilot). If Microsoft effectively integrates OpenAI’s tools into its SaaS like Microsoft Office, many of Google’s “customers” will migrate to its services. That includes Bing (Bing+GPT-4) and Office, as paid SaaS services. Professional users will gladly pay.\nGoogle only knows how to make money by selling ads. It sells SaaS, but it’s not its core business nor part of its culture. The poor support they offer proves it. Has anyone in the room managed to talk to a human at a Google service? And a ChatGPT‑like assistant that substitutes for a search engine doesn’t show ads, and therefore doesn’t make money under Google’s current business model. How does Google solve that?\nWhere does that leave the continuity of Google Apps (Drive, Classroom, Meet, Calendar, even Gmail)? Very likely, over the next 5 years these tools will evolve more than in the last 25 as generative AI gets integrated.\nKeeping these tools state of the art and footing the compute bill (the GPUs that run AI don’t have zero marginal cost; they cost money) will be a business decision. And we have no certainty that Google will choose to keep playing.\nAnd even if the probability of discontinuing the service, or letting it go stale as a low value‑add free service for “free” users, is very low, that’s the risk every organization takes when it uses them for critical processes.\nOrganizations like the UPC are taking on significant risk and technical debt. When it would be easy for us to self‑provision with open‑source solutions, keep our IT services staff skilled up, and create business opportunities for UPCnet or inLabFIB.\nIn the article Privacy and E-learning a Pending Task (Alier M, Casañ Guerrero MJ, Amo D, Severance C, Fonseca D. Privacy and E-Learning: A Pending Task. Sustainability. 2021; 13(16):9206. https://doi.org/10.3390/su13169206) we already discussed the long-term effects of outsourcing core services at universities. In that case, the bet on Moodle and Open Source proved to be the right one back in 2003. What’s going to happen in 2023?\n","date":"17 January 2023","externalUrl":null,"permalink":"/ludo/en/posts/post-2/","section":"Posts","summary":"I keep seeing more signs that Google will have to get its house in order to stay competitive. Like this one [https://archive.ph/7BYDG]. When that happens, many free or very cheap services will simply be discontinued. Let’s not forget that “Google graveyard” grows every day.\n","title":"The risk of sticking with Google Apps for universities like the UPC","type":"posts"},{"content":"Martin Dougiamas is the creator of the Moodle open-source software project. I have collaborated with him extensively in the past. A few weeks ago, Martin was kind enough to come to visit me at LISA (Laboratori d’Inovació i Suport Audiovisual del ICE de la UPC) where we had a conversation for over an hour and a half about Education, Innovation, and many other things.\nMartin is going to edit and trim the audio for his podcast, the OpenEdTec Podcast. The conversation is in English, but the YouTube video has decent subtitles. It does a better job with Martin’s English than with my Maresme English. I hope you enjoy it.\n","date":"17 February 2020","externalUrl":null,"permalink":"/ludo/en/posts/post-3/","section":"Posts","summary":"Martin Dougiamas is the creator of the Moodle open-source software project. I have collaborated with him extensively in the past. A few weeks ago, Martin was kind enough to come to visit me at LISA (Laboratori d’Inovació i Suport Audiovisual del ICE de la UPC) where we had a conversation for over an hour and a half about Education, Innovation, and many other things.\n","title":"A conversation with Martin Dougiamas about ICT and Education","type":"posts"},{"content":"","date":"17 February 2020","externalUrl":null,"permalink":"/ludo/en/tags/conversation/","section":"Tags","summary":"","title":"Conversation","type":"tags"},{"content":" Build and display Mermaid.js diagrams in Hugo Export from Notion to Obsidian Github Repo ","externalUrl":null,"permalink":"/ludo/en/cursos/hugo-obsidian-resources/","section":"Cursos / Courses by Marc Alier","summary":" Build and display Mermaid.js diagrams in Hugo Export from Notion to Obsidian Github Repo ","title":"","type":"cursos"},{"content":"","externalUrl":null,"permalink":"/ludo/en/posts/10-prediccions-de-david-poblador/","section":"Posts","summary":"","title":"","type":"posts"},{"content":"","externalUrl":null,"permalink":"/ludo/en/posts/buscador-de-tfgs-de-ingenier%C3%ADa-inform%C3%A1tica/","section":"Posts","summary":"","title":"","type":"posts"},{"content":" Brief Bio # Marc Alier, an Associate Professor at the Politecnical University of Catalonia (UPC), specializes in computer science, information systems, e-learning, IT Ethics and more. With a computer science engineering degree and a Ph.D. in Sustainability, he has contributed significantly to the Moodle community and has authored over 160 academic publications. Alier currently serves as the academic director of the Engineering Education Ph.D. program and teaches multiple courses at UPC’s Barcelona School of Informatics. Short Bio # Hi there! My name is Marc Alier and I’m an Associate Professor at the Politecnical University of Catalonia (UPC) since 2002. My range of interests include computer science and information systems, e-learning, history and ethics of computing, as well as podcasting and building guitars.\nI hold an engineering degree in computer science and a Ph.D. in Sustainability from UPC. I have over 25 years of experience in research and software development related to e-learning ( in the 90’s I designed and developed a couple of forgotten Learning Management Systems (LMS) and educational content authoring tools). I ocalionally create online courses and MOOCs My Youtube Channel.\nI’ve been developing and contributing to the Moodle community since 2004, designing and implementing core functionalities such as the Wiki module, the Webservices layer, and the IMS LTI consumer. Since 2021 my research interest lies in the intersection of education and IA. Over the years, I’ve authored more than 160 academic publications in journals and conferences. For a list of publications and other research information:\nUPC’s Research Portfolio- Futur http://orcid.org/0000-0003-3922-1516 https://scholar.google.com/citations?user=POVraJQAAAAJ https://publons.com/researcher/3163603/marc-alier/ I am a member of the EduSteam Reseach Group.\nCurrently Teaching # At the Doctoral School of UPC I am academic director of the Ph.D. program in Education in Engineering Sciences and Technology.\nAt the Barcelona School of Informatics I currently teach.\nSocial and Environmental Issues Of Information Technologies Bachelors Degree Course on Histoy, Ethics and social aspects of IT. Development and Management of Information Systems Masters Degree course. Management # From 1997 to 2005 Founder, CEO and CTO of Ludoland Informàtica S.L. a small software development company specialized on back-end web development.\nFrom 2004 to 2015 Academic director of several Masters and Post-degree courses (Information management and Business Intelligence with SAP (2004-2012), Software development for PDA’s and mobile phones(2005-2006), Cloud Computing(2009,2012) and Community Management (2014-2015) at https://www.talent.upc.edu\nFrom 2014 to 2015 I was Director of the Institute of Sciences of Education at UPC.\nFrom 2019 to March 2023 deputy director of the Director of Educational Inovation ar the Institute of Sciences of Education at UPC in the management team of Ariadna Llorens.\nFrom Sept 2021 Academic Coordinator of the PHD Program in Science, Engineering and Technology Education\nOther Interests # I’m also a (really) prolific podcaster at\nmossegalapoma.cat, cabalgaelcometa.com zetatesters.com. creapodcast.com Contact Marc # marc.alier (at) upc.edu https://twitter.com/ludo\n","externalUrl":null,"permalink":"/ludo/en/about/","section":"About- Marc Alier- Asoc Prof at UPC, Luthier Apprentice And Podcaster","summary":"Brief Bio # Marc Alier, an Associate Professor at the Politecnical University of Catalonia (UPC), specializes in computer science, information systems, e-learning, IT Ethics and more. With a computer science engineering degree and a Ph.D. in Sustainability, he has contributed significantly to the Moodle community and has authored over 160 academic publications. Alier currently serves as the academic director of the Engineering Education Ph.D. program and teaches multiple courses at UPC’s Barcelona School of Informatics. ","title":"About- Marc Alier- Asoc Prof at UPC, Luthier Apprentice And Podcaster","type":"about"},{"content":" Table Of Contents # Future Shock Prompts and interactions with ChatGpt used in this video 2. LLM Basics 3. Training ChatGPT Prompts and interactions with ChatGpt used in this video 4 Learning howto better craft prompts Future Shock # Slides\nPrompts and interactions with ChatGpt used in this video # Future shock* Example of engineering problem with wolfram alpha https://www.wolframalpha.com 2. LLM Basics # Slides\n3. Training ChatGPT # Slides\nPrompts and interactions with ChatGpt used in this video # Training ChatGPT to write in our style 4 Learning howto better craft prompts # ChatGPT Understanding, Improving your prompting strategies Embeddings, Context uses and Self Referencing\n[TABLE]\n","externalUrl":null,"permalink":"/ludo/en/cursos/chatgpt-opportunity-and-challenge/","section":"Cursos / Courses by Marc Alier","summary":"Table Of Contents # Future Shock Prompts and interactions with ChatGpt used in this video 2. LLM Basics 3. Training ChatGPT Prompts and interactions with ChatGpt used in this video 4 Learning howto better craft prompts Future Shock # Slides\n","title":"Course: ChatGPT challenge and oportunity for teaching","type":"cursos"},{"content":" Table Of Contents # Introducció 6. Casos d’us de ChatGPT (I) 7. Guerres de Tonner 8. ChatGPT dins l’aula 9. Casos d’us de ChatGPT i II Video 10. El salt de GPT 3.5 a GPT 4 Com a contingut addicional us recomano el pòdcast que vam gravar amb Faraón Llorens la primera setmana de 2023. Cabalga El Cometa EP 31: Las alucinaciones de ChatGPT con Faraón LLorens\nIntroducció # Transparències Vídeo 1\n2. La Llei dels Rendiments Accelerats # Transparències Vídeo 2\n3. Fent Salsitxes amb IA # Transcripció Vídeo 3 Enllaços:\nEina IA de transcripció de veu a text, Whisper https://openai.com/research/whisper Open Broadcaster Studio https://obsproject.com Corrector ortogràfic i gramatical multilingüe https://languagetool.org NVidia Broadcast ##4. ChatGPT, un Model de Machine Learning Generatiu\nTransparències Vídeo 4\n##5. Usant ChatGPT Presa de contacte\nTransparències Vídeo 5\n6. Casos d’us de ChatGPT (I) # Transparències Vídeo 6\n7. Guerres de Tonner # En aquest video pubicat el Març de 2023 jo ja parlava d’eines de detecció de contingut generat en IA en el context educatiu. Ja llavors deia que no disposarem mai d’eines fiables. Sobretot pel risc i dilema etic que comporta usar eines que donen falsos positius. A Setembre de 2023 encara mantinc la meva posició. I sembla ser que la premsa ja ho ha descobert.\nTransparències Vídeo 7\n8. ChatGPT dins l’aula # Transparencies vídeo 8\n9. Casos d’us de ChatGPT i II # Transparencies video 9\nVideo 10. El salt de GPT 3.5 a GPT 4 # markdownCopy code\nCurs subjecte a Llicència Creative Commons Reconeixement-NoComercial-CompartirIgual 4.0 Internacional.\n","externalUrl":null,"permalink":"/ludo/cursos/chatgpt-oportunitat-i-repte/","section":"Cursos / Courses by Marc Alier","summary":"Table Of Contents # Introducció 6. Casos d’us de ChatGPT (I) 7. Guerres de Tonner 8. ChatGPT dins l’aula 9. Casos d’us de ChatGPT i II Video 10. El salt de GPT 3.5 a GPT 4 Com a contingut addicional us recomano el pòdcast que vam gravar amb Faraón Llorens la primera setmana de 2023. Cabalga El Cometa EP 31: Las alucinaciones de ChatGPT con Faraón LLorens\n","title":"Curs ChatGPT Oportunitat i Repte per a la docencia","type":"cursos"},{"content":" Table Of Contents # 1. El Shock del futuro. Curso ChatGPT Oportunidad y reto para la docencia 2. Puntos sobre las IAS 4. Las guerras de Tonner 5. GPT Personal Bullshit Generator 6. Casos de uso de ChatGPT (I) 7. ChatGPT Instrumento de consulta Herramientas usadas para crear estos contenidos Como contenido adicional os recomiendo el podcast que grabamos con Faraón Llorens la primera semana de 2023. Cabalga El Cometa EP 31: Las alucinaciones de ChatGPT con Faraón LLorens Un poco largo, pero vale la pena.\n1. El Shock del futuro. Curso ChatGPT Oportunidad y reto para la docencia # Transparencias Vídeo 1\n2. Puntos sobre las IAS # Transparencias Vídeo 2\n3. Entrenando a ChatGPT # Transcripción Vídeo 3\n4. Las guerras de Tonner # Transparencias Vídeo 4\n5. GPT Personal Bullshit Generator # Transparencias Vídeo 5\n6. Casos de uso de ChatGPT (I) # [Transparencias Vídeo 6](pages/open_courseware/es/6_chatgpt_es 1.pdf)\n7. ChatGPT Instrumento de consulta # Transparencias Vídeo 7\nHerramientas usadas para crear estos contenidos # Enlaces:\nHerramienta IA de transcripción de voz a texto, Whisper https://openai.com/research/whisper Open Broadcaster Studio https://obsproject.com Corrector ortográfico y gramatical multilingüe https://languagetool.org NVidia Broadcast [TABLE]\n","externalUrl":null,"permalink":"/ludo/es/cursos/chatgpt-oportunidad-y-reto/","section":"Cursos / Courses by Marc Alier","summary":"Table Of Contents # 1. El Shock del futuro. Curso ChatGPT Oportunidad y reto para la docencia 2. Puntos sobre las IAS ","title":"Curso ChatGPT: Oportunidad y Reto para la Docencia. Version en Castellano (Mayo-Junio 2023)","type":"cursos"},{"content":" Cursos en Català # ChatGPT: Oportunitat i Repte per a la Docència Cursos en Castellano # ChatGPT: Oportunidad y Reto para la Docencia Courses in English # ChatGPT: Opportunity and Challenge for Education Resources for Hugo + Obsidian Web Publishing ","externalUrl":null,"permalink":"/ludo/en/cursos/","section":"Cursos / Courses by Marc Alier","summary":"Cursos en Català # ChatGPT: Oportunitat i Repte per a la Docència Cursos en Castellano # ChatGPT: Oportunidad y Reto para la Docencia Courses in English # ChatGPT: Opportunity and Challenge for Education Resources for Hugo + Obsidian Web Publishing ","title":"Cursos / Courses by Marc Alier","type":"cursos"},{"content":" Models de Traducció i Conversió de Veu i Text # SeamlessM4T: Traducció Multilingüe: Model open source que tradueix text i parla entre diferents idiomes, suportant prop de 100 idiomes en text i 36 en parla. - Enllaç Speech Gen: Conversió Text a Veu: Eina de pagament que converteix text a veu amb alta qualitat. - Enllaç Whisper: Reconeixement de Veu: Model open source d’OpenAI per al reconeixement de veu amb algunes limitacions en converses bilíngües. - Enllaç Whisper-ctranslate2: Millora de Whisper: Millora del model Whisper, amb implementacions més ràpides per SoftCatalà. - Enllaç https://github.com/ggerganov/whisper.cpp versió high performance de Whisper.. oferint api com openai i tot https://github.com/ggerganov/whisper.cpp/pull/1380 Insanely Fast Whisper with Speaker Diarization (GitHub Repo) A somewhat opinionated CLI tool for transcribing audio files. It also now includes speaker segmentation and diarization (e.g., recognizing speaker 1 vs speaker 2). The claim is they can transcribe 2.5 hrs of audio in 100s, even on some of the largest models available. Text a veu # https://github.com/coqui-ai/TTS\nMillora d’audio # Resemble enhance # “Resemble Enhance” és una eina impulsada per la intel·ligència artificial que té com a objectiu millorar la qualitat general de la parla realitzant desenrenou i millora. Consisteix en dos mòduls: un desenrenoador, que separa la parla d’un àudio sorollós, i un millorador, que augmenta encara més la qualitat perceptiva de l’àudio restaurant distorsions d’àudio i ampliant l’amplada de banda de l’àudio. Els dos models estan entrenats amb dades de parla de 44.1kHz d’alta qualitat que garanteixen la millora de la teva parla amb alta qualitat. https://github.com/resemble-ai/resemble-enhance/tree/main\nProcessament d’Imatges # Segment-anything: Segmentació d’Objectes: Model open source de Meta per a la segmentació d’objectes en imatges. - Enllaç Generació d’Imatges: Eina que crea imatges a partir de prompts, com l’exemple de “Elven Lady Galadriel”. - Enllaç emu edit https://emu-edit.metademolab.com APPLE MGIE https://venturebeat.com/ai/apple-releases-mgie-a-revolutionary-ai-model-for-instruction-based-image-editing/?utm_source=tldrai https://github.com/apple/ml-mgie Anàlisi de Documents # Mapdeduce i Chatpdf: Anàlisi de PDF: Eines per analitzar contingut de fitxers PDF, possiblement usant l’API d’OpenAI (GPT-3.5-turbo). - Mapdeduce, Chatpdf Disseny de Logos # Logomaster.ai: Generació de Logos: Eina automàtica per a la generació de logos. - Enllaç Eines de Desenvolupament # Ollama.ai Entorn que permet l’execució de models open source en local. Projecte molt potent que ofereix interficie shell, API-REST, i projectes paralels com Ollama Web UI, o Instructor https://jxnl.github.io/instructor/blog/2024/02/08/ollama/ ChromaDB Ludwig: Aprenentatge Automàtic No Code: Eina per a la creació de pipelines d’aprenentatge automàtic sense necessitat de codificació. - Enllaç Haystack: Eina per al desenvolupament amb diversos models i pipelines. - Enllaç LLM Patterns: Avaluació de LLMs: Patrons per a l’avaluació de Models de Llenguatge de Màquina. - Enllaç PromptTools: Eines per a LLMs: Conjunt d’eines per a l’avaluació i experimentació amb Models de Llenguatge de Màquina. - Enllaç Streamlit https://streamlit.io Run llama RAGS https://github.com/run-llama/rags?utm_source=tldrnewsletter ingest and vectorize https://unstructured-io.github.io/unstructured/examples/chroma.html E-commerce # Marketsy.ai: Creació de Botigues en Línia: Eina per crear botigues en línia amb diverses plantilles i opcions de personalització. - Enllaç Eines de Disseny Gràfic # Eina de Microsoft per al disseny d’imatges, diapositives, retols a partir de prompts. - Enllaç Models de Visió # CLIP: Model d’OpenAI que combina text i imatge per a diverses tasques visuals. - Enllaç LLaVa: Model que pot respondre preguntes sobre imatges, accessible en maquinari de consumidor. - Enllaç **REKA Multimodal https://reka.ai/reka-flash-an-efficient-and-capable-multimodal-language-model/?utm_source=tldrai Generació de Música a partir de Prompt # Stable Audio: Sistema per generar música i efectes sonors a partir de descripcions de text. - Enllaç ChatGPT Plugins # Webpilot: Plugin per a consultes a pàgines web. Wolfram: Plugin per accedir a dades científiques i realitzar càlculs matemàtics. Smart Slides: Plugin que assisteix en la creació de diapositives. Sumarització # https://scisummary.com https://chatpdf.com\nDatasets # SCB-ST-Dataset4: Dataset per l’estudi del comportament espai-temporal en escenaris d’aula. - Enllaç Cada descripció proporciona una idea clara del que ofereix cada eina o tecnologia, acompanyada dels enllaços per a més detalls.\nLLMs # NexusRaven v2 # https://nexusflow.ai/blogs/ravenv2 NexusRaven-V2, un model de llenguatge de 13B, ha superat GPT-4 en la capacitat de cridar funcions sense exemples previs. Això permet convertir instruccions en llenguatge natural en codi executable. Destaca per la seva alta taxa d’èxit en casos d’ús humans, és de codi obert i fàcil d’integrar. Inclou un banc de proves de crida de funcions i un tauler de líders amb exemples humans, cobrint una àmplia gamma de casos i dificultats. És notable per la seva robustesa i precisió, superant els APIs de models de llenguatge propietaris.\nPer a més detalls, pots visitar la pàgina web original aquí.\narticles essencials # Attacks on machine learning models # Prompt Engineering # https://github.com/dair-ai/Prompt-Engineering-Guide\nInteresting engines # scholar.ai perplexity.ai consensus.ai\n","externalUrl":null,"permalink":"/ludo/en/eines_ia/","section":"Directori d'eines IA útils","summary":"Models de Traducció i Conversió de Veu i Text # SeamlessM4T: Traducció Multilingüe: Model open source que tradueix text i parla entre diferents idiomes, suportant prop de 100 idiomes en text i 36 en parla. - Enllaç Speech Gen: Conversió Text a Veu: Eina de pagament que converteix text a veu amb alta qualitat. - Enllaç Whisper: Reconeixement de Veu: Model open source d’OpenAI per al reconeixement de veu amb algunes limitacions en converses bilíngües. - Enllaç Whisper-ctranslate2: Millora de Whisper: Millora del model Whisper, amb implementacions més ràpides per SoftCatalà. - Enllaç https://github.com/ggerganov/whisper.cpp versió high performance de Whisper.. oferint api com openai i tot https://github.com/ggerganov/whisper.cpp/pull/1380 Insanely Fast Whisper with Speaker Diarization (GitHub Repo) A somewhat opinionated CLI tool for transcribing audio files. It also now includes speaker segmentation and diarization (e.g., recognizing speaker 1 vs speaker 2). The claim is they can transcribe 2.5 hrs of audio in 100s, even on some of the largest models available. Text a veu # https://github.com/coqui-ai/TTS\n","title":"Directori d'eines IA útils","type":"eines_ia"},{"content":"","externalUrl":null,"permalink":"/ludo/en/series/","section":"Series","summary":"","title":"Series","type":"series"}]