LLMs Wildest Dreams

Over the past two weeks, I’ve been following a number of discussions around recent research by Anthropic — the company behind the Claude language model family. These discussions have played out not only in academic circles, but also across social media platforms and YouTube, with science communicators offering their own takes on what the research means.

I noticed that many of these interpretations diverge significantly, sometimes dramatically, from what the papers actually say (and — by the way — from each other). In particular, some claims struck me as exaggerated or speculative, even though they were presented with great confidence and strong rhetorical framing. Others, while more cautious, also made far-reaching conclusions that aren’t necessarily supported by the evidence.

So I decided to take a closer look at the primary sources—the papers themselves—and compare them to some of the circulating interpretations online. My aim here is to share how I read the papers, highlight where I believe common misunderstandings have occurred, and offer a technically grounded perspective that avoids hype, but also doesn’t dismiss the progress that has clearly been made.

More specifically, this post focuses on the following:

  • A structured recap of the two Anthropic papers and what they actually demonstrate.
  • A comparison of two popular YouTube interpretations (by Matthew Berman and Sabine Hossenfelder), which offer almost opposing takes.
  • A short excursion on LLMs and how they work (and why)
  • My own analysis of why certain assumptions — about self-awareness, internal reasoning, and “honesty” in LLMs — may not be warranted based on the research.

In writing this, I’m not claiming to offer the final word. But I do think it’s important to take a step back from the soundbites and really ask: What do these findings actually show? And what might they not show?

If you’ve found yourself wondering whether LLMs are secretly reasoning behind our backs, or whether it’s just a glorified autocomplete—this post might help you sort signal from noise.

Read More

Obsidian Plugin for Print

This blog post covers my experience building an Obsidian plugin using ChatGPT – including a minimal backend. The project is open source and available on GitHub:

Why that? To solve a real limitation I’ve hit repeatedly. Obsidian on iOS doesn’t support printing or PDF export. That’s an issue, if you want to print short notes, e.g. prep lists for talks, interview questions, or quick todos. To make it even more useful (to me), I don’t print to A4 – I use a thermal receipt printer connected to a Raspberry Pi 4B. It prints on 80 mm continuous paper, ideal for portable, foldable notes that fit in a jacket pocket. Of course, a nice addition would be to support A4, too 🤓.

The resulting notes look like this:

Read More

Strategic Thinking in Complex Situations

Do you know the feeling of standing helplessly in front of a technical problem, with modern devices leaving you no room to intervene in the solution because they are closed systems? It feels like you have no control and are at the mercy of the manufacturer. I work in IT myself and have carried out many software projects in various roles throughout my life. Actually, I am not willing— and I also tell all my friends and acquaintances— to help with technical issues on home devices like smartphones, computers, or laptops. My usual saying is: “I’d rather help you move than fix a computer problem.”

But what if it’s your own device to deal with? Then you have to take care of it. Many problems can be solved with extensive Googling or by asking an AI of your choice. I recently encountered an issue with my iPad that turned out to be tricky to fix.

Let me briefly explain the problem: Despite having automatic updates enabled, I hadn’t updated my iPad for a long time. The latest iPadOS version was 18.3.2, but I was still running 16.6.1 without realizing it. That really surprised me. When I tried to manually trigger the update, the iPad downloaded it for a long time, prepared the update even longer, restarted, but remained on the old operating system.

Spoiler Alert: The strategies I present in this article help with almost any issue 🤓.
A good reading advice that helped me a lot: The Logic of Failure: Strategic Thinking in Complex Situations (german)

TL;DR: The final solution was to reset the iPad, set it up as a new device, immediately install the update, and then restore it from the last backup. Often, an update fails to complete correctly in a specific system state. By resetting the device, updating to the latest iPadOS version, and then restoring the backup, I was able to resolve the issue.

Read More

Exploring langchain4j with Spring Boot: A Practical Journey

Many examples, tutorials, and blog posts convey the impression that Python is the only language to do AI. This blog post shows that for enterprise applications, Kotlin with Spring Boot is a very good choice in order to build robust AI and Agentic AI applications.

In today’s fast‑evolving landscape of AI‑driven applications, integrating language models into enterprise solutions is essential. This post explores LangChain4j within a Spring Boot environment using Kotlin. I will demonstrate how to declare AI services using annotations, integrate system prompts, dynamic user prompts, and tools — all via declarative interfaces.

Read More

The Evolution of Large Language Models in Programming: A Broader Perspective

In the previous two posts of this series, I shared my experiences with using ChatGPT to create a coding project, which involved some ups and downs. It took 78 prompts and a total of 350 lines of prompts to create a 118-line Typescript project. Moreover, the process took four times longer than coding it by hand. Nevertheless, this was only the beginning of my exploration of the potential of Large Language Models (LLMs) in programming.

Read More

The ChatGPT Programming Experience: A Second Attempt

Welcome back to my series on using ChatGPT for programming (written with help of ChatGPT)! In this second installment, I’ll share my experiences using ChatGPT to build a small software project.

Here you can find the chats: https://github.com/juangamnik/chatgpt-schedule/tree/main/chatgpt

I set out with the premise that I would not write more than a few lines of code, leaving most of the work to ChatGPT. Let’s dive into the numbers, the areas where ChatGPT slowed me down, and the moments where it truly shined.

Read More

Large Language Models, GPT, and I

This is the first blog post in a row that describes my first experiences with ChatGPT as a pair programmer and assistant for a developer. As you will see, these experiences had some ups and some downs, but all in all – spoiler alert – on the one hand is the advancement in A[G]I (Artificial [General] Intelligence) especially in regard of NLP (Natural Language Processing) using LLM (Large Language Models) and GPT (Generative Pre-trained Transformers) very impressive. On the other hand I had to revise some of the observations (and criticism) I made, just weeks later, since the pace of evolution in the AI scene is so freaking high.

So these are the blog posts of this series:

  1. This post
  2. The ChatGPT Programming Experience: A Second Attempt
  3. The Evolution of Large Language Models in Programming: A Broader Perspective

But let’s start at the beginning (the following text has been created with help of GPT-4)…

Read More

Incremental Backup of Image Files (or: How to Diff and Patch Big Binary Files)

More often than expected, there is a problem for which there should be an easy solution, but a short googling session lets you behind with the hollow feeling that the world let you down… again. But then you put out your unix skills to find a solution for the problem on your own.

Update: There is a Part II to this post, which explains the idea behind the solution shown here

Today is such a day… The problem is as follows: you backup a disk (e.g. the sdcard of a raspberry pi) with dd like this:

$ sudo dd if=/dev/mmcblk0 of=/media/backup/yyyymmdd-raspi-homebridge.img bs=1M

A backup with dd is a bitwise copy, which takes exactly the space of the disk, no matter how empty the block device is. I.e., the dd-image of an sdcard with nominally 16GB takes about 15GB (the usable space of the disk). If the device is more or less empty, the image consists of a lot of zeros and can be compressed with tools like bzip2 very well. In your (i.e., my) case 6 GB are used on the disk. After compressing the image it is less than 2 GB. Sounds great, right? Unfortunately, you are paranoid and want to store the last X backups. Even with a small X, this can get really hungry on your cloud storage. This is the time where your inner voice says: Wouldn’t it be great to store the delta of an old to a new backup, only?

That means, you store the complete (compressed) backup of the most current backup, as it is most likely, that you need it than older ones. The older backups are just deltas to the next-newer backup. Each time a new backup is created, the predecessor image is replaced by a diff/delta between it and the new backup.

There must be a solution for this, right? Meh, at least I couldn’t find that solution. If you found it, please comment below. So, I started some experiments…

Read More

Find Missing Files in a Backup 2nd Iteration

I already wrote a post about this topic. But as I do at work, I work in an agile manner at home. So here is an update to the post Find Missing Files in a Backup. The script there has been designed to be copied from the clipboard into the terminal. This time, I present you a script, which you may copy to a file, make it executable and reuse it easily. Further on, it fixes some minor issues with the original version (e.g. handling files with spaces and backup folders which are named differently, than the original folder).

For the more general idea of this script, please have a look at the former blog post (see link above). So here it is:

#!/bin/bash

src=$1
tgt=$2

(cd ${src} && ls) | while read file; do
  found=$(find $tgt -name "$file" | wc -l)
  if [ $found -eq 0 ]; then
    echo $file
  fi
done

Copy this into a file e.g. named backup-check.sh and give it exec rights:

$ chmod ugo+x backup-check.sh

Afterwards you can use it like this:

$ ./backup-check.sh original/ backup/

Exciting 🤓.