Thursday, August 7, 2025

Value of AI knowledge: Does evolution call for deterministic or LLM biased results?

I remember the first time I saw a diagram of a neuron (Img.1.). I was a school kid, and among other hobbies, I was interested in the human brain, it fascinated me. I borrowed a book from the bookstore to find out how it all worked, because these cells are responsible for more than just the functioning of the human species.

Img.1: Anatomy of multipolar neuron

Simplification of neuron, perceptron

I took a courses at university focused on neural networks and their applications, basically research, in engineering and control systems. It was a lot of fun because the neuron was simplified into an abstraction presented as a perceptron (Img.2.).

Img.2: Schematic of a perceptron where weights are applied to the inputs, resulting in a weighted sum. The sum is passed to a step function that provides the output.

Using multilayer perceptrons with feedforward learning or back-propagation algorithms, we were able to detect specific patterns that signaled the point of interest. With increasing complexity we faced difficulties in interpreting the results in the deterministic way. Nevertheless, the general result was astonishing.

A breakthrough multi-layer neural network system at massive scale

Nowadays OpenAI has made a remarkable breakthrough and released the agentic system ChatGPT in fall 2022. It was impressive to observe its capabilities of probabilistic evaluation of the next upcoming word. The technology based on transformer architecture [3] , which may use multilayered perceptron networks behind the scene, was acting like a “real assistant”. It was and is impressive although it is just mathematical calculations on a large scale. It means the true gold of such a system are the calculated weights.

From weights to information and back

Weights required to be properly trained. But such trained weights may contain a noticeable level of entropy [4]. The concept of entropy is used across the disciples and is associated with the level of disorder or randomness of such a system. Such a high level of entropy may be projected to unexpected behaviour. Some research work suggested could be the property of the Large Language Model (LLM) itself.

Why does entropy increase in LLM-based AI systems? LLM models excel at detecting patterns invisible to the naked eye thanks to the concept of layered neural networks and trained weights. The connections, correlations between weights, and how they are updated may be considered non-deterministic. The process can appear to be empirical or stochastic/random. This means that once the weights reach a desired state, it may be challenging to reproduce the exact process.

Is Boolean algebra still valid?

This reminds me of an operation I learned in my boolean algebra classes at university, material implication (Img.3.)

The material implication applies on two following conditional states x and y that when x condition is false, the result is always true independently on the condition y. In other words all is correct.

Img.3.: Material implication table of 2 conditional states x and y

Perhaps such a state, where everything is correct because the assumption of determinism is false, could be a high-level explanation for the commonly observed LLM state called hallucinations. And as some scientists have already pointed out, Hallucination as the property of the system is difficult to correct.

Are LLMs a robust repository of knowledge?

The ability of LLM to highlight hidden patterns in text, speech, images, voice has been already mentioned as astonishing. LLM works very well as a translation of a written or spoken word into bytes with a probabilistic level of accuracy identified by an external human observer. The human observer evaluates the received answer/result from the agentic AI system based on LLM.

The architectural scale of agentic systems may remind someone of, for example, the precisely designed ancient pyramids

The reason why the pyramid is taken into the consideration is the magnitude of the involvement of LLM based systems across the industry. While the architecture may not be seen from a higher perspective as deterministic as that of the Great Pyramids of Giza, its magnitude is evident. These pyramids are considered some of the masterpieces ever built by mankind. Is the use of LLM or Deep learning techniques heading in the same direction?

The process of how and with what type of tools the pyramids were built seems to be forever lost to history. It looks similar to how the LLM achieved the most accurate results. The answer lies in the history of weights, or rather how the weights were calculated, which is also lost, but in a very short time frame.

Img.4.: Simplified information storage in architecture pattern, each stone inside the pyramid have deterministically defined structure and position, although fluctuation of material particles may be considered non-deterministic.This is in contrast to a neural network and its storage/extraction of information on the right side of the image. Read color indicates the element to be corrected.

What is the knowledge self repair factor ?

Given the robustness and resiliency of the pyramids, the architecture seems to have been proven by time. Any repair can be made at the deepest layer of the system without compromising its stability (Img.4.). Since LLM AI agent systems are based on multilayer neural networks, which may imply a higher level of entropy, randomness, such a correction, or indeed any correction, may present a challenge with a not-entirely-deterministic outcome.

The question of repairability of such a complex and highly scalable system may address new discussions. Over the years, mankind has developed a number of enterprise design patterns for business processes that allow performing work with minimal entropy, leading to the desired result with a defined probability, i.e., an error state is considered. Such a process is deterministic and, moreover, repeatable.

An example is the SAGA pattern (long-running transaction [6]), where the system has the ability to recover from specific states or take action, such state is repeatable.

Is biased information random ?

The intensive development in the field of large language models reveals new insights every day and offers possible directions for improving or mitigating current challenges observed or identified in current implementations. It is possible that as humanity works with increasingly large models, deterministic information sinks deeper into the value of the probability weight. This means that we lose connected points in favor of computed weights. The side effect may occur in disability of effectively promting the question to the system (Img.5.)

Img.5.: A possible consequence of the intense experience of promting with an LLM agents may be damage to cognitive connectivity within the brain due to the removal of connections between stored information.

The trigonometric functions sine and cosine are well known [7] (Img.6.) LLM transformers make extensive use of them to identify the correct word position. It is worth remembering that the entire LLM model still only predicts the next word.

Img.6. Sine and cosine functions plotted from 0 to 2π.

The use of trigonometric functions has an amazing effect on word order correction. Based on the calculated LLM weights, the correction of dramatically misspelled text is astounding. However, the entire outcome of the agent system may lead to random results because the response is biased by the training data. During current development they may be observed willing to address such questions as biased data may have broader impact.

Prompting conclusion

History has shown the power of deterministically stored information in many different areas. Humanity has trained the brains to increase neuronal connectivity and improve cognitive and other functions through evolution (Img.7.)

Img.7. Random three people who are dedicated to gathering knowledge and understanding through the process of learning, like Mr. Leonardo DaVinci, Tesla and many others

Img.8.: Combining knowledge influenced by the fundamental concept of the gravity projected into the of the pyramid repair process

It is very exciting to watch and contribute to the entire process of evolution in the field of artificial intelligence, whether as an individual or a group. We are constantly discovering many directions from very different perspectives, knowing that our known world must follow a number of fundamental rules, such as gravity, which exist between all entities with mass or energy (Img. 8). The changes in the field of agentic AI systems are very intense and rapid, but the process itself is very narrow and centered around small groups of authors. The scope of impact is very wide. The ability to understand, design or cope with the current state has already predetermined requirements defined by the level of applied mathematics.

The future is not yet written and every moment is important.

Willingness to contribute to the development, among other things, I helped create the JC-AI newsletter [8][9] as part of our Java Champion initiative. Do not hesitate to contact me for future collaboration.

[1] https://en.wikipedia.org/wiki/Neuron

[2] https://en.wikipedia.org/wiki/Perceptron

[3] https://en.wikipedia.org/wiki/Transformer_(deep_learning_architecture)

[4] https://en.wikipedia.org/wiki/Entropy

[5] https://en.wikipedia.org/wiki/Boolean_algebra

[6] https://en.wikipedia.org/wiki/Long-running_transaction

[7] https://en.wikipedia.org/wiki/Sine_and_cosine

[8] https://foojay.io/today/ai-newsletter-1

[9] https://foojay.io/today/jc-ai-newsletter-2

Sunday, August 3, 2025

Stay tuned Artificial Intelligence, not only Java ?

It may seem that nowadays, when the article does not use the abbreviation LLM (Large Language Model) for AI (Artificial Intelligence), the article has a reduced value, although its message may have had some value.

It may be that the general information flow is overwhelmed with news about the use of artificial intelligence systems. I can offer another perspective, the perspective where the public is eager to understand. Yes, to understand what these artificial intelligence systems are and how they contribute, deterministically contribute, to everyday business or the development of humanity.

In recent years, I have worked on various types of machine learning or knowledge base systems. Some of them used probability theories and advanced mathematics to calculate the most likely outcomes. It all requires continual reading of research articles or modifying current approaches based on newly gained understanding. I feel like field of current Large Language Models based on neural networks and weights requires the same in order to keep understanding what has been achieved and evaluate the current state.

Knowledge is language independent, but it also requires the right stimulation. I was pleased to see that in our JavaChampion group there is a great need to share knowledge and actively contribute to the process of spreading awareness about all the possibilities not only in the field of LLM, but also in all areas of artificial intelligence in general. Artificial intelligence has much more to offer and it is worth discovering.

Foojay.io - JC-AI-Newsletter Vol. 1

Friday, February 10, 2023

Launch Announcement: How I become an Book Author

"One day, while reading a book, I thought about what it would be like to be an author and write a book myself. This day is here and it is real and here some details of my journey... #java #designpatterns #platform #effectiveness #maintenance #fun #rocks Packt"

I started working on the book almost a year ago and now it's done! The book is published, all minor tasks are solved and the current state? I'm looking forward to my hardcopies!

Now I'm trying to figure out how to properly share how much thoughts I put into creating something that every developer will potentially appreciate on his daily job. Something that could kick her/him back on the trail during being in the local minimum. Maybe not only her/him but also to me just to refresh some points and to keep all knowledge fresh. I was thinking that maybe even my kids will appreciate it when they grow up, haha.

Design patterns are a very engaging topic, similar to math. Technologies may change, hardware may change, and we can not stop evolution or time but math remains, similar to the design patterns. They may be adopted differently due to the technology jump but they will be there.

The book begins by introducing the Java platform in enough detail to shape the context to understand the value of using patterns. Insights are automatically revealed during usage of presented programming concepts while implementing patterns.

I have used neutral examples in the book by using vehicle manufacturing abstractions to drive the reader through the entire book as we all love vehicles. This setup allowed connecting all dots between different design pattern approaches and implementations and to create a flow where the reader may identify himself or herself with the chapter. With all the great Java API’s and all the newly added Java platform enhancements I was inspired to not use additional frameworks, just pure Java and command-line. I hope the reader appreciates it similarly to me :). In my eyes this allows the reader to stay fully concentrated on the particular topic and apply it across different scenarios. I'll let it upon the reader how successfully I did it.

The book contains many standard terms used across the application designers community which makes the book valuable reading material not only for developers but maybe also for project managers to assimilate a similar terminology used across the different types of meetings in different stages of application development. Let's see, it was one of my secret wishes ;)

Anyhow, after many years of working with multiple languages running on the Java Virtual Machine, my biggest pleasure was always with Java language as the most effective tool to create byte-code.

I want to thank my beloved wife, my beautiful kids for giving me energy to step over difficulties and continue my work on this book till the successful end.

It was a great pleasure to work with the Packt team, reviewers that helped me through this amazing journey. My special thanks goes to Bruno Souza for writing such a beautiful foreword!

Thank you guys: Bruno Souza, Sonia Chauhan , Sathya Mohan, Prajakta Naik, Rohit Kumar Singh, Werner Keil and others.

It was my big pleasure to make this book happen.

Of course I can not forget my peers from the OpenValue family for having some nice discussions with me.

Monday, November 15, 2021

Java Flight Recorder: profiling Kotlin and Java apps with fun

Introduction

The goal of this article is to examine the possibilities of profiling Kotlin and a similar Java application with JMC/JFR to get a better understanding of their behavior compared to each other.

Nowadays the IT world often spells terms like Site Reliability Engineering - SRE [5] or latencies, but how to measure this in an accurate way without relying on “random” samples ? The answer to this question is out of the scope, but we can show what possibilities there are for understanding the behaviour of an application based on simple examples.

Starting slowly: what is the JVM? The JVM stands for Java Virtual Machine. The JVM enables a computer to run Java programs, but not just only them. JVM supports all languages that are able to be compiled into the Java byte-code. All good so far, but what is the JFR ?

The letters JFR stand for the Java Flight Recorder, which is an event-based toolset build directly into the JVM.

Exciting, isn't it ? The JFR can provide a view to the JVM internals through emitted events and more…

The purpose of the article is to compare a Kotlin and a Java app and touch “hidden” compiled code compositions.

Let’s briefly introduce Koltin as Java is known a bit longer. I bet a couple of million articles have already been written about Java, so it’s fair to have a short Kotlin introduction.

Kotlin belongs to the JVM language family. It was introduced in 2011 as a new language for JVM and has been developed by JetBrains company.

The communicated goal was to become “better language” for the JVM. Kotlin has been successfully adopted by the Android community. In short, Kotlin is an object-oriented statistically typed language. It offers a set of quite handy features: data classes, concise, safety, smart casting, functional capabilities etc.
Aside from the often spelled Kotlin "benefits", it seems that nowadays more companies are thinking about or are already using a Kotlin stack for backend development. The reason for this may seem obvious but there are different perspectives. Such discussion may turn into the chicken-egg argumentation and it is out of the scope of this article.

I have put the word "benefits" into the apostrophes intentionally as some nice concepts may look very neat in Kotlin but they may not be as optimized as Java’s counterpart (considering latest Java builds). The Java Ecosystem evolves pretty fast and Java stays the 1st JVM language and also the most optimized one. But nonetheless Kotlin has many nice constructs. Such constructs can help teams move faster without causing unwanted issues. An example of such an issue is the well known NPE (Null Pointer Exception) .

Let’s profile

The introduction is done. Let’s now compare the characteristics of a comparable Java & Kotlin apps using the JMC/JFR!

The setup goes first. For each measurement we use the current development state of the JMC/Java Flight Recorder - Early Access [1]. Each measurement is done in a 45 seconds time window. All examples use OpenJDK 17 [4].

We consider the following 2 examples [2] to get some measurable data:

Hot-Methods
Latencies

Each example uses similar JMC/JFR Events. Such events wrap equivalent sections of the application (Java/Kotlin) to obtain comparable results. The application Threads are also reduced in both cases in order to obtain comparable results. This means any detailed platform configuration is avoided as the goal is to compare basic platform configuration.

1. Hot-Methods Example

The idea of this simple app is to have two “containers” held by the individual "worker". Those two containers each hold their own collection of numbers. The worker tries to find an intersection of those paired containers (see Img.1. in Kotlin, similar in Java)

Img.1.: example worker code - Kotlin

After running 1st not fixed code for 45 seconds we obtain for the Java app following results (see Img.2.)

Img.2.: Hot-Methods example: Java app - 431 400 events emitted

We repeat this process for the Kotlin version of the similar application for 45 seconds and obtain the following results (see, Img.3.)

Img.3.: Hot-Methods example: Kotlin app - 408 403 events emitted

Let's fix the code for both apps and observe the improvements in throughputs on both sides. We publish only the amount of JFR events that have been emitted. (Img.4.) as this is the identifier of the improvements that have been achieved.

Img.4.: Hot-Methods example Java, Kotlin comparison results

2. Latency Example

The second example is based on getting insights into latency that can be caused by many things. It could be unnecessary garbage collection, network issues or improper synchronization inside the application. In the current example we consider a problematic logger (Img.5.) that has a corrupted method log that is synchronized. It means it forces each tread to wait for it.

Img.5.: Problematic Logger method

The Kotlin application uses the concept of Mutex (Kotlin Interface, coroutines library) to enforce synchronization. Let's take a look at the measurements. We try to answer the question of how many events can be emitted from the corrupted code in a 45 seconds time window (Img.6.: Java, Img.7.: Kotlin)

Img.6.: Java Problematic logger results, 222 JFR events

When we run the Kotlin application we receive almost identical results (Img.7)

Img.7.: Kotlin Problematic logger result, 222 JFR events

After the fixing a problematic logger we obtain the following results (Img.8.: Java, Img.9.: Kotlin).

The results show that the blocking issue has been removed.

Img.8.: Java fixed logger, 957 JFR events

How is Kotlin app doing?

Img.9.: Kotlin fixed logger, 808 JFR events

Conclusion

Watching the results we can observe that the results are almost very comparable. Important to note: the time window for the examples was just 45 seconds and the examples were more set up from a research perspective to highlight a specific issue. We have seen the possibilities that are provided to us by the Java Platform in order to create our application better and understand the behavior in more detail. Mainly, by demonstrating the possibilities the Java Flight Recorder brought to us. We have also discovered, in a bit more details, a composition of the Kotlin itselves and coroutine framework (asynchronous and non-blocking lib)

Stay Tuned

and Happy JVM coding !

Stay Tuned

and Happy JVM coding !

References:

Java Mission Control Project
JMC-JVM-Lang tutorial: profiling examples for different languages
JMC-tutorial examples by Markus Hirt
OpenJDK 17
Site Reliability Engineering - SRE

Pages