Hey everyone! I'm curious about how the design of programming languages impacts the effectiveness of LLMs (large language models) when it comes to generating and assisting with code. My gut feeling is that LLMs perform better with mainstream languages like Java, Python, and JavaScript. But does it also depend on the language design? For instance, do statically-typed languages have an advantage in LLM code generation, or do more verbose languages help? I'm also wondering if all of this ultimately boils down to the amount of training data available. Would love to hear your thoughts! Thanks!
7 Answers
I feel LLMs shine in generating code for less verbose languages (like Python), since you can do more with fewer characters. But statically typed languages offer better safety because their compilers catch when an LLM hallucinates something.
I’m curious how well LLMs would handle writing code in something like Brainfuck, given how little data there is available for training. The code structure is pretty simplistic, which could be a challenge.
In my experience, LLMs do better with languages that have fewer rules, like Python. Sure, they can write valid code in Java, but the chance for mistakes seems higher. For example, I once saw Claude invent a vec5 type for GLSL, and that’s just wild!
I think statically typed languages should be easier for LLMs to handle. However, tools like Copilot sometimes generate code with type errors, which suggests it's not a certainty. Maybe this is an area where LLMs can improve in the future?
But doesn't that just mean you might not spot those errors easily in dynamically typed languages?
I’ve used Copilot for multiple languages and I find it works well with JavaScript and HTML, but I prefer IntelliSense for Java.
Absolutely, training data is critical! A language with more restrictions makes it easier for LLMs to avoid common errors. With a good design, issues like memory management and type mismatches can be minimized.
LLMs seem to be more effective with statically typed languages like Java. If an LLM makes a mistake and generates a non-existent class member, Java won't compile the code, while languages like JavaScript or Python may allow it to run and then crash later on. That’s one reason why I switched from YAML to Kotlin for my projects; I want that safety net!
Do you find Kotlin faster for personal development projects compared to Java?
Awesome question! A couple of years ago, I thought the object-oriented nature of Java would help LLMs understand bigger code structures better. But honestly, I don’t notice a significant difference since I mainly use LLMs for smaller tasks like plugins or scripts rather than full-scale code refactoring. I still think having a clear design, like Java interfaces, can make things easier in general.
I’m not sure I agree; having a simpler syntax doesn’t guarantee LLM outputs are sensible. Statically typed languages catch hallucinations immediately, preventing them from being executed, unlike JS or Python where you might not find out until runtime.