Every time I have read a new post offering advice on how to work with an LLM, be it with prompts or context, I just couldn’t shake the feeling that there was some unifying “theory of language” that explained what made a prompt good or bad.
I had initially explored this by wanting to describe the ideal prompt as a mixture of formalism and “maneuverability”. However, when discussing my ideas with colleagues, it was politely pointed out that I was simply imposing a new system of superstition to “formalize” an existing list of superstitions.
But is there such a “theory of prompting”? What would such a theory be? If we build a large model of language, should it be safe to assume that we have built something that models language?
Turns out that, if we move our focus from Large Language Model to rather be Large Language Model, then the field of Linguistics is exactly what we want. There is no need for us to re-invent theories and notations - the research has been done for us!
What follows are examples of good and bad prompts, and my attempt to explain why they are “good” or “bad” - not with my own vibism, but by relying on existing linguistic theories.
Working Memory Theory
Working Memory Theory is a crucial concept in linguistics, particularly in understanding how we process and learn language. Working memory acts as a mental workspace where we hold and process incoming linguistic information, like words and sentences, while integrating them with our existing knowledge. (Sound familiar?)
But working memory has limited capacity, meaning it can only hold a certain amount of information at once - generally around 7 items. Just as human working memory can be overwhelmed, LLM working memory can also be overwhelmed!
When you overwhelm an LLM with a kitchen-sink prompt, you’re creating the computational equivalent of “cognitive overload”. Just as with humans, breaking down complex information into “semantically coherent segments” significantly improves processing accuracy of an LLM.
Before (cognitive overload):
Create a complete web application with user authentication, database integration, real-time chat, file upload functionality, admin dashboard, and responsive design using React, Node.js, Express, MongoDB, and Socket.io with proper error handling, security measures, and performance optimization.
After (chunked for cognition):
Let's build a web application step by step:
1\. First, create a basic React frontend with user registration/login forms
2\. Then, set up a Node.js/Express backend with MongoDB for user management
3\. Next, implement secure authentication with JWT tokens
4\. Finally, add real-time chat using Socket.io
Focus on step 1 first - create the user registration component.
We already intuitively know, have experienced, and feel that the chunked version is better. But now we have a theory of why: working memory.
Linguistic Anchoring
The “anchoring effect” describes the tendency for individuals to rely heavily on the first piece of information they receive (the “anchor”) when making decisions, even if that information is irrelevant or misleading.
“Selective Prompt Anchoring” is the application of anchoring to prompting, where we set specific tokens to be the “anchored text”. We are attempting to amplify attention towards these tokens, so as to better control the model’s focus.
Before (attention drift):
Write a function to sort a list.
After (linguistically anchored):
TASK: Sort a list of integers efficiently
FOCUS: Choose optimal algorithm for large datasets
CONSTRAINTS: Handle edge cases (empty lists, duplicates)
DELIVERABLE: Python function with time complexity analysis
def sort_large_list(nums: List[int]) -> List[int]:
"""Efficiently sort a large list of integers."""
# Your implementation focusing on the TASK above
Again, we have seen the second prompt perform better. But, instead of us offering advice based on heuristics, we can lean on the existing anchoring effect literature to explain why it works.
Information Density
Information Density is a measure of how much information is packed into a linguistic unit. Information density is not an inherent property of a language, but rather context-dependent. For example, a word might be highly predictable in one sentence and less predictable in another.
Speakers and writers make choices about how to most efficiently encode and communicate their messages, and information density plays a role in these choices.
The best example of this theory in action is writing prompts that are clear and concise.
Before (low density, high noise):
Please help me write some code that can handle files and do some processing on them. I need it to work with different types of files and be able to process them efficiently. Can you make something that's robust and handles errors well?
After (optimized information density):
Create a Python file processor class that:
- Accepts .txt, .csv, .json file types
- Reads content with encoding detection
- Applies transformation function (passed as parameter)
- Writes to output directory with '_processed' suffix
- Handles FileNotFoundError, PermissionError, UnicodeDecodeError
- Logs progress for files > 1MB
In the the optimized version, we’ve removed vague terms, removed redundancy, increased specificity, provided measurable success criteria, and made a specific choice about message encoding.
Information theory thus provides the framework for what we intuitively know already: precision beats verbosity.
But if you can’t be precise, be verbose, right? Well, this works because “pragmatics” and “discourse theory” suggest that redundancy and multiple attempts at explanation can help listeners (and LLMs!) triangulate meaning.
Embodied Cognition
Cognitive linguistics holds that humans understand abstract concepts through physical metaphors. It turns out that good LLM prompts sometimes also use this physical metaphor approach.
Instead of treating code as abstract logic, a good code generation prompt should leverage “embodied cognition” by placing programming concepts in a physical experience.
Before (abstract):
Implement caching functionality
After (embodied):
Create a memory system that works like a librarian's quick-access shelf—frequently requested books stay within arm's reach while rarely used volumes move to distant archives. Build this caching layer where hot data stays close and cold data migrates to deeper storage.
The “language” of embodied cognition happens through “image schemas”. These are the conceptual frameworks for capturing the different ways of describing our “physical” actions. For example:
- CONTAINER schema: “Put validation logic inside a protective wrapper”
- PATH schema: “Guide data through transformation pipelines”
- BALANCE schema: “Maintain equilibrium between performance and memory”
Placing the abstract in the concrete through physical metaphor is not just how we speak to each other, but also describes what makes a prompt “good”.
Given that an LLM is trained on human text, should we actually be surprised that even without a physical experience, a good prompt to the LLM mirrors our physical metaphor?
Register and Politeness
Do you say “please” when asking an LLM to do something? The research says you should. In fact, the “register” you use for speaking (and prompting) affects the output quality. Think about the register you use when speaking to a colleague or client.
The optimal register often varies by task type and technical domain. For code generation, a professional technical register consistently outperforms both casual and overly formal approaches.
Before (inappropriate register):
hey can u plz write me some python code that does stuff with lists thx
After (professional technical register):
Generate a Python function that implements efficient list manipulation operations, including sorting, filtering, and transformation methods. Include docstrings and type hints following PEP 8 conventions.
This is something that sociolinguistic research has already demonstrated. Just like us, LLMs respond to register shifts. When you adopt a senior developer’s register, you activate patterns associated with expert code production.
This approach also manifests as a “persona prompting” - “You are an expert python developer”. This technique leverages “accommodation theory”, which is the tendency to match communication styles with perceived expertise levels.
Discourse Markers
Discourse markers are words or phrases that help organize and connect ideas in speech and writing. They’re signposts that guide readers through a text, showing how different parts relate to each other.
Examples of such words include “first,” “next,” “specifically,” and “moreover”. One would think that to be “concise”, we need to leave out these words. But they create cognitive scaffolding that guides both human and AI reasoning.
Before (unstructured):
Make this code faster and add error handling and documentation
After (discourse-structured):
Let's improve this code systematically. First, analyze performance bottlenecks using profiling data. Next, implement targeted optimizations for the critical path. Then, add comprehensive error handling for edge cases. Finally, document the optimization strategy and performance gains.
The discourse markers create a “cognitive map” that prevents the LLM from conflating tasks or missing requirements. For code generation, you will probably get additional lift in result quality if you mirror the discourse markers like “first”, “then”, “finally”, that are already part of coding constructs.
Frame Semantics
Frame semantics theory shows that words only make sense within structured knowledge frames. A “frame” is like a mental schema that includes all the background knowledge and expectations associated with a particular concept.
For code generation, this means activating an entire “conceptual framework” rather than just a single feature:
Before (isolated concepts):
Add authentication to the API
After (frame-activated):
Implement the AUTHENTICATION frame for our API:
- Authority: JWT token issuer
- Credentials: username/password pairs
- Validation: cryptographic verification
- Session: token lifecycle management
- Permissions: role-based access control
- Audit: authentication event logging
Build these frame components with security-first design.
How many times have you had an LLM go down the wrong path? Now you know it’s because you’ve not activated the correct framing.
Construction Grammar
Construction grammar is a set of linguistic theories that treats grammatical construction as the primary units of language, rather than focusing on words and rules as separate entities.
The sentence “As a senior Python developer, architect a data pipeline that handles real-time streaming”, when examined based on it’s grammar, can be thought of as a sentence with the structure “As a [EXPERT_ROLE], [ACTION_VERB] a [TARGET_OBJECT] that [CONSTRAINT]”. This “Role-Action-Object” pattern has proved to be rather effective when working with an LLM.
Other patterns that seem to work well include:
Conditional-Temporal Pattern
When [CONDITION] occurs, then [ACTION], ensuring [OUTCOME]
Example: "When user input arrives, validate and sanitize it, ensuring no code injection"
Analogical Pattern
[TASK] is like [FAMILIAR_DOMAIN] where [MAPPING]
Example: "Database normalization is like organizing a library where books are grouped by topic without duplication"
When thinking about what makes a good prompt statement, thinking of it in terms of constructive grammar, and then testing out variations with the template, can help us to produce more rigorous prompting advice.
Formal Specification
The ideal description of a task would be via a “formal language”, where you have a defined set of strings constructed from a finite alphabet according to precise rules. But in a formal language, you end up with tasks described as
∀x (HasItemsInCart(x) → ◊(ProceedsToCheckout(x) → ◇(SeesPaymentOptions(x) ∧ CorrectTotal(x))))
What we can do instead is approximate a (strict) formal language with “temporal logic” using something like Gherkin, implementing what is often called “Behaviour Driven Development (BDD)”:
Given a user has items in their shopping cart
When they proceed to checkout
Then they should see the payment options
And the cart total should be calculated correctly
Here, “Given” is the initial state conditions (“existential quantification”), “When” is the state transition actions (“temporal operators”), and “Then” is the final state assertions (“model logic”).
Another framing is the say this is that Gherkin allows us to maintain natural language comprehension while implementing “model-theoretic validation”:
Scenario: Valid login
Given a user with email "[email protected]" and password "secure123"
When they attempt to login
Then they should be redirected to the dashboard
Scenario: Invalid password
Given a user with email "[email protected]" and password "wrong"
When they attempt to login
Then they should see "Invalid credentials" error
Here, each scenario provides a “concrete model” that satisfies (or violates) the abstract specification. This is precisely how model theory works. We have abstract logical statements becoming concrete interpretations, through specific models.
Linguistic theory explains why BDD (via Gherkin and other “formal” specifications) work so well with LLMs. They provide ’compositional structure’ (in that each scenario decomposes cleanly into semantic components),’speech act clarity’ (because there is an explicit performative structure with “Given/When/Then”), ’discourse coherence’ (since there is a temporal sequence that has a clear causal relationship), ‘frame activation’ (in that they use a domain-specific vocabulary that activates relevant knowledge frames), and ’model theoretic validation’ (because multiple examples constrain the space for interpretation).
BDD is almost the perfect amalgamation of “LLM linguistics for code generation”.
Conclusion
While I have not covered every piece of prompting advice or linguistic theory, I think I have done enough to demonstrate that we do not have to invent new ways of thinking for working with (a model of) language.
If we continue to improve our language models, thereby producing better models of language, I’m willing to bet that linguistic theories will become even more relevant.
While I am still trying to better conceptualize linguistic theory as “prompting advice”, I can, for now, offer some points for better prompting - and yes, I used an LLM to help me develop these:
Use Conceptual blending to create novel solutions by integrating multiple domains. Instead of requesting a “caching system,” prompt for a “library-memory hybrid where frequent books migrate to the reference desk.” This activates richer conceptual frameworks than technical specifications alone.
Use relevance theory to optimize context selection. Every piece of context should enable new inferences — if removing information doesn’t change potential outputs, it’s noise.
Use code-switching strategies to leverage the boundary between natural and formal languages. Strategic mixing (using natural language for logic and code syntax for structure) outperforms pure natural language or pure code examples. x Use specification languages that align with both formal semantics and natural language structure. The most successful approaches will be those that treat code generation as a translation task from linguistically well-formed specifications to executable implementations.