 | You have agent A, agent B, and the judge. They communicate through a tool we developed using LangChain, which facilitates interaction and infrastructure for different agents. LangChain collects responses and integrates them into the prompt for the next agent. We aim to gather results and create a refined prompt for the subsequent agent. If there are multiple inputs, we have an agent dedicated to collecting results and enhancing the prompt for the next agent. Regarding calibration, when conducting a code review with an agent, it is essential to establish what constitutes good and bad practices. We perform calibration by assessing the context we have. For instance, when receiving code reviews, the LLM does not inherently know what is significant for your specific workflow. Different industries, such as healthcare, retail, and finance, may utilize the same Java framework in varied ways, with different aspects being crucial for each. To address this, we provide two options for guiding agents on their tasks. First, we index all your pull request (PR) history to identify when similar issues were previously encountered and compare them to the current version. This is done from a contextual standpoint. We analyze the changes made to the code and check for similar instances in the past. This context is transferred twice: once to the sub-agents for their analysis and again to the judge agent. The judge agent evaluates the 15 recommendations for a code review based on historical comments from reviewers and developers, determining which suggestions are worth presenting to your developers. This process is applied to every agent. As you noted, we do not share context between agents; each agent operates with a specific context. We focus on providing only the most relevant information to each agent, utilizing a context engine to filter out unnecessary data. However, this raises a question about bridging gaps between agents. For example, if you have a code quality agent and a specific coding review agent, each operates autonomously with limited information. While this approach works for straightforward tasks like checking for linting or test implementation, it may fall short for more complex architectural decisions that require a broader perspective, such as security considerations. Historically, code reviews involved senior engineers who understood the codebase and could provide insights based on their experience. Security experts would assess potential vulnerabilities, while compliance auditors would ensure adherence to standards like ISO or SOC 2. In the past, specialized knowledge was crucial in these areas. Now, when context is provided, it includes specific security and architectural concerns. We can implement a similar strategy with agents, allowing architects and compliance personnel to input their guidelines, enabling agents to verify compliance with those standards. |