20 march 2026
ForgentFramework: How to Turn Strong Copilot Practices into a Portable Framework for Different Projects
In article 16, we talked about the principles of mature multi-agent work: plan-and-execute, a separate critic, rubric-based review, Reflexion, context control, step logging, and a clear stopping point.

In article 17, we looked at how these approaches can be assembled technically using an agent framework.

In article 18, we discussed why GitHub Copilot is especially useful inside the IDE: next to the code, files, terminal, and the real state of the project.
Introduction

In my case, this is not an SDK and not a library that needs to be linked into product code. It is more of a portable engineering layer for working with Copilot at the repository level. Its purpose is to collect successful practices in one place: instruction structure, templates, the bootstrap process, an agent role catalog, scoped instructions, skills, repo context, and basic observability. The need for such a framework appeared after I had built the same working structure several times in a row. Again, you create AGENTS.md, PROJECT.md, llms.txt, and copilot-instructions.md. Again, you split roles into planner, executor, critic, and reviewer. Again, you think about how to store context across multiple repositories, how to log steps, how to limit tool access, and where to stop endless agent attempts. After several such repetitions, it becomes clear: this is no longer a collection of random techniques. It is a separate engineering layer that is better described, standardized, and reused across projects.

What ForgentFramework is
Portable framework layer between host and project repositories

ForgentFramework is best understood not as a library and not as a runtime embedded into an application. It is a separate layer at the repository level that helps organize work with Copilot and agents in the same way across different projects. Simply put, it is not “yet another clever prompt.” It is a set of files, rules, roles, templates, and processes that can be moved into a new project so that you do not have to rebuild the whole agent setup from scratch. Such a framework contains:

  • specification;
  • templates;
  • bootstrap tools;
  • agent catalog;
  • project instructions;
  • scoped instructions;
  • prompts;
  • skills;
  • repo context;
  • session files;
  • trace artifacts.

The main idea is simple: if you managed to configure Copilot well in one project, with roles, critics, context, and logging, that setup should not live only in your head. It should be formalized as a portable engineering layer. The framework has three main goals. The first is to launch a new project faster. So that you do not have to recreate agents, bootstrap logic, context files, instructions, and review rules from scratch every time. The second is to make the process repeatable. Install, upgrade, remove, context population, the executor-critic loop, and review should follow predefined and documented rules, rather than depend on a specific implementation in a specific project. The third is to separate the common from the project-specific. The common parts should be the framework lifecycle, roles, safety gates, trace logs, and file structure. The project-specific parts should be the stack, repository structure, build and test commands, manual validation rules, and domain specifics. The goal of ForgentFramework is to provide a clear way to assemble a multi-agent system in a new repository, distribute everything across the right files, and use it in Copilot inside VS Code.

ForgentFramework architecture

ForgentFramework is better viewed not as a directory tree, but as several working loops.

1. Rules and templates layer

First comes the normative part: the specification and working rules. This describes how the agent system should be organized, how roles work, how critics are structured, what lifecycle scenarios exist, how install, upgrade, and remove are performed, how tracing is maintained, and where the system must stop. Next to that are templates and bootstrap scripts. They are needed to quickly lay out the initial structure in a project: base files, instructions, agents, and configuration artifacts. Simply put: the specification describes the rules, and the templates help materialize them in a specific repository.

2. Bootstrap loop

The second important layer is bootstrap agents. They are needed to maintain the framework itself: installation, updates, removal, and initial population of context files. The process looks like this:

topology intent ↓ PRE_DISCOVERY ↓ inventory confirmation ↓ dry-run ↓ APPLY ↓ context bootstrap ↓ critic review

This makes framework installation controlled. The agent discovers the project topology, identifies which repositories exist in the workspace, shows the plan, and only after confirmation deploys the required files.

3. Project work loop

The third layer is project agents. These are agents that work with the project itself: backend, frontend, DevOps, QA, security, documentation, and so on. The central role here is the project orchestrator. It does not merely answer the user’s question. It runs your task as a process:

understand the goal ↓ break the task into steps ↓ select an executor ↓ receive the result ↓ send it to the critic ↓ address the findings ↓ perform a final check

This way, a task starts to be solved by a group of agents. It becomes a managed cycle: plan, execution, critique, correction, stopping, or escalation.

4. Context, specialization, and observability

The fourth layer is everything that keeps the system from turning into chaos. It includes:

  • PROJECT.md;
  • AGENTS.md;
  • llms.txt;
  • .github/copilot-instructions.md;
  • scoped instructions;
  • prompts;
  • skills;
  • session files;
  • trace files;
  • compliance/eval artifacts;
  • rubrics.

These files are responsible for different things. PROJECT.md and AGENTS.md describe the project itself and its structure. llms.txt helps provide the model with a compact overview of the project. copilot-instructions.md defines general behavior rules. Scoped instructions and skills add rules for specific areas, technologies, or tasks. Session files store the memory of a specific run. Traces make it possible to later understand what exactly the system did. In short: the specification defines the rules, bootstrap deploys them into the project, project agents work according to those rules, and context plus traces make the process verifiable. This is easiest to see through one end-to-end scenario. Suppose there is a host repo with ForgentFramework and a product repo with a real application. The host repo stores:

  • specification;
  • templates;
  • agent catalog;
  • skills;
  • scoped instructions;
  • bootstrap processes;
  • traces.

The product repo stores:

  • application;
  • solution file;
  • Dockerfile;
  • pipeline YAML;
  • SQL scripts;
  • tests;
  • project instructions.

A typical process looks like this. First, the user describes the workspace topology: where the host repo is, where the product repo is, and which repositories participate in the work. Then the bootstrap orchestrator builds a plan. It must not write files immediately. Next, the bootstrap loop goes through discovery, confirms the inventory, performs a dry-run, and only after an explicit APPLY makes changes. After that, context bootstrap runs: the framework fills or updates AGENTS.md, PROJECT.md, llms.txt, and other context files. Then the critic checks that the lifecycle has not been violated. Only after that does the project orchestrator start working with the real task in the product repo. If the critic finds a problem, the findings go into session memory. The next pass must take into account what was already found and what did not work. All important steps are written to traces. If the task gets stuck, the criteria are unclear, or the iteration limit is exhausted, the process must end not with an endless “I’ll try again,” but with the status NEEDS_HUMAN.

ForgentFramework architecture with four connected working loops
How the principles from article 16 are implemented in ForgentFramework

Now let’s look at how the principles from article 16 appear here.

Plan-and-Execute

Planning exists at two levels at once. First, the bootstrap orchestrator builds a plan for installing or updating the framework. Then the project orchestrator builds a plan for the applied task itself. Here, the plan is a mandatory part of the process. First, the system explains what it is going to do, and only then moves to execution.

ReAct

Here, ReAct looks like a normal engineering cycle:

inspect ↓ formulate a hypothesis ↓ perform an action ↓ check the result ↓ decide the next step

For example, during bootstrap, the agent first studies the workspace structure, then creates an inventory, then shows a dry-run, then applies changes, and only after that starts the review. The same applies to project work: the agent should not immediately make mass file changes. It should first understand the context, state a hypothesis, perform a narrow action, and check the result.

Critic isolation

In ForgentFramework, the critic exists not only for one scenario, but for every main type of agent. There is a critic for bootstrap operations, a critic for context fill, a critic for project executors, and separate critic roles for different domain tasks. For example, DevOps changes should be reviewed not by the same agent that made them, but by the associated critic. The idea is simple: the executor does the work, and the critic checks the result separately. Isolation is achieved through the way the critic task is defined. The critic should receive not the executor’s internal logic and not the entire history of its reasoning, but specific inputs for review:

  • the original task;
  • acceptance criteria;
  • changed files or diff;
  • executor result;
  • validation commands and their output, if available.

With this setup, the critic does not continue thinking like the executor and does not defend the executor’s solution. It looks at the result from the outside: whether it matches the task, whether framework rules were violated, whether there are missing checks, risks, or blockers. In other words, critic isolation in ForgentFramework is implemented primarily at the instruction level: the critic receives a separate role, a separate task, and a separate review format. This makes review an independent phase of the process rather than self-assessment by the executor.

Rubric and formal critique

The critic should not simply write “I like it” or “looks bad.” A review needs a rubric. For example:

  • Blocker;
  • Warning;
  • Missing validation;
  • Rollback risk;
  • Security impact.

In ForgentFramework, such checks are moved into the rubric layer: critic prompts and separate review specifications. This makes the result more predictable. For example, if PRE_DISCOVERY was skipped, that is a blocker. If install wrote files before APPLY, that is a lifecycle violation.

Reflexion

If an attempt did not work, it is important not to start the next one from scratch. ForgentFramework uses session memory for this. Critic findings go into TASK_CONTEXT.md, and the next attempt must take this file into account. In other words, an error becomes a process artifact:

  • in the previous attempt, the critic found a problem;
  • it was recorded in session context;
  • the next executor must take it into account.
Context engineering

In ForgentFramework, context is not placed into one huge prompt. It is split into several layers, and each layer is needed for its own part of the work. Framework context is the general rule set of the framework itself: lifecycle, roles, critic process, trace protocol, rubrics, install/upgrade/remove. This layer answers the question: how should the agent system work? Project context is information about the specific project: stack, environments, build/test commands, CI/CD, constraints, and manual validation rules. This layer answers the question: which system are we working with now? Repo context is the repository map: where backend, frontend, DevOps files, documentation, and generated files live, which zones may be changed and which may not. This is usually described through AGENTS.md and llms.txt. Scoped instructions and skills are local rules and playbooks that are loaded only when needed. For example, separate rules for Terraform, Dockerfile, backend code, or security review. Session context is the memory of a specific run: what has already been tried, what the critic found, and which errors must be considered in the next iteration. This is not a permanent project map, but the history of the current task. The idea is simple: different roles need different context. The bootstrap agent needs topology, discovery, and installation rules. The project orchestrator needs the repo map and task criteria. The executor needs a specific subtask and the required files. The critic needs the task, acceptance criteria, and result — but not the entire history of the executor’s reasoning. This turns context engineering into controlled information delivery: the right context, to the right role, at the right moment.

Step logging and observability

For serious work, it is not enough to get the result. You also need to understand how the system arrived at it. ForgentFramework has traces for this. A trace should capture key events:

  • discovery;
  • dry-run;
  • apply;
  • critic review;
  • executor result;
  • findings;
  • final status.

This makes the agent process readable. You can go back and see exactly what happened, who did what, and at which step the problem appeared.

Stop and escalate

The system must not spin forever in a loop of “I’ll try one more time.” ForgentFramework needs explicit stopping rules for that:

  • iteration limit;
  • transition to NEEDS_HUMAN;
  • stop when acceptance criteria are unclear;
  • stop before dangerous changes;
  • complete install only after critic and context bootstrap.

This is especially important for DevOps and infrastructure tasks. Sometimes the correct action for the agent is not to continue, but to honestly say: “a human is needed here.”

Least privilege

Not every role needs full access. The planner can work read-only. The critic should almost always be read-only. The executor may have the right to edit files, but should not arbitrarily change framework artifacts. Bootstrap agents should handle the framework lifecycle, not application development. Project agents should work with the product repo, but not replace the bootstrap loop. The idea is simple: the fewer unnecessary permissions a role has, the safer and more predictable its behavior is.

How to use ForgentFramework on a new project
Using ForgentFramework in a new project from bootstrap to project work

If we describe the usage flow from scratch, it looks like this. First, decide where the framework will live. There are two options. The first is to maintain ForgentFramework as a separate host repo and attach it to the workspace next to the product repo. The second is the vendored approach, where framework/ is placed directly inside the target repo. After that, the bootstrap loop needs to be started. If the project does not yet have bootstrap agents, the starting structure can be laid out through bootstrap scripts:

.\framework\tools\bootstrap.ps1

or:

bash framework/tools/bootstrap.sh

But it is important to understand: these scripts are only the initial technical step. They lay out the base files and agents. The full install lifecycle should then continue through bootstrap agents. The main entry point is the bootstrap orchestrator. It is the one that should handle operations:

  • Install;
  • Upgrade;
  • Remove;
  • context bootstrap.

Then the process goes like this:

topology intent ↓ PRE_DISCOVERY ↓ confirmed inventory ↓ dry-run ↓ APPLY ↓ repo context bootstrap ↓ critic review ↓ project work

Install is considered complete not after the first file write, but only after the context layer has been populated and the critic has confirmed that the lifecycle was not violated. After that, you can move on to regular project work through Group 1 agents: project orchestrator, domain executors, and critics. As the project starts to live, the framework is gradually supplemented with project-specific layers:

  • scoped instructions;
  • domain skills;
  • repo-specific critics;
  • prompts;
  • model mapping;
  • manual validation rules.
Conclusion

ForgentFramework is my attempt to collect in one place the practices that started repeating in almost every project: agent roles, critic passes, bootstrap, repo context, session memory, traces, rules, skills, and clear stopping points. The idea of the framework is simple: if the multi-agent approach with Copilot proved useful in one project, it should not be manually rebuilt every time in the next one. It is better to move the common part into a portable layer and then adapt only the project-specific details: stack, repo structure, validation commands, domain rules, and scoped instructions. Right now, I actively use ForgentFramework myself. With it, I have added a multi-agent structure to all my projects and am gradually checking which parts really work in real development, and which ones need to be simplified or improved. For me, this is not a finished “ideal framework,” but a living tool. I will continue developing it: improving bootstrap, clarifying roles, adding new skills, strengthening the critic process, developing observability, and making installation into new projects easier. I hope this approach will be useful to you as well. Even if you do not take ForgentFramework as a whole, the idea itself may be useful: do not keep successful Copilot practices only in your head, but formalize them as a repeatable structure that can be transferred, tested, and improved from project to project.