Merge branch 'feature/prompts_and_agents_documentation'

2026-01-06 19:29:33 +01:00 · 2023-09-11 14:18:10 +10:00
parent 21f18d5318 de196af7a9
commit a8008b263e
11 changed files with 234 additions and 20 deletions
--- a/README.md
+++ b/README.md
@@ -37,7 +37,6 @@ Obviously, it still can't create any production-ready app but the general concep

 # 🔌 Requirements

-
 - **Python**
 - **PostgreSQL** (optional, projects default is SQLite)
   - DB is needed for multiple reasons like continuing app development if you had to stop at any point or app crashed, going back to specific step so you can change some later steps in development, easier debugging, for future we will add functionality to update project (change some things in existing project or add new features to the project and so on)...
@@ -64,28 +63,92 @@ All generated code will be stored in the folder `workspace` inside the folder na
 **IMPORTANT: To run GPT Pilot, you need to have PostgreSQL set up on your machine**
 <br>

-# 🧑‍💻️ Other arguments
- continue working on an existing app
+# 🧑‍💻️ CLI arguments
+
+## `app_type` and `name`
+If not provided, the ProductOwner will ask for these values
+
+`app_type` is used as a hint to the LLM as to what kind of architecture, language options and conventions would apply. If not provided, `prompts.prompts.ask_for_app_type()` will ask for it.
+
+See `const.common.ALL_TYPES`: 'Web App', 'Script', 'Mobile App', 'Chrome Extension'
+
+
+## `app_id` and `workspace`
+Continue working on an existing app using **`app_id`**
 ```bash
 python main.py app_id=<ID_OF_THE_APP>
 ```

- continue working on an existing app from a specific step
+_or_ **`workspace`** path:
+
+```bash
+python main.py workspace=<PATH_TO_PROJECT_WORKSPACE>
+```
+
+Each user can have their own workspace path for each App.
+
+
+## `user_id`, `email` and `password`
+These values will be saved to the User table in the DB.
+
+```bash
+python main.py user_id=me_at_work
+```
+
+If not specified, `user_id` defaults to the OS username, but can be provided explicitly if your OS username differs from your GitHub or work username. This value is used to load the `App` config when the `workspace` arg is provided.
+
+If not specified `email` will be parsed from `~/.gitconfig` if the file exists.
+
+See also [What's the purpose of arguments.password / User.password?](https://github.com/Pythagora-io/gpt-pilot/discussions/55)
+
+
+## `advanced`
+The Architect by default favours certain technologies including: 
+
+- Node.JS
+- MongoDB
+- PeeWee ORM
+- Jest & PyUnit
+- Bootstrap
+- Vanilla JavaScript
+- Socket.io
+
+If you have your own preferences, you can have a deeper conversation with the Architect.
+
+```bash
+python main.py advanced=True
+```
+
+
+## `step`
+Continue working on an existing app from a specific **`step`** (eg: `user_tasks`)
 ```bash
 python main.py app_id=<ID_OF_THE_APP> step=<STEP_FROM_CONST_COMMON>
 ```

- continue working on an existing app from a specific development step
+
+## `skip_until_dev_step`
+Continue working on an existing app from a specific **development step**
 ```bash
 python main.py app_id=<ID_OF_THE_APP> skip_until_dev_step=<DEV_STEP>
 ```
 This is basically the same as `step` but during the actual development process. If you want to play around with gpt-pilot, this is likely the flag you will often use.
 <br>
- erase all development steps previously done and continue working on an existing app from start of development
+
+Erase all development steps previously done and continue working on an existing app from start of development
+
 ```bash
 python main.py app_id=<ID_OF_THE_APP> skip_until_dev_step=0
 ```

+
+## `delete_unrelated_steps`
+
+
+## `update_files_before_start`
+
+
+
 # 🔎 Examples

 Here are a couple of example apps GPT Pilot created by itself:
@@ -130,8 +193,10 @@ Here are the steps GPT Pilot takes to create an app:
 4. **Architect agent** writes up technologies that will be used for the app
 5. **DevOps agent** checks if all technologies are installed on the machine and installs them if they are not
 6. **Tech Lead agent** writes up development tasks that Developer will need to implement. This is an important part because, for each step, Tech Lead needs to specify how the user (real world developer) can review if the task is done (eg. open localhost:3000 and do something)
-7. **Developer agent** takes each task and writes up what needs to be done to implement it. The description is in human readable form.
-8. Finally, **Code Monkey agent** takes the Developer's description and the currently implement file and implements the changes into it. We realized this works much better than giving it to Developer right away to implement changes.
+7. **Developer agent** takes each task and writes up what needs to be done to implement it. The description is in human-readable form.
+8. Finally, **Code Monkey agent** takes the Developer's description and the existing file and implements the changes into it. We realized this works much better than giving it to Developer right away to implement changes.
+
+For more details on the roles of agents employed by GPT Pilot refer to [AGENTS.md](https://github.com/Pythagora-io/gpt-pilot/blob/main/pilot/helpers/agents/AGENTS.md)

 ![GPT Pilot Coding Workflow](https://github.com/Pythagora-io/gpt-pilot/assets/10895136/54a8ec24-a2ea-43a6-a494-03139d4e43f5)

--- a/pilot/helpers/AgentConvo.py
+++ b/pilot/helpers/AgentConvo.py
@@ -42,6 +42,7 @@ class AgentConvo:
        # craft message
        self.construct_and_add_message_from_prompt(prompt_path, prompt_data)

+        # TODO: should this be "... 'functions' in function_calls:"?
        if function_calls is not None and 'function_calls' in function_calls:
            self.messages[-1]['content'] += '\nMAKE SURE THAT YOU RESPOND WITH A CORRECT JSON FORMAT!!!'

--- a/pilot/helpers/agents/AGENTS.md
+++ b/pilot/helpers/agents/AGENTS.md
@@ -0,0 +1,64 @@
+Roles are defined in `const.common.ROLES`.
+Each agent's role is described to the LLM by a prompt in `pilot/prompts/system_messages/{role}.prompt`
+
+## Product Owner
+`project_description`, `user_stories`, `user_tasks`
+
+- Talk to client, ask detailed questions about what client wants
+- Give specifications to dev team
+
+
+## Architect
+`architecture`
+
+- Scripts: Node.js, MongoDB, PeeWee ORM
+- Testing: Node.js -> Jest, Python -> pytest, E2E -> Cypress **(TODO - BDD?)**
+- Frontend: Bootstrap, vanilla Javascript **(TODO - TypeScript, Material/Styled, React/Vue/other?)**
+- Other: cronjob, Socket.io
+
+TODO: 
+- README.md
+- .gitignore
+- .editorconfig
+- LICENSE
+- CI/CD
+- IaC, Dockerfile
+
+
+## Tech Lead
+`development_planning`
+
+- Break down the project into smaller tasks for devs.
+- Specify each task as clear as possible:
+  - Description
+  - "Programmatic goal" which determines if the task can be marked as done.
+    eg: "server needs to be able to start running on a port 3000 and accept API request 
+         to the URL `http://localhost:3000/ping` when it will return the status code 200"
+  - "User-review goal" 
+    eg: "run `npm run start` and open `http://localhost:3000/ping`, see "Hello World" on the screen"
+
+
+## Dev Ops
+`environment_setup`
+
+**TODO: no prompt**
+
+`debug` functions: `run_command`, `implement_code_changes`
+
+
+## Developer (full_stack_developer)
+`create_scripts`, `coding` **(TODO: No entry in `STEPS` for `create_scripts`)**
+
+- Implement tasks assigned by tech lead
+- Modular code, TDD
+- Tasks provided as "programmatic goals" **(TODO: consider BDD)**
+
+
+
+## Code Monkey
+**TODO: not listed in `ROLES`**
+
+`development/implement_changes` functions: `save_files`
+
+- Implement tasks assigned by tech lead
+- Modular code, TDD
--- a/pilot/helpers/agents/Architect.py
+++ b/pilot/helpers/agents/Architect.py
@@ -39,6 +39,7 @@ class Architect(Agent):
            #  'user_tasks': self.project.user_tasks,
             'app_type': self.project.args['app_type']}, ARCHITECTURE)

+        # TODO: Project.args should be a defined class so that all of the possible args are more obvious
        if self.project.args.get('advanced', False):
            architecture = get_additional_info_from_user(self.project, architecture, 'architect')

--- a/pilot/helpers/agents/CodeMonkey.py
+++ b/pilot/helpers/agents/CodeMonkey.py
@@ -12,6 +12,9 @@ class CodeMonkey(Agent):
        if convo is None:
            convo = AgentConvo(self)

+        # "... step {i} - {step.description}.
+        # To do this, you will need to see the local files
+        # Ask for files relative to project root."
        files_needed = convo.send_message('development/task/request_files_for_code_changes.prompt', {
            "step_description": code_changes_description,
            "directory_tree": self.project.get_directory_tree(True),
@@ -19,7 +22,6 @@ class CodeMonkey(Agent):
            "finished_steps": ', '.join(f"#{j}" for j in range(step_index))
        }, GET_FILES)

-
        changes = convo.send_message('development/implement_changes.prompt', {
            "step_description": code_changes_description,
            "step_index": step_index,
--- a/pilot/helpers/cli.py
+++ b/pilot/helpers/cli.py
@@ -234,6 +234,7 @@ def build_directory_tree(path, prefix="", ignore=None, is_last=False, files=None

    return output

+
 def execute_command_and_check_cli_response(command, timeout, convo):
    """
    Execute a command and check its CLI response.
--- a/pilot/prompts/prompts.py
+++ b/pilot/prompts/prompts.py
@@ -12,7 +12,7 @@ from logger.logger import logger


 def ask_for_app_type():
-    return 'Web App'
+    return 'App'
    answer = styled_select(
        "What type of app do you want to build?",
        choices=common.APP_TYPES
@@ -40,7 +40,7 @@ def ask_for_app_type():
 def ask_for_main_app_definition(project):
    description = styled_text(
        project,
-        "Describe your app in as many details as possible."
+        "Describe your app in as much detail as possible."
    )

    if description is None:
@@ -68,9 +68,22 @@ def ask_user(project, question, require_some_input=True):


 def get_additional_info_from_openai(project, messages):
+    """
+    Runs the conversation between Product Owner and LLM.
+    Provides the user's initial description, LLM asks the user clarifying questions and user responds.
+    Limited by `MAX_QUESTIONS`, exits when LLM responds "EVERYTHING_CLEAR".
+
+    :param project: Project
+    :param messages: [
+        { "role": "system", "content": "You are a Product Owner..." },
+        { "role": "user", "content": "I want you to create the app {name} that can be described: ```{description}```..." }
+      ]
+    :return: The updated `messages` list with the entire conversation between user and LLM.
+    """
    is_complete = False
    while not is_complete:
        # Obtain clarifications using the OpenAI API
+        # { 'text': new_code }
        response = create_gpt_chat_completion(messages, 'additional_info')

        if response is not None:
@@ -93,12 +106,21 @@ def get_additional_info_from_openai(project, messages):


 # TODO refactor this to comply with AgentConvo class
-def get_additional_info_from_user(project,  messages, role):
+def get_additional_info_from_user(project, messages, role):
+    """
+    If `advanced` CLI arg, Architect offers user a chance to change the architecture.
+    Prompts: "Please check this message and say what needs to be changed. If everything is ok just press ENTER"...
+    Then asks the LLM to update the messages based on the user's feedback.
+
+    :param project: Project
+    :param messages: array<string | { "text": string }>
+    :param role: 'product_owner', 'architect', 'dev_ops', 'tech_lead', 'full_stack_developer', 'code_monkey'
+    :return: a list of updated messages - see https://github.com/Pythagora-io/gpt-pilot/issues/78
+    """
    # TODO process with agent convo
    updated_messages = []

    for message in messages:
-
        while True:
            if isinstance(message, dict) and 'text' in message:
                message = message['text']
@@ -109,22 +131,41 @@ def get_additional_info_from_user(project,  messages, role):
            if answer.lower() == '':
                break
            response = create_gpt_chat_completion(
-                generate_messages_from_custom_conversation(role, [get_prompt('utils/update.prompt'), message, answer], 'user'), 'additional_info')
+                generate_messages_from_custom_conversation(role, [get_prompt('utils/update.prompt'), message, answer], 'user'),
+                'additional_info')

            message = response

        updated_messages.append(message)

    logger.info('Getting additional info from user done')
-
    return updated_messages


 def generate_messages_from_description(description, app_type, name):
+    """
+    Called by ProductOwner.get_description().
+    :param description: "I want to build a cool app that will make me rich"
+    :param app_type: 'Web App', 'Script', 'Mobile App', 'Chrome Extension' etc
+    :param name: Project name
+    :return: [
+        { "role": "system", "content": "You are a Product Owner..." },
+        { "role": "user", "content": "I want you to create the app {name} that can be described: ```{description}```..." }
+      ]
+    """
+    # "I want you to create the app {name} that can be described: ```{description}```
+    # Get additional answers
+    # Break down stories
+    # Break down user tasks
+    # Start with Get additional answers
+    # {prompts/components/no_microservices}
+    # {prompts/components/single_question}
+    # "
    prompt = get_prompt('high_level_questions/specs.prompt', {
        'name': name,
        'prompt': description,
        'app_type': app_type,
+        # TODO: MAX_QUESTIONS should be configurable by ENV or CLI arg
        'MAX_QUESTIONS': MAX_QUESTIONS
    })

@@ -135,6 +176,20 @@ def generate_messages_from_description(description, app_type, name):


 def generate_messages_from_custom_conversation(role, messages, start_role='user'):
+    """
+    :param role: 'product_owner', 'architect', 'dev_ops', 'tech_lead', 'full_stack_developer', 'code_monkey'
+    :param messages: [
+        "I will show you some of your message to which I want you to make some updates. Please just modify your last message per my instructions.",
+        {LLM's previous message},
+        {user's request for change}
+    ]
+    :param start_role: 'user'
+    :return: [
+      { "role": "system", "content": "You are a ..., You do ..." },
+      { "role": start_role, "content": messages[i + even] },
+      { "role": "assistant" (or "user" for other start_role), "content": messages[i + odd] },
+      ... ]
+    """
    # messages is list of strings
    result = [get_sys_message(role)]

--- a/pilot/prompts/system_messages/architect.prompt
+++ b/pilot/prompts/system_messages/architect.prompt
@@ -1,10 +1,10 @@
 You are an experienced software architect. Your expertise is in creating an architecture for an MVP (minimum viable products) for {{ app_type }}s that can be developed as fast as possible by using as many ready-made technologies as possible. The technologies that you prefer using when other technologies are not explicitly specified are:
-**Scripts**: you prefer using Node.js for writing scripts that are meant to be ran just with the CLI.
+**Scripts**: You prefer using Node.js for writing scripts that are meant to be ran just with the CLI.

-**Backend**: you prefer using Node.js with Mongo database if not explicitely specified otherwise. When you're using Mongo, you always use Mongoose and when you're using Postgresql, you always use PeeWee as an ORM.
+**Backend**: You prefer using Node.js with Mongo database if not explicitly specified otherwise. When you're using Mongo, you always use Mongoose and when you're using a relational database, you always use PeeWee as an ORM.

 **Testing**: To create unit and integration tests, you prefer using Jest for Node.js projects and pytest for Python projects. To create end-to-end tests, you prefer using Cypress.

-**Frontend**: you prefer using Bootstrap for creating HTML and CSS while you use plain (vanilla) Javascript.
+**Frontend**: You prefer using Bootstrap for creating HTML and CSS while you use plain (vanilla) Javascript.

 **Other**: From other technologies, if they are needed for the project, you prefer using cronjob (for making automated tasks), Socket.io for web sockets
--- a/pilot/prompts/utils/update.prompt
+++ b/pilot/prompts/utils/update.prompt
@@ -1 +1 @@
-I will show you some of your message to which I want make some updates. Please just modify your last message per my instructions.
+I will show you some of your message to which I want you to make some updates. Please just modify your last message per my instructions.
--- a/pilot/utils/llm_connection.py
+++ b/pilot/utils/llm_connection.py
@@ -91,6 +91,20 @@ def num_tokens_from_functions(functions, model=model):

 def create_gpt_chat_completion(messages: List[dict], req_type, min_tokens=MIN_TOKENS_FOR_GPT_RESPONSE,
                               function_calls=None):
+    """
+    Called from:
+      - AgentConvo.send_message() - these calls often have `function_calls`, usually from `pilot/const/function_calls.py`
+         - convo.continuous_conversation()
+      - prompts.get_additional_info_from_openai()
+      - prompts.get_additional_info_from_user() after the user responds to each
+            "Please check this message and say what needs to be changed... {message}"
+    :param messages: [{ "role": "system"|"assistant"|"user", "content": string }, ... ]
+    :param req_type: 'project_description' etc. See common.STEPS
+    :param min_tokens: defaults to 600
+    :param function_calls: (optional) {'definitions': [{ 'name': str }, ...]}
+        see `IMPLEMENT_CHANGES` etc. in `pilot/const/function_calls.py`
+    :return: {'text': new_code} or (if `function_calls` param provided) {'function_calls': function_calls}
+    """
    gpt_data = {
        'model': os.getenv('OPENAI_MODEL', 'gpt-4'),
        'n': 1,
@@ -106,6 +120,7 @@ def create_gpt_chat_completion(messages: List[dict], req_type, min_tokens=MIN_TO
    if function_calls is not None:
        gpt_data['functions'] = function_calls['definitions']
        if len(function_calls['definitions']) > 1:
+            # DEV_STEPS
            gpt_data['function_call'] = 'auto'
        else:
            gpt_data['function_call'] = {'name': function_calls['definitions'][0]['name']}
@@ -181,6 +196,12 @@ def retry_on_exception(func):

@retry_on_exception
 def stream_gpt_completion(data, req_type):
+    """
+    Called from create_gpt_chat_completion()
+    :param data:
+    :param req_type: 'project_description' etc. See common.STEPS
+    :return: {'text': str} or {'function_calls': function_calls}
+    """
    terminal_width = os.get_terminal_size().columns
    lines_printed = 2
    buffer = ""  # A buffer to accumulate incoming data
@@ -291,7 +312,7 @@ def stream_gpt_completion(data, req_type):
    return return_result({'text': new_code}, lines_printed)


-def postprocessing(gpt_response, req_type):
+def postprocessing(gpt_response: str, req_type) -> str:
    return gpt_response


--- a/pilot/utils/utils.py
+++ b/pilot/utils/utils.py
@@ -61,6 +61,10 @@ def get_prompt_components():


 def get_sys_message(role):
+    """
+    :param role: 'product_owner', 'architect', 'dev_ops', 'tech_lead', 'full_stack_developer', 'code_monkey'
+    :return: { "role": "system", "content": "You are a {role}... You do..." }
+    """
    # Create a FileSystemLoader
    file_loader = FileSystemLoader('prompts/system_messages')