mirror of
https://github.com/langgenius/dify.git
synced 2026-04-11 12:14:34 +08:00
Compare commits
48 Commits
refactor/l
...
refactor/c
| Author | SHA1 | Date | |
|---|---|---|---|
|
|
0c29b67e22 | ||
|
|
c080c48aba | ||
|
|
d13638f6e4 | ||
|
|
b4eef76c14 | ||
|
|
cbf7f646d9 | ||
|
|
c58647d39c | ||
|
|
f6be9cd90d | ||
|
|
360f3bb32f | ||
|
|
8519b16cfc | ||
|
|
f00d823f9f | ||
|
|
e48419937b | ||
|
|
5eaf0c733a | ||
|
|
f561656a89 | ||
|
|
f01f555146 | ||
|
|
47d0e400ae | ||
|
|
8724ba04aa | ||
|
|
6fd001c660 | ||
|
|
e8e386a6b9 | ||
|
|
eba5eac3fa | ||
|
|
19008dce13 | ||
|
|
92011d0a31 | ||
|
|
a51ced0a4f | ||
|
|
dad8e408b0 | ||
|
|
d941201a3e | ||
|
|
dd988d42c2 | ||
|
|
a43d2ec4f0 | ||
|
|
7c12e923b6 | ||
|
|
b9f1d65d4f | ||
|
|
b4e2af96e2 | ||
|
|
9d38af6d99 | ||
|
|
0772d49257 | ||
|
|
67eb8c052d | ||
|
|
5c4028d557 | ||
|
|
55e6bca11c | ||
|
|
67657c2f48 | ||
|
|
e8f9d64651 | ||
|
|
1f8c730259 | ||
|
|
8d45755303 | ||
|
|
6342d196e8 | ||
|
|
5dc5709d58 | ||
|
|
99d19cd3db | ||
|
|
fa92548cf6 | ||
|
|
41428432cc | ||
|
|
b3a869b91b | ||
|
|
f911199c8e | ||
|
|
056095238b | ||
|
|
c8ae6e39d2 | ||
|
|
61f8647f37 |
23
.github/workflows/autofix.yml
vendored
23
.github/workflows/autofix.yml
vendored
@@ -79,6 +79,29 @@ jobs:
|
||||
find . -name "*.py" -type f -exec sed -i.bak -E 's/"([^"]+)" \| None/Optional["\1"]/g; s/'"'"'([^'"'"']+)'"'"' \| None/Optional['"'"'\1'"'"']/g' {} \;
|
||||
find . -name "*.py.bak" -type f -delete
|
||||
|
||||
- name: Install pnpm
|
||||
uses: pnpm/action-setup@v4
|
||||
with:
|
||||
package_json_file: web/package.json
|
||||
run_install: false
|
||||
|
||||
- name: Setup Node.js
|
||||
uses: actions/setup-node@v6
|
||||
with:
|
||||
node-version: 24
|
||||
cache: pnpm
|
||||
cache-dependency-path: ./web/pnpm-lock.yaml
|
||||
|
||||
- name: Install web dependencies
|
||||
run: |
|
||||
cd web
|
||||
pnpm install --frozen-lockfile
|
||||
|
||||
- name: ESLint autofix
|
||||
run: |
|
||||
cd web
|
||||
pnpm lint:fix || true
|
||||
|
||||
# mdformat breaks YAML front matter in markdown files. Add --exclude for directories containing YAML front matter.
|
||||
- name: mdformat
|
||||
run: |
|
||||
|
||||
24
AGENTS.md
24
AGENTS.md
@@ -25,6 +25,30 @@ pnpm type-check:tsgo
|
||||
pnpm test
|
||||
```
|
||||
|
||||
### Frontend Linting
|
||||
|
||||
ESLint is used for frontend code quality. Available commands:
|
||||
|
||||
```bash
|
||||
# Lint all files (report only)
|
||||
pnpm lint
|
||||
|
||||
# Lint and auto-fix issues
|
||||
pnpm lint:fix
|
||||
|
||||
# Lint specific files or directories
|
||||
pnpm lint:fix app/components/base/button/
|
||||
pnpm lint:fix app/components/base/button/index.tsx
|
||||
|
||||
# Lint quietly (errors only, no warnings)
|
||||
pnpm lint:quiet
|
||||
|
||||
# Check code complexity
|
||||
pnpm lint:complexity
|
||||
```
|
||||
|
||||
**Important**: Always run `pnpm lint:fix` before committing. The pre-commit hook runs `lint-staged` which only lints staged files.
|
||||
|
||||
## Testing & Quality Practices
|
||||
|
||||
- Follow TDD: red → green → refactor.
|
||||
|
||||
120
api/AGENTS.md
120
api/AGENTS.md
@@ -1,97 +1,47 @@
|
||||
# API Agent Guide
|
||||
|
||||
## Agent Notes (must-check)
|
||||
## Notes for Agent (must-check)
|
||||
|
||||
Before you start work on any backend file under `api/`, you MUST check whether a related note exists under:
|
||||
Before changing any backend code under `api/`, you MUST read the surrounding docstrings and comments. These notes contain required context (invariants, edge cases, trade-offs) and are treated as part of the spec.
|
||||
|
||||
- `agent-notes/<same-relative-path-as-target-file>.md`
|
||||
Look for:
|
||||
|
||||
Rules:
|
||||
- The module (file) docstring at the top of a source code file
|
||||
- Docstrings on classes and functions/methods
|
||||
- Paragraph/block comments for non-obvious logic
|
||||
|
||||
- **Path mapping**: for a target file `<path>/<name>.py`, the note must be `agent-notes/<path>/<name>.py.md` (same folder structure, same filename, plus `.md`).
|
||||
- **Before working**:
|
||||
- If the note exists, read it first and follow any constraints/decisions recorded there.
|
||||
- If the note conflicts with the current code, or references an "origin" file/path that has been deleted, renamed, or migrated, treat the **code as the single source of truth** and update the note to match reality.
|
||||
- If the note does not exist, create it with a short architecture/intent summary and any relevant invariants/edge cases.
|
||||
- **During working**:
|
||||
- Keep the note in sync as you discover constraints, make decisions, or change approach.
|
||||
- If you move/rename a file, migrate its note to the new mapped path (and fix any outdated references inside the note).
|
||||
- Record non-obvious edge cases, trade-offs, and the test/verification plan as you go (not just at the end).
|
||||
- Keep notes **coherent**: integrate new findings into the relevant sections and rewrite for clarity; avoid append-only “recent fix” / changelog-style additions unless the note is explicitly intended to be a changelog.
|
||||
- **When finishing work**:
|
||||
- Update the related note(s) to reflect what changed, why, and any new edge cases/tests.
|
||||
- If a file is deleted, remove or clearly deprecate the corresponding note so it cannot be mistaken as current guidance.
|
||||
- Keep notes concise and accurate; they are meant to prevent repeated rediscovery.
|
||||
### What to write where
|
||||
|
||||
## Skill Index
|
||||
- Keep notes scoped: module notes cover module-wide context, class notes cover class-wide context, function/method notes cover behavioural contracts, and paragraph/block comments cover local “why”. Avoid duplicating the same content across scopes unless repetition prevents misuse.
|
||||
- **Module (file) docstring**: purpose, boundaries, key invariants, and “gotchas” that a new reader must know before editing.
|
||||
- Include cross-links to the key collaborators (modules/services) when discovery is otherwise hard.
|
||||
- Prefer stable facts (invariants, contracts) over ephemeral “today we…” notes.
|
||||
- **Class docstring**: responsibility, lifecycle, invariants, and how it should be used (or not used).
|
||||
- If the class is intentionally stateful, note what state exists and what methods mutate it.
|
||||
- If concurrency/async assumptions matter, state them explicitly.
|
||||
- **Function/method docstring**: behavioural contract.
|
||||
- Document arguments, return shape, side effects (DB writes, external I/O, task dispatch), and raised domain exceptions.
|
||||
- Add examples only when they prevent misuse.
|
||||
- **Paragraph/block comments**: explain *why* (trade-offs, historical constraints, surprising edge cases), not what the code already states.
|
||||
- Keep comments adjacent to the logic they justify; delete or rewrite comments that no longer match reality.
|
||||
|
||||
Start with the section that best matches your need. Each entry lists the problems it solves plus key files/concepts so you know what to expect before opening it.
|
||||
### Rules (must follow)
|
||||
|
||||
### Platform Foundations
|
||||
In this section, “notes” means module/class/function docstrings plus any relevant paragraph/block comments.
|
||||
|
||||
#### [Infrastructure Overview](agent_skills/infra.md)
|
||||
|
||||
- **When to read this**
|
||||
- You need to understand where a feature belongs in the architecture.
|
||||
- You’re wiring storage, Redis, vector stores, or OTEL.
|
||||
- You’re about to add CLI commands or async jobs.
|
||||
- **What it covers**
|
||||
- Configuration stack (`configs/app_config.py`, remote settings)
|
||||
- Storage entry points (`extensions/ext_storage.py`, `core/file/file_manager.py`)
|
||||
- Redis conventions (`extensions/ext_redis.py`)
|
||||
- Plugin runtime topology
|
||||
- Vector-store factory (`core/rag/datasource/vdb/*`)
|
||||
- Observability hooks
|
||||
- SSRF proxy usage
|
||||
- Core CLI commands
|
||||
|
||||
### Plugin & Extension Development
|
||||
|
||||
#### [Plugin Systems](agent_skills/plugin.md)
|
||||
|
||||
- **When to read this**
|
||||
- You’re building or debugging a marketplace plugin.
|
||||
- You need to know how manifests, providers, daemons, and migrations fit together.
|
||||
- **What it covers**
|
||||
- Plugin manifests (`core/plugin/entities/plugin.py`)
|
||||
- Installation/upgrade flows (`services/plugin/plugin_service.py`, CLI commands)
|
||||
- Runtime adapters (`core/plugin/impl/*` for tool/model/datasource/trigger/endpoint/agent)
|
||||
- Daemon coordination (`core/plugin/entities/plugin_daemon.py`)
|
||||
- How provider registries surface capabilities to the rest of the platform
|
||||
|
||||
#### [Plugin OAuth](agent_skills/plugin_oauth.md)
|
||||
|
||||
- **When to read this**
|
||||
- You must integrate OAuth for a plugin or datasource.
|
||||
- You’re handling credential encryption or refresh flows.
|
||||
- **Topics**
|
||||
- Credential storage
|
||||
- Encryption helpers (`core/helper/provider_encryption.py`)
|
||||
- OAuth client bootstrap (`services/plugin/oauth_service.py`, `services/plugin/plugin_parameter_service.py`)
|
||||
- How console/API layers expose the flows
|
||||
|
||||
### Workflow Entry & Execution
|
||||
|
||||
#### [Trigger Concepts](agent_skills/trigger.md)
|
||||
|
||||
- **When to read this**
|
||||
- You’re debugging why a workflow didn’t start.
|
||||
- You’re adding a new trigger type or hook.
|
||||
- You need to trace async execution, draft debugging, or webhook/schedule pipelines.
|
||||
- **Details**
|
||||
- Start-node taxonomy
|
||||
- Webhook & schedule internals (`core/workflow/nodes/trigger_*`, `services/trigger/*`)
|
||||
- Async orchestration (`services/async_workflow_service.py`, Celery queues)
|
||||
- Debug event bus
|
||||
- Storage/logging interactions
|
||||
|
||||
## General Reminders
|
||||
|
||||
- All skill docs assume you follow the coding style rules below—run the lint/type/test commands before submitting changes.
|
||||
- When you cannot find an answer in these briefs, search the codebase using the paths referenced (e.g., `core/plugin/impl/tool.py`, `services/dataset_service.py`).
|
||||
- If you run into cross-cutting concerns (tenancy, configuration, storage), check the infrastructure guide first; it links to most supporting modules.
|
||||
- Keep multi-tenancy and configuration central: everything flows through `configs.dify_config` and `tenant_id`.
|
||||
- When touching plugins or triggers, consult both the system overview and the specialised doc to ensure you adjust lifecycle, storage, and observability consistently.
|
||||
- **Before working**
|
||||
- Read the notes in the area you’ll touch; treat them as part of the spec.
|
||||
- If a docstring or comment conflicts with the current code, treat the **code as the single source of truth** and update the docstring or comment to match reality.
|
||||
- If important intent/invariants/edge cases are missing, add them in the closest docstring or comment (module for overall scope, function for behaviour).
|
||||
- **During working**
|
||||
- Keep the notes in sync as you discover constraints, make decisions, or change approach.
|
||||
- If you move/rename responsibilities across modules/classes, update the affected docstrings and comments so readers can still find the “why” and the invariants.
|
||||
- Record non-obvious edge cases, trade-offs, and the test/verification plan in the nearest docstring or comment that will stay correct.
|
||||
- Keep the notes **coherent**: integrate new findings into the relevant docstrings and comments; avoid append-only “recent fix” / changelog-style additions.
|
||||
- **When finishing**
|
||||
- Update the notes to reflect what changed, why, and any new edge cases/tests.
|
||||
- Remove or rewrite any comments that could be mistaken as current guidance but no longer apply.
|
||||
- Keep docstrings and comments concise and accurate; they are meant to prevent repeated rediscovery.
|
||||
|
||||
## Coding Style
|
||||
|
||||
@@ -226,7 +176,7 @@ Before opening a PR / submitting:
|
||||
|
||||
- Controllers: parse input via Pydantic, invoke services, return serialised responses; no business logic.
|
||||
- Services: coordinate repositories, providers, background tasks; keep side effects explicit.
|
||||
- Document non-obvious behaviour with concise comments.
|
||||
- Document non-obvious behaviour with concise docstrings and comments.
|
||||
|
||||
### Miscellaneous
|
||||
|
||||
|
||||
171
api/README.md
171
api/README.md
@@ -1,6 +1,6 @@
|
||||
# Dify Backend API
|
||||
|
||||
## Usage
|
||||
## Setup and Run
|
||||
|
||||
> [!IMPORTANT]
|
||||
>
|
||||
@@ -8,48 +8,77 @@
|
||||
> [`uv`](https://docs.astral.sh/uv/) as the package manager
|
||||
> for Dify API backend service.
|
||||
|
||||
1. Start the docker-compose stack
|
||||
`uv` and `pnpm` are required to run the setup and development commands below.
|
||||
|
||||
The backend require some middleware, including PostgreSQL, Redis, and Weaviate, which can be started together using `docker-compose`.
|
||||
### Using scripts (recommended)
|
||||
|
||||
The scripts resolve paths relative to their location, so you can run them from anywhere.
|
||||
|
||||
1. Run setup (copies env files and installs dependencies).
|
||||
|
||||
```bash
|
||||
cd ../docker
|
||||
cp middleware.env.example middleware.env
|
||||
# change the profile to mysql if you are not using postgres,change the profile to other vector database if you are not using weaviate
|
||||
docker compose -f docker-compose.middleware.yaml --profile postgresql --profile weaviate -p dify up -d
|
||||
cd ../api
|
||||
./dev/setup
|
||||
```
|
||||
|
||||
1. Copy `.env.example` to `.env`
|
||||
1. Review `api/.env`, `web/.env.local`, and `docker/middleware.env` values (see the `SECRET_KEY` note below).
|
||||
|
||||
```cli
|
||||
cp .env.example .env
|
||||
1. Start middleware (PostgreSQL/Redis/Weaviate).
|
||||
|
||||
```bash
|
||||
./dev/start-docker-compose
|
||||
```
|
||||
|
||||
> [!IMPORTANT]
|
||||
>
|
||||
> When the frontend and backend run on different subdomains, set COOKIE_DOMAIN to the site’s top-level domain (e.g., `example.com`). The frontend and backend must be under the same top-level domain in order to share authentication cookies.
|
||||
1. Start backend (runs migrations first).
|
||||
|
||||
1. Generate a `SECRET_KEY` in the `.env` file.
|
||||
|
||||
bash for Linux
|
||||
|
||||
```bash for Linux
|
||||
sed -i "/^SECRET_KEY=/c\SECRET_KEY=$(openssl rand -base64 42)" .env
|
||||
```bash
|
||||
./dev/start-api
|
||||
```
|
||||
|
||||
bash for Mac
|
||||
1. Start Dify [web](../web) service.
|
||||
|
||||
```bash for Mac
|
||||
secret_key=$(openssl rand -base64 42)
|
||||
sed -i '' "/^SECRET_KEY=/c\\
|
||||
SECRET_KEY=${secret_key}" .env
|
||||
```bash
|
||||
./dev/start-web
|
||||
```
|
||||
|
||||
1. Create environment.
|
||||
1. Set up your application by visiting `http://localhost:3000`.
|
||||
|
||||
Dify API service uses [UV](https://docs.astral.sh/uv/) to manage dependencies.
|
||||
First, you need to add the uv package manager, if you don't have it already.
|
||||
1. Optional: start the worker service (async tasks, runs from `api`).
|
||||
|
||||
```bash
|
||||
./dev/start-worker
|
||||
```
|
||||
|
||||
1. Optional: start Celery Beat (scheduled tasks).
|
||||
|
||||
```bash
|
||||
./dev/start-beat
|
||||
```
|
||||
|
||||
### Manual commands
|
||||
|
||||
<details>
|
||||
<summary>Show manual setup and run steps</summary>
|
||||
|
||||
These commands assume you start from the repository root.
|
||||
|
||||
1. Start the docker-compose stack.
|
||||
|
||||
The backend requires middleware, including PostgreSQL, Redis, and Weaviate, which can be started together using `docker-compose`.
|
||||
|
||||
```bash
|
||||
cp docker/middleware.env.example docker/middleware.env
|
||||
# Use mysql or another vector database profile if you are not using postgres/weaviate.
|
||||
docker compose -f docker/docker-compose.middleware.yaml --profile postgresql --profile weaviate -p dify up -d
|
||||
```
|
||||
|
||||
1. Copy env files.
|
||||
|
||||
```bash
|
||||
cp api/.env.example api/.env
|
||||
cp web/.env.example web/.env.local
|
||||
```
|
||||
|
||||
1. Install UV if needed.
|
||||
|
||||
```bash
|
||||
pip install uv
|
||||
@@ -57,60 +86,96 @@
|
||||
brew install uv
|
||||
```
|
||||
|
||||
1. Install dependencies
|
||||
1. Install API dependencies.
|
||||
|
||||
```bash
|
||||
uv sync --dev
|
||||
cd api
|
||||
uv sync --group dev
|
||||
```
|
||||
|
||||
1. Run migrate
|
||||
|
||||
Before the first launch, migrate the database to the latest version.
|
||||
1. Install web dependencies.
|
||||
|
||||
```bash
|
||||
cd web
|
||||
pnpm install
|
||||
cd ..
|
||||
```
|
||||
|
||||
1. Start backend (runs migrations first, in a new terminal).
|
||||
|
||||
```bash
|
||||
cd api
|
||||
uv run flask db upgrade
|
||||
```
|
||||
|
||||
1. Start backend
|
||||
|
||||
```bash
|
||||
uv run flask run --host 0.0.0.0 --port=5001 --debug
|
||||
```
|
||||
|
||||
1. Start Dify [web](../web) service.
|
||||
1. Start Dify [web](../web) service (in a new terminal).
|
||||
|
||||
1. Setup your application by visiting `http://localhost:3000`.
|
||||
```bash
|
||||
cd web
|
||||
pnpm dev:inspect
|
||||
```
|
||||
|
||||
1. If you need to handle and debug the async tasks (e.g. dataset importing and documents indexing), please start the worker service.
|
||||
1. Set up your application by visiting `http://localhost:3000`.
|
||||
|
||||
```bash
|
||||
uv run celery -A app.celery worker -P threads -c 2 --loglevel INFO -Q dataset,priority_dataset,priority_pipeline,pipeline,mail,ops_trace,app_deletion,plugin,workflow_storage,conversation,workflow,schedule_poller,schedule_executor,triggered_workflow_dispatcher,trigger_refresh_executor,retention
|
||||
```
|
||||
1. Optional: start the worker service (async tasks, in a new terminal).
|
||||
|
||||
Additionally, if you want to debug the celery scheduled tasks, you can run the following command in another terminal to start the beat service:
|
||||
```bash
|
||||
cd api
|
||||
uv run celery -A app.celery worker -P threads -c 2 --loglevel INFO -Q dataset,priority_dataset,priority_pipeline,pipeline,mail,ops_trace,app_deletion,plugin,workflow_storage,conversation,workflow,schedule_poller,schedule_executor,triggered_workflow_dispatcher,trigger_refresh_executor,retention
|
||||
```
|
||||
|
||||
```bash
|
||||
uv run celery -A app.celery beat
|
||||
```
|
||||
1. Optional: start Celery Beat (scheduled tasks, in a new terminal).
|
||||
|
||||
```bash
|
||||
cd api
|
||||
uv run celery -A app.celery beat
|
||||
```
|
||||
|
||||
</details>
|
||||
|
||||
### Environment notes
|
||||
|
||||
> [!IMPORTANT]
|
||||
>
|
||||
> When the frontend and backend run on different subdomains, set COOKIE_DOMAIN to the site’s top-level domain (e.g., `example.com`). The frontend and backend must be under the same top-level domain in order to share authentication cookies.
|
||||
|
||||
- Generate a `SECRET_KEY` in the `.env` file.
|
||||
|
||||
bash for Linux
|
||||
|
||||
```bash
|
||||
sed -i "/^SECRET_KEY=/c\\SECRET_KEY=$(openssl rand -base64 42)" .env
|
||||
```
|
||||
|
||||
bash for Mac
|
||||
|
||||
```bash
|
||||
secret_key=$(openssl rand -base64 42)
|
||||
sed -i '' "/^SECRET_KEY=/c\\
|
||||
SECRET_KEY=${secret_key}" .env
|
||||
```
|
||||
|
||||
## Testing
|
||||
|
||||
1. Install dependencies for both the backend and the test environment
|
||||
|
||||
```bash
|
||||
uv sync --dev
|
||||
cd api
|
||||
uv sync --group dev
|
||||
```
|
||||
|
||||
1. Run the tests locally with mocked system environment variables in `tool.pytest_env` section in `pyproject.toml`, more can check [Claude.md](../CLAUDE.md)
|
||||
|
||||
```bash
|
||||
cd api
|
||||
uv run pytest # Run all tests
|
||||
uv run pytest tests/unit_tests/ # Unit tests only
|
||||
uv run pytest tests/integration_tests/ # Integration tests
|
||||
|
||||
# Code quality
|
||||
../dev/reformat # Run all formatters and linters
|
||||
uv run ruff check --fix ./ # Fix linting issues
|
||||
uv run ruff format ./ # Format code
|
||||
uv run basedpyright . # Type checking
|
||||
./dev/reformat # Run all formatters and linters
|
||||
uv run ruff check --fix ./ # Fix linting issues
|
||||
uv run ruff format ./ # Format code
|
||||
uv run basedpyright . # Type checking
|
||||
```
|
||||
|
||||
@@ -81,6 +81,7 @@ def initialize_extensions(app: DifyApp):
|
||||
ext_commands,
|
||||
ext_compress,
|
||||
ext_database,
|
||||
ext_fastopenapi,
|
||||
ext_forward_refs,
|
||||
ext_hosting_provider,
|
||||
ext_import_modules,
|
||||
@@ -128,6 +129,7 @@ def initialize_extensions(app: DifyApp):
|
||||
ext_proxy_fix,
|
||||
ext_blueprints,
|
||||
ext_commands,
|
||||
ext_fastopenapi,
|
||||
ext_otel,
|
||||
ext_request_logging,
|
||||
ext_session_factory,
|
||||
|
||||
340
api/commands.py
340
api/commands.py
@@ -950,6 +950,346 @@ def clean_workflow_runs(
|
||||
)
|
||||
|
||||
|
||||
@click.command(
|
||||
"archive-workflow-runs",
|
||||
help="Archive workflow runs for paid plan tenants to S3-compatible storage.",
|
||||
)
|
||||
@click.option("--tenant-ids", default=None, help="Optional comma-separated tenant IDs for grayscale rollout.")
|
||||
@click.option("--before-days", default=90, show_default=True, help="Archive runs older than N days.")
|
||||
@click.option(
|
||||
"--from-days-ago",
|
||||
default=None,
|
||||
type=click.IntRange(min=0),
|
||||
help="Lower bound in days ago (older). Must be paired with --to-days-ago.",
|
||||
)
|
||||
@click.option(
|
||||
"--to-days-ago",
|
||||
default=None,
|
||||
type=click.IntRange(min=0),
|
||||
help="Upper bound in days ago (newer). Must be paired with --from-days-ago.",
|
||||
)
|
||||
@click.option(
|
||||
"--start-from",
|
||||
type=click.DateTime(formats=["%Y-%m-%d", "%Y-%m-%dT%H:%M:%S"]),
|
||||
default=None,
|
||||
help="Archive runs created at or after this timestamp (UTC if no timezone).",
|
||||
)
|
||||
@click.option(
|
||||
"--end-before",
|
||||
type=click.DateTime(formats=["%Y-%m-%d", "%Y-%m-%dT%H:%M:%S"]),
|
||||
default=None,
|
||||
help="Archive runs created before this timestamp (UTC if no timezone).",
|
||||
)
|
||||
@click.option("--batch-size", default=100, show_default=True, help="Batch size for processing.")
|
||||
@click.option("--workers", default=1, show_default=True, type=int, help="Concurrent workflow runs to archive.")
|
||||
@click.option("--limit", default=None, type=int, help="Maximum number of runs to archive.")
|
||||
@click.option("--dry-run", is_flag=True, help="Preview without archiving.")
|
||||
@click.option("--delete-after-archive", is_flag=True, help="Delete runs and related data after archiving.")
|
||||
def archive_workflow_runs(
|
||||
tenant_ids: str | None,
|
||||
before_days: int,
|
||||
from_days_ago: int | None,
|
||||
to_days_ago: int | None,
|
||||
start_from: datetime.datetime | None,
|
||||
end_before: datetime.datetime | None,
|
||||
batch_size: int,
|
||||
workers: int,
|
||||
limit: int | None,
|
||||
dry_run: bool,
|
||||
delete_after_archive: bool,
|
||||
):
|
||||
"""
|
||||
Archive workflow runs for paid plan tenants older than the specified days.
|
||||
|
||||
This command archives the following tables to storage:
|
||||
- workflow_node_executions
|
||||
- workflow_node_execution_offload
|
||||
- workflow_pauses
|
||||
- workflow_pause_reasons
|
||||
- workflow_trigger_logs
|
||||
|
||||
The workflow_runs and workflow_app_logs tables are preserved for UI listing.
|
||||
"""
|
||||
from services.retention.workflow_run.archive_paid_plan_workflow_run import WorkflowRunArchiver
|
||||
|
||||
run_started_at = datetime.datetime.now(datetime.UTC)
|
||||
click.echo(
|
||||
click.style(
|
||||
f"Starting workflow run archiving at {run_started_at.isoformat()}.",
|
||||
fg="white",
|
||||
)
|
||||
)
|
||||
|
||||
if (start_from is None) ^ (end_before is None):
|
||||
click.echo(click.style("start-from and end-before must be provided together.", fg="red"))
|
||||
return
|
||||
|
||||
if (from_days_ago is None) ^ (to_days_ago is None):
|
||||
click.echo(click.style("from-days-ago and to-days-ago must be provided together.", fg="red"))
|
||||
return
|
||||
|
||||
if from_days_ago is not None and to_days_ago is not None:
|
||||
if start_from or end_before:
|
||||
click.echo(click.style("Choose either day offsets or explicit dates, not both.", fg="red"))
|
||||
return
|
||||
if from_days_ago <= to_days_ago:
|
||||
click.echo(click.style("from-days-ago must be greater than to-days-ago.", fg="red"))
|
||||
return
|
||||
now = datetime.datetime.now()
|
||||
start_from = now - datetime.timedelta(days=from_days_ago)
|
||||
end_before = now - datetime.timedelta(days=to_days_ago)
|
||||
before_days = 0
|
||||
|
||||
if start_from and end_before and start_from >= end_before:
|
||||
click.echo(click.style("start-from must be earlier than end-before.", fg="red"))
|
||||
return
|
||||
if workers < 1:
|
||||
click.echo(click.style("workers must be at least 1.", fg="red"))
|
||||
return
|
||||
|
||||
archiver = WorkflowRunArchiver(
|
||||
days=before_days,
|
||||
batch_size=batch_size,
|
||||
start_from=start_from,
|
||||
end_before=end_before,
|
||||
workers=workers,
|
||||
tenant_ids=[tid.strip() for tid in tenant_ids.split(",")] if tenant_ids else None,
|
||||
limit=limit,
|
||||
dry_run=dry_run,
|
||||
delete_after_archive=delete_after_archive,
|
||||
)
|
||||
summary = archiver.run()
|
||||
click.echo(
|
||||
click.style(
|
||||
f"Summary: processed={summary.total_runs_processed}, archived={summary.runs_archived}, "
|
||||
f"skipped={summary.runs_skipped}, failed={summary.runs_failed}, "
|
||||
f"time={summary.total_elapsed_time:.2f}s",
|
||||
fg="cyan",
|
||||
)
|
||||
)
|
||||
|
||||
run_finished_at = datetime.datetime.now(datetime.UTC)
|
||||
elapsed = run_finished_at - run_started_at
|
||||
click.echo(
|
||||
click.style(
|
||||
f"Workflow run archiving completed. start={run_started_at.isoformat()} "
|
||||
f"end={run_finished_at.isoformat()} duration={elapsed}",
|
||||
fg="green",
|
||||
)
|
||||
)
|
||||
|
||||
|
||||
@click.command(
|
||||
"restore-workflow-runs",
|
||||
help="Restore archived workflow runs from S3-compatible storage.",
|
||||
)
|
||||
@click.option(
|
||||
"--tenant-ids",
|
||||
required=False,
|
||||
help="Tenant IDs (comma-separated).",
|
||||
)
|
||||
@click.option("--run-id", required=False, help="Workflow run ID to restore.")
|
||||
@click.option(
|
||||
"--start-from",
|
||||
type=click.DateTime(formats=["%Y-%m-%d", "%Y-%m-%dT%H:%M:%S"]),
|
||||
default=None,
|
||||
help="Optional lower bound (inclusive) for created_at; must be paired with --end-before.",
|
||||
)
|
||||
@click.option(
|
||||
"--end-before",
|
||||
type=click.DateTime(formats=["%Y-%m-%d", "%Y-%m-%dT%H:%M:%S"]),
|
||||
default=None,
|
||||
help="Optional upper bound (exclusive) for created_at; must be paired with --start-from.",
|
||||
)
|
||||
@click.option("--workers", default=1, show_default=True, type=int, help="Concurrent workflow runs to restore.")
|
||||
@click.option("--limit", type=int, default=100, show_default=True, help="Maximum number of runs to restore.")
|
||||
@click.option("--dry-run", is_flag=True, help="Preview without restoring.")
|
||||
def restore_workflow_runs(
|
||||
tenant_ids: str | None,
|
||||
run_id: str | None,
|
||||
start_from: datetime.datetime | None,
|
||||
end_before: datetime.datetime | None,
|
||||
workers: int,
|
||||
limit: int,
|
||||
dry_run: bool,
|
||||
):
|
||||
"""
|
||||
Restore an archived workflow run from storage to the database.
|
||||
|
||||
This restores the following tables:
|
||||
- workflow_node_executions
|
||||
- workflow_node_execution_offload
|
||||
- workflow_pauses
|
||||
- workflow_pause_reasons
|
||||
- workflow_trigger_logs
|
||||
"""
|
||||
from services.retention.workflow_run.restore_archived_workflow_run import WorkflowRunRestore
|
||||
|
||||
parsed_tenant_ids = None
|
||||
if tenant_ids:
|
||||
parsed_tenant_ids = [tid.strip() for tid in tenant_ids.split(",") if tid.strip()]
|
||||
if not parsed_tenant_ids:
|
||||
raise click.BadParameter("tenant-ids must not be empty")
|
||||
|
||||
if (start_from is None) ^ (end_before is None):
|
||||
raise click.UsageError("--start-from and --end-before must be provided together.")
|
||||
if run_id is None and (start_from is None or end_before is None):
|
||||
raise click.UsageError("--start-from and --end-before are required for batch restore.")
|
||||
if workers < 1:
|
||||
raise click.BadParameter("workers must be at least 1")
|
||||
|
||||
start_time = datetime.datetime.now(datetime.UTC)
|
||||
click.echo(
|
||||
click.style(
|
||||
f"Starting restore of workflow run {run_id} at {start_time.isoformat()}.",
|
||||
fg="white",
|
||||
)
|
||||
)
|
||||
|
||||
restorer = WorkflowRunRestore(dry_run=dry_run, workers=workers)
|
||||
if run_id:
|
||||
results = [restorer.restore_by_run_id(run_id)]
|
||||
else:
|
||||
assert start_from is not None
|
||||
assert end_before is not None
|
||||
results = restorer.restore_batch(
|
||||
parsed_tenant_ids,
|
||||
start_date=start_from,
|
||||
end_date=end_before,
|
||||
limit=limit,
|
||||
)
|
||||
|
||||
end_time = datetime.datetime.now(datetime.UTC)
|
||||
elapsed = end_time - start_time
|
||||
|
||||
successes = sum(1 for result in results if result.success)
|
||||
failures = len(results) - successes
|
||||
|
||||
if failures == 0:
|
||||
click.echo(
|
||||
click.style(
|
||||
f"Restore completed successfully. success={successes} duration={elapsed}",
|
||||
fg="green",
|
||||
)
|
||||
)
|
||||
else:
|
||||
click.echo(
|
||||
click.style(
|
||||
f"Restore completed with failures. success={successes} failed={failures} duration={elapsed}",
|
||||
fg="red",
|
||||
)
|
||||
)
|
||||
|
||||
|
||||
@click.command(
|
||||
"delete-archived-workflow-runs",
|
||||
help="Delete archived workflow runs from the database.",
|
||||
)
|
||||
@click.option(
|
||||
"--tenant-ids",
|
||||
required=False,
|
||||
help="Tenant IDs (comma-separated).",
|
||||
)
|
||||
@click.option("--run-id", required=False, help="Workflow run ID to delete.")
|
||||
@click.option(
|
||||
"--start-from",
|
||||
type=click.DateTime(formats=["%Y-%m-%d", "%Y-%m-%dT%H:%M:%S"]),
|
||||
default=None,
|
||||
help="Optional lower bound (inclusive) for created_at; must be paired with --end-before.",
|
||||
)
|
||||
@click.option(
|
||||
"--end-before",
|
||||
type=click.DateTime(formats=["%Y-%m-%d", "%Y-%m-%dT%H:%M:%S"]),
|
||||
default=None,
|
||||
help="Optional upper bound (exclusive) for created_at; must be paired with --start-from.",
|
||||
)
|
||||
@click.option("--limit", type=int, default=100, show_default=True, help="Maximum number of runs to delete.")
|
||||
@click.option("--dry-run", is_flag=True, help="Preview without deleting.")
|
||||
def delete_archived_workflow_runs(
|
||||
tenant_ids: str | None,
|
||||
run_id: str | None,
|
||||
start_from: datetime.datetime | None,
|
||||
end_before: datetime.datetime | None,
|
||||
limit: int,
|
||||
dry_run: bool,
|
||||
):
|
||||
"""
|
||||
Delete archived workflow runs from the database.
|
||||
"""
|
||||
from services.retention.workflow_run.delete_archived_workflow_run import ArchivedWorkflowRunDeletion
|
||||
|
||||
parsed_tenant_ids = None
|
||||
if tenant_ids:
|
||||
parsed_tenant_ids = [tid.strip() for tid in tenant_ids.split(",") if tid.strip()]
|
||||
if not parsed_tenant_ids:
|
||||
raise click.BadParameter("tenant-ids must not be empty")
|
||||
|
||||
if (start_from is None) ^ (end_before is None):
|
||||
raise click.UsageError("--start-from and --end-before must be provided together.")
|
||||
if run_id is None and (start_from is None or end_before is None):
|
||||
raise click.UsageError("--start-from and --end-before are required for batch delete.")
|
||||
|
||||
start_time = datetime.datetime.now(datetime.UTC)
|
||||
target_desc = f"workflow run {run_id}" if run_id else "workflow runs"
|
||||
click.echo(
|
||||
click.style(
|
||||
f"Starting delete of {target_desc} at {start_time.isoformat()}.",
|
||||
fg="white",
|
||||
)
|
||||
)
|
||||
|
||||
deleter = ArchivedWorkflowRunDeletion(dry_run=dry_run)
|
||||
if run_id:
|
||||
results = [deleter.delete_by_run_id(run_id)]
|
||||
else:
|
||||
assert start_from is not None
|
||||
assert end_before is not None
|
||||
results = deleter.delete_batch(
|
||||
parsed_tenant_ids,
|
||||
start_date=start_from,
|
||||
end_date=end_before,
|
||||
limit=limit,
|
||||
)
|
||||
|
||||
for result in results:
|
||||
if result.success:
|
||||
click.echo(
|
||||
click.style(
|
||||
f"{'[DRY RUN] Would delete' if dry_run else 'Deleted'} "
|
||||
f"workflow run {result.run_id} (tenant={result.tenant_id})",
|
||||
fg="green",
|
||||
)
|
||||
)
|
||||
else:
|
||||
click.echo(
|
||||
click.style(
|
||||
f"Failed to delete workflow run {result.run_id}: {result.error}",
|
||||
fg="red",
|
||||
)
|
||||
)
|
||||
|
||||
end_time = datetime.datetime.now(datetime.UTC)
|
||||
elapsed = end_time - start_time
|
||||
|
||||
successes = sum(1 for result in results if result.success)
|
||||
failures = len(results) - successes
|
||||
|
||||
if failures == 0:
|
||||
click.echo(
|
||||
click.style(
|
||||
f"Delete completed successfully. success={successes} duration={elapsed}",
|
||||
fg="green",
|
||||
)
|
||||
)
|
||||
else:
|
||||
click.echo(
|
||||
click.style(
|
||||
f"Delete completed with failures. success={successes} failed={failures} duration={elapsed}",
|
||||
fg="red",
|
||||
)
|
||||
)
|
||||
|
||||
|
||||
@click.option("-f", "--force", is_flag=True, help="Skip user confirmation and force the command to execute.")
|
||||
@click.command("clear-orphaned-file-records", help="Clear orphaned file records.")
|
||||
def clear_orphaned_file_records(force: bool):
|
||||
|
||||
@@ -82,13 +82,13 @@ class ProviderNotSupportSpeechToTextError(BaseHTTPException):
|
||||
class DraftWorkflowNotExist(BaseHTTPException):
|
||||
error_code = "draft_workflow_not_exist"
|
||||
description = "Draft workflow need to be initialized."
|
||||
code = 400
|
||||
code = 404
|
||||
|
||||
|
||||
class DraftWorkflowNotSync(BaseHTTPException):
|
||||
error_code = "draft_workflow_not_sync"
|
||||
description = "Workflow graph might have been modified, please refresh and resubmit."
|
||||
code = 400
|
||||
code = 409
|
||||
|
||||
|
||||
class TracingConfigNotExist(BaseHTTPException):
|
||||
|
||||
@@ -470,7 +470,7 @@ class AdvancedChatDraftRunLoopNodeApi(Resource):
|
||||
Run draft workflow loop node
|
||||
"""
|
||||
current_user, _ = current_account_with_tenant()
|
||||
args = LoopNodeRunPayload.model_validate(console_ns.payload or {}).model_dump(exclude_none=True)
|
||||
args = LoopNodeRunPayload.model_validate(console_ns.payload or {})
|
||||
|
||||
try:
|
||||
response = AppGenerateService.generate_single_loop(
|
||||
@@ -508,7 +508,7 @@ class WorkflowDraftRunLoopNodeApi(Resource):
|
||||
Run draft workflow loop node
|
||||
"""
|
||||
current_user, _ = current_account_with_tenant()
|
||||
args = LoopNodeRunPayload.model_validate(console_ns.payload or {}).model_dump(exclude_none=True)
|
||||
args = LoopNodeRunPayload.model_validate(console_ns.payload or {})
|
||||
|
||||
try:
|
||||
response = AppGenerateService.generate_single_loop(
|
||||
@@ -999,6 +999,7 @@ class DraftWorkflowTriggerRunApi(Resource):
|
||||
if not event:
|
||||
return jsonable_encoder({"status": "waiting", "retry_in": LISTENING_RETRY_IN})
|
||||
workflow_args = dict(event.workflow_args)
|
||||
|
||||
workflow_args[SKIP_PREPARE_USER_INPUTS_KEY] = True
|
||||
return helper.compact_generate_response(
|
||||
AppGenerateService.generate(
|
||||
@@ -1147,6 +1148,7 @@ class DraftWorkflowTriggerRunAllApi(Resource):
|
||||
|
||||
try:
|
||||
workflow_args = dict(trigger_debug_event.workflow_args)
|
||||
|
||||
workflow_args[SKIP_PREPARE_USER_INPUTS_KEY] = True
|
||||
response = AppGenerateService.generate(
|
||||
app_model=app_model,
|
||||
|
||||
@@ -11,7 +11,10 @@ from controllers.console.app.wraps import get_app_model
|
||||
from controllers.console.wraps import account_initialization_required, setup_required
|
||||
from core.workflow.enums import WorkflowExecutionStatus
|
||||
from extensions.ext_database import db
|
||||
from fields.workflow_app_log_fields import build_workflow_app_log_pagination_model
|
||||
from fields.workflow_app_log_fields import (
|
||||
build_workflow_app_log_pagination_model,
|
||||
build_workflow_archived_log_pagination_model,
|
||||
)
|
||||
from libs.login import login_required
|
||||
from models import App
|
||||
from models.model import AppMode
|
||||
@@ -61,6 +64,7 @@ console_ns.schema_model(
|
||||
|
||||
# Register model for flask_restx to avoid dict type issues in Swagger
|
||||
workflow_app_log_pagination_model = build_workflow_app_log_pagination_model(console_ns)
|
||||
workflow_archived_log_pagination_model = build_workflow_archived_log_pagination_model(console_ns)
|
||||
|
||||
|
||||
@console_ns.route("/apps/<uuid:app_id>/workflow-app-logs")
|
||||
@@ -99,3 +103,33 @@ class WorkflowAppLogApi(Resource):
|
||||
)
|
||||
|
||||
return workflow_app_log_pagination
|
||||
|
||||
|
||||
@console_ns.route("/apps/<uuid:app_id>/workflow-archived-logs")
|
||||
class WorkflowArchivedLogApi(Resource):
|
||||
@console_ns.doc("get_workflow_archived_logs")
|
||||
@console_ns.doc(description="Get workflow archived execution logs")
|
||||
@console_ns.doc(params={"app_id": "Application ID"})
|
||||
@console_ns.expect(console_ns.models[WorkflowAppLogQuery.__name__])
|
||||
@console_ns.response(200, "Workflow archived logs retrieved successfully", workflow_archived_log_pagination_model)
|
||||
@setup_required
|
||||
@login_required
|
||||
@account_initialization_required
|
||||
@get_app_model(mode=[AppMode.WORKFLOW])
|
||||
@marshal_with(workflow_archived_log_pagination_model)
|
||||
def get(self, app_model: App):
|
||||
"""
|
||||
Get workflow archived logs
|
||||
"""
|
||||
args = WorkflowAppLogQuery.model_validate(request.args.to_dict(flat=True)) # type: ignore
|
||||
|
||||
workflow_app_service = WorkflowAppService()
|
||||
with Session(db.engine) as session:
|
||||
workflow_app_log_pagination = workflow_app_service.get_paginate_workflow_archive_logs(
|
||||
session=session,
|
||||
app_model=app_model,
|
||||
page=args.page,
|
||||
limit=args.limit,
|
||||
)
|
||||
|
||||
return workflow_app_log_pagination
|
||||
|
||||
@@ -1,12 +1,15 @@
|
||||
from datetime import UTC, datetime, timedelta
|
||||
from typing import Literal, cast
|
||||
|
||||
from flask import request
|
||||
from flask_restx import Resource, fields, marshal_with
|
||||
from pydantic import BaseModel, Field, field_validator
|
||||
from sqlalchemy import select
|
||||
|
||||
from controllers.console import console_ns
|
||||
from controllers.console.app.wraps import get_app_model
|
||||
from controllers.console.wraps import account_initialization_required, setup_required
|
||||
from extensions.ext_database import db
|
||||
from fields.end_user_fields import simple_end_user_fields
|
||||
from fields.member_fields import simple_account_fields
|
||||
from fields.workflow_run_fields import (
|
||||
@@ -19,14 +22,17 @@ from fields.workflow_run_fields import (
|
||||
workflow_run_node_execution_list_fields,
|
||||
workflow_run_pagination_fields,
|
||||
)
|
||||
from libs.archive_storage import ArchiveStorageNotConfiguredError, get_archive_storage
|
||||
from libs.custom_inputs import time_duration
|
||||
from libs.helper import uuid_value
|
||||
from libs.login import current_user, login_required
|
||||
from models import Account, App, AppMode, EndUser, WorkflowRunTriggeredFrom
|
||||
from models import Account, App, AppMode, EndUser, WorkflowArchiveLog, WorkflowRunTriggeredFrom
|
||||
from services.retention.workflow_run.constants import ARCHIVE_BUNDLE_NAME
|
||||
from services.workflow_run_service import WorkflowRunService
|
||||
|
||||
# Workflow run status choices for filtering
|
||||
WORKFLOW_RUN_STATUS_CHOICES = ["running", "succeeded", "failed", "stopped", "partial-succeeded"]
|
||||
EXPORT_SIGNED_URL_EXPIRE_SECONDS = 3600
|
||||
|
||||
# Register models for flask_restx to avoid dict type issues in Swagger
|
||||
# Register in dependency order: base models first, then dependent models
|
||||
@@ -93,6 +99,15 @@ workflow_run_node_execution_list_model = console_ns.model(
|
||||
"WorkflowRunNodeExecutionList", workflow_run_node_execution_list_fields_copy
|
||||
)
|
||||
|
||||
workflow_run_export_fields = console_ns.model(
|
||||
"WorkflowRunExport",
|
||||
{
|
||||
"status": fields.String(description="Export status: success/failed"),
|
||||
"presigned_url": fields.String(description="Pre-signed URL for download", required=False),
|
||||
"presigned_url_expires_at": fields.String(description="Pre-signed URL expiration time", required=False),
|
||||
},
|
||||
)
|
||||
|
||||
DEFAULT_REF_TEMPLATE_SWAGGER_2_0 = "#/definitions/{model}"
|
||||
|
||||
|
||||
@@ -181,6 +196,56 @@ class AdvancedChatAppWorkflowRunListApi(Resource):
|
||||
return result
|
||||
|
||||
|
||||
@console_ns.route("/apps/<uuid:app_id>/workflow-runs/<uuid:run_id>/export")
|
||||
class WorkflowRunExportApi(Resource):
|
||||
@console_ns.doc("get_workflow_run_export_url")
|
||||
@console_ns.doc(description="Generate a download URL for an archived workflow run.")
|
||||
@console_ns.doc(params={"app_id": "Application ID", "run_id": "Workflow run ID"})
|
||||
@console_ns.response(200, "Export URL generated", workflow_run_export_fields)
|
||||
@setup_required
|
||||
@login_required
|
||||
@account_initialization_required
|
||||
@get_app_model()
|
||||
def get(self, app_model: App, run_id: str):
|
||||
tenant_id = str(app_model.tenant_id)
|
||||
app_id = str(app_model.id)
|
||||
run_id_str = str(run_id)
|
||||
|
||||
run_created_at = db.session.scalar(
|
||||
select(WorkflowArchiveLog.run_created_at)
|
||||
.where(
|
||||
WorkflowArchiveLog.tenant_id == tenant_id,
|
||||
WorkflowArchiveLog.app_id == app_id,
|
||||
WorkflowArchiveLog.workflow_run_id == run_id_str,
|
||||
)
|
||||
.limit(1)
|
||||
)
|
||||
if not run_created_at:
|
||||
return {"code": "archive_log_not_found", "message": "workflow run archive not found"}, 404
|
||||
|
||||
prefix = (
|
||||
f"{tenant_id}/app_id={app_id}/year={run_created_at.strftime('%Y')}/"
|
||||
f"month={run_created_at.strftime('%m')}/workflow_run_id={run_id_str}"
|
||||
)
|
||||
archive_key = f"{prefix}/{ARCHIVE_BUNDLE_NAME}"
|
||||
|
||||
try:
|
||||
archive_storage = get_archive_storage()
|
||||
except ArchiveStorageNotConfiguredError as e:
|
||||
return {"code": "archive_storage_not_configured", "message": str(e)}, 500
|
||||
|
||||
presigned_url = archive_storage.generate_presigned_url(
|
||||
archive_key,
|
||||
expires_in=EXPORT_SIGNED_URL_EXPIRE_SECONDS,
|
||||
)
|
||||
expires_at = datetime.now(UTC) + timedelta(seconds=EXPORT_SIGNED_URL_EXPIRE_SECONDS)
|
||||
return {
|
||||
"status": "success",
|
||||
"presigned_url": presigned_url,
|
||||
"presigned_url_expires_at": expires_at.isoformat(),
|
||||
}, 200
|
||||
|
||||
|
||||
@console_ns.route("/apps/<uuid:app_id>/advanced-chat/workflow-runs/count")
|
||||
class AdvancedChatAppWorkflowRunCountApi(Resource):
|
||||
@console_ns.doc("get_advanced_chat_workflow_runs_count")
|
||||
|
||||
@@ -36,6 +36,16 @@ class NotionEstimatePayload(BaseModel):
|
||||
doc_language: str = Field(default="English")
|
||||
|
||||
|
||||
class DataSourceNotionListQuery(BaseModel):
|
||||
dataset_id: str | None = Field(default=None, description="Dataset ID")
|
||||
credential_id: str = Field(..., description="Credential ID", min_length=1)
|
||||
datasource_parameters: dict[str, Any] | None = Field(default=None, description="Datasource parameters JSON string")
|
||||
|
||||
|
||||
class DataSourceNotionPreviewQuery(BaseModel):
|
||||
credential_id: str = Field(..., description="Credential ID", min_length=1)
|
||||
|
||||
|
||||
register_schema_model(console_ns, NotionEstimatePayload)
|
||||
|
||||
|
||||
@@ -136,26 +146,15 @@ class DataSourceNotionListApi(Resource):
|
||||
def get(self):
|
||||
current_user, current_tenant_id = current_account_with_tenant()
|
||||
|
||||
dataset_id = request.args.get("dataset_id", default=None, type=str)
|
||||
credential_id = request.args.get("credential_id", default=None, type=str)
|
||||
if not credential_id:
|
||||
raise ValueError("Credential id is required.")
|
||||
query = DataSourceNotionListQuery.model_validate(request.args.to_dict())
|
||||
|
||||
# Get datasource_parameters from query string (optional, for GitHub and other datasources)
|
||||
datasource_parameters_str = request.args.get("datasource_parameters", default=None, type=str)
|
||||
datasource_parameters = {}
|
||||
if datasource_parameters_str:
|
||||
try:
|
||||
datasource_parameters = json.loads(datasource_parameters_str)
|
||||
if not isinstance(datasource_parameters, dict):
|
||||
raise ValueError("datasource_parameters must be a JSON object.")
|
||||
except json.JSONDecodeError:
|
||||
raise ValueError("Invalid datasource_parameters JSON format.")
|
||||
datasource_parameters = query.datasource_parameters or {}
|
||||
|
||||
datasource_provider_service = DatasourceProviderService()
|
||||
credential = datasource_provider_service.get_datasource_credentials(
|
||||
tenant_id=current_tenant_id,
|
||||
credential_id=credential_id,
|
||||
credential_id=query.credential_id,
|
||||
provider="notion_datasource",
|
||||
plugin_id="langgenius/notion_datasource",
|
||||
)
|
||||
@@ -164,8 +163,8 @@ class DataSourceNotionListApi(Resource):
|
||||
exist_page_ids = []
|
||||
with Session(db.engine) as session:
|
||||
# import notion in the exist dataset
|
||||
if dataset_id:
|
||||
dataset = DatasetService.get_dataset(dataset_id)
|
||||
if query.dataset_id:
|
||||
dataset = DatasetService.get_dataset(query.dataset_id)
|
||||
if not dataset:
|
||||
raise NotFound("Dataset not found.")
|
||||
if dataset.data_source_type != "notion_import":
|
||||
@@ -173,7 +172,7 @@ class DataSourceNotionListApi(Resource):
|
||||
|
||||
documents = session.scalars(
|
||||
select(Document).filter_by(
|
||||
dataset_id=dataset_id,
|
||||
dataset_id=query.dataset_id,
|
||||
tenant_id=current_tenant_id,
|
||||
data_source_type="notion_import",
|
||||
enabled=True,
|
||||
@@ -240,13 +239,12 @@ class DataSourceNotionApi(Resource):
|
||||
def get(self, page_id, page_type):
|
||||
_, current_tenant_id = current_account_with_tenant()
|
||||
|
||||
credential_id = request.args.get("credential_id", default=None, type=str)
|
||||
if not credential_id:
|
||||
raise ValueError("Credential id is required.")
|
||||
query = DataSourceNotionPreviewQuery.model_validate(request.args.to_dict())
|
||||
|
||||
datasource_provider_service = DatasourceProviderService()
|
||||
credential = datasource_provider_service.get_datasource_credentials(
|
||||
tenant_id=current_tenant_id,
|
||||
credential_id=credential_id,
|
||||
credential_id=query.credential_id,
|
||||
provider="notion_datasource",
|
||||
plugin_id="langgenius/notion_datasource",
|
||||
)
|
||||
|
||||
@@ -176,7 +176,18 @@ class IndexingEstimatePayload(BaseModel):
|
||||
return result
|
||||
|
||||
|
||||
register_schema_models(console_ns, DatasetCreatePayload, DatasetUpdatePayload, IndexingEstimatePayload)
|
||||
class ConsoleDatasetListQuery(BaseModel):
|
||||
page: int = Field(default=1, description="Page number")
|
||||
limit: int = Field(default=20, description="Number of items per page")
|
||||
keyword: str | None = Field(default=None, description="Search keyword")
|
||||
include_all: bool = Field(default=False, description="Include all datasets")
|
||||
ids: list[str] = Field(default_factory=list, description="Filter by dataset IDs")
|
||||
tag_ids: list[str] = Field(default_factory=list, description="Filter by tag IDs")
|
||||
|
||||
|
||||
register_schema_models(
|
||||
console_ns, DatasetCreatePayload, DatasetUpdatePayload, IndexingEstimatePayload, ConsoleDatasetListQuery
|
||||
)
|
||||
|
||||
|
||||
def _get_retrieval_methods_by_vector_type(vector_type: str | None, is_mock: bool = False) -> dict[str, list[str]]:
|
||||
@@ -275,18 +286,19 @@ class DatasetListApi(Resource):
|
||||
@enterprise_license_required
|
||||
def get(self):
|
||||
current_user, current_tenant_id = current_account_with_tenant()
|
||||
page = request.args.get("page", default=1, type=int)
|
||||
limit = request.args.get("limit", default=20, type=int)
|
||||
ids = request.args.getlist("ids")
|
||||
query = ConsoleDatasetListQuery.model_validate(request.args.to_dict(flat=False))
|
||||
# provider = request.args.get("provider", default="vendor")
|
||||
search = request.args.get("keyword", default=None, type=str)
|
||||
tag_ids = request.args.getlist("tag_ids")
|
||||
include_all = request.args.get("include_all", default="false").lower() == "true"
|
||||
if ids:
|
||||
datasets, total = DatasetService.get_datasets_by_ids(ids, current_tenant_id)
|
||||
if query.ids:
|
||||
datasets, total = DatasetService.get_datasets_by_ids(query.ids, current_tenant_id)
|
||||
else:
|
||||
datasets, total = DatasetService.get_datasets(
|
||||
page, limit, current_tenant_id, current_user, search, tag_ids, include_all
|
||||
query.page,
|
||||
query.limit,
|
||||
current_tenant_id,
|
||||
current_user,
|
||||
query.keyword,
|
||||
query.tag_ids,
|
||||
query.include_all,
|
||||
)
|
||||
|
||||
# check embedding setting
|
||||
@@ -318,7 +330,13 @@ class DatasetListApi(Resource):
|
||||
else:
|
||||
item.update({"partial_member_list": []})
|
||||
|
||||
response = {"data": data, "has_more": len(datasets) == limit, "limit": limit, "total": total, "page": page}
|
||||
response = {
|
||||
"data": data,
|
||||
"has_more": len(datasets) == query.limit,
|
||||
"limit": query.limit,
|
||||
"total": total,
|
||||
"page": query.page,
|
||||
}
|
||||
return response, 200
|
||||
|
||||
@console_ns.doc("create_dataset")
|
||||
|
||||
@@ -98,12 +98,19 @@ class BedrockRetrievalPayload(BaseModel):
|
||||
knowledge_id: str
|
||||
|
||||
|
||||
class ExternalApiTemplateListQuery(BaseModel):
|
||||
page: int = Field(default=1, description="Page number")
|
||||
limit: int = Field(default=20, description="Number of items per page")
|
||||
keyword: str | None = Field(default=None, description="Search keyword")
|
||||
|
||||
|
||||
register_schema_models(
|
||||
console_ns,
|
||||
ExternalKnowledgeApiPayload,
|
||||
ExternalDatasetCreatePayload,
|
||||
ExternalHitTestingPayload,
|
||||
BedrockRetrievalPayload,
|
||||
ExternalApiTemplateListQuery,
|
||||
)
|
||||
|
||||
|
||||
@@ -124,19 +131,17 @@ class ExternalApiTemplateListApi(Resource):
|
||||
@account_initialization_required
|
||||
def get(self):
|
||||
_, current_tenant_id = current_account_with_tenant()
|
||||
page = request.args.get("page", default=1, type=int)
|
||||
limit = request.args.get("limit", default=20, type=int)
|
||||
search = request.args.get("keyword", default=None, type=str)
|
||||
query = ExternalApiTemplateListQuery.model_validate(request.args.to_dict())
|
||||
|
||||
external_knowledge_apis, total = ExternalDatasetService.get_external_knowledge_apis(
|
||||
page, limit, current_tenant_id, search
|
||||
query.page, query.limit, current_tenant_id, query.keyword
|
||||
)
|
||||
response = {
|
||||
"data": [item.to_dict() for item in external_knowledge_apis],
|
||||
"has_more": len(external_knowledge_apis) == limit,
|
||||
"limit": limit,
|
||||
"has_more": len(external_knowledge_apis) == query.limit,
|
||||
"limit": query.limit,
|
||||
"total": total,
|
||||
"page": page,
|
||||
"page": query.page,
|
||||
}
|
||||
return response, 200
|
||||
|
||||
|
||||
@@ -3,7 +3,7 @@ from typing import Any
|
||||
|
||||
from flask import request
|
||||
from flask_restx import Resource, marshal_with
|
||||
from pydantic import BaseModel
|
||||
from pydantic import BaseModel, Field
|
||||
from sqlalchemy import and_, select
|
||||
from werkzeug.exceptions import BadRequest, Forbidden, NotFound
|
||||
|
||||
@@ -28,6 +28,10 @@ class InstalledAppUpdatePayload(BaseModel):
|
||||
is_pinned: bool | None = None
|
||||
|
||||
|
||||
class InstalledAppsListQuery(BaseModel):
|
||||
app_id: str | None = Field(default=None, description="App ID to filter by")
|
||||
|
||||
|
||||
logger = logging.getLogger(__name__)
|
||||
|
||||
|
||||
@@ -37,13 +41,13 @@ class InstalledAppsListApi(Resource):
|
||||
@account_initialization_required
|
||||
@marshal_with(installed_app_list_fields)
|
||||
def get(self):
|
||||
app_id = request.args.get("app_id", default=None, type=str)
|
||||
query = InstalledAppsListQuery.model_validate(request.args.to_dict())
|
||||
current_user, current_tenant_id = current_account_with_tenant()
|
||||
|
||||
if app_id:
|
||||
if query.app_id:
|
||||
installed_apps = db.session.scalars(
|
||||
select(InstalledApp).where(
|
||||
and_(InstalledApp.tenant_id == current_tenant_id, InstalledApp.app_id == app_id)
|
||||
and_(InstalledApp.tenant_id == current_tenant_id, InstalledApp.app_id == query.app_id)
|
||||
)
|
||||
).all()
|
||||
else:
|
||||
|
||||
@@ -1,6 +1,7 @@
|
||||
from flask_restx import Resource, fields
|
||||
from werkzeug.exceptions import Unauthorized
|
||||
|
||||
from libs.login import current_account_with_tenant, login_required
|
||||
from libs.login import current_account_with_tenant, current_user, login_required
|
||||
from services.feature_service import FeatureService
|
||||
|
||||
from . import console_ns
|
||||
@@ -39,5 +40,21 @@ class SystemFeatureApi(Resource):
|
||||
),
|
||||
)
|
||||
def get(self):
|
||||
"""Get system-wide feature configuration"""
|
||||
return FeatureService.get_system_features().model_dump()
|
||||
"""Get system-wide feature configuration
|
||||
|
||||
NOTE: This endpoint is unauthenticated by design, as it provides system features
|
||||
data required for dashboard initialization.
|
||||
|
||||
Authentication would create circular dependency (can't login without dashboard loading).
|
||||
|
||||
Only non-sensitive configuration data should be returned by this endpoint.
|
||||
"""
|
||||
# NOTE(QuantumGhost): ideally we should access `current_user.is_authenticated`
|
||||
# without a try-catch. However, due to the implementation of user loader (the `load_user_from_request`
|
||||
# in api/extensions/ext_login.py), accessing `current_user.is_authenticated` will
|
||||
# raise `Unauthorized` exception if authentication token is not provided.
|
||||
try:
|
||||
is_authenticated = current_user.is_authenticated
|
||||
except Unauthorized:
|
||||
is_authenticated = False
|
||||
return FeatureService.get_system_features(is_authenticated=is_authenticated).model_dump()
|
||||
|
||||
@@ -1,17 +1,17 @@
|
||||
from flask_restx import Resource, fields
|
||||
from pydantic import BaseModel, Field
|
||||
|
||||
from . import console_ns
|
||||
from controllers.fastopenapi import console_router
|
||||
|
||||
|
||||
@console_ns.route("/ping")
|
||||
class PingApi(Resource):
|
||||
@console_ns.doc("health_check")
|
||||
@console_ns.doc(description="Health check endpoint for connection testing")
|
||||
@console_ns.response(
|
||||
200,
|
||||
"Success",
|
||||
console_ns.model("PingResponse", {"result": fields.String(description="Health check result", example="pong")}),
|
||||
)
|
||||
def get(self):
|
||||
"""Health check endpoint for connection testing"""
|
||||
return {"result": "pong"}
|
||||
class PingResponse(BaseModel):
|
||||
result: str = Field(description="Health check result", examples=["pong"])
|
||||
|
||||
|
||||
@console_router.get(
|
||||
"/ping",
|
||||
response_model=PingResponse,
|
||||
tags=["console"],
|
||||
)
|
||||
def ping() -> PingResponse:
|
||||
"""Health check endpoint for connection testing."""
|
||||
return PingResponse(result="pong")
|
||||
|
||||
@@ -1,20 +1,19 @@
|
||||
from typing import Literal
|
||||
|
||||
from flask import request
|
||||
from flask_restx import Resource, fields
|
||||
from pydantic import BaseModel, Field, field_validator
|
||||
|
||||
from configs import dify_config
|
||||
from controllers.fastopenapi import console_router
|
||||
from libs.helper import EmailStr, extract_remote_ip
|
||||
from libs.password import valid_password
|
||||
from models.model import DifySetup, db
|
||||
from services.account_service import RegisterService, TenantService
|
||||
|
||||
from . import console_ns
|
||||
from .error import AlreadySetupError, NotInitValidateError
|
||||
from .init_validate import get_init_validate_status
|
||||
from .wraps import only_edition_self_hosted
|
||||
|
||||
DEFAULT_REF_TEMPLATE_SWAGGER_2_0 = "#/definitions/{model}"
|
||||
|
||||
|
||||
class SetupRequestPayload(BaseModel):
|
||||
email: EmailStr = Field(..., description="Admin email address")
|
||||
@@ -28,78 +27,66 @@ class SetupRequestPayload(BaseModel):
|
||||
return valid_password(value)
|
||||
|
||||
|
||||
console_ns.schema_model(
|
||||
SetupRequestPayload.__name__,
|
||||
SetupRequestPayload.model_json_schema(ref_template=DEFAULT_REF_TEMPLATE_SWAGGER_2_0),
|
||||
class SetupStatusResponse(BaseModel):
|
||||
step: Literal["not_started", "finished"] = Field(description="Setup step status")
|
||||
setup_at: str | None = Field(default=None, description="Setup completion time (ISO format)")
|
||||
|
||||
|
||||
class SetupResponse(BaseModel):
|
||||
result: str = Field(description="Setup result", examples=["success"])
|
||||
|
||||
|
||||
@console_router.get(
|
||||
"/setup",
|
||||
response_model=SetupStatusResponse,
|
||||
tags=["console"],
|
||||
)
|
||||
def get_setup_status_api() -> SetupStatusResponse:
|
||||
"""Get system setup status."""
|
||||
if dify_config.EDITION == "SELF_HOSTED":
|
||||
setup_status = get_setup_status()
|
||||
if setup_status and not isinstance(setup_status, bool):
|
||||
return SetupStatusResponse(step="finished", setup_at=setup_status.setup_at.isoformat())
|
||||
if setup_status:
|
||||
return SetupStatusResponse(step="finished")
|
||||
return SetupStatusResponse(step="not_started")
|
||||
return SetupStatusResponse(step="finished")
|
||||
|
||||
|
||||
@console_ns.route("/setup")
|
||||
class SetupApi(Resource):
|
||||
@console_ns.doc("get_setup_status")
|
||||
@console_ns.doc(description="Get system setup status")
|
||||
@console_ns.response(
|
||||
200,
|
||||
"Success",
|
||||
console_ns.model(
|
||||
"SetupStatusResponse",
|
||||
{
|
||||
"step": fields.String(description="Setup step status", enum=["not_started", "finished"]),
|
||||
"setup_at": fields.String(description="Setup completion time (ISO format)", required=False),
|
||||
},
|
||||
),
|
||||
@console_router.post(
|
||||
"/setup",
|
||||
response_model=SetupResponse,
|
||||
tags=["console"],
|
||||
status_code=201,
|
||||
)
|
||||
@only_edition_self_hosted
|
||||
def setup_system(payload: SetupRequestPayload) -> SetupResponse:
|
||||
"""Initialize system setup with admin account."""
|
||||
if get_setup_status():
|
||||
raise AlreadySetupError()
|
||||
|
||||
tenant_count = TenantService.get_tenant_count()
|
||||
if tenant_count > 0:
|
||||
raise AlreadySetupError()
|
||||
|
||||
if not get_init_validate_status():
|
||||
raise NotInitValidateError()
|
||||
|
||||
normalized_email = payload.email.lower()
|
||||
|
||||
RegisterService.setup(
|
||||
email=normalized_email,
|
||||
name=payload.name,
|
||||
password=payload.password,
|
||||
ip_address=extract_remote_ip(request),
|
||||
language=payload.language,
|
||||
)
|
||||
def get(self):
|
||||
"""Get system setup status"""
|
||||
if dify_config.EDITION == "SELF_HOSTED":
|
||||
setup_status = get_setup_status()
|
||||
# Check if setup_status is a DifySetup object rather than a bool
|
||||
if setup_status and not isinstance(setup_status, bool):
|
||||
return {"step": "finished", "setup_at": setup_status.setup_at.isoformat()}
|
||||
elif setup_status:
|
||||
return {"step": "finished"}
|
||||
return {"step": "not_started"}
|
||||
return {"step": "finished"}
|
||||
|
||||
@console_ns.doc("setup_system")
|
||||
@console_ns.doc(description="Initialize system setup with admin account")
|
||||
@console_ns.expect(console_ns.models[SetupRequestPayload.__name__])
|
||||
@console_ns.response(
|
||||
201, "Success", console_ns.model("SetupResponse", {"result": fields.String(description="Setup result")})
|
||||
)
|
||||
@console_ns.response(400, "Already setup or validation failed")
|
||||
@only_edition_self_hosted
|
||||
def post(self):
|
||||
"""Initialize system setup with admin account"""
|
||||
# is set up
|
||||
if get_setup_status():
|
||||
raise AlreadySetupError()
|
||||
|
||||
# is tenant created
|
||||
tenant_count = TenantService.get_tenant_count()
|
||||
if tenant_count > 0:
|
||||
raise AlreadySetupError()
|
||||
|
||||
if not get_init_validate_status():
|
||||
raise NotInitValidateError()
|
||||
|
||||
args = SetupRequestPayload.model_validate(console_ns.payload)
|
||||
normalized_email = args.email.lower()
|
||||
|
||||
# setup
|
||||
RegisterService.setup(
|
||||
email=normalized_email,
|
||||
name=args.name,
|
||||
password=args.password,
|
||||
ip_address=extract_remote_ip(request),
|
||||
language=args.language,
|
||||
)
|
||||
|
||||
return {"result": "success"}, 201
|
||||
return SetupResponse(result="success")
|
||||
|
||||
|
||||
def get_setup_status():
|
||||
def get_setup_status() -> DifySetup | bool | None:
|
||||
if dify_config.EDITION == "SELF_HOSTED":
|
||||
return db.session.query(DifySetup).first()
|
||||
else:
|
||||
return True
|
||||
|
||||
return True
|
||||
|
||||
@@ -40,6 +40,7 @@ register_schema_models(
|
||||
TagBasePayload,
|
||||
TagBindingPayload,
|
||||
TagBindingRemovePayload,
|
||||
TagListQueryParam,
|
||||
)
|
||||
|
||||
|
||||
|
||||
@@ -1,15 +1,11 @@
|
||||
import json
|
||||
import logging
|
||||
|
||||
import httpx
|
||||
from flask import request
|
||||
from flask_restx import Resource, fields
|
||||
from packaging import version
|
||||
from pydantic import BaseModel, Field
|
||||
|
||||
from configs import dify_config
|
||||
|
||||
from . import console_ns
|
||||
from controllers.fastopenapi import console_router
|
||||
|
||||
logger = logging.getLogger(__name__)
|
||||
|
||||
@@ -18,69 +14,61 @@ class VersionQuery(BaseModel):
|
||||
current_version: str = Field(..., description="Current application version")
|
||||
|
||||
|
||||
console_ns.schema_model(
|
||||
VersionQuery.__name__,
|
||||
VersionQuery.model_json_schema(ref_template="#/definitions/{model}"),
|
||||
class VersionFeatures(BaseModel):
|
||||
can_replace_logo: bool = Field(description="Whether logo replacement is supported")
|
||||
model_load_balancing_enabled: bool = Field(description="Whether model load balancing is enabled")
|
||||
|
||||
|
||||
class VersionResponse(BaseModel):
|
||||
version: str = Field(description="Latest version number")
|
||||
release_date: str = Field(description="Release date of latest version")
|
||||
release_notes: str = Field(description="Release notes for latest version")
|
||||
can_auto_update: bool = Field(description="Whether auto-update is supported")
|
||||
features: VersionFeatures = Field(description="Feature flags and capabilities")
|
||||
|
||||
|
||||
@console_router.get(
|
||||
"/version",
|
||||
response_model=VersionResponse,
|
||||
tags=["console"],
|
||||
)
|
||||
def check_version_update(query: VersionQuery) -> VersionResponse:
|
||||
"""Check for application version updates."""
|
||||
check_update_url = dify_config.CHECK_UPDATE_URL
|
||||
|
||||
|
||||
@console_ns.route("/version")
|
||||
class VersionApi(Resource):
|
||||
@console_ns.doc("check_version_update")
|
||||
@console_ns.doc(description="Check for application version updates")
|
||||
@console_ns.expect(console_ns.models[VersionQuery.__name__])
|
||||
@console_ns.response(
|
||||
200,
|
||||
"Success",
|
||||
console_ns.model(
|
||||
"VersionResponse",
|
||||
{
|
||||
"version": fields.String(description="Latest version number"),
|
||||
"release_date": fields.String(description="Release date of latest version"),
|
||||
"release_notes": fields.String(description="Release notes for latest version"),
|
||||
"can_auto_update": fields.Boolean(description="Whether auto-update is supported"),
|
||||
"features": fields.Raw(description="Feature flags and capabilities"),
|
||||
},
|
||||
result = VersionResponse(
|
||||
version=dify_config.project.version,
|
||||
release_date="",
|
||||
release_notes="",
|
||||
can_auto_update=False,
|
||||
features=VersionFeatures(
|
||||
can_replace_logo=dify_config.CAN_REPLACE_LOGO,
|
||||
model_load_balancing_enabled=dify_config.MODEL_LB_ENABLED,
|
||||
),
|
||||
)
|
||||
def get(self):
|
||||
"""Check for application version updates"""
|
||||
args = VersionQuery.model_validate(request.args.to_dict(flat=True)) # type: ignore
|
||||
check_update_url = dify_config.CHECK_UPDATE_URL
|
||||
|
||||
result = {
|
||||
"version": dify_config.project.version,
|
||||
"release_date": "",
|
||||
"release_notes": "",
|
||||
"can_auto_update": False,
|
||||
"features": {
|
||||
"can_replace_logo": dify_config.CAN_REPLACE_LOGO,
|
||||
"model_load_balancing_enabled": dify_config.MODEL_LB_ENABLED,
|
||||
},
|
||||
}
|
||||
|
||||
if not check_update_url:
|
||||
return result
|
||||
|
||||
try:
|
||||
response = httpx.get(
|
||||
check_update_url,
|
||||
params={"current_version": args.current_version},
|
||||
timeout=httpx.Timeout(timeout=10.0, connect=3.0),
|
||||
)
|
||||
except Exception as error:
|
||||
logger.warning("Check update version error: %s.", str(error))
|
||||
result["version"] = args.current_version
|
||||
return result
|
||||
|
||||
content = json.loads(response.content)
|
||||
if _has_new_version(latest_version=content["version"], current_version=f"{args.current_version}"):
|
||||
result["version"] = content["version"]
|
||||
result["release_date"] = content["releaseDate"]
|
||||
result["release_notes"] = content["releaseNotes"]
|
||||
result["can_auto_update"] = content["canAutoUpdate"]
|
||||
if not check_update_url:
|
||||
return result
|
||||
|
||||
try:
|
||||
response = httpx.get(
|
||||
check_update_url,
|
||||
params={"current_version": query.current_version},
|
||||
timeout=httpx.Timeout(timeout=10.0, connect=3.0),
|
||||
)
|
||||
content = response.json()
|
||||
except Exception as error:
|
||||
logger.warning("Check update version error: %s.", str(error))
|
||||
result.version = query.current_version
|
||||
return result
|
||||
latest_version = content.get("version", result.version)
|
||||
if _has_new_version(latest_version=latest_version, current_version=f"{query.current_version}"):
|
||||
result.version = latest_version
|
||||
result.release_date = content.get("releaseDate", "")
|
||||
result.release_notes = content.get("releaseNotes", "")
|
||||
result.can_auto_update = content.get("canAutoUpdate", False)
|
||||
return result
|
||||
|
||||
|
||||
def _has_new_version(*, latest_version: str, current_version: str) -> bool:
|
||||
try:
|
||||
|
||||
3
api/controllers/fastopenapi.py
Normal file
3
api/controllers/fastopenapi.py
Normal file
@@ -0,0 +1,3 @@
|
||||
from fastopenapi.routers import FlaskRouter
|
||||
|
||||
console_router = FlaskRouter()
|
||||
@@ -87,6 +87,14 @@ class TagUnbindingPayload(BaseModel):
|
||||
target_id: str
|
||||
|
||||
|
||||
class DatasetListQuery(BaseModel):
|
||||
page: int = Field(default=1, description="Page number")
|
||||
limit: int = Field(default=20, description="Number of items per page")
|
||||
keyword: str | None = Field(default=None, description="Search keyword")
|
||||
include_all: bool = Field(default=False, description="Include all datasets")
|
||||
tag_ids: list[str] = Field(default_factory=list, description="Filter by tag IDs")
|
||||
|
||||
|
||||
register_schema_models(
|
||||
service_api_ns,
|
||||
DatasetCreatePayload,
|
||||
@@ -96,6 +104,7 @@ register_schema_models(
|
||||
TagDeletePayload,
|
||||
TagBindingPayload,
|
||||
TagUnbindingPayload,
|
||||
DatasetListQuery,
|
||||
)
|
||||
|
||||
|
||||
@@ -113,15 +122,11 @@ class DatasetListApi(DatasetApiResource):
|
||||
)
|
||||
def get(self, tenant_id):
|
||||
"""Resource for getting datasets."""
|
||||
page = request.args.get("page", default=1, type=int)
|
||||
limit = request.args.get("limit", default=20, type=int)
|
||||
query = DatasetListQuery.model_validate(request.args.to_dict(flat=False))
|
||||
# provider = request.args.get("provider", default="vendor")
|
||||
search = request.args.get("keyword", default=None, type=str)
|
||||
tag_ids = request.args.getlist("tag_ids")
|
||||
include_all = request.args.get("include_all", default="false").lower() == "true"
|
||||
|
||||
datasets, total = DatasetService.get_datasets(
|
||||
page, limit, tenant_id, current_user, search, tag_ids, include_all
|
||||
query.page, query.limit, tenant_id, current_user, query.keyword, query.tag_ids, query.include_all
|
||||
)
|
||||
# check embedding setting
|
||||
provider_manager = ProviderManager()
|
||||
@@ -147,7 +152,13 @@ class DatasetListApi(DatasetApiResource):
|
||||
item["embedding_available"] = False
|
||||
else:
|
||||
item["embedding_available"] = True
|
||||
response = {"data": data, "has_more": len(datasets) == limit, "limit": limit, "total": total, "page": page}
|
||||
response = {
|
||||
"data": data,
|
||||
"has_more": len(datasets) == query.limit,
|
||||
"limit": query.limit,
|
||||
"total": total,
|
||||
"page": query.page,
|
||||
}
|
||||
return response, 200
|
||||
|
||||
@service_api_ns.expect(service_api_ns.models[DatasetCreatePayload.__name__])
|
||||
|
||||
@@ -69,7 +69,14 @@ class DocumentTextUpdate(BaseModel):
|
||||
return self
|
||||
|
||||
|
||||
for m in [ProcessRule, RetrievalModel, DocumentTextCreatePayload, DocumentTextUpdate]:
|
||||
class DocumentListQuery(BaseModel):
|
||||
page: int = Field(default=1, description="Page number")
|
||||
limit: int = Field(default=20, description="Number of items per page")
|
||||
keyword: str | None = Field(default=None, description="Search keyword")
|
||||
status: str | None = Field(default=None, description="Document status filter")
|
||||
|
||||
|
||||
for m in [ProcessRule, RetrievalModel, DocumentTextCreatePayload, DocumentTextUpdate, DocumentListQuery]:
|
||||
service_api_ns.schema_model(m.__name__, m.model_json_schema(ref_template=DEFAULT_REF_TEMPLATE_SWAGGER_2_0)) # type: ignore
|
||||
|
||||
|
||||
@@ -261,17 +268,6 @@ class DocumentAddByFileApi(DatasetApiResource):
|
||||
@cloud_edition_billing_rate_limit_check("knowledge", "dataset")
|
||||
def post(self, tenant_id, dataset_id):
|
||||
"""Create document by upload file."""
|
||||
args = {}
|
||||
if "data" in request.form:
|
||||
args = json.loads(request.form["data"])
|
||||
if "doc_form" not in args:
|
||||
args["doc_form"] = "text_model"
|
||||
if "doc_language" not in args:
|
||||
args["doc_language"] = "English"
|
||||
|
||||
# get dataset info
|
||||
dataset_id = str(dataset_id)
|
||||
tenant_id = str(tenant_id)
|
||||
dataset = db.session.query(Dataset).where(Dataset.tenant_id == tenant_id, Dataset.id == dataset_id).first()
|
||||
|
||||
if not dataset:
|
||||
@@ -280,6 +276,18 @@ class DocumentAddByFileApi(DatasetApiResource):
|
||||
if dataset.provider == "external":
|
||||
raise ValueError("External datasets are not supported.")
|
||||
|
||||
args = {}
|
||||
if "data" in request.form:
|
||||
args = json.loads(request.form["data"])
|
||||
if "doc_form" not in args:
|
||||
args["doc_form"] = dataset.chunk_structure or "text_model"
|
||||
if "doc_language" not in args:
|
||||
args["doc_language"] = "English"
|
||||
|
||||
# get dataset info
|
||||
dataset_id = str(dataset_id)
|
||||
tenant_id = str(tenant_id)
|
||||
|
||||
indexing_technique = args.get("indexing_technique") or dataset.indexing_technique
|
||||
if not indexing_technique:
|
||||
raise ValueError("indexing_technique is required.")
|
||||
@@ -370,17 +378,6 @@ class DocumentUpdateByFileApi(DatasetApiResource):
|
||||
@cloud_edition_billing_rate_limit_check("knowledge", "dataset")
|
||||
def post(self, tenant_id, dataset_id, document_id):
|
||||
"""Update document by upload file."""
|
||||
args = {}
|
||||
if "data" in request.form:
|
||||
args = json.loads(request.form["data"])
|
||||
if "doc_form" not in args:
|
||||
args["doc_form"] = "text_model"
|
||||
if "doc_language" not in args:
|
||||
args["doc_language"] = "English"
|
||||
|
||||
# get dataset info
|
||||
dataset_id = str(dataset_id)
|
||||
tenant_id = str(tenant_id)
|
||||
dataset = db.session.query(Dataset).where(Dataset.tenant_id == tenant_id, Dataset.id == dataset_id).first()
|
||||
|
||||
if not dataset:
|
||||
@@ -389,6 +386,18 @@ class DocumentUpdateByFileApi(DatasetApiResource):
|
||||
if dataset.provider == "external":
|
||||
raise ValueError("External datasets are not supported.")
|
||||
|
||||
args = {}
|
||||
if "data" in request.form:
|
||||
args = json.loads(request.form["data"])
|
||||
if "doc_form" not in args:
|
||||
args["doc_form"] = dataset.chunk_structure or "text_model"
|
||||
if "doc_language" not in args:
|
||||
args["doc_language"] = "English"
|
||||
|
||||
# get dataset info
|
||||
dataset_id = str(dataset_id)
|
||||
tenant_id = str(tenant_id)
|
||||
|
||||
# indexing_technique is already set in dataset since this is an update
|
||||
args["indexing_technique"] = dataset.indexing_technique
|
||||
|
||||
@@ -458,34 +467,33 @@ class DocumentListApi(DatasetApiResource):
|
||||
def get(self, tenant_id, dataset_id):
|
||||
dataset_id = str(dataset_id)
|
||||
tenant_id = str(tenant_id)
|
||||
page = request.args.get("page", default=1, type=int)
|
||||
limit = request.args.get("limit", default=20, type=int)
|
||||
search = request.args.get("keyword", default=None, type=str)
|
||||
status = request.args.get("status", default=None, type=str)
|
||||
query_params = DocumentListQuery.model_validate(request.args.to_dict())
|
||||
dataset = db.session.query(Dataset).where(Dataset.tenant_id == tenant_id, Dataset.id == dataset_id).first()
|
||||
if not dataset:
|
||||
raise NotFound("Dataset not found.")
|
||||
|
||||
query = select(Document).filter_by(dataset_id=str(dataset_id), tenant_id=tenant_id)
|
||||
|
||||
if status:
|
||||
query = DocumentService.apply_display_status_filter(query, status)
|
||||
if query_params.status:
|
||||
query = DocumentService.apply_display_status_filter(query, query_params.status)
|
||||
|
||||
if search:
|
||||
search = f"%{search}%"
|
||||
if query_params.keyword:
|
||||
search = f"%{query_params.keyword}%"
|
||||
query = query.where(Document.name.like(search))
|
||||
|
||||
query = query.order_by(desc(Document.created_at), desc(Document.position))
|
||||
|
||||
paginated_documents = db.paginate(select=query, page=page, per_page=limit, max_per_page=100, error_out=False)
|
||||
paginated_documents = db.paginate(
|
||||
select=query, page=query_params.page, per_page=query_params.limit, max_per_page=100, error_out=False
|
||||
)
|
||||
documents = paginated_documents.items
|
||||
|
||||
response = {
|
||||
"data": marshal(documents, document_fields),
|
||||
"has_more": len(documents) == limit,
|
||||
"limit": limit,
|
||||
"has_more": len(documents) == query_params.limit,
|
||||
"limit": query_params.limit,
|
||||
"total": paginated_documents.total,
|
||||
"page": page,
|
||||
"page": query_params.page,
|
||||
}
|
||||
|
||||
return response
|
||||
|
||||
@@ -11,7 +11,9 @@ from controllers.service_api.wraps import DatasetApiResource, cloud_edition_bill
|
||||
from fields.dataset_fields import dataset_metadata_fields
|
||||
from services.dataset_service import DatasetService
|
||||
from services.entities.knowledge_entities.knowledge_entities import (
|
||||
DocumentMetadataOperation,
|
||||
MetadataArgs,
|
||||
MetadataDetail,
|
||||
MetadataOperationData,
|
||||
)
|
||||
from services.metadata_service import MetadataService
|
||||
@@ -22,7 +24,13 @@ class MetadataUpdatePayload(BaseModel):
|
||||
|
||||
|
||||
register_schema_model(service_api_ns, MetadataUpdatePayload)
|
||||
register_schema_models(service_api_ns, MetadataArgs, MetadataOperationData)
|
||||
register_schema_models(
|
||||
service_api_ns,
|
||||
MetadataArgs,
|
||||
MetadataDetail,
|
||||
DocumentMetadataOperation,
|
||||
MetadataOperationData,
|
||||
)
|
||||
|
||||
|
||||
@service_api_ns.route("/datasets/<uuid:dataset_id>/metadata")
|
||||
|
||||
@@ -17,5 +17,15 @@ class SystemFeatureApi(Resource):
|
||||
|
||||
Returns:
|
||||
dict: System feature configuration object
|
||||
|
||||
This endpoint is akin to the `SystemFeatureApi` endpoint in api/controllers/console/feature.py,
|
||||
except it is intended for use by the web app, instead of the console dashboard.
|
||||
|
||||
NOTE: This endpoint is unauthenticated by design, as it provides system features
|
||||
data required for webapp initialization.
|
||||
|
||||
Authentication would create circular dependency (can't authenticate without webapp loading).
|
||||
|
||||
Only non-sensitive configuration data should be returned by this endpoint.
|
||||
"""
|
||||
return FeatureService.get_system_features().model_dump()
|
||||
|
||||
@@ -1,9 +1,11 @@
|
||||
from flask import make_response, request
|
||||
from flask_restx import Resource, reqparse
|
||||
from flask_restx import Resource
|
||||
from jwt import InvalidTokenError
|
||||
from pydantic import BaseModel, Field, field_validator
|
||||
|
||||
import services
|
||||
from configs import dify_config
|
||||
from controllers.common.schema import register_schema_models
|
||||
from controllers.console.auth.error import (
|
||||
AuthenticationFailedError,
|
||||
EmailCodeError,
|
||||
@@ -18,7 +20,7 @@ from controllers.console.wraps import (
|
||||
)
|
||||
from controllers.web import web_ns
|
||||
from controllers.web.wraps import decode_jwt_token
|
||||
from libs.helper import email
|
||||
from libs.helper import EmailStr
|
||||
from libs.passport import PassportService
|
||||
from libs.password import valid_password
|
||||
from libs.token import (
|
||||
@@ -30,10 +32,35 @@ from services.app_service import AppService
|
||||
from services.webapp_auth_service import WebAppAuthService
|
||||
|
||||
|
||||
class LoginPayload(BaseModel):
|
||||
email: EmailStr
|
||||
password: str
|
||||
|
||||
@field_validator("password")
|
||||
@classmethod
|
||||
def validate_password(cls, value: str) -> str:
|
||||
return valid_password(value)
|
||||
|
||||
|
||||
class EmailCodeLoginSendPayload(BaseModel):
|
||||
email: EmailStr
|
||||
language: str | None = None
|
||||
|
||||
|
||||
class EmailCodeLoginVerifyPayload(BaseModel):
|
||||
email: EmailStr
|
||||
code: str
|
||||
token: str = Field(min_length=1)
|
||||
|
||||
|
||||
register_schema_models(web_ns, LoginPayload, EmailCodeLoginSendPayload, EmailCodeLoginVerifyPayload)
|
||||
|
||||
|
||||
@web_ns.route("/login")
|
||||
class LoginApi(Resource):
|
||||
"""Resource for web app email/password login."""
|
||||
|
||||
@web_ns.expect(web_ns.models[LoginPayload.__name__])
|
||||
@setup_required
|
||||
@only_edition_enterprise
|
||||
@web_ns.doc("web_app_login")
|
||||
@@ -50,15 +77,10 @@ class LoginApi(Resource):
|
||||
@decrypt_password_field
|
||||
def post(self):
|
||||
"""Authenticate user and login."""
|
||||
parser = (
|
||||
reqparse.RequestParser()
|
||||
.add_argument("email", type=email, required=True, location="json")
|
||||
.add_argument("password", type=valid_password, required=True, location="json")
|
||||
)
|
||||
args = parser.parse_args()
|
||||
payload = LoginPayload.model_validate(web_ns.payload or {})
|
||||
|
||||
try:
|
||||
account = WebAppAuthService.authenticate(args["email"], args["password"])
|
||||
account = WebAppAuthService.authenticate(payload.email, payload.password)
|
||||
except services.errors.account.AccountLoginError:
|
||||
raise AccountBannedError()
|
||||
except services.errors.account.AccountPasswordError:
|
||||
@@ -145,6 +167,7 @@ class EmailCodeLoginSendEmailApi(Resource):
|
||||
@only_edition_enterprise
|
||||
@web_ns.doc("send_email_code_login")
|
||||
@web_ns.doc(description="Send email verification code for login")
|
||||
@web_ns.expect(web_ns.models[EmailCodeLoginSendPayload.__name__])
|
||||
@web_ns.doc(
|
||||
responses={
|
||||
200: "Email code sent successfully",
|
||||
@@ -153,19 +176,14 @@ class EmailCodeLoginSendEmailApi(Resource):
|
||||
}
|
||||
)
|
||||
def post(self):
|
||||
parser = (
|
||||
reqparse.RequestParser()
|
||||
.add_argument("email", type=email, required=True, location="json")
|
||||
.add_argument("language", type=str, required=False, location="json")
|
||||
)
|
||||
args = parser.parse_args()
|
||||
payload = EmailCodeLoginSendPayload.model_validate(web_ns.payload or {})
|
||||
|
||||
if args["language"] is not None and args["language"] == "zh-Hans":
|
||||
if payload.language == "zh-Hans":
|
||||
language = "zh-Hans"
|
||||
else:
|
||||
language = "en-US"
|
||||
|
||||
account = WebAppAuthService.get_user_through_email(args["email"])
|
||||
account = WebAppAuthService.get_user_through_email(payload.email)
|
||||
if account is None:
|
||||
raise AuthenticationFailedError()
|
||||
else:
|
||||
@@ -179,6 +197,7 @@ class EmailCodeLoginApi(Resource):
|
||||
@only_edition_enterprise
|
||||
@web_ns.doc("verify_email_code_login")
|
||||
@web_ns.doc(description="Verify email code and complete login")
|
||||
@web_ns.expect(web_ns.models[EmailCodeLoginVerifyPayload.__name__])
|
||||
@web_ns.doc(
|
||||
responses={
|
||||
200: "Email code verified and login successful",
|
||||
@@ -189,17 +208,11 @@ class EmailCodeLoginApi(Resource):
|
||||
)
|
||||
@decrypt_code_field
|
||||
def post(self):
|
||||
parser = (
|
||||
reqparse.RequestParser()
|
||||
.add_argument("email", type=str, required=True, location="json")
|
||||
.add_argument("code", type=str, required=True, location="json")
|
||||
.add_argument("token", type=str, required=True, location="json")
|
||||
)
|
||||
args = parser.parse_args()
|
||||
payload = EmailCodeLoginVerifyPayload.model_validate(web_ns.payload or {})
|
||||
|
||||
user_email = args["email"].lower()
|
||||
user_email = payload.email.lower()
|
||||
|
||||
token_data = WebAppAuthService.get_email_code_login_data(args["token"])
|
||||
token_data = WebAppAuthService.get_email_code_login_data(payload.token)
|
||||
if token_data is None:
|
||||
raise InvalidTokenError()
|
||||
|
||||
@@ -210,10 +223,10 @@ class EmailCodeLoginApi(Resource):
|
||||
if normalized_token_email != user_email:
|
||||
raise InvalidEmailError()
|
||||
|
||||
if token_data["code"] != args["code"]:
|
||||
if token_data["code"] != payload.code:
|
||||
raise EmailCodeError()
|
||||
|
||||
WebAppAuthService.revoke_email_code_login_token(args["token"])
|
||||
WebAppAuthService.revoke_email_code_login_token(payload.token)
|
||||
account = WebAppAuthService.get_user_through_email(token_email)
|
||||
if not account:
|
||||
raise AuthenticationFailedError()
|
||||
|
||||
@@ -1,8 +1,10 @@
|
||||
import logging
|
||||
from typing import Any
|
||||
|
||||
from flask_restx import reqparse
|
||||
from pydantic import BaseModel, Field
|
||||
from werkzeug.exceptions import InternalServerError
|
||||
|
||||
from controllers.common.schema import register_schema_models
|
||||
from controllers.web import web_ns
|
||||
from controllers.web.error import (
|
||||
CompletionRequestError,
|
||||
@@ -27,19 +29,22 @@ from models.model import App, AppMode, EndUser
|
||||
from services.app_generate_service import AppGenerateService
|
||||
from services.errors.llm import InvokeRateLimitError
|
||||
|
||||
|
||||
class WorkflowRunPayload(BaseModel):
|
||||
inputs: dict[str, Any] = Field(description="Input variables for the workflow")
|
||||
files: list[dict[str, Any]] | None = Field(default=None, description="Files to be processed by the workflow")
|
||||
|
||||
|
||||
logger = logging.getLogger(__name__)
|
||||
|
||||
register_schema_models(web_ns, WorkflowRunPayload)
|
||||
|
||||
|
||||
@web_ns.route("/workflows/run")
|
||||
class WorkflowRunApi(WebApiResource):
|
||||
@web_ns.doc("Run Workflow")
|
||||
@web_ns.doc(description="Execute a workflow with provided inputs and files.")
|
||||
@web_ns.doc(
|
||||
params={
|
||||
"inputs": {"description": "Input variables for the workflow", "type": "object", "required": True},
|
||||
"files": {"description": "Files to be processed by the workflow", "type": "array", "required": False},
|
||||
}
|
||||
)
|
||||
@web_ns.expect(web_ns.models[WorkflowRunPayload.__name__])
|
||||
@web_ns.doc(
|
||||
responses={
|
||||
200: "Success",
|
||||
@@ -58,12 +63,8 @@ class WorkflowRunApi(WebApiResource):
|
||||
if app_mode != AppMode.WORKFLOW:
|
||||
raise NotWorkflowAppError()
|
||||
|
||||
parser = (
|
||||
reqparse.RequestParser()
|
||||
.add_argument("inputs", type=dict, required=True, nullable=False, location="json")
|
||||
.add_argument("files", type=list, required=False, location="json")
|
||||
)
|
||||
args = parser.parse_args()
|
||||
payload = WorkflowRunPayload.model_validate(web_ns.payload or {})
|
||||
args = payload.model_dump(exclude_none=True)
|
||||
|
||||
try:
|
||||
response = AppGenerateService.generate(
|
||||
|
||||
@@ -1,9 +1,11 @@
|
||||
from __future__ import annotations
|
||||
|
||||
import contextvars
|
||||
import logging
|
||||
import threading
|
||||
import uuid
|
||||
from collections.abc import Generator, Mapping
|
||||
from typing import Any, Literal, Union, overload
|
||||
from typing import TYPE_CHECKING, Any, Literal, Union, overload
|
||||
|
||||
from flask import Flask, current_app
|
||||
from pydantic import ValidationError
|
||||
@@ -13,6 +15,9 @@ from sqlalchemy.orm import Session, sessionmaker
|
||||
import contexts
|
||||
from configs import dify_config
|
||||
from constants import UUID_NIL
|
||||
|
||||
if TYPE_CHECKING:
|
||||
from controllers.console.app.workflow import LoopNodeRunPayload
|
||||
from core.app.app_config.features.file_upload.manager import FileUploadConfigManager
|
||||
from core.app.apps.advanced_chat.app_config_manager import AdvancedChatAppConfigManager
|
||||
from core.app.apps.advanced_chat.app_runner import AdvancedChatAppRunner
|
||||
@@ -304,7 +309,7 @@ class AdvancedChatAppGenerator(MessageBasedAppGenerator):
|
||||
workflow: Workflow,
|
||||
node_id: str,
|
||||
user: Account | EndUser,
|
||||
args: Mapping,
|
||||
args: LoopNodeRunPayload,
|
||||
streaming: bool = True,
|
||||
) -> Mapping[str, Any] | Generator[str | Mapping[str, Any], Any, None]:
|
||||
"""
|
||||
@@ -320,7 +325,7 @@ class AdvancedChatAppGenerator(MessageBasedAppGenerator):
|
||||
if not node_id:
|
||||
raise ValueError("node_id is required")
|
||||
|
||||
if args.get("inputs") is None:
|
||||
if args.inputs is None:
|
||||
raise ValueError("inputs is required")
|
||||
|
||||
# convert to app config
|
||||
@@ -338,7 +343,7 @@ class AdvancedChatAppGenerator(MessageBasedAppGenerator):
|
||||
stream=streaming,
|
||||
invoke_from=InvokeFrom.DEBUGGER,
|
||||
extras={"auto_generate_conversation_name": False},
|
||||
single_loop_run=AdvancedChatAppGenerateEntity.SingleLoopRunEntity(node_id=node_id, inputs=args["inputs"]),
|
||||
single_loop_run=AdvancedChatAppGenerateEntity.SingleLoopRunEntity(node_id=node_id, inputs=args.inputs),
|
||||
)
|
||||
contexts.plugin_tool_providers.set({})
|
||||
contexts.plugin_tool_providers_lock.set(threading.Lock())
|
||||
|
||||
@@ -236,4 +236,7 @@ class AgentChatAppRunner(AppRunner):
|
||||
queue_manager=queue_manager,
|
||||
stream=application_generate_entity.stream,
|
||||
agent=True,
|
||||
message_id=message.id,
|
||||
user_id=application_generate_entity.user_id,
|
||||
tenant_id=app_config.tenant_id,
|
||||
)
|
||||
|
||||
@@ -1,6 +1,8 @@
|
||||
import base64
|
||||
import logging
|
||||
import time
|
||||
from collections.abc import Generator, Mapping, Sequence
|
||||
from mimetypes import guess_extension
|
||||
from typing import TYPE_CHECKING, Any, Union
|
||||
|
||||
from core.app.app_config.entities import ExternalDataVariableEntity, PromptTemplateEntity
|
||||
@@ -11,10 +13,16 @@ from core.app.entities.app_invoke_entities import (
|
||||
InvokeFrom,
|
||||
ModelConfigWithCredentialsEntity,
|
||||
)
|
||||
from core.app.entities.queue_entities import QueueAgentMessageEvent, QueueLLMChunkEvent, QueueMessageEndEvent
|
||||
from core.app.entities.queue_entities import (
|
||||
QueueAgentMessageEvent,
|
||||
QueueLLMChunkEvent,
|
||||
QueueMessageEndEvent,
|
||||
QueueMessageFileEvent,
|
||||
)
|
||||
from core.app.features.annotation_reply.annotation_reply import AnnotationReplyFeature
|
||||
from core.app.features.hosting_moderation.hosting_moderation import HostingModerationFeature
|
||||
from core.external_data_tool.external_data_fetch import ExternalDataFetch
|
||||
from core.file.enums import FileTransferMethod, FileType
|
||||
from core.memory.token_buffer_memory import TokenBufferMemory
|
||||
from core.model_manager import ModelInstance
|
||||
from core.model_runtime.entities.llm_entities import LLMResult, LLMResultChunk, LLMResultChunkDelta, LLMUsage
|
||||
@@ -22,6 +30,7 @@ from core.model_runtime.entities.message_entities import (
|
||||
AssistantPromptMessage,
|
||||
ImagePromptMessageContent,
|
||||
PromptMessage,
|
||||
TextPromptMessageContent,
|
||||
)
|
||||
from core.model_runtime.entities.model_entities import ModelPropertyKey
|
||||
from core.model_runtime.errors.invoke import InvokeBadRequestError
|
||||
@@ -29,7 +38,10 @@ from core.moderation.input_moderation import InputModeration
|
||||
from core.prompt.advanced_prompt_transform import AdvancedPromptTransform
|
||||
from core.prompt.entities.advanced_prompt_entities import ChatModelMessage, CompletionModelPromptTemplate, MemoryConfig
|
||||
from core.prompt.simple_prompt_transform import ModelMode, SimplePromptTransform
|
||||
from models.model import App, AppMode, Message, MessageAnnotation
|
||||
from core.tools.tool_file_manager import ToolFileManager
|
||||
from extensions.ext_database import db
|
||||
from models.enums import CreatorUserRole
|
||||
from models.model import App, AppMode, Message, MessageAnnotation, MessageFile
|
||||
|
||||
if TYPE_CHECKING:
|
||||
from core.file.models import File
|
||||
@@ -203,6 +215,9 @@ class AppRunner:
|
||||
queue_manager: AppQueueManager,
|
||||
stream: bool,
|
||||
agent: bool = False,
|
||||
message_id: str | None = None,
|
||||
user_id: str | None = None,
|
||||
tenant_id: str | None = None,
|
||||
):
|
||||
"""
|
||||
Handle invoke result
|
||||
@@ -210,21 +225,41 @@ class AppRunner:
|
||||
:param queue_manager: application queue manager
|
||||
:param stream: stream
|
||||
:param agent: agent
|
||||
:param message_id: message id for multimodal output
|
||||
:param user_id: user id for multimodal output
|
||||
:param tenant_id: tenant id for multimodal output
|
||||
:return:
|
||||
"""
|
||||
if not stream and isinstance(invoke_result, LLMResult):
|
||||
self._handle_invoke_result_direct(invoke_result=invoke_result, queue_manager=queue_manager, agent=agent)
|
||||
self._handle_invoke_result_direct(
|
||||
invoke_result=invoke_result,
|
||||
queue_manager=queue_manager,
|
||||
)
|
||||
elif stream and isinstance(invoke_result, Generator):
|
||||
self._handle_invoke_result_stream(invoke_result=invoke_result, queue_manager=queue_manager, agent=agent)
|
||||
self._handle_invoke_result_stream(
|
||||
invoke_result=invoke_result,
|
||||
queue_manager=queue_manager,
|
||||
agent=agent,
|
||||
message_id=message_id,
|
||||
user_id=user_id,
|
||||
tenant_id=tenant_id,
|
||||
)
|
||||
else:
|
||||
raise NotImplementedError(f"unsupported invoke result type: {type(invoke_result)}")
|
||||
|
||||
def _handle_invoke_result_direct(self, invoke_result: LLMResult, queue_manager: AppQueueManager, agent: bool):
|
||||
def _handle_invoke_result_direct(
|
||||
self,
|
||||
invoke_result: LLMResult,
|
||||
queue_manager: AppQueueManager,
|
||||
):
|
||||
"""
|
||||
Handle invoke result direct
|
||||
:param invoke_result: invoke result
|
||||
:param queue_manager: application queue manager
|
||||
:param agent: agent
|
||||
:param message_id: message id for multimodal output
|
||||
:param user_id: user id for multimodal output
|
||||
:param tenant_id: tenant id for multimodal output
|
||||
:return:
|
||||
"""
|
||||
queue_manager.publish(
|
||||
@@ -235,13 +270,22 @@ class AppRunner:
|
||||
)
|
||||
|
||||
def _handle_invoke_result_stream(
|
||||
self, invoke_result: Generator[LLMResultChunk, None, None], queue_manager: AppQueueManager, agent: bool
|
||||
self,
|
||||
invoke_result: Generator[LLMResultChunk, None, None],
|
||||
queue_manager: AppQueueManager,
|
||||
agent: bool,
|
||||
message_id: str | None = None,
|
||||
user_id: str | None = None,
|
||||
tenant_id: str | None = None,
|
||||
):
|
||||
"""
|
||||
Handle invoke result
|
||||
:param invoke_result: invoke result
|
||||
:param queue_manager: application queue manager
|
||||
:param agent: agent
|
||||
:param message_id: message id for multimodal output
|
||||
:param user_id: user id for multimodal output
|
||||
:param tenant_id: tenant id for multimodal output
|
||||
:return:
|
||||
"""
|
||||
model: str = ""
|
||||
@@ -259,12 +303,26 @@ class AppRunner:
|
||||
text += message.content
|
||||
elif isinstance(message.content, list):
|
||||
for content in message.content:
|
||||
if not isinstance(content, str):
|
||||
# TODO(QuantumGhost): Add multimodal output support for easy ui.
|
||||
_logger.warning("received multimodal output, type=%s", type(content))
|
||||
if isinstance(content, str):
|
||||
text += content
|
||||
elif isinstance(content, TextPromptMessageContent):
|
||||
text += content.data
|
||||
elif isinstance(content, ImagePromptMessageContent):
|
||||
if message_id and user_id and tenant_id:
|
||||
try:
|
||||
self._handle_multimodal_image_content(
|
||||
content=content,
|
||||
message_id=message_id,
|
||||
user_id=user_id,
|
||||
tenant_id=tenant_id,
|
||||
queue_manager=queue_manager,
|
||||
)
|
||||
except Exception:
|
||||
_logger.exception("Failed to handle multimodal image output")
|
||||
else:
|
||||
_logger.warning("Received multimodal output but missing required parameters")
|
||||
else:
|
||||
text += content # failback to str
|
||||
text += content.data if hasattr(content, "data") else str(content)
|
||||
|
||||
if not model:
|
||||
model = result.model
|
||||
@@ -289,6 +347,101 @@ class AppRunner:
|
||||
PublishFrom.APPLICATION_MANAGER,
|
||||
)
|
||||
|
||||
def _handle_multimodal_image_content(
|
||||
self,
|
||||
content: ImagePromptMessageContent,
|
||||
message_id: str,
|
||||
user_id: str,
|
||||
tenant_id: str,
|
||||
queue_manager: AppQueueManager,
|
||||
):
|
||||
"""
|
||||
Handle multimodal image content from LLM response.
|
||||
Save the image and create a MessageFile record.
|
||||
|
||||
:param content: ImagePromptMessageContent instance
|
||||
:param message_id: message id
|
||||
:param user_id: user id
|
||||
:param tenant_id: tenant id
|
||||
:param queue_manager: queue manager
|
||||
:return:
|
||||
"""
|
||||
_logger.info("Handling multimodal image content for message %s", message_id)
|
||||
|
||||
image_url = content.url
|
||||
base64_data = content.base64_data
|
||||
|
||||
_logger.info("Image URL: %s, Base64 data present: %s", image_url, base64_data)
|
||||
|
||||
if not image_url and not base64_data:
|
||||
_logger.warning("Image content has neither URL nor base64 data")
|
||||
return
|
||||
|
||||
tool_file_manager = ToolFileManager()
|
||||
|
||||
# Save the image file
|
||||
try:
|
||||
if image_url:
|
||||
# Download image from URL
|
||||
_logger.info("Downloading image from URL: %s", image_url)
|
||||
tool_file = tool_file_manager.create_file_by_url(
|
||||
user_id=user_id,
|
||||
tenant_id=tenant_id,
|
||||
file_url=image_url,
|
||||
conversation_id=None,
|
||||
)
|
||||
_logger.info("Image saved successfully, tool_file_id: %s", tool_file.id)
|
||||
elif base64_data:
|
||||
if base64_data.startswith("data:"):
|
||||
base64_data = base64_data.split(",", 1)[1]
|
||||
|
||||
image_binary = base64.b64decode(base64_data)
|
||||
mimetype = content.mime_type or "image/png"
|
||||
extension = guess_extension(mimetype) or ".png"
|
||||
|
||||
tool_file = tool_file_manager.create_file_by_raw(
|
||||
user_id=user_id,
|
||||
tenant_id=tenant_id,
|
||||
conversation_id=None,
|
||||
file_binary=image_binary,
|
||||
mimetype=mimetype,
|
||||
filename=f"generated_image{extension}",
|
||||
)
|
||||
_logger.info("Image saved successfully, tool_file_id: %s", tool_file.id)
|
||||
else:
|
||||
return
|
||||
except Exception:
|
||||
_logger.exception("Failed to save image file")
|
||||
return
|
||||
|
||||
# Create MessageFile record
|
||||
message_file = MessageFile(
|
||||
message_id=message_id,
|
||||
type=FileType.IMAGE,
|
||||
transfer_method=FileTransferMethod.TOOL_FILE,
|
||||
belongs_to="assistant",
|
||||
url=f"/files/tools/{tool_file.id}",
|
||||
upload_file_id=tool_file.id,
|
||||
created_by_role=(
|
||||
CreatorUserRole.ACCOUNT
|
||||
if queue_manager.invoke_from in {InvokeFrom.DEBUGGER, InvokeFrom.EXPLORE}
|
||||
else CreatorUserRole.END_USER
|
||||
),
|
||||
created_by=user_id,
|
||||
)
|
||||
|
||||
db.session.add(message_file)
|
||||
db.session.commit()
|
||||
db.session.refresh(message_file)
|
||||
|
||||
# Publish QueueMessageFileEvent
|
||||
queue_manager.publish(
|
||||
QueueMessageFileEvent(message_file_id=message_file.id),
|
||||
PublishFrom.APPLICATION_MANAGER,
|
||||
)
|
||||
|
||||
_logger.info("QueueMessageFileEvent published for message_file_id: %s", message_file.id)
|
||||
|
||||
def moderation_for_inputs(
|
||||
self,
|
||||
*,
|
||||
|
||||
@@ -226,5 +226,10 @@ class ChatAppRunner(AppRunner):
|
||||
|
||||
# handle invoke result
|
||||
self._handle_invoke_result(
|
||||
invoke_result=invoke_result, queue_manager=queue_manager, stream=application_generate_entity.stream
|
||||
invoke_result=invoke_result,
|
||||
queue_manager=queue_manager,
|
||||
stream=application_generate_entity.stream,
|
||||
message_id=message.id,
|
||||
user_id=application_generate_entity.user_id,
|
||||
tenant_id=app_config.tenant_id,
|
||||
)
|
||||
|
||||
@@ -184,5 +184,10 @@ class CompletionAppRunner(AppRunner):
|
||||
|
||||
# handle invoke result
|
||||
self._handle_invoke_result(
|
||||
invoke_result=invoke_result, queue_manager=queue_manager, stream=application_generate_entity.stream
|
||||
invoke_result=invoke_result,
|
||||
queue_manager=queue_manager,
|
||||
stream=application_generate_entity.stream,
|
||||
message_id=message.id,
|
||||
user_id=application_generate_entity.user_id,
|
||||
tenant_id=app_config.tenant_id,
|
||||
)
|
||||
|
||||
@@ -1,9 +1,11 @@
|
||||
from __future__ import annotations
|
||||
|
||||
import contextvars
|
||||
import logging
|
||||
import threading
|
||||
import uuid
|
||||
from collections.abc import Generator, Mapping, Sequence
|
||||
from typing import Any, Literal, Union, overload
|
||||
from typing import TYPE_CHECKING, Any, Literal, Union, overload
|
||||
|
||||
from flask import Flask, current_app
|
||||
from pydantic import ValidationError
|
||||
@@ -40,6 +42,9 @@ from models import Account, App, EndUser, Workflow, WorkflowNodeExecutionTrigger
|
||||
from models.enums import WorkflowRunTriggeredFrom
|
||||
from services.workflow_draft_variable_service import DraftVarLoader, WorkflowDraftVariableService
|
||||
|
||||
if TYPE_CHECKING:
|
||||
from controllers.console.app.workflow import LoopNodeRunPayload
|
||||
|
||||
SKIP_PREPARE_USER_INPUTS_KEY = "_skip_prepare_user_inputs"
|
||||
|
||||
logger = logging.getLogger(__name__)
|
||||
@@ -381,7 +386,7 @@ class WorkflowAppGenerator(BaseAppGenerator):
|
||||
workflow: Workflow,
|
||||
node_id: str,
|
||||
user: Account | EndUser,
|
||||
args: Mapping[str, Any],
|
||||
args: LoopNodeRunPayload,
|
||||
streaming: bool = True,
|
||||
) -> Mapping[str, Any] | Generator[str | Mapping[str, Any], None, None]:
|
||||
"""
|
||||
@@ -397,7 +402,7 @@ class WorkflowAppGenerator(BaseAppGenerator):
|
||||
if not node_id:
|
||||
raise ValueError("node_id is required")
|
||||
|
||||
if args.get("inputs") is None:
|
||||
if args.inputs is None:
|
||||
raise ValueError("inputs is required")
|
||||
|
||||
# convert to app config
|
||||
@@ -413,7 +418,7 @@ class WorkflowAppGenerator(BaseAppGenerator):
|
||||
stream=streaming,
|
||||
invoke_from=InvokeFrom.DEBUGGER,
|
||||
extras={"auto_generate_conversation_name": False},
|
||||
single_loop_run=WorkflowAppGenerateEntity.SingleLoopRunEntity(node_id=node_id, inputs=args["inputs"]),
|
||||
single_loop_run=WorkflowAppGenerateEntity.SingleLoopRunEntity(node_id=node_id, inputs=args.inputs or {}),
|
||||
workflow_execution_id=str(uuid.uuid4()),
|
||||
)
|
||||
contexts.plugin_tool_providers.set({})
|
||||
|
||||
@@ -166,18 +166,22 @@ class WorkflowBasedAppRunner:
|
||||
|
||||
# Determine which type of single node execution and get graph/variable_pool
|
||||
if single_iteration_run:
|
||||
graph, variable_pool = self._get_graph_and_variable_pool_of_single_iteration(
|
||||
graph, variable_pool = self._get_graph_and_variable_pool_for_single_node_run(
|
||||
workflow=workflow,
|
||||
node_id=single_iteration_run.node_id,
|
||||
user_inputs=dict(single_iteration_run.inputs),
|
||||
graph_runtime_state=graph_runtime_state,
|
||||
node_type_filter_key="iteration_id",
|
||||
node_type_label="iteration",
|
||||
)
|
||||
elif single_loop_run:
|
||||
graph, variable_pool = self._get_graph_and_variable_pool_of_single_loop(
|
||||
graph, variable_pool = self._get_graph_and_variable_pool_for_single_node_run(
|
||||
workflow=workflow,
|
||||
node_id=single_loop_run.node_id,
|
||||
user_inputs=dict(single_loop_run.inputs),
|
||||
graph_runtime_state=graph_runtime_state,
|
||||
node_type_filter_key="loop_id",
|
||||
node_type_label="loop",
|
||||
)
|
||||
else:
|
||||
raise ValueError("Neither single_iteration_run nor single_loop_run is specified")
|
||||
@@ -314,44 +318,6 @@ class WorkflowBasedAppRunner:
|
||||
|
||||
return graph, variable_pool
|
||||
|
||||
def _get_graph_and_variable_pool_of_single_iteration(
|
||||
self,
|
||||
workflow: Workflow,
|
||||
node_id: str,
|
||||
user_inputs: dict[str, Any],
|
||||
graph_runtime_state: GraphRuntimeState,
|
||||
) -> tuple[Graph, VariablePool]:
|
||||
"""
|
||||
Get variable pool of single iteration
|
||||
"""
|
||||
return self._get_graph_and_variable_pool_for_single_node_run(
|
||||
workflow=workflow,
|
||||
node_id=node_id,
|
||||
user_inputs=user_inputs,
|
||||
graph_runtime_state=graph_runtime_state,
|
||||
node_type_filter_key="iteration_id",
|
||||
node_type_label="iteration",
|
||||
)
|
||||
|
||||
def _get_graph_and_variable_pool_of_single_loop(
|
||||
self,
|
||||
workflow: Workflow,
|
||||
node_id: str,
|
||||
user_inputs: dict[str, Any],
|
||||
graph_runtime_state: GraphRuntimeState,
|
||||
) -> tuple[Graph, VariablePool]:
|
||||
"""
|
||||
Get variable pool of single loop
|
||||
"""
|
||||
return self._get_graph_and_variable_pool_for_single_node_run(
|
||||
workflow=workflow,
|
||||
node_id=node_id,
|
||||
user_inputs=user_inputs,
|
||||
graph_runtime_state=graph_runtime_state,
|
||||
node_type_filter_key="loop_id",
|
||||
node_type_label="loop",
|
||||
)
|
||||
|
||||
def _handle_event(self, workflow_entry: WorkflowEntry, event: GraphEngineEvent):
|
||||
"""
|
||||
Handle event
|
||||
|
||||
@@ -39,6 +39,7 @@ from core.app.entities.task_entities import (
|
||||
MessageAudioEndStreamResponse,
|
||||
MessageAudioStreamResponse,
|
||||
MessageEndStreamResponse,
|
||||
StreamEvent,
|
||||
StreamResponse,
|
||||
)
|
||||
from core.app.task_pipeline.based_generate_task_pipeline import BasedGenerateTaskPipeline
|
||||
@@ -70,6 +71,7 @@ class EasyUIBasedGenerateTaskPipeline(BasedGenerateTaskPipeline):
|
||||
|
||||
_task_state: EasyUITaskState
|
||||
_application_generate_entity: Union[ChatAppGenerateEntity, CompletionAppGenerateEntity, AgentChatAppGenerateEntity]
|
||||
_precomputed_event_type: StreamEvent | None = None
|
||||
|
||||
def __init__(
|
||||
self,
|
||||
@@ -342,11 +344,15 @@ class EasyUIBasedGenerateTaskPipeline(BasedGenerateTaskPipeline):
|
||||
self._task_state.llm_result.message.content = current_content
|
||||
|
||||
if isinstance(event, QueueLLMChunkEvent):
|
||||
event_type = self._message_cycle_manager.get_message_event_type(message_id=self._message_id)
|
||||
# Determine the event type once, on first LLM chunk, and reuse for subsequent chunks
|
||||
if not hasattr(self, "_precomputed_event_type") or self._precomputed_event_type is None:
|
||||
self._precomputed_event_type = self._message_cycle_manager.get_message_event_type(
|
||||
message_id=self._message_id
|
||||
)
|
||||
yield self._message_cycle_manager.message_to_stream_response(
|
||||
answer=cast(str, delta_text),
|
||||
message_id=self._message_id,
|
||||
event_type=event_type,
|
||||
event_type=self._precomputed_event_type,
|
||||
)
|
||||
else:
|
||||
yield self._agent_message_to_stream_response(
|
||||
|
||||
@@ -5,7 +5,7 @@ from threading import Thread
|
||||
from typing import Union
|
||||
|
||||
from flask import Flask, current_app
|
||||
from sqlalchemy import exists, select
|
||||
from sqlalchemy import select
|
||||
from sqlalchemy.orm import Session
|
||||
|
||||
from configs import dify_config
|
||||
@@ -30,6 +30,7 @@ from core.app.entities.task_entities import (
|
||||
StreamEvent,
|
||||
WorkflowTaskState,
|
||||
)
|
||||
from core.db.session_factory import session_factory
|
||||
from core.llm_generator.llm_generator import LLMGenerator
|
||||
from core.tools.signature import sign_tool_file
|
||||
from extensions.ext_database import db
|
||||
@@ -57,13 +58,15 @@ class MessageCycleManager:
|
||||
self._message_has_file: set[str] = set()
|
||||
|
||||
def get_message_event_type(self, message_id: str) -> StreamEvent:
|
||||
# Fast path: cached determination from prior QueueMessageFileEvent
|
||||
if message_id in self._message_has_file:
|
||||
return StreamEvent.MESSAGE_FILE
|
||||
|
||||
with Session(db.engine, expire_on_commit=False) as session:
|
||||
has_file = session.query(exists().where(MessageFile.message_id == message_id)).scalar()
|
||||
# Use SQLAlchemy 2.x style session.scalar(select(...))
|
||||
with session_factory.create_session() as session:
|
||||
message_file = session.scalar(select(MessageFile).where(MessageFile.message_id == message_id))
|
||||
|
||||
if has_file:
|
||||
if message_file:
|
||||
self._message_has_file.add(message_id)
|
||||
return StreamEvent.MESSAGE_FILE
|
||||
|
||||
@@ -199,6 +202,8 @@ class MessageCycleManager:
|
||||
message_file = session.scalar(select(MessageFile).where(MessageFile.id == event.message_file_id))
|
||||
|
||||
if message_file and message_file.url is not None:
|
||||
self._message_has_file.add(message_file.message_id)
|
||||
|
||||
# get tool file id
|
||||
tool_file_id = message_file.url.split("/")[-1]
|
||||
# trim extension
|
||||
|
||||
@@ -1,4 +1,4 @@
|
||||
from collections.abc import Generator, Mapping
|
||||
from collections.abc import Generator
|
||||
from typing import Any
|
||||
|
||||
from core.datasource.__base.datasource_plugin import DatasourcePlugin
|
||||
@@ -34,7 +34,7 @@ class OnlineDocumentDatasourcePlugin(DatasourcePlugin):
|
||||
def get_online_document_pages(
|
||||
self,
|
||||
user_id: str,
|
||||
datasource_parameters: Mapping[str, Any],
|
||||
datasource_parameters: dict[str, Any],
|
||||
provider_type: str,
|
||||
) -> Generator[OnlineDocumentPagesMessage, None, None]:
|
||||
manager = PluginDatasourceManager()
|
||||
|
||||
@@ -1,7 +1,7 @@
|
||||
import logging
|
||||
import time
|
||||
import uuid
|
||||
from collections.abc import Generator, Sequence
|
||||
from collections.abc import Callable, Generator, Iterator, Sequence
|
||||
from typing import Union
|
||||
|
||||
from pydantic import ConfigDict
|
||||
@@ -30,6 +30,142 @@ def _gen_tool_call_id() -> str:
|
||||
return f"chatcmpl-tool-{str(uuid.uuid4().hex)}"
|
||||
|
||||
|
||||
def _run_callbacks(callbacks: Sequence[Callback] | None, *, event: str, invoke: Callable[[Callback], None]) -> None:
|
||||
if not callbacks:
|
||||
return
|
||||
|
||||
for callback in callbacks:
|
||||
try:
|
||||
invoke(callback)
|
||||
except Exception as e:
|
||||
if callback.raise_error:
|
||||
raise
|
||||
logger.warning("Callback %s %s failed with error %s", callback.__class__.__name__, event, e)
|
||||
|
||||
|
||||
def _get_or_create_tool_call(
|
||||
existing_tools_calls: list[AssistantPromptMessage.ToolCall],
|
||||
tool_call_id: str,
|
||||
) -> AssistantPromptMessage.ToolCall:
|
||||
"""
|
||||
Get or create a tool call by ID.
|
||||
|
||||
If `tool_call_id` is empty, returns the most recently created tool call.
|
||||
"""
|
||||
if not tool_call_id:
|
||||
if not existing_tools_calls:
|
||||
raise ValueError("tool_call_id is empty but no existing tool call is available to apply the delta")
|
||||
return existing_tools_calls[-1]
|
||||
|
||||
tool_call = next((tool_call for tool_call in existing_tools_calls if tool_call.id == tool_call_id), None)
|
||||
if tool_call is None:
|
||||
tool_call = AssistantPromptMessage.ToolCall(
|
||||
id=tool_call_id,
|
||||
type="function",
|
||||
function=AssistantPromptMessage.ToolCall.ToolCallFunction(name="", arguments=""),
|
||||
)
|
||||
existing_tools_calls.append(tool_call)
|
||||
|
||||
return tool_call
|
||||
|
||||
|
||||
def _merge_tool_call_delta(
|
||||
tool_call: AssistantPromptMessage.ToolCall,
|
||||
delta: AssistantPromptMessage.ToolCall,
|
||||
) -> None:
|
||||
if delta.id:
|
||||
tool_call.id = delta.id
|
||||
if delta.type:
|
||||
tool_call.type = delta.type
|
||||
if delta.function.name:
|
||||
tool_call.function.name = delta.function.name
|
||||
if delta.function.arguments:
|
||||
tool_call.function.arguments += delta.function.arguments
|
||||
|
||||
|
||||
def _build_llm_result_from_first_chunk(
|
||||
model: str,
|
||||
prompt_messages: Sequence[PromptMessage],
|
||||
chunks: Iterator[LLMResultChunk],
|
||||
) -> LLMResult:
|
||||
"""
|
||||
Build a single `LLMResult` from the first returned chunk.
|
||||
|
||||
This is used for `stream=False` because the plugin side may still implement the response via a chunked stream.
|
||||
"""
|
||||
content = ""
|
||||
content_list: list[PromptMessageContentUnionTypes] = []
|
||||
usage = LLMUsage.empty_usage()
|
||||
system_fingerprint: str | None = None
|
||||
tools_calls: list[AssistantPromptMessage.ToolCall] = []
|
||||
|
||||
first_chunk = next(chunks, None)
|
||||
if first_chunk is not None:
|
||||
if isinstance(first_chunk.delta.message.content, str):
|
||||
content += first_chunk.delta.message.content
|
||||
elif isinstance(first_chunk.delta.message.content, list):
|
||||
content_list.extend(first_chunk.delta.message.content)
|
||||
|
||||
if first_chunk.delta.message.tool_calls:
|
||||
_increase_tool_call(first_chunk.delta.message.tool_calls, tools_calls)
|
||||
|
||||
usage = first_chunk.delta.usage or LLMUsage.empty_usage()
|
||||
system_fingerprint = first_chunk.system_fingerprint
|
||||
|
||||
return LLMResult(
|
||||
model=model,
|
||||
prompt_messages=prompt_messages,
|
||||
message=AssistantPromptMessage(
|
||||
content=content or content_list,
|
||||
tool_calls=tools_calls,
|
||||
),
|
||||
usage=usage,
|
||||
system_fingerprint=system_fingerprint,
|
||||
)
|
||||
|
||||
|
||||
def _invoke_llm_via_plugin(
|
||||
*,
|
||||
tenant_id: str,
|
||||
user_id: str,
|
||||
plugin_id: str,
|
||||
provider: str,
|
||||
model: str,
|
||||
credentials: dict,
|
||||
model_parameters: dict,
|
||||
prompt_messages: Sequence[PromptMessage],
|
||||
tools: list[PromptMessageTool] | None,
|
||||
stop: Sequence[str] | None,
|
||||
stream: bool,
|
||||
) -> Union[LLMResult, Generator[LLMResultChunk, None, None]]:
|
||||
from core.plugin.impl.model import PluginModelClient
|
||||
|
||||
plugin_model_manager = PluginModelClient()
|
||||
return plugin_model_manager.invoke_llm(
|
||||
tenant_id=tenant_id,
|
||||
user_id=user_id,
|
||||
plugin_id=plugin_id,
|
||||
provider=provider,
|
||||
model=model,
|
||||
credentials=credentials,
|
||||
model_parameters=model_parameters,
|
||||
prompt_messages=list(prompt_messages),
|
||||
tools=tools,
|
||||
stop=list(stop) if stop else None,
|
||||
stream=stream,
|
||||
)
|
||||
|
||||
|
||||
def _normalize_non_stream_plugin_result(
|
||||
model: str,
|
||||
prompt_messages: Sequence[PromptMessage],
|
||||
result: Union[LLMResult, Iterator[LLMResultChunk]],
|
||||
) -> LLMResult:
|
||||
if isinstance(result, LLMResult):
|
||||
return result
|
||||
return _build_llm_result_from_first_chunk(model=model, prompt_messages=prompt_messages, chunks=result)
|
||||
|
||||
|
||||
def _increase_tool_call(
|
||||
new_tool_calls: list[AssistantPromptMessage.ToolCall], existing_tools_calls: list[AssistantPromptMessage.ToolCall]
|
||||
):
|
||||
@@ -40,42 +176,13 @@ def _increase_tool_call(
|
||||
:param existing_tools_calls: List of existing tool calls to be modified IN-PLACE.
|
||||
"""
|
||||
|
||||
def get_tool_call(tool_call_id: str):
|
||||
"""
|
||||
Get or create a tool call by ID
|
||||
|
||||
:param tool_call_id: tool call ID
|
||||
:return: existing or new tool call
|
||||
"""
|
||||
if not tool_call_id:
|
||||
return existing_tools_calls[-1]
|
||||
|
||||
_tool_call = next((_tool_call for _tool_call in existing_tools_calls if _tool_call.id == tool_call_id), None)
|
||||
if _tool_call is None:
|
||||
_tool_call = AssistantPromptMessage.ToolCall(
|
||||
id=tool_call_id,
|
||||
type="function",
|
||||
function=AssistantPromptMessage.ToolCall.ToolCallFunction(name="", arguments=""),
|
||||
)
|
||||
existing_tools_calls.append(_tool_call)
|
||||
|
||||
return _tool_call
|
||||
|
||||
for new_tool_call in new_tool_calls:
|
||||
# generate ID for tool calls with function name but no ID to track them
|
||||
if new_tool_call.function.name and not new_tool_call.id:
|
||||
new_tool_call.id = _gen_tool_call_id()
|
||||
# get tool call
|
||||
tool_call = get_tool_call(new_tool_call.id)
|
||||
# update tool call
|
||||
if new_tool_call.id:
|
||||
tool_call.id = new_tool_call.id
|
||||
if new_tool_call.type:
|
||||
tool_call.type = new_tool_call.type
|
||||
if new_tool_call.function.name:
|
||||
tool_call.function.name = new_tool_call.function.name
|
||||
if new_tool_call.function.arguments:
|
||||
tool_call.function.arguments += new_tool_call.function.arguments
|
||||
|
||||
tool_call = _get_or_create_tool_call(existing_tools_calls, new_tool_call.id)
|
||||
_merge_tool_call_delta(tool_call, new_tool_call)
|
||||
|
||||
|
||||
class LargeLanguageModel(AIModel):
|
||||
@@ -141,10 +248,7 @@ class LargeLanguageModel(AIModel):
|
||||
result: Union[LLMResult, Generator[LLMResultChunk, None, None]]
|
||||
|
||||
try:
|
||||
from core.plugin.impl.model import PluginModelClient
|
||||
|
||||
plugin_model_manager = PluginModelClient()
|
||||
result = plugin_model_manager.invoke_llm(
|
||||
result = _invoke_llm_via_plugin(
|
||||
tenant_id=self.tenant_id,
|
||||
user_id=user or "unknown",
|
||||
plugin_id=self.plugin_id,
|
||||
@@ -154,38 +258,13 @@ class LargeLanguageModel(AIModel):
|
||||
model_parameters=model_parameters,
|
||||
prompt_messages=prompt_messages,
|
||||
tools=tools,
|
||||
stop=list(stop) if stop else None,
|
||||
stop=stop,
|
||||
stream=stream,
|
||||
)
|
||||
|
||||
if not stream:
|
||||
content = ""
|
||||
content_list = []
|
||||
usage = LLMUsage.empty_usage()
|
||||
system_fingerprint = None
|
||||
tools_calls: list[AssistantPromptMessage.ToolCall] = []
|
||||
|
||||
for chunk in result:
|
||||
if isinstance(chunk.delta.message.content, str):
|
||||
content += chunk.delta.message.content
|
||||
elif isinstance(chunk.delta.message.content, list):
|
||||
content_list.extend(chunk.delta.message.content)
|
||||
if chunk.delta.message.tool_calls:
|
||||
_increase_tool_call(chunk.delta.message.tool_calls, tools_calls)
|
||||
|
||||
usage = chunk.delta.usage or LLMUsage.empty_usage()
|
||||
system_fingerprint = chunk.system_fingerprint
|
||||
break
|
||||
|
||||
result = LLMResult(
|
||||
model=model,
|
||||
prompt_messages=prompt_messages,
|
||||
message=AssistantPromptMessage(
|
||||
content=content or content_list,
|
||||
tool_calls=tools_calls,
|
||||
),
|
||||
usage=usage,
|
||||
system_fingerprint=system_fingerprint,
|
||||
result = _normalize_non_stream_plugin_result(
|
||||
model=model, prompt_messages=prompt_messages, result=result
|
||||
)
|
||||
except Exception as e:
|
||||
self._trigger_invoke_error_callbacks(
|
||||
@@ -425,27 +504,21 @@ class LargeLanguageModel(AIModel):
|
||||
:param user: unique user id
|
||||
:param callbacks: callbacks
|
||||
"""
|
||||
if callbacks:
|
||||
for callback in callbacks:
|
||||
try:
|
||||
callback.on_before_invoke(
|
||||
llm_instance=self,
|
||||
model=model,
|
||||
credentials=credentials,
|
||||
prompt_messages=prompt_messages,
|
||||
model_parameters=model_parameters,
|
||||
tools=tools,
|
||||
stop=stop,
|
||||
stream=stream,
|
||||
user=user,
|
||||
)
|
||||
except Exception as e:
|
||||
if callback.raise_error:
|
||||
raise e
|
||||
else:
|
||||
logger.warning(
|
||||
"Callback %s on_before_invoke failed with error %s", callback.__class__.__name__, e
|
||||
)
|
||||
_run_callbacks(
|
||||
callbacks,
|
||||
event="on_before_invoke",
|
||||
invoke=lambda callback: callback.on_before_invoke(
|
||||
llm_instance=self,
|
||||
model=model,
|
||||
credentials=credentials,
|
||||
prompt_messages=prompt_messages,
|
||||
model_parameters=model_parameters,
|
||||
tools=tools,
|
||||
stop=stop,
|
||||
stream=stream,
|
||||
user=user,
|
||||
),
|
||||
)
|
||||
|
||||
def _trigger_new_chunk_callbacks(
|
||||
self,
|
||||
@@ -473,26 +546,22 @@ class LargeLanguageModel(AIModel):
|
||||
:param stream: is stream response
|
||||
:param user: unique user id
|
||||
"""
|
||||
if callbacks:
|
||||
for callback in callbacks:
|
||||
try:
|
||||
callback.on_new_chunk(
|
||||
llm_instance=self,
|
||||
chunk=chunk,
|
||||
model=model,
|
||||
credentials=credentials,
|
||||
prompt_messages=prompt_messages,
|
||||
model_parameters=model_parameters,
|
||||
tools=tools,
|
||||
stop=stop,
|
||||
stream=stream,
|
||||
user=user,
|
||||
)
|
||||
except Exception as e:
|
||||
if callback.raise_error:
|
||||
raise e
|
||||
else:
|
||||
logger.warning("Callback %s on_new_chunk failed with error %s", callback.__class__.__name__, e)
|
||||
_run_callbacks(
|
||||
callbacks,
|
||||
event="on_new_chunk",
|
||||
invoke=lambda callback: callback.on_new_chunk(
|
||||
llm_instance=self,
|
||||
chunk=chunk,
|
||||
model=model,
|
||||
credentials=credentials,
|
||||
prompt_messages=prompt_messages,
|
||||
model_parameters=model_parameters,
|
||||
tools=tools,
|
||||
stop=stop,
|
||||
stream=stream,
|
||||
user=user,
|
||||
),
|
||||
)
|
||||
|
||||
def _trigger_after_invoke_callbacks(
|
||||
self,
|
||||
@@ -521,28 +590,22 @@ class LargeLanguageModel(AIModel):
|
||||
:param user: unique user id
|
||||
:param callbacks: callbacks
|
||||
"""
|
||||
if callbacks:
|
||||
for callback in callbacks:
|
||||
try:
|
||||
callback.on_after_invoke(
|
||||
llm_instance=self,
|
||||
result=result,
|
||||
model=model,
|
||||
credentials=credentials,
|
||||
prompt_messages=prompt_messages,
|
||||
model_parameters=model_parameters,
|
||||
tools=tools,
|
||||
stop=stop,
|
||||
stream=stream,
|
||||
user=user,
|
||||
)
|
||||
except Exception as e:
|
||||
if callback.raise_error:
|
||||
raise e
|
||||
else:
|
||||
logger.warning(
|
||||
"Callback %s on_after_invoke failed with error %s", callback.__class__.__name__, e
|
||||
)
|
||||
_run_callbacks(
|
||||
callbacks,
|
||||
event="on_after_invoke",
|
||||
invoke=lambda callback: callback.on_after_invoke(
|
||||
llm_instance=self,
|
||||
result=result,
|
||||
model=model,
|
||||
credentials=credentials,
|
||||
prompt_messages=prompt_messages,
|
||||
model_parameters=model_parameters,
|
||||
tools=tools,
|
||||
stop=stop,
|
||||
stream=stream,
|
||||
user=user,
|
||||
),
|
||||
)
|
||||
|
||||
def _trigger_invoke_error_callbacks(
|
||||
self,
|
||||
@@ -571,25 +634,19 @@ class LargeLanguageModel(AIModel):
|
||||
:param user: unique user id
|
||||
:param callbacks: callbacks
|
||||
"""
|
||||
if callbacks:
|
||||
for callback in callbacks:
|
||||
try:
|
||||
callback.on_invoke_error(
|
||||
llm_instance=self,
|
||||
ex=ex,
|
||||
model=model,
|
||||
credentials=credentials,
|
||||
prompt_messages=prompt_messages,
|
||||
model_parameters=model_parameters,
|
||||
tools=tools,
|
||||
stop=stop,
|
||||
stream=stream,
|
||||
user=user,
|
||||
)
|
||||
except Exception as e:
|
||||
if callback.raise_error:
|
||||
raise e
|
||||
else:
|
||||
logger.warning(
|
||||
"Callback %s on_invoke_error failed with error %s", callback.__class__.__name__, e
|
||||
)
|
||||
_run_callbacks(
|
||||
callbacks,
|
||||
event="on_invoke_error",
|
||||
invoke=lambda callback: callback.on_invoke_error(
|
||||
llm_instance=self,
|
||||
ex=ex,
|
||||
model=model,
|
||||
credentials=credentials,
|
||||
prompt_messages=prompt_messages,
|
||||
model_parameters=model_parameters,
|
||||
tools=tools,
|
||||
stop=stop,
|
||||
stream=stream,
|
||||
user=user,
|
||||
),
|
||||
)
|
||||
|
||||
@@ -154,7 +154,7 @@ class IrisConnectionPool:
|
||||
# Add to cache to skip future checks
|
||||
self._schemas_initialized.add(schema)
|
||||
|
||||
except Exception as e:
|
||||
except Exception:
|
||||
conn.rollback()
|
||||
logger.exception("Failed to ensure schema %s exists", schema)
|
||||
raise
|
||||
@@ -177,6 +177,9 @@ class IrisConnectionPool:
|
||||
class IrisVector(BaseVector):
|
||||
"""IRIS vector database implementation using native VECTOR type and HNSW indexing."""
|
||||
|
||||
# Fallback score for full-text search when Rank function unavailable or TEXT_INDEX disabled
|
||||
_FULL_TEXT_FALLBACK_SCORE = 0.5
|
||||
|
||||
def __init__(self, collection_name: str, config: IrisVectorConfig) -> None:
|
||||
super().__init__(collection_name)
|
||||
self.config = config
|
||||
@@ -272,41 +275,131 @@ class IrisVector(BaseVector):
|
||||
return docs
|
||||
|
||||
def search_by_full_text(self, query: str, **kwargs: Any) -> list[Document]:
|
||||
"""Search documents by full-text using iFind index or fallback to LIKE search."""
|
||||
"""Search documents by full-text using iFind index with BM25 relevance scoring.
|
||||
|
||||
When IRIS_TEXT_INDEX is enabled, this method uses the auto-generated Rank
|
||||
function from %iFind.Index.Basic to calculate BM25 relevance scores. The Rank
|
||||
function is automatically created with naming: {schema}.{table_name}_{index}Rank
|
||||
|
||||
Args:
|
||||
query: Search query string
|
||||
**kwargs: Optional parameters including top_k, document_ids_filter
|
||||
|
||||
Returns:
|
||||
List of Document objects with relevance scores in metadata["score"]
|
||||
"""
|
||||
top_k = kwargs.get("top_k", 5)
|
||||
document_ids_filter = kwargs.get("document_ids_filter")
|
||||
|
||||
with self._get_cursor() as cursor:
|
||||
if self.config.IRIS_TEXT_INDEX:
|
||||
# Use iFind full-text search with index
|
||||
# Use iFind full-text search with auto-generated Rank function
|
||||
text_index_name = f"idx_{self.table_name}_text"
|
||||
# IRIS removes underscores from function names
|
||||
table_no_underscore = self.table_name.replace("_", "")
|
||||
index_no_underscore = text_index_name.replace("_", "")
|
||||
rank_function = f"{self.schema}.{table_no_underscore}_{index_no_underscore}Rank"
|
||||
|
||||
# Build WHERE clause with document ID filter if provided
|
||||
where_clause = f"WHERE %ID %FIND search_index({text_index_name}, ?)"
|
||||
# First param for Rank function, second for FIND
|
||||
params = [query, query]
|
||||
|
||||
if document_ids_filter:
|
||||
# Add document ID filter
|
||||
placeholders = ",".join("?" * len(document_ids_filter))
|
||||
where_clause += f" AND JSON_VALUE(meta, '$.document_id') IN ({placeholders})"
|
||||
params.extend(document_ids_filter)
|
||||
|
||||
sql = f"""
|
||||
SELECT TOP {top_k} id, text, meta
|
||||
SELECT TOP {top_k}
|
||||
id,
|
||||
text,
|
||||
meta,
|
||||
{rank_function}(%ID, ?) AS score
|
||||
FROM {self.schema}.{self.table_name}
|
||||
WHERE %ID %FIND search_index({text_index_name}, ?)
|
||||
{where_clause}
|
||||
ORDER BY score DESC
|
||||
"""
|
||||
cursor.execute(sql, (query,))
|
||||
|
||||
logger.debug(
|
||||
"iFind search: query='%s', index='%s', rank='%s'",
|
||||
query,
|
||||
text_index_name,
|
||||
rank_function,
|
||||
)
|
||||
|
||||
try:
|
||||
cursor.execute(sql, params)
|
||||
except Exception: # pylint: disable=broad-exception-caught
|
||||
# Fallback to query without Rank function if it fails
|
||||
logger.warning(
|
||||
"Rank function '%s' failed, using fixed score",
|
||||
rank_function,
|
||||
exc_info=True,
|
||||
)
|
||||
sql_fallback = f"""
|
||||
SELECT TOP {top_k} id, text, meta, {self._FULL_TEXT_FALLBACK_SCORE} AS score
|
||||
FROM {self.schema}.{self.table_name}
|
||||
{where_clause}
|
||||
"""
|
||||
# Skip first param (for Rank function)
|
||||
cursor.execute(sql_fallback, params[1:])
|
||||
else:
|
||||
# Fallback to LIKE search (inefficient for large datasets)
|
||||
# Escape special characters for LIKE clause to prevent SQL injection
|
||||
from libs.helper import escape_like_pattern
|
||||
# Fallback to LIKE search (IRIS_TEXT_INDEX disabled)
|
||||
from libs.helper import ( # pylint: disable=import-outside-toplevel
|
||||
escape_like_pattern,
|
||||
)
|
||||
|
||||
escaped_query = escape_like_pattern(query)
|
||||
query_pattern = f"%{escaped_query}%"
|
||||
|
||||
# Build WHERE clause with document ID filter if provided
|
||||
where_clause = "WHERE text LIKE ? ESCAPE '\\\\'"
|
||||
params = [query_pattern]
|
||||
|
||||
if document_ids_filter:
|
||||
placeholders = ",".join("?" * len(document_ids_filter))
|
||||
where_clause += f" AND JSON_VALUE(meta, '$.document_id') IN ({placeholders})"
|
||||
params.extend(document_ids_filter)
|
||||
|
||||
sql = f"""
|
||||
SELECT TOP {top_k} id, text, meta
|
||||
SELECT TOP {top_k} id, text, meta, {self._FULL_TEXT_FALLBACK_SCORE} AS score
|
||||
FROM {self.schema}.{self.table_name}
|
||||
WHERE text LIKE ? ESCAPE '\\'
|
||||
{where_clause}
|
||||
ORDER BY LENGTH(text) ASC
|
||||
"""
|
||||
cursor.execute(sql, (query_pattern,))
|
||||
|
||||
logger.debug(
|
||||
"LIKE fallback (TEXT_INDEX disabled): query='%s'",
|
||||
query_pattern,
|
||||
)
|
||||
cursor.execute(sql, params)
|
||||
|
||||
docs = []
|
||||
for row in cursor.fetchall():
|
||||
if len(row) >= 3:
|
||||
metadata = json.loads(row[2]) if row[2] else {}
|
||||
docs.append(Document(page_content=row[1], metadata=metadata))
|
||||
# Expecting 4 columns: id, text, meta, score
|
||||
if len(row) >= 4:
|
||||
text_content = row[1]
|
||||
meta_str = row[2]
|
||||
score_value = row[3]
|
||||
|
||||
metadata = json.loads(meta_str) if meta_str else {}
|
||||
# Add score to metadata for hybrid search compatibility
|
||||
score = float(score_value) if score_value is not None else 0.0
|
||||
metadata["score"] = score
|
||||
|
||||
docs.append(Document(page_content=text_content, metadata=metadata))
|
||||
|
||||
logger.info(
|
||||
"Full-text search completed: query='%s', results=%d/%d",
|
||||
query,
|
||||
len(docs),
|
||||
top_k,
|
||||
)
|
||||
|
||||
if not docs:
|
||||
logger.info("Full-text search for '%s' returned no results", query)
|
||||
logger.warning("Full-text search for '%s' returned no results", query)
|
||||
|
||||
return docs
|
||||
|
||||
@@ -370,7 +463,11 @@ class IrisVector(BaseVector):
|
||||
AS %iFind.Index.Basic
|
||||
(LANGUAGE = '{language}', LOWER = 1, INDEXOPTION = 0)
|
||||
"""
|
||||
logger.info("Creating text index: %s with language: %s", text_index_name, language)
|
||||
logger.info(
|
||||
"Creating text index: %s with language: %s",
|
||||
text_index_name,
|
||||
language,
|
||||
)
|
||||
logger.info("SQL for text index: %s", sql_text_index)
|
||||
cursor.execute(sql_text_index)
|
||||
logger.info("Text index created successfully: %s", text_index_name)
|
||||
|
||||
@@ -130,7 +130,7 @@ class ToolInvokeMessage(BaseModel):
|
||||
text: str
|
||||
|
||||
class JsonMessage(BaseModel):
|
||||
json_object: dict
|
||||
json_object: dict | list
|
||||
suppress_output: bool = Field(default=False, description="Whether to suppress JSON output in result string")
|
||||
|
||||
class BlobMessage(BaseModel):
|
||||
@@ -144,7 +144,14 @@ class ToolInvokeMessage(BaseModel):
|
||||
end: bool = Field(..., description="Whether the chunk is the last chunk")
|
||||
|
||||
class FileMessage(BaseModel):
|
||||
pass
|
||||
file_marker: str = Field(default="file_marker")
|
||||
|
||||
@model_validator(mode="before")
|
||||
@classmethod
|
||||
def validate_file_message(cls, values):
|
||||
if isinstance(values, dict) and "file_marker" not in values:
|
||||
raise ValueError("Invalid FileMessage: missing file_marker")
|
||||
return values
|
||||
|
||||
class VariableMessage(BaseModel):
|
||||
variable_name: str = Field(..., description="The name of the variable")
|
||||
@@ -234,10 +241,22 @@ class ToolInvokeMessage(BaseModel):
|
||||
|
||||
@field_validator("message", mode="before")
|
||||
@classmethod
|
||||
def decode_blob_message(cls, v):
|
||||
def decode_blob_message(cls, v, info: ValidationInfo):
|
||||
# 处理 blob 解码
|
||||
if isinstance(v, dict) and "blob" in v:
|
||||
with contextlib.suppress(Exception):
|
||||
v["blob"] = base64.b64decode(v["blob"])
|
||||
|
||||
# Force correct message type based on type field
|
||||
# Only wrap dict types to avoid wrapping already parsed Pydantic model objects
|
||||
if info.data and isinstance(info.data, dict) and isinstance(v, dict):
|
||||
msg_type = info.data.get("type")
|
||||
if msg_type == cls.MessageType.JSON:
|
||||
if "json_object" not in v:
|
||||
v = {"json_object": v}
|
||||
elif msg_type == cls.MessageType.FILE:
|
||||
v = {"file_marker": "file_marker"}
|
||||
|
||||
return v
|
||||
|
||||
@field_serializer("message")
|
||||
|
||||
@@ -494,7 +494,7 @@ class AgentNode(Node[AgentNodeData]):
|
||||
|
||||
text = ""
|
||||
files: list[File] = []
|
||||
json_list: list[dict] = []
|
||||
json_list: list[dict | list] = []
|
||||
|
||||
agent_logs: list[AgentLogEvent] = []
|
||||
agent_execution_metadata: Mapping[WorkflowNodeExecutionMetadataKey, Any] = {}
|
||||
@@ -568,13 +568,18 @@ class AgentNode(Node[AgentNodeData]):
|
||||
elif message.type == ToolInvokeMessage.MessageType.JSON:
|
||||
assert isinstance(message.message, ToolInvokeMessage.JsonMessage)
|
||||
if node_type == NodeType.AGENT:
|
||||
msg_metadata: dict[str, Any] = message.message.json_object.pop("execution_metadata", {})
|
||||
llm_usage = LLMUsage.from_metadata(cast(LLMUsageMetadata, msg_metadata))
|
||||
agent_execution_metadata = {
|
||||
WorkflowNodeExecutionMetadataKey(key): value
|
||||
for key, value in msg_metadata.items()
|
||||
if key in WorkflowNodeExecutionMetadataKey.__members__.values()
|
||||
}
|
||||
if isinstance(message.message.json_object, dict):
|
||||
msg_metadata: dict[str, Any] = message.message.json_object.pop("execution_metadata", {})
|
||||
llm_usage = LLMUsage.from_metadata(cast(LLMUsageMetadata, msg_metadata))
|
||||
agent_execution_metadata = {
|
||||
WorkflowNodeExecutionMetadataKey(key): value
|
||||
for key, value in msg_metadata.items()
|
||||
if key in WorkflowNodeExecutionMetadataKey.__members__.values()
|
||||
}
|
||||
else:
|
||||
msg_metadata = {}
|
||||
llm_usage = LLMUsage.empty_usage()
|
||||
agent_execution_metadata = {}
|
||||
if message.message.json_object:
|
||||
json_list.append(message.message.json_object)
|
||||
elif message.type == ToolInvokeMessage.MessageType.LINK:
|
||||
@@ -683,7 +688,7 @@ class AgentNode(Node[AgentNodeData]):
|
||||
yield agent_log
|
||||
|
||||
# Add agent_logs to outputs['json'] to ensure frontend can access thinking process
|
||||
json_output: list[dict[str, Any]] = []
|
||||
json_output: list[dict[str, Any] | list[Any]] = []
|
||||
|
||||
# Step 1: append each agent log as its own dict.
|
||||
if agent_logs:
|
||||
|
||||
@@ -301,7 +301,7 @@ class DatasourceNode(Node[DatasourceNodeData]):
|
||||
|
||||
text = ""
|
||||
files: list[File] = []
|
||||
json: list[dict] = []
|
||||
json: list[dict | list] = []
|
||||
|
||||
variables: dict[str, Any] = {}
|
||||
|
||||
|
||||
@@ -244,7 +244,7 @@ class ToolNode(Node[ToolNodeData]):
|
||||
|
||||
text = ""
|
||||
files: list[File] = []
|
||||
json: list[dict] = []
|
||||
json: list[dict | list] = []
|
||||
|
||||
variables: dict[str, Any] = {}
|
||||
|
||||
@@ -400,7 +400,7 @@ class ToolNode(Node[ToolNodeData]):
|
||||
message.message.metadata = dict_metadata
|
||||
|
||||
# Add agent_logs to outputs['json'] to ensure frontend can access thinking process
|
||||
json_output: list[dict[str, Any]] = []
|
||||
json_output: list[dict[str, Any] | list[Any]] = []
|
||||
|
||||
# Step 2: normalize JSON into {"data": [...]}.change json to list[dict]
|
||||
if json:
|
||||
|
||||
21
api/enums/hosted_provider.py
Normal file
21
api/enums/hosted_provider.py
Normal file
@@ -0,0 +1,21 @@
|
||||
from enum import StrEnum
|
||||
|
||||
|
||||
class HostedTrialProvider(StrEnum):
|
||||
"""
|
||||
Enum representing hosted model provider names for trial access.
|
||||
"""
|
||||
|
||||
OPENAI = "langgenius/openai/openai"
|
||||
ANTHROPIC = "langgenius/anthropic/anthropic"
|
||||
GEMINI = "langgenius/gemini/google"
|
||||
X = "langgenius/x/x"
|
||||
DEEPSEEK = "langgenius/deepseek/deepseek"
|
||||
TONGYI = "langgenius/tongyi/tongyi"
|
||||
|
||||
@property
|
||||
def config_key(self) -> str:
|
||||
"""Return the config key used in dify_config (e.g., HOSTED_{config_key}_PAID_ENABLED)."""
|
||||
if self == HostedTrialProvider.X:
|
||||
return "XAI"
|
||||
return self.name
|
||||
@@ -4,6 +4,7 @@ from dify_app import DifyApp
|
||||
def init_app(app: DifyApp):
|
||||
from commands import (
|
||||
add_qdrant_index,
|
||||
archive_workflow_runs,
|
||||
clean_expired_messages,
|
||||
clean_workflow_runs,
|
||||
cleanup_orphaned_draft_variables,
|
||||
@@ -11,6 +12,7 @@ def init_app(app: DifyApp):
|
||||
clear_orphaned_file_records,
|
||||
convert_to_agent_apps,
|
||||
create_tenant,
|
||||
delete_archived_workflow_runs,
|
||||
extract_plugins,
|
||||
extract_unique_plugins,
|
||||
file_usage,
|
||||
@@ -24,6 +26,7 @@ def init_app(app: DifyApp):
|
||||
reset_email,
|
||||
reset_encrypt_key_pair,
|
||||
reset_password,
|
||||
restore_workflow_runs,
|
||||
setup_datasource_oauth_client,
|
||||
setup_system_tool_oauth_client,
|
||||
setup_system_trigger_oauth_client,
|
||||
@@ -58,6 +61,9 @@ def init_app(app: DifyApp):
|
||||
setup_datasource_oauth_client,
|
||||
transform_datasource_credentials,
|
||||
install_rag_pipeline_plugins,
|
||||
archive_workflow_runs,
|
||||
delete_archived_workflow_runs,
|
||||
restore_workflow_runs,
|
||||
clean_workflow_runs,
|
||||
clean_expired_messages,
|
||||
]
|
||||
|
||||
45
api/extensions/ext_fastopenapi.py
Normal file
45
api/extensions/ext_fastopenapi.py
Normal file
@@ -0,0 +1,45 @@
|
||||
from fastopenapi.routers import FlaskRouter
|
||||
from flask_cors import CORS
|
||||
|
||||
from configs import dify_config
|
||||
from controllers.fastopenapi import console_router
|
||||
from dify_app import DifyApp
|
||||
from extensions.ext_blueprints import AUTHENTICATED_HEADERS, EXPOSED_HEADERS
|
||||
|
||||
DOCS_PREFIX = "/fastopenapi"
|
||||
|
||||
|
||||
def init_app(app: DifyApp) -> None:
|
||||
docs_enabled = dify_config.SWAGGER_UI_ENABLED
|
||||
docs_url = f"{DOCS_PREFIX}/docs" if docs_enabled else None
|
||||
redoc_url = f"{DOCS_PREFIX}/redoc" if docs_enabled else None
|
||||
openapi_url = f"{DOCS_PREFIX}/openapi.json" if docs_enabled else None
|
||||
|
||||
router = FlaskRouter(
|
||||
app=app,
|
||||
docs_url=docs_url,
|
||||
redoc_url=redoc_url,
|
||||
openapi_url=openapi_url,
|
||||
openapi_version="3.0.0",
|
||||
title="Dify API (FastOpenAPI PoC)",
|
||||
version="1.0",
|
||||
description="FastOpenAPI proof of concept for Dify API",
|
||||
)
|
||||
|
||||
# Ensure route decorators are evaluated.
|
||||
import controllers.console.ping as ping_module
|
||||
from controllers.console import setup
|
||||
|
||||
_ = ping_module
|
||||
_ = setup
|
||||
|
||||
router.include_router(console_router, prefix="/console/api")
|
||||
CORS(
|
||||
app,
|
||||
resources={r"/console/api/*": {"origins": dify_config.CONSOLE_CORS_ALLOW_ORIGINS}},
|
||||
supports_credentials=True,
|
||||
allow_headers=list(AUTHENTICATED_HEADERS),
|
||||
methods=["GET", "PUT", "POST", "DELETE", "OPTIONS", "PATCH"],
|
||||
expose_headers=list(EXPOSED_HEADERS),
|
||||
)
|
||||
app.extensions["fastopenapi"] = router
|
||||
@@ -2,7 +2,12 @@ from flask_restx import Namespace, fields
|
||||
|
||||
from fields.end_user_fields import build_simple_end_user_model, simple_end_user_fields
|
||||
from fields.member_fields import build_simple_account_model, simple_account_fields
|
||||
from fields.workflow_run_fields import build_workflow_run_for_log_model, workflow_run_for_log_fields
|
||||
from fields.workflow_run_fields import (
|
||||
build_workflow_run_for_archived_log_model,
|
||||
build_workflow_run_for_log_model,
|
||||
workflow_run_for_archived_log_fields,
|
||||
workflow_run_for_log_fields,
|
||||
)
|
||||
from libs.helper import TimestampField
|
||||
|
||||
workflow_app_log_partial_fields = {
|
||||
@@ -34,6 +39,33 @@ def build_workflow_app_log_partial_model(api_or_ns: Namespace):
|
||||
return api_or_ns.model("WorkflowAppLogPartial", copied_fields)
|
||||
|
||||
|
||||
workflow_archived_log_partial_fields = {
|
||||
"id": fields.String,
|
||||
"workflow_run": fields.Nested(workflow_run_for_archived_log_fields, allow_null=True),
|
||||
"trigger_metadata": fields.Raw,
|
||||
"created_by_account": fields.Nested(simple_account_fields, attribute="created_by_account", allow_null=True),
|
||||
"created_by_end_user": fields.Nested(simple_end_user_fields, attribute="created_by_end_user", allow_null=True),
|
||||
"created_at": TimestampField,
|
||||
}
|
||||
|
||||
|
||||
def build_workflow_archived_log_partial_model(api_or_ns: Namespace):
|
||||
"""Build the workflow archived log partial model for the API or Namespace."""
|
||||
workflow_run_model = build_workflow_run_for_archived_log_model(api_or_ns)
|
||||
simple_account_model = build_simple_account_model(api_or_ns)
|
||||
simple_end_user_model = build_simple_end_user_model(api_or_ns)
|
||||
|
||||
copied_fields = workflow_archived_log_partial_fields.copy()
|
||||
copied_fields["workflow_run"] = fields.Nested(workflow_run_model, allow_null=True)
|
||||
copied_fields["created_by_account"] = fields.Nested(
|
||||
simple_account_model, attribute="created_by_account", allow_null=True
|
||||
)
|
||||
copied_fields["created_by_end_user"] = fields.Nested(
|
||||
simple_end_user_model, attribute="created_by_end_user", allow_null=True
|
||||
)
|
||||
return api_or_ns.model("WorkflowArchivedLogPartial", copied_fields)
|
||||
|
||||
|
||||
workflow_app_log_pagination_fields = {
|
||||
"page": fields.Integer,
|
||||
"limit": fields.Integer,
|
||||
@@ -51,3 +83,21 @@ def build_workflow_app_log_pagination_model(api_or_ns: Namespace):
|
||||
copied_fields = workflow_app_log_pagination_fields.copy()
|
||||
copied_fields["data"] = fields.List(fields.Nested(workflow_app_log_partial_model))
|
||||
return api_or_ns.model("WorkflowAppLogPagination", copied_fields)
|
||||
|
||||
|
||||
workflow_archived_log_pagination_fields = {
|
||||
"page": fields.Integer,
|
||||
"limit": fields.Integer,
|
||||
"total": fields.Integer,
|
||||
"has_more": fields.Boolean,
|
||||
"data": fields.List(fields.Nested(workflow_archived_log_partial_fields)),
|
||||
}
|
||||
|
||||
|
||||
def build_workflow_archived_log_pagination_model(api_or_ns: Namespace):
|
||||
"""Build the workflow archived log pagination model for the API or Namespace."""
|
||||
workflow_archived_log_partial_model = build_workflow_archived_log_partial_model(api_or_ns)
|
||||
|
||||
copied_fields = workflow_archived_log_pagination_fields.copy()
|
||||
copied_fields["data"] = fields.List(fields.Nested(workflow_archived_log_partial_model))
|
||||
return api_or_ns.model("WorkflowArchivedLogPagination", copied_fields)
|
||||
|
||||
@@ -23,6 +23,19 @@ def build_workflow_run_for_log_model(api_or_ns: Namespace):
|
||||
return api_or_ns.model("WorkflowRunForLog", workflow_run_for_log_fields)
|
||||
|
||||
|
||||
workflow_run_for_archived_log_fields = {
|
||||
"id": fields.String,
|
||||
"status": fields.String,
|
||||
"triggered_from": fields.String,
|
||||
"elapsed_time": fields.Float,
|
||||
"total_tokens": fields.Integer,
|
||||
}
|
||||
|
||||
|
||||
def build_workflow_run_for_archived_log_model(api_or_ns: Namespace):
|
||||
return api_or_ns.model("WorkflowRunForArchivedLog", workflow_run_for_archived_log_fields)
|
||||
|
||||
|
||||
workflow_run_for_list_fields = {
|
||||
"id": fields.String,
|
||||
"version": fields.String,
|
||||
|
||||
@@ -7,7 +7,6 @@ to S3-compatible object storage.
|
||||
|
||||
import base64
|
||||
import datetime
|
||||
import gzip
|
||||
import hashlib
|
||||
import logging
|
||||
from collections.abc import Generator
|
||||
@@ -39,7 +38,7 @@ class ArchiveStorage:
|
||||
"""
|
||||
S3-compatible storage client for archiving or exporting.
|
||||
|
||||
This client provides methods for storing and retrieving archived data in JSONL+gzip format.
|
||||
This client provides methods for storing and retrieving archived data in JSONL format.
|
||||
"""
|
||||
|
||||
def __init__(self, bucket: str):
|
||||
@@ -69,7 +68,10 @@ class ArchiveStorage:
|
||||
aws_access_key_id=dify_config.ARCHIVE_STORAGE_ACCESS_KEY,
|
||||
aws_secret_access_key=dify_config.ARCHIVE_STORAGE_SECRET_KEY,
|
||||
region_name=dify_config.ARCHIVE_STORAGE_REGION,
|
||||
config=Config(s3={"addressing_style": "path"}),
|
||||
config=Config(
|
||||
s3={"addressing_style": "path"},
|
||||
max_pool_connections=64,
|
||||
),
|
||||
)
|
||||
|
||||
# Verify bucket accessibility
|
||||
@@ -100,12 +102,18 @@ class ArchiveStorage:
|
||||
"""
|
||||
checksum = hashlib.md5(data).hexdigest()
|
||||
try:
|
||||
self.client.put_object(
|
||||
response = self.client.put_object(
|
||||
Bucket=self.bucket,
|
||||
Key=key,
|
||||
Body=data,
|
||||
ContentMD5=self._content_md5(data),
|
||||
)
|
||||
etag = response.get("ETag")
|
||||
if not etag:
|
||||
raise ArchiveStorageError(f"Missing ETag for '{key}'")
|
||||
normalized_etag = etag.strip('"')
|
||||
if normalized_etag != checksum:
|
||||
raise ArchiveStorageError(f"ETag mismatch for '{key}': expected={checksum}, actual={normalized_etag}")
|
||||
logger.debug("Uploaded object: %s (size=%d, checksum=%s)", key, len(data), checksum)
|
||||
return checksum
|
||||
except ClientError as e:
|
||||
@@ -240,19 +248,18 @@ class ArchiveStorage:
|
||||
return base64.b64encode(hashlib.md5(data).digest()).decode()
|
||||
|
||||
@staticmethod
|
||||
def serialize_to_jsonl_gz(records: list[dict[str, Any]]) -> bytes:
|
||||
def serialize_to_jsonl(records: list[dict[str, Any]]) -> bytes:
|
||||
"""
|
||||
Serialize records to gzipped JSONL format.
|
||||
Serialize records to JSONL format.
|
||||
|
||||
Args:
|
||||
records: List of dictionaries to serialize
|
||||
|
||||
Returns:
|
||||
Gzipped JSONL bytes
|
||||
JSONL bytes
|
||||
"""
|
||||
lines = []
|
||||
for record in records:
|
||||
# Convert datetime objects to ISO format strings
|
||||
serialized = ArchiveStorage._serialize_record(record)
|
||||
lines.append(orjson.dumps(serialized))
|
||||
|
||||
@@ -260,23 +267,22 @@ class ArchiveStorage:
|
||||
if jsonl_content:
|
||||
jsonl_content += b"\n"
|
||||
|
||||
return gzip.compress(jsonl_content)
|
||||
return jsonl_content
|
||||
|
||||
@staticmethod
|
||||
def deserialize_from_jsonl_gz(data: bytes) -> list[dict[str, Any]]:
|
||||
def deserialize_from_jsonl(data: bytes) -> list[dict[str, Any]]:
|
||||
"""
|
||||
Deserialize gzipped JSONL data to records.
|
||||
Deserialize JSONL data to records.
|
||||
|
||||
Args:
|
||||
data: Gzipped JSONL bytes
|
||||
data: JSONL bytes
|
||||
|
||||
Returns:
|
||||
List of dictionaries
|
||||
"""
|
||||
jsonl_content = gzip.decompress(data)
|
||||
records = []
|
||||
|
||||
for line in jsonl_content.splitlines():
|
||||
for line in data.splitlines():
|
||||
if line:
|
||||
records.append(orjson.loads(line))
|
||||
|
||||
|
||||
@@ -0,0 +1,95 @@
|
||||
"""create workflow_archive_logs
|
||||
|
||||
Revision ID: 9d77545f524e
|
||||
Revises: f9f6d18a37f9
|
||||
Create Date: 2026-01-06 17:18:56.292479
|
||||
|
||||
"""
|
||||
from alembic import op
|
||||
import models as models
|
||||
import sqlalchemy as sa
|
||||
|
||||
|
||||
def _is_pg(conn):
|
||||
return conn.dialect.name == "postgresql"
|
||||
|
||||
# revision identifiers, used by Alembic.
|
||||
revision = '9d77545f524e'
|
||||
down_revision = 'f9f6d18a37f9'
|
||||
branch_labels = None
|
||||
depends_on = None
|
||||
|
||||
|
||||
def upgrade():
|
||||
# ### commands auto generated by Alembic - please adjust! ###
|
||||
conn = op.get_bind()
|
||||
if _is_pg(conn):
|
||||
op.create_table('workflow_archive_logs',
|
||||
sa.Column('id', models.types.StringUUID(), server_default=sa.text('uuidv7()'), nullable=False),
|
||||
sa.Column('log_id', models.types.StringUUID(), nullable=True),
|
||||
sa.Column('tenant_id', models.types.StringUUID(), nullable=False),
|
||||
sa.Column('app_id', models.types.StringUUID(), nullable=False),
|
||||
sa.Column('workflow_id', models.types.StringUUID(), nullable=False),
|
||||
sa.Column('workflow_run_id', models.types.StringUUID(), nullable=False),
|
||||
sa.Column('created_by_role', sa.String(length=255), nullable=False),
|
||||
sa.Column('created_by', models.types.StringUUID(), nullable=False),
|
||||
sa.Column('log_created_at', sa.DateTime(), nullable=True),
|
||||
sa.Column('log_created_from', sa.String(length=255), nullable=True),
|
||||
sa.Column('run_version', sa.String(length=255), nullable=False),
|
||||
sa.Column('run_status', sa.String(length=255), nullable=False),
|
||||
sa.Column('run_triggered_from', sa.String(length=255), nullable=False),
|
||||
sa.Column('run_error', models.types.LongText(), nullable=True),
|
||||
sa.Column('run_elapsed_time', sa.Float(), server_default=sa.text('0'), nullable=False),
|
||||
sa.Column('run_total_tokens', sa.BigInteger(), server_default=sa.text('0'), nullable=False),
|
||||
sa.Column('run_total_steps', sa.Integer(), server_default=sa.text('0'), nullable=True),
|
||||
sa.Column('run_created_at', sa.DateTime(), nullable=False),
|
||||
sa.Column('run_finished_at', sa.DateTime(), nullable=True),
|
||||
sa.Column('run_exceptions_count', sa.Integer(), server_default=sa.text('0'), nullable=True),
|
||||
sa.Column('trigger_metadata', models.types.LongText(), nullable=True),
|
||||
sa.Column('archived_at', sa.DateTime(), server_default=sa.text('CURRENT_TIMESTAMP'), nullable=False),
|
||||
sa.PrimaryKeyConstraint('id', name='workflow_archive_log_pkey')
|
||||
)
|
||||
else:
|
||||
op.create_table('workflow_archive_logs',
|
||||
sa.Column('id', models.types.StringUUID(), nullable=False),
|
||||
sa.Column('log_id', models.types.StringUUID(), nullable=True),
|
||||
sa.Column('tenant_id', models.types.StringUUID(), nullable=False),
|
||||
sa.Column('app_id', models.types.StringUUID(), nullable=False),
|
||||
sa.Column('workflow_id', models.types.StringUUID(), nullable=False),
|
||||
sa.Column('workflow_run_id', models.types.StringUUID(), nullable=False),
|
||||
sa.Column('created_by_role', sa.String(length=255), nullable=False),
|
||||
sa.Column('created_by', models.types.StringUUID(), nullable=False),
|
||||
sa.Column('log_created_at', sa.DateTime(), nullable=True),
|
||||
sa.Column('log_created_from', sa.String(length=255), nullable=True),
|
||||
sa.Column('run_version', sa.String(length=255), nullable=False),
|
||||
sa.Column('run_status', sa.String(length=255), nullable=False),
|
||||
sa.Column('run_triggered_from', sa.String(length=255), nullable=False),
|
||||
sa.Column('run_error', models.types.LongText(), nullable=True),
|
||||
sa.Column('run_elapsed_time', sa.Float(), server_default=sa.text('0'), nullable=False),
|
||||
sa.Column('run_total_tokens', sa.BigInteger(), server_default=sa.text('0'), nullable=False),
|
||||
sa.Column('run_total_steps', sa.Integer(), server_default=sa.text('0'), nullable=True),
|
||||
sa.Column('run_created_at', sa.DateTime(), nullable=False),
|
||||
sa.Column('run_finished_at', sa.DateTime(), nullable=True),
|
||||
sa.Column('run_exceptions_count', sa.Integer(), server_default=sa.text('0'), nullable=True),
|
||||
sa.Column('trigger_metadata', models.types.LongText(), nullable=True),
|
||||
sa.Column('archived_at', sa.DateTime(), server_default=sa.text('CURRENT_TIMESTAMP'), nullable=False),
|
||||
sa.PrimaryKeyConstraint('id', name='workflow_archive_log_pkey')
|
||||
)
|
||||
with op.batch_alter_table('workflow_archive_logs', schema=None) as batch_op:
|
||||
batch_op.create_index('workflow_archive_log_app_idx', ['tenant_id', 'app_id'], unique=False)
|
||||
batch_op.create_index('workflow_archive_log_run_created_at_idx', ['run_created_at'], unique=False)
|
||||
batch_op.create_index('workflow_archive_log_workflow_run_id_idx', ['workflow_run_id'], unique=False)
|
||||
|
||||
|
||||
# ### end Alembic commands ###
|
||||
|
||||
|
||||
def downgrade():
|
||||
# ### commands auto generated by Alembic - please adjust! ###
|
||||
with op.batch_alter_table('workflow_archive_logs', schema=None) as batch_op:
|
||||
batch_op.drop_index('workflow_archive_log_workflow_run_id_idx')
|
||||
batch_op.drop_index('workflow_archive_log_run_created_at_idx')
|
||||
batch_op.drop_index('workflow_archive_log_app_idx')
|
||||
|
||||
op.drop_table('workflow_archive_logs')
|
||||
# ### end Alembic commands ###
|
||||
@@ -103,6 +103,7 @@ from .workflow import (
|
||||
Workflow,
|
||||
WorkflowAppLog,
|
||||
WorkflowAppLogCreatedFrom,
|
||||
WorkflowArchiveLog,
|
||||
WorkflowNodeExecutionModel,
|
||||
WorkflowNodeExecutionOffload,
|
||||
WorkflowNodeExecutionTriggeredFrom,
|
||||
@@ -203,6 +204,7 @@ __all__ = [
|
||||
"Workflow",
|
||||
"WorkflowAppLog",
|
||||
"WorkflowAppLogCreatedFrom",
|
||||
"WorkflowArchiveLog",
|
||||
"WorkflowNodeExecutionModel",
|
||||
"WorkflowNodeExecutionOffload",
|
||||
"WorkflowNodeExecutionTriggeredFrom",
|
||||
|
||||
@@ -315,40 +315,48 @@ class App(Base):
|
||||
return None
|
||||
|
||||
|
||||
class AppModelConfig(Base):
|
||||
class AppModelConfig(TypeBase):
|
||||
__tablename__ = "app_model_configs"
|
||||
__table_args__ = (sa.PrimaryKeyConstraint("id", name="app_model_config_pkey"), sa.Index("app_app_id_idx", "app_id"))
|
||||
|
||||
id = mapped_column(StringUUID, default=lambda: str(uuid4()))
|
||||
app_id = mapped_column(StringUUID, nullable=False)
|
||||
provider = mapped_column(String(255), nullable=True)
|
||||
model_id = mapped_column(String(255), nullable=True)
|
||||
configs = mapped_column(sa.JSON, nullable=True)
|
||||
created_by = mapped_column(StringUUID, nullable=True)
|
||||
created_at = mapped_column(sa.DateTime, nullable=False, server_default=func.current_timestamp())
|
||||
updated_by = mapped_column(StringUUID, nullable=True)
|
||||
updated_at = mapped_column(
|
||||
sa.DateTime, nullable=False, server_default=func.current_timestamp(), onupdate=func.current_timestamp()
|
||||
id: Mapped[str] = mapped_column(StringUUID, default=lambda: str(uuid4()), init=False)
|
||||
app_id: Mapped[str] = mapped_column(StringUUID, nullable=False)
|
||||
provider: Mapped[str | None] = mapped_column(String(255), nullable=True, default=None)
|
||||
model_id: Mapped[str | None] = mapped_column(String(255), nullable=True, default=None)
|
||||
configs: Mapped[Any | None] = mapped_column(sa.JSON, nullable=True, default=None)
|
||||
created_by: Mapped[str | None] = mapped_column(StringUUID, nullable=True, default=None)
|
||||
created_at: Mapped[datetime] = mapped_column(
|
||||
sa.DateTime, nullable=False, server_default=func.current_timestamp(), init=False
|
||||
)
|
||||
opening_statement = mapped_column(LongText)
|
||||
suggested_questions = mapped_column(LongText)
|
||||
suggested_questions_after_answer = mapped_column(LongText)
|
||||
speech_to_text = mapped_column(LongText)
|
||||
text_to_speech = mapped_column(LongText)
|
||||
more_like_this = mapped_column(LongText)
|
||||
model = mapped_column(LongText)
|
||||
user_input_form = mapped_column(LongText)
|
||||
dataset_query_variable = mapped_column(String(255))
|
||||
pre_prompt = mapped_column(LongText)
|
||||
agent_mode = mapped_column(LongText)
|
||||
sensitive_word_avoidance = mapped_column(LongText)
|
||||
retriever_resource = mapped_column(LongText)
|
||||
prompt_type = mapped_column(String(255), nullable=False, server_default=sa.text("'simple'"))
|
||||
chat_prompt_config = mapped_column(LongText)
|
||||
completion_prompt_config = mapped_column(LongText)
|
||||
dataset_configs = mapped_column(LongText)
|
||||
external_data_tools = mapped_column(LongText)
|
||||
file_upload = mapped_column(LongText)
|
||||
updated_by: Mapped[str | None] = mapped_column(StringUUID, nullable=True, default=None)
|
||||
updated_at: Mapped[datetime] = mapped_column(
|
||||
sa.DateTime,
|
||||
nullable=False,
|
||||
server_default=func.current_timestamp(),
|
||||
onupdate=func.current_timestamp(),
|
||||
init=False,
|
||||
)
|
||||
opening_statement: Mapped[str | None] = mapped_column(LongText, default=None)
|
||||
suggested_questions: Mapped[str | None] = mapped_column(LongText, default=None)
|
||||
suggested_questions_after_answer: Mapped[str | None] = mapped_column(LongText, default=None)
|
||||
speech_to_text: Mapped[str | None] = mapped_column(LongText, default=None)
|
||||
text_to_speech: Mapped[str | None] = mapped_column(LongText, default=None)
|
||||
more_like_this: Mapped[str | None] = mapped_column(LongText, default=None)
|
||||
model: Mapped[str | None] = mapped_column(LongText, default=None)
|
||||
user_input_form: Mapped[str | None] = mapped_column(LongText, default=None)
|
||||
dataset_query_variable: Mapped[str | None] = mapped_column(String(255), default=None)
|
||||
pre_prompt: Mapped[str | None] = mapped_column(LongText, default=None)
|
||||
agent_mode: Mapped[str | None] = mapped_column(LongText, default=None)
|
||||
sensitive_word_avoidance: Mapped[str | None] = mapped_column(LongText, default=None)
|
||||
retriever_resource: Mapped[str | None] = mapped_column(LongText, default=None)
|
||||
prompt_type: Mapped[str] = mapped_column(
|
||||
String(255), nullable=False, server_default=sa.text("'simple'"), default="simple"
|
||||
)
|
||||
chat_prompt_config: Mapped[str | None] = mapped_column(LongText, default=None)
|
||||
completion_prompt_config: Mapped[str | None] = mapped_column(LongText, default=None)
|
||||
dataset_configs: Mapped[str | None] = mapped_column(LongText, default=None)
|
||||
external_data_tools: Mapped[str | None] = mapped_column(LongText, default=None)
|
||||
file_upload: Mapped[str | None] = mapped_column(LongText, default=None)
|
||||
|
||||
@property
|
||||
def app(self) -> App | None:
|
||||
@@ -810,8 +818,8 @@ class Conversation(Base):
|
||||
override_model_configs = json.loads(self.override_model_configs)
|
||||
|
||||
if "model" in override_model_configs:
|
||||
app_model_config = AppModelConfig()
|
||||
app_model_config = app_model_config.from_model_config_dict(override_model_configs)
|
||||
# where is app_id?
|
||||
app_model_config = AppModelConfig(app_id=self.app_id).from_model_config_dict(override_model_configs)
|
||||
model_config = app_model_config.to_dict()
|
||||
else:
|
||||
model_config["configs"] = override_model_configs
|
||||
|
||||
@@ -226,8 +226,7 @@ class Workflow(Base): # bug
|
||||
#
|
||||
# Currently, the following functions / methods would mutate the returned dict:
|
||||
#
|
||||
# - `_get_graph_and_variable_pool_of_single_iteration`.
|
||||
# - `_get_graph_and_variable_pool_of_single_loop`.
|
||||
# - `_get_graph_and_variable_pool_for_single_node_run`.
|
||||
return json.loads(self.graph) if self.graph else {}
|
||||
|
||||
def get_node_config_by_id(self, node_id: str) -> Mapping[str, Any]:
|
||||
@@ -1163,6 +1162,69 @@ class WorkflowAppLog(TypeBase):
|
||||
}
|
||||
|
||||
|
||||
class WorkflowArchiveLog(TypeBase):
|
||||
"""
|
||||
Workflow archive log.
|
||||
|
||||
Stores essential workflow run snapshot data for archived app logs.
|
||||
|
||||
Field sources:
|
||||
- Shared fields (tenant/app/workflow/run ids, created_by*): from WorkflowRun for consistency.
|
||||
- log_* fields: from WorkflowAppLog when present; null if the run has no app log.
|
||||
- run_* fields: workflow run snapshot fields from WorkflowRun.
|
||||
- trigger_metadata: snapshot from WorkflowTriggerLog when present.
|
||||
"""
|
||||
|
||||
__tablename__ = "workflow_archive_logs"
|
||||
__table_args__ = (
|
||||
sa.PrimaryKeyConstraint("id", name="workflow_archive_log_pkey"),
|
||||
sa.Index("workflow_archive_log_app_idx", "tenant_id", "app_id"),
|
||||
sa.Index("workflow_archive_log_workflow_run_id_idx", "workflow_run_id"),
|
||||
sa.Index("workflow_archive_log_run_created_at_idx", "run_created_at"),
|
||||
)
|
||||
|
||||
id: Mapped[str] = mapped_column(
|
||||
StringUUID, insert_default=lambda: str(uuidv7()), default_factory=lambda: str(uuidv7()), init=False
|
||||
)
|
||||
|
||||
tenant_id: Mapped[str] = mapped_column(StringUUID, nullable=False)
|
||||
app_id: Mapped[str] = mapped_column(StringUUID, nullable=False)
|
||||
workflow_id: Mapped[str] = mapped_column(StringUUID, nullable=False)
|
||||
workflow_run_id: Mapped[str] = mapped_column(StringUUID, nullable=False)
|
||||
created_by_role: Mapped[str] = mapped_column(String(255), nullable=False)
|
||||
created_by: Mapped[str] = mapped_column(StringUUID, nullable=False)
|
||||
|
||||
log_id: Mapped[str | None] = mapped_column(StringUUID, nullable=True)
|
||||
log_created_at: Mapped[datetime | None] = mapped_column(DateTime, nullable=True)
|
||||
log_created_from: Mapped[str | None] = mapped_column(String(255), nullable=True)
|
||||
|
||||
run_version: Mapped[str] = mapped_column(String(255), nullable=False)
|
||||
run_status: Mapped[str] = mapped_column(String(255), nullable=False)
|
||||
run_triggered_from: Mapped[str] = mapped_column(String(255), nullable=False)
|
||||
run_error: Mapped[str | None] = mapped_column(LongText, nullable=True)
|
||||
run_elapsed_time: Mapped[float] = mapped_column(sa.Float, nullable=False, server_default=sa.text("0"))
|
||||
run_total_tokens: Mapped[int] = mapped_column(sa.BigInteger, server_default=sa.text("0"))
|
||||
run_total_steps: Mapped[int] = mapped_column(sa.Integer, server_default=sa.text("0"), nullable=True)
|
||||
run_created_at: Mapped[datetime] = mapped_column(DateTime, nullable=False)
|
||||
run_finished_at: Mapped[datetime | None] = mapped_column(DateTime, nullable=True)
|
||||
run_exceptions_count: Mapped[int] = mapped_column(sa.Integer, server_default=sa.text("0"), nullable=True)
|
||||
|
||||
trigger_metadata: Mapped[str | None] = mapped_column(LongText, nullable=True)
|
||||
archived_at: Mapped[datetime] = mapped_column(
|
||||
DateTime, nullable=False, server_default=func.current_timestamp(), init=False
|
||||
)
|
||||
|
||||
@property
|
||||
def workflow_run_summary(self) -> dict[str, Any]:
|
||||
return {
|
||||
"id": self.workflow_run_id,
|
||||
"status": self.run_status,
|
||||
"triggered_from": self.run_triggered_from,
|
||||
"elapsed_time": self.run_elapsed_time,
|
||||
"total_tokens": self.run_total_tokens,
|
||||
}
|
||||
|
||||
|
||||
class ConversationVariable(TypeBase):
|
||||
__tablename__ = "workflow_conversation_variables"
|
||||
|
||||
|
||||
@@ -31,7 +31,7 @@ dependencies = [
|
||||
"gunicorn~=23.0.0",
|
||||
"httpx[socks]~=0.27.0",
|
||||
"jieba==0.42.1",
|
||||
"json-repair>=0.41.1",
|
||||
"json-repair>=0.55.1",
|
||||
"jsonschema>=4.25.1",
|
||||
"langfuse~=2.51.3",
|
||||
"langsmith~=0.1.77",
|
||||
@@ -64,7 +64,7 @@ dependencies = [
|
||||
"pandas[excel,output-formatting,performance]~=2.2.2",
|
||||
"psycogreen~=1.0.2",
|
||||
"psycopg2-binary~=2.9.6",
|
||||
"pycryptodome==3.19.1",
|
||||
"pycryptodome==3.23.0",
|
||||
"pydantic~=2.11.4",
|
||||
"pydantic-extra-types~=2.10.3",
|
||||
"pydantic-settings~=2.11.0",
|
||||
@@ -93,6 +93,7 @@ dependencies = [
|
||||
"weaviate-client==4.17.0",
|
||||
"apscheduler>=3.11.0",
|
||||
"weave>=0.52.16",
|
||||
"fastopenapi[flask]>=0.7.0",
|
||||
]
|
||||
# Before adding new dependency, consider place it in
|
||||
# alphabet order (a-z) and suitable group.
|
||||
|
||||
@@ -8,6 +8,7 @@
|
||||
],
|
||||
"typeCheckingMode": "strict",
|
||||
"allowedUntypedLibraries": [
|
||||
"fastopenapi",
|
||||
"flask_restx",
|
||||
"flask_login",
|
||||
"opentelemetry.instrumentation.celery",
|
||||
|
||||
@@ -16,7 +16,7 @@ from typing import Protocol
|
||||
from sqlalchemy.orm import Session
|
||||
|
||||
from core.workflow.repositories.workflow_node_execution_repository import WorkflowNodeExecutionRepository
|
||||
from models.workflow import WorkflowNodeExecutionModel
|
||||
from models.workflow import WorkflowNodeExecutionModel, WorkflowNodeExecutionOffload
|
||||
|
||||
|
||||
class DifyAPIWorkflowNodeExecutionRepository(WorkflowNodeExecutionRepository, Protocol):
|
||||
@@ -209,3 +209,23 @@ class DifyAPIWorkflowNodeExecutionRepository(WorkflowNodeExecutionRepository, Pr
|
||||
The number of executions deleted
|
||||
"""
|
||||
...
|
||||
|
||||
def get_offloads_by_execution_ids(
|
||||
self,
|
||||
session: Session,
|
||||
node_execution_ids: Sequence[str],
|
||||
) -> Sequence[WorkflowNodeExecutionOffload]:
|
||||
"""
|
||||
Get offload records by node execution IDs.
|
||||
|
||||
This method retrieves workflow node execution offload records
|
||||
that belong to the given node execution IDs.
|
||||
|
||||
Args:
|
||||
session: The database session to use
|
||||
node_execution_ids: List of node execution IDs to filter by
|
||||
|
||||
Returns:
|
||||
A sequence of WorkflowNodeExecutionOffload instances
|
||||
"""
|
||||
...
|
||||
|
||||
@@ -45,7 +45,7 @@ from core.workflow.enums import WorkflowType
|
||||
from core.workflow.repositories.workflow_execution_repository import WorkflowExecutionRepository
|
||||
from libs.infinite_scroll_pagination import InfiniteScrollPagination
|
||||
from models.enums import WorkflowRunTriggeredFrom
|
||||
from models.workflow import WorkflowRun
|
||||
from models.workflow import WorkflowAppLog, WorkflowArchiveLog, WorkflowPause, WorkflowPauseReason, WorkflowRun
|
||||
from repositories.entities.workflow_pause import WorkflowPauseEntity
|
||||
from repositories.types import (
|
||||
AverageInteractionStats,
|
||||
@@ -270,6 +270,58 @@ class APIWorkflowRunRepository(WorkflowExecutionRepository, Protocol):
|
||||
"""
|
||||
...
|
||||
|
||||
def get_archived_run_ids(
|
||||
self,
|
||||
session: Session,
|
||||
run_ids: Sequence[str],
|
||||
) -> set[str]:
|
||||
"""
|
||||
Fetch workflow run IDs that already have archive log records.
|
||||
"""
|
||||
...
|
||||
|
||||
def get_archived_logs_by_time_range(
|
||||
self,
|
||||
session: Session,
|
||||
tenant_ids: Sequence[str] | None,
|
||||
start_date: datetime,
|
||||
end_date: datetime,
|
||||
limit: int,
|
||||
) -> Sequence[WorkflowArchiveLog]:
|
||||
"""
|
||||
Fetch archived workflow logs by time range for restore.
|
||||
"""
|
||||
...
|
||||
|
||||
def get_archived_log_by_run_id(
|
||||
self,
|
||||
run_id: str,
|
||||
) -> WorkflowArchiveLog | None:
|
||||
"""
|
||||
Fetch a workflow archive log by workflow run ID.
|
||||
"""
|
||||
...
|
||||
|
||||
def delete_archive_log_by_run_id(
|
||||
self,
|
||||
session: Session,
|
||||
run_id: str,
|
||||
) -> int:
|
||||
"""
|
||||
Delete archive log by workflow run ID.
|
||||
|
||||
Used after restoring a workflow run to remove the archive log record,
|
||||
allowing the run to be archived again if needed.
|
||||
|
||||
Args:
|
||||
session: Database session
|
||||
run_id: Workflow run ID
|
||||
|
||||
Returns:
|
||||
Number of records deleted (0 or 1)
|
||||
"""
|
||||
...
|
||||
|
||||
def delete_runs_with_related(
|
||||
self,
|
||||
runs: Sequence[WorkflowRun],
|
||||
@@ -282,6 +334,61 @@ class APIWorkflowRunRepository(WorkflowExecutionRepository, Protocol):
|
||||
"""
|
||||
...
|
||||
|
||||
def get_pause_records_by_run_id(
|
||||
self,
|
||||
session: Session,
|
||||
run_id: str,
|
||||
) -> Sequence[WorkflowPause]:
|
||||
"""
|
||||
Fetch workflow pause records by workflow run ID.
|
||||
"""
|
||||
...
|
||||
|
||||
def get_pause_reason_records_by_run_id(
|
||||
self,
|
||||
session: Session,
|
||||
pause_ids: Sequence[str],
|
||||
) -> Sequence[WorkflowPauseReason]:
|
||||
"""
|
||||
Fetch workflow pause reason records by pause IDs.
|
||||
"""
|
||||
...
|
||||
|
||||
def get_app_logs_by_run_id(
|
||||
self,
|
||||
session: Session,
|
||||
run_id: str,
|
||||
) -> Sequence[WorkflowAppLog]:
|
||||
"""
|
||||
Fetch workflow app logs by workflow run ID.
|
||||
"""
|
||||
...
|
||||
|
||||
def create_archive_logs(
|
||||
self,
|
||||
session: Session,
|
||||
run: WorkflowRun,
|
||||
app_logs: Sequence[WorkflowAppLog],
|
||||
trigger_metadata: str | None,
|
||||
) -> int:
|
||||
"""
|
||||
Create archive log records for a workflow run.
|
||||
"""
|
||||
...
|
||||
|
||||
def get_archived_runs_by_time_range(
|
||||
self,
|
||||
session: Session,
|
||||
tenant_ids: Sequence[str] | None,
|
||||
start_date: datetime,
|
||||
end_date: datetime,
|
||||
limit: int,
|
||||
) -> Sequence[WorkflowRun]:
|
||||
"""
|
||||
Return workflow runs that already have archive logs, for cleanup of `workflow_runs`.
|
||||
"""
|
||||
...
|
||||
|
||||
def count_runs_with_related(
|
||||
self,
|
||||
runs: Sequence[WorkflowRun],
|
||||
|
||||
@@ -351,3 +351,27 @@ class DifyAPISQLAlchemyWorkflowNodeExecutionRepository(DifyAPIWorkflowNodeExecut
|
||||
)
|
||||
|
||||
return int(node_executions_count), int(offloads_count)
|
||||
|
||||
@staticmethod
|
||||
def get_by_run(
|
||||
session: Session,
|
||||
run_id: str,
|
||||
) -> Sequence[WorkflowNodeExecutionModel]:
|
||||
"""
|
||||
Fetch node executions for a run using workflow_run_id.
|
||||
"""
|
||||
stmt = select(WorkflowNodeExecutionModel).where(WorkflowNodeExecutionModel.workflow_run_id == run_id)
|
||||
return list(session.scalars(stmt))
|
||||
|
||||
def get_offloads_by_execution_ids(
|
||||
self,
|
||||
session: Session,
|
||||
node_execution_ids: Sequence[str],
|
||||
) -> Sequence[WorkflowNodeExecutionOffload]:
|
||||
if not node_execution_ids:
|
||||
return []
|
||||
|
||||
stmt = select(WorkflowNodeExecutionOffload).where(
|
||||
WorkflowNodeExecutionOffload.node_execution_id.in_(node_execution_ids)
|
||||
)
|
||||
return list(session.scalars(stmt))
|
||||
|
||||
@@ -40,14 +40,7 @@ from libs.infinite_scroll_pagination import InfiniteScrollPagination
|
||||
from libs.time_parser import get_time_threshold
|
||||
from libs.uuid_utils import uuidv7
|
||||
from models.enums import WorkflowRunTriggeredFrom
|
||||
from models.workflow import (
|
||||
WorkflowAppLog,
|
||||
WorkflowPauseReason,
|
||||
WorkflowRun,
|
||||
)
|
||||
from models.workflow import (
|
||||
WorkflowPause as WorkflowPauseModel,
|
||||
)
|
||||
from models.workflow import WorkflowAppLog, WorkflowArchiveLog, WorkflowPause, WorkflowPauseReason, WorkflowRun
|
||||
from repositories.api_workflow_run_repository import APIWorkflowRunRepository
|
||||
from repositories.entities.workflow_pause import WorkflowPauseEntity
|
||||
from repositories.types import (
|
||||
@@ -369,6 +362,53 @@ class DifyAPISQLAlchemyWorkflowRunRepository(APIWorkflowRunRepository):
|
||||
|
||||
return session.scalars(stmt).all()
|
||||
|
||||
def get_archived_run_ids(
|
||||
self,
|
||||
session: Session,
|
||||
run_ids: Sequence[str],
|
||||
) -> set[str]:
|
||||
if not run_ids:
|
||||
return set()
|
||||
|
||||
stmt = select(WorkflowArchiveLog.workflow_run_id).where(WorkflowArchiveLog.workflow_run_id.in_(run_ids))
|
||||
return set(session.scalars(stmt).all())
|
||||
|
||||
def get_archived_log_by_run_id(
|
||||
self,
|
||||
run_id: str,
|
||||
) -> WorkflowArchiveLog | None:
|
||||
with self._session_maker() as session:
|
||||
stmt = select(WorkflowArchiveLog).where(WorkflowArchiveLog.workflow_run_id == run_id).limit(1)
|
||||
return session.scalar(stmt)
|
||||
|
||||
def delete_archive_log_by_run_id(
|
||||
self,
|
||||
session: Session,
|
||||
run_id: str,
|
||||
) -> int:
|
||||
stmt = delete(WorkflowArchiveLog).where(WorkflowArchiveLog.workflow_run_id == run_id)
|
||||
result = session.execute(stmt)
|
||||
return cast(CursorResult, result).rowcount or 0
|
||||
|
||||
def get_pause_records_by_run_id(
|
||||
self,
|
||||
session: Session,
|
||||
run_id: str,
|
||||
) -> Sequence[WorkflowPause]:
|
||||
stmt = select(WorkflowPause).where(WorkflowPause.workflow_run_id == run_id)
|
||||
return list(session.scalars(stmt))
|
||||
|
||||
def get_pause_reason_records_by_run_id(
|
||||
self,
|
||||
session: Session,
|
||||
pause_ids: Sequence[str],
|
||||
) -> Sequence[WorkflowPauseReason]:
|
||||
if not pause_ids:
|
||||
return []
|
||||
|
||||
stmt = select(WorkflowPauseReason).where(WorkflowPauseReason.pause_id.in_(pause_ids))
|
||||
return list(session.scalars(stmt))
|
||||
|
||||
def delete_runs_with_related(
|
||||
self,
|
||||
runs: Sequence[WorkflowRun],
|
||||
@@ -396,9 +436,8 @@ class DifyAPISQLAlchemyWorkflowRunRepository(APIWorkflowRunRepository):
|
||||
app_logs_result = session.execute(delete(WorkflowAppLog).where(WorkflowAppLog.workflow_run_id.in_(run_ids)))
|
||||
app_logs_deleted = cast(CursorResult, app_logs_result).rowcount or 0
|
||||
|
||||
pause_ids = session.scalars(
|
||||
select(WorkflowPauseModel.id).where(WorkflowPauseModel.workflow_run_id.in_(run_ids))
|
||||
).all()
|
||||
pause_stmt = select(WorkflowPause.id).where(WorkflowPause.workflow_run_id.in_(run_ids))
|
||||
pause_ids = session.scalars(pause_stmt).all()
|
||||
pause_reasons_deleted = 0
|
||||
pauses_deleted = 0
|
||||
|
||||
@@ -407,7 +446,7 @@ class DifyAPISQLAlchemyWorkflowRunRepository(APIWorkflowRunRepository):
|
||||
delete(WorkflowPauseReason).where(WorkflowPauseReason.pause_id.in_(pause_ids))
|
||||
)
|
||||
pause_reasons_deleted = cast(CursorResult, pause_reasons_result).rowcount or 0
|
||||
pauses_result = session.execute(delete(WorkflowPauseModel).where(WorkflowPauseModel.id.in_(pause_ids)))
|
||||
pauses_result = session.execute(delete(WorkflowPause).where(WorkflowPause.id.in_(pause_ids)))
|
||||
pauses_deleted = cast(CursorResult, pauses_result).rowcount or 0
|
||||
|
||||
trigger_logs_deleted = delete_trigger_logs(session, run_ids) if delete_trigger_logs else 0
|
||||
@@ -427,6 +466,124 @@ class DifyAPISQLAlchemyWorkflowRunRepository(APIWorkflowRunRepository):
|
||||
"pause_reasons": pause_reasons_deleted,
|
||||
}
|
||||
|
||||
def get_app_logs_by_run_id(
|
||||
self,
|
||||
session: Session,
|
||||
run_id: str,
|
||||
) -> Sequence[WorkflowAppLog]:
|
||||
stmt = select(WorkflowAppLog).where(WorkflowAppLog.workflow_run_id == run_id)
|
||||
return list(session.scalars(stmt))
|
||||
|
||||
def create_archive_logs(
|
||||
self,
|
||||
session: Session,
|
||||
run: WorkflowRun,
|
||||
app_logs: Sequence[WorkflowAppLog],
|
||||
trigger_metadata: str | None,
|
||||
) -> int:
|
||||
if not app_logs:
|
||||
archive_log = WorkflowArchiveLog(
|
||||
log_id=None,
|
||||
log_created_at=None,
|
||||
log_created_from=None,
|
||||
tenant_id=run.tenant_id,
|
||||
app_id=run.app_id,
|
||||
workflow_id=run.workflow_id,
|
||||
workflow_run_id=run.id,
|
||||
created_by_role=run.created_by_role,
|
||||
created_by=run.created_by,
|
||||
run_version=run.version,
|
||||
run_status=run.status,
|
||||
run_triggered_from=run.triggered_from,
|
||||
run_error=run.error,
|
||||
run_elapsed_time=run.elapsed_time,
|
||||
run_total_tokens=run.total_tokens,
|
||||
run_total_steps=run.total_steps,
|
||||
run_created_at=run.created_at,
|
||||
run_finished_at=run.finished_at,
|
||||
run_exceptions_count=run.exceptions_count,
|
||||
trigger_metadata=trigger_metadata,
|
||||
)
|
||||
session.add(archive_log)
|
||||
return 1
|
||||
|
||||
archive_logs = [
|
||||
WorkflowArchiveLog(
|
||||
log_id=app_log.id,
|
||||
log_created_at=app_log.created_at,
|
||||
log_created_from=app_log.created_from,
|
||||
tenant_id=run.tenant_id,
|
||||
app_id=run.app_id,
|
||||
workflow_id=run.workflow_id,
|
||||
workflow_run_id=run.id,
|
||||
created_by_role=run.created_by_role,
|
||||
created_by=run.created_by,
|
||||
run_version=run.version,
|
||||
run_status=run.status,
|
||||
run_triggered_from=run.triggered_from,
|
||||
run_error=run.error,
|
||||
run_elapsed_time=run.elapsed_time,
|
||||
run_total_tokens=run.total_tokens,
|
||||
run_total_steps=run.total_steps,
|
||||
run_created_at=run.created_at,
|
||||
run_finished_at=run.finished_at,
|
||||
run_exceptions_count=run.exceptions_count,
|
||||
trigger_metadata=trigger_metadata,
|
||||
)
|
||||
for app_log in app_logs
|
||||
]
|
||||
session.add_all(archive_logs)
|
||||
return len(archive_logs)
|
||||
|
||||
def get_archived_runs_by_time_range(
|
||||
self,
|
||||
session: Session,
|
||||
tenant_ids: Sequence[str] | None,
|
||||
start_date: datetime,
|
||||
end_date: datetime,
|
||||
limit: int,
|
||||
) -> Sequence[WorkflowRun]:
|
||||
"""
|
||||
Retrieves WorkflowRun records by joining workflow_archive_logs.
|
||||
|
||||
Used to identify runs that are already archived and ready for deletion.
|
||||
"""
|
||||
stmt = (
|
||||
select(WorkflowRun)
|
||||
.join(WorkflowArchiveLog, WorkflowArchiveLog.workflow_run_id == WorkflowRun.id)
|
||||
.where(
|
||||
WorkflowArchiveLog.run_created_at >= start_date,
|
||||
WorkflowArchiveLog.run_created_at < end_date,
|
||||
)
|
||||
.order_by(WorkflowArchiveLog.run_created_at.asc(), WorkflowArchiveLog.workflow_run_id.asc())
|
||||
.limit(limit)
|
||||
)
|
||||
if tenant_ids:
|
||||
stmt = stmt.where(WorkflowArchiveLog.tenant_id.in_(tenant_ids))
|
||||
return list(session.scalars(stmt))
|
||||
|
||||
def get_archived_logs_by_time_range(
|
||||
self,
|
||||
session: Session,
|
||||
tenant_ids: Sequence[str] | None,
|
||||
start_date: datetime,
|
||||
end_date: datetime,
|
||||
limit: int,
|
||||
) -> Sequence[WorkflowArchiveLog]:
|
||||
# Returns WorkflowArchiveLog rows directly; use this when workflow_runs may be deleted.
|
||||
stmt = (
|
||||
select(WorkflowArchiveLog)
|
||||
.where(
|
||||
WorkflowArchiveLog.run_created_at >= start_date,
|
||||
WorkflowArchiveLog.run_created_at < end_date,
|
||||
)
|
||||
.order_by(WorkflowArchiveLog.run_created_at.asc(), WorkflowArchiveLog.workflow_run_id.asc())
|
||||
.limit(limit)
|
||||
)
|
||||
if tenant_ids:
|
||||
stmt = stmt.where(WorkflowArchiveLog.tenant_id.in_(tenant_ids))
|
||||
return list(session.scalars(stmt))
|
||||
|
||||
def count_runs_with_related(
|
||||
self,
|
||||
runs: Sequence[WorkflowRun],
|
||||
@@ -459,7 +616,7 @@ class DifyAPISQLAlchemyWorkflowRunRepository(APIWorkflowRunRepository):
|
||||
)
|
||||
|
||||
pause_ids = session.scalars(
|
||||
select(WorkflowPauseModel.id).where(WorkflowPauseModel.workflow_run_id.in_(run_ids))
|
||||
select(WorkflowPause.id).where(WorkflowPause.workflow_run_id.in_(run_ids))
|
||||
).all()
|
||||
pauses_count = len(pause_ids)
|
||||
pause_reasons_count = 0
|
||||
@@ -511,9 +668,7 @@ class DifyAPISQLAlchemyWorkflowRunRepository(APIWorkflowRunRepository):
|
||||
ValueError: If workflow_run_id is invalid or workflow run doesn't exist
|
||||
RuntimeError: If workflow is already paused or in invalid state
|
||||
"""
|
||||
previous_pause_model_query = select(WorkflowPauseModel).where(
|
||||
WorkflowPauseModel.workflow_run_id == workflow_run_id
|
||||
)
|
||||
previous_pause_model_query = select(WorkflowPause).where(WorkflowPause.workflow_run_id == workflow_run_id)
|
||||
with self._session_maker() as session, session.begin():
|
||||
# Get the workflow run
|
||||
workflow_run = session.get(WorkflowRun, workflow_run_id)
|
||||
@@ -538,7 +693,7 @@ class DifyAPISQLAlchemyWorkflowRunRepository(APIWorkflowRunRepository):
|
||||
# Upload the state file
|
||||
|
||||
# Create the pause record
|
||||
pause_model = WorkflowPauseModel()
|
||||
pause_model = WorkflowPause()
|
||||
pause_model.id = str(uuidv7())
|
||||
pause_model.workflow_id = workflow_run.workflow_id
|
||||
pause_model.workflow_run_id = workflow_run.id
|
||||
@@ -710,13 +865,13 @@ class DifyAPISQLAlchemyWorkflowRunRepository(APIWorkflowRunRepository):
|
||||
"""
|
||||
with self._session_maker() as session, session.begin():
|
||||
# Get the pause model by ID
|
||||
pause_model = session.get(WorkflowPauseModel, pause_entity.id)
|
||||
pause_model = session.get(WorkflowPause, pause_entity.id)
|
||||
if pause_model is None:
|
||||
raise _WorkflowRunError(f"WorkflowPause not found: {pause_entity.id}")
|
||||
self._delete_pause_model(session, pause_model)
|
||||
|
||||
@staticmethod
|
||||
def _delete_pause_model(session: Session, pause_model: WorkflowPauseModel):
|
||||
def _delete_pause_model(session: Session, pause_model: WorkflowPause):
|
||||
storage.delete(pause_model.state_object_key)
|
||||
|
||||
# Delete the pause record
|
||||
@@ -751,15 +906,15 @@ class DifyAPISQLAlchemyWorkflowRunRepository(APIWorkflowRunRepository):
|
||||
_limit: int = limit or 1000
|
||||
pruned_record_ids: list[str] = []
|
||||
cond = or_(
|
||||
WorkflowPauseModel.created_at < expiration,
|
||||
WorkflowPause.created_at < expiration,
|
||||
and_(
|
||||
WorkflowPauseModel.resumed_at.is_not(null()),
|
||||
WorkflowPauseModel.resumed_at < resumption_expiration,
|
||||
WorkflowPause.resumed_at.is_not(null()),
|
||||
WorkflowPause.resumed_at < resumption_expiration,
|
||||
),
|
||||
)
|
||||
# First, collect pause records to delete with their state files
|
||||
# Expired pauses (created before expiration time)
|
||||
stmt = select(WorkflowPauseModel).where(cond).limit(_limit)
|
||||
stmt = select(WorkflowPause).where(cond).limit(_limit)
|
||||
|
||||
with self._session_maker(expire_on_commit=False) as session:
|
||||
# Old resumed pauses (resumed more than resumption_duration ago)
|
||||
@@ -770,7 +925,7 @@ class DifyAPISQLAlchemyWorkflowRunRepository(APIWorkflowRunRepository):
|
||||
# Delete state files from storage
|
||||
for pause in pauses_to_delete:
|
||||
with self._session_maker(expire_on_commit=False) as session, session.begin():
|
||||
# todo: this issues a separate query for each WorkflowPauseModel record.
|
||||
# todo: this issues a separate query for each WorkflowPause record.
|
||||
# consider batching this lookup.
|
||||
try:
|
||||
storage.delete(pause.state_object_key)
|
||||
@@ -1022,7 +1177,7 @@ class _PrivateWorkflowPauseEntity(WorkflowPauseEntity):
|
||||
def __init__(
|
||||
self,
|
||||
*,
|
||||
pause_model: WorkflowPauseModel,
|
||||
pause_model: WorkflowPause,
|
||||
reason_models: Sequence[WorkflowPauseReason],
|
||||
human_input_form: Sequence = (),
|
||||
) -> None:
|
||||
|
||||
@@ -46,6 +46,11 @@ class SQLAlchemyWorkflowTriggerLogRepository(WorkflowTriggerLogRepository):
|
||||
|
||||
return self.session.scalar(query)
|
||||
|
||||
def list_by_run_id(self, run_id: str) -> Sequence[WorkflowTriggerLog]:
|
||||
"""List trigger logs for a workflow run."""
|
||||
query = select(WorkflowTriggerLog).where(WorkflowTriggerLog.workflow_run_id == run_id)
|
||||
return list(self.session.scalars(query).all())
|
||||
|
||||
def get_failed_for_retry(
|
||||
self, tenant_id: str, max_retry_count: int = 3, limit: int = 100
|
||||
) -> Sequence[WorkflowTriggerLog]:
|
||||
|
||||
@@ -521,12 +521,10 @@ class AppDslService:
|
||||
raise ValueError("Missing model_config for chat/agent-chat/completion app")
|
||||
# Initialize or update model config
|
||||
if not app.app_model_config:
|
||||
app_model_config = AppModelConfig().from_model_config_dict(model_config)
|
||||
app_model_config = AppModelConfig(
|
||||
app_id=app.id, created_by=account.id, updated_by=account.id
|
||||
).from_model_config_dict(model_config)
|
||||
app_model_config.id = str(uuid4())
|
||||
app_model_config.app_id = app.id
|
||||
app_model_config.created_by = account.id
|
||||
app_model_config.updated_by = account.id
|
||||
|
||||
app.app_model_config_id = app_model_config.id
|
||||
|
||||
self._session.add(app_model_config)
|
||||
@@ -783,15 +781,16 @@ class AppDslService:
|
||||
return dependencies
|
||||
|
||||
@classmethod
|
||||
def get_leaked_dependencies(cls, tenant_id: str, dsl_dependencies: list[dict]) -> list[PluginDependency]:
|
||||
def get_leaked_dependencies(
|
||||
cls, tenant_id: str, dsl_dependencies: list[PluginDependency]
|
||||
) -> list[PluginDependency]:
|
||||
"""
|
||||
Returns the leaked dependencies in current workspace
|
||||
"""
|
||||
dependencies = [PluginDependency.model_validate(dep) for dep in dsl_dependencies]
|
||||
if not dependencies:
|
||||
if not dsl_dependencies:
|
||||
return []
|
||||
|
||||
return DependenciesAnalysisService.get_leaked_dependencies(tenant_id=tenant_id, dependencies=dependencies)
|
||||
return DependenciesAnalysisService.get_leaked_dependencies(tenant_id=tenant_id, dependencies=dsl_dependencies)
|
||||
|
||||
@staticmethod
|
||||
def _generate_aes_key(tenant_id: str) -> bytes:
|
||||
|
||||
@@ -1,6 +1,8 @@
|
||||
from __future__ import annotations
|
||||
|
||||
import uuid
|
||||
from collections.abc import Generator, Mapping
|
||||
from typing import Any, Union
|
||||
from typing import TYPE_CHECKING, Any, Union
|
||||
|
||||
from configs import dify_config
|
||||
from core.app.apps.advanced_chat.app_generator import AdvancedChatAppGenerator
|
||||
@@ -18,6 +20,9 @@ from services.errors.app import QuotaExceededError, WorkflowIdFormatError, Workf
|
||||
from services.errors.llm import InvokeRateLimitError
|
||||
from services.workflow_service import WorkflowService
|
||||
|
||||
if TYPE_CHECKING:
|
||||
from controllers.console.app.workflow import LoopNodeRunPayload
|
||||
|
||||
|
||||
class AppGenerateService:
|
||||
@classmethod
|
||||
@@ -165,7 +170,9 @@ class AppGenerateService:
|
||||
raise ValueError(f"Invalid app mode {app_model.mode}")
|
||||
|
||||
@classmethod
|
||||
def generate_single_loop(cls, app_model: App, user: Account, node_id: str, args: Any, streaming: bool = True):
|
||||
def generate_single_loop(
|
||||
cls, app_model: App, user: Account, node_id: str, args: LoopNodeRunPayload, streaming: bool = True
|
||||
):
|
||||
if app_model.mode == AppMode.ADVANCED_CHAT:
|
||||
workflow = cls._get_workflow(app_model, InvokeFrom.DEBUGGER)
|
||||
return AdvancedChatAppGenerator.convert_to_event_stream(
|
||||
|
||||
@@ -150,10 +150,9 @@ class AppService:
|
||||
db.session.flush()
|
||||
|
||||
if default_model_config:
|
||||
app_model_config = AppModelConfig(**default_model_config)
|
||||
app_model_config.app_id = app.id
|
||||
app_model_config.created_by = account.id
|
||||
app_model_config.updated_by = account.id
|
||||
app_model_config = AppModelConfig(
|
||||
**default_model_config, app_id=app.id, created_by=account.id, updated_by=account.id
|
||||
)
|
||||
db.session.add(app_model_config)
|
||||
db.session.flush()
|
||||
|
||||
|
||||
@@ -131,7 +131,7 @@ class BillingService:
|
||||
headers = {"Content-Type": "application/json", "Billing-Api-Secret-Key": cls.secret_key}
|
||||
|
||||
url = f"{cls.base_url}{endpoint}"
|
||||
response = httpx.request(method, url, json=json, params=params, headers=headers)
|
||||
response = httpx.request(method, url, json=json, params=params, headers=headers, follow_redirects=True)
|
||||
if method == "GET" and response.status_code != httpx.codes.OK:
|
||||
raise ValueError("Unable to retrieve billing information. Please try again later or contact support.")
|
||||
if method == "PUT":
|
||||
@@ -143,6 +143,9 @@ class BillingService:
|
||||
raise ValueError("Invalid arguments.")
|
||||
if method == "POST" and response.status_code != httpx.codes.OK:
|
||||
raise ValueError(f"Unable to send request to {url}. Please try again later or contact support.")
|
||||
if method == "DELETE" and response.status_code != httpx.codes.OK:
|
||||
logger.error("billing_service: DELETE response: %s %s", response.status_code, response.text)
|
||||
raise ValueError(f"Unable to process delete request {url}. Please try again later or contact support.")
|
||||
return response.json()
|
||||
|
||||
@staticmethod
|
||||
@@ -165,7 +168,7 @@ class BillingService:
|
||||
def delete_account(cls, account_id: str):
|
||||
"""Delete account."""
|
||||
params = {"account_id": account_id}
|
||||
return cls._send_request("DELETE", "/account/", params=params)
|
||||
return cls._send_request("DELETE", "/account", params=params)
|
||||
|
||||
@classmethod
|
||||
def is_email_in_freeze(cls, email: str) -> bool:
|
||||
|
||||
@@ -4,6 +4,7 @@ from pydantic import BaseModel, ConfigDict, Field
|
||||
|
||||
from configs import dify_config
|
||||
from enums.cloud_plan import CloudPlan
|
||||
from enums.hosted_provider import HostedTrialProvider
|
||||
from services.billing_service import BillingService
|
||||
from services.enterprise.enterprise_service import EnterpriseService
|
||||
|
||||
@@ -170,6 +171,7 @@ class SystemFeatureModel(BaseModel):
|
||||
plugin_installation_permission: PluginInstallationPermissionModel = PluginInstallationPermissionModel()
|
||||
enable_change_email: bool = True
|
||||
plugin_manager: PluginManagerModel = PluginManagerModel()
|
||||
trial_models: list[str] = []
|
||||
enable_trial_app: bool = False
|
||||
enable_explore_banner: bool = False
|
||||
|
||||
@@ -202,7 +204,7 @@ class FeatureService:
|
||||
return knowledge_rate_limit
|
||||
|
||||
@classmethod
|
||||
def get_system_features(cls) -> SystemFeatureModel:
|
||||
def get_system_features(cls, is_authenticated: bool = False) -> SystemFeatureModel:
|
||||
system_features = SystemFeatureModel()
|
||||
|
||||
cls._fulfill_system_params_from_env(system_features)
|
||||
@@ -212,7 +214,7 @@ class FeatureService:
|
||||
system_features.webapp_auth.enabled = True
|
||||
system_features.enable_change_email = False
|
||||
system_features.plugin_manager.enabled = True
|
||||
cls._fulfill_params_from_enterprise(system_features)
|
||||
cls._fulfill_params_from_enterprise(system_features, is_authenticated)
|
||||
|
||||
if dify_config.MARKETPLACE_ENABLED:
|
||||
system_features.enable_marketplace = True
|
||||
@@ -227,9 +229,21 @@ class FeatureService:
|
||||
system_features.is_allow_register = dify_config.ALLOW_REGISTER
|
||||
system_features.is_allow_create_workspace = dify_config.ALLOW_CREATE_WORKSPACE
|
||||
system_features.is_email_setup = dify_config.MAIL_TYPE is not None and dify_config.MAIL_TYPE != ""
|
||||
system_features.trial_models = cls._fulfill_trial_models_from_env()
|
||||
system_features.enable_trial_app = dify_config.ENABLE_TRIAL_APP
|
||||
system_features.enable_explore_banner = dify_config.ENABLE_EXPLORE_BANNER
|
||||
|
||||
@classmethod
|
||||
def _fulfill_trial_models_from_env(cls) -> list[str]:
|
||||
return [
|
||||
provider.value
|
||||
for provider in HostedTrialProvider
|
||||
if (
|
||||
getattr(dify_config, f"HOSTED_{provider.config_key}_PAID_ENABLED", False)
|
||||
and getattr(dify_config, f"HOSTED_{provider.config_key}_TRIAL_ENABLED", False)
|
||||
)
|
||||
]
|
||||
|
||||
@classmethod
|
||||
def _fulfill_params_from_env(cls, features: FeatureModel):
|
||||
features.can_replace_logo = dify_config.CAN_REPLACE_LOGO
|
||||
@@ -310,7 +324,7 @@ class FeatureService:
|
||||
features.next_credit_reset_date = billing_info["next_credit_reset_date"]
|
||||
|
||||
@classmethod
|
||||
def _fulfill_params_from_enterprise(cls, features: SystemFeatureModel):
|
||||
def _fulfill_params_from_enterprise(cls, features: SystemFeatureModel, is_authenticated: bool = False):
|
||||
enterprise_info = EnterpriseService.get_info()
|
||||
|
||||
if "SSOEnforcedForSignin" in enterprise_info:
|
||||
@@ -347,19 +361,14 @@ class FeatureService:
|
||||
)
|
||||
features.webapp_auth.sso_config.protocol = enterprise_info.get("SSOEnforcedForWebProtocol", "")
|
||||
|
||||
if "License" in enterprise_info:
|
||||
license_info = enterprise_info["License"]
|
||||
if is_authenticated and (license_info := enterprise_info.get("License")):
|
||||
features.license.status = LicenseStatus(license_info.get("status", LicenseStatus.INACTIVE))
|
||||
features.license.expired_at = license_info.get("expiredAt", "")
|
||||
|
||||
if "status" in license_info:
|
||||
features.license.status = LicenseStatus(license_info.get("status", LicenseStatus.INACTIVE))
|
||||
|
||||
if "expiredAt" in license_info:
|
||||
features.license.expired_at = license_info["expiredAt"]
|
||||
|
||||
if "workspaces" in license_info:
|
||||
features.license.workspaces.enabled = license_info["workspaces"]["enabled"]
|
||||
features.license.workspaces.limit = license_info["workspaces"]["limit"]
|
||||
features.license.workspaces.size = license_info["workspaces"]["used"]
|
||||
if workspaces_info := license_info.get("workspaces"):
|
||||
features.license.workspaces.enabled = workspaces_info.get("enabled", False)
|
||||
features.license.workspaces.limit = workspaces_info.get("limit", 0)
|
||||
features.license.workspaces.size = workspaces_info.get("used", 0)
|
||||
|
||||
if "PluginInstallationPermission" in enterprise_info:
|
||||
plugin_installation_info = enterprise_info["PluginInstallationPermission"]
|
||||
|
||||
@@ -261,10 +261,9 @@ class MessageService:
|
||||
else:
|
||||
conversation_override_model_configs = json.loads(conversation.override_model_configs)
|
||||
app_model_config = AppModelConfig(
|
||||
id=conversation.app_model_config_id,
|
||||
app_id=app_model.id,
|
||||
)
|
||||
|
||||
app_model_config.id = conversation.app_model_config_id
|
||||
app_model_config = app_model_config.from_model_config_dict(conversation_override_model_configs)
|
||||
if not app_model_config:
|
||||
raise ValueError("did not find app model config")
|
||||
|
||||
@@ -870,15 +870,16 @@ class RagPipelineDslService:
|
||||
return dependencies
|
||||
|
||||
@classmethod
|
||||
def get_leaked_dependencies(cls, tenant_id: str, dsl_dependencies: list[dict]) -> list[PluginDependency]:
|
||||
def get_leaked_dependencies(
|
||||
cls, tenant_id: str, dsl_dependencies: list[PluginDependency]
|
||||
) -> list[PluginDependency]:
|
||||
"""
|
||||
Returns the leaked dependencies in current workspace
|
||||
"""
|
||||
dependencies = [PluginDependency.model_validate(dep) for dep in dsl_dependencies]
|
||||
if not dependencies:
|
||||
if not dsl_dependencies:
|
||||
return []
|
||||
|
||||
return DependenciesAnalysisService.get_leaked_dependencies(tenant_id=tenant_id, dependencies=dependencies)
|
||||
return DependenciesAnalysisService.get_leaked_dependencies(tenant_id=tenant_id, dependencies=dsl_dependencies)
|
||||
|
||||
def _generate_aes_key(self, tenant_id: str) -> bytes:
|
||||
"""Generate AES key based on tenant_id"""
|
||||
|
||||
@@ -44,7 +44,7 @@ class RagPipelineTransformService:
|
||||
doc_form = dataset.doc_form
|
||||
if not doc_form:
|
||||
return self._transform_to_empty_pipeline(dataset)
|
||||
retrieval_model = dataset.retrieval_model
|
||||
retrieval_model = RetrievalSetting.model_validate(dataset.retrieval_model) if dataset.retrieval_model else None
|
||||
pipeline_yaml = self._get_transform_yaml(doc_form, datasource_type, indexing_technique)
|
||||
# deal dependencies
|
||||
self._deal_dependencies(pipeline_yaml, dataset.tenant_id)
|
||||
@@ -154,7 +154,12 @@ class RagPipelineTransformService:
|
||||
return node
|
||||
|
||||
def _deal_knowledge_index(
|
||||
self, dataset: Dataset, doc_form: str, indexing_technique: str | None, retrieval_model: dict, node: dict
|
||||
self,
|
||||
dataset: Dataset,
|
||||
doc_form: str,
|
||||
indexing_technique: str | None,
|
||||
retrieval_model: RetrievalSetting | None,
|
||||
node: dict,
|
||||
):
|
||||
knowledge_configuration_dict = node.get("data", {})
|
||||
knowledge_configuration = KnowledgeConfiguration.model_validate(knowledge_configuration_dict)
|
||||
@@ -163,10 +168,9 @@ class RagPipelineTransformService:
|
||||
knowledge_configuration.embedding_model = dataset.embedding_model
|
||||
knowledge_configuration.embedding_model_provider = dataset.embedding_model_provider
|
||||
if retrieval_model:
|
||||
retrieval_setting = RetrievalSetting.model_validate(retrieval_model)
|
||||
if indexing_technique == "economy":
|
||||
retrieval_setting.search_method = RetrievalMethod.KEYWORD_SEARCH
|
||||
knowledge_configuration.retrieval_model = retrieval_setting
|
||||
retrieval_model.search_method = RetrievalMethod.KEYWORD_SEARCH
|
||||
knowledge_configuration.retrieval_model = retrieval_model
|
||||
else:
|
||||
dataset.retrieval_model = knowledge_configuration.retrieval_model.model_dump()
|
||||
|
||||
|
||||
@@ -0,0 +1 @@
|
||||
"""Workflow run retention services."""
|
||||
|
||||
@@ -0,0 +1,531 @@
|
||||
"""
|
||||
Archive Paid Plan Workflow Run Logs Service.
|
||||
|
||||
This service archives workflow run logs for paid plan users older than the configured
|
||||
retention period (default: 90 days) to S3-compatible storage.
|
||||
|
||||
Archived tables:
|
||||
- workflow_runs
|
||||
- workflow_app_logs
|
||||
- workflow_node_executions
|
||||
- workflow_node_execution_offload
|
||||
- workflow_pauses
|
||||
- workflow_pause_reasons
|
||||
- workflow_trigger_logs
|
||||
|
||||
"""
|
||||
|
||||
import datetime
|
||||
import io
|
||||
import json
|
||||
import logging
|
||||
import time
|
||||
import zipfile
|
||||
from collections.abc import Sequence
|
||||
from concurrent.futures import ThreadPoolExecutor
|
||||
from dataclasses import dataclass, field
|
||||
from typing import Any
|
||||
|
||||
import click
|
||||
from sqlalchemy import inspect
|
||||
from sqlalchemy.orm import Session, sessionmaker
|
||||
|
||||
from configs import dify_config
|
||||
from core.workflow.enums import WorkflowType
|
||||
from enums.cloud_plan import CloudPlan
|
||||
from extensions.ext_database import db
|
||||
from libs.archive_storage import (
|
||||
ArchiveStorage,
|
||||
ArchiveStorageNotConfiguredError,
|
||||
get_archive_storage,
|
||||
)
|
||||
from models.workflow import WorkflowAppLog, WorkflowRun
|
||||
from repositories.api_workflow_node_execution_repository import DifyAPIWorkflowNodeExecutionRepository
|
||||
from repositories.api_workflow_run_repository import APIWorkflowRunRepository
|
||||
from repositories.sqlalchemy_workflow_trigger_log_repository import SQLAlchemyWorkflowTriggerLogRepository
|
||||
from services.billing_service import BillingService
|
||||
from services.retention.workflow_run.constants import ARCHIVE_BUNDLE_NAME, ARCHIVE_SCHEMA_VERSION
|
||||
|
||||
logger = logging.getLogger(__name__)
|
||||
|
||||
|
||||
@dataclass
|
||||
class TableStats:
|
||||
"""Statistics for a single archived table."""
|
||||
|
||||
table_name: str
|
||||
row_count: int
|
||||
checksum: str
|
||||
size_bytes: int
|
||||
|
||||
|
||||
@dataclass
|
||||
class ArchiveResult:
|
||||
"""Result of archiving a single workflow run."""
|
||||
|
||||
run_id: str
|
||||
tenant_id: str
|
||||
success: bool
|
||||
tables: list[TableStats] = field(default_factory=list)
|
||||
error: str | None = None
|
||||
elapsed_time: float = 0.0
|
||||
|
||||
|
||||
@dataclass
|
||||
class ArchiveSummary:
|
||||
"""Summary of the entire archive operation."""
|
||||
|
||||
total_runs_processed: int = 0
|
||||
runs_archived: int = 0
|
||||
runs_skipped: int = 0
|
||||
runs_failed: int = 0
|
||||
total_elapsed_time: float = 0.0
|
||||
|
||||
|
||||
class WorkflowRunArchiver:
|
||||
"""
|
||||
Archive workflow run logs for paid plan users.
|
||||
|
||||
Storage Layout:
|
||||
{tenant_id}/app_id={app_id}/year={YYYY}/month={MM}/workflow_run_id={run_id}/
|
||||
└── archive.v1.0.zip
|
||||
├── manifest.json
|
||||
├── workflow_runs.jsonl
|
||||
├── workflow_app_logs.jsonl
|
||||
├── workflow_node_executions.jsonl
|
||||
├── workflow_node_execution_offload.jsonl
|
||||
├── workflow_pauses.jsonl
|
||||
├── workflow_pause_reasons.jsonl
|
||||
└── workflow_trigger_logs.jsonl
|
||||
"""
|
||||
|
||||
ARCHIVED_TYPE = [
|
||||
WorkflowType.WORKFLOW,
|
||||
WorkflowType.RAG_PIPELINE,
|
||||
]
|
||||
ARCHIVED_TABLES = [
|
||||
"workflow_runs",
|
||||
"workflow_app_logs",
|
||||
"workflow_node_executions",
|
||||
"workflow_node_execution_offload",
|
||||
"workflow_pauses",
|
||||
"workflow_pause_reasons",
|
||||
"workflow_trigger_logs",
|
||||
]
|
||||
|
||||
start_from: datetime.datetime | None
|
||||
end_before: datetime.datetime
|
||||
|
||||
def __init__(
|
||||
self,
|
||||
days: int = 90,
|
||||
batch_size: int = 100,
|
||||
start_from: datetime.datetime | None = None,
|
||||
end_before: datetime.datetime | None = None,
|
||||
workers: int = 1,
|
||||
tenant_ids: Sequence[str] | None = None,
|
||||
limit: int | None = None,
|
||||
dry_run: bool = False,
|
||||
delete_after_archive: bool = False,
|
||||
workflow_run_repo: APIWorkflowRunRepository | None = None,
|
||||
):
|
||||
"""
|
||||
Initialize the archiver.
|
||||
|
||||
Args:
|
||||
days: Archive runs older than this many days
|
||||
batch_size: Number of runs to process per batch
|
||||
start_from: Optional start time (inclusive) for archiving
|
||||
end_before: Optional end time (exclusive) for archiving
|
||||
workers: Number of concurrent workflow runs to archive
|
||||
tenant_ids: Optional tenant IDs for grayscale rollout
|
||||
limit: Maximum number of runs to archive (None for unlimited)
|
||||
dry_run: If True, only preview without making changes
|
||||
delete_after_archive: If True, delete runs and related data after archiving
|
||||
"""
|
||||
self.days = days
|
||||
self.batch_size = batch_size
|
||||
if start_from or end_before:
|
||||
if start_from is None or end_before is None:
|
||||
raise ValueError("start_from and end_before must be provided together")
|
||||
if start_from >= end_before:
|
||||
raise ValueError("start_from must be earlier than end_before")
|
||||
self.start_from = start_from.replace(tzinfo=datetime.UTC)
|
||||
self.end_before = end_before.replace(tzinfo=datetime.UTC)
|
||||
else:
|
||||
self.start_from = None
|
||||
self.end_before = datetime.datetime.now(datetime.UTC) - datetime.timedelta(days=days)
|
||||
if workers < 1:
|
||||
raise ValueError("workers must be at least 1")
|
||||
self.workers = workers
|
||||
self.tenant_ids = sorted(set(tenant_ids)) if tenant_ids else []
|
||||
self.limit = limit
|
||||
self.dry_run = dry_run
|
||||
self.delete_after_archive = delete_after_archive
|
||||
self.workflow_run_repo = workflow_run_repo
|
||||
|
||||
def run(self) -> ArchiveSummary:
|
||||
"""
|
||||
Main archiving loop.
|
||||
|
||||
Returns:
|
||||
ArchiveSummary with statistics about the operation
|
||||
"""
|
||||
summary = ArchiveSummary()
|
||||
start_time = time.time()
|
||||
|
||||
click.echo(
|
||||
click.style(
|
||||
self._build_start_message(),
|
||||
fg="white",
|
||||
)
|
||||
)
|
||||
|
||||
# Initialize archive storage (will raise if not configured)
|
||||
try:
|
||||
if not self.dry_run:
|
||||
storage = get_archive_storage()
|
||||
else:
|
||||
storage = None
|
||||
except ArchiveStorageNotConfiguredError as e:
|
||||
click.echo(click.style(f"Archive storage not configured: {e}", fg="red"))
|
||||
return summary
|
||||
|
||||
session_maker = sessionmaker(bind=db.engine, expire_on_commit=False)
|
||||
repo = self._get_workflow_run_repo()
|
||||
|
||||
def _archive_with_session(run: WorkflowRun) -> ArchiveResult:
|
||||
with session_maker() as session:
|
||||
return self._archive_run(session, storage, run)
|
||||
|
||||
last_seen: tuple[datetime.datetime, str] | None = None
|
||||
archived_count = 0
|
||||
|
||||
with ThreadPoolExecutor(max_workers=self.workers) as executor:
|
||||
while True:
|
||||
# Check limit
|
||||
if self.limit and archived_count >= self.limit:
|
||||
click.echo(click.style(f"Reached limit of {self.limit} runs", fg="yellow"))
|
||||
break
|
||||
|
||||
# Fetch batch of runs
|
||||
runs = self._get_runs_batch(last_seen)
|
||||
|
||||
if not runs:
|
||||
break
|
||||
|
||||
run_ids = [run.id for run in runs]
|
||||
with session_maker() as session:
|
||||
archived_run_ids = repo.get_archived_run_ids(session, run_ids)
|
||||
|
||||
last_seen = (runs[-1].created_at, runs[-1].id)
|
||||
|
||||
# Filter to paid tenants only
|
||||
tenant_ids = {run.tenant_id for run in runs}
|
||||
paid_tenants = self._filter_paid_tenants(tenant_ids)
|
||||
|
||||
runs_to_process: list[WorkflowRun] = []
|
||||
for run in runs:
|
||||
summary.total_runs_processed += 1
|
||||
|
||||
# Skip non-paid tenants
|
||||
if run.tenant_id not in paid_tenants:
|
||||
summary.runs_skipped += 1
|
||||
continue
|
||||
|
||||
# Skip already archived runs
|
||||
if run.id in archived_run_ids:
|
||||
summary.runs_skipped += 1
|
||||
continue
|
||||
|
||||
# Check limit
|
||||
if self.limit and archived_count + len(runs_to_process) >= self.limit:
|
||||
break
|
||||
|
||||
runs_to_process.append(run)
|
||||
|
||||
if not runs_to_process:
|
||||
continue
|
||||
|
||||
results = list(executor.map(_archive_with_session, runs_to_process))
|
||||
|
||||
for run, result in zip(runs_to_process, results):
|
||||
if result.success:
|
||||
summary.runs_archived += 1
|
||||
archived_count += 1
|
||||
click.echo(
|
||||
click.style(
|
||||
f"{'[DRY RUN] Would archive' if self.dry_run else 'Archived'} "
|
||||
f"run {run.id} (tenant={run.tenant_id}, "
|
||||
f"tables={len(result.tables)}, time={result.elapsed_time:.2f}s)",
|
||||
fg="green",
|
||||
)
|
||||
)
|
||||
else:
|
||||
summary.runs_failed += 1
|
||||
click.echo(
|
||||
click.style(
|
||||
f"Failed to archive run {run.id}: {result.error}",
|
||||
fg="red",
|
||||
)
|
||||
)
|
||||
|
||||
summary.total_elapsed_time = time.time() - start_time
|
||||
click.echo(
|
||||
click.style(
|
||||
f"{'[DRY RUN] ' if self.dry_run else ''}Archive complete: "
|
||||
f"processed={summary.total_runs_processed}, archived={summary.runs_archived}, "
|
||||
f"skipped={summary.runs_skipped}, failed={summary.runs_failed}, "
|
||||
f"time={summary.total_elapsed_time:.2f}s",
|
||||
fg="white",
|
||||
)
|
||||
)
|
||||
|
||||
return summary
|
||||
|
||||
def _get_runs_batch(
|
||||
self,
|
||||
last_seen: tuple[datetime.datetime, str] | None,
|
||||
) -> Sequence[WorkflowRun]:
|
||||
"""Fetch a batch of workflow runs to archive."""
|
||||
repo = self._get_workflow_run_repo()
|
||||
return repo.get_runs_batch_by_time_range(
|
||||
start_from=self.start_from,
|
||||
end_before=self.end_before,
|
||||
last_seen=last_seen,
|
||||
batch_size=self.batch_size,
|
||||
run_types=self.ARCHIVED_TYPE,
|
||||
tenant_ids=self.tenant_ids or None,
|
||||
)
|
||||
|
||||
def _build_start_message(self) -> str:
|
||||
range_desc = f"before {self.end_before.isoformat()}"
|
||||
if self.start_from:
|
||||
range_desc = f"between {self.start_from.isoformat()} and {self.end_before.isoformat()}"
|
||||
return (
|
||||
f"{'[DRY RUN] ' if self.dry_run else ''}Starting workflow run archiving "
|
||||
f"for runs {range_desc} "
|
||||
f"(batch_size={self.batch_size}, tenant_ids={','.join(self.tenant_ids) or 'all'})"
|
||||
)
|
||||
|
||||
def _filter_paid_tenants(self, tenant_ids: set[str]) -> set[str]:
|
||||
"""Filter tenant IDs to only include paid tenants."""
|
||||
if not dify_config.BILLING_ENABLED:
|
||||
# If billing is not enabled, treat all tenants as paid
|
||||
return tenant_ids
|
||||
|
||||
if not tenant_ids:
|
||||
return set()
|
||||
|
||||
try:
|
||||
bulk_info = BillingService.get_plan_bulk_with_cache(list(tenant_ids))
|
||||
except Exception:
|
||||
logger.exception("Failed to fetch billing plans for tenants")
|
||||
# On error, skip all tenants in this batch
|
||||
return set()
|
||||
|
||||
# Filter to paid tenants (any plan except SANDBOX)
|
||||
paid = set()
|
||||
for tid, info in bulk_info.items():
|
||||
if info and info.get("plan") in (CloudPlan.PROFESSIONAL, CloudPlan.TEAM):
|
||||
paid.add(tid)
|
||||
|
||||
return paid
|
||||
|
||||
def _archive_run(
|
||||
self,
|
||||
session: Session,
|
||||
storage: ArchiveStorage | None,
|
||||
run: WorkflowRun,
|
||||
) -> ArchiveResult:
|
||||
"""Archive a single workflow run."""
|
||||
start_time = time.time()
|
||||
result = ArchiveResult(run_id=run.id, tenant_id=run.tenant_id, success=False)
|
||||
|
||||
try:
|
||||
# Extract data from all tables
|
||||
table_data, app_logs, trigger_metadata = self._extract_data(session, run)
|
||||
|
||||
if self.dry_run:
|
||||
# In dry run, just report what would be archived
|
||||
for table_name in self.ARCHIVED_TABLES:
|
||||
records = table_data.get(table_name, [])
|
||||
result.tables.append(
|
||||
TableStats(
|
||||
table_name=table_name,
|
||||
row_count=len(records),
|
||||
checksum="",
|
||||
size_bytes=0,
|
||||
)
|
||||
)
|
||||
result.success = True
|
||||
else:
|
||||
if storage is None:
|
||||
raise ArchiveStorageNotConfiguredError("Archive storage not configured")
|
||||
archive_key = self._get_archive_key(run)
|
||||
|
||||
# Serialize tables for the archive bundle
|
||||
table_stats: list[TableStats] = []
|
||||
table_payloads: dict[str, bytes] = {}
|
||||
for table_name in self.ARCHIVED_TABLES:
|
||||
records = table_data.get(table_name, [])
|
||||
data = ArchiveStorage.serialize_to_jsonl(records)
|
||||
table_payloads[table_name] = data
|
||||
checksum = ArchiveStorage.compute_checksum(data)
|
||||
|
||||
table_stats.append(
|
||||
TableStats(
|
||||
table_name=table_name,
|
||||
row_count=len(records),
|
||||
checksum=checksum,
|
||||
size_bytes=len(data),
|
||||
)
|
||||
)
|
||||
|
||||
# Generate and upload archive bundle
|
||||
manifest = self._generate_manifest(run, table_stats)
|
||||
manifest_data = json.dumps(manifest, indent=2, default=str).encode("utf-8")
|
||||
archive_data = self._build_archive_bundle(manifest_data, table_payloads)
|
||||
storage.put_object(archive_key, archive_data)
|
||||
|
||||
repo = self._get_workflow_run_repo()
|
||||
archived_log_count = repo.create_archive_logs(session, run, app_logs, trigger_metadata)
|
||||
session.commit()
|
||||
|
||||
deleted_counts = None
|
||||
if self.delete_after_archive:
|
||||
deleted_counts = repo.delete_runs_with_related(
|
||||
[run],
|
||||
delete_node_executions=self._delete_node_executions,
|
||||
delete_trigger_logs=self._delete_trigger_logs,
|
||||
)
|
||||
|
||||
logger.info(
|
||||
"Archived workflow run %s: tables=%s, archived_logs=%s, deleted=%s",
|
||||
run.id,
|
||||
{s.table_name: s.row_count for s in table_stats},
|
||||
archived_log_count,
|
||||
deleted_counts,
|
||||
)
|
||||
|
||||
result.tables = table_stats
|
||||
result.success = True
|
||||
|
||||
except Exception as e:
|
||||
logger.exception("Failed to archive workflow run %s", run.id)
|
||||
result.error = str(e)
|
||||
session.rollback()
|
||||
|
||||
result.elapsed_time = time.time() - start_time
|
||||
return result
|
||||
|
||||
def _extract_data(
|
||||
self,
|
||||
session: Session,
|
||||
run: WorkflowRun,
|
||||
) -> tuple[dict[str, list[dict[str, Any]]], Sequence[WorkflowAppLog], str | None]:
|
||||
table_data: dict[str, list[dict[str, Any]]] = {}
|
||||
table_data["workflow_runs"] = [self._row_to_dict(run)]
|
||||
repo = self._get_workflow_run_repo()
|
||||
app_logs = repo.get_app_logs_by_run_id(session, run.id)
|
||||
table_data["workflow_app_logs"] = [self._row_to_dict(row) for row in app_logs]
|
||||
node_exec_repo = self._get_workflow_node_execution_repo(session)
|
||||
node_exec_records = node_exec_repo.get_executions_by_workflow_run(
|
||||
tenant_id=run.tenant_id,
|
||||
app_id=run.app_id,
|
||||
workflow_run_id=run.id,
|
||||
)
|
||||
node_exec_ids = [record.id for record in node_exec_records]
|
||||
offload_records = node_exec_repo.get_offloads_by_execution_ids(session, node_exec_ids)
|
||||
table_data["workflow_node_executions"] = [self._row_to_dict(row) for row in node_exec_records]
|
||||
table_data["workflow_node_execution_offload"] = [self._row_to_dict(row) for row in offload_records]
|
||||
repo = self._get_workflow_run_repo()
|
||||
pause_records = repo.get_pause_records_by_run_id(session, run.id)
|
||||
pause_ids = [pause.id for pause in pause_records]
|
||||
pause_reason_records = repo.get_pause_reason_records_by_run_id(
|
||||
session,
|
||||
pause_ids,
|
||||
)
|
||||
table_data["workflow_pauses"] = [self._row_to_dict(row) for row in pause_records]
|
||||
table_data["workflow_pause_reasons"] = [self._row_to_dict(row) for row in pause_reason_records]
|
||||
trigger_repo = SQLAlchemyWorkflowTriggerLogRepository(session)
|
||||
trigger_records = trigger_repo.list_by_run_id(run.id)
|
||||
table_data["workflow_trigger_logs"] = [self._row_to_dict(row) for row in trigger_records]
|
||||
trigger_metadata = trigger_records[0].trigger_metadata if trigger_records else None
|
||||
return table_data, app_logs, trigger_metadata
|
||||
|
||||
@staticmethod
|
||||
def _row_to_dict(row: Any) -> dict[str, Any]:
|
||||
mapper = inspect(row).mapper
|
||||
return {str(column.name): getattr(row, mapper.get_property_by_column(column).key) for column in mapper.columns}
|
||||
|
||||
def _get_archive_key(self, run: WorkflowRun) -> str:
|
||||
"""Get the storage key for the archive bundle."""
|
||||
created_at = run.created_at
|
||||
prefix = (
|
||||
f"{run.tenant_id}/app_id={run.app_id}/year={created_at.strftime('%Y')}/"
|
||||
f"month={created_at.strftime('%m')}/workflow_run_id={run.id}"
|
||||
)
|
||||
return f"{prefix}/{ARCHIVE_BUNDLE_NAME}"
|
||||
|
||||
def _generate_manifest(
|
||||
self,
|
||||
run: WorkflowRun,
|
||||
table_stats: list[TableStats],
|
||||
) -> dict[str, Any]:
|
||||
"""Generate a manifest for the archived workflow run."""
|
||||
return {
|
||||
"schema_version": ARCHIVE_SCHEMA_VERSION,
|
||||
"workflow_run_id": run.id,
|
||||
"tenant_id": run.tenant_id,
|
||||
"app_id": run.app_id,
|
||||
"workflow_id": run.workflow_id,
|
||||
"created_at": run.created_at.isoformat(),
|
||||
"archived_at": datetime.datetime.now(datetime.UTC).isoformat(),
|
||||
"tables": {
|
||||
stat.table_name: {
|
||||
"row_count": stat.row_count,
|
||||
"checksum": stat.checksum,
|
||||
"size_bytes": stat.size_bytes,
|
||||
}
|
||||
for stat in table_stats
|
||||
},
|
||||
}
|
||||
|
||||
def _build_archive_bundle(self, manifest_data: bytes, table_payloads: dict[str, bytes]) -> bytes:
|
||||
buffer = io.BytesIO()
|
||||
with zipfile.ZipFile(buffer, mode="w", compression=zipfile.ZIP_DEFLATED) as archive:
|
||||
archive.writestr("manifest.json", manifest_data)
|
||||
for table_name in self.ARCHIVED_TABLES:
|
||||
data = table_payloads.get(table_name)
|
||||
if data is None:
|
||||
raise ValueError(f"Missing archive payload for {table_name}")
|
||||
archive.writestr(f"{table_name}.jsonl", data)
|
||||
return buffer.getvalue()
|
||||
|
||||
def _delete_trigger_logs(self, session: Session, run_ids: Sequence[str]) -> int:
|
||||
trigger_repo = SQLAlchemyWorkflowTriggerLogRepository(session)
|
||||
return trigger_repo.delete_by_run_ids(run_ids)
|
||||
|
||||
def _delete_node_executions(self, session: Session, runs: Sequence[WorkflowRun]) -> tuple[int, int]:
|
||||
run_ids = [run.id for run in runs]
|
||||
return self._get_workflow_node_execution_repo(session).delete_by_runs(session, run_ids)
|
||||
|
||||
def _get_workflow_node_execution_repo(
|
||||
self,
|
||||
session: Session,
|
||||
) -> DifyAPIWorkflowNodeExecutionRepository:
|
||||
from repositories.factory import DifyAPIRepositoryFactory
|
||||
|
||||
session_maker = sessionmaker(bind=session.get_bind(), expire_on_commit=False)
|
||||
return DifyAPIRepositoryFactory.create_api_workflow_node_execution_repository(session_maker)
|
||||
|
||||
def _get_workflow_run_repo(self) -> APIWorkflowRunRepository:
|
||||
if self.workflow_run_repo is not None:
|
||||
return self.workflow_run_repo
|
||||
|
||||
from repositories.factory import DifyAPIRepositoryFactory
|
||||
|
||||
session_maker = sessionmaker(bind=db.engine, expire_on_commit=False)
|
||||
self.workflow_run_repo = DifyAPIRepositoryFactory.create_api_workflow_run_repository(session_maker)
|
||||
return self.workflow_run_repo
|
||||
2
api/services/retention/workflow_run/constants.py
Normal file
2
api/services/retention/workflow_run/constants.py
Normal file
@@ -0,0 +1,2 @@
|
||||
ARCHIVE_SCHEMA_VERSION = "1.0"
|
||||
ARCHIVE_BUNDLE_NAME = f"archive.v{ARCHIVE_SCHEMA_VERSION}.zip"
|
||||
@@ -0,0 +1,134 @@
|
||||
"""
|
||||
Delete Archived Workflow Run Service.
|
||||
|
||||
This service deletes archived workflow run data from the database while keeping
|
||||
archive logs intact.
|
||||
"""
|
||||
|
||||
import time
|
||||
from collections.abc import Sequence
|
||||
from dataclasses import dataclass, field
|
||||
from datetime import datetime
|
||||
|
||||
from sqlalchemy.orm import Session, sessionmaker
|
||||
|
||||
from extensions.ext_database import db
|
||||
from models.workflow import WorkflowRun
|
||||
from repositories.api_workflow_run_repository import APIWorkflowRunRepository
|
||||
from repositories.sqlalchemy_workflow_trigger_log_repository import SQLAlchemyWorkflowTriggerLogRepository
|
||||
|
||||
|
||||
@dataclass
|
||||
class DeleteResult:
|
||||
run_id: str
|
||||
tenant_id: str
|
||||
success: bool
|
||||
deleted_counts: dict[str, int] = field(default_factory=dict)
|
||||
error: str | None = None
|
||||
elapsed_time: float = 0.0
|
||||
|
||||
|
||||
class ArchivedWorkflowRunDeletion:
|
||||
def __init__(self, dry_run: bool = False):
|
||||
self.dry_run = dry_run
|
||||
self.workflow_run_repo: APIWorkflowRunRepository | None = None
|
||||
|
||||
def delete_by_run_id(self, run_id: str) -> DeleteResult:
|
||||
start_time = time.time()
|
||||
result = DeleteResult(run_id=run_id, tenant_id="", success=False)
|
||||
|
||||
repo = self._get_workflow_run_repo()
|
||||
session_maker = sessionmaker(bind=db.engine, expire_on_commit=False)
|
||||
with session_maker() as session:
|
||||
run = session.get(WorkflowRun, run_id)
|
||||
if not run:
|
||||
result.error = f"Workflow run {run_id} not found"
|
||||
result.elapsed_time = time.time() - start_time
|
||||
return result
|
||||
|
||||
result.tenant_id = run.tenant_id
|
||||
if not repo.get_archived_run_ids(session, [run.id]):
|
||||
result.error = f"Workflow run {run_id} is not archived"
|
||||
result.elapsed_time = time.time() - start_time
|
||||
return result
|
||||
|
||||
result = self._delete_run(run)
|
||||
result.elapsed_time = time.time() - start_time
|
||||
return result
|
||||
|
||||
def delete_batch(
|
||||
self,
|
||||
tenant_ids: list[str] | None,
|
||||
start_date: datetime,
|
||||
end_date: datetime,
|
||||
limit: int = 100,
|
||||
) -> list[DeleteResult]:
|
||||
session_maker = sessionmaker(bind=db.engine, expire_on_commit=False)
|
||||
results: list[DeleteResult] = []
|
||||
|
||||
repo = self._get_workflow_run_repo()
|
||||
with session_maker() as session:
|
||||
runs = list(
|
||||
repo.get_archived_runs_by_time_range(
|
||||
session=session,
|
||||
tenant_ids=tenant_ids,
|
||||
start_date=start_date,
|
||||
end_date=end_date,
|
||||
limit=limit,
|
||||
)
|
||||
)
|
||||
for run in runs:
|
||||
results.append(self._delete_run(run))
|
||||
|
||||
return results
|
||||
|
||||
def _delete_run(self, run: WorkflowRun) -> DeleteResult:
|
||||
start_time = time.time()
|
||||
result = DeleteResult(run_id=run.id, tenant_id=run.tenant_id, success=False)
|
||||
if self.dry_run:
|
||||
result.success = True
|
||||
result.elapsed_time = time.time() - start_time
|
||||
return result
|
||||
|
||||
repo = self._get_workflow_run_repo()
|
||||
try:
|
||||
deleted_counts = repo.delete_runs_with_related(
|
||||
[run],
|
||||
delete_node_executions=self._delete_node_executions,
|
||||
delete_trigger_logs=self._delete_trigger_logs,
|
||||
)
|
||||
result.deleted_counts = deleted_counts
|
||||
result.success = True
|
||||
except Exception as e:
|
||||
result.error = str(e)
|
||||
result.elapsed_time = time.time() - start_time
|
||||
return result
|
||||
|
||||
@staticmethod
|
||||
def _delete_trigger_logs(session: Session, run_ids: Sequence[str]) -> int:
|
||||
trigger_repo = SQLAlchemyWorkflowTriggerLogRepository(session)
|
||||
return trigger_repo.delete_by_run_ids(run_ids)
|
||||
|
||||
@staticmethod
|
||||
def _delete_node_executions(
|
||||
session: Session,
|
||||
runs: Sequence[WorkflowRun],
|
||||
) -> tuple[int, int]:
|
||||
from repositories.factory import DifyAPIRepositoryFactory
|
||||
|
||||
run_ids = [run.id for run in runs]
|
||||
repo = DifyAPIRepositoryFactory.create_api_workflow_node_execution_repository(
|
||||
session_maker=sessionmaker(bind=session.get_bind(), expire_on_commit=False)
|
||||
)
|
||||
return repo.delete_by_runs(session, run_ids)
|
||||
|
||||
def _get_workflow_run_repo(self) -> APIWorkflowRunRepository:
|
||||
if self.workflow_run_repo is not None:
|
||||
return self.workflow_run_repo
|
||||
|
||||
from repositories.factory import DifyAPIRepositoryFactory
|
||||
|
||||
self.workflow_run_repo = DifyAPIRepositoryFactory.create_api_workflow_run_repository(
|
||||
sessionmaker(bind=db.engine, expire_on_commit=False)
|
||||
)
|
||||
return self.workflow_run_repo
|
||||
@@ -0,0 +1,481 @@
|
||||
"""
|
||||
Restore Archived Workflow Run Service.
|
||||
|
||||
This service restores archived workflow run data from S3-compatible storage
|
||||
back to the database.
|
||||
"""
|
||||
|
||||
import io
|
||||
import json
|
||||
import logging
|
||||
import time
|
||||
import zipfile
|
||||
from collections.abc import Callable
|
||||
from concurrent.futures import ThreadPoolExecutor
|
||||
from dataclasses import dataclass
|
||||
from datetime import datetime
|
||||
from typing import Any, cast
|
||||
|
||||
import click
|
||||
from sqlalchemy.dialects.postgresql import insert as pg_insert
|
||||
from sqlalchemy.engine import CursorResult
|
||||
from sqlalchemy.orm import DeclarativeBase, Session, sessionmaker
|
||||
|
||||
from extensions.ext_database import db
|
||||
from libs.archive_storage import (
|
||||
ArchiveStorage,
|
||||
ArchiveStorageNotConfiguredError,
|
||||
get_archive_storage,
|
||||
)
|
||||
from models.trigger import WorkflowTriggerLog
|
||||
from models.workflow import (
|
||||
WorkflowAppLog,
|
||||
WorkflowArchiveLog,
|
||||
WorkflowNodeExecutionModel,
|
||||
WorkflowNodeExecutionOffload,
|
||||
WorkflowPause,
|
||||
WorkflowPauseReason,
|
||||
WorkflowRun,
|
||||
)
|
||||
from repositories.api_workflow_run_repository import APIWorkflowRunRepository
|
||||
from repositories.factory import DifyAPIRepositoryFactory
|
||||
from services.retention.workflow_run.constants import ARCHIVE_BUNDLE_NAME
|
||||
|
||||
logger = logging.getLogger(__name__)
|
||||
|
||||
|
||||
# Mapping of table names to SQLAlchemy models
|
||||
TABLE_MODELS = {
|
||||
"workflow_runs": WorkflowRun,
|
||||
"workflow_app_logs": WorkflowAppLog,
|
||||
"workflow_node_executions": WorkflowNodeExecutionModel,
|
||||
"workflow_node_execution_offload": WorkflowNodeExecutionOffload,
|
||||
"workflow_pauses": WorkflowPause,
|
||||
"workflow_pause_reasons": WorkflowPauseReason,
|
||||
"workflow_trigger_logs": WorkflowTriggerLog,
|
||||
}
|
||||
|
||||
SchemaMapper = Callable[[dict[str, Any]], dict[str, Any]]
|
||||
|
||||
SCHEMA_MAPPERS: dict[str, dict[str, SchemaMapper]] = {
|
||||
"1.0": {},
|
||||
}
|
||||
|
||||
|
||||
@dataclass
|
||||
class RestoreResult:
|
||||
"""Result of restoring a single workflow run."""
|
||||
|
||||
run_id: str
|
||||
tenant_id: str
|
||||
success: bool
|
||||
restored_counts: dict[str, int]
|
||||
error: str | None = None
|
||||
elapsed_time: float = 0.0
|
||||
|
||||
|
||||
class WorkflowRunRestore:
|
||||
"""
|
||||
Restore archived workflow run data from storage to database.
|
||||
|
||||
This service reads archived data from storage and restores it to the
|
||||
database tables. It handles idempotency by skipping records that already
|
||||
exist in the database.
|
||||
"""
|
||||
|
||||
def __init__(self, dry_run: bool = False, workers: int = 1):
|
||||
"""
|
||||
Initialize the restore service.
|
||||
|
||||
Args:
|
||||
dry_run: If True, only preview without making changes
|
||||
workers: Number of concurrent workflow runs to restore
|
||||
"""
|
||||
self.dry_run = dry_run
|
||||
if workers < 1:
|
||||
raise ValueError("workers must be at least 1")
|
||||
self.workers = workers
|
||||
self.workflow_run_repo: APIWorkflowRunRepository | None = None
|
||||
|
||||
def _restore_from_run(
|
||||
self,
|
||||
run: WorkflowRun | WorkflowArchiveLog,
|
||||
*,
|
||||
session_maker: sessionmaker,
|
||||
) -> RestoreResult:
|
||||
start_time = time.time()
|
||||
run_id = run.workflow_run_id if isinstance(run, WorkflowArchiveLog) else run.id
|
||||
created_at = run.run_created_at if isinstance(run, WorkflowArchiveLog) else run.created_at
|
||||
result = RestoreResult(
|
||||
run_id=run_id,
|
||||
tenant_id=run.tenant_id,
|
||||
success=False,
|
||||
restored_counts={},
|
||||
)
|
||||
|
||||
if not self.dry_run:
|
||||
click.echo(
|
||||
click.style(
|
||||
f"Starting restore for workflow run {run_id} (tenant={run.tenant_id})",
|
||||
fg="white",
|
||||
)
|
||||
)
|
||||
|
||||
try:
|
||||
storage = get_archive_storage()
|
||||
except ArchiveStorageNotConfiguredError as e:
|
||||
result.error = str(e)
|
||||
click.echo(click.style(f"Archive storage not configured: {e}", fg="red"))
|
||||
result.elapsed_time = time.time() - start_time
|
||||
return result
|
||||
|
||||
prefix = (
|
||||
f"{run.tenant_id}/app_id={run.app_id}/year={created_at.strftime('%Y')}/"
|
||||
f"month={created_at.strftime('%m')}/workflow_run_id={run_id}"
|
||||
)
|
||||
archive_key = f"{prefix}/{ARCHIVE_BUNDLE_NAME}"
|
||||
try:
|
||||
archive_data = storage.get_object(archive_key)
|
||||
except FileNotFoundError:
|
||||
result.error = f"Archive bundle not found: {archive_key}"
|
||||
click.echo(click.style(result.error, fg="red"))
|
||||
result.elapsed_time = time.time() - start_time
|
||||
return result
|
||||
|
||||
with session_maker() as session:
|
||||
try:
|
||||
with zipfile.ZipFile(io.BytesIO(archive_data), mode="r") as archive:
|
||||
try:
|
||||
manifest = self._load_manifest_from_zip(archive)
|
||||
except ValueError as e:
|
||||
result.error = f"Archive bundle invalid: {e}"
|
||||
click.echo(click.style(result.error, fg="red"))
|
||||
return result
|
||||
|
||||
tables = manifest.get("tables", {})
|
||||
schema_version = self._get_schema_version(manifest)
|
||||
for table_name, info in tables.items():
|
||||
row_count = info.get("row_count", 0)
|
||||
if row_count == 0:
|
||||
result.restored_counts[table_name] = 0
|
||||
continue
|
||||
|
||||
if self.dry_run:
|
||||
result.restored_counts[table_name] = row_count
|
||||
continue
|
||||
|
||||
member_path = f"{table_name}.jsonl"
|
||||
try:
|
||||
data = archive.read(member_path)
|
||||
except KeyError:
|
||||
click.echo(
|
||||
click.style(
|
||||
f" Warning: Table data not found in archive: {member_path}",
|
||||
fg="yellow",
|
||||
)
|
||||
)
|
||||
result.restored_counts[table_name] = 0
|
||||
continue
|
||||
|
||||
records = ArchiveStorage.deserialize_from_jsonl(data)
|
||||
restored = self._restore_table_records(
|
||||
session,
|
||||
table_name,
|
||||
records,
|
||||
schema_version=schema_version,
|
||||
)
|
||||
result.restored_counts[table_name] = restored
|
||||
if not self.dry_run:
|
||||
click.echo(
|
||||
click.style(
|
||||
f" Restored {restored}/{len(records)} records to {table_name}",
|
||||
fg="white",
|
||||
)
|
||||
)
|
||||
|
||||
# Verify row counts match manifest
|
||||
manifest_total = sum(info.get("row_count", 0) for info in tables.values())
|
||||
restored_total = sum(result.restored_counts.values())
|
||||
|
||||
if not self.dry_run:
|
||||
# Note: restored count might be less than manifest count if records already exist
|
||||
logger.info(
|
||||
"Restore verification: manifest_total=%d, restored_total=%d",
|
||||
manifest_total,
|
||||
restored_total,
|
||||
)
|
||||
|
||||
# Delete the archive log record after successful restore
|
||||
repo = self._get_workflow_run_repo()
|
||||
repo.delete_archive_log_by_run_id(session, run_id)
|
||||
|
||||
session.commit()
|
||||
|
||||
result.success = True
|
||||
if not self.dry_run:
|
||||
click.echo(
|
||||
click.style(
|
||||
f"Completed restore for workflow run {run_id}: restored={result.restored_counts}",
|
||||
fg="green",
|
||||
)
|
||||
)
|
||||
|
||||
except Exception as e:
|
||||
logger.exception("Failed to restore workflow run %s", run_id)
|
||||
result.error = str(e)
|
||||
session.rollback()
|
||||
click.echo(click.style(f"Restore failed: {e}", fg="red"))
|
||||
|
||||
result.elapsed_time = time.time() - start_time
|
||||
return result
|
||||
|
||||
def _get_workflow_run_repo(self) -> APIWorkflowRunRepository:
|
||||
if self.workflow_run_repo is not None:
|
||||
return self.workflow_run_repo
|
||||
|
||||
self.workflow_run_repo = DifyAPIRepositoryFactory.create_api_workflow_run_repository(
|
||||
sessionmaker(bind=db.engine, expire_on_commit=False)
|
||||
)
|
||||
return self.workflow_run_repo
|
||||
|
||||
@staticmethod
|
||||
def _load_manifest_from_zip(archive: zipfile.ZipFile) -> dict[str, Any]:
|
||||
try:
|
||||
data = archive.read("manifest.json")
|
||||
except KeyError as e:
|
||||
raise ValueError("manifest.json missing from archive bundle") from e
|
||||
return json.loads(data.decode("utf-8"))
|
||||
|
||||
def _restore_table_records(
|
||||
self,
|
||||
session: Session,
|
||||
table_name: str,
|
||||
records: list[dict[str, Any]],
|
||||
*,
|
||||
schema_version: str,
|
||||
) -> int:
|
||||
"""
|
||||
Restore records to a table.
|
||||
|
||||
Uses INSERT ... ON CONFLICT DO NOTHING for idempotency.
|
||||
|
||||
Args:
|
||||
session: Database session
|
||||
table_name: Name of the table
|
||||
records: List of record dictionaries
|
||||
schema_version: Archived schema version from manifest
|
||||
|
||||
Returns:
|
||||
Number of records actually inserted
|
||||
"""
|
||||
if not records:
|
||||
return 0
|
||||
|
||||
model = TABLE_MODELS.get(table_name)
|
||||
if not model:
|
||||
logger.warning("Unknown table: %s", table_name)
|
||||
return 0
|
||||
|
||||
column_names, required_columns, non_nullable_with_default = self._get_model_column_info(model)
|
||||
unknown_fields: set[str] = set()
|
||||
|
||||
# Apply schema mapping, filter to current columns, then convert datetimes
|
||||
converted_records = []
|
||||
for record in records:
|
||||
mapped = self._apply_schema_mapping(table_name, schema_version, record)
|
||||
unknown_fields.update(set(mapped.keys()) - column_names)
|
||||
filtered = {key: value for key, value in mapped.items() if key in column_names}
|
||||
for key in non_nullable_with_default:
|
||||
if key in filtered and filtered[key] is None:
|
||||
filtered.pop(key)
|
||||
missing_required = [key for key in required_columns if key not in filtered or filtered.get(key) is None]
|
||||
if missing_required:
|
||||
missing_cols = ", ".join(sorted(missing_required))
|
||||
raise ValueError(
|
||||
f"Missing required columns for {table_name} (schema_version={schema_version}): {missing_cols}"
|
||||
)
|
||||
converted = self._convert_datetime_fields(filtered, model)
|
||||
converted_records.append(converted)
|
||||
if unknown_fields:
|
||||
logger.warning(
|
||||
"Dropped unknown columns for %s (schema_version=%s): %s",
|
||||
table_name,
|
||||
schema_version,
|
||||
", ".join(sorted(unknown_fields)),
|
||||
)
|
||||
|
||||
# Use INSERT ... ON CONFLICT DO NOTHING for idempotency
|
||||
stmt = pg_insert(model).values(converted_records)
|
||||
stmt = stmt.on_conflict_do_nothing(index_elements=["id"])
|
||||
|
||||
result = session.execute(stmt)
|
||||
return cast(CursorResult, result).rowcount or 0
|
||||
|
||||
def _convert_datetime_fields(
|
||||
self,
|
||||
record: dict[str, Any],
|
||||
model: type[DeclarativeBase] | Any,
|
||||
) -> dict[str, Any]:
|
||||
"""Convert ISO datetime strings to datetime objects."""
|
||||
from sqlalchemy import DateTime
|
||||
|
||||
result = dict(record)
|
||||
|
||||
for column in model.__table__.columns:
|
||||
if isinstance(column.type, DateTime):
|
||||
value = result.get(column.key)
|
||||
if isinstance(value, str):
|
||||
try:
|
||||
result[column.key] = datetime.fromisoformat(value)
|
||||
except ValueError:
|
||||
pass
|
||||
|
||||
return result
|
||||
|
||||
def _get_schema_version(self, manifest: dict[str, Any]) -> str:
|
||||
schema_version = manifest.get("schema_version")
|
||||
if not schema_version:
|
||||
logger.warning("Manifest missing schema_version; defaulting to 1.0")
|
||||
schema_version = "1.0"
|
||||
schema_version = str(schema_version)
|
||||
if schema_version not in SCHEMA_MAPPERS:
|
||||
raise ValueError(f"Unsupported schema_version {schema_version}. Add a mapping before restoring.")
|
||||
return schema_version
|
||||
|
||||
def _apply_schema_mapping(
|
||||
self,
|
||||
table_name: str,
|
||||
schema_version: str,
|
||||
record: dict[str, Any],
|
||||
) -> dict[str, Any]:
|
||||
# Keep hook for forward/backward compatibility when schema evolves.
|
||||
mapper = SCHEMA_MAPPERS.get(schema_version, {}).get(table_name)
|
||||
if mapper is None:
|
||||
return dict(record)
|
||||
return mapper(record)
|
||||
|
||||
def _get_model_column_info(
|
||||
self,
|
||||
model: type[DeclarativeBase] | Any,
|
||||
) -> tuple[set[str], set[str], set[str]]:
|
||||
columns = list(model.__table__.columns)
|
||||
column_names = {column.key for column in columns}
|
||||
required_columns = {
|
||||
column.key
|
||||
for column in columns
|
||||
if not column.nullable
|
||||
and column.default is None
|
||||
and column.server_default is None
|
||||
and not column.autoincrement
|
||||
}
|
||||
non_nullable_with_default = {
|
||||
column.key
|
||||
for column in columns
|
||||
if not column.nullable
|
||||
and (column.default is not None or column.server_default is not None or column.autoincrement)
|
||||
}
|
||||
return column_names, required_columns, non_nullable_with_default
|
||||
|
||||
def restore_batch(
|
||||
self,
|
||||
tenant_ids: list[str] | None,
|
||||
start_date: datetime,
|
||||
end_date: datetime,
|
||||
limit: int = 100,
|
||||
) -> list[RestoreResult]:
|
||||
"""
|
||||
Restore multiple workflow runs by time range.
|
||||
|
||||
Args:
|
||||
tenant_ids: Optional tenant IDs
|
||||
start_date: Start date filter
|
||||
end_date: End date filter
|
||||
limit: Maximum number of runs to restore (default: 100)
|
||||
|
||||
Returns:
|
||||
List of RestoreResult objects
|
||||
"""
|
||||
results: list[RestoreResult] = []
|
||||
if tenant_ids is not None and not tenant_ids:
|
||||
return results
|
||||
session_maker = sessionmaker(bind=db.engine, expire_on_commit=False)
|
||||
repo = self._get_workflow_run_repo()
|
||||
|
||||
with session_maker() as session:
|
||||
archive_logs = repo.get_archived_logs_by_time_range(
|
||||
session=session,
|
||||
tenant_ids=tenant_ids,
|
||||
start_date=start_date,
|
||||
end_date=end_date,
|
||||
limit=limit,
|
||||
)
|
||||
|
||||
click.echo(
|
||||
click.style(
|
||||
f"Found {len(archive_logs)} archived workflow runs to restore",
|
||||
fg="white",
|
||||
)
|
||||
)
|
||||
|
||||
def _restore_with_session(archive_log: WorkflowArchiveLog) -> RestoreResult:
|
||||
return self._restore_from_run(
|
||||
archive_log,
|
||||
session_maker=session_maker,
|
||||
)
|
||||
|
||||
with ThreadPoolExecutor(max_workers=self.workers) as executor:
|
||||
results = list(executor.map(_restore_with_session, archive_logs))
|
||||
|
||||
total_counts: dict[str, int] = {}
|
||||
for result in results:
|
||||
for table_name, count in result.restored_counts.items():
|
||||
total_counts[table_name] = total_counts.get(table_name, 0) + count
|
||||
success_count = sum(1 for result in results if result.success)
|
||||
|
||||
if self.dry_run:
|
||||
click.echo(
|
||||
click.style(
|
||||
f"[DRY RUN] Would restore {len(results)} workflow runs: totals={total_counts}",
|
||||
fg="yellow",
|
||||
)
|
||||
)
|
||||
else:
|
||||
click.echo(
|
||||
click.style(
|
||||
f"Restored {success_count}/{len(results)} workflow runs: totals={total_counts}",
|
||||
fg="green",
|
||||
)
|
||||
)
|
||||
|
||||
return results
|
||||
|
||||
def restore_by_run_id(
|
||||
self,
|
||||
run_id: str,
|
||||
) -> RestoreResult:
|
||||
"""
|
||||
Restore a single workflow run by run ID.
|
||||
"""
|
||||
repo = self._get_workflow_run_repo()
|
||||
archive_log = repo.get_archived_log_by_run_id(run_id)
|
||||
|
||||
if not archive_log:
|
||||
click.echo(click.style(f"Workflow run archive {run_id} not found", fg="red"))
|
||||
return RestoreResult(
|
||||
run_id=run_id,
|
||||
tenant_id="",
|
||||
success=False,
|
||||
restored_counts={},
|
||||
error=f"Workflow run archive {run_id} not found",
|
||||
)
|
||||
|
||||
session_maker = sessionmaker(bind=db.engine, expire_on_commit=False)
|
||||
result = self._restore_from_run(archive_log, session_maker=session_maker)
|
||||
if self.dry_run and result.success:
|
||||
click.echo(
|
||||
click.style(
|
||||
f"[DRY RUN] Would restore workflow run {run_id}: totals={result.restored_counts}",
|
||||
fg="yellow",
|
||||
)
|
||||
)
|
||||
return result
|
||||
@@ -7,7 +7,7 @@ from sqlalchemy import and_, func, or_, select
|
||||
from sqlalchemy.orm import Session
|
||||
|
||||
from core.workflow.enums import WorkflowExecutionStatus
|
||||
from models import Account, App, EndUser, WorkflowAppLog, WorkflowRun
|
||||
from models import Account, App, EndUser, WorkflowAppLog, WorkflowArchiveLog, WorkflowRun
|
||||
from models.enums import AppTriggerType, CreatorUserRole
|
||||
from models.trigger import WorkflowTriggerLog
|
||||
from services.plugin.plugin_service import PluginService
|
||||
@@ -173,7 +173,80 @@ class WorkflowAppService:
|
||||
"data": items,
|
||||
}
|
||||
|
||||
def handle_trigger_metadata(self, tenant_id: str, meta_val: str) -> dict[str, Any]:
|
||||
def get_paginate_workflow_archive_logs(
|
||||
self,
|
||||
*,
|
||||
session: Session,
|
||||
app_model: App,
|
||||
page: int = 1,
|
||||
limit: int = 20,
|
||||
):
|
||||
"""
|
||||
Get paginate workflow archive logs using SQLAlchemy 2.0 style.
|
||||
"""
|
||||
stmt = select(WorkflowArchiveLog).where(
|
||||
WorkflowArchiveLog.tenant_id == app_model.tenant_id,
|
||||
WorkflowArchiveLog.app_id == app_model.id,
|
||||
WorkflowArchiveLog.log_id.isnot(None),
|
||||
)
|
||||
|
||||
stmt = stmt.order_by(WorkflowArchiveLog.run_created_at.desc())
|
||||
|
||||
count_stmt = select(func.count()).select_from(stmt.subquery())
|
||||
total = session.scalar(count_stmt) or 0
|
||||
|
||||
offset_stmt = stmt.offset((page - 1) * limit).limit(limit)
|
||||
|
||||
logs = list(session.scalars(offset_stmt).all())
|
||||
account_ids = {log.created_by for log in logs if log.created_by_role == CreatorUserRole.ACCOUNT}
|
||||
end_user_ids = {log.created_by for log in logs if log.created_by_role == CreatorUserRole.END_USER}
|
||||
|
||||
accounts_by_id = {}
|
||||
if account_ids:
|
||||
accounts_by_id = {
|
||||
account.id: account
|
||||
for account in session.scalars(select(Account).where(Account.id.in_(account_ids))).all()
|
||||
}
|
||||
|
||||
end_users_by_id = {}
|
||||
if end_user_ids:
|
||||
end_users_by_id = {
|
||||
end_user.id: end_user
|
||||
for end_user in session.scalars(select(EndUser).where(EndUser.id.in_(end_user_ids))).all()
|
||||
}
|
||||
|
||||
items = []
|
||||
for log in logs:
|
||||
if log.created_by_role == CreatorUserRole.ACCOUNT:
|
||||
created_by_account = accounts_by_id.get(log.created_by)
|
||||
created_by_end_user = None
|
||||
elif log.created_by_role == CreatorUserRole.END_USER:
|
||||
created_by_account = None
|
||||
created_by_end_user = end_users_by_id.get(log.created_by)
|
||||
else:
|
||||
created_by_account = None
|
||||
created_by_end_user = None
|
||||
|
||||
items.append(
|
||||
{
|
||||
"id": log.id,
|
||||
"workflow_run": log.workflow_run_summary,
|
||||
"trigger_metadata": self.handle_trigger_metadata(app_model.tenant_id, log.trigger_metadata),
|
||||
"created_by_account": created_by_account,
|
||||
"created_by_end_user": created_by_end_user,
|
||||
"created_at": log.log_created_at,
|
||||
}
|
||||
)
|
||||
|
||||
return {
|
||||
"page": page,
|
||||
"limit": limit,
|
||||
"total": total,
|
||||
"has_more": total > page * limit,
|
||||
"data": items,
|
||||
}
|
||||
|
||||
def handle_trigger_metadata(self, tenant_id: str, meta_val: str | None) -> dict[str, Any]:
|
||||
metadata: dict[str, Any] | None = self._safe_json_loads(meta_val)
|
||||
if not metadata:
|
||||
return {}
|
||||
|
||||
@@ -17,7 +17,7 @@ logger = logging.getLogger(__name__)
|
||||
|
||||
RETRY_TIMES_OF_ONE_PLUGIN_IN_ONE_TENANT = 3
|
||||
CACHE_REDIS_KEY_PREFIX = "plugin_autoupgrade_check_task:cached_plugin_manifests:"
|
||||
CACHE_REDIS_TTL = 60 * 15 # 15 minutes
|
||||
CACHE_REDIS_TTL = 60 * 60 # 1 hour
|
||||
|
||||
|
||||
def _get_redis_cache_key(plugin_id: str) -> str:
|
||||
|
||||
@@ -11,8 +11,10 @@ from sqlalchemy.engine import CursorResult
|
||||
from sqlalchemy.exc import SQLAlchemyError
|
||||
from sqlalchemy.orm import sessionmaker
|
||||
|
||||
from configs import dify_config
|
||||
from core.db.session_factory import session_factory
|
||||
from extensions.ext_database import db
|
||||
from libs.archive_storage import ArchiveStorageNotConfiguredError, get_archive_storage
|
||||
from models import (
|
||||
ApiToken,
|
||||
AppAnnotationHitHistory,
|
||||
@@ -43,6 +45,7 @@ from models.workflow import (
|
||||
ConversationVariable,
|
||||
Workflow,
|
||||
WorkflowAppLog,
|
||||
WorkflowArchiveLog,
|
||||
)
|
||||
from repositories.factory import DifyAPIRepositoryFactory
|
||||
|
||||
@@ -67,6 +70,9 @@ def remove_app_and_related_data_task(self, tenant_id: str, app_id: str):
|
||||
_delete_app_workflow_runs(tenant_id, app_id)
|
||||
_delete_app_workflow_node_executions(tenant_id, app_id)
|
||||
_delete_app_workflow_app_logs(tenant_id, app_id)
|
||||
if dify_config.BILLING_ENABLED and dify_config.ARCHIVE_STORAGE_ENABLED:
|
||||
_delete_app_workflow_archive_logs(tenant_id, app_id)
|
||||
_delete_archived_workflow_run_files(tenant_id, app_id)
|
||||
_delete_app_conversations(tenant_id, app_id)
|
||||
_delete_app_messages(tenant_id, app_id)
|
||||
_delete_workflow_tool_providers(tenant_id, app_id)
|
||||
@@ -252,6 +258,45 @@ def _delete_app_workflow_app_logs(tenant_id: str, app_id: str):
|
||||
)
|
||||
|
||||
|
||||
def _delete_app_workflow_archive_logs(tenant_id: str, app_id: str):
|
||||
def del_workflow_archive_log(workflow_archive_log_id: str):
|
||||
db.session.query(WorkflowArchiveLog).where(WorkflowArchiveLog.id == workflow_archive_log_id).delete(
|
||||
synchronize_session=False
|
||||
)
|
||||
|
||||
_delete_records(
|
||||
"""select id from workflow_archive_logs where tenant_id=:tenant_id and app_id=:app_id limit 1000""",
|
||||
{"tenant_id": tenant_id, "app_id": app_id},
|
||||
del_workflow_archive_log,
|
||||
"workflow archive log",
|
||||
)
|
||||
|
||||
|
||||
def _delete_archived_workflow_run_files(tenant_id: str, app_id: str):
|
||||
prefix = f"{tenant_id}/app_id={app_id}/"
|
||||
try:
|
||||
archive_storage = get_archive_storage()
|
||||
except ArchiveStorageNotConfiguredError as e:
|
||||
logger.info("Archive storage not configured, skipping archive file cleanup: %s", e)
|
||||
return
|
||||
|
||||
try:
|
||||
keys = archive_storage.list_objects(prefix)
|
||||
except Exception:
|
||||
logger.exception("Failed to list archive files for app %s", app_id)
|
||||
return
|
||||
|
||||
deleted = 0
|
||||
for key in keys:
|
||||
try:
|
||||
archive_storage.delete_object(key)
|
||||
deleted += 1
|
||||
except Exception:
|
||||
logger.exception("Failed to delete archive object %s", key)
|
||||
|
||||
logger.info("Deleted %s archive objects for app %s", deleted, app_id)
|
||||
|
||||
|
||||
def _delete_app_conversations(tenant_id: str, app_id: str):
|
||||
def del_conversation(session, conversation_id: str):
|
||||
session.query(PinnedConversation).where(PinnedConversation.conversation_id == conversation_id).delete(
|
||||
|
||||
@@ -172,7 +172,6 @@ class TestAgentService:
|
||||
|
||||
# Create app model config
|
||||
app_model_config = AppModelConfig(
|
||||
id=fake.uuid4(),
|
||||
app_id=app.id,
|
||||
provider="openai",
|
||||
model_id="gpt-3.5-turbo",
|
||||
@@ -180,6 +179,7 @@ class TestAgentService:
|
||||
model="gpt-3.5-turbo",
|
||||
agent_mode=json.dumps({"enabled": True, "strategy": "react", "tools": []}),
|
||||
)
|
||||
app_model_config.id = fake.uuid4()
|
||||
db.session.add(app_model_config)
|
||||
db.session.commit()
|
||||
|
||||
@@ -413,7 +413,6 @@ class TestAgentService:
|
||||
|
||||
# Create app model config
|
||||
app_model_config = AppModelConfig(
|
||||
id=fake.uuid4(),
|
||||
app_id=app.id,
|
||||
provider="openai",
|
||||
model_id="gpt-3.5-turbo",
|
||||
@@ -421,6 +420,7 @@ class TestAgentService:
|
||||
model="gpt-3.5-turbo",
|
||||
agent_mode=json.dumps({"enabled": True, "strategy": "react", "tools": []}),
|
||||
)
|
||||
app_model_config.id = fake.uuid4()
|
||||
db.session.add(app_model_config)
|
||||
db.session.commit()
|
||||
|
||||
@@ -485,7 +485,6 @@ class TestAgentService:
|
||||
|
||||
# Create app model config
|
||||
app_model_config = AppModelConfig(
|
||||
id=fake.uuid4(),
|
||||
app_id=app.id,
|
||||
provider="openai",
|
||||
model_id="gpt-3.5-turbo",
|
||||
@@ -493,6 +492,7 @@ class TestAgentService:
|
||||
model="gpt-3.5-turbo",
|
||||
agent_mode=json.dumps({"enabled": True, "strategy": "react", "tools": []}),
|
||||
)
|
||||
app_model_config.id = fake.uuid4()
|
||||
db.session.add(app_model_config)
|
||||
db.session.commit()
|
||||
|
||||
|
||||
@@ -226,26 +226,27 @@ class TestAppDslService:
|
||||
app, account = self._create_test_app_and_account(db_session_with_containers, mock_external_service_dependencies)
|
||||
|
||||
# Create model config for the app
|
||||
model_config = AppModelConfig()
|
||||
model_config.id = fake.uuid4()
|
||||
model_config.app_id = app.id
|
||||
model_config.provider = "openai"
|
||||
model_config.model_id = "gpt-3.5-turbo"
|
||||
model_config.model = json.dumps(
|
||||
{
|
||||
"provider": "openai",
|
||||
"name": "gpt-3.5-turbo",
|
||||
"mode": "chat",
|
||||
"completion_params": {
|
||||
"max_tokens": 1000,
|
||||
"temperature": 0.7,
|
||||
},
|
||||
}
|
||||
model_config = AppModelConfig(
|
||||
app_id=app.id,
|
||||
provider="openai",
|
||||
model_id="gpt-3.5-turbo",
|
||||
model=json.dumps(
|
||||
{
|
||||
"provider": "openai",
|
||||
"name": "gpt-3.5-turbo",
|
||||
"mode": "chat",
|
||||
"completion_params": {
|
||||
"max_tokens": 1000,
|
||||
"temperature": 0.7,
|
||||
},
|
||||
}
|
||||
),
|
||||
pre_prompt="You are a helpful assistant.",
|
||||
prompt_type="simple",
|
||||
created_by=account.id,
|
||||
updated_by=account.id,
|
||||
)
|
||||
model_config.pre_prompt = "You are a helpful assistant."
|
||||
model_config.prompt_type = "simple"
|
||||
model_config.created_by = account.id
|
||||
model_config.updated_by = account.id
|
||||
model_config.id = fake.uuid4()
|
||||
|
||||
# Set the app_model_config_id to link the config
|
||||
app.app_model_config_id = model_config.id
|
||||
|
||||
@@ -4,7 +4,13 @@ import pytest
|
||||
from faker import Faker
|
||||
|
||||
from enums.cloud_plan import CloudPlan
|
||||
from services.feature_service import FeatureModel, FeatureService, KnowledgeRateLimitModel, SystemFeatureModel
|
||||
from services.feature_service import (
|
||||
FeatureModel,
|
||||
FeatureService,
|
||||
KnowledgeRateLimitModel,
|
||||
LicenseStatus,
|
||||
SystemFeatureModel,
|
||||
)
|
||||
|
||||
|
||||
class TestFeatureService:
|
||||
@@ -274,7 +280,7 @@ class TestFeatureService:
|
||||
mock_config.PLUGIN_MAX_PACKAGE_SIZE = 100
|
||||
|
||||
# Act: Execute the method under test
|
||||
result = FeatureService.get_system_features()
|
||||
result = FeatureService.get_system_features(is_authenticated=True)
|
||||
|
||||
# Assert: Verify the expected outcomes
|
||||
assert result is not None
|
||||
@@ -324,6 +330,61 @@ class TestFeatureService:
|
||||
# Verify mock interactions
|
||||
mock_external_service_dependencies["enterprise_service"].get_info.assert_called_once()
|
||||
|
||||
def test_get_system_features_unauthenticated(self, db_session_with_containers, mock_external_service_dependencies):
|
||||
"""
|
||||
Test system features retrieval for an unauthenticated user.
|
||||
|
||||
This test verifies that:
|
||||
- The response payload is minimized (e.g., verbose license details are excluded).
|
||||
- Essential UI configuration (Branding, SSO, Marketplace) remains available.
|
||||
- The response structure adheres to the public schema for unauthenticated clients.
|
||||
"""
|
||||
# Arrange: Setup test data with exact same config as success test
|
||||
with patch("services.feature_service.dify_config") as mock_config:
|
||||
mock_config.ENTERPRISE_ENABLED = True
|
||||
mock_config.MARKETPLACE_ENABLED = True
|
||||
mock_config.ENABLE_EMAIL_CODE_LOGIN = True
|
||||
mock_config.ENABLE_EMAIL_PASSWORD_LOGIN = True
|
||||
mock_config.ENABLE_SOCIAL_OAUTH_LOGIN = False
|
||||
mock_config.ALLOW_REGISTER = False
|
||||
mock_config.ALLOW_CREATE_WORKSPACE = False
|
||||
mock_config.MAIL_TYPE = "smtp"
|
||||
mock_config.PLUGIN_MAX_PACKAGE_SIZE = 100
|
||||
|
||||
# Act: Execute with is_authenticated=False
|
||||
result = FeatureService.get_system_features(is_authenticated=False)
|
||||
|
||||
# Assert: Basic structure
|
||||
assert result is not None
|
||||
assert isinstance(result, SystemFeatureModel)
|
||||
|
||||
# --- 1. Verify Response Payload Optimization (Data Minimization) ---
|
||||
# Ensure only essential UI flags are returned to unauthenticated clients
|
||||
# to keep the payload lightweight and adhere to architectural boundaries.
|
||||
assert result.license.status == LicenseStatus.NONE
|
||||
assert result.license.expired_at == ""
|
||||
assert result.license.workspaces.enabled is False
|
||||
assert result.license.workspaces.limit == 0
|
||||
assert result.license.workspaces.size == 0
|
||||
|
||||
# --- 2. Verify Public UI Configuration Availability ---
|
||||
# Ensure that data required for frontend rendering remains accessible.
|
||||
|
||||
# Branding should match the mock data
|
||||
assert result.branding.enabled is True
|
||||
assert result.branding.application_title == "Test Enterprise"
|
||||
assert result.branding.login_page_logo == "https://example.com/logo.png"
|
||||
|
||||
# SSO settings should be visible for login page rendering
|
||||
assert result.sso_enforced_for_signin is True
|
||||
assert result.sso_enforced_for_signin_protocol == "saml"
|
||||
|
||||
# General auth settings should be visible
|
||||
assert result.enable_email_code_login is True
|
||||
|
||||
# Marketplace should be visible
|
||||
assert result.enable_marketplace is True
|
||||
|
||||
def test_get_system_features_basic_config(self, db_session_with_containers, mock_external_service_dependencies):
|
||||
"""
|
||||
Test system features retrieval with basic configuration (no enterprise).
|
||||
@@ -1031,7 +1092,7 @@ class TestFeatureService:
|
||||
}
|
||||
|
||||
# Act: Execute the method under test
|
||||
result = FeatureService.get_system_features()
|
||||
result = FeatureService.get_system_features(is_authenticated=True)
|
||||
|
||||
# Assert: Verify the expected outcomes
|
||||
assert result is not None
|
||||
@@ -1400,7 +1461,7 @@ class TestFeatureService:
|
||||
}
|
||||
|
||||
# Act: Execute the method under test
|
||||
result = FeatureService.get_system_features()
|
||||
result = FeatureService.get_system_features(is_authenticated=True)
|
||||
|
||||
# Assert: Verify the expected outcomes
|
||||
assert result is not None
|
||||
|
||||
@@ -925,24 +925,24 @@ class TestWorkflowService:
|
||||
# Create app model config (required for conversion)
|
||||
from models.model import AppModelConfig
|
||||
|
||||
app_model_config = AppModelConfig()
|
||||
app_model_config.id = fake.uuid4()
|
||||
app_model_config.app_id = app.id
|
||||
app_model_config.tenant_id = app.tenant_id
|
||||
app_model_config.provider = "openai"
|
||||
app_model_config.model_id = "gpt-3.5-turbo"
|
||||
# Set the model field directly - this is what model_dict property returns
|
||||
app_model_config.model = json.dumps(
|
||||
{
|
||||
"provider": "openai",
|
||||
"name": "gpt-3.5-turbo",
|
||||
"completion_params": {"max_tokens": 1000, "temperature": 0.7},
|
||||
}
|
||||
app_model_config = AppModelConfig(
|
||||
app_id=app.id,
|
||||
provider="openai",
|
||||
model_id="gpt-3.5-turbo",
|
||||
# Set the model field directly - this is what model_dict property returns
|
||||
model=json.dumps(
|
||||
{
|
||||
"provider": "openai",
|
||||
"name": "gpt-3.5-turbo",
|
||||
"completion_params": {"max_tokens": 1000, "temperature": 0.7},
|
||||
}
|
||||
),
|
||||
# Set pre_prompt for PromptTemplateConfigManager
|
||||
pre_prompt="You are a helpful assistant.",
|
||||
created_by=account.id,
|
||||
updated_by=account.id,
|
||||
)
|
||||
# Set pre_prompt for PromptTemplateConfigManager
|
||||
app_model_config.pre_prompt = "You are a helpful assistant."
|
||||
app_model_config.created_by = account.id
|
||||
app_model_config.updated_by = account.id
|
||||
app_model_config.id = fake.uuid4()
|
||||
|
||||
from extensions.ext_database import db
|
||||
|
||||
@@ -987,24 +987,24 @@ class TestWorkflowService:
|
||||
# Create app model config (required for conversion)
|
||||
from models.model import AppModelConfig
|
||||
|
||||
app_model_config = AppModelConfig()
|
||||
app_model_config.id = fake.uuid4()
|
||||
app_model_config.app_id = app.id
|
||||
app_model_config.tenant_id = app.tenant_id
|
||||
app_model_config.provider = "openai"
|
||||
app_model_config.model_id = "gpt-3.5-turbo"
|
||||
# Set the model field directly - this is what model_dict property returns
|
||||
app_model_config.model = json.dumps(
|
||||
{
|
||||
"provider": "openai",
|
||||
"name": "gpt-3.5-turbo",
|
||||
"completion_params": {"max_tokens": 1000, "temperature": 0.7},
|
||||
}
|
||||
app_model_config = AppModelConfig(
|
||||
app_id=app.id,
|
||||
provider="openai",
|
||||
model_id="gpt-3.5-turbo",
|
||||
# Set the model field directly - this is what model_dict property returns
|
||||
model=json.dumps(
|
||||
{
|
||||
"provider": "openai",
|
||||
"name": "gpt-3.5-turbo",
|
||||
"completion_params": {"max_tokens": 1000, "temperature": 0.7},
|
||||
}
|
||||
),
|
||||
# Set pre_prompt for PromptTemplateConfigManager
|
||||
pre_prompt="Complete the following text:",
|
||||
created_by=account.id,
|
||||
updated_by=account.id,
|
||||
)
|
||||
# Set pre_prompt for PromptTemplateConfigManager
|
||||
app_model_config.pre_prompt = "Complete the following text:"
|
||||
app_model_config.created_by = account.id
|
||||
app_model_config.updated_by = account.id
|
||||
app_model_config.id = fake.uuid4()
|
||||
|
||||
from extensions.ext_database import db
|
||||
|
||||
|
||||
@@ -0,0 +1,27 @@
|
||||
import builtins
|
||||
|
||||
import pytest
|
||||
from flask import Flask
|
||||
from flask.views import MethodView
|
||||
|
||||
from extensions import ext_fastopenapi
|
||||
|
||||
if not hasattr(builtins, "MethodView"):
|
||||
builtins.MethodView = MethodView # type: ignore[attr-defined]
|
||||
|
||||
|
||||
@pytest.fixture
|
||||
def app() -> Flask:
|
||||
app = Flask(__name__)
|
||||
app.config["TESTING"] = True
|
||||
return app
|
||||
|
||||
|
||||
def test_console_ping_fastopenapi_returns_pong(app: Flask):
|
||||
ext_fastopenapi.init_app(app)
|
||||
|
||||
client = app.test_client()
|
||||
response = client.get("/console/api/ping")
|
||||
|
||||
assert response.status_code == 200
|
||||
assert response.get_json() == {"result": "pong"}
|
||||
@@ -0,0 +1,56 @@
|
||||
import builtins
|
||||
from unittest.mock import patch
|
||||
|
||||
import pytest
|
||||
from flask import Flask
|
||||
from flask.views import MethodView
|
||||
|
||||
from extensions import ext_fastopenapi
|
||||
|
||||
if not hasattr(builtins, "MethodView"):
|
||||
builtins.MethodView = MethodView # type: ignore[attr-defined]
|
||||
|
||||
|
||||
@pytest.fixture
|
||||
def app() -> Flask:
|
||||
app = Flask(__name__)
|
||||
app.config["TESTING"] = True
|
||||
return app
|
||||
|
||||
|
||||
def test_console_setup_fastopenapi_get_not_started(app: Flask):
|
||||
ext_fastopenapi.init_app(app)
|
||||
|
||||
with (
|
||||
patch("controllers.console.setup.dify_config.EDITION", "SELF_HOSTED"),
|
||||
patch("controllers.console.setup.get_setup_status", return_value=None),
|
||||
):
|
||||
client = app.test_client()
|
||||
response = client.get("/console/api/setup")
|
||||
|
||||
assert response.status_code == 200
|
||||
assert response.get_json() == {"step": "not_started", "setup_at": None}
|
||||
|
||||
|
||||
def test_console_setup_fastopenapi_post_success(app: Flask):
|
||||
ext_fastopenapi.init_app(app)
|
||||
|
||||
payload = {
|
||||
"email": "admin@example.com",
|
||||
"name": "Admin",
|
||||
"password": "Passw0rd1",
|
||||
"language": "en-US",
|
||||
}
|
||||
|
||||
with (
|
||||
patch("controllers.console.wraps.dify_config.EDITION", "SELF_HOSTED"),
|
||||
patch("controllers.console.setup.get_setup_status", return_value=None),
|
||||
patch("controllers.console.setup.TenantService.get_tenant_count", return_value=0),
|
||||
patch("controllers.console.setup.get_init_validate_status", return_value=True),
|
||||
patch("controllers.console.setup.RegisterService.setup"),
|
||||
):
|
||||
client = app.test_client()
|
||||
response = client.post("/console/api/setup", json=payload)
|
||||
|
||||
assert response.status_code == 201
|
||||
assert response.get_json() == {"result": "success"}
|
||||
@@ -0,0 +1,35 @@
|
||||
import builtins
|
||||
from unittest.mock import patch
|
||||
|
||||
import pytest
|
||||
from flask import Flask
|
||||
from flask.views import MethodView
|
||||
|
||||
from configs import dify_config
|
||||
from extensions import ext_fastopenapi
|
||||
|
||||
if not hasattr(builtins, "MethodView"):
|
||||
builtins.MethodView = MethodView # type: ignore[attr-defined]
|
||||
|
||||
|
||||
@pytest.fixture
|
||||
def app() -> Flask:
|
||||
app = Flask(__name__)
|
||||
app.config["TESTING"] = True
|
||||
return app
|
||||
|
||||
|
||||
def test_console_version_fastopenapi_returns_current_version(app: Flask):
|
||||
ext_fastopenapi.init_app(app)
|
||||
|
||||
with patch("controllers.console.version.dify_config.CHECK_UPDATE_URL", None):
|
||||
client = app.test_client()
|
||||
response = client.get("/console/api/version", query_string={"current_version": "0.0.0"})
|
||||
|
||||
assert response.status_code == 200
|
||||
data = response.get_json()
|
||||
assert data["version"] == dify_config.project.version
|
||||
assert data["release_date"] == ""
|
||||
assert data["release_notes"] == ""
|
||||
assert data["can_auto_update"] is False
|
||||
assert "features" in data
|
||||
@@ -1,39 +0,0 @@
|
||||
from types import SimpleNamespace
|
||||
from unittest.mock import patch
|
||||
|
||||
from controllers.console.setup import SetupApi
|
||||
|
||||
|
||||
class TestSetupApi:
|
||||
def test_post_lowercases_email_before_register(self):
|
||||
"""Ensure setup registration normalizes email casing."""
|
||||
payload = {
|
||||
"email": "Admin@Example.com",
|
||||
"name": "Admin User",
|
||||
"password": "ValidPass123!",
|
||||
"language": "en-US",
|
||||
}
|
||||
setup_api = SetupApi(api=None)
|
||||
|
||||
mock_console_ns = SimpleNamespace(payload=payload)
|
||||
|
||||
with (
|
||||
patch("controllers.console.setup.console_ns", mock_console_ns),
|
||||
patch("controllers.console.setup.get_setup_status", return_value=False),
|
||||
patch("controllers.console.setup.TenantService.get_tenant_count", return_value=0),
|
||||
patch("controllers.console.setup.get_init_validate_status", return_value=True),
|
||||
patch("controllers.console.setup.extract_remote_ip", return_value="127.0.0.1"),
|
||||
patch("controllers.console.setup.request", object()),
|
||||
patch("controllers.console.setup.RegisterService.setup") as mock_register,
|
||||
):
|
||||
response, status = setup_api.post()
|
||||
|
||||
assert response == {"result": "success"}
|
||||
assert status == 201
|
||||
mock_register.assert_called_once_with(
|
||||
email="admin@example.com",
|
||||
name=payload["name"],
|
||||
password=payload["password"],
|
||||
ip_address="127.0.0.1",
|
||||
language=payload["language"],
|
||||
)
|
||||
@@ -0,0 +1,454 @@
|
||||
"""Test multimodal image output handling in BaseAppRunner."""
|
||||
|
||||
from unittest.mock import MagicMock, patch
|
||||
from uuid import uuid4
|
||||
|
||||
import pytest
|
||||
|
||||
from core.app.apps.base_app_queue_manager import PublishFrom
|
||||
from core.app.apps.base_app_runner import AppRunner
|
||||
from core.app.entities.app_invoke_entities import InvokeFrom
|
||||
from core.app.entities.queue_entities import QueueMessageFileEvent
|
||||
from core.file.enums import FileTransferMethod, FileType
|
||||
from core.model_runtime.entities.message_entities import ImagePromptMessageContent
|
||||
from models.enums import CreatorUserRole
|
||||
|
||||
|
||||
class TestBaseAppRunnerMultimodal:
|
||||
"""Test that BaseAppRunner correctly handles multimodal image content."""
|
||||
|
||||
@pytest.fixture
|
||||
def mock_user_id(self):
|
||||
"""Mock user ID."""
|
||||
return str(uuid4())
|
||||
|
||||
@pytest.fixture
|
||||
def mock_tenant_id(self):
|
||||
"""Mock tenant ID."""
|
||||
return str(uuid4())
|
||||
|
||||
@pytest.fixture
|
||||
def mock_message_id(self):
|
||||
"""Mock message ID."""
|
||||
return str(uuid4())
|
||||
|
||||
@pytest.fixture
|
||||
def mock_queue_manager(self):
|
||||
"""Create a mock queue manager."""
|
||||
manager = MagicMock()
|
||||
manager.invoke_from = InvokeFrom.SERVICE_API
|
||||
return manager
|
||||
|
||||
@pytest.fixture
|
||||
def mock_tool_file(self):
|
||||
"""Create a mock tool file."""
|
||||
tool_file = MagicMock()
|
||||
tool_file.id = str(uuid4())
|
||||
return tool_file
|
||||
|
||||
@pytest.fixture
|
||||
def mock_message_file(self):
|
||||
"""Create a mock message file."""
|
||||
message_file = MagicMock()
|
||||
message_file.id = str(uuid4())
|
||||
return message_file
|
||||
|
||||
def test_handle_multimodal_image_content_with_url(
|
||||
self,
|
||||
mock_user_id,
|
||||
mock_tenant_id,
|
||||
mock_message_id,
|
||||
mock_queue_manager,
|
||||
mock_tool_file,
|
||||
mock_message_file,
|
||||
):
|
||||
"""Test handling image from URL."""
|
||||
# Arrange
|
||||
image_url = "http://example.com/image.png"
|
||||
content = ImagePromptMessageContent(
|
||||
url=image_url,
|
||||
format="png",
|
||||
mime_type="image/png",
|
||||
)
|
||||
|
||||
with patch("core.app.apps.base_app_runner.ToolFileManager") as mock_mgr_class:
|
||||
# Setup mock tool file manager
|
||||
mock_mgr = MagicMock()
|
||||
mock_mgr.create_file_by_url.return_value = mock_tool_file
|
||||
mock_mgr_class.return_value = mock_mgr
|
||||
|
||||
with patch("core.app.apps.base_app_runner.MessageFile") as mock_msg_file_class:
|
||||
# Setup mock message file
|
||||
mock_msg_file_class.return_value = mock_message_file
|
||||
|
||||
with patch("core.app.apps.base_app_runner.db.session") as mock_session:
|
||||
mock_session.add = MagicMock()
|
||||
mock_session.commit = MagicMock()
|
||||
mock_session.refresh = MagicMock()
|
||||
|
||||
# Act
|
||||
# Create a mock runner with the method bound
|
||||
runner = MagicMock()
|
||||
|
||||
method = AppRunner._handle_multimodal_image_content
|
||||
runner._handle_multimodal_image_content = lambda *args, **kwargs: method(runner, *args, **kwargs)
|
||||
|
||||
runner._handle_multimodal_image_content(
|
||||
content=content,
|
||||
message_id=mock_message_id,
|
||||
user_id=mock_user_id,
|
||||
tenant_id=mock_tenant_id,
|
||||
queue_manager=mock_queue_manager,
|
||||
)
|
||||
|
||||
# Assert
|
||||
# Verify tool file was created from URL
|
||||
mock_mgr.create_file_by_url.assert_called_once_with(
|
||||
user_id=mock_user_id,
|
||||
tenant_id=mock_tenant_id,
|
||||
file_url=image_url,
|
||||
conversation_id=None,
|
||||
)
|
||||
|
||||
# Verify message file was created with correct parameters
|
||||
mock_msg_file_class.assert_called_once()
|
||||
call_kwargs = mock_msg_file_class.call_args[1]
|
||||
assert call_kwargs["message_id"] == mock_message_id
|
||||
assert call_kwargs["type"] == FileType.IMAGE
|
||||
assert call_kwargs["transfer_method"] == FileTransferMethod.TOOL_FILE
|
||||
assert call_kwargs["belongs_to"] == "assistant"
|
||||
assert call_kwargs["created_by"] == mock_user_id
|
||||
|
||||
# Verify database operations
|
||||
mock_session.add.assert_called_once_with(mock_message_file)
|
||||
mock_session.commit.assert_called_once()
|
||||
mock_session.refresh.assert_called_once_with(mock_message_file)
|
||||
|
||||
# Verify event was published
|
||||
mock_queue_manager.publish.assert_called_once()
|
||||
publish_call = mock_queue_manager.publish.call_args
|
||||
assert isinstance(publish_call[0][0], QueueMessageFileEvent)
|
||||
assert publish_call[0][0].message_file_id == mock_message_file.id
|
||||
# publish_from might be passed as positional or keyword argument
|
||||
assert (
|
||||
publish_call[0][1] == PublishFrom.APPLICATION_MANAGER
|
||||
or publish_call.kwargs.get("publish_from") == PublishFrom.APPLICATION_MANAGER
|
||||
)
|
||||
|
||||
def test_handle_multimodal_image_content_with_base64(
|
||||
self,
|
||||
mock_user_id,
|
||||
mock_tenant_id,
|
||||
mock_message_id,
|
||||
mock_queue_manager,
|
||||
mock_tool_file,
|
||||
mock_message_file,
|
||||
):
|
||||
"""Test handling image from base64 data."""
|
||||
# Arrange
|
||||
import base64
|
||||
|
||||
# Create a small test image (1x1 PNG)
|
||||
test_image_data = base64.b64encode(
|
||||
b"\x89PNG\r\n\x1a\n\x00\x00\x00\rIHDR\x00\x00\x00\x01\x00\x00\x00\x01\x08\x02\x00\x00\x00\x90wS\xde"
|
||||
).decode()
|
||||
content = ImagePromptMessageContent(
|
||||
base64_data=test_image_data,
|
||||
format="png",
|
||||
mime_type="image/png",
|
||||
)
|
||||
|
||||
with patch("core.app.apps.base_app_runner.ToolFileManager") as mock_mgr_class:
|
||||
# Setup mock tool file manager
|
||||
mock_mgr = MagicMock()
|
||||
mock_mgr.create_file_by_raw.return_value = mock_tool_file
|
||||
mock_mgr_class.return_value = mock_mgr
|
||||
|
||||
with patch("core.app.apps.base_app_runner.MessageFile") as mock_msg_file_class:
|
||||
# Setup mock message file
|
||||
mock_msg_file_class.return_value = mock_message_file
|
||||
|
||||
with patch("core.app.apps.base_app_runner.db.session") as mock_session:
|
||||
mock_session.add = MagicMock()
|
||||
mock_session.commit = MagicMock()
|
||||
mock_session.refresh = MagicMock()
|
||||
|
||||
# Act
|
||||
# Create a mock runner with the method bound
|
||||
runner = MagicMock()
|
||||
method = AppRunner._handle_multimodal_image_content
|
||||
runner._handle_multimodal_image_content = lambda *args, **kwargs: method(runner, *args, **kwargs)
|
||||
|
||||
runner._handle_multimodal_image_content(
|
||||
content=content,
|
||||
message_id=mock_message_id,
|
||||
user_id=mock_user_id,
|
||||
tenant_id=mock_tenant_id,
|
||||
queue_manager=mock_queue_manager,
|
||||
)
|
||||
|
||||
# Assert
|
||||
# Verify tool file was created from base64
|
||||
mock_mgr.create_file_by_raw.assert_called_once()
|
||||
call_kwargs = mock_mgr.create_file_by_raw.call_args[1]
|
||||
assert call_kwargs["user_id"] == mock_user_id
|
||||
assert call_kwargs["tenant_id"] == mock_tenant_id
|
||||
assert call_kwargs["conversation_id"] is None
|
||||
assert "file_binary" in call_kwargs
|
||||
assert call_kwargs["mimetype"] == "image/png"
|
||||
assert call_kwargs["filename"].startswith("generated_image")
|
||||
assert call_kwargs["filename"].endswith(".png")
|
||||
|
||||
# Verify message file was created
|
||||
mock_msg_file_class.assert_called_once()
|
||||
|
||||
# Verify database operations
|
||||
mock_session.add.assert_called_once()
|
||||
mock_session.commit.assert_called_once()
|
||||
mock_session.refresh.assert_called_once()
|
||||
|
||||
# Verify event was published
|
||||
mock_queue_manager.publish.assert_called_once()
|
||||
|
||||
def test_handle_multimodal_image_content_with_base64_data_uri(
|
||||
self,
|
||||
mock_user_id,
|
||||
mock_tenant_id,
|
||||
mock_message_id,
|
||||
mock_queue_manager,
|
||||
mock_tool_file,
|
||||
mock_message_file,
|
||||
):
|
||||
"""Test handling image from base64 data with URI prefix."""
|
||||
# Arrange
|
||||
# Data URI format: data:image/png;base64,<base64_data>
|
||||
test_image_data = (
|
||||
"iVBORw0KGgoAAAANSUhEUgAAAAEAAAABCAYAAAAfFcSJAAAADUlEQVR42mNk+M9QDwADhgGAWjR9awAAAABJRU5ErkJggg=="
|
||||
)
|
||||
content = ImagePromptMessageContent(
|
||||
base64_data=f"data:image/png;base64,{test_image_data}",
|
||||
format="png",
|
||||
mime_type="image/png",
|
||||
)
|
||||
|
||||
with patch("core.app.apps.base_app_runner.ToolFileManager") as mock_mgr_class:
|
||||
# Setup mock tool file manager
|
||||
mock_mgr = MagicMock()
|
||||
mock_mgr.create_file_by_raw.return_value = mock_tool_file
|
||||
mock_mgr_class.return_value = mock_mgr
|
||||
|
||||
with patch("core.app.apps.base_app_runner.MessageFile") as mock_msg_file_class:
|
||||
# Setup mock message file
|
||||
mock_msg_file_class.return_value = mock_message_file
|
||||
|
||||
with patch("core.app.apps.base_app_runner.db.session") as mock_session:
|
||||
mock_session.add = MagicMock()
|
||||
mock_session.commit = MagicMock()
|
||||
mock_session.refresh = MagicMock()
|
||||
|
||||
# Act
|
||||
# Create a mock runner with the method bound
|
||||
runner = MagicMock()
|
||||
method = AppRunner._handle_multimodal_image_content
|
||||
runner._handle_multimodal_image_content = lambda *args, **kwargs: method(runner, *args, **kwargs)
|
||||
|
||||
runner._handle_multimodal_image_content(
|
||||
content=content,
|
||||
message_id=mock_message_id,
|
||||
user_id=mock_user_id,
|
||||
tenant_id=mock_tenant_id,
|
||||
queue_manager=mock_queue_manager,
|
||||
)
|
||||
|
||||
# Assert - verify that base64 data was extracted correctly (without prefix)
|
||||
mock_mgr.create_file_by_raw.assert_called_once()
|
||||
call_kwargs = mock_mgr.create_file_by_raw.call_args[1]
|
||||
# The base64 data should be decoded, so we check the binary was passed
|
||||
assert "file_binary" in call_kwargs
|
||||
|
||||
def test_handle_multimodal_image_content_without_url_or_base64(
|
||||
self,
|
||||
mock_user_id,
|
||||
mock_tenant_id,
|
||||
mock_message_id,
|
||||
mock_queue_manager,
|
||||
):
|
||||
"""Test handling image content without URL or base64 data."""
|
||||
# Arrange
|
||||
content = ImagePromptMessageContent(
|
||||
url="",
|
||||
base64_data="",
|
||||
format="png",
|
||||
mime_type="image/png",
|
||||
)
|
||||
|
||||
with patch("core.app.apps.base_app_runner.ToolFileManager") as mock_mgr_class:
|
||||
with patch("core.app.apps.base_app_runner.MessageFile") as mock_msg_file_class:
|
||||
with patch("core.app.apps.base_app_runner.db.session") as mock_session:
|
||||
# Act
|
||||
# Create a mock runner with the method bound
|
||||
runner = MagicMock()
|
||||
method = AppRunner._handle_multimodal_image_content
|
||||
runner._handle_multimodal_image_content = lambda *args, **kwargs: method(runner, *args, **kwargs)
|
||||
|
||||
runner._handle_multimodal_image_content(
|
||||
content=content,
|
||||
message_id=mock_message_id,
|
||||
user_id=mock_user_id,
|
||||
tenant_id=mock_tenant_id,
|
||||
queue_manager=mock_queue_manager,
|
||||
)
|
||||
|
||||
# Assert - should not create any files or publish events
|
||||
mock_mgr_class.assert_not_called()
|
||||
mock_msg_file_class.assert_not_called()
|
||||
mock_session.add.assert_not_called()
|
||||
mock_queue_manager.publish.assert_not_called()
|
||||
|
||||
def test_handle_multimodal_image_content_with_error(
|
||||
self,
|
||||
mock_user_id,
|
||||
mock_tenant_id,
|
||||
mock_message_id,
|
||||
mock_queue_manager,
|
||||
):
|
||||
"""Test handling image content when an error occurs."""
|
||||
# Arrange
|
||||
image_url = "http://example.com/image.png"
|
||||
content = ImagePromptMessageContent(
|
||||
url=image_url,
|
||||
format="png",
|
||||
mime_type="image/png",
|
||||
)
|
||||
|
||||
with patch("core.app.apps.base_app_runner.ToolFileManager") as mock_mgr_class:
|
||||
# Setup mock to raise exception
|
||||
mock_mgr = MagicMock()
|
||||
mock_mgr.create_file_by_url.side_effect = Exception("Network error")
|
||||
mock_mgr_class.return_value = mock_mgr
|
||||
|
||||
with patch("core.app.apps.base_app_runner.MessageFile") as mock_msg_file_class:
|
||||
with patch("core.app.apps.base_app_runner.db.session") as mock_session:
|
||||
# Act
|
||||
# Create a mock runner with the method bound
|
||||
runner = MagicMock()
|
||||
method = AppRunner._handle_multimodal_image_content
|
||||
runner._handle_multimodal_image_content = lambda *args, **kwargs: method(runner, *args, **kwargs)
|
||||
|
||||
# Should not raise exception, just log it
|
||||
runner._handle_multimodal_image_content(
|
||||
content=content,
|
||||
message_id=mock_message_id,
|
||||
user_id=mock_user_id,
|
||||
tenant_id=mock_tenant_id,
|
||||
queue_manager=mock_queue_manager,
|
||||
)
|
||||
|
||||
# Assert - should not create message file or publish event on error
|
||||
mock_msg_file_class.assert_not_called()
|
||||
mock_session.add.assert_not_called()
|
||||
mock_queue_manager.publish.assert_not_called()
|
||||
|
||||
def test_handle_multimodal_image_content_debugger_mode(
|
||||
self,
|
||||
mock_user_id,
|
||||
mock_tenant_id,
|
||||
mock_message_id,
|
||||
mock_queue_manager,
|
||||
mock_tool_file,
|
||||
mock_message_file,
|
||||
):
|
||||
"""Test that debugger mode sets correct created_by_role."""
|
||||
# Arrange
|
||||
image_url = "http://example.com/image.png"
|
||||
content = ImagePromptMessageContent(
|
||||
url=image_url,
|
||||
format="png",
|
||||
mime_type="image/png",
|
||||
)
|
||||
mock_queue_manager.invoke_from = InvokeFrom.DEBUGGER
|
||||
|
||||
with patch("core.app.apps.base_app_runner.ToolFileManager") as mock_mgr_class:
|
||||
# Setup mock tool file manager
|
||||
mock_mgr = MagicMock()
|
||||
mock_mgr.create_file_by_url.return_value = mock_tool_file
|
||||
mock_mgr_class.return_value = mock_mgr
|
||||
|
||||
with patch("core.app.apps.base_app_runner.MessageFile") as mock_msg_file_class:
|
||||
# Setup mock message file
|
||||
mock_msg_file_class.return_value = mock_message_file
|
||||
|
||||
with patch("core.app.apps.base_app_runner.db.session") as mock_session:
|
||||
mock_session.add = MagicMock()
|
||||
mock_session.commit = MagicMock()
|
||||
mock_session.refresh = MagicMock()
|
||||
|
||||
# Act
|
||||
# Create a mock runner with the method bound
|
||||
runner = MagicMock()
|
||||
method = AppRunner._handle_multimodal_image_content
|
||||
runner._handle_multimodal_image_content = lambda *args, **kwargs: method(runner, *args, **kwargs)
|
||||
|
||||
runner._handle_multimodal_image_content(
|
||||
content=content,
|
||||
message_id=mock_message_id,
|
||||
user_id=mock_user_id,
|
||||
tenant_id=mock_tenant_id,
|
||||
queue_manager=mock_queue_manager,
|
||||
)
|
||||
|
||||
# Assert - verify created_by_role is ACCOUNT for debugger mode
|
||||
call_kwargs = mock_msg_file_class.call_args[1]
|
||||
assert call_kwargs["created_by_role"] == CreatorUserRole.ACCOUNT
|
||||
|
||||
def test_handle_multimodal_image_content_service_api_mode(
|
||||
self,
|
||||
mock_user_id,
|
||||
mock_tenant_id,
|
||||
mock_message_id,
|
||||
mock_queue_manager,
|
||||
mock_tool_file,
|
||||
mock_message_file,
|
||||
):
|
||||
"""Test that service API mode sets correct created_by_role."""
|
||||
# Arrange
|
||||
image_url = "http://example.com/image.png"
|
||||
content = ImagePromptMessageContent(
|
||||
url=image_url,
|
||||
format="png",
|
||||
mime_type="image/png",
|
||||
)
|
||||
mock_queue_manager.invoke_from = InvokeFrom.SERVICE_API
|
||||
|
||||
with patch("core.app.apps.base_app_runner.ToolFileManager") as mock_mgr_class:
|
||||
# Setup mock tool file manager
|
||||
mock_mgr = MagicMock()
|
||||
mock_mgr.create_file_by_url.return_value = mock_tool_file
|
||||
mock_mgr_class.return_value = mock_mgr
|
||||
|
||||
with patch("core.app.apps.base_app_runner.MessageFile") as mock_msg_file_class:
|
||||
# Setup mock message file
|
||||
mock_msg_file_class.return_value = mock_message_file
|
||||
|
||||
with patch("core.app.apps.base_app_runner.db.session") as mock_session:
|
||||
mock_session.add = MagicMock()
|
||||
mock_session.commit = MagicMock()
|
||||
mock_session.refresh = MagicMock()
|
||||
|
||||
# Act
|
||||
# Create a mock runner with the method bound
|
||||
runner = MagicMock()
|
||||
method = AppRunner._handle_multimodal_image_content
|
||||
runner._handle_multimodal_image_content = lambda *args, **kwargs: method(runner, *args, **kwargs)
|
||||
|
||||
runner._handle_multimodal_image_content(
|
||||
content=content,
|
||||
message_id=mock_message_id,
|
||||
user_id=mock_user_id,
|
||||
tenant_id=mock_tenant_id,
|
||||
queue_manager=mock_queue_manager,
|
||||
)
|
||||
|
||||
# Assert - verify created_by_role is END_USER for service API
|
||||
call_kwargs = mock_msg_file_class.call_args[1]
|
||||
assert call_kwargs["created_by_role"] == CreatorUserRole.END_USER
|
||||
@@ -1,7 +1,6 @@
|
||||
"""Unit tests for the message cycle manager optimization."""
|
||||
|
||||
from types import SimpleNamespace
|
||||
from unittest.mock import ANY, Mock, patch
|
||||
from unittest.mock import Mock, patch
|
||||
|
||||
import pytest
|
||||
from flask import current_app
|
||||
@@ -28,17 +27,14 @@ class TestMessageCycleManagerOptimization:
|
||||
|
||||
def test_get_message_event_type_with_message_file(self, message_cycle_manager):
|
||||
"""Test get_message_event_type returns MESSAGE_FILE when message has files."""
|
||||
with (
|
||||
patch("core.app.task_pipeline.message_cycle_manager.Session") as mock_session_class,
|
||||
patch("core.app.task_pipeline.message_cycle_manager.db", new=SimpleNamespace(engine=Mock())),
|
||||
):
|
||||
with patch("core.app.task_pipeline.message_cycle_manager.session_factory") as mock_session_factory:
|
||||
# Setup mock session and message file
|
||||
mock_session = Mock()
|
||||
mock_session_class.return_value.__enter__.return_value = mock_session
|
||||
mock_session_factory.create_session.return_value.__enter__.return_value = mock_session
|
||||
|
||||
mock_message_file = Mock()
|
||||
# Current implementation uses session.query(...).scalar()
|
||||
mock_session.query.return_value.scalar.return_value = mock_message_file
|
||||
# Current implementation uses session.scalar(select(...))
|
||||
mock_session.scalar.return_value = mock_message_file
|
||||
|
||||
# Execute
|
||||
with current_app.app_context():
|
||||
@@ -46,19 +42,16 @@ class TestMessageCycleManagerOptimization:
|
||||
|
||||
# Assert
|
||||
assert result == StreamEvent.MESSAGE_FILE
|
||||
mock_session.query.return_value.scalar.assert_called_once()
|
||||
mock_session.scalar.assert_called_once()
|
||||
|
||||
def test_get_message_event_type_without_message_file(self, message_cycle_manager):
|
||||
"""Test get_message_event_type returns MESSAGE when message has no files."""
|
||||
with (
|
||||
patch("core.app.task_pipeline.message_cycle_manager.Session") as mock_session_class,
|
||||
patch("core.app.task_pipeline.message_cycle_manager.db", new=SimpleNamespace(engine=Mock())),
|
||||
):
|
||||
with patch("core.app.task_pipeline.message_cycle_manager.session_factory") as mock_session_factory:
|
||||
# Setup mock session and no message file
|
||||
mock_session = Mock()
|
||||
mock_session_class.return_value.__enter__.return_value = mock_session
|
||||
# Current implementation uses session.query(...).scalar()
|
||||
mock_session.query.return_value.scalar.return_value = None
|
||||
mock_session_factory.create_session.return_value.__enter__.return_value = mock_session
|
||||
# Current implementation uses session.scalar(select(...))
|
||||
mock_session.scalar.return_value = None
|
||||
|
||||
# Execute
|
||||
with current_app.app_context():
|
||||
@@ -66,21 +59,18 @@ class TestMessageCycleManagerOptimization:
|
||||
|
||||
# Assert
|
||||
assert result == StreamEvent.MESSAGE
|
||||
mock_session.query.return_value.scalar.assert_called_once()
|
||||
mock_session.scalar.assert_called_once()
|
||||
|
||||
def test_message_to_stream_response_with_precomputed_event_type(self, message_cycle_manager):
|
||||
"""MessageCycleManager.message_to_stream_response expects a valid event_type; callers should precompute it."""
|
||||
with (
|
||||
patch("core.app.task_pipeline.message_cycle_manager.Session") as mock_session_class,
|
||||
patch("core.app.task_pipeline.message_cycle_manager.db", new=SimpleNamespace(engine=Mock())),
|
||||
):
|
||||
with patch("core.app.task_pipeline.message_cycle_manager.session_factory") as mock_session_factory:
|
||||
# Setup mock session and message file
|
||||
mock_session = Mock()
|
||||
mock_session_class.return_value.__enter__.return_value = mock_session
|
||||
mock_session_factory.create_session.return_value.__enter__.return_value = mock_session
|
||||
|
||||
mock_message_file = Mock()
|
||||
# Current implementation uses session.query(...).scalar()
|
||||
mock_session.query.return_value.scalar.return_value = mock_message_file
|
||||
# Current implementation uses session.scalar(select(...))
|
||||
mock_session.scalar.return_value = mock_message_file
|
||||
|
||||
# Execute: compute event type once, then pass to message_to_stream_response
|
||||
with current_app.app_context():
|
||||
@@ -94,11 +84,11 @@ class TestMessageCycleManagerOptimization:
|
||||
assert result.answer == "Hello world"
|
||||
assert result.id == "test-message-id"
|
||||
assert result.event == StreamEvent.MESSAGE_FILE
|
||||
mock_session.query.return_value.scalar.assert_called_once()
|
||||
mock_session.scalar.assert_called_once()
|
||||
|
||||
def test_message_to_stream_response_with_event_type_skips_query(self, message_cycle_manager):
|
||||
"""Test that message_to_stream_response skips database query when event_type is provided."""
|
||||
with patch("core.app.task_pipeline.message_cycle_manager.Session") as mock_session_class:
|
||||
with patch("core.app.task_pipeline.message_cycle_manager.session_factory") as mock_session_factory:
|
||||
# Execute with event_type provided
|
||||
result = message_cycle_manager.message_to_stream_response(
|
||||
answer="Hello world", message_id="test-message-id", event_type=StreamEvent.MESSAGE
|
||||
@@ -109,8 +99,8 @@ class TestMessageCycleManagerOptimization:
|
||||
assert result.answer == "Hello world"
|
||||
assert result.id == "test-message-id"
|
||||
assert result.event == StreamEvent.MESSAGE
|
||||
# Should not query database when event_type is provided
|
||||
mock_session_class.assert_not_called()
|
||||
# Should not open a session when event_type is provided
|
||||
mock_session_factory.create_session.assert_not_called()
|
||||
|
||||
def test_message_to_stream_response_with_from_variable_selector(self, message_cycle_manager):
|
||||
"""Test message_to_stream_response with from_variable_selector parameter."""
|
||||
@@ -130,24 +120,21 @@ class TestMessageCycleManagerOptimization:
|
||||
def test_optimization_usage_example(self, message_cycle_manager):
|
||||
"""Test the optimization pattern that should be used by callers."""
|
||||
# Step 1: Get event type once (this queries database)
|
||||
with (
|
||||
patch("core.app.task_pipeline.message_cycle_manager.Session") as mock_session_class,
|
||||
patch("core.app.task_pipeline.message_cycle_manager.db", new=SimpleNamespace(engine=Mock())),
|
||||
):
|
||||
with patch("core.app.task_pipeline.message_cycle_manager.session_factory") as mock_session_factory:
|
||||
mock_session = Mock()
|
||||
mock_session_class.return_value.__enter__.return_value = mock_session
|
||||
# Current implementation uses session.query(...).scalar()
|
||||
mock_session.query.return_value.scalar.return_value = None # No files
|
||||
mock_session_factory.create_session.return_value.__enter__.return_value = mock_session
|
||||
# Current implementation uses session.scalar(select(...))
|
||||
mock_session.scalar.return_value = None # No files
|
||||
with current_app.app_context():
|
||||
event_type = message_cycle_manager.get_message_event_type("test-message-id")
|
||||
|
||||
# Should query database once
|
||||
mock_session_class.assert_called_once_with(ANY, expire_on_commit=False)
|
||||
# Should open session once
|
||||
mock_session_factory.create_session.assert_called_once()
|
||||
assert event_type == StreamEvent.MESSAGE
|
||||
|
||||
# Step 2: Use event_type for multiple calls (no additional queries)
|
||||
with patch("core.app.task_pipeline.message_cycle_manager.Session") as mock_session_class:
|
||||
mock_session_class.return_value.__enter__.return_value = Mock()
|
||||
with patch("core.app.task_pipeline.message_cycle_manager.session_factory") as mock_session_factory:
|
||||
mock_session_factory.create_session.return_value.__enter__.return_value = Mock()
|
||||
|
||||
chunk1_response = message_cycle_manager.message_to_stream_response(
|
||||
answer="Chunk 1", message_id="test-message-id", event_type=event_type
|
||||
@@ -157,8 +144,8 @@ class TestMessageCycleManagerOptimization:
|
||||
answer="Chunk 2", message_id="test-message-id", event_type=event_type
|
||||
)
|
||||
|
||||
# Should not query database again
|
||||
mock_session_class.assert_not_called()
|
||||
# Should not open session again when event_type provided
|
||||
mock_session_factory.create_session.assert_not_called()
|
||||
|
||||
assert chunk1_response.event == StreamEvent.MESSAGE
|
||||
assert chunk2_response.event == StreamEvent.MESSAGE
|
||||
|
||||
@@ -1,5 +1,7 @@
|
||||
from unittest.mock import MagicMock, patch
|
||||
|
||||
import pytest
|
||||
|
||||
from core.model_runtime.entities.message_entities import AssistantPromptMessage
|
||||
from core.model_runtime.model_providers.__base.large_language_model import _increase_tool_call
|
||||
|
||||
@@ -97,3 +99,14 @@ def test__increase_tool_call():
|
||||
mock_id_generator.side_effect = [_exp_case.id for _exp_case in EXPECTED_CASE_4]
|
||||
with patch("core.model_runtime.model_providers.__base.large_language_model._gen_tool_call_id", mock_id_generator):
|
||||
_run_case(INPUTS_CASE_4, EXPECTED_CASE_4)
|
||||
|
||||
|
||||
def test__increase_tool_call__no_id_no_name_first_delta_should_raise():
|
||||
inputs = [
|
||||
ToolCall(id="", type="function", function=ToolCall.ToolCallFunction(name="", arguments='{"arg1": ')),
|
||||
ToolCall(id="", type="function", function=ToolCall.ToolCallFunction(name="func_foo", arguments='"value"}')),
|
||||
]
|
||||
actual: list[ToolCall] = []
|
||||
with patch("core.model_runtime.model_providers.__base.large_language_model._gen_tool_call_id", MagicMock()):
|
||||
with pytest.raises(ValueError):
|
||||
_increase_tool_call(inputs, actual)
|
||||
|
||||
@@ -0,0 +1,103 @@
|
||||
from core.model_runtime.entities.llm_entities import LLMResult, LLMResultChunk, LLMResultChunkDelta, LLMUsage
|
||||
from core.model_runtime.entities.message_entities import (
|
||||
AssistantPromptMessage,
|
||||
TextPromptMessageContent,
|
||||
UserPromptMessage,
|
||||
)
|
||||
from core.model_runtime.model_providers.__base.large_language_model import _normalize_non_stream_plugin_result
|
||||
|
||||
|
||||
def _make_chunk(
|
||||
*,
|
||||
model: str = "test-model",
|
||||
content: str | list[TextPromptMessageContent] | None,
|
||||
tool_calls: list[AssistantPromptMessage.ToolCall] | None = None,
|
||||
usage: LLMUsage | None = None,
|
||||
system_fingerprint: str | None = None,
|
||||
) -> LLMResultChunk:
|
||||
message = AssistantPromptMessage(content=content, tool_calls=tool_calls or [])
|
||||
delta = LLMResultChunkDelta(index=0, message=message, usage=usage)
|
||||
return LLMResultChunk(model=model, delta=delta, system_fingerprint=system_fingerprint)
|
||||
|
||||
|
||||
def test__normalize_non_stream_plugin_result__from_first_chunk_str_content_and_tool_calls():
|
||||
prompt_messages = [UserPromptMessage(content="hi")]
|
||||
|
||||
tool_calls = [
|
||||
AssistantPromptMessage.ToolCall(
|
||||
id="1",
|
||||
type="function",
|
||||
function=AssistantPromptMessage.ToolCall.ToolCallFunction(name="func_foo", arguments=""),
|
||||
),
|
||||
AssistantPromptMessage.ToolCall(
|
||||
id="",
|
||||
type="function",
|
||||
function=AssistantPromptMessage.ToolCall.ToolCallFunction(name="", arguments='{"arg1": '),
|
||||
),
|
||||
AssistantPromptMessage.ToolCall(
|
||||
id="",
|
||||
type="function",
|
||||
function=AssistantPromptMessage.ToolCall.ToolCallFunction(name="", arguments='"value"}'),
|
||||
),
|
||||
]
|
||||
|
||||
usage = LLMUsage.empty_usage().model_copy(update={"prompt_tokens": 1, "total_tokens": 1})
|
||||
chunk = _make_chunk(content="hello", tool_calls=tool_calls, usage=usage, system_fingerprint="fp-1")
|
||||
|
||||
result = _normalize_non_stream_plugin_result(
|
||||
model="test-model", prompt_messages=prompt_messages, result=iter([chunk])
|
||||
)
|
||||
|
||||
assert result.model == "test-model"
|
||||
assert result.prompt_messages == prompt_messages
|
||||
assert result.message.content == "hello"
|
||||
assert result.usage.prompt_tokens == 1
|
||||
assert result.system_fingerprint == "fp-1"
|
||||
assert result.message.tool_calls == [
|
||||
AssistantPromptMessage.ToolCall(
|
||||
id="1",
|
||||
type="function",
|
||||
function=AssistantPromptMessage.ToolCall.ToolCallFunction(name="func_foo", arguments='{"arg1": "value"}'),
|
||||
)
|
||||
]
|
||||
|
||||
|
||||
def test__normalize_non_stream_plugin_result__from_first_chunk_list_content():
|
||||
prompt_messages = [UserPromptMessage(content="hi")]
|
||||
|
||||
content_list = [TextPromptMessageContent(data="a"), TextPromptMessageContent(data="b")]
|
||||
chunk = _make_chunk(content=content_list, usage=LLMUsage.empty_usage())
|
||||
|
||||
result = _normalize_non_stream_plugin_result(
|
||||
model="test-model", prompt_messages=prompt_messages, result=iter([chunk])
|
||||
)
|
||||
|
||||
assert result.message.content == content_list
|
||||
|
||||
|
||||
def test__normalize_non_stream_plugin_result__passthrough_llm_result():
|
||||
prompt_messages = [UserPromptMessage(content="hi")]
|
||||
llm_result = LLMResult(
|
||||
model="test-model",
|
||||
prompt_messages=prompt_messages,
|
||||
message=AssistantPromptMessage(content="ok"),
|
||||
usage=LLMUsage.empty_usage(),
|
||||
)
|
||||
|
||||
assert (
|
||||
_normalize_non_stream_plugin_result(model="test-model", prompt_messages=prompt_messages, result=llm_result)
|
||||
== llm_result
|
||||
)
|
||||
|
||||
|
||||
def test__normalize_non_stream_plugin_result__empty_iterator_defaults():
|
||||
prompt_messages = [UserPromptMessage(content="hi")]
|
||||
|
||||
result = _normalize_non_stream_plugin_result(model="test-model", prompt_messages=prompt_messages, result=iter([]))
|
||||
|
||||
assert result.model == "test-model"
|
||||
assert result.prompt_messages == prompt_messages
|
||||
assert result.message.content == []
|
||||
assert result.message.tool_calls == []
|
||||
assert result.usage == LLMUsage.empty_usage()
|
||||
assert result.system_fingerprint is None
|
||||
@@ -475,3 +475,130 @@ def test_valid_api_key_works():
|
||||
headers = executor._assembling_headers()
|
||||
assert "Authorization" in headers
|
||||
assert headers["Authorization"] == "Bearer valid-api-key-123"
|
||||
|
||||
|
||||
def test_executor_with_json_body_and_unquoted_uuid_variable():
|
||||
"""Test that unquoted UUID variables are correctly handled in JSON body.
|
||||
|
||||
This test verifies the fix for issue #31436 where json_repair would truncate
|
||||
certain UUID patterns (like 57eeeeb1-...) when they appeared as unquoted values.
|
||||
"""
|
||||
# UUID that triggers the json_repair truncation bug
|
||||
test_uuid = "57eeeeb1-450b-482c-81b9-4be77e95dee2"
|
||||
|
||||
variable_pool = VariablePool(
|
||||
system_variables=SystemVariable.empty(),
|
||||
user_inputs={},
|
||||
)
|
||||
variable_pool.add(["pre_node_id", "uuid"], test_uuid)
|
||||
|
||||
node_data = HttpRequestNodeData(
|
||||
title="Test JSON Body with Unquoted UUID Variable",
|
||||
method="post",
|
||||
url="https://api.example.com/data",
|
||||
authorization=HttpRequestNodeAuthorization(type="no-auth"),
|
||||
headers="Content-Type: application/json",
|
||||
params="",
|
||||
body=HttpRequestNodeBody(
|
||||
type="json",
|
||||
data=[
|
||||
BodyData(
|
||||
key="",
|
||||
type="text",
|
||||
# UUID variable without quotes - this is the problematic case
|
||||
value='{"rowId": {{#pre_node_id.uuid#}}}',
|
||||
)
|
||||
],
|
||||
),
|
||||
)
|
||||
|
||||
executor = Executor(
|
||||
node_data=node_data,
|
||||
timeout=HttpRequestNodeTimeout(connect=10, read=30, write=30),
|
||||
variable_pool=variable_pool,
|
||||
)
|
||||
|
||||
# The UUID should be preserved in full, not truncated
|
||||
assert executor.json == {"rowId": test_uuid}
|
||||
assert len(executor.json["rowId"]) == len(test_uuid)
|
||||
|
||||
|
||||
def test_executor_with_json_body_and_unquoted_uuid_with_newlines():
|
||||
"""Test that unquoted UUID variables with newlines in JSON are handled correctly.
|
||||
|
||||
This is a specific case from issue #31436 where the JSON body contains newlines.
|
||||
"""
|
||||
test_uuid = "57eeeeb1-450b-482c-81b9-4be77e95dee2"
|
||||
|
||||
variable_pool = VariablePool(
|
||||
system_variables=SystemVariable.empty(),
|
||||
user_inputs={},
|
||||
)
|
||||
variable_pool.add(["pre_node_id", "uuid"], test_uuid)
|
||||
|
||||
node_data = HttpRequestNodeData(
|
||||
title="Test JSON Body with Unquoted UUID and Newlines",
|
||||
method="post",
|
||||
url="https://api.example.com/data",
|
||||
authorization=HttpRequestNodeAuthorization(type="no-auth"),
|
||||
headers="Content-Type: application/json",
|
||||
params="",
|
||||
body=HttpRequestNodeBody(
|
||||
type="json",
|
||||
data=[
|
||||
BodyData(
|
||||
key="",
|
||||
type="text",
|
||||
# JSON with newlines and unquoted UUID variable
|
||||
value='{\n"rowId": {{#pre_node_id.uuid#}}\n}',
|
||||
)
|
||||
],
|
||||
),
|
||||
)
|
||||
|
||||
executor = Executor(
|
||||
node_data=node_data,
|
||||
timeout=HttpRequestNodeTimeout(connect=10, read=30, write=30),
|
||||
variable_pool=variable_pool,
|
||||
)
|
||||
|
||||
# The UUID should be preserved in full
|
||||
assert executor.json == {"rowId": test_uuid}
|
||||
|
||||
|
||||
def test_executor_with_json_body_preserves_numbers_and_strings():
|
||||
"""Test that numbers are preserved and string values are properly quoted."""
|
||||
variable_pool = VariablePool(
|
||||
system_variables=SystemVariable.empty(),
|
||||
user_inputs={},
|
||||
)
|
||||
variable_pool.add(["node", "count"], 42)
|
||||
variable_pool.add(["node", "id"], "abc-123")
|
||||
|
||||
node_data = HttpRequestNodeData(
|
||||
title="Test JSON Body with mixed types",
|
||||
method="post",
|
||||
url="https://api.example.com/data",
|
||||
authorization=HttpRequestNodeAuthorization(type="no-auth"),
|
||||
headers="",
|
||||
params="",
|
||||
body=HttpRequestNodeBody(
|
||||
type="json",
|
||||
data=[
|
||||
BodyData(
|
||||
key="",
|
||||
type="text",
|
||||
value='{"count": {{#node.count#}}, "id": {{#node.id#}}}',
|
||||
)
|
||||
],
|
||||
),
|
||||
)
|
||||
|
||||
executor = Executor(
|
||||
node_data=node_data,
|
||||
timeout=HttpRequestNodeTimeout(connect=10, read=30, write=30),
|
||||
variable_pool=variable_pool,
|
||||
)
|
||||
|
||||
assert executor.json["count"] == 42
|
||||
assert executor.json["id"] == "abc-123"
|
||||
|
||||
@@ -30,3 +30,12 @@ class TestWorkflowExecutionStatus:
|
||||
|
||||
for status in non_ended_statuses:
|
||||
assert not status.is_ended(), f"{status} should not be considered ended"
|
||||
|
||||
def test_ended_values(self):
|
||||
"""Test ended_values returns the expected status values."""
|
||||
assert set(WorkflowExecutionStatus.ended_values()) == {
|
||||
WorkflowExecutionStatus.SUCCEEDED.value,
|
||||
WorkflowExecutionStatus.FAILED.value,
|
||||
WorkflowExecutionStatus.PARTIAL_SUCCEEDED.value,
|
||||
WorkflowExecutionStatus.STOPPED.value,
|
||||
}
|
||||
|
||||
@@ -37,6 +37,20 @@ def _client_error(code: str) -> ClientError:
|
||||
def _mock_client(monkeypatch):
|
||||
client = MagicMock()
|
||||
client.head_bucket.return_value = None
|
||||
# Configure put_object to return a proper ETag that matches the MD5 hash
|
||||
# The ETag format is typically the MD5 hash wrapped in quotes
|
||||
|
||||
def mock_put_object(**kwargs):
|
||||
md5_hash = kwargs.get("Body", b"")
|
||||
if isinstance(md5_hash, bytes):
|
||||
md5_hash = hashlib.md5(md5_hash).hexdigest()
|
||||
else:
|
||||
md5_hash = hashlib.md5(md5_hash.encode()).hexdigest()
|
||||
response = MagicMock()
|
||||
response.get.return_value = f'"{md5_hash}"'
|
||||
return response
|
||||
|
||||
client.put_object.side_effect = mock_put_object
|
||||
boto_client = MagicMock(return_value=client)
|
||||
monkeypatch.setattr(storage_module.boto3, "client", boto_client)
|
||||
return client, boto_client
|
||||
@@ -254,8 +268,8 @@ def test_serialization_roundtrip():
|
||||
{"id": "2", "value": 123},
|
||||
]
|
||||
|
||||
data = ArchiveStorage.serialize_to_jsonl_gz(records)
|
||||
decoded = ArchiveStorage.deserialize_from_jsonl_gz(data)
|
||||
data = ArchiveStorage.serialize_to_jsonl(records)
|
||||
decoded = ArchiveStorage.deserialize_from_jsonl(data)
|
||||
|
||||
assert decoded[0]["id"] == "1"
|
||||
assert decoded[0]["payload"]["nested"] == "value"
|
||||
|
||||
@@ -0,0 +1,54 @@
|
||||
"""
|
||||
Unit tests for workflow run archiving functionality.
|
||||
|
||||
This module contains tests for:
|
||||
- Archive service
|
||||
- Rollback service
|
||||
"""
|
||||
|
||||
from datetime import datetime
|
||||
from unittest.mock import MagicMock, patch
|
||||
|
||||
from services.retention.workflow_run.constants import ARCHIVE_BUNDLE_NAME
|
||||
|
||||
|
||||
class TestWorkflowRunArchiver:
|
||||
"""Tests for the WorkflowRunArchiver class."""
|
||||
|
||||
@patch("services.retention.workflow_run.archive_paid_plan_workflow_run.dify_config")
|
||||
@patch("services.retention.workflow_run.archive_paid_plan_workflow_run.get_archive_storage")
|
||||
def test_archiver_initialization(self, mock_get_storage, mock_config):
|
||||
"""Test archiver can be initialized with various options."""
|
||||
from services.retention.workflow_run.archive_paid_plan_workflow_run import WorkflowRunArchiver
|
||||
|
||||
mock_config.BILLING_ENABLED = False
|
||||
|
||||
archiver = WorkflowRunArchiver(
|
||||
days=90,
|
||||
batch_size=100,
|
||||
tenant_ids=["test-tenant"],
|
||||
limit=50,
|
||||
dry_run=True,
|
||||
)
|
||||
|
||||
assert archiver.days == 90
|
||||
assert archiver.batch_size == 100
|
||||
assert archiver.tenant_ids == ["test-tenant"]
|
||||
assert archiver.limit == 50
|
||||
assert archiver.dry_run is True
|
||||
|
||||
def test_get_archive_key(self):
|
||||
"""Test archive key generation."""
|
||||
from services.retention.workflow_run.archive_paid_plan_workflow_run import WorkflowRunArchiver
|
||||
|
||||
archiver = WorkflowRunArchiver.__new__(WorkflowRunArchiver)
|
||||
|
||||
mock_run = MagicMock()
|
||||
mock_run.tenant_id = "tenant-123"
|
||||
mock_run.app_id = "app-999"
|
||||
mock_run.id = "run-456"
|
||||
mock_run.created_at = datetime(2024, 1, 15, 12, 0, 0)
|
||||
|
||||
key = archiver._get_archive_key(mock_run)
|
||||
|
||||
assert key == f"tenant-123/app_id=app-999/year=2024/month=01/workflow_run_id=run-456/{ARCHIVE_BUNDLE_NAME}"
|
||||
@@ -171,22 +171,26 @@ class TestBillingServiceSendRequest:
|
||||
"status_code", [httpx.codes.BAD_REQUEST, httpx.codes.INTERNAL_SERVER_ERROR, httpx.codes.NOT_FOUND]
|
||||
)
|
||||
def test_delete_request_non_200_with_valid_json(self, mock_httpx_request, mock_billing_config, status_code):
|
||||
"""Test DELETE request with non-200 status code but valid JSON response.
|
||||
"""Test DELETE request with non-200 status code raises ValueError.
|
||||
|
||||
DELETE doesn't check status code, so it returns the error JSON.
|
||||
DELETE now checks status code and raises ValueError for non-200 responses.
|
||||
"""
|
||||
# Arrange
|
||||
error_response = {"detail": "Error message"}
|
||||
mock_response = MagicMock()
|
||||
mock_response.status_code = status_code
|
||||
mock_response.text = "Error message"
|
||||
mock_response.json.return_value = error_response
|
||||
mock_httpx_request.return_value = mock_response
|
||||
|
||||
# Act
|
||||
result = BillingService._send_request("DELETE", "/test", json={"key": "value"})
|
||||
|
||||
# Assert
|
||||
assert result == error_response
|
||||
# Act & Assert
|
||||
with patch("services.billing_service.logger") as mock_logger:
|
||||
with pytest.raises(ValueError) as exc_info:
|
||||
BillingService._send_request("DELETE", "/test", json={"key": "value"})
|
||||
assert "Unable to process delete request" in str(exc_info.value)
|
||||
# Verify error logging
|
||||
mock_logger.error.assert_called_once()
|
||||
assert "DELETE response" in str(mock_logger.error.call_args)
|
||||
|
||||
@pytest.mark.parametrize(
|
||||
"status_code", [httpx.codes.BAD_REQUEST, httpx.codes.INTERNAL_SERVER_ERROR, httpx.codes.NOT_FOUND]
|
||||
@@ -210,9 +214,9 @@ class TestBillingServiceSendRequest:
|
||||
"status_code", [httpx.codes.BAD_REQUEST, httpx.codes.INTERNAL_SERVER_ERROR, httpx.codes.NOT_FOUND]
|
||||
)
|
||||
def test_delete_request_non_200_with_invalid_json(self, mock_httpx_request, mock_billing_config, status_code):
|
||||
"""Test DELETE request with non-200 status code and invalid JSON response raises exception.
|
||||
"""Test DELETE request with non-200 status code raises ValueError before JSON parsing.
|
||||
|
||||
DELETE doesn't check status code, so it calls response.json() which raises JSONDecodeError
|
||||
DELETE now checks status code before calling response.json(), so ValueError is raised
|
||||
when the response cannot be parsed as JSON (e.g., empty response).
|
||||
"""
|
||||
# Arrange
|
||||
@@ -223,8 +227,13 @@ class TestBillingServiceSendRequest:
|
||||
mock_httpx_request.return_value = mock_response
|
||||
|
||||
# Act & Assert
|
||||
with pytest.raises(json.JSONDecodeError):
|
||||
BillingService._send_request("DELETE", "/test", json={"key": "value"})
|
||||
with patch("services.billing_service.logger") as mock_logger:
|
||||
with pytest.raises(ValueError) as exc_info:
|
||||
BillingService._send_request("DELETE", "/test", json={"key": "value"})
|
||||
assert "Unable to process delete request" in str(exc_info.value)
|
||||
# Verify error logging
|
||||
mock_logger.error.assert_called_once()
|
||||
assert "DELETE response" in str(mock_logger.error.call_args)
|
||||
|
||||
def test_retry_on_request_error(self, mock_httpx_request, mock_billing_config):
|
||||
"""Test that _send_request retries on httpx.RequestError."""
|
||||
@@ -789,7 +798,7 @@ class TestBillingServiceAccountManagement:
|
||||
|
||||
# Assert
|
||||
assert result == expected_response
|
||||
mock_send_request.assert_called_once_with("DELETE", "/account/", params={"account_id": account_id})
|
||||
mock_send_request.assert_called_once_with("DELETE", "/account", params={"account_id": account_id})
|
||||
|
||||
def test_is_email_in_freeze_true(self, mock_send_request):
|
||||
"""Test checking if email is frozen (returns True)."""
|
||||
|
||||
@@ -0,0 +1,180 @@
|
||||
"""
|
||||
Unit tests for archived workflow run deletion service.
|
||||
"""
|
||||
|
||||
from unittest.mock import MagicMock, patch
|
||||
|
||||
|
||||
class TestArchivedWorkflowRunDeletion:
|
||||
def test_delete_by_run_id_returns_error_when_run_missing(self):
|
||||
from services.retention.workflow_run.delete_archived_workflow_run import ArchivedWorkflowRunDeletion
|
||||
|
||||
deleter = ArchivedWorkflowRunDeletion()
|
||||
repo = MagicMock()
|
||||
session = MagicMock()
|
||||
session.get.return_value = None
|
||||
|
||||
session_maker = MagicMock()
|
||||
session_maker.return_value.__enter__.return_value = session
|
||||
session_maker.return_value.__exit__.return_value = None
|
||||
mock_db = MagicMock()
|
||||
mock_db.engine = MagicMock()
|
||||
|
||||
with (
|
||||
patch("services.retention.workflow_run.delete_archived_workflow_run.db", mock_db),
|
||||
patch(
|
||||
"services.retention.workflow_run.delete_archived_workflow_run.sessionmaker", return_value=session_maker
|
||||
),
|
||||
patch.object(deleter, "_get_workflow_run_repo", return_value=repo),
|
||||
):
|
||||
result = deleter.delete_by_run_id("run-1")
|
||||
|
||||
assert result.success is False
|
||||
assert result.error == "Workflow run run-1 not found"
|
||||
repo.get_archived_run_ids.assert_not_called()
|
||||
|
||||
def test_delete_by_run_id_returns_error_when_not_archived(self):
|
||||
from services.retention.workflow_run.delete_archived_workflow_run import ArchivedWorkflowRunDeletion
|
||||
|
||||
deleter = ArchivedWorkflowRunDeletion()
|
||||
repo = MagicMock()
|
||||
repo.get_archived_run_ids.return_value = set()
|
||||
run = MagicMock()
|
||||
run.id = "run-1"
|
||||
run.tenant_id = "tenant-1"
|
||||
|
||||
session = MagicMock()
|
||||
session.get.return_value = run
|
||||
|
||||
session_maker = MagicMock()
|
||||
session_maker.return_value.__enter__.return_value = session
|
||||
session_maker.return_value.__exit__.return_value = None
|
||||
mock_db = MagicMock()
|
||||
mock_db.engine = MagicMock()
|
||||
|
||||
with (
|
||||
patch("services.retention.workflow_run.delete_archived_workflow_run.db", mock_db),
|
||||
patch(
|
||||
"services.retention.workflow_run.delete_archived_workflow_run.sessionmaker", return_value=session_maker
|
||||
),
|
||||
patch.object(deleter, "_get_workflow_run_repo", return_value=repo),
|
||||
patch.object(deleter, "_delete_run") as mock_delete_run,
|
||||
):
|
||||
result = deleter.delete_by_run_id("run-1")
|
||||
|
||||
assert result.success is False
|
||||
assert result.error == "Workflow run run-1 is not archived"
|
||||
mock_delete_run.assert_not_called()
|
||||
|
||||
def test_delete_by_run_id_calls_delete_run(self):
|
||||
from services.retention.workflow_run.delete_archived_workflow_run import ArchivedWorkflowRunDeletion
|
||||
|
||||
deleter = ArchivedWorkflowRunDeletion()
|
||||
repo = MagicMock()
|
||||
repo.get_archived_run_ids.return_value = {"run-1"}
|
||||
run = MagicMock()
|
||||
run.id = "run-1"
|
||||
run.tenant_id = "tenant-1"
|
||||
|
||||
session = MagicMock()
|
||||
session.get.return_value = run
|
||||
|
||||
session_maker = MagicMock()
|
||||
session_maker.return_value.__enter__.return_value = session
|
||||
session_maker.return_value.__exit__.return_value = None
|
||||
mock_db = MagicMock()
|
||||
mock_db.engine = MagicMock()
|
||||
|
||||
with (
|
||||
patch("services.retention.workflow_run.delete_archived_workflow_run.db", mock_db),
|
||||
patch(
|
||||
"services.retention.workflow_run.delete_archived_workflow_run.sessionmaker", return_value=session_maker
|
||||
),
|
||||
patch.object(deleter, "_get_workflow_run_repo", return_value=repo),
|
||||
patch.object(deleter, "_delete_run", return_value=MagicMock(success=True)) as mock_delete_run,
|
||||
):
|
||||
result = deleter.delete_by_run_id("run-1")
|
||||
|
||||
assert result.success is True
|
||||
mock_delete_run.assert_called_once_with(run)
|
||||
|
||||
def test_delete_batch_uses_repo(self):
|
||||
from services.retention.workflow_run.delete_archived_workflow_run import ArchivedWorkflowRunDeletion
|
||||
|
||||
deleter = ArchivedWorkflowRunDeletion()
|
||||
repo = MagicMock()
|
||||
run1 = MagicMock()
|
||||
run1.id = "run-1"
|
||||
run1.tenant_id = "tenant-1"
|
||||
run2 = MagicMock()
|
||||
run2.id = "run-2"
|
||||
run2.tenant_id = "tenant-1"
|
||||
repo.get_archived_runs_by_time_range.return_value = [run1, run2]
|
||||
|
||||
session = MagicMock()
|
||||
session_maker = MagicMock()
|
||||
session_maker.return_value.__enter__.return_value = session
|
||||
session_maker.return_value.__exit__.return_value = None
|
||||
start_date = MagicMock()
|
||||
end_date = MagicMock()
|
||||
mock_db = MagicMock()
|
||||
mock_db.engine = MagicMock()
|
||||
|
||||
with (
|
||||
patch("services.retention.workflow_run.delete_archived_workflow_run.db", mock_db),
|
||||
patch(
|
||||
"services.retention.workflow_run.delete_archived_workflow_run.sessionmaker", return_value=session_maker
|
||||
),
|
||||
patch.object(deleter, "_get_workflow_run_repo", return_value=repo),
|
||||
patch.object(
|
||||
deleter, "_delete_run", side_effect=[MagicMock(success=True), MagicMock(success=True)]
|
||||
) as mock_delete_run,
|
||||
):
|
||||
results = deleter.delete_batch(
|
||||
tenant_ids=["tenant-1"],
|
||||
start_date=start_date,
|
||||
end_date=end_date,
|
||||
limit=2,
|
||||
)
|
||||
|
||||
assert len(results) == 2
|
||||
repo.get_archived_runs_by_time_range.assert_called_once_with(
|
||||
session=session,
|
||||
tenant_ids=["tenant-1"],
|
||||
start_date=start_date,
|
||||
end_date=end_date,
|
||||
limit=2,
|
||||
)
|
||||
assert mock_delete_run.call_count == 2
|
||||
|
||||
def test_delete_run_dry_run(self):
|
||||
from services.retention.workflow_run.delete_archived_workflow_run import ArchivedWorkflowRunDeletion
|
||||
|
||||
deleter = ArchivedWorkflowRunDeletion(dry_run=True)
|
||||
run = MagicMock()
|
||||
run.id = "run-1"
|
||||
run.tenant_id = "tenant-1"
|
||||
|
||||
with patch.object(deleter, "_get_workflow_run_repo") as mock_get_repo:
|
||||
result = deleter._delete_run(run)
|
||||
|
||||
assert result.success is True
|
||||
mock_get_repo.assert_not_called()
|
||||
|
||||
def test_delete_run_calls_repo(self):
|
||||
from services.retention.workflow_run.delete_archived_workflow_run import ArchivedWorkflowRunDeletion
|
||||
|
||||
deleter = ArchivedWorkflowRunDeletion()
|
||||
run = MagicMock()
|
||||
run.id = "run-1"
|
||||
run.tenant_id = "tenant-1"
|
||||
|
||||
repo = MagicMock()
|
||||
repo.delete_runs_with_related.return_value = {"runs": 1}
|
||||
|
||||
with patch.object(deleter, "_get_workflow_run_repo", return_value=repo):
|
||||
result = deleter._delete_run(run)
|
||||
|
||||
assert result.success is True
|
||||
assert result.deleted_counts == {"runs": 1}
|
||||
repo.delete_runs_with_related.assert_called_once()
|
||||
@@ -0,0 +1,65 @@
|
||||
"""
|
||||
Unit tests for workflow run restore functionality.
|
||||
"""
|
||||
|
||||
from datetime import datetime
|
||||
from unittest.mock import MagicMock
|
||||
|
||||
|
||||
class TestWorkflowRunRestore:
|
||||
"""Tests for the WorkflowRunRestore class."""
|
||||
|
||||
def test_restore_initialization(self):
|
||||
"""Restore service should respect dry_run flag."""
|
||||
from services.retention.workflow_run.restore_archived_workflow_run import WorkflowRunRestore
|
||||
|
||||
restore = WorkflowRunRestore(dry_run=True)
|
||||
|
||||
assert restore.dry_run is True
|
||||
|
||||
def test_convert_datetime_fields(self):
|
||||
"""ISO datetime strings should be converted to datetime objects."""
|
||||
from models.workflow import WorkflowRun
|
||||
from services.retention.workflow_run.restore_archived_workflow_run import WorkflowRunRestore
|
||||
|
||||
record = {
|
||||
"id": "test-id",
|
||||
"created_at": "2024-01-01T12:00:00",
|
||||
"finished_at": "2024-01-01T12:05:00",
|
||||
"name": "test",
|
||||
}
|
||||
|
||||
restore = WorkflowRunRestore()
|
||||
result = restore._convert_datetime_fields(record, WorkflowRun)
|
||||
|
||||
assert isinstance(result["created_at"], datetime)
|
||||
assert result["created_at"].year == 2024
|
||||
assert result["created_at"].month == 1
|
||||
assert result["name"] == "test"
|
||||
|
||||
def test_restore_table_records_returns_rowcount(self):
|
||||
"""Restore should return inserted rowcount."""
|
||||
from services.retention.workflow_run.restore_archived_workflow_run import WorkflowRunRestore
|
||||
|
||||
session = MagicMock()
|
||||
session.execute.return_value = MagicMock(rowcount=2)
|
||||
|
||||
restore = WorkflowRunRestore()
|
||||
records = [{"id": "p1", "workflow_run_id": "r1", "created_at": "2024-01-01T00:00:00"}]
|
||||
|
||||
restored = restore._restore_table_records(session, "workflow_pauses", records, schema_version="1.0")
|
||||
|
||||
assert restored == 2
|
||||
session.execute.assert_called_once()
|
||||
|
||||
def test_restore_table_records_unknown_table(self):
|
||||
"""Unknown table names should be ignored gracefully."""
|
||||
from services.retention.workflow_run.restore_archived_workflow_run import WorkflowRunRestore
|
||||
|
||||
session = MagicMock()
|
||||
|
||||
restore = WorkflowRunRestore()
|
||||
restored = restore._restore_table_records(session, "unknown_table", [{"id": "x1"}], schema_version="1.0")
|
||||
|
||||
assert restored == 0
|
||||
session.execute.assert_not_called()
|
||||
@@ -2,7 +2,11 @@ from unittest.mock import ANY, MagicMock, call, patch
|
||||
|
||||
import pytest
|
||||
|
||||
from libs.archive_storage import ArchiveStorageNotConfiguredError
|
||||
from models.workflow import WorkflowArchiveLog
|
||||
from tasks.remove_app_and_related_data_task import (
|
||||
_delete_app_workflow_archive_logs,
|
||||
_delete_archived_workflow_run_files,
|
||||
_delete_draft_variable_offload_data,
|
||||
_delete_draft_variables,
|
||||
delete_draft_variables_batch,
|
||||
@@ -324,3 +328,68 @@ class TestDeleteDraftVariableOffloadData:
|
||||
|
||||
# Verify error was logged
|
||||
mock_logging.exception.assert_called_once_with("Error deleting draft variable offload data:")
|
||||
|
||||
|
||||
class TestDeleteWorkflowArchiveLogs:
|
||||
@patch("tasks.remove_app_and_related_data_task._delete_records")
|
||||
@patch("tasks.remove_app_and_related_data_task.db")
|
||||
def test_delete_app_workflow_archive_logs_calls_delete_records(self, mock_db, mock_delete_records):
|
||||
tenant_id = "tenant-1"
|
||||
app_id = "app-1"
|
||||
|
||||
_delete_app_workflow_archive_logs(tenant_id, app_id)
|
||||
|
||||
mock_delete_records.assert_called_once()
|
||||
query_sql, params, delete_func, name = mock_delete_records.call_args[0]
|
||||
assert "workflow_archive_logs" in query_sql
|
||||
assert params == {"tenant_id": tenant_id, "app_id": app_id}
|
||||
assert name == "workflow archive log"
|
||||
|
||||
mock_query = MagicMock()
|
||||
mock_delete_query = MagicMock()
|
||||
mock_query.where.return_value = mock_delete_query
|
||||
mock_db.session.query.return_value = mock_query
|
||||
|
||||
delete_func("log-1")
|
||||
|
||||
mock_db.session.query.assert_called_once_with(WorkflowArchiveLog)
|
||||
mock_query.where.assert_called_once()
|
||||
mock_delete_query.delete.assert_called_once_with(synchronize_session=False)
|
||||
|
||||
|
||||
class TestDeleteArchivedWorkflowRunFiles:
|
||||
@patch("tasks.remove_app_and_related_data_task.get_archive_storage")
|
||||
@patch("tasks.remove_app_and_related_data_task.logger")
|
||||
def test_delete_archived_workflow_run_files_not_configured(self, mock_logger, mock_get_storage):
|
||||
mock_get_storage.side_effect = ArchiveStorageNotConfiguredError("missing config")
|
||||
|
||||
_delete_archived_workflow_run_files("tenant-1", "app-1")
|
||||
|
||||
assert mock_logger.info.call_count == 1
|
||||
assert "Archive storage not configured" in mock_logger.info.call_args[0][0]
|
||||
|
||||
@patch("tasks.remove_app_and_related_data_task.get_archive_storage")
|
||||
@patch("tasks.remove_app_and_related_data_task.logger")
|
||||
def test_delete_archived_workflow_run_files_list_failure(self, mock_logger, mock_get_storage):
|
||||
storage = MagicMock()
|
||||
storage.list_objects.side_effect = Exception("list failed")
|
||||
mock_get_storage.return_value = storage
|
||||
|
||||
_delete_archived_workflow_run_files("tenant-1", "app-1")
|
||||
|
||||
storage.list_objects.assert_called_once_with("tenant-1/app_id=app-1/")
|
||||
storage.delete_object.assert_not_called()
|
||||
mock_logger.exception.assert_called_once_with("Failed to list archive files for app %s", "app-1")
|
||||
|
||||
@patch("tasks.remove_app_and_related_data_task.get_archive_storage")
|
||||
@patch("tasks.remove_app_and_related_data_task.logger")
|
||||
def test_delete_archived_workflow_run_files_success(self, mock_logger, mock_get_storage):
|
||||
storage = MagicMock()
|
||||
storage.list_objects.return_value = ["key-1", "key-2"]
|
||||
mock_get_storage.return_value = storage
|
||||
|
||||
_delete_archived_workflow_run_files("tenant-1", "app-1")
|
||||
|
||||
storage.list_objects.assert_called_once_with("tenant-1/app_id=app-1/")
|
||||
storage.delete_object.assert_has_calls([call("key-1"), call("key-2")], any_order=False)
|
||||
mock_logger.info.assert_called_with("Deleted %s archive objects for app %s", 2, "app-1")
|
||||
|
||||
4678
api/uv.lock
generated
4678
api/uv.lock
generated
File diff suppressed because it is too large
Load Diff
28
dev/setup
Executable file
28
dev/setup
Executable file
@@ -0,0 +1,28 @@
|
||||
#!/usr/bin/env bash
|
||||
set -euo pipefail
|
||||
|
||||
SCRIPT_DIR="$(dirname "$(realpath "$0")")"
|
||||
ROOT="$(dirname "$SCRIPT_DIR")"
|
||||
|
||||
API_ENV_EXAMPLE="$ROOT/api/.env.example"
|
||||
API_ENV="$ROOT/api/.env"
|
||||
WEB_ENV_EXAMPLE="$ROOT/web/.env.example"
|
||||
WEB_ENV="$ROOT/web/.env.local"
|
||||
MIDDLEWARE_ENV_EXAMPLE="$ROOT/docker/middleware.env.example"
|
||||
MIDDLEWARE_ENV="$ROOT/docker/middleware.env"
|
||||
|
||||
# 1) Copy api/.env.example -> api/.env
|
||||
cp "$API_ENV_EXAMPLE" "$API_ENV"
|
||||
|
||||
# 2) Copy web/.env.example -> web/.env.local
|
||||
cp "$WEB_ENV_EXAMPLE" "$WEB_ENV"
|
||||
|
||||
# 3) Copy docker/middleware.env.example -> docker/middleware.env
|
||||
cp "$MIDDLEWARE_ENV_EXAMPLE" "$MIDDLEWARE_ENV"
|
||||
|
||||
# 4) Install deps
|
||||
cd "$ROOT/api"
|
||||
uv sync --group dev
|
||||
|
||||
cd "$ROOT/web"
|
||||
pnpm install
|
||||
@@ -3,8 +3,9 @@
|
||||
set -x
|
||||
|
||||
SCRIPT_DIR="$(dirname "$(realpath "$0")")"
|
||||
cd "$SCRIPT_DIR/.."
|
||||
cd "$SCRIPT_DIR/../api"
|
||||
|
||||
uv run flask db upgrade
|
||||
|
||||
uv --directory api run \
|
||||
uv run \
|
||||
flask run --host 0.0.0.0 --port=5001 --debug
|
||||
|
||||
8
dev/start-docker-compose
Executable file
8
dev/start-docker-compose
Executable file
@@ -0,0 +1,8 @@
|
||||
#!/usr/bin/env bash
|
||||
set -euo pipefail
|
||||
|
||||
SCRIPT_DIR="$(dirname "$(realpath "$0")")"
|
||||
ROOT="$(dirname "$SCRIPT_DIR")"
|
||||
|
||||
cd "$ROOT/docker"
|
||||
docker compose -f docker-compose.middleware.yaml --profile postgresql --profile weaviate -p dify up -d
|
||||
Some files were not shown because too many files have changed in this diff Show More
Reference in New Issue
Block a user