AGENTS.md
Purpose
This repository contains a BI and LiveOps system for a coloring game. Agents working in this repo should optimize for data correctness, operational safety, and low-risk incremental changes.
Use this file as the default project behavior guide for future coding, debugging, and optimization work.
Repository Overview
oms/: Node.js + TypeScript backend, data services, cron jobs, admin APIs.
omsapp/: Angular admin dashboard.
oms/services/event-api-service.ts: receives analytics events and publishes to RabbitMQ.
oms/services/log-service.ts: consumes RabbitMQ events and writes rotating logs.
oms/services/ingestor-service.ts: consumes RabbitMQ events and writes to ClickHouse and MongoDB.
oms/services/cron-jobs/done-rate.ts: daily completion-rate aggregation job.
oms/src/services/clickhouseService.ts: ClickHouse table management and query wrapper.
oms/scripts/: operational and migration scripts.
Working Principles
- Prefer small, reversible changes over broad rewrites.
- Preserve production behavior unless the task explicitly requires a behavior change.
- Treat data scripts, cron jobs, and ingestion paths as high-risk surfaces.
- When changing analytics logic, explain the data impact clearly.
- When changing operational scripts, favor idempotent and restart-safe behavior.
High-Risk Areas
Changes in these areas require extra care and focused validation:
- RabbitMQ publish and consume paths
- ClickHouse schema, partitioning, and migration scripts
- MongoDB aggregation and batch update logic
- Cron jobs that scan large datasets
- PM2-managed services and cutover scripts
Default Behavior Expectations
Code Changes
- Fix the narrowest correct slice first.
- Do not refactor unrelated code while touching a critical data path.
- Reuse existing service abstractions unless they are the root cause.
- Keep public API and payload formats stable unless explicitly asked to change them.
Data and Analytics Changes
- Prefer query-shape optimization before changing product semantics.
- For ClickHouse queries, always consider:
- partition pruning
- scanned time range
- aggregation fanout
- whether multiple scans can be merged into one
- For long-running jobs, add execution timing and result-size logs when useful.
- Use half-open time ranges:
[start, nextStart).
Operational Scripts
- Scripts should default to safe behavior.
- Prefer dry-run by default for destructive or cutover actions.
- Print the exact SQL or command plan before executing.
- Avoid relying on host-installed database tools when the repo already uses Dockerized services.
- When reconciling data, prefer idempotent logic keyed by stable identifiers such as
log_id.
Frontend and Generated Assets
- Do not manually edit built Angular assets under
oms/public/app/ unless explicitly asked.
- Prefer editing source files under
omsapp/src/ and rebuilding.
Validation Rules
After making changes, prefer the narrowest useful validation in this order:
- file-level type or syntax validation
- targeted script execution or query validation
- focused runtime check on the changed service or cron job
- broader build only if needed
For database or migration work:
- validate counts before and after
- validate per-month or per-partition distribution when relevant
- keep an explicit rollback path
Environment Conventions
- This project commonly runs ClickHouse in Docker.
- Prefer
docker exec ... clickhouse-client ... over assuming clickhouse-client exists on the host.
- PM2 is used in production-like environments.
- Be careful not to restart unrelated services during operational changes.
Current Project-Specific Guidance
Done Rate
oms/services/cron-jobs/done-rate.ts is a hotspot.
- Keep ClickHouse aggregation consolidated when possible.
- Watch both ClickHouse query time and MongoDB update time.
Ingestor Reliability
oms/services/ingestor-service.ts currently accepts a small amount of analytics loss as a tradeoff.
- Do not change acknowledgement semantics unless the task explicitly targets ingestion reliability.
ClickHouse Storage
- The
events table is partitioned monthly by toYYYYMM(time).
- Future schema changes must preserve partition-aware query patterns.
Tracking and Follow-Up
- Use
oms/OPTIMIZATION_TRACKER.md as the source of truth for known completed and pending optimizations.
- When finishing a meaningful optimization, update that tracker.
When Unsure
- Choose the safer operational path.
- Prefer observability over speculation.
- Add logs and narrow validation before making a second larger change.