feat(telegram): attribute forwarded messages #37

Merged
weiwen merged 1 commit from sandcastle/issue-25 into main 2026-07-05 14:11:36 +08:00
Owner

Summary

When a Telegram message is a forward, prepend a compact attribution prefix to its text before it reaches the pi pipeline, so the model knows where the content came from.

Format: [Forwarded from <name>, YYYY-MM-DD] — or [Forwarded] when the origin has no displayable name (e.g. a nameless chat).

What changed

  • format_forward_prefix(origin: &MessageOrigin) — maps all four MessageOrigin variants (User, HiddenUser, Chat, Channel) to a human-readable attribution string with an ISO date.
  • apply_forward_prefix(extraction, prefix) — prepends the prefix to Extraction::Content text; leaves Unhandled and other variants untouched.
  • extract_message_content now wraps the renamed extract_message_content_raw, reading msg.forward_origin() and applying the prefix in one place before returning.
  • Five unit tests cover both functions: named user, hidden user, prefix applied, prefix absent (None), and Unhandled passthrough.

Key decisions

  • Attribution is injected as plain text rather than a separate metadata field — keeps the pi integration simple with no protocol changes.
  • Date format is YYYY-MM-DD for unambiguous, locale-neutral display.

Reviewer checklist

  • Prefix format reads naturally in actual pi conversations
  • All four MessageOrigin arms are handled (confirm no future variants are expected by teloxide)
  • Tests cover the edge cases you care about

Closes #25

## Summary When a Telegram message is a forward, prepend a compact attribution prefix to its text before it reaches the pi pipeline, so the model knows where the content came from. **Format:** `[Forwarded from <name>, YYYY-MM-DD]` — or `[Forwarded]` when the origin has no displayable name (e.g. a nameless chat). ## What changed - **`format_forward_prefix(origin: &MessageOrigin)`** — maps all four `MessageOrigin` variants (User, HiddenUser, Chat, Channel) to a human-readable attribution string with an ISO date. - **`apply_forward_prefix(extraction, prefix)`** — prepends the prefix to `Extraction::Content` text; leaves `Unhandled` and other variants untouched. - **`extract_message_content`** now wraps the renamed `extract_message_content_raw`, reading `msg.forward_origin()` and applying the prefix in one place before returning. - Five unit tests cover both functions: named user, hidden user, prefix applied, prefix absent (None), and Unhandled passthrough. ## Key decisions - Attribution is injected as plain text rather than a separate metadata field — keeps the pi integration simple with no protocol changes. - Date format is `YYYY-MM-DD` for unambiguous, locale-neutral display. ## Reviewer checklist - [ ] Prefix format reads naturally in actual pi conversations - [ ] All four `MessageOrigin` arms are handled (confirm no future variants are expected by teloxide) - [ ] Tests cover the edge cases you care about Closes #25
RALPH: feat(telegram): attribute forwarded messages (closes #25)
Some checks failed
CI / check (pull_request) Failing after 56s
87513d3163
Task: Detect Telegram forward_origin on incoming messages and prepend a
compact attribution line so pi knows the content originated elsewhere.

Key decisions:
- format_forward_prefix(origin: &MessageOrigin) -> String: maps all four
  MessageOrigin variants (User, HiddenUser, Chat, Channel) to
  "[Forwarded from <name>, YYYY-MM-DD]"; falls back to "[Forwarded]" when
  no displayable name is available (e.g. Chat/Channel with no title).
- apply_forward_prefix(extraction, prefix) -> Extraction: pure helper that
  prepends the prefix string to Extraction::Content.text, leaving
  FileTooLarge and Unhandled untouched. Extracted for unit-testability.
- extract_message_content now delegates raw extraction to
  extract_message_content_raw (renamed from the previous body) then
  applies the prefix in one place, so text, photo, and document branches
  all benefit automatically.
- MessageOrigin added to the teloxide::types import list.

Files changed:
- src/telegram/mod.rs: format_forward_prefix, apply_forward_prefix,
  extract_message_content refactor, extract_message_content_raw,
  5 new unit tests (User prefix, HiddenUser prefix, prefix composition
  with content, None-prefix no-op, Unhandled pass-through).

Blockers/notes:
- No Rust toolchain in sandbox; CI must validate compilation and tests.
  Logic follows the same async extraction pattern used for documents and
  photos merged in prior commits.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
weiwen merged commit 6087d159e9 into main 2026-07-05 14:11:36 +08:00
Sign in to join this conversation.
No reviewers
No milestone
No project
No assignees
1 participant
Notifications
Due date
The due date is invalid or out of range. Please use the format "yyyy-mm-dd".

No due date set.

Dependencies

No dependencies set.

Reference
weiwen/evie!37
No description provided.