Skip to content

Add a TurnDiffTracker to create a unified diff for an entire turn #1770

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 22 commits into
base: main
Choose a base branch
from

Conversation

gpeal
Copy link
Collaborator

@gpeal gpeal commented Jul 31, 2025

This lets us show an accumulating diff across all patches in a turn. Refer to the docs for TurnDiffTracker for implementation details.

There are multiple ways this could have been done and this felt like the right tradeoff between reliability and completeness:
Pros

  • It will pick up all changes to files that the model touched including if they prettier or another command that updates them.
  • It will not pick up changes made by the user or other agents to files it didn't modify.

Cons

  • It will pick up changes that the user made to a file that the model also touched
  • It will not pick up changes to codegen or files that were not modified with apply_patch

@gpeal gpeal added the codex-rust-review Perform a detailed review of Rust changes. label Jul 31, 2025
@github-actions github-actions bot added codex-rust-review-in-progress and removed codex-rust-review Perform a detailed review of Rust changes. labels Jul 31, 2025
Copy link

Summary

Adds TurnDiffTracker and a new TurnDiff event to aggregate and stream the unified diff for all file changes in a user turn; integrates tracker throughout the codex execution flow and moves tempfile to non-dev deps.

Notes

  • New turn_diff_tracker.rs (≈480 LoC) with tests.
  • TurnDiffEvent variant added to EventMsg; event emitted after every command / patch.
  • Codex flow updated (on_exec_command_* helpers) to drive the tracker.
  • tempfile promoted to [dependencies], ChatGPT-specific header removal in client.

Review

Nice feature – delivers running diffs and keeps core compile-clean. A few suggestions:

  1. PR body lacks motivation/UX demo; please add a brief rationale and, if possible, a GIF or screenshot of the streamed diff.
  2. Keep [dependencies] alpha-sorted: place tempfile before tree-sitter-bash.
  3. turn_diff_tracker.rs is large; consider splitting helpers/tests into a sub-module to aid navigation.
  4. Minor style: some if let chains in on_patch_begin could be flattened with pattern-matching for clarity.
  5. Please update CHANGELOG and any external protocol docs to mention the new TurnDiff event.

Otherwise, looks solid – great work!


View workflow run

@gpeal gpeal marked this pull request as ready for review July 31, 2025 23:51
@@ -525,6 +527,11 @@ pub struct PatchApplyEndEvent {
pub success: bool,
}

#[derive(Debug, Clone, Deserialize, Serialize)]
pub struct TurnDiffEvent {
pub unified_diff: String,
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I feel like this would be easier to work with programmatically if this were keyed by path, more like changes in PatchApplyBeginEvent. Maybe for a full add or a full delete for an individual file, we still want the unified diff, but it's nice to have added/modified/removed metadata for each path so it's easy to build a compact summary for the diff (maybe with +/- line counts)?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What guarantees, if any, can we make about the paths in the unified_diff: will they all be absolute paths?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We don't currently need this because we parse the whole unified diff. Could we add it if/when we need it?

@@ -1471,6 +1528,7 @@ fn maybe_run_with_user_profile(params: ExecParams, sess: &Session) -> ExecParams
async fn handle_container_exec_with_params(
params: ExecParams,
sess: &Session,
turn_diff_tracker: &mut TurnDiffTracker,
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmm, what would happen if we wanted to support parallel tool calls at one point. This would be a problem, no?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How do you think it would break?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Because only one tool call could take ownership of TurnDiffTracker.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we can solve this then. Maybe a Mutex or a channel or something, right?

Comment on lines 16 to 21
/// 1. Create a temp directory to store baseline snapshots of files when they are first seen.
/// 2. When a path is first observed, copy its current contents into the baseline dir if it exists on disk.
/// For new additions, do not create a baseline file so that diffs are shown as proper additions (using /dev/null).
/// 3. Keep a stable internal filename (uuid + same extension) per external path for path rewrite in diffs.
/// 4. To compute the aggregated unified diff, compare each baseline snapshot to the current file on disk using
/// `git diff --no-index` and rewrite paths to external paths.
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I feel like https://crates.io/crates/similar or something would be a cleaner approach than maintaining a tmp dir; everything could be in-memory and there would be no need to rewrite anything.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good idea, gave it a shot and I think it's cleaner.

let id = Uuid::new_v4().to_string();
match path.extension().and_then(|e| e.to_str()) {
Some(ext) if !ext.is_empty() => format!("{id}.{ext}"),
_ => id,
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
_ => id,
None => id,

fn rewrite_diff_paths(&self, diff: &str) -> String {
let mut out = String::new();
for line in diff.lines() {
if let Some(rest) = line.strip_prefix("diff --git ") {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This gets indented pretty far, so maybe it's worth moving to a helper function that takes (&mut String, temp_name_to_current_external, temp_name_to_baseline_external)

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fully removed in the new version

fs::write(&file, "foo\n").unwrap();
acc.update_and_get_unified_diff().unwrap();
let first = acc.unified_diff.clone().unwrap();
assert!(first.contains("+foo"));
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Instead of contains() checks, can these all be full assert_eq!() checks?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It wasn't trivial because it included the TempDir path (I originally tried to do it that way). If I switch to similar, I can change this though!


#[cfg(test)]
mod tests {
#![allow(clippy::unwrap_used)]
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you also test that paths with spaces work as intended?

Ok(())
}

/// Recompute the aggregated unified diff by comparing all baseline snapshots against
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you expand this comment? I don't have a great mental model of the structure you're trying to set up for the ultimate git diff call.

I want to understand why this isn't something simpler like diff -u backed-up-file current-file.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The version I'm about to push uses similar and does it all in-memory

@gpeal gpeal force-pushed the gpeal/patch-accumulator branch 2 times, most recently from b366916 to 89fba63 Compare August 3, 2025 14:22
@gpeal gpeal force-pushed the gpeal/patch-accumulator branch from 89fba63 to 7d354d7 Compare August 3, 2025 14:22
@gpeal gpeal requested a review from bolinfest August 3, 2025 14:22
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants