CS766 Final Project

Automatic Photo Augmentation from a Single Image

This project turns a single photo into a fully automatic editing workflow. The system identifies the main subject, estimates depth, adjusts composition with an adaptive crop, and applies lightweight style enhancement without requiring manual retouching.

Team Jiapeng Zeng, Heyan Zhang, Lanxi Zhang
Course University of Wisconsin-Madison, CS766
Project Focus Subject emphasis, smart crop, and automatic photo augmentation

Overview

Many photos fail not because the scene is bad, but because the chosen framing, focus, or style does not match the subject a user actually cares about. Our proposal reframes the task as perceptual post-capture optimization: automatically identify the most important region, emphasize it, and render a more intentional image.

We frame the task as perceptual post-capture optimization: detect the visually important region, organize the composition around it, and render a stronger final image automatically.

  • Input: one RGB photograph
  • Automatic subject emphasis: depth, contrast, saturation, and center priors
  • Composition adjustment: adaptive crop guided by subject placement
  • Output: a refocused and photo-enhanced image with clearer visual hierarchy

Pipeline

The demo pipeline below shows the full CS766 flow from the original photo to the final augmented result.

CS766 pipeline diagram
1

Depth Estimation

We estimate monocular depth with Depth Anything V2 when the environment supports it, and fall back to a heuristic pseudo-depth prior when local torch and torchvision versions are mismatched.

Robust demo path
2

Automatic Subject Prior

We combine near-depth preference, local contrast, saturation, edge strength, and a soft center bias to infer which region is most likely to be the intended subject.

No manual click required
3

Adaptive Focus and Crop

The subject prior drives automatic focus distance estimation and a crop box that tries to keep the subject comfortably framed while improving balance.

Composition aware
4

Style Enhancement

We apply lightweight local contrast, color, and detail enhancement, with stronger sharpening concentrated around the automatically selected subject.

Presentation ready

Method Details

Depth and Geometry

Dense depth estimation gives us a rough geometric layout. We smooth and remap the depth field before using it for focus simulation and subject ranking.

Depth method visualization

Depth-Aware Rendering

We compute a Circle of Confusion map and blend multiple blur layers to produce depth-aware focus effects that strengthen subject separation before the final enhancement step.

Refocus method visualization

Automatic Augmentation

The detected subject prior drives a crop and enhancement pass that directly improves framing, emphasis, and presentation quality in the final image.

Automatic augmentation summary

Interactive Results

These examples compare the input photo, depth-aware rendering outputs, and the final automatic augmentation stage.

Original vs. Depth Prior

Drag the handle to compare the input photograph against the estimated depth visualization.

Original image
Depth map
Original Depth prior

Original vs. Refocused

This component uses the estimated focus distance and depth map to synthesize a shallower depth of field.

Original image
Refocused image
Original Refocused

CoC Visualization

The Circle of Confusion map highlights where the renderer assigns stronger blur.

CoC map

Automatic Augmentation Result

The left image is the adaptive crop before enhancement. The right image is the final augmented result.

Adaptive crop original
Automatic augmentation output
Adaptive crop Final augment

Subject Prior and Crop

The overlay shows the saliency prior, the detected subject box, and the crop selected for final enhancement.

Automatic subject overlay

Summary Strip

Automatic augmentation summary strip

Depth-of-Field Sweep Across Apertures

These samples show how the renderer responds when the aperture changes.

Automatic Focus Sweep

This animation now shows two synchronized views: the current focus render on the left and the final augmented result on the right, so the change in focus and the enhancement pass are both visible.

Focus sweep animation

Implementation

Core Components

  • Depth Anything V2 or pseudo-depth fallback for robust demo execution
  • Depth-aware Circle of Confusion renderer for controllable focus effects
  • Automatic subject saliency from depth, contrast, saturation, and center priors
  • Adaptive crop and subject-aware enhancement for final photo augmentation

Command Examples

python CS766_Project/CS766_AutoPhoto_Core/auto_augment.py Image_Folder/pic-24.jpg --output-dir Auto_Augment_Results python CS766_Project/CS766_AutoPhoto_Webpage/prepare_webpage_images.py Image_Folder/pic-24.jpg --output-dir CS766_Project/CS766_AutoPhoto_Webpage/webpage_images python CS766_Project/CS766_AutoPhoto_Webpage/update_html_paths.py webpage_images

Film Style Presets

Beyond depth-of-field simulation, the pipeline supports personalised colour grading inspired by classic film stocks and cinematic looks. Each preset adjusts tone curves, saturation, shadow and highlight tints, spatially-correlated film grain, and a subject-anchored vignette — all applied after the subject-aware augmentation stage so the look is consistent regardless of composition.

Challenges and Next Steps

Current Challenges

  • Monocular depth still introduces edge errors around thin structures and cluttered scenes.
  • The subject heuristic is effective for demos, but it is not yet learned end-to-end from user preference data.
  • Automatic cropping can fail on unusual compositions where the intended subject is not the most salient region.

Future Work

  • Train a learned importance predictor for subject emphasis rather than using heuristics alone.
  • Add multiple target aspect ratios and platform-specific crop presets.
  • Introduce stronger style controls so users can choose portrait, editorial, or cinematic output modes.
  • Connect the pipeline to a simple web upload interface for live demos.