• Ep. 246 - Part 2 - June 12, 2024

  • 2024/06/13
  • 再生時間: 43 分
  • ポッドキャスト

Ep. 246 - Part 2 - June 12, 2024

  • サマリー

  • ArXiv Computer Vision research for Wednesday, June 12, 2024.


    00:21: From Sim-to-Real: Toward General Event-based Low-light Frame Interpolation with Per-scene Optimization

    01:44: Make Your Actor Talk: Generalizable and High-Fidelity Lip Sync with Motion and Appearance Disentanglement

    03:20: Adversarial Patch for 3D Local Feature Extractor

    04:00: Valeo4Cast: A Modular Approach to End-to-End Forecasting

    05:38: The impact of deep learning aid on the workload and interpretation accuracy of radiologists on chest computed tomography: a cross-over reader study

    08:50: Universal Scale Laws for Colors and Patterns in Imagery

    10:11: CT3D++: Improving 3D Object Detection with Keypoint-induced Channel-wise Transformer

    11:44: ConMe: Rethinking Evaluation of Compositional Reasoning for Modern VLMs

    13:25: Continuous fake media detection: adapting deepfake detectors to new generative techniques

    15:18: Category-level Neural Field for Reconstruction of Partially Observed Objects in Indoor Environment

    16:23: One-Step Effective Diffusion Network for Real-World Image Super-Resolution

    18:12: 2nd Place Solution for MOSE Track in CVPR 2024 PVUW workshop: Complex Video Object Segmentation

    19:22: Diffusion-Promoted HDR Video Reconstruction

    21:09: Runtime Freezing: Dynamic Class Loss for Multi-Organ 3D Segmentation

    21:52: A Sociotechnical Lens for Evaluating Computer Vision Models: A Case Study on Detecting and Reasoning about Gender and Emotion

    23:54: DistilDoc: Knowledge Distillation for Visually-Rich Document Applications

    25:28: Using Deep Convolutional Neural Networks to Detect Rendered Glitches in Video Games

    26:39: OpenCOLE: Towards Reproducible Automatic Graphic Design Generation

    27:23: Dataset Enhancement with Instance-Level Augmentations

    28:33: Interpretable Representation Learning of Cardiac MRI via Attribute Regularization

    29:33: A New Class Biorthogonal Spline Wavelet for Image Edge Detection

    30:48: Outdoor Scene Extrapolation with Hierarchical Generative Cellular Automata

    32:10: Vessel Re-identification and Activity Detection in Thermal Domain for Maritime Surveillance

    33:32: AdaNCA: Neural Cellular Automata As Adaptors For More Robust Vision Transformer

    35:09: From Chaos to Clarity: 3DGS in the Dark

    36:32: LaMOT: Language-Guided Multi-Object Tracking

    38:07: UDON: Universal Dynamic Online distillatioN for generic image representations

    39:49: WMAdapter: Adding WaterMark Control to Latent Diffusion Models

    40:48: Blind Image Deblurring using FFT-ReLU with Deep Learning Pipeline Integration

    42:06: DocSynthv2: A Practical Autoregressive Modeling for Document Generation

    続きを読む 一部表示

あらすじ・解説

ArXiv Computer Vision research for Wednesday, June 12, 2024.


00:21: From Sim-to-Real: Toward General Event-based Low-light Frame Interpolation with Per-scene Optimization

01:44: Make Your Actor Talk: Generalizable and High-Fidelity Lip Sync with Motion and Appearance Disentanglement

03:20: Adversarial Patch for 3D Local Feature Extractor

04:00: Valeo4Cast: A Modular Approach to End-to-End Forecasting

05:38: The impact of deep learning aid on the workload and interpretation accuracy of radiologists on chest computed tomography: a cross-over reader study

08:50: Universal Scale Laws for Colors and Patterns in Imagery

10:11: CT3D++: Improving 3D Object Detection with Keypoint-induced Channel-wise Transformer

11:44: ConMe: Rethinking Evaluation of Compositional Reasoning for Modern VLMs

13:25: Continuous fake media detection: adapting deepfake detectors to new generative techniques

15:18: Category-level Neural Field for Reconstruction of Partially Observed Objects in Indoor Environment

16:23: One-Step Effective Diffusion Network for Real-World Image Super-Resolution

18:12: 2nd Place Solution for MOSE Track in CVPR 2024 PVUW workshop: Complex Video Object Segmentation

19:22: Diffusion-Promoted HDR Video Reconstruction

21:09: Runtime Freezing: Dynamic Class Loss for Multi-Organ 3D Segmentation

21:52: A Sociotechnical Lens for Evaluating Computer Vision Models: A Case Study on Detecting and Reasoning about Gender and Emotion

23:54: DistilDoc: Knowledge Distillation for Visually-Rich Document Applications

25:28: Using Deep Convolutional Neural Networks to Detect Rendered Glitches in Video Games

26:39: OpenCOLE: Towards Reproducible Automatic Graphic Design Generation

27:23: Dataset Enhancement with Instance-Level Augmentations

28:33: Interpretable Representation Learning of Cardiac MRI via Attribute Regularization

29:33: A New Class Biorthogonal Spline Wavelet for Image Edge Detection

30:48: Outdoor Scene Extrapolation with Hierarchical Generative Cellular Automata

32:10: Vessel Re-identification and Activity Detection in Thermal Domain for Maritime Surveillance

33:32: AdaNCA: Neural Cellular Automata As Adaptors For More Robust Vision Transformer

35:09: From Chaos to Clarity: 3DGS in the Dark

36:32: LaMOT: Language-Guided Multi-Object Tracking

38:07: UDON: Universal Dynamic Online distillatioN for generic image representations

39:49: WMAdapter: Adding WaterMark Control to Latent Diffusion Models

40:48: Blind Image Deblurring using FFT-ReLU with Deep Learning Pipeline Integration

42:06: DocSynthv2: A Practical Autoregressive Modeling for Document Generation

Ep. 246 - Part 2 - June 12, 2024に寄せられたリスナーの声

カスタマーレビュー:以下のタブを選択することで、他のサイトのレビューをご覧になれます。