大模型 39
-
论文笔记《Attention Bootstrapping for Multi-Modal Test-Time Adaptation》
论文 - 《Attention Bootstrapping for Multi-Modal Test-Time Adaptation》 关键词 - 多模态(视频+音频)、TTA、主成分熵最小化 摘要 问题背景:测试时间自适应(Test-time adaptation)以往的研究主要集中在单一模态上,
-
论文笔记《Unsupervised Domain Adaptive Visual Question Answering in the era of MLLMs》
论文笔记 - 《Unsupervised Domain Adaptive Visual Question Answering in the era of Multi-modal Large Language Models》 关键词 - 问答、特征对齐、多模态、域适应、WACV2025 1 介绍 研究
-
论文笔记《Test-Time Model Adaptation with Only Forward Passes》
论文 - 《Test-Time Model Adaptation with Only Forward Passes》 代码 - Github 关键词 - TTA、无反向传播、图像分类、Prompt Tuning、IICML2024Oral 摘要 问题背景 模型通常部署在资源受限的设备上(如 FPGA
-
论文笔记《Test-Time Adaptation for Combating Missing Modalities in Egocentric Videos》
论文 - 《Test-Time Adaptation for Combating Missing Modalities in Egocentric Videos》 代码 - Github 关键词 - ICLR2025、模态缺失、多模态(视频+音频)、TTA 摘要 问题背景 理解包含多种模态的视频任务
-
论文笔记《Smoothing the Shift: Towards Stable Test-Time Adaptation under Complex Multimodal Noises》
论文 - 《SMOOTHING THE SHIFT: TOWARDS STABLE TEST-TIME ADAPTATION UNDER COMPLEX MULTIMODAL NOISES》 代码 - Github 关键词 - 多模态、Test-Time Adaptation、音频、视频、ICLR2
-
论文笔记《Test-time Adaptation against Multi-modal Reliability Bias》
论文 - 《TEST-TIME ADAPTATION AGAINST MULTI-MODAL RELIABILITY BIAS》 代码 - Github 关键词 - 多模态、Test-Time Adaptation、音频、视频、ICLR2024 摘要 研究背景 现有的 TTA 方法主要集中于单模态任
-
论文笔记《Video Test-Time Adaptation for Action Recognition》
论文 - 《Video Test-Time Adaptation for Action Recognition》 代码 - Github 关键词 - 动作识别、test-time adaptation、视频、CVPR 2023 摘要 研究背景:目前动作识别系统在面对测试数据中未预料到的分布偏移时表现
-
论文笔记《Entropy is not Enough for Test-Time Adaptation: From the Perspective of Disentangled Factors》
论文-《ENTROPY IS NOT ENOUGH FOR TEST-TIME ADAPTATION: FROM THE PERSPECTIVE OF DISENTANGLED FACTORS》 代码 - Github 关键词 - test-time adaptation、多模态、ICLR spot
-
论文笔记《Cross-Device Collaborative Test-Time Adaptation》
论文-《Cross-Device Collaborative Test-Time Adaptation》 代码-Github 关键词-test-time Adaptation、开源、视觉、Neurips 2024 摘要 本文工作 - test-time Collaborative Lifelong
-
论文笔记《Cloud-Device Collaborative Learning for Multimodal Large Language Models》
论文-《Cloud-Device Collaborative Learning for Multimodal Large Language Models》 关键词:云端-设备协作、多模态、大模型、CVPR2024 摘要 问题背景:多模态大语言模型(MLLMs)在图像描述生成、常识推理和视觉场景理解等
-
论文笔记《An Image is Worth 1/2 Tokens After Layer 2: Plug-and-PLay Acceleration for VLLM Inference》
论文-《An Image is Worth 1/2 Tokens After Layer 2: Plug-and-PLay Acceleration for VLLM Inference》 代码-Github 关键词-多模态、推理加速、视觉、剪枝token、开源 摘要 研究问题-注意力低效现象 在流
-
论文笔记《Janus: Collaborative Vision Transformer Under Dynamic Network Environment》
论文-《Janus: Collaborative Vision Transformer Under Dynamic Network Environment》 关键词-多模态、云边协作、模型分割、动态网络、INFOCOM2025 摘要 问题背景:ViTs在计算机视觉任务的性能惊人,但是计算成本较高,在
-
论文笔记《Task-Oriented Feature Compression for Multimodal Understanding via Device-Edge Co-Inference》
论文:《Task-Oriented Feature Compression for Multimodal Understanding via Device-Edge Co-Inference》 摘要 研究背景 大多数LMM (large multimodal models) 的推理请求来自边缘设备,
-
论文笔记《Self-Adapting Large Visual-Language Models to Edge Devices...》
论文-《Self-Adapting Large Visual-Language Models to Edge Devices across Visual Modalities》 代码-Github 摘要 研究问题 视觉-语言(VL)模型的进展引发了对其在边缘设备上部署的兴趣,但在处理多样化视觉模态、
-
论文笔记《HPipe: Large Language Model Pipeline Parallelism for Long Context...》
论文-《HPipe: Large Language Model Pipeline Parallelism for Long Context on Heterogeneous Cost-effective Devices》 摘要 问题背景 微型企业和个人开发者对使用强大的大型语言模型(LLMs)进行长