title: “주간 팁 #131: 특별 멤버 함수와 = default
”
layout: tips
sidenav: side-nav-tips.html
published: true
permalink: tips/131
type: markdown
order: “131”
—
Abseil Tip 3 문자열 연결과 operator+ vs. StrCat()
title: “이번 주의 팁 #3: 문자열 연결과 operator+ vs. StrCat()” layout: tips sidenav: side-nav-tips.html published: true permalink: tips/3 type: markdown order: “003” —
Abseil Tip 10 문자열 분리, 골치 아프지 않게
title: “이번 주의 팁 #10: 문자열 분리, 골치 아프지 않게!” layout: tips sidenav: side-nav-tips.html published: true permalink: tips/10 type: markdown order: “010” —
LoCoCo: Dropping In Convolutions for Long Context Compression
Loki: Low-rank Keys for Efficient Sparse Attention
PyramidKV: Dynamic KV Cache Compression based on Pyramidal Information Funneling
MiniCache: KV Cache Compression in Depth Dimension for Large Language Models
Reducing Transformer Key-Value Cache Size with Cross-Layer Attention
Abseil Tip 74 위임 생성자와 상속 생성자
원래 2014-04-21에 totw/74로 게시됨
작성자: Bradley White (bww@google.com)
Abseil Tip 42 초기화 메서드보다 팩토리 함수를 선호하세요
title: “주간 팁 #42: 초기화 메서드보다 팩토리 함수를 선호하세요” layout: tips sidenav: side-nav-tips.html published: true permalink: tips/42 type: markdown order: “042” —
Abseil Tip 131 Special 멤버 함수와 = default
title: “주간 팁 #131: 특별 멤버 함수와 = default
”
layout: tips
sidenav: side-nav-tips.html
published: true
permalink: tips/131
type: markdown
order: “131”
—
Attention Score is not All You Need for Token Importance Indicator in KV Cache Reduction: Value Also Matters
CItruS : ChunkedInstruction-aware State Eviction for Long Sequence Modeling
ASimple and Effective L2 Norm-Based Strategy for KV Cache Compression
MLKV:Multi-Layer Key-Value Heads for Memory Efficient Transformer Decoding
Effectively Compress KV Heads for LLM
MODEL TELLS YOU WHERE TO MERGE: ADAPTIVE KV CACHE MERGING FOR LLMS ON LONG-CONTEXT TASKS
Efficient Sparse Attention needs Adaptive Token Release
Benchmark of Long Context Capable Approaches
LOOK-M:Look-Once Optimization in KV Cache for Efficient Multimodal Long-Context Inference
Dynamic Discriminative Operations (D2O) for Efficient Generative Inference of Large Language Models
Abseil Tip 130 네임스페이스 이름 지정
주간 팁 #130: 네임스페이스 이름 지정
Abseil Tip 123 absl::optional과 std::unique_ptr
주간 팁 #123: absl::optional
과 std::unique_ptr
Abseil Tip 119 using 선언과 네임스페이스 별칭 사용하기
주간 팁 #119: using
선언과 네임스페이스 별칭 사용하기
Pruning in Transformer Decoder
Keep the Cost Down: A Review on Methods to Optimize LLM’s KV Cache Consumption.
LazyLLM: Dynamic Token Pruning for Efficient Long Context LLM Inference
PQCache: Product Quantization-based KVCache for Long Context LLM Inference
Ada-KV: Optimizing KV Cache Eviction by Adaptive Budget Allocation for Efficient LLM Inference
Abseil Tip 99 비멤버 인터페이스 에티켓
Abseil Tip 126 make_unique는 새로운 new입니다
Abseil Tip 109 함수 선언에서 의미 있는 const 사용
한글 번역
RetrievalAttention: Accelerating Long-Context LLM Inference via Vector Retrieval
Post-Training Sparse Attention with Double Sparsity
NACL: AGeneral and Effective KV Cache Eviction Framework for LLMs at Inference Time
Palu: Compressing KV-Cache with Low-Rank Projection
ThinK: Thinner Key Cache by Query-Driven Pruning
Locret: Enhancing Eviction in Long-Context LLM Inference with Trained Retaining Heads
InfiniPot: Infinite Context Processing on Memory-Constrained LLMs
KV-COMPRESS: Paged KV-Cache Compression with Variable Compression Rates per Attention Head
Discovering the Gems in Early Layers: Accelerating Long-Context LLMs with 1000x Input Token Reduction
TACO-RL: Task Aware Prompt Compression Optimization with Reinforcement Learning
Abseil Tip 65 제자리에 넣기
한글 번역
Abseil Tip 49 인자 기반 탐색
한글 번역
Abseil Tip 112 emplace vs. push_back
한글 번역
title: “Tip of the Week #112: emplace vs. push_back”
layout: tips
sidenav: side-nav-tips.html
published: true
permalink: tips/112
type: markdown
order: “112”
—
DUOATTENTION: EFFICIENT LONG-CONTEXT LLM INFERENCE WITH RETRIEVAL AND STREAMING HEADS
TorchTitan: One-stop PyTorch native solution for production ready LLM pre-training
TIDALDECODE: FAST AND ACCURATE LLM DECOD ING WITH POSITION PERSISTENT SPARSE ATTENTION
SPARSEVLM: VISUAL TOKEN SPARSIFICATION FOR EFFICIENT VISION-LANGUAGE MODEL INFERENCE
SWIFTKV: FAST PREFILL-OPTIMIZED INFERENCE WITH KNOWLEDGE-PRESERVING MODEL TRANSFORMATION
LoRC: Low-Rank Compression for LLMs KV Cache with a Progressive Compression Strategy
A Systematic Study of Cross-Layer KV Sharing for Efficient LLM Inference
SimLayerKV: A Simple Framework for Layer-Level KV Cache Reduction
In-context KV-Cache Eviction for LLMs via Attention-Gate
Prompt Compression for Large Language Models: A Survey
Textbooks Are All You Need
Scaling Laws for Neural Language Models
Abseil Tip 135 계약을 테스트하라, 구현을 테스트하지 마라
주간 팁 #135: 계약을 테스트하라, 구현을 테스트하지 마라
Abseil Tip 107 참조 수명 연장
아래는 “이번 주의 팁 #107: 참조 수명 확장”에 대한 한글 번역입니다:
Abseil Tip 101 반환 값, 참조 및 수명
주간 팁 #101: 반환 값, 참조 및 수명
Squeezed Attention: Accelerating Long Context Length LLM Inference
Recycled Attention: Efficient inference for long-context language models
TokenSelect: Efficient Long-Context Inference and Length Extrapolation for LLMs via Dynamic Token-Level KV Cache Selection
VL-Cache: Sparsity and Modality-Aware KV Cache Compression for Vision-Language Model Inference Acceleration
MagicPIG: LSH Sampling for Efficient LLM Generation
Abseil Tip 86 클래스(enum class)를 활용한 열거형
title: “이번 주의 팁 #86: 클래스(enum class)를 활용한 열거형” layout: tips sidenav: side-nav-tips.html published: true permalink: tips/86 type: markdown order: “086” —
Abseil Tip 77 임시 객체, 이동, 복사
title: “이번 주의 팁 #77: 임시 객체, 이동, 복사” layout: tips sidenav: side-nav-tips.html published: true permalink: tips/77 type: markdown order: “077” —
Abseil Tip 64 Raw 문자열 리터럴
title: “이번 주의 팁 #64: Raw 문자열 리터럴” layout: tips sidenav: side-nav-tips.html published: true permalink: tips/64 type: markdown order: “064” —
Chain-of-Thought Prompting Elicits Reasoning in Large Language Models
논문 : https://arxiv.org/abs/2201.11903
Learning Transferable Visual Models From Natural Language Supervision
논문 : https://arxiv.org/abs/2103.00020
HART Efficient Visual Generation with Hybrid Autoregressive Transformer
논문 : https://arxiv.org/abs/2410.10812
Deep Compression Autoencoder for Efficient High-Resolution Diffusion Models
논문 : https://arxiv.org/abs/2410.10733v2
The CoT Collection Improving Zero-shot and Few-shot Learning of Language Models via Chain-of-Thought Fine-Tuning
논문 : https://arxiv.org/abs/2305.14045
Abseil Tip 55 이름 개수 세기와 unique_ptr
title: “이번 주의 팁 #55: 이름 개수 세기와 unique_ptr
”
layout: tips
sidenav: side-nav-tips.html
published: true
permalink: tips/55
type: markdown
order: “055”
—
Abseil Tip 122 테스트 픽스처, 명확성, 그리고 데이터 흐름
title: “이번 주의 팁 #122: 테스트 픽스처, 명확성, 그리고 데이터 흐름” layout: tips sidenav: side-nav-tips.html published: true permalink: tips/122 type: markdown order: “122” —
VILA-U a Unified Foundation Model Integrating Visual Understanding and Generation
논문 : https://arxiv.org/abs/2409.04429
Condition-Aware Neural Network for Controlled Image Generation
논문 : https://arxiv.org/abs/2404.01143
DistriFusion Distributed Parallel Inference for High-Resolution Diffusion Models
논문 : https://arxiv.org/abs/2402.19481
VILA On Pre-training for Visual Language Models
논문 : https://arxiv.org/abs/2312.07533
FastComposer Tuning-Free Multi-Subject Image Generation with Localized Attention
논문 : https://arxiv.org/abs/2305.10431
Abseil Tip 1 string_view의 활용 방법과 이점
Abseil Tip #1: string_view
의 활용 방법과 이점
ShadowKV KV Cache in Shadows for High-Throughput Long-Context LLM Inference
논문 : https://arxiv.org/abs/2410.21465
Query-Efficient Correlation Clustering with Noisy Oracle
논문 : https://arxiv.org/abs/2402.01400
LiteMoE Customizing On-device LLM Serving via Proxy Submodel Tuning
논문 : https://dl.acm.org/doi/10.1145/3666025.3699355
LaRS Latent Reasoning Skills for Chain-of-Thought Reasoning
논문 : https://aclanthology.org/2024.findings-emnlp.206/
Batch Calibration Rethinking Calibration for In-Context Learning and Prompt Engineering
논문 : https://arxiv.org/abs/2309.17249
Scientific Beta Multi-Beta Multi-Strategy Indices Implementing Multi-Factor Equity Portfolios with Smart Factor Indices
논문 : https://conferences.pionline.com/uploads/conference_admin/ERI_Scientific_Beta_Publication_Scientific_Beta_Multi-Beta_Multi-Strategy_Indices_Equity_Portfolios.pdf
Foundations of Factor Investing
논문 : https://www.msci.com/documents/1296102/1336482/Foundations_of_Factor_Investing.pdf
RAG4ITOps A Supervised Fine-Tunable and Comprehensive RAG Framework for IT Operations and Maintenance
논문 : https://arxiv.org/abs/2410.15805v1
MagicPIG LSH Sampling for Efficient LLM Generation
논문 : https://arxiv.org/abs/2410.16179
EPIC Efficient Position-Independent Context Caching for Serving Large Language Models
논문 : https://arxiv.org/abs/2410.15332
ELICIT LLM Augmentation via External In-Context Capability
논문 : https://arxiv.org/abs/2410.09343
COMET Towards Partical W4A4KV4 LLMs Serving
논문 : https://arxiv.org/abs/2410.12168
The Cross-Section of Expected Stock Returns
논문 : https://www.jstor.org/stable/2329112
Portfolio Selection
논문 : https://www.jstor.org/stable/2975974
Capital asset prices A theory of market equilibrium under conditions of risk
논문 : https://www.jstor.org/stable/2977928
MInference 1.0 Accelerating Pre-filling for Long-Context LLMs via Dynamic Sparse Attention
논문 : https://arxiv.org/abs/2407.02490
HYSYNTH Context-Free LLM Approximation for Guiding Program Synthesis
논문 : https://arxiv.org/abs/2405.15880v2
DynamoLLM Designing LLM Inference Clusters for Performance and Energy Efficiency
논문 : https://arxiv.org/abs/2408.00741
Can Graph Learning Improve Planning in LLM-based Agents?
논문 : https://arxiv.org/abs/2405.19119