Matt Kulukundis (kfm@google.com) 작성
최초 게시일: 2017년 3월 30일
최종 업데이트: 2019년 11월 25일
Abseil Tip 108 std::bind를 피하세요
Roman Perepelitsa (roman.perepelitsa@gmail.com) 작성
최초 게시일: 2016년 1월 7일
최종 업데이트: 2020년 8월 19일
Benchmarks as Limits to Arbitrage: Understanding the Low-Volatility Anomaly
High Idiosyncratic Volatility and Low Returns: International and Further U.S. Evidence
CHAI: Clustered Head Attention for Efficient LLM Inference
QAQ: Quality Adaptive Quantization for LLM KV Cache
Transformers are Multi-State RNNs
Compressed Context Memory For Online Language Model Interaction
CacheGen: KV Cache Compression and Streaming for Fast Large Language Model Serving
LongLLMLingua: Accelerating and Enhancing LLMs in Long Context Scenarios via Prompt Compression
Galactica: A Large Language Model for Science
BLOOM: A 176B-Parameter Open-Access Multilingual Language Model
Abseil Tip 182 정수형 변수를 초기화하세요!
주간 팁 #182: 정수형 변수를 초기화하세요!
Abseil Tip 180 Dangling References(유효하지 않은 참조) 피하기
주간 팁 #180: Dangling References(유효하지 않은 참조) 피하기
Abseil Tip 158 Abseil 연관 컨테이너와 contains()
주간 팁 #158: Abseil 연관 컨테이너와 contains()
Abseil Tip 147 Exhaustive switch 문을 책임감 있게 사용하기
주간 팁 #147: Exhaustive switch
문을 책임감 있게 사용하기
Momentum Strategies
Mixed Precision Quantization
WKVQuant: Quantizing Weight and Key/Value Cache for Large Language Models Gains More
Get More with LESS: Synthesizing Recurrence with KV Cache Compression for Efficient LLM Inference
KIVI: A Tuning-Free Asymmetric 2bit Quantization for KV Cache
KVQuant: Towards 10 Million Context Length LLM Inference with KV Cache Quantization
DeepCache: Accelerating Diffusion Models for Free
Abseil Tip 90 Retired Flags(사용 중단된 플래그)
주간 팁 #90: Retired Flags(사용 중단된 플래그)
Abseil Tip 45 플래그를 피하라, 특히 라이브러리 코드에서
주간 팁 #45: 플래그를 피하라, 특히 라이브러리 코드에서
Abseil Tip 103 플래그는 전역 변수입니다
주간 팁 #103: 플래그는 전역 변수입니다
Improving Language Understanding by Generative Pre-Training
QAQ: Quality Adaptive Quantization for LLM KV Cache
PowerInfer: Fast Large Language Model Serving with a Consumer-grade GPU
PaLM 2 Technical Report
Abseil Tip 153 using-directives를 사용하지 마세요
title: “주간 팁 #153: using-directives
를 사용하지 마세요”
layout: tips
sidenav: side-nav-tips.html
published: true
permalink: tips/153
type: markdown
order: “153”
—
Abseil Tip 152 AbslHashValue과 함께
title: “주간 팁 #152: AbslHashValue
과 함께”
layout: tips
sidenav: side-nav-tips.html
published: true
permalink: tips/152
type: markdown
order: “152”
—
Abseil Tip 144 연관 컨테이너에서의 이종 조회(Heterogeneous Lookup)
주간 팁 #144: 연관 컨테이너에서의 이종 조회(Heterogeneous Lookup)
Abseil Tip 136 Unordered Containers
주간 팁 #136: Unordered Containers
ALISA: Accelerating Large Language Model Inference via Sparsity-Aware KV Caching
LLMLingua-2: Data Distillation for Efficient and Faithful Task-Agnostic Prompt Compression
FASTDECODE: High-Throughput GPU-Efficient LLM Serving using Heterogeneous Pipelines
Dynamic Memory Compression: Retrofitting LLMs for Accelerated Inference
Fast Inference from Transformers via Speculative Decoding
BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding
Abseil Tip 24 복사, 축약
title: “이번 주의 팁 #24: 복사, 축약” layout: tips sidenav: side-nav-tips.html published: true permalink: tips/24 type: markdown order: “024” —
Abseil Tip 149 Object Lifetimes vs = delete
title: “이번 주의 팁 #149: 객체 수명 vs. = delete
”
layout: tips
sidenav: side-nav-tips.html
published: true
permalink: tips/149
type: markdown
order: “149”
—
Abseil Tip 148 Overload Sets
원래 TotW #148로 2018년 5월 3일 게시됨
Abseil Tip 117 복사 생략과 값으로 전달하기
원래 TotW #117로 2016년 6월 8일 게시됨
HIERARCHICAL CONTEXT MERGING: BETTER LONG CONTEXT UNDERSTANDING FOR PRE-TRAINED LLMS
MuxServe:FlexibleSpatial-TemporalMultiplexingforMultipleLLMServing
Transformer-Lite: High-efficiency Deployment of Large Language Models on Mobile Phone GPUs
ALISA: Accelerating Large Language Model Inference via Sparsity-Aware KV Caching
LLMLingua-2: Data Distillation for Efficient and Faithful Task-Agnostic Prompt Compression
MELTing point: Mobile Evaluation of Language Transformers
Direct Preference Optimization: Your Language Model is Secretly a Reward Model
Tree of Thoughts: Deliberate Problem Solving with Large Language Models
The Flan Collection: Designing Data and Methods for Effective Instruction Tuning
Abseil Tip 143 C++11 삭제된 함수 (= delete)
주간 팁 #143: C++11 삭제된 함수 (= delete
)
Abseil Tip 120 반환 값은 건드리지 마세요
주간 팁 #120: 반환 값은 건드리지 마세요
Abseil Tip 11 반환 정책
주간 팁 #11: 반환 정책
CORM: Cache Optimization with Recent Message for Large Language Model Inference
Retrieval Head Mechanistically Explains Long-Context Factuality
SnapKV: LLM Knows What You are Looking for Before Generation
Leave No Context Behind: Efficient Infinite Context Transformers with Infini-attention
SqueezeAttention: 2D Management of KV-Cache in LLM Inference via Layer-wise Optimal Budget
Toward Inference-optimal Mixture-of-Expert Large Language Models
Mistral 7B
Llama 2: Open Foundation and Fine-Tuned Chat Models
Abseil Tip 93 absl::Span 사용하기
Abseil Tip 61 기본 멤버 초기화 (Default Member Initializers)
Abseil Tip 141 bool로의 암시적 변환에 주의하라
주간 팁 #141: bool
로의 암시적 변환에 주의하라
Abseil Tip 134 make_unique와 private 생성자
주간 팁 #134: make_unique
와 private
생성자
MobileAIBench: Benchmarking LLMs and LMMs for On-Device Use Cases
PowerInfer-2: Fast Large Language Model Inference on a Smartphone
DeepSeek-V2: A Strong, Economical, and Efficient Mixture-of-Experts Language Model
RAGCache: Efficient Knowledge Caching for Retrieval-Augmented Generation
Parallel Decoding via Hidden Transfer for Lossless Large Language Model Acceleration
Tree-based Speculative Inference and Verification
Fast Inference from Transformers via Speculative Decoding
Abseil Tip 88 초기화 방법 =, (), 그리고 {}
title: “Tip of the Week #88: 초기화 방법: =, (), 그리고 {}” layout: tips sidenav: side-nav-tips.html published: true permalink: tips/88 type: markdown order: “088” —
Abseil Tip 59 튜플 연결하기
title: “Tip of the Week #59: 튜플 연결하기” layout: tips sidenav: side-nav-tips.html published: true permalink: tips/59 type: markdown order: “059” —
Abseil Tip 142 다중 매개변수 생성자와 explicit
title: “Tip of the Week #142: 다중 매개변수 생성자와 explicit
”
layout: tips
sidenav: side-nav-tips.html
published: true
permalink: tips/142
type: markdown
order: “142”
—
KV Cache Compression
PyramidInfer: Pyramid KV Cache Compression for High-throughput LLM Inference
Layer-Condensed KV Cache for Efficient Inference of Large Language Models
SKVQ:Sliding-window Key and Value Cache Quantization for Large Language Models
You Only Cache Once: Decoder-Decoder Architectures for Language Models
Abseil Tip 36 새로운 Join API
title: “주간 팁 #131: 특별 멤버 함수와 = default
”
layout: tips
sidenav: side-nav-tips.html
published: true
permalink: tips/131
type: markdown
order: “131”
—
Abseil Tip 3 문자열 연결과 operator+ vs. StrCat()
title: “이번 주의 팁 #3: 문자열 연결과 operator+ vs. StrCat()” layout: tips sidenav: side-nav-tips.html published: true permalink: tips/3 type: markdown order: “003” —
Abseil Tip 10 문자열 분리, 골치 아프지 않게
title: “이번 주의 팁 #10: 문자열 분리, 골치 아프지 않게!” layout: tips sidenav: side-nav-tips.html published: true permalink: tips/10 type: markdown order: “010” —
LoCoCo: Dropping In Convolutions for Long Context Compression
Loki: Low-rank Keys for Efficient Sparse Attention
PyramidKV: Dynamic KV Cache Compression based on Pyramidal Information Funneling
MiniCache: KV Cache Compression in Depth Dimension for Large Language Models
Reducing Transformer Key-Value Cache Size with Cross-Layer Attention
Abseil Tip 74 위임 생성자와 상속 생성자
원래 2014-04-21에 totw/74로 게시됨
작성자: Bradley White (bww@google.com)
Abseil Tip 42 초기화 메서드보다 팩토리 함수를 선호하세요
title: “주간 팁 #42: 초기화 메서드보다 팩토리 함수를 선호하세요” layout: tips sidenav: side-nav-tips.html published: true permalink: tips/42 type: markdown order: “042” —
Abseil Tip 131 Special 멤버 함수와 = default
title: “주간 팁 #131: 특별 멤버 함수와 = default
”
layout: tips
sidenav: side-nav-tips.html
published: true
permalink: tips/131
type: markdown
order: “131”
—