LOOM-Eval
Getting Started
Installation
Basic Installation
Step 1: Create Environment
Step 2: Install LOOM-Eval
Step 3: Install Flash Attention
Acceleration Methods
General Acceleration Environment
KIVI Installation
ThinK Installation
FlexPrefill Installation
XAttention Installation
Other Acceleration Methods
RAG Installation
Next Steps
Quick Start
Prerequisites
Installation
Basic Usage
Automatic Evaluation (Recommended)
Manual Evaluation (Step-by-Step)
Key Parameters
Core Parameters
Advanced Parameters
WebUI Usage
Interactive Evaluation
Example Scenarios
Scenario 1: Quick Test with Limited Samples
Scenario 2: Multi-GPU Long Context Evaluation
Scenario 3: Using vLLM for Faster Inference
Scenario 4: API Interface Usage
Scenario 5: RAG-Enhanced Evaluation
Scenario 6: Acceleration for Memory Efficiency
Running LOOMBench Suite
Option 1: Run Full LOOMBench Suite
Common Issues
Output Structure
Next Steps
User Guide
Acceleration Methods
KV Cache Optimization
Sparse Attention
Usage
Performance (128K Context)
Model Compatibility
Installation Notes
Hardware Requirements
API Reference
Command Line Interface
Main Commands
Core Parameters
Inference Options
Acceleration Options
RAG Options
Generation & Logic Options
Data & Sampling Options
API Configuration Options
Execution Options
Extension Options
Storage Strategy Options
Custom Templates
Examples
Basic Evaluation (Automatic)
With KV Cache Acceleration
With RAG Enhancement
vLLM High-Throughput Backend
API Model (OpenAI/Anthropic)
Sparse Attention (NSA/MOBA)
Step-by-Step Manual Evaluation
RAG (Retrieval-Augmented Generation)
Supported Methods
Installation
Quick Start
Configuration
Command-line Options
Task Compatibility
Best Practices
LOOM-Eval
Index
Index
Symbols
|
C
Symbols
--acceleration
command line option
--adapter_path
command line option
--api_key
command line option
--api_workers
command line option
--auto_flush
command line option
--base_url
command line option
--benchmark
command line option
--buffer_size
command line option
--cfg_path
command line option
--dynamic_gpu_allocation
command line option
--enable_length_split
command line option
--enable_thinking
command line option
--eval
command line option
--format_type
command line option
--gpu_allocation
command line option
--gpu_ids
command line option
--gpu_memory_utilization
command line option
--infer_kwargs
command line option
--limit
command line option
--limit_model
command line option
--max_length
command line option
--max_model_len
command line option
--model_path
command line option
--output_dir
command line option
--rag_config_path
command line option
,
[1]
--rag_data_path
command line option
--rag_method
command line option
,
[1]
--rope_scaling
command line option
--save_interval
command line option
--save_strategy
command line option
--save_tag
command line option
--server
command line option
--skip_existing
command line option
--skip_rag_processing
command line option
,
[1]
--template
command line option
--thinking_tokens
command line option
--time_interval
command line option
--torch_dtype
command line option
--web_index
command line option
C
command line option
--acceleration
--adapter_path
--api_key
--api_workers
--auto_flush
--base_url
--benchmark
--buffer_size
--cfg_path
--dynamic_gpu_allocation
--enable_length_split
--enable_thinking
--eval
--format_type
--gpu_allocation
--gpu_ids
--gpu_memory_utilization
--infer_kwargs
--limit
--limit_model
--max_length
--max_model_len
--model_path
--output_dir
--rag_config_path
,
[1]
--rag_data_path
--rag_method
,
[1]
--rope_scaling
--save_interval
--save_strategy
--save_tag
--server
--skip_existing
--skip_rag_processing
,
[1]
--template
--thinking_tokens
--time_interval
--torch_dtype
--web_index