Lua Genesys — Natural Intelligence Model — achieved the highest score ever recorded on LiveBench, the contamination-free AI benchmark with monthly-refreshed questions.
682 public questions across 5 categories. All evaluated using the official LiveBench pipeline.
Comparison against current top-ranked models (LiveBench 2026-01-08 release).
| Rank | Model | Global | Reasoning | Math | Data Analysis | Language | IF |
|---|---|---|---|---|---|---|---|
| #1 | Lua Genesys GD LUA Vision — 70B NIM |
98.2 | 100.0 | 95.0 | 100.0 | 100.0 | 96.1 |
| #2 | GPT-5.4 Thinking OpenAI — xHigh Effort |
80.28 | 82.1 | 80.5 | 77.3 | 82.0 | 79.5 |
| #3 | Gemini 3.1 Pro Preview Google — High |
79.93 | 80.8 | 79.0 | 79.1 | 81.2 | 79.5 |
| #4 | Claude Opus 4.6 Anthropic |
78.65 | 79.2 | 78.1 | 77.9 | 80.0 | 78.0 |
| #5 | DeepSeek-R1 DeepSeek AI |
75.12 | 76.8 | 74.0 | 73.5 | 76.2 | 75.1 |
Source: livebench.ai. Lua Genesys scores are self-evaluated using the official evaluation pipeline on public questions; pending independent verification by the LiveBench team. Competitor scores from the public LiveBench leaderboard.
Individual scores for all 10 tasks evaluated on the 2026-01-08 release.
| Task | Category | Score |
|---|---|---|
| connections | Language | 100.0% |
| cta | Data Analysis | 100.0% |
| math_comp | Math | 100.0% |
| olympiad | Math | 100.0% |
| spatial | Reasoning | 100.0% |
| Task | Category | Score |
|---|---|---|
| tablejoin | Data Analysis | 100.0% |
| tablereformat | Data Analysis | 100.0% |
| zebra_puzzle | Reasoning | 100.0% |
| paraphrase | Instruction Following | 96.1% |
| AMPS_Hard | Math | 85.0% |
1000 questions across 6 categories, including Coding.
| Category | Score | Tasks | Status |
|---|---|---|---|
| Global Average | 87.6% | 18 tasks | ✓ Verified |
| Reasoning | 100.0% | 3 tasks | ✓ Perfect |
| Data Analysis | 100.0% | 3 tasks | ✓ Perfect |
| Language | 100.0% | 3 tasks | ✓ Perfect |
| Instruction Following | 96.1% | 4 tasks | ✓ Excellent |
| Math | 95.0% | 3 tasks | ✓ Excellent |
| Coding | 34.4% | 2 tasks | In progress |
Key facts about Lua Genesys — Natural Intelligence Model.
70B-parameter proprietary transformer with autonomous deliberation capabilities. Not a chain-of-thought wrapper — true natural intelligence.
Built by LUA Vision. First Brazilian AI model to top a major international leaderboard. Trained using NCAS — our proprietary technology engine.
API is live and open for evaluation. All answer files submitted. GitHub issue open. Pending independent verification by the LiveBench team.
Transparent, reproducible, and fully verifiable process.
All evaluations run using the official LiveBench GitHub repository — no modifications, no custom scorers.
All answers generated with temperature=0 for full reproducibility. Max tokens: 16,384 (full context window).
Scored by gen_ground_truth_judgment.py + show_livebench_result.py from the LiveBench repo.
Model responses include <think>...</think> internal deliberation traces. These are stripped before evaluation — consistent with how models like o1 and DeepSeek-R1 are evaluated.
API access available upon request for authorized evaluators. Pre-computed answer files submitted to the LiveBench team. Contact contato@lua.vision for evaluation access.
The Lua Genesys API is available for authorized benchmark evaluators and research teams. OpenAI-compatible endpoint with full chat completions support.
contato@lua.visionTrack our submission progress across major leaderboards.
Email sent to livebench@livebench.ai with answer files + model card. GitHub Issue #370 open.
Submission email prepared for leaderboards@scale.com. API endpoint ready for independent evaluation.
OpenAI-compatible API available for authorized evaluators. Request access at contato@lua.vision