LLM | Coriva

QuantaAlpha: Evolving Alpha Mining as Trajectories

AlphaAgent tackled “LLM-generated factors look too similar, accelerating alpha decay” with factor-level regularization. QuantaAlpha (arXiv 2602.07085) operates one level up: instead of constraining a single generation, treat the full “hypothesis → factor → backtest” pipeline as a trajectory, and apply mutation and crossover between trajectories. On CSI 300 with GPT-5.2 it lands IC 0.0472, ARR 4.68%, MDD 11.8%. Zero-shot transfer to CSI 500 and S&P 500 keeps 40.28% and 19.1% cumulative excess return respectively. ...

AlphaAgent: Regularized Exploration to Fight Alpha Decay

The previous AlphaGPT review left an open question: when everyone uses LLMs to mine factors, how long can those factors stay effective? AlphaAgent (paper, KDD 2025) tackles this head-on. Its core observation: LLM-generated factors lean too heavily on existing knowledge, producing homogeneous signals that crowd the same trades and accelerate alpha decay. The fix is three regularization constraints injected into the factor generation process, forcing the model to explore structurally novel, logically coherent, and complexity-controlled factors. ...

AlphaGPT: Mining Quantitative Factors with LLMs

One of the core tasks in quantitative investing is mining alpha factors — finding signals that predict asset returns. The traditional approach relies on researchers manually constructing factor expressions, or using automated search methods like Genetic Programming (GP) to brute-force combinations in the operator space. The former depends on human experience and intuition — low efficiency but high interpretability. The latter is efficient but produces deeply nested operator expressions that are nearly impossible for researchers to interpret. AlphaGPT (paper) brings large language models into the factor mining pipeline, using an LLM as the factor “generator.” The follow-up work, AlphaGPT 2.0 (paper), further introduces a human-in-the-loop closed cycle. ...

Key Questions Before Starting an LLM Startup

Before diving into an LLM-based startup, you should think through these five questions carefully. Failing to do so is a recipe for trouble down the road. ...

Phi-2: The Surprising Power of Small Language Models

Microsoft released Phi-2, a 2.7 billion parameter language model that demonstrates outstanding reasoning and language understanding capabilities, achieving state-of-the-art performance among base language models with fewer than 13 billion parameters. On complex benchmarks, Phi-2 matches or outperforms models roughly 25 times its size, thanks to innovations in model scaling and training data curation. ...

Textbooks Are All You Need: Key Takeaways

Microsoft recently proposed an intriguing approach: training models on synthetic textbooks instead of the massive datasets typically used. Paper: https://arxiv.org/abs/2306.11644 ...

A ChatGPT-Written Hospital Appointment Bot

Anyone who’s tried booking an appointment at Peking University School of Stomatology knows how difficult it is. So let’s have ChatGPT write an appointment bot. Unfortunately, the booking logic it produced is hilariously superficial — basically a no-op: ...

Introduction to Retrieval-Augmented Generation (RAG)

Retrieval-Augmented Generation (RAG) is a natural language processing approach that combines pretrained parametric and non-parametric memory to improve performance on knowledge-intensive NLP tasks. This post covers the RAG framework and its potential applications. ...