Train a reinforcement-learning trading bot in the cloud

Train FreqAI-native reinforcement-learning agents — visually with a reward-template builder or directly in Python. PPO, A2C, DQN, QRDQN, TRPO and MaskablePPO, running inline on your existing runners. No major competitor offers this.

Start 7-day free trial Explore features

What it is

Train a reinforcement-learning trading bot in the cloud, explained.

Reinforcement learning trains an agent to make trading decisions by rewarding the outcomes you care about, rather than following fixed if-then rules. The agent learns a policy from your market data and the reward you define.

VolatiCloud's RL is FreqAI-native: it uses Freqtrade's built-in ReinforcementLearner, so there is no bolt-on machine-learning service to wire up. You can shape the reward visually with a reward-template builder or write it in code, and training runs inline on the same runners your bots already use — no separate GPU job and no database migration.

How it works

From idea to a running bot.

RL on VolatiCloud is built to feel like the rest of the platform: pick an algorithm, shape a reward, train, deploy.

Choose an algorithm

Select from PPO (the default), A2C, DQN, QRDQN, TRPO or MaskablePPO — or pass an advanced model type for any image-supported algorithm.

Shape the reward

Start from a curated reward template — Balanced, Trend-follow, or Mean-revert — and tune it visually, or write the reward in code.

Train inline

Training runs on your existing runner. The first run shows a "Training model…" status while the agent learns; longer runs consume more credits.

Backtest and deploy

Validate the trained agent in a backtest, then promote it to a live or dry-run bot like any other strategy.

Who it's for

Built for the way you trade.

RL is the platform's headline differentiator — and it is built so you don't need an ML team to use it.

ML-curious traders

Use the visual reward-template builder to train an agent without writing any reinforcement-learning code.

Quant developers

Drop into code mode to define a custom reward and wire up a scaffolded ReinforcementLearner exactly how you want.

Strategy researchers

Compare RL agents against rule-based strategies on the same backtests and metrics.

FreqAI-native ReinforcementLearner — no bolt-on service
PPO, A2C, DQN, QRDQN, TRPO, MaskablePPO
Visual reward-template builder and code mode
Trains inline on your existing runners
Balanced / Trend-follow / Mean-revert reward presets
No GPU job, no database migration

FAQ

Frequently asked questions.

What makes this different from other bot platforms?

Hosted, in-product reinforcement-learning training is something no major competitor offers. VolatiCloud trains FreqAI-native RL agents inline on your existing runners — visually or in code — without you standing up any ML infrastructure.

Which algorithms are supported?

PPO (the default), A2C, DQN, QRDQN, TRPO and MaskablePPO are curated in the UI, plus an advanced free-text model type so any algorithm the engine image supports can be used.

Do I need to write code to train an RL agent?

No. A visual reward-template builder lets you start from a Balanced, Trend-follow, or Mean-revert template and tune it by hand. Code mode is there if you want full control of the reward.

How long does training take and what does it cost?

RL training runs inline on your runner, and the first run is slower — you'll see a "Training model…" status while the agent learns. Longer training runs consume more credits, so you control the trade-off between training depth and cost.

Keep exploring

Related capabilities.

Ship your first live bot this afternoon.

Connect an exchange, build a strategy in the visual builder, backtest it on real data, and deploy. Start a 7-day Pro trial — no credit card required.

Start 7-day free trial Talk to us

No credit card required · Cancel any time