Tom Basso’s “Coin-Flip” Experiment — Overview¶
Tom Basso’s coin-flip experiment showed that an edge in a trading system can come from risk management and exits, not clever entries. Instead of hunting for a perfect signal, Basso (working with Van K. Tharp in the 1990s) asked a sharper question: what happens if entries are random but exits and sizing are disciplined? Their tests across a diversified basket of futures suggested that combining a small, fixed risk per trade (≈1% of equity), volatility-based trailing stops (e.g., 3× ATR), and consistent rules can yield positive expectancy even when the entry is decided by chance.
Equally important, diversification across relatively uncorrelated markets was a major contributor to the results. By holding assets whose returns don’t move in lockstep (currencies, rates, energies, metals, grains, livestock), the portfolio benefits when one market draws down while another trends or mean-reverts. Uncorrelated streams:
- Dampen portfolio volatility
- Stabilize compounding (geometric returns) by avoiding large equity cliffs
- Often improve risk-adjusted performance
In short: profitability came not from prediction but from position sizing, risk management, and cross-market diversification: sizing kept the system alive through losing streaks, the risk management rules would cut losers while letting winners run, and diversification added uncorrelated returns that tempered drawdowns. This notebook mirrors that setup with a coin-flip bot, ~1% risk per trade, a 3 × ATR trailing stop, and a diversified futures basket, so you can see the same principles play out in your own results.
How the experiment was conducted¶
- Universe: 10 liquid futures markets spanning commodities, currencies, rates, and energy (this notebooke only includes 8 though).
- Entry: entirely random (flip a coin to choose long/short) for each market.
- Always in: when a position closed, the system stood ready to re-enter on the next bar using the same random entry logic.
- Evaluation: results tracked across all markets to observe the combined equity curve and distribution of wins/losses.
The rules that mattered (and that we reproduce here)¶
- Risk per trade: ~1% of equity (fixed-fractional position sizing).
- Volatility measure: 10-day EMA of ATR (a smoothed ATR).
- Initial stop: 3 × volatility (i.e., 3 × the 10-day EMA-ATR) from the entry price.
- Trailing exit: the same 3× ATR stop ratchets with price only in your favor (never loosening).
- Diversification: apply the identical rules across multiple, low-correlated futures to smooth outcomes.
Why it’s instructive: the outcome is the classic trend-following profile—many small losses, a handful of outsized winners—and, crucially, compounding over time when position sizing and exits are kept consistent. This notebook recreates that setup end-to-end with a coin-flip bot, ~1% risk per trade, and a 3× ATR trailing stop, run over a representative futures portfolio, with commissions, slippage, ticks, and margins modeled so you can see how systematic exits and sizing turn randomness into results.
What this notebook does: It recreates that setup end-to-end with a coin-flip bot, 1% risk per trade, and a 3× ATR trailing stop, run over a representative portfolio. We also model commissions, slippage, ticks, and margins, and surface clear diagnostics—per-asset stats, win rate, expectancy, drawdown, and a live equity curve—so you can see how systematic exits and sizing turn randomness into results.
Markets used in Basso’s test (10 futures)¶
- Gold
- Silver
- U.S. Bonds (long-term Treasury bond futures)
- Eurodollars (short-term interest rate futures)
- Crude Oil
- Soybeans
- Sugar
- Deutsche Mark (pre-euro currency future)
- British Pound (Pound sterling)
- Live Cattle
Notes: The Deutsche Mark was later replaced by the euro; Eurodollars have since largely been superseded by SOFR futures in modern markets.
Mapping to this notebook’s symbols (using 8 of the 10 contracts from Basso's Portfolio)¶
Basso market | Notebook symbol (example) | Comment |
---|---|---|
Gold | GC=F |
COMEX Gold |
Silver | SI=F |
COMEX Silver |
U.S. Bonds | ZB=F |
30-Year Treasury Bond |
Eurodollars | — | Not included here (SOFR analog) |
Crude Oil | CL=F |
NYMEX WTI |
Soybeans | ZS=F |
CBOT Soybeans |
Sugar | — | Not included here |
Deutsche Mark | 6E=F |
Euro FX as successor to DEM |
British Pound | 6B=F |
British Pound FX |
Live Cattle | LE=F |
CME Live Cattle |
Setup¶
# jump to repo root (fallback: parent if in notebooks/)
ROOT = !git rev-parse --show-toplevel 2>/dev/null
%cd {ROOT[0] if ROOT else '..'}
/home/dennis/Algo-Trading-Stack
!./setup/fetch_sample_portfolio_futures_data.sh
========== Last 10 years ========== ⏭️ Gold: already exists, skipping. ⏭️ Silver: already exists, skipping. ⏭️ Crude_Oil: already exists, skipping. ⏭️ Soybeans: already exists, skipping. ⏭️ Sugar: already exists, skipping. ⏭️ US_Treasury_Bonds: already exists, skipping. ⏭️ Euro: already exists, skipping. ⏭️ British_Pound: already exists, skipping. ⏭️ Live_Cattle: already exists, skipping. ========== Last 20 years ========== ⏭️ Gold: already exists, skipping. ⏭️ Silver: already exists, skipping. ⏭️ Crude_Oil: already exists, skipping. ⏭️ Soybeans: already exists, skipping. ⏭️ Sugar: already exists, skipping. ⏭️ US_Treasury_Bonds: already exists, skipping. ⏭️ Euro: already exists, skipping. ⏭️ British_Pound: already exists, skipping. ⏭️ Live_Cattle: already exists, skipping. ========== 2000 to 2015 ========== ⏭️ Gold: already exists, skipping. ⏭️ Silver: already exists, skipping. ⏭️ Crude_Oil: already exists, skipping. ⏭️ Soybeans: already exists, skipping. ⏭️ Sugar: already exists, skipping. ⏭️ US_Treasury_Bonds: already exists, skipping. ⏭️ Euro: already exists, skipping. ⏭️ British_Pound: already exists, skipping. ⏭️ Live_Cattle: already exists, skipping. ✅ All downloads complete.
# Enable autoreload (useful while iterating), and hook Qt into Jupyter
%load_ext autoreload
%autoreload 2
%gui qt
Project root & imports¶
Set the project root if your notebook isn't at the repo root. By default, we assume the notebook lives in the root (where classes/
and bots/
exist).
import sys, os, pathlib
PROJECT_ROOT = os.path.abspath('.')
if PROJECT_ROOT not in sys.path:
sys.path.insert(0, PROJECT_ROOT)
print('PROJECT_ROOT =', PROJECT_ROOT)
os.chdir(PROJECT_ROOT)
print("Current working directory:", os.getcwd())
PROJECT_ROOT = /home/dennis/Algo-Trading-Stack Current working directory: /home/dennis/Algo-Trading-Stack
from PyQt5 import QtWidgets
import gc
from classes.Backtester_Engine import BacktesterEngine
from classes.Trading_Environment import TradingEnvironment
from classes.ui_main_window import launch_gui
# Bots
from bots.coin_flip_bot.coin_flip_bot import CoinFlipBot
# Exits
from bots.exit_strategies import TrailingATRExit, FixedRatioExit
Build the exit strategy¶
exit_strategy = TrailingATRExit(atr_multiple=3.0)
Build the bot¶
bot = CoinFlipBot(
exit_strategy=exit_strategy,
base_risk_percent=0.01,
enforce_sessions=False,
flatten_before_maintenance=True,
enable_online_learning=False,
seed=42,
)
Initialize engine and environment¶
config_path = "backtest_configs/backtest_config_10_yrs.yaml"
api = BacktesterEngine(config_path=config_path)
api.connect()
env = TradingEnvironment()
env.set_api(api)
env.set_bot(bot)
# Initial indicator compute happens inside TradingEnvironment on connect.
print('Assets:', env.get_asset_list())
Assets: ['6B=F', 'CL=F', '6E=F', 'GC=F', 'LE=F', 'SI=F', 'ZS=F', 'ZB=F']
Launch GUI and Run Backtest¶
This starts the backtest control panel and charting UI. You can open charts, start/pause/restart, and view statistics.
If the window doesn't appear from within Jupyter, ensure you ran %gui qt
above, or run this notebook locally (VS Code, JupyterLab).
launch_gui(env, api)
[FORCED LIQUIDATION] 6B=F: current qty=7, submitting side=sell, qty=7 [FORCED LIQUIDATION] CL=F: current qty=2, submitting side=sell, qty=2 [FORCED LIQUIDATION] 6E=F: current qty=3, submitting side=sell, qty=3 [FORCED LIQUIDATION] LE=F: current qty=3, submitting side=sell, qty=3 [FORCED LIQUIDATION] SI=F: current qty=1, submitting side=sell, qty=1 [FORCED LIQUIDATION] ZS=F: current qty=-5, submitting side=buy, qty=5 [FORCED LIQUIDATION] ZB=F: current qty=-3, submitting side=buy, qty=3
Backtesting Results¶
Show Statistics¶
# Minimal: pull stats from the running/backtested engine and show them inline
import pandas as pd
from IPython.display import display
stats = api.get_stats_snapshot() # live snapshot; safe to call anytime
# Portfolio (one row)
display(pd.DataFrame([{
"Initial Cash": stats["portfolio"].get("initial_cash", 0.0),
"Final Equity": stats["portfolio"].get("total_equity", 0.0),
"Used Margin": stats["portfolio"].get("used_margin", 0.0),
"Max Drawdown %": 100.0 * stats["portfolio"].get("max_drawdown", 0.0),
}]))
# Per-asset table
display(pd.DataFrame.from_dict(stats["per_asset"], orient="index").reset_index().rename(columns={"index":"Symbol"}))
Initial Cash | Final Equity | Used Margin | Max Drawdown % | |
---|---|---|---|---|
0 | 1000000.0 | 1.245048e+06 | 0.0 | 21.660606 |
Symbol | trades | wins | losses | long_trades | short_trades | win_rate | avg_win | avg_loss | profit_factor | expectancy | commission_total | fee_total | max_drawdown | |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
0 | 6B=F | 104 | 39 | 65 | 50 | 54 | 0.375000 | 9099.679486 | -5877.788461 | 0.928888 | -261.237981 | 2196.0 | 312.0 | 4.924990e+00 |
1 | CL=F | 91 | 34 | 57 | 53 | 38 | 0.373626 | 9423.823530 | -4432.280701 | 1.268247 | 744.725276 | 488.0 | 273.0 | 1.639998e+15 |
2 | 6E=F | 97 | 34 | 63 | 49 | 48 | 0.350515 | 9980.882352 | -5985.912698 | 0.899864 | -389.304124 | 1288.0 | 291.0 | 3.085518e+01 |
3 | GC=F | 121 | 47 | 74 | 61 | 60 | 0.388430 | 10840.851068 | -5277.027029 | 1.304789 | 983.636364 | 864.0 | 363.0 | 6.877922e+00 |
4 | LE=F | 118 | 40 | 78 | 55 | 63 | 0.338983 | 12299.000000 | -6518.333334 | 0.967606 | -139.576272 | 2016.0 | 354.0 | 6.331000e+16 |
5 | SI=F | 138 | 63 | 75 | 62 | 76 | 0.456522 | 8858.333333 | -6055.333335 | 1.228834 | 753.079709 | 1032.0 | 414.0 | 1.966030e+00 |
6 | ZS=F | 103 | 29 | 74 | 45 | 58 | 0.281553 | 13576.293103 | -5778.885136 | 0.920669 | -329.368933 | 1488.0 | 309.0 | 6.396226e+00 |
7 | ZB=F | 89 | 33 | 56 | 44 | 45 | 0.370787 | 11487.215909 | -5297.712053 | 1.277769 | 925.912921 | 792.0 | 267.0 | 2.812500e+14 |
Show Equity Curve¶
# Assuming `s` is the equity Series you already built
import matplotlib.pyplot as plt
from matplotlib.ticker import FuncFormatter
# Times + equity (portfolio). Safe to call anytime; uses the engine's live history.
times, equity = api.get_equity_series() # None -> portfolio; pass a symbol for per-asset
n = min(len(times), len(equity))
if n == 0:
print("No equity data available yet.")
else:
s = pd.Series(equity[:n], index=pd.to_datetime(times[:n])).dropna()
# (Optional) smooth gaps like weekends/holidays:
s = s.resample("h").last().ffill()
fig, ax = plt.subplots(figsize=(10, 4))
ax.plot(s.index, s.values)
ax.set_title("Portfolio Equity")
ax.set_xlabel("Time"); ax.set_ylabel("Equity ($)")
ax.grid(True)
# Turn off scientific notation/offset and format with commas
ax.ticklabel_format(axis='y', style='plain', useOffset=False)
ax.yaxis.set_major_formatter(FuncFormatter(lambda x, pos: f'${x:,.0f}'))
fig.autofmt_xdate()
plt.show()
But Wait! We can do better!¶
Adding Intelligence to the Exit: PPO-Selected Trailing Stops¶
The first run used a fixed 3× ATR trailing stop—faithful to the Basso demo and great for illustrating the importance of risk management. Next, we’ll upgrade the exit with a small dose of machine learning: a PPO-based policy that selects the ATR multiple dynamically (e.g., choose 1×, 2×, 3×, or 4× ATR) based on recent market context.
Why this can help¶
- Regime-aware stops: Quiet, mean-reverting markets often benefit from tighter stops; volatile/trending regimes often need looser stops to avoid whipsaw.
- More consistent expectancy: By adjusting the distance intelligently, the exit aims to reduce avoidable stop-outs without choking off winners.
- No look-ahead, still disciplined: The policy only sees previously closed-bar features; entries/exits remain rule-based and reproducible.
How it works here¶
- Training data: We generated offline data by scanning several ATR multiples and recording which one would have performed best at each eligible entry.
- Features (prev bar only):
ATR
,RSI(14)
,EMA(21)
,Close
, and a position flag (+1 long / −1 short). - Model: A PPO classifier maps features → a discrete ATR multiple (
{1,2,3,4}
by default). - Execution: At entry and on each bar while trailing, the policy proposes
K×ATR
. The stop is ratcheted forward only when price has moved favorably past entry (never loosens). - Safety: If a model is missing for a symbol (or SB3 isn’t available), we fall back to 3× ATR so the run remains consistent.
What to expect¶
- **Similar win rate, improved *profit factor*** in persistent regimes (fewer “too-tight” stop-outs).
- Smoother per-asset equity when trend strength/volatility shifts.
- Still Basso-style: random entries, disciplined exits—now context-sensitive.
Imports¶
from PyQt5 import QtWidgets
import gc
from classes.Backtester_Engine import BacktesterEngine
from classes.Trading_Environment import TradingEnvironment
from classes.ui_main_window import launch_gui
# Bots
from bots.coin_flip_bot.coin_flip_bot import CoinFlipBot
# Exits
from bots.exit_strategies import TrailingATRExit, FixedRatioExit, RLTrailingATRExit
Generate ML training data¶
!PYTHONPATH=. python3 bots/generate_ML_SL_Training_data.py \
--config bots/configs/ml_sl_config.yaml \
--output-dir bots/data/yahoo_finance/training_data
Processing 6B=F (yahoo_finance/data/Futures/British_Pound/1Day_timeframe/british_pound_2000-2015.csv)... range [100, 3792) 6B=F: 100%|################################| 3692/3692 [00:24<00:00, 150.60it/s] Saved to bots/data/yahoo_finance/training_data/rl_stop_loss_training_6B=F.csv Processing CL=F (yahoo_finance/data/Futures/Crude_Oil/1Day_timeframe/crude_oil_2000-2015.csv)... range [100, 3792) CL=F: 100%|################################| 3692/3692 [00:28<00:00, 131.85it/s] Saved to bots/data/yahoo_finance/training_data/rl_stop_loss_training_CL=F.csv Processing 6E=F (yahoo_finance/data/Futures/Euro/1Day_timeframe/euro_2000-2015.csv)... range [100, 3792) 6E=F: 100%|################################| 3692/3692 [00:34<00:00, 107.71it/s] Saved to bots/data/yahoo_finance/training_data/rl_stop_loss_training_6E=F.csv Processing GC=F (yahoo_finance/data/Futures/Gold/1Day_timeframe/gold_2000-2015.csv)... range [100, 3792) GC=F: 100%|#################################| 3692/3692 [00:37<00:00, 99.41it/s] Saved to bots/data/yahoo_finance/training_data/rl_stop_loss_training_GC=F.csv Processing LE=F (yahoo_finance/data/Futures/Live_Cattle/1Day_timeframe/live_cattle_2000-2015.csv)... range [100, 3792) LE=F: 100%|################################| 3692/3692 [00:18<00:00, 195.41it/s] Saved to bots/data/yahoo_finance/training_data/rl_stop_loss_training_LE=F.csv Processing SI=F (yahoo_finance/data/Futures/Silver/1Day_timeframe/silver_2000-2015.csv)... range [100, 3792) SI=F: 100%|################################| 3692/3692 [00:26<00:00, 138.07it/s] Saved to bots/data/yahoo_finance/training_data/rl_stop_loss_training_SI=F.csv Processing ZS=F (yahoo_finance/data/Futures/Soybeans/1Day_timeframe/soybeans_2000-2015.csv)... range [100, 3792) ZS=F: 100%|################################| 3692/3692 [00:22<00:00, 167.56it/s] Saved to bots/data/yahoo_finance/training_data/rl_stop_loss_training_ZS=F.csv Processing ZB=F (yahoo_finance/data/Futures/US_Treasury_Bonds/1Day_timeframe/us_treasury_bonds_2000-2015.csv)... range [100, 3792) ZB=F: 100%|################################| 3692/3692 [00:27<00:00, 132.80it/s] Saved to bots/data/yahoo_finance/training_data/rl_stop_loss_training_ZB=F.csv
Train PPO stop‑loss selector¶
!PYTHONPATH=. python3 bots/train_ppo_stop_selector.py \
--input_dir bots/data/yahoo_finance/training_data \
--output_dir bots/models/PPO_Trailing_Stop_Loss \
--total_timesteps 300000
[TRAIN] 6B=F — rows: 7326, actions: [1.0, 2.0, 3.0, 4.0], envs: 8 Using cpu device ------------------------------ | time/ | | | fps | 20991 | | iterations | 1 | | time_elapsed | 0 | | total_timesteps | 8192 | ------------------------------ ----------------------------------------- | time/ | | | fps | 19595 | | iterations | 2 | | time_elapsed | 0 | | total_timesteps | 16384 | | train/ | | | approx_kl | 0.013843633 | | clip_fraction | 0.0984 | | clip_range | 0.2 | | entropy_loss | -1.38 | | explained_variance | -0.0171 | | learning_rate | 0.0003 | | loss | 0.0313 | | n_updates | 6 | | policy_gradient_loss | -0.0707 | | value_loss | 0.396 | ----------------------------------------- ----------------------------------------- | time/ | | | fps | 19240 | | iterations | 3 | | time_elapsed | 1 | | total_timesteps | 24576 | | train/ | | | approx_kl | 0.017927568 | | clip_fraction | 0.107 | | clip_range | 0.2 | | entropy_loss | -1.34 | | explained_variance | -0.0178 | | learning_rate | 0.0003 | | loss | -0.0203 | | n_updates | 12 | | policy_gradient_loss | -0.0817 | | value_loss | 0.296 | ----------------------------------------- ----------------------------------------- | time/ | | | fps | 19237 | | iterations | 4 | | time_elapsed | 1 | | total_timesteps | 32768 | | train/ | | | approx_kl | 0.025181714 | | clip_fraction | 0.15 | | clip_range | 0.2 | | entropy_loss | -1.24 | | explained_variance | -0.0181 | | learning_rate | 0.0003 | | loss | -0.0511 | | n_updates | 18 | | policy_gradient_loss | -0.0991 | | value_loss | 0.26 | ----------------------------------------- ---------------------------------------- | time/ | | | fps | 19217 | | iterations | 5 | | time_elapsed | 2 | | total_timesteps | 40960 | | train/ | | | approx_kl | 0.02499305 | | clip_fraction | 0.116 | | clip_range | 0.2 | | entropy_loss | -1.08 | | explained_variance | -0.0163 | | learning_rate | 0.0003 | | loss | -0.05 | | n_updates | 24 | | policy_gradient_loss | -0.0975 | | value_loss | 0.256 | ---------------------------------------- ---------------------------------------- | time/ | | | fps | 19206 | | iterations | 6 | | time_elapsed | 2 | | total_timesteps | 49152 | | train/ | | | approx_kl | 0.02237932 | | clip_fraction | 0.124 | | clip_range | 0.2 | | entropy_loss | -0.862 | | explained_variance | -0.00746 | | learning_rate | 0.0003 | | loss | -0.0206 | | n_updates | 30 | | policy_gradient_loss | -0.0827 | | value_loss | 0.233 | ---------------------------------------- ----------------------------------------- | time/ | | | fps | 19144 | | iterations | 7 | | time_elapsed | 2 | | total_timesteps | 57344 | | train/ | | | approx_kl | 0.016101798 | | clip_fraction | 0.0856 | | clip_range | 0.2 | | entropy_loss | -0.649 | | explained_variance | -0.00338 | | learning_rate | 0.0003 | | loss | -0.00407 | | n_updates | 36 | | policy_gradient_loss | -0.0641 | | value_loss | 0.191 | ----------------------------------------- ---------------------------------------- | rollout/ | | | ep_len_mean | 7.33e+03 | | ep_rew_mean | 3.95e+03 | | time/ | | | fps | 19148 | | iterations | 8 | | time_elapsed | 3 | | total_timesteps | 65536 | | train/ | | | approx_kl | 0.01037625 | | clip_fraction | 0.0673 | | clip_range | 0.2 | | entropy_loss | -0.469 | | explained_variance | 0.00516 | | learning_rate | 0.0003 | | loss | 0.0137 | | n_updates | 42 | | policy_gradient_loss | -0.0459 | | value_loss | 0.161 | ---------------------------------------- ------------------------------------------ | rollout/ | | | ep_len_mean | 7.33e+03 | | ep_rew_mean | 3.95e+03 | | time/ | | | fps | 19157 | | iterations | 9 | | time_elapsed | 3 | | total_timesteps | 73728 | | train/ | | | approx_kl | 0.0057062125 | | clip_fraction | 0.0388 | | clip_range | 0.2 | | entropy_loss | -0.34 | | explained_variance | 0.00639 | | learning_rate | 0.0003 | | loss | 0.0165 | | n_updates | 48 | | policy_gradient_loss | -0.0324 | | value_loss | 0.133 | ------------------------------------------ ------------------------------------------ | rollout/ | | | ep_len_mean | 7.33e+03 | | ep_rew_mean | 3.95e+03 | | time/ | | | fps | 19165 | | iterations | 10 | | time_elapsed | 4 | | total_timesteps | 81920 | | train/ | | | approx_kl | 0.0030949104 | | clip_fraction | 0.024 | | clip_range | 0.2 | | entropy_loss | -0.252 | | explained_variance | 0.0131 | | learning_rate | 0.0003 | | loss | 0.0226 | | n_updates | 54 | | policy_gradient_loss | -0.0216 | | value_loss | 0.112 | ------------------------------------------ ------------------------------------------ | rollout/ | | | ep_len_mean | 7.33e+03 | | ep_rew_mean | 3.95e+03 | | time/ | | | fps | 19177 | | iterations | 11 | | time_elapsed | 4 | | total_timesteps | 90112 | | train/ | | | approx_kl | 0.0019354754 | | clip_fraction | 0.013 | | clip_range | 0.2 | | entropy_loss | -0.188 | | explained_variance | 0.00992 | | learning_rate | 0.0003 | | loss | 0.0237 | | n_updates | 60 | | policy_gradient_loss | -0.0152 | | value_loss | 0.0945 | ------------------------------------------ ------------------------------------------ | rollout/ | | | ep_len_mean | 7.33e+03 | | ep_rew_mean | 3.95e+03 | | time/ | | | fps | 19158 | | iterations | 12 | | time_elapsed | 5 | | total_timesteps | 98304 | | train/ | | | approx_kl | 0.0014332188 | | clip_fraction | 0.0111 | | clip_range | 0.2 | | entropy_loss | -0.14 | | explained_variance | 0.0165 | | learning_rate | 0.0003 | | loss | 0.0229 | | n_updates | 66 | | policy_gradient_loss | -0.0122 | | value_loss | 0.0836 | ------------------------------------------ ------------------------------------------ | rollout/ | | | ep_len_mean | 7.33e+03 | | ep_rew_mean | 3.95e+03 | | time/ | | | fps | 19090 | | iterations | 13 | | time_elapsed | 5 | | total_timesteps | 106496 | | train/ | | | approx_kl | 0.0010673881 | | clip_fraction | 0.00712 | | clip_range | 0.2 | | entropy_loss | -0.104 | | explained_variance | 0.0138 | | learning_rate | 0.0003 | | loss | 0.0282 | | n_updates | 72 | | policy_gradient_loss | -0.00886 | | value_loss | 0.0846 | ------------------------------------------ ------------------------------------------ | rollout/ | | | ep_len_mean | 7.33e+03 | | ep_rew_mean | 3.95e+03 | | time/ | | | fps | 19078 | | iterations | 14 | | time_elapsed | 6 | | total_timesteps | 114688 | | train/ | | | approx_kl | 0.0006308706 | | clip_fraction | 0.0049 | | clip_range | 0.2 | | entropy_loss | -0.0765 | | explained_variance | 0.013 | | learning_rate | 0.0003 | | loss | 0.0289 | | n_updates | 78 | | policy_gradient_loss | -0.00621 | | value_loss | 0.0761 | ------------------------------------------ ----------------------------------------- | rollout/ | | | ep_len_mean | 7.33e+03 | | ep_rew_mean | 5.26e+03 | | time/ | | | fps | 19080 | | iterations | 15 | | time_elapsed | 6 | | total_timesteps | 122880 | | train/ | | | approx_kl | 0.000340859 | | clip_fraction | 0.00212 | | clip_range | 0.2 | | entropy_loss | -0.0597 | | explained_variance | 0.0157 | | learning_rate | 0.0003 | | loss | 0.0282 | | n_updates | 84 | | policy_gradient_loss | -0.0043 | | value_loss | 0.0708 | ----------------------------------------- ------------------------------------------- | rollout/ | | | ep_len_mean | 7.33e+03 | | ep_rew_mean | 5.26e+03 | | time/ | | | fps | 19085 | | iterations | 16 | | time_elapsed | 6 | | total_timesteps | 131072 | | train/ | | | approx_kl | 0.00017984342 | | clip_fraction | 0.00061 | | clip_range | 0.2 | | entropy_loss | -0.0472 | | explained_variance | 0.0146 | | learning_rate | 0.0003 | | loss | 0.0319 | | n_updates | 90 | | policy_gradient_loss | -0.00285 | | value_loss | 0.0704 | ------------------------------------------- ------------------------------------------- | rollout/ | | | ep_len_mean | 7.33e+03 | | ep_rew_mean | 5.26e+03 | | time/ | | | fps | 19096 | | iterations | 17 | | time_elapsed | 7 | | total_timesteps | 139264 | | train/ | | | approx_kl | 0.00011668514 | | clip_fraction | 0.000142 | | clip_range | 0.2 | | entropy_loss | -0.0383 | | explained_variance | 0.0156 | | learning_rate | 0.0003 | | loss | 0.0291 | | n_updates | 96 | | policy_gradient_loss | -0.00209 | | value_loss | 0.0699 | ------------------------------------------- ------------------------------------------ | rollout/ | | | ep_len_mean | 7.33e+03 | | ep_rew_mean | 5.26e+03 | | time/ | | | fps | 19107 | | iterations | 18 | | time_elapsed | 7 | | total_timesteps | 147456 | | train/ | | | approx_kl | 7.765376e-05 | | clip_fraction | 0 | | clip_range | 0.2 | | entropy_loss | -0.0318 | | explained_variance | 0.0168 | | learning_rate | 0.0003 | | loss | 0.0312 | | n_updates | 102 | | policy_gradient_loss | -0.00156 | | value_loss | 0.0711 | ------------------------------------------ ------------------------------------------- | rollout/ | | | ep_len_mean | 7.33e+03 | | ep_rew_mean | 5.26e+03 | | time/ | | | fps | 19104 | | iterations | 19 | | time_elapsed | 8 | | total_timesteps | 155648 | | train/ | | | approx_kl | 3.0487085e-05 | | clip_fraction | 0 | | clip_range | 0.2 | | entropy_loss | -0.0271 | | explained_variance | 0.0203 | | learning_rate | 0.0003 | | loss | 0.0329 | | n_updates | 108 | | policy_gradient_loss | -0.000821 | | value_loss | 0.0673 | ------------------------------------------- ------------------------------------------- | rollout/ | | | ep_len_mean | 7.33e+03 | | ep_rew_mean | 5.26e+03 | | time/ | | | fps | 19107 | | iterations | 20 | | time_elapsed | 8 | | total_timesteps | 163840 | | train/ | | | approx_kl | 2.7386559e-05 | | clip_fraction | 0 | | clip_range | 0.2 | | entropy_loss | -0.0242 | | explained_variance | 0.0185 | | learning_rate | 0.0003 | | loss | 0.0301 | | n_updates | 114 | | policy_gradient_loss | -0.000805 | | value_loss | 0.0633 | ------------------------------------------- ------------------------------------------- | rollout/ | | | ep_len_mean | 7.33e+03 | | ep_rew_mean | 5.26e+03 | | time/ | | | fps | 19103 | | iterations | 21 | | time_elapsed | 9 | | total_timesteps | 172032 | | train/ | | | approx_kl | 3.1381976e-05 | | clip_fraction | 0 | | clip_range | 0.2 | | entropy_loss | -0.0206 | | explained_variance | 0.017 | | learning_rate | 0.0003 | | loss | 0.0305 | | n_updates | 120 | | policy_gradient_loss | -0.000866 | | value_loss | 0.0661 | ------------------------------------------- ------------------------------------------- | rollout/ | | | ep_len_mean | 7.33e+03 | | ep_rew_mean | 5.76e+03 | | time/ | | | fps | 19109 | | iterations | 22 | | time_elapsed | 9 | | total_timesteps | 180224 | | train/ | | | approx_kl | 1.7015293e-05 | | clip_fraction | 0 | | clip_range | 0.2 | | entropy_loss | -0.0187 | | explained_variance | 0.0239 | | learning_rate | 0.0003 | | loss | 0.0331 | | n_updates | 126 | | policy_gradient_loss | -0.000542 | | value_loss | 0.0674 | ------------------------------------------- ------------------------------------------ | rollout/ | | | ep_len_mean | 7.33e+03 | | ep_rew_mean | 5.76e+03 | | time/ | | | fps | 19116 | | iterations | 23 | | time_elapsed | 9 | | total_timesteps | 188416 | | train/ | | | approx_kl | 8.737858e-06 | | clip_fraction | 0 | | clip_range | 0.2 | | entropy_loss | -0.0167 | | explained_variance | 0.0223 | | learning_rate | 0.0003 | | loss | 0.0348 | | n_updates | 132 | | policy_gradient_loss | -0.000329 | | value_loss | 0.0712 | ------------------------------------------ ------------------------------------------- | rollout/ | | | ep_len_mean | 7.33e+03 | | ep_rew_mean | 5.76e+03 | | time/ | | | fps | 19123 | | iterations | 24 | | time_elapsed | 10 | | total_timesteps | 196608 | | train/ | | | approx_kl | 9.8511955e-06 | | clip_fraction | 0 | | clip_range | 0.2 | | entropy_loss | -0.0153 | | explained_variance | 0.0209 | | learning_rate | 0.0003 | | loss | 0.0307 | | n_updates | 138 | | policy_gradient_loss | -0.000398 | | value_loss | 0.0638 | ------------------------------------------- ------------------------------------------- | rollout/ | | | ep_len_mean | 7.33e+03 | | ep_rew_mean | 5.76e+03 | | time/ | | | fps | 19105 | | iterations | 25 | | time_elapsed | 10 | | total_timesteps | 204800 | | train/ | | | approx_kl | 5.1805473e-06 | | clip_fraction | 0 | | clip_range | 0.2 | | entropy_loss | -0.0142 | | explained_variance | 0.0157 | | learning_rate | 0.0003 | | loss | 0.0307 | | n_updates | 144 | | policy_gradient_loss | -0.000245 | | value_loss | 0.0644 | ------------------------------------------- ------------------------------------------ | rollout/ | | | ep_len_mean | 7.33e+03 | | ep_rew_mean | 5.76e+03 | | time/ | | | fps | 19110 | | iterations | 26 | | time_elapsed | 11 | | total_timesteps | 212992 | | train/ | | | approx_kl | 8.469215e-06 | | clip_fraction | 0 | | clip_range | 0.2 | | entropy_loss | -0.0135 | | explained_variance | 0.0251 | | learning_rate | 0.0003 | | loss | 0.0397 | | n_updates | 150 | | policy_gradient_loss | -0.000347 | | value_loss | 0.074 | ------------------------------------------ ------------------------------------------ | rollout/ | | | ep_len_mean | 7.33e+03 | | ep_rew_mean | 5.76e+03 | | time/ | | | fps | 19118 | | iterations | 27 | | time_elapsed | 11 | | total_timesteps | 221184 | | train/ | | | approx_kl | 4.426707e-06 | | clip_fraction | 0 | | clip_range | 0.2 | | entropy_loss | -0.0117 | | explained_variance | 0.0222 | | learning_rate | 0.0003 | | loss | 0.0281 | | n_updates | 156 | | policy_gradient_loss | -0.000209 | | value_loss | 0.0592 | ------------------------------------------ ------------------------------------------- | rollout/ | | | ep_len_mean | 7.33e+03 | | ep_rew_mean | 5.76e+03 | | time/ | | | fps | 19122 | | iterations | 28 | | time_elapsed | 11 | | total_timesteps | 229376 | | train/ | | | approx_kl | 3.5331614e-06 | | clip_fraction | 0 | | clip_range | 0.2 | | entropy_loss | -0.0116 | | explained_variance | 0.0213 | | learning_rate | 0.0003 | | loss | 0.0349 | | n_updates | 162 | | policy_gradient_loss | -0.000168 | | value_loss | 0.0667 | ------------------------------------------- ------------------------------------------ | rollout/ | | | ep_len_mean | 7.33e+03 | | ep_rew_mean | 6.02e+03 | | time/ | | | fps | 19128 | | iterations | 29 | | time_elapsed | 12 | | total_timesteps | 237568 | | train/ | | | approx_kl | 2.787805e-06 | | clip_fraction | 0 | | clip_range | 0.2 | | entropy_loss | -0.0106 | | explained_variance | 0.0187 | | learning_rate | 0.0003 | | loss | 0.0327 | | n_updates | 168 | | policy_gradient_loss | -0.000151 | | value_loss | 0.0667 | ------------------------------------------ ------------------------------------------- | rollout/ | | | ep_len_mean | 7.33e+03 | | ep_rew_mean | 6.02e+03 | | time/ | | | fps | 19131 | | iterations | 30 | | time_elapsed | 12 | | total_timesteps | 245760 | | train/ | | | approx_kl | 3.0290394e-06 | | clip_fraction | 0 | | clip_range | 0.2 | | entropy_loss | -0.00984 | | explained_variance | 0.0207 | | learning_rate | 0.0003 | | loss | 0.0329 | | n_updates | 174 | | policy_gradient_loss | -0.00017 | | value_loss | 0.0644 | ------------------------------------------- ------------------------------------------ | rollout/ | | | ep_len_mean | 7.33e+03 | | ep_rew_mean | 6.02e+03 | | time/ | | | fps | 19130 | | iterations | 31 | | time_elapsed | 13 | | total_timesteps | 253952 | | train/ | | | approx_kl | 3.381596e-06 | | clip_fraction | 0 | | clip_range | 0.2 | | entropy_loss | -0.00949 | | explained_variance | 0.0277 | | learning_rate | 0.0003 | | loss | 0.0332 | | n_updates | 180 | | policy_gradient_loss | -0.000176 | | value_loss | 0.0684 | ------------------------------------------ ------------------------------------------- | rollout/ | | | ep_len_mean | 7.33e+03 | | ep_rew_mean | 6.02e+03 | | time/ | | | fps | 19132 | | iterations | 32 | | time_elapsed | 13 | | total_timesteps | 262144 | | train/ | | | approx_kl | 2.2856475e-06 | | clip_fraction | 0 | | clip_range | 0.2 | | entropy_loss | -0.00879 | | explained_variance | 0.0234 | | learning_rate | 0.0003 | | loss | 0.0344 | | n_updates | 186 | | policy_gradient_loss | -0.000137 | | value_loss | 0.0658 | ------------------------------------------- ------------------------------------------- | rollout/ | | | ep_len_mean | 7.33e+03 | | ep_rew_mean | 6.02e+03 | | time/ | | | fps | 19132 | | iterations | 33 | | time_elapsed | 14 | | total_timesteps | 270336 | | train/ | | | approx_kl | 1.4155812e-06 | | clip_fraction | 0 | | clip_range | 0.2 | | entropy_loss | -0.00821 | | explained_variance | 0.0255 | | learning_rate | 0.0003 | | loss | 0.0358 | | n_updates | 192 | | policy_gradient_loss | -9.13e-05 | | value_loss | 0.0688 | ------------------------------------------- ------------------------------------------- | rollout/ | | | ep_len_mean | 7.33e+03 | | ep_rew_mean | 6.02e+03 | | time/ | | | fps | 19138 | | iterations | 34 | | time_elapsed | 14 | | total_timesteps | 278528 | | train/ | | | approx_kl | 1.2561795e-06 | | clip_fraction | 0 | | clip_range | 0.2 | | entropy_loss | -0.00768 | | explained_variance | 0.0206 | | learning_rate | 0.0003 | | loss | 0.0298 | | n_updates | 198 | | policy_gradient_loss | -7.76e-05 | | value_loss | 0.0623 | ------------------------------------------- ------------------------------------------- | rollout/ | | | ep_len_mean | 7.33e+03 | | ep_rew_mean | 6.02e+03 | | time/ | | | fps | 19146 | | iterations | 35 | | time_elapsed | 14 | | total_timesteps | 286720 | | train/ | | | approx_kl | 2.1879678e-06 | | clip_fraction | 0 | | clip_range | 0.2 | | entropy_loss | -0.00754 | | explained_variance | 0.0184 | | learning_rate | 0.0003 | | loss | 0.0335 | | n_updates | 204 | | policy_gradient_loss | -0.000133 | | value_loss | 0.0637 | ------------------------------------------- ------------------------------------------- | rollout/ | | | ep_len_mean | 7.33e+03 | | ep_rew_mean | 6.18e+03 | | time/ | | | fps | 19147 | | iterations | 36 | | time_elapsed | 15 | | total_timesteps | 294912 | | train/ | | | approx_kl | 3.5027188e-07 | | clip_fraction | 0 | | clip_range | 0.2 | | entropy_loss | -0.00718 | | explained_variance | 0.0229 | | learning_rate | 0.0003 | | loss | 0.0287 | | n_updates | 210 | | policy_gradient_loss | -3.23e-05 | | value_loss | 0.0613 | ------------------------------------------- ------------------------------------------ | rollout/ | | | ep_len_mean | 7.33e+03 | | ep_rew_mean | 6.18e+03 | | time/ | | | fps | 19147 | | iterations | 37 | | time_elapsed | 15 | | total_timesteps | 303104 | | train/ | | | approx_kl | 2.319066e-07 | | clip_fraction | 0 | | clip_range | 0.2 | | entropy_loss | -0.007 | | explained_variance | 0.0242 | | learning_rate | 0.0003 | | loss | 0.0347 | | n_updates | 216 | | policy_gradient_loss | -1.94e-05 | | value_loss | 0.0662 | ------------------------------------------ [SAVED] bots/models/PPO_Trailing_Stop_Loss/ppo_stop_loss_selector_rl_stop_loss_training_6B=F.zip [TRAIN] 6E=F — rows: 7380, actions: [1.0, 2.0, 3.0, 4.0], envs: 8 Using cpu device ------------------------------ | time/ | | | fps | 21350 | | iterations | 1 | | time_elapsed | 0 | | total_timesteps | 8192 | ------------------------------ ------------------------------------------ | time/ | | | fps | 20132 | | iterations | 2 | | time_elapsed | 0 | | total_timesteps | 16384 | | train/ | | | approx_kl | 0.0138778575 | | clip_fraction | 0.0997 | | clip_range | 0.2 | | entropy_loss | -1.38 | | explained_variance | -0.0242 | | learning_rate | 0.0003 | | loss | 0.0373 | | n_updates | 6 | | policy_gradient_loss | -0.0671 | | value_loss | 0.4 | ------------------------------------------ ----------------------------------------- | time/ | | | fps | 19726 | | iterations | 3 | | time_elapsed | 1 | | total_timesteps | 24576 | | train/ | | | approx_kl | 0.017829828 | | clip_fraction | 0.107 | | clip_range | 0.2 | | entropy_loss | -1.34 | | explained_variance | -0.0131 | | learning_rate | 0.0003 | | loss | -0.0145 | | n_updates | 12 | | policy_gradient_loss | -0.0765 | | value_loss | 0.29 | ----------------------------------------- ----------------------------------------- | time/ | | | fps | 19538 | | iterations | 4 | | time_elapsed | 1 | | total_timesteps | 32768 | | train/ | | | approx_kl | 0.025589835 | | clip_fraction | 0.151 | | clip_range | 0.2 | | entropy_loss | -1.24 | | explained_variance | -0.013 | | learning_rate | 0.0003 | | loss | -0.0431 | | n_updates | 18 | | policy_gradient_loss | -0.093 | | value_loss | 0.255 | ----------------------------------------- --------------------------------------- | time/ | | | fps | 19407 | | iterations | 5 | | time_elapsed | 2 | | total_timesteps | 40960 | | train/ | | | approx_kl | 0.0245471 | | clip_fraction | 0.11 | | clip_range | 0.2 | | entropy_loss | -1.08 | | explained_variance | -0.0151 | | learning_rate | 0.0003 | | loss | -0.0362 | | n_updates | 24 | | policy_gradient_loss | -0.0896 | | value_loss | 0.256 | --------------------------------------- ---------------------------------------- | time/ | | | fps | 19327 | | iterations | 6 | | time_elapsed | 2 | | total_timesteps | 49152 | | train/ | | | approx_kl | 0.02243957 | | clip_fraction | 0.124 | | clip_range | 0.2 | | entropy_loss | -0.863 | | explained_variance | -0.00603 | | learning_rate | 0.0003 | | loss | -0.0089 | | n_updates | 30 | | policy_gradient_loss | -0.0766 | | value_loss | 0.238 | ---------------------------------------- ---------------------------------------- | time/ | | | fps | 19264 | | iterations | 7 | | time_elapsed | 2 | | total_timesteps | 57344 | | train/ | | | approx_kl | 0.01618424 | | clip_fraction | 0.0856 | | clip_range | 0.2 | | entropy_loss | -0.649 | | explained_variance | -0.00363 | | learning_rate | 0.0003 | | loss | 0.015 | | n_updates | 36 | | policy_gradient_loss | -0.0581 | | value_loss | 0.206 | ---------------------------------------- ----------------------------------------- | rollout/ | | | ep_len_mean | 7.38e+03 | | ep_rew_mean | 3.86e+03 | | time/ | | | fps | 19202 | | iterations | 8 | | time_elapsed | 3 | | total_timesteps | 65536 | | train/ | | | approx_kl | 0.010122934 | | clip_fraction | 0.068 | | clip_range | 0.2 | | entropy_loss | -0.469 | | explained_variance | 0.00299 | | learning_rate | 0.0003 | | loss | 0.0258 | | n_updates | 42 | | policy_gradient_loss | -0.0417 | | value_loss | 0.176 | ----------------------------------------- ------------------------------------------ | rollout/ | | | ep_len_mean | 7.38e+03 | | ep_rew_mean | 3.86e+03 | | time/ | | | fps | 19176 | | iterations | 9 | | time_elapsed | 3 | | total_timesteps | 73728 | | train/ | | | approx_kl | 0.0055768033 | | clip_fraction | 0.0392 | | clip_range | 0.2 | | entropy_loss | -0.343 | | explained_variance | 0.00677 | | learning_rate | 0.0003 | | loss | 0.0344 | | n_updates | 48 | | policy_gradient_loss | -0.0282 | | value_loss | 0.154 | ------------------------------------------ ------------------------------------------ | rollout/ | | | ep_len_mean | 7.38e+03 | | ep_rew_mean | 3.86e+03 | | time/ | | | fps | 19155 | | iterations | 10 | | time_elapsed | 4 | | total_timesteps | 81920 | | train/ | | | approx_kl | 0.0030899993 | | clip_fraction | 0.0227 | | clip_range | 0.2 | | entropy_loss | -0.253 | | explained_variance | 0.00657 | | learning_rate | 0.0003 | | loss | 0.0362 | | n_updates | 54 | | policy_gradient_loss | -0.0184 | | value_loss | 0.133 | ------------------------------------------ ------------------------------------------ | rollout/ | | | ep_len_mean | 7.38e+03 | | ep_rew_mean | 3.86e+03 | | time/ | | | fps | 19131 | | iterations | 11 | | time_elapsed | 4 | | total_timesteps | 90112 | | train/ | | | approx_kl | 0.0020436402 | | clip_fraction | 0.0143 | | clip_range | 0.2 | | entropy_loss | -0.19 | | explained_variance | 0.0111 | | learning_rate | 0.0003 | | loss | 0.0425 | | n_updates | 60 | | policy_gradient_loss | -0.0127 | | value_loss | 0.128 | ------------------------------------------ ------------------------------------------ | rollout/ | | | ep_len_mean | 7.38e+03 | | ep_rew_mean | 3.86e+03 | | time/ | | | fps | 19117 | | iterations | 12 | | time_elapsed | 5 | | total_timesteps | 98304 | | train/ | | | approx_kl | 0.0014379448 | | clip_fraction | 0.0104 | | clip_range | 0.2 | | entropy_loss | -0.14 | | explained_variance | 0.00464 | | learning_rate | 0.0003 | | loss | 0.0446 | | n_updates | 66 | | policy_gradient_loss | -0.0094 | | value_loss | 0.116 | ------------------------------------------ ------------------------------------------ | rollout/ | | | ep_len_mean | 7.38e+03 | | ep_rew_mean | 3.86e+03 | | time/ | | | fps | 19104 | | iterations | 13 | | time_elapsed | 5 | | total_timesteps | 106496 | | train/ | | | approx_kl | 0.0009949196 | | clip_fraction | 0.0072 | | clip_range | 0.2 | | entropy_loss | -0.104 | | explained_variance | 0.00665 | | learning_rate | 0.0003 | | loss | 0.0459 | | n_updates | 72 | | policy_gradient_loss | -0.00716 | | value_loss | 0.114 | ------------------------------------------ ------------------------------------------ | rollout/ | | | ep_len_mean | 7.38e+03 | | ep_rew_mean | 3.86e+03 | | time/ | | | fps | 19092 | | iterations | 14 | | time_elapsed | 6 | | total_timesteps | 114688 | | train/ | | | approx_kl | 0.0005190867 | | clip_fraction | 0.0036 | | clip_range | 0.2 | | entropy_loss | -0.0783 | | explained_variance | 0.00889 | | learning_rate | 0.0003 | | loss | 0.0486 | | n_updates | 78 | | policy_gradient_loss | -0.00475 | | value_loss | 0.107 | ------------------------------------------ ------------------------------------------- | rollout/ | | | ep_len_mean | 7.38e+03 | | ep_rew_mean | 5.1e+03 | | time/ | | | fps | 19082 | | iterations | 15 | | time_elapsed | 6 | | total_timesteps | 122880 | | train/ | | | approx_kl | 0.00027618528 | | clip_fraction | 0.00157 | | clip_range | 0.2 | | entropy_loss | -0.0616 | | explained_variance | 0.0115 | | learning_rate | 0.0003 | | loss | 0.0459 | | n_updates | 84 | | policy_gradient_loss | -0.00322 | | value_loss | 0.103 | ------------------------------------------- ------------------------------------------- | rollout/ | | | ep_len_mean | 7.38e+03 | | ep_rew_mean | 5.1e+03 | | time/ | | | fps | 19074 | | iterations | 16 | | time_elapsed | 6 | | total_timesteps | 131072 | | train/ | | | approx_kl | 0.00016038428 | | clip_fraction | 0 | | clip_range | 0.2 | | entropy_loss | -0.0501 | | explained_variance | 0.0155 | | learning_rate | 0.0003 | | loss | 0.0453 | | n_updates | 90 | | policy_gradient_loss | -0.0021 | | value_loss | 0.106 | ------------------------------------------- -------------------------------------------- | rollout/ | | | ep_len_mean | 7.38e+03 | | ep_rew_mean | 5.1e+03 | | time/ | | | fps | 19069 | | iterations | 17 | | time_elapsed | 7 | | total_timesteps | 139264 | | train/ | | | approx_kl | 0.000110134395 | | clip_fraction | 0 | | clip_range | 0.2 | | entropy_loss | -0.0416 | | explained_variance | 0.0111 | | learning_rate | 0.0003 | | loss | 0.0513 | | n_updates | 96 | | policy_gradient_loss | -0.00162 | | value_loss | 0.104 | -------------------------------------------- ------------------------------------------ | rollout/ | | | ep_len_mean | 7.38e+03 | | ep_rew_mean | 5.1e+03 | | time/ | | | fps | 19065 | | iterations | 18 | | time_elapsed | 7 | | total_timesteps | 147456 | | train/ | | | approx_kl | 6.149147e-05 | | clip_fraction | 0 | | clip_range | 0.2 | | entropy_loss | -0.0351 | | explained_variance | 0.0147 | | learning_rate | 0.0003 | | loss | 0.0522 | | n_updates | 102 | | policy_gradient_loss | -0.00108 | | value_loss | 0.105 | ------------------------------------------ ------------------------------------------ | rollout/ | | | ep_len_mean | 7.38e+03 | | ep_rew_mean | 5.1e+03 | | time/ | | | fps | 19057 | | iterations | 19 | | time_elapsed | 8 | | total_timesteps | 155648 | | train/ | | | approx_kl | 2.784554e-05 | | clip_fraction | 0 | | clip_range | 0.2 | | entropy_loss | -0.0303 | | explained_variance | 0.0166 | | learning_rate | 0.0003 | | loss | 0.0526 | | n_updates | 108 | | policy_gradient_loss | -0.00065 | | value_loss | 0.101 | ------------------------------------------ ------------------------------------------- | rollout/ | | | ep_len_mean | 7.38e+03 | | ep_rew_mean | 5.1e+03 | | time/ | | | fps | 19052 | | iterations | 20 | | time_elapsed | 8 | | total_timesteps | 163840 | | train/ | | | approx_kl | 2.4154047e-05 | | clip_fraction | 0 | | clip_range | 0.2 | | entropy_loss | -0.0271 | | explained_variance | 0.0162 | | learning_rate | 0.0003 | | loss | 0.0506 | | n_updates | 114 | | policy_gradient_loss | -0.00059 | | value_loss | 0.102 | ------------------------------------------- ------------------------------------------- | rollout/ | | | ep_len_mean | 7.38e+03 | | ep_rew_mean | 5.1e+03 | | time/ | | | fps | 19047 | | iterations | 21 | | time_elapsed | 9 | | total_timesteps | 172032 | | train/ | | | approx_kl | 3.0500298e-05 | | clip_fraction | 0 | | clip_range | 0.2 | | entropy_loss | -0.0241 | | explained_variance | 0.0172 | | learning_rate | 0.0003 | | loss | 0.0454 | | n_updates | 120 | | policy_gradient_loss | -0.00072 | | value_loss | 0.0949 | ------------------------------------------- ------------------------------------------- | rollout/ | | | ep_len_mean | 7.38e+03 | | ep_rew_mean | 5.57e+03 | | time/ | | | fps | 19036 | | iterations | 22 | | time_elapsed | 9 | | total_timesteps | 180224 | | train/ | | | approx_kl | 1.3894045e-05 | | clip_fraction | 0 | | clip_range | 0.2 | | entropy_loss | -0.0219 | | explained_variance | 0.0195 | | learning_rate | 0.0003 | | loss | 0.0495 | | n_updates | 126 | | policy_gradient_loss | -0.000397 | | value_loss | 0.0983 | ------------------------------------------- ------------------------------------------- | rollout/ | | | ep_len_mean | 7.38e+03 | | ep_rew_mean | 5.57e+03 | | time/ | | | fps | 19003 | | iterations | 23 | | time_elapsed | 9 | | total_timesteps | 188416 | | train/ | | | approx_kl | 1.1776217e-05 | | clip_fraction | 0 | | clip_range | 0.2 | | entropy_loss | -0.0201 | | explained_variance | 0.0193 | | learning_rate | 0.0003 | | loss | 0.0474 | | n_updates | 132 | | policy_gradient_loss | -0.000347 | | value_loss | 0.0988 | ------------------------------------------- ------------------------------------------- | rollout/ | | | ep_len_mean | 7.38e+03 | | ep_rew_mean | 5.57e+03 | | time/ | | | fps | 18977 | | iterations | 24 | | time_elapsed | 10 | | total_timesteps | 196608 | | train/ | | | approx_kl | 1.1170021e-05 | | clip_fraction | 0 | | clip_range | 0.2 | | entropy_loss | -0.0182 | | explained_variance | 0.0223 | | learning_rate | 0.0003 | | loss | 0.0457 | | n_updates | 138 | | policy_gradient_loss | -0.00035 | | value_loss | 0.099 | ------------------------------------------- ------------------------------------------ | rollout/ | | | ep_len_mean | 7.38e+03 | | ep_rew_mean | 5.57e+03 | | time/ | | | fps | 18976 | | iterations | 25 | | time_elapsed | 10 | | total_timesteps | 204800 | | train/ | | | approx_kl | 4.919908e-06 | | clip_fraction | 0 | | clip_range | 0.2 | | entropy_loss | -0.0165 | | explained_variance | 0.0236 | | learning_rate | 0.0003 | | loss | 0.0485 | | n_updates | 144 | | policy_gradient_loss | -0.000194 | | value_loss | 0.0971 | ------------------------------------------ ------------------------------------------- | rollout/ | | | ep_len_mean | 7.38e+03 | | ep_rew_mean | 5.57e+03 | | time/ | | | fps | 18975 | | iterations | 26 | | time_elapsed | 11 | | total_timesteps | 212992 | | train/ | | | approx_kl | 4.8175207e-06 | | clip_fraction | 0 | | clip_range | 0.2 | | entropy_loss | -0.0154 | | explained_variance | 0.027 | | learning_rate | 0.0003 | | loss | 0.0489 | | n_updates | 150 | | policy_gradient_loss | -0.000174 | | value_loss | 0.0998 | ------------------------------------------- ------------------------------------------- | rollout/ | | | ep_len_mean | 7.38e+03 | | ep_rew_mean | 5.57e+03 | | time/ | | | fps | 18975 | | iterations | 27 | | time_elapsed | 11 | | total_timesteps | 221184 | | train/ | | | approx_kl | 3.2801472e-06 | | clip_fraction | 0 | | clip_range | 0.2 | | entropy_loss | -0.0145 | | explained_variance | 0.0272 | | learning_rate | 0.0003 | | loss | 0.0476 | | n_updates | 156 | | policy_gradient_loss | -0.000144 | | value_loss | 0.0991 | ------------------------------------------- ------------------------------------------ | rollout/ | | | ep_len_mean | 7.38e+03 | | ep_rew_mean | 5.57e+03 | | time/ | | | fps | 18972 | | iterations | 28 | | time_elapsed | 12 | | total_timesteps | 229376 | | train/ | | | approx_kl | 2.847737e-06 | | clip_fraction | 0 | | clip_range | 0.2 | | entropy_loss | -0.0136 | | explained_variance | 0.0292 | | learning_rate | 0.0003 | | loss | 0.0465 | | n_updates | 162 | | policy_gradient_loss | -0.00014 | | value_loss | 0.0966 | ------------------------------------------ ------------------------------------------- | rollout/ | | | ep_len_mean | 7.38e+03 | | ep_rew_mean | 5.81e+03 | | time/ | | | fps | 18969 | | iterations | 29 | | time_elapsed | 12 | | total_timesteps | 237568 | | train/ | | | approx_kl | 2.3579269e-06 | | clip_fraction | 0 | | clip_range | 0.2 | | entropy_loss | -0.0131 | | explained_variance | 0.0311 | | learning_rate | 0.0003 | | loss | 0.0487 | | n_updates | 168 | | policy_gradient_loss | -0.000111 | | value_loss | 0.1 | ------------------------------------------- ------------------------------------------ | rollout/ | | | ep_len_mean | 7.38e+03 | | ep_rew_mean | 5.81e+03 | | time/ | | | fps | 18965 | | iterations | 30 | | time_elapsed | 12 | | total_timesteps | 245760 | | train/ | | | approx_kl | 3.214569e-06 | | clip_fraction | 0 | | clip_range | 0.2 | | entropy_loss | -0.0123 | | explained_variance | 0.0333 | | learning_rate | 0.0003 | | loss | 0.0471 | | n_updates | 174 | | policy_gradient_loss | -0.000142 | | value_loss | 0.0935 | ------------------------------------------ ------------------------------------------- | rollout/ | | | ep_len_mean | 7.38e+03 | | ep_rew_mean | 5.81e+03 | | time/ | | | fps | 18962 | | iterations | 31 | | time_elapsed | 13 | | total_timesteps | 253952 | | train/ | | | approx_kl | 3.4514305e-06 | | clip_fraction | 0 | | clip_range | 0.2 | | entropy_loss | -0.0113 | | explained_variance | 0.0328 | | learning_rate | 0.0003 | | loss | 0.0436 | | n_updates | 180 | | policy_gradient_loss | -0.000154 | | value_loss | 0.0939 | ------------------------------------------- ------------------------------------------ | rollout/ | | | ep_len_mean | 7.38e+03 | | ep_rew_mean | 5.81e+03 | | time/ | | | fps | 18962 | | iterations | 32 | | time_elapsed | 13 | | total_timesteps | 262144 | | train/ | | | approx_kl | 2.025692e-06 | | clip_fraction | 0 | | clip_range | 0.2 | | entropy_loss | -0.011 | | explained_variance | 0.0387 | | learning_rate | 0.0003 | | loss | 0.0488 | | n_updates | 186 | | policy_gradient_loss | -0.000102 | | value_loss | 0.0976 | ------------------------------------------ ------------------------------------------- | rollout/ | | | ep_len_mean | 7.38e+03 | | ep_rew_mean | 5.81e+03 | | time/ | | | fps | 18959 | | iterations | 33 | | time_elapsed | 14 | | total_timesteps | 270336 | | train/ | | | approx_kl | 1.1465818e-06 | | clip_fraction | 0 | | clip_range | 0.2 | | entropy_loss | -0.0102 | | explained_variance | 0.0406 | | learning_rate | 0.0003 | | loss | 0.0463 | | n_updates | 192 | | policy_gradient_loss | -7.01e-05 | | value_loss | 0.0956 | ------------------------------------------- ------------------------------------------- | rollout/ | | | ep_len_mean | 7.38e+03 | | ep_rew_mean | 5.81e+03 | | time/ | | | fps | 18958 | | iterations | 34 | | time_elapsed | 14 | | total_timesteps | 278528 | | train/ | | | approx_kl | 1.4040852e-06 | | clip_fraction | 0 | | clip_range | 0.2 | | entropy_loss | -0.00977 | | explained_variance | 0.0461 | | learning_rate | 0.0003 | | loss | 0.0487 | | n_updates | 198 | | policy_gradient_loss | -8.27e-05 | | value_loss | 0.0976 | ------------------------------------------- ------------------------------------------- | rollout/ | | | ep_len_mean | 7.38e+03 | | ep_rew_mean | 5.81e+03 | | time/ | | | fps | 18958 | | iterations | 35 | | time_elapsed | 15 | | total_timesteps | 286720 | | train/ | | | approx_kl | 2.6844136e-06 | | clip_fraction | 0 | | clip_range | 0.2 | | entropy_loss | -0.00944 | | explained_variance | 0.0518 | | learning_rate | 0.0003 | | loss | 0.0488 | | n_updates | 204 | | policy_gradient_loss | -0.000127 | | value_loss | 0.0959 | ------------------------------------------- ------------------------------------------- | rollout/ | | | ep_len_mean | 7.38e+03 | | ep_rew_mean | 5.81e+03 | | time/ | | | fps | 18945 | | iterations | 36 | | time_elapsed | 15 | | total_timesteps | 294912 | | train/ | | | approx_kl | 5.8821024e-07 | | clip_fraction | 0 | | clip_range | 0.2 | | entropy_loss | -0.0088 | | explained_variance | 0.051 | | learning_rate | 0.0003 | | loss | 0.0515 | | n_updates | 210 | | policy_gradient_loss | -4.16e-05 | | value_loss | 0.0987 | ------------------------------------------- ------------------------------------------- | rollout/ | | | ep_len_mean | 7.38e+03 | | ep_rew_mean | 5.96e+03 | | time/ | | | fps | 18934 | | iterations | 37 | | time_elapsed | 16 | | total_timesteps | 303104 | | train/ | | | approx_kl | 3.3714605e-07 | | clip_fraction | 0 | | clip_range | 0.2 | | entropy_loss | -0.00853 | | explained_variance | 0.0556 | | learning_rate | 0.0003 | | loss | 0.0473 | | n_updates | 216 | | policy_gradient_loss | -2.62e-05 | | value_loss | 0.0922 | ------------------------------------------- [SAVED] bots/models/PPO_Trailing_Stop_Loss/ppo_stop_loss_selector_rl_stop_loss_training_6E=F.zip [TRAIN] CL=F — rows: 7306, actions: [1.0, 2.0, 3.0, 4.0], envs: 8 Using cpu device ------------------------------ | time/ | | | fps | 21294 | | iterations | 1 | | time_elapsed | 0 | | total_timesteps | 8192 | ------------------------------ ----------------------------------------- | time/ | | | fps | 19947 | | iterations | 2 | | time_elapsed | 0 | | total_timesteps | 16384 | | train/ | | | approx_kl | 0.013938623 | | clip_fraction | 0.0925 | | clip_range | 0.2 | | entropy_loss | -1.38 | | explained_variance | -0.125 | | learning_rate | 0.0003 | | loss | 0.00518 | | n_updates | 6 | | policy_gradient_loss | -0.0645 | | value_loss | 0.291 | ----------------------------------------- ----------------------------------------- | time/ | | | fps | 19578 | | iterations | 3 | | time_elapsed | 1 | | total_timesteps | 24576 | | train/ | | | approx_kl | 0.018464828 | | clip_fraction | 0.104 | | clip_range | 0.2 | | entropy_loss | -1.34 | | explained_variance | -0.0822 | | learning_rate | 0.0003 | | loss | -0.0235 | | n_updates | 12 | | policy_gradient_loss | -0.0792 | | value_loss | 0.249 | ----------------------------------------- ----------------------------------------- | time/ | | | fps | 19325 | | iterations | 4 | | time_elapsed | 1 | | total_timesteps | 32768 | | train/ | | | approx_kl | 0.022791272 | | clip_fraction | 0.136 | | clip_range | 0.2 | | entropy_loss | -1.24 | | explained_variance | -0.0282 | | learning_rate | 0.0003 | | loss | -0.0457 | | n_updates | 18 | | policy_gradient_loss | -0.0936 | | value_loss | 0.248 | ----------------------------------------- ----------------------------------------- | time/ | | | fps | 19154 | | iterations | 5 | | time_elapsed | 2 | | total_timesteps | 40960 | | train/ | | | approx_kl | 0.021041423 | | clip_fraction | 0.086 | | clip_range | 0.2 | | entropy_loss | -1.08 | | explained_variance | -0.0107 | | learning_rate | 0.0003 | | loss | -0.0349 | | n_updates | 24 | | policy_gradient_loss | -0.088 | | value_loss | 0.255 | ----------------------------------------- ---------------------------------------- | time/ | | | fps | 18990 | | iterations | 6 | | time_elapsed | 2 | | total_timesteps | 49152 | | train/ | | | approx_kl | 0.01860438 | | clip_fraction | 0.0951 | | clip_range | 0.2 | | entropy_loss | -0.892 | | explained_variance | -0.000775 | | learning_rate | 0.0003 | | loss | -0.0117 | | n_updates | 30 | | policy_gradient_loss | -0.0758 | | value_loss | 0.238 | ---------------------------------------- ----------------------------------------- | time/ | | | fps | 18925 | | iterations | 7 | | time_elapsed | 3 | | total_timesteps | 57344 | | train/ | | | approx_kl | 0.014656736 | | clip_fraction | 0.087 | | clip_range | 0.2 | | entropy_loss | -0.695 | | explained_variance | -0.000294 | | learning_rate | 0.0003 | | loss | 0.00452 | | n_updates | 36 | | policy_gradient_loss | -0.0613 | | value_loss | 0.207 | ----------------------------------------- ----------------------------------------- | rollout/ | | | ep_len_mean | 7.31e+03 | | ep_rew_mean | 3.84e+03 | | time/ | | | fps | 18880 | | iterations | 8 | | time_elapsed | 3 | | total_timesteps | 65536 | | train/ | | | approx_kl | 0.010473039 | | clip_fraction | 0.0636 | | clip_range | 0.2 | | entropy_loss | -0.518 | | explained_variance | -0.00106 | | learning_rate | 0.0003 | | loss | 0.0162 | | n_updates | 42 | | policy_gradient_loss | -0.0465 | | value_loss | 0.177 | ----------------------------------------- ---------------------------------------- | rollout/ | | | ep_len_mean | 7.31e+03 | | ep_rew_mean | 3.84e+03 | | time/ | | | fps | 18752 | | iterations | 9 | | time_elapsed | 3 | | total_timesteps | 73728 | | train/ | | | approx_kl | 0.00641768 | | clip_fraction | 0.0449 | | clip_range | 0.2 | | entropy_loss | -0.383 | | explained_variance | 0.00533 | | learning_rate | 0.0003 | | loss | 0.029 | | n_updates | 48 | | policy_gradient_loss | -0.0329 | | value_loss | 0.154 | ---------------------------------------- ------------------------------------------ | rollout/ | | | ep_len_mean | 7.31e+03 | | ep_rew_mean | 3.84e+03 | | time/ | | | fps | 18695 | | iterations | 10 | | time_elapsed | 4 | | total_timesteps | 81920 | | train/ | | | approx_kl | 0.0035958125 | | clip_fraction | 0.0241 | | clip_range | 0.2 | | entropy_loss | -0.286 | | explained_variance | 0.0205 | | learning_rate | 0.0003 | | loss | 0.0313 | | n_updates | 54 | | policy_gradient_loss | -0.0223 | | value_loss | 0.13 | ------------------------------------------ ------------------------------------------ | rollout/ | | | ep_len_mean | 7.31e+03 | | ep_rew_mean | 3.84e+03 | | time/ | | | fps | 18621 | | iterations | 11 | | time_elapsed | 4 | | total_timesteps | 90112 | | train/ | | | approx_kl | 0.0022716466 | | clip_fraction | 0.0156 | | clip_range | 0.2 | | entropy_loss | -0.214 | | explained_variance | 0.0388 | | learning_rate | 0.0003 | | loss | 0.0306 | | n_updates | 60 | | policy_gradient_loss | -0.0153 | | value_loss | 0.111 | ------------------------------------------ ------------------------------------------ | rollout/ | | | ep_len_mean | 7.31e+03 | | ep_rew_mean | 3.84e+03 | | time/ | | | fps | 18540 | | iterations | 12 | | time_elapsed | 5 | | total_timesteps | 98304 | | train/ | | | approx_kl | 0.0016360199 | | clip_fraction | 0.0114 | | clip_range | 0.2 | | entropy_loss | -0.159 | | explained_variance | 0.039 | | learning_rate | 0.0003 | | loss | 0.0313 | | n_updates | 66 | | policy_gradient_loss | -0.0121 | | value_loss | 0.103 | ------------------------------------------ ------------------------------------------ | rollout/ | | | ep_len_mean | 7.31e+03 | | ep_rew_mean | 3.84e+03 | | time/ | | | fps | 18461 | | iterations | 13 | | time_elapsed | 5 | | total_timesteps | 106496 | | train/ | | | approx_kl | 0.0012078169 | | clip_fraction | 0.00877 | | clip_range | 0.2 | | entropy_loss | -0.119 | | explained_variance | 0.0588 | | learning_rate | 0.0003 | | loss | 0.0362 | | n_updates | 72 | | policy_gradient_loss | -0.00902 | | value_loss | 0.1 | ------------------------------------------ ------------------------------------------ | rollout/ | | | ep_len_mean | 7.31e+03 | | ep_rew_mean | 3.84e+03 | | time/ | | | fps | 18423 | | iterations | 14 | | time_elapsed | 6 | | total_timesteps | 114688 | | train/ | | | approx_kl | 0.0007647462 | | clip_fraction | 0.00606 | | clip_range | 0.2 | | entropy_loss | -0.0887 | | explained_variance | 0.0691 | | learning_rate | 0.0003 | | loss | 0.035 | | n_updates | 78 | | policy_gradient_loss | -0.00655 | | value_loss | 0.0901 | ------------------------------------------ ------------------------------------------ | rollout/ | | | ep_len_mean | 7.31e+03 | | ep_rew_mean | 5.09e+03 | | time/ | | | fps | 18400 | | iterations | 15 | | time_elapsed | 6 | | total_timesteps | 122880 | | train/ | | | approx_kl | 0.0003838509 | | clip_fraction | 0.0025 | | clip_range | 0.2 | | entropy_loss | -0.0682 | | explained_variance | 0.0711 | | learning_rate | 0.0003 | | loss | 0.0361 | | n_updates | 84 | | policy_gradient_loss | -0.00419 | | value_loss | 0.086 | ------------------------------------------ ------------------------------------------- | rollout/ | | | ep_len_mean | 7.31e+03 | | ep_rew_mean | 5.09e+03 | | time/ | | | fps | 18387 | | iterations | 16 | | time_elapsed | 7 | | total_timesteps | 131072 | | train/ | | | approx_kl | 0.00024317268 | | clip_fraction | 0.00118 | | clip_range | 0.2 | | entropy_loss | -0.0541 | | explained_variance | 0.0885 | | learning_rate | 0.0003 | | loss | 0.0341 | | n_updates | 90 | | policy_gradient_loss | -0.00313 | | value_loss | 0.0821 | ------------------------------------------- ------------------------------------------- | rollout/ | | | ep_len_mean | 7.31e+03 | | ep_rew_mean | 5.09e+03 | | time/ | | | fps | 18350 | | iterations | 17 | | time_elapsed | 7 | | total_timesteps | 139264 | | train/ | | | approx_kl | 0.00013239255 | | clip_fraction | 0.000244 | | clip_range | 0.2 | | entropy_loss | -0.0439 | | explained_variance | 0.0779 | | learning_rate | 0.0003 | | loss | 0.0377 | | n_updates | 96 | | policy_gradient_loss | -0.00204 | | value_loss | 0.0825 | ------------------------------------------- ------------------------------------------- | rollout/ | | | ep_len_mean | 7.31e+03 | | ep_rew_mean | 5.09e+03 | | time/ | | | fps | 18352 | | iterations | 18 | | time_elapsed | 8 | | total_timesteps | 147456 | | train/ | | | approx_kl | 7.5353804e-05 | | clip_fraction | 0 | | clip_range | 0.2 | | entropy_loss | -0.0366 | | explained_variance | 0.0849 | | learning_rate | 0.0003 | | loss | 0.0407 | | n_updates | 102 | | policy_gradient_loss | -0.00141 | | value_loss | 0.083 | ------------------------------------------- ------------------------------------------- | rollout/ | | | ep_len_mean | 7.31e+03 | | ep_rew_mean | 5.09e+03 | | time/ | | | fps | 18356 | | iterations | 19 | | time_elapsed | 8 | | total_timesteps | 155648 | | train/ | | | approx_kl | 3.7433987e-05 | | clip_fraction | 0 | | clip_range | 0.2 | | entropy_loss | -0.0316 | | explained_variance | 0.106 | | learning_rate | 0.0003 | | loss | 0.037 | | n_updates | 108 | | policy_gradient_loss | -0.000836 | | value_loss | 0.079 | ------------------------------------------- ----------------------------------------- | rollout/ | | | ep_len_mean | 7.31e+03 | | ep_rew_mean | 5.09e+03 | | time/ | | | fps | 18381 | | iterations | 20 | | time_elapsed | 8 | | total_timesteps | 163840 | | train/ | | | approx_kl | 3.05703e-05 | | clip_fraction | 0 | | clip_range | 0.2 | | entropy_loss | -0.0276 | | explained_variance | 0.105 | | learning_rate | 0.0003 | | loss | 0.0386 | | n_updates | 114 | | policy_gradient_loss | -0.000782 | | value_loss | 0.0807 | ----------------------------------------- ------------------------------------------- | rollout/ | | | ep_len_mean | 7.31e+03 | | ep_rew_mean | 5.09e+03 | | time/ | | | fps | 18408 | | iterations | 21 | | time_elapsed | 9 | | total_timesteps | 172032 | | train/ | | | approx_kl | 3.9773746e-05 | | clip_fraction | 0 | | clip_range | 0.2 | | entropy_loss | -0.0244 | | explained_variance | 0.12 | | learning_rate | 0.0003 | | loss | 0.0367 | | n_updates | 120 | | policy_gradient_loss | -0.000863 | | value_loss | 0.0793 | ------------------------------------------- ------------------------------------------- | rollout/ | | | ep_len_mean | 7.31e+03 | | ep_rew_mean | 5.59e+03 | | time/ | | | fps | 18428 | | iterations | 22 | | time_elapsed | 9 | | total_timesteps | 180224 | | train/ | | | approx_kl | 1.6443533e-05 | | clip_fraction | 0 | | clip_range | 0.2 | | entropy_loss | -0.0215 | | explained_variance | 0.0976 | | learning_rate | 0.0003 | | loss | 0.0374 | | n_updates | 126 | | policy_gradient_loss | -0.000474 | | value_loss | 0.0805 | ------------------------------------------- ------------------------------------------- | rollout/ | | | ep_len_mean | 7.31e+03 | | ep_rew_mean | 5.59e+03 | | time/ | | | fps | 18452 | | iterations | 23 | | time_elapsed | 10 | | total_timesteps | 188416 | | train/ | | | approx_kl | 1.4652971e-05 | | clip_fraction | 0 | | clip_range | 0.2 | | entropy_loss | -0.0195 | | explained_variance | 0.107 | | learning_rate | 0.0003 | | loss | 0.0392 | | n_updates | 132 | | policy_gradient_loss | -0.000429 | | value_loss | 0.0807 | ------------------------------------------- -------------------------------------------- | rollout/ | | | ep_len_mean | 7.31e+03 | | ep_rew_mean | 5.59e+03 | | time/ | | | fps | 18475 | | iterations | 24 | | time_elapsed | 10 | | total_timesteps | 196608 | | train/ | | | approx_kl | 1.21653575e-05 | | clip_fraction | 0 | | clip_range | 0.2 | | entropy_loss | -0.0176 | | explained_variance | 0.112 | | learning_rate | 0.0003 | | loss | 0.0386 | | n_updates | 138 | | policy_gradient_loss | -0.000424 | | value_loss | 0.0762 | -------------------------------------------- ------------------------------------------- | rollout/ | | | ep_len_mean | 7.31e+03 | | ep_rew_mean | 5.59e+03 | | time/ | | | fps | 18496 | | iterations | 25 | | time_elapsed | 11 | | total_timesteps | 204800 | | train/ | | | approx_kl | 3.9155493e-06 | | clip_fraction | 0 | | clip_range | 0.2 | | entropy_loss | -0.0161 | | explained_variance | 0.119 | | learning_rate | 0.0003 | | loss | 0.0385 | | n_updates | 144 | | policy_gradient_loss | -0.000152 | | value_loss | 0.0781 | ------------------------------------------- ------------------------------------------- | rollout/ | | | ep_len_mean | 7.31e+03 | | ep_rew_mean | 5.59e+03 | | time/ | | | fps | 18517 | | iterations | 26 | | time_elapsed | 11 | | total_timesteps | 212992 | | train/ | | | approx_kl | 5.5898927e-06 | | clip_fraction | 0 | | clip_range | 0.2 | | entropy_loss | -0.0151 | | explained_variance | 0.136 | | learning_rate | 0.0003 | | loss | 0.0394 | | n_updates | 150 | | policy_gradient_loss | -0.000239 | | value_loss | 0.0786 | ------------------------------------------- ------------------------------------------ | rollout/ | | | ep_len_mean | 7.31e+03 | | ep_rew_mean | 5.59e+03 | | time/ | | | fps | 18534 | | iterations | 27 | | time_elapsed | 11 | | total_timesteps | 221184 | | train/ | | | approx_kl | 4.934598e-06 | | clip_fraction | 0 | | clip_range | 0.2 | | entropy_loss | -0.014 | | explained_variance | 0.123 | | learning_rate | 0.0003 | | loss | 0.0359 | | n_updates | 156 | | policy_gradient_loss | -0.000213 | | value_loss | 0.0773 | ------------------------------------------ ------------------------------------------- | rollout/ | | | ep_len_mean | 7.31e+03 | | ep_rew_mean | 5.59e+03 | | time/ | | | fps | 18551 | | iterations | 28 | | time_elapsed | 12 | | total_timesteps | 229376 | | train/ | | | approx_kl | 4.1564417e-06 | | clip_fraction | 0 | | clip_range | 0.2 | | entropy_loss | -0.0131 | | explained_variance | 0.117 | | learning_rate | 0.0003 | | loss | 0.0386 | | n_updates | 162 | | policy_gradient_loss | -0.000179 | | value_loss | 0.0768 | ------------------------------------------- ------------------------------------------- | rollout/ | | | ep_len_mean | 7.31e+03 | | ep_rew_mean | 5.84e+03 | | time/ | | | fps | 18564 | | iterations | 29 | | time_elapsed | 12 | | total_timesteps | 237568 | | train/ | | | approx_kl | 3.2159878e-06 | | clip_fraction | 0 | | clip_range | 0.2 | | entropy_loss | -0.0122 | | explained_variance | 0.122 | | learning_rate | 0.0003 | | loss | 0.0355 | | n_updates | 168 | | policy_gradient_loss | -0.000142 | | value_loss | 0.0754 | ------------------------------------------- ------------------------------------------- | rollout/ | | | ep_len_mean | 7.31e+03 | | ep_rew_mean | 5.84e+03 | | time/ | | | fps | 18580 | | iterations | 30 | | time_elapsed | 13 | | total_timesteps | 245760 | | train/ | | | approx_kl | 4.0169834e-06 | | clip_fraction | 0 | | clip_range | 0.2 | | entropy_loss | -0.0115 | | explained_variance | 0.126 | | learning_rate | 0.0003 | | loss | 0.0366 | | n_updates | 174 | | policy_gradient_loss | -0.000192 | | value_loss | 0.075 | ------------------------------------------- ------------------------------------------ | rollout/ | | | ep_len_mean | 7.31e+03 | | ep_rew_mean | 5.84e+03 | | time/ | | | fps | 18573 | | iterations | 31 | | time_elapsed | 13 | | total_timesteps | 253952 | | train/ | | | approx_kl | 3.976922e-06 | | clip_fraction | 0 | | clip_range | 0.2 | | entropy_loss | -0.0108 | | explained_variance | 0.135 | | learning_rate | 0.0003 | | loss | 0.0365 | | n_updates | 180 | | policy_gradient_loss | -0.00018 | | value_loss | 0.0747 | ------------------------------------------ ------------------------------------------ | rollout/ | | | ep_len_mean | 7.31e+03 | | ep_rew_mean | 5.84e+03 | | time/ | | | fps | 18578 | | iterations | 32 | | time_elapsed | 14 | | total_timesteps | 262144 | | train/ | | | approx_kl | 1.768647e-06 | | clip_fraction | 0 | | clip_range | 0.2 | | entropy_loss | -0.0101 | | explained_variance | 0.13 | | learning_rate | 0.0003 | | loss | 0.0357 | | n_updates | 186 | | policy_gradient_loss | -9e-05 | | value_loss | 0.0735 | ------------------------------------------ ------------------------------------------- | rollout/ | | | ep_len_mean | 7.31e+03 | | ep_rew_mean | 5.84e+03 | | time/ | | | fps | 18464 | | iterations | 33 | | time_elapsed | 14 | | total_timesteps | 270336 | | train/ | | | approx_kl | 1.0735894e-06 | | clip_fraction | 0 | | clip_range | 0.2 | | entropy_loss | -0.00954 | | explained_variance | 0.139 | | learning_rate | 0.0003 | | loss | 0.0385 | | n_updates | 192 | | policy_gradient_loss | -7.3e-05 | | value_loss | 0.0747 | ------------------------------------------- ------------------------------------------- | rollout/ | | | ep_len_mean | 7.31e+03 | | ep_rew_mean | 5.84e+03 | | time/ | | | fps | 18475 | | iterations | 34 | | time_elapsed | 15 | | total_timesteps | 278528 | | train/ | | | approx_kl | 1.3901445e-06 | | clip_fraction | 0 | | clip_range | 0.2 | | entropy_loss | -0.00913 | | explained_variance | 0.129 | | learning_rate | 0.0003 | | loss | 0.0352 | | n_updates | 198 | | policy_gradient_loss | -9.16e-05 | | value_loss | 0.0745 | ------------------------------------------- ------------------------------------------- | rollout/ | | | ep_len_mean | 7.31e+03 | | ep_rew_mean | 5.84e+03 | | time/ | | | fps | 18483 | | iterations | 35 | | time_elapsed | 15 | | total_timesteps | 286720 | | train/ | | | approx_kl | 3.3095348e-06 | | clip_fraction | 0 | | clip_range | 0.2 | | entropy_loss | -0.00865 | | explained_variance | 0.142 | | learning_rate | 0.0003 | | loss | 0.0359 | | n_updates | 204 | | policy_gradient_loss | -0.000159 | | value_loss | 0.0759 | ------------------------------------------- ------------------------------------------ | rollout/ | | | ep_len_mean | 7.31e+03 | | ep_rew_mean | 5.99e+03 | | time/ | | | fps | 18483 | | iterations | 36 | | time_elapsed | 15 | | total_timesteps | 294912 | | train/ | | | approx_kl | 8.042334e-07 | | clip_fraction | 0 | | clip_range | 0.2 | | entropy_loss | -0.00816 | | explained_variance | 0.14 | | learning_rate | 0.0003 | | loss | 0.0379 | | n_updates | 210 | | policy_gradient_loss | -5.17e-05 | | value_loss | 0.0781 | ------------------------------------------ ------------------------------------------- | rollout/ | | | ep_len_mean | 7.31e+03 | | ep_rew_mean | 5.99e+03 | | time/ | | | fps | 18497 | | iterations | 37 | | time_elapsed | 16 | | total_timesteps | 303104 | | train/ | | | approx_kl | 3.0693627e-07 | | clip_fraction | 0 | | clip_range | 0.2 | | entropy_loss | -0.00782 | | explained_variance | 0.143 | | learning_rate | 0.0003 | | loss | 0.0366 | | n_updates | 216 | | policy_gradient_loss | -2.78e-05 | | value_loss | 0.0734 | ------------------------------------------- [SAVED] bots/models/PPO_Trailing_Stop_Loss/ppo_stop_loss_selector_rl_stop_loss_training_CL=F.zip [TRAIN] GC=F — rows: 7300, actions: [1.0, 2.0, 3.0, 4.0], envs: 8 Using cpu device ------------------------------ | time/ | | | fps | 22047 | | iterations | 1 | | time_elapsed | 0 | | total_timesteps | 8192 | ------------------------------ ------------------------------------------ | time/ | | | fps | 20503 | | iterations | 2 | | time_elapsed | 0 | | total_timesteps | 16384 | | train/ | | | approx_kl | 0.0145595735 | | clip_fraction | 0.104 | | clip_range | 0.2 | | entropy_loss | -1.38 | | explained_variance | -0.0853 | | learning_rate | 0.0003 | | loss | 0.219 | | n_updates | 6 | | policy_gradient_loss | -0.0614 | | value_loss | 0.874 | ------------------------------------------ ----------------------------------------- | time/ | | | fps | 20100 | | iterations | 3 | | time_elapsed | 1 | | total_timesteps | 24576 | | train/ | | | approx_kl | 0.014348937 | | clip_fraction | 0.0834 | | clip_range | 0.2 | | entropy_loss | -1.35 | | explained_variance | -0.194 | | learning_rate | 0.0003 | | loss | 0.0729 | | n_updates | 12 | | policy_gradient_loss | -0.0606 | | value_loss | 0.526 | ----------------------------------------- ----------------------------------------- | time/ | | | fps | 19890 | | iterations | 4 | | time_elapsed | 1 | | total_timesteps | 32768 | | train/ | | | approx_kl | 0.025508685 | | clip_fraction | 0.154 | | clip_range | 0.2 | | entropy_loss | -1.26 | | explained_variance | -0.249 | | learning_rate | 0.0003 | | loss | -0.000587 | | n_updates | 18 | | policy_gradient_loss | -0.0756 | | value_loss | 0.319 | ----------------------------------------- ----------------------------------------- | time/ | | | fps | 19779 | | iterations | 5 | | time_elapsed | 2 | | total_timesteps | 40960 | | train/ | | | approx_kl | 0.033062644 | | clip_fraction | 0.245 | | clip_range | 0.2 | | entropy_loss | -1.07 | | explained_variance | -0.128 | | learning_rate | 0.0003 | | loss | -0.0112 | | n_updates | 24 | | policy_gradient_loss | -0.0874 | | value_loss | 0.275 | ----------------------------------------- ----------------------------------------- | time/ | | | fps | 19695 | | iterations | 6 | | time_elapsed | 2 | | total_timesteps | 49152 | | train/ | | | approx_kl | 0.021857377 | | clip_fraction | 0.11 | | clip_range | 0.2 | | entropy_loss | -0.854 | | explained_variance | -0.0403 | | learning_rate | 0.0003 | | loss | 0.0117 | | n_updates | 30 | | policy_gradient_loss | -0.0658 | | value_loss | 0.248 | ----------------------------------------- ----------------------------------------- | time/ | | | fps | 19653 | | iterations | 7 | | time_elapsed | 2 | | total_timesteps | 57344 | | train/ | | | approx_kl | 0.018235309 | | clip_fraction | 0.101 | | clip_range | 0.2 | | entropy_loss | -0.632 | | explained_variance | -0.0242 | | learning_rate | 0.0003 | | loss | 0.0321 | | n_updates | 36 | | policy_gradient_loss | -0.0523 | | value_loss | 0.225 | ----------------------------------------- ----------------------------------------- | rollout/ | | | ep_len_mean | 7.3e+03 | | ep_rew_mean | 3.68e+03 | | time/ | | | fps | 19618 | | iterations | 8 | | time_elapsed | 3 | | total_timesteps | 65536 | | train/ | | | approx_kl | 0.011451498 | | clip_fraction | 0.0651 | | clip_range | 0.2 | | entropy_loss | -0.442 | | explained_variance | -0.0223 | | learning_rate | 0.0003 | | loss | 0.0463 | | n_updates | 42 | | policy_gradient_loss | -0.0357 | | value_loss | 0.196 | ----------------------------------------- ------------------------------------------ | rollout/ | | | ep_len_mean | 7.3e+03 | | ep_rew_mean | 3.68e+03 | | time/ | | | fps | 19599 | | iterations | 9 | | time_elapsed | 3 | | total_timesteps | 73728 | | train/ | | | approx_kl | 0.0056556673 | | clip_fraction | 0.04 | | clip_range | 0.2 | | entropy_loss | -0.312 | | explained_variance | -0.0204 | | learning_rate | 0.0003 | | loss | 0.0588 | | n_updates | 48 | | policy_gradient_loss | -0.0225 | | value_loss | 0.178 | ------------------------------------------ ------------------------------------------ | rollout/ | | | ep_len_mean | 7.3e+03 | | ep_rew_mean | 3.68e+03 | | time/ | | | fps | 19573 | | iterations | 10 | | time_elapsed | 4 | | total_timesteps | 81920 | | train/ | | | approx_kl | 0.0031574985 | | clip_fraction | 0.0215 | | clip_range | 0.2 | | entropy_loss | -0.226 | | explained_variance | -0.0221 | | learning_rate | 0.0003 | | loss | 0.0608 | | n_updates | 54 | | policy_gradient_loss | -0.014 | | value_loss | 0.164 | ------------------------------------------ ------------------------------------------ | rollout/ | | | ep_len_mean | 7.3e+03 | | ep_rew_mean | 3.68e+03 | | time/ | | | fps | 19554 | | iterations | 11 | | time_elapsed | 4 | | total_timesteps | 90112 | | train/ | | | approx_kl | 0.0017940018 | | clip_fraction | 0.014 | | clip_range | 0.2 | | entropy_loss | -0.165 | | explained_variance | -0.0226 | | learning_rate | 0.0003 | | loss | 0.0656 | | n_updates | 60 | | policy_gradient_loss | -0.00908 | | value_loss | 0.155 | ------------------------------------------ ------------------------------------------ | rollout/ | | | ep_len_mean | 7.3e+03 | | ep_rew_mean | 3.68e+03 | | time/ | | | fps | 19553 | | iterations | 12 | | time_elapsed | 5 | | total_timesteps | 98304 | | train/ | | | approx_kl | 0.0012823453 | | clip_fraction | 0.0093 | | clip_range | 0.2 | | entropy_loss | -0.121 | | explained_variance | -0.0165 | | learning_rate | 0.0003 | | loss | 0.0634 | | n_updates | 66 | | policy_gradient_loss | -0.00695 | | value_loss | 0.151 | ------------------------------------------ ------------------------------------------ | rollout/ | | | ep_len_mean | 7.3e+03 | | ep_rew_mean | 3.68e+03 | | time/ | | | fps | 19506 | | iterations | 13 | | time_elapsed | 5 | | total_timesteps | 106496 | | train/ | | | approx_kl | 0.0007296989 | | clip_fraction | 0.00614 | | clip_range | 0.2 | | entropy_loss | -0.0903 | | explained_variance | -0.021 | | learning_rate | 0.0003 | | loss | 0.064 | | n_updates | 72 | | policy_gradient_loss | -0.00485 | | value_loss | 0.145 | ------------------------------------------ ------------------------------------------- | rollout/ | | | ep_len_mean | 7.3e+03 | | ep_rew_mean | 3.68e+03 | | time/ | | | fps | 19503 | | iterations | 14 | | time_elapsed | 5 | | total_timesteps | 114688 | | train/ | | | approx_kl | 0.00038570867 | | clip_fraction | 0.00226 | | clip_range | 0.2 | | entropy_loss | -0.07 | | explained_variance | -0.0191 | | learning_rate | 0.0003 | | loss | 0.0644 | | n_updates | 78 | | policy_gradient_loss | -0.00317 | | value_loss | 0.143 | ------------------------------------------- ------------------------------------------- | rollout/ | | | ep_len_mean | 7.3e+03 | | ep_rew_mean | 4.82e+03 | | time/ | | | fps | 19491 | | iterations | 15 | | time_elapsed | 6 | | total_timesteps | 122880 | | train/ | | | approx_kl | 0.00022462045 | | clip_fraction | 0.00106 | | clip_range | 0.2 | | entropy_loss | -0.0558 | | explained_variance | -0.0214 | | learning_rate | 0.0003 | | loss | 0.0641 | | n_updates | 84 | | policy_gradient_loss | -0.00216 | | value_loss | 0.138 | ------------------------------------------- ------------------------------------------- | rollout/ | | | ep_len_mean | 7.3e+03 | | ep_rew_mean | 4.82e+03 | | time/ | | | fps | 19483 | | iterations | 16 | | time_elapsed | 6 | | total_timesteps | 131072 | | train/ | | | approx_kl | 0.00011899312 | | clip_fraction | 0 | | clip_range | 0.2 | | entropy_loss | -0.0459 | | explained_variance | -0.0163 | | learning_rate | 0.0003 | | loss | 0.0661 | | n_updates | 90 | | policy_gradient_loss | -0.00141 | | value_loss | 0.136 | ------------------------------------------- ------------------------------------------- | rollout/ | | | ep_len_mean | 7.3e+03 | | ep_rew_mean | 4.82e+03 | | time/ | | | fps | 19441 | | iterations | 17 | | time_elapsed | 7 | | total_timesteps | 139264 | | train/ | | | approx_kl | 8.1258666e-05 | | clip_fraction | 0 | | clip_range | 0.2 | | entropy_loss | -0.0386 | | explained_variance | -0.0136 | | learning_rate | 0.0003 | | loss | 0.0672 | | n_updates | 96 | | policy_gradient_loss | -0.00103 | | value_loss | 0.14 | ------------------------------------------- ----------------------------------------- | rollout/ | | | ep_len_mean | 7.3e+03 | | ep_rew_mean | 4.82e+03 | | time/ | | | fps | 19442 | | iterations | 18 | | time_elapsed | 7 | | total_timesteps | 147456 | | train/ | | | approx_kl | 4.85883e-05 | | clip_fraction | 0 | | clip_range | 0.2 | | entropy_loss | -0.0332 | | explained_variance | -0.0126 | | learning_rate | 0.0003 | | loss | 0.0726 | | n_updates | 102 | | policy_gradient_loss | -0.000727 | | value_loss | 0.143 | ----------------------------------------- ------------------------------------------- | rollout/ | | | ep_len_mean | 7.3e+03 | | ep_rew_mean | 4.82e+03 | | time/ | | | fps | 19415 | | iterations | 19 | | time_elapsed | 8 | | total_timesteps | 155648 | | train/ | | | approx_kl | 2.0459462e-05 | | clip_fraction | 0 | | clip_range | 0.2 | | entropy_loss | -0.0292 | | explained_variance | -0.0178 | | learning_rate | 0.0003 | | loss | 0.0668 | | n_updates | 108 | | policy_gradient_loss | -0.000421 | | value_loss | 0.135 | ------------------------------------------- ------------------------------------------- | rollout/ | | | ep_len_mean | 7.3e+03 | | ep_rew_mean | 4.82e+03 | | time/ | | | fps | 19389 | | iterations | 20 | | time_elapsed | 8 | | total_timesteps | 163840 | | train/ | | | approx_kl | 1.9647574e-05 | | clip_fraction | 0 | | clip_range | 0.2 | | entropy_loss | -0.0263 | | explained_variance | -0.013 | | learning_rate | 0.0003 | | loss | 0.0648 | | n_updates | 114 | | policy_gradient_loss | -0.000392 | | value_loss | 0.134 | ------------------------------------------- ------------------------------------------- | rollout/ | | | ep_len_mean | 7.3e+03 | | ep_rew_mean | 4.82e+03 | | time/ | | | fps | 19365 | | iterations | 21 | | time_elapsed | 8 | | total_timesteps | 172032 | | train/ | | | approx_kl | 2.1511172e-05 | | clip_fraction | 0 | | clip_range | 0.2 | | entropy_loss | -0.0238 | | explained_variance | -0.00799 | | learning_rate | 0.0003 | | loss | 0.0701 | | n_updates | 120 | | policy_gradient_loss | -0.000406 | | value_loss | 0.139 | ------------------------------------------- ------------------------------------------- | rollout/ | | | ep_len_mean | 7.3e+03 | | ep_rew_mean | 5.25e+03 | | time/ | | | fps | 19361 | | iterations | 22 | | time_elapsed | 9 | | total_timesteps | 180224 | | train/ | | | approx_kl | 1.0006443e-05 | | clip_fraction | 0 | | clip_range | 0.2 | | entropy_loss | -0.0216 | | explained_variance | -0.00143 | | learning_rate | 0.0003 | | loss | 0.069 | | n_updates | 126 | | policy_gradient_loss | -0.000229 | | value_loss | 0.136 | ------------------------------------------- ------------------------------------------ | rollout/ | | | ep_len_mean | 7.3e+03 | | ep_rew_mean | 5.25e+03 | | time/ | | | fps | 19363 | | iterations | 23 | | time_elapsed | 9 | | total_timesteps | 188416 | | train/ | | | approx_kl | 8.820854e-06 | | clip_fraction | 0 | | clip_range | 0.2 | | entropy_loss | -0.0199 | | explained_variance | 0.000704 | | learning_rate | 0.0003 | | loss | 0.0631 | | n_updates | 132 | | policy_gradient_loss | -0.000237 | | value_loss | 0.132 | ------------------------------------------ ------------------------------------------ | rollout/ | | | ep_len_mean | 7.3e+03 | | ep_rew_mean | 5.25e+03 | | time/ | | | fps | 19358 | | iterations | 24 | | time_elapsed | 10 | | total_timesteps | 196608 | | train/ | | | approx_kl | 8.881369e-06 | | clip_fraction | 0 | | clip_range | 0.2 | | entropy_loss | -0.0183 | | explained_variance | 0.000622 | | learning_rate | 0.0003 | | loss | 0.0678 | | n_updates | 138 | | policy_gradient_loss | -0.000233 | | value_loss | 0.139 | ------------------------------------------ ------------------------------------------ | rollout/ | | | ep_len_mean | 7.3e+03 | | ep_rew_mean | 5.25e+03 | | time/ | | | fps | 19358 | | iterations | 25 | | time_elapsed | 10 | | total_timesteps | 204800 | | train/ | | | approx_kl | 5.019261e-06 | | clip_fraction | 0 | | clip_range | 0.2 | | entropy_loss | -0.0169 | | explained_variance | 0.00109 | | learning_rate | 0.0003 | | loss | 0.0675 | | n_updates | 144 | | policy_gradient_loss | -0.000162 | | value_loss | 0.133 | ------------------------------------------ ------------------------------------------- | rollout/ | | | ep_len_mean | 7.3e+03 | | ep_rew_mean | 5.25e+03 | | time/ | | | fps | 19354 | | iterations | 26 | | time_elapsed | 11 | | total_timesteps | 212992 | | train/ | | | approx_kl | 5.5480486e-06 | | clip_fraction | 0 | | clip_range | 0.2 | | entropy_loss | -0.0157 | | explained_variance | 9.97e-05 | | learning_rate | 0.0003 | | loss | 0.0645 | | n_updates | 150 | | policy_gradient_loss | -0.000174 | | value_loss | 0.134 | ------------------------------------------- ------------------------------------------ | rollout/ | | | ep_len_mean | 7.3e+03 | | ep_rew_mean | 5.25e+03 | | time/ | | | fps | 19354 | | iterations | 27 | | time_elapsed | 11 | | total_timesteps | 221184 | | train/ | | | approx_kl | 5.821632e-06 | | clip_fraction | 0 | | clip_range | 0.2 | | entropy_loss | -0.0146 | | explained_variance | 0.000478 | | learning_rate | 0.0003 | | loss | 0.0667 | | n_updates | 156 | | policy_gradient_loss | -0.000177 | | value_loss | 0.135 | ------------------------------------------ ------------------------------------------- | rollout/ | | | ep_len_mean | 7.3e+03 | | ep_rew_mean | 5.25e+03 | | time/ | | | fps | 19340 | | iterations | 28 | | time_elapsed | 11 | | total_timesteps | 229376 | | train/ | | | approx_kl | 4.0204686e-06 | | clip_fraction | 0 | | clip_range | 0.2 | | entropy_loss | -0.0136 | | explained_variance | 0.0024 | | learning_rate | 0.0003 | | loss | 0.0661 | | n_updates | 162 | | policy_gradient_loss | -0.000127 | | value_loss | 0.134 | ------------------------------------------- ------------------------------------------- | rollout/ | | | ep_len_mean | 7.3e+03 | | ep_rew_mean | 5.47e+03 | | time/ | | | fps | 19337 | | iterations | 29 | | time_elapsed | 12 | | total_timesteps | 237568 | | train/ | | | approx_kl | 2.4277542e-06 | | clip_fraction | 0 | | clip_range | 0.2 | | entropy_loss | -0.0128 | | explained_variance | 0.00119 | | learning_rate | 0.0003 | | loss | 0.0687 | | n_updates | 168 | | policy_gradient_loss | -9.01e-05 | | value_loss | 0.134 | ------------------------------------------- ------------------------------------------- | rollout/ | | | ep_len_mean | 7.3e+03 | | ep_rew_mean | 5.47e+03 | | time/ | | | fps | 19339 | | iterations | 30 | | time_elapsed | 12 | | total_timesteps | 245760 | | train/ | | | approx_kl | 2.6972193e-06 | | clip_fraction | 0 | | clip_range | 0.2 | | entropy_loss | -0.0121 | | explained_variance | 0.00154 | | learning_rate | 0.0003 | | loss | 0.0647 | | n_updates | 174 | | policy_gradient_loss | -9.83e-05 | | value_loss | 0.133 | ------------------------------------------- ------------------------------------------- | rollout/ | | | ep_len_mean | 7.3e+03 | | ep_rew_mean | 5.47e+03 | | time/ | | | fps | 19338 | | iterations | 31 | | time_elapsed | 13 | | total_timesteps | 253952 | | train/ | | | approx_kl | 2.2189924e-06 | | clip_fraction | 0 | | clip_range | 0.2 | | entropy_loss | -0.0114 | | explained_variance | 0.0021 | | learning_rate | 0.0003 | | loss | 0.069 | | n_updates | 180 | | policy_gradient_loss | -7.99e-05 | | value_loss | 0.135 | ------------------------------------------- ------------------------------------------ | rollout/ | | | ep_len_mean | 7.3e+03 | | ep_rew_mean | 5.47e+03 | | time/ | | | fps | 19337 | | iterations | 32 | | time_elapsed | 13 | | total_timesteps | 262144 | | train/ | | | approx_kl | 1.028835e-06 | | clip_fraction | 0 | | clip_range | 0.2 | | entropy_loss | -0.0109 | | explained_variance | 0.00312 | | learning_rate | 0.0003 | | loss | 0.0688 | | n_updates | 186 | | policy_gradient_loss | -4.58e-05 | | value_loss | 0.135 | ------------------------------------------ ------------------------------------------ | rollout/ | | | ep_len_mean | 7.3e+03 | | ep_rew_mean | 5.47e+03 | | time/ | | | fps | 19339 | | iterations | 33 | | time_elapsed | 13 | | total_timesteps | 270336 | | train/ | | | approx_kl | 9.115174e-07 | | clip_fraction | 0 | | clip_range | 0.2 | | entropy_loss | -0.0104 | | explained_variance | 0.00299 | | learning_rate | 0.0003 | | loss | 0.0658 | | n_updates | 192 | | policy_gradient_loss | -4.91e-05 | | value_loss | 0.132 | ------------------------------------------ ------------------------------------------ | rollout/ | | | ep_len_mean | 7.3e+03 | | ep_rew_mean | 5.47e+03 | | time/ | | | fps | 19338 | | iterations | 34 | | time_elapsed | 14 | | total_timesteps | 278528 | | train/ | | | approx_kl | 9.727883e-07 | | clip_fraction | 0 | | clip_range | 0.2 | | entropy_loss | -0.01 | | explained_variance | 0.0025 | | learning_rate | 0.0003 | | loss | 0.0669 | | n_updates | 198 | | policy_gradient_loss | -4.56e-05 | | value_loss | 0.136 | ------------------------------------------ ------------------------------------------ | rollout/ | | | ep_len_mean | 7.3e+03 | | ep_rew_mean | 5.47e+03 | | time/ | | | fps | 19332 | | iterations | 35 | | time_elapsed | 14 | | total_timesteps | 286720 | | train/ | | | approx_kl | 2.963512e-06 | | clip_fraction | 0 | | clip_range | 0.2 | | entropy_loss | -0.00955 | | explained_variance | 0.00197 | | learning_rate | 0.0003 | | loss | 0.0652 | | n_updates | 204 | | policy_gradient_loss | -0.000108 | | value_loss | 0.133 | ------------------------------------------ ------------------------------------------- | rollout/ | | | ep_len_mean | 7.3e+03 | | ep_rew_mean | 5.6e+03 | | time/ | | | fps | 19312 | | iterations | 36 | | time_elapsed | 15 | | total_timesteps | 294912 | | train/ | | | approx_kl | 5.6384306e-07 | | clip_fraction | 0 | | clip_range | 0.2 | | entropy_loss | -0.00903 | | explained_variance | 0.00313 | | learning_rate | 0.0003 | | loss | 0.0668 | | n_updates | 210 | | policy_gradient_loss | -2.65e-05 | | value_loss | 0.134 | ------------------------------------------- ------------------------------------------- | rollout/ | | | ep_len_mean | 7.3e+03 | | ep_rew_mean | 5.6e+03 | | time/ | | | fps | 19294 | | iterations | 37 | | time_elapsed | 15 | | total_timesteps | 303104 | | train/ | | | approx_kl | 2.8371142e-07 | | clip_fraction | 0 | | clip_range | 0.2 | | entropy_loss | -0.00871 | | explained_variance | 0.00409 | | learning_rate | 0.0003 | | loss | 0.0673 | | n_updates | 216 | | policy_gradient_loss | -2.04e-05 | | value_loss | 0.134 | ------------------------------------------- [SAVED] bots/models/PPO_Trailing_Stop_Loss/ppo_stop_loss_selector_rl_stop_loss_training_GC=F.zip [TRAIN] LE=F — rows: 7176, actions: [1.0, 2.0, 3.0, 4.0], envs: 8 Using cpu device ------------------------------ | time/ | | | fps | 21947 | | iterations | 1 | | time_elapsed | 0 | | total_timesteps | 8192 | ------------------------------ ----------------------------------------- | time/ | | | fps | 20547 | | iterations | 2 | | time_elapsed | 0 | | total_timesteps | 16384 | | train/ | | | approx_kl | 0.015467179 | | clip_fraction | 0.103 | | clip_range | 0.2 | | entropy_loss | -1.38 | | explained_variance | -0.0724 | | learning_rate | 0.0003 | | loss | -0.00778 | | n_updates | 6 | | policy_gradient_loss | -0.0692 | | value_loss | 0.26 | ----------------------------------------- ----------------------------------------- | time/ | | | fps | 20074 | | iterations | 3 | | time_elapsed | 1 | | total_timesteps | 24576 | | train/ | | | approx_kl | 0.020210564 | | clip_fraction | 0.115 | | clip_range | 0.2 | | entropy_loss | -1.34 | | explained_variance | -0.0411 | | learning_rate | 0.0003 | | loss | -0.0307 | | n_updates | 12 | | policy_gradient_loss | -0.0825 | | value_loss | 0.239 | ----------------------------------------- ----------------------------------------- | time/ | | | fps | 19852 | | iterations | 4 | | time_elapsed | 1 | | total_timesteps | 32768 | | train/ | | | approx_kl | 0.023914147 | | clip_fraction | 0.157 | | clip_range | 0.2 | | entropy_loss | -1.23 | | explained_variance | -0.0241 | | learning_rate | 0.0003 | | loss | -0.0489 | | n_updates | 18 | | policy_gradient_loss | -0.0959 | | value_loss | 0.25 | ----------------------------------------- ----------------------------------------- | time/ | | | fps | 19761 | | iterations | 5 | | time_elapsed | 2 | | total_timesteps | 40960 | | train/ | | | approx_kl | 0.019508064 | | clip_fraction | 0.0703 | | clip_range | 0.2 | | entropy_loss | -1.07 | | explained_variance | -0.0143 | | learning_rate | 0.0003 | | loss | -0.0293 | | n_updates | 24 | | policy_gradient_loss | -0.0838 | | value_loss | 0.256 | ----------------------------------------- ----------------------------------------- | time/ | | | fps | 19687 | | iterations | 6 | | time_elapsed | 2 | | total_timesteps | 49152 | | train/ | | | approx_kl | 0.018300926 | | clip_fraction | 0.1 | | clip_range | 0.2 | | entropy_loss | -0.877 | | explained_variance | -0.0121 | | learning_rate | 0.0003 | | loss | -0.0111 | | n_updates | 30 | | policy_gradient_loss | -0.0748 | | value_loss | 0.239 | ----------------------------------------- ----------------------------------------- | time/ | | | fps | 19641 | | iterations | 7 | | time_elapsed | 2 | | total_timesteps | 57344 | | train/ | | | approx_kl | 0.014512697 | | clip_fraction | 0.0883 | | clip_range | 0.2 | | entropy_loss | -0.68 | | explained_variance | -0.00972 | | learning_rate | 0.0003 | | loss | 0.0112 | | n_updates | 36 | | policy_gradient_loss | -0.0595 | | value_loss | 0.206 | ----------------------------------------- ------------------------------------------ | rollout/ | | | ep_len_mean | 7.18e+03 | | ep_rew_mean | 3.77e+03 | | time/ | | | fps | 19604 | | iterations | 8 | | time_elapsed | 3 | | total_timesteps | 65536 | | train/ | | | approx_kl | 0.0100190025 | | clip_fraction | 0.063 | | clip_range | 0.2 | | entropy_loss | -0.505 | | explained_variance | -0.00967 | | learning_rate | 0.0003 | | loss | 0.0199 | | n_updates | 42 | | policy_gradient_loss | -0.0452 | | value_loss | 0.177 | ------------------------------------------ ---------------------------------------- | rollout/ | | | ep_len_mean | 7.18e+03 | | ep_rew_mean | 3.77e+03 | | time/ | | | fps | 19563 | | iterations | 9 | | time_elapsed | 3 | | total_timesteps | 73728 | | train/ | | | approx_kl | 0.00595114 | | clip_fraction | 0.0443 | | clip_range | 0.2 | | entropy_loss | -0.373 | | explained_variance | -0.011 | | learning_rate | 0.0003 | | loss | 0.0283 | | n_updates | 48 | | policy_gradient_loss | -0.0319 | | value_loss | 0.155 | ---------------------------------------- ----------------------------------------- | rollout/ | | | ep_len_mean | 7.18e+03 | | ep_rew_mean | 3.77e+03 | | time/ | | | fps | 19536 | | iterations | 10 | | time_elapsed | 4 | | total_timesteps | 81920 | | train/ | | | approx_kl | 0.003399176 | | clip_fraction | 0.0231 | | clip_range | 0.2 | | entropy_loss | -0.28 | | explained_variance | -0.000722 | | learning_rate | 0.0003 | | loss | 0.0309 | | n_updates | 54 | | policy_gradient_loss | -0.0219 | | value_loss | 0.132 | ----------------------------------------- ------------------------------------------ | rollout/ | | | ep_len_mean | 7.18e+03 | | ep_rew_mean | 3.77e+03 | | time/ | | | fps | 19509 | | iterations | 11 | | time_elapsed | 4 | | total_timesteps | 90112 | | train/ | | | approx_kl | 0.0021622744 | | clip_fraction | 0.0151 | | clip_range | 0.2 | | entropy_loss | -0.21 | | explained_variance | -0.00152 | | learning_rate | 0.0003 | | loss | 0.0348 | | n_updates | 60 | | policy_gradient_loss | -0.0149 | | value_loss | 0.116 | ------------------------------------------ ------------------------------------------ | rollout/ | | | ep_len_mean | 7.18e+03 | | ep_rew_mean | 3.77e+03 | | time/ | | | fps | 19476 | | iterations | 12 | | time_elapsed | 5 | | total_timesteps | 98304 | | train/ | | | approx_kl | 0.0015323983 | | clip_fraction | 0.0111 | | clip_range | 0.2 | | entropy_loss | -0.157 | | explained_variance | 0.00773 | | learning_rate | 0.0003 | | loss | 0.0349 | | n_updates | 66 | | policy_gradient_loss | -0.0111 | | value_loss | 0.111 | ------------------------------------------ ------------------------------------------ | rollout/ | | | ep_len_mean | 7.18e+03 | | ep_rew_mean | 3.77e+03 | | time/ | | | fps | 19417 | | iterations | 13 | | time_elapsed | 5 | | total_timesteps | 106496 | | train/ | | | approx_kl | 0.0011562344 | | clip_fraction | 0.00865 | | clip_range | 0.2 | | entropy_loss | -0.118 | | explained_variance | 0.00527 | | learning_rate | 0.0003 | | loss | 0.0403 | | n_updates | 72 | | policy_gradient_loss | -0.00895 | | value_loss | 0.104 | ------------------------------------------ ------------------------------------------- | rollout/ | | | ep_len_mean | 7.18e+03 | | ep_rew_mean | 3.77e+03 | | time/ | | | fps | 19410 | | iterations | 14 | | time_elapsed | 5 | | total_timesteps | 114688 | | train/ | | | approx_kl | 0.00061776116 | | clip_fraction | 0.00409 | | clip_range | 0.2 | | entropy_loss | -0.0891 | | explained_variance | 0.0157 | | learning_rate | 0.0003 | | loss | 0.0407 | | n_updates | 78 | | policy_gradient_loss | -0.00559 | | value_loss | 0.0977 | ------------------------------------------- ------------------------------------------- | rollout/ | | | ep_len_mean | 7.18e+03 | | ep_rew_mean | 4.99e+03 | | time/ | | | fps | 19363 | | iterations | 15 | | time_elapsed | 6 | | total_timesteps | 122880 | | train/ | | | approx_kl | 0.00034573482 | | clip_fraction | 0.00216 | | clip_range | 0.2 | | entropy_loss | -0.0691 | | explained_variance | 0.0255 | | learning_rate | 0.0003 | | loss | 0.0373 | | n_updates | 84 | | policy_gradient_loss | -0.00397 | | value_loss | 0.0892 | ------------------------------------------- ------------------------------------------- | rollout/ | | | ep_len_mean | 7.18e+03 | | ep_rew_mean | 4.99e+03 | | time/ | | | fps | 19358 | | iterations | 16 | | time_elapsed | 6 | | total_timesteps | 131072 | | train/ | | | approx_kl | 0.00022995682 | | clip_fraction | 0.000651 | | clip_range | 0.2 | | entropy_loss | -0.0551 | | explained_variance | 0.0286 | | learning_rate | 0.0003 | | loss | 0.0381 | | n_updates | 90 | | policy_gradient_loss | -0.00296 | | value_loss | 0.0921 | ------------------------------------------- ------------------------------------------- | rollout/ | | | ep_len_mean | 7.18e+03 | | ep_rew_mean | 4.99e+03 | | time/ | | | fps | 19349 | | iterations | 17 | | time_elapsed | 7 | | total_timesteps | 139264 | | train/ | | | approx_kl | 0.00013406997 | | clip_fraction | 4.07e-05 | | clip_range | 0.2 | | entropy_loss | -0.0448 | | explained_variance | 0.0343 | | learning_rate | 0.0003 | | loss | 0.0414 | | n_updates | 96 | | policy_gradient_loss | -0.00203 | | value_loss | 0.0884 | ------------------------------------------- ------------------------------------------- | rollout/ | | | ep_len_mean | 7.18e+03 | | ep_rew_mean | 4.99e+03 | | time/ | | | fps | 19338 | | iterations | 18 | | time_elapsed | 7 | | total_timesteps | 147456 | | train/ | | | approx_kl | 7.4595744e-05 | | clip_fraction | 0 | | clip_range | 0.2 | | entropy_loss | -0.0374 | | explained_variance | 0.0324 | | learning_rate | 0.0003 | | loss | 0.0441 | | n_updates | 102 | | policy_gradient_loss | -0.00135 | | value_loss | 0.0879 | ------------------------------------------- ------------------------------------------ | rollout/ | | | ep_len_mean | 7.18e+03 | | ep_rew_mean | 4.99e+03 | | time/ | | | fps | 19309 | | iterations | 19 | | time_elapsed | 8 | | total_timesteps | 155648 | | train/ | | | approx_kl | 3.875704e-05 | | clip_fraction | 0 | | clip_range | 0.2 | | entropy_loss | -0.0321 | | explained_variance | 0.0401 | | learning_rate | 0.0003 | | loss | 0.0416 | | n_updates | 108 | | policy_gradient_loss | -0.000878 | | value_loss | 0.0866 | ------------------------------------------ ------------------------------------------- | rollout/ | | | ep_len_mean | 7.18e+03 | | ep_rew_mean | 4.99e+03 | | time/ | | | fps | 19289 | | iterations | 20 | | time_elapsed | 8 | | total_timesteps | 163840 | | train/ | | | approx_kl | 3.0018899e-05 | | clip_fraction | 0 | | clip_range | 0.2 | | entropy_loss | -0.0282 | | explained_variance | 0.0337 | | learning_rate | 0.0003 | | loss | 0.0436 | | n_updates | 114 | | policy_gradient_loss | -0.000737 | | value_loss | 0.0887 | ------------------------------------------- ------------------------------------------ | rollout/ | | | ep_len_mean | 7.18e+03 | | ep_rew_mean | 4.99e+03 | | time/ | | | fps | 19282 | | iterations | 21 | | time_elapsed | 8 | | total_timesteps | 172032 | | train/ | | | approx_kl | 3.455513e-05 | | clip_fraction | 0 | | clip_range | 0.2 | | entropy_loss | -0.025 | | explained_variance | 0.0426 | | learning_rate | 0.0003 | | loss | 0.0435 | | n_updates | 120 | | policy_gradient_loss | -0.000847 | | value_loss | 0.0836 | ------------------------------------------ ------------------------------------------- | rollout/ | | | ep_len_mean | 7.18e+03 | | ep_rew_mean | 5.48e+03 | | time/ | | | fps | 19281 | | iterations | 22 | | time_elapsed | 9 | | total_timesteps | 180224 | | train/ | | | approx_kl | 1.8315484e-05 | | clip_fraction | 0 | | clip_range | 0.2 | | entropy_loss | -0.0221 | | explained_variance | 0.042 | | learning_rate | 0.0003 | | loss | 0.0419 | | n_updates | 126 | | policy_gradient_loss | -0.000518 | | value_loss | 0.0823 | ------------------------------------------- ------------------------------------------- | rollout/ | | | ep_len_mean | 7.18e+03 | | ep_rew_mean | 5.48e+03 | | time/ | | | fps | 19281 | | iterations | 23 | | time_elapsed | 9 | | total_timesteps | 188416 | | train/ | | | approx_kl | 1.2273245e-05 | | clip_fraction | 0 | | clip_range | 0.2 | | entropy_loss | -0.02 | | explained_variance | 0.0409 | | learning_rate | 0.0003 | | loss | 0.0398 | | n_updates | 132 | | policy_gradient_loss | -0.000405 | | value_loss | 0.0833 | ------------------------------------------- ------------------------------------------- | rollout/ | | | ep_len_mean | 7.18e+03 | | ep_rew_mean | 5.48e+03 | | time/ | | | fps | 19252 | | iterations | 24 | | time_elapsed | 10 | | total_timesteps | 196608 | | train/ | | | approx_kl | 1.0170828e-05 | | clip_fraction | 0 | | clip_range | 0.2 | | entropy_loss | -0.0182 | | explained_variance | 0.045 | | learning_rate | 0.0003 | | loss | 0.0394 | | n_updates | 138 | | policy_gradient_loss | -0.000344 | | value_loss | 0.0855 | ------------------------------------------- ------------------------------------------- | rollout/ | | | ep_len_mean | 7.18e+03 | | ep_rew_mean | 5.48e+03 | | time/ | | | fps | 19232 | | iterations | 25 | | time_elapsed | 10 | | total_timesteps | 204800 | | train/ | | | approx_kl | 5.8873047e-06 | | clip_fraction | 0 | | clip_range | 0.2 | | entropy_loss | -0.0168 | | explained_variance | 0.0435 | | learning_rate | 0.0003 | | loss | 0.0421 | | n_updates | 144 | | policy_gradient_loss | -0.000232 | | value_loss | 0.0819 | ------------------------------------------- ------------------------------------------- | rollout/ | | | ep_len_mean | 7.18e+03 | | ep_rew_mean | 5.48e+03 | | time/ | | | fps | 19224 | | iterations | 26 | | time_elapsed | 11 | | total_timesteps | 212992 | | train/ | | | approx_kl | 6.1442624e-06 | | clip_fraction | 0 | | clip_range | 0.2 | | entropy_loss | -0.0155 | | explained_variance | 0.0425 | | learning_rate | 0.0003 | | loss | 0.0445 | | n_updates | 150 | | policy_gradient_loss | -0.000263 | | value_loss | 0.0854 | ------------------------------------------- ------------------------------------------ | rollout/ | | | ep_len_mean | 7.18e+03 | | ep_rew_mean | 5.48e+03 | | time/ | | | fps | 19203 | | iterations | 27 | | time_elapsed | 11 | | total_timesteps | 221184 | | train/ | | | approx_kl | 6.828108e-06 | | clip_fraction | 0 | | clip_range | 0.2 | | entropy_loss | -0.0144 | | explained_variance | 0.0426 | | learning_rate | 0.0003 | | loss | 0.0412 | | n_updates | 156 | | policy_gradient_loss | -0.000282 | | value_loss | 0.0846 | ------------------------------------------ ------------------------------------------ | rollout/ | | | ep_len_mean | 7.18e+03 | | ep_rew_mean | 5.48e+03 | | time/ | | | fps | 19174 | | iterations | 28 | | time_elapsed | 11 | | total_timesteps | 229376 | | train/ | | | approx_kl | 3.765097e-06 | | clip_fraction | 0 | | clip_range | 0.2 | | entropy_loss | -0.0133 | | explained_variance | 0.0581 | | learning_rate | 0.0003 | | loss | 0.0402 | | n_updates | 162 | | policy_gradient_loss | -0.000177 | | value_loss | 0.0812 | ------------------------------------------ ------------------------------------------- | rollout/ | | | ep_len_mean | 7.18e+03 | | ep_rew_mean | 5.73e+03 | | time/ | | | fps | 19178 | | iterations | 29 | | time_elapsed | 12 | | total_timesteps | 237568 | | train/ | | | approx_kl | 2.3752364e-06 | | clip_fraction | 0 | | clip_range | 0.2 | | entropy_loss | -0.0125 | | explained_variance | 0.053 | | learning_rate | 0.0003 | | loss | 0.0448 | | n_updates | 168 | | policy_gradient_loss | -0.00011 | | value_loss | 0.0869 | ------------------------------------------- ------------------------------------------ | rollout/ | | | ep_len_mean | 7.18e+03 | | ep_rew_mean | 5.73e+03 | | time/ | | | fps | 19184 | | iterations | 30 | | time_elapsed | 12 | | total_timesteps | 245760 | | train/ | | | approx_kl | 2.428118e-06 | | clip_fraction | 0 | | clip_range | 0.2 | | entropy_loss | -0.0118 | | explained_variance | 0.0523 | | learning_rate | 0.0003 | | loss | 0.0394 | | n_updates | 174 | | policy_gradient_loss | -0.000117 | | value_loss | 0.0803 | ------------------------------------------ ------------------------------------------- | rollout/ | | | ep_len_mean | 7.18e+03 | | ep_rew_mean | 5.73e+03 | | time/ | | | fps | 19175 | | iterations | 31 | | time_elapsed | 13 | | total_timesteps | 253952 | | train/ | | | approx_kl | 2.7328788e-06 | | clip_fraction | 0 | | clip_range | 0.2 | | entropy_loss | -0.0112 | | explained_variance | 0.0536 | | learning_rate | 0.0003 | | loss | 0.0407 | | n_updates | 180 | | policy_gradient_loss | -0.000136 | | value_loss | 0.0809 | ------------------------------------------- ------------------------------------------- | rollout/ | | | ep_len_mean | 7.18e+03 | | ep_rew_mean | 5.73e+03 | | time/ | | | fps | 19180 | | iterations | 32 | | time_elapsed | 13 | | total_timesteps | 262144 | | train/ | | | approx_kl | 1.6658814e-06 | | clip_fraction | 0 | | clip_range | 0.2 | | entropy_loss | -0.0106 | | explained_variance | 0.0655 | | learning_rate | 0.0003 | | loss | 0.0403 | | n_updates | 186 | | policy_gradient_loss | -8.95e-05 | | value_loss | 0.0834 | ------------------------------------------- ------------------------------------------ | rollout/ | | | ep_len_mean | 7.18e+03 | | ep_rew_mean | 5.73e+03 | | time/ | | | fps | 19174 | | iterations | 33 | | time_elapsed | 14 | | total_timesteps | 270336 | | train/ | | | approx_kl | 1.401233e-06 | | clip_fraction | 0 | | clip_range | 0.2 | | entropy_loss | -0.0101 | | explained_variance | 0.0625 | | learning_rate | 0.0003 | | loss | 0.0404 | | n_updates | 192 | | policy_gradient_loss | -7.73e-05 | | value_loss | 0.0814 | ------------------------------------------ ------------------------------------------- | rollout/ | | | ep_len_mean | 7.18e+03 | | ep_rew_mean | 5.73e+03 | | time/ | | | fps | 19160 | | iterations | 34 | | time_elapsed | 14 | | total_timesteps | 278528 | | train/ | | | approx_kl | 1.1300799e-06 | | clip_fraction | 0 | | clip_range | 0.2 | | entropy_loss | -0.00964 | | explained_variance | 0.0737 | | learning_rate | 0.0003 | | loss | 0.041 | | n_updates | 198 | | policy_gradient_loss | -7.43e-05 | | value_loss | 0.0832 | ------------------------------------------- ------------------------------------------- | rollout/ | | | ep_len_mean | 7.18e+03 | | ep_rew_mean | 5.73e+03 | | time/ | | | fps | 19165 | | iterations | 35 | | time_elapsed | 14 | | total_timesteps | 286720 | | train/ | | | approx_kl | 2.2381573e-06 | | clip_fraction | 0 | | clip_range | 0.2 | | entropy_loss | -0.00921 | | explained_variance | 0.0745 | | learning_rate | 0.0003 | | loss | 0.0399 | | n_updates | 204 | | policy_gradient_loss | -0.000105 | | value_loss | 0.0801 | ------------------------------------------- ------------------------------------------- | rollout/ | | | ep_len_mean | 7.18e+03 | | ep_rew_mean | 5.88e+03 | | time/ | | | fps | 19170 | | iterations | 36 | | time_elapsed | 15 | | total_timesteps | 294912 | | train/ | | | approx_kl | 5.2683754e-07 | | clip_fraction | 0 | | clip_range | 0.2 | | entropy_loss | -0.00877 | | explained_variance | 0.0777 | | learning_rate | 0.0003 | | loss | 0.0413 | | n_updates | 210 | | policy_gradient_loss | -4.58e-05 | | value_loss | 0.08 | ------------------------------------------- ------------------------------------------- | rollout/ | | | ep_len_mean | 7.18e+03 | | ep_rew_mean | 5.88e+03 | | time/ | | | fps | 19170 | | iterations | 37 | | time_elapsed | 15 | | total_timesteps | 303104 | | train/ | | | approx_kl | 3.3327524e-07 | | clip_fraction | 0 | | clip_range | 0.2 | | entropy_loss | -0.00845 | | explained_variance | 0.0866 | | learning_rate | 0.0003 | | loss | 0.0401 | | n_updates | 216 | | policy_gradient_loss | -3.14e-05 | | value_loss | 0.0824 | ------------------------------------------- [SAVED] bots/models/PPO_Trailing_Stop_Loss/ppo_stop_loss_selector_rl_stop_loss_training_LE=F.zip [TRAIN] SI=F — rows: 7304, actions: [1.0, 2.0, 3.0, 4.0], envs: 8 Using cpu device ------------------------------ | time/ | | | fps | 22010 | | iterations | 1 | | time_elapsed | 0 | | total_timesteps | 8192 | ------------------------------ ----------------------------------------- | time/ | | | fps | 20364 | | iterations | 2 | | time_elapsed | 0 | | total_timesteps | 16384 | | train/ | | | approx_kl | 0.011680173 | | clip_fraction | 0.0786 | | clip_range | 0.2 | | entropy_loss | -1.38 | | explained_variance | -0.0463 | | learning_rate | 0.0003 | | loss | 0.0281 | | n_updates | 6 | | policy_gradient_loss | -0.0604 | | value_loss | 0.36 | ----------------------------------------- ----------------------------------------- | time/ | | | fps | 19837 | | iterations | 3 | | time_elapsed | 1 | | total_timesteps | 24576 | | train/ | | | approx_kl | 0.015800383 | | clip_fraction | 0.0911 | | clip_range | 0.2 | | entropy_loss | -1.35 | | explained_variance | -0.0382 | | learning_rate | 0.0003 | | loss | -0.0147 | | n_updates | 12 | | policy_gradient_loss | -0.0724 | | value_loss | 0.267 | ----------------------------------------- ----------------------------------------- | time/ | | | fps | 19687 | | iterations | 4 | | time_elapsed | 1 | | total_timesteps | 32768 | | train/ | | | approx_kl | 0.022618053 | | clip_fraction | 0.125 | | clip_range | 0.2 | | entropy_loss | -1.26 | | explained_variance | -0.0249 | | learning_rate | 0.0003 | | loss | -0.0365 | | n_updates | 18 | | policy_gradient_loss | -0.089 | | value_loss | 0.246 | ----------------------------------------- ----------------------------------------- | time/ | | | fps | 19528 | | iterations | 5 | | time_elapsed | 2 | | total_timesteps | 40960 | | train/ | | | approx_kl | 0.023785941 | | clip_fraction | 0.117 | | clip_range | 0.2 | | entropy_loss | -1.11 | | explained_variance | -0.0076 | | learning_rate | 0.0003 | | loss | -0.0391 | | n_updates | 24 | | policy_gradient_loss | -0.0906 | | value_loss | 0.253 | ----------------------------------------- ----------------------------------------- | time/ | | | fps | 19537 | | iterations | 6 | | time_elapsed | 2 | | total_timesteps | 49152 | | train/ | | | approx_kl | 0.020738259 | | clip_fraction | 0.106 | | clip_range | 0.2 | | entropy_loss | -0.908 | | explained_variance | -0.000927 | | learning_rate | 0.0003 | | loss | -0.0108 | | n_updates | 30 | | policy_gradient_loss | -0.0767 | | value_loss | 0.244 | ----------------------------------------- ----------------------------------------- | time/ | | | fps | 19522 | | iterations | 7 | | time_elapsed | 2 | | total_timesteps | 57344 | | train/ | | | approx_kl | 0.016834186 | | clip_fraction | 0.0917 | | clip_range | 0.2 | | entropy_loss | -0.702 | | explained_variance | 0.00275 | | learning_rate | 0.0003 | | loss | 0.00941 | | n_updates | 36 | | policy_gradient_loss | -0.0602 | | value_loss | 0.216 | ----------------------------------------- ----------------------------------------- | rollout/ | | | ep_len_mean | 7.3e+03 | | ep_rew_mean | 3.7e+03 | | time/ | | | fps | 19522 | | iterations | 8 | | time_elapsed | 3 | | total_timesteps | 65536 | | train/ | | | approx_kl | 0.011759612 | | clip_fraction | 0.0735 | | clip_range | 0.2 | | entropy_loss | -0.512 | | explained_variance | 0.00483 | | learning_rate | 0.0003 | | loss | 0.0281 | | n_updates | 42 | | policy_gradient_loss | -0.0447 | | value_loss | 0.186 | ----------------------------------------- ------------------------------------------ | rollout/ | | | ep_len_mean | 7.3e+03 | | ep_rew_mean | 3.7e+03 | | time/ | | | fps | 19527 | | iterations | 9 | | time_elapsed | 3 | | total_timesteps | 73728 | | train/ | | | approx_kl | 0.0068010013 | | clip_fraction | 0.0446 | | clip_range | 0.2 | | entropy_loss | -0.371 | | explained_variance | 0.0134 | | learning_rate | 0.0003 | | loss | 0.036 | | n_updates | 48 | | policy_gradient_loss | -0.031 | | value_loss | 0.161 | ------------------------------------------ ----------------------------------------- | rollout/ | | | ep_len_mean | 7.3e+03 | | ep_rew_mean | 3.7e+03 | | time/ | | | fps | 19530 | | iterations | 10 | | time_elapsed | 4 | | total_timesteps | 81920 | | train/ | | | approx_kl | 0.003551888 | | clip_fraction | 0.0264 | | clip_range | 0.2 | | entropy_loss | -0.274 | | explained_variance | 0.0225 | | learning_rate | 0.0003 | | loss | 0.0381 | | n_updates | 54 | | policy_gradient_loss | -0.0201 | | value_loss | 0.135 | ----------------------------------------- ------------------------------------------ | rollout/ | | | ep_len_mean | 7.3e+03 | | ep_rew_mean | 3.7e+03 | | time/ | | | fps | 19511 | | iterations | 11 | | time_elapsed | 4 | | total_timesteps | 90112 | | train/ | | | approx_kl | 0.0022966224 | | clip_fraction | 0.016 | | clip_range | 0.2 | | entropy_loss | -0.203 | | explained_variance | 0.0262 | | learning_rate | 0.0003 | | loss | 0.042 | | n_updates | 60 | | policy_gradient_loss | -0.0134 | | value_loss | 0.125 | ------------------------------------------ ------------------------------------------ | rollout/ | | | ep_len_mean | 7.3e+03 | | ep_rew_mean | 3.7e+03 | | time/ | | | fps | 19493 | | iterations | 12 | | time_elapsed | 5 | | total_timesteps | 98304 | | train/ | | | approx_kl | 0.0016589619 | | clip_fraction | 0.012 | | clip_range | 0.2 | | entropy_loss | -0.149 | | explained_variance | 0.0423 | | learning_rate | 0.0003 | | loss | 0.0393 | | n_updates | 66 | | policy_gradient_loss | -0.0106 | | value_loss | 0.112 | ------------------------------------------ ------------------------------------------ | rollout/ | | | ep_len_mean | 7.3e+03 | | ep_rew_mean | 3.7e+03 | | time/ | | | fps | 19460 | | iterations | 13 | | time_elapsed | 5 | | total_timesteps | 106496 | | train/ | | | approx_kl | 0.0012054963 | | clip_fraction | 0.0085 | | clip_range | 0.2 | | entropy_loss | -0.11 | | explained_variance | 0.051 | | learning_rate | 0.0003 | | loss | 0.0439 | | n_updates | 72 | | policy_gradient_loss | -0.00789 | | value_loss | 0.112 | ------------------------------------------ ------------------------------------------ | rollout/ | | | ep_len_mean | 7.3e+03 | | ep_rew_mean | 3.7e+03 | | time/ | | | fps | 19417 | | iterations | 14 | | time_elapsed | 5 | | total_timesteps | 114688 | | train/ | | | approx_kl | 0.0005791058 | | clip_fraction | 0.00403 | | clip_range | 0.2 | | entropy_loss | -0.082 | | explained_variance | 0.0593 | | learning_rate | 0.0003 | | loss | 0.0454 | | n_updates | 78 | | policy_gradient_loss | -0.00492 | | value_loss | 0.106 | ------------------------------------------ ----------------------------------------- | rollout/ | | | ep_len_mean | 7.3e+03 | | ep_rew_mean | 4.96e+03 | | time/ | | | fps | 19404 | | iterations | 15 | | time_elapsed | 6 | | total_timesteps | 122880 | | train/ | | | approx_kl | 0.000311509 | | clip_fraction | 0.00161 | | clip_range | 0.2 | | entropy_loss | -0.0642 | | explained_variance | 0.073 | | learning_rate | 0.0003 | | loss | 0.0437 | | n_updates | 84 | | policy_gradient_loss | -0.00317 | | value_loss | 0.101 | ----------------------------------------- ------------------------------------------- | rollout/ | | | ep_len_mean | 7.3e+03 | | ep_rew_mean | 4.96e+03 | | time/ | | | fps | 19354 | | iterations | 16 | | time_elapsed | 6 | | total_timesteps | 131072 | | train/ | | | approx_kl | 0.00018201688 | | clip_fraction | 0.000448 | | clip_range | 0.2 | | entropy_loss | -0.0518 | | explained_variance | 0.0933 | | learning_rate | 0.0003 | | loss | 0.043 | | n_updates | 90 | | policy_gradient_loss | -0.00229 | | value_loss | 0.098 | ------------------------------------------- -------------------------------------------- | rollout/ | | | ep_len_mean | 7.3e+03 | | ep_rew_mean | 4.96e+03 | | time/ | | | fps | 19320 | | iterations | 17 | | time_elapsed | 7 | | total_timesteps | 139264 | | train/ | | | approx_kl | 0.000119664204 | | clip_fraction | 4.07e-05 | | clip_range | 0.2 | | entropy_loss | -0.0426 | | explained_variance | 0.107 | | learning_rate | 0.0003 | | loss | 0.0462 | | n_updates | 96 | | policy_gradient_loss | -0.00171 | | value_loss | 0.0961 | -------------------------------------------- ------------------------------------------ | rollout/ | | | ep_len_mean | 7.3e+03 | | ep_rew_mean | 4.96e+03 | | time/ | | | fps | 19306 | | iterations | 18 | | time_elapsed | 7 | | total_timesteps | 147456 | | train/ | | | approx_kl | 6.811674e-05 | | clip_fraction | 0 | | clip_range | 0.2 | | entropy_loss | -0.0357 | | explained_variance | 0.112 | | learning_rate | 0.0003 | | loss | 0.0456 | | n_updates | 102 | | policy_gradient_loss | -0.00116 | | value_loss | 0.0944 | ------------------------------------------ ------------------------------------------- | rollout/ | | | ep_len_mean | 7.3e+03 | | ep_rew_mean | 4.96e+03 | | time/ | | | fps | 19298 | | iterations | 19 | | time_elapsed | 8 | | total_timesteps | 155648 | | train/ | | | approx_kl | 3.3057244e-05 | | clip_fraction | 0 | | clip_range | 0.2 | | entropy_loss | -0.0307 | | explained_variance | 0.106 | | learning_rate | 0.0003 | | loss | 0.0441 | | n_updates | 108 | | policy_gradient_loss | -0.00072 | | value_loss | 0.0911 | ------------------------------------------- ------------------------------------------- | rollout/ | | | ep_len_mean | 7.3e+03 | | ep_rew_mean | 4.96e+03 | | time/ | | | fps | 19295 | | iterations | 20 | | time_elapsed | 8 | | total_timesteps | 163840 | | train/ | | | approx_kl | 2.7266338e-05 | | clip_fraction | 0 | | clip_range | 0.2 | | entropy_loss | -0.0272 | | explained_variance | 0.109 | | learning_rate | 0.0003 | | loss | 0.046 | | n_updates | 114 | | policy_gradient_loss | -0.000644 | | value_loss | 0.094 | ------------------------------------------- ------------------------------------------- | rollout/ | | | ep_len_mean | 7.3e+03 | | ep_rew_mean | 4.96e+03 | | time/ | | | fps | 19296 | | iterations | 21 | | time_elapsed | 8 | | total_timesteps | 172032 | | train/ | | | approx_kl | 3.0745177e-05 | | clip_fraction | 0 | | clip_range | 0.2 | | entropy_loss | -0.0242 | | explained_variance | 0.128 | | learning_rate | 0.0003 | | loss | 0.047 | | n_updates | 120 | | policy_gradient_loss | -0.000681 | | value_loss | 0.0954 | ------------------------------------------- ------------------------------------------ | rollout/ | | | ep_len_mean | 7.3e+03 | | ep_rew_mean | 5.45e+03 | | time/ | | | fps | 19300 | | iterations | 22 | | time_elapsed | 9 | | total_timesteps | 180224 | | train/ | | | approx_kl | 1.634036e-05 | | clip_fraction | 0 | | clip_range | 0.2 | | entropy_loss | -0.0215 | | explained_variance | 0.114 | | learning_rate | 0.0003 | | loss | 0.0436 | | n_updates | 126 | | policy_gradient_loss | -0.000441 | | value_loss | 0.0902 | ------------------------------------------ ------------------------------------------- | rollout/ | | | ep_len_mean | 7.3e+03 | | ep_rew_mean | 5.45e+03 | | time/ | | | fps | 19292 | | iterations | 23 | | time_elapsed | 9 | | total_timesteps | 188416 | | train/ | | | approx_kl | 1.0737844e-05 | | clip_fraction | 0 | | clip_range | 0.2 | | entropy_loss | -0.0195 | | explained_variance | 0.114 | | learning_rate | 0.0003 | | loss | 0.0453 | | n_updates | 132 | | policy_gradient_loss | -0.00035 | | value_loss | 0.0894 | ------------------------------------------- ------------------------------------------- | rollout/ | | | ep_len_mean | 7.3e+03 | | ep_rew_mean | 5.45e+03 | | time/ | | | fps | 19289 | | iterations | 24 | | time_elapsed | 10 | | total_timesteps | 196608 | | train/ | | | approx_kl | 1.4108438e-05 | | clip_fraction | 0 | | clip_range | 0.2 | | entropy_loss | -0.0177 | | explained_variance | 0.123 | | learning_rate | 0.0003 | | loss | 0.0442 | | n_updates | 138 | | policy_gradient_loss | -0.00039 | | value_loss | 0.0924 | ------------------------------------------- ------------------------------------------ | rollout/ | | | ep_len_mean | 7.3e+03 | | ep_rew_mean | 5.45e+03 | | time/ | | | fps | 19292 | | iterations | 25 | | time_elapsed | 10 | | total_timesteps | 204800 | | train/ | | | approx_kl | 5.295122e-06 | | clip_fraction | 0 | | clip_range | 0.2 | | entropy_loss | -0.0162 | | explained_variance | 0.125 | | learning_rate | 0.0003 | | loss | 0.0405 | | n_updates | 144 | | policy_gradient_loss | -0.00021 | | value_loss | 0.0872 | ------------------------------------------ ------------------------------------------- | rollout/ | | | ep_len_mean | 7.3e+03 | | ep_rew_mean | 5.45e+03 | | time/ | | | fps | 19272 | | iterations | 26 | | time_elapsed | 11 | | total_timesteps | 212992 | | train/ | | | approx_kl | 6.7665824e-06 | | clip_fraction | 0 | | clip_range | 0.2 | | entropy_loss | -0.015 | | explained_variance | 0.136 | | learning_rate | 0.0003 | | loss | 0.0436 | | n_updates | 150 | | policy_gradient_loss | -0.000253 | | value_loss | 0.0915 | ------------------------------------------- ------------------------------------------- | rollout/ | | | ep_len_mean | 7.3e+03 | | ep_rew_mean | 5.45e+03 | | time/ | | | fps | 19272 | | iterations | 27 | | time_elapsed | 11 | | total_timesteps | 221184 | | train/ | | | approx_kl | 4.8166985e-06 | | clip_fraction | 0 | | clip_range | 0.2 | | entropy_loss | -0.0139 | | explained_variance | 0.13 | | learning_rate | 0.0003 | | loss | 0.0457 | | n_updates | 156 | | policy_gradient_loss | -0.000206 | | value_loss | 0.0892 | ------------------------------------------- ------------------------------------------- | rollout/ | | | ep_len_mean | 7.3e+03 | | ep_rew_mean | 5.45e+03 | | time/ | | | fps | 19276 | | iterations | 28 | | time_elapsed | 11 | | total_timesteps | 229376 | | train/ | | | approx_kl | 2.0192601e-06 | | clip_fraction | 0 | | clip_range | 0.2 | | entropy_loss | -0.0129 | | explained_variance | 0.128 | | learning_rate | 0.0003 | | loss | 0.0446 | | n_updates | 162 | | policy_gradient_loss | -8.96e-05 | | value_loss | 0.0886 | ------------------------------------------- ------------------------------------------- | rollout/ | | | ep_len_mean | 7.3e+03 | | ep_rew_mean | 5.7e+03 | | time/ | | | fps | 19274 | | iterations | 29 | | time_elapsed | 12 | | total_timesteps | 237568 | | train/ | | | approx_kl | 2.0791122e-06 | | clip_fraction | 0 | | clip_range | 0.2 | | entropy_loss | -0.0123 | | explained_variance | 0.14 | | learning_rate | 0.0003 | | loss | 0.0446 | | n_updates | 168 | | policy_gradient_loss | -0.000107 | | value_loss | 0.0895 | ------------------------------------------- ------------------------------------------- | rollout/ | | | ep_len_mean | 7.3e+03 | | ep_rew_mean | 5.7e+03 | | time/ | | | fps | 19279 | | iterations | 30 | | time_elapsed | 12 | | total_timesteps | 245760 | | train/ | | | approx_kl | 3.0586962e-06 | | clip_fraction | 0 | | clip_range | 0.2 | | entropy_loss | -0.0116 | | explained_variance | 0.14 | | learning_rate | 0.0003 | | loss | 0.0453 | | n_updates | 174 | | policy_gradient_loss | -0.000145 | | value_loss | 0.0917 | ------------------------------------------- ------------------------------------------- | rollout/ | | | ep_len_mean | 7.3e+03 | | ep_rew_mean | 5.7e+03 | | time/ | | | fps | 19280 | | iterations | 31 | | time_elapsed | 13 | | total_timesteps | 253952 | | train/ | | | approx_kl | 3.0131196e-06 | | clip_fraction | 0 | | clip_range | 0.2 | | entropy_loss | -0.0109 | | explained_variance | 0.143 | | learning_rate | 0.0003 | | loss | 0.044 | | n_updates | 180 | | policy_gradient_loss | -0.000143 | | value_loss | 0.09 | ------------------------------------------- ------------------------------------------- | rollout/ | | | ep_len_mean | 7.3e+03 | | ep_rew_mean | 5.7e+03 | | time/ | | | fps | 19283 | | iterations | 32 | | time_elapsed | 13 | | total_timesteps | 262144 | | train/ | | | approx_kl | 1.4135949e-06 | | clip_fraction | 0 | | clip_range | 0.2 | | entropy_loss | -0.0103 | | explained_variance | 0.142 | | learning_rate | 0.0003 | | loss | 0.0434 | | n_updates | 186 | | policy_gradient_loss | -7.14e-05 | | value_loss | 0.087 | ------------------------------------------- ------------------------------------------- | rollout/ | | | ep_len_mean | 7.3e+03 | | ep_rew_mean | 5.7e+03 | | time/ | | | fps | 19281 | | iterations | 33 | | time_elapsed | 14 | | total_timesteps | 270336 | | train/ | | | approx_kl | 1.2583987e-06 | | clip_fraction | 0 | | clip_range | 0.2 | | entropy_loss | -0.0098 | | explained_variance | 0.14 | | learning_rate | 0.0003 | | loss | 0.0424 | | n_updates | 192 | | policy_gradient_loss | -5.73e-05 | | value_loss | 0.0863 | ------------------------------------------- ------------------------------------------- | rollout/ | | | ep_len_mean | 7.3e+03 | | ep_rew_mean | 5.7e+03 | | time/ | | | fps | 19266 | | iterations | 34 | | time_elapsed | 14 | | total_timesteps | 278528 | | train/ | | | approx_kl | 1.2177843e-06 | | clip_fraction | 0 | | clip_range | 0.2 | | entropy_loss | -0.00939 | | explained_variance | 0.146 | | learning_rate | 0.0003 | | loss | 0.0462 | | n_updates | 198 | | policy_gradient_loss | -7.72e-05 | | value_loss | 0.0898 | ------------------------------------------- ----------------------------------------- | rollout/ | | | ep_len_mean | 7.3e+03 | | ep_rew_mean | 5.7e+03 | | time/ | | | fps | 19252 | | iterations | 35 | | time_elapsed | 14 | | total_timesteps | 286720 | | train/ | | | approx_kl | 2.76247e-06 | | clip_fraction | 0 | | clip_range | 0.2 | | entropy_loss | -0.00893 | | explained_variance | 0.148 | | learning_rate | 0.0003 | | loss | 0.0404 | | n_updates | 204 | | policy_gradient_loss | -0.000133 | | value_loss | 0.0834 | ----------------------------------------- ------------------------------------------- | rollout/ | | | ep_len_mean | 7.3e+03 | | ep_rew_mean | 5.85e+03 | | time/ | | | fps | 19227 | | iterations | 36 | | time_elapsed | 15 | | total_timesteps | 294912 | | train/ | | | approx_kl | 6.2915205e-07 | | clip_fraction | 0 | | clip_range | 0.2 | | entropy_loss | -0.00845 | | explained_variance | 0.146 | | learning_rate | 0.0003 | | loss | 0.0459 | | n_updates | 210 | | policy_gradient_loss | -4.4e-05 | | value_loss | 0.0906 | ------------------------------------------- ------------------------------------------- | rollout/ | | | ep_len_mean | 7.3e+03 | | ep_rew_mean | 5.85e+03 | | time/ | | | fps | 19220 | | iterations | 37 | | time_elapsed | 15 | | total_timesteps | 303104 | | train/ | | | approx_kl | 2.8912473e-07 | | clip_fraction | 0 | | clip_range | 0.2 | | entropy_loss | -0.00813 | | explained_variance | 0.145 | | learning_rate | 0.0003 | | loss | 0.0442 | | n_updates | 216 | | policy_gradient_loss | -2.23e-05 | | value_loss | 0.0869 | ------------------------------------------- [SAVED] bots/models/PPO_Trailing_Stop_Loss/ppo_stop_loss_selector_rl_stop_loss_training_SI=F.zip [TRAIN] ZB=F — rows: 7318, actions: [1.0, 2.0, 3.0, 4.0], envs: 8 Using cpu device ------------------------------ | time/ | | | fps | 22040 | | iterations | 1 | | time_elapsed | 0 | | total_timesteps | 8192 | ------------------------------ ---------------------------------------- | time/ | | | fps | 20451 | | iterations | 2 | | time_elapsed | 0 | | total_timesteps | 16384 | | train/ | | | approx_kl | 0.01522733 | | clip_fraction | 0.104 | | clip_range | 0.2 | | entropy_loss | -1.38 | | explained_variance | -0.0761 | | learning_rate | 0.0003 | | loss | 0.00755 | | n_updates | 6 | | policy_gradient_loss | -0.0634 | | value_loss | 0.279 | ---------------------------------------- ----------------------------------------- | time/ | | | fps | 20131 | | iterations | 3 | | time_elapsed | 1 | | total_timesteps | 24576 | | train/ | | | approx_kl | 0.020271031 | | clip_fraction | 0.115 | | clip_range | 0.2 | | entropy_loss | -1.34 | | explained_variance | -0.0343 | | learning_rate | 0.0003 | | loss | -0.0227 | | n_updates | 12 | | policy_gradient_loss | -0.0778 | | value_loss | 0.239 | ----------------------------------------- ----------------------------------------- | time/ | | | fps | 19954 | | iterations | 4 | | time_elapsed | 1 | | total_timesteps | 32768 | | train/ | | | approx_kl | 0.025232496 | | clip_fraction | 0.161 | | clip_range | 0.2 | | entropy_loss | -1.23 | | explained_variance | -0.028 | | learning_rate | 0.0003 | | loss | -0.0329 | | n_updates | 18 | | policy_gradient_loss | -0.0899 | | value_loss | 0.248 | ----------------------------------------- ----------------------------------------- | time/ | | | fps | 19840 | | iterations | 5 | | time_elapsed | 2 | | total_timesteps | 40960 | | train/ | | | approx_kl | 0.019272462 | | clip_fraction | 0.0698 | | clip_range | 0.2 | | entropy_loss | -1.07 | | explained_variance | -0.0213 | | learning_rate | 0.0003 | | loss | -0.0195 | | n_updates | 24 | | policy_gradient_loss | -0.0776 | | value_loss | 0.257 | ----------------------------------------- ----------------------------------------- | time/ | | | fps | 19747 | | iterations | 6 | | time_elapsed | 2 | | total_timesteps | 49152 | | train/ | | | approx_kl | 0.018699085 | | clip_fraction | 0.102 | | clip_range | 0.2 | | entropy_loss | -0.873 | | explained_variance | -0.0155 | | learning_rate | 0.0003 | | loss | 0.00927 | | n_updates | 30 | | policy_gradient_loss | -0.0679 | | value_loss | 0.247 | ----------------------------------------- ------------------------------------------ | time/ | | | fps | 19697 | | iterations | 7 | | time_elapsed | 2 | | total_timesteps | 57344 | | train/ | | | approx_kl | 0.0145712085 | | clip_fraction | 0.0891 | | clip_range | 0.2 | | entropy_loss | -0.674 | | explained_variance | -0.0166 | | learning_rate | 0.0003 | | loss | 0.0243 | | n_updates | 36 | | policy_gradient_loss | -0.0529 | | value_loss | 0.22 | ------------------------------------------ ----------------------------------------- | rollout/ | | | ep_len_mean | 7.32e+03 | | ep_rew_mean | 3.75e+03 | | time/ | | | fps | 19637 | | iterations | 8 | | time_elapsed | 3 | | total_timesteps | 65536 | | train/ | | | approx_kl | 0.009770615 | | clip_fraction | 0.0624 | | clip_range | 0.2 | | entropy_loss | -0.501 | | explained_variance | -0.0154 | | learning_rate | 0.0003 | | loss | 0.0378 | | n_updates | 42 | | policy_gradient_loss | -0.0391 | | value_loss | 0.195 | ----------------------------------------- ------------------------------------------ | rollout/ | | | ep_len_mean | 7.32e+03 | | ep_rew_mean | 3.75e+03 | | time/ | | | fps | 19597 | | iterations | 9 | | time_elapsed | 3 | | total_timesteps | 73728 | | train/ | | | approx_kl | 0.0058736634 | | clip_fraction | 0.042 | | clip_range | 0.2 | | entropy_loss | -0.372 | | explained_variance | -0.0152 | | learning_rate | 0.0003 | | loss | 0.0502 | | n_updates | 48 | | policy_gradient_loss | -0.027 | | value_loss | 0.184 | ------------------------------------------ ------------------------------------------ | rollout/ | | | ep_len_mean | 7.32e+03 | | ep_rew_mean | 3.75e+03 | | time/ | | | fps | 19567 | | iterations | 10 | | time_elapsed | 4 | | total_timesteps | 81920 | | train/ | | | approx_kl | 0.0033011725 | | clip_fraction | 0.0225 | | clip_range | 0.2 | | entropy_loss | -0.279 | | explained_variance | -0.0114 | | learning_rate | 0.0003 | | loss | 0.0524 | | n_updates | 54 | | policy_gradient_loss | -0.0183 | | value_loss | 0.159 | ------------------------------------------ ------------------------------------------ | rollout/ | | | ep_len_mean | 7.32e+03 | | ep_rew_mean | 3.75e+03 | | time/ | | | fps | 19544 | | iterations | 11 | | time_elapsed | 4 | | total_timesteps | 90112 | | train/ | | | approx_kl | 0.0021294355 | | clip_fraction | 0.0149 | | clip_range | 0.2 | | entropy_loss | -0.209 | | explained_variance | -0.0113 | | learning_rate | 0.0003 | | loss | 0.056 | | n_updates | 60 | | policy_gradient_loss | -0.012 | | value_loss | 0.149 | ------------------------------------------ ------------------------------------------ | rollout/ | | | ep_len_mean | 7.32e+03 | | ep_rew_mean | 3.75e+03 | | time/ | | | fps | 19514 | | iterations | 12 | | time_elapsed | 5 | | total_timesteps | 98304 | | train/ | | | approx_kl | 0.0015405614 | | clip_fraction | 0.0111 | | clip_range | 0.2 | | entropy_loss | -0.157 | | explained_variance | -0.0077 | | learning_rate | 0.0003 | | loss | 0.0574 | | n_updates | 66 | | policy_gradient_loss | -0.00892 | | value_loss | 0.142 | ------------------------------------------ ------------------------------------------ | rollout/ | | | ep_len_mean | 7.32e+03 | | ep_rew_mean | 3.75e+03 | | time/ | | | fps | 19509 | | iterations | 13 | | time_elapsed | 5 | | total_timesteps | 106496 | | train/ | | | approx_kl | 0.0011109535 | | clip_fraction | 0.00863 | | clip_range | 0.2 | | entropy_loss | -0.118 | | explained_variance | -0.00729 | | learning_rate | 0.0003 | | loss | 0.0545 | | n_updates | 72 | | policy_gradient_loss | -0.00722 | | value_loss | 0.136 | ------------------------------------------ ------------------------------------------ | rollout/ | | | ep_len_mean | 7.32e+03 | | ep_rew_mean | 3.75e+03 | | time/ | | | fps | 19473 | | iterations | 14 | | time_elapsed | 5 | | total_timesteps | 114688 | | train/ | | | approx_kl | 0.0004946777 | | clip_fraction | 0.00315 | | clip_range | 0.2 | | entropy_loss | -0.0904 | | explained_variance | -0.00242 | | learning_rate | 0.0003 | | loss | 0.0573 | | n_updates | 78 | | policy_gradient_loss | -0.00423 | | value_loss | 0.127 | ------------------------------------------ ------------------------------------------- | rollout/ | | | ep_len_mean | 7.32e+03 | | ep_rew_mean | 4.89e+03 | | time/ | | | fps | 19462 | | iterations | 15 | | time_elapsed | 6 | | total_timesteps | 122880 | | train/ | | | approx_kl | 0.00027982748 | | clip_fraction | 0.000712 | | clip_range | 0.2 | | entropy_loss | -0.072 | | explained_variance | -0.00556 | | learning_rate | 0.0003 | | loss | 0.0572 | | n_updates | 84 | | policy_gradient_loss | -0.00288 | | value_loss | 0.13 | ------------------------------------------- ------------------------------------------- | rollout/ | | | ep_len_mean | 7.32e+03 | | ep_rew_mean | 4.89e+03 | | time/ | | | fps | 19464 | | iterations | 16 | | time_elapsed | 6 | | total_timesteps | 131072 | | train/ | | | approx_kl | 0.00023075184 | | clip_fraction | 0.000427 | | clip_range | 0.2 | | entropy_loss | -0.0585 | | explained_variance | -0.00351 | | learning_rate | 0.0003 | | loss | 0.0624 | | n_updates | 90 | | policy_gradient_loss | -0.00251 | | value_loss | 0.131 | ------------------------------------------- ------------------------------------------- | rollout/ | | | ep_len_mean | 7.32e+03 | | ep_rew_mean | 4.89e+03 | | time/ | | | fps | 19468 | | iterations | 17 | | time_elapsed | 7 | | total_timesteps | 139264 | | train/ | | | approx_kl | 0.00011884098 | | clip_fraction | 0 | | clip_range | 0.2 | | entropy_loss | -0.0482 | | explained_variance | 0.00375 | | learning_rate | 0.0003 | | loss | 0.062 | | n_updates | 96 | | policy_gradient_loss | -0.00153 | | value_loss | 0.125 | ------------------------------------------- ------------------------------------------ | rollout/ | | | ep_len_mean | 7.32e+03 | | ep_rew_mean | 4.89e+03 | | time/ | | | fps | 19458 | | iterations | 18 | | time_elapsed | 7 | | total_timesteps | 147456 | | train/ | | | approx_kl | 6.558176e-05 | | clip_fraction | 0 | | clip_range | 0.2 | | entropy_loss | -0.041 | | explained_variance | 0.0343 | | learning_rate | 0.0003 | | loss | 0.0574 | | n_updates | 102 | | policy_gradient_loss | -0.00106 | | value_loss | 0.12 | ------------------------------------------ ------------------------------------------ | rollout/ | | | ep_len_mean | 7.32e+03 | | ep_rew_mean | 4.89e+03 | | time/ | | | fps | 19445 | | iterations | 19 | | time_elapsed | 8 | | total_timesteps | 155648 | | train/ | | | approx_kl | 3.759037e-05 | | clip_fraction | 0 | | clip_range | 0.2 | | entropy_loss | -0.0357 | | explained_variance | 0.0481 | | learning_rate | 0.0003 | | loss | 0.0543 | | n_updates | 108 | | policy_gradient_loss | -0.000736 | | value_loss | 0.116 | ------------------------------------------ ------------------------------------------- | rollout/ | | | ep_len_mean | 7.32e+03 | | ep_rew_mean | 4.89e+03 | | time/ | | | fps | 19396 | | iterations | 20 | | time_elapsed | 8 | | total_timesteps | 163840 | | train/ | | | approx_kl | 2.6363923e-05 | | clip_fraction | 0 | | clip_range | 0.2 | | entropy_loss | -0.0318 | | explained_variance | 0.0569 | | learning_rate | 0.0003 | | loss | 0.0592 | | n_updates | 114 | | policy_gradient_loss | -0.000576 | | value_loss | 0.116 | ------------------------------------------- ------------------------------------------- | rollout/ | | | ep_len_mean | 7.32e+03 | | ep_rew_mean | 4.89e+03 | | time/ | | | fps | 19395 | | iterations | 21 | | time_elapsed | 8 | | total_timesteps | 172032 | | train/ | | | approx_kl | 2.6332033e-05 | | clip_fraction | 0 | | clip_range | 0.2 | | entropy_loss | -0.0285 | | explained_variance | 0.0623 | | learning_rate | 0.0003 | | loss | 0.0611 | | n_updates | 120 | | policy_gradient_loss | -0.000551 | | value_loss | 0.12 | ------------------------------------------- ------------------------------------------- | rollout/ | | | ep_len_mean | 7.32e+03 | | ep_rew_mean | 5.34e+03 | | time/ | | | fps | 19388 | | iterations | 22 | | time_elapsed | 9 | | total_timesteps | 180224 | | train/ | | | approx_kl | 1.4799829e-05 | | clip_fraction | 0 | | clip_range | 0.2 | | entropy_loss | -0.0258 | | explained_variance | 0.0652 | | learning_rate | 0.0003 | | loss | 0.057 | | n_updates | 126 | | policy_gradient_loss | -0.000354 | | value_loss | 0.117 | ------------------------------------------- ------------------------------------------- | rollout/ | | | ep_len_mean | 7.32e+03 | | ep_rew_mean | 5.34e+03 | | time/ | | | fps | 19359 | | iterations | 23 | | time_elapsed | 9 | | total_timesteps | 188416 | | train/ | | | approx_kl | 1.2226003e-05 | | clip_fraction | 0 | | clip_range | 0.2 | | entropy_loss | -0.0236 | | explained_variance | 0.0696 | | learning_rate | 0.0003 | | loss | 0.0591 | | n_updates | 132 | | policy_gradient_loss | -0.000342 | | value_loss | 0.115 | ------------------------------------------- ------------------------------------------- | rollout/ | | | ep_len_mean | 7.32e+03 | | ep_rew_mean | 5.34e+03 | | time/ | | | fps | 19350 | | iterations | 24 | | time_elapsed | 10 | | total_timesteps | 196608 | | train/ | | | approx_kl | 1.0287455e-05 | | clip_fraction | 0 | | clip_range | 0.2 | | entropy_loss | -0.0217 | | explained_variance | 0.0693 | | learning_rate | 0.0003 | | loss | 0.0586 | | n_updates | 138 | | policy_gradient_loss | -0.00031 | | value_loss | 0.117 | ------------------------------------------- ------------------------------------------ | rollout/ | | | ep_len_mean | 7.32e+03 | | ep_rew_mean | 5.34e+03 | | time/ | | | fps | 19348 | | iterations | 25 | | time_elapsed | 10 | | total_timesteps | 204800 | | train/ | | | approx_kl | 5.608439e-06 | | clip_fraction | 0 | | clip_range | 0.2 | | entropy_loss | -0.02 | | explained_variance | 0.0797 | | learning_rate | 0.0003 | | loss | 0.0547 | | n_updates | 144 | | policy_gradient_loss | -0.000197 | | value_loss | 0.113 | ------------------------------------------ ------------------------------------------- | rollout/ | | | ep_len_mean | 7.32e+03 | | ep_rew_mean | 5.34e+03 | | time/ | | | fps | 19342 | | iterations | 26 | | time_elapsed | 11 | | total_timesteps | 212992 | | train/ | | | approx_kl | 6.1146056e-06 | | clip_fraction | 0 | | clip_range | 0.2 | | entropy_loss | -0.0187 | | explained_variance | 0.0833 | | learning_rate | 0.0003 | | loss | 0.0564 | | n_updates | 150 | | policy_gradient_loss | -0.00021 | | value_loss | 0.114 | ------------------------------------------- ------------------------------------------- | rollout/ | | | ep_len_mean | 7.32e+03 | | ep_rew_mean | 5.34e+03 | | time/ | | | fps | 19341 | | iterations | 27 | | time_elapsed | 11 | | total_timesteps | 221184 | | train/ | | | approx_kl | 7.0261594e-06 | | clip_fraction | 0 | | clip_range | 0.2 | | entropy_loss | -0.0174 | | explained_variance | 0.0822 | | learning_rate | 0.0003 | | loss | 0.0565 | | n_updates | 156 | | policy_gradient_loss | -0.000242 | | value_loss | 0.112 | ------------------------------------------- ------------------------------------------- | rollout/ | | | ep_len_mean | 7.32e+03 | | ep_rew_mean | 5.34e+03 | | time/ | | | fps | 19342 | | iterations | 28 | | time_elapsed | 11 | | total_timesteps | 229376 | | train/ | | | approx_kl | 5.0583258e-06 | | clip_fraction | 0 | | clip_range | 0.2 | | entropy_loss | -0.0161 | | explained_variance | 0.09 | | learning_rate | 0.0003 | | loss | 0.056 | | n_updates | 162 | | policy_gradient_loss | -0.000163 | | value_loss | 0.114 | ------------------------------------------- ------------------------------------------- | rollout/ | | | ep_len_mean | 7.32e+03 | | ep_rew_mean | 5.57e+03 | | time/ | | | fps | 19343 | | iterations | 29 | | time_elapsed | 12 | | total_timesteps | 237568 | | train/ | | | approx_kl | 2.6294947e-06 | | clip_fraction | 0 | | clip_range | 0.2 | | entropy_loss | -0.0151 | | explained_variance | 0.091 | | learning_rate | 0.0003 | | loss | 0.0557 | | n_updates | 168 | | policy_gradient_loss | -0.00011 | | value_loss | 0.112 | ------------------------------------------- ------------------------------------------- | rollout/ | | | ep_len_mean | 7.32e+03 | | ep_rew_mean | 5.57e+03 | | time/ | | | fps | 19341 | | iterations | 30 | | time_elapsed | 12 | | total_timesteps | 245760 | | train/ | | | approx_kl | 3.5281264e-06 | | clip_fraction | 0 | | clip_range | 0.2 | | entropy_loss | -0.0143 | | explained_variance | 0.0958 | | learning_rate | 0.0003 | | loss | 0.0547 | | n_updates | 174 | | policy_gradient_loss | -0.000149 | | value_loss | 0.11 | ------------------------------------------- ----------------------------------------- | rollout/ | | | ep_len_mean | 7.32e+03 | | ep_rew_mean | 5.57e+03 | | time/ | | | fps | 19342 | | iterations | 31 | | time_elapsed | 13 | | total_timesteps | 253952 | | train/ | | | approx_kl | 4.21788e-06 | | clip_fraction | 0 | | clip_range | 0.2 | | entropy_loss | -0.0134 | | explained_variance | 0.0994 | | learning_rate | 0.0003 | | loss | 0.053 | | n_updates | 180 | | policy_gradient_loss | -0.000152 | | value_loss | 0.109 | ----------------------------------------- ------------------------------------------- | rollout/ | | | ep_len_mean | 7.32e+03 | | ep_rew_mean | 5.57e+03 | | time/ | | | fps | 19339 | | iterations | 32 | | time_elapsed | 13 | | total_timesteps | 262144 | | train/ | | | approx_kl | 1.4121251e-06 | | clip_fraction | 0 | | clip_range | 0.2 | | entropy_loss | -0.0127 | | explained_variance | 0.105 | | learning_rate | 0.0003 | | loss | 0.0567 | | n_updates | 186 | | policy_gradient_loss | -7.27e-05 | | value_loss | 0.113 | ------------------------------------------- ------------------------------------------- | rollout/ | | | ep_len_mean | 7.32e+03 | | ep_rew_mean | 5.57e+03 | | time/ | | | fps | 19314 | | iterations | 33 | | time_elapsed | 13 | | total_timesteps | 270336 | | train/ | | | approx_kl | 1.9317595e-06 | | clip_fraction | 0 | | clip_range | 0.2 | | entropy_loss | -0.0121 | | explained_variance | 0.101 | | learning_rate | 0.0003 | | loss | 0.0521 | | n_updates | 192 | | policy_gradient_loss | -9.67e-05 | | value_loss | 0.107 | ------------------------------------------- ------------------------------------------- | rollout/ | | | ep_len_mean | 7.32e+03 | | ep_rew_mean | 5.57e+03 | | time/ | | | fps | 19290 | | iterations | 34 | | time_elapsed | 14 | | total_timesteps | 278528 | | train/ | | | approx_kl | 1.5125042e-06 | | clip_fraction | 0 | | clip_range | 0.2 | | entropy_loss | -0.0115 | | explained_variance | 0.103 | | learning_rate | 0.0003 | | loss | 0.0553 | | n_updates | 198 | | policy_gradient_loss | -7.99e-05 | | value_loss | 0.113 | ------------------------------------------- ------------------------------------------ | rollout/ | | | ep_len_mean | 7.32e+03 | | ep_rew_mean | 5.57e+03 | | time/ | | | fps | 19284 | | iterations | 35 | | time_elapsed | 14 | | total_timesteps | 286720 | | train/ | | | approx_kl | 2.827801e-06 | | clip_fraction | 0 | | clip_range | 0.2 | | entropy_loss | -0.0109 | | explained_variance | 0.107 | | learning_rate | 0.0003 | | loss | 0.0552 | | n_updates | 204 | | policy_gradient_loss | -0.000127 | | value_loss | 0.11 | ------------------------------------------ ------------------------------------------- | rollout/ | | | ep_len_mean | 7.32e+03 | | ep_rew_mean | 5.71e+03 | | time/ | | | fps | 19269 | | iterations | 36 | | time_elapsed | 15 | | total_timesteps | 294912 | | train/ | | | approx_kl | 5.6750287e-07 | | clip_fraction | 0 | | clip_range | 0.2 | | entropy_loss | -0.0104 | | explained_variance | 0.114 | | learning_rate | 0.0003 | | loss | 0.0529 | | n_updates | 210 | | policy_gradient_loss | -3.71e-05 | | value_loss | 0.11 | ------------------------------------------- ------------------------------------------ | rollout/ | | | ep_len_mean | 7.32e+03 | | ep_rew_mean | 5.71e+03 | | time/ | | | fps | 19249 | | iterations | 37 | | time_elapsed | 15 | | total_timesteps | 303104 | | train/ | | | approx_kl | 4.063986e-07 | | clip_fraction | 0 | | clip_range | 0.2 | | entropy_loss | -0.00999 | | explained_variance | 0.11 | | learning_rate | 0.0003 | | loss | 0.0571 | | n_updates | 216 | | policy_gradient_loss | -3.3e-05 | | value_loss | 0.111 | ------------------------------------------ [SAVED] bots/models/PPO_Trailing_Stop_Loss/ppo_stop_loss_selector_rl_stop_loss_training_ZB=F.zip [TRAIN] ZS=F — rows: 7328, actions: [1.0, 2.0, 3.0, 4.0], envs: 8 Using cpu device ------------------------------ | time/ | | | fps | 21074 | | iterations | 1 | | time_elapsed | 0 | | total_timesteps | 8192 | ------------------------------ ----------------------------------------- | time/ | | | fps | 19969 | | iterations | 2 | | time_elapsed | 0 | | total_timesteps | 16384 | | train/ | | | approx_kl | 0.014207177 | | clip_fraction | 0.104 | | clip_range | 0.2 | | entropy_loss | -1.38 | | explained_variance | -0.129 | | learning_rate | 0.0003 | | loss | 0.231 | | n_updates | 6 | | policy_gradient_loss | -0.068 | | value_loss | 0.911 | ----------------------------------------- ----------------------------------------- | time/ | | | fps | 19481 | | iterations | 3 | | time_elapsed | 1 | | total_timesteps | 24576 | | train/ | | | approx_kl | 0.014300158 | | clip_fraction | 0.0844 | | clip_range | 0.2 | | entropy_loss | -1.35 | | explained_variance | -0.375 | | learning_rate | 0.0003 | | loss | 0.0897 | | n_updates | 12 | | policy_gradient_loss | -0.0641 | | value_loss | 0.576 | ----------------------------------------- ----------------------------------------- | time/ | | | fps | 19367 | | iterations | 4 | | time_elapsed | 1 | | total_timesteps | 32768 | | train/ | | | approx_kl | 0.023877975 | | clip_fraction | 0.127 | | clip_range | 0.2 | | entropy_loss | -1.26 | | explained_variance | -0.392 | | learning_rate | 0.0003 | | loss | 0.00698 | | n_updates | 18 | | policy_gradient_loss | -0.0783 | | value_loss | 0.362 | ----------------------------------------- ----------------------------------------- | time/ | | | fps | 19353 | | iterations | 5 | | time_elapsed | 2 | | total_timesteps | 40960 | | train/ | | | approx_kl | 0.034484014 | | clip_fraction | 0.244 | | clip_range | 0.2 | | entropy_loss | -1.08 | | explained_variance | -0.212 | | learning_rate | 0.0003 | | loss | -0.0241 | | n_updates | 24 | | policy_gradient_loss | -0.0971 | | value_loss | 0.294 | ----------------------------------------- ----------------------------------------- | time/ | | | fps | 19324 | | iterations | 6 | | time_elapsed | 2 | | total_timesteps | 49152 | | train/ | | | approx_kl | 0.026748445 | | clip_fraction | 0.126 | | clip_range | 0.2 | | entropy_loss | -0.844 | | explained_variance | -0.0848 | | learning_rate | 0.0003 | | loss | -0.0154 | | n_updates | 30 | | policy_gradient_loss | -0.0809 | | value_loss | 0.244 | ----------------------------------------- ----------------------------------------- | time/ | | | fps | 19330 | | iterations | 7 | | time_elapsed | 2 | | total_timesteps | 57344 | | train/ | | | approx_kl | 0.020989873 | | clip_fraction | 0.107 | | clip_range | 0.2 | | entropy_loss | -0.598 | | explained_variance | -0.0682 | | learning_rate | 0.0003 | | loss | 0.00901 | | n_updates | 36 | | policy_gradient_loss | -0.0623 | | value_loss | 0.199 | ----------------------------------------- ----------------------------------------- | rollout/ | | | ep_len_mean | 7.33e+03 | | ep_rew_mean | 3.94e+03 | | time/ | | | fps | 19346 | | iterations | 8 | | time_elapsed | 3 | | total_timesteps | 65536 | | train/ | | | approx_kl | 0.011593575 | | clip_fraction | 0.0672 | | clip_range | 0.2 | | entropy_loss | -0.405 | | explained_variance | -0.043 | | learning_rate | 0.0003 | | loss | 0.0209 | | n_updates | 42 | | policy_gradient_loss | -0.0423 | | value_loss | 0.159 | ----------------------------------------- ------------------------------------------ | rollout/ | | | ep_len_mean | 7.33e+03 | | ep_rew_mean | 3.94e+03 | | time/ | | | fps | 19259 | | iterations | 9 | | time_elapsed | 3 | | total_timesteps | 73728 | | train/ | | | approx_kl | 0.0055017713 | | clip_fraction | 0.0371 | | clip_range | 0.2 | | entropy_loss | -0.279 | | explained_variance | -0.0461 | | learning_rate | 0.0003 | | loss | 0.024 | | n_updates | 48 | | policy_gradient_loss | -0.0277 | | value_loss | 0.129 | ------------------------------------------ ----------------------------------------- | rollout/ | | | ep_len_mean | 7.33e+03 | | ep_rew_mean | 3.94e+03 | | time/ | | | fps | 19251 | | iterations | 10 | | time_elapsed | 4 | | total_timesteps | 81920 | | train/ | | | approx_kl | 0.002933262 | | clip_fraction | 0.0188 | | clip_range | 0.2 | | entropy_loss | -0.199 | | explained_variance | -0.0312 | | learning_rate | 0.0003 | | loss | 0.0287 | | n_updates | 54 | | policy_gradient_loss | -0.0172 | | value_loss | 0.107 | ----------------------------------------- ------------------------------------------ | rollout/ | | | ep_len_mean | 7.33e+03 | | ep_rew_mean | 3.94e+03 | | time/ | | | fps | 19253 | | iterations | 11 | | time_elapsed | 4 | | total_timesteps | 90112 | | train/ | | | approx_kl | 0.0017655796 | | clip_fraction | 0.0122 | | clip_range | 0.2 | | entropy_loss | -0.142 | | explained_variance | -0.0372 | | learning_rate | 0.0003 | | loss | 0.031 | | n_updates | 60 | | policy_gradient_loss | -0.0117 | | value_loss | 0.0968 | ------------------------------------------ ------------------------------------------ | rollout/ | | | ep_len_mean | 7.33e+03 | | ep_rew_mean | 3.94e+03 | | time/ | | | fps | 19246 | | iterations | 12 | | time_elapsed | 5 | | total_timesteps | 98304 | | train/ | | | approx_kl | 0.0011323919 | | clip_fraction | 0.00791 | | clip_range | 0.2 | | entropy_loss | -0.102 | | explained_variance | -0.0306 | | learning_rate | 0.0003 | | loss | 0.0334 | | n_updates | 66 | | policy_gradient_loss | -0.00803 | | value_loss | 0.0909 | ------------------------------------------ ------------------------------------------- | rollout/ | | | ep_len_mean | 7.33e+03 | | ep_rew_mean | 3.94e+03 | | time/ | | | fps | 19204 | | iterations | 13 | | time_elapsed | 5 | | total_timesteps | 106496 | | train/ | | | approx_kl | 0.00066238345 | | clip_fraction | 0.00527 | | clip_range | 0.2 | | entropy_loss | -0.0745 | | explained_variance | -0.0263 | | learning_rate | 0.0003 | | loss | 0.0346 | | n_updates | 72 | | policy_gradient_loss | -0.00586 | | value_loss | 0.0865 | ------------------------------------------- ------------------------------------------- | rollout/ | | | ep_len_mean | 7.33e+03 | | ep_rew_mean | 3.94e+03 | | time/ | | | fps | 19198 | | iterations | 14 | | time_elapsed | 5 | | total_timesteps | 114688 | | train/ | | | approx_kl | 0.00038536388 | | clip_fraction | 0.00264 | | clip_range | 0.2 | | entropy_loss | -0.0564 | | explained_variance | -0.0132 | | learning_rate | 0.0003 | | loss | 0.0342 | | n_updates | 78 | | policy_gradient_loss | -0.00408 | | value_loss | 0.0795 | ------------------------------------------- ------------------------------------------- | rollout/ | | | ep_len_mean | 7.33e+03 | | ep_rew_mean | 5.26e+03 | | time/ | | | fps | 19190 | | iterations | 15 | | time_elapsed | 6 | | total_timesteps | 122880 | | train/ | | | approx_kl | 0.00018762345 | | clip_fraction | 0.000997 | | clip_range | 0.2 | | entropy_loss | -0.044 | | explained_variance | -0.00609 | | learning_rate | 0.0003 | | loss | 0.0339 | | n_updates | 84 | | policy_gradient_loss | -0.00253 | | value_loss | 0.0758 | ------------------------------------------- ------------------------------------------- | rollout/ | | | ep_len_mean | 7.33e+03 | | ep_rew_mean | 5.26e+03 | | time/ | | | fps | 19200 | | iterations | 16 | | time_elapsed | 6 | | total_timesteps | 131072 | | train/ | | | approx_kl | 8.1062724e-05 | | clip_fraction | 0 | | clip_range | 0.2 | | entropy_loss | -0.0358 | | explained_variance | -0.0103 | | learning_rate | 0.0003 | | loss | 0.0331 | | n_updates | 90 | | policy_gradient_loss | -0.00142 | | value_loss | 0.0735 | ------------------------------------------- ------------------------------------------ | rollout/ | | | ep_len_mean | 7.33e+03 | | ep_rew_mean | 5.26e+03 | | time/ | | | fps | 19174 | | iterations | 17 | | time_elapsed | 7 | | total_timesteps | 139264 | | train/ | | | approx_kl | 8.631839e-05 | | clip_fraction | 0 | | clip_range | 0.2 | | entropy_loss | -0.0298 | | explained_variance | -0.00115 | | learning_rate | 0.0003 | | loss | 0.0335 | | n_updates | 96 | | policy_gradient_loss | -0.00155 | | value_loss | 0.0751 | ------------------------------------------ ------------------------------------------- | rollout/ | | | ep_len_mean | 7.33e+03 | | ep_rew_mean | 5.26e+03 | | time/ | | | fps | 19169 | | iterations | 18 | | time_elapsed | 7 | | total_timesteps | 147456 | | train/ | | | approx_kl | 4.1890984e-05 | | clip_fraction | 0 | | clip_range | 0.2 | | entropy_loss | -0.025 | | explained_variance | -0.00811 | | learning_rate | 0.0003 | | loss | 0.0365 | | n_updates | 102 | | policy_gradient_loss | -0.000898 | | value_loss | 0.0766 | ------------------------------------------- ------------------------------------------- | rollout/ | | | ep_len_mean | 7.33e+03 | | ep_rew_mean | 5.26e+03 | | time/ | | | fps | 19170 | | iterations | 19 | | time_elapsed | 8 | | total_timesteps | 155648 | | train/ | | | approx_kl | 1.7873477e-05 | | clip_fraction | 0 | | clip_range | 0.2 | | entropy_loss | -0.0216 | | explained_variance | 0.00054 | | learning_rate | 0.0003 | | loss | 0.0364 | | n_updates | 108 | | policy_gradient_loss | -0.000502 | | value_loss | 0.0746 | ------------------------------------------- ------------------------------------------- | rollout/ | | | ep_len_mean | 7.33e+03 | | ep_rew_mean | 5.26e+03 | | time/ | | | fps | 19154 | | iterations | 20 | | time_elapsed | 8 | | total_timesteps | 163840 | | train/ | | | approx_kl | 1.8387327e-05 | | clip_fraction | 0 | | clip_range | 0.2 | | entropy_loss | -0.0192 | | explained_variance | 0.00147 | | learning_rate | 0.0003 | | loss | 0.0364 | | n_updates | 114 | | policy_gradient_loss | -0.000533 | | value_loss | 0.0762 | ------------------------------------------- ------------------------------------------- | rollout/ | | | ep_len_mean | 7.33e+03 | | ep_rew_mean | 5.26e+03 | | time/ | | | fps | 19161 | | iterations | 21 | | time_elapsed | 8 | | total_timesteps | 172032 | | train/ | | | approx_kl | 2.0283813e-05 | | clip_fraction | 0 | | clip_range | 0.2 | | entropy_loss | -0.0171 | | explained_variance | 0.00338 | | learning_rate | 0.0003 | | loss | 0.0325 | | n_updates | 120 | | policy_gradient_loss | -0.00056 | | value_loss | 0.0684 | ------------------------------------------- ------------------------------------------ | rollout/ | | | ep_len_mean | 7.33e+03 | | ep_rew_mean | 5.75e+03 | | time/ | | | fps | 19168 | | iterations | 22 | | time_elapsed | 9 | | total_timesteps | 180224 | | train/ | | | approx_kl | 8.371084e-06 | | clip_fraction | 0 | | clip_range | 0.2 | | entropy_loss | -0.0153 | | explained_variance | 0.00137 | | learning_rate | 0.0003 | | loss | 0.0368 | | n_updates | 126 | | policy_gradient_loss | -0.000265 | | value_loss | 0.0756 | ------------------------------------------ ------------------------------------------ | rollout/ | | | ep_len_mean | 7.33e+03 | | ep_rew_mean | 5.75e+03 | | time/ | | | fps | 19144 | | iterations | 23 | | time_elapsed | 9 | | total_timesteps | 188416 | | train/ | | | approx_kl | 6.709728e-06 | | clip_fraction | 0 | | clip_range | 0.2 | | entropy_loss | -0.014 | | explained_variance | 0.00267 | | learning_rate | 0.0003 | | loss | 0.0335 | | n_updates | 132 | | policy_gradient_loss | -0.000257 | | value_loss | 0.0697 | ------------------------------------------ ------------------------------------------ | rollout/ | | | ep_len_mean | 7.33e+03 | | ep_rew_mean | 5.75e+03 | | time/ | | | fps | 19137 | | iterations | 24 | | time_elapsed | 10 | | total_timesteps | 196608 | | train/ | | | approx_kl | 4.584166e-06 | | clip_fraction | 0 | | clip_range | 0.2 | | entropy_loss | -0.0129 | | explained_variance | 0.0025 | | learning_rate | 0.0003 | | loss | 0.037 | | n_updates | 138 | | policy_gradient_loss | -0.000172 | | value_loss | 0.073 | ------------------------------------------ ------------------------------------------- | rollout/ | | | ep_len_mean | 7.33e+03 | | ep_rew_mean | 5.75e+03 | | time/ | | | fps | 19120 | | iterations | 25 | | time_elapsed | 10 | | total_timesteps | 204800 | | train/ | | | approx_kl | 2.7833157e-06 | | clip_fraction | 0 | | clip_range | 0.2 | | entropy_loss | -0.012 | | explained_variance | 1.98e-05 | | learning_rate | 0.0003 | | loss | 0.0354 | | n_updates | 144 | | policy_gradient_loss | -0.000146 | | value_loss | 0.0714 | ------------------------------------------- ------------------------------------------- | rollout/ | | | ep_len_mean | 7.33e+03 | | ep_rew_mean | 5.75e+03 | | time/ | | | fps | 19122 | | iterations | 26 | | time_elapsed | 11 | | total_timesteps | 212992 | | train/ | | | approx_kl | 5.9275335e-06 | | clip_fraction | 0 | | clip_range | 0.2 | | entropy_loss | -0.0112 | | explained_variance | 0.0014 | | learning_rate | 0.0003 | | loss | 0.0335 | | n_updates | 150 | | policy_gradient_loss | -0.000243 | | value_loss | 0.0737 | ------------------------------------------- ------------------------------------------ | rollout/ | | | ep_len_mean | 7.33e+03 | | ep_rew_mean | 5.75e+03 | | time/ | | | fps | 19126 | | iterations | 27 | | time_elapsed | 11 | | total_timesteps | 221184 | | train/ | | | approx_kl | 4.291527e-06 | | clip_fraction | 0 | | clip_range | 0.2 | | entropy_loss | -0.0103 | | explained_variance | 0.00667 | | learning_rate | 0.0003 | | loss | 0.0339 | | n_updates | 156 | | policy_gradient_loss | -0.00019 | | value_loss | 0.0726 | ------------------------------------------ ------------------------------------------- | rollout/ | | | ep_len_mean | 7.33e+03 | | ep_rew_mean | 5.75e+03 | | time/ | | | fps | 19115 | | iterations | 28 | | time_elapsed | 11 | | total_timesteps | 229376 | | train/ | | | approx_kl | 1.7408165e-06 | | clip_fraction | 0 | | clip_range | 0.2 | | entropy_loss | -0.00959 | | explained_variance | 0.00327 | | learning_rate | 0.0003 | | loss | 0.0371 | | n_updates | 162 | | policy_gradient_loss | -9.85e-05 | | value_loss | 0.0749 | ------------------------------------------- ------------------------------------------- | rollout/ | | | ep_len_mean | 7.33e+03 | | ep_rew_mean | 6e+03 | | time/ | | | fps | 19102 | | iterations | 29 | | time_elapsed | 12 | | total_timesteps | 237568 | | train/ | | | approx_kl | 1.7778948e-06 | | clip_fraction | 0 | | clip_range | 0.2 | | entropy_loss | -0.00904 | | explained_variance | 0.0011 | | learning_rate | 0.0003 | | loss | 0.037 | | n_updates | 168 | | policy_gradient_loss | -9.94e-05 | | value_loss | 0.0748 | ------------------------------------------- ------------------------------------------- | rollout/ | | | ep_len_mean | 7.33e+03 | | ep_rew_mean | 6e+03 | | time/ | | | fps | 19072 | | iterations | 30 | | time_elapsed | 12 | | total_timesteps | 245760 | | train/ | | | approx_kl | 2.6665512e-06 | | clip_fraction | 0 | | clip_range | 0.2 | | entropy_loss | -0.00853 | | explained_variance | 0.00209 | | learning_rate | 0.0003 | | loss | 0.0329 | | n_updates | 174 | | policy_gradient_loss | -0.000138 | | value_loss | 0.0698 | ------------------------------------------- ------------------------------------------- | rollout/ | | | ep_len_mean | 7.33e+03 | | ep_rew_mean | 6e+03 | | time/ | | | fps | 19055 | | iterations | 31 | | time_elapsed | 13 | | total_timesteps | 253952 | | train/ | | | approx_kl | 2.8003778e-06 | | clip_fraction | 0 | | clip_range | 0.2 | | entropy_loss | -0.00797 | | explained_variance | 0.00326 | | learning_rate | 0.0003 | | loss | 0.0376 | | n_updates | 180 | | policy_gradient_loss | -0.000142 | | value_loss | 0.0735 | ------------------------------------------- ------------------------------------------- | rollout/ | | | ep_len_mean | 7.33e+03 | | ep_rew_mean | 6e+03 | | time/ | | | fps | 19053 | | iterations | 32 | | time_elapsed | 13 | | total_timesteps | 262144 | | train/ | | | approx_kl | 1.6040431e-06 | | clip_fraction | 0 | | clip_range | 0.2 | | entropy_loss | -0.00745 | | explained_variance | 0.00265 | | learning_rate | 0.0003 | | loss | 0.038 | | n_updates | 186 | | policy_gradient_loss | -9.49e-05 | | value_loss | 0.0737 | ------------------------------------------- ------------------------------------------- | rollout/ | | | ep_len_mean | 7.33e+03 | | ep_rew_mean | 6e+03 | | time/ | | | fps | 19014 | | iterations | 33 | | time_elapsed | 14 | | total_timesteps | 270336 | | train/ | | | approx_kl | 1.4998805e-06 | | clip_fraction | 0 | | clip_range | 0.2 | | entropy_loss | -0.00702 | | explained_variance | 0.00355 | | learning_rate | 0.0003 | | loss | 0.0353 | | n_updates | 192 | | policy_gradient_loss | -8.87e-05 | | value_loss | 0.0714 | ------------------------------------------- ------------------------------------------- | rollout/ | | | ep_len_mean | 7.33e+03 | | ep_rew_mean | 6e+03 | | time/ | | | fps | 19023 | | iterations | 34 | | time_elapsed | 14 | | total_timesteps | 278528 | | train/ | | | approx_kl | 1.6422127e-06 | | clip_fraction | 0 | | clip_range | 0.2 | | entropy_loss | -0.00664 | | explained_variance | 0.00246 | | learning_rate | 0.0003 | | loss | 0.0384 | | n_updates | 198 | | policy_gradient_loss | -8.98e-05 | | value_loss | 0.0732 | ------------------------------------------- ------------------------------------------- | rollout/ | | | ep_len_mean | 7.33e+03 | | ep_rew_mean | 6e+03 | | time/ | | | fps | 19024 | | iterations | 35 | | time_elapsed | 15 | | total_timesteps | 286720 | | train/ | | | approx_kl | 1.0164658e-06 | | clip_fraction | 0 | | clip_range | 0.2 | | entropy_loss | -0.00631 | | explained_variance | 0.00373 | | learning_rate | 0.0003 | | loss | 0.0363 | | n_updates | 204 | | policy_gradient_loss | -6.71e-05 | | value_loss | 0.0719 | ------------------------------------------- ------------------------------------------- | rollout/ | | | ep_len_mean | 7.33e+03 | | ep_rew_mean | 6.15e+03 | | time/ | | | fps | 19025 | | iterations | 36 | | time_elapsed | 15 | | total_timesteps | 294912 | | train/ | | | approx_kl | 2.5571353e-07 | | clip_fraction | 0 | | clip_range | 0.2 | | entropy_loss | -0.00602 | | explained_variance | 0.00436 | | learning_rate | 0.0003 | | loss | 0.0366 | | n_updates | 210 | | policy_gradient_loss | -2.64e-05 | | value_loss | 0.0706 | ------------------------------------------- ------------------------------------------ | rollout/ | | | ep_len_mean | 7.33e+03 | | ep_rew_mean | 6.15e+03 | | time/ | | | fps | 19019 | | iterations | 37 | | time_elapsed | 15 | | total_timesteps | 303104 | | train/ | | | approx_kl | 2.804154e-07 | | clip_fraction | 0 | | clip_range | 0.2 | | entropy_loss | -0.00582 | | explained_variance | 0.00292 | | learning_rate | 0.0003 | | loss | 0.04 | | n_updates | 216 | | policy_gradient_loss | -2.38e-05 | | value_loss | 0.0768 | ------------------------------------------ [SAVED] bots/models/PPO_Trailing_Stop_Loss/ppo_stop_loss_selector_rl_stop_loss_training_ZS=F.zip ✅ Done: 8/8 models saved to bots/models/PPO_Trailing_Stop_Loss
Build the PPO RL exit strategy¶
# ML Trailing Stop Loss Exits
PPO_Models_Dir = "bots/models/PPO_Trailing_Stop_Loss"
# ML Trailing Stop Loss using PPO or LSTM models
exit_strategy = RLTrailingATRExit(
model_dir=PPO_Models_Dir,
fallback_multiple=3.0, # used if a symbol has no model or SB3 isn't available
ema_span=21, # use 21 by default; you can sync this to your bot's EMA below
debug=False, # set True to print load/inference fallbacks
)
Build the bot with the new exit strategy¶
bot = CoinFlipBot(
exit_strategy=exit_strategy,
base_risk_percent=0.01,
enforce_sessions=False,
flatten_before_maintenance=True,
enable_online_learning=False,
seed=42,
)
Initialize engine and environment¶
config_path = "backtest_configs/backtest_config_10_yrs.yaml"
api = BacktesterEngine(config_path=config_path)
api.connect()
env = TradingEnvironment()
env.set_api(api)
env.set_bot(bot)
# Initial indicator compute happens inside TradingEnvironment on connect.
print('Assets:', env.get_asset_list())
Assets: ['6B=F', 'CL=F', '6E=F', 'GC=F', 'LE=F', 'SI=F', 'ZS=F', 'ZB=F']
Launch GUI and Run Backtest¶
launch_gui(env, api)
[FORCED LIQUIDATION] CL=F: current qty=-22, submitting side=buy, qty=22 [FORCED LIQUIDATION] 6E=F: current qty=33, submitting side=sell, qty=33 [FORCED LIQUIDATION] SI=F: current qty=-14, submitting side=buy, qty=14 [FORCED LIQUIDATION] ZS=F: current qty=55, submitting side=sell, qty=55 [FORCED LIQUIDATION] ZB=F: current qty=-37, submitting side=buy, qty=37
Backtesting Results¶
Show Statistics¶
# Minimal: pull stats from the running/backtested engine and show them inline
import pandas as pd
from IPython.display import display
stats = api.get_stats_snapshot() # live snapshot; safe to call anytime
# Portfolio (one row)
display(pd.DataFrame([{
"Initial Cash": stats["portfolio"].get("initial_cash", 0.0),
"Final Equity": stats["portfolio"].get("total_equity", 0.0),
"Used Margin": stats["portfolio"].get("used_margin", 0.0),
"Max Drawdown %": 100.0 * stats["portfolio"].get("max_drawdown", 0.0),
}]))
# Per-asset table
display(pd.DataFrame.from_dict(stats["per_asset"], orient="index").reset_index().rename(columns={"index":"Symbol"}))
Initial Cash | Final Equity | Used Margin | Max Drawdown % | |
---|---|---|---|---|
0 | 1000000.0 | 4.547230e+06 | 0.0 | 34.52171 |
Symbol | trades | wins | losses | long_trades | short_trades | win_rate | avg_win | avg_loss | profit_factor | expectancy | commission_total | fee_total | max_drawdown | |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
0 | 6B=F | 514 | 203 | 311 | 256 | 258 | 0.394942 | 29654.341133 | -15740.233119 | 1.229738 | 2187.974222 | 74928.0 | 1542.0 | 3.551710e+00 |
1 | CL=F | 501 | 170 | 331 | 252 | 249 | 0.339321 | 34352.882353 | -18195.951662 | 0.969637 | -365.009980 | 23356.0 | 1503.0 | 4.919995e+15 |
2 | 6E=F | 509 | 184 | 325 | 261 | 248 | 0.361493 | 29110.495925 | -16057.019231 | 1.026406 | 270.726916 | 47880.0 | 1527.0 | 3.067676e+01 |
3 | GC=F | 523 | 196 | 327 | 281 | 242 | 0.374761 | 37225.204086 | -17359.021407 | 1.285346 | 3097.017210 | 24840.0 | 1569.0 | 3.230981e+00 |
4 | LE=F | 480 | 173 | 307 | 248 | 232 | 0.360417 | 32814.161850 | -17616.091205 | 1.049686 | 559.812500 | 57320.0 | 1440.0 | 1.395000e+16 |
5 | SI=F | 533 | 201 | 332 | 265 | 268 | 0.377111 | 43731.343284 | -20129.668675 | 1.315268 | 3953.001876 | 26788.0 | 1599.0 | 3.573250e+00 |
6 | ZS=F | 519 | 190 | 329 | 254 | 265 | 0.366089 | 31786.842105 | -17809.194529 | 1.030768 | 347.350674 | 51332.0 | 1557.0 | 1.966981e+00 |
7 | ZB=F | 525 | 178 | 347 | 259 | 266 | 0.339048 | 29250.702247 | -18901.161744 | 0.793850 | -2575.386905 | 37160.0 | 1575.0 | 1.125000e+15 |
Show Equity Curve¶
# Assuming `s` is the equity Series you already built
import matplotlib.pyplot as plt
from matplotlib.ticker import FuncFormatter
# Times + equity (portfolio). Safe to call anytime; uses the engine's live history.
times, equity = api.get_equity_series() # None -> portfolio; pass a symbol for per-asset
n = min(len(times), len(equity))
if n == 0:
print("No equity data available yet.")
else:
s = pd.Series(equity[:n], index=pd.to_datetime(times[:n])).dropna()
# (Optional) smooth gaps like weekends/holidays:
s = s.resample("h").last().ffill()
fig, ax = plt.subplots(figsize=(10, 4))
ax.plot(s.index, s.values)
ax.set_title("Portfolio Equity")
ax.set_xlabel("Time"); ax.set_ylabel("Equity ($)")
ax.grid(True)
# Turn off scientific notation/offset and format with commas
ax.ticklabel_format(axis='y', style='plain', useOffset=False)
ax.yaxis.set_major_formatter(FuncFormatter(lambda x, pos: f'${x:,.0f}'))
fig.autofmt_xdate()
plt.show()