Trend¶

Trend features are backward-looking. They only use past and present data and are safe for live streaming (forward_period = 0).

Simple Moving Average (SMA)¶

Streaming <1µs/update Research

\[ \text{SMA}_t = \frac{1}{N} \sum_{i=0}^{N-1} x_{t-i} \]

The arithmetic mean of the last \(N\) bars, with equal weight on every observation. The simplest trend smoother: it reduces noise, but it lags by construction. The larger the window, the smoother and the slower.

ParametersOutputBehaviorInterpretationExampleSourceContribute

Name	Type	Constraint	Description
`inputs`	`list[str]`	len = 1	Input column, e.g. `["close"]`
`window`	`int`	>= 1	Number of bars to average (\(N\))
`outputs`	`list[str]`	len = 1	Output column, e.g. `["close_sma_20"]`

Column	When valid	Description
`outputs[0]`	`t >= window - 1`, no `NaN` in buffer	Rolling arithmetic mean over the last `window` bars

Warm-up. The first window - 1 bars return NaN. The buffer must hold exactly window values before a mean can be computed.
NaN propagation. A single NaN input contaminates the buffer. The output stays NaN until that value is evicted, i.e. until window consecutive valid bars have been seen.
window = 1. No warm-up. Output equals the input at every bar.
reset(). Clears the buffer entirely. Call it between backtest folds (CPCV, walk-forward) to avoid state leaking across splits. After reset, the full window - 1 warm-up applies again.
Implementation. Recomputes the sum over the full buffer on every update() call (O(N) per bar, O(N) memory). For typical window sizes the overhead is negligible; see Benchmarks. A running-sum approach would bring this to O(1) per bar with no change to output.

Situation	Output
`t < window - 1` (buffer not full)	`NaN`
Buffer full, all values valid	SMA value
Any value in the buffer is `NaN`	`NaN`
`window = 1`	Input value (immediate, no warm-up)
After `reset()`	`NaN` until buffer refills

Signal. Price above SMA suggests an uptrend; price below suggests a downtrend. A crossover of two SMAs with different windows is one of the most widely traded trend signals.
Lag. Equal weight on every bar in the window. The larger the window, the smoother the output but the more it lags behind the actual price. This lag is structural and by design.

import pandas as pd
from oryon.features import Sma
from oryon import FeaturePipeline, run_features_pipeline

sma = Sma(["close"], window=3, outputs=["close_sma_3"])
fp  = FeaturePipeline(features=[sma], input_columns=["close"])

df = pd.DataFrame({"close": [100.0, 101.0, 102.0, 103.0, None, 104.0, 105.0, 106.0]})
out = run_features_pipeline(fp, df)
print(out)
#    close_sma_3
# 0          NaN
# 1          NaN
# 2       101.00
# 3       102.00
# 4          NaN
# 5          NaN
# 6          NaN
# 7       105.00

Step-by-step with window = 3:

Bar	Input	Buffer	Output
0	100.0	`[100]`	`NaN`
1	101.0	`[100, 101]`	`NaN`
2	102.0	`[100, 101, 102]`	101.0
3	103.0	`[101, 102, 103]`	102.0
4	`NaN`	`[102, 103, NaN]`	`NaN`
5	104.0	`[103, NaN, 104]`	`NaN`
6	105.0	`[NaN, 104, 105]`	`NaN`
7	106.0	`[104, 105, 106]`	105.0

Bars 4-6 are NaN because the NaN at bar 4 remains in the buffer until bar 7 evicts it.

crates/oryon/src/features/sma.rs

O(N) to O(1) per update. update() currently recomputes the sum over the full buffer on every bar by delegating to average() in ops/stats.rs. A running-sum approach would maintain a single accumulator: add the incoming value, subtract the evicted one. This brings the per-update cost from O(N) to O(1) with no change to numerical output.

Exponential Moving Average (EMA)¶

Streaming <1µs/update Research

\[ \text{EMA}_t = \alpha \cdot x_t + (1 - \alpha) \cdot \text{EMA}_{t-1}, \quad \alpha = \frac{2}{N+1} \]

Weights recent observations more heavily than older ones using an exponential decay factor. It reacts faster to price changes than the SMA and maintains only a single state value, making it O(1) per update after seeding.

ParametersOutputBehaviorInterpretationExampleSource

Name	Type	Constraint	Description
`inputs`	`list[str]`	len = 1	Input column, e.g. `["close"]`
`window`	`int`	>= 1	Span for the smoothing factor (\(N\))
`outputs`	`list[str]`	len = 1	Output column, e.g. `["close_ema_20"]`

Column	When valid	Description
`outputs[0]`	`t >= window - 1`, no `NaN` since last reset	Exponentially weighted mean

Warm-up. The first window bars are used to seed the EMA with their SMA. The first valid output appears at bar window - 1.
NaN propagation. Behavior depends on the phase:
- During seeding: a NaN slides through the rolling seed buffer naturally. The seed is computed as soon as window consecutive valid values are available. No bars are wasted.
- During recursive phase: a NaN fully resets prev_ema and clears the buffer. The seeding phase restarts from scratch, requiring window consecutive valid bars before output resumes.
window = 1. Alpha equals 1.0. Output equals the input at every bar, no warm-up.
reset(). Clears prev_ema and the seed buffer entirely. Call it between backtest folds (CPCV, walk-forward) to avoid state leaking across splits. After reset, the full window - 1 warm-up applies again.
Implementation. After seeding, only prev_ema is maintained in memory (O(1) per update, O(1) memory in the recursive phase). The seed buffer is cleared once seeding completes.

Situation	Output
`t < window - 1` (seeding)	`NaN`
`t == window - 1` (seed bar)	SMA of first `window` bars
Recursive phase, valid input	EMA value
`NaN` during seeding	`NaN`, slides through buffer (no reset)
`NaN` during recursive phase	`NaN` + full state reset
`window = 1`	Input value (immediate, no warm-up)
After `reset()`	`NaN` until reseeded

Signal. Exponentially weighted smoother with recency bias. Recent bars have more influence than older ones, making EMA faster at tracking regime changes than equal-weight smoothers.
Production. Recursive update with a single state variable: O(1) time and memory after seeding. The cheapest smoother to run in a live pipeline.

import pandas as pd
from oryon.features import Ema
from oryon import FeaturePipeline, run_features_pipeline

ema = Ema(["close"], window=3, outputs=["close_ema_3"])
fp  = FeaturePipeline(features=[ema], input_columns=["close"])

df = pd.DataFrame({"close": [100.0, 101.0, 102.0, 103.0, None, 104.0, 105.0, 106.0]})
out = run_features_pipeline(fp, df)
print(out)
#    close_ema_3
# 0          NaN
# 1          NaN
# 2       101.00
# 3       102.00
# 4          NaN
# 5          NaN
# 6          NaN
# 7       105.00

Step-by-step with window = 3, alpha = 0.5:

Bar	Input	Phase	State	Output
0	100.0	Seeding	buffer=`[100]`	`NaN`
1	101.0	Seeding	buffer=`[100, 101]`	`NaN`
2	102.0	Seeding	seed = SMA(`[100, 101, 102]`) = 101.0	101.0
3	103.0	Recursive	0.5×103 + 0.5×101 = 102.0	102.0
4	`NaN`	Reset	state cleared	`NaN`
5	104.0	Seeding	buffer=`[104]`	`NaN`
6	105.0	Seeding	buffer=`[104, 105]`	`NaN`
7	106.0	Seeding	seed = SMA(`[104, 105, 106]`) = 105.0	105.0

At bar 4, NaN arrives in the recursive phase and triggers a full state reset. The EMA reseeds from scratch. If the NaN had arrived during the seeding phase instead, it would have slid through the buffer naturally without resetting.

crates/oryon/src/features/ema.rs

Kaufman Adaptive Moving Average (KAMA)¶

Streaming <1µs/update Research

\[ \text{ER}_t = \frac{|P_t - P_{t-N}|}{\sum_{i=1}^{N} |P_i - P_{i-1}|} \]

\[ \text{SC}_t = \bigl(\text{ER}_t \cdot (\alpha_f - \alpha_s) + \alpha_s\bigr)^2 \]

\[ \text{KAMA}_t = \text{KAMA}_{t-1} + \text{SC}_t \cdot (P_t - \text{KAMA}_{t-1}) \]

Adapts its smoothing speed based on market efficiency. In trending markets the Efficiency Ratio (ER) approaches 1 and KAMA tracks price closely. In choppy markets ER approaches 0 and KAMA barely moves, suppressing noise.

Default parameters

fast=2, slow=30 match Kaufman's original 1998 paper and are well market-tested. Only change them if you have a specific calibration reason.

ParametersOutputBehaviorInterpretationExampleSource

Name	Type	Constraint	Description
`inputs`	`list[str]`	len = 1	Input column, e.g. `["close"]`
`window`	`int`	>= 1	ER lookback (\(N\)). Kaufman default: `10`
`outputs`	`list[str]`	len = 1	Output column, e.g. `["close_kama_10"]`
`fast`	`int`	>= 1	Fast smoothing period (\(\alpha_f = 2/(fast+1)\)). Default: `2`
`slow`	`int`	> `fast`	Slow smoothing period (\(\alpha_s = 2/(slow+1)\)). Default: `30`

Column	When valid	Description
`outputs[0]`	`t >= window`, no `NaN` in window	Adaptive smoothed value

Warm-up. Requires window + 1 bars to compute the first ER. The first valid output appears at bar window (one bar later than SMA/EMA with the same window).
NaN propagation. A NaN anywhere in the current window resets prev_kama and returns None. Once the window no longer contains the NaN, KAMA re-seeds automatically using prices[window - 1] as the starting value.
reset(). Clears prev_kama and the price buffer. Call it between backtest folds (CPCV, walk-forward) to avoid state leaking across splits. After reset, the full window warm-up applies again.
Implementation. Iterates over the full buffer on every update() to compute direction, volatility, and ER (O(N) per bar, O(N) memory).

Situation	Output
`t < window` (buffer not full)	`NaN`
Buffer full, all values valid	KAMA value
Any `NaN` in the window	`NaN` + `prev_kama` reset
After `reset()`	`NaN` until buffer refills

Signal. Adaptive smoother. Smoothing speed adjusts automatically based on the Efficiency Ratio (ER): fast during directional moves (ER near 1), slow during noisy/ranging markets (ER near 0). This built-in regime detection reduces false signals compared to fixed-speed smoothers.
The ER as a standalone feature. The Efficiency Ratio that KAMA uses internally is independently useful. It quantifies how directional the market is over a window, a direct measure of signal-to-noise ratio in price movement.
References. Kaufman, P.J. (1995), Smarter Trading. Expanded in Trading Systems and Methods (6th ed., 2020).

import pandas as pd
from oryon.features import Kama
from oryon import FeaturePipeline, run_features_pipeline

kama = Kama(["close"], window=3, outputs=["close_kama_3"], fast=2, slow=5)
fp   = FeaturePipeline(features=[kama], input_columns=["close"])

df = pd.DataFrame({"close": [100.0, 101.0, 103.0, 102.0, 105.0, 107.0, 106.0]})
out = run_features_pipeline(fp, df)
print(out)
#    close_kama_3
# 0           NaN
# 1           NaN
# 2           NaN
# 3        102.75
# 4        103.44
# 5        104.54
# 6        104.99

With window=3, fast=2, slow=5 (α_f=2/3, α_s=1/3). Bars 0-2 are NaN because window + 1 = 4 bars are needed. At bar 3: ER=0.5, SC=0.25, seed=103.0 → KAMA = 103.0 + 0.25×(102-103) = 102.75.

crates/oryon/src/features/kama.rs

Linear Slope¶

Streaming <1µs/update Research

\[ \hat{\beta} = \frac{S_{xy}}{S_{xx}}, \quad R^2 = \frac{S_{xy}^2}{S_{xx} \cdot S_{yy}} \]

\[ S_{xy} = \sum (x_i - \bar{x})(y_i - \bar{y}) \]

Fits an OLS regression of y on x over a rolling window and outputs the slope and R² at each bar. Useful for quantifying trend direction, strength, and linearity of price movement.

Choosing x

Pass a simple integer index [0, 1, 2, ...] as x to get slope in price-per-bar units. If you pass timestamps, the slope becomes price-per-nanosecond and is much harder to read.

ParametersOutputBehaviorInterpretationExampleSourceContribute

Name	Type	Constraint	Description
`inputs`	`list[str]`	len >= 2	Two columns in order: `[x_col, y_col]`
`window`	`int`	>= 2	Rolling window length
`outputs`	`list[str]`	len = 2	Output columns: `[slope_col, r2_col]`

Column	When valid	Description
`outputs[0]`	`t >= window - 1`, no `NaN`, `x` not constant	OLS slope over the last `window` bars
`outputs[1]`	Same as slope, and `y` not constant	Coefficient of determination R²

Warm-up. The first window - 1 bars return NaN for both outputs. A full window of x and y values is required before regression can be computed.
NaN propagation. A NaN in either x or y contaminates both outputs. They stay NaN until that bar is evicted, i.e. until window consecutive valid pairs have been seen.
Degenerate cases. If x is constant over the window (S_xx = 0), both outputs are NaN. If y is constant (S_yy = 0), slope is valid (returns 0.0) but R² is NaN.
reset(). Clears both the x and y buffers entirely. Call it between backtest folds (CPCV, walk-forward) to avoid state leaking across splits. After reset, the full window - 1 warm-up applies again.
Implementation. Two passes over the window on every update(): one to compute means, one for \(S_{xx}\), \(S_{xy}\), \(S_{yy}\) (O(N) per bar, O(N) memory).

Situation	Slope	R²
`t < window - 1`	`NaN`	`NaN`
Buffer full, all values valid	OLS slope	R² value
Any `NaN` in either buffer	`NaN`	`NaN`
`x` constant (`S_xx = 0`)	`NaN`	`NaN`
`y` constant (`S_yy = 0`)	`0.0`	`NaN`
After `reset()`	`NaN`	`NaN`

Signal. Slope and R² together give a richer picture than any moving average. Slope captures trend direction and magnitude. R² captures trend quality - how linear the movement is over the window. What counts as a "high" R² depends entirely on the inputs: time-vs-price regressions can reach 0.9, while volume-vs-price rarely exceeds 0.2.

import pandas as pd
from oryon.features import LinearSlope
from oryon import FeaturePipeline, run_features_pipeline

ls = LinearSlope(
    ["time_idx", "close"], window=3,
    outputs=["close_slope_3", "close_r2_3"],
)
fp = FeaturePipeline(features=[ls], input_columns=["time_idx", "close"])

df = pd.DataFrame({
    "time_idx": [0.0, 1.0, 2.0, 3.0, 4.0],
    "close":    [100.0, 103.0, 106.0, 109.0, 112.0],
})
out = run_features_pipeline(fp, df)
print(out)
#    close_slope_3  close_r2_3
# 0            NaN         NaN
# 1            NaN         NaN
# 2            3.0         1.0
# 3            3.0         1.0
# 4            3.0         1.0

Price increases by exactly 3.0 per bar, so slope=3.0 and R²=1.0 (perfect linear fit) at every valid bar.

crates/oryon/src/features/linear_slope.rs

O(N) to O(1) per update. update() currently runs two full passes over the window to compute means then \(S_{xx}\), \(S_{xy}\), \(S_{yy}\). An incremental approach would maintain five running sums (n, sum_x, sum_y, sum_xx, sum_xy, sum_yy) updated in O(1) by adding the incoming pair and subtracting the evicted one, similar to a Welford-style online algorithm.