Skip to main content

5. Score and Rank

Chapter 4 reacted to events and managed match lifecycle. Now you will score completed matches by how surprising they are, using three scorers: SurpriseScorer (pattern-level), StuScorer (property-level), and SequentialScorer (transition-level). The engine finds matches; the scorers rank them. This separation is deliberate.

Scorers

  • SurpriseScorer -- set a baseline probability per pattern, observe each tick, then score. Shannon surprise in bits: positive = rarer than expected, negative = more common.
  • StuScorer -- observe properties extracted from each completed match (categorical attributes, not entity IDs). Score by how rare those properties are. Two aggregation modes shown side by side: ArithmeticMean (lower = more surprising) and TfIdf (higher = more surprising).
  • SequentialScorer -- bigram model over pattern transitions. Records which pattern completed after which. Scores transition surprise.

Complete code

/// Extract categorical properties from a match's bindings.
/// Use roles and categories, never raw entity IDs.
fn extract_properties(m: &Match<String, MemValue, i64>) -> Vec<String> {
let mut props = Vec::new();
for (var, val) in &m.bindings {
if let BoundValue::Node(id) = val {
if id.chars().next().is_some_and(|c| c.is_uppercase()) {
props.push(format!("{}_sector=equity", var));
} else {
props.push(format!("{}_role=trader", var));
}
}
}
props
}

let mut graph = MemGraph::new();
let mut engine: SiftEngineFor<MemGraph> = SiftEngine::new();

let idx_insider = engine.register(insider_trading_pattern());
let idx_crash = engine.register(flash_crash_pattern());
let idx_pump = engine.register(pump_dump_pattern());

// -- set up scorers --
let mut surprise = SurpriseScorer::new();
surprise.set_baseline(idx_insider, 0.1);
surprise.set_baseline(idx_crash, 0.05);
surprise.set_baseline(idx_pump, 0.15);

let mut stu_mean = StuScorer::new();
let mut stu_tfidf = StuScorer::new().with_aggregation(StuAggregation::TfIdf);
let mut sequential = SequentialScorer::new();

// -- simulation: 10 ticks of market activity --
let schedule: Vec<Vec<(&str, &str, &str)>> = vec![
vec![("insider_tip", "alice", "ACME"), ("trade", "carol", "ACME")],
vec![
("promote", "carol", "ACME"),
("price_change", "market", "ACME"),
],
vec![("trade", "alice", "ACME"), ("alert", "system", "ACME")],
vec![
("sell", "carol", "ACME"),
("price_change", "market", "ZINC"),
],
vec![("insider_tip", "bob", "ZINC"), ("alert", "system", "ZINC")],
vec![("trade", "bob", "ZINC"), ("trade", "dan", "ACME")],
vec![("promote", "dan", "ACME"), ("insider_tip", "eve", "BETA")],
vec![("trade", "eve", "BETA"), ("sell", "dan", "ACME")],
vec![
("price_change", "market", "BETA"),
("alert", "system", "BETA"),
],
vec![("trade", "frank", "BETA")],
];

let mut all_completed: Vec<Match<String, MemValue, i64>> = Vec::new();
let mut last_pattern: Option<String> = None;
let mut event_id = 0;

for (tick_idx, tick_events) in schedule.iter().enumerate() {
let tick = (tick_idx + 1) as i64;

for &(action, actor, stock) in tick_events {
let id = format!("ev{}", event_id);
add_event(&mut graph, &id, action, actor, stock, tick);

let events = engine.on_edge_added(
&graph,
&id,
&"action".to_string(),
&MemValue::Str(action.to_string()),
&Interval::open(tick),
);
surprise.observe_events(&events, engine.patterns());
event_id += 1;
}

surprise.tick();
let (_delta, _expired) = engine.end_tick(50);

let completed = engine.drain_completed();
for m in &completed {
let props = extract_properties(m);
stu_mean.observe_one(&m.pattern, &props);
stu_tfidf.observe_one(&m.pattern, &props);

if let Some(ref prev) = last_pattern {
sequential.observe_transition(prev, &m.pattern);
}
last_pattern = Some(m.pattern.clone());
}
all_completed.extend(completed);
}

// -- SurpriseScorer results --
println!("=== Pattern-Level Surprise (SurpriseScorer) ===");
let scored = surprise.score(&all_completed, engine.patterns());
for sm in &scored {
println!(" {} -> {:.2} bits", sm.pattern, sm.surprise);
}

// -- StuScorer: ArithmeticMean vs TfIdf side by side --
println!("\n=== Property-Level Surprise (StuScorer) ===");
println!("{:<20} {:>12} {:>12}", "match", "ArithMean", "TfIdf");
println!("{}", "-".repeat(46));

let with_props: Vec<(Match<String, MemValue, i64>, Vec<String>)> = all_completed
.iter()
.map(|m| (m.clone(), extract_properties(m)))
.collect();

let scored_mean = stu_mean.score(&with_props);
let scored_tfidf = stu_tfidf.score(&with_props);

for (sm, st) in scored_mean.iter().zip(scored_tfidf.iter()) {
println!(
"{:<20} {:>12.4} {:>12.4}",
sm.pattern, sm.stu_score, st.stu_score
);
if !sm.property_frequencies.is_empty() {
println!(
" rarest: {} ({:.3})",
sm.property_frequencies[0].0, sm.property_frequencies[0].1
);
}
}

// -- SequentialScorer results --
println!("\n=== Sequential Surprise ===");
let names: Vec<&str> = all_completed.iter().map(|m| m.pattern.as_str()).collect();
for pair in names.windows(2) {
let score = sequential.score_transition(pair[0], pair[1]);
println!(" {} -> {} : {:.2} bits", pair[0], pair[1], score);
}

println!("\ntotal completed: {}", all_completed.len());

Expected output

=== Pattern-Level Surprise (SurpriseScorer) ===
insider_trading -> ... bits
pump_and_dump -> ... bits
insider_trading -> ... bits
pump_and_dump -> ... bits
flash_crash -> ... bits

=== Property-Level Surprise (StuScorer) ===
match ArithMean TfIdf
----------------------------------------------
insider_trading ... ...
rarest: ...
pump_and_dump ... ...
rarest: ...
insider_trading ... ...
rarest: ...
pump_and_dump ... ...
rarest: ...
flash_crash ... ...
rarest: ...

=== Sequential Surprise ===
insider_trading -> pump_and_dump : ... bits
pump_and_dump -> insider_trading : ... bits
insider_trading -> pump_and_dump : ... bits
pump_and_dump -> flash_crash : ... bits

total completed: 5

Exact numbers depend on Laplace smoothing and observation counts. The structure matters: pattern-level scores show how often each pattern fires vs. baseline; StU scores rank by property rarity with two different aggregation modes; sequential scores show transition surprise between consecutive completions.

What you learned

  • SurpriseScorer -- set_baseline() per pattern index, observe_events() + tick() each step, score() to rank. Positive bits = rarer than expected.
  • StuScorer -- observe_one(pattern, properties) per completed match, score() with pre-extracted properties. Lower = more surprising for ArithmeticMean; higher = more surprising for TfIdf.
  • StuAggregation -- ArithmeticMean (original StU), TfIdf (log-weighted, reversed polarity), GeometricMean, Min. Same data, different theories of surprise.
  • SequentialScorer -- observe_transition(prev, current) records what followed what. score_transition() returns conditional surprise in bits.
  • Property extraction -- use categorical attributes (suspect_role=trader), not entity IDs. IDs produce uniform frequencies and make everything equally "surprising."

Next: Speculate with MCTS ->