Component Gallery

A showcase of all available academic components in this project.

Information Design

Educational Callouts

Custom callouts designed for academic contexts with distinct semantic types.

Intuition
Intuitive Understanding
Intuition callouts help build mental models. They use warm colors to invite the reader to think conceptually before diving into formalisms.
DDefinition
Formal Definition
Definition callouts use professional blue tones for rigorous statements and terminology.
TTheorem
Theorem 1.1 (Central Limit)
Theorem callouts use purple tones to highlight proven mathematical or scientific results.
EExample
Example callouts (green) provide concrete applications of abstract concepts.
!Common pitfall
Common Pitfall
Warning callouts (red) alert readers to subtle errors or misconceptions.
Nice to know
"Nice to know" callouts (pink) provide interesting side notes that aren't critical to the main path.
Rigorous Notation

Math & Equations

Integrated LaTeX support via KaTeX with block and inline formatting.

Block Equations

f(x)=f^(ξ)e2πiξxdξf(x) = \int_{-\infty}^{\infty} \hat{f}(\xi) e^{2\pi i \xi x} \, d\xi

Inline Equations

The fundamental theorem of calculus states that abf(x)dx=F(b)F(a)\int_a^b f(x) dx = F(b) - F(a) where F=fF' = f.
Logical Structure

Proof Blocks

1.Assume √2 is rational. Then √2 = a/b where a, b are coprime integers.
2.Squaring both sides: 2 = a²/b², so a² = 2b². Thus a² is even, which implies a is even.
3.Let a = 2k. Then (2k)² = 2b², so 4k² = 2b², which means b² = 2k².
4.Thus b² is even, so b is even. But if a and b are both even, they aren't coprime-a contradiction.
Visual Learning

Interactive Graphs

Sigmoid Activation FunctionTypical non-linear function used in neural networks.
Normal DistributionThe probability density function for a Gaussian distribution.
Media

Image Blocks

Use markdown image syntax or the explicit MDX image component for figures with optional captions.

Distributed systems example cover image
MdxImage supports responsive rendering with optional captions.
Data Analysis

Statistical Charts

Algorithm Performance Comparison (ms)

0ms43ms86ms129ms172msOurs42msState-of-art89msBaseline156ms

Energy Consumption Breakdown

Compute (45%)
Memory (25%)
I/O (20%)
Other (10%)

System Trade-offs

ConsistencyAvailabilityLatencyScalabilityPartition
Machine Learning

ML Plot Suite

Training Curves

Epoch-wise train/validation loss with an overfitting marker.

0.1220.4270.7321.0371.3420.723.6326.5449.45612.3715.28Train Loss: (1, 1.099)Train Loss: (2, 0.975)Train Loss: (3, 0.865)Train Loss: (4, 0.768)Train Loss: (5, 0.682)Train Loss: (6, 0.607)Train Loss: (7, 0.54)Train Loss: (8, 0.481)Train Loss: (9, 0.43)Train Loss: (10, 0.384)Train Loss: (11, 0.343)Train Loss: (12, 0.308)Train Loss: (13, 0.276)Train Loss: (14, 0.249)Train Loss: (15, 0.224)Validation Loss: (1, 1.24)Validation Loss: (2, 1)Validation Loss: (3, 0.84)Validation Loss: (4, 0.71)Validation Loss: (5, 0.63)Validation Loss: (6, 0.57)Validation Loss: (7, 0.53)Validation Loss: (8, 0.5)Validation Loss: (9, 0.48)Validation Loss: (10, 0.47)Validation Loss: (11, 0.49)Validation Loss: (12, 0.53)Validation Loss: (13, 0.58)Validation Loss: (14, 0.64)Validation Loss: (15, 0.71)Overfit startsEpochLoss
Train LossValidation Loss

Embedding Projection

Cluster separation in 2D projection.

-1.8, 1.5-1.2, 1.1-1.4, 0.71.1, -1.21.6, -1.51.9, -0.8-2.096-1.238-0.3790.4791.3382.196-1.74-0.8700.871.74PC1PC2
Class AClass B

Confusion Matrix

Normalized class-level classification accuracy.

RowCatDogBird
Cat
0.92
0.07
0.01
Dog
0.08
0.85
0.07
Bird
0.03
0.11
0.86
Low
High
Computation

Algorithms & Code

Mini-batch Gradient Descent

Input
Dataset D, model parameters θ, learning rate η
Output
Updated parameters θ*
Complexity
O(E * |D| * fwd/bwd)
  1. 1
    Initialize
    Initialize parameters θ and optimizer state.
    Random init or pretrained
  2. 2
    Forward
    For each mini-batch, compute predictions and loss.
    Cross-entropy or MSE
  3. 3
    Backward
    Backpropagate gradients with respect to θ.
    Autodiff graph
  4. 4
    Update
    Apply optimizer update rule using η and gradients.
    SGD/Adam step
  5. 5
    Repeat
    Run for E epochs and monitor validation metrics.
    Early stopping optional

Training Step (PyTorch)

python
train_step.py
1def train_step(model, batch, optimizer, criterion):
2 model.train()
3 x, y = batch
4 optimizer.zero_grad()
5 logits = model(x)
6 loss = criterion(logits, y)
7 loss.backward()
8 optimizer.step()
9 return loss.item()
Learning Loop

Exercise Blocks

Bias-Variance Check

medium
Training loss keeps decreasing, but validation loss starts increasing after epoch 12. What is happening and what should you change first?
Deep Learning Internals

Tensor Shapes

Tensor Shapes

LayerOperationShapeNote
Input TokensEmbedding Lookup[B, T, d_model]Token + position embedding
Self-AttentionQK^T / sqrt(dk)[B, H, T, T]Attention scores
Contextsoftmax(scores) * V[B, T, d_model]-
MLPLinear -> GELU -> Linear[B, T, d_model]-
LogitsProjection to vocab[B, T, V]Pre-softmax output
Architecture

Model Diagrams

Encoder-Decoder Overview

Simple sequence-to-sequence model with cross-attention bridge.

hidden statesK,VcontextlogitsskipInputtokensEncoder 1Encoder 2Cross AttentionDecoderOutputtokens
Infrastructure

System Architecture

Declarative diagrams for distributed systems and cloud topologies.

3-Tier Distributed Web Architecture

HTTPHTTPSQLSQL
Load Balancer
Server A
Server B
Primary DB
Communication

Sequence Diagrams

Visualize message passing and distributed protocols over time.

Raft Leader Election (Successful)

Node A (Cand.)Node BNode CRequestVote(T=1)RequestVote(T=1)VoteGrantedVoteGrantedHeartbeat (Leader)Heartbeat (Leader)
Structured Data

Math Tables

OperatorMeaningLaTeX
Gradient / Nabla\nabla
ΣSummation\sum
Product\prod
Partial Derivative\partial