[HITL Agentic Systems](https://medium.com/@saimudhiganti/human-in-the-loop-agentic-systems-a-practical-guide-for -engineers-who-want-smarter-safer-agents-e1becadfbbdd) - It makes sense that you want a human-in-the-loop for high stakes decisions. Per this medium article, it suggests that humans support AI agents but I think it should be the other way around. When your AI spits out text tokens based on how it is chunked (semantically, syntactically, temporally, visually, auditory, cognitive, statistically, lexically, task-oriented, spatially, hierarchically, emotionally/conceptually, linguistically) and stored as vector embeddings and found via cosine similarity calculations. But, the amount of review on the mental load per processing is intense.
Upon some digging, the other ways to incorporate feedback into an agent’s learning are:
Method | Key Algorithms/Architectures |
---|---|
Knowledge Graphs | Graph Neural Networks (GNNs), SPARQL, OWL Reasoners, BFS/DFS, Neo4j |
Rule-Based Systems | Forward/Backward Chaining, Expert Systems (CLIPS, Drools), Logic Programming (Prolog) |
Probabilistic Reasoning | Bayesian Inference, Dynamic Bayesian Networks, Markov Decision Processes |
Memory-Augmented NN | Neural Turing Machines (NTMs), Differentiable Neural Computers (DNCs), Transformers with Memory |
Incremental Learning | Online Learning (SGD), Continual Learning (EWC), Streaming Algorithms (Hoeffding Trees) |
Episodic Memory Systems | Case-Based Reasoning (CBR), Attention Mechanisms, Episodic Control, Memory-Augmented RL |
Policy Updates in RL | Q-Learning, Policy Gradients, Actor-Critic (PPO, A3C), Model-Based RL (MuZero) |
Context-Aware Retrieval | ElasticSearch, BM25, Dense Passage Retrieval (DPR), Query Expansion |
Hybrid Approaches | Neuro-Symbolic AI, Hybrid Memory (RAG), Multi-Agent Systems |
Human-in-the-Loop | Active Learning, RLHF, Crowdsourcing, Interactive Machine Learning |
1. Knowledge Graphs
Algorithms/Architectures
- Graph Databases:
- Examples: Neo4j, ArangoDB, Amazon Neptune.
- Graph Neural Networks (GNNs):
- Algorithms: Graph Convolutional Networks (GCNs), Graph Attention Networks (GATs), Message Passing Neural Networks (MPNNs).
- Use Case: Learning embeddings for nodes and edges to improve reasoning over relationships.
- Pathfinding Algorithms:
- Examples: Breadth-First Search (BFS), Depth-First Search (DFS), Dijkstra’s Algorithm.
- Reasoning Engines:
- Tools: OWL Reasoners (e.g., Protégé, Pellet) for semantic reasoning.
- Query Language: SPARQL for querying the graph.
2. Rule-Based Systems
Algorithms/Architectures
- Expert Systems:
- Examples: MYCIN, CLIPS, Drools.
- Rules are written as explicit “if-then” statements.
- Forward Chaining:
- Algorithm: Start from known facts and infer new facts until a goal is reached.
- Example: Production systems.
- Backward Chaining:
- Algorithm: Work backward from a goal to find supporting facts.
- Example: Prolog and logic programming.
- Hybrid Systems:
- Combine symbolic rule-based reasoning with neural networks for more complex tasks.
3. Probabilistic Reasoning (Bayesian Networks)
Algorithms/Architectures
- Bayesian Inference:
- Algorithms: Variable Elimination, Junction Tree Algorithm, Belief Propagation.
- Frameworks: PyMC3, TensorFlow Probability, bnlearn.
- Dynamic Bayesian Networks:
- Combines Bayesian inference with temporal models like Hidden Markov Models (HMMs) or Kalman Filters.
- Markov Decision Processes (MDPs):
- Algorithms: Value Iteration, Policy Iteration.
- Use Case: Decision-making under uncertainty.
4. Memory-Augmented Neural Networks
Algorithms/Architectures
- Neural Turing Machines (NTMs):
- Combines a neural network with an external memory module.
- Algorithm: Differentiable attention mechanisms for memory read/write.
- Differentiable Neural Computers (DNCs):
- Extension of NTMs with enhanced memory capabilities.
- Transformer Architectures with Memory:
- Example: GPT models with attention-based memory.
- Episodic Memory Networks:
- Algorithm: Memory slots updated with episodic information (e.g., Facebook AI’s Memory Networks).
5. Incremental Learning
Algorithms/Architectures
- Online Learning Algorithms:
- Example: Stochastic Gradient Descent (SGD) for continuous updates.
- Perceptron Learning Algorithm for incremental classification.
- Continual Learning:
- Algorithms: Elastic Weight Consolidation (EWC), Synaptic Intelligence, Progressive Neural Networks.
- Streaming Algorithms:
- Examples: Hoeffding Trees, Online Random Forests.
6. Episodic Memory Systems
Algorithms/Architectures
- Case-Based Reasoning (CBR):
- Algorithm: Retrieve, reuse, revise, and retain cases.
- Frameworks: OpenCBR, myCBR.
- Attention Mechanisms:
- Algorithm: Self-attention (e.g., Transformers) for retrieving relevant episodes from memory.
- Reinforcement Learning with Memory:
- Algorithm: Memory-augmented RL (e.g., LSTMs or Transformers in RL agents).
- Episodic Control:
- Stores specific action-reward pairs for rapid decision-making.
7. Policy Updates in Reinforcement Learning
Algorithms/Architectures
- Value-Based Methods:
- Q-Learning: Updates Q-values based on feedback.
- Deep Q-Networks (DQN): Combines Q-Learning with neural networks.
- Policy-Based Methods:
- Policy Gradient Methods: Directly optimize the policy using feedback (e.g., REINFORCE).
- Actor-Critic Algorithms: Combines value-based and policy-based methods (e.g., PPO, A3C).
- Model-Based RL:
- Uses a model of the environment to simulate feedback.
- Algorithms: Dyna-Q, MuZero.
- Hierarchical RL:
- Decomposes tasks into subtasks and updates policies for each subtask.
8. Context-Aware Retrieval (Without Vectorstores)
Algorithms/Architectures
- ElasticSearch:
- A distributed search engine for keyword and semantic similarity retrieval.
- BM25 (Best Matching 25):
- Algorithm for ranking documents using term frequency and inverse document frequency (TF-IDF).
- Dense Retrieval Models:
- Algorithms: Dense Passage Retrieval (DPR), Retrieval-Augmented Generation (RAG).
- Dynamic Query Expansion:
- Algorithm: Expands search queries based on feedback or context.
9. Hybrid Approaches
Algorithms/Architectures
- Neuro-Symbolic Systems:
- Combines neural networks with symbolic reasoning (e.g., DeepMind’s AlphaCode, IBM’s Neuro-Symbolic AI).
- Hybrid Memory Architectures:
- Example: Retrieval-Augmented Generation (RAG) combines dense vector retrieval with generative models.
- Multi-Agent Systems:
- Algorithms: Game theory, auction mechanisms for collaborative reasoning.
- Frameworks: OpenAI Gym, PettingZoo.
10. Human-in-the-Loop Updates
Algorithms/Architectures
- Active Learning:
- Algorithm: Queries humans for labels on uncertain data points.
- Examples: Uncertainty Sampling, Query-by-Committee.
- Reinforcement Learning from Human Feedback (RLHF):
- Algorithm: Fine-tunes a model based on human-provided reward signals.
- Example: Used in OpenAI’s GPT-4 and ChatGPT.
- Crowdsourcing and Annotation Tools:
- Platforms: Amazon Mechanical Turk, Label Studio.
- Interactive Machine Learning:
- Algorithm: Incrementally updates the model based on real-time human feedback.
In ReAct architecture, ReAct stands for Reasoning + Acting framework that emphasizes the integration of reasoning (ability of an agent to think, reflect and plan based on its internal knowledge and observed environment) and acting (ability of agets to take actions in the environment to achieve its goals).
… to be continued… brain full.