In Hierarchal Reinforcement Learning (HRL) agents can break down a complex task into smaller one. Q Learning is an algorithm used in hierarchal learning. However Q learning find it hard to learn from very big state space. It does not work in continuous space as exploring all states is hard. Learning policy is even harder. Key to success is - can you create an abstract world (dataset) that let you learn good policies.
Add more data points, diversity in information, apply variability, handle class imbalance, handle different tasks, introduce noise Make your data representative of real world and more resistant to overfitting.
Generate new dataset to test your model. Use these if you have a cold start problem Learn how data may evolve in future and ensure your model will work. Use generated datasets to test your model.
From the blog
Generative AI and RL agent can work together to build a robust system. A Variational Auto Encoder (VAE) learn from latent space and then it geenrate future trajectories. An RL agent can learn from these and build an optimal policy. Using knobs one can keep state space limited and focus on abstraction that are required for learning.
Discovering abstractions that reduce the amount of experience or thinking time an RL agent requires to find a good solution is key for RL success. Knobs let you control how big or small state space you can use.
Using knobs you can manage a trade off between compression of states and representation of good behavior.