There is functionality around llama_sampling_context currently part of common. We should move it into llama. Pretty much the entire API from common/sampling.h except llama_sampling_params and llama_sampling_sample can be integrated into the library.
This would probably require to also merge the grammar parser into the llama lib implementation.
The llama_sampling_params and llama_sampling_sample will stay in common since they are very example-specific and not general-purpose enough to be merged.