If you treat a sequence of actions as a string, then using features like "an action A was performed" is akin to unigram model. If you want to account for order of actions, you should add bigrams, trigrams, etc.
That will blow up your feature space, though. For example, if you have
M possible actions, then there are
M (M-1) / 2 bigrams. In general, there are
O(Mk) k-grams. This leads to the following issues:
- The more features you have — the harder it is to apply some methods. For example, many models suffer from curse of dimensionality
- The more features you have — the more data you need to capture meaningful relations.
This is just one possible approach to your problem. There may be others. For example, if you know that there's some set of parameters ϴ, that governs action-generating process in a known (at least approximately) way, you can build a separate model to infer these first, and then use ϴ as features.
The process of coming up with sensible numerical representation of your data is called feature engineering. Once you've done that, you can use any Machine Learning algorithm at your disposal.