https://github.com/machine-intelligence-laboratory/OptimalNumberOfTopics/blob/master/topnum/scores/dataset_utils.py