Feature selection using metaheuristics made easy: Open source MAFESE library in Python
Published in Future Generation Computer Systems, 2024
Artificial intelligence (AI) often relies on feature selection (FS) to recognize and highlight the most relevant and major features in a dataset. The procedure of training and optimizing AI systems with key data points is decisive for its development and efficacy. To address this challenge, the present study introduces MAFESE, an open-source Python library that employs metaheuristic algorithms for selecting the optimal set of attributes, particularly when dealing with complex and high-dimensional data. MAFESE encompasses a wide range of feature selection techniques, including unsupervised-based, filter-based, embedded-based, and wrapper-based methods. Notably, within the wrapper-based category, MAFESE offers users access to over 200 metaheuristic algorithms, empowering them to choose the most suitable algorithm based on their specific datasets and requirements. Additionally, MAFESE incorporates built-in evaluation metrics that enable efficient comparison among various algorithms. Open-source design of MAFESE encourages cooperation within the data science community, allowing for continuous upgrades. This collaborative environment promotes the sharing of ideas, proposals, and changes, resulting in a stronger and more adaptive feature selection framework. MAFESE distinguishes itself with an easy-to-use Python interface that follows object-oriented programming concepts. It supports both experienced researchers and practitioners. MAFESE offers many resources, including documentation, examples, and test cases, for a smooth user onboarding experience. The modular architecture enables users to enhance features and interface with other tools, such as scikit-learn. MAFESE can be a valuable tool for identifying meaningful features in complicated, high-dimensional datasets, making it a significant addition to feature selection. MAFESE utilizes metaheuristic algorithms to help users tackle complex feature selection problems more efficiently and accurately. Through openness and teamwork, MAFESE may become a comprehensive resource that responds to the evolving demands of the data science community. The source code and materials of MAFESE are publicly available in the official GitHub repository at https://github.com/thieu1995/mafese.
If you would like to get a version of the paper, let send me an email.
Recommended citation:
Download Paper | Download Slides