diff --git a/joss.06310/10.21105.joss.06310.crossref.xml b/joss.06310/10.21105.joss.06310.crossref.xml
new file mode 100644
index 0000000000..5420b5c0d0
--- /dev/null
+++ b/joss.06310/10.21105.joss.06310.crossref.xml
@@ -0,0 +1,337 @@
+
+
+
+ 20240318T122733-baf0b0b00c8dd7f2a8dbe7df855935119f5fcd59
+ 20240318122733
+
+ JOSS Admin
+ admin@theoj.org
+
+ The Open Journal
+
+
+
+
+ Journal of Open Source Software
+ JOSS
+ 2475-9066
+
+ 10.21105/joss
+ https://joss.theoj.org
+
+
+
+
+ 03
+ 2024
+
+
+ 9
+
+ 95
+
+
+
+ Imbalance: A comprehensive multi-interface Julia
+toolbox to address class imbalance
+
+
+
+ Essam
+ Wisam
+ https://orcid.org/0009-0009-1198-7166
+
+
+ Anthony
+ Blaom
+ https://orcid.org/0000-0001-6689-886X
+
+
+
+ 03
+ 18
+ 2024
+
+
+ 6310
+
+
+ 10.21105/joss.06310
+
+
+ http://creativecommons.org/licenses/by/4.0/
+ http://creativecommons.org/licenses/by/4.0/
+ http://creativecommons.org/licenses/by/4.0/
+
+
+
+ Software archive
+ 10.5281/zenodo.10823254
+
+
+ GitHub review issue
+ https://github.com/openjournals/joss-reviews/issues/6310
+
+
+
+ 10.21105/joss.06310
+ https://joss.theoj.org/papers/10.21105/joss.06310
+
+
+ https://joss.theoj.org/papers/10.21105/joss.06310.pdf
+
+
+
+
+
+ Julia: A fresh approach to numerical
+computing
+ Bezanson
+ SIAM Review
+ 1
+ 59
+ 10.1137/141000671
+ 2017
+ Bezanson, J., Edelman, A., Karpinski,
+S., & Shah, V. B. (2017). Julia: A fresh approach to numerical
+computing. SIAM Review, 59(1), 65–98.
+https://doi.org/10.1137/141000671
+
+
+ Supervised learning
+ Cunningham
+ Machine learning techniques for multimedia:
+Case studies on organization and retrieval
+ 10.1007/978-3-540-75171-7_2
+ 978-3-540-75171-7
+ 2008
+ Cunningham, P., Cord, M., &
+Delany, S. J. (2008). Supervised learning. In M. Cord & P.
+Cunningham (Eds.), Machine learning techniques for multimedia: Case
+studies on organization and retrieval (pp. 21–49). Springer Berlin
+Heidelberg.
+https://doi.org/10.1007/978-3-540-75171-7_2
+
+
+ Classification with class imbalance problem:
+A review
+ Ali
+ Soft computing models in industrial and
+environmental applications
+ 2015
+ Ali, A., Shamsuddin, S. M. Hj., &
+Ralescu, A. L. (2015). Classification with class imbalance problem: A
+review. Soft Computing Models in Industrial and Environmental
+Applications.
+https://api.semanticscholar.org/CorpusID:26644563
+
+
+ Effective prediction of three common diseases
+by combining SMOTE with tomek links technique for imbalanced medical
+data
+ Zeng
+ 2016 IEEE International Conference of Online
+Analysis and Computing Science (ICOACS)
+ 10.1109/ICOACS.2016.7563084
+ 2016
+ Zeng, M., Zou, B., Wei, F., Liu, X.,
+& Wang, L. (2016). Effective prediction of three common diseases by
+combining SMOTE with tomek links technique for imbalanced medical data.
+2016 IEEE International Conference of Online Analysis and Computing
+Science (ICOACS), 225–228.
+https://doi.org/10.1109/ICOACS.2016.7563084
+
+
+ Exploratory undersampling for class-imbalance
+learning
+ Liu
+ IEEE Transactions on Systems, Man, and
+Cybernetics, Part B (Cybernetics)
+ 39
+ 10.1109/TSMCB.2008.2007853
+ 2009
+ Liu, X.-Y., Wu, J., & Zhou, Z.-H.
+(2009). Exploratory undersampling for class-imbalance learning. IEEE
+Transactions on Systems, Man, and Cybernetics, Part B (Cybernetics), 39,
+539–550.
+https://doi.org/10.1109/TSMCB.2008.2007853
+
+
+ The curse of class imbalance and conflicting
+metrics with machine learning for side-channel
+evaluations
+ Picek
+ IACR Trans. Cryptogr. Hardw. Embed.
+Syst.
+ 2019
+ 10.13154/tches.v2019.i1.209-237
+ 2018
+ Picek, S., Heuser, A., Jović, A.,
+Bhasin, S., & Regazzoni, F. (2018). The curse of class imbalance and
+conflicting metrics with machine learning for side-channel evaluations.
+IACR Trans. Cryptogr. Hardw. Embed. Syst., 2019, 209–237.
+https://doi.org/10.13154/tches.v2019.i1.209-237
+
+
+ Addressing the curse of imbalanced training
+sets: One-sided selection
+ Kubát
+ International conference on machine
+learning
+ 1997
+ Kubát, M., & Matwin, S. (1997).
+Addressing the curse of imbalanced training sets: One-sided selection.
+International Conference on Machine Learning.
+https://api.semanticscholar.org/CorpusID:18370956
+
+
+ SMOTE: Synthetic minority over-sampling
+technique
+ Chawla
+ ArXiv
+ abs/1106.1813
+ 10.1613/jair.953
+ 2002
+ Chawla, N., Bowyer, K., Hall, L. O.,
+& Kegelmeyer, W. P. (2002). SMOTE: Synthetic minority over-sampling
+technique. ArXiv, abs/1106.1813.
+https://doi.org/10.1613/jair.953
+
+
+ Borderline-SMOTE: A new over-sampling method
+in imbalanced data sets learning
+ Han
+ International conference on intelligent
+computing
+ 10.1007/11538059_91
+ 2005
+ Han, H., Wang, W., & Mao, B.
+(2005). Borderline-SMOTE: A new over-sampling method in imbalanced data
+sets learning. International Conference on Intelligent Computing.
+https://doi.org/10.1007/11538059_91
+
+
+ RWO-sampling: A random walk over-sampling
+approach to imbalanced data classification
+ Zhang
+ Inf. Fusion
+ 20
+ 10.1016/j.inffus.2013.12.003
+ 2014
+ Zhang, H., & Li, M. (2014).
+RWO-sampling: A random walk over-sampling approach to imbalanced data
+classification. Inf. Fusion, 20, 99–116.
+https://doi.org/10.1016/j.inffus.2013.12.003
+
+
+ Training and assessing classification rules
+with imbalanced data
+ Menardi
+ Data Mining and Knowledge
+Discovery
+ 28
+ 10.1007/s10618-012-0295-5
+ 2012
+ Menardi, G., & Torelli, N.
+(2012). Training and assessing classification rules with imbalanced
+data. Data Mining and Knowledge Discovery, 28, 92–122.
+https://doi.org/10.1007/s10618-012-0295-5
+
+
+ Clustering-based undersampling in
+class-imbalanced data
+ Lin
+ Inf. Sci.
+ 409
+ 10.1016/j.ins.2017.05.008
+ 2016
+ Lin, W.-C., Tsai, C.-F., Hu, Y.-H.,
+& Jhang, J.-S. (2016). Clustering-based undersampling in
+class-imbalanced data. Inf. Sci., 409, 17–26.
+https://doi.org/10.1016/j.ins.2017.05.008
+
+
+ The condensed nearest neighbor rule
+(corresp.)
+ Hart
+ IEEE Trans. Inf. Theory
+ 14
+ 10.1109/TIT.1968.1054155
+ 1968
+ Hart, P. E. (1968). The condensed
+nearest neighbor rule (corresp.). IEEE Trans. Inf. Theory, 14, 515–516.
+https://doi.org/10.1109/TIT.1968.1054155
+
+
+ Imbalanced-learn: A Python toolbox to tackle
+the curse of imbalanced datasets in machine learning
+ Lemaître
+ ArXiv
+ abs/1609.06570
+ 2016
+ Lemaître, G., Nogueira, F., &
+Aridas, C. K. (2016). Imbalanced-learn: A Python toolbox to tackle the
+curse of imbalanced datasets in machine learning. ArXiv, abs/1609.06570.
+https://api.semanticscholar.org/CorpusID:1426815
+
+
+ Smote-variants: A Python implementation of 85
+minority oversampling techniques
+ Kovács
+ Neurocomputing
+ 366
+ 10.1016/j.neucom.2019.06.100
+ 0925-2312
+ 2019
+ Kovács, G. (2019). Smote-variants: A
+Python implementation of 85 minority oversampling techniques.
+Neurocomputing, 366, 352–354.
+https://doi.org/10.1016/j.neucom.2019.06.100
+
+
+ The rise of Julia
+ Tuychiev
+ 2023
+ Tuychiev, B. (2023). The rise of
+Julia.
+https://www.datacamp.com/blog/the-rise-of-julia-is-it-worth-learning-in-2022
+
+
+ Analysing the classification of imbalanced
+data-sets with multiple classes: Binarization techniques and ad-hoc
+approaches
+ Fernández
+ Knowl. Based Syst.
+ 42
+ 10.1016/J.KNOSYS.2013.01.018
+ 2013
+ Fernández, A., López, V., Galar, M.,
+Jesús, M. J. del, & Herrera, F. (2013). Analysing the classification
+of imbalanced data-sets with multiple classes: Binarization techniques
+and ad-hoc approaches. Knowl. Based Syst., 42, 97–110.
+https://doi.org/10.1016/J.KNOSYS.2013.01.018
+
+
+ MLJ: A julia package for composable machine
+learning
+ Blaom
+ J. Open Source Softw.
+ 5
+ 10.21105/joss.02704
+ 2020
+ Blaom, A. D., Király, F. J., Lienart,
+T., Simillides, Y., Arenas, D., & Vollmer, S. J. (2020). MLJ: A
+julia package for composable machine learning. J. Open Source Softw., 5,
+2704. https://doi.org/10.21105/joss.02704
+
+
+
+
+
+
diff --git a/joss.06310/10.21105.joss.06310.jats b/joss.06310/10.21105.joss.06310.jats
new file mode 100644
index 0000000000..ce837ab966
--- /dev/null
+++ b/joss.06310/10.21105.joss.06310.jats
@@ -0,0 +1,748 @@
+
+
+
+
+
+
+
+Journal of Open Source Software
+JOSS
+
+2475-9066
+
+Open Journals
+
+
+
+6310
+10.21105/joss.06310
+
+Imbalance: A comprehensive multi-interface Julia toolbox
+to address class imbalance
+
+
+
+https://orcid.org/0009-0009-1198-7166
+
+Wisam
+Essam
+
+
+
+
+https://orcid.org/0000-0001-6689-886X
+
+Blaom
+Anthony
+
+
+
+
+
+Cairo University, Egypt
+
+
+
+
+University of Auckland, New Zealand
+
+
+
+
+17
+10
+2023
+
+9
+95
+6310
+
+Authors of papers retain copyright and release the
+work under a Creative Commons Attribution 4.0 International License (CC
+BY 4.0)
+2022
+The article authors
+
+Authors of papers retain copyright and release the work under
+a Creative Commons Attribution 4.0 International License (CC BY
+4.0)
+
+
+
+machine learning
+classification
+class imbalance
+resampling
+oversampling
+undersampling
+julia
+
+
+
+
+
+ Summary
+
Given a set of observations that each belong to a certain class,
+ supervised classification aims to learn a classification model that
+ can predict the class of a new, unlabeled observation
+ (Cunningham
+ et al., 2008). This modeling process finds extensive
+ application in real-life scenarios, including but not limited to
+ medical diagnostics, recommendation systems, credit scoring, and
+ sentiment analysis.
+
In various real-world scenarios where supervised classification is
+ employed, such as those pertaining to the detection of particular
+ conditions like fraud, faults, pollution, or rare diseases, a severe
+ discrepancy between the number of observations in each class can
+ occur. This is known as class imbalance. This poses a problem if
+ assumptions inherent in the classification model imply hindered
+ performance when the model is trained on imbalanced data as is
+ commonly the case
+ (Ali
+ et al., 2015). Two prevalent strategies for mitigating class
+ imbalance, when it poses a problem to the classification model,
+ involve either increasing the representation of less frequently
+ occurring classes through oversampling or reducing instances of more
+ frequently occurring classes through undersampling. It may be also
+ possible to achieve even greater performance by combining both
+ approaches in a sequential pipeline
+ (Zeng
+ et al., 2016) or by undersampling the data multiple times and
+ training the classification model on each resampled dataset to form an
+ ensemble model that aggregates results from different model instances
+ (Liu
+ et al., 2009). Contrary to undersampling, oversampling, or
+ their combination, the ensemble approach possesses the ability to
+ address class imbalance while making use of the entire dataset and
+ without generating synthetic data.
+
+
+ Statement of Need
+
A substantial body of literature in the field of machine learning
+ and statistics is devoted to addressing the class imbalance issue.
+ This predicament has often been aptly labeled the “curse of class
+ imbalance,” as noted in
+ (Picek
+ et al., 2018) and
+ (Kubát
+ & Matwin, 1997) which follows from the pervasive nature of
+ the issue across diverse real-world applications and its pronounced
+ severity; a classifier may incur an extraordinarily large performance
+ penalty in response to training on imbalanced data.
+
The literature encompasses a myriad of oversampling and
+ undersampling techniques to approach the class imbalance issue. These
+ include SMOTE
+ (Chawla
+ et al., 2002) which operates by generating synthetic examples
+ along the lines joining existing ones, SMOTE-N and SMOTE-NC
+ (Chawla
+ et al., 2002) which are variants of SMOTE that can handle
+ categorical data. The sheer number of SMOTE variants makes them a body
+ of literature on their own. Notably, the most widely cited variant of
+ SMOTE is BorderlineSMOTE
+ (Han
+ et al., 2005). Other well-established oversampling techniques
+ include RWO
+ (Zhang
+ & Li, 2014) and ROSE
+ (Menardi
+ & Torelli, 2012) which operate by estimating probability
+ densities and sampling from them to generate synthetic points. On the
+ other hand, the literature also encompasses many undersampling
+ techniques. Cluster undersampling
+ (Lin
+ et al., 2016) and condensed nearest neighbors
+ (Hart,
+ 1968) are two prominent examples that attempt to reduce the
+ number of points while preserving the structure or classification
+ boundary of the data. Furthermore, methods that combine oversampling
+ and undersampling such as SMOTETomek
+ (Zeng
+ et al., 2016) are also present. The motivation behind these
+ methods is that when undersampling is not random, it can filter out
+ noisy or irrelevant oversampled data. Lastly, resampling with ensemble
+ learning has also been presented in the literature with EasyEnsemble
+ being the most well-known approach of that type
+ (Liu
+ et al., 2009).
+
The existence of a toolbox with techniques that harness this wealth
+ of research is imperative to the development of novel approaches to
+ the class imbalance problem and for machine learning research broadly.
+ Aside from addressing class imbalance in a general machine learning
+ research setting, such a toolbox can help in class imbalance research
+ settings by making it possible to juxtapose different methods, compose
+ them together, or form variants of them without having to reimplement
+ them from scratch. In prevalent programming languages, such as Python,
+ a variety of such toolboxes already exist, such as imbalanced-learn
+ (Lemaître
+ et al., 2016) and SMOTE-variants
+ (Kovács,
+ 2019). Meanwhile, Julia
+ (Bezanson
+ et al., 2017), a well-known programming language with over 40M
+ downloads
+ (Tuychiev,
+ 2023), has been lacking a similar toolbox to address the class
+ imbalance issue in general multi-class and heterogeneous data
+ settings. This has served as the primary motivation for the creation
+ of the Imbalance.jl toolbox, which we introduce
+ in the subsequent section.
+
+
+ Imbalance.jl
+
In this work, we present, Imbalance.jl, a
+ software toolbox implemented in the Julia programming language that
+ offers over 10 well-established techniques that help address the class
+ imbalance issue. Additionally, we present a companion package,
+ MLJBalancing.jl, which: (i) facilitates the
+ inclusion of resampling methods in pipelines with classification
+ models via the BalancedModel construct; and
+ (ii) implements a general version of the EasyEnsemble algorithm
+ presented in
+ (Liu
+ et al., 2009).
+
The toolbox offers a pure functional interface for each method
+ implemented. For example, SMOTE can be used in
+ the following fashion:
+ Xover, yover = smote(X, y)
+
Here Xover, yover are
+ X, y after oversampling.
+
A ratios hyperparameter or similar is always
+ present to control the degree of oversampling or undersampling to be
+ done for each class. All hyperparameters for a resampling method have
+ default values that can be overridden.
+
The set of resampling techniques implemented in either
+ Imbalance.jl or
+ MLJBalancing.jl are shown in the table below.
+ Note that although no combination resampling techniques are explicitly
+ presented, they are easy to form using the
+ BalancedModel wrapper found in
+ MLJBalancing.jl which can wrap an arbitrary
+ number of resamplers in sequence.
+
+
+
Resampling techniques implemented in
+ Imbalance.jl and
+ MLJBalancing.jl.
+
+
+
+
+
+
+
+
+
+
Technique
+
Type
+
Supported Data Types
+
+
+
+
+
BalancedBaggingClassifier
+
Ensemble
+
Continuous and/or nominal
+
+
+
Borderline SMOTE1
+
Oversampling
+
Continuous
+
+
+
Cluster Undersampler
+
Undersampling
+
Continuous
+
+
+
Edited Nearest Neighbors Undersampler
+
Undersampling
+
Continuous
+
+
+
Random Oversampler
+
Oversampling
+
Continuous and/or nominal
+
+
+
Random Undersampler
+
Undersampling
+
Continuous and/or nominal
+
+
+
Random Walk Oversampler
+
Oversampling
+
Continuous and/or nominal
+
+
+
ROSE
+
Oversampling
+
Continuous
+
+
+
SMOTE
+
Oversampling
+
Continuous
+
+
+
SMOTE-N
+
Oversampling
+
Nominal
+
+
+
SMOTE-NC
+
Oversampling
+
Continuous and nominal
+
+
+
Tomek Links Undersampler
+
Undersampling
+
Continuous
+
+
+
+
+
+ Imbalance.jl Design Principles
+
The toolbox implementation follows a specific set of design
+ principles in terms of the implemented techniques, interface
+ support, developer experience and testing, and user experience.
+
+ Implemented Techniques
+
+
+
Should support all four major types of resampling
+ approaches (oversampling, undersampling, combination,
+ ensemble)
+
+
+
Should be generally compatible with multi-class
+ settings
+
+
+
Should offer solutions to heterogeneous data settings
+ (continuous and nominal data)
+
+
+
When possible, preference should be given to techniques
+ that are more common in the literature or industry
+
+
+
Methods implemented in the Imbalance.jl
+ toolbox indeed meet all aforementioned design principles for the
+ implemented techniques. The one-vs-rest scheme as proposed in
+ (Fernández
+ et al., 2013) was used to generalize binary technique to
+ multi-class when needed.
+
+
+ Interface Support
+
+
+
Should support both matrix and table type inputs
+
+
+
Target variable may or may not be given as a separate
+ column
+
+
+
Should expose a pure functional implementation, but also
+ support popular Julia machine learning interfaces
+
+
+
Should be possible to wrap an arbitrary number of resampler
+ models with a classification model to behave as a unified
+ model
+
+
+
Methods implemented in the Imbalance.jl
+ toolbox meet all the interface design principles above. It
+ particularly implements the MLJ
+ (Blaom
+ et al., 2020) and TableTransforms
+ interface for each method. BalancedModel
+ from MLJBalancing.jl also allows fusing an
+ arbitrary number of resampling models and a classifier together to
+ behave as one unified model.
+
+
+ Developer Experience and Testing
+
+
+
There should exist a developer guide to encourage and guide
+ contribution
+
+
+
Functions should be implemented in smaller units to aid in
+ testing
+
+
+
Testing coverage should be maximized; even the most basic
+ functions should be tested
+
+
+
Features commonly used by multiple resampling techniques
+ should be implemented in a single function and reused
+
+
+
Should document all functions, including internal ones
+
+
+
Comments should be included to justify or simplify written
+ implementations when needed
+
+
+
This set of design principles is also satisfied by
+ Imbalance.jl. Implemented techniques are
+ tested by testing smaller units that form them. Aside from that,
+ end-to-end tests are performed for each technique by testing
+ properties and characteristics of the technique or by using the
+ imbalanced-learn toolbox
+ (Lemaître
+ et al., 2016) from Python and comparing outputs.
+
+
+ User Experience
+
+
+
Functional documentation should be comprehensive and
+ clear
+
+
+
Examples (with shown output) that work after copy-pasting
+ should accompany each method
+
+
+
An illustrative visual example that presents a plot or
+ animation should preferably accompany each method
+
+
+
A practical example that uses the method with real data
+ should preferably accompany each method
+
+
+
If an implemented method lacks an online explanation, an
+ article that explains the method after it is implemented
+ should be preferably written
+
+
+
The Imbalance.jl documentation indeed
+ satisfies this set of design principles. Methods are each
+ associated with an example that can be copy-pasted, a visual
+ example that demonstrates the operation of the technique, and
+ possibly, an example that utilizes it with a real-world dataset to
+ improve the performance of a classification model.
+
+
+
+ Author Contributions
+
Design: E. Wisam, A. Blaom. Implementation, tests and
+ documentation: E. Wisam. Code and documentation review: A. Blaom.
+ The authors would like to acknowledge the financial support provided
+ by the Google Summer of Code program, which made this project
+ possible.
+
+
+
+
+
+
+
+
+ BezansonJeff
+ EdelmanAlan
+ KarpinskiStefan
+ ShahViral B.
+
+ Julia: A fresh approach to numerical computing
+
+ Society for Industrial & Applied Mathematics (SIAM)
+ 201701
+ 59
+ 1
+ https://doi.org/10.1137%2F141000671
+ 10.1137/141000671
+ 65
+ 98
+
+
+
+
+
+ CunninghamPádraig
+ CordMatthieu
+ DelanySarah Jane
+
+ Supervised learning
+
+
+ CordMatthieu
+ CunninghamPádraig
+
+ Springer Berlin Heidelberg
+ Berlin, Heidelberg
+ 2008
+ 978-3-540-75171-7
+ https://doi.org/10.1007/978-3-540-75171-7_2
+ 10.1007/978-3-540-75171-7_2
+ 21
+ 49
+
+
+
+
+
+ AliAida
+ ShamsuddinSiti Mariyam Hj.
+ RalescuAnca L.
+
+ Classification with class imbalance problem: A review
+
+ 2015
+ https://api.semanticscholar.org/CorpusID:26644563
+
+
+
+
+
+ ZengMin
+ ZouBeiji
+ WeiFaran
+ LiuXiyao
+ WangLei
+
+ Effective prediction of three common diseases by combining SMOTE with tomek links technique for imbalanced medical data
+
+ 2016
+ https://api.semanticscholar.org/CorpusID:25184489
+ 10.1109/ICOACS.2016.7563084
+ 225
+ 228
+
+
+
+
+
+ LiuXu-Ying
+ WuJianxin
+ ZhouZhi-Hua
+
+ Exploratory undersampling for class-imbalance learning
+
+ 2009
+ 39
+ https://api.semanticscholar.org/CorpusID:62808464
+ 10.1109/TSMCB.2008.2007853
+ 539
+ 550
+
+
+
+
+
+ PicekStjepan
+ HeuserAnnelie
+ JovićAlan
+ BhasinShivam
+ RegazzoniFrancesco
+
+ The curse of class imbalance and conflicting metrics with machine learning for side-channel evaluations
+
+ 2018
+ 2019
+ https://api.semanticscholar.org/CorpusID:44136202
+ 10.13154/tches.v2019.i1.209-237
+ 209
+ 237
+
+
+
+
+
+ KubátMiroslav
+ MatwinStan
+
+ Addressing the curse of imbalanced training sets: One-sided selection
+
+ 1997
+ https://api.semanticscholar.org/CorpusID:18370956
+
+
+
+
+
+ ChawlaN.
+ BowyerK.
+ HallLawrence O.
+ KegelmeyerW. Philip
+
+ SMOTE: Synthetic minority over-sampling technique
+
+ 2002
+ abs/1106.1813
+ https://api.semanticscholar.org/CorpusID:1554582
+ 10.1613/jair.953
+
+
+
+
+
+ HanHui
+ WangWenyuan
+ MaoBinghuan
+
+ Borderline-SMOTE: A new over-sampling method in imbalanced data sets learning
+
+ 2005
+ https://api.semanticscholar.org/CorpusID:12126950
+ 10.1007/11538059_91
+
+
+
+
+
+ ZhangHuaxiang
+ LiMingfang
+
+ RWO-sampling: A random walk over-sampling approach to imbalanced data classification
+
+ 2014
+ 20
+ https://api.semanticscholar.org/CorpusID:205432428
+ 10.1016/j.inffus.2013.12.003
+ 99
+ 116
+
+
+
+
+
+ MenardiGiovanna
+ TorelliNicola
+
+ Training and assessing classification rules with imbalanced data
+
+ 2012
+ 28
+ https://api.semanticscholar.org/CorpusID:18164904
+ 10.1007/s10618-012-0295-5
+ 92
+ 122
+
+
+
+
+
+ LinWei-Chao
+ TsaiChih-Fong
+ HuYa-Han
+ JhangJing-Shang
+
+ Clustering-based undersampling in class-imbalanced data
+
+ 2016
+ 409
+ https://api.semanticscholar.org/CorpusID:424467
+ 10.1016/j.ins.2017.05.008
+ 17
+ 26
+
+
+
+
+
+ HartPeter E.
+
+ The condensed nearest neighbor rule (corresp.)
+
+ 1968
+ 14
+ https://api.semanticscholar.org/CorpusID:206729609
+ 10.1109/TIT.1968.1054155
+ 515
+ 516
+
+
+
+
+
+ LemaîtreGuillaume
+ NogueiraFernando
+ AridasChristos K.
+
+ Imbalanced-learn: A Python toolbox to tackle the curse of imbalanced datasets in machine learning
+
+ 2016
+ abs/1609.06570
+ https://api.semanticscholar.org/CorpusID:1426815
+
+
+
+
+
+ KovácsGyörgy
+
+ Smote-variants: A Python implementation of 85 minority oversampling techniques
+
+ 2019
+ 366
+ 0925-2312
+ https://www.sciencedirect.com/science/article/pii/S0925231219311622
+ 10.1016/j.neucom.2019.06.100
+ 352
+ 354
+
+
+
+
+
+ TuychievBekhruz
+
+ The rise of Julia
+ 2023
+ https://www.datacamp.com/blog/the-rise-of-julia-is-it-worth-learning-in-2022
+
+
+
+
+
+ FernándezAlberto
+ LópezVictoria
+ GalarMikel
+ JesúsMaría José del
+ HerreraFrancisco
+
+ Analysing the classification of imbalanced data-sets with multiple classes: Binarization techniques and ad-hoc approaches
+
+ 2013
+ 42
+ https://api.semanticscholar.org/CorpusID:131286
+ 10.1016/J.KNOSYS.2013.01.018
+ 97
+ 110
+
+
+
+
+
+ BlaomAnthony D.
+ KirályFranz J.
+ LienartThibaut
+ SimillidesYiannis
+ ArenasDiego
+ VollmerSebastian J.
+
+ MLJ: A julia package for composable machine learning
+
+ 2020
+ 5
+ https://api.semanticscholar.org/CorpusID:220768685
+ 10.21105/joss.02704
+ 2704
+
+
+
+
+
+
diff --git a/joss.06310/10.21105.joss.06310.pdf b/joss.06310/10.21105.joss.06310.pdf
new file mode 100644
index 0000000000..3776bcdee6
Binary files /dev/null and b/joss.06310/10.21105.joss.06310.pdf differ