Not OP. This question is being reposted to preserve technical content removed from elsewhere. Feel free to add your own answers/discussion.

Original question:

I got a data set from high performance liquid chromatography, because hplc is expensive we only got about 39 data point. Each data point is 9 dimension, representing 9different substances concentration. I tried different network and the accuracy is not higher than 50%. (We have four classes) however the KNN has a accuracy of more than 90%. I remember hearing that neural network is not good on small data set. Is this the reason? I have not tried svm or other traditional machine learning models yet. Should I try them if yes which one

  • ShadowAetherOPM
    link
    fedilink
    English
    arrow-up
    3
    ·
    edit-2
    2 years ago

    Original answer:

    NNs can require a lot of data to train. For smaller datasets, knn or svm can be a better choice especially if the classification boundary does not need to be very complex or classes are tightly clustered within-class and far away from other classes. Also keep in mind the bayes error of your problem which is the best value you would be able to get over the set you want to generalize to with any classifier and is based on the separability and measurement noise in your data.

  • omegastick@lemmy.ml
    link
    fedilink
    English
    arrow-up
    2
    ·
    2 years ago

    I remember hearing that neural network is not good on small data set.

    That’s almost definitely it. Neural networks are good for high-dimensional problems with lots of available training data.